git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
Search results ordered by [date|relevance]  view[summary|threaded|Atom feed]

* [PATCHv2 5/6] builtin/grep.c: respect 'submodule.recurse' option
      [irrelevant] <20170522194806.13568-1-sbeller@google.com>
@ 2017-05-22 19:48 ` Stefan Beller
  2017-05-22 19:48 ` [PATCHv2 6/6] builtin/push.c: respect 'submodule.recurse' option Stefan Beller
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-22 19:48 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, Stefan Beller

In builtin/grep.c we parse the config before evaluating the command line
options. This makes the task of teaching grep to respect the new config
option 'submodule.recurse' very easy by just parsing that option.

As an alternative I had implemented a similar structure to treat
submodules as the fetch/push command have, including
* aligning the meaning of the 'recurse_submodules' to possible submodule
  values RECURSE_SUBMODULES_* as defined in submodule.h.
* having a callback to parse the value and
* reacting to the RECURSE_SUBMODULES_DEFAULT state that was the initial
  state.

However all this is not needed for a true boolean value, so let's keep
it simple. However this adds another place where "submodule.recurse" is
parsed.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/grep.c                     |  3 +++
 t/t7814-grep-recurse-submodules.sh | 18 ++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 65070c52fc..7b998801fe 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -291,6 +291,9 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 			    num_threads, var);
 	}
 
+	if (!strcmp(var, "submodule.recurse"))
+		recurse_submodules = git_config_bool(var, value);
+
 	return st;
 }
 
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 5b6eb3a65e..234d2d188b 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -33,6 +33,24 @@ test_expect_success 'grep correctly finds patterns in a submodule' '
 	test_cmp expect actual
 '
 
+test_expect_success 'grep finds patterns in a submodule via config' '
+	test_config submodule.recurse true &&
+	# expect from previous test
+	git grep -e "bar" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-recurse-submodules overrides config' '
+	test_config submodule.recurse true &&
+	cat >expect <<-\EOF &&
+	a:foobar
+	b/b:bar
+	EOF
+
+	git grep -e "bar" --no-recurse-submodules >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep and basic pathspecs' '
 	cat >expect <<-\EOF &&
 	submodule/a:foobar
-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 31%]

* [PATCHv2 6/6] builtin/push.c: respect 'submodule.recurse' option
      [irrelevant] <20170522194806.13568-1-sbeller@google.com>
  2017-05-22 19:48 ` [PATCHv2 5/6] builtin/grep.c: respect 'submodule.recurse' option Stefan Beller
@ 2017-05-22 19:48 ` Stefan Beller
      [irrelevant] ` <20170522194806.13568-3-sbeller@google.com>
      [irrelevant] ` <20170522194806.13568-2-sbeller@google.com>
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-22 19:48 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, Stefan Beller

The closest mapping from the boolean 'submodule.recurse' set to "yes"
to the variety of submodule push modes is "on-demand", so implement that.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/push.c                 |  4 ++++
 t/t5531-deep-submodule-push.sh | 21 +++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/builtin/push.c b/builtin/push.c
index 5c22e9f2e5..fcf66b3bec 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -498,6 +498,10 @@ static int git_push_config(const char *k, const char *v, void *cb)
 		const char *value;
 		if (!git_config_get_value("push.recursesubmodules", &value))
 			recurse_submodules = parse_push_recurse_submodules_arg(k, value);
+	} else if (!strcmp(k, "submodule.recurse")) {
+		int val = git_config_bool(k, v) ?
+			RECURSE_SUBMODULES_ON_DEMAND : RECURSE_SUBMODULES_OFF;
+		recurse_submodules = val;
 	}
 
 	return git_default_config(k, v, NULL);
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index f55137f76f..97c1f14f6b 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -126,6 +126,27 @@ test_expect_success 'push succeeds if submodule commit not on remote but using o
 	)
 '
 
+test_expect_success 'push succeeds if submodule commit not on remote but using auto-on-demand via submodule.recurse config' '
+	(
+		cd work/gar/bage &&
+		>recurse-on-demand-from-submodule-recurse-config &&
+		git add recurse-on-demand-from-submodule-recurse-config &&
+		git commit -m "Recurse submodule.recurse from config junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Recurse submodule.recurse from config for gar/bage" &&
+		git -c submodule.recurse push ../pub.git master &&
+		# Check that the supermodule commit got there
+		git fetch ../pub.git &&
+		git diff --quiet FETCH_HEAD master &&
+		# Check that the submodule commit got there too
+		cd gar/bage &&
+		git diff --quiet origin/master master
+	)
+'
+
 test_expect_success 'push recurse-submodules on command line overrides config' '
 	(
 		cd work/gar/bage &&
-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 30%]

* Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C
      [irrelevant]   ` <20170521125814.26255-2-pc44800@gmail.com>
@ 2017-05-22 20:04     ` Stefan Beller
  2017-05-23 19:09       ` Brandon Williams
  2017-05-23 19:36     ` Brandon Williams
  2017-05-26 15:17     ` [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
  2 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-22 20:04 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Christian Couder, Jeff King, Brandon Williams, Ramsay Jones

On Sun, May 21, 2017 at 5:58 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:

> I have also made some changes in git-submodule.sh for correcting
> the $path variable. And hence made the corresponding changes in
> the new test introduced in t7407-submodule-foreach as well.
> I have push this work at:
> https://github.com/pratham-pc/git/commits/foreach-bug-fixed

This one seems to pass the test suite by having the bug fixed.
(The patches posted here seems to be
https://github.com/pratham-pc/git/commits/foreach
which does not pass tests? These two series seem to only differ in
the bug fix commit, which I think is a good idea to include, as then we
have a bug fixed and the tests pass.)

> +static void for_each_submodule_list(const struct module_list list, submodule_list_func_t fn, void *cb_data)
..
> +       return;

no need for an explicit return in a void function.

> +struct cb_foreach {
> +       int argc;
> +       const char **argv;
> +       const char *prefix;
> +       unsigned int quiet: 1;
> +       unsigned int recursive: 1;
> +};
> +#define CB_FOREACH_INIT { 0, NULL, 0, 0 }

This static initializer doesn't quite match the struct,
(I would expect two NULLs as we have two const char pointers).

> +
> +       info.argc = argc;
> +       info.argv = argv;
> +       info.prefix = prefix;
> +       info.quiet = !!quiet;
> +       info.recursive = !!recursive;

as you assign all fields of the struct yourself, you could also omit the
static initialization via _INIT above.


Apart from these two minor nits the code looks good to me.
However we'd really want to have the bug fix patch as well.
(At the time of submission of a patch we should not be aware
of any tests failing, which we are without said bug fix patch)

Thanks,
Stefan

^ permalink raw reply	[relevance 15%]

* Re: [GSoC][PATCH v1 2/2] submodule: port submodule subcommand status
      [irrelevant] ` <20170521122711.22021-2-pc44800@gmail.com>
@ 2017-05-22 21:28   ` Stefan Beller
  2017-06-05 20:25     ` [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C Prathamesh Chavan
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-22 21:28 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Christian Couder, Jeff King

On Sun, May 21, 2017 at 5:27 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> This aims to make git-submodule status a builtin. 'status' is ported
> to submodule--helper, and submodule--helper is called from
> git-submodule.sh.
>
> For the purpose of porting cmd_status, the code is split up such that
> one function obtains all the list of submodules, acting as the
> front-end of git-submodule status. This function later calls the
> second function for_each_submodule_list,it which basically loops
> through the list of submodules and calls function fn, which in this
> case is status_submodule. The third function, status submodule returns
> the status of submodule and also takes care of the recursive flag.
>
> The first function module_status parses the options present in argv,
> and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
>
> The second function for_each_submodule_list traverses through the list,
> and calls function fn (which in the case of submodule subcommand
> foreach is status_submodule) is called for each entry.
>
> The third function status_foreach checks for the various conditions,
> and prints the status of the submodule accordingly. Also, this
> function takes care of the recursive flag by creating a separate
> child_process and running it inside the submodule.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> A new function, get_submodule_displaypath is also introduced for getting
> the displaypath of the submodule while taking care of its prefix and
> superprefix.
>
>  builtin/submodule--helper.c | 162 ++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            |  48 +------------
>  2 files changed, 163 insertions(+), 47 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 5f0ddd8ad..7c040a375 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -13,6 +13,8 @@
>  #include "refs.h"
>  #include "connect.h"
>
> +typedef void (*submodule_list_func_t)(const struct cache_entry *list_item, void *cb_data);
> +
>  static char *get_default_remote(void)
>  {
>         char *dest = NULL, *ret;
> @@ -219,6 +221,23 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
>         return 0;
>  }
>
> +static char *get_submodule_displaypath(const char *path, const char *prefix)
> +{
> +       const char *super_prefix = get_super_prefix();
> +
> +       if (prefix && super_prefix) {
> +               BUG("cannot have prefix '%s' and superprefix '%s'",
> +                   prefix, super_prefix);
> +       } else if (prefix) {
> +               struct strbuf sb = STRBUF_INIT;
> +               return xstrdup(relative_path(path, prefix, &sb));
> +       } else if (super_prefix) {
> +               return xstrfmt("%s/%s", super_prefix, path);
> +       } else {
> +               return xstrdup(path);
> +       }
> +}
> +
>  enum describe_step {
>         step_bare = 0,
>         step_tags,
> @@ -395,6 +414,13 @@ static int module_list(int argc, const char **argv, const char *prefix)
>         return 0;
>  }
>
> +static void for_each_submodule_list(const struct module_list list, submodule_list_func_t fn, void *cb_data)
> +{
> +       int i;
> +       for (i = 0; i < list.nr; i++)
> +               fn(list.entries[i], cb_data);
> +}

Up to here it looks like the patch in
https://public-inbox.org/git/20170521125814.26255-2-pc44800@gmail.com/
(without the nit of having an extra void return)

Maybe it is worth it to combine the two patch series, such that we'd need
to review the common parts only once?

> +
>  static void init_submodule(const char *path, const char *prefix, int quiet)
>  {
>         const struct submodule *sub;
> @@ -532,6 +558,141 @@ static int module_init(int argc, const char **argv, const char *prefix)
>         return 0;
>  }
>
> +struct cb_status {
> +       const char *prefix;
> +       unsigned int quiet: 1;
> +       unsigned int cached: 1;
> +       unsigned int recursive: 1;
> +};
> +#define CB_STATUS_INIT { NULL, 0, 0, 0 }



> +
> +               if (run_command(&cpr))
> +                       die(_("Failed to recurse into submodule path %s"), list_item->name);

I thought this is a badly worded error message, but it turns out it
is just as in the shell code, which is good for a direct translation.

Maybe we can adapt the error message in a later follow up to be more
aligned to other submodule error messages. (dropping "path" and putting
single quotes around %s, also un-capitalize the first letter)


> +static int module_status(int argc, const char **argv, const char *prefix)
> +{
> +       struct cb_status info = CB_STATUS_INIT;
> +       struct pathspec pathspec;
> +       struct module_list list = MODULE_LIST_INIT;
> +       int quiet = 0;
> +       int cached = 0;
> +       int recursive = 0;
> +
> +       struct option module_status_options[] = {
> +               OPT__QUIET(&quiet, N_("Suppress output for initializing a submodule")),
> +               OPT_BOOL(0, "cached", &cached, N_("Use commit stored in the index instead of the one stored in the submodule HEAD")),
> +               OPT_BOOL(0, "recursive", &recursive, N_("Recurse into nested submodules")),
> +               OPT_END(),
> +       };
> +
> +       const char *const git_submodule_helper_usage[] = {
> +               N_("git submodule status [--quiet] [--cached] [--recursive] [<path>]"),
> +               NULL
> +       };
> +
> +       argc = parse_options(argc, argv, prefix, module_status_options,
> +                            git_submodule_helper_usage, 0);
> +
> +       if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
> +               return 1;
> +
> +       info.prefix = prefix;
> +       info.quiet = !!quiet;
> +       info.cached = !!cached;
> +       info.recursive = !!recursive;
> +
> +       for_each_submodule_list(list, status_submodule, &info);
> +
> +       return 0;
> +}

This function looks good. Though my gut reaction was to suggest to
add another layer of abstraction. Then I checked wt-status.c, but that
makes use of "submodule summary" and not "submodule status". So all is good.


> +       git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper status ${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive} "$@"

I'd think we would not need to pass down superprefix here as we do not call
"submodule status" in a recursive way. The recursion works on
the submodule helper itself, so we could simplify it to just

    git ${wt_prefix:+-C "$wt_prefix"} submodule--helper status
${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive}
"$@"

Another idea that I just had:
Maybe we could drop --cached, --recursive as well,
as they are just command line options, which could
be just part of "$@".

For --quiet this is a bit more complicated as it may come in
via an environment variable (which we could also check for
in C in theory. I know I omitted that when writing some
submodule--helper code a couple months ago, but the reason
escaped me)

Thanks,
Stefan

^ permalink raw reply	[relevance 26%]

* Re: [PATCH v2 0/2] Update sha1dc from upstream & optionally make it a submodule
      [irrelevant]   ` <xmqqbmqko7c2.fsf@gitster.mtv.corp.google.com>
@ 2017-05-22 22:48     ` Stefan Beller
  2017-05-23  3:22       ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-22 22:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git, Marc Stevens, Michael Kebe, Jeff King, Brandon Williams

On Mon, May 22, 2017 at 3:27 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> I liked the suggestion to make the URL a relative path, but this would
>> require you to maintain a mirror in the same places you push git.git
>> to, is that something you'd be willing to do?
>
> After thinking about this a bit more, I know what I think we want a
> bit better.
>
> Relative URL (e.g. ../sha1collisiondetection that sits next to the
> copy of git.git) may be a good way to go.  I can arrange to create
> necessary repository next to git.git on k.org and github.com but I
> need to double check about other places

And here we see another deficit with a single URL:
We have to abide by the same scheme at all hosting endpoints.

For example consider the host https://kernel.googlesource.com/pub/scm/git/git
that mirrors from kernel.org. It would be able to bind the
submodule at  https://kernel.googlesource.com/pub/scm/git/git/sha1dc
i.e. it would look like a subdirectory of the main git repo.

This is not an issue for our desired usecase, as all hosts can comply
with the scheme that you outlined (url=../sha1...), but worth noting that
in the long term we may want to have the ability to "configure" each
remote individually by having out-of-history config options. I think we
would want to solve that via a "refs/meta/gitmodules" branch that can be
adapted per remote. (original idea from jrnieder@)

> Whether the submodule is referenced by a relative URL from the main
> project, the submodule should not come directly from the upstream,
> and various mirrors that sit next to git.git should not be blind and
> automated "mirrors".

That sounds reasonable for our sanity.

> This is because I do not want us to trust the
> security measures of https://github.com/cr-marcstevens/ repository.
> The consumers already need to trust k.org/pub/scm/git/git.git and by
> ensuring k.org/pub/scm/git/sha1dc is managed the same way, they do
> not have to trust anything extra.

The trust would be transitive, as the said submodule is referenced via
sha1, so all malicious actions upstream could perform are:
* denial of service: (by remove a commit that we pointed at in our history)
* denial of service 2: add a huge blob to their repo, such that anyone
  obtaining the submodule not carefully is annoyed by a super large repo.
* add additional malicious data (such as illegal numbers and algorithms)
  to a branch, which would be obtained by users cloning the submodule
  carelessly.

> Another reason is that we want to make sure all commits in the
> submodule that we bind to the superproject (i.e. git.git) are always
> in the submodule, regardless of what our upstream does, and one way
> to do so is to have control over _our_ canonical repository for the
> submodule.

By having all repos under one entity of trust, we would not need to discuss
all kinds of possible attacks as above.

>  In normal times, it will faithfully follow the upstream
> without doing anything else, but we'd keep the option of anchoring a
> submodule commit that is referenced by the superproject history with
> our own tag, if it is ever rewound away in the upstream history for
> whatever reason.

That makes sense.

Thanks,
Stefan

^ permalink raw reply	[relevance 23%]

* [PATCHv4 00/17] Diff machine: highlight moved lines.
      [irrelevant] <20170518193746.486-1-sbeller@google.com>
@ 2017-05-23  2:40 ` Stefan Beller
  2017-05-23  2:40   ` [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
                     ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stefan Beller @ 2017-05-23  2:40 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger, Stefan Beller

v4:
* interdiff to v3 (what is currently origin/sb/diff-color-move) below.
* renamed the "buffered_patch_line" to "diff_line". Originally I planned
  to not carry the "line" part as it can be a piece of a line as well.
  But for the intended functionality it is best to keep the name.
  If we'd want to add more functionality to say have a move detection
  for words as well, we'd rename the struct to have a better name then.
  For now diff_line is the best. (Thanks Jonathan Nieder!)
* tests to demonstrate it doesn't mess with --color-words as well as
  submodules. (Thanks Jonathan Tan!)
* added in the statics (Thanks Ramsay!)
* smaller scope for the hashmaps (Thanks Jonathan Tan!)
* some commit messages were updated, prior patch 4-7 is squashed into one
  (Thanks Jonathan Tan!)
* the tests added revealed an actual fault: now that the submodule process
  is not attached to a dupe of our stdout, it would stop coloring the
  output. We need to pass on use-color explicitly.
* updated the NEEDSWORK comment in the second last patch.

Thanks for bearing,
Stefan

v3:
* see interdiff below.
* fixing one invalid computation (Thanks Junio!)
* I reasoned more about submodule and word diffing, see the commit message
  of the last patch:
  
    A note on the options '--submodule=diff' and '--color-words/--word-diff':
    In the conversion to use emit_line in the prior patches both submodules
    as well as word diff output carefully chose to call emit_line with sign=0.
    All output with sign=0 is ignored for move detection purposes in this
    patch, such that no weird looking output will be generated for these
    cases. This leads to another thought: We could pass on '--color-moved' to
    submodules such that they color up moved lines for themselves. If we'd do
    so only line moves within a repository boundary are marked up.

* better name for emit_line outside of diff.[ch]

v2:
* emit_line now takes an argument that indicates if we want it
  to emit the line prefix as well. This should allow for a more faithful
  refactoring in the beginning. (Thanks Jonathan!)
* fixed memleaks (Thanks Brandon!)
* "git -c color.moved=true log -p" works now! (Thanks Jeff)
* interdiff below, though it is large.
* less intrusive than v1 (Thanks Jonathan!)

v1:

For details on *why* see the commit message of the last commit.

The first five patches are slight refactorings to get into good
shape, the next patches are funneling all output through emit_line_*.

The second last patch introduces an option to buffer up all output
before printing, and then the last patch can color up moved lines
of code.

Any feedback welcome.

Thanks,
Stefan

Stefan Beller (17):
  diff: readability fix
  diff: move line ending check into emit_hunk_header
  diff.c: factor out diff_flush_patch_all_file_pairs
  diff: introduce more flexible emit function
  diff.c: convert fn_out_consume to use emit_line
  diff.c: convert builtin_diff to use emit_line_*
  diff.c: convert emit_rewrite_diff to use emit_line_*
  diff.c: convert emit_rewrite_lines to use emit_line_*
  submodule.c: convert show_submodule_summary to use emit_line_fmt
  diff.c: convert emit_binary_diff_body to use emit_line_*
  diff.c: convert show_stats to use emit_line_*
  diff.c: convert word diffing to use emit_line_*
  diff.c: convert diff_flush to use emit_line_*
  diff.c: convert diff_summary to use emit_line_*
  diff.c: emit_line includes whitespace highlighting
  diff: buffer all output if asked to
  diff.c: color moved lines differently

 Documentation/config.txt   |  14 +-
 diff.c                     | 858 +++++++++++++++++++++++++++++++++------------
 diff.h                     |  59 +++-
 submodule.c                |  89 ++---
 submodule.h                |   9 +-
 t/t4015-diff-whitespace.sh | 267 ++++++++++++++
 6 files changed, 1018 insertions(+), 278 deletions(-)

diff --git a/diff.c b/diff.c
index b977a5d95b..23e70d348e 100644
--- a/diff.c
+++ b/diff.c
@@ -373,11 +373,11 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 
 struct moved_entry {
 	struct hashmap_entry ent;
-	const struct buffered_patch_line *line;
+	const struct diff_line *line;
 	struct moved_entry *next_line;
 };
 
-static void get_ws_cleaned_string(const struct buffered_patch_line *l,
+static void get_ws_cleaned_string(const struct diff_line *l,
 				  struct strbuf *out)
 {
 	int i;
@@ -388,8 +388,8 @@ static void get_ws_cleaned_string(const struct buffered_patch_line *l,
 	}
 }
 
-static int buffered_patch_line_cmp_no_ws(const struct buffered_patch_line *a,
-					 const struct buffered_patch_line *b,
+static int diff_line_cmp_no_ws(const struct diff_line *a,
+					 const struct diff_line *b,
 					 const void *keydata)
 {
 	int ret;
@@ -405,8 +405,8 @@ static int buffered_patch_line_cmp_no_ws(const struct buffered_patch_line *a,
 	return ret;
 }
 
-static int buffered_patch_line_cmp(const struct buffered_patch_line *a,
-				   const struct buffered_patch_line *b,
+static int diff_line_cmp(const struct diff_line *a,
+				   const struct diff_line *b,
 				   const void *keydata)
 {
 	return a->len != b->len || strncmp(a->line, b->line, a->len);
@@ -416,17 +416,17 @@ static int moved_entry_cmp(const struct moved_entry *a,
 			   const struct moved_entry *b,
 			   const void *keydata)
 {
-	return buffered_patch_line_cmp(a->line, b->line, keydata);
+	return diff_line_cmp(a->line, b->line, keydata);
 }
 
 static int moved_entry_cmp_no_ws(const struct moved_entry *a,
 				 const struct moved_entry *b,
 				 const void *keydata)
 {
-	return buffered_patch_line_cmp_no_ws(a->line, b->line, keydata);
+	return diff_line_cmp_no_ws(a->line, b->line, keydata);
 }
 
-static unsigned get_line_hash(struct buffered_patch_line *line, unsigned ignore_ws)
+static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
 {
 	static struct strbuf sb = STRBUF_INIT;
 
@@ -444,7 +444,7 @@ static struct moved_entry *prepare_entry(struct diff_options *o,
 {
 	struct moved_entry *ret = xmalloc(sizeof(*ret));
 	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
-	struct buffered_patch_line *l = &o->line_buffer[line_no];
+	struct diff_line *l = &o->line_buffer[line_no];
 
 	ret->ent.hash = get_line_hash(l, ignore_ws);
 	ret->line = l;
@@ -615,7 +615,9 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
 	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
-static void add_lines_to_move_detection(struct diff_options *o)
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
 {
 	struct moved_entry *prev_line = NULL;
 
@@ -628,11 +630,11 @@ static void add_lines_to_move_detection(struct diff_options *o)
 		switch (o->line_buffer[n].sign) {
 		case '+':
 			sign = '+';
-			hm = o->added_lines;
+			hm = add_lines;
 			break;
 		case '-':
 			sign = '-';
-			hm = o->deleted_lines;
+			hm = del_lines;
 			break;
 		case ' ':
 		default:
@@ -650,29 +652,31 @@ static void add_lines_to_move_detection(struct diff_options *o)
 	}
 }
 
-static void mark_color_as_moved(struct diff_options *o)
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
 {
 	struct moved_entry **pmb = NULL; /* potentially moved blocks */
 	int pmb_nr = 0, pmb_alloc = 0;
-	int alt_flag = 0;
+	int use_alt_color = 0;
 	int n;
 
 	for (n = 0; n < o->line_buffer_nr; n++) {
 		struct hashmap *hm = NULL;
 		struct moved_entry *key;
 		struct moved_entry *match = NULL;
-		struct buffered_patch_line *l = &o->line_buffer[n];
+		struct diff_line *l = &o->line_buffer[n];
 		int i, lp, rp;
 
 		switch (l->sign) {
 		case '+':
-			hm = o->deleted_lines;
+			hm = del_lines;
 			break;
 		case '-':
-			hm = o->added_lines;
+			hm = add_lines;
 			break;
 		default:
-			alt_flag = 0; /* reset to standard, no-alt move color */
+			use_alt_color = 0;
 			pmb_nr = 0; /* no running sets */
 			continue;
 		}
@@ -690,7 +694,7 @@ static void mark_color_as_moved(struct diff_options *o)
 			struct moved_entry *pnext = (p && p->next_line) ?
 					p->next_line : NULL;
 			if (pnext &&
-			    !buffered_patch_line_cmp(pnext->line, l, o)) {
+			    !diff_line_cmp(pnext->line, l, o)) {
 				pmb[i] = p->next_line;
 			} else {
 				pmb[i] = NULL;
@@ -720,7 +724,7 @@ static void mark_color_as_moved(struct diff_options *o)
 			pmb_nr = rp + 1;
 		} else {
 			/* Toggle color */
-			alt_flag = (alt_flag + 1) % 2;
+			use_alt_color = (use_alt_color + 1) % 2;
 
 			/* Build up a new set */
 			pmb_nr = 0;
@@ -732,10 +736,12 @@ static void mark_color_as_moved(struct diff_options *o)
 
 		switch (l->sign) {
 		case '+':
-			l->set = diff_get_color_opt(o, DIFF_FILE_NEW_MOVED + alt_flag);
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_NEW_MOVED + use_alt_color);
 			break;
 		case '-':
-			l->set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED + alt_flag);
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_OLD_MOVED + use_alt_color);
 			break;
 		default:
 			die("BUG: we should have continued earlier?");
@@ -744,8 +750,8 @@ static void mark_color_as_moved(struct diff_options *o)
 	free(pmb);
 }
 
-static void emit_buffered_patch_line(struct diff_options *o,
-				     struct buffered_patch_line *e)
+static void emit_diff_line(struct diff_options *o,
+				     struct diff_line *e)
 {
 	const char *ws;
 	int has_trailing_newline, has_trailing_carriage_return;
@@ -756,7 +762,7 @@ static void emit_buffered_patch_line(struct diff_options *o,
 		fputs(diff_line_prefix(o), file);
 
 	switch (e->state) {
-	case BPL_EMIT_LINE_WS:
+	case DIFF_LINE_WS:
 		ws = diff_get_color(o->use_color, DIFF_WHITESPACE);
 		if (e->set)
 			fputs(e->set, file);
@@ -767,7 +773,7 @@ static void emit_buffered_patch_line(struct diff_options *o,
 		ws_check_emit(e->line, e->len, o->ws_rule,
 			      file, e->set, e->reset, ws);
 		return;
-	case BPL_EMIT_LINE_ASIS:
+	case DIFF_LINE_ASIS:
 		has_trailing_newline = (len > 0 && e->line[len-1] == '\n');
 		if (has_trailing_newline)
 			len--;
@@ -789,46 +795,46 @@ static void emit_buffered_patch_line(struct diff_options *o,
 		if (has_trailing_newline)
 			fputc('\n', file);
 		return;
-	case BPL_HANDOVER:
-		o->ws_rule = whitespace_rule(e->line); /*read from file, stored in line?*/
+	case DIFF_LINE_RELOAD_WS_RULE:
+		o->ws_rule = whitespace_rule(e->line);
 		return;
 	default:
 		die("BUG: malformatted buffered patch line: '%d'", e->state);
 	}
 }
 
-static void append_buffered_patch_line(struct diff_options *o,
-				       struct buffered_patch_line *e)
+static void append_diff_line(struct diff_options *o,
+				       struct diff_line *e)
 {
-	struct buffered_patch_line *f;
+	struct diff_line *f;
 	ALLOC_GROW(o->line_buffer,
 		   o->line_buffer_nr + 1,
 		   o->line_buffer_alloc);
 	f = &o->line_buffer[o->line_buffer_nr++];
 
-	memcpy(f, e, sizeof(struct buffered_patch_line));
+	memcpy(f, e, sizeof(struct diff_line));
 	f->line = e->line ? xmemdupz(e->line, e->len) : NULL;
 }
 
-void emit_line(struct diff_options *o,
-	       const char *set, const char *reset,
-	       int add_line_prefix, int markup_ws,
-	       int sign, const char *line, int len)
+static void emit_line(struct diff_options *o,
+		      const char *set, const char *reset,
+		      int add_line_prefix, int markup_ws,
+		      int sign, const char *line, int len)
 {
-	struct buffered_patch_line e = {set, reset, line,
+	struct diff_line e = {set, reset, line,
 		len, sign, add_line_prefix,
-		markup_ws ? BPL_EMIT_LINE_WS : BPL_EMIT_LINE_ASIS};
+		markup_ws ? DIFF_LINE_WS : DIFF_LINE_ASIS};
 
 	if (o->use_buffer)
-		append_buffered_patch_line(o, &e);
+		append_diff_line(o, &e);
 	else
-		emit_buffered_patch_line(o, &e);
+		emit_diff_line(o, &e);
 }
 
-void emit_line_fmt(struct diff_options *o,
-		   const char *set, const char *reset,
-		   int add_line_prefix,
-		   const char *fmt, ...)
+static void emit_line_fmt(struct diff_options *o,
+			  const char *set, const char *reset,
+			  int add_line_prefix,
+			  const char *fmt, ...)
 {
 	struct strbuf sb = STRBUF_INIT;
 	va_list ap;
@@ -1435,7 +1441,7 @@ static void diff_words_flush(struct emit_callback *ecbdata)
 	if (ecbdata->diff_words->opt->line_buffer_nr) {
 		int i;
 		for (i = 0; i < ecbdata->diff_words->opt->line_buffer_nr; i++)
-			append_buffered_patch_line(ecbdata->opt,
+			append_diff_line(ecbdata->opt,
 				&ecbdata->diff_words->opt->line_buffer[i]);
 
 		for (i = 0; i < ecbdata->diff_words->opt->line_buffer_nr; i++)
@@ -1862,8 +1868,8 @@ static void fill_print_name(struct diffstat_file *file)
 	file->print_name = pname;
 }
 
-void print_stat_summary_0(struct diff_options *options, int files,
-			  int insertions, int deletions)
+static void print_stat_summary_0(struct diff_options *options, int files,
+				 int insertions, int deletions)
 {
 	struct strbuf sb = STRBUF_INIT;
 
@@ -2857,11 +2863,11 @@ static void builtin_diff(const char *name_a,
 		if (o->word_diff)
 			init_diff_words_data(&ecbdata, o, one, two);
 		if (o->use_buffer) {
-			struct buffered_patch_line e = BUFFERED_PATCH_LINE_INIT;
-			e.state = BPL_HANDOVER;
+			struct diff_line e = diff_line_INIT;
+			e.state = DIFF_LINE_RELOAD_WS_RULE;
 			e.line = name_b;
 			e.len = strlen(name_b);
-			append_buffered_patch_line(o, &e);
+			append_diff_line(o, &e);
 		}
 		if (xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata,
 				  &xpp, &xecfg))
@@ -5094,18 +5100,8 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	int i;
 	struct diff_queue_struct *q = &diff_queued_diff;
 
-	if (o->color_moved) {
-		unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	if (o->color_moved)
 		o->use_buffer = 1;
-		o->deleted_lines = xmallocz(sizeof(*o->deleted_lines));
-		o->added_lines = xmallocz(sizeof(*o->added_lines));
-		hashmap_init(o->deleted_lines, ignore_ws ?
-			(hashmap_cmp_fn)moved_entry_cmp_no_ws :
-			(hashmap_cmp_fn)moved_entry_cmp, 0);
-		hashmap_init(o->added_lines, ignore_ws ?
-			(hashmap_cmp_fn)moved_entry_cmp_no_ws :
-			(hashmap_cmp_fn)moved_entry_cmp, 0);
-	}
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -5115,12 +5111,25 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 
 	if (o->use_buffer) {
 		if (o->color_moved) {
-			add_lines_to_move_detection(o);
-			mark_color_as_moved(o);
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
 		}
 
 		for (i = 0; i < o->line_buffer_nr; i++)
-			emit_buffered_patch_line(o, &o->line_buffer[i]);
+			emit_diff_line(o, &o->line_buffer[i]);
 
 		for (i = 0; i < o->line_buffer_nr; i++)
 			free((void *)o->line_buffer[i].line);
diff --git a/diff.h b/diff.h
index 2d86e3a012..445259ebf7 100644
--- a/diff.h
+++ b/diff.h
@@ -123,11 +123,12 @@ enum diff_submodule_format {
  * into the pre/post image file. This pointer could be a union with the
  * line pointer. By storing an offset into the file instead of the literal line,
  * we can decrease the memory footprint for the buffered output. At first we
- * may want to only have indirection for the content lines, but we could
- * also have an enum (based on sign?) that stores prefabricated lines, e.g.
- * the similarity score line or hunk/file headers.
+ * may want to only have indirection for the content lines, but we could also
+ * enhance the state for emitting prefabricated lines, e.g. the similarity
+ * score line or hunk/file headers would only need to store a number or path
+ * and then the output can be constructed later on depending on state.
  */
-struct buffered_patch_line {
+struct diff_line {
 	const char *set;
 	const char *reset;
 	const char *line;
@@ -140,16 +141,16 @@ struct buffered_patch_line {
 		 * ws_check_emit which will output "line", marked up
 		 * according to ws_rule.
 		 */
-		BPL_EMIT_LINE_WS,
+		DIFF_LINE_WS,
 
 		/* Emits [lineprefix][set][sign] line [reset] */
-		BPL_EMIT_LINE_ASIS,
+		DIFF_LINE_ASIS,
 
 		/* Reloads the ws_rule; line contains the file name */
-		BPL_HANDOVER
+		DIFF_LINE_RELOAD_WS_RULE
 	} state;
 };
-#define BUFFERED_PATCH_LINE_INIT {NULL, NULL, NULL, 0, 0, 0}
+#define diff_line_INIT {NULL, NULL, NULL, 0, 0, 0}
 
 struct diff_options {
 	const char *orderfile;
@@ -226,14 +227,13 @@ struct diff_options {
 	unsigned ws_rule;
 	int use_buffer;
 
-	struct buffered_patch_line *line_buffer;
+	struct diff_line *line_buffer;
 	int line_buffer_nr, line_buffer_alloc;
 
 	int color_moved;
-	struct hashmap *deleted_lines;
-	struct hashmap *added_lines;
 };
 
+/* Emit [line_prefix] [set] line [reset] */
 void diff_emit_line(struct diff_options *o, const char *set, const char *reset,
 		    const char *line, int len);
 
diff --git a/submodule.c b/submodule.c
index 19c63197fb..428c996c97 100644
--- a/submodule.c
+++ b/submodule.c
@@ -550,6 +550,8 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
 
 	/* TODO: other options may need to be passed here. */
 	argv_array_push(&cp.args, "diff");
+	if (o->use_color)
+		argv_array_push(&cp.args, "--color=always");
 	argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
 	if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
 		argv_array_pushf(&cp.args, "--src-prefix=%s%s/",
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 232d9ad55e..0e92bf94bf 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1124,7 +1124,7 @@ test_expect_success 'detect moved code, inside file' '
 	test_cmp expected actual
 '
 
-test_expect_success 'detect permutations inside moved code, ' '
+test_expect_success 'detect permutations inside moved code' '
 	# reusing the move example from last test:
 	cat <<-\EOF >main.c &&
 		#include<stdio.h>
@@ -1201,4 +1201,42 @@ test_expect_success 'detect permutations inside moved code, ' '
 	test_cmp expected actual
 '
 
+test_expect_success 'move detection does not mess up colored words' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done



^ permalink raw reply	[relevance 13%]

* [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
  2017-05-23  2:40 ` [PATCHv4 00/17] Diff machine: highlight moved lines. Stefan Beller
@ 2017-05-23  2:40   ` Stefan Beller
  2017-05-23  5:59     ` Junio C Hamano
  2017-05-23  2:40   ` [PATCHv4 17/17] diff.c: color moved lines differently Stefan Beller
  2017-05-27  1:04   ` Jacob Keller
  2 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-23  2:40 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger, Stefan Beller

In a later patch, I want to propose an option to detect&color
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This prepares the code for submodules to go through the
emit_line function.

As the submodule process is no longer attached to the
same stdout as the superprojects process we need to
pass on the usage of colors explicitly.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c      | 14 ++++++----
 diff.h      |  3 +++
 submodule.c | 89 +++++++++++++++++++++++++++++++++----------------------------
 submodule.h |  9 +++----
 4 files changed, 63 insertions(+), 52 deletions(-)

diff --git a/diff.c b/diff.c
index ca6b48cf49..3357c0fca3 100644
--- a/diff.c
+++ b/diff.c
@@ -562,6 +562,12 @@ static void emit_line_fmt(struct diff_options *o,
 	strbuf_release(&sb);
 }
 
+void diff_emit_line(struct diff_options *o, const char *set, const char *reset,
+		    const char *line, int len)
+{
+	emit_line(o, set, reset, 1, 0, line, len);
+}
+
 static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len)
 {
 	if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) &&
@@ -2384,8 +2390,7 @@ static void builtin_diff(const char *name_a,
 	    (!two->mode || S_ISGITLINK(two->mode))) {
 		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
 		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
-		show_submodule_summary(o->file, one->path ? one->path : two->path,
-				line_prefix,
+		show_submodule_summary(o, one->path ? one->path : two->path,
 				&one->oid, &two->oid,
 				two->dirty_submodule,
 				meta, del, add, reset);
@@ -2395,11 +2400,10 @@ static void builtin_diff(const char *name_a,
 		   (!two->mode || S_ISGITLINK(two->mode))) {
 		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
 		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
-		show_submodule_inline_diff(o->file, one->path ? one->path : two->path,
-				line_prefix,
+		show_submodule_inline_diff(o, one->path ? one->path : two->path,
 				&one->oid, &two->oid,
 				two->dirty_submodule,
-				meta, del, add, reset, o);
+				meta, del, add, reset);
 		return;
 	}
 
diff --git a/diff.h b/diff.h
index 5be1ee77a7..9ad546361a 100644
--- a/diff.h
+++ b/diff.h
@@ -188,6 +188,9 @@ struct diff_options {
 	int diff_path_counter;
 };
 
+void diff_emit_line(struct diff_options *o, const char *set, const char *reset,
+		    const char *line, int len);
+
 enum color_diff {
 	DIFF_RESET = 0,
 	DIFF_CONTEXT = 1,
diff --git a/submodule.c b/submodule.c
index d3299e29c0..428c996c97 100644
--- a/submodule.c
+++ b/submodule.c
@@ -362,8 +362,8 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path,
 	return prepare_revision_walk(rev);
 }
 
-static void print_submodule_summary(struct rev_info *rev, FILE *f,
-		const char *line_prefix,
+static void print_submodule_summary(struct rev_info *rev,
+		struct diff_options *o,
 		const char *del, const char *add, const char *reset)
 {
 	static const char format[] = "  %m %s";
@@ -375,18 +375,12 @@ static void print_submodule_summary(struct rev_info *rev, FILE *f,
 		ctx.date_mode = rev->date_mode;
 		ctx.output_encoding = get_log_output_encoding();
 		strbuf_setlen(&sb, 0);
-		strbuf_addstr(&sb, line_prefix);
-		if (commit->object.flags & SYMMETRIC_LEFT) {
-			if (del)
-				strbuf_addstr(&sb, del);
-		}
-		else if (add)
-			strbuf_addstr(&sb, add);
 		format_commit_message(commit, format, &sb, &ctx);
-		if (reset)
-			strbuf_addstr(&sb, reset);
 		strbuf_addch(&sb, '\n');
-		fprintf(f, "%s", sb.buf);
+		if (commit->object.flags & SYMMETRIC_LEFT)
+			diff_emit_line(o, del, reset, sb.buf, sb.len);
+		else if (add)
+			diff_emit_line(o, add, reset, sb.buf, sb.len);
 	}
 	strbuf_release(&sb);
 }
@@ -413,8 +407,7 @@ void prepare_submodule_repo_env(struct argv_array *out)
  * attempt to lookup both the left and right commits and put them into the
  * left and right pointers.
  */
-static void show_submodule_header(FILE *f, const char *path,
-		const char *line_prefix,
+static void show_submodule_header(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *reset,
@@ -425,12 +418,17 @@ static void show_submodule_header(FILE *f, const char *path,
 	struct strbuf sb = STRBUF_INIT;
 	int fast_forward = 0, fast_backward = 0;
 
-	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
-		fprintf(f, "%sSubmodule %s contains untracked content\n",
-			line_prefix, path);
-	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
-		fprintf(f, "%sSubmodule %s contains modified content\n",
-			line_prefix, path);
+	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED) {
+		strbuf_addf(&sb, "Submodule %s contains untracked content\n", path);
+		diff_emit_line(o, NULL, NULL, sb.buf, sb.len);
+		strbuf_reset(&sb);
+	}
+
+	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED) {
+		strbuf_addf(&sb, "Submodule %s contains modified content\n", path);
+		diff_emit_line(o, NULL, NULL, sb.buf, sb.len);
+		strbuf_reset(&sb);
+	}
 
 	if (is_null_oid(one))
 		message = "(new submodule)";
@@ -472,21 +470,20 @@ static void show_submodule_header(FILE *f, const char *path,
 	}
 
 output_header:
-	strbuf_addf(&sb, "%s%sSubmodule %s ", line_prefix, meta, path);
+	strbuf_addf(&sb, "Submodule %s ", path);
 	strbuf_add_unique_abbrev(&sb, one->hash, DEFAULT_ABBREV);
 	strbuf_addstr(&sb, (fast_backward || fast_forward) ? ".." : "...");
 	strbuf_add_unique_abbrev(&sb, two->hash, DEFAULT_ABBREV);
 	if (message)
-		strbuf_addf(&sb, " %s%s\n", message, reset);
+		strbuf_addf(&sb, " %s\n", message);
 	else
-		strbuf_addf(&sb, "%s:%s\n", fast_backward ? " (rewind)" : "", reset);
-	fwrite(sb.buf, sb.len, 1, f);
+		strbuf_addf(&sb, "%s:\n", fast_backward ? " (rewind)" : "");
+	diff_emit_line(o, meta, reset, sb.buf, sb.len);
 
 	strbuf_release(&sb);
 }
 
-void show_submodule_summary(FILE *f, const char *path,
-		const char *line_prefix,
+void show_submodule_summary(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset)
@@ -495,7 +492,7 @@ void show_submodule_summary(FILE *f, const char *path,
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
 
-	show_submodule_header(f, path, line_prefix, one, two, dirty_submodule,
+	show_submodule_header(o, path, one, two, dirty_submodule,
 			      meta, reset, &left, &right, &merge_bases);
 
 	/*
@@ -508,11 +505,12 @@ void show_submodule_summary(FILE *f, const char *path,
 
 	/* Treat revision walker failure the same as missing commits */
 	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
-		fprintf(f, "%s(revision walker failed)\n", line_prefix);
+		const char *error = "(revision walker failed)\n";
+		diff_emit_line(o, NULL, NULL, error, strlen(error));
 		goto out;
 	}
 
-	print_submodule_summary(&rev, f, line_prefix, del, add, reset);
+	print_submodule_summary(&rev, o, del, add, reset);
 
 out:
 	if (merge_bases)
@@ -521,20 +519,18 @@ void show_submodule_summary(FILE *f, const char *path,
 	clear_commit_marks(right, ~0);
 }
 
-void show_submodule_inline_diff(FILE *f, const char *path,
-		const char *line_prefix,
+void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset,
-		const struct diff_options *o)
+		const char *del, const char *add, const char *reset)
 {
 	const struct object_id *old = &empty_tree_oid, *new = &empty_tree_oid;
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
-	struct strbuf submodule_dir = STRBUF_INIT;
 	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf sb = STRBUF_INIT;
 
-	show_submodule_header(f, path, line_prefix, one, two, dirty_submodule,
+	show_submodule_header(o, path, one, two, dirty_submodule,
 			      meta, reset, &left, &right, &merge_bases);
 
 	/* We need a valid left and right commit to display a difference */
@@ -547,15 +543,16 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 	if (right)
 		new = two;
 
-	fflush(f);
 	cp.git_cmd = 1;
 	cp.dir = path;
-	cp.out = dup(fileno(f));
+	cp.out = -1;
 	cp.no_stdin = 1;
 
 	/* TODO: other options may need to be passed here. */
 	argv_array_push(&cp.args, "diff");
-	argv_array_pushf(&cp.args, "--line-prefix=%s", line_prefix);
+	if (o->use_color)
+		argv_array_push(&cp.args, "--color=always");
+	argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
 	if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
 		argv_array_pushf(&cp.args, "--src-prefix=%s%s/",
 				 o->b_prefix, path);
@@ -578,11 +575,21 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 		argv_array_push(&cp.args, oid_to_hex(new));
 
 	prepare_submodule_repo_env(&cp.env_array);
-	if (run_command(&cp))
-		fprintf(f, "(diff failed)\n");
+	if (start_command(&cp)) {
+		const char *error = "(diff failed)\n";
+		diff_emit_line(o, NULL, NULL, error, strlen(error));
+	}
+
+	while (strbuf_getwholeline_fd(&sb, cp.out, '\n') != EOF)
+		diff_emit_line(o, NULL, NULL, sb.buf, sb.len);
+
+	if (finish_command(&cp)) {
+		const char *error = "(diff failed)\n";
+		diff_emit_line(o, NULL, NULL, error, strlen(error));
+	}
 
 done:
-	strbuf_release(&submodule_dir);
+	strbuf_release(&sb);
 	if (merge_bases)
 		free_commit_list(merge_bases);
 	if (left)
diff --git a/submodule.h b/submodule.h
index 1277480add..9df0a3aea2 100644
--- a/submodule.h
+++ b/submodule.h
@@ -53,17 +53,14 @@ extern int parse_submodule_update_strategy(const char *value,
 		struct submodule_update_strategy *dst);
 extern const char *submodule_strategy_to_string(const struct submodule_update_strategy *s);
 extern void handle_ignore_submodules_arg(struct diff_options *, const char *);
-extern void show_submodule_summary(FILE *f, const char *path,
-		const char *line_prefix,
+extern void show_submodule_summary(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset);
-extern void show_submodule_inline_diff(FILE *f, const char *path,
-		const char *line_prefix,
+extern void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset,
-		const struct diff_options *opt);
+		const char *del, const char *add, const char *reset);
 extern void set_config_fetch_recurse_submodules(int value);
 extern void set_config_update_recurse_submodules(int value);
 /* Check if we want to update any submodule.*/
-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 18%]

* [PATCHv4 17/17] diff.c: color moved lines differently
  2017-05-23  2:40 ` [PATCHv4 00/17] Diff machine: highlight moved lines. Stefan Beller
  2017-05-23  2:40   ` [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
@ 2017-05-23  2:40   ` Stefan Beller
  2017-05-27  1:04   ` Jacob Keller
  2 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-23  2:40 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger, Stefan Beller

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

Adjacent blocks are colored differently. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why the alternative coloring is really needed.

As the reviewers attention should be brought to the places, where the
difference is introduced to the moved code, we cannot just have one new
color for all of moved code.

First I implemented an alternative design, which would show a moved hunk
in one color, and its boundaries in another color. This idea was error
prone as it inspected each line and its neighboring lines to determine
if the line was (a) moved and (b) if was deep inside a hunk by having
matching neighboring lines. This is unreliable as the we can construct
hunks which have equal neighbors that just exceed the number of lines
inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that
is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then switches color to the alternative color
for the next hunk. By doing this any permutation is recognized and
displayed. That implies that there is no dedicated boundary or
inside-hunk color, but instead we'll have just two colors alternating
for hunks.

It would be a bit more UX friendly if the two corresponding hunks
(of added and deleted lines) for one move would get the same color id.
(Both get "regular moved" or "alternative moved"). This problem is
deferred to a later patch for now.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

# Conflicts:
#	diff.c
---
 Documentation/config.txt   |  14 ++-
 diff.c                     | 275 +++++++++++++++++++++++++++++++++++++++++++--
 diff.h                     |   9 +-
 t/t4015-diff-whitespace.sh | 267 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 552 insertions(+), 13 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..902d017c3b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1051,14 +1051,24 @@ This does not affect linkgit:git-format-patch[1] or the
 'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
 command line with the `--color[=<when>]` option.
 
+color.moved::
+	A boolean value, whether a diff should color moved lines
+	differently. The moved lines are searched for in the diff only.
+	Duplicated lines from somewhere in the project that are not
+	part of the diff are not colored as moved.
+	Defaults to false.
+
 color.diff.<slot>::
 	Use customized color for diff colorization.  `<slot>` specifies
 	which part of the patch to use the specified color, and is one
 	of `context` (context text - `plain` is a historical synonym),
 	`meta` (metainformation), `frag`
 	(hunk header), 'func' (function in hunk header), `old` (removed lines),
-	`new` (added lines), `commit` (commit headers), or `whitespace`
-	(highlighting whitespace errors).
+	`new` (added lines), `commit` (commit headers), `whitespace`
+	(highlighting whitespace errors), `oldMoved` (removed lines that
+	reappear), `newMoved` (added lines that were removed elsewhere),
+	`oldMovedAlternative` and `newMovedAlternative` (as a fallback to
+	cover adjacent blocks of moved code)
 
 color.decorate.<slot>::
 	Use customized color for 'git log --decorate' output.  `<slot>` is one
diff --git a/diff.c b/diff.c
index c0b8afa38f..23e70d348e 100644
--- a/diff.c
+++ b/diff.c
@@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
+static int diff_color_moved_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_YELLOW,	/* COMMIT */
 	GIT_COLOR_BG_RED,	/* WHITESPACE */
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
+	GIT_COLOR_BOLD_RED,	/* OLD_MOVED_A */
+	GIT_COLOR_BG_RED,	/* OLD_MOVED_B */
+	GIT_COLOR_BOLD_GREEN,	/* NEW_MOVED_A */
+	GIT_COLOR_BG_GREEN,	/* NEW_MOVED_B */
 };
 
 static NORETURN void die_want_option(const char *option_name)
@@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
 		return DIFF_WHITESPACE;
 	if (!strcasecmp(var, "func"))
 		return DIFF_FUNCINFO;
+	if (!strcasecmp(var, "oldmoved"))
+		return DIFF_FILE_OLD_MOVED;
+	if (!strcasecmp(var, "oldmovedalternative"))
+		return DIFF_FILE_OLD_MOVED_ALT;
+	if (!strcasecmp(var, "newmoved"))
+		return DIFF_FILE_NEW_MOVED;
+	if (!strcasecmp(var, "newmovedalternative"))
+		return DIFF_FILE_NEW_MOVED_ALT;
 	return -1;
 }
 
@@ -234,6 +247,10 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
 		diff_use_color_default = git_config_colorbool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "color.moved")) {
+		diff_color_moved_default = git_config_bool(var, value);
+		return 0;
+	}
 	if (!strcmp(var, "diff.context")) {
 		diff_context_default = git_config_int(var, value);
 		if (diff_context_default < 0)
@@ -354,6 +371,88 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+struct moved_entry {
+	struct hashmap_entry ent;
+	const struct diff_line *line;
+	struct moved_entry *next_line;
+};
+
+static void get_ws_cleaned_string(const struct diff_line *l,
+				  struct strbuf *out)
+{
+	int i;
+	for (i = 0; i < l->len; i++) {
+		if (isspace(l->line[i]))
+			continue;
+		strbuf_addch(out, l->line[i]);
+	}
+}
+
+static int diff_line_cmp_no_ws(const struct diff_line *a,
+					 const struct diff_line *b,
+					 const void *keydata)
+{
+	int ret;
+	struct strbuf sba = STRBUF_INIT;
+	struct strbuf sbb = STRBUF_INIT;
+
+	get_ws_cleaned_string(a, &sba);
+	get_ws_cleaned_string(b, &sbb);
+	ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
+
+	strbuf_release(&sba);
+	strbuf_release(&sbb);
+	return ret;
+}
+
+static int diff_line_cmp(const struct diff_line *a,
+				   const struct diff_line *b,
+				   const void *keydata)
+{
+	return a->len != b->len || strncmp(a->line, b->line, a->len);
+}
+
+static int moved_entry_cmp(const struct moved_entry *a,
+			   const struct moved_entry *b,
+			   const void *keydata)
+{
+	return diff_line_cmp(a->line, b->line, keydata);
+}
+
+static int moved_entry_cmp_no_ws(const struct moved_entry *a,
+				 const struct moved_entry *b,
+				 const void *keydata)
+{
+	return diff_line_cmp_no_ws(a->line, b->line, keydata);
+}
+
+static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
+{
+	static struct strbuf sb = STRBUF_INIT;
+
+	if (ignore_ws) {
+		strbuf_reset(&sb);
+		get_ws_cleaned_string(line, &sb);
+		return memhash(sb.buf, sb.len);
+	} else {
+		return memhash(line->line, line->len);
+	}
+}
+
+static struct moved_entry *prepare_entry(struct diff_options *o,
+					 int line_no)
+{
+	struct moved_entry *ret = xmalloc(sizeof(*ret));
+	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	struct diff_line *l = &o->line_buffer[line_no];
+
+	ret->ent.hash = get_line_hash(l, ignore_ws);
+	ret->line = l;
+	ret->next_line = NULL;
+
+	return ret;
+}
+
 static char *quote_two(const char *one, const char *two)
 {
 	int need_one = quote_c_style(one, NULL, NULL, 1);
@@ -516,6 +615,141 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
 	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
+{
+	struct moved_entry *prev_line = NULL;
+
+	int n;
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		int sign = 0;
+		struct hashmap *hm;
+		struct moved_entry *key;
+
+		switch (o->line_buffer[n].sign) {
+		case '+':
+			sign = '+';
+			hm = add_lines;
+			break;
+		case '-':
+			sign = '-';
+			hm = del_lines;
+			break;
+		case ' ':
+		default:
+			prev_line = NULL;
+			continue;
+		}
+
+		key = prepare_entry(o, n);
+		if (prev_line &&
+		    prev_line->line->sign == sign)
+			prev_line->next_line = key;
+
+		hashmap_add(hm, key);
+		prev_line = key;
+	}
+}
+
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
+{
+	struct moved_entry **pmb = NULL; /* potentially moved blocks */
+	int pmb_nr = 0, pmb_alloc = 0;
+	int use_alt_color = 0;
+	int n;
+
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		struct hashmap *hm = NULL;
+		struct moved_entry *key;
+		struct moved_entry *match = NULL;
+		struct diff_line *l = &o->line_buffer[n];
+		int i, lp, rp;
+
+		switch (l->sign) {
+		case '+':
+			hm = del_lines;
+			break;
+		case '-':
+			hm = add_lines;
+			break;
+		default:
+			use_alt_color = 0;
+			pmb_nr = 0; /* no running sets */
+			continue;
+		}
+
+		/* Check for any match to color it as a move. */
+		key = prepare_entry(o, n);
+		match = hashmap_get(hm, key, o);
+		free(key);
+		if (!match)
+			continue;
+
+		/* Check any potential block runs, advance each or nullify */
+		for (i = 0; i < pmb_nr; i++) {
+			struct moved_entry *p = pmb[i];
+			struct moved_entry *pnext = (p && p->next_line) ?
+					p->next_line : NULL;
+			if (pnext &&
+			    !diff_line_cmp(pnext->line, l, o)) {
+				pmb[i] = p->next_line;
+			} else {
+				pmb[i] = NULL;
+			}
+		}
+
+		/* Shrink the set to the remaining runs */
+		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+			while (lp < pmb_nr && pmb[lp])
+				lp++;
+			/* lp points at the first NULL now */
+
+			while (rp > -1 && !pmb[rp])
+				rp--;
+			/* rp points at the last non-NULL */
+
+			if (lp < pmb_nr && rp > -1 && lp < rp) {
+				pmb[lp] = pmb[rp];
+				pmb[rp] = NULL;
+				rp--;
+				lp++;
+			}
+		}
+
+		if (rp > -1) {
+			/* Remember the number of running sets */
+			pmb_nr = rp + 1;
+		} else {
+			/* Toggle color */
+			use_alt_color = (use_alt_color + 1) % 2;
+
+			/* Build up a new set */
+			pmb_nr = 0;
+			for (; match; match = hashmap_get_next(hm, match)) {
+				ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
+				pmb[pmb_nr++] = match;
+			}
+		}
+
+		switch (l->sign) {
+		case '+':
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_NEW_MOVED + use_alt_color);
+			break;
+		case '-':
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_OLD_MOVED + use_alt_color);
+			break;
+		default:
+			die("BUG: we should have continued earlier?");
+		}
+	}
+	free(pmb);
+}
+
 static void emit_diff_line(struct diff_options *o,
 				     struct diff_line *e)
 {
@@ -3518,6 +3752,8 @@ void diff_setup(struct diff_options *options)
 	options->line_buffer = NULL;
 	options->line_buffer_nr = 0;
 	options->line_buffer_alloc = 0;
+
+	options->color_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -3627,6 +3863,9 @@ void diff_setup_done(struct diff_options *options)
 
 	if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
 		die(_("--follow requires exactly one pathspec"));
+
+	if (!options->use_color || external_diff())
+		options->color_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
@@ -4051,6 +4290,10 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--no-color"))
 		options->use_color = 0;
+	else if (!strcmp(arg, "--color-moved"))
+		options->color_moved = 1;
+	else if (!strcmp(arg, "--no-color-moved"))
+		options->color_moved = 0;
 	else if (!strcmp(arg, "--color-words")) {
 		options->use_color = 1;
 		options->word_diff = DIFF_WORDS_COLOR;
@@ -4856,16 +5099,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	/*
-	 * For testing purposes we want to make sure the diff machinery
-	 * works completely with the buffer. If there is anything emitted
-	 * outside the emit_diff_line, then the order is screwed
-	 * up and the tests will fail.
-	 *
-	 * TODO (later in this series):
-	 * We'll unset this flag in a later patch.
-	 */
-	o->use_buffer = 1;
+
+	if (o->color_moved)
+		o->use_buffer = 1;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -4874,6 +5110,24 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	}
 
 	if (o->use_buffer) {
+		if (o->color_moved) {
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
+		}
+
 		for (i = 0; i < o->line_buffer_nr; i++)
 			emit_diff_line(o, &o->line_buffer[i]);
 
@@ -4962,6 +5216,7 @@ void diff_flush(struct diff_options *options)
 		if (!options->file)
 			die_errno("Could not open /dev/null");
 		options->close_file = 1;
+		options->color_moved = 0;
 		for (i = 0; i < q->nr; i++) {
 			struct diff_filepair *p = q->queue[i];
 			if (check_pair_status(p))
diff --git a/diff.h b/diff.h
index fad1258556..445259ebf7 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "hashmap.h"
 
 struct rev_info;
 struct diff_options;
@@ -228,6 +229,8 @@ struct diff_options {
 
 	struct diff_line *line_buffer;
 	int line_buffer_nr, line_buffer_alloc;
+
+	int color_moved;
 };
 
 /* Emit [line_prefix] [set] line [reset] */
@@ -243,7 +246,11 @@ enum color_diff {
 	DIFF_FILE_NEW = 5,
 	DIFF_COMMIT = 6,
 	DIFF_WHITESPACE = 7,
-	DIFF_FUNCINFO = 8
+	DIFF_FUNCINFO = 8,
+	DIFF_FILE_OLD_MOVED = 9,
+	DIFF_FILE_OLD_MOVED_ALT = 10,
+	DIFF_FILE_NEW_MOVED = 11,
+	DIFF_FILE_NEW_MOVED_ALT = 12
 };
 const char *diff_get_color(int diff_use_color, enum color_diff ix);
 #define diff_get_color_opt(o, ix) \
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 289806d0c7..0e92bf94bf 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -972,4 +972,271 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 
 '
 
+test_expect_success 'detect moved code, complete file' '
+	git reset --hard &&
+	cat <<-\EOF >test.c &&
+	#include<stdio.h>
+	main()
+	{
+	printf("Hello World");
+	}
+	EOF
+	git add test.c &&
+	git commit -m "add main function" &&
+	git mv test.c main.c &&
+	git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
+	cat >expected <<-\EOF &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>new file mode 100644<RESET>
+	<BOLD>index 0000000..a986c57<RESET>
+	<BOLD>--- /dev/null<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -0,0 +1,5 @@<RESET>
+	<BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
+	<BGREEN>+<RESET><BGREEN>main()<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>deleted file mode 100644<RESET>
+	<BOLD>index a986c57..0000000<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ /dev/null<RESET>
+	<CYAN>@@ -1,5 +0,0 @@<RESET>
+	<BRED>-#include<stdio.h><RESET>
+	<BRED>-main()<RESET>
+	<BRED>-{<RESET>
+	<BRED>-printf("Hello World");<RESET>
+	<BRED>-}<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect moved code, inside file' '
+	git reset --hard &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git add main.c test.c &&
+	git commit -m "add main and test file" &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BRED>-if (!u->is_allowed_foo)<RESET>
+	<BRED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BRED>-}<RESET>
+	<BRED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..e34eb69 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BGREEN>+<RESET><BGREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BGREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect permutations inside moved code' '
+	# reusing the move example from last test:
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			foo(u);
+			if (!u->is_allowed_foo)
+				return;
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BOLD;RED>-if (!u->is_allowed_foo)<RESET>
+	<BOLD;RED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BOLD;RED>-}<RESET>
+	<BOLD;RED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..2bedec9 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BOLD;GREEN>+<RESET><BOLD;GREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BOLD;GREEN>+<RESET><BOLD;GREEN>}<RESET>
+	<BOLD;GREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'move detection does not mess up colored words' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done
-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 9%]

* Re: [PATCH v2 0/2] Update sha1dc from upstream & optionally make it a submodule
  2017-05-22 22:48     ` Re: [PATCH v2 0/2] Update sha1dc from upstream & optionally make it a submodule Stefan Beller
@ 2017-05-23  3:22       ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-05-23  3:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Ævar Arnfjörð Bjarmason, git\, Marc Stevens, Michael Kebe, Jeff King, Brandon Williams

Stefan Beller <sbeller@google.com> writes:

> On Mon, May 22, 2017 at 3:27 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>>
>>> I liked the suggestion to make the URL a relative path, but this would
>>> require you to maintain a mirror in the same places you push git.git
>>> to, is that something you'd be willing to do?
>>
>> After thinking about this a bit more, I know what I think we want a
>> bit better.
>>
>> Relative URL (e.g. ../sha1collisiondetection that sits next to the
>> copy of git.git) may be a good way to go.  I can arrange to create
>> necessary repository next to git.git on k.org and github.com but I
>> need to double check about other places
>
> And here we see another deficit with a single URL:
> We have to abide by the same scheme at all hosting endpoints.

FWIW, I do not see it a deficit.  It is a price you may or may not
be willing to pay for simplicity, and I think it is a reasonable
trade-off.

The .gitmodules format can be enhanced to list multiple URLs quite
easily.  I think the current users all use the equivalent of "git
config -f .gitmodules submodule.foo.url" to grab one value.  Unless
the user chooses to do anything special, they will continue to get
the same behaviour whensuch an enhancement happens, which is a good
thing.

But then, you need to design what users choose to do that is
"something special".  Should "git clone --recurse-submodules" have a
way to control which one of the not-yet-known-before-cloning URLs
that may be listed in .gitmodules?  Will we have a way to say "For
North American users, we recommend this URL, while Asians may want
to fetch from this other URL" in .gitmodules and then the recursive
clone have a way to say "I want the European option"?  Would the
recursive clone have a way to go interactive?

And from that point of view, "you'll find the submodules relative to
the superproject" convention is one way (not necessarily the only
way) to allow users not to care too much.  The simplicity comes with
price and that is perfectly acceptable.

Also a single URL scheme may still perfectly fine.  .gitmodules may
have new submodule.<name>.alternateURL fields and recursive clone
can be told to optionally go interactive when such fields are
present.

Or README can list alternate URLs and instruct the users to use the
insteadOf if they want to go to mirrors instead.  Those users who do
care about picking particular mirror are likely not favor simplicity
over flexibility, so they would not likely to do a recursive clone
(after all, clone is a single-time operation) and it may be
sufficient if they can clone the top-level, read README and then
decide how and from where they get their submodules.

^ permalink raw reply	[relevance 23%]

* Re: [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
  2017-05-23  2:40   ` [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
@ 2017-05-23  5:59     ` Junio C Hamano
  2017-05-23 18:14       ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-05-23  5:59 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger

Stefan Beller <sbeller@google.com> writes:

> diff --git a/submodule.c b/submodule.c
> index d3299e29c0..428c996c97 100644
> --- a/submodule.c
> +++ b/submodule.c
> ...
> @@ -547,15 +543,16 @@ void show_submodule_inline_diff(FILE *f, const char *path,
>  	if (right)
>  		new = two;
>  
> -	fflush(f);
>  	cp.git_cmd = 1;
>  	cp.dir = path;
> -	cp.out = dup(fileno(f));
> +	cp.out = -1;
>  	cp.no_stdin = 1;
>  
>  	/* TODO: other options may need to be passed here. */
>  	argv_array_push(&cp.args, "diff");
> -	argv_array_pushf(&cp.args, "--line-prefix=%s", line_prefix);
> +	if (o->use_color)
> +		argv_array_push(&cp.args, "--color=always");
> +	argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));

This makes me wonder if we also need to explicitly decline coloring
when o->use_color is not set.  After all, even if configuration in
the submodule's config file says diff.color=never, we will enable
the color with this codepath (because the user explicitly asked to
use the color in the top-level), so we should do the same for the
opposite case where the config says yes/auto if the user said no at
the top-level, no?

^ permalink raw reply	[relevance 22%]

* Re: [PATCHv2 2/6] submodule test invocation: only pass additional arguments
      [irrelevant] ` <20170522194806.13568-3-sbeller@google.com>
@ 2017-05-23  6:26   ` Junio C Hamano
  2017-05-23 18:29     ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-05-23  6:26 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, bmwill

Stefan Beller <sbeller@google.com> writes:

> diff --git a/t/t2013-checkout-submodule.sh b/t/t2013-checkout-submodule.sh
> index e8f70b806f..2672f104cf 100755
> --- a/t/t2013-checkout-submodule.sh
> +++ b/t/t2013-checkout-submodule.sh
> @@ -65,9 +65,9 @@ test_expect_success '"checkout <submodule>" honors submodule.*.ignore from .git/
>  
>  KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
>  KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
> -test_submodule_switch_recursing "git checkout --recurse-submodules"
> +test_submodule_switch_recursing "checkout"
>  
> -test_submodule_forced_switch_recursing "git checkout -f --recurse-submodules"
> +test_submodule_forced_switch_recursing "checkout -f"
>  
>  test_submodule_switch "git checkout"

Doesn't the above look crazy to you?  

It is hostile to other people (and those who need to make merges)
who have to work with test_submodule_switch_recursing that older one
used to take the full command but its definition suddenly changes so
that the caller now must omit the leading "git".  Even worse,
another helper with a similar-sounding name, test_submodule_switch,
still must be called with the leading "git".

The same comment applies to the one we can see below.

> diff --git a/t/t7112-reset-submodule.sh b/t/t7112-reset-submodule.sh
> index f86ccdf215..a000304221 100755
> --- a/t/t7112-reset-submodule.sh
> +++ b/t/t7112-reset-submodule.sh
> @@ -9,9 +9,9 @@ KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
>  KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
>  KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED=1
>  
> -test_submodule_switch_recursing "git reset --recurse-submodules --keep"
> +test_submodule_switch_recursing "reset --keep"
>  
> -test_submodule_forced_switch_recursing "git reset --hard --recurse-submodules"
> +test_submodule_forced_switch_recursing "reset --hard"
>  
>  test_submodule_switch "git reset --keep"

^ permalink raw reply	[relevance 16%]

* Re: BUG: The .gitignore rules can't be made to cross submodule boundaries
      [irrelevant]   ` <CACBZZX5EQhoEBvj2e6ogXU5Y=EfwSPCx+jFvTJ1P2KbYNpADyw@mail.gmail.com>
@ 2017-05-23 17:51     ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-23 17:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Johannes Schindelin, Git Mailing List, Brandon Williams

On Tue, May 23, 2017 at 2:55 AM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Tue, May 23, 2017 at 11:17 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> Hi Ævar,
>>
>> On Mon, 22 May 2017, Ævar Arnfjörð Bjarmason wrote:
>>
>>> When I was adding the sha1collisiondetection submodule to git.git I
>>> noticed that building git would dirty the submodule.
>>>
>>> This is because our own Makefile adds .depend/ directories. I hacked
>>> around it by just getting the upstream project accept carrying an ignore
>>> rule for that around:
>>> https://github.com/cr-marcstevens/sha1collisiondetection/commit/e8397b26
>>>
>>> A workaround for this is to have the Makefile add such a rule to
>>> .git/modules/sha1collisiondetection/info/exclude, but that's less
>>> convenient than being able to distribute it as a normal .gitignore rule.
>>>
>>> The submodule.<name>.ignore config provides an overly big hammer to
>>> solve this, it would be better if we had something like
>>> submodule.<name>.gitignore=<path>. Then we could have e.g.
>>> .gitignore.sha1collisiondetection which would be added to whatever rules
>>> the repo's own .gitignore provides.
>>
>> While I have nothing but the utmost respect for Stefan and Brandon for
>> trying to improve submodules, maybe it would be a wiser idea to imitate
>> the same strategy with sha1dc as we use with git-gui and gitk, i.e.
>> perform a subtree merge instead of adding it as a submodule. It's not like
>> 570kB will kill us.

Actually that is a very valid bug report outside that series for the
behavior of submodules.

In a world where you use a submodule to track say a third party
library, the current behavior of .gitignore applying to each repo makes
sense.

When it is no third party, but a first party lib, then it is sensible to expect
that the building/testing infrastructure works across the whole repo set,
and the user wants just one central place to specify things, such as
ignoring certain files or applying .gitattributes.

This topic came up in various forms on the mailing list, most often for
config that ought to be applied across all repos[1].

That said I have no good idea yet how to fix this issue without introducing
the ultimate user confusion.

The conditional include of config files (by Duy as part of 2.13) seems like
an interesting approach, which we could build on top of.
We currently have a main config and a per-working-tree config, so I would
expect we'd introduce another config file that is included by all submodules
by default. It could be located in the superproject at ".git/config.super".
This config file could then specify

[submodule]
    recursiveIgnore = [yes/no]
    recursiveAttributes = [yes/no]

In that way commands run from within the submodule as well as from
the superproject would realize that the submodule needs to lookup
the superproject and use the attribute/ignore/config settings from there.

[1] Here the example for URL.insteadOf
https://public-inbox.org/git/CAPZ477MCsBsfbqKzp69MT_brwz-0aes6twJofQrhizUBV7ZoeA@mail.gmail.com/

>
> The submodule/.gitignore bug/feature-request being reported here isn't
> something that impacts the ab/sha1dc series in practice.
>
> It was something I noticed while working with an earlier commit in
> that repo, but that's a commit that'll never be pinned by the
> git.git:sha1collisiondetection submodule.

Thanks for the bug report. As outlined above, we'd still need to bikeshed
how to fix it properly I'd think.

Thanks,
Stefan

^ permalink raw reply	[relevance 24%]

* Re: [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
  2017-05-23  5:59     ` Junio C Hamano
@ 2017-05-23 18:14       ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-23 18:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brandon Williams, Jonathan Nieder, Jonathan Tan, Jeff King, Michael Haggerty

On Mon, May 22, 2017 at 10:59 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> diff --git a/submodule.c b/submodule.c
>> index d3299e29c0..428c996c97 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>> ...
>> @@ -547,15 +543,16 @@ void show_submodule_inline_diff(FILE *f, const char *path,
>>       if (right)
>>               new = two;
>>
>> -     fflush(f);
>>       cp.git_cmd = 1;
>>       cp.dir = path;
>> -     cp.out = dup(fileno(f));
>> +     cp.out = -1;
>>       cp.no_stdin = 1;
>>
>>       /* TODO: other options may need to be passed here. */
>>       argv_array_push(&cp.args, "diff");
>> -     argv_array_pushf(&cp.args, "--line-prefix=%s", line_prefix);
>> +     if (o->use_color)
>> +             argv_array_push(&cp.args, "--color=always");
>> +     argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
>
> This makes me wonder if we also need to explicitly decline coloring
> when o->use_color is not set.  After all, even if configuration in
> the submodule's config file says diff.color=never, we will enable
> the color with this codepath (because the user explicitly asked to
> use the color in the top-level), so we should do the same for the
> opposite case where the config says yes/auto if the user said no at
> the top-level, no?

That makes sense, so instead we'd do

             argv_array_push(&cp.args, "--color=%s", o->use_color ?
"always" : "never");

to override the submodule config in all cases.

However that changes from current behavior.

You could imagine that you want to see the superproject colored
and the submodule non-colored to easily spot that it is a submodule change.
Currently this can be made to work via setting color=never in the
submodule and then run the diff from the superproject.

What we really want here is a switch that influences the automatic detection
and say: pretend "dup(fileno(f));" was your stdout, now run your auto-detection
to decide for yourself.

I am not sure if it worth the effort to fix this hypothetical situation, though.

Thanks,
Stefan

^ permalink raw reply	[relevance 25%]

* Re: [PATCHv2 2/6] submodule test invocation: only pass additional arguments
  2017-05-23  6:26   ` Re: [PATCHv2 2/6] submodule test invocation: only pass additional arguments Junio C Hamano
@ 2017-05-23 18:29     ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-23 18:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brandon Williams

On Mon, May 22, 2017 at 11:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> diff --git a/t/t2013-checkout-submodule.sh b/t/t2013-checkout-submodule.sh
>> index e8f70b806f..2672f104cf 100755
>> --- a/t/t2013-checkout-submodule.sh
>> +++ b/t/t2013-checkout-submodule.sh
>> @@ -65,9 +65,9 @@ test_expect_success '"checkout <submodule>" honors submodule.*.ignore from .git/
>>
>>  KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
>>  KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
>> -test_submodule_switch_recursing "git checkout --recurse-submodules"
>> +test_submodule_switch_recursing "checkout"
>>
>> -test_submodule_forced_switch_recursing "git checkout -f --recurse-submodules"
>> +test_submodule_forced_switch_recursing "checkout -f"
>>
>>  test_submodule_switch "git checkout"
>
> Doesn't the above look crazy to you?

Oh well. The commit message doesn't explain why the craziness is
required here (really!).

    submodule test invocation: only pass additional arguments

    In a later patch we want to introduce a config option to trigger
    the submodule recursing by default. As this option should be
    available and uniform across all commands that deal with submodules
    we'd want to test for this option in the submodule update library.

    So instead of calling the whole test set again for
    "git -c submodule.recurse foo" instead of
    "git foo --recurse-submodules", we'd only want to introduce one
    basic test that tests if the option is recognized and respected
    to not overload the test suite.

>
> It is hostile to other people (and those who need to make merges)
> who have to work with test_submodule_switch_recursing that older one
> used to take the full command but its definition suddenly changes so
> that the caller now must omit the leading "git".

I am not aware of other people (or other series in flight by myself) that use
one of the switches currently.

>  Even worse,
> another helper with a similar-sounding name, test_submodule_switch,
> still must be called with the leading "git".

Oh, yeah that is a real issue. I will migrate all of them.

>
> The same comment applies to the one we can see below.

An alternative would be to come up with a slightly different name
to ensure we do not have issues with other series in flight. The function
name is already pretty long, so encoding even more information in it
may be not a good idea. But the argument is shorter, so maybe:

- test_submodule_switch_recursing "git reset --hard --recurse-submodules"
+ test_submodule_switch_recursing_args_only  "reset --hard"

Thanks,
Stefan

^ permalink raw reply	[relevance 25%]

* Re: [PATCHv2 1/6] submodule.c: add has_submodules to check if we have any submodules
      [irrelevant] ` <20170522194806.13568-2-sbeller@google.com>
@ 2017-05-23 18:40   ` Brandon Williams
  2017-05-23 18:52     ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Brandon Williams @ 2017-05-23 18:40 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git

On 05/22, Stefan Beller wrote:
> When submodules are involved, it often slows down the process, as most
> submodule related handling is either done via a child process or by
> iterating over the index finding all gitlinks.
> 
> For most commands that may interact with submodules, we need have a
> quick check if we do have any submodules at all, such that we can
> be fast in the case when no submodules are in use.  For this quick
> check introduce a function that checks with different heuristics if
> we do have submodules around, checking if
> * anything related to submodules is configured,
> * absorbed git dirs for submodules are present,
> * the '.gitmodules' file exists
> * gitlinks are recorded in the index.
> 
> Each heuristic has advantages and disadvantages.
> For example in a later patch, when we first use this function in
> git-clone, we'll just check for the existence of the '.gitmodules'
> file, because at the time of running the clone command there will
> be no absorbed git dirs or submodule configuration around.
> 
> Checking for any configuration related to submodules would be useful
> in a later stage (after cloning) to see if the submodules are actually
> in use.
> 
> Checking for absorbed git directories is good to see if the user has
> actually cloned submodules already (i.e. not just initialized them by
> configuring them).
> 
> The heuristic for checking the configuration requires this patch
> to have have a global state, whether the submodule config has already
> been read, and if there were any submodule related keys. Make
> 'submodule_config' private to the submodule code, and introduce
> 'load_submodule_config' that will take care of this global state.

It doesn't look like any patches actually use this helper, is this
intended?

> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  builtin/checkout.c          |  2 +-
>  builtin/fetch.c             |  3 +-
>  builtin/read-tree.c         |  3 +-
>  builtin/reset.c             |  3 +-
>  builtin/submodule--helper.c | 10 ++----
>  submodule.c                 | 80 ++++++++++++++++++++++++++++++++++++---------
>  submodule.h                 |  8 ++++-
>  unpack-trees.c              |  3 +-
>  8 files changed, 79 insertions(+), 33 deletions(-)
> 
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index bfa5419f33..2787b343b1 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -1215,7 +1215,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
> -		git_config(submodule_config, NULL);
> +		load_submodule_config();
>  		if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
>  			set_config_update_recurse_submodules(recurse_submodules);
>  	}
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 4ef7a08afc..4b5f172623 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1343,8 +1343,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  			int arg = parse_fetch_recurse_submodules_arg("--recurse-submodules-default", recurse_submodules_default);
>  			set_config_fetch_recurse_submodules(arg);
>  		}
> -		gitmodules_config();
> -		git_config(submodule_config, NULL);
> +		load_submodule_config();
>  	}
>  
>  	if (all) {
> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> index 23e212ee8c..2f7f085b82 100644
> --- a/builtin/read-tree.c
> +++ b/builtin/read-tree.c
> @@ -176,8 +176,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
>  	hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
>  
>  	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT) {
> -		gitmodules_config();
> -		git_config(submodule_config, NULL);
> +		load_submodule_config();
>  		set_config_update_recurse_submodules(RECURSE_SUBMODULES_ON);
>  	}
>  
> diff --git a/builtin/reset.c b/builtin/reset.c
> index 5ce27fcaed..319d8c1201 100644
> --- a/builtin/reset.c
> +++ b/builtin/reset.c
> @@ -320,8 +320,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
>  	parse_args(&pathspec, argv, prefix, patch_mode, &rev);
>  
>  	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT) {
> -		gitmodules_config();
> -		git_config(submodule_config, NULL);
> +		load_submodule_config();
>  		set_config_update_recurse_submodules(RECURSE_SUBMODULES_ON);
>  	}
>  
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 85aafe46a4..92e13abe2d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1013,9 +1013,7 @@ static int update_clone(int argc, const char **argv, const char *prefix)
>  	if (pathspec.nr)
>  		suc.warn_if_uninitialized = 1;
>  
> -	/* Overlay the parsed .gitmodules file with .git/config */
> -	gitmodules_config();
> -	git_config(submodule_config, NULL);
> +	load_submodule_config();
>  
>  	if (max_jobs < 0)
>  		max_jobs = parallel_submodules();
> @@ -1057,9 +1055,8 @@ static int resolve_relative_path(int argc, const char **argv, const char *prefix
>  static const char *remote_submodule_branch(const char *path)
>  {
>  	const struct submodule *sub;
> -	gitmodules_config();
> -	git_config(submodule_config, NULL);
>  
> +	load_submodule_config();
>  	sub = submodule_from_path(null_sha1, path);
>  	if (!sub)
>  		return NULL;
> @@ -1129,8 +1126,7 @@ static int absorb_git_dirs(int argc, const char **argv, const char *prefix)
>  	argc = parse_options(argc, argv, prefix, embed_gitdir_options,
>  			     git_submodule_helper_usage, 0);
>  
> -	gitmodules_config();
> -	git_config(submodule_config, NULL);
> +	load_submodule_config();
>  
>  	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
>  		return 1;
> diff --git a/submodule.c b/submodule.c
> index 20ed5b5681..dda5ed210f 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -24,6 +24,12 @@ static int initialized_fetch_ref_tips;
>  static struct sha1_array ref_tips_before_fetch;
>  static struct sha1_array ref_tips_after_fetch;
>  
> +static enum {
> +	SUBMODULE_CONFIG_NOT_READ = 0,
> +	SUBMODULE_CONFIG_NO_CONFIG,
> +	SUBMODULE_CONFIG_EXISTS,
> +} submodule_config_reading;
> +
>  /*
>   * The following flag is set if the .gitmodules file is unmerged. We then
>   * disable recursion for all submodules where .git/config doesn't have a
> @@ -83,6 +89,64 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
>  	return 0;
>  }
>  
> +static int submodule_config(const char *var, const char *value, void *cb)
> +{
> +	if (!strcmp(var, "submodule.fetchjobs")) {
> +		submodule_config_reading = SUBMODULE_CONFIG_EXISTS;
> +		parallel_jobs = git_config_int(var, value);
> +		if (parallel_jobs < 0)
> +			die(_("negative values not allowed for submodule.fetchJobs"));
> +		return 0;
> +	} else if (starts_with(var, "submodule.")) {
> +		submodule_config_reading = SUBMODULE_CONFIG_EXISTS;
> +		return parse_submodule_config_option(var, value);
> +	} else if (!strcmp(var, "fetch.recursesubmodules")) {
> +		submodule_config_reading = SUBMODULE_CONFIG_EXISTS;
> +		config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
> +		return 0;
> +	}
> +	return 0;
> +}
> +
> +void load_submodule_config(void)
> +{
> +	submodule_config_reading = SUBMODULE_CONFIG_NO_CONFIG;
> +
> +	gitmodules_config();
> +	git_config(submodule_config, NULL);
> +}
> +
> +int has_submodules(unsigned what_to_check)
> +{
> +	if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) {
> +		if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ)
> +			load_submodule_config();
> +		if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS)
> +			return 1;
> +	}
> +
> +	if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) &&
> +	    file_exists(git_path("modules")))
> +		return 1;
> +
> +	if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) &&
> +	    (!is_bare_repository() && file_exists(".gitmodules")))
> +		return 1;
> +
> +	if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) {
> +		int i;
> +
> +		if (read_cache() < 0)
> +			die(_("index file corrupt"));
> +
> +		for (i = 0; i < active_nr; i++)
> +			if (S_ISGITLINK(active_cache[i]->ce_mode))
> +				return 1;
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * Try to remove the "submodule.<name>" section from .gitmodules where the given
>   * path is configured. Return 0 only if a .gitmodules file was found, a section
> @@ -152,22 +216,6 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
>  	}
>  }
>  
> -int submodule_config(const char *var, const char *value, void *cb)
> -{
> -	if (!strcmp(var, "submodule.fetchjobs")) {
> -		parallel_jobs = git_config_int(var, value);
> -		if (parallel_jobs < 0)
> -			die(_("negative values not allowed for submodule.fetchJobs"));
> -		return 0;
> -	} else if (starts_with(var, "submodule."))
> -		return parse_submodule_config_option(var, value);
> -	else if (!strcmp(var, "fetch.recursesubmodules")) {
> -		config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
> -		return 0;
> -	}
> -	return 0;
> -}
> -
>  void gitmodules_config(void)
>  {
>  	const char *work_tree = get_git_work_tree();
> diff --git a/submodule.h b/submodule.h
> index 8a8bc49dc9..5ec72fbb16 100644
> --- a/submodule.h
> +++ b/submodule.h
> @@ -1,6 +1,12 @@
>  #ifndef SUBMODULE_H
>  #define SUBMODULE_H
>  
> +#define SUBMODULE_CHECK_ANY_CONFIG		(1<<0)
> +#define SUBMODULE_CHECK_ABSORBED_GIT_DIRS	(1<<1)
> +#define SUBMODULE_CHECK_GITMODULES_IN_WT	(1<<2)
> +#define SUBMODULE_CHECK_GITLINKS_IN_TREE 	(1<<3)
> +int has_submodules(unsigned what_to_check);
> +
>  struct diff_options;
>  struct argv_array;
>  struct sha1_array;
> @@ -37,7 +43,7 @@ extern int remove_path_from_gitmodules(const char *path);
>  extern void stage_updated_gitmodules(void);
>  extern void set_diffopt_flags_from_submodule_config(struct diff_options *,
>  		const char *path);
> -extern int submodule_config(const char *var, const char *value, void *cb);
> +extern void load_submodule_config(void);
>  extern void gitmodules_config(void);
>  extern void gitmodules_config_sha1(const unsigned char *commit_sha1);
>  extern int is_submodule_initialized(const char *path);
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 4b3f9518e5..e3174b3b66 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -291,8 +291,7 @@ static void reload_gitmodules_file(struct index_state *index,
>  			else if (r == 0) {
>  				submodule_free();
>  				checkout_entry(ce, state, NULL);
> -				gitmodules_config();
> -				git_config(submodule_config, NULL);
> +				load_submodule_config();
>  			} else
>  				break;
>  		}
> -- 
> 2.13.0.18.g7d86cc8ba0
> 

-- 
Brandon Williams

^ permalink raw reply	[relevance 18%]

* Re: [PATCHv2 1/6] submodule.c: add has_submodules to check if we have any submodules
  2017-05-23 18:40   ` Re: [PATCHv2 1/6] submodule.c: add has_submodules to check if we have any submodules Brandon Williams
@ 2017-05-23 18:52     ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-23 18:52 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Junio C Hamano, git

On Tue, May 23, 2017 at 11:40 AM, Brandon Williams <bmwill@google.com> wrote:
> It doesn't look like any patches actually use this helper, is this
> intended?

It was needed for
https://public-inbox.org/git/20170411194616.4963-1-sbeller@google.com/
which we do not have in this series any more. Will drop this patch.

^ permalink raw reply	[relevance 18%]

* Re: [GSoC][PATCH v4 1/2] t7407: test "submodule foreach --recursive" from subdirectory added
      [irrelevant] ` <20170521125814.26255-1-pc44800@gmail.com>
@ 2017-05-23 19:06   ` Brandon Williams
      [irrelevant]   ` <20170521125814.26255-2-pc44800@gmail.com>
  1 sibling, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-23 19:06 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, sbeller, christian.couder, peff, ramsay

On 05/21, Prathamesh Chavan wrote:
> Additional test cases added to the submodule-foreach test suite
> to check the submodule foreach --recursive behavior from a
> subdirectory as this was missing from the test suite.
> 
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> It was observed that after porting the submodule subcommand to
> C, it passed all the test from the existing test-suite.
> But since there was some observation made, where the output of
> the orignal submodule foreach subcommand wasn't matching to that
> of the newly ported function, this test has been added.
> 
> After which, it can been seen that the patch fails in test #9
> of t7407-submodule-foreach, which is the newly added
> test to that suite. The main reason of adding this test
> was to bring the behavior of $path for the submodule
> foreach --recursive case.
> 
> The observation made was as follows:
> 
> For a project - super containing dir (not a submodule)
> and a submodule sub which contains another submodule
> subsub. When we run a command from super/dir:
> 
> git submodule foreach "echo \$path-\$sm_path"
> 
> actual results:
> Entering '../sub'
> ../sub-../sub
> Entering '../sub/subsub'
> ../subsub-../subsub
> 
> ported function's result:
> Entering '../sub'
> sub-../sub
> Entering '../sub/subsub'
> subsub-../sub/subsub
> 
> This is occurring since in cmd_foreach of git-submodule.sh
> when we use to recurse, we call cmd_foreach
> and hence the process ran in the same shell.
> Because of this, the variable $wt_prefix is set only once
> which is at the beginning of the submodule foreach execution.
> wt_prefix=$(git rev-parse --show-prefix)
> 
> And since sm_path and path are set using $wt_prefix as :
> sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
> path=$sm_path
> It differs with the value of displaypath as well.
> 
> This make the value of $path confusing and I also feel it
> deviates from its documentation:
> $path is the name of the submodule directory relative
> to the superproject.
> 
> But since in refactoring the code, we wish to maintain the
> code in same way, we need to pass wt_prefix on every
> recursive call, which may result in complex C code.
> Another option could be to first correct the $path value
> in git-submodule.sh and then port the updated cmd_foreach.
> 
>  t/t7407-submodule-foreach.sh | 35 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
> index 6ba5daf42..58a890e31 100755
> --- a/t/t7407-submodule-foreach.sh
> +++ b/t/t7407-submodule-foreach.sh
> @@ -79,7 +79,6 @@ test_expect_success 'test basic "submodule foreach" usage' '
>  	) &&
>  	test_i18ncmp expect actual
>  '
> -

The removal of this line seems unrelated to the rest of this patch.  Was
this intended?

>  cat >expect <<EOF
>  Entering '../sub1'
>  $pwd/clone-foo1-../sub1-$sub1sha1
> @@ -197,6 +196,40 @@ test_expect_success 'test messages from "foreach --recursive" from subdirectory'
>  	test_i18ncmp expect actual
>  '
>  
> +sub1sha1=$(cd clone2/sub1 && git rev-parse HEAD)
> +sub2sha1=$(cd clone2/sub2 && git rev-parse HEAD)
> +sub3sha1=$(cd clone2/sub3 && git rev-parse HEAD)
> +nested1sha1=$(cd clone2/nested1 && git rev-parse HEAD)
> +nested2sha1=$(cd clone2/nested1/nested2 && git rev-parse HEAD)
> +nested3sha1=$(cd clone2/nested1/nested2/nested3 && git rev-parse HEAD)
> +submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse HEAD)
> +
> +cat >expect <<EOF
> +Entering '../nested1'
> +$pwd/clone2-nested1-../nested1-$nested1sha1
> +Entering '../nested1/nested2'
> +$pwd/clone2/nested1-nested2-../nested2-$nested2sha1
> +Entering '../nested1/nested2/nested3'
> +$pwd/clone2/nested1/nested2-nested3-../nested3-$nested3sha1
> +Entering '../nested1/nested2/nested3/submodule'
> +$pwd/clone2/nested1/nested2/nested3-submodule-../submodule-$submodulesha1
> +Entering '../sub1'
> +$pwd/clone2-foo1-../sub1-$sub1sha1
> +Entering '../sub2'
> +$pwd/clone2-foo2-../sub2-$sub2sha1
> +Entering '../sub3'
> +$pwd/clone2-foo3-../sub3-$sub3sha1
> +EOF
> +
> +test_expect_success 'test "submodule foreach --recursive" from subdirectory' '
> +	(
> +		cd clone2 &&
> +		cd untracked &&
> +		git submodule foreach --recursive "echo \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
> +	) &&
> +	test_i18ncmp expect actual
> +'
> +
>  cat > expect <<EOF
>  nested1-nested1
>  nested2-nested2
> -- 
> 2.11.0
> 

-- 
Brandon Williams

^ permalink raw reply	[relevance 7%]

* Re: What's cooking in git.git (May 2017, #07; Tue, 23)
      [irrelevant] <xmqqwp98j8q2.fsf@gitster.mtv.corp.google.com>
@ 2017-05-23 19:08 ` Stefan Beller
  2017-05-23 19:38   ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-23 19:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, May 23, 2017 at 1:08 AM, Junio C Hamano <gitster@pobox.com> wrote:

> * sb/submodule-blanket-recursive (2017-05-23) 6 commits
>  . builtin/push.c: respect 'submodule.recurse' option
>  . builtin/grep.c: respect 'submodule.recurse' option
>  . builtin/fetch.c: respect 'submodule.recurse' option
>  . Introduce submodule.recurse option for worktree manipulators
>  . submodule test invocation: only pass additional arguments
>  . submodule.c: add has_submodules to check if we have any submodules
>  (this branch uses sb/reset-recurse-submodules.)
>
>  A new configuration variable "submodule.recurse" can be set to true
>  to force various commands run at the top-level superproject to
>  behave as if they were invoked with the "--recurse-submodules"
>  option.
>
>  Seems to break t7814 when merged to 'pu'.

I will investigate! (It passes on its own, so I guess it is some
interference with a recent grep series)


> * sb/diff-color-move (2017-05-23) 17 commits
>  . diff.c: color moved lines differently
>  . diff: buffer all output if asked to
>  . diff.c: emit_line includes whitespace highlighting
>  . diff.c: convert diff_summary to use emit_line_*
>  . diff.c: convert diff_flush to use emit_line_*
>  . diff.c: convert word diffing to use emit_line_*
>  . diff.c: convert show_stats to use emit_line_*
>  . diff.c: convert emit_binary_diff_body to use emit_line_*
>  . submodule.c: convert show_submodule_summary to use emit_line_fmt
>  . diff.c: convert emit_rewrite_lines to use emit_line_*
>  . diff.c: convert emit_rewrite_diff to use emit_line_*
>  . diff.c: convert builtin_diff to use emit_line_*
>  . diff.c: convert fn_out_consume to use emit_line
>  . diff: introduce more flexible emit function
>  . diff.c: factor out diff_flush_patch_all_file_pairs
>  . diff: move line ending check into emit_hunk_header
>  . diff: readability fix
>
>  "git diff" has been taught to optionally paint new lines that are
>  the same as deleted lines elsewhere differently from genuinely new
>  lines.
>
>  Seems to break t4060 when merged to 'next'.

It breaks own its own, but when merged to next it breaks, too. :(

The reason for this is the submodule color thing that I added
last minute as manual inspection of submodule diffs seemed
odd to me.

It turns out submodule diffs were never colored appropriately,
so I'll resend with this interdiff (that let's test pass again),
once the discussion settles:

diff --git a/submodule.c b/submodule.c
index 428c996c97..19c63197fb 100644
--- a/submodule.c
+++ b/submodule.c
@@ -550,8 +550,6 @@ void show_submodule_inline_diff(struct
diff_options *o, const char *path,

        /* TODO: other options may need to be passed here. */
        argv_array_push(&cp.args, "diff");
-       if (o->use_color)
-               argv_array_push(&cp.args, "--color=always");
        argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
        if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
                argv_array_pushf(&cp.args, "--src-prefix=%s%s/",

^ permalink raw reply	[relevance 19%]

* Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C
  2017-05-22 20:04     ` Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C Stefan Beller
@ 2017-05-23 19:09       ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-23 19:09 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git, Christian Couder, Jeff King, Ramsay Jones

On 05/22, Stefan Beller wrote:
> On Sun, May 21, 2017 at 5:58 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> 
> > I have also made some changes in git-submodule.sh for correcting
> > the $path variable. And hence made the corresponding changes in
> > the new test introduced in t7407-submodule-foreach as well.
> > I have push this work at:
> > https://github.com/pratham-pc/git/commits/foreach-bug-fixed
> 
> This one seems to pass the test suite by having the bug fixed.
> (The patches posted here seems to be
> https://github.com/pratham-pc/git/commits/foreach
> which does not pass tests? These two series seem to only differ in
> the bug fix commit, which I think is a good idea to include, as then we
> have a bug fixed and the tests pass.)
> 
> > +static void for_each_submodule_list(const struct module_list list, submodule_list_func_t fn, void *cb_data)
> ..
> > +       return;
> 
> no need for an explicit return in a void function.
> 
> > +struct cb_foreach {
> > +       int argc;
> > +       const char **argv;
> > +       const char *prefix;
> > +       unsigned int quiet: 1;
> > +       unsigned int recursive: 1;
> > +};
> > +#define CB_FOREACH_INIT { 0, NULL, 0, 0 }
> 
> This static initializer doesn't quite match the struct,
> (I would expect two NULLs as we have two const char pointers).

If we ever move to a new version of C, these initializers would be much
more readable as we could assign values to the fields themselves.  But
that is unrelated to this change.

> 
> > +
> > +       info.argc = argc;
> > +       info.argv = argv;
> > +       info.prefix = prefix;
> > +       info.quiet = !!quiet;
> > +       info.recursive = !!recursive;
> 
> as you assign all fields of the struct yourself, you could also omit the
> static initialization via _INIT above.
> 
> 
> Apart from these two minor nits the code looks good to me.
> However we'd really want to have the bug fix patch as well.
> (At the time of submission of a patch we should not be aware
> of any tests failing, which we are without said bug fix patch)
> 
> Thanks,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[relevance 16%]

* Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C
      [irrelevant]   ` <20170521125814.26255-2-pc44800@gmail.com>
  2017-05-22 20:04     ` Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C Stefan Beller
@ 2017-05-23 19:36     ` Brandon Williams
  2017-05-23 20:57       ` Stefan Beller
  2017-05-26 15:17     ` [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
  2 siblings, 1 reply; 200+ results
From: Brandon Williams @ 2017-05-23 19:36 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, sbeller, christian.couder, peff, ramsay

On 05/21, Prathamesh Chavan wrote:
> This aims to make git-submodule foreach a builtin. This is the very
> first step taken in this direction. Hence, 'foreach' is ported to
> submodule--helper, and submodule--helper is called from git-submodule.sh.
> The code is split up to have one function to obtain all the list of
> submodules. This function acts as the front-end of git-submodule foreach
> subcommand. It calls the function for_each_submodule_list, which basically
> loops through the list and calls function fn, which in this case is
> runcommand_in_submodule. This third function is a calling function that
> takes care of running the command in that submodule, and recursively
> perform the same when --recursive is flagged.
> 
> The first function module_foreach first parses the options present in
> argv, and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
> 
> The second function for_each_submodule_list traverses through the
> list, and calls function fn (which in case of submodule subcommand
> foreach is runcommand_in_submodule) is called for each entry.
> 
> The third function runcommand_in_submodule, generates a submodule struct sub
> for $name, value and then later prepends name=sub->name; and other
> value assignment to the env argv_array structure of a child_process.
> Also the <command> of submodule-foreach is push to args argv_array
> structure and finally, using run_command the commands are executed
> using a shell.
> 
> The third function also takes care of the recursive flag, by creating
> a separate child_process structure and prepending "--super-prefix displaypath",
> to the args argv_array structure. Other required arguments and the
> input <command> of submodule-foreach is also appended to this argv_array.
> 
> The commit 1c4fb136db (submodule foreach: skip eval for more than one
> argument, 2013-09-27), which explains that why for the case when argc>1,
> we do not use eval. But since in this patch, we are calling the
> command in a separate shell itself for all values of argc, this case
> is not considered separately.
> 
> Both env variable $path and $sm_path were added since both are used in
> tests in t7407.
> 
> Helped-by: Brandon Williams <bmwill@google.com>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> This series of patch is based on gitster/jk/bug-to-abort for untilizing its
> BUG() macro.
> 
> In this new version of patch, a new function
> get_submodule_displaypath is introduced, which is the same one
> as that in the patch series for porting of submodule subcommand
> status. I had to again introduce this in this patch as well as
> I am working on two separate branches for parting of each function.
> Also, the function for_each_submodule_list repeats for the same
> reason.
> 
> I have pushed this work on Github at:
> https://github.com/pratham-pc/git/commits/foreach
> 
> Its build report is available at:
> https://travis-ci.org/pratham-pc/git/builds/
> Branch: foreach
> Build #67
> 
> I have also made some changes in git-submodule.sh for correcting
> the $path variable. And hence made the corresponding changes in
> the new test introduced in t7407-submodule-foreach as well.
> I have push this work at:
> https://github.com/pratham-pc/git/commits/foreach-bug-fixed
> 
> Its build report is available at:
> https://travis-ci.org/pratham-pc/git/builds/
> Branch: foreach-bug-fixed
> Build #66
> 
>  builtin/submodule--helper.c | 142 ++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            |  39 +-----------
>  2 files changed, 143 insertions(+), 38 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 566a5b6a6..4e19beaff 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -13,6 +13,8 @@
>  #include "refs.h"
>  #include "connect.h"
>  
> +typedef void (*submodule_list_func_t)(const struct cache_entry *list_item, void *cb_data);
> +
>  static char *get_default_remote(void)
>  {
>  	char *dest = NULL, *ret;
> @@ -219,6 +221,23 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
>  	return 0;
>  }
>  
> +static char *get_submodule_displaypath(const char *path, const char *prefix)
> +{
> +	const char *super_prefix = get_super_prefix();
> +
> +	if (prefix && super_prefix) {
> +		BUG("cannot have prefix '%s' and superprefix '%s'",
> +		    prefix, super_prefix);
> +	} else if (prefix) {
> +		struct strbuf sb = STRBUF_INIT;
> +		return xstrdup(relative_path(path, prefix, &sb));

You have a potential memory leak here, you need to release the strbuf
before returning.

> +	} else if (super_prefix) {
> +		return xstrfmt("%s/%s", super_prefix, path);
> +	} else {
> +		return xstrdup(path);
> +	}
> +}
> +
>  struct module_list {
>  	const struct cache_entry **entries;
>  	int alloc, nr;
> @@ -331,6 +350,15 @@ static int module_list(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> +static void for_each_submodule_list(const struct module_list list, submodule_list_func_t fn, void *cb_data)

nit: You could probably break this line so its not longer than 80 chars.

> +{
> +	int i;
> +	for (i = 0; i < list.nr; i++)
> +		fn(list.entries[i], cb_data);
> +
> +	return;

No return needed.

> +}

small nit, and not that important, but could this function potentially
be moved closer to where it is used? What as the rational for placing it
here?

> +
>  static void init_submodule(const char *path, const char *prefix, int quiet)
>  {
>  	const struct submodule *sub;
> @@ -487,6 +515,119 @@ static int module_name(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> +struct cb_foreach {
> +	int argc;
> +	const char **argv;
> +	const char *prefix;
> +	unsigned int quiet: 1;
> +	unsigned int recursive: 1;
> +};
> +#define CB_FOREACH_INIT { 0, NULL, 0, 0 }

Need an extra NULL as Stefan pointed out:
  { 0, NULL, NULL, 0, 0 }

> +
> +static void runcommand_in_submodule(const struct cache_entry *list_item, void *cb_data)
> +{
> +	struct cb_foreach *info = cb_data;
> +	char *toplevel = xgetcwd();
> +	const struct submodule *sub;
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	char* displaypath = NULL;
> +	int i;
> +
> +	/* Only loads from .gitmodules, no overlay with .git/config */
> +	gitmodules_config();
> +
> +	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
> +
> +	sub = submodule_from_path(null_sha1, list_item->name);
> +
> +	if (!sub)
> +		die(_("No url found for submodule path '%s' in .gitmodules"),
> +		      displaypath);
> +
> +	prepare_submodule_repo_env(&cp.env_array);
> +	cp.use_shell = 1;
> +	cp.dir = list_item->name;
> +
> +	argv_array_pushf(&cp.env_array, "name=%s", sub->name);
> +	argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);
> +	argv_array_pushf(&cp.env_array, "path=%s", list_item->name);
> +	argv_array_pushf(&cp.env_array, "sha1=%s", oid_to_hex(&list_item->oid));
> +	argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
> +
> +	for (i = 0; i < info->argc; i++)
> +		argv_array_push(&cp.args, info->argv[i]);
> +
> +	if (!is_submodule_populated_gently(list_item->name, NULL))
> +		return;

This check needs to be hoisted up probably before calculating the
display path, otherwise you have a bunch of memory leaks that need to be
plugged. Something like:

    +	sub = submodule_from_path(null_sha1, list_item->name);
    +
    +	if (!sub)
    +		die(_("No url found for submodule path '%s' in .gitmodules"),
    +		      displaypath);
    +
    +	if (!is_submodule_populated_gently(list_item->name, NULL))
    +		return;
    +
    +	displaypath = get_submodule_displaypath(list_item->name, info->prefix);

> +
> +	if (!info->quiet)
> +		printf(_("Entering '%s'\n"), displaypath);
> +	if (info->argv[0] && run_command(&cp))
> +		die(_("run_command returned non-zero status for %s\n."), displaypath);
> +
> +	if (info->recursive) {
> +		struct child_process cpr = CHILD_PROCESS_INIT;
> +
> +		cpr.use_shell = 1;

You can set .git_cmd = 1 instead.

> +		cpr.dir = list_item->name;
> +		prepare_submodule_repo_env(&cpr.env_array);
> +
> +		argv_array_pushl(&cpr.args, "git", "--super-prefix", displaypath,

And then you don't need to include "git" here.

> +				 "submodule--helper", "foreach", "--recursive", NULL);
> +
> +		if (info->quiet)
> +			argv_array_push(&cpr.args, "--quiet");
> +
> +		for (i = 0; i < info->argc; i++)
> +			argv_array_push(&cpr.args, info->argv[i]);
> +
> +		if (run_command(&cpr))
> +			die(_("run_command returned non-zero status while \
> +			      recuring in the nested submodules of %s\n."),

If you're going to split these two lines up then it may make more sense
to use concatenation instead of a continuation '\'.  I'm not sure how
the spaces at the beginning of the line will look when printed.
Something like this:

    +			die(_("run_command returned non-zero status while"
    +			      "recursing in the nested submodules of %s\n."),


Also s/recuring/recursing

> +			      displaypath);
> +	}
> +
> +	free(displaypath);
> +	free(toplevel);
> +}
> +
> +static int module_foreach(int argc, const char **argv, const char *prefix)
> +{
> +	struct cb_foreach info = CB_FOREACH_INIT;
> +	struct pathspec pathspec;
> +	struct module_list list = MODULE_LIST_INIT;
> +	int quiet = 0;
> +	int recursive = 0;
> +
> +	struct option module_foreach_options[] = {
> +		OPT__QUIET(&quiet, N_("Suppress output of entering each submodule command")),
> +		OPT_BOOL(0, "recursive", &recursive,
> +			 N_("Recurse into nested submodules")),
> +		OPT_END()
> +	};
> +
> +	const char *const git_submodule_helper_usage[] = {
> +		N_("git submodule--helper foreach [--quiet] [--recursive] <command>"),
> +		NULL
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, module_foreach_options,
> +			     git_submodule_helper_usage, PARSE_OPT_KEEP_UNKNOWN);
> +
> +	if (module_list_compute(0, NULL, prefix, &pathspec, &list) < 0)
> +		die("BUG: module_list_compute should not choke on empty pathspec");

You mentioned using the 'BUG()' function call, and you use it up above,
so why not use it here too.

> +
> +	info.argc = argc;
> +	info.argv = argv;
> +	info.prefix = prefix;
> +	info.quiet = !!quiet;
> +	info.recursive = !!recursive;

If these values are boolean why do we need to do the extra '!!'?

> +
> +	for_each_submodule_list(list, runcommand_in_submodule, &info);
> +
> +	return 0;
> +}
> +
>  static int clone_submodule(const char *path, const char *gitdir, const char *url,
>  			   const char *depth, struct string_list *reference,
>  			   int quiet, int progress)
> @@ -1212,6 +1353,7 @@ static struct cmd_struct commands[] = {
>  	{"relative-path", resolve_relative_path, 0},
>  	{"resolve-relative-url", resolve_relative_url, 0},
>  	{"resolve-relative-url-test", resolve_relative_url_test, 0},
> +	{"foreach", module_foreach, SUPPORT_SUPER_PREFIX},
>  	{"init", module_init, SUPPORT_SUPER_PREFIX},
>  	{"remote-branch", resolve_remote_submodule_branch, 0},
>  	{"push-check", push_check, 0},
> diff --git a/git-submodule.sh b/git-submodule.sh
> index c0d0e9a4c..032fd2540 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -322,45 +322,8 @@ cmd_foreach()
>  		shift
>  	done
>  
> -	toplevel=$(pwd)
> +	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper foreach ${GIT_QUIET:+--quiet} ${recursive:+--recursive} "$@"
>  
> -	# dup stdin so that it can be restored when running the external
> -	# command in the subshell (and a recursive call to this function)
> -	exec 3<&0
> -
> -	{
> -		git submodule--helper list --prefix "$wt_prefix" ||
> -		echo "#unmatched" $?
> -	} |
> -	while read -r mode sha1 stage sm_path
> -	do
> -		die_if_unmatched "$mode" "$sha1"
> -		if test -e "$sm_path"/.git
> -		then
> -			displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
> -			say "$(eval_gettext "Entering '\$displaypath'")"
> -			name=$(git submodule--helper name "$sm_path")
> -			(
> -				prefix="$prefix$sm_path/"
> -				sanitize_submodule_env
> -				cd "$sm_path" &&
> -				sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
> -				# we make $path available to scripts ...
> -				path=$sm_path &&
> -				if test $# -eq 1
> -				then
> -					eval "$1"
> -				else
> -					"$@"
> -				fi &&
> -				if test -n "$recursive"
> -				then
> -					cmd_foreach "--recursive" "$@"
> -				fi
> -			) <&3 3<&- ||
> -			die "$(eval_gettext "Stopping at '\$displaypath'; script returned non-zero status.")"
> -		fi
> -	done
>  }
>  
>  #
> -- 
> 2.11.0
> 

-- 
Brandon Williams

^ permalink raw reply	[relevance 14%]

* Re: What's cooking in git.git (May 2017, #07; Tue, 23)
  2017-05-23 19:08 ` Re: What's cooking in git.git (May 2017, #07; Tue, 23) Stefan Beller
@ 2017-05-23 19:38   ` Stefan Beller
  2017-05-23 19:50     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-23 19:38 UTC (permalink / raw)
  To: Junio C Hamano, Ævar Arnfjörð Bjarmason; +Cc: git

On Tue, May 23, 2017 at 12:08 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, May 23, 2017 at 1:08 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> * sb/submodule-blanket-recursive (2017-05-23) 6 commits
>>  . builtin/push.c: respect 'submodule.recurse' option
>>  . builtin/grep.c: respect 'submodule.recurse' option
>>  . builtin/fetch.c: respect 'submodule.recurse' option
>>  . Introduce submodule.recurse option for worktree manipulators
>>  . submodule test invocation: only pass additional arguments
>>  . submodule.c: add has_submodules to check if we have any submodules
>>  (this branch uses sb/reset-recurse-submodules.)
>>
>>  A new configuration variable "submodule.recurse" can be set to true
>>  to force various commands run at the top-level superproject to
>>  behave as if they were invoked with the "--recurse-submodules"
>>  option.
>>
>>  Seems to break t7814 when merged to 'pu'.
>
> I will investigate! (It passes on its own, so I guess it is some
> interference with a recent grep series)

And the winner is 5d52a30eda (grep: amend submodule recursion
test for regex engine testing, 2017-05-20, by Ævar)

The tests added by grep rely on the old content of
test 2 'grep correctly finds patterns in a submodule'.

The (whitespace broken) diff below fixes it.
I think the best way forward is that my series relies on
that series as a foundation then, and writes correct tests based
on the file contents at that version.

---8<---
diff --git a/t/t7814-grep-recurse-submodules.sh
b/t/t7814-grep-recurse-submodules.sh
index 14eeb54b4b..ce9fbbc1f6 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -36,18 +36,18 @@ test_expect_success 'grep correctly finds patterns
in a submodule' '
 test_expect_success 'grep finds patterns in a submodule via config' '
        test_config submodule.recurse true &&
        # expect from previous test
-       git grep -e "bar" >actual &&
+       git grep -e3 >actual &&
        test_cmp expect actual
 '

 test_expect_success 'grep --no-recurse-submodules overrides config' '
        test_config submodule.recurse true &&
        cat >expect <<-\EOF &&
-       a:foobar
-       b/b:bar
+       a:(1|2)d(3|4)
+       b/b:(3|4)
        EOF

-       git grep -e "bar" --no-recurse-submodules >actual &&
+       git grep -e4 --no-recurse-submodules >actual &&
        test_cmp expect actual
 '

---8<---

Thanks,
Stefan

^ permalink raw reply	[relevance 27%]

* Re: What's cooking in git.git (May 2017, #07; Tue, 23)
  2017-05-23 19:38   ` Stefan Beller
@ 2017-05-23 19:50     ` Ævar Arnfjörð Bjarmason
      [irrelevant]       ` <xmqq60gpfvqj.fsf@gitster.mtv.corp.google.com>
  0 siblings, 1 reply; 200+ results
From: Ævar Arnfjörð Bjarmason @ 2017-05-23 19:50 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, git

On Tue, May 23, 2017 at 9:38 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, May 23, 2017 at 12:08 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, May 23, 2017 at 1:08 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> * sb/submodule-blanket-recursive (2017-05-23) 6 commits
>>>  . builtin/push.c: respect 'submodule.recurse' option
>>>  . builtin/grep.c: respect 'submodule.recurse' option
>>>  . builtin/fetch.c: respect 'submodule.recurse' option
>>>  . Introduce submodule.recurse option for worktree manipulators
>>>  . submodule test invocation: only pass additional arguments
>>>  . submodule.c: add has_submodules to check if we have any submodules
>>>  (this branch uses sb/reset-recurse-submodules.)
>>>
>>>  A new configuration variable "submodule.recurse" can be set to true
>>>  to force various commands run at the top-level superproject to
>>>  behave as if they were invoked with the "--recurse-submodules"
>>>  option.
>>>
>>>  Seems to break t7814 when merged to 'pu'.
>>
>> I will investigate! (It passes on its own, so I guess it is some
>> interference with a recent grep series)
>
> And the winner is 5d52a30eda (grep: amend submodule recursion
> test for regex engine testing, 2017-05-20, by Ævar)
>
> The tests added by grep rely on the old content of
> test 2 'grep correctly finds patterns in a submodule'.

Sorry about the fallout.

> The (whitespace broken) diff below fixes it.
> I think the best way forward is that my series relies on
> that series as a foundation then, and writes correct tests based
> on the file contents at that version.
>
> ---8<---
> diff --git a/t/t7814-grep-recurse-submodules.sh
> b/t/t7814-grep-recurse-submodules.sh
> index 14eeb54b4b..ce9fbbc1f6 100755
> --- a/t/t7814-grep-recurse-submodules.sh
> +++ b/t/t7814-grep-recurse-submodules.sh
> @@ -36,18 +36,18 @@ test_expect_success 'grep correctly finds patterns
> in a submodule' '
>  test_expect_success 'grep finds patterns in a submodule via config' '
>         test_config submodule.recurse true &&
>         # expect from previous test
> -       git grep -e "bar" >actual &&
> +       git grep -e3 >actual &&
>         test_cmp expect actual
>  '
>
>  test_expect_success 'grep --no-recurse-submodules overrides config' '
>         test_config submodule.recurse true &&
>         cat >expect <<-\EOF &&
> -       a:foobar
> -       b/b:bar
> +       a:(1|2)d(3|4)
> +       b/b:(3|4)
>         EOF
>
> -       git grep -e "bar" --no-recurse-submodules >actual &&
> +       git grep -e4 --no-recurse-submodules >actual &&

The rest of my changed just did:

        foobar -> (1|2)d(3|4)
        foo    -> (1|2)
        bar    -> (3|4)

While this works might want to do e.g. `-e "(3|4)"` here like the
rest. This works, but probably confusing going forward when it's the
only exception.

>         test_cmp expect actual
>  '
> ---8<---
>
> Thanks,
> Stefan

^ permalink raw reply	[relevance 8%]

* Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C
  2017-05-23 19:36     ` Brandon Williams
@ 2017-05-23 20:57       ` Stefan Beller
  2017-05-23 21:05         ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-23 20:57 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Prathamesh Chavan, git, Christian Couder, Jeff King, Ramsay Jones

On Tue, May 23, 2017 at 12:36 PM, Brandon Williams <bmwill@google.com> wrote:
>
> You can set .git_cmd = 1 instead.
>
>> +             cpr.dir = list_item->name;
>> +             prepare_submodule_repo_env(&cpr.env_array);
>> +
>> +             argv_array_pushl(&cpr.args, "git", "--super-prefix", displaypath,
>
> And then you don't need to include "git" here.

even if git_cmd = 1 is set, you'd need a first dummy argument?
cf. find_unpushed_submodules, See comment in 9cfa1c260f
(serialize collection of refs that contain submodule changes, 2016-11-16)

>> +
>> +     info.argc = argc;
>> +     info.argv = argv;
>> +     info.prefix = prefix;
>> +     info.quiet = !!quiet;
>> +     info.recursive = !!recursive;
>
> If these values are boolean why do we need to do the extra '!!'?

Actually that was my advice. As we only have a limited space in a single
bit, strange things happen when you were to do:

    quiet = 2; /* be extra quiet */
    info.quiet = quiet;

This is not the case here, but other commands have evolved over time
to first take a OPT_BOOL, and then in a later patch an OPT_INT.
(some commands take a "-v -v -v")

And by having the double negative we'd have some defensive programming
right here. (To prove I am not telling crazy stories, $ git log -S \!\!)

^ permalink raw reply	[relevance 22%]

* Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C
  2017-05-23 20:57       ` Stefan Beller
@ 2017-05-23 21:05         ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-23 21:05 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git, Christian Couder, Jeff King, Ramsay Jones

On 05/23, Stefan Beller wrote:
> On Tue, May 23, 2017 at 12:36 PM, Brandon Williams <bmwill@google.com> wrote:
> >
> > You can set .git_cmd = 1 instead.
> >
> >> +             cpr.dir = list_item->name;
> >> +             prepare_submodule_repo_env(&cpr.env_array);
> >> +
> >> +             argv_array_pushl(&cpr.args, "git", "--super-prefix", displaypath,
> >
> > And then you don't need to include "git" here.
> 
> even if git_cmd = 1 is set, you'd need a first dummy argument?
> cf. find_unpushed_submodules, See comment in 9cfa1c260f
> (serialize collection of refs that contain submodule changes, 2016-11-16)

Different subsystem, you don't need a dummy first argument.  The
revision walking code does (for some reason) need a dummy first
argument.

> 
> >> +
> >> +     info.argc = argc;
> >> +     info.argv = argv;
> >> +     info.prefix = prefix;
> >> +     info.quiet = !!quiet;
> >> +     info.recursive = !!recursive;
> >
> > If these values are boolean why do we need to do the extra '!!'?
> 
> Actually that was my advice. As we only have a limited space in a single
> bit, strange things happen when you were to do:
> 
>     quiet = 2; /* be extra quiet */
>     info.quiet = quiet;
> 
> This is not the case here, but other commands have evolved over time
> to first take a OPT_BOOL, and then in a later patch an OPT_INT.
> (some commands take a "-v -v -v")
> 
> And by having the double negative we'd have some defensive programming
> right here. (To prove I am not telling crazy stories, $ git log -S \!\!)

All good, I didn't notice that they were bit fields.

-- 
Brandon Williams

^ permalink raw reply	[relevance 16%]

* [PATCHv5 00/17] Diff machine: highlight moved lines.
      [irrelevant] <20170523024048.16879-1-sbeller@google.com/>
@ 2017-05-24 21:40 ` Stefan Beller
  2017-05-24 21:40   ` [PATCHv5 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
  2017-05-24 21:40   ` [PATCHv5 17/17] diff.c: color moved lines differently Stefan Beller
  0 siblings, 2 replies; 200+ results
From: Stefan Beller @ 2017-05-24 21:40 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger, Stefan Beller

v5:
* removed the color passing to the submodule to make the tests pass again.
* fixed an indentation issue that was introduced from v3 -> v4.
* I merged it with origin/next and tests pass here.

Thanks,
Stefan

diff to v4:
diff --git a/diff.c b/diff.c
index 23e70d348e..1292d3c4ad 100644
--- a/diff.c
+++ b/diff.c
@@ -751,7 +751,7 @@ static void mark_color_as_moved(struct diff_options *o,
 }
 
 static void emit_diff_line(struct diff_options *o,
-				     struct diff_line *e)
+			   struct diff_line *e)
 {
 	const char *ws;
 	int has_trailing_newline, has_trailing_carriage_return;
@@ -804,7 +804,7 @@ static void emit_diff_line(struct diff_options *o,
 }
 
 static void append_diff_line(struct diff_options *o,
-				       struct diff_line *e)
+			     struct diff_line *e)
 {
 	struct diff_line *f;
 	ALLOC_GROW(o->line_buffer,
diff --git a/submodule.c b/submodule.c
index 428c996c97..19c63197fb 100644
--- a/submodule.c
+++ b/submodule.c
@@ -550,8 +550,6 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path,
 
 	/* TODO: other options may need to be passed here. */
 	argv_array_push(&cp.args, "diff");
-	if (o->use_color)
-		argv_array_push(&cp.args, "--color=always");
 	argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
 	if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
 		argv_array_pushf(&cp.args, "--src-prefix=%s%s/",

v4:
* interdiff to v3 (what is currently origin/sb/diff-color-move) below.
* renamed the "buffered_patch_line" to "diff_line". Originally I planned
  to not carry the "line" part as it can be a piece of a line as well.
  But for the intended functionality it is best to keep the name.
  If we'd want to add more functionality to say have a move detection
  for words as well, we'd rename the struct to have a better name then.
  For now diff_line is the best. (Thanks Jonathan Nieder!)
* tests to demonstrate it doesn't mess with --color-words as well as
  submodules. (Thanks Jonathan Tan!)
* added in the statics (Thanks Ramsay!)
* smaller scope for the hashmaps (Thanks Jonathan Tan!)
* some commit messages were updated, prior patch 4-7 is squashed into one
  (Thanks Jonathan Tan!)
* the tests added revealed an actual fault: now that the submodule process
  is not attached to a dupe of our stdout, it would stop coloring the
  output. We need to pass on use-color explicitly.
* updated the NEEDSWORK comment in the second last patch.

Thanks for bearing,
Stefan

v3:
* see interdiff below.
* fixing one invalid computation (Thanks Junio!)
* I reasoned more about submodule and word diffing, see the commit message
  of the last patch:
  
    A note on the options '--submodule=diff' and '--color-words/--word-diff':
    In the conversion to use emit_line in the prior patches both submodules
    as well as word diff output carefully chose to call emit_line with sign=0.
    All output with sign=0 is ignored for move detection purposes in this
    patch, such that no weird looking output will be generated for these
    cases. This leads to another thought: We could pass on '--color-moved' to
    submodules such that they color up moved lines for themselves. If we'd do
    so only line moves within a repository boundary are marked up.

* better name for emit_line outside of diff.[ch]

v2:
* emit_line now takes an argument that indicates if we want it
  to emit the line prefix as well. This should allow for a more faithful
  refactoring in the beginning. (Thanks Jonathan!)
* fixed memleaks (Thanks Brandon!)
* "git -c color.moved=true log -p" works now! (Thanks Jeff)
* interdiff below, though it is large.
* less intrusive than v1 (Thanks Jonathan!)

v1:

For details on *why* see the commit message of the last commit.

The first five patches are slight refactorings to get into good
shape, the next patches are funneling all output through emit_line_*.

The second last patch introduces an option to buffer up all output
before printing, and then the last patch can color up moved lines
of code.

Any feedback welcome.

Thanks,
Stefan

Stefan Beller (17):
  diff: readability fix
  diff: move line ending check into emit_hunk_header
  diff.c: factor out diff_flush_patch_all_file_pairs
  diff: introduce more flexible emit function
  diff.c: convert fn_out_consume to use emit_line
  diff.c: convert builtin_diff to use emit_line_*
  diff.c: convert emit_rewrite_diff to use emit_line_*
  diff.c: convert emit_rewrite_lines to use emit_line_*
  submodule.c: convert show_submodule_summary to use emit_line_fmt
  diff.c: convert emit_binary_diff_body to use emit_line_*
  diff.c: convert show_stats to use emit_line_*
  diff.c: convert word diffing to use emit_line_*
  diff.c: convert diff_flush to use emit_line_*
  diff.c: convert diff_summary to use emit_line_*
  diff.c: emit_line includes whitespace highlighting
  diff: buffer all output if asked to
  diff.c: color moved lines differently

 Documentation/config.txt   |  14 +-
 diff.c                     | 858 +++++++++++++++++++++++++++++++++------------
 diff.h                     |  59 +++-
 submodule.c                |  87 ++---
 submodule.h                |   9 +-
 t/t4015-diff-whitespace.sh | 267 ++++++++++++++
 6 files changed, 1016 insertions(+), 278 deletions(-)

-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 21%]

* [PATCHv5 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
  2017-05-24 21:40 ` [PATCHv5 00/17] Diff machine: highlight moved lines. Stefan Beller
@ 2017-05-24 21:40   ` Stefan Beller
  2017-05-24 21:40   ` [PATCHv5 17/17] diff.c: color moved lines differently Stefan Beller
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-24 21:40 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger, Stefan Beller

In a later patch, I want to propose an option to detect&color
moved lines in a diff, which cannot be done in a one-pass over
the diff. Instead we need to go over the whole diff twice,
because we cannot detect the first line of the two corresponding
lines (+ and -) that got moved.

So to prepare the diff machinery for two pass algorithms
(i.e. buffer it all up and then operate on the result),
move all emissions to places, such that the only emitting
function is emit_line_0.

This prepares the code for submodules to go through the emit_line
function.

As the submodule process is no longer attached to the same stdout as
the superprojects process, one might imagine that we would need to
pass on the usage of colors explicitly as the subprocess can no longer
determine where the output is going to land eventually. But this is not
the case. Apparently coloring submodule diffs never worked, so defer
the submodule diff coloring to a future patch series.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c      | 14 ++++++----
 diff.h      |  3 +++
 submodule.c | 87 ++++++++++++++++++++++++++++++++-----------------------------
 submodule.h |  9 +++----
 4 files changed, 61 insertions(+), 52 deletions(-)

diff --git a/diff.c b/diff.c
index ca6b48cf49..3357c0fca3 100644
--- a/diff.c
+++ b/diff.c
@@ -562,6 +562,12 @@ static void emit_line_fmt(struct diff_options *o,
 	strbuf_release(&sb);
 }
 
+void diff_emit_line(struct diff_options *o, const char *set, const char *reset,
+		    const char *line, int len)
+{
+	emit_line(o, set, reset, 1, 0, line, len);
+}
+
 static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len)
 {
 	if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) &&
@@ -2384,8 +2390,7 @@ static void builtin_diff(const char *name_a,
 	    (!two->mode || S_ISGITLINK(two->mode))) {
 		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
 		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
-		show_submodule_summary(o->file, one->path ? one->path : two->path,
-				line_prefix,
+		show_submodule_summary(o, one->path ? one->path : two->path,
 				&one->oid, &two->oid,
 				two->dirty_submodule,
 				meta, del, add, reset);
@@ -2395,11 +2400,10 @@ static void builtin_diff(const char *name_a,
 		   (!two->mode || S_ISGITLINK(two->mode))) {
 		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
 		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
-		show_submodule_inline_diff(o->file, one->path ? one->path : two->path,
-				line_prefix,
+		show_submodule_inline_diff(o, one->path ? one->path : two->path,
 				&one->oid, &two->oid,
 				two->dirty_submodule,
-				meta, del, add, reset, o);
+				meta, del, add, reset);
 		return;
 	}
 
diff --git a/diff.h b/diff.h
index 5be1ee77a7..9ad546361a 100644
--- a/diff.h
+++ b/diff.h
@@ -188,6 +188,9 @@ struct diff_options {
 	int diff_path_counter;
 };
 
+void diff_emit_line(struct diff_options *o, const char *set, const char *reset,
+		    const char *line, int len);
+
 enum color_diff {
 	DIFF_RESET = 0,
 	DIFF_CONTEXT = 1,
diff --git a/submodule.c b/submodule.c
index d3299e29c0..19c63197fb 100644
--- a/submodule.c
+++ b/submodule.c
@@ -362,8 +362,8 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path,
 	return prepare_revision_walk(rev);
 }
 
-static void print_submodule_summary(struct rev_info *rev, FILE *f,
-		const char *line_prefix,
+static void print_submodule_summary(struct rev_info *rev,
+		struct diff_options *o,
 		const char *del, const char *add, const char *reset)
 {
 	static const char format[] = "  %m %s";
@@ -375,18 +375,12 @@ static void print_submodule_summary(struct rev_info *rev, FILE *f,
 		ctx.date_mode = rev->date_mode;
 		ctx.output_encoding = get_log_output_encoding();
 		strbuf_setlen(&sb, 0);
-		strbuf_addstr(&sb, line_prefix);
-		if (commit->object.flags & SYMMETRIC_LEFT) {
-			if (del)
-				strbuf_addstr(&sb, del);
-		}
-		else if (add)
-			strbuf_addstr(&sb, add);
 		format_commit_message(commit, format, &sb, &ctx);
-		if (reset)
-			strbuf_addstr(&sb, reset);
 		strbuf_addch(&sb, '\n');
-		fprintf(f, "%s", sb.buf);
+		if (commit->object.flags & SYMMETRIC_LEFT)
+			diff_emit_line(o, del, reset, sb.buf, sb.len);
+		else if (add)
+			diff_emit_line(o, add, reset, sb.buf, sb.len);
 	}
 	strbuf_release(&sb);
 }
@@ -413,8 +407,7 @@ void prepare_submodule_repo_env(struct argv_array *out)
  * attempt to lookup both the left and right commits and put them into the
  * left and right pointers.
  */
-static void show_submodule_header(FILE *f, const char *path,
-		const char *line_prefix,
+static void show_submodule_header(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *reset,
@@ -425,12 +418,17 @@ static void show_submodule_header(FILE *f, const char *path,
 	struct strbuf sb = STRBUF_INIT;
 	int fast_forward = 0, fast_backward = 0;
 
-	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
-		fprintf(f, "%sSubmodule %s contains untracked content\n",
-			line_prefix, path);
-	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
-		fprintf(f, "%sSubmodule %s contains modified content\n",
-			line_prefix, path);
+	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED) {
+		strbuf_addf(&sb, "Submodule %s contains untracked content\n", path);
+		diff_emit_line(o, NULL, NULL, sb.buf, sb.len);
+		strbuf_reset(&sb);
+	}
+
+	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED) {
+		strbuf_addf(&sb, "Submodule %s contains modified content\n", path);
+		diff_emit_line(o, NULL, NULL, sb.buf, sb.len);
+		strbuf_reset(&sb);
+	}
 
 	if (is_null_oid(one))
 		message = "(new submodule)";
@@ -472,21 +470,20 @@ static void show_submodule_header(FILE *f, const char *path,
 	}
 
 output_header:
-	strbuf_addf(&sb, "%s%sSubmodule %s ", line_prefix, meta, path);
+	strbuf_addf(&sb, "Submodule %s ", path);
 	strbuf_add_unique_abbrev(&sb, one->hash, DEFAULT_ABBREV);
 	strbuf_addstr(&sb, (fast_backward || fast_forward) ? ".." : "...");
 	strbuf_add_unique_abbrev(&sb, two->hash, DEFAULT_ABBREV);
 	if (message)
-		strbuf_addf(&sb, " %s%s\n", message, reset);
+		strbuf_addf(&sb, " %s\n", message);
 	else
-		strbuf_addf(&sb, "%s:%s\n", fast_backward ? " (rewind)" : "", reset);
-	fwrite(sb.buf, sb.len, 1, f);
+		strbuf_addf(&sb, "%s:\n", fast_backward ? " (rewind)" : "");
+	diff_emit_line(o, meta, reset, sb.buf, sb.len);
 
 	strbuf_release(&sb);
 }
 
-void show_submodule_summary(FILE *f, const char *path,
-		const char *line_prefix,
+void show_submodule_summary(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset)
@@ -495,7 +492,7 @@ void show_submodule_summary(FILE *f, const char *path,
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
 
-	show_submodule_header(f, path, line_prefix, one, two, dirty_submodule,
+	show_submodule_header(o, path, one, two, dirty_submodule,
 			      meta, reset, &left, &right, &merge_bases);
 
 	/*
@@ -508,11 +505,12 @@ void show_submodule_summary(FILE *f, const char *path,
 
 	/* Treat revision walker failure the same as missing commits */
 	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
-		fprintf(f, "%s(revision walker failed)\n", line_prefix);
+		const char *error = "(revision walker failed)\n";
+		diff_emit_line(o, NULL, NULL, error, strlen(error));
 		goto out;
 	}
 
-	print_submodule_summary(&rev, f, line_prefix, del, add, reset);
+	print_submodule_summary(&rev, o, del, add, reset);
 
 out:
 	if (merge_bases)
@@ -521,20 +519,18 @@ void show_submodule_summary(FILE *f, const char *path,
 	clear_commit_marks(right, ~0);
 }
 
-void show_submodule_inline_diff(FILE *f, const char *path,
-		const char *line_prefix,
+void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset,
-		const struct diff_options *o)
+		const char *del, const char *add, const char *reset)
 {
 	const struct object_id *old = &empty_tree_oid, *new = &empty_tree_oid;
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
-	struct strbuf submodule_dir = STRBUF_INIT;
 	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf sb = STRBUF_INIT;
 
-	show_submodule_header(f, path, line_prefix, one, two, dirty_submodule,
+	show_submodule_header(o, path, one, two, dirty_submodule,
 			      meta, reset, &left, &right, &merge_bases);
 
 	/* We need a valid left and right commit to display a difference */
@@ -547,15 +543,14 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 	if (right)
 		new = two;
 
-	fflush(f);
 	cp.git_cmd = 1;
 	cp.dir = path;
-	cp.out = dup(fileno(f));
+	cp.out = -1;
 	cp.no_stdin = 1;
 
 	/* TODO: other options may need to be passed here. */
 	argv_array_push(&cp.args, "diff");
-	argv_array_pushf(&cp.args, "--line-prefix=%s", line_prefix);
+	argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
 	if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
 		argv_array_pushf(&cp.args, "--src-prefix=%s%s/",
 				 o->b_prefix, path);
@@ -578,11 +573,21 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 		argv_array_push(&cp.args, oid_to_hex(new));
 
 	prepare_submodule_repo_env(&cp.env_array);
-	if (run_command(&cp))
-		fprintf(f, "(diff failed)\n");
+	if (start_command(&cp)) {
+		const char *error = "(diff failed)\n";
+		diff_emit_line(o, NULL, NULL, error, strlen(error));
+	}
+
+	while (strbuf_getwholeline_fd(&sb, cp.out, '\n') != EOF)
+		diff_emit_line(o, NULL, NULL, sb.buf, sb.len);
+
+	if (finish_command(&cp)) {
+		const char *error = "(diff failed)\n";
+		diff_emit_line(o, NULL, NULL, error, strlen(error));
+	}
 
 done:
-	strbuf_release(&submodule_dir);
+	strbuf_release(&sb);
 	if (merge_bases)
 		free_commit_list(merge_bases);
 	if (left)
diff --git a/submodule.h b/submodule.h
index 1277480add..9df0a3aea2 100644
--- a/submodule.h
+++ b/submodule.h
@@ -53,17 +53,14 @@ extern int parse_submodule_update_strategy(const char *value,
 		struct submodule_update_strategy *dst);
 extern const char *submodule_strategy_to_string(const struct submodule_update_strategy *s);
 extern void handle_ignore_submodules_arg(struct diff_options *, const char *);
-extern void show_submodule_summary(FILE *f, const char *path,
-		const char *line_prefix,
+extern void show_submodule_summary(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset);
-extern void show_submodule_inline_diff(FILE *f, const char *path,
-		const char *line_prefix,
+extern void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
 		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset,
-		const struct diff_options *opt);
+		const char *del, const char *add, const char *reset);
 extern void set_config_fetch_recurse_submodules(int value);
 extern void set_config_update_recurse_submodules(int value);
 /* Check if we want to update any submodule.*/
-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 18%]

* [PATCHv5 17/17] diff.c: color moved lines differently
  2017-05-24 21:40 ` [PATCHv5 00/17] Diff machine: highlight moved lines. Stefan Beller
  2017-05-24 21:40   ` [PATCHv5 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
@ 2017-05-24 21:40   ` Stefan Beller
  2017-05-25  2:27     ` Junio C Hamano
  1 sibling, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-24 21:40 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger, Stefan Beller

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

Adjacent blocks are colored differently. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why the alternative coloring is really needed.

As the reviewers attention should be brought to the places, where the
difference is introduced to the moved code, we cannot just have one new
color for all of moved code.

First I implemented an alternative design, which would show a moved hunk
in one color, and its boundaries in another color. This idea was error
prone as it inspected each line and its neighboring lines to determine
if the line was (a) moved and (b) if was deep inside a hunk by having
matching neighboring lines. This is unreliable as the we can construct
hunks which have equal neighbors that just exceed the number of lines
inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that
is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then switches color to the alternative color
for the next hunk. By doing this any permutation is recognized and
displayed. That implies that there is no dedicated boundary or
inside-hunk color, but instead we'll have just two colors alternating
for hunks.

It would be a bit more UX friendly if the two corresponding hunks
(of added and deleted lines) for one move would get the same color id.
(Both get "regular moved" or "alternative moved"). This problem is
deferred to a later patch for now.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

# Conflicts:
#	diff.c
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt   |  14 ++-
 diff.c                     | 275 +++++++++++++++++++++++++++++++++++++++++++--
 diff.h                     |   9 +-
 t/t4015-diff-whitespace.sh | 267 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 552 insertions(+), 13 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..902d017c3b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1051,14 +1051,24 @@ This does not affect linkgit:git-format-patch[1] or the
 'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
 command line with the `--color[=<when>]` option.
 
+color.moved::
+	A boolean value, whether a diff should color moved lines
+	differently. The moved lines are searched for in the diff only.
+	Duplicated lines from somewhere in the project that are not
+	part of the diff are not colored as moved.
+	Defaults to false.
+
 color.diff.<slot>::
 	Use customized color for diff colorization.  `<slot>` specifies
 	which part of the patch to use the specified color, and is one
 	of `context` (context text - `plain` is a historical synonym),
 	`meta` (metainformation), `frag`
 	(hunk header), 'func' (function in hunk header), `old` (removed lines),
-	`new` (added lines), `commit` (commit headers), or `whitespace`
-	(highlighting whitespace errors).
+	`new` (added lines), `commit` (commit headers), `whitespace`
+	(highlighting whitespace errors), `oldMoved` (removed lines that
+	reappear), `newMoved` (added lines that were removed elsewhere),
+	`oldMovedAlternative` and `newMovedAlternative` (as a fallback to
+	cover adjacent blocks of moved code)
 
 color.decorate.<slot>::
 	Use customized color for 'git log --decorate' output.  `<slot>` is one
diff --git a/diff.c b/diff.c
index 8e06206881..1292d3c4ad 100644
--- a/diff.c
+++ b/diff.c
@@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
+static int diff_color_moved_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_YELLOW,	/* COMMIT */
 	GIT_COLOR_BG_RED,	/* WHITESPACE */
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
+	GIT_COLOR_BOLD_RED,	/* OLD_MOVED_A */
+	GIT_COLOR_BG_RED,	/* OLD_MOVED_B */
+	GIT_COLOR_BOLD_GREEN,	/* NEW_MOVED_A */
+	GIT_COLOR_BG_GREEN,	/* NEW_MOVED_B */
 };
 
 static NORETURN void die_want_option(const char *option_name)
@@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
 		return DIFF_WHITESPACE;
 	if (!strcasecmp(var, "func"))
 		return DIFF_FUNCINFO;
+	if (!strcasecmp(var, "oldmoved"))
+		return DIFF_FILE_OLD_MOVED;
+	if (!strcasecmp(var, "oldmovedalternative"))
+		return DIFF_FILE_OLD_MOVED_ALT;
+	if (!strcasecmp(var, "newmoved"))
+		return DIFF_FILE_NEW_MOVED;
+	if (!strcasecmp(var, "newmovedalternative"))
+		return DIFF_FILE_NEW_MOVED_ALT;
 	return -1;
 }
 
@@ -234,6 +247,10 @@ int git_diff_ui_config(const char *var, const char *value, void *cb)
 		diff_use_color_default = git_config_colorbool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "color.moved")) {
+		diff_color_moved_default = git_config_bool(var, value);
+		return 0;
+	}
 	if (!strcmp(var, "diff.context")) {
 		diff_context_default = git_config_int(var, value);
 		if (diff_context_default < 0)
@@ -354,6 +371,88 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+struct moved_entry {
+	struct hashmap_entry ent;
+	const struct diff_line *line;
+	struct moved_entry *next_line;
+};
+
+static void get_ws_cleaned_string(const struct diff_line *l,
+				  struct strbuf *out)
+{
+	int i;
+	for (i = 0; i < l->len; i++) {
+		if (isspace(l->line[i]))
+			continue;
+		strbuf_addch(out, l->line[i]);
+	}
+}
+
+static int diff_line_cmp_no_ws(const struct diff_line *a,
+					 const struct diff_line *b,
+					 const void *keydata)
+{
+	int ret;
+	struct strbuf sba = STRBUF_INIT;
+	struct strbuf sbb = STRBUF_INIT;
+
+	get_ws_cleaned_string(a, &sba);
+	get_ws_cleaned_string(b, &sbb);
+	ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
+
+	strbuf_release(&sba);
+	strbuf_release(&sbb);
+	return ret;
+}
+
+static int diff_line_cmp(const struct diff_line *a,
+				   const struct diff_line *b,
+				   const void *keydata)
+{
+	return a->len != b->len || strncmp(a->line, b->line, a->len);
+}
+
+static int moved_entry_cmp(const struct moved_entry *a,
+			   const struct moved_entry *b,
+			   const void *keydata)
+{
+	return diff_line_cmp(a->line, b->line, keydata);
+}
+
+static int moved_entry_cmp_no_ws(const struct moved_entry *a,
+				 const struct moved_entry *b,
+				 const void *keydata)
+{
+	return diff_line_cmp_no_ws(a->line, b->line, keydata);
+}
+
+static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
+{
+	static struct strbuf sb = STRBUF_INIT;
+
+	if (ignore_ws) {
+		strbuf_reset(&sb);
+		get_ws_cleaned_string(line, &sb);
+		return memhash(sb.buf, sb.len);
+	} else {
+		return memhash(line->line, line->len);
+	}
+}
+
+static struct moved_entry *prepare_entry(struct diff_options *o,
+					 int line_no)
+{
+	struct moved_entry *ret = xmalloc(sizeof(*ret));
+	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	struct diff_line *l = &o->line_buffer[line_no];
+
+	ret->ent.hash = get_line_hash(l, ignore_ws);
+	ret->line = l;
+	ret->next_line = NULL;
+
+	return ret;
+}
+
 static char *quote_two(const char *one, const char *two)
 {
 	int need_one = quote_c_style(one, NULL, NULL, 1);
@@ -516,6 +615,141 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
 	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
+{
+	struct moved_entry *prev_line = NULL;
+
+	int n;
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		int sign = 0;
+		struct hashmap *hm;
+		struct moved_entry *key;
+
+		switch (o->line_buffer[n].sign) {
+		case '+':
+			sign = '+';
+			hm = add_lines;
+			break;
+		case '-':
+			sign = '-';
+			hm = del_lines;
+			break;
+		case ' ':
+		default:
+			prev_line = NULL;
+			continue;
+		}
+
+		key = prepare_entry(o, n);
+		if (prev_line &&
+		    prev_line->line->sign == sign)
+			prev_line->next_line = key;
+
+		hashmap_add(hm, key);
+		prev_line = key;
+	}
+}
+
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
+{
+	struct moved_entry **pmb = NULL; /* potentially moved blocks */
+	int pmb_nr = 0, pmb_alloc = 0;
+	int use_alt_color = 0;
+	int n;
+
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		struct hashmap *hm = NULL;
+		struct moved_entry *key;
+		struct moved_entry *match = NULL;
+		struct diff_line *l = &o->line_buffer[n];
+		int i, lp, rp;
+
+		switch (l->sign) {
+		case '+':
+			hm = del_lines;
+			break;
+		case '-':
+			hm = add_lines;
+			break;
+		default:
+			use_alt_color = 0;
+			pmb_nr = 0; /* no running sets */
+			continue;
+		}
+
+		/* Check for any match to color it as a move. */
+		key = prepare_entry(o, n);
+		match = hashmap_get(hm, key, o);
+		free(key);
+		if (!match)
+			continue;
+
+		/* Check any potential block runs, advance each or nullify */
+		for (i = 0; i < pmb_nr; i++) {
+			struct moved_entry *p = pmb[i];
+			struct moved_entry *pnext = (p && p->next_line) ?
+					p->next_line : NULL;
+			if (pnext &&
+			    !diff_line_cmp(pnext->line, l, o)) {
+				pmb[i] = p->next_line;
+			} else {
+				pmb[i] = NULL;
+			}
+		}
+
+		/* Shrink the set to the remaining runs */
+		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+			while (lp < pmb_nr && pmb[lp])
+				lp++;
+			/* lp points at the first NULL now */
+
+			while (rp > -1 && !pmb[rp])
+				rp--;
+			/* rp points at the last non-NULL */
+
+			if (lp < pmb_nr && rp > -1 && lp < rp) {
+				pmb[lp] = pmb[rp];
+				pmb[rp] = NULL;
+				rp--;
+				lp++;
+			}
+		}
+
+		if (rp > -1) {
+			/* Remember the number of running sets */
+			pmb_nr = rp + 1;
+		} else {
+			/* Toggle color */
+			use_alt_color = (use_alt_color + 1) % 2;
+
+			/* Build up a new set */
+			pmb_nr = 0;
+			for (; match; match = hashmap_get_next(hm, match)) {
+				ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
+				pmb[pmb_nr++] = match;
+			}
+		}
+
+		switch (l->sign) {
+		case '+':
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_NEW_MOVED + use_alt_color);
+			break;
+		case '-':
+			l->set = diff_get_color_opt(o,
+				DIFF_FILE_OLD_MOVED + use_alt_color);
+			break;
+		default:
+			die("BUG: we should have continued earlier?");
+		}
+	}
+	free(pmb);
+}
+
 static void emit_diff_line(struct diff_options *o,
 			   struct diff_line *e)
 {
@@ -3518,6 +3752,8 @@ void diff_setup(struct diff_options *options)
 	options->line_buffer = NULL;
 	options->line_buffer_nr = 0;
 	options->line_buffer_alloc = 0;
+
+	options->color_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -3627,6 +3863,9 @@ void diff_setup_done(struct diff_options *options)
 
 	if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
 		die(_("--follow requires exactly one pathspec"));
+
+	if (!options->use_color || external_diff())
+		options->color_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
@@ -4051,6 +4290,10 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--no-color"))
 		options->use_color = 0;
+	else if (!strcmp(arg, "--color-moved"))
+		options->color_moved = 1;
+	else if (!strcmp(arg, "--no-color-moved"))
+		options->color_moved = 0;
 	else if (!strcmp(arg, "--color-words")) {
 		options->use_color = 1;
 		options->word_diff = DIFF_WORDS_COLOR;
@@ -4856,16 +5099,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	/*
-	 * For testing purposes we want to make sure the diff machinery
-	 * works completely with the buffer. If there is anything emitted
-	 * outside the emit_diff_line, then the order is screwed
-	 * up and the tests will fail.
-	 *
-	 * TODO (later in this series):
-	 * We'll unset this flag in a later patch.
-	 */
-	o->use_buffer = 1;
+
+	if (o->color_moved)
+		o->use_buffer = 1;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -4874,6 +5110,24 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	}
 
 	if (o->use_buffer) {
+		if (o->color_moved) {
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
+		}
+
 		for (i = 0; i < o->line_buffer_nr; i++)
 			emit_diff_line(o, &o->line_buffer[i]);
 
@@ -4962,6 +5216,7 @@ void diff_flush(struct diff_options *options)
 		if (!options->file)
 			die_errno("Could not open /dev/null");
 		options->close_file = 1;
+		options->color_moved = 0;
 		for (i = 0; i < q->nr; i++) {
 			struct diff_filepair *p = q->queue[i];
 			if (check_pair_status(p))
diff --git a/diff.h b/diff.h
index fad1258556..445259ebf7 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "hashmap.h"
 
 struct rev_info;
 struct diff_options;
@@ -228,6 +229,8 @@ struct diff_options {
 
 	struct diff_line *line_buffer;
 	int line_buffer_nr, line_buffer_alloc;
+
+	int color_moved;
 };
 
 /* Emit [line_prefix] [set] line [reset] */
@@ -243,7 +246,11 @@ enum color_diff {
 	DIFF_FILE_NEW = 5,
 	DIFF_COMMIT = 6,
 	DIFF_WHITESPACE = 7,
-	DIFF_FUNCINFO = 8
+	DIFF_FUNCINFO = 8,
+	DIFF_FILE_OLD_MOVED = 9,
+	DIFF_FILE_OLD_MOVED_ALT = 10,
+	DIFF_FILE_NEW_MOVED = 11,
+	DIFF_FILE_NEW_MOVED_ALT = 12
 };
 const char *diff_get_color(int diff_use_color, enum color_diff ix);
 #define diff_get_color_opt(o, ix) \
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 289806d0c7..0e92bf94bf 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -972,4 +972,271 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 
 '
 
+test_expect_success 'detect moved code, complete file' '
+	git reset --hard &&
+	cat <<-\EOF >test.c &&
+	#include<stdio.h>
+	main()
+	{
+	printf("Hello World");
+	}
+	EOF
+	git add test.c &&
+	git commit -m "add main function" &&
+	git mv test.c main.c &&
+	git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
+	cat >expected <<-\EOF &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>new file mode 100644<RESET>
+	<BOLD>index 0000000..a986c57<RESET>
+	<BOLD>--- /dev/null<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -0,0 +1,5 @@<RESET>
+	<BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
+	<BGREEN>+<RESET><BGREEN>main()<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>deleted file mode 100644<RESET>
+	<BOLD>index a986c57..0000000<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ /dev/null<RESET>
+	<CYAN>@@ -1,5 +0,0 @@<RESET>
+	<BRED>-#include<stdio.h><RESET>
+	<BRED>-main()<RESET>
+	<BRED>-{<RESET>
+	<BRED>-printf("Hello World");<RESET>
+	<BRED>-}<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect moved code, inside file' '
+	git reset --hard &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git add main.c test.c &&
+	git commit -m "add main and test file" &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BRED>-if (!u->is_allowed_foo)<RESET>
+	<BRED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BRED>-}<RESET>
+	<BRED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..e34eb69 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BGREEN>+<RESET><BGREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BGREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect permutations inside moved code' '
+	# reusing the move example from last test:
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			foo(u);
+			if (!u->is_allowed_foo)
+				return;
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BOLD;RED>-if (!u->is_allowed_foo)<RESET>
+	<BOLD;RED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BOLD;RED>-}<RESET>
+	<BOLD;RED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..2bedec9 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BOLD;GREEN>+<RESET><BOLD;GREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BOLD;GREEN>+<RESET><BOLD;GREEN>}<RESET>
+	<BOLD;GREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'move detection does not mess up colored words' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done
-- 
2.13.0.18.g7d86cc8ba0


^ permalink raw reply	[relevance 10%]

* Re: [PATCHv5 17/17] diff.c: color moved lines differently
  2017-05-24 21:40   ` [PATCHv5 17/17] diff.c: color moved lines differently Stefan Beller
@ 2017-05-25  2:27     ` Junio C Hamano
  2017-05-25  5:39       ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-05-25  2:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, bmwill, jrnieder, jonathantanmy, peff, mhagger

Stefan Beller <sbeller@google.com> writes:

> When a patch consists mostly of moving blocks of code around, it can
> be quite tedious to ensure that the blocks are moved verbatim, and not
> ...
> cases. This leads to another thought: We could pass on '--color-moved' to
> submodules such that they color up moved lines for themselves. If we'd do
> so only line moves within a repository boundary are marked up.
>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>
> # Conflicts:
> #	diff.c
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

Hmph, what are these final lines about?

^ permalink raw reply	[relevance 8%]

* Re: [PATCHv5 17/17] diff.c: color moved lines differently
  2017-05-25  2:27     ` Junio C Hamano
@ 2017-05-25  5:39       ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-25  5:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brandon Williams, Jonathan Nieder, Jonathan Tan, Jeff King, Michael Haggerty

On Wed, May 24, 2017 at 7:27 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> When a patch consists mostly of moving blocks of code around, it can
>> be quite tedious to ensure that the blocks are moved verbatim, and not
>> ...
>> cases. This leads to another thought: We could pass on '--color-moved' to
>> submodules such that they color up moved lines for themselves. If we'd do
>> so only line moves within a repository boundary are marked up.
>>
>> Helped-by: Jonathan Tan <jonathantanmy@google.com>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>>
>> # Conflicts:
>> #     diff.c
>> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>
> Hmph, what are these final lines about?

See the explanation in the patch 16/17.
My guess is that one of us (me) was careless again.

The commented lines are easily produced by git-gui, that
I use. I think it just takes the commit message from the underlying
git-core. But unlike git-core, it doesn't strip off commented lines
as there is no extra information in comments presented.
I wonder how you are the first signoff after the conflict markers
though, as that would hint that you signed off a commit message
with the commented conflict lines first, before I had them.

Puzzeled, will fix in a reroll.

Thanks,
Stefan

^ permalink raw reply	[relevance 9%]

* Re: What's cooking in git.git (May 2017, #07; Tue, 23)
      [irrelevant]       ` <xmqq60gpfvqj.fsf@gitster.mtv.corp.google.com>
@ 2017-05-25  5:42         ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-25  5:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git

On Wed, May 24, 2017 at 8:42 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>> The tests added by grep rely on the old content of
>>> test 2 'grep correctly finds patterns in a submodule'.
>>
>> Sorry about the fallout.
>>
>>> The (whitespace broken) diff below fixes it.
>
> Ah, then, this was an example of maintainer not doing a good job.
> When I see a topic that pass its own test that fails when merged to
> 'pu', I usually try to see where it goes wrong myself and come up
> with a fix in an evil merge, but this time I didn't have enough time
> to do so before sending out the "What's cooking" report.
>
> Here is what I taught my merge-fix machinery to apply after
> mechanical merge of the two topics.

Please evict (or stop paying attention to)
sb/submodule-blanket-recursive as it is fundamentally broken.

I hoped to resend a fixed version today, but it took me longer than expected
to figure out the config machinery playing with submodules.

The diff below looks correct to me.

Thanks,
Stefan
>
>  t/t7814-grep-recurse-submodules.sh | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> index 14eeb54b4b..7184113b9b 100755
> --- a/t/t7814-grep-recurse-submodules.sh
> +++ b/t/t7814-grep-recurse-submodules.sh
> @@ -36,18 +36,18 @@ test_expect_success 'grep correctly finds patterns in a submodule' '
>  test_expect_success 'grep finds patterns in a submodule via config' '
>         test_config submodule.recurse true &&
>         # expect from previous test
> -       git grep -e "bar" >actual &&
> +       git grep -e "(3|4)" >actual &&
>         test_cmp expect actual
>  '
>
>  test_expect_success 'grep --no-recurse-submodules overrides config' '
>         test_config submodule.recurse true &&
>         cat >expect <<-\EOF &&
> -       a:foobar
> -       b/b:bar
> +       a:(1|2)d(3|4)
> +       b/b:(3|4)
>         EOF
>
> -       git grep -e "bar" --no-recurse-submodules >actual &&
> +       git grep -e "(3|4)" --no-recurse-submodules >actual &&
>         test_cmp expect actual
>  '
>
> --
> 2.13.0-491-g71cfeddc25
>

^ permalink raw reply	[relevance 17%]

* [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
      [irrelevant]   ` <20170521125814.26255-2-pc44800@gmail.com>
  2017-05-22 20:04     ` Re: [GSoC][PATCH v4 2/2] submodule: port subcommand foreach from shell to C Stefan Beller
  2017-05-23 19:36     ` Brandon Williams
@ 2017-05-26 15:17     ` Prathamesh Chavan
  2017-05-26 15:17       ` [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added Prathamesh Chavan
                         ` (2 more replies)
  2 siblings, 3 replies; 200+ results
From: Prathamesh Chavan @ 2017-05-26 15:17 UTC (permalink / raw)
  To: git; +Cc: bmwill, christian.couder, ramsay, sbeller, Prathamesh Chavan

According to the documentation about git-submodule foreach subcommand's
$path variable:
$path is the name of the submodule directory relative to the superproject

But it was observed when the value of the $path value deviates from this
for the nested submodules when the <command> is run from a subdirectory.
This patch aims for its correction.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
This series of patch is based on gitster/jk/bug-to-abort for untilizing its 
BUG() macro.

The observation made was as follows:
For a project - super containing dir (not a submodule) and a submodule sub 
which contains another submodule subsub. When we run a command from super/dir:

git submodule foreach "echo \$path-\$sm_path"

actual results:
Entering '../sub'
../sub-../sub
Entering '../sub/subsub'
../subsub-../subsub

expected result wrt documentation and current test suite:
Entering '../sub'
sub-../sub
Entering '../sub/subsub'
subsub-../sub/subsub

This make the value of $path confusing and I also feel it deviates from its 
documentation:
$path is the name of the submodule directory relative to the superproject.
Hence, this patch corrects the value assigned to the $path and $sm_path.

 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c..ea6f56337 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -344,9 +344,9 @@ cmd_foreach()
 				prefix="$prefix$sm_path/"
 				sanitize_submodule_env
 				cd "$sm_path" &&
-				sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
 				# we make $path available to scripts ...
 				path=$sm_path &&
+				sm_path=$displaypath &&
 				if test $# -eq 1
 				then
 					eval "$1"
-- 
2.11.0


^ permalink raw reply	[relevance 24%]

* [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added
  2017-05-26 15:17     ` [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
@ 2017-05-26 15:17       ` Prathamesh Chavan
  2017-05-26 16:19         ` Stefan Beller
  2017-05-26 16:33         ` Brandon Williams
  2017-05-26 15:17       ` [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C Prathamesh Chavan
  2017-05-26 16:31       ` Ramsay Jones
  2 siblings, 2 replies; 200+ results
From: Prathamesh Chavan @ 2017-05-26 15:17 UTC (permalink / raw)
  To: git; +Cc: bmwill, christian.couder, ramsay, sbeller, Prathamesh Chavan

Additional test cases added to the submodule-foreach test suite
to check the submodule foreach --recursive behavior from a
subdirectory as this was missing from the test suite.

Helped-by: Brandon Williams <bmwill@google.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
Additional test added to check the bug fixed in the [PATCH v5 1/3] of
this patch series.

 t/t7407-submodule-foreach.sh | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 6ba5daf42..1c8d132d8 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -197,6 +197,40 @@ test_expect_success 'test messages from "foreach --recursive" from subdirectory'
 	test_i18ncmp expect actual
 '
 
+sub1sha1=$(cd clone2/sub1 && git rev-parse HEAD)
+sub2sha1=$(cd clone2/sub2 && git rev-parse HEAD)
+sub3sha1=$(cd clone2/sub3 && git rev-parse HEAD)
+nested1sha1=$(cd clone2/nested1 && git rev-parse HEAD)
+nested2sha1=$(cd clone2/nested1/nested2 && git rev-parse HEAD)
+nested3sha1=$(cd clone2/nested1/nested2/nested3 && git rev-parse HEAD)
+submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse HEAD)
+
+cat >expect <<EOF
+Entering '../nested1'
+$pwd/clone2-nested1-../nested1-$nested1sha1
+Entering '../nested1/nested2'
+$pwd/clone2/nested1-nested2-../nested1/nested2-$nested2sha1
+Entering '../nested1/nested2/nested3'
+$pwd/clone2/nested1/nested2-nested3-../nested1/nested2/nested3-$nested3sha1
+Entering '../nested1/nested2/nested3/submodule'
+$pwd/clone2/nested1/nested2/nested3-submodule-../nested1/nested2/nested3/submodule-$submodulesha1
+Entering '../sub1'
+$pwd/clone2-foo1-../sub1-$sub1sha1
+Entering '../sub2'
+$pwd/clone2-foo2-../sub2-$sub2sha1
+Entering '../sub3'
+$pwd/clone2-foo3-../sub3-$sub3sha1
+EOF
+
+test_expect_success 'test "submodule foreach --recursive" from subdirectory' '
+	(
+		cd clone2 &&
+		cd untracked &&
+		git submodule foreach --recursive "echo \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
+	) &&
+	test_i18ncmp expect actual
+'
+
 cat > expect <<EOF
 nested1-nested1
 nested2-nested2
-- 
2.11.0


^ permalink raw reply	[relevance 23%]

* [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C
  2017-05-26 15:17     ` [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
  2017-05-26 15:17       ` [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added Prathamesh Chavan
@ 2017-05-26 15:17       ` Prathamesh Chavan
  2017-05-26 16:14         ` Stefan Beller
  2017-05-26 16:44         ` Brandon Williams
  2017-05-26 16:31       ` Ramsay Jones
  2 siblings, 2 replies; 200+ results
From: Prathamesh Chavan @ 2017-05-26 15:17 UTC (permalink / raw)
  To: git; +Cc: bmwill, christian.couder, ramsay, sbeller, Prathamesh Chavan

This aims to make git-submodule foreach a builtin. This is the very
first step taken in this direction. Hence, 'foreach' is ported to
submodule--helper, and submodule--helper is called from git-submodule.sh.
The code is split up to have one function to obtain all the list of
submodules. This function acts as the front-end of git-submodule foreach
subcommand. It calls the function for_each_submodule_list, which basically
loops through the list and calls function fn, which in this case is
runcommand_in_submodule. This third function is a calling function that
takes care of running the command in that submodule, and recursively
perform the same when --recursive is flagged.

The first function module_foreach first parses the options present in
argv, and then with the help of module_list_compute, generates the list of
submodules present in the current working tree.

The second function for_each_submodule_list traverses through the
list, and calls function fn (which in case of submodule subcommand
foreach is runcommand_in_submodule) is called for each entry.

The third function runcommand_in_submodule, generates a submodule struct sub
for $name, value and then later prepends name=sub->name; and other
value assignment to the env argv_array structure of a child_process.
Also the <command> of submodule-foreach is push to args argv_array
structure and finally, using run_command the commands are executed
using a shell.

The third function also takes care of the recursive flag, by creating
a separate child_process structure and prepending "--super-prefix displaypath",
to the args argv_array structure. Other required arguments and the
input <command> of submodule-foreach is also appended to this argv_array.

The commit 1c4fb136db (submodule foreach: skip eval for more than one
argument, 2013-09-27), which explains that why for the case when argc>1,
we do not use eval. But since in this patch, we are calling the
command in a separate shell itself for all values of argc, this case
is not considered separately.

Both env variable $path and $sm_path were added since both are used in
tests in t7407.

Helped-by: Brandon Williams <bmwill@google.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
These series of patches passes the complete test suite.
Its build report is available at:
https://travis-ci.org/pratham-pc/git/builds
Branch: submodule-foreach
Build #71

 builtin/submodule--helper.c | 148 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  39 +-----------
 2 files changed, 149 insertions(+), 38 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 566a5b6a6..343b6269c 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,6 +13,8 @@
 #include "refs.h"
 #include "connect.h"
 
+typedef void (*submodule_list_func_t)(const struct cache_entry *list_item, void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -219,6 +221,26 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath;
+		displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		return xstrfmt("%s/%s", super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -331,6 +353,14 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static void for_each_submodule_list(const struct module_list list,
+				    submodule_list_func_t fn, void *cb_data)
+{
+	int i;
+	for (i = 0; i < list.nr; i++)
+		fn(list.entries[i], cb_data);
+}
+
 static void init_submodule(const char *path, const char *prefix, int quiet)
 {
 	const struct submodule *sub;
@@ -487,6 +517,123 @@ static int module_name(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct cb_foreach {
+	int argc;
+	const char **argv;
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int recursive: 1;
+};
+#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
+
+static void runcommand_in_submodule(const struct cache_entry *list_item,
+				    void *cb_data)
+{
+	struct cb_foreach *info = cb_data;
+	char *toplevel = xgetcwd();
+	const struct submodule *sub;
+	struct child_process cp = CHILD_PROCESS_INIT;
+	char* displaypath = NULL;
+	int i;
+
+	/* Only loads from .gitmodules, no overlay with .git/config */
+	gitmodules_config();
+
+	sub = submodule_from_path(null_sha1, list_item->name);
+
+	if (!sub)
+		die(_("No url found for submodule path '%s' in .gitmodules"),
+		      displaypath);
+
+	if (!is_submodule_populated_gently(list_item->name, NULL))
+		return;
+
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+	prepare_submodule_repo_env(&cp.env_array);
+	cp.use_shell = 1;
+	cp.dir = list_item->name;
+
+	argv_array_pushf(&cp.env_array, "name=%s", sub->name);
+	argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);
+	argv_array_pushf(&cp.env_array, "path=%s", list_item->name);
+	argv_array_pushf(&cp.env_array, "sha1=%s", oid_to_hex(&list_item->oid));
+	argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
+
+	for (i = 0; i < info->argc; i++)
+		argv_array_push(&cp.args, info->argv[i]);
+
+	if (!info->quiet)
+		printf(_("Entering '%s'\n"), displaypath);
+
+	if (info->argv[0] && run_command(&cp))
+		die(_("run_command returned non-zero status for %s\n."),
+		      displaypath);
+
+	if (info->recursive) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = list_item->name;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
+				 "submodule--helper", "foreach", "--recursive",
+				 NULL);
+
+		if (info->quiet)
+			argv_array_push(&cpr.args, "--quiet");
+
+		for (i = 0; i < info->argc; i++)
+			argv_array_push(&cpr.args, info->argv[i]);
+
+		if (run_command(&cpr))
+			die(_("run_command returned non-zero status while"
+			      "recursing in the nested submodules of %s\n."),
+			      displaypath);
+	}
+
+	free(displaypath);
+	free(toplevel);
+}
+
+static int module_foreach(int argc, const char **argv, const char *prefix)
+{
+	struct cb_foreach info;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int recursive = 0;
+
+	struct option module_foreach_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress output of entering each submodule command")),
+		OPT_BOOL(0, "recursive", &recursive,
+			 N_("Recurse into nested submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper foreach [--quiet] [--recursive] <command>"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_foreach_options,
+			     git_submodule_helper_usage, PARSE_OPT_KEEP_UNKNOWN);
+
+	if (module_list_compute(0, NULL, prefix, &pathspec, &list) < 0)
+		BUG("module_list_compute should not choke on empty pathspec");
+
+	info.argc = argc;
+	info.argv = argv;
+	info.prefix = prefix;
+	info.quiet = quiet;
+	info.recursive = recursive;
+
+	for_each_submodule_list(list, runcommand_in_submodule, &info);
+
+	return 0;
+}
+
 static int clone_submodule(const char *path, const char *gitdir, const char *url,
 			   const char *depth, struct string_list *reference,
 			   int quiet, int progress)
@@ -1212,6 +1359,7 @@ static struct cmd_struct commands[] = {
 	{"relative-path", resolve_relative_path, 0},
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
+	{"foreach", module_foreach, SUPPORT_SUPER_PREFIX},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index ea6f56337..032fd2540 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -322,45 +322,8 @@ cmd_foreach()
 		shift
 	done
 
-	toplevel=$(pwd)
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper foreach ${GIT_QUIET:+--quiet} ${recursive:+--recursive} "$@"
 
-	# dup stdin so that it can be restored when running the external
-	# command in the subshell (and a recursive call to this function)
-	exec 3<&0
-
-	{
-		git submodule--helper list --prefix "$wt_prefix" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		if test -e "$sm_path"/.git
-		then
-			displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-			say "$(eval_gettext "Entering '\$displaypath'")"
-			name=$(git submodule--helper name "$sm_path")
-			(
-				prefix="$prefix$sm_path/"
-				sanitize_submodule_env
-				cd "$sm_path" &&
-				# we make $path available to scripts ...
-				path=$sm_path &&
-				sm_path=$displaypath &&
-				if test $# -eq 1
-				then
-					eval "$1"
-				else
-					"$@"
-				fi &&
-				if test -n "$recursive"
-				then
-					cmd_foreach "--recursive" "$@"
-				fi
-			) <&3 3<&- ||
-			die "$(eval_gettext "Stopping at '\$displaypath'; script returned non-zero status.")"
-		fi
-	done
 }
 
 #
-- 
2.11.0


^ permalink raw reply	[relevance 17%]

* Re: [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C
  2017-05-26 15:17       ` [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C Prathamesh Chavan
@ 2017-05-26 16:14         ` Stefan Beller
  2017-05-26 16:44         ` Brandon Williams
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 16:14 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Brandon Williams, Christian Couder, Ramsay Jones

On Fri, May 26, 2017 at 8:17 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> This aims to make git-submodule foreach a builtin.

Cool. I reviewed the code and only have one minor nit.

> +static void runcommand_in_submodule(const struct cache_entry *list_item,
> +                                   void *cb_data)
> +{

> +       /* Only loads from .gitmodules, no overlay with .git/config */
> +       gitmodules_config();

Performance nit: We only need to load the gitmodules file once instead
of foreach submodule separately, so we could move this to module_foreach().

Thanks,
Stefan

^ permalink raw reply	[relevance 22%]

* Re: [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added
  2017-05-26 15:17       ` [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added Prathamesh Chavan
@ 2017-05-26 16:19         ` Stefan Beller
  2017-05-26 16:33         ` Brandon Williams
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 16:19 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Brandon Williams, Christian Couder, Ramsay Jones

On Fri, May 26, 2017 at 8:17 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> Additional test cases added to the submodule-foreach test suite
> to check the submodule foreach --recursive behavior from a
> subdirectory as this was missing from the test suite.

As this demonstrates the fixture of the first patch,
this could be squashed into the first commit.

Reason: When someone is looking at that first commit, they
may wonder if there is no test that demonstrates the fix. (as fixing
a bug with no test is bad style. ;) And given the data structures of
Git it is only easy to find the previous commit, but hard to find the
next commit (this one) later on.

I think with only the minor nit in patch 3, the foreach is tackled. :)

Thanks,
Stefan

^ permalink raw reply	[relevance 16%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-26 15:17     ` [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
  2017-05-26 15:17       ` [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added Prathamesh Chavan
  2017-05-26 15:17       ` [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C Prathamesh Chavan
@ 2017-05-26 16:31       ` Ramsay Jones
  2017-05-26 17:07         ` Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Ramsay Jones @ 2017-05-26 16:31 UTC (permalink / raw)
  To: Prathamesh Chavan, git; +Cc: bmwill, christian.couder, sbeller



On 26/05/17 16:17, Prathamesh Chavan wrote:
> According to the documentation about git-submodule foreach subcommand's
> $path variable:
> $path is the name of the submodule directory relative to the superproject
> 
> But it was observed when the value of the $path value deviates from this
> for the nested submodules when the <command> is run from a subdirectory.
> This patch aims for its correction.
> 
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> This series of patch is based on gitster/jk/bug-to-abort for untilizing its 
> BUG() macro.
> 
> The observation made was as follows:
> For a project - super containing dir (not a submodule) and a submodule sub 
> which contains another submodule subsub. When we run a command from super/dir:
> 
> git submodule foreach "echo \$path-\$sm_path"
> 
> actual results:
> Entering '../sub'
> ../sub-../sub
> Entering '../sub/subsub'
> ../subsub-../subsub
> 
> expected result wrt documentation and current test suite:
> Entering '../sub'
> sub-../sub
> Entering '../sub/subsub'
> subsub-../sub/subsub
> 
> This make the value of $path confusing and I also feel it deviates from its 
> documentation:
> $path is the name of the submodule directory relative to the superproject.
> Hence, this patch corrects the value assigned to the $path and $sm_path.
> 
>  git-submodule.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/git-submodule.sh b/git-submodule.sh
> index c0d0e9a4c..ea6f56337 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -344,9 +344,9 @@ cmd_foreach()
>  				prefix="$prefix$sm_path/"
>  				sanitize_submodule_env
>  				cd "$sm_path" &&
> -				sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
>  				# we make $path available to scripts ...
>  				path=$sm_path &&
> +				sm_path=$displaypath &&
>  				if test $# -eq 1
>  				then
>  					eval "$1"
> 

Hmm, I'm not sure which documentation you are referring to, but if
$path != $sm_path then something is wrong. (unless their definition
has changed, of course). commit 091a6eb0fe may have muddied the water
a little by using $sm_path in the test in t7407, since (as far as I
know) $path is the user-facing variable (NOT $sm_path).

ATB,
Ramsay Jones



^ permalink raw reply	[relevance 7%]

* Re: [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added
  2017-05-26 15:17       ` [GSoC][PATCH v5 2/3] t7407: test "submodule foreach --recursive" from subdirectory added Prathamesh Chavan
  2017-05-26 16:19         ` Stefan Beller
@ 2017-05-26 16:33         ` Brandon Williams
  1 sibling, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-26 16:33 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, christian.couder, ramsay, sbeller

On 05/26, Prathamesh Chavan wrote:
> Additional test cases added to the submodule-foreach test suite
> to check the submodule foreach --recursive behavior from a
> subdirectory as this was missing from the test suite.
> 
> Helped-by: Brandon Williams <bmwill@google.com>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> Additional test added to check the bug fixed in the [PATCH v5 1/3] of
> this patch series.
> 
>  t/t7407-submodule-foreach.sh | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
> index 6ba5daf42..1c8d132d8 100755
> --- a/t/t7407-submodule-foreach.sh
> +++ b/t/t7407-submodule-foreach.sh
> @@ -197,6 +197,40 @@ test_expect_success 'test messages from "foreach --recursive" from subdirectory'
>  	test_i18ncmp expect actual
>  '
>  
> +sub1sha1=$(cd clone2/sub1 && git rev-parse HEAD)
> +sub2sha1=$(cd clone2/sub2 && git rev-parse HEAD)
> +sub3sha1=$(cd clone2/sub3 && git rev-parse HEAD)
> +nested1sha1=$(cd clone2/nested1 && git rev-parse HEAD)
> +nested2sha1=$(cd clone2/nested1/nested2 && git rev-parse HEAD)
> +nested3sha1=$(cd clone2/nested1/nested2/nested3 && git rev-parse HEAD)
> +submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse HEAD)
> +
> +cat >expect <<EOF
> +Entering '../nested1'
> +$pwd/clone2-nested1-../nested1-$nested1sha1
> +Entering '../nested1/nested2'
> +$pwd/clone2/nested1-nested2-../nested1/nested2-$nested2sha1
> +Entering '../nested1/nested2/nested3'
> +$pwd/clone2/nested1/nested2-nested3-../nested1/nested2/nested3-$nested3sha1
> +Entering '../nested1/nested2/nested3/submodule'
> +$pwd/clone2/nested1/nested2/nested3-submodule-../nested1/nested2/nested3/submodule-$submodulesha1
> +Entering '../sub1'
> +$pwd/clone2-foo1-../sub1-$sub1sha1
> +Entering '../sub2'
> +$pwd/clone2-foo2-../sub2-$sub2sha1
> +Entering '../sub3'
> +$pwd/clone2-foo3-../sub3-$sub3sha1
> +EOF
> +
> +test_expect_success 'test "submodule foreach --recursive" from subdirectory' '
> +	(
> +		cd clone2 &&
> +		cd untracked &&
> +		git submodule foreach --recursive "echo \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
> +	) &&

small nit: You can either merge the two cd commands to 'cd clone2/untracked' or
better you can even avoid the subshell entirely by doing the following:

  git -C clone2/untracked submodule foreach --recursive \
    "echo \$toplevel-\$name-\$sm_path-\$sha1" >actual

Or something akin to that.

> +	test_i18ncmp expect actual
> +'
> +
>  cat > expect <<EOF
>  nested1-nested1
>  nested2-nested2
> -- 
> 2.11.0
> 

-- 
Brandon Williams

^ permalink raw reply	[relevance 13%]

* Re: [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C
  2017-05-26 15:17       ` [GSoC][PATCH v5 3/3] submodule: port subcommand foreach from shell to C Prathamesh Chavan
  2017-05-26 16:14         ` Stefan Beller
@ 2017-05-26 16:44         ` Brandon Williams
  1 sibling, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-26 16:44 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, christian.couder, ramsay, sbeller

On 05/26, Prathamesh Chavan wrote:
> This aims to make git-submodule foreach a builtin. This is the very
> first step taken in this direction. Hence, 'foreach' is ported to
> submodule--helper, and submodule--helper is called from git-submodule.sh.
> The code is split up to have one function to obtain all the list of
> submodules. This function acts as the front-end of git-submodule foreach
> subcommand. It calls the function for_each_submodule_list, which basically
> loops through the list and calls function fn, which in this case is
> runcommand_in_submodule. This third function is a calling function that
> takes care of running the command in that submodule, and recursively
> perform the same when --recursive is flagged.
> 
> The first function module_foreach first parses the options present in
> argv, and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
> 
> The second function for_each_submodule_list traverses through the
> list, and calls function fn (which in case of submodule subcommand
> foreach is runcommand_in_submodule) is called for each entry.
> 
> The third function runcommand_in_submodule, generates a submodule struct sub
> for $name, value and then later prepends name=sub->name; and other
> value assignment to the env argv_array structure of a child_process.
> Also the <command> of submodule-foreach is push to args argv_array
> structure and finally, using run_command the commands are executed
> using a shell.
> 
> The third function also takes care of the recursive flag, by creating
> a separate child_process structure and prepending "--super-prefix displaypath",
> to the args argv_array structure. Other required arguments and the
> input <command> of submodule-foreach is also appended to this argv_array.
> 
> The commit 1c4fb136db (submodule foreach: skip eval for more than one
> argument, 2013-09-27), which explains that why for the case when argc>1,
> we do not use eval. But since in this patch, we are calling the
> command in a separate shell itself for all values of argc, this case
> is not considered separately.
> 
> Both env variable $path and $sm_path were added since both are used in
> tests in t7407.
> 
> Helped-by: Brandon Williams <bmwill@google.com>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> These series of patches passes the complete test suite.
> Its build report is available at:
> https://travis-ci.org/pratham-pc/git/builds
> Branch: submodule-foreach
> Build #71
> 
>  builtin/submodule--helper.c | 148 ++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            |  39 +-----------
>  2 files changed, 149 insertions(+), 38 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 566a5b6a6..343b6269c 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -13,6 +13,8 @@
>  #include "refs.h"
>  #include "connect.h"
>  
> +typedef void (*submodule_list_func_t)(const struct cache_entry *list_item, void *cb_data);
> +
>  static char *get_default_remote(void)
>  {
>  	char *dest = NULL, *ret;
> @@ -219,6 +221,26 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
>  	return 0;
>  }
>  
> +static char *get_submodule_displaypath(const char *path, const char *prefix)
> +{
> +	const char *super_prefix = get_super_prefix();
> +
> +	if (prefix && super_prefix) {
> +		BUG("cannot have prefix '%s' and superprefix '%s'",
> +		    prefix, super_prefix);
> +	} else if (prefix) {
> +		struct strbuf sb = STRBUF_INIT;
> +		char *displaypath;
> +		displaypath = xstrdup(relative_path(path, prefix, &sb));

These can probably go on the same line:
  
  char *displaypath = xstrdup(relative_path(path, prefix, &sb));

> +		strbuf_release(&sb);
> +		return displaypath;
> +	} else if (super_prefix) {
> +		return xstrfmt("%s/%s", super_prefix, path);
> +	} else {
> +		return xstrdup(path);
> +	}
> +}
> +
>  struct module_list {
>  	const struct cache_entry **entries;
>  	int alloc, nr;
> @@ -331,6 +353,14 @@ static int module_list(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> +static void for_each_submodule_list(const struct module_list list,
> +				    submodule_list_func_t fn, void *cb_data)
> +{
> +	int i;
> +	for (i = 0; i < list.nr; i++)
> +		fn(list.entries[i], cb_data);
> +}
> +
>  static void init_submodule(const char *path, const char *prefix, int quiet)
>  {
>  	const struct submodule *sub;
> @@ -487,6 +517,123 @@ static int module_name(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> +struct cb_foreach {
> +	int argc;
> +	const char **argv;
> +	const char *prefix;
> +	unsigned int quiet: 1;
> +	unsigned int recursive: 1;
> +};
> +#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
> +
> +static void runcommand_in_submodule(const struct cache_entry *list_item,
> +				    void *cb_data)
> +{
> +	struct cb_foreach *info = cb_data;
> +	char *toplevel = xgetcwd();
> +	const struct submodule *sub;
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	char* displaypath = NULL;
> +	int i;
> +
> +	/* Only loads from .gitmodules, no overlay with .git/config */
> +	gitmodules_config();
> +
> +	sub = submodule_from_path(null_sha1, list_item->name);
> +
> +	if (!sub)
> +		die(_("No url found for submodule path '%s' in .gitmodules"),
> +		      displaypath);
> +
> +	if (!is_submodule_populated_gently(list_item->name, NULL))
> +		return;

I missed one other memory leak from the last round.  You should probably
call xgetcwd() to fill 'toplevel' here to avoid leaking the memory if
you do an early return.

> +
> +	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
> +
> +	prepare_submodule_repo_env(&cp.env_array);
> +	cp.use_shell = 1;
> +	cp.dir = list_item->name;
> +
> +	argv_array_pushf(&cp.env_array, "name=%s", sub->name);
> +	argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);
> +	argv_array_pushf(&cp.env_array, "path=%s", list_item->name);
> +	argv_array_pushf(&cp.env_array, "sha1=%s", oid_to_hex(&list_item->oid));
> +	argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
> +
> +	for (i = 0; i < info->argc; i++)
> +		argv_array_push(&cp.args, info->argv[i]);
> +
> +	if (!info->quiet)
> +		printf(_("Entering '%s'\n"), displaypath);
> +
> +	if (info->argv[0] && run_command(&cp))
> +		die(_("run_command returned non-zero status for %s\n."),
> +		      displaypath);
> +
> +	if (info->recursive) {
> +		struct child_process cpr = CHILD_PROCESS_INIT;
> +
> +		cpr.git_cmd = 1;
> +		cpr.dir = list_item->name;
> +		prepare_submodule_repo_env(&cpr.env_array);
> +
> +		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
> +				 "submodule--helper", "foreach", "--recursive",
> +				 NULL);
> +
> +		if (info->quiet)
> +			argv_array_push(&cpr.args, "--quiet");
> +
> +		for (i = 0; i < info->argc; i++)
> +			argv_array_push(&cpr.args, info->argv[i]);
> +
> +		if (run_command(&cpr))
> +			die(_("run_command returned non-zero status while"
> +			      "recursing in the nested submodules of %s\n."),
> +			      displaypath);
> +	}
> +
> +	free(displaypath);
> +	free(toplevel);
> +}
> +
> +static int module_foreach(int argc, const char **argv, const char *prefix)
> +{
> +	struct cb_foreach info;
> +	struct pathspec pathspec;
> +	struct module_list list = MODULE_LIST_INIT;
> +	int quiet = 0;
> +	int recursive = 0;
> +
> +	struct option module_foreach_options[] = {
> +		OPT__QUIET(&quiet, N_("Suppress output of entering each submodule command")),
> +		OPT_BOOL(0, "recursive", &recursive,
> +			 N_("Recurse into nested submodules")),
> +		OPT_END()
> +	};
> +
> +	const char *const git_submodule_helper_usage[] = {
> +		N_("git submodule--helper foreach [--quiet] [--recursive] <command>"),
> +		NULL
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, module_foreach_options,
> +			     git_submodule_helper_usage, PARSE_OPT_KEEP_UNKNOWN);
> +
> +	if (module_list_compute(0, NULL, prefix, &pathspec, &list) < 0)
> +		BUG("module_list_compute should not choke on empty pathspec");
> +
> +	info.argc = argc;
> +	info.argv = argv;
> +	info.prefix = prefix;
> +	info.quiet = quiet;
> +	info.recursive = recursive;
> +
> +	for_each_submodule_list(list, runcommand_in_submodule, &info);
> +
> +	return 0;
> +}
> +
>  static int clone_submodule(const char *path, const char *gitdir, const char *url,
>  			   const char *depth, struct string_list *reference,
>  			   int quiet, int progress)
> @@ -1212,6 +1359,7 @@ static struct cmd_struct commands[] = {
>  	{"relative-path", resolve_relative_path, 0},
>  	{"resolve-relative-url", resolve_relative_url, 0},
>  	{"resolve-relative-url-test", resolve_relative_url_test, 0},
> +	{"foreach", module_foreach, SUPPORT_SUPER_PREFIX},
>  	{"init", module_init, SUPPORT_SUPER_PREFIX},
>  	{"remote-branch", resolve_remote_submodule_branch, 0},
>  	{"push-check", push_check, 0},
> diff --git a/git-submodule.sh b/git-submodule.sh
> index ea6f56337..032fd2540 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -322,45 +322,8 @@ cmd_foreach()
>  		shift
>  	done
>  
> -	toplevel=$(pwd)
> +	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper foreach ${GIT_QUIET:+--quiet} ${recursive:+--recursive} "$@"
>  
> -	# dup stdin so that it can be restored when running the external
> -	# command in the subshell (and a recursive call to this function)
> -	exec 3<&0
> -
> -	{
> -		git submodule--helper list --prefix "$wt_prefix" ||
> -		echo "#unmatched" $?
> -	} |
> -	while read -r mode sha1 stage sm_path
> -	do
> -		die_if_unmatched "$mode" "$sha1"
> -		if test -e "$sm_path"/.git
> -		then
> -			displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
> -			say "$(eval_gettext "Entering '\$displaypath'")"
> -			name=$(git submodule--helper name "$sm_path")
> -			(
> -				prefix="$prefix$sm_path/"
> -				sanitize_submodule_env
> -				cd "$sm_path" &&
> -				# we make $path available to scripts ...
> -				path=$sm_path &&
> -				sm_path=$displaypath &&
> -				if test $# -eq 1
> -				then
> -					eval "$1"
> -				else
> -					"$@"
> -				fi &&
> -				if test -n "$recursive"
> -				then
> -					cmd_foreach "--recursive" "$@"
> -				fi
> -			) <&3 3<&- ||
> -			die "$(eval_gettext "Stopping at '\$displaypath'; script returned non-zero status.")"
> -		fi
> -	done
>  }
>  
>  #
> -- 
> 2.11.0
> 

Looking good!

-- 
Brandon Williams

^ permalink raw reply	[relevance 7%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-26 16:31       ` Ramsay Jones
@ 2017-05-26 17:07         ` Stefan Beller
  2017-05-27  1:10           ` Ramsay Jones
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-26 17:07 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder

On Fri, May 26, 2017 at 9:31 AM, Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
>
>
> On 26/05/17 16:17, Prathamesh Chavan wrote:
>> According to the documentation about git-submodule foreach subcommand's
>> $path variable:
>> $path is the name of the submodule directory relative to the superproject
>>
>> But it was observed when the value of the $path value deviates from this
>> for the nested submodules when the <command> is run from a subdirectory.
>> This patch aims for its correction.
>>
>> Mentored-by: Christian Couder <christian.couder@gmail.com>
>> Mentored-by: Stefan Beller <sbeller@google.com>
>> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
>> ---
>> This series of patch is based on gitster/jk/bug-to-abort for untilizing its
>> BUG() macro.
>>
>> The observation made was as follows:
>> For a project - super containing dir (not a submodule) and a submodule sub
>> which contains another submodule subsub. When we run a command from super/dir:
>>
>> git submodule foreach "echo \$path-\$sm_path"
>>
>> actual results:
>> Entering '../sub'
>> ../sub-../sub
>> Entering '../sub/subsub'
>> ../subsub-../subsub
>>
>> expected result wrt documentation and current test suite:
>> Entering '../sub'
>> sub-../sub
>> Entering '../sub/subsub'
>> subsub-../sub/subsub
>>
>> This make the value of $path confusing and I also feel it deviates from its
>> documentation:
>> $path is the name of the submodule directory relative to the superproject.
>> Hence, this patch corrects the value assigned to the $path and $sm_path.
>>
>>  git-submodule.sh | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index c0d0e9a4c..ea6f56337 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -344,9 +344,9 @@ cmd_foreach()
>>                               prefix="$prefix$sm_path/"
>>                               sanitize_submodule_env
>>                               cd "$sm_path" &&
>> -                             sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
>>                               # we make $path available to scripts ...
>>                               path=$sm_path &&
>> +                             sm_path=$displaypath &&
>>                               if test $# -eq 1
>>                               then
>>                                       eval "$1"
>>
>
> Hmm, I'm not sure which documentation you are referring to,

Quite likely our fine manual pages. ;)

       foreach [--recursive] <command>
           Evaluates an arbitrary shell command in each checked out submodule.
           The command has access to the variables $name, $path, $sha1 and
           $toplevel: $name is the name of the relevant submodule section in
           .gitmodules, $path is the name of the submodule directory relative
           to the superproject, $sha1 is the commit as recorded in the
           superproject, and $toplevel is the absolute path to the top-level
           of the superproject. Any submodules defined in the superproject but
           not checked out are ignored by this command. Unless given --quiet,
           foreach prints the name of each submodule before evaluating the
           command. If --recursive is given, submodules are traversed
           recursively (i.e. the given shell command is evaluated in nested
           submodules as well). A non-zero return from the command in any
           submodule causes the processing to terminate. This can be
           overridden by adding || : to the end of the command.

As $path is documented and $sm_path is not, we should care about
$path first to be correct and either fix the documentation or the implementation
such that we have a consistent world view. :)

> but if
> $path != $sm_path then something is wrong. (unless their definition
> has changed, of course).

I would lean in doing so (changing their definition):

    $path (as documented) is the name of the submodule directory
    relative to the direct superproject (so in nested submodules you
    go up only one level).

$sm_path on the other hand is not documented at all and yields
non-sense results in corner cases.

With this patch it becomes less non-sensey and could be documented as:

    $sm_path is the relative path from the current working directory
    to the submodule (ignoring relations to the superproject or nesting
    of submodules). This documentation also fits into the narrative of
    the test in t7407.

Thanks,
Stefan

^ permalink raw reply	[relevance 25%]

* [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive
@ 2017-05-26 19:10 Stefan Beller
  2017-05-26 19:10 ` [PATCH 1/8] submodule recursing: do not write a config variable twice Stefan Beller
                   ` (8 more replies)
  0 siblings, 9 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

v2:
* A reroll of sb/submodule-blanket-recursive.
* This requires ab/grep-preparatory-cleanup 
* It changed a lot from v1, as in v1 the tests did not work,
  hence the code was broken. Now it actually works.
* it also includes grep, fetch, push in addition to plain working tree
  manipulators.

Thanks,
Stefan

Stefan Beller (8):
  submodule recursing: do not write a config variable twice
  submodule test invocation: only pass additional arguments
  reset/checkout/read-tree: unify config callback for submodule
    recursion
  submodule loading: separate code path for .gitmodules and config
    overlay
  Introduce 'submodule.recurse' option for worktree manipulators
  builtin/grep.c: respect 'submodule.recurse' option
  builtin/push.c: respect 'submodule.recurse' option
  builtin/fetch.c: respect 'submodule.recurse' option

 Documentation/config.txt           |  5 +++
 builtin/checkout.c                 | 31 ++----------------
 builtin/fetch.c                    |  7 +++++
 builtin/grep.c                     |  3 ++
 builtin/push.c                     |  4 +++
 builtin/read-tree.c                | 32 ++++++-------------
 builtin/reset.c                    | 39 +++++++----------------
 submodule.c                        | 64 +++++++++++++++++++++++++++++++++-----
 submodule.h                        |  7 ++++-
 t/lib-submodule-update.sh          | 22 ++++++++++---
 t/t1013-read-tree-submodule.sh     |  4 +--
 t/t2013-checkout-submodule.sh      |  4 +--
 t/t5526-fetch-submodules.sh        | 10 ++++++
 t/t5531-deep-submodule-push.sh     | 21 +++++++++++++
 t/t7112-reset-submodule.sh         |  4 +--
 t/t7814-grep-recurse-submodules.sh | 18 +++++++++++
 16 files changed, 178 insertions(+), 97 deletions(-)

-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 35%]

* [PATCH 1/8] submodule recursing: do not write a config variable twice
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 2/8] submodule test invocation: only pass additional arguments Stefan Beller
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

The command line option for '--recurse-submodules' is implemented
using an OPTION_CALLBACK, which takes both the callback (that sets
the file static global variable) as well as passes the same file
static global variable to the option parsing machinery to assign it.
This is fixed in this commit by passing NULL as the variable. The
callback sets it instead

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/checkout.c  | 2 +-
 builtin/read-tree.c | 2 +-
 builtin/reset.c     | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index bfa5419f33..0fd57672cc 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1181,7 +1181,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 				N_("second guess 'git checkout <no-such-branch>'")),
 		OPT_BOOL(0, "ignore-other-worktrees", &opts.ignore_other_worktrees,
 			 N_("do not check if another worktree is holding the given ref")),
-		{ OPTION_CALLBACK, 0, "recurse-submodules", &recurse_submodules,
+		{ OPTION_CALLBACK, 0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
 			    PARSE_OPT_OPTARG, option_parse_recurse_submodules },
 		OPT_BOOL(0, "progress", &opts.show_progress, N_("force progress reporting")),
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 23e212ee8c..2a1b8a530e 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -157,7 +157,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 			 N_("skip applying sparse checkout filter")),
 		OPT_BOOL(0, "debug-unpack", &opts.debug_unpack,
 			 N_("debug unpack-trees")),
-		{ OPTION_CALLBACK, 0, "recurse-submodules", &recurse_submodules,
+		{ OPTION_CALLBACK, 0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
 			    PARSE_OPT_OPTARG, option_parse_recurse_submodules },
 		OPT_END()
diff --git a/builtin/reset.c b/builtin/reset.c
index 5ce27fcaed..1e5f85b1fb 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -304,7 +304,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 				N_("reset HEAD, index and working tree"), MERGE),
 		OPT_SET_INT(0, "keep", &reset_type,
 				N_("reset HEAD but keep local changes"), KEEP),
-		{ OPTION_CALLBACK, 0, "recurse-submodules", &recurse_submodules,
+		{ OPTION_CALLBACK, 0, "recurse-submodules", NULL,
 			    "reset", "control recursive updating of submodules",
 			    PARSE_OPT_OPTARG, option_parse_recurse_submodules },
 		OPT_BOOL('p', "patch", &patch_mode, N_("select hunks interactively")),
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 29%]

* [PATCH 2/8] submodule test invocation: only pass additional arguments
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
  2017-05-26 19:10 ` [PATCH 1/8] submodule recursing: do not write a config variable twice Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 3/8] reset/checkout/read-tree: unify config callback for submodule recursion Stefan Beller
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

In a later patch we want to introduce a config option to trigger the
submodule recursing by default. As this option should be available and
uniform across all commands that deal with submodules we'd want to test
for this option in the submodule update library.

So instead of calling the whole test set again for
"git -c submodule.recurse foo" instead of "git foo --recurse-submodules",
we'd only want to introduce one basic test that tests if the option is
recognized and respected to not overload the test suite.

Change the test functions by taking only the argument and assemble the
command inside the test function by embedding the arguments into the
command that is "git $arguments --recurse-submodules".

It would be nice to do this for all functions in lib-submodule-update,
but we cannot do that for the non-recursing tests, as there we do not
just pass in a git command but whole functions. (See t3426 for example)

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/lib-submodule-update.sh      | 10 ++++++----
 t/t1013-read-tree-submodule.sh |  4 ++--
 t/t2013-checkout-submodule.sh  |  4 ++--
 t/t7112-reset-submodule.sh     |  4 ++--
 4 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index f0b1b18206..0272c4d8ca 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -781,8 +781,9 @@ test_submodule_forced_switch () {
 # - Removing a submodule with a git directory absorbs the submodules
 #   git directory first into the superproject.
 
-test_submodule_switch_recursing () {
-	command="$1"
+test_submodule_switch_recursing_with_args () {
+	cmd_args="$1"
+	command="git $cmd_args --recurse-submodules"
 	RESULTDS=success
 	if test "$KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS" = 1
 	then
@@ -1021,8 +1022,9 @@ test_submodule_switch_recursing () {
 # Test that submodule contents are updated when switching between commits
 # that change a submodule, but throwing away local changes in
 # the superproject as well as the submodule is allowed.
-test_submodule_forced_switch_recursing () {
-	command="$1"
+test_submodule_forced_switch_recursing_with_args () {
+	cmd_args="$1"
+	command="git $cmd_args --recurse-submodules"
 	RESULT=success
 	if test "$KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS" = 1
 	then
diff --git a/t/t1013-read-tree-submodule.sh b/t/t1013-read-tree-submodule.sh
index de1ba02dc5..2c8d620324 100755
--- a/t/t1013-read-tree-submodule.sh
+++ b/t/t1013-read-tree-submodule.sh
@@ -9,9 +9,9 @@ KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
 KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
 KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED=1
 
-test_submodule_switch_recursing "git read-tree --recurse-submodules -u -m"
+test_submodule_switch_recursing_with_args "read-tree -u -m"
 
-test_submodule_forced_switch_recursing "git read-tree --recurse-submodules -u --reset"
+test_submodule_forced_switch_recursing_with_args "read-tree -u --reset"
 
 test_submodule_switch "git read-tree -u -m"
 
diff --git a/t/t2013-checkout-submodule.sh b/t/t2013-checkout-submodule.sh
index e8f70b806f..c962a02277 100755
--- a/t/t2013-checkout-submodule.sh
+++ b/t/t2013-checkout-submodule.sh
@@ -65,9 +65,9 @@ test_expect_success '"checkout <submodule>" honors submodule.*.ignore from .git/
 
 KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
 KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
-test_submodule_switch_recursing "git checkout --recurse-submodules"
+test_submodule_switch_recursing_with_args "checkout"
 
-test_submodule_forced_switch_recursing "git checkout -f --recurse-submodules"
+test_submodule_forced_switch_recursing_with_args "checkout -f"
 
 test_submodule_switch "git checkout"
 
diff --git a/t/t7112-reset-submodule.sh b/t/t7112-reset-submodule.sh
index f86ccdf215..a1cb9ff858 100755
--- a/t/t7112-reset-submodule.sh
+++ b/t/t7112-reset-submodule.sh
@@ -9,9 +9,9 @@ KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED=1
 KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
 KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED=1
 
-test_submodule_switch_recursing "git reset --recurse-submodules --keep"
+test_submodule_switch_recursing_with_args "reset --keep"
 
-test_submodule_forced_switch_recursing "git reset --hard --recurse-submodules"
+test_submodule_forced_switch_recursing_with_args "reset --hard"
 
 test_submodule_switch "git reset --keep"
 
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 28%]

* [PATCH 3/8] reset/checkout/read-tree: unify config callback for submodule recursion
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
  2017-05-26 19:10 ` [PATCH 1/8] submodule recursing: do not write a config variable twice Stefan Beller
  2017-05-26 19:10 ` [PATCH 2/8] submodule test invocation: only pass additional arguments Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 4/8] submodule loading: separate code path for .gitmodules and config overlay Stefan Beller
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

The callback function is essentially duplicated 3 times. Remove all
of them and offer a new callback function, that lives in submodule.c

By putting the callback function there, we no longer need the function
'set_config_update_recurse_submodules', nor duplicate the global variable
in each builtin as well as submodule.c

In the three builtins we have different 2 ways how to load the .gitmodules
and config file, which are slightly different. git-checkout has to load
the submodule config all the time due to 23b4c7bcc5 (checkout: Use
submodule.*.ignore settings from .git/config and .gitmodules, 2010-08-28)

git-reset and git-read-tree do not respect these diff settings, so loading
the submodule configuration is optional. Also put that into submodule.c
for code deduplication.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/checkout.c  | 27 +--------------------------
 builtin/read-tree.c | 28 +++-------------------------
 builtin/reset.c     | 27 ++-------------------------
 submodule.c         | 33 +++++++++++++++++++++++++++------
 submodule.h         |  6 +++++-
 5 files changed, 38 insertions(+), 83 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 0fd57672cc..acff6039d6 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -21,31 +21,12 @@
 #include "submodule-config.h"
 #include "submodule.h"
 
-static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
-
 static const char * const checkout_usage[] = {
 	N_("git checkout [<options>] <branch>"),
 	N_("git checkout [<options>] [<branch>] -- <file>..."),
 	NULL,
 };
 
-static int option_parse_recurse_submodules(const struct option *opt,
-					   const char *arg, int unset)
-{
-	if (unset) {
-		recurse_submodules = RECURSE_SUBMODULES_OFF;
-		return 0;
-	}
-	if (arg)
-		recurse_submodules =
-			parse_update_recurse_submodules_arg(opt->long_name,
-							    arg);
-	else
-		recurse_submodules = RECURSE_SUBMODULES_ON;
-
-	return 0;
-}
-
 struct checkout_opts {
 	int patch_mode;
 	int quiet;
@@ -1183,7 +1164,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 			 N_("do not check if another worktree is holding the given ref")),
 		{ OPTION_CALLBACK, 0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
-			    PARSE_OPT_OPTARG, option_parse_recurse_submodules },
+			    PARSE_OPT_OPTARG, option_parse_recurse_submodules_worktree_updater },
 		OPT_BOOL(0, "progress", &opts.show_progress, N_("force progress reporting")),
 		OPT_END(),
 	};
@@ -1214,12 +1195,6 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 		git_xmerge_config("merge.conflictstyle", conflict_style, NULL);
 	}
 
-	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
-		git_config(submodule_config, NULL);
-		if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
-			set_config_update_recurse_submodules(recurse_submodules);
-	}
-
 	if ((!!opts.new_branch + !!opts.new_branch_force + !!opts.new_orphan_branch) > 1)
 		die(_("-b, -B and --orphan are mutually exclusive"));
 
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 2a1b8a530e..8a889ef4c3 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -21,7 +21,6 @@
 static int nr_trees;
 static int read_empty;
 static struct tree *trees[MAX_UNPACK_TREES];
-static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 
 static int list_tree(unsigned char *sha1)
 {
@@ -99,23 +98,6 @@ static int debug_merge(const struct cache_entry * const *stages,
 	return 0;
 }
 
-static int option_parse_recurse_submodules(const struct option *opt,
-					   const char *arg, int unset)
-{
-	if (unset) {
-		recurse_submodules = RECURSE_SUBMODULES_OFF;
-		return 0;
-	}
-	if (arg)
-		recurse_submodules =
-			parse_update_recurse_submodules_arg(opt->long_name,
-							    arg);
-	else
-		recurse_submodules = RECURSE_SUBMODULES_ON;
-
-	return 0;
-}
-
 static struct lock_file lock_file;
 
 int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
@@ -159,7 +141,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 			 N_("debug unpack-trees")),
 		{ OPTION_CALLBACK, 0, "recurse-submodules", NULL,
 			    "checkout", "control recursive updating of submodules",
-			    PARSE_OPT_OPTARG, option_parse_recurse_submodules },
+			    PARSE_OPT_OPTARG, option_parse_recurse_submodules_worktree_updater },
 		OPT_END()
 	};
 
@@ -173,13 +155,9 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	argc = parse_options(argc, argv, unused_prefix, read_tree_options,
 			     read_tree_usage, 0);
 
-	hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
+	load_submodule_cache();
 
-	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT) {
-		gitmodules_config();
-		git_config(submodule_config, NULL);
-		set_config_update_recurse_submodules(RECURSE_SUBMODULES_ON);
-	}
+	hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
 
 	prefix_set = opts.prefix ? 1 : 0;
 	if (1 < opts.merge + opts.reset + prefix_set)
diff --git a/builtin/reset.c b/builtin/reset.c
index 1e5f85b1fb..6f89dc5494 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -24,25 +24,6 @@
 #include "submodule.h"
 #include "submodule-config.h"
 
-static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
-
-static int option_parse_recurse_submodules(const struct option *opt,
-					   const char *arg, int unset)
-{
-	if (unset) {
-		recurse_submodules = RECURSE_SUBMODULES_OFF;
-		return 0;
-	}
-	if (arg)
-		recurse_submodules =
-			parse_update_recurse_submodules_arg(opt->long_name,
-							    arg);
-	else
-		recurse_submodules = RECURSE_SUBMODULES_ON;
-
-	return 0;
-}
-
 static const char * const git_reset_usage[] = {
 	N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] [<commit>]"),
 	N_("git reset [-q] [<tree-ish>] [--] <paths>..."),
@@ -306,7 +287,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 				N_("reset HEAD but keep local changes"), KEEP),
 		{ OPTION_CALLBACK, 0, "recurse-submodules", NULL,
 			    "reset", "control recursive updating of submodules",
-			    PARSE_OPT_OPTARG, option_parse_recurse_submodules },
+			    PARSE_OPT_OPTARG, option_parse_recurse_submodules_worktree_updater },
 		OPT_BOOL('p', "patch", &patch_mode, N_("select hunks interactively")),
 		OPT_BOOL('N', "intent-to-add", &intent_to_add,
 				N_("record only the fact that removed paths will be added later")),
@@ -319,11 +300,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 						PARSE_OPT_KEEP_DASHDASH);
 	parse_args(&pathspec, argv, prefix, patch_mode, &rev);
 
-	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT) {
-		gitmodules_config();
-		git_config(submodule_config, NULL);
-		set_config_update_recurse_submodules(RECURSE_SUBMODULES_ON);
-	}
+	load_submodule_cache();
 
 	unborn = !strcmp(rev, "HEAD") && get_sha1("HEAD", oid.hash);
 	if (unborn) {
diff --git a/submodule.c b/submodule.c
index 54825100b2..c9e764b519 100644
--- a/submodule.c
+++ b/submodule.c
@@ -18,7 +18,7 @@
 #include "worktree.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
-static int config_update_recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static int parallel_jobs = 1;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
@@ -169,6 +169,32 @@ int submodule_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
+						     const char *arg, int unset)
+{
+	if (unset) {
+		config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
+		return 0;
+	}
+	if (arg)
+		config_update_recurse_submodules =
+			parse_update_recurse_submodules_arg(opt->long_name,
+							    arg);
+	else
+		config_update_recurse_submodules = RECURSE_SUBMODULES_ON;
+
+	return 0;
+}
+
+void load_submodule_cache(void)
+{
+	if (config_update_recurse_submodules == RECURSE_SUBMODULES_OFF)
+		return;
+
+	gitmodules_config();
+	git_config(submodule_config, NULL);
+}
+
 void gitmodules_config(void)
 {
 	const char *work_tree = get_git_work_tree();
@@ -596,11 +622,6 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
-void set_config_update_recurse_submodules(int value)
-{
-	config_update_recurse_submodules = value;
-}
-
 int should_update_submodules(void)
 {
 	return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
diff --git a/submodule.h b/submodule.h
index 1277480add..b13f120f76 100644
--- a/submodule.h
+++ b/submodule.h
@@ -39,6 +39,11 @@ extern void stage_updated_gitmodules(void);
 extern void set_diffopt_flags_from_submodule_config(struct diff_options *,
 		const char *path);
 extern int submodule_config(const char *var, const char *value, void *cb);
+
+struct option;
+int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
+						     const char *arg, int unset);
+void load_submodule_cache(void);
 extern void gitmodules_config(void);
 extern void gitmodules_config_sha1(const unsigned char *commit_sha1);
 extern int is_submodule_initialized(const char *path);
@@ -65,7 +70,6 @@ extern void show_submodule_inline_diff(FILE *f, const char *path,
 		const char *del, const char *add, const char *reset,
 		const struct diff_options *opt);
 extern void set_config_fetch_recurse_submodules(int value);
-extern void set_config_update_recurse_submodules(int value);
 /* Check if we want to update any submodule.*/
 extern int should_update_submodules(void);
 /*
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 22%]

* [PATCH 4/8] submodule loading: separate code path for .gitmodules and config overlay
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
                   ` (2 preceding siblings ...)
  2017-05-26 19:10 ` [PATCH 3/8] reset/checkout/read-tree: unify config callback for submodule recursion Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 5/8] Introduce 'submodule.recurse' option for worktree manipulators Stefan Beller
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

The .gitmodules file is not supposed to have all the options available,
that are available in the configuration so separate it out.

A configuration option such as the hypothetical submodule.color.diff
that determines in which color a submodule change is printed,
is a very user specific thing, that the .gitmodules file should
not tamper with.

The .gitmodules file should only be used for settings that required
to setup the project in which the .gitmodules file is tracked. As the
minimum this would only include the name<->path mapping of the
submodule and its URL and branch.

Any further setting (such as 'fetch.recursesubmodules' or
'submodule.<name>.{update, ignore, shallow}') is not specific
to the project setup requirements, but rather is a distribution
of suggested developer configurations.  In other areas of Git
a suggested developer configuration is not transported in-tree
but via other means.  In an organisation this could be done
by deploying an opinionated system wide config (/etc/gitconfig)
or by putting the settings in the users home directory when
they start at the organisation. In open source projects this
is often accomplished via extensive READMEs (cf. our
SubmittingPatches/CodingGuidlines).

As a later patch in this series wants to introduce
a generic submodule recursion option, we want to make
sure that switch is not exposed via the gitmodules file.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index c9e764b519..78cccb7563 100644
--- a/submodule.c
+++ b/submodule.c
@@ -153,7 +153,8 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 	}
 }
 
-int submodule_config(const char *var, const char *value, void *cb)
+/* For loading from the .gitmodules file. */
+static int git_modules_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, "submodule.fetchjobs")) {
 		parallel_jobs = git_config_int(var, value);
@@ -169,6 +170,12 @@ int submodule_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+/* Loads all submodule settings from the config */
+int submodule_config(const char *var, const char *value, void *cb)
+{
+	return git_modules_config(var, value, cb);
+}
+
 int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
 						     const char *arg, int unset)
 {
@@ -222,7 +229,8 @@ void gitmodules_config(void)
 		}
 
 		if (!gitmodules_is_unmerged)
-			git_config_from_file(submodule_config, gitmodules_path.buf, NULL);
+			git_config_from_file(git_modules_config,
+				gitmodules_path.buf, NULL);
 		strbuf_release(&gitmodules_path);
 	}
 }
@@ -233,7 +241,7 @@ void gitmodules_config_sha1(const unsigned char *commit_sha1)
 	unsigned char sha1[20];
 
 	if (gitmodule_sha1_from_commit(commit_sha1, sha1, &rev)) {
-		git_config_from_blob_sha1(submodule_config, rev.buf,
+		git_config_from_blob_sha1(git_modules_config, rev.buf,
 					  sha1, NULL);
 	}
 	strbuf_release(&rev);
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 29%]

* [PATCH 5/8] Introduce 'submodule.recurse' option for worktree manipulators
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
                   ` (3 preceding siblings ...)
  2017-05-26 19:10 ` [PATCH 4/8] submodule loading: separate code path for .gitmodules and config overlay Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 6/8] builtin/grep.c: respect 'submodule.recurse' option Stefan Beller
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

Any command that understands '--recurse-submodules' can have its
default changed to true, by setting the new 'submodule.recurse'
option.

This patch includes read-tree/checkout/reset for working tree
manipulating commands. Later patches will cover other commands.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt  |  5 +++++
 builtin/checkout.c        |  2 +-
 builtin/read-tree.c       | 10 +++++++++-
 builtin/reset.c           | 10 +++++++++-
 submodule.c               | 23 +++++++++++++++++++++--
 submodule.h               |  1 +
 t/lib-submodule-update.sh | 12 ++++++++++++
 7 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..e367becf72 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3063,6 +3063,11 @@ submodule.active::
 	submodule's path to determine if the submodule is of interest to git
 	commands.
 
+submodule.recurse::
+	Specifies if commands recurse into submodules by default. This
+	applies to all commands that have a `--recurse-submodules` option.
+	Defaults to false.
+
 submodule.fetchJobs::
 	Specifies how many submodules are fetched/cloned at the same time.
 	A positive integer allows up to that number of submodules fetched
diff --git a/builtin/checkout.c b/builtin/checkout.c
index acff6039d6..9ccc4a1d52 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -854,7 +854,7 @@ static int git_checkout_config(const char *var, const char *value, void *cb)
 	}
 
 	if (starts_with(var, "submodule."))
-		return parse_submodule_config_option(var, value);
+		return submodule_config(var, value, NULL);
 
 	return git_xmerge_config(var, value, NULL);
 }
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 8a889ef4c3..6dd70cd430 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -98,6 +98,14 @@ static int debug_merge(const struct cache_entry * const *stages,
 	return 0;
 }
 
+int git_read_tree_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.recurse"))
+		return git_default_submodule_config(var, value, cb);
+
+	return git_default_config(var, value, cb);
+}
+
 static struct lock_file lock_file;
 
 int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
@@ -150,7 +158,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	opts.src_index = &the_index;
 	opts.dst_index = &the_index;
 
-	git_config(git_default_config, NULL);
+	git_config(git_read_tree_config, NULL);
 
 	argc = parse_options(argc, argv, unused_prefix, read_tree_options,
 			     read_tree_usage, 0);
diff --git a/builtin/reset.c b/builtin/reset.c
index 6f89dc5494..8ccdb7437e 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -266,6 +266,14 @@ static int reset_refs(const char *rev, const struct object_id *oid)
 	return update_ref_status;
 }
 
+int git_reset_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.recurse"))
+		return git_default_submodule_config(var, value, cb);
+
+	return git_default_config(var, value, cb);
+}
+
 int cmd_reset(int argc, const char **argv, const char *prefix)
 {
 	int reset_type = NONE, update_ref_status = 0, quiet = 0;
@@ -294,7 +302,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 		OPT_END()
 	};
 
-	git_config(git_default_config, NULL);
+	git_config(git_reset_config, NULL);
 
 	argc = parse_options(argc, argv, prefix, options, git_reset_usage,
 						PARSE_OPT_KEEP_DASHDASH);
diff --git a/submodule.c b/submodule.c
index 78cccb7563..2b157dc995 100644
--- a/submodule.c
+++ b/submodule.c
@@ -16,6 +16,7 @@
 #include "quote.h"
 #include "remote.h"
 #include "worktree.h"
+#include "parse-options.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
@@ -170,10 +171,28 @@ static int git_modules_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/* Loads all submodule settings from the config */
+/* Loads all submodule settings from the config. */
 int submodule_config(const char *var, const char *value, void *cb)
 {
-	return git_modules_config(var, value, cb);
+	if (!strcmp(var, "submodule.recurse")) {
+		int v = git_config_bool(var, value) ?
+			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+		config_update_recurse_submodules = v;
+		return 0;
+	} else {
+		return git_modules_config(var, value, cb);
+	}
+}
+
+/* Cheap function that only determines if we're interested in submodules at all */
+int git_default_submodule_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.recurse")) {
+		int v = git_config_bool(var, value) ?
+			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+		config_update_recurse_submodules = v;
+	}
+	return 0;
 }
 
 int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
diff --git a/submodule.h b/submodule.h
index b13f120f76..d920ca1d5a 100644
--- a/submodule.h
+++ b/submodule.h
@@ -39,6 +39,7 @@ extern void stage_updated_gitmodules(void);
 extern void set_diffopt_flags_from_submodule_config(struct diff_options *,
 		const char *path);
 extern int submodule_config(const char *var, const char *value, void *cb);
+extern int git_default_submodule_config(const char *var, const char *value, void *cb);
 
 struct option;
 int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 0272c4d8ca..52beadad96 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -990,6 +990,18 @@ test_submodule_switch_recursing_with_args () {
 		)
 	'
 
+	test_expect_success "git -c submodule.recurse=true $cmd_args: modified submodule updates submodule work tree" '
+		prolog &&
+		reset_work_tree_to_interested add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t modify_sub1 origin/modify_sub1 &&
+			git -c submodule.recurse=true $cmd_args modify_sub1 &&
+			test_superproject_content origin/modify_sub1 &&
+			test_submodule_content sub1 origin/modify_sub1
+		)
+	'
+
 	# Updating a submodule to an invalid sha1 doesn't update the
 	# superproject nor the submodule's work tree.
 	test_expect_success "$command: updating to a missing submodule commit fails" '
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 24%]

* [PATCH 6/8] builtin/grep.c: respect 'submodule.recurse' option
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
                   ` (4 preceding siblings ...)
  2017-05-26 19:10 ` [PATCH 5/8] Introduce 'submodule.recurse' option for worktree manipulators Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 7/8] builtin/push.c: respect 'submodule.recurse' option Stefan Beller
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

In builtin/grep.c we parse the config before evaluating the command line
options. This makes the task of teaching grep to respect the new config
option 'submodule.recurse' very easy by just parsing that option.

As an alternative I had implemented a similar structure to treat
submodules as the fetch/push command have, including
* aligning the meaning of the 'recurse_submodules' to possible submodule
  values RECURSE_SUBMODULES_* as defined in submodule.h.
* having a callback to parse the value and
* reacting to the RECURSE_SUBMODULES_DEFAULT state that was the initial
  state.

However all this is not needed for a true boolean value, so let's keep
it simple. However this adds another place where "submodule.recurse" is
parsed.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/grep.c                     |  3 +++
 t/t7814-grep-recurse-submodules.sh | 18 ++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index b1095362fb..454e263820 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -302,6 +302,9 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 #endif
 	}
 
+	if (!strcmp(var, "submodule.recurse"))
+		recurse_submodules = git_config_bool(var, value);
+
 	return st;
 }
 
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 3a58197f47..7184113b9b 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -33,6 +33,24 @@ test_expect_success 'grep correctly finds patterns in a submodule' '
 	test_cmp expect actual
 '
 
+test_expect_success 'grep finds patterns in a submodule via config' '
+	test_config submodule.recurse true &&
+	# expect from previous test
+	git grep -e "(3|4)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-recurse-submodules overrides config' '
+	test_config submodule.recurse true &&
+	cat >expect <<-\EOF &&
+	a:(1|2)d(3|4)
+	b/b:(3|4)
+	EOF
+
+	git grep -e "(3|4)" --no-recurse-submodules >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep and basic pathspecs' '
 	cat >expect <<-\EOF &&
 	submodule/a:(1|2)d(3|4)
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 31%]

* [PATCH 7/8] builtin/push.c: respect 'submodule.recurse' option
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
                   ` (5 preceding siblings ...)
  2017-05-26 19:10 ` [PATCH 6/8] builtin/grep.c: respect 'submodule.recurse' option Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-26 19:10 ` [PATCH 8/8] builtin/fetch.c: respect 'submodule.recurse' option Stefan Beller
  2017-05-30  5:30 ` Junio C Hamano
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

The closest mapping from the boolean 'submodule.recurse' set to "yes"
to the variety of submodule push modes is "on-demand", so implement that.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/push.c                 |  4 ++++
 t/t5531-deep-submodule-push.sh | 21 +++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/builtin/push.c b/builtin/push.c
index a597759d8f..258648d5fd 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -498,6 +498,10 @@ static int git_push_config(const char *k, const char *v, void *cb)
 		const char *value;
 		if (!git_config_get_value("push.recursesubmodules", &value))
 			recurse_submodules = parse_push_recurse_submodules_arg(k, value);
+	} else if (!strcmp(k, "submodule.recurse")) {
+		int val = git_config_bool(k, v) ?
+			RECURSE_SUBMODULES_ON_DEMAND : RECURSE_SUBMODULES_OFF;
+		recurse_submodules = val;
 	}
 
 	return git_default_config(k, v, NULL);
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index 57ba322628..712c595fd8 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -126,6 +126,27 @@ test_expect_success 'push succeeds if submodule commit not on remote but using o
 	)
 '
 
+test_expect_success 'push succeeds if submodule commit not on remote but using auto-on-demand via submodule.recurse config' '
+	(
+		cd work/gar/bage &&
+		>recurse-on-demand-from-submodule-recurse-config &&
+		git add recurse-on-demand-from-submodule-recurse-config &&
+		git commit -m "Recurse submodule.recurse from config junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Recurse submodule.recurse from config for gar/bage" &&
+		git -c submodule.recurse push ../pub.git master &&
+		# Check that the supermodule commit got there
+		git fetch ../pub.git &&
+		git diff --quiet FETCH_HEAD master &&
+		# Check that the submodule commit got there too
+		cd gar/bage &&
+		git diff --quiet origin/master master
+	)
+'
+
 test_expect_success 'push recurse-submodules on command line overrides config' '
 	(
 		cd work/gar/bage &&
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 30%]

* [PATCH 8/8] builtin/fetch.c: respect 'submodule.recurse' option
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
                   ` (6 preceding siblings ...)
  2017-05-26 19:10 ` [PATCH 7/8] builtin/push.c: respect 'submodule.recurse' option Stefan Beller
@ 2017-05-26 19:10 ` Stefan Beller
  2017-05-30  5:30 ` Junio C Hamano
  8 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-26 19:10 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fetch.c             |  7 +++++++
 t/t5526-fetch-submodules.sh | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5f2c2ab23e..c1ec3b03c3 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -73,6 +73,13 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 		fetch_prune_config = git_config_bool(k, v);
 		return 0;
 	}
+
+	if (!strcmp(k, "submodule.recurse")) {
+		int r = git_config_bool(k, v) ?
+			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+		recurse_submodules = r;
+	}
+
 	return git_default_config(k, v, cb);
 }
 
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index f3b0a8d30a..162baf101f 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -71,6 +71,16 @@ test_expect_success "fetch --recurse-submodules recurses into submodules" '
 	test_i18ncmp expect.err actual.err
 '
 
+test_expect_success "submodule.recurse option triggers recursive fetch" '
+	add_upstream_commit &&
+	(
+		cd downstream &&
+		git -c submodule.recurse fetch >../actual.out 2>../actual.err
+	) &&
+	test_must_be_empty actual.out &&
+	test_i18ncmp expect.err actual.err
+'
+
 test_expect_success "fetch --recurse-submodules -j2 has the same output behaviour" '
 	add_upstream_commit &&
 	(
-- 
2.13.0.17.g582985b1e4


^ permalink raw reply	[relevance 33%]

* [PATCH 1/1] diff.c: color moved lines differently
      [irrelevant] ` <20170527001820.25214-1-sbeller@google.com>
@ 2017-05-27  0:18   ` Stefan Beller
  2017-05-27  7:05     ` Philip Oakley
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-27  0:18 UTC (permalink / raw)
  To: gitster; +Cc: git, bmwill, jrnieder, peff, mhagger, jonathantanmy, Stefan Beller

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.

As the reviewers attention should be brought to the places, where the
difference is introduced to the moved code, we cannot just have one new
color for all of moved code.

First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt       |  10 +-
 Documentation/diff-options.txt |  32 ++++
 color.h                        |   2 +
 diff.c                         | 342 +++++++++++++++++++++++++++++++++++--
 diff.h                         |  15 +-
 t/t4015-diff-whitespace.sh     | 373 +++++++++++++++++++++++++++++++++++++++++
 6 files changed, 760 insertions(+), 14 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..73511a4603 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1051,14 +1051,20 @@ This does not affect linkgit:git-format-patch[1] or the
 'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
 command line with the `--color[=<when>]` option.
 
+diff.colorMoved::
+	If set moved lines in a diff are colored differently,
+	for details see '--color-moved' in linkgit:git-diff[1].
+
 color.diff.<slot>::
 	Use customized color for diff colorization.  `<slot>` specifies
 	which part of the patch to use the specified color, and is one
 	of `context` (context text - `plain` is a historical synonym),
 	`meta` (metainformation), `frag`
 	(hunk header), 'func' (function in hunk header), `old` (removed lines),
-	`new` (added lines), `commit` (commit headers), or `whitespace`
-	(highlighting whitespace errors).
+	`new` (added lines), `commit` (commit headers), `whitespace`
+	(highlighting whitespace errors), `oldMoved`, `newMoved`,
+	`oldMovedAlternative` and `newMovedAlternative` (See the '<mode>'
+	setting of '--color-moved' in linkgit:git-diff[1] for details).
 
 color.decorate.<slot>::
 	Use customized color for 'git log --decorate' output.  `<slot>` is one
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 89cc0f48de..25259dbbc3 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -231,6 +231,38 @@ ifdef::git-diff[]
 endif::git-diff[]
 	It is the same as `--color=never`.
 
+--color-moved[=<mode>]::
+	Moved lines of code are colored differently.
+ifdef::git-diff[]
+	It can be changed by the `diff.colorMoved` configuration setting.
+endif::git-diff[]
+	The <mode> defaults to 'no' if the option is not given
+	and to 'adjacentbounds' if the option with no mode is given.
+	The mode must be one of:
++
+--
+no::
+	Moved lines are not highlighted.
+nobounds::
+	Any line that is added in on location and was removed
+	in another location will be colored with 'color.diff.newmoved'.
+	Any line that is removed in on location and was added
+	in another location will be colored with 'color.diff.oldmoved'.
+allbounds::
+	Based on 'nobounds'. Additionally blocks of moved code are
+	detected and the first and last line of a block will be highlighted
+	using 'color.diff.newMovedAlternate' or
+	'color.diff.oldMovedAlternate'.
+adjacentbounds::
+	The same as 'allbounds' except that highlighting is only performed
+	at adjacent block boundaries of blocks that have the same sign.
+alternate::
+	Based on 'nobounds'. Additionally blocks of moved code are
+	detected. If moved blocks are adjacent mark one of them with the
+	alternative move color using 'color.diff.newMovedAlternate' or
+	'color.diff.oldMovedAlternate'.
+--
+
 --word-diff[=<mode>]::
 	Show a word diff, using the <mode> to delimit changed words.
 	By default, words are delimited by whitespace; see
diff --git a/color.h b/color.h
index 90627650fc..04b3b87929 100644
--- a/color.h
+++ b/color.h
@@ -42,6 +42,8 @@ struct strbuf;
 #define GIT_COLOR_BG_BLUE	"\033[44m"
 #define GIT_COLOR_BG_MAGENTA	"\033[45m"
 #define GIT_COLOR_BG_CYAN	"\033[46m"
+#define GIT_COLOR_DI_IT_CYAN	"\033[2;3;36m"
+#define GIT_COLOR_DI_IT_MAGENTA	"\033[2;3;35m"
 
 /* A special value meaning "no color selected" */
 #define GIT_COLOR_NIL "NIL"
diff --git a/diff.c b/diff.c
index a3c16ef827..efd2530a89 100644
--- a/diff.c
+++ b/diff.c
@@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
+static int diff_color_moved_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_YELLOW,	/* COMMIT */
 	GIT_COLOR_BG_RED,	/* WHITESPACE */
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
+	GIT_COLOR_DI_IT_MAGENTA,/* OLD_MOVED */
+	GIT_COLOR_BG_RED,	/* OLD_MOVED ALTERNATIVE */
+	GIT_COLOR_DI_IT_CYAN,	/* NEW_MOVED */
+	GIT_COLOR_BG_GREEN,	/* NEW_MOVED ALTERNATIVE */
 };
 
 static NORETURN void die_want_option(const char *option_name)
@@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
 		return DIFF_WHITESPACE;
 	if (!strcasecmp(var, "func"))
 		return DIFF_FUNCINFO;
+	if (!strcasecmp(var, "oldmoved"))
+		return DIFF_FILE_OLD_MOVED;
+	if (!strcasecmp(var, "oldmovedalternative"))
+		return DIFF_FILE_OLD_MOVED_ALT;
+	if (!strcasecmp(var, "newmoved"))
+		return DIFF_FILE_NEW_MOVED;
+	if (!strcasecmp(var, "newmovedalternative"))
+		return DIFF_FILE_NEW_MOVED_ALT;
 	return -1;
 }
 
@@ -228,12 +241,35 @@ int git_diff_heuristic_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+static int parse_color_moved(const char *arg)
+{
+	if (!strcmp(arg, "no"))
+		return MOVED_LINES_NO;
+	else if (!strcmp(arg, "nobounds"))
+		return MOVED_LINES_BOUNDARY_NO;
+	else if (!strcmp(arg, "allbounds"))
+		return MOVED_LINES_BOUNDARY_ALL;
+	else if (!strcmp(arg, "adjacentbounds"))
+		return MOVED_LINES_BOUNDARY_ADJACENT;
+	else if (!strcmp(arg, "alternate"))
+		return MOVED_LINES_ALTERNATE;
+	else
+		return -1;
+}
+
 int git_diff_ui_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
 		diff_use_color_default = git_config_colorbool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "diff.colormoved")) {
+		int cm = parse_color_moved(value);
+		if (cm < 0)
+			return -1;
+		diff_color_moved_default = cm;
+		return 0;
+	}
 	if (!strcmp(var, "diff.context")) {
 		diff_context_default = git_config_int(var, value);
 		if (diff_context_default < 0)
@@ -354,6 +390,88 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+struct moved_entry {
+	struct hashmap_entry ent;
+	const struct diff_line *line;
+	struct moved_entry *next_line;
+};
+
+static void get_ws_cleaned_string(const struct diff_line *l,
+				  struct strbuf *out)
+{
+	int i;
+	for (i = 0; i < l->len; i++) {
+		if (isspace(l->line[i]))
+			continue;
+		strbuf_addch(out, l->line[i]);
+	}
+}
+
+static int diff_line_cmp_no_ws(const struct diff_line *a,
+					 const struct diff_line *b,
+					 const void *keydata)
+{
+	int ret;
+	struct strbuf sba = STRBUF_INIT;
+	struct strbuf sbb = STRBUF_INIT;
+
+	get_ws_cleaned_string(a, &sba);
+	get_ws_cleaned_string(b, &sbb);
+	ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
+
+	strbuf_release(&sba);
+	strbuf_release(&sbb);
+	return ret;
+}
+
+static int diff_line_cmp(const struct diff_line *a,
+				   const struct diff_line *b,
+				   const void *keydata)
+{
+	return a->len != b->len || strncmp(a->line, b->line, a->len);
+}
+
+static int moved_entry_cmp(const struct moved_entry *a,
+			   const struct moved_entry *b,
+			   const void *keydata)
+{
+	return diff_line_cmp(a->line, b->line, keydata);
+}
+
+static int moved_entry_cmp_no_ws(const struct moved_entry *a,
+				 const struct moved_entry *b,
+				 const void *keydata)
+{
+	return diff_line_cmp_no_ws(a->line, b->line, keydata);
+}
+
+static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
+{
+	static struct strbuf sb = STRBUF_INIT;
+
+	if (ignore_ws) {
+		strbuf_reset(&sb);
+		get_ws_cleaned_string(line, &sb);
+		return memhash(sb.buf, sb.len);
+	} else {
+		return memhash(line->line, line->len);
+	}
+}
+
+static struct moved_entry *prepare_entry(struct diff_options *o,
+					 int line_no)
+{
+	struct moved_entry *ret = xmalloc(sizeof(*ret));
+	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	struct diff_line *l = &o->line_buffer[line_no];
+
+	ret->ent.hash = get_line_hash(l, ignore_ws);
+	ret->line = l;
+	ret->next_line = NULL;
+
+	return ret;
+}
+
 static char *quote_two(const char *one, const char *two)
 {
 	int need_one = quote_c_style(one, NULL, NULL, 1);
@@ -516,6 +634,179 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
 	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
+{
+	struct moved_entry *prev_line = NULL;
+
+	int n;
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		int sign = 0;
+		struct hashmap *hm;
+		struct moved_entry *key;
+
+		switch (o->line_buffer[n].sign) {
+		case '+':
+			sign = '+';
+			hm = add_lines;
+			break;
+		case '-':
+			sign = '-';
+			hm = del_lines;
+			break;
+		case ' ':
+		default:
+			prev_line = NULL;
+			continue;
+		}
+
+		key = prepare_entry(o, n);
+		if (prev_line &&
+		    prev_line->line->sign == sign)
+			prev_line->next_line = key;
+
+		hashmap_add(hm, key);
+		prev_line = key;
+	}
+}
+
+static void mark_color_as_moved_single_line(struct diff_options *o,
+					    struct diff_line *l, int alt_color)
+{
+	switch (l->sign) {
+	case '+':
+		l->set = diff_get_color_opt(o,
+			DIFF_FILE_NEW_MOVED + alt_color);
+		break;
+	case '-':
+		l->set = diff_get_color_opt(o,
+			DIFF_FILE_OLD_MOVED + alt_color);
+		break;
+	default:
+		die("BUG: we should have continued earlier?");
+	}
+}
+
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
+{
+	struct moved_entry **pmb = NULL; /* potentially moved blocks */
+	struct diff_line *prev_line = NULL;
+	int pmb_nr = 0, pmb_alloc = 0;
+	int n, flipped_block = 0;
+
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		struct hashmap *hm = NULL;
+		struct moved_entry *key;
+		struct moved_entry *match = NULL;
+		struct diff_line *l = &o->line_buffer[n];
+		int i, lp, rp, adjacent_blocks = 0;
+
+		/* Check for any match to color it as a move. */
+		switch (l->sign) {
+		case '+':
+			hm = del_lines;
+			key = prepare_entry(o, n);
+			match = hashmap_get(hm, key, o);
+			free(key);
+			break;
+		case '-':
+			hm = add_lines;
+			key = prepare_entry(o, n);
+			match = hashmap_get(hm, key, o);
+			free(key);
+			break;
+		default: ;
+		}
+
+		if (!match) {
+			pmb_nr = 0;
+			if (prev_line &&
+			    o->color_moved == MOVED_LINES_BOUNDARY_ALL)
+				mark_color_as_moved_single_line(o, prev_line, 1);
+			prev_line = NULL;
+			continue;
+		}
+
+		if (o->color_moved == MOVED_LINES_BOUNDARY_NO) {
+			mark_color_as_moved_single_line(o, l, 0);
+			continue;
+		}
+
+		/* Check any potential block runs, advance each or nullify */
+		for (i = 0; i < pmb_nr; i++) {
+			struct moved_entry *p = pmb[i];
+			struct moved_entry *pnext = (p && p->next_line) ?
+					p->next_line : NULL;
+			if (pnext &&
+			    !diff_line_cmp(pnext->line, l, o)) {
+				pmb[i] = p->next_line;
+			} else {
+				pmb[i] = NULL;
+			}
+		}
+
+		/* Shrink the set of potential block to the remaining running */
+		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+			while (lp < pmb_nr && pmb[lp])
+				lp++;
+			/* lp points at the first NULL now */
+
+			while (rp > -1 && !pmb[rp])
+				rp--;
+			/* rp points at the last non-NULL */
+
+			if (lp < pmb_nr && rp > -1 && lp < rp) {
+				pmb[lp] = pmb[rp];
+				pmb[rp] = NULL;
+				rp--;
+				lp++;
+			}
+		}
+
+		/* Remember the number of running sets */
+		pmb_nr = rp + 1;
+
+		if (pmb_nr == 0) {
+			/*
+			 * This line is the start of a new block.
+			 * Setup the set of potential blocks.
+			 */
+			for (; match; match = hashmap_get_next(hm, match)) {
+				ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
+				pmb[pmb_nr++] = match;
+			}
+
+			if (o->color_moved == MOVED_LINES_BOUNDARY_ALL) {
+				adjacent_blocks = 1;
+			} else {
+				/* Check if two blocks are adjacent */
+				adjacent_blocks = prev_line &&
+						  prev_line->sign == l->sign;
+			}
+		}
+
+		if (o->color_moved == MOVED_LINES_ALTERNATE) {
+			if (adjacent_blocks)
+				flipped_block = (flipped_block + 1) % 2;
+			mark_color_as_moved_single_line(o, l, flipped_block);
+		} else {
+			/* MOVED_LINES_BOUNDARY_{ADJACENT, ALL} */
+			mark_color_as_moved_single_line(o, l, adjacent_blocks);
+			if (adjacent_blocks && prev_line)
+				prev_line->set = l->set;
+		}
+
+		prev_line = l;
+	}
+	if (prev_line && o->color_moved == MOVED_LINES_BOUNDARY_ALL)
+		mark_color_as_moved_single_line(o, prev_line, 1);
+
+	free(pmb);
+}
+
 static void emit_diff_line(struct diff_options *o,
 			   struct diff_line *e)
 {
@@ -3518,6 +3809,8 @@ void diff_setup(struct diff_options *options)
 	options->line_buffer = NULL;
 	options->line_buffer_nr = 0;
 	options->line_buffer_alloc = 0;
+
+	options->color_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -3627,6 +3920,9 @@ void diff_setup_done(struct diff_options *options)
 
 	if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
 		die(_("--follow requires exactly one pathspec"));
+
+	if (!options->use_color || external_diff())
+		options->color_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
@@ -4051,7 +4347,19 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--no-color"))
 		options->use_color = 0;
-	else if (!strcmp(arg, "--color-words")) {
+	else if (!strcmp(arg, "--color-moved"))
+		if (diff_color_moved_default)
+			options->color_moved = diff_color_moved_default;
+		else
+			options->color_moved = MOVED_LINES_BOUNDARY_ADJACENT;
+	else if (!strcmp(arg, "--no-color-moved"))
+		options->color_moved = MOVED_LINES_NO;
+	else if (skip_prefix(arg, "--color-moved=", &arg)) {
+		int cm = parse_color_moved(arg);
+		if (cm < 0)
+			die("bad --color-moved argument: %s", arg);
+		options->color_moved = cm;
+	} else if (!strcmp(arg, "--color-words")) {
 		options->use_color = 1;
 		options->word_diff = DIFF_WORDS_COLOR;
 	}
@@ -4856,16 +5164,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	/*
-	 * For testing purposes we want to make sure the diff machinery
-	 * works completely with the buffer. If there is anything emitted
-	 * outside the emit_diff_line, then the order is screwed
-	 * up and the tests will fail.
-	 *
-	 * TODO (later in this series):
-	 * We'll unset this flag in a later patch.
-	 */
-	o->use_buffer = 1;
+
+	if (o->color_moved)
+		o->use_buffer = 1;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -4874,6 +5175,24 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	}
 
 	if (o->use_buffer) {
+		if (o->color_moved) {
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
+		}
+
 		for (i = 0; i < o->line_buffer_nr; i++)
 			emit_diff_line(o, &o->line_buffer[i]);
 
@@ -4962,6 +5281,7 @@ void diff_flush(struct diff_options *options)
 		if (!options->file)
 			die_errno("Could not open /dev/null");
 		options->close_file = 1;
+		options->color_moved = 0;
 		for (i = 0; i < q->nr; i++) {
 			struct diff_filepair *p = q->queue[i];
 			if (check_pair_status(p))
diff --git a/diff.h b/diff.h
index be51e8f867..d9fbafd383 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "hashmap.h"
 
 struct rev_info;
 struct diff_options;
@@ -228,6 +229,14 @@ struct diff_options {
 
 	struct diff_line *line_buffer;
 	int line_buffer_nr, line_buffer_alloc;
+
+	enum {
+		MOVED_LINES_NO = 0,
+		MOVED_LINES_BOUNDARY_NO = 1,
+		MOVED_LINES_BOUNDARY_ALL = 2,
+		MOVED_LINES_BOUNDARY_ADJACENT = 3,
+		MOVED_LINES_ALTERNATE = 4,
+	} color_moved;
 };
 
 /* Emit [line_prefix] [set] line [reset] */
@@ -243,7 +252,11 @@ enum color_diff {
 	DIFF_FILE_NEW = 5,
 	DIFF_COMMIT = 6,
 	DIFF_WHITESPACE = 7,
-	DIFF_FUNCINFO = 8
+	DIFF_FUNCINFO = 8,
+	DIFF_FILE_OLD_MOVED = 9,
+	DIFF_FILE_OLD_MOVED_ALT = 10,
+	DIFF_FILE_NEW_MOVED = 11,
+	DIFF_FILE_NEW_MOVED_ALT = 12
 };
 const char *diff_get_color(int diff_use_color, enum color_diff ix);
 #define diff_get_color_opt(o, ix) \
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 289806d0c7..d4bd082af7 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -972,4 +972,377 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 
 '
 
+test_expect_success 'detect moved code, complete file' '
+	git reset --hard &&
+	cat <<-\EOF >test.c &&
+	#include<stdio.h>
+	main()
+	{
+	printf("Hello World");
+	}
+	EOF
+	git add test.c &&
+	git commit -m "add main function" &&
+	git mv test.c main.c &&
+	test_config color.diff.oldMoved "normal red" &&
+	test_config color.diff.newMoved "normal green" &&
+	git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
+	cat >expected <<-\EOF &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>new file mode 100644<RESET>
+	<BOLD>index 0000000..a986c57<RESET>
+	<BOLD>--- /dev/null<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -0,0 +1,5 @@<RESET>
+	<BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
+	<BGREEN>+<RESET><BGREEN>main()<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>deleted file mode 100644<RESET>
+	<BOLD>index a986c57..0000000<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ /dev/null<RESET>
+	<CYAN>@@ -1,5 +0,0 @@<RESET>
+	<BRED>-#include<stdio.h><RESET>
+	<BRED>-main()<RESET>
+	<BRED>-{<RESET>
+	<BRED>-printf("Hello World");<RESET>
+	<BRED>-}<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect moved code, inside file' '
+	git reset --hard &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git add main.c test.c &&
+	git commit -m "add main and test file" &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	test_config color.diff.oldMoved "normal red" &&
+	test_config color.diff.newMoved "normal green" &&
+	test_config color.diff.oldMovedAlternative "bold red" &&
+	test_config color.diff.newMovedAlternative "bold green" &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BRED>-if (!u->is_allowed_foo)<RESET>
+	<BRED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BRED>-}<RESET>
+	<BRED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..e34eb69 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BGREEN>+<RESET><BGREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BGREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect permutations inside moved code' '
+	git reset --hard &&
+	cat <<-\EOF >lines.txt &&
+		line 1
+		line 2
+		line 3
+		line 4
+		line 5
+		line 6
+		line 7
+		line 8
+		line 9
+		line 10
+		line 11
+		line 12
+		line 13
+		line 14
+		line 15
+		line 16
+	EOF
+	git add lines.txt &&
+	git commit -m "add poetry" &&
+	cat <<-\EOF >lines.txt &&
+		line 4
+		line 5
+		line 6
+		line 7
+		line 8
+		line 9
+		line 1
+		line 2
+		line 3
+		line 14
+		line 15
+		line 16
+		line 10
+		line 11
+		line 12
+		line 13
+	EOF
+	test_config color.diff.oldMoved "magenta" &&
+	test_config color.diff.newMoved "cyan" &&
+	test_config color.diff.oldMovedAlternative "blue" &&
+	test_config color.diff.newMovedAlternative "yellow" &&
+
+
+	git diff HEAD --no-renames --color-moved=nobounds| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+		<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+		<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+		<BOLD>--- a/lines.txt<RESET>
+		<BOLD>+++ b/lines.txt<RESET>
+		<CYAN>@@ -1,16 +1,16 @@<RESET>
+		<MAGENTA>-line 1<RESET>
+		<MAGENTA>-line 2<RESET>
+		<MAGENTA>-line 3<RESET>
+		 line 4<RESET>
+		 line 5<RESET>
+		 line 6<RESET>
+		 line 7<RESET>
+		 line 8<RESET>
+		 line 9<RESET>
+		<CYAN>+<RESET><CYAN>line 1<RESET>
+		<CYAN>+<RESET><CYAN>line 2<RESET>
+		<CYAN>+<RESET><CYAN>line 3<RESET>
+		<CYAN>+<RESET><CYAN>line 14<RESET>
+		<CYAN>+<RESET><CYAN>line 15<RESET>
+		<CYAN>+<RESET><CYAN>line 16<RESET>
+		 line 10<RESET>
+		 line 11<RESET>
+		 line 12<RESET>
+		 line 13<RESET>
+		<MAGENTA>-line 14<RESET>
+		<MAGENTA>-line 15<RESET>
+		<MAGENTA>-line 16<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	git diff HEAD --no-renames --color-moved=adjacentbounds| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<MAGENTA>-line 1<RESET>
+	<MAGENTA>-line 2<RESET>
+	<MAGENTA>-line 3<RESET>
+	 line 4<RESET>
+	 line 5<RESET>
+	 line 6<RESET>
+	 line 7<RESET>
+	 line 8<RESET>
+	 line 9<RESET>
+	<CYAN>+<RESET><CYAN>line 1<RESET>
+	<CYAN>+<RESET><CYAN>line 2<RESET>
+	<YELLOW>+<RESET><YELLOW>line 3<RESET>
+	<YELLOW>+<RESET><YELLOW>line 14<RESET>
+	<CYAN>+<RESET><CYAN>line 15<RESET>
+	<CYAN>+<RESET><CYAN>line 16<RESET>
+	 line 10<RESET>
+	 line 11<RESET>
+	 line 12<RESET>
+	 line 13<RESET>
+	<MAGENTA>-line 14<RESET>
+	<MAGENTA>-line 15<RESET>
+	<MAGENTA>-line 16<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	test_config diff.colorMoved alternate &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<MAGENTA>-line 1<RESET>
+	<MAGENTA>-line 2<RESET>
+	<MAGENTA>-line 3<RESET>
+	 line 4<RESET>
+	 line 5<RESET>
+	 line 6<RESET>
+	 line 7<RESET>
+	 line 8<RESET>
+	 line 9<RESET>
+	<CYAN>+<RESET><CYAN>line 1<RESET>
+	<CYAN>+<RESET><CYAN>line 2<RESET>
+	<CYAN>+<RESET><CYAN>line 3<RESET>
+	<YELLOW>+<RESET><YELLOW>line 14<RESET>
+	<YELLOW>+<RESET><YELLOW>line 15<RESET>
+	<YELLOW>+<RESET><YELLOW>line 16<RESET>
+	 line 10<RESET>
+	 line 11<RESET>
+	 line 12<RESET>
+	 line 13<RESET>
+	<BLUE>-line 14<RESET>
+	<BLUE>-line 15<RESET>
+	<BLUE>-line 16<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	test_config diff.colorMoved allbounds &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<BLUE>-line 1<RESET>
+	<MAGENTA>-line 2<RESET>
+	<BLUE>-line 3<RESET>
+	 line 4<RESET>
+	 line 5<RESET>
+	 line 6<RESET>
+	 line 7<RESET>
+	 line 8<RESET>
+	 line 9<RESET>
+	<YELLOW>+<RESET><YELLOW>line 1<RESET>
+	<CYAN>+<RESET><CYAN>line 2<RESET>
+	<YELLOW>+<RESET><YELLOW>line 3<RESET>
+	<YELLOW>+<RESET><YELLOW>line 14<RESET>
+	<CYAN>+<RESET><CYAN>line 15<RESET>
+	<YELLOW>+<RESET><YELLOW>line 16<RESET>
+	 line 10<RESET>
+	 line 11<RESET>
+	 line 12<RESET>
+	 line 13<RESET>
+	<BLUE>-line 14<RESET>
+	<MAGENTA>-line 15<RESET>
+	<BLUE>-line 16<RESET>
+	EOF
+	test_cmp expected actual
+'
+
+test_expect_success 'move detection does not mess up colored words' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 7%]

* Re: [PATCHv4 00/17] Diff machine: highlight moved lines.
  2017-05-23  2:40 ` [PATCHv4 00/17] Diff machine: highlight moved lines. Stefan Beller
  2017-05-23  2:40   ` [PATCHv4 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt Stefan Beller
  2017-05-23  2:40   ` [PATCHv4 17/17] diff.c: color moved lines differently Stefan Beller
@ 2017-05-27  1:04   ` Jacob Keller
  2017-05-30 21:38     ` Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Jacob Keller @ 2017-05-27  1:04 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Git mailing list, Brandon Williams, Jonathan Nieder, Jonathan Tan, Jeff King, Michael Haggerty

On Mon, May 22, 2017 at 7:40 PM, Stefan Beller <sbeller@google.com> wrote:
> v4:
> * interdiff to v3 (what is currently origin/sb/diff-color-move) below.
> * renamed the "buffered_patch_line" to "diff_line". Originally I planned
>   to not carry the "line" part as it can be a piece of a line as well.
>   But for the intended functionality it is best to keep the name.
>   If we'd want to add more functionality to say have a move detection
>   for words as well, we'd rename the struct to have a better name then.
>   For now diff_line is the best. (Thanks Jonathan Nieder!)
> * tests to demonstrate it doesn't mess with --color-words as well as
>   submodules. (Thanks Jonathan Tan!)
> * added in the statics (Thanks Ramsay!)
> * smaller scope for the hashmaps (Thanks Jonathan Tan!)
> * some commit messages were updated, prior patch 4-7 is squashed into one
>   (Thanks Jonathan Tan!)
> * the tests added revealed an actual fault: now that the submodule process
>   is not attached to a dupe of our stdout, it would stop coloring the
>   output. We need to pass on use-color explicitly.
> * updated the NEEDSWORK comment in the second last patch.
>
> Thanks for bearing,
> Stefan
>

One thing to note when I was playing around with what's on pu right
now, I noticed that the oldMovedAlternative and newMovedAlternative
are the first moved colors to be used if there is only one move. (Ie:
a simple case of literally one section moved) This is a bit weird that
the alternative colors are used before the "main" colors. I would have
thought it would be the other way.

I noticed this because the default colors do not work well for my
terminal color scheme and I had to configure but realized that I
needed to configure the alternative ones to make a difference in the
simple diff I was viewing.

Thanks,
Jake

^ permalink raw reply	[relevance 8%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-26 17:07         ` Stefan Beller
@ 2017-05-27  1:10           ` Ramsay Jones
  2017-05-30 21:53             ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Ramsay Jones @ 2017-05-27  1:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder



On 26/05/17 18:07, Stefan Beller wrote:
> On Fri, May 26, 2017 at 9:31 AM, Ramsay Jones
> <ramsay@ramsayjones.plus.com> wrote:
>> Hmm, I'm not sure which documentation you are referring to,
> 
> Quite likely our fine manual pages. ;)
> 
>        foreach [--recursive] <command>
>            Evaluates an arbitrary shell command in each checked out submodule.
>            The command has access to the variables $name, $path, $sha1 and
>            $toplevel: $name is the name of the relevant submodule section in
>            .gitmodules, $path is the name of the submodule directory relative
>            to the superproject, $sha1 is the commit as recorded in the
>            superproject, and $toplevel is the absolute path to the top-level
>            of the superproject. Any submodules defined in the superproject but
>            not checked out are ignored by this command. Unless given --quiet,
>            foreach prints the name of each submodule before evaluating the
>            command. If --recursive is given, submodules are traversed
>            recursively (i.e. the given shell command is evaluated in nested
>            submodules as well). A non-zero return from the command in any
>            submodule causes the processing to terminate. This can be
>            overridden by adding || : to the end of the command.

I suspected as much, but I was wondering specifically if $sm_path
had been documented anywhere. I didn't think so, but ...

> As $path is documented and $sm_path is not, we should care about
> $path first to be correct and either fix the documentation or the implementation
> such that we have a consistent world view. :)

Sure, but what is that world view? :-D

I suspect that commit 091a6eb0fe did not intend (should not have)
used $sm_path in that test. If we were to 'fix' that test, would
it still work?

Back in 2012, the submodule list was generated by filtering the
output of 'git ls-files --error-unmatch --stage --'; but I don't
recall if (at that time) git-ls-files required being at the top
of the working tree, or if it would execute fine in a sub-directory.
So, it's possible that the documentation of $path was wrong all along.
;-)

At that time, by definition, $path == $sm_path. However, you know this
stuff much better than me (I don't use git-submodule), so ...

>> but if
>> $path != $sm_path then something is wrong. (unless their definition
>> has changed, of course).
> 
> I would lean in doing so (changing their definition):
> 
>     $path (as documented) is the name of the submodule directory
>     relative to the direct superproject (so in nested submodules you
>     go up only one level).
> 
> $sm_path on the other hand is not documented at all and yields
> non-sense results in corner cases.

Hmm, at what point did '$sm_path yields non-sense results' start
being the case? (perhaps the corner cases need to be fixed first).

> With this patch it becomes less non-sensey and could be documented as:
> 
>     $sm_path is the relative path from the current working directory
>     to the submodule (ignoring relations to the superproject or nesting
>     of submodules). 

OK.

>                      This documentation also fits into the narrative of
>     the test in t7407.

Hmm, does it?

ATB,
Ramsay Jones



^ permalink raw reply	[relevance 23%]

* Re: [PATCH 1/1] diff.c: color moved lines differently
  2017-05-27  0:18   ` [PATCH 1/1] diff.c: color moved lines differently Stefan Beller
@ 2017-05-27  7:05     ` Philip Oakley
  0 siblings, 0 replies; 200+ results
From: Philip Oakley @ 2017-05-27  7:05 UTC (permalink / raw)
  To: Stefan Beller, gitster; +Cc: git, bmwill, jrnieder, peff, mhagger, jonathantanmy, Stefan Beller

a couple of mispellings in the doc parts:
  s/on location/one location/
[code not checked]
----- Original Message ----- 
From: "Stefan Beller" <sbeller@google.com>
Subject: [PATCH 1/1] diff.c: color moved lines differently


> When a patch consists mostly of moving blocks of code around, it can
> be quite tedious to ensure that the blocks are moved verbatim, and not
> undesirably modified in the move. To that end, color blocks that are
> moved within the same patch differently. For example (OM, del, add,
> and NM are different colors):
>
>    [OM]  -void sensitive_stuff(void)
>    [OM]  -{
>    [OM]  -        if (!is_authorized_user())
>    [OM]  -                die("unauthorized");
>    [OM]  -        sensitive_stuff(spanning,
>    [OM]  -                        multiple,
>    [OM]  -                        lines);
>    [OM]  -}
>
>           void another_function()
>           {
>    [del] -        printf("foo");
>    [add] +        printf("bar");
>           }
>
>    [NM]  +void sensitive_stuff(void)
>    [NM]  +{
>    [NM]  +        if (!is_authorized_user())
>    [NM]  +                die("unauthorized");
>    [NM]  +        sensitive_stuff(spanning,
>    [NM]  +                        multiple,
>    [NM]  +                        lines);
>    [NM]  +}
>
> However adjacent blocks may be problematic. For example, in this
> potentially malicious patch, the swapping of blocks can be spotted:
>
>    [OM]  -void sensitive_stuff(void)
>    [OM]  -{
>    [OMA] -        if (!is_authorized_user())
>    [OMA] -                die("unauthorized");
>    [OM]  -        sensitive_stuff(spanning,
>    [OM]  -                        multiple,
>    [OM]  -                        lines);
>    [OMA] -}
>
>           void another_function()
>           {
>    [del] -        printf("foo");
>    [add] +        printf("bar");
>           }
>
>    [NM]  +void sensitive_stuff(void)
>    [NM]  +{
>    [NMA] +        sensitive_stuff(spanning,
>    [NMA] +                        multiple,
>    [NMA] +                        lines);
>    [NM]  +        if (!is_authorized_user())
>    [NM]  +                die("unauthorized");
>    [NMA] +}
>
> If the moved code is larger, it is easier to hide some permutation in the
> code, which is why some alternative coloring is needed.
>
> As the reviewers attention should be brought to the places, where the
> difference is introduced to the moved code, we cannot just have one new
> color for all of moved code.
>
> First I implemented an alternative design, which would try to fingerprint
> a line by its neighbors to detect if we are in a block or at the boundary.
> This idea iss error prone as it inspected each line and its neighboring
> lines to determine if the line was (a) moved and (b) if was deep inside
> a hunk by having matching neighboring lines. This is unreliable as the
> we can construct hunks which have equal neighbors that just exceed the
> number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
> as a line, that is permutated to AXYZCXYZBXYZD..').
>
> Instead this provides a dynamic programming greedy algorithm that finds
> the largest moved hunk and then has several modes on highlighting bounds.
>
> A note on the options '--submodule=diff' and '--color-words/--word-diff':
> In the conversion to use emit_line in the prior patches both submodules
> as well as word diff output carefully chose to call emit_line with sign=0.
> All output with sign=0 is ignored for move detection purposes in this
> patch, such that no weird looking output will be generated for these
> cases. This leads to another thought: We could pass on '--color-moved' to
> submodules such that they color up moved lines for themselves. If we'd do
> so only line moves within a repository boundary are marked up.
>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> Documentation/config.txt       |  10 +-
> Documentation/diff-options.txt |  32 ++++
> color.h                        |   2 +
> diff.c                         | 342 +++++++++++++++++++++++++++++++++++--
> diff.h                         |  15 +-
> t/t4015-diff-whitespace.sh     | 373 
> +++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 760 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874d51..73511a4603 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1051,14 +1051,20 @@ This does not affect linkgit:git-format-patch[1] 
> or the
> 'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
> command line with the `--color[=<when>]` option.
>
> +diff.colorMoved::
> + If set moved lines in a diff are colored differently,
> + for details see '--color-moved' in linkgit:git-diff[1].
> +
> color.diff.<slot>::
>  Use customized color for diff colorization.  `<slot>` specifies
>  which part of the patch to use the specified color, and is one
>  of `context` (context text - `plain` is a historical synonym),
>  `meta` (metainformation), `frag`
>  (hunk header), 'func' (function in hunk header), `old` (removed lines),
> - `new` (added lines), `commit` (commit headers), or `whitespace`
> - (highlighting whitespace errors).
> + `new` (added lines), `commit` (commit headers), `whitespace`
> + (highlighting whitespace errors), `oldMoved`, `newMoved`,
> + `oldMovedAlternative` and `newMovedAlternative` (See the '<mode>'
> + setting of '--color-moved' in linkgit:git-diff[1] for details).
>
> color.decorate.<slot>::
>  Use customized color for 'git log --decorate' output.  `<slot>` is one
> diff --git a/Documentation/diff-options.txt 
> b/Documentation/diff-options.txt
> index 89cc0f48de..25259dbbc3 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -231,6 +231,38 @@ ifdef::git-diff[]
> endif::git-diff[]
>  It is the same as `--color=never`.
>
> +--color-moved[=<mode>]::
> + Moved lines of code are colored differently.
> +ifdef::git-diff[]
> + It can be changed by the `diff.colorMoved` configuration setting.
> +endif::git-diff[]
> + The <mode> defaults to 'no' if the option is not given
> + and to 'adjacentbounds' if the option with no mode is given.
> + The mode must be one of:
> ++
> +--
> +no::
> + Moved lines are not highlighted.
> +nobounds::
> + Any line that is added in on location and was removed
s/on location/one location/

> + in another location will be colored with 'color.diff.newmoved'.
> + Any line that is removed in on location and was added
s/on location/one location/

> + in another location will be colored with 'color.diff.oldmoved'.
> +allbounds::
> + Based on 'nobounds'. Additionally blocks of moved code are
> + detected and the first and last line of a block will be highlighted
> + using 'color.diff.newMovedAlternate' or
> + 'color.diff.oldMovedAlternate'.
> +adjacentbounds::
> + The same as 'allbounds' except that highlighting is only performed
> + at adjacent block boundaries of blocks that have the same sign.
> +alternate::
> + Based on 'nobounds'. Additionally blocks of moved code are
> + detected. If moved blocks are adjacent mark one of them with the
> + alternative move color using 'color.diff.newMovedAlternate' or
> + 'color.diff.oldMovedAlternate'.
> +--
> +
> --word-diff[=<mode>]::
>  Show a word diff, using the <mode> to delimit changed words.
>  By default, words are delimited by whitespace; see

--
Philip

> diff --git a/color.h b/color.h
> index 90627650fc..04b3b87929 100644
> --- a/color.h
> +++ b/color.h
> @@ -42,6 +42,8 @@ struct strbuf;
> #define GIT_COLOR_BG_BLUE "\033[44m"
> #define GIT_COLOR_BG_MAGENTA "\033[45m"
> #define GIT_COLOR_BG_CYAN "\033[46m"
> +#define GIT_COLOR_DI_IT_CYAN "\033[2;3;36m"
> +#define GIT_COLOR_DI_IT_MAGENTA "\033[2;3;35m"
>
> /* A special value meaning "no color selected" */
> #define GIT_COLOR_NIL "NIL"
> diff --git a/diff.c b/diff.c
> index a3c16ef827..efd2530a89 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
> static int diff_rename_limit_default = 400;
> static int diff_suppress_blank_empty;
> static int diff_use_color_default = -1;
> +static int diff_color_moved_default;
> static int diff_context_default = 3;
> static int diff_interhunk_context_default;
> static const char *diff_word_regex_cfg;
> @@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
>  GIT_COLOR_YELLOW, /* COMMIT */
>  GIT_COLOR_BG_RED, /* WHITESPACE */
>  GIT_COLOR_NORMAL, /* FUNCINFO */
> + GIT_COLOR_DI_IT_MAGENTA,/* OLD_MOVED */
> + GIT_COLOR_BG_RED, /* OLD_MOVED ALTERNATIVE */
> + GIT_COLOR_DI_IT_CYAN, /* NEW_MOVED */
> + GIT_COLOR_BG_GREEN, /* NEW_MOVED ALTERNATIVE */
> };
>
> static NORETURN void die_want_option(const char *option_name)
> @@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
>  return DIFF_WHITESPACE;
>  if (!strcasecmp(var, "func"))
>  return DIFF_FUNCINFO;
> + if (!strcasecmp(var, "oldmoved"))
> + return DIFF_FILE_OLD_MOVED;
> + if (!strcasecmp(var, "oldmovedalternative"))
> + return DIFF_FILE_OLD_MOVED_ALT;
> + if (!strcasecmp(var, "newmoved"))
> + return DIFF_FILE_NEW_MOVED;
> + if (!strcasecmp(var, "newmovedalternative"))
> + return DIFF_FILE_NEW_MOVED_ALT;
>  return -1;
> }
>
> @@ -228,12 +241,35 @@ int git_diff_heuristic_config(const char *var, const 
> char *value, void *cb)
>  return 0;
> }
>
> +static int parse_color_moved(const char *arg)
> +{
> + if (!strcmp(arg, "no"))
> + return MOVED_LINES_NO;
> + else if (!strcmp(arg, "nobounds"))
> + return MOVED_LINES_BOUNDARY_NO;
> + else if (!strcmp(arg, "allbounds"))
> + return MOVED_LINES_BOUNDARY_ALL;
> + else if (!strcmp(arg, "adjacentbounds"))
> + return MOVED_LINES_BOUNDARY_ADJACENT;
> + else if (!strcmp(arg, "alternate"))
> + return MOVED_LINES_ALTERNATE;
> + else
> + return -1;
> +}
> +
> int git_diff_ui_config(const char *var, const char *value, void *cb)
> {
>  if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
>  diff_use_color_default = git_config_colorbool(var, value);
>  return 0;
>  }
> + if (!strcmp(var, "diff.colormoved")) {
> + int cm = parse_color_moved(value);
> + if (cm < 0)
> + return -1;
> + diff_color_moved_default = cm;
> + return 0;
> + }
>  if (!strcmp(var, "diff.context")) {
>  diff_context_default = git_config_int(var, value);
>  if (diff_context_default < 0)
> @@ -354,6 +390,88 @@ int git_diff_basic_config(const char *var, const char 
> *value, void *cb)
>  return git_default_config(var, value, cb);
> }
>
> +struct moved_entry {
> + struct hashmap_entry ent;
> + const struct diff_line *line;
> + struct moved_entry *next_line;
> +};
> +
> +static void get_ws_cleaned_string(const struct diff_line *l,
> +   struct strbuf *out)
> +{
> + int i;
> + for (i = 0; i < l->len; i++) {
> + if (isspace(l->line[i]))
> + continue;
> + strbuf_addch(out, l->line[i]);
> + }
> +}
> +
> +static int diff_line_cmp_no_ws(const struct diff_line *a,
> + const struct diff_line *b,
> + const void *keydata)
> +{
> + int ret;
> + struct strbuf sba = STRBUF_INIT;
> + struct strbuf sbb = STRBUF_INIT;
> +
> + get_ws_cleaned_string(a, &sba);
> + get_ws_cleaned_string(b, &sbb);
> + ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
> +
> + strbuf_release(&sba);
> + strbuf_release(&sbb);
> + return ret;
> +}
> +
> +static int diff_line_cmp(const struct diff_line *a,
> +    const struct diff_line *b,
> +    const void *keydata)
> +{
> + return a->len != b->len || strncmp(a->line, b->line, a->len);
> +}
> +
> +static int moved_entry_cmp(const struct moved_entry *a,
> +    const struct moved_entry *b,
> +    const void *keydata)
> +{
> + return diff_line_cmp(a->line, b->line, keydata);
> +}
> +
> +static int moved_entry_cmp_no_ws(const struct moved_entry *a,
> + const struct moved_entry *b,
> + const void *keydata)
> +{
> + return diff_line_cmp_no_ws(a->line, b->line, keydata);
> +}
> +
> +static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
> +{
> + static struct strbuf sb = STRBUF_INIT;
> +
> + if (ignore_ws) {
> + strbuf_reset(&sb);
> + get_ws_cleaned_string(line, &sb);
> + return memhash(sb.buf, sb.len);
> + } else {
> + return memhash(line->line, line->len);
> + }
> +}
> +
> +static struct moved_entry *prepare_entry(struct diff_options *o,
> + int line_no)
> +{
> + struct moved_entry *ret = xmalloc(sizeof(*ret));
> + unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
> + struct diff_line *l = &o->line_buffer[line_no];
> +
> + ret->ent.hash = get_line_hash(l, ignore_ws);
> + ret->line = l;
> + ret->next_line = NULL;
> +
> + return ret;
> +}
> +
> static char *quote_two(const char *one, const char *two)
> {
>  int need_one = quote_c_style(one, NULL, NULL, 1);
> @@ -516,6 +634,179 @@ static void check_blank_at_eof(mmfile_t *mf1, 
> mmfile_t *mf2,
>  ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
> }
>
> +static void add_lines_to_move_detection(struct diff_options *o,
> + struct hashmap *add_lines,
> + struct hashmap *del_lines)
> +{
> + struct moved_entry *prev_line = NULL;
> +
> + int n;
> + for (n = 0; n < o->line_buffer_nr; n++) {
> + int sign = 0;
> + struct hashmap *hm;
> + struct moved_entry *key;
> +
> + switch (o->line_buffer[n].sign) {
> + case '+':
> + sign = '+';
> + hm = add_lines;
> + break;
> + case '-':
> + sign = '-';
> + hm = del_lines;
> + break;
> + case ' ':
> + default:
> + prev_line = NULL;
> + continue;
> + }
> +
> + key = prepare_entry(o, n);
> + if (prev_line &&
> +     prev_line->line->sign == sign)
> + prev_line->next_line = key;
> +
> + hashmap_add(hm, key);
> + prev_line = key;
> + }
> +}
> +
> +static void mark_color_as_moved_single_line(struct diff_options *o,
> +     struct diff_line *l, int alt_color)
> +{
> + switch (l->sign) {
> + case '+':
> + l->set = diff_get_color_opt(o,
> + DIFF_FILE_NEW_MOVED + alt_color);
> + break;
> + case '-':
> + l->set = diff_get_color_opt(o,
> + DIFF_FILE_OLD_MOVED + alt_color);
> + break;
> + default:
> + die("BUG: we should have continued earlier?");
> + }
> +}
> +
> +static void mark_color_as_moved(struct diff_options *o,
> + struct hashmap *add_lines,
> + struct hashmap *del_lines)
> +{
> + struct moved_entry **pmb = NULL; /* potentially moved blocks */
> + struct diff_line *prev_line = NULL;
> + int pmb_nr = 0, pmb_alloc = 0;
> + int n, flipped_block = 0;
> +
> + for (n = 0; n < o->line_buffer_nr; n++) {
> + struct hashmap *hm = NULL;
> + struct moved_entry *key;
> + struct moved_entry *match = NULL;
> + struct diff_line *l = &o->line_buffer[n];
> + int i, lp, rp, adjacent_blocks = 0;
> +
> + /* Check for any match to color it as a move. */
> + switch (l->sign) {
> + case '+':
> + hm = del_lines;
> + key = prepare_entry(o, n);
> + match = hashmap_get(hm, key, o);
> + free(key);
> + break;
> + case '-':
> + hm = add_lines;
> + key = prepare_entry(o, n);
> + match = hashmap_get(hm, key, o);
> + free(key);
> + break;
> + default: ;
> + }
> +
> + if (!match) {
> + pmb_nr = 0;
> + if (prev_line &&
> +     o->color_moved == MOVED_LINES_BOUNDARY_ALL)
> + mark_color_as_moved_single_line(o, prev_line, 1);
> + prev_line = NULL;
> + continue;
> + }
> +
> + if (o->color_moved == MOVED_LINES_BOUNDARY_NO) {
> + mark_color_as_moved_single_line(o, l, 0);
> + continue;
> + }
> +
> + /* Check any potential block runs, advance each or nullify */
> + for (i = 0; i < pmb_nr; i++) {
> + struct moved_entry *p = pmb[i];
> + struct moved_entry *pnext = (p && p->next_line) ?
> + p->next_line : NULL;
> + if (pnext &&
> +     !diff_line_cmp(pnext->line, l, o)) {
> + pmb[i] = p->next_line;
> + } else {
> + pmb[i] = NULL;
> + }
> + }
> +
> + /* Shrink the set of potential block to the remaining running */
> + for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
> + while (lp < pmb_nr && pmb[lp])
> + lp++;
> + /* lp points at the first NULL now */
> +
> + while (rp > -1 && !pmb[rp])
> + rp--;
> + /* rp points at the last non-NULL */
> +
> + if (lp < pmb_nr && rp > -1 && lp < rp) {
> + pmb[lp] = pmb[rp];
> + pmb[rp] = NULL;
> + rp--;
> + lp++;
> + }
> + }
> +
> + /* Remember the number of running sets */
> + pmb_nr = rp + 1;
> +
> + if (pmb_nr == 0) {
> + /*
> + * This line is the start of a new block.
> + * Setup the set of potential blocks.
> + */
> + for (; match; match = hashmap_get_next(hm, match)) {
> + ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
> + pmb[pmb_nr++] = match;
> + }
> +
> + if (o->color_moved == MOVED_LINES_BOUNDARY_ALL) {
> + adjacent_blocks = 1;
> + } else {
> + /* Check if two blocks are adjacent */
> + adjacent_blocks = prev_line &&
> +   prev_line->sign == l->sign;
> + }
> + }
> +
> + if (o->color_moved == MOVED_LINES_ALTERNATE) {
> + if (adjacent_blocks)
> + flipped_block = (flipped_block + 1) % 2;
> + mark_color_as_moved_single_line(o, l, flipped_block);
> + } else {
> + /* MOVED_LINES_BOUNDARY_{ADJACENT, ALL} */
> + mark_color_as_moved_single_line(o, l, adjacent_blocks);
> + if (adjacent_blocks && prev_line)
> + prev_line->set = l->set;
> + }
> +
> + prev_line = l;
> + }
> + if (prev_line && o->color_moved == MOVED_LINES_BOUNDARY_ALL)
> + mark_color_as_moved_single_line(o, prev_line, 1);
> +
> + free(pmb);
> +}
> +
> static void emit_diff_line(struct diff_options *o,
>     struct diff_line *e)
> {
> @@ -3518,6 +3809,8 @@ void diff_setup(struct diff_options *options)
>  options->line_buffer = NULL;
>  options->line_buffer_nr = 0;
>  options->line_buffer_alloc = 0;
> +
> + options->color_moved = diff_color_moved_default;
> }
>
> void diff_setup_done(struct diff_options *options)
> @@ -3627,6 +3920,9 @@ void diff_setup_done(struct diff_options *options)
>
>  if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
>  die(_("--follow requires exactly one pathspec"));
> +
> + if (!options->use_color || external_diff())
> + options->color_moved = 0;
> }
>
> static int opt_arg(const char *arg, int arg_short, const char *arg_long, 
> int *val)
> @@ -4051,7 +4347,19 @@ int diff_opt_parse(struct diff_options *options,
>  }
>  else if (!strcmp(arg, "--no-color"))
>  options->use_color = 0;
> - else if (!strcmp(arg, "--color-words")) {
> + else if (!strcmp(arg, "--color-moved"))
> + if (diff_color_moved_default)
> + options->color_moved = diff_color_moved_default;
> + else
> + options->color_moved = MOVED_LINES_BOUNDARY_ADJACENT;
> + else if (!strcmp(arg, "--no-color-moved"))
> + options->color_moved = MOVED_LINES_NO;
> + else if (skip_prefix(arg, "--color-moved=", &arg)) {
> + int cm = parse_color_moved(arg);
> + if (cm < 0)
> + die("bad --color-moved argument: %s", arg);
> + options->color_moved = cm;
> + } else if (!strcmp(arg, "--color-words")) {
>  options->use_color = 1;
>  options->word_diff = DIFF_WORDS_COLOR;
>  }
> @@ -4856,16 +5164,9 @@ static void diff_flush_patch_all_file_pairs(struct 
> diff_options *o)
> {
>  int i;
>  struct diff_queue_struct *q = &diff_queued_diff;
> - /*
> - * For testing purposes we want to make sure the diff machinery
> - * works completely with the buffer. If there is anything emitted
> - * outside the emit_diff_line, then the order is screwed
> - * up and the tests will fail.
> - *
> - * TODO (later in this series):
> - * We'll unset this flag in a later patch.
> - */
> - o->use_buffer = 1;
> +
> + if (o->color_moved)
> + o->use_buffer = 1;
>
>  for (i = 0; i < q->nr; i++) {
>  struct diff_filepair *p = q->queue[i];
> @@ -4874,6 +5175,24 @@ static void diff_flush_patch_all_file_pairs(struct 
> diff_options *o)
>  }
>
>  if (o->use_buffer) {
> + if (o->color_moved) {
> + struct hashmap add_lines, del_lines;
> + unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
> +
> + hashmap_init(&del_lines, ignore_ws ?
> + (hashmap_cmp_fn)moved_entry_cmp_no_ws :
> + (hashmap_cmp_fn)moved_entry_cmp, 0);
> + hashmap_init(&add_lines, ignore_ws ?
> + (hashmap_cmp_fn)moved_entry_cmp_no_ws :
> + (hashmap_cmp_fn)moved_entry_cmp, 0);
> +
> + add_lines_to_move_detection(o, &add_lines, &del_lines);
> + mark_color_as_moved(o, &add_lines, &del_lines);
> +
> + hashmap_free(&add_lines, 0);
> + hashmap_free(&del_lines, 0);
> + }
> +
>  for (i = 0; i < o->line_buffer_nr; i++)
>  emit_diff_line(o, &o->line_buffer[i]);
>
> @@ -4962,6 +5281,7 @@ void diff_flush(struct diff_options *options)
>  if (!options->file)
>  die_errno("Could not open /dev/null");
>  options->close_file = 1;
> + options->color_moved = 0;
>  for (i = 0; i < q->nr; i++) {
>  struct diff_filepair *p = q->queue[i];
>  if (check_pair_status(p))
> diff --git a/diff.h b/diff.h
> index be51e8f867..d9fbafd383 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -7,6 +7,7 @@
> #include "tree-walk.h"
> #include "pathspec.h"
> #include "object.h"
> +#include "hashmap.h"
>
> struct rev_info;
> struct diff_options;
> @@ -228,6 +229,14 @@ struct diff_options {
>
>  struct diff_line *line_buffer;
>  int line_buffer_nr, line_buffer_alloc;
> +
> + enum {
> + MOVED_LINES_NO = 0,
> + MOVED_LINES_BOUNDARY_NO = 1,
> + MOVED_LINES_BOUNDARY_ALL = 2,
> + MOVED_LINES_BOUNDARY_ADJACENT = 3,
> + MOVED_LINES_ALTERNATE = 4,
> + } color_moved;
> };
>
> /* Emit [line_prefix] [set] line [reset] */
> @@ -243,7 +252,11 @@ enum color_diff {
>  DIFF_FILE_NEW = 5,
>  DIFF_COMMIT = 6,
>  DIFF_WHITESPACE = 7,
> - DIFF_FUNCINFO = 8
> + DIFF_FUNCINFO = 8,
> + DIFF_FILE_OLD_MOVED = 9,
> + DIFF_FILE_OLD_MOVED_ALT = 10,
> + DIFF_FILE_NEW_MOVED = 11,
> + DIFF_FILE_NEW_MOVED_ALT = 12
> };
> const char *diff_get_color(int diff_use_color, enum color_diff ix);
> #define diff_get_color_opt(o, ix) \
> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index 289806d0c7..d4bd082af7 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -972,4 +972,377 @@ test_expect_success 'option overrides 
> diff.wsErrorHighlight' '
>
> '
>
> +test_expect_success 'detect moved code, complete file' '
> + git reset --hard &&
> + cat <<-\EOF >test.c &&
> + #include<stdio.h>
> + main()
> + {
> + printf("Hello World");
> + }
> + EOF
> + git add test.c &&
> + git commit -m "add main function" &&
> + git mv test.c main.c &&
> + test_config color.diff.oldMoved "normal red" &&
> + test_config color.diff.newMoved "normal green" &&
> + git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
> + cat >expected <<-\EOF &&
> + <BOLD>diff --git a/main.c b/main.c<RESET>
> + <BOLD>new file mode 100644<RESET>
> + <BOLD>index 0000000..a986c57<RESET>
> + <BOLD>--- /dev/null<RESET>
> + <BOLD>+++ b/main.c<RESET>
> + <CYAN>@@ -0,0 +1,5 @@<RESET>
> + <BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
> + <BGREEN>+<RESET><BGREEN>main()<RESET>
> + <BGREEN>+<RESET><BGREEN>{<RESET>
> + <BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
> + <BGREEN>+<RESET><BGREEN>}<RESET>
> + <BOLD>diff --git a/test.c b/test.c<RESET>
> + <BOLD>deleted file mode 100644<RESET>
> + <BOLD>index a986c57..0000000<RESET>
> + <BOLD>--- a/test.c<RESET>
> + <BOLD>+++ /dev/null<RESET>
> + <CYAN>@@ -1,5 +0,0 @@<RESET>
> + <BRED>-#include<stdio.h><RESET>
> + <BRED>-main()<RESET>
> + <BRED>-{<RESET>
> + <BRED>-printf("Hello World");<RESET>
> + <BRED>-}<RESET>
> + EOF
> +
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'detect moved code, inside file' '
> + git reset --hard &&
> + cat <<-\EOF >main.c &&
> + #include<stdio.h>
> + int stuff()
> + {
> + printf("Hello ");
> + printf("World\n");
> + }
> +
> + int secure_foo(struct user *u)
> + {
> + if (!u->is_allowed_foo)
> + return;
> + foo(u);
> + }
> +
> + int main()
> + {
> + foo();
> + }
> + EOF
> + cat <<-\EOF >test.c &&
> + #include<stdio.h>
> + int bar()
> + {
> + printf("Hello World, but different\n");
> + }
> +
> + int another_function()
> + {
> + bar();
> + }
> + EOF
> + git add main.c test.c &&
> + git commit -m "add main and test file" &&
> + cat <<-\EOF >main.c &&
> + #include<stdio.h>
> + int stuff()
> + {
> + printf("Hello ");
> + printf("World\n");
> + }
> +
> + int main()
> + {
> + foo();
> + }
> + EOF
> + cat <<-\EOF >test.c &&
> + #include<stdio.h>
> + int bar()
> + {
> + printf("Hello World, but different\n");
> + }
> +
> + int secure_foo(struct user *u)
> + {
> + if (!u->is_allowed_foo)
> + return;
> + foo(u);
> + }
> +
> + int another_function()
> + {
> + bar();
> + }
> + EOF
> + test_config color.diff.oldMoved "normal red" &&
> + test_config color.diff.newMoved "normal green" &&
> + test_config color.diff.oldMovedAlternative "bold red" &&
> + test_config color.diff.newMovedAlternative "bold green" &&
> + git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
> + cat <<-\EOF >expected &&
> + <BOLD>diff --git a/main.c b/main.c<RESET>
> + <BOLD>index 27a619c..7cf9336 100644<RESET>
> + <BOLD>--- a/main.c<RESET>
> + <BOLD>+++ b/main.c<RESET>
> + <CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
> + printf("World\n");<RESET>
> + }<RESET>
> + <RESET>
> + <BRED>-int secure_foo(struct user *u)<RESET>
> + <BRED>-{<RESET>
> + <BRED>-if (!u->is_allowed_foo)<RESET>
> + <BRED>-return;<RESET>
> + <BRED>-foo(u);<RESET>
> + <BRED>-}<RESET>
> + <BRED>-<RESET>
> + int main()<RESET>
> + {<RESET>
> + foo();<RESET>
> + <BOLD>diff --git a/test.c b/test.c<RESET>
> + <BOLD>index 1dc1d85..e34eb69 100644<RESET>
> + <BOLD>--- a/test.c<RESET>
> + <BOLD>+++ b/test.c<RESET>
> + <CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
> + printf("Hello World, but different\n");<RESET>
> + }<RESET>
> + <RESET>
> + <BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
> + <BGREEN>+<RESET><BGREEN>{<RESET>
> + <BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
> + <BGREEN>+<RESET><BGREEN>return;<RESET>
> + <BGREEN>+<RESET><BGREEN>foo(u);<RESET>
> + <BGREEN>+<RESET><BGREEN>}<RESET>
> + <BGREEN>+<RESET>
> + int another_function()<RESET>
> + {<RESET>
> + bar();<RESET>
> + EOF
> +
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'detect permutations inside moved code' '
> + git reset --hard &&
> + cat <<-\EOF >lines.txt &&
> + line 1
> + line 2
> + line 3
> + line 4
> + line 5
> + line 6
> + line 7
> + line 8
> + line 9
> + line 10
> + line 11
> + line 12
> + line 13
> + line 14
> + line 15
> + line 16
> + EOF
> + git add lines.txt &&
> + git commit -m "add poetry" &&
> + cat <<-\EOF >lines.txt &&
> + line 4
> + line 5
> + line 6
> + line 7
> + line 8
> + line 9
> + line 1
> + line 2
> + line 3
> + line 14
> + line 15
> + line 16
> + line 10
> + line 11
> + line 12
> + line 13
> + EOF
> + test_config color.diff.oldMoved "magenta" &&
> + test_config color.diff.newMoved "cyan" &&
> + test_config color.diff.oldMovedAlternative "blue" &&
> + test_config color.diff.newMovedAlternative "yellow" &&
> +
> +
> + git diff HEAD --no-renames --color-moved=nobounds| test_decode_color 
>  >actual &&
> + cat <<-\EOF >expected &&
> + <BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> + <BOLD>index 47ea9c3..ba96a38 100644<RESET>
> + <BOLD>--- a/lines.txt<RESET>
> + <BOLD>+++ b/lines.txt<RESET>
> + <CYAN>@@ -1,16 +1,16 @@<RESET>
> + <MAGENTA>-line 1<RESET>
> + <MAGENTA>-line 2<RESET>
> + <MAGENTA>-line 3<RESET>
> + line 4<RESET>
> + line 5<RESET>
> + line 6<RESET>
> + line 7<RESET>
> + line 8<RESET>
> + line 9<RESET>
> + <CYAN>+<RESET><CYAN>line 1<RESET>
> + <CYAN>+<RESET><CYAN>line 2<RESET>
> + <CYAN>+<RESET><CYAN>line 3<RESET>
> + <CYAN>+<RESET><CYAN>line 14<RESET>
> + <CYAN>+<RESET><CYAN>line 15<RESET>
> + <CYAN>+<RESET><CYAN>line 16<RESET>
> + line 10<RESET>
> + line 11<RESET>
> + line 12<RESET>
> + line 13<RESET>
> + <MAGENTA>-line 14<RESET>
> + <MAGENTA>-line 15<RESET>
> + <MAGENTA>-line 16<RESET>
> + EOF
> + test_cmp expected actual &&
> +
> + git diff HEAD --no-renames --color-moved=adjacentbounds| 
> test_decode_color >actual &&
> + cat <<-\EOF >expected &&
> + <BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> + <BOLD>index 47ea9c3..ba96a38 100644<RESET>
> + <BOLD>--- a/lines.txt<RESET>
> + <BOLD>+++ b/lines.txt<RESET>
> + <CYAN>@@ -1,16 +1,16 @@<RESET>
> + <MAGENTA>-line 1<RESET>
> + <MAGENTA>-line 2<RESET>
> + <MAGENTA>-line 3<RESET>
> + line 4<RESET>
> + line 5<RESET>
> + line 6<RESET>
> + line 7<RESET>
> + line 8<RESET>
> + line 9<RESET>
> + <CYAN>+<RESET><CYAN>line 1<RESET>
> + <CYAN>+<RESET><CYAN>line 2<RESET>
> + <YELLOW>+<RESET><YELLOW>line 3<RESET>
> + <YELLOW>+<RESET><YELLOW>line 14<RESET>
> + <CYAN>+<RESET><CYAN>line 15<RESET>
> + <CYAN>+<RESET><CYAN>line 16<RESET>
> + line 10<RESET>
> + line 11<RESET>
> + line 12<RESET>
> + line 13<RESET>
> + <MAGENTA>-line 14<RESET>
> + <MAGENTA>-line 15<RESET>
> + <MAGENTA>-line 16<RESET>
> + EOF
> + test_cmp expected actual &&
> +
> + test_config diff.colorMoved alternate &&
> + git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
> + cat <<-\EOF >expected &&
> + <BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> + <BOLD>index 47ea9c3..ba96a38 100644<RESET>
> + <BOLD>--- a/lines.txt<RESET>
> + <BOLD>+++ b/lines.txt<RESET>
> + <CYAN>@@ -1,16 +1,16 @@<RESET>
> + <MAGENTA>-line 1<RESET>
> + <MAGENTA>-line 2<RESET>
> + <MAGENTA>-line 3<RESET>
> + line 4<RESET>
> + line 5<RESET>
> + line 6<RESET>
> + line 7<RESET>
> + line 8<RESET>
> + line 9<RESET>
> + <CYAN>+<RESET><CYAN>line 1<RESET>
> + <CYAN>+<RESET><CYAN>line 2<RESET>
> + <CYAN>+<RESET><CYAN>line 3<RESET>
> + <YELLOW>+<RESET><YELLOW>line 14<RESET>
> + <YELLOW>+<RESET><YELLOW>line 15<RESET>
> + <YELLOW>+<RESET><YELLOW>line 16<RESET>
> + line 10<RESET>
> + line 11<RESET>
> + line 12<RESET>
> + line 13<RESET>
> + <BLUE>-line 14<RESET>
> + <BLUE>-line 15<RESET>
> + <BLUE>-line 16<RESET>
> + EOF
> + test_cmp expected actual &&
> +
> + test_config diff.colorMoved allbounds &&
> + git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
> + cat <<-\EOF >expected &&
> + <BOLD>diff --git a/lines.txt b/lines.txt<RESET>
> + <BOLD>index 47ea9c3..ba96a38 100644<RESET>
> + <BOLD>--- a/lines.txt<RESET>
> + <BOLD>+++ b/lines.txt<RESET>
> + <CYAN>@@ -1,16 +1,16 @@<RESET>
> + <BLUE>-line 1<RESET>
> + <MAGENTA>-line 2<RESET>
> + <BLUE>-line 3<RESET>
> + line 4<RESET>
> + line 5<RESET>
> + line 6<RESET>
> + line 7<RESET>
> + line 8<RESET>
> + line 9<RESET>
> + <YELLOW>+<RESET><YELLOW>line 1<RESET>
> + <CYAN>+<RESET><CYAN>line 2<RESET>
> + <YELLOW>+<RESET><YELLOW>line 3<RESET>
> + <YELLOW>+<RESET><YELLOW>line 14<RESET>
> + <CYAN>+<RESET><CYAN>line 15<RESET>
> + <YELLOW>+<RESET><YELLOW>line 16<RESET>
> + line 10<RESET>
> + line 11<RESET>
> + line 12<RESET>
> + line 13<RESET>
> + <BLUE>-line 14<RESET>
> + <MAGENTA>-line 15<RESET>
> + <BLUE>-line 16<RESET>
> + EOF
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'move detection does not mess up colored words' '
> + cat <<-\EOF >text.txt &&
> + Lorem Ipsum is simply dummy text of the printing and typesetting 
> industry.
> + EOF
> + git add text.txt &&
> + git commit -a -m "clean state" &&
> + cat <<-\EOF >text.txt &&
> + simply Lorem Ipsum dummy is text of the typesetting and printing 
> industry.
> + EOF
> + git diff --color-moved --word-diff >actual &&
> + git diff --word-diff >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'move detection with submodules' '
> + test_create_repo bananas &&
> + echo ripe >bananas/recipe &&
> + git -C bananas add recipe &&
> + test_commit fruit &&
> + test_commit -C bananas recipe &&
> + git submodule add ./bananas &&
> + git add bananas &&
> + git commit -a -m "bananas are like a heavy library?" &&
> + echo foul >bananas/recipe &&
> + echo ripe >fruit.t &&
> +
> + git diff --submodule=diff --color-moved >actual &&
> +
> + # no move detection as the moved line is across repository boundaries.
> + test_decode_color <actual >decoded_actual &&
> + ! grep BGREEN decoded_actual &&
> + ! grep BRED decoded_actual &&
> +
> + # nor did we mess with it another way
> + git diff --submodule=diff | test_decode_color >expect &&
> + test_cmp expect decoded_actual
> +'
> +
> test_done
> -- 
> 2.13.0.17.gab62347cd9
> 


^ permalink raw reply	[relevance 8%]

* Re: git push recurse.submodules behavior changed in 2.13
      [irrelevant] ` <xmqqinkk8jqm.fsf@gitster.mtv.corp.google.com>
@ 2017-05-29  4:20   ` Stefan Beller
  2017-05-30 12:01     ` John Shahid
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-29  4:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, John Shahid

On Sun, May 28, 2017 at 7:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
> John Shahid <jvshahid@gmail.com> writes:
>
>> It looks like the git push recurse-submodules behavior has changed.
>> Currently with 2.13 you cannot run "git push
>> --recurse-submodules=on-demand" if the parent repo is on a different
>> branch than the sub repos, e.g. parent repo is on "develop" and
>> sub-repo on "master". I created a test that can be found here [1].
>>
>> A bisect shows that the change to propagate refspec [2] to the
>> submodules is the culprit. imho this is an undesired change in
>> behavior. I looked at the code but couldn't see an easy way to fix
>> this issue without breaking the feature mentioned above. The only
>> option I can think of is to control the refspec propagation behavior
>> using a flag, e.g. "--propagate-refspecs" or add another
>> recurse-submodules option, e.g. "--recurse-submodules=propagate"
>>
>> What do you all think ?
>>
>> [1] https://gist.github.com/jvshahid/b778702cc3d825c6887d2707e866a9c8
>> [2] https://github.com/git/git/commit/06bf4ad1db92c32af38e16d9b7f928edbd647780
>
> Brandon?  I cannot quite tell from the report what "has changed"
> refers to, what failures "you cannot run" gets, and if that is a
> desirable thing to do (i.e. if letting the user run it in such a
> configuration would somehow break things, actively erroring out may
> be a deliberate change) or not (i.e. an unintended regression).
>

Before the refspec was passed down into the submodules,
we'd just invoke "git push" in the submodule assuming the user
setup a remote tracking branch and a push strategy such that
"git push" would do the right thing.
And because the submodule is configured independently, it
doesn't matter which branch you're on in the superproject.

Looking at the test[1], you run "git push --recurse-submodules"
without any remote/branch that was called out in the commit
message[2] to not have changed. Is that understanding correct?

Looking at the test cases of [2] we did not test for explicit
"still works with no args given", though one could have expected
we'd have a test for that already. :/

Thanks,
Stefan

^ permalink raw reply	[relevance 26%]

* [GSoC] Update: Week 2
@ 2017-05-29 20:41 Prathamesh Chavan
  0 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-05-29 20:41 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Christian Couder

SUMMARY OF MY PROJECT:

Git submodule subcommands are currently implemented by using shell script
'git-submodule.sh'. There are several reasons why we'll prefer not to
use the shell script. My project intends to convert the subcommands into
C code, thus making them builtins. This will increase Git's portability
and hence the efficiency of working with the git-submodule commands.
Link to the complete proposal: [1]

Mentors:
Stefan Beller <sbeller@google.com>
Christian Couder <christian.couder@gmail.com>

UPDATES:

As planned for the second week, I continued working on completing the porting
of submodule subcommand foreach[2][3][4] and status.[5][6] An updated version
of these was added to the mailing list as well.

For the submodule-status, I have implemented the suggestions received on the
previous patch. But for submodule-foreach, still, some issues are left to be
solved.

Apart from this, in this week, porting of submodule subcommand sync was  also
carried out. But instead of adding anymore floating patches on the  mailing
list, I have started discussing the patch with my mentors itself, so that on
the mailing list, the focus would remain with the ported submodule subcommands
status and foreach patches.

I have also taken up with the submodule subcommand summary for porting.

PLAN FOR WEEK-3 (30 May 2017 to 5 June 2017):

As suggested by my mentors, in this week, instead of adding more floating
patches to the mailing list and porting more submodule subcommand, I would
like to polish the existing patches and try to resolve the issues they
currently have, eventually aiming for getting them merged.

Also, since I have also completed porting of submodule subcommand sync, after
reviewing the patches with mentors I'll soon be posting it on the  mailing
list.

Additionally, I will also try to complete porting of submodule-subcommand
summary in this week itself.

[1]: https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/
[2]: https://public-inbox.org/git/20170526151713.10974-1-pc44800@gmail.com/
[3]: https://public-inbox.org/git/20170526151713.10974-2-pc44800@gmail.com/
[4]: https://public-inbox.org/git/20170526151713.10974-3-pc44800@gmail.com/
[5]: https://public-inbox.org/git/20170521122711.22021-1-pc44800@gmail.com/
[6]: https://public-inbox.org/git/20170521122711.22021-2-pc44800@gmail.com/

^ permalink raw reply	[relevance 18%]

* Re: [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive
  2017-05-26 19:10 [PATCHv2 0/8] A reroll of sb/submodule-blanket-recursive Stefan Beller
                   ` (7 preceding siblings ...)
  2017-05-26 19:10 ` [PATCH 8/8] builtin/fetch.c: respect 'submodule.recurse' option Stefan Beller
@ 2017-05-30  5:30 ` Junio C Hamano
  2017-06-01  0:30   ` [PATCHv3 0/4] A reroll of sb/submodule-blanket-recursive Stefan Beller
  8 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-05-30  5:30 UTC (permalink / raw)
  To: Stefan Beller; +Cc: bmwill, git

Stefan Beller <sbeller@google.com> writes:

> v2:
> * A reroll of sb/submodule-blanket-recursive.
> * This requires ab/grep-preparatory-cleanup 

2/8 seems to be more stale than sb/checkout-recurse-submodules that
was merged at f1101cef to 'master'.  I'll try to merge Ævar's series
to 'master' before that merge, queue these patches and see if the
resulting history is too messy.

Thanks.

^ permalink raw reply	[relevance 22%]

* Re: git push recurse.submodules behavior changed in 2.13
  2017-05-29  4:20   ` Re: git push recurse.submodules behavior changed in 2.13 Stefan Beller
@ 2017-05-30 12:01     ` John Shahid
  2017-05-30 17:05       ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: John Shahid @ 2017-05-30 12:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Brandon Williams, git

Junio, sorry for the poor report. I totally forgot to describe the
behavior that i'm currently getting vs what i expect.

Expected behavior:

We have a parent repo on a branch called "develop" and a submodule on
a branch called "master". Prior to git version 2.13 if we had an
unpushed commit in the submodule and ran "git push origin develop
--recurse-submodules=on-demand" git would happily push the develop
branch of the parent repo and the master branch of the submodule,
e.g.:

> Pushing submodule 'sub'
> Counting objects: 2, done.
> Delta compression using up to 4 threads.
> Compressing objects: 100% (2/2), done.
> Writing objects: 100% (2/2), 242 bytes | 0 bytes/s, done.
> Total 2 (delta 0), reused 0 (delta 0)
> To /home/jvshahid/codez/git/t/trash directory.t9904-diff-branch-submodule-push/sub.git
>    3cd2129..69cbc06  master -> master
> Counting objects: 2, done.
> Delta compression using up to 4 threads.
> Compressing objects: 100% (2/2), done.
> Writing objects: 100% (2/2), 283 bytes | 0 bytes/s, done.
> Total 2 (delta 0), reused 0 (delta 0)
> To ../pub.git
>    7ff6fca..945bc12  develop -> develop
> ok 2 - push if submodule has is on a different branch

Actual behavior:

After the change mentioned in my previous email, git would propagate
the refspec from the parent repo to the submodule, i.e. it would try
to push a branch called "develop" in the submodule which would error
since no branch with that name exist in the submodule. Here is a
sample output with git v2.13.0:

> pushing to ref refs/heads/develop:refs/heads/develop
> pushging to remote origin
> fatal: src refspec 'refs/heads/develop' must name a ref
> fatal: process for submodule 'sub' failed
> fatal: The remote end hung up unexpectedly

I hope this clarifies my bug report.

Stefan, one little correction. I don't think the commit called out
this behavior. The commit message was talking about unconfigured
remotes (i.e. pushing to a url or local path) to not propagate the
refspec and preserve the current behavior. Judging from the code and a
test case that I wrote, this behavior is working as expected. That is,
git won't propagate the refspec.

Cheers,

JS

On Mon, May 29, 2017 at 12:20 AM, Stefan Beller <sbeller@google.com> wrote:
> On Sun, May 28, 2017 at 7:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> John Shahid <jvshahid@gmail.com> writes:
>>
>>> It looks like the git push recurse-submodules behavior has changed.
>>> Currently with 2.13 you cannot run "git push
>>> --recurse-submodules=on-demand" if the parent repo is on a different
>>> branch than the sub repos, e.g. parent repo is on "develop" and
>>> sub-repo on "master". I created a test that can be found here [1].
>>>
>>> A bisect shows that the change to propagate refspec [2] to the
>>> submodules is the culprit. imho this is an undesired change in
>>> behavior. I looked at the code but couldn't see an easy way to fix
>>> this issue without breaking the feature mentioned above. The only
>>> option I can think of is to control the refspec propagation behavior
>>> using a flag, e.g. "--propagate-refspecs" or add another
>>> recurse-submodules option, e.g. "--recurse-submodules=propagate"
>>>
>>> What do you all think ?
>>>
>>> [1] https://gist.github.com/jvshahid/b778702cc3d825c6887d2707e866a9c8
>>> [2] https://github.com/git/git/commit/06bf4ad1db92c32af38e16d9b7f928edbd647780
>>
>> Brandon?  I cannot quite tell from the report what "has changed"
>> refers to, what failures "you cannot run" gets, and if that is a
>> desirable thing to do (i.e. if letting the user run it in such a
>> configuration would somehow break things, actively erroring out may
>> be a deliberate change) or not (i.e. an unintended regression).
>>
>
> Before the refspec was passed down into the submodules,
> we'd just invoke "git push" in the submodule assuming the user
> setup a remote tracking branch and a push strategy such that
> "git push" would do the right thing.
> And because the submodule is configured independently, it
> doesn't matter which branch you're on in the superproject.
>
> Looking at the test[1], you run "git push --recurse-submodules"
> without any remote/branch that was called out in the commit
> message[2] to not have changed. Is that understanding correct?
>
> Looking at the test cases of [2] we did not test for explicit
> "still works with no args given", though one could have expected
> we'd have a test for that already. :/
>
> Thanks,
> Stefan

^ permalink raw reply	[relevance 25%]

* Re: git push recurse.submodules behavior changed in 2.13
  2017-05-30 12:01     ` John Shahid
@ 2017-05-30 17:05       ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-30 17:05 UTC (permalink / raw)
  To: John Shahid; +Cc: Stefan Beller, Junio C Hamano, git

On 05/30, John Shahid wrote:
> Junio, sorry for the poor report. I totally forgot to describe the
> behavior that i'm currently getting vs what i expect.
> 
> Expected behavior:
> 
> We have a parent repo on a branch called "develop" and a submodule on
> a branch called "master". Prior to git version 2.13 if we had an
> unpushed commit in the submodule and ran "git push origin develop
> --recurse-submodules=on-demand" git would happily push the develop
> branch of the parent repo and the master branch of the submodule,
> e.g.:

Yeah my patches would definitely break that kind of workflow because
they assumed that if you actually provided a refspec + --recurse that
you would want it propagated down.  When developing those patches I was
trying to avoid needing to add an additional flag to do the propagation
but given people were already relying on this behavior it looks like
that may be the only course of action.

> 
> > Pushing submodule 'sub'
> > Counting objects: 2, done.
> > Delta compression using up to 4 threads.
> > Compressing objects: 100% (2/2), done.
> > Writing objects: 100% (2/2), 242 bytes | 0 bytes/s, done.
> > Total 2 (delta 0), reused 0 (delta 0)
> > To /home/jvshahid/codez/git/t/trash directory.t9904-diff-branch-submodule-push/sub.git
> >    3cd2129..69cbc06  master -> master
> > Counting objects: 2, done.
> > Delta compression using up to 4 threads.
> > Compressing objects: 100% (2/2), done.
> > Writing objects: 100% (2/2), 283 bytes | 0 bytes/s, done.
> > Total 2 (delta 0), reused 0 (delta 0)
> > To ../pub.git
> >    7ff6fca..945bc12  develop -> develop
> > ok 2 - push if submodule has is on a different branch
> 
> Actual behavior:
> 
> After the change mentioned in my previous email, git would propagate
> the refspec from the parent repo to the submodule, i.e. it would try
> to push a branch called "develop" in the submodule which would error
> since no branch with that name exist in the submodule. Here is a
> sample output with git v2.13.0:
> 
> > pushing to ref refs/heads/develop:refs/heads/develop
> > pushging to remote origin
> > fatal: src refspec 'refs/heads/develop' must name a ref
> > fatal: process for submodule 'sub' failed
> > fatal: The remote end hung up unexpectedly
> 
> I hope this clarifies my bug report.
> 
> Stefan, one little correction. I don't think the commit called out
> this behavior. The commit message was talking about unconfigured
> remotes (i.e. pushing to a url or local path) to not propagate the
> refspec and preserve the current behavior. Judging from the code and a
> test case that I wrote, this behavior is working as expected. That is,
> git won't propagate the refspec.
> 
> Cheers,
> 
> JS
> 
> On Mon, May 29, 2017 at 12:20 AM, Stefan Beller <sbeller@google.com> wrote:
> > On Sun, May 28, 2017 at 7:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >> John Shahid <jvshahid@gmail.com> writes:
> >>
> >>> It looks like the git push recurse-submodules behavior has changed.
> >>> Currently with 2.13 you cannot run "git push
> >>> --recurse-submodules=on-demand" if the parent repo is on a different
> >>> branch than the sub repos, e.g. parent repo is on "develop" and
> >>> sub-repo on "master". I created a test that can be found here [1].
> >>>
> >>> A bisect shows that the change to propagate refspec [2] to the
> >>> submodules is the culprit. imho this is an undesired change in
> >>> behavior. I looked at the code but couldn't see an easy way to fix
> >>> this issue without breaking the feature mentioned above. The only
> >>> option I can think of is to control the refspec propagation behavior
> >>> using a flag, e.g. "--propagate-refspecs" or add another
> >>> recurse-submodules option, e.g. "--recurse-submodules=propagate"
> >>>
> >>> What do you all think ?
> >>>
> >>> [1] https://gist.github.com/jvshahid/b778702cc3d825c6887d2707e866a9c8
> >>> [2] https://github.com/git/git/commit/06bf4ad1db92c32af38e16d9b7f928edbd647780
> >>
> >> Brandon?  I cannot quite tell from the report what "has changed"
> >> refers to, what failures "you cannot run" gets, and if that is a
> >> desirable thing to do (i.e. if letting the user run it in such a
> >> configuration would somehow break things, actively erroring out may
> >> be a deliberate change) or not (i.e. an unintended regression).
> >>
> >
> > Before the refspec was passed down into the submodules,
> > we'd just invoke "git push" in the submodule assuming the user
> > setup a remote tracking branch and a push strategy such that
> > "git push" would do the right thing.
> > And because the submodule is configured independently, it
> > doesn't matter which branch you're on in the superproject.
> >
> > Looking at the test[1], you run "git push --recurse-submodules"
> > without any remote/branch that was called out in the commit
> > message[2] to not have changed. Is that understanding correct?
> >
> > Looking at the test cases of [2] we did not test for explicit
> > "still works with no args given", though one could have expected
> > we'd have a test for that already. :/
> >
> > Thanks,
> > Stefan

-- 
Brandon Williams

^ permalink raw reply	[relevance 7%]

* Re: What's cooking in git.git (May 2017, #08; Mon, 29)
      [irrelevant] <xmqq1sr889lb.fsf@gitster.mtv.corp.google.com>
@ 2017-05-30 17:42 ` Stefan Beller
  2017-05-30 23:07   ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-30 17:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, May 28, 2017 at 11:23 PM, Junio C Hamano <gitster@pobox.com> wrote:

> A bit more topics are now in 'master'.  One unfortunate thing is
> that the SHA1 breakage in 2.13 for big-endian platforms were lost in
> the noise with excitement felt by some subset of contributors with
> the possible use of submodules.

I am sorry about being excited, without considering the immediate
pressing issues.

> The first step in the series is
> neutral to the excitement, and should be fast-tracked down to
> 'maint' soonish.

Yes I agree on that. Thanks for being calm and unexcited about that!

>
>
> * sb/diff-color-move (2017-05-25) 17 commits
>  - diff.c: color moved lines differently
...
>
>  "git diff" has been taught to optionally paint new lines that are
>  the same as deleted lines elsewhere differently from genuinely new
>  lines.

My current understanding is that we agree on having the first n-1 patches
in good shape[1] and are only discussing how the exact line coloring
algorithm should look like, so I resent that separately[2]. While
it has better documentation and tests (also a command line option)
Philip still found some issues in there, so I will revisit that patch once
more.

[1] https://public-inbox.org/git/xmqq7f15e8pu.fsf@gitster.mtv.corp.google.com/
[2] https://public-inbox.org/git/20170527001820.25214-2-sbeller@google.com/

>
> * sb/submodule-blanket-recursive (2017-05-23) 6 commits
...
>
>  Retracted for now.
>  cf. <CAGZ79kZexcwh=E6_ks83=pJh=ZvKnLvJ54eLsn+HURsTZOpvqg@mail.gmail.com>

And the retraction is retracted by sending a new series.
You remarked that it still misbehaves with other series in flight,
so I'll inspect it again.

^ permalink raw reply	[relevance 8%]

* Re: [PATCHv4 00/17] Diff machine: highlight moved lines.
  2017-05-27  1:04   ` Jacob Keller
@ 2017-05-30 21:38     ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-30 21:38 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Junio C Hamano, Git mailing list, Brandon Williams, Jonathan Nieder, Jonathan Tan, Jeff King, Michael Haggerty

On Fri, May 26, 2017 at 6:04 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> On Mon, May 22, 2017 at 7:40 PM, Stefan Beller <sbeller@google.com> wrote:
>> v4:
>> * interdiff to v3 (what is currently origin/sb/diff-color-move) below.
>> * renamed the "buffered_patch_line" to "diff_line". Originally I planned
>>   to not carry the "line" part as it can be a piece of a line as well.
>>   But for the intended functionality it is best to keep the name.
>>   If we'd want to add more functionality to say have a move detection
>>   for words as well, we'd rename the struct to have a better name then.
>>   For now diff_line is the best. (Thanks Jonathan Nieder!)
>> * tests to demonstrate it doesn't mess with --color-words as well as
>>   submodules. (Thanks Jonathan Tan!)
>> * added in the statics (Thanks Ramsay!)
>> * smaller scope for the hashmaps (Thanks Jonathan Tan!)
>> * some commit messages were updated, prior patch 4-7 is squashed into one
>>   (Thanks Jonathan Tan!)
>> * the tests added revealed an actual fault: now that the submodule process
>>   is not attached to a dupe of our stdout, it would stop coloring the
>>   output. We need to pass on use-color explicitly.
>> * updated the NEEDSWORK comment in the second last patch.
>>
>> Thanks for bearing,
>> Stefan
>>
>
> One thing to note when I was playing around with what's on pu right
> now, I noticed that the oldMovedAlternative and newMovedAlternative
> are the first moved colors to be used if there is only one move. (Ie:
> a simple case of literally one section moved) This is a bit weird that
> the alternative colors are used before the "main" colors. I would have
> thought it would be the other way.

While pu is not up-to-date, I double checked with the most recent
implementation and that is no longer the case.

> I noticed this because the default colors do not work well for my
> terminal color scheme and I had to configure but realized that I
> needed to configure the alternative ones to make a difference in the
> simple diff I was viewing.

The v4 that you tested, is the "alternate" scheme in the resend
https://public-inbox.org/git/20170527001820.25214-2-sbeller@google.com/

Thanks,
Stefan

^ permalink raw reply	[relevance 9%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-27  1:10           ` Ramsay Jones
@ 2017-05-30 21:53             ` Stefan Beller
  2017-05-30 23:07               ` Ramsay Jones
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-30 21:53 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder

On Fri, May 26, 2017 at 6:10 PM, Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
>
>
> On 26/05/17 18:07, Stefan Beller wrote:
>> On Fri, May 26, 2017 at 9:31 AM, Ramsay Jones
>> <ramsay@ramsayjones.plus.com> wrote:
>>> Hmm, I'm not sure which documentation you are referring to,
>>
>> Quite likely our fine manual pages. ;)
>>
>>        foreach [--recursive] <command>
>>            Evaluates an arbitrary shell command in each checked out submodule.
>>            The command has access to the variables $name, $path, $sha1 and
>>            $toplevel: $name is the name of the relevant submodule section in
>>            .gitmodules, $path is the name of the submodule directory relative
>>            to the superproject, $sha1 is the commit as recorded in the
>>            superproject, and $toplevel is the absolute path to the top-level
>>            of the superproject. Any submodules defined in the superproject but
>>            not checked out are ignored by this command. Unless given --quiet,
>>            foreach prints the name of each submodule before evaluating the
>>            command. If --recursive is given, submodules are traversed
>>            recursively (i.e. the given shell command is evaluated in nested
>>            submodules as well). A non-zero return from the command in any
>>            submodule causes the processing to terminate. This can be
>>            overridden by adding || : to the end of the command.
>
> I suspected as much, but I was wondering specifically if $sm_path
> had been documented anywhere. I didn't think so, but ...
>
>> As $path is documented and $sm_path is not, we should care about
>> $path first to be correct and either fix the documentation or the implementation
>> such that we have a consistent world view. :)
>
> Sure, but what is that world view? :-D
>
> I suspect that commit 091a6eb0fe did not intend (should not have)
> used $sm_path in that test. If we were to 'fix' that test, would
> it still work?
>
> Back in 2012, the submodule list was generated by filtering the
> output of 'git ls-files --error-unmatch --stage --'; but I don't
> recall if (at that time) git-ls-files required being at the top
> of the working tree, or if it would execute fine in a sub-directory.
> So, it's possible that the documentation of $path was wrong all along.
> ;-)
>
> At that time, by definition, $path == $sm_path. However, you know this
> stuff much better than me (I don't use git-submodule), so ...

Don't take that stance. Sometimes I shoot quickly from the hip without
considering consequences (Figuratively).

In a foreach command I can see value both in the 'displaypath'
(what sm_path would become here) and the 'submodule path'
from the superproject. The naming of 'sm_path' hints at that it ought
to be the 'submodule path'.

>>
>>     $path (as documented) is the name of the submodule directory
>>     relative to the direct superproject (so in nested submodules you
>>     go up only one level).
>>
>> $sm_path on the other hand is not documented at all and yields
>> non-sense results in corner cases.
>
> Hmm, at what point did '$sm_path yields non-sense results' start
> being the case? (perhaps the corner cases need to be fixed first).

Well the corner case is described in the patchs notes.
So that patch would fix it to be consistent with the new world view
(that I have in mind) as I do not know about the 2012 ideas how submodules
ought to behave correctly.

>> With this patch it becomes less non-sensey and could be documented as:
>>
>>     $sm_path is the relative path from the current working directory
>>     to the submodule (ignoring relations to the superproject or nesting
>>     of submodules).
>
> OK.
>
>>                      This documentation also fits into the narrative of
>>     the test in t7407.
>
> Hmm, does it?

After rereading that test, I would think so?

Thanks for keeping discussing this.

So maybe we want to
* keep path=sm_path
* fix the documentation via s/$path/$sm_path/g in that section quoted above.
* Introduce a new variable sm_display_path that contains what this patch
  proposes sm_path to be.
* fix the test in t7407 by checking both sm_path (fixed) as well
  as sm_display_path (what is currently recorded in sm_path)
---
In the next patch:
* Additionally in the rewrite in C, we would do an

    #ifndef WINDOWS /* need to lookup the exact macro */
        argv_array_push(env_vars, "path=%s", sm_path);
    #endif

such that Windows users are forced to migrate to sm_path
as path/Path is case sensitive there. sm_path being documented
value, so it should work fine?

Thanks,
Stefan

^ permalink raw reply	[relevance 23%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-30 21:53             ` Stefan Beller
@ 2017-05-30 23:07               ` Ramsay Jones
  2017-05-30 23:29                 ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Ramsay Jones @ 2017-05-30 23:07 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder, Johannes Sixt



On 30/05/17 22:53, Stefan Beller wrote:
> On Fri, May 26, 2017 at 6:10 PM, Ramsay Jones
> <ramsay@ramsayjones.plus.com> wrote:
>> On 26/05/17 18:07, Stefan Beller wrote:
>>> On Fri, May 26, 2017 at 9:31 AM, Ramsay Jones
>>> <ramsay@ramsayjones.plus.com> wrote:

>> Back in 2012, the submodule list was generated by filtering the
>> output of 'git ls-files --error-unmatch --stage --'; but I don't
>> recall if (at that time) git-ls-files required being at the top
>> of the working tree, or if it would execute fine in a sub-directory.
>> So, it's possible that the documentation of $path was wrong all along.
>> ;-)
>>
>> At that time, by definition, $path == $sm_path. However, you know this
>> stuff much better than me (I don't use git-submodule), so ...
> 
> Don't take that stance. Sometimes I shoot quickly from the hip without
> considering consequences (Figuratively).
> 
> In a foreach command I can see value both in the 'displaypath'
> (what sm_path would become here) and the 'submodule path'
> from the superproject. The naming of 'sm_path' hints at that it ought
> to be the 'submodule path'.

Well, since I introduced it, I can confidently proclaim that it is
indeed the 'submodule path'. :-D

As I said above, I can't remember how git-ls-files worked back then,
but it seems that I thought of it as the path to the submodule from
the root of the working tree. Again, by definition, $sm_path == $path
(as documented). Of course, that may have changed since then.

>>> With this patch it becomes less non-sensey and could be documented as:
>>>
>>>     $sm_path is the relative path from the current working directory
>>>     to the submodule (ignoring relations to the superproject or nesting
>>>     of submodules).
>>
>> OK.
>>
>>>                      This documentation also fits into the narrative of
>>>     the test in t7407.
>>
>> Hmm, does it?
> 
> After rereading that test, I would think so?

Really? So, if it differs from $path, then something changed between
commit 64394e3ae9 and commit 091a6eb0fe. I haven't really read that
commit/test, so take what I say with a pinch of salt ...

> Thanks for keeping discussing this.
> 
> So maybe we want to
> * keep path=sm_path

As I said in commit 64394e3ae9, $path was part of the API, so I could
not just rename it, without a deprecation period, etc ... So, I was
'crossing my fingers' that nobody would export $path in their user
scripts (not very likely, after all).

> * fix the documentation via s/$path/$sm_path/g in that section quoted above.

So, "$path is the name of the submodule directory relative to the
superproject", as currently documented in the man page, yes?

So, $sm_path == $path, at least for some period?

> * Introduce a new variable sm_display_path that contains what this patch
>   proposes sm_path to be.

So, this would be the path from the cwd to the submodule, yes?

> * fix the test in t7407 by checking both sm_path (fixed) as well
>   as sm_display_path (what is currently recorded in sm_path)

Hmm, ...

> In the next patch:
> * Additionally in the rewrite in C, we would do an
> 
>     #ifndef WINDOWS /* need to lookup the exact macro */
>         argv_array_push(env_vars, "path=%s", sm_path);
>     #endif
> 
> such that Windows users are forced to migrate to sm_path
> as path/Path is case sensitive there. sm_path being documented
> value, so it should work fine?

Well, as you saw in a separate thread, I can no longer get
cygwin to fail, so something (probably in the cygwin runtime)
has changed since 2012 to make this work now, despite the
case insensitive win32 environment block. (This may also be
true of MSYS2, but I haven't tested it).

I have not tested this on MYSY2/MinGW/Git-for-windows, but
Johannes Sixt was concerned about this, so I guess it may
still be a problem there.

I don't know how windows folks will feel about simply
removing $path, ...


ATB,
Ramsay Jones


^ permalink raw reply	[relevance 23%]

* Re: What's cooking in git.git (May 2017, #08; Mon, 29)
  2017-05-30 17:42 ` Re: What's cooking in git.git (May 2017, #08; Mon, 29) Stefan Beller
@ 2017-05-30 23:07   ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-05-30 23:07 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git\

Stefan Beller <sbeller@google.com> writes:

>>
>> * sb/submodule-blanket-recursive (2017-05-23) 6 commits
> ...
>
> And the retraction is retracted by sending a new series.
> You remarked that it still misbehaves with other series in flight,
> so I'll inspect it again.

What I said (or at least what I ment to say) was "it looked like it
is based on an older codebase and I do not yet know how messy the
conflict resolution and resulting history would look like".



^ permalink raw reply	[relevance 8%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-30 23:07               ` Ramsay Jones
@ 2017-05-30 23:29                 ` Stefan Beller
  2017-05-31  0:13                   ` Ramsay Jones
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-30 23:29 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder, Johannes Sixt

On Tue, May 30, 2017 at 4:07 PM, Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
>
>
> On 30/05/17 22:53, Stefan Beller wrote:
>> On Fri, May 26, 2017 at 6:10 PM, Ramsay Jones
>> <ramsay@ramsayjones.plus.com> wrote:
>>> On 26/05/17 18:07, Stefan Beller wrote:
>>>> On Fri, May 26, 2017 at 9:31 AM, Ramsay Jones
>>>> <ramsay@ramsayjones.plus.com> wrote:
>
>>> Back in 2012, the submodule list was generated by filtering the
>>> output of 'git ls-files --error-unmatch --stage --'; but I don't
>>> recall if (at that time) git-ls-files required being at the top
>>> of the working tree, or if it would execute fine in a sub-directory.
>>> So, it's possible that the documentation of $path was wrong all along.
>>> ;-)
>>>
>>> At that time, by definition, $path == $sm_path. However, you know this
>>> stuff much better than me (I don't use git-submodule), so ...
>>
>> Don't take that stance. Sometimes I shoot quickly from the hip without
>> considering consequences (Figuratively).
>>
>> In a foreach command I can see value both in the 'displaypath'
>> (what sm_path would become here) and the 'submodule path'
>> from the superproject. The naming of 'sm_path' hints at that it ought
>> to be the 'submodule path'.
>
> Well, since I introduced it, I can confidently proclaim that it is
> indeed the 'submodule path'. :-D

ok. :)

> As I said above, I can't remember how git-ls-files worked back then,
> but it seems that I thought of it as the path to the submodule from
> the root of the working tree. Again, by definition, $sm_path == $path
> (as documented). Of course, that may have changed since then.

Documented in 64394e3 (git-submodule.sh: Don't use $path variable in
eval_gettext string, by yourself)

What I intended to say above was "documented to the end user",
and I do not count our commit messages as such. The end user facing
documentation only talks about path, not mentioning sm_path.

>>>> With this patch it becomes less non-sensey and could be documented as:
>>>>
>>>>     $sm_path is the relative path from the current working directory
>>>>     to the submodule (ignoring relations to the superproject or nesting
>>>>     of submodules).
>>>
>>> OK.
>>>
>>>>                      This documentation also fits into the narrative of
>>>>     the test in t7407.
>>>
>>> Hmm, does it?
>>
>> After rereading that test, I would think so?
>
> Really? So, if it differs from $path, then something changed between
> commit 64394e3ae9 and commit 091a6eb0fe. I haven't really read that
> commit/test, so take what I say with a pinch of salt ...

Well yes. I am specifically reading 091a6eb0fe, the changes to t7407.

In that test sm_path contains the relative path from $PWD to the
submodule. (It does NOT: "$[sm_]path is the name of the submodule
directory relative to the superproject" as documented but rather
... relative to the $PWD)

>
>> Thanks for keeping discussing this.
>>
>> So maybe we want to
>> * keep path=sm_path
>
> As I said in commit 64394e3ae9, $path was part of the API, so I could
> not just rename it, without a deprecation period, etc ... So, I was
> 'crossing my fingers' that nobody would export $path in their user
> scripts (not very likely, after all).

Ok. So another approach to get away in the C conversion:
* export the sm_path as all other environment variables
* for "$path" we do not export it into the environment, but
  prefix the command with it, i.e. we'd ask our shell to run
  "path=%s; %s", sm_path, argv[0]
  to preserve the historic behavior.

>
>> * fix the documentation via s/$path/$sm_path/g in that section quoted above.
>
> So, "$path is the name of the submodule directory relative to the
> superproject", as currently documented in the man page, yes?

No, the documentation does not match reality. The reality is that
both sm_path as well as path give the display path.

> So, $sm_path == $path, at least for some period?

yes that is current reality.

>
>> * Introduce a new variable sm_display_path that contains what this patch
>>   proposes sm_path to be.
>
> So, this would be the path from the cwd to the submodule, yes?

yes.

>
> I don't know how windows folks will feel about simply
> removing $path, ...

I agree that this is a bad idea now. As said above we'd
just not export the path as an env variable and should be
"fine" in the sense that we do not break historical expectations,
but have to deal with slightly messy code.

Just digging further, there was ea2fa1040d (submodule foreach:
correct path display in recursive submodules, 2016-03-29), which
is tangent to the issue of whether we think it is a display path
or the superproject->submodule path.

Thanks,
Stefan

^ permalink raw reply	[relevance 24%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-30 23:29                 ` Stefan Beller
@ 2017-05-31  0:13                   ` Ramsay Jones
  2017-05-31  0:48                     ` Ramsay Jones
  0 siblings, 1 reply; 200+ results
From: Ramsay Jones @ 2017-05-31  0:13 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder, Johannes Sixt



On 31/05/17 00:29, Stefan Beller wrote:
 
>> As I said above, I can't remember how git-ls-files worked back then,
>> but it seems that I thought of it as the path to the submodule from
>> the root of the working tree. Again, by definition, $sm_path == $path
>> (as documented). Of course, that may have changed since then.
> 
> Documented in 64394e3 (git-submodule.sh: Don't use $path variable in
> eval_gettext string, by yourself)
> 
> What I intended to say above was "documented to the end user",
> and I do not count our commit messages as such. The end user facing
> documentation only talks about path, not mentioning sm_path.

Correct, and that is exactly what I was saying. ie. $path as
'documented to the end user'. (again by _definition_ $sm_path
_is_ $path).

>>> After rereading that test, I would think so?
>>
>> Really? So, if it differs from $path, then something changed between
>> commit 64394e3ae9 and commit 091a6eb0fe. I haven't really read that
>> commit/test, so take what I say with a pinch of salt ...
> 
> Well yes. I am specifically reading 091a6eb0fe, the changes to t7407.
> 
> In that test sm_path contains the relative path from $PWD to the
> submodule. (It does NOT: "$[sm_]path is the name of the submodule
> directory relative to the superproject" as documented but rather
> ... relative to the $PWD)

In that case, the current user documentation does not agree with
the current implementation, yes?

So, was the user documentation always wrong? (did git-ls-files work
from a sub-directory, limiting its output to the cwd, or did it
chdir() to the top of the worktree first?).

>> As I said in commit 64394e3ae9, $path was part of the API, so I could
>> not just rename it, without a deprecation period, etc ... So, I was
>> 'crossing my fingers' that nobody would export $path in their user
>> scripts (not very likely, after all).
> 
> Ok. So another approach to get away in the C conversion:
> * export the sm_path as all other environment variables
> * for "$path" we do not export it into the environment, but
>   prefix the command with it, i.e. we'd ask our shell to run
>   "path=%s; %s", sm_path, argv[0]
>   to preserve the historic behavior.

Yes, that would probably work.

ATB,
Ramsay Jones


^ permalink raw reply	[relevance 16%]

* Re: [GSoC][PATCH v5 1/3] submodule: fix buggy $path and $sm_path variable's value
  2017-05-31  0:13                   ` Ramsay Jones
@ 2017-05-31  0:48                     ` Ramsay Jones
  2017-06-02 11:24                       ` [GSoC][PATCH v6 1/2] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
  0 siblings, 1 reply; 200+ results
From: Ramsay Jones @ 2017-05-31  0:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git, Brandon Williams, Christian Couder, Johannes Sixt



On 31/05/17 01:13, Ramsay Jones wrote:

> On 31/05/17 00:29, Stefan Beller wrote:
>  
>> In that test sm_path contains the relative path from $PWD to the
>> submodule. (It does NOT: "$[sm_]path is the name of the submodule
>> directory relative to the superproject" as documented but rather
>> ... relative to the $PWD)
> 
> In that case, the current user documentation does not agree with
> the current implementation, yes?
> 
> So, was the user documentation always wrong? (did git-ls-files work
> from a sub-directory, limiting its output to the cwd, or did it
> chdir() to the top of the worktree first?).

To answer my own question, I rebuilt git and tried directly:

  $ pwd
  /home/ramsay/git
  $ 
  $ ./git version
  git version 1.7.10.1.g64394
  $ 
  $ cd git_remote_helpers/
  $ ../git-ls-files --error-unmatch --stage --
  100644 2247d5f95a7193c7221b9464debe167763b4fae3 0	.gitignore
  100644 74b05dc91e42414147d5f3dc7b4fc66fb86c0eca 0	Makefile
  100644 00f69cbeda277b24e8ab35cb7db2c25cc0cc122e 0	__init__.py
  100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	git/__init__.py
  100644 9ee5f96d4ce313f4f94505ff65b560943bfd21cb 0	git/exporter.py
  100644 007a1bfdf37d231470f69d9d0cffa46e80127f34 0	git/git.py
  100644 5c6b595e16665bc508625ab0e96c95776bacba1a 0	git/importer.py
  100644 e70025095dcfb31d3944e72ac1f83dd7d4109103 0	git/non_local.py
  100644 acbf8d7785e2253777456f8910e2352992dda474 0	git/repo.py
  100644 4bff8878d14ccaf02c552073ef55d519df0b4cad 0	setup.cfg
  100644 4d434b65cbf5c42a455d5cd3bced030bfb51a245 0	setup.py
  100644 fbbb01b14619c1d2ed6bcc8f304f019fbe98697f 0	util.py
  $ 
  
Hmm, so it looks like $path was always relative to cwd!

(got to get some sleep now ...)

ATB,
Ramsay Jones



^ permalink raw reply	[relevance 7%]

* Re: [PATCH 0/5] Some submodule bugfixes and "reattaching detached HEADs"
      [irrelevant]     ` <CAGZ79kb52QDUG0RtTXNEEpMJR1CSMYMrRHTRRvGn0-cF=HnzWw@mail.gmail.com>
@ 2017-05-31 22:09       ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-05-31 22:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, Jonathan Nieder, git

On Mon, May 1, 2017 at 9:04 PM, Stefan Beller <sbeller@google.com> wrote:
>>
>>> I don't know why submodules were originally designed to be in a
>>> detached HEAD state but I much prefer working on branches (as I'm sure
>>> many other developers do) so the prospect of this becoming the norm is
>>> exciting! :D
>>
>
> I'll think about this more.

What the current model is missing is the possibility to have
a symbolic link not just to a ref within a repository, but to the outside
of a repository (such as the superproject in this case).

So we could have a HEAD with a content like:

    "super: <superprojects git dir> [LF <path inside superproject>]"

Then we would use the HEAD to determine if the superproject
would touch a submodule at all. Example workflow:

    git -C <sub> checkout --reattach-to-superproject

    # hack away in the submodule

    # This will make a commit in <sub> and add the
    # resulting object to the index of the superproject
    # because HEAD is tracking the superproject.
    # so in order to have HEAD containing the new
    # commit we have to change the superproject:
    git -C <sub> commit -a -m "message"

    # This has also interesting consequences for
    # submodule related commands:
    git checkout --recurse-submodules <tree-ish>
    # Any submodule whose HEAD is attached to the
    # superproject would be touched, the others would
    # not.

By being directly attached to the superproject, it would be
easy to find all submodules that are changed, via a

    git -C <super> status # no need to recurse, even!




















The whole "checkout --recurse-submodules" series is based on
assumptions of the current mental model of how branches and
detached HEADs work.


A submodule would have a symref

^ permalink raw reply	[relevance 33%]

* Re: [PATCH 00/31] repository object
      [irrelevant] <20170531214417.38857-1-bmwill@google.com>
@ 2017-05-31 22:56 ` Stefan Beller
  2017-05-31 23:01   ` Brandon Williams
      [irrelevant] ` <20170531214417.38857-7-bmwill@google.com>
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-05-31 22:56 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff King, Jeff Hostetler, Ævar Arnfjörð Bjarmason, Jonathan Tan

On Wed, May 31, 2017 at 2:43 PM, Brandon Williams <bmwill@google.com> wrote:
> Given the vast interest expressed when I sent out my RFC series I decided it
> would be worth it to invest more time to making a repository object a reality.
>
> This series is an extension of the last series I sent out (in that ls-files is
> converted to working on submodules in-process using repository objects instead
> of spawning a child process to do the work).  The big difference from the RFC
> series is that I went through and did the work to migrate key repository state
> from global variables in 'environment.c' to being stored in a repository object
> itself.  I migrated the bits of state that seemed reasonable for this series,
> there is still a lot of global state which could be migrated in the future.
>
> I do think that we need to be slightly cautious about moving global state into
> the repository object though, I don't want 'struct repo' to simply become a
> kitchen sink where everything gets dumped.  But this is just a warning for the
> future.

Or in other words:
You want to have another struct e.g. 'the_command_line_arguments',
which would carry the verbosity/color options for example as they are
not related to a repo object, but to the current command being run?

> Since this is a v1 I'm fairly certain that it still has a lot of rough edges
> (like I think I need to write better commit messages, and we should probably
> have more comments documenting object fields/contract) but I want to get the
> review process started sooner rather than later since I'm sure people will have
> opinions (e.g. should it be called 'struct repo' or 'struct repository'?!).

IMHO this is the most obvious, but bikesheddable part of the series. ;)
Keep it short as everyone knows what a 'repo' is.

^ permalink raw reply	[relevance 9%]

* Re: [PATCH 00/31] repository object
  2017-05-31 22:56 ` Re: [PATCH 00/31] repository object Stefan Beller
@ 2017-05-31 23:01   ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-05-31 23:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff King, Jeff Hostetler, Ævar Arnfjörð Bjarmason, Jonathan Tan

On 05/31, Stefan Beller wrote:
> On Wed, May 31, 2017 at 2:43 PM, Brandon Williams <bmwill@google.com> wrote:
> > Given the vast interest expressed when I sent out my RFC series I decided it
> > would be worth it to invest more time to making a repository object a reality.
> >
> > This series is an extension of the last series I sent out (in that ls-files is
> > converted to working on submodules in-process using repository objects instead
> > of spawning a child process to do the work).  The big difference from the RFC
> > series is that I went through and did the work to migrate key repository state
> > from global variables in 'environment.c' to being stored in a repository object
> > itself.  I migrated the bits of state that seemed reasonable for this series,
> > there is still a lot of global state which could be migrated in the future.
> >
> > I do think that we need to be slightly cautious about moving global state into
> > the repository object though, I don't want 'struct repo' to simply become a
> > kitchen sink where everything gets dumped.  But this is just a warning for the
> > future.
> 
> Or in other words:
> You want to have another struct e.g. 'the_command_line_arguments',
> which would carry the verbosity/color options for example as they are
> not related to a repo object, but to the current command being run?

Yes exactly.  Library code that needs to operate on a repository would
then be able to take arguments like:

  some_library_function(struct repo *repo, struct lib_opts *ops)

Much like how the grep machinery takes a grep_opts struct.

> 
> > Since this is a v1 I'm fairly certain that it still has a lot of rough edges
> > (like I think I need to write better commit messages, and we should probably
> > have more comments documenting object fields/contract) but I want to get the
> > review process started sooner rather than later since I'm sure people will have
> > opinions (e.g. should it be called 'struct repo' or 'struct repository'?!).
> 
> IMHO this is the most obvious, but bikesheddable part of the series. ;)

I know, that's why I mentioned it ;)

> Keep it short as everyone knows what a 'repo' is.

-- 
Brandon Williams

^ permalink raw reply	[relevance 8%]

* [PATCH] diff.c: color moved lines differently
      [irrelevant] <CAGZ79kbq3XiP8W_01FV133aMjZP9_GvpEg86N=XC2rTy24ZZGQ@mail.gmail.com>
@ 2017-06-01  0:24 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-01  0:24 UTC (permalink / raw)
  To: sbeller; +Cc: bmwill, git, gitster, jonathantanmy, jrnieder, mhagger, peff, philipoakley

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.

As the reviewers attention should be brought to the places, where the
difference is introduced to the moved code, we cannot just have one new
color for all of moved code.

First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

 Replacing the top commit in origin/sb/diff-color-move, 
 this has the spelling fixes by Philip.
 
 Also a minor fix for the 'alternate' mode, to go back to the default
 after empty lines. Thanks to Jacob.
 
 Thanks,
 Stefan

 Documentation/config.txt       |  10 +-
 Documentation/diff-options.txt |  32 ++++
 color.h                        |   2 +
 diff.c                         | 343 +++++++++++++++++++++++++++++++++++--
 diff.h                         |  15 +-
 t/t4015-diff-whitespace.sh     | 373 +++++++++++++++++++++++++++++++++++++++++
 6 files changed, 761 insertions(+), 14 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..73511a4603 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1051,14 +1051,20 @@ This does not affect linkgit:git-format-patch[1] or the
 'git-diff-{asterisk}' plumbing commands.  Can be overridden on the
 command line with the `--color[=<when>]` option.
 
+diff.colorMoved::
+	If set moved lines in a diff are colored differently,
+	for details see '--color-moved' in linkgit:git-diff[1].
+
 color.diff.<slot>::
 	Use customized color for diff colorization.  `<slot>` specifies
 	which part of the patch to use the specified color, and is one
 	of `context` (context text - `plain` is a historical synonym),
 	`meta` (metainformation), `frag`
 	(hunk header), 'func' (function in hunk header), `old` (removed lines),
-	`new` (added lines), `commit` (commit headers), or `whitespace`
-	(highlighting whitespace errors).
+	`new` (added lines), `commit` (commit headers), `whitespace`
+	(highlighting whitespace errors), `oldMoved`, `newMoved`,
+	`oldMovedAlternative` and `newMovedAlternative` (See the '<mode>'
+	setting of '--color-moved' in linkgit:git-diff[1] for details).
 
 color.decorate.<slot>::
 	Use customized color for 'git log --decorate' output.  `<slot>` is one
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 89cc0f48de..69bf061c5c 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -231,6 +231,38 @@ ifdef::git-diff[]
 endif::git-diff[]
 	It is the same as `--color=never`.
 
+--color-moved[=<mode>]::
+	Moved lines of code are colored differently.
+ifdef::git-diff[]
+	It can be changed by the `diff.colorMoved` configuration setting.
+endif::git-diff[]
+	The <mode> defaults to 'no' if the option is not given
+	and to 'adjacentbounds' if the option with no mode is given.
+	The mode must be one of:
++
+--
+no::
+	Moved lines are not highlighted.
+nobounds::
+	Any line that is added in one location and was removed
+	in another location will be colored with 'color.diff.newmoved'.
+	Similarly 'color.diff.oldmoved' will be used for removed lines
+	that are added somewhere else in the diff.
+allbounds::
+	Based on 'nobounds'. Additionally blocks of moved code are
+	detected and the first and last line of a block will be highlighted
+	using 'color.diff.newMovedAlternate' or
+	'color.diff.oldMovedAlternate'.
+adjacentbounds::
+	The same as 'allbounds' except that highlighting is only performed
+	at adjacent block boundaries of blocks that have the same sign.
+alternate::
+	Based on 'nobounds'. Additionally blocks of moved code are
+	detected. If moved blocks are adjacent mark one of them with the
+	alternative move color using 'color.diff.newMovedAlternate' or
+	'color.diff.oldMovedAlternate'.
+--
+
 --word-diff[=<mode>]::
 	Show a word diff, using the <mode> to delimit changed words.
 	By default, words are delimited by whitespace; see
diff --git a/color.h b/color.h
index 90627650fc..04b3b87929 100644
--- a/color.h
+++ b/color.h
@@ -42,6 +42,8 @@ struct strbuf;
 #define GIT_COLOR_BG_BLUE	"\033[44m"
 #define GIT_COLOR_BG_MAGENTA	"\033[45m"
 #define GIT_COLOR_BG_CYAN	"\033[46m"
+#define GIT_COLOR_DI_IT_CYAN	"\033[2;3;36m"
+#define GIT_COLOR_DI_IT_MAGENTA	"\033[2;3;35m"
 
 /* A special value meaning "no color selected" */
 #define GIT_COLOR_NIL "NIL"
diff --git a/diff.c b/diff.c
index a3c16ef827..a1f919ba57 100644
--- a/diff.c
+++ b/diff.c
@@ -31,6 +31,7 @@ static int diff_indent_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
+static int diff_color_moved_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -55,6 +56,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_YELLOW,	/* COMMIT */
 	GIT_COLOR_BG_RED,	/* WHITESPACE */
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
+	GIT_COLOR_DI_IT_MAGENTA,/* OLD_MOVED */
+	GIT_COLOR_BG_RED,	/* OLD_MOVED ALTERNATIVE */
+	GIT_COLOR_DI_IT_CYAN,	/* NEW_MOVED */
+	GIT_COLOR_BG_GREEN,	/* NEW_MOVED ALTERNATIVE */
 };
 
 static NORETURN void die_want_option(const char *option_name)
@@ -80,6 +85,14 @@ static int parse_diff_color_slot(const char *var)
 		return DIFF_WHITESPACE;
 	if (!strcasecmp(var, "func"))
 		return DIFF_FUNCINFO;
+	if (!strcasecmp(var, "oldmoved"))
+		return DIFF_FILE_OLD_MOVED;
+	if (!strcasecmp(var, "oldmovedalternative"))
+		return DIFF_FILE_OLD_MOVED_ALT;
+	if (!strcasecmp(var, "newmoved"))
+		return DIFF_FILE_NEW_MOVED;
+	if (!strcasecmp(var, "newmovedalternative"))
+		return DIFF_FILE_NEW_MOVED_ALT;
 	return -1;
 }
 
@@ -228,12 +241,35 @@ int git_diff_heuristic_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+static int parse_color_moved(const char *arg)
+{
+	if (!strcmp(arg, "no"))
+		return MOVED_LINES_NO;
+	else if (!strcmp(arg, "nobounds"))
+		return MOVED_LINES_BOUNDARY_NO;
+	else if (!strcmp(arg, "allbounds"))
+		return MOVED_LINES_BOUNDARY_ALL;
+	else if (!strcmp(arg, "adjacentbounds"))
+		return MOVED_LINES_BOUNDARY_ADJACENT;
+	else if (!strcmp(arg, "alternate"))
+		return MOVED_LINES_ALTERNATE;
+	else
+		return -1;
+}
+
 int git_diff_ui_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
 		diff_use_color_default = git_config_colorbool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "diff.colormoved")) {
+		int cm = parse_color_moved(value);
+		if (cm < 0)
+			return -1;
+		diff_color_moved_default = cm;
+		return 0;
+	}
 	if (!strcmp(var, "diff.context")) {
 		diff_context_default = git_config_int(var, value);
 		if (diff_context_default < 0)
@@ -354,6 +390,88 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+struct moved_entry {
+	struct hashmap_entry ent;
+	const struct diff_line *line;
+	struct moved_entry *next_line;
+};
+
+static void get_ws_cleaned_string(const struct diff_line *l,
+				  struct strbuf *out)
+{
+	int i;
+	for (i = 0; i < l->len; i++) {
+		if (isspace(l->line[i]))
+			continue;
+		strbuf_addch(out, l->line[i]);
+	}
+}
+
+static int diff_line_cmp_no_ws(const struct diff_line *a,
+					 const struct diff_line *b,
+					 const void *keydata)
+{
+	int ret;
+	struct strbuf sba = STRBUF_INIT;
+	struct strbuf sbb = STRBUF_INIT;
+
+	get_ws_cleaned_string(a, &sba);
+	get_ws_cleaned_string(b, &sbb);
+	ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
+
+	strbuf_release(&sba);
+	strbuf_release(&sbb);
+	return ret;
+}
+
+static int diff_line_cmp(const struct diff_line *a,
+				   const struct diff_line *b,
+				   const void *keydata)
+{
+	return a->len != b->len || strncmp(a->line, b->line, a->len);
+}
+
+static int moved_entry_cmp(const struct moved_entry *a,
+			   const struct moved_entry *b,
+			   const void *keydata)
+{
+	return diff_line_cmp(a->line, b->line, keydata);
+}
+
+static int moved_entry_cmp_no_ws(const struct moved_entry *a,
+				 const struct moved_entry *b,
+				 const void *keydata)
+{
+	return diff_line_cmp_no_ws(a->line, b->line, keydata);
+}
+
+static unsigned get_line_hash(struct diff_line *line, unsigned ignore_ws)
+{
+	static struct strbuf sb = STRBUF_INIT;
+
+	if (ignore_ws) {
+		strbuf_reset(&sb);
+		get_ws_cleaned_string(line, &sb);
+		return memhash(sb.buf, sb.len);
+	} else {
+		return memhash(line->line, line->len);
+	}
+}
+
+static struct moved_entry *prepare_entry(struct diff_options *o,
+					 int line_no)
+{
+	struct moved_entry *ret = xmalloc(sizeof(*ret));
+	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	struct diff_line *l = &o->line_buffer[line_no];
+
+	ret->ent.hash = get_line_hash(l, ignore_ws);
+	ret->line = l;
+	ret->next_line = NULL;
+
+	return ret;
+}
+
 static char *quote_two(const char *one, const char *two)
 {
 	int need_one = quote_c_style(one, NULL, NULL, 1);
@@ -516,6 +634,180 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
 	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
+{
+	struct moved_entry *prev_line = NULL;
+
+	int n;
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		int sign = 0;
+		struct hashmap *hm;
+		struct moved_entry *key;
+
+		switch (o->line_buffer[n].sign) {
+		case '+':
+			sign = '+';
+			hm = add_lines;
+			break;
+		case '-':
+			sign = '-';
+			hm = del_lines;
+			break;
+		case ' ':
+		default:
+			prev_line = NULL;
+			continue;
+		}
+
+		key = prepare_entry(o, n);
+		if (prev_line &&
+		    prev_line->line->sign == sign)
+			prev_line->next_line = key;
+
+		hashmap_add(hm, key);
+		prev_line = key;
+	}
+}
+
+static void mark_color_as_moved_single_line(struct diff_options *o,
+					    struct diff_line *l, int alt_color)
+{
+	switch (l->sign) {
+	case '+':
+		l->set = diff_get_color_opt(o,
+			DIFF_FILE_NEW_MOVED + alt_color);
+		break;
+	case '-':
+		l->set = diff_get_color_opt(o,
+			DIFF_FILE_OLD_MOVED + alt_color);
+		break;
+	default:
+		die("BUG: we should have continued earlier?");
+	}
+}
+
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
+{
+	struct moved_entry **pmb = NULL; /* potentially moved blocks */
+	struct diff_line *prev_line = NULL;
+	int pmb_nr = 0, pmb_alloc = 0;
+	int n, flipped_block = 0;
+
+	for (n = 0; n < o->line_buffer_nr; n++) {
+		struct hashmap *hm = NULL;
+		struct moved_entry *key;
+		struct moved_entry *match = NULL;
+		struct diff_line *l = &o->line_buffer[n];
+		int i, lp, rp, adjacent_blocks = 0;
+
+		/* Check for any match to color it as a move. */
+		switch (l->sign) {
+		case '+':
+			hm = del_lines;
+			key = prepare_entry(o, n);
+			match = hashmap_get(hm, key, o);
+			free(key);
+			break;
+		case '-':
+			hm = add_lines;
+			key = prepare_entry(o, n);
+			match = hashmap_get(hm, key, o);
+			free(key);
+			break;
+		default: ;
+			flipped_block = 0;
+		}
+
+		if (!match) {
+			pmb_nr = 0;
+			if (prev_line &&
+			    o->color_moved == MOVED_LINES_BOUNDARY_ALL)
+				mark_color_as_moved_single_line(o, prev_line, 1);
+			prev_line = NULL;
+			continue;
+		}
+
+		if (o->color_moved == MOVED_LINES_BOUNDARY_NO) {
+			mark_color_as_moved_single_line(o, l, 0);
+			continue;
+		}
+
+		/* Check any potential block runs, advance each or nullify */
+		for (i = 0; i < pmb_nr; i++) {
+			struct moved_entry *p = pmb[i];
+			struct moved_entry *pnext = (p && p->next_line) ?
+					p->next_line : NULL;
+			if (pnext &&
+			    !diff_line_cmp(pnext->line, l, o)) {
+				pmb[i] = p->next_line;
+			} else {
+				pmb[i] = NULL;
+			}
+		}
+
+		/* Shrink the set of potential block to the remaining running */
+		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+			while (lp < pmb_nr && pmb[lp])
+				lp++;
+			/* lp points at the first NULL now */
+
+			while (rp > -1 && !pmb[rp])
+				rp--;
+			/* rp points at the last non-NULL */
+
+			if (lp < pmb_nr && rp > -1 && lp < rp) {
+				pmb[lp] = pmb[rp];
+				pmb[rp] = NULL;
+				rp--;
+				lp++;
+			}
+		}
+
+		/* Remember the number of running sets */
+		pmb_nr = rp + 1;
+
+		if (pmb_nr == 0) {
+			/*
+			 * This line is the start of a new block.
+			 * Setup the set of potential blocks.
+			 */
+			for (; match; match = hashmap_get_next(hm, match)) {
+				ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
+				pmb[pmb_nr++] = match;
+			}
+
+			if (o->color_moved == MOVED_LINES_BOUNDARY_ALL) {
+				adjacent_blocks = 1;
+			} else {
+				/* Check if two blocks are adjacent */
+				adjacent_blocks = prev_line &&
+						  prev_line->sign == l->sign;
+			}
+		}
+
+		if (o->color_moved == MOVED_LINES_ALTERNATE) {
+			if (adjacent_blocks)
+				flipped_block = (flipped_block + 1) % 2;
+			mark_color_as_moved_single_line(o, l, flipped_block);
+		} else {
+			/* MOVED_LINES_BOUNDARY_{ADJACENT, ALL} */
+			mark_color_as_moved_single_line(o, l, adjacent_blocks);
+			if (adjacent_blocks && prev_line)
+				prev_line->set = l->set;
+		}
+
+		prev_line = l;
+	}
+	if (prev_line && o->color_moved == MOVED_LINES_BOUNDARY_ALL)
+		mark_color_as_moved_single_line(o, prev_line, 1);
+
+	free(pmb);
+}
+
 static void emit_diff_line(struct diff_options *o,
 			   struct diff_line *e)
 {
@@ -3518,6 +3810,8 @@ void diff_setup(struct diff_options *options)
 	options->line_buffer = NULL;
 	options->line_buffer_nr = 0;
 	options->line_buffer_alloc = 0;
+
+	options->color_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -3627,6 +3921,9 @@ void diff_setup_done(struct diff_options *options)
 
 	if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
 		die(_("--follow requires exactly one pathspec"));
+
+	if (!options->use_color || external_diff())
+		options->color_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
@@ -4051,7 +4348,19 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--no-color"))
 		options->use_color = 0;
-	else if (!strcmp(arg, "--color-words")) {
+	else if (!strcmp(arg, "--color-moved"))
+		if (diff_color_moved_default)
+			options->color_moved = diff_color_moved_default;
+		else
+			options->color_moved = MOVED_LINES_BOUNDARY_ADJACENT;
+	else if (!strcmp(arg, "--no-color-moved"))
+		options->color_moved = MOVED_LINES_NO;
+	else if (skip_prefix(arg, "--color-moved=", &arg)) {
+		int cm = parse_color_moved(arg);
+		if (cm < 0)
+			die("bad --color-moved argument: %s", arg);
+		options->color_moved = cm;
+	} else if (!strcmp(arg, "--color-words")) {
 		options->use_color = 1;
 		options->word_diff = DIFF_WORDS_COLOR;
 	}
@@ -4856,16 +5165,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	/*
-	 * For testing purposes we want to make sure the diff machinery
-	 * works completely with the buffer. If there is anything emitted
-	 * outside the emit_diff_line, then the order is screwed
-	 * up and the tests will fail.
-	 *
-	 * TODO (later in this series):
-	 * We'll unset this flag in a later patch.
-	 */
-	o->use_buffer = 1;
+
+	if (o->color_moved)
+		o->use_buffer = 1;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -4874,6 +5176,24 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	}
 
 	if (o->use_buffer) {
+		if (o->color_moved) {
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
+		}
+
 		for (i = 0; i < o->line_buffer_nr; i++)
 			emit_diff_line(o, &o->line_buffer[i]);
 
@@ -4962,6 +5282,7 @@ void diff_flush(struct diff_options *options)
 		if (!options->file)
 			die_errno("Could not open /dev/null");
 		options->close_file = 1;
+		options->color_moved = 0;
 		for (i = 0; i < q->nr; i++) {
 			struct diff_filepair *p = q->queue[i];
 			if (check_pair_status(p))
diff --git a/diff.h b/diff.h
index be51e8f867..d9fbafd383 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "hashmap.h"
 
 struct rev_info;
 struct diff_options;
@@ -228,6 +229,14 @@ struct diff_options {
 
 	struct diff_line *line_buffer;
 	int line_buffer_nr, line_buffer_alloc;
+
+	enum {
+		MOVED_LINES_NO = 0,
+		MOVED_LINES_BOUNDARY_NO = 1,
+		MOVED_LINES_BOUNDARY_ALL = 2,
+		MOVED_LINES_BOUNDARY_ADJACENT = 3,
+		MOVED_LINES_ALTERNATE = 4,
+	} color_moved;
 };
 
 /* Emit [line_prefix] [set] line [reset] */
@@ -243,7 +252,11 @@ enum color_diff {
 	DIFF_FILE_NEW = 5,
 	DIFF_COMMIT = 6,
 	DIFF_WHITESPACE = 7,
-	DIFF_FUNCINFO = 8
+	DIFF_FUNCINFO = 8,
+	DIFF_FILE_OLD_MOVED = 9,
+	DIFF_FILE_OLD_MOVED_ALT = 10,
+	DIFF_FILE_NEW_MOVED = 11,
+	DIFF_FILE_NEW_MOVED_ALT = 12
 };
 const char *diff_get_color(int diff_use_color, enum color_diff ix);
 #define diff_get_color_opt(o, ix) \
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 289806d0c7..e7b821be0a 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -972,4 +972,377 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 
 '
 
+test_expect_success 'detect moved code, complete file' '
+	git reset --hard &&
+	cat <<-\EOF >test.c &&
+	#include<stdio.h>
+	main()
+	{
+	printf("Hello World");
+	}
+	EOF
+	git add test.c &&
+	git commit -m "add main function" &&
+	git mv test.c main.c &&
+	test_config color.diff.oldMoved "normal red" &&
+	test_config color.diff.newMoved "normal green" &&
+	git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
+	cat >expected <<-\EOF &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>new file mode 100644<RESET>
+	<BOLD>index 0000000..a986c57<RESET>
+	<BOLD>--- /dev/null<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -0,0 +1,5 @@<RESET>
+	<BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
+	<BGREEN>+<RESET><BGREEN>main()<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>deleted file mode 100644<RESET>
+	<BOLD>index a986c57..0000000<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ /dev/null<RESET>
+	<CYAN>@@ -1,5 +0,0 @@<RESET>
+	<BRED>-#include<stdio.h><RESET>
+	<BRED>-main()<RESET>
+	<BRED>-{<RESET>
+	<BRED>-printf("Hello World");<RESET>
+	<BRED>-}<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect moved code, inside file' '
+	git reset --hard &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git add main.c test.c &&
+	git commit -m "add main and test file" &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	test_config color.diff.oldMoved "normal red" &&
+	test_config color.diff.newMoved "normal green" &&
+	test_config color.diff.oldMovedAlternative "bold red" &&
+	test_config color.diff.newMovedAlternative "bold green" &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BRED>-if (!u->is_allowed_foo)<RESET>
+	<BRED>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BRED>-}<RESET>
+	<BRED>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..e34eb69 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<BGREEN>+<RESET><BGREEN>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BGREEN>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect permutations inside moved code' '
+	git reset --hard &&
+	cat <<-\EOF >lines.txt &&
+		line 1
+		line 2
+		line 3
+		line 4
+		line 5
+		line 6
+		line 7
+		line 8
+		line 9
+		line 10
+		line 11
+		line 12
+		line 13
+		line 14
+		line 15
+		line 16
+	EOF
+	git add lines.txt &&
+	git commit -m "add poetry" &&
+	cat <<-\EOF >lines.txt &&
+		line 4
+		line 5
+		line 6
+		line 7
+		line 8
+		line 9
+		line 1
+		line 2
+		line 3
+		line 14
+		line 15
+		line 16
+		line 10
+		line 11
+		line 12
+		line 13
+	EOF
+	test_config color.diff.oldMoved "magenta" &&
+	test_config color.diff.newMoved "cyan" &&
+	test_config color.diff.oldMovedAlternative "blue" &&
+	test_config color.diff.newMovedAlternative "yellow" &&
+
+
+	git diff HEAD --no-renames --color-moved=nobounds| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+		<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+		<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+		<BOLD>--- a/lines.txt<RESET>
+		<BOLD>+++ b/lines.txt<RESET>
+		<CYAN>@@ -1,16 +1,16 @@<RESET>
+		<MAGENTA>-line 1<RESET>
+		<MAGENTA>-line 2<RESET>
+		<MAGENTA>-line 3<RESET>
+		 line 4<RESET>
+		 line 5<RESET>
+		 line 6<RESET>
+		 line 7<RESET>
+		 line 8<RESET>
+		 line 9<RESET>
+		<CYAN>+<RESET><CYAN>line 1<RESET>
+		<CYAN>+<RESET><CYAN>line 2<RESET>
+		<CYAN>+<RESET><CYAN>line 3<RESET>
+		<CYAN>+<RESET><CYAN>line 14<RESET>
+		<CYAN>+<RESET><CYAN>line 15<RESET>
+		<CYAN>+<RESET><CYAN>line 16<RESET>
+		 line 10<RESET>
+		 line 11<RESET>
+		 line 12<RESET>
+		 line 13<RESET>
+		<MAGENTA>-line 14<RESET>
+		<MAGENTA>-line 15<RESET>
+		<MAGENTA>-line 16<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	git diff HEAD --no-renames --color-moved=adjacentbounds| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<MAGENTA>-line 1<RESET>
+	<MAGENTA>-line 2<RESET>
+	<MAGENTA>-line 3<RESET>
+	 line 4<RESET>
+	 line 5<RESET>
+	 line 6<RESET>
+	 line 7<RESET>
+	 line 8<RESET>
+	 line 9<RESET>
+	<CYAN>+<RESET><CYAN>line 1<RESET>
+	<CYAN>+<RESET><CYAN>line 2<RESET>
+	<YELLOW>+<RESET><YELLOW>line 3<RESET>
+	<YELLOW>+<RESET><YELLOW>line 14<RESET>
+	<CYAN>+<RESET><CYAN>line 15<RESET>
+	<CYAN>+<RESET><CYAN>line 16<RESET>
+	 line 10<RESET>
+	 line 11<RESET>
+	 line 12<RESET>
+	 line 13<RESET>
+	<MAGENTA>-line 14<RESET>
+	<MAGENTA>-line 15<RESET>
+	<MAGENTA>-line 16<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	test_config diff.colorMoved alternate &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<MAGENTA>-line 1<RESET>
+	<MAGENTA>-line 2<RESET>
+	<MAGENTA>-line 3<RESET>
+	 line 4<RESET>
+	 line 5<RESET>
+	 line 6<RESET>
+	 line 7<RESET>
+	 line 8<RESET>
+	 line 9<RESET>
+	<CYAN>+<RESET><CYAN>line 1<RESET>
+	<CYAN>+<RESET><CYAN>line 2<RESET>
+	<CYAN>+<RESET><CYAN>line 3<RESET>
+	<YELLOW>+<RESET><YELLOW>line 14<RESET>
+	<YELLOW>+<RESET><YELLOW>line 15<RESET>
+	<YELLOW>+<RESET><YELLOW>line 16<RESET>
+	 line 10<RESET>
+	 line 11<RESET>
+	 line 12<RESET>
+	 line 13<RESET>
+	<MAGENTA>-line 14<RESET>
+	<MAGENTA>-line 15<RESET>
+	<MAGENTA>-line 16<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	test_config diff.colorMoved allbounds &&
+	git diff HEAD --no-renames --color-moved| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/lines.txt b/lines.txt<RESET>
+	<BOLD>index 47ea9c3..ba96a38 100644<RESET>
+	<BOLD>--- a/lines.txt<RESET>
+	<BOLD>+++ b/lines.txt<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<BLUE>-line 1<RESET>
+	<MAGENTA>-line 2<RESET>
+	<BLUE>-line 3<RESET>
+	 line 4<RESET>
+	 line 5<RESET>
+	 line 6<RESET>
+	 line 7<RESET>
+	 line 8<RESET>
+	 line 9<RESET>
+	<YELLOW>+<RESET><YELLOW>line 1<RESET>
+	<CYAN>+<RESET><CYAN>line 2<RESET>
+	<YELLOW>+<RESET><YELLOW>line 3<RESET>
+	<YELLOW>+<RESET><YELLOW>line 14<RESET>
+	<CYAN>+<RESET><CYAN>line 15<RESET>
+	<YELLOW>+<RESET><YELLOW>line 16<RESET>
+	 line 10<RESET>
+	 line 11<RESET>
+	 line 12<RESET>
+	 line 13<RESET>
+	<BLUE>-line 14<RESET>
+	<MAGENTA>-line 15<RESET>
+	<BLUE>-line 16<RESET>
+	EOF
+	test_cmp expected actual
+'
+
+test_expect_success 'move detection does not mess up colored words' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 7%]

* [PATCHv3 0/4] A reroll of sb/submodule-blanket-recursive
  2017-05-30  5:30 ` Junio C Hamano
@ 2017-06-01  0:30   ` Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 1/4] Introduce 'submodule.recurse' option for worktree manipulators Stefan Beller
                       ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Stefan Beller @ 2017-06-01  0:30 UTC (permalink / raw)
  To: gitster; +Cc: bmwill, git, sbeller

v3:
* rerolling only the top-4 patches of sb/submodule-blanket-recursive.
  (base: 1d789d089280539ca39b83aabb67860929d39b75)
* fixes function declarations that should be static, thanks Ramsay!

v2:
* A reroll of sb/submodule-blanket-recursive.
* This requires ab/grep-preparatory-cleanup
* It changed a lot from v1, as in v1 the tests did not work,
  hence the code was broken. Now it actually works.
* it also includes grep, fetch, push in addition to plain working tree
  manipulators.

Thanks,
Stefan

Stefan Beller (4):
  Introduce 'submodule.recurse' option for worktree manipulators
  builtin/grep.c: respect 'submodule.recurse' option
  builtin/push.c: respect 'submodule.recurse' option
  builtin/fetch.c: respect 'submodule.recurse' option

 Documentation/config.txt           |  5 +++++
 builtin/checkout.c                 |  2 +-
 builtin/fetch.c                    |  7 +++++++
 builtin/grep.c                     |  3 +++
 builtin/push.c                     |  4 ++++
 builtin/read-tree.c                | 10 +++++++++-
 builtin/reset.c                    | 10 +++++++++-
 submodule.c                        | 23 +++++++++++++++++++++--
 submodule.h                        |  1 +
 t/lib-submodule-update.sh          | 12 ++++++++++++
 t/t5526-fetch-submodules.sh        | 10 ++++++++++
 t/t5531-deep-submodule-push.sh     | 21 +++++++++++++++++++++
 t/t7814-grep-recurse-submodules.sh | 18 ++++++++++++++++++
 13 files changed, 121 insertions(+), 5 deletions(-)

-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 35%]

* [PATCHv3 1/4] Introduce 'submodule.recurse' option for worktree manipulators
  2017-06-01  0:30   ` [PATCHv3 0/4] A reroll of sb/submodule-blanket-recursive Stefan Beller
@ 2017-06-01  0:30     ` Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 2/4] builtin/grep.c: respect 'submodule.recurse' option Stefan Beller
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-01  0:30 UTC (permalink / raw)
  To: gitster; +Cc: bmwill, git, sbeller

Any command that understands '--recurse-submodules' can have its
default changed to true, by setting the new 'submodule.recurse'
option.

This patch includes read-tree/checkout/reset for working tree
manipulating commands. Later patches will cover other commands.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt  |  5 +++++
 builtin/checkout.c        |  2 +-
 builtin/read-tree.c       | 10 +++++++++-
 builtin/reset.c           | 10 +++++++++-
 submodule.c               | 23 +++++++++++++++++++++--
 submodule.h               |  1 +
 t/lib-submodule-update.sh | 12 ++++++++++++
 7 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e0b9fd0bc3..f60c504e86 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3065,6 +3065,11 @@ submodule.active::
 	submodule's path to determine if the submodule is of interest to git
 	commands.
 
+submodule.recurse::
+	Specifies if commands recurse into submodules by default. This
+	applies to all commands that have a `--recurse-submodules` option.
+	Defaults to false.
+
 submodule.fetchJobs::
 	Specifies how many submodules are fetched/cloned at the same time.
 	A positive integer allows up to that number of submodules fetched
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 56ea723b75..e289b7d477 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -855,7 +855,7 @@ static int git_checkout_config(const char *var, const char *value, void *cb)
 	}
 
 	if (starts_with(var, "submodule."))
-		return parse_submodule_config_option(var, value);
+		return submodule_config(var, value, NULL);
 
 	return git_xmerge_config(var, value, NULL);
 }
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 8a889ef4c3..7fd55140db 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -98,6 +98,14 @@ static int debug_merge(const struct cache_entry * const *stages,
 	return 0;
 }
 
+static int git_read_tree_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.recurse"))
+		return git_default_submodule_config(var, value, cb);
+
+	return git_default_config(var, value, cb);
+}
+
 static struct lock_file lock_file;
 
 int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
@@ -150,7 +158,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	opts.src_index = &the_index;
 	opts.dst_index = &the_index;
 
-	git_config(git_default_config, NULL);
+	git_config(git_read_tree_config, NULL);
 
 	argc = parse_options(argc, argv, unused_prefix, read_tree_options,
 			     read_tree_usage, 0);
diff --git a/builtin/reset.c b/builtin/reset.c
index 6f89dc5494..585cfe0745 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -266,6 +266,14 @@ static int reset_refs(const char *rev, const struct object_id *oid)
 	return update_ref_status;
 }
 
+static int git_reset_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.recurse"))
+		return git_default_submodule_config(var, value, cb);
+
+	return git_default_config(var, value, cb);
+}
+
 int cmd_reset(int argc, const char **argv, const char *prefix)
 {
 	int reset_type = NONE, update_ref_status = 0, quiet = 0;
@@ -294,7 +302,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 		OPT_END()
 	};
 
-	git_config(git_default_config, NULL);
+	git_config(git_reset_config, NULL);
 
 	argc = parse_options(argc, argv, prefix, options, git_reset_usage,
 						PARSE_OPT_KEEP_DASHDASH);
diff --git a/submodule.c b/submodule.c
index 78cccb7563..2b157dc995 100644
--- a/submodule.c
+++ b/submodule.c
@@ -16,6 +16,7 @@
 #include "quote.h"
 #include "remote.h"
 #include "worktree.h"
+#include "parse-options.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
@@ -170,10 +171,28 @@ static int git_modules_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/* Loads all submodule settings from the config */
+/* Loads all submodule settings from the config. */
 int submodule_config(const char *var, const char *value, void *cb)
 {
-	return git_modules_config(var, value, cb);
+	if (!strcmp(var, "submodule.recurse")) {
+		int v = git_config_bool(var, value) ?
+			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+		config_update_recurse_submodules = v;
+		return 0;
+	} else {
+		return git_modules_config(var, value, cb);
+	}
+}
+
+/* Cheap function that only determines if we're interested in submodules at all */
+int git_default_submodule_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.recurse")) {
+		int v = git_config_bool(var, value) ?
+			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+		config_update_recurse_submodules = v;
+	}
+	return 0;
 }
 
 int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
diff --git a/submodule.h b/submodule.h
index b13f120f76..d920ca1d5a 100644
--- a/submodule.h
+++ b/submodule.h
@@ -39,6 +39,7 @@ extern void stage_updated_gitmodules(void);
 extern void set_diffopt_flags_from_submodule_config(struct diff_options *,
 		const char *path);
 extern int submodule_config(const char *var, const char *value, void *cb);
+extern int git_default_submodule_config(const char *var, const char *value, void *cb);
 
 struct option;
 int option_parse_recurse_submodules_worktree_updater(const struct option *opt,
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 0272c4d8ca..52beadad96 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -990,6 +990,18 @@ test_submodule_switch_recursing_with_args () {
 		)
 	'
 
+	test_expect_success "git -c submodule.recurse=true $cmd_args: modified submodule updates submodule work tree" '
+		prolog &&
+		reset_work_tree_to_interested add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t modify_sub1 origin/modify_sub1 &&
+			git -c submodule.recurse=true $cmd_args modify_sub1 &&
+			test_superproject_content origin/modify_sub1 &&
+			test_submodule_content sub1 origin/modify_sub1
+		)
+	'
+
 	# Updating a submodule to an invalid sha1 doesn't update the
 	# superproject nor the submodule's work tree.
 	test_expect_success "$command: updating to a missing submodule commit fails" '
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 24%]

* [PATCHv3 2/4] builtin/grep.c: respect 'submodule.recurse' option
  2017-06-01  0:30   ` [PATCHv3 0/4] A reroll of sb/submodule-blanket-recursive Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 1/4] Introduce 'submodule.recurse' option for worktree manipulators Stefan Beller
@ 2017-06-01  0:30     ` Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 3/4] builtin/push.c: respect 'submodule.recurse' option Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 4/4] builtin/fetch.c: respect 'submodule.recurse' option Stefan Beller
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-01  0:30 UTC (permalink / raw)
  To: gitster; +Cc: bmwill, git, sbeller

In builtin/grep.c we parse the config before evaluating the command line
options. This makes the task of teaching grep to respect the new config
option 'submodule.recurse' very easy by just parsing that option.

As an alternative I had implemented a similar structure to treat
submodules as the fetch/push command have, including
* aligning the meaning of the 'recurse_submodules' to possible submodule
  values RECURSE_SUBMODULES_* as defined in submodule.h.
* having a callback to parse the value and
* reacting to the RECURSE_SUBMODULES_DEFAULT state that was the initial
  state.

However all this is not needed for a true boolean value, so let's keep
it simple. However this adds another place where "submodule.recurse" is
parsed.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/grep.c                     |  3 +++
 t/t7814-grep-recurse-submodules.sh | 18 ++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index b1095362fb..454e263820 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -302,6 +302,9 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 #endif
 	}
 
+	if (!strcmp(var, "submodule.recurse"))
+		recurse_submodules = git_config_bool(var, value);
+
 	return st;
 }
 
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 3a58197f47..7184113b9b 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -33,6 +33,24 @@ test_expect_success 'grep correctly finds patterns in a submodule' '
 	test_cmp expect actual
 '
 
+test_expect_success 'grep finds patterns in a submodule via config' '
+	test_config submodule.recurse true &&
+	# expect from previous test
+	git grep -e "(3|4)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'grep --no-recurse-submodules overrides config' '
+	test_config submodule.recurse true &&
+	cat >expect <<-\EOF &&
+	a:(1|2)d(3|4)
+	b/b:(3|4)
+	EOF
+
+	git grep -e "(3|4)" --no-recurse-submodules >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'grep and basic pathspecs' '
 	cat >expect <<-\EOF &&
 	submodule/a:(1|2)d(3|4)
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 31%]

* [PATCHv3 3/4] builtin/push.c: respect 'submodule.recurse' option
  2017-06-01  0:30   ` [PATCHv3 0/4] A reroll of sb/submodule-blanket-recursive Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 1/4] Introduce 'submodule.recurse' option for worktree manipulators Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 2/4] builtin/grep.c: respect 'submodule.recurse' option Stefan Beller
@ 2017-06-01  0:30     ` Stefan Beller
  2017-06-01  0:30     ` [PATCHv3 4/4] builtin/fetch.c: respect 'submodule.recurse' option Stefan Beller
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-01  0:30 UTC (permalink / raw)
  To: gitster; +Cc: bmwill, git, sbeller

The closest mapping from the boolean 'submodule.recurse' set to "yes"
to the variety of submodule push modes is "on-demand", so implement that.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/push.c                 |  4 ++++
 t/t5531-deep-submodule-push.sh | 21 +++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/builtin/push.c b/builtin/push.c
index a597759d8f..258648d5fd 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -498,6 +498,10 @@ static int git_push_config(const char *k, const char *v, void *cb)
 		const char *value;
 		if (!git_config_get_value("push.recursesubmodules", &value))
 			recurse_submodules = parse_push_recurse_submodules_arg(k, value);
+	} else if (!strcmp(k, "submodule.recurse")) {
+		int val = git_config_bool(k, v) ?
+			RECURSE_SUBMODULES_ON_DEMAND : RECURSE_SUBMODULES_OFF;
+		recurse_submodules = val;
 	}
 
 	return git_default_config(k, v, NULL);
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index 57ba322628..712c595fd8 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -126,6 +126,27 @@ test_expect_success 'push succeeds if submodule commit not on remote but using o
 	)
 '
 
+test_expect_success 'push succeeds if submodule commit not on remote but using auto-on-demand via submodule.recurse config' '
+	(
+		cd work/gar/bage &&
+		>recurse-on-demand-from-submodule-recurse-config &&
+		git add recurse-on-demand-from-submodule-recurse-config &&
+		git commit -m "Recurse submodule.recurse from config junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Recurse submodule.recurse from config for gar/bage" &&
+		git -c submodule.recurse push ../pub.git master &&
+		# Check that the supermodule commit got there
+		git fetch ../pub.git &&
+		git diff --quiet FETCH_HEAD master &&
+		# Check that the submodule commit got there too
+		cd gar/bage &&
+		git diff --quiet origin/master master
+	)
+'
+
 test_expect_success 'push recurse-submodules on command line overrides config' '
 	(
 		cd work/gar/bage &&
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 30%]

* [PATCHv3 4/4] builtin/fetch.c: respect 'submodule.recurse' option
  2017-06-01  0:30   ` [PATCHv3 0/4] A reroll of sb/submodule-blanket-recursive Stefan Beller
                       ` (2 preceding siblings ...)
  2017-06-01  0:30     ` [PATCHv3 3/4] builtin/push.c: respect 'submodule.recurse' option Stefan Beller
@ 2017-06-01  0:30     ` Stefan Beller
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-01  0:30 UTC (permalink / raw)
  To: gitster; +Cc: bmwill, git, sbeller

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch.c             |  7 +++++++
 t/t5526-fetch-submodules.sh | 10 ++++++++++
 2 files changed, 17 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5f2c2ab23e..c1ec3b03c3 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -73,6 +73,13 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 		fetch_prune_config = git_config_bool(k, v);
 		return 0;
 	}
+
+	if (!strcmp(k, "submodule.recurse")) {
+		int r = git_config_bool(k, v) ?
+			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
+		recurse_submodules = r;
+	}
+
 	return git_default_config(k, v, cb);
 }
 
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index f3b0a8d30a..162baf101f 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -71,6 +71,16 @@ test_expect_success "fetch --recurse-submodules recurses into submodules" '
 	test_i18ncmp expect.err actual.err
 '
 
+test_expect_success "submodule.recurse option triggers recursive fetch" '
+	add_upstream_commit &&
+	(
+		cd downstream &&
+		git -c submodule.recurse fetch >../actual.out 2>../actual.err
+	) &&
+	test_must_be_empty actual.out &&
+	test_i18ncmp expect.err actual.err
+'
+
 test_expect_success "fetch --recurse-submodules -j2 has the same output behaviour" '
 	add_upstream_commit &&
 	(
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 33%]

* Re: RFC: Would a config fetch.retryCount make sense?
      [irrelevant] <6762A30E-C558-4085-943B-AB85EBF18706@gmail.com>
@ 2017-06-01 17:59 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-01 17:59 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Mailing List, Junio C Hamano

On Thu, Jun 1, 2017 at 5:48 AM, Lars Schneider <larsxschneider@gmail.com> wrote:
> Hi,
>
> we occasionally see "The remote end hung up unexpectedly" (pkt-line.c:265)
> on our `git fetch` calls (most noticeably in our automations). I expect
> random network glitches to be the cause.

There is 665b35eccd (submodule--helper: initial clone learns retry
logic, 2016-06-09)
but that is for submodules and only the initial clone.

I tried searching the mailing list archive if it was discussed for
fetch before (I am sure it was), but could not find a good hint to link at.

IIRC one major concern was:
* When a human operates git-fetch, then they want to have fast feedback.
  The failure may be non-transient, for example when I forgot to up the
  wifi connection. Then the human can inspect and fix the root cause.
  (Assumption in human workflow: these non transient errors happen more
  often than the occasional fetch error due to network glitches.)

For automation I would expect that the retry logic is actually beneficial,
such that you would want to have a command line options such as
"git fetch --retries=5 --delay-between-retries=10s".

>
> In some places we added a basic retry mechanism and I was wondering
> if this could be a useful feature for Git itself.

There are already retries in other places. :) Cf. f4ab4f3ab1
(lock_packed_refs():
allow retries when acquiring the packed-refs lock, 2015-05-11), which
solves the need
of github on the serverside, when they have a very active repo that
multiple people
push to at the same time. (to different branches. I believe that forks
are internally
handled as the same repo, just with different namespaces. So if there
are 1000 forks
of linux.git you see a lot of pushes to the "same" repo)

>
> E.g. a Git config such as "fetch.retryCount" or something.
> Or is there something like this in Git already and I missed it?

I like it.

Thanks,
Stefan

^ permalink raw reply	[relevance 16%]

* Re: [PATCH 06/31] repo: introduce the repository object
      [irrelevant] ` <20170531214417.38857-7-bmwill@google.com>
@ 2017-06-01 19:53   ` Stefan Beller
  2017-06-05 17:53     ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-01 19:53 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff King, Jeff Hostetler, Ævar Arnfjörð Bjarmason, Jonathan Tan

On Wed, May 31, 2017 at 2:43 PM, Brandon Williams <bmwill@google.com> wrote:
> Introduce the repository object 'struct repo' which can be used hold all
> state pertaining to a git repository.
>
> The aim of object-ifying the repository is to (1) make the code base
> more readable and easier to reason about and (2) allow for working on
> multiple repositories, specifically submodules, within the same process.
>
> TODO: Add more motivating points for adding a repository object?

Yes please (or delete this line).
https://public-inbox.org/git/alpine.DEB.2.21.1.1705221501540.3610@virtualbox/

> +++ b/repo.c
> @@ -0,0 +1,124 @@
> +#include "cache.h"
> +#include "repo.h"
> +
> +/*
> + * This may be the wrong place for this.
> + * It may be better to go in env.c or setup for the time being?

In env.c we say:
/*
 * We put all the git config variables in this same object
 * file, so that programs can link against the config parser
 * without having to link against all the rest of git.
 *
 * In particular, no need to bring in libz etc unless needed,
 * even if you might want to know where the git directory etc
 * are.
 */

And setup.c only has a few variables that matter there locally.
So I would think having 'the_repository' in repo.c is acceptable.

> + */
> +struct repo the_repository;
> +
> +static char *git_path_from_env(const char *envvar, const char *git_dir,
> +                              const char *path, int fromenv)
> +{
> +       if (fromenv) {
> +               const char *value = getenv(envvar);
> +               if (value)
> +                       return xstrdup(value);
> +       }
> +
> +       return xstrfmt("%s/%s", git_dir, path);
> +}
> +
> +static int find_common_dir(struct strbuf *sb, const char *gitdir, int fromenv)
> +{
> +       if (fromenv) {
> +               const char *value = getenv(GIT_COMMON_DIR_ENVIRONMENT);
> +               if (value) {
> +                       strbuf_addstr(sb, value);
> +                       return 1;
> +               }
> +       }
> +
> +       return get_common_dir_noenv(sb, gitdir);
> +}
> +
> +/* called after setting gitdir */
> +static void repo_setup_env(struct repo *repo)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +
> +       if (!repo->gitdir)
> +               BUG("gitdir wasn't set before setting up the environment");
> +
> +       repo->different_commondir = find_common_dir(&sb, repo->gitdir,
> +                                                   !repo->ignore_env);
> +       repo->commondir = strbuf_detach(&sb, NULL);
> +       repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
> +                                           "objects", !repo->ignore_env);
> +       repo->index_file = git_path_from_env(INDEX_ENVIRONMENT, repo->gitdir,
> +                                            "index", !repo->ignore_env);
> +       repo->graft_file = git_path_from_env(GRAFT_ENVIRONMENT, repo->commondir,
> +                                            "info/grafts", !repo->ignore_env);
> +       repo->namespace = expand_namespace(repo->ignore_env ? NULL :
> +                                          getenv(GIT_NAMESPACE_ENVIRONMENT));
> +}
> +
> +static void repo_clear_env(struct repo *repo)
> +{
> +       free(repo->gitdir);
> +       repo->gitdir = NULL;
> +       free(repo->commondir);
> +       repo->commondir = NULL;
> +       free(repo->objectdir);
> +       repo->objectdir = NULL;
> +       free(repo->index_file);
> +       repo->index_file = NULL;
> +       free(repo->graft_file);
> +       repo->graft_file = NULL;
> +       free(repo->namespace);
> +       repo->namespace = NULL;

I wonder if we can defer the NULL assignments to
repo_clear, where we would just do a
memset(repo, 0, sizeof(struct repo));

> +
> +       repo_set_gitdir(repo, resolved_gitdir);
> +
> +       /* NEEDSWORK: Verify repository format version */

Care to elaborate on this? I do not understand why we would want
to check the format version here?

> +
> +extern void repo_set_gitdir(struct repo *repo, const char *path);
> +extern int repo_init(struct repo *repo, const char *path);
> +extern void repo_clear(struct repo *repo);

The init and clear method seem obvious to me, but what do we need the
repo_set_gitdir for externally? I would assume the repo auto-discovers its
gitdir on its own?

^ permalink raw reply	[relevance 8%]

* Re: [PATCH 31/31] ls-files: use repository object
      [irrelevant] ` <20170531214417.38857-32-bmwill@google.com>
@ 2017-06-01 20:36   ` Stefan Beller
  2017-06-05 17:46     ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-01 20:36 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff King, Jeff Hostetler, Ævar Arnfjörð Bjarmason, Jonathan Tan

On Wed, May 31, 2017 at 2:44 PM, Brandon Williams <bmwill@google.com> wrote:
> Convert ls-files to use a repository struct and recurse submodules
> inprocess.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>


> +static void show_submodule(const struct repo *superproject,
> +                          struct dir_struct *dir, const char *path)
>  {
> +       struct repo submodule;
> +       char *gitdir = mkpathdup("%s/%s", superproject->worktree, path);
> +       repo_init(&submodule, gitdir);
> +
> +       repo_read_index(&submodule);
> +       repo_read_gitmodules(&submodule);
> +
> +       if (superproject->submodule_prefix)
> +               submodule.submodule_prefix = xstrfmt("%s%s/", superproject->submodule_prefix, path);
> +       else
> +               submodule.submodule_prefix = xstrfmt("%s/", path);
> +       show_files(&submodule, dir);
> +
> +       repo_clear(&submodule);
> +       free(gitdir);
>  }

I like how it seems easy now to do work in another repository. :)

> -       { "ls-files", cmd_ls_files, RUN_SETUP | SUPPORT_SUPER_PREFIX },
> +       { "ls-files", cmd_ls_files, RUN_SETUP },

With this step, we can get rid of SUPPORT_SUPER_PREFIX eventually.

I do not have comments on the patches in the middle, but they
cleared up some of the questions that I asked in the early patches.

Thanks,
Stefan

^ permalink raw reply	[relevance 9%]

* [GSoC][PATCH v6 1/2] submodule: fix buggy $path and $sm_path variable's value
  2017-05-31  0:48                     ` Ramsay Jones
@ 2017-06-02 11:24                       ` Prathamesh Chavan
  2017-06-02 11:24                         ` [GSoC][PATCH v6 2/2] submodule: port subcommand foreach from shell to C Prathamesh Chavan
  0 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-02 11:24 UTC (permalink / raw)
  To: git; +Cc: sbeller, j6t, christian.couder, ramsay, Prathamesh Chavan

According to the documentation about git-submodule foreach subcommand's
$path variable:
$path is the name of the submodule directory relative to the superproject

But it was observed when the value of the $path value deviates from this
for the nested submodules when the <command> is run from a subdirectory.
This patch aims for its correction.

Additional test cases added to the submodule-foreach test suite in t7407,
to check the submodule foreach --recursive behavior from a subdirectory
as this was missing from the test suite.

Helped-by: Brandon Williams <bmwill@google.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 git-submodule.sh             |  2 +-
 t/t7407-submodule-foreach.sh | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c..ea6f56337 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -344,9 +344,9 @@ cmd_foreach()
 				prefix="$prefix$sm_path/"
 				sanitize_submodule_env
 				cd "$sm_path" &&
-				sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
 				# we make $path available to scripts ...
 				path=$sm_path &&
+				sm_path=$displaypath &&
 				if test $# -eq 1
 				then
 					eval "$1"
diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 6ba5daf42..f4e07eee3 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -197,6 +197,39 @@ test_expect_success 'test messages from "foreach --recursive" from subdirectory'
 	test_i18ncmp expect actual
 '
 
+sub1sha1=$(cd clone2/sub1 && git rev-parse HEAD)
+sub2sha1=$(cd clone2/sub2 && git rev-parse HEAD)
+sub3sha1=$(cd clone2/sub3 && git rev-parse HEAD)
+nested1sha1=$(cd clone2/nested1 && git rev-parse HEAD)
+nested2sha1=$(cd clone2/nested1/nested2 && git rev-parse HEAD)
+nested3sha1=$(cd clone2/nested1/nested2/nested3 && git rev-parse HEAD)
+submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse HEAD)
+
+cat >expect <<EOF
+Entering '../nested1'
+$pwd/clone2-nested1-../nested1-$nested1sha1
+Entering '../nested1/nested2'
+$pwd/clone2/nested1-nested2-../nested1/nested2-$nested2sha1
+Entering '../nested1/nested2/nested3'
+$pwd/clone2/nested1/nested2-nested3-../nested1/nested2/nested3-$nested3sha1
+Entering '../nested1/nested2/nested3/submodule'
+$pwd/clone2/nested1/nested2/nested3-submodule-../nested1/nested2/nested3/submodule-$submodulesha1
+Entering '../sub1'
+$pwd/clone2-foo1-../sub1-$sub1sha1
+Entering '../sub2'
+$pwd/clone2-foo2-../sub2-$sub2sha1
+Entering '../sub3'
+$pwd/clone2-foo3-../sub3-$sub3sha1
+EOF
+
+test_expect_success 'test "submodule foreach --recursive" from subdirectory' '
+	(
+		cd clone2/untracked &&
+		git submodule foreach --recursive "echo \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
+	) &&
+	test_i18ncmp expect actual
+'
+
 cat > expect <<EOF
 nested1-nested1
 nested2-nested2
-- 
2.13.0


^ permalink raw reply	[relevance 22%]

* [GSoC][PATCH v6 2/2] submodule: port subcommand foreach from shell to C
  2017-06-02 11:24                       ` [GSoC][PATCH v6 1/2] submodule: fix buggy $path and $sm_path variable's value Prathamesh Chavan
@ 2017-06-02 11:24                         ` Prathamesh Chavan
  2017-06-03  2:13                           ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-02 11:24 UTC (permalink / raw)
  To: git; +Cc: sbeller, j6t, christian.couder, ramsay, Prathamesh Chavan

This aims to make git-submodule foreach a builtin. This is the very
first step taken in this direction. Hence, 'foreach' is ported to
submodule--helper, and submodule--helper is called from git-submodule.sh.
The code is split up to have one function to obtain all the list of
submodules. This function acts as the front-end of git-submodule foreach
subcommand. It calls the function for_each_submodule_list, which basically
loops through the list and calls function fn, which in this case is
runcommand_in_submodule. This third function is a calling function that
takes care of running the command in that submodule, and recursively
perform the same when --recursive is flagged.

The first function module_foreach first parses the options present in
argv, and then with the help of module_list_compute, generates the list of
submodules present in the current working tree.

The second function for_each_submodule_list traverses through the
list, and calls function fn (which in case of submodule subcommand
foreach is runcommand_in_submodule) is called for each entry.

The third function runcommand_in_submodule, generates a submodule struct sub
for $name, value and then later prepends name=sub->name; and other
value assignment to the env argv_array structure of a child_process.
Also the <command> of submodule-foreach is push to args argv_array
structure and finally, using run_command the commands are executed
using a shell.

The third function also takes care of the recursive flag, by creating
a separate child_process structure and prepending "--super-prefix displaypath",
to the args argv_array structure. Other required arguments and the
input <command> of submodule-foreach is also appended to this argv_array.

Helped-by: Brandon Williams <bmwill@google.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
In this new version of, we avoid exporting the $path variable to
the environment, but instead prefix the <command> with it.
In this way, we avoid the issue it creates with the env variable
$PATH in windows.

Other than that, additionally the case of no. of arugments in <command>
being equal to 1 is also considered separetly.
THe reason of having this change in the shell script was given in the
commit 1c4fb136db.
According to my understanding, eval "$1" executes $1 in same shell,
whereas "$@" gets executed in a separate shell, which doesn't allow
"$@" to access the env variables $name, $path, etc.
Hence, to keep the ported function similar, this condition is also
added. 

Apart from this, other suggested changes are also implemented.

Complete build report is available at:
https://travis-ci.org/pratham-pc/git/builds
Branch: submodule-foreach
Build #87
 
 builtin/submodule--helper.c | 153 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  39 +----------
 2 files changed, 154 insertions(+), 38 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 566a5b6a6..d08a02ad9 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,6 +13,9 @@
 #include "refs.h"
 #include "connect.h"
 
+typedef void (*submodule_list_func_t)(const struct cache_entry *list_item,
+	      void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -219,6 +222,25 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		return xstrfmt("%s/%s", super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -331,6 +353,14 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static void for_each_submodule_list(const struct module_list list,
+				    submodule_list_func_t fn, void *cb_data)
+{
+	int i;
+	for (i = 0; i < list.nr; i++)
+		fn(list.entries[i], cb_data);
+}
+
 static void init_submodule(const char *path, const char *prefix, int quiet)
 {
 	const struct submodule *sub;
@@ -487,6 +517,128 @@ static int module_name(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct cb_foreach {
+	int argc;
+	const char **argv;
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int recursive: 1;
+};
+#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
+
+static void runcommand_in_submodule(const struct cache_entry *list_item,
+				    void *cb_data)
+{
+	struct cb_foreach *info = cb_data;
+	char *toplevel;
+	const struct submodule *sub;
+	struct child_process cp = CHILD_PROCESS_INIT;
+	char *displaypath;
+	int i;
+
+	sub = submodule_from_path(null_sha1, list_item->name);
+
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+	if (!sub)
+		die(_("No url found for submodule path '%s' in .gitmodules"),
+		      displaypath);
+
+	if (!is_submodule_populated_gently(list_item->name, NULL))
+		return;
+
+	toplevel = xgetcwd();
+
+	prepare_submodule_repo_env(&cp.env_array);
+	cp.use_shell = 1;
+	cp.dir = list_item->name;
+
+	if (info->argc == 1) {
+		argv_array_pushf(&cp.env_array, "name=%s", sub->name);
+		argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);
+		argv_array_pushf(&cp.env_array, "sha1=%s",
+				 oid_to_hex(&list_item->oid));
+		argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
+
+		argv_array_pushf(&cp.args, "path=%s; %s", list_item->name,
+				 info->argv[0]);
+	} else {
+		for (i = 0; i < info->argc; i++)
+			argv_array_push(&cp.args, info->argv[i]);
+	}
+
+	if (!info->quiet)
+		printf(_("Entering '%s'\n"), displaypath);
+
+	if (info->argv[0] && run_command(&cp))
+		die(_("run_command returned non-zero status for %s\n."),
+		      displaypath);
+
+	if (info->recursive) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = list_item->name;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
+				 "submodule--helper", "foreach", "--recursive",
+				 NULL);
+
+		if (info->quiet)
+			argv_array_push(&cpr.args, "--quiet");
+
+		for (i = 0; i < info->argc; i++)
+			argv_array_push(&cpr.args, info->argv[i]);
+
+		if (run_command(&cpr))
+			die(_("run_command returned non-zero status while"
+			      "recursing in the nested submodules of %s\n."),
+			      displaypath);
+	}
+
+	free(displaypath);
+	free(toplevel);
+}
+
+static int module_foreach(int argc, const char **argv, const char *prefix)
+{
+	struct cb_foreach info;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int recursive = 0;
+
+	struct option module_foreach_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress output of entering each submodule command")),
+		OPT_BOOL(0, "recursive", &recursive,
+			 N_("Recurse into nested submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper foreach [--quiet] [--recursive] <command>"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_foreach_options,
+			     git_submodule_helper_usage, PARSE_OPT_KEEP_UNKNOWN);
+
+	if (module_list_compute(0, NULL, prefix, &pathspec, &list) < 0)
+		BUG("module_list_compute should not choke on empty pathspec");
+
+	info.argc = argc;
+	info.argv = argv;
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+	info.recursive = !!recursive;
+
+	gitmodules_config();
+	for_each_submodule_list(list, runcommand_in_submodule, &info);
+
+	return 0;
+}
+
 static int clone_submodule(const char *path, const char *gitdir, const char *url,
 			   const char *depth, struct string_list *reference,
 			   int quiet, int progress)
@@ -1212,6 +1364,7 @@ static struct cmd_struct commands[] = {
 	{"relative-path", resolve_relative_path, 0},
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
+	{"foreach", module_foreach, SUPPORT_SUPER_PREFIX},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index ea6f56337..032fd2540 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -322,45 +322,8 @@ cmd_foreach()
 		shift
 	done
 
-	toplevel=$(pwd)
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper foreach ${GIT_QUIET:+--quiet} ${recursive:+--recursive} "$@"
 
-	# dup stdin so that it can be restored when running the external
-	# command in the subshell (and a recursive call to this function)
-	exec 3<&0
-
-	{
-		git submodule--helper list --prefix "$wt_prefix" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		if test -e "$sm_path"/.git
-		then
-			displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-			say "$(eval_gettext "Entering '\$displaypath'")"
-			name=$(git submodule--helper name "$sm_path")
-			(
-				prefix="$prefix$sm_path/"
-				sanitize_submodule_env
-				cd "$sm_path" &&
-				# we make $path available to scripts ...
-				path=$sm_path &&
-				sm_path=$displaypath &&
-				if test $# -eq 1
-				then
-					eval "$1"
-				else
-					"$@"
-				fi &&
-				if test -n "$recursive"
-				then
-					cmd_foreach "--recursive" "$@"
-				fi
-			) <&3 3<&- ||
-			die "$(eval_gettext "Stopping at '\$displaypath'; script returned non-zero status.")"
-		fi
-	done
 }
 
 #
-- 
2.13.0


^ permalink raw reply	[relevance 17%]

* [PATCH] Documentation/git-rm: correct submodule description
@ 2017-06-02 19:28 Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-02 19:28 UTC (permalink / raw)
  To: gitster; +Cc: git, Stefan Beller

Since 3ccd681c2a (Merge branch 'sb/submodule-rm-absorb', 2017-01-18)
git-rm tries to absorb any submodules git dir before deleting the
submodule. Correct the documentation to say so.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-rm.txt | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index f1efc116eb..8c87e8cdd7 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -140,10 +140,11 @@ Only submodules using a gitfile (which means they were cloned
 with a Git version 1.7.8 or newer) will be removed from the work
 tree, as their repository lives inside the .git directory of the
 superproject. If a submodule (or one of those nested inside it)
-still uses a .git directory, `git rm` will fail - no matter if forced
-or not - to protect the submodule's history. If it exists the
-submodule.<name> section in the linkgit:gitmodules[5] file will also
-be removed and that file will be staged (unless --cached or -n are used).
+still uses a .git directory, `git rm` will move the submodules
+git directory into the superprojects git directory to protect
+the submodule's history. If it exists the submodule.<name> section
+in the linkgit:gitmodules[5] file will also be removed and that file
+will be staged (unless --cached or -n are used).
 
 A submodule is considered up-to-date when the HEAD is the same as
 recorded in the index, no tracked files are modified and no untracked
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 31%]

* [PATCH] submodule foreach: correct $sm_path in nested submodules from a dir
      [irrelevant] <20170515183405.GA79147@google.com>
      [irrelevant] ` <20170521125814.26255-1-pc44800@gmail.com>
@ 2017-06-03  0:37 ` Stefan Beller
  2017-06-03 14:07   ` Ramsay Jones
  2017-06-05 22:20   ` Jonathan Nieder
  1 sibling, 2 replies; 200+ results
From: Stefan Beller @ 2017-06-03  0:37 UTC (permalink / raw)
  To: bmwill, gitster; +Cc: git, pc44800, ramsay, sbeller

When running 'git submodule foreach' from a subdirectory of your
repository, nested submodules get a bogus value for $sm_path:
For a submodule 'sub' that contains a nested submodule 'nested',
running 'git -C dir submodule foreach echo $path' would report
path='../nested' for the nested submodule. The first part '../' is
derived from the logic computing the relative path from $pwd to the
root of the superproject. The second part is the submodule path inside
the submodule. This value is of little use and is hard to document.

There are two different possible solutions that have more value:
(a) The path value is documented as the path from the toplevel of the
    superproject to the mount point of the submodule.
    In this case we would want to have path='sub/nested'.

(b) As Ramsay noticed the documented value is wrong. For the non-nested
    case the path is equal to the relative path from $pwd to the
    submodules working directory. When following this model,
    the expected value would be path='../sub/nested'.

The behavior for (b) was introduced in 091a6eb0fe (submodule: drop the
top-level requirement, 2013-06-16) the intent for $path seemed to be
relative to $cwd to the submodule worktree, but that did not work for
nested submodules, as the intermittent submodules were not included in
the path.

If we were to fix the meaning of the $path using (a) such that "path"
is "the path from the toplevel of the superproject to the mount point
of the submodule", we would break any existing submodule user that runs
foreach from non-root of the superproject as the non-nested submodule
'../sub' would change its path to 'sub'.

If we would fix the meaning of the $path using (b), such that "path"
is "the relative path from $pwd to the submodule", then we would break
any user that uses nested submodules (even from the root directory) as
the 'nested' would become 'sub/nested'.

Both groups can be found in the wild.  The author has no data if one group
outweighs the other by large margin, and offending each one seems equally
bad at first.  However in the authors imagination it is better to go with
(a) as running from a sub directory sounds like it is carried out
by a human rather than by some automation task.  With a human on
the keyboard the feedback loop is short and the changed behavior can be
adapted to quickly unlike some automation that can break silently.

To ameliorate the situation, perform these changes
* Document 'sm_path' instead of 'path'.
  As using a variable '$path' may be harmful to users due to
  capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
  use $path variable in eval_gettext string, 2012-04-17). Adjust
  the documentation to advocate for using $sm_path,  which contains
  the same value. We still make the 'path' variable available,
  though not documented.

* Clarify the 'toplevel' variable documentation.
  It does not contain the topmost superproject as the author assumed,
  but the direct superproject, such that $toplevel/$sm_path is the
  actual absolute path of the submodule.

* The variable '$displaypath' was accessible but undocumented.
  Rename it '$displaypath' to '$dpath'. Document what it contains.
  Users that are broken by the behavior change of 'sm_path' introduced
  in this commit, can switch from '$path' to '$dpath'.

Discussed-with: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Ramsay,
I would think this would be least offensive to all parties involved.
With my current understanding of the situation
I think this is the best fix for now.

Prathamesh, sorry for the missleading suggestions earlier
on how to approach this bug. Let's see how the discussion turns out on this
one before rebasing the rewrite in C on this one.

Thanks,
Stefan

 Documentation/git-submodule.txt | 32 ++++++++++++++++++--------------
 git-submodule.sh                |  7 +++----
 t/t7407-submodule-foreach.sh    | 39 ++++++++++++++++++++++++++++++++++++---
 3 files changed, 57 insertions(+), 21 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 74bc6200d5..52e3ef1325 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -218,20 +218,24 @@ information too.
 
 foreach [--recursive] <command>::
 	Evaluates an arbitrary shell command in each checked out submodule.
-	The command has access to the variables $name, $path, $sha1 and
-	$toplevel:
-	$name is the name of the relevant submodule section in .gitmodules,
-	$path is the name of the submodule directory relative to the
-	superproject, $sha1 is the commit as recorded in the superproject,
-	and $toplevel is the absolute path to the top-level of the superproject.
-	Any submodules defined in the superproject but not checked out are
-	ignored by this command. Unless given `--quiet`, foreach prints the name
-	of each submodule before evaluating the command.
-	If `--recursive` is given, submodules are traversed recursively (i.e.
-	the given shell command is evaluated in nested submodules as well).
-	A non-zero return from the command in any submodule causes
-	the processing to terminate. This can be overridden by adding '|| :'
-	to the end of the command.
+	The command has access to the following variables:
++
+* `$name` is the name of the relevant submodule section in .gitmodules,
+* `$sha1` is the commit as recorded in the superproject.
+* `$sm_path` is the path recorded in the superproject.
+* `$toplevel` is the absolute path to its superproject, such that
+  `$toplevel/$sm_path` is the absolute path of the submodule.
+* `$dpath` contains the relative path from the current working directory
+   to the submodules root directory.
++
+Any submodules defined in the superproject but not checked out are
+ignored by this command. Unless given `--quiet`, foreach prints the name
+of each submodule before evaluating the command.
+If `--recursive` is given, submodules are traversed recursively (i.e.
+the given shell command is evaluated in nested submodules as well).
+A non-zero return from the command in any submodule causes
+the processing to terminate. This can be overridden by adding '|| :'
+to the end of the command.
 +
 As an example, the command below will show the path and currently
 checked out commit for each submodule:
diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c6..8133640fa1 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -337,14 +337,13 @@ cmd_foreach()
 		die_if_unmatched "$mode" "$sha1"
 		if test -e "$sm_path"/.git
 		then
-			displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-			say "$(eval_gettext "Entering '\$displaypath'")"
+			dpath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
+			say "$(eval_gettext "Entering '\$dpath'")"
 			name=$(git submodule--helper name "$sm_path")
 			(
 				prefix="$prefix$sm_path/"
 				sanitize_submodule_env
 				cd "$sm_path" &&
-				sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") &&
 				# we make $path available to scripts ...
 				path=$sm_path &&
 				if test $# -eq 1
@@ -358,7 +357,7 @@ cmd_foreach()
 					cmd_foreach "--recursive" "$@"
 				fi
 			) <&3 3<&- ||
-			die "$(eval_gettext "Stopping at '\$displaypath'; script returned non-zero status.")"
+			die "$(eval_gettext "Stopping at '\$dpath'; script returned non-zero status.")"
 		fi
 	done
 }
diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 6ba5daf42e..a082ca75aa 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -82,16 +82,16 @@ test_expect_success 'test basic "submodule foreach" usage' '
 
 cat >expect <<EOF
 Entering '../sub1'
-$pwd/clone-foo1-../sub1-$sub1sha1
+$pwd/clone-foo1-sub1-../sub1-$sub1sha1
 Entering '../sub3'
-$pwd/clone-foo3-../sub3-$sub3sha1
+$pwd/clone-foo3-sub3-../sub3-$sub3sha1
 EOF
 
 test_expect_success 'test "submodule foreach" from subdirectory' '
 	mkdir clone/sub &&
 	(
 		cd clone/sub &&
-		git submodule foreach "echo \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
+		git submodule foreach "echo \$toplevel-\$name-\$sm_path-\$dpath-\$sha1" >../../actual
 	) &&
 	test_i18ncmp expect actual
 '
@@ -197,6 +197,39 @@ test_expect_success 'test messages from "foreach --recursive" from subdirectory'
 	test_i18ncmp expect actual
 '
 
+sub1sha1=$(cd clone2/sub1 && git rev-parse HEAD)
+sub2sha1=$(cd clone2/sub2 && git rev-parse HEAD)
+sub3sha1=$(cd clone2/sub3 && git rev-parse HEAD)
+nested1sha1=$(cd clone2/nested1 && git rev-parse HEAD)
+nested2sha1=$(cd clone2/nested1/nested2 && git rev-parse HEAD)
+nested3sha1=$(cd clone2/nested1/nested2/nested3 && git rev-parse HEAD)
+submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse HEAD)
+
+cat >expect <<EOF
+Entering '../nested1'
+$pwd/clone2-nested1-nested1-../nested1-$nested1sha1
+Entering '../nested1/nested2'
+$pwd/clone2/nested1-nested2-nested2-../nested1/nested2-$nested2sha1
+Entering '../nested1/nested2/nested3'
+$pwd/clone2/nested1/nested2-nested3-nested3-../nested1/nested2/nested3-$nested3sha1
+Entering '../nested1/nested2/nested3/submodule'
+$pwd/clone2/nested1/nested2/nested3-submodule-submodule-../nested1/nested2/nested3/submodule-$submodulesha1
+Entering '../sub1'
+$pwd/clone2-foo1-sub1-../sub1-$sub1sha1
+Entering '../sub2'
+$pwd/clone2-foo2-sub2-../sub2-$sub2sha1
+Entering '../sub3'
+$pwd/clone2-foo3-sub3-../sub3-$sub3sha1
+EOF
+
+test_expect_success 'test "submodule foreach --recursive" from subdirectory' '
+	(
+		cd clone2/untracked &&
+		git submodule foreach --recursive "echo \$toplevel-\$name-\$sm_path-\$dpath-\$sha1" >../../actual
+	) &&
+	test_i18ncmp expect actual
+'
+
 cat > expect <<EOF
 nested1-nested1
 nested2-nested2
-- 
2.13.0.17.gab62347cd9


^ permalink raw reply	[relevance 22%]

* Re: [GSoC][PATCH v6 2/2] submodule: port subcommand foreach from shell to C
  2017-06-02 11:24                         ` [GSoC][PATCH v6 2/2] submodule: port subcommand foreach from shell to C Prathamesh Chavan
@ 2017-06-03  2:13                           ` Stefan Beller
  2017-06-04 10:32                             ` Prathamesh Chavan
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-03  2:13 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Johannes Sixt, Christian Couder, Ramsay Jones

On Fri, Jun 2, 2017 at 4:24 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> This aims to make git-submodule foreach a builtin. This is the very
> first step taken in this direction. Hence, 'foreach' is ported to
> submodule--helper, and submodule--helper is called from git-submodule.sh.
> The code is split up to have one function to obtain all the list of
> submodules. This function acts as the front-end of git-submodule foreach
> subcommand. It calls the function for_each_submodule_list, which basically
> loops through the list and calls function fn, which in this case is
> runcommand_in_submodule. This third function is a calling function that
> takes care of running the command in that submodule, and recursively
> perform the same when --recursive is flagged.
>
> The first function module_foreach first parses the options present in
> argv, and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
>
> The second function for_each_submodule_list traverses through the
> list, and calls function fn (which in case of submodule subcommand
> foreach is runcommand_in_submodule) is called for each entry.
>
> The third function runcommand_in_submodule, generates a submodule struct sub
> for $name, value and then later prepends name=sub->name; and other
> value assignment to the env argv_array structure of a child_process.
> Also the <command> of submodule-foreach is push to args argv_array
> structure and finally, using run_command the commands are executed
> using a shell.
>
> The third function also takes care of the recursive flag, by creating
> a separate child_process structure and prepending "--super-prefix displaypath",
> to the args argv_array structure. Other required arguments and the
> input <command> of submodule-foreach is also appended to this argv_array.
>

Is the commit message still accurate?
You describe the changes between the versions below the --- line,
that is not recorded in the permanent commit history.

In the commit message is less important to write "what" is happening,
because that can easily be read from the patch/commit itself, but rather
"why" things happen, such as design choices, maybe:

    This aims to make git-submodule foreach a builtin. This is the very
    first step taken in this direction. Hence, 'foreach' is ported to
    submodule--helper, and submodule--helper is called from git-submodule.sh.

    We'll introduce 3 functions, one that is exposed to the command line
    and handles command line arguments, one to iterate over a set of
    submodules, and finally one to execute an arbitrary shell command
    in the submodule.

    Attention must be paid to the 'path' variable, see 64394e3ae9
    (git-submodule.sh: Don't use $path variable in eval_gettext string,
    2012-04-17) details. The path varialbe is not exposed into the environment
    of the invoked shell, but we just give "path=%s;" as the first argument.

    We do not need to condition on the number of arguments as in 1c4fb136db
    (submodule foreach: skip eval for more than one argument, 2013-09-27)
    as we will run exactly one shell in the submodules directory.

    Sign-off-...

>
> Other than that, additionally the case of no. of arugments in <command>
> being equal to 1 is also considered separetly.
> THe reason of having this change in the shell script was given in the
> commit 1c4fb136db.
> According to my understanding, eval "$1" executes $1 in same shell,
> whereas "$@" gets executed in a separate shell, which doesn't allow
> "$@" to access the env variables $name, $path, etc.
> Hence, to keep the ported function similar, this condition is also
> added.

This paragraph would be a good candidate for the commit message, too.
However as we rewrite it in C, we will spawn exactly one shell no matter
how many arguments we have (well for 0 we have no shell, but for 1 or more
arguments we'll spawn exactly one shell?)

> +       } else if (prefix) {
> +               struct strbuf sb = STRBUF_INIT;
> +               char *displaypath = xstrdup(relative_path(path, prefix, &sb));
> +               strbuf_release(&sb);
> +               return displaypath;

Note to self (or any other that is interested in long term clean code):
    I have seen this pattern a couple of times, a strbuf just to appease
    the argument list of relative_path.
    (init_submodule, prepare_to_clone_next_submodule,
    resolve_relative_path in submodule--helper
    cmd_rev_parse in builtin/rev-parse
    connect_work_tree_and_git_dir in dir.c
    write_name_quoted_relative in quote.c
    get_superproject_working_tree in submodule.c
    cmd_main in test-path-utils;
    actually all uses of this function :( )
    We should really make a relative_path function that can work
    without the need of a strbuf, maybe just wrap the 3 lines into a new
    function, or remove the strbuf from the argument list.
    (The potential memleak is horrible to fix though. But as seen here
    we could just always return an allocated string and
    mandate the caller to free it)

> +struct cb_foreach {
> +       int argc;
> +       const char **argv;
> +       const char *prefix;
> +       unsigned int quiet: 1;
> +       unsigned int recursive: 1;
> +};
> +#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
> +
> +static void runcommand_in_submodule(const struct cache_entry *list_item,
> +                                   void *cb_data)

As we only ever use list_item->name, we could also change
the argument list to take a "const char *[submodule_]path".

> +       prepare_submodule_repo_env(&cp.env_array);
> +       cp.use_shell = 1;
> +       cp.dir = list_item->name;
> +
> +       if (info->argc == 1) {
> +               argv_array_pushf(&cp.env_array, "name=%s", sub->name);
> +               argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);
> +               argv_array_pushf(&cp.env_array, "sha1=%s",
> +                                oid_to_hex(&list_item->oid));
> +               argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
> +
> +               argv_array_pushf(&cp.args, "path=%s; %s", list_item->name,
> +                                info->argv[0]);

In the argc != 1 case we also want to have the env_array filled with
useful variables. (It seems we do not have tests for that?)
To test for that we'd need a test similar as in 1c4fb136d,
just with
    git submodule foreach echo \$sm_path \$dpath
for example.

So I think you can move the pushes to the env array outside this condition.
As we set cp.use_shell unconditionally, we do not need to construct
the first argument specially with path preset, but instead we can just prepend
it in the array:

    argv_array_pushf(&cp.env_array, "name=%s", sub->name);
    ... // more env stuff

    argv_array_pushf(&cp.args, "path=%s;", list_item->name);
    for (i = 0; i < info->argc; i++)
        argv_array_push(&cp.args, info->argv[i]);

should do?


> +
> +       if (info->argv[0] && run_command(&cp))
> +               die(_("run_command returned non-zero status for %s\n."),

This would rather be
    die(_("Stopping at '%s'; script returned non-zero status."), displaypath);
to imitate the shell version faithfully?


> +
> +               if (run_command(&cpr))
> +                       die(_("run_command returned non-zero status while"
> +                             "recursing in the nested submodules of %s\n."),
> +                             displaypath);

same here. As the inner process would have already printed the "Stopping..."
we would not need to repeat it, though.

So maybe

    /* no need to report error, child does: */
    run_command(&cpr);

The rest below looks good. :)

Thanks,
Stefan

^ permalink raw reply	[relevance 22%]

* Re: [PATCH] submodule foreach: correct $sm_path in nested submodules from a dir
  2017-06-03  0:37 ` [PATCH] submodule foreach: correct $sm_path in nested submodules from a dir Stefan Beller
@ 2017-06-03 14:07   ` Ramsay Jones
  2017-06-05 22:20   ` Jonathan Nieder
  1 sibling, 0 replies; 200+ results
From: Ramsay Jones @ 2017-06-03 14:07 UTC (permalink / raw)
  To: Stefan Beller, bmwill, gitster; +Cc: git, pc44800



On 03/06/17 01:37, Stefan Beller wrote:
> When running 'git submodule foreach' from a subdirectory of your
> repository, nested submodules get a bogus value for $sm_path:
> For a submodule 'sub' that contains a nested submodule 'nested',
> running 'git -C dir submodule foreach echo $path' would report
> path='../nested' for the nested submodule. The first part '../' is
> derived from the logic computing the relative path from $pwd to the
> root of the superproject. The second part is the submodule path inside
> the submodule. This value is of little use and is hard to document.
> 
> There are two different possible solutions that have more value:
> (a) The path value is documented as the path from the toplevel of the
>     superproject to the mount point of the submodule.
>     In this case we would want to have path='sub/nested'.
> 
> (b) As Ramsay noticed the documented value is wrong. For the non-nested
>     case the path is equal to the relative path from $pwd to the
>     submodules working directory. When following this model,
>     the expected value would be path='../sub/nested'.
> 
> The behavior for (b) was introduced in 091a6eb0fe (submodule: drop the

Ah, so prior to 091a6eb0fe the documentation of $path was actually
correct - you could not run the command from a sub-directory, so
$path _was_ the 'name of the submodule directory relative to the
superproject'. (The fact that git-ls-files worked from a subdirectory
is not in the least relevant - it never was!) ;-)

> top-level requirement, 2013-06-16) the intent for $path seemed to be
> relative to $cwd to the submodule worktree, but that did not work for
> nested submodules, as the intermittent submodules were not included in
                            ^^^^^^^^^^^^
intermediate?

Hmm, so nobody noticed the change in behaviour (and, of course, that
the documentation hadn't been updated to match) since 2013!

> the path.
> 
> If we were to fix the meaning of the $path using (a) such that "path"
> is "the path from the toplevel of the superproject to the mount point
> of the submodule", we would break any existing submodule user that runs
> foreach from non-root of the superproject as the non-nested submodule
> '../sub' would change its path to 'sub'.
> 
> If we would fix the meaning of the $path using (b), such that "path"
> is "the relative path from $pwd to the submodule", then we would break
> any user that uses nested submodules (even from the root directory) as
> the 'nested' would become 'sub/nested'.
> 
> Both groups can be found in the wild.  The author has no data if one group
> outweighs the other by large margin, and offending each one seems equally
> bad at first.  However in the authors imagination it is better to go with
> (a) as running from a sub directory sounds like it is carried out
> by a human rather than by some automation task.  With a human on
> the keyboard the feedback loop is short and the changed behavior can be
> adapted to quickly unlike some automation that can break silently.

I do not use submodules (I have absolutely no interest in them, except
in a general wish to improve git sense). So, I have no idea what kind
of impact either change will have. However, FWIW, I agree with your
reasoning here. (Not that I actually get a vote, but I vote for a!)

> To ameliorate the situation, perform these changes
> * Document 'sm_path' instead of 'path'.
>   As using a variable '$path' may be harmful to users due to
>   capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
>   use $path variable in eval_gettext string, 2012-04-17). Adjust
>   the documentation to advocate for using $sm_path,  which contains
>   the same value. We still make the 'path' variable available,
>   though not documented.
> 
> * Clarify the 'toplevel' variable documentation.
>   It does not contain the topmost superproject as the author assumed,
>   but the direct superproject, such that $toplevel/$sm_path is the
>   actual absolute path of the submodule.
> 
> * The variable '$displaypath' was accessible but undocumented.
>   Rename it '$displaypath' to '$dpath'. Document what it contains.
>   Users that are broken by the behavior change of 'sm_path' introduced
>   in this commit, can switch from '$path' to '$dpath'.
> 
> Discussed-with: Ramsay Jones <ramsay@ramsayjones.plus.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> 

  Documentation/git-submodule.txt | 32 ++++++++++++++++++--------------
>  git-submodule.sh                |  7 +++----
>  t/t7407-submodule-foreach.sh    | 39 ++++++++++++++++++++++++++++++++++++---
>  3 files changed, 57 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 74bc6200d5..52e3ef1325 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -218,20 +218,24 @@ information too.
>  
>  foreach [--recursive] <command>::
>  	Evaluates an arbitrary shell command in each checked out submodule.
> -	The command has access to the variables $name, $path, $sha1 and
> -	$toplevel:
> -	$name is the name of the relevant submodule section in .gitmodules,
> -	$path is the name of the submodule directory relative to the
> -	superproject, $sha1 is the commit as recorded in the superproject,
> -	and $toplevel is the absolute path to the top-level of the superproject.
> -	Any submodules defined in the superproject but not checked out are
> -	ignored by this command. Unless given `--quiet`, foreach prints the name
> -	of each submodule before evaluating the command.
> -	If `--recursive` is given, submodules are traversed recursively (i.e.
> -	the given shell command is evaluated in nested submodules as well).
> -	A non-zero return from the command in any submodule causes
> -	the processing to terminate. This can be overridden by adding '|| :'
> -	to the end of the command.
> +	The command has access to the following variables:
> ++
> +* `$name` is the name of the relevant submodule section in .gitmodules,
> +* `$sha1` is the commit as recorded in the superproject.
> +* `$sm_path` is the path recorded in the superproject.

Just to be sure, the 'path recorded in the superproject' means the
same thing as the 'name of the submodule directory relative to the
superproject'. Yes?

> +* `$toplevel` is the absolute path to its superproject, such that
> +  `$toplevel/$sm_path` is the absolute path of the submodule.
> +* `$dpath` contains the relative path from the current working directory
> +   to the submodules root directory.

Subject to the above, this all looks good to me. (I can't comment
on the implementation - I just assume that it correctly implements
the above).

Thanks!

ATB,
Ramsay Jones



^ permalink raw reply	[relevance 26%]

* Re: [GSoC][PATCH v6 2/2] submodule: port subcommand foreach from shell to C
  2017-06-03  2:13                           ` Stefan Beller
@ 2017-06-04 10:32                             ` Prathamesh Chavan
  0 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-06-04 10:32 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Johannes Sixt, Christian Couder, Ramsay Jones

On Sat, Jun 3, 2017 at 7:43 AM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Jun 2, 2017 at 4:24 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
>> This aims to make git-submodule foreach a builtin. This is the very
>> first step taken in this direction. Hence, 'foreach' is ported to
>> submodule--helper, and submodule--helper is called from git-submodule.sh.
>> The code is split up to have one function to obtain all the list of
>> submodules. This function acts as the front-end of git-submodule foreach
>> subcommand. It calls the function for_each_submodule_list, which basically
>> loops through the list and calls function fn, which in this case is
>> runcommand_in_submodule. This third function is a calling function that
>> takes care of running the command in that submodule, and recursively
>> perform the same when --recursive is flagged.
>>
>> The first function module_foreach first parses the options present in
>> argv, and then with the help of module_list_compute, generates the list of
>> submodules present in the current working tree.
>>
>> The second function for_each_submodule_list traverses through the
>> list, and calls function fn (which in case of submodule subcommand
>> foreach is runcommand_in_submodule) is called for each entry.
>>
>> The third function runcommand_in_submodule, generates a submodule struct sub
>> for $name, value and then later prepends name=sub->name; and other
>> value assignment to the env argv_array structure of a child_process.
>> Also the <command> of submodule-foreach is push to args argv_array
>> structure and finally, using run_command the commands are executed
>> using a shell.
>>
>> The third function also takes care of the recursive flag, by creating
>> a separate child_process structure and prepending "--super-prefix displaypath",
>> to the args argv_array structure. Other required arguments and the
>> input <command> of submodule-foreach is also appended to this argv_array.
>>
>
> Is the commit message still accurate?
> You describe the changes between the versions below the --- line,
> that is not recorded in the permanent commit history.
>
> In the commit message is less important to write "what" is happening,
> because that can easily be read from the patch/commit itself, but rather
> "why" things happen, such as design choices, maybe:
>
>     This aims to make git-submodule foreach a builtin. This is the very
>     first step taken in this direction. Hence, 'foreach' is ported to
>     submodule--helper, and submodule--helper is called from git-submodule.sh.
>
>     We'll introduce 3 functions, one that is exposed to the command line
>     and handles command line arguments, one to iterate over a set of
>     submodules, and finally one to execute an arbitrary shell command
>     in the submodule.
>
>     Attention must be paid to the 'path' variable, see 64394e3ae9
>     (git-submodule.sh: Don't use $path variable in eval_gettext string,
>     2012-04-17) details. The path varialbe is not exposed into the environment
>     of the invoked shell, but we just give "path=%s;" as the first argument.
>
>     We do not need to condition on the number of arguments as in 1c4fb136db
>     (submodule foreach: skip eval for more than one argument, 2013-09-27)
>     as we will run exactly one shell in the submodules directory.
>
>     Sign-off-...
>
>>
>> Other than that, additionally the case of no. of arugments in <command>
>> being equal to 1 is also considered separetly.
>> THe reason of having this change in the shell script was given in the
>> commit 1c4fb136db.
>> According to my understanding, eval "$1" executes $1 in same shell,
>> whereas "$@" gets executed in a separate shell, which doesn't allow
>> "$@" to access the env variables $name, $path, etc.
>> Hence, to keep the ported function similar, this condition is also
>> added.
>
> This paragraph would be a good candidate for the commit message, too.
> However as we rewrite it in C, we will spawn exactly one shell no matter
> how many arguments we have (well for 0 we have no shell, but for 1 or more
> arguments we'll spawn exactly one shell?)
>

I was trying to explaing the condition of the code in git-submodule.sh,
before porting. To be more clear, I meant that when we run the command
eval "$1", it runs in the same shell in which the cmd_foreach has been running,
unlike in the case of "$@", in which case, the command in executed in a separate
shell.

>> +       } else if (prefix) {
>> +               struct strbuf sb = STRBUF_INIT;
>> +               char *displaypath = xstrdup(relative_path(path, prefix, &sb));
>> +               strbuf_release(&sb);
>> +               return displaypath;
>
> Note to self (or any other that is interested in long term clean code):
>     I have seen this pattern a couple of times, a strbuf just to appease
>     the argument list of relative_path.
>     (init_submodule, prepare_to_clone_next_submodule,
>     resolve_relative_path in submodule--helper
>     cmd_rev_parse in builtin/rev-parse
>     connect_work_tree_and_git_dir in dir.c
>     write_name_quoted_relative in quote.c
>     get_superproject_working_tree in submodule.c
>     cmd_main in test-path-utils;
>     actually all uses of this function :( )
>     We should really make a relative_path function that can work
>     without the need of a strbuf, maybe just wrap the 3 lines into a new
>     function, or remove the strbuf from the argument list.
>     (The potential memleak is horrible to fix though. But as seen here
>     we could just always return an allocated string and
>     mandate the caller to free it)
>
>> +struct cb_foreach {
>> +       int argc;
>> +       const char **argv;
>> +       const char *prefix;
>> +       unsigned int quiet: 1;
>> +       unsigned int recursive: 1;
>> +};
>> +#define CB_FOREACH_INIT { 0, NULL, NULL, 0, 0 }
>> +
>> +static void runcommand_in_submodule(const struct cache_entry *list_item,
>> +                                   void *cb_data)
>
> As we only ever use list_item->name, we could also change
> the argument list to take a "const char *[submodule_]path".
>
>> +       prepare_submodule_repo_env(&cp.env_array);
>> +       cp.use_shell = 1;
>> +       cp.dir = list_item->name;
>> +
>> +       if (info->argc == 1) {
>> +               argv_array_pushf(&cp.env_array, "name=%s", sub->name);
>> +               argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);
>> +               argv_array_pushf(&cp.env_array, "sha1=%s",
>> +                                oid_to_hex(&list_item->oid));
>> +               argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
>> +
>> +               argv_array_pushf(&cp.args, "path=%s; %s", list_item->name,
>> +                                info->argv[0]);
>
> In the argc != 1 case we also want to have the env_array filled with
> useful variables. (It seems we do not have tests for that?)
> To test for that we'd need a test similar as in 1c4fb136d,
> just with
>     git submodule foreach echo \$sm_path \$dpath
> for example.
>
> So I think you can move the pushes to the env array outside this condition.
> As we set cp.use_shell unconditionally, we do not need to construct
> the first argument specially with path preset, but instead we can just prepend
> it in the array:
>
>     argv_array_pushf(&cp.env_array, "name=%s", sub->name);
>     ... // more env stuff
>
>     argv_array_pushf(&cp.args, "path=%s;", list_item->name);
>     for (i = 0; i < info->argc; i++)
>         argv_array_push(&cp.args, info->argv[i]);
>
> should do?
>
>
Yes, even I think this is what should be done. But for the given code,

if test $# -eq 1
then
    eval "$1"
else
    "$@"
fi &&

in case when $# is not equal to 1,
the <command> "$@" gets executed in a separate shell and hence could not
access the $name, $path, $toplevel, $sha1 variables.
Infact, it can be observed, that the output of the <command>
echo \$name is something bogus and not the submodule's name.

Also, in the given test suite, the env variables are used only in those
cases where no. of arguments is one.

Hence, to keep the ported code similar to the privious one, such an
additional case of argc == 1 needs to be considered.

Thanks,
Prathamesh Chavan

^ permalink raw reply	[relevance 23%]

* Re: [PATCH 31/31] ls-files: use repository object
  2017-06-01 20:36   ` Re: [PATCH 31/31] ls-files: use repository object Stefan Beller
@ 2017-06-05 17:46     ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-06-05 17:46 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff King, Jeff Hostetler, Ævar Arnfjörð Bjarmason, Jonathan Tan

On 06/01, Stefan Beller wrote:
> On Wed, May 31, 2017 at 2:44 PM, Brandon Williams <bmwill@google.com> wrote:
> > Convert ls-files to use a repository struct and recurse submodules
> > inprocess.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> 
> > +static void show_submodule(const struct repo *superproject,
> > +                          struct dir_struct *dir, const char *path)
> >  {
> > +       struct repo submodule;
> > +       char *gitdir = mkpathdup("%s/%s", superproject->worktree, path);
> > +       repo_init(&submodule, gitdir);
> > +
> > +       repo_read_index(&submodule);
> > +       repo_read_gitmodules(&submodule);
> > +
> > +       if (superproject->submodule_prefix)
> > +               submodule.submodule_prefix = xstrfmt("%s%s/", superproject->submodule_prefix, path);
> > +       else
> > +               submodule.submodule_prefix = xstrfmt("%s/", path);
> > +       show_files(&submodule, dir);
> > +
> > +       repo_clear(&submodule);
> > +       free(gitdir);
> >  }
> 
> I like how it seems easy now to do work in another repository. :)

It really does make working with another repo easy!  No more compiling
argv options and spawning child processes :D

> 
> > -       { "ls-files", cmd_ls_files, RUN_SETUP | SUPPORT_SUPER_PREFIX },
> > +       { "ls-files", cmd_ls_files, RUN_SETUP },
> 
> With this step, we can get rid of SUPPORT_SUPER_PREFIX eventually.

Yes Ideally most of the little hacks I introduced when originally
teaching ls-files and grep to recurse could be removed.

> 
> I do not have comments on the patches in the middle, but they
> cleared up some of the questions that I asked in the early patches.
> 
> Thanks,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[relevance 8%]

* Re: [PATCH 06/31] repo: introduce the repository object
  2017-06-01 19:53   ` Re: [PATCH 06/31] repo: introduce the repository object Stefan Beller
@ 2017-06-05 17:53     ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-06-05 17:53 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff King, Jeff Hostetler, Ævar Arnfjörð Bjarmason, Jonathan Tan

On 06/01, Stefan Beller wrote:
> On Wed, May 31, 2017 at 2:43 PM, Brandon Williams <bmwill@google.com> wrote:
> > Introduce the repository object 'struct repo' which can be used hold all
> > state pertaining to a git repository.
> >
> > The aim of object-ifying the repository is to (1) make the code base
> > more readable and easier to reason about and (2) allow for working on
> > multiple repositories, specifically submodules, within the same process.
> >
> > TODO: Add more motivating points for adding a repository object?
> 
> Yes please (or delete this line).
> https://public-inbox.org/git/alpine.DEB.2.21.1.1705221501540.3610@virtualbox/
> 
> > +++ b/repo.c
> > @@ -0,0 +1,124 @@
> > +#include "cache.h"
> > +#include "repo.h"
> > +
> > +/*
> > + * This may be the wrong place for this.
> > + * It may be better to go in env.c or setup for the time being?
> 
> In env.c we say:
> /*
>  * We put all the git config variables in this same object
>  * file, so that programs can link against the config parser
>  * without having to link against all the rest of git.
>  *
>  * In particular, no need to bring in libz etc unless needed,
>  * even if you might want to know where the git directory etc
>  * are.
>  */
> 
> And setup.c only has a few variables that matter there locally.
> So I would think having 'the_repository' in repo.c is acceptable.

And perhaps (far down the road) 'the_repoository' could be removed such
that builtin commands take a pointer to a repo object as a parameter.

> 
> > + */
> > +struct repo the_repository;
> > +
> > +static char *git_path_from_env(const char *envvar, const char *git_dir,
> > +                              const char *path, int fromenv)
> > +{
> > +       if (fromenv) {
> > +               const char *value = getenv(envvar);
> > +               if (value)
> > +                       return xstrdup(value);
> > +       }
> > +
> > +       return xstrfmt("%s/%s", git_dir, path);
> > +}
> > +
> > +static int find_common_dir(struct strbuf *sb, const char *gitdir, int fromenv)
> > +{
> > +       if (fromenv) {
> > +               const char *value = getenv(GIT_COMMON_DIR_ENVIRONMENT);
> > +               if (value) {
> > +                       strbuf_addstr(sb, value);
> > +                       return 1;
> > +               }
> > +       }
> > +
> > +       return get_common_dir_noenv(sb, gitdir);
> > +}
> > +
> > +/* called after setting gitdir */
> > +static void repo_setup_env(struct repo *repo)
> > +{
> > +       struct strbuf sb = STRBUF_INIT;
> > +
> > +       if (!repo->gitdir)
> > +               BUG("gitdir wasn't set before setting up the environment");
> > +
> > +       repo->different_commondir = find_common_dir(&sb, repo->gitdir,
> > +                                                   !repo->ignore_env);
> > +       repo->commondir = strbuf_detach(&sb, NULL);
> > +       repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
> > +                                           "objects", !repo->ignore_env);
> > +       repo->index_file = git_path_from_env(INDEX_ENVIRONMENT, repo->gitdir,
> > +                                            "index", !repo->ignore_env);
> > +       repo->graft_file = git_path_from_env(GRAFT_ENVIRONMENT, repo->commondir,
> > +                                            "info/grafts", !repo->ignore_env);
> > +       repo->namespace = expand_namespace(repo->ignore_env ? NULL :
> > +                                          getenv(GIT_NAMESPACE_ENVIRONMENT));
> > +}
> > +
> > +static void repo_clear_env(struct repo *repo)
> > +{
> > +       free(repo->gitdir);
> > +       repo->gitdir = NULL;
> > +       free(repo->commondir);
> > +       repo->commondir = NULL;
> > +       free(repo->objectdir);
> > +       repo->objectdir = NULL;
> > +       free(repo->index_file);
> > +       repo->index_file = NULL;
> > +       free(repo->graft_file);
> > +       repo->graft_file = NULL;
> > +       free(repo->namespace);
> > +       repo->namespace = NULL;
> 
> I wonder if we can defer the NULL assignments to
> repo_clear, where we would just do a
> memset(repo, 0, sizeof(struct repo));
> 

Yeah perhaps, clearing the env should either be done when setting gitdir
again (so setting up the env will happen again and we don't need to
clear the fields) or clearing the struct as a whole so using memset
would work.

> > +
> > +       repo_set_gitdir(repo, resolved_gitdir);
> > +
> > +       /* NEEDSWORK: Verify repository format version */
> 
> Care to elaborate on this? I do not understand why we would want
> to check the format version here?

When opening up a repository git needs to check if it understands the
repository format and all extensions.  If it doesn't git needs to bail
out and not operate on the repository.  So a part of initializing a repo
object would be to verify that it understands the repository format
version.

> 
> > +
> > +extern void repo_set_gitdir(struct repo *repo, const char *path);
> > +extern int repo_init(struct repo *repo, const char *path);
> > +extern void repo_clear(struct repo *repo);
> 
> The init and clear method seem obvious to me, but what do we need the
> repo_set_gitdir for externally? I would assume the repo auto-discovers its
> gitdir on its own?

Well I didn't completely overhaul the setup code in this series so I
really just hooked into where the setup code already sets the gitdir,
hence why this function is exposed atm.

-- 
Brandon Williams

^ permalink raw reply	[relevance 8%]

* Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
      [irrelevant] <xmqq1sqzkrui.fsf@gitster.mtv.corp.google.com>
@ 2017-06-05 18:23 ` Stefan Beller
  2017-06-06  6:44   ` Jacob Keller
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-05 18:23 UTC (permalink / raw)
  To: Junio C Hamano, Jacob Keller, Michael Haggerty; +Cc: git

> * sb/diff-color-move (2017-06-01) 17 commits
>  - diff.c: color moved lines differently
>  - diff: buffer all output if asked to
>  - diff.c: emit_line includes whitespace highlighting
>  - diff.c: convert diff_summary to use emit_line_*
>  - diff.c: convert diff_flush to use emit_line_*
>  - diff.c: convert word diffing to use emit_line_*
>  - diff.c: convert show_stats to use emit_line_*
>  - diff.c: convert emit_binary_diff_body to use emit_line_*
>  - submodule.c: convert show_submodule_summary to use emit_line_fmt
>  - diff.c: convert emit_rewrite_lines to use emit_line_*
>  - diff.c: convert emit_rewrite_diff to use emit_line_*
>  - diff.c: convert builtin_diff to use emit_line_*
>  - diff.c: convert fn_out_consume to use emit_line
>  - diff: introduce more flexible emit function
>  - diff.c: factor out diff_flush_patch_all_file_pairs
>  - diff: move line ending check into emit_hunk_header
>  - diff: readability fix
>
>  "git diff" has been taught to optionally paint new lines that are
>  the same as deleted lines elsewhere differently from genuinely new
>  lines.
>
>  Are we happy with these changes?

I advertised this series e.g. for reviewing Brandons
repo object refactoring series and used it myself to inspect
some patches there[1]. I am certainly happy (but biased) with
what we have available there.

Jacob intended to use this series
for review as well, but has given no opinion yet.

You seemed to have used it for js/blame-lib?

--
Those patches had a wide reviewer audience cc'd,
so I would think people are aware of this series.

--
Things to come, but not in this series as they are more advanced:

    Discuss if a block/line needs a minimum requirement.

When doing reviews with this series, a couple of lines such
as "\t\t}" were marked as a moved, which is not wrong as they
really occurred in the text with opposing sign.
But it was annoying as it drew my attention to just closing
braces, which IMO is not the point of code review.

To solve this issue I had the idea of a "minimum requirement", e.g.
* at least 3 consecutive lines or
* at least one line with at least 3 non-ws characters or
* compute the entropy of a given moved block and if it is too low, do
  not mark it up.

I am not sure if such a "minimum requirement" is the right approach
at all. The nature of this discussion comes close to the diff heuristics
at which Michael did present a wonderful solution, hence I had him cc'd
on the series as he may have some good insights on how to improve
the diffs. :)

--
In conclusion:

We are happy to move to next as it seems technically sound.

But we want more exposure on usage to point out UX bugs.
(e.g. is the default mode for just giving --color-moved good for the
majority of people/use cases? Are there subtle annoyances such
as the closing braces?)

So maybe merge to next with the strong option to evict it
when finding more fundamentally wrong things?

Thanks,
Stefan

[1]
https://public-inbox.org/git/CAGZ79kZJF9iDsVgyi-hSKb6N8w0uhVCU4W-r89F0eRJPXe_4Og@mail.gmail.com/

^ permalink raw reply	[relevance 7%]

* Re: [PATCH] test-lib: add ability to cap the runtime of tests
      [irrelevant]       ` <CACBZZX5FP_jxXaT+NW8g2JqH89iYajHPjHhxCj=_vWnkxZ=rYQ@mail.gmail.com>
@ 2017-06-05 19:03         ` Stefan Beller
  2017-06-05 20:37           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-05 19:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Lars Schneider, Junio C Hamano, Git Mailing List

> That's never going to be a problem on a less beefy machine with
> --state=slow,save, since the 30s test is going to be long over by the
> time the rest of the tests run.
>
> Cutting down on these long tail tests allows me to e.g. replace this:
>
>     git rebase -i --exec '(make -j56 all && cd t && prove -j56 <some
> limited glob>)'
>
> With a glob that runs the entire test suite, with the rebase only
> taking marginally longer in most cases while getting much better test
> coverage than I'd otherwise bother with.

I wonder if this functionality is rather best put into prove?

Also prove doesn't know which tests are "interesting",
e.g. if you were working on interactive rebase, then you really
want the longest test to be run in full?

And this "judge by time, not by interest" doesn't bode well with
me.

I have a non-beefy machine such that this particular problem
doesn't apply to me, but instead the whole test suite takes just
long to run.

For that I reduce testing intelligently, i.e. I know where I am
working on, so I run only some given tests (in case of
submodules I'd go with "prove t74*") which would also fix
your issue IIUC?

^ permalink raw reply	[relevance 15%]

* [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C
  2017-05-22 21:28   ` Re: [GSoC][PATCH v1 2/2] submodule: port submodule subcommand status Stefan Beller
@ 2017-06-05 20:25     ` Prathamesh Chavan
  2017-06-05 20:25       ` [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status Prathamesh Chavan
                         ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Prathamesh Chavan @ 2017-06-05 20:25 UTC (permalink / raw)
  To: sbeller; +Cc: git, christian.couder, Prathamesh Chavan

Since later on we want to port submodule subcommand status, and since
set_name_rev is part of cmd_status, hence this function is ported. It
has been ported to function print_name_rev in C, which calls get_name_rev
to get the revname, and after formatting it, print_name_rev prints it.
And hence in this way, the command `git submodule--helper print-name-rev
"sm_path" "sha1"` sets value of revname in git-submodule.sh

The function get_name_rev returns the stdout of the git describe
commands. Since there are four different git-describe commands used for
generating the name rev, four child_process are introduced, each successive
child process running only when previous has no stdout. The order of these
four git-describe commands is maintained the same as it was in the function
set_name_rev() in shell script.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            | 16 ++---------
 2 files changed, 69 insertions(+), 14 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 566a5b6a6..3022118d1 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -219,6 +219,72 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+enum describe_step {
+	step_bare = 0,
+	step_tags,
+	step_contains,
+	step_all_always,
+	step_end
+};
+
+static char *get_name_rev(int argc, const char **argv, const char *prefix)
+{
+	struct child_process cp;
+	struct strbuf sb = STRBUF_INIT;
+	enum describe_step cur_step;
+
+	for (cur_step = step_bare; cur_step < step_end; cur_step++) {
+		child_process_init(&cp);
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.dir = argv[1];
+		cp.no_stderr = 1;
+
+		switch (cur_step) {
+			case step_bare:
+				argv_array_pushl(&cp.args, "git", "describe",
+						 argv[2], NULL);
+				break;
+			case step_tags:
+				argv_array_pushl(&cp.args, "git", "describe",
+						 "--tags", argv[2], NULL);
+				break;
+			case step_contains:
+				argv_array_pushl(&cp.args, "git", "describe",
+						 "--contains", argv[2], NULL);
+				break;
+			case step_all_always:
+				argv_array_pushl(&cp.args, "git", "describe",
+						 "--all", "--always", argv[2],
+						 NULL);
+				break;
+			default:
+				BUG("unknown describe step '%d'", cur_step);
+		}
+
+		if (!capture_command(&cp, &sb, 0) && sb.len) {
+			strbuf_strip_suffix(&sb, "\n");
+			return strbuf_detach(&sb, NULL);
+		}
+	}
+
+	strbuf_release(&sb);
+	return NULL;
+}
+
+static int print_name_rev(int argc, const char **argv, const char *prefix)
+{
+	char *namerev;
+	if (argc != 3)
+		die("print-name-rev only accepts two arguments: <path> <sha1>");
+
+	namerev = get_name_rev(argc, argv, prefix);
+	if (namerev && namerev[0])
+		printf(" (%s)", namerev);
+	printf("\n");
+
+	return 0;
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -1212,6 +1278,7 @@ static struct cmd_struct commands[] = {
 	{"relative-path", resolve_relative_path, 0},
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
+	{"print-name-rev", print_name_rev, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c..091051891 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -758,18 +758,6 @@ cmd_update()
 	}
 }
 
-set_name_rev () {
-	revname=$( (
-		sanitize_submodule_env
-		cd "$1" && {
-			git describe "$2" 2>/dev/null ||
-			git describe --tags "$2" 2>/dev/null ||
-			git describe --contains "$2" 2>/dev/null ||
-			git describe --all --always "$2"
-		}
-	) )
-	test -z "$revname" || revname=" ($revname)"
-}
 #
 # Show commit summary for submodules in index or working tree
 #
@@ -1041,14 +1029,14 @@ cmd_status()
 		fi
 		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
 		then
-			set_name_rev "$sm_path" "$sha1"
+			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
 			say " $sha1 $displaypath$revname"
 		else
 			if test -z "$cached"
 			then
 				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
 			fi
-			set_name_rev "$sm_path" "$sha1"
+			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
 			say "+$sha1 $displaypath$revname"
 		fi
 
-- 
2.13.0


^ permalink raw reply	[relevance 20%]

* [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status
  2017-06-05 20:25     ` [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C Prathamesh Chavan
@ 2017-06-05 20:25       ` Prathamesh Chavan
  2017-06-05 23:12         ` Stefan Beller
  2017-06-05 22:50       ` Stefan Beller
  2017-06-05 23:20       ` Brandon Williams
  2 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-05 20:25 UTC (permalink / raw)
  To: sbeller; +Cc: git, christian.couder, Prathamesh Chavan

This aims to make git-submodule subcommand status a builtin. Here
'status' is ported to submodule--helper, and submodule--helper is
called from git-submodule.sh.

For the purpose of porting cmd_status, the code is split up such that
one function obtains all the list of submodules, acting as the front-end
of git-submodule status. This function later calls the second function
for_each_submodule_list,it which basically loops through the list of
submodules and calls function fn, which in this case is status_submodule.
The third function, status submodule returns the status of submodule and
also takes care of the recursive flag.

The first function module_status parses the options present in argv,
and then with the help of module_list_compute, generates the list of
submodules present in the current working tree.

The second function for_each_submodule_list traverses through the list,
and calls function fn (which in the case of submodule subcommand status
is status_submodule) is called for each entry.

The third function status_submodule checks for the various conditions,
and prints the status of the submodule accordingly. Also, this function
takes care of the recursive flag by creating a separate child_process
and running it inside the submodule. The function print_status handles the
printing of submodule's status.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
In this new version of patch, function print_status is introduced.

The functions for_each_submodule_list and get_submodule_displaypath
are found to be the same as those in the ported submodule subcommand
foreach's patches. The reason for doing so is to keep both the patches
independant and on separate branches. Also this patch is build on
the branch gitster/jk/bug-to-abort for utilizing its BUG() macro.
 
Complete build report is available at:
https://travis-ci.org/pratham-pc/git/builds/
Branch: submodule-status-new
Build #91

 builtin/submodule--helper.c | 181 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  49 +-----------
 2 files changed, 182 insertions(+), 48 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 3022118d1..85da05550 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,6 +13,9 @@
 #include "refs.h"
 #include "connect.h"
 
+typedef void (*submodule_list_func_t)(const struct cache_entry *list_item,
+				      void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -219,6 +222,25 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		return xstrfmt("%s/%s", super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 enum describe_step {
 	step_bare = 0,
 	step_tags,
@@ -397,6 +419,13 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static void for_each_submodule_list(const struct module_list list, submodule_list_func_t fn, void *cb_data)
+{
+	int i;
+	for (i = 0; i < list.nr; i++)
+		fn(list.entries[i], cb_data);
+}
+
 static void init_submodule(const char *path, const char *prefix, int quiet)
 {
 	const struct submodule *sub;
@@ -534,6 +563,157 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct status_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int recursive: 1;
+	unsigned int cached: 1;
+};
+#define STATUS_CB_INIT { NULL, 0, 0, 0 }
+
+static void print_status(struct status_cb *info, char state, const char *path,
+			 char *sub_sha1, char *displaypath)
+{
+	if (info->quiet)
+		return;
+
+	printf("%c%s %s", state, sub_sha1, displaypath);
+
+	if (state == ' ' || state == '+') {
+		struct argv_array name_rev_args = ARGV_ARRAY_INIT;
+
+		argv_array_pushl(&name_rev_args, "print-name-rev",
+				 path, sub_sha1, NULL);
+		print_name_rev(name_rev_args.argc, name_rev_args.argv,
+			       info->prefix);
+	} else {
+		printf("\n");
+	}
+}
+
+static void status_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+	struct status_cb *info = cb_data;
+	char *sub_sha1 = xstrdup(oid_to_hex(&list_item->oid));
+	char *displaypath;
+	struct argv_array diff_files_args = ARGV_ARRAY_INIT;
+
+	if (!submodule_from_path(null_sha1, list_item->name))
+		die(_("no submodule mapping found in .gitmodules for path '%s'"),
+		      list_item->name);
+
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+	if (list_item->ce_flags) {
+		print_status(info, 'U', list_item->name,
+			     sha1_to_hex(null_sha1), displaypath);
+		goto cleanup;
+	}
+
+	if (!is_submodule_initialized(list_item->name)) {
+		print_status(info, '-', list_item->name, sub_sha1, displaypath);
+		goto cleanup;
+	}
+
+	argv_array_pushl(&diff_files_args, "diff-files",
+			 "--ignore-submodules=dirty", "--quiet", "--",
+			 list_item->name, NULL);
+
+	if (!cmd_diff_files(diff_files_args.argc, diff_files_args.argv,
+			    info->prefix)) {
+		print_status(info, ' ', list_item->name, sub_sha1, displaypath);
+	} else {
+		if (!info->cached) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			struct strbuf sb = STRBUF_INIT;
+
+			prepare_submodule_repo_env(&cp.env_array);
+			cp.git_cmd = 1;
+			cp.dir = list_item->name;
+
+			argv_array_pushl(&cp.args, "rev-parse",
+					 "--verify", "HEAD", NULL);
+
+			if (capture_command(&cp, &sb, 0))
+				die(_("could not run 'git rev-parse --verify"
+				      "HEAD' in submodule %s"),
+				      list_item->name);
+
+			strbuf_strip_suffix(&sb, "\n");
+			print_status(info, '+', list_item->name, sb.buf,
+				     displaypath);
+			strbuf_release(&sb);
+		} else {
+			print_status(info, '+', list_item->name, sub_sha1,
+				     displaypath);
+		}
+	}
+
+	if (info->recursive) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = list_item->name;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
+				 "submodule--helper", "status", "--recursive",
+				 NULL);
+
+		if (info->cached)
+			argv_array_push(&cpr.args, "--cached");
+
+		if (info->quiet)
+			argv_array_push(&cpr.args, "--quiet");
+
+		if (run_command(&cpr))
+			die(_("failed to recurse into submodule '%s'"),
+			      list_item->name);
+	}
+
+cleanup:
+	free(displaypath);
+	free(sub_sha1);
+}
+
+static int module_status(int argc, const char **argv, const char *prefix)
+{
+	struct status_cb info = STATUS_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int cached = 0;
+	int recursive = 0;
+
+	struct option module_status_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress submodule status output")),
+		OPT_BOOL(0, "cached", &cached, N_("Use commit stored in the index instead of the one stored in the submodule HEAD")),
+		OPT_BOOL(0, "recursive", &recursive, N_("Recurse into nested submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule status [--quiet] [--cached] [--recursive] [<path>]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_status_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		return 1;
+
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+	info.recursive = !!recursive;
+	info.cached = !!cached;
+
+	gitmodules_config();
+	for_each_submodule_list(list, status_submodule, &info);
+
+	return 0;
+}
+
 static int module_name(int argc, const char **argv, const char *prefix)
 {
 	const struct submodule *sub;
@@ -1280,6 +1460,7 @@ static struct cmd_struct commands[] = {
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
 	{"print-name-rev", print_name_rev, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
+	{"status", module_status, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index 091051891..a24b1b91b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -1004,54 +1004,7 @@ cmd_status()
 		shift
 	done
 
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		name=$(git submodule--helper name "$sm_path") || exit
-		displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-		if test "$stage" = U
-		then
-			say "U$sha1 $displaypath"
-			continue
-		fi
-		if ! git submodule--helper is-active "$sm_path" ||
-		{
-			! test -d "$sm_path"/.git &&
-			! test -f "$sm_path"/.git
-		}
-		then
-			say "-$sha1 $displaypath"
-			continue;
-		fi
-		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
-		then
-			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
-			say " $sha1 $displaypath$revname"
-		else
-			if test -z "$cached"
-			then
-				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
-			fi
-			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
-			say "+$sha1 $displaypath$revname"
-		fi
-
-		if test -n "$recursive"
-		then
-			(
-				prefix="$displaypath/"
-				sanitize_submodule_env
-				wt_prefix=
-				cd "$sm_path" &&
-				eval cmd_status
-			) ||
-			die "$(eval_gettext "Failed to recurse into submodule path '\$sm_path'")"
-		fi
-	done
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper status ${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive} "$@"
 }
 #
 # Sync remote urls for submodules
-- 
2.13.0


^ permalink raw reply	[relevance 19%]

* Re: [PATCH] test-lib: add ability to cap the runtime of tests
  2017-06-05 19:03         ` Re: [PATCH] test-lib: add ability to cap the runtime of tests Stefan Beller
@ 2017-06-05 20:37           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 200+ results
From: Ævar Arnfjörð Bjarmason @ 2017-06-05 20:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Lars Schneider, Junio C Hamano, Git Mailing List

On Mon, Jun 5, 2017 at 9:03 PM, Stefan Beller <sbeller@google.com> wrote:
>> That's never going to be a problem on a less beefy machine with
>> --state=slow,save, since the 30s test is going to be long over by the
>> time the rest of the tests run.
>>
>> Cutting down on these long tail tests allows me to e.g. replace this:
>>
>>     git rebase -i --exec '(make -j56 all && cd t && prove -j56 <some
>> limited glob>)'
>>
>> With a glob that runs the entire test suite, with the rebase only
>> taking marginally longer in most cases while getting much better test
>> coverage than I'd otherwise bother with.
>
> I wonder if this functionality is rather best put into prove?

It would be nice to have a general facility to abort & kill tests
based on some criteria as they're run by Test::Harness, but making
that work reliably with all the edge cases prove needs to deal with
(tens/hundreds of thousands of test suites) is a much bigger project
than this.

> Also prove doesn't know which tests are "interesting",
> e.g. if you were working on interactive rebase, then you really
> want the longest test to be run in full?

If I were hacking rebase or another feature which has such a long
running test then the long running test without the timeout would be
part of my "regular" testing.

The point of this feature is that most tests aren't like that, then
you can use this and do the full test suite every time.

> And this "judge by time, not by interest" doesn't bode well with
> me.

They're not mutually exclusive.

> I have a non-beefy machine such that this particular problem
> doesn't apply to me, but instead the whole test suite takes just
> long to run.
>
> For that I reduce testing intelligently, i.e. I know where I am
> working on, so I run only some given tests (in case of
> submodules I'd go with "prove t74*") which would also fix
> your issue IIUC?

No, because even when you're working on e.g. "grep" something you're
doing occasionally breaks in some completely unrelated test because it
happens to cover an aspect of grep which is not part of the main
tests.

I ran into this recently while hacking the wildmatch() implementation.
There's dozens of tests all over the test suite that'll break in
subtle ways if wildmatch() breaks, often in cases where the main
wildmatch test is still passing.

Running the whole thing, even in a limited timeout fashion, has a much
higher chance of catching whatever I've screwed up earlier, before I
do an occasional full test suite run. Running the tests in 10 or 15s
is a much shorter time to wait for during a edit/compile/test cycle.

^ permalink raw reply	[relevance 7%]

* [GSoC] Update: Week 3
@ 2017-06-05 20:56 Prathamesh Chavan
  2017-06-05 22:25 ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-05 20:56 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Christian Couder

SUMMARY OF MY PROJECT:

Git submodule subcommands are currently implemented by using shell script
'git-submodule.sh'. There are several reasons why we'll prefer not to
use the shell script. My project intends to convert the subcommands into
C code, thus making them builtins. This will increase Git's portability
and hence the efficiency of working with the git-submodule commands.
Link to the complete proposal: [1]

Mentors:
Stefan Beller <sbeller@google.com>
Christian Couder <christian.couder@gmail.com>

UPDATES:

As planned for the third week, most of the time was spent on improving the
conditions of the present patches, which deal with porting the following
submodule subcommands:
1. foreach: After a discussion over the issue of the path variable in
   windows, in this week my mentor, Stefan Beller came up with the
   appropriate solution for the problem after discussing it with Ramsay
   Jones. It is being posted on the mailing list for further discussion on
   including it.[2]
   Also, some changes are suggested to the posted version of ported foreach
   function which needs to be taken care of.[3]

2. status: The subcommand's porting is completed and has been discussed
   with the mentors for the last two week after which, a new version of the
   ported function is posted on the mailing list.[4][5]

3. sync: The subcommand's porting is completed and I am currently discussing
   it with my mentors for improving the ported function.

4. summary: Porting of this subcommand is underway and will be trying to
   finish it in the following week.

PLAN FOR WEEK-4 (6 June 2017 to 12 June 2017):

1. sync: since this ported function is currently under discussion with
   the mentors, firstly I'll be looking forward to improvising it as
   suggested and do the needful changes.

2. ported function on the mailing list: currently the ported functions
   foreach and status are on the mailing list, under discussion.
   I will be updating the patches on the list and improvising
   them as required for eventually getting these merged.

3. summary and deinit: I will resume porting submodule subcommands
   from shell to C, firstly git-submodule summary and then git-submodule
   deinit for this week.

[1]: https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/
[2]: https://public-inbox.org/git/20170603003710.5558-1-sbeller@google.com/
[3]: https://public-inbox.org/git/20170602112428.11131-2-pc44800@gmail.com/
[4]: https://public-inbox.org/git/20170605202529.22959-1-pc44800@gmail.com/
[5]: https://public-inbox.org/git/20170605202529.22959-2-pc44800@gmail.com/

Thanks,
Prathamesh Chavan

^ permalink raw reply	[relevance 19%]

* Re: [PATCH] submodule foreach: correct $sm_path in nested submodules from a dir
  2017-06-03  0:37 ` [PATCH] submodule foreach: correct $sm_path in nested submodules from a dir Stefan Beller
  2017-06-03 14:07   ` Ramsay Jones
@ 2017-06-05 22:20   ` Jonathan Nieder
  1 sibling, 0 replies; 200+ results
From: Jonathan Nieder @ 2017-06-05 22:20 UTC (permalink / raw)
  To: Stefan Beller; +Cc: bmwill, gitster, git, pc44800, ramsay

Hi,

This patch seems to aim to do multiple things.  Initial thoughts:

Stefan Beller wrote:

[...]
> To ameliorate the situation, perform these changes
> * Document 'sm_path' instead of 'path'.
>   As using a variable '$path' may be harmful to users due to
>   capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
>   use $path variable in eval_gettext string, 2012-04-17). Adjust
>   the documentation to advocate for using $sm_path,  which contains
>   the same value. We still make the 'path' variable available,
>   though not documented.

Making sm_path part of the public API as described here sounds like a
good idea (as a separate patch), to avoid conflicting with $PATH on
Windows.  It's convenient that scripts have access to the private
variable 'sm_path'.  The 'path' variable would still need to be
documented as a deprecated synonym so people working with existing
scripts can know how to update them.

> * Clarify the 'toplevel' variable documentation.
>   It does not contain the topmost superproject as the author assumed,
>   but the direct superproject, such that $toplevel/$sm_path is the
>   actual absolute path of the submodule.

This is very confusing.  I suspect it's a bug.  Can we make 'toplevel'
point to the topmost superproject (as a separate path)?

> * The variable '$displaypath' was accessible but undocumented.
>   Rename it '$displaypath' to '$dpath'. Document what it contains.
>   Users that are broken by the behavior change of 'sm_path' introduced
>   in this commit, can switch from '$path' to '$dpath'.

What does dpath stand for?  Renaming the variable to $dpath means that
scripts trying to adapt to this change would not work with previous
versions of git.  Would it make sense to use $displaypath for this for
compatibility?

What is the intent behind the sm_path behavior change in this patch?
Stepping back, what kind of scripts is this interface meant to support
(e.g., what is an example script that used this interface that would
be affected), and is there a straightforward way to support those use
cases without breaking existing scripts except where necessary?

To summarize, the patch leaves me a bit confused.  I think it would be
best to have multiple patches that solve one problem at a time, which
would hopefully make the story clearer.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[relevance 16%]

* Re: [GSoC] Update: Week 3
  2017-06-05 20:56 [GSoC] Update: Week 3 Prathamesh Chavan
@ 2017-06-05 22:25 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-05 22:25 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Christian Couder

On Mon, Jun 5, 2017 at 1:56 PM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> 1. foreach: After a discussion over the issue of the path variable in
>    windows, in this week my mentor, Stefan Beller came up with the
>    appropriate solution for the problem after discussing it with Ramsay
>    Jones.

Thanks for having so much faith in my abilities, but it may not be
appropriate, yet. (It does multiple things at once, which is generally
a bad sign already. )

Maybe to be unblocked on the conversion of foreach, you could make
the patch have the original behavior, i.e.

<up_path><submodule path>

which makes sense in the way that it is only converting from shell to C,
not fixing a bug along the way. As we discovered a bug, you could just put
a NEEDSWORK comment explaining what the problem is; deferring solving
the issue until later.

I'll review the other patches.

Thanks,
Stefan

^ permalink raw reply	[relevance 15%]

* Re: [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C
  2017-06-05 20:25     ` [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C Prathamesh Chavan
  2017-06-05 20:25       ` [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status Prathamesh Chavan
@ 2017-06-05 22:50       ` Stefan Beller
  2017-06-05 23:20       ` Brandon Williams
  2 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-05 22:50 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Christian Couder

On Mon, Jun 5, 2017 at 1:25 PM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> Since later on we want to port submodule subcommand status, and since
> set_name_rev is part of cmd_status, hence this function is ported. It
> has been ported to function print_name_rev in C, which calls get_name_rev
> to get the revname, and after formatting it, print_name_rev prints it.
> And hence in this way, the command `git submodule--helper print-name-rev
> "sm_path" "sha1"` sets value of revname in git-submodule.sh
>
> The function get_name_rev returns the stdout of the git describe
> commands. Since there are four different git-describe commands used for
> generating the name rev, four child_process are introduced, each successive
> child process running only when previous has no stdout. The order of these
> four git-describe commands is maintained the same as it was in the function
> set_name_rev() in shell script.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
>  builtin/submodule--helper.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            | 16 ++---------
>  2 files changed, 69 insertions(+), 14 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 566a5b6a6..3022118d1 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -219,6 +219,72 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
>         return 0;
>  }
>
> +enum describe_step {
> +       step_bare = 0,

Do we rely on step_bare to be equal to 0?
(This is the hint I am reading from '=0' here.
If we do not, please omit.)

> +       step_tags,
> +       step_contains,
> +       step_all_always,
> +       step_end
> +};
> +
> +static char *get_name_rev(int argc, const char **argv, const char *prefix)

So we split up the functionality into two functions.
get_name_rev, which does the heavy lifting work, and
print_name_rev, that is a wrapper around having to deal with
going from shell to C and back.

One of C strength' compared to shell is type safety,
so maybe we can tighten the contract that get_name_rev
offers to its callers and make it

  get_name_rev(const char *sub_path, const char *object_id / sha1)

and then have print_name_rev call it via

  get_name_rev (argv[1], argv[2])

(which coincidentally is right after checking for
argc != 3, which reinforces that the contract of the
wrapper is "just making sure we have valid input" and
this function "just does heavy lifting, assuming input
is valid".

> +{
> +       struct child_process cp;
> +       struct strbuf sb = STRBUF_INIT;
> +       enum describe_step cur_step;
> +
> +       for (cur_step = step_bare; cur_step < step_end; cur_step++) {
> +               child_process_init(&cp);

(minor nit, personal opinion, feel free to ignore:)
Alternatively, you could declare cp inside the loop assigned to
CHILD_PROCESS_INIT.
Same for strbuf sb as well, such that you only declare the iterator
variable outside the loop.

> +               prepare_submodule_repo_env(&cp.env_array);
> +               cp.dir = argv[1];
> +               cp.no_stderr = 1;

cp.git_cmd = 1; as well?

Thanks,
Stefan

^ permalink raw reply	[relevance 15%]

* Re: [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status
  2017-06-05 20:25       ` [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status Prathamesh Chavan
@ 2017-06-05 23:12         ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-05 23:12 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Christian Couder

On Mon, Jun 5, 2017 at 1:25 PM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> This aims to make git-submodule subcommand status a builtin. Here
> 'status' is ported to submodule--helper, and submodule--helper is
> called from git-submodule.sh.
>
> For the purpose of porting cmd_status, the code is split up such that
> one function obtains all the list of submodules, acting as the front-end
> of git-submodule status. This function later calls the second function
> for_each_submodule_list,it which basically loops through the list of
> submodules and calls function fn, which in this case is status_submodule.
> The third function, status submodule returns the status of submodule and
> also takes care of the recursive flag.
>
> The first function module_status parses the options present in argv,
> and then with the help of module_list_compute, generates the list of
> submodules present in the current working tree.
>
> The second function for_each_submodule_list traverses through the list,
> and calls function fn (which in the case of submodule subcommand status
> is status_submodule) is called for each entry.
>
> The third function status_submodule checks for the various conditions,
> and prints the status of the submodule accordingly. Also, this function
> takes care of the recursive flag by creating a separate child_process
> and running it inside the submodule. The function print_status handles the
> printing of submodule's status.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> In this new version of patch, function print_status is introduced.
>
> The functions for_each_submodule_list and get_submodule_displaypath
> are found to be the same as those in the ported submodule subcommand
> foreach's patches. The reason for doing so is to keep both the patches
> independant and on separate branches.


Maybe keep it even in a separate patch, such that
the status series becomes:
  patch 1: introduce for_each_submodule_list and get_submodule_displaypath
  patch 2: port print_name_rev
  patch 3: port status

whereas the foreach series (and other series later) could
re-use patch 1, and build on top of it.

For reviewing patches, it is fine to have the
get_submodule_displaypath is both series, though for applying
patches it for less complication/deduplication from the maintainer
I would think.

> +
> +static void print_status(struct status_cb *info, char state, const char *path,
> +                        char *sub_sha1, char *displaypath)
> +{
> +       if (info->quiet)
> +               return;
> +
> +       printf("%c%s %s", state, sub_sha1, displaypath);
> +
> +       if (state == ' ' || state == '+') {
> +               struct argv_array name_rev_args = ARGV_ARRAY_INIT;
> +
> +               argv_array_pushl(&name_rev_args, "print-name-rev",
> +                                path, sub_sha1, NULL);
> +               print_name_rev(name_rev_args.argc, name_rev_args.argv,
> +                              info->prefix);

... with the suggestion given in the print_name_rev patch, this would
become a one liner. :)


The rest looks good to me :)

^ permalink raw reply	[relevance 18%]

* Re: [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C
  2017-06-05 20:25     ` [GSoC][PATCH v2 1/2] submodule: port set_name_rev from shell to C Prathamesh Chavan
  2017-06-05 20:25       ` [GSoC][PATCH v2 2/2] submodule: port submodule subcommand status Prathamesh Chavan
  2017-06-05 22:50       ` Stefan Beller
@ 2017-06-05 23:20       ` Brandon Williams
  2 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-06-05 23:20 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: sbeller, git, christian.couder

On 06/06, Prathamesh Chavan wrote:
> Since later on we want to port submodule subcommand status, and since
> set_name_rev is part of cmd_status, hence this function is ported. It
> has been ported to function print_name_rev in C, which calls get_name_rev
> to get the revname, and after formatting it, print_name_rev prints it.
> And hence in this way, the command `git submodule--helper print-name-rev
> "sm_path" "sha1"` sets value of revname in git-submodule.sh
> 
> The function get_name_rev returns the stdout of the git describe
> commands. Since there are four different git-describe commands used for
> generating the name rev, four child_process are introduced, each successive
> child process running only when previous has no stdout. The order of these
> four git-describe commands is maintained the same as it was in the function
> set_name_rev() in shell script.
> 
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
>  builtin/submodule--helper.c | 67 +++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            | 16 ++---------
>  2 files changed, 69 insertions(+), 14 deletions(-)
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 566a5b6a6..3022118d1 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -219,6 +219,72 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
>  	return 0;
>  }
>  
> +enum describe_step {
> +	step_bare = 0,
> +	step_tags,
> +	step_contains,
> +	step_all_always,
> +	step_end
> +};
> +
> +static char *get_name_rev(int argc, const char **argv, const char *prefix)
> +{
> +	struct child_process cp;
> +	struct strbuf sb = STRBUF_INIT;
> +	enum describe_step cur_step;
> +
> +	for (cur_step = step_bare; cur_step < step_end; cur_step++) {
> +		child_process_init(&cp);
> +		prepare_submodule_repo_env(&cp.env_array);
> +		cp.dir = argv[1];
> +		cp.no_stderr = 1;

set cp.git = 1 so that you can avoid pushing "git" onto the arg array.

> +
> +		switch (cur_step) {
> +			case step_bare:
> +				argv_array_pushl(&cp.args, "git", "describe",
> +						 argv[2], NULL);
> +				break;
> +			case step_tags:
> +				argv_array_pushl(&cp.args, "git", "describe",
> +						 "--tags", argv[2], NULL);
> +				break;
> +			case step_contains:
> +				argv_array_pushl(&cp.args, "git", "describe",
> +						 "--contains", argv[2], NULL);
> +				break;
> +			case step_all_always:
> +				argv_array_pushl(&cp.args, "git", "describe",
> +						 "--all", "--always", argv[2],
> +						 NULL);
> +				break;
> +			default:
> +				BUG("unknown describe step '%d'", cur_step);
> +		}
> +
> +		if (!capture_command(&cp, &sb, 0) && sb.len) {
> +			strbuf_strip_suffix(&sb, "\n");
> +			return strbuf_detach(&sb, NULL);
> +		}
> +	}
> +
> +	strbuf_release(&sb);
> +	return NULL;
> +}
> +
> +static int print_name_rev(int argc, const char **argv, const char *prefix)
> +{
> +	char *namerev;
> +	if (argc != 3)
> +		die("print-name-rev only accepts two arguments: <path> <sha1>");
> +
> +	namerev = get_name_rev(argc, argv, prefix);
> +	if (namerev && namerev[0])
> +		printf(" (%s)", namerev);
> +	printf("\n");
> +
> +	return 0;
> +}
> +
>  struct module_list {
>  	const struct cache_entry **entries;
>  	int alloc, nr;
> @@ -1212,6 +1278,7 @@ static struct cmd_struct commands[] = {
>  	{"relative-path", resolve_relative_path, 0},
>  	{"resolve-relative-url", resolve_relative_url, 0},
>  	{"resolve-relative-url-test", resolve_relative_url_test, 0},
> +	{"print-name-rev", print_name_rev, 0},
>  	{"init", module_init, SUPPORT_SUPER_PREFIX},
>  	{"remote-branch", resolve_remote_submodule_branch, 0},
>  	{"push-check", push_check, 0},
> diff --git a/git-submodule.sh b/git-submodule.sh
> index c0d0e9a4c..091051891 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -758,18 +758,6 @@ cmd_update()
>  	}
>  }
>  
> -set_name_rev () {
> -	revname=$( (
> -		sanitize_submodule_env
> -		cd "$1" && {
> -			git describe "$2" 2>/dev/null ||
> -			git describe --tags "$2" 2>/dev/null ||
> -			git describe --contains "$2" 2>/dev/null ||
> -			git describe --all --always "$2"
> -		}
> -	) )
> -	test -z "$revname" || revname=" ($revname)"
> -}
>  #
>  # Show commit summary for submodules in index or working tree
>  #
> @@ -1041,14 +1029,14 @@ cmd_status()
>  		fi
>  		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
>  		then
> -			set_name_rev "$sm_path" "$sha1"
> +			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
>  			say " $sha1 $displaypath$revname"
>  		else
>  			if test -z "$cached"
>  			then
>  				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
>  			fi
> -			set_name_rev "$sm_path" "$sha1"
> +			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
>  			say "+$sha1 $displaypath$revname"
>  		fi
>  
> -- 
> 2.13.0
> 

-- 
Brandon Williams

^ permalink raw reply	[relevance 7%]

* Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)
  2017-06-05 18:23 ` Re: What's cooking in git.git (Jun 2017, #03; Mon, 5) Stefan Beller
@ 2017-06-06  6:44   ` Jacob Keller
  0 siblings, 0 replies; 200+ results
From: Jacob Keller @ 2017-06-06  6:44 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Michael Haggerty, git

On Mon, Jun 5, 2017 at 11:23 AM, Stefan Beller <sbeller@google.com> wrote:
>> * sb/diff-color-move (2017-06-01) 17 commits
>>  - diff.c: color moved lines differently
>>  - diff: buffer all output if asked to
>>  - diff.c: emit_line includes whitespace highlighting
>>  - diff.c: convert diff_summary to use emit_line_*
>>  - diff.c: convert diff_flush to use emit_line_*
>>  - diff.c: convert word diffing to use emit_line_*
>>  - diff.c: convert show_stats to use emit_line_*
>>  - diff.c: convert emit_binary_diff_body to use emit_line_*
>>  - submodule.c: convert show_submodule_summary to use emit_line_fmt
>>  - diff.c: convert emit_rewrite_lines to use emit_line_*
>>  - diff.c: convert emit_rewrite_diff to use emit_line_*
>>  - diff.c: convert builtin_diff to use emit_line_*
>>  - diff.c: convert fn_out_consume to use emit_line
>>  - diff: introduce more flexible emit function
>>  - diff.c: factor out diff_flush_patch_all_file_pairs
>>  - diff: move line ending check into emit_hunk_header
>>  - diff: readability fix
>>
>>  "git diff" has been taught to optionally paint new lines that are
>>  the same as deleted lines elsewhere differently from genuinely new
>>  lines.
>>
>>  Are we happy with these changes?
>
> I advertised this series e.g. for reviewing Brandons
> repo object refactoring series and used it myself to inspect
> some patches there[1]. I am certainly happy (but biased) with
> what we have available there.
>
> Jacob intended to use this series
> for review as well, but has given no opinion yet.

I haven't had any problems thus far. Been using it for the past few
days at $DAYJOB.

Haven't said anything yet because I haven't really had anything to
add. I like it.

Thanks,
Jake

>
> You seemed to have used it for js/blame-lib?
>
> --
> Those patches had a wide reviewer audience cc'd,
> so I would think people are aware of this series.
>
> --
> Things to come, but not in this series as they are more advanced:
>
>     Discuss if a block/line needs a minimum requirement.
>
> When doing reviews with this series, a couple of lines such
> as "\t\t}" were marked as a moved, which is not wrong as they
> really occurred in the text with opposing sign.
> But it was annoying as it drew my attention to just closing
> braces, which IMO is not the point of code review.
>
> To solve this issue I had the idea of a "minimum requirement", e.g.
> * at least 3 consecutive lines or
> * at least one line with at least 3 non-ws characters or
> * compute the entropy of a given moved block and if it is too low, do
>   not mark it up.
>
> I am not sure if such a "minimum requirement" is the right approach
> at all. The nature of this discussion comes close to the diff heuristics
> at which Michael did present a wonderful solution, hence I had him cc'd
> on the series as he may have some good insights on how to improve
> the diffs. :)
>
> --
> In conclusion:
>
> We are happy to move to next as it seems technically sound.
>
> But we want more exposure on usage to point out UX bugs.
> (e.g. is the default mode for just giving --color-moved good for the
> majority of people/use cases? Are there subtle annoyances such
> as the closing braces?)
>
> So maybe merge to next with the strong option to evict it
> when finding more fundamentally wrong things?
>
> Thanks,
> Stefan
>
> [1]
> https://public-inbox.org/git/CAGZ79kZJF9iDsVgyi-hSKb6N8w0uhVCU4W-r89F0eRJPXe_4Og@mail.gmail.com/

^ permalink raw reply	[relevance 8%]

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
      [irrelevant] <20170606035650.oykbz2uc4xkr3cr2@sigill.intra.peff.net>
@ 2017-06-06 18:01 ` Stefan Beller
  2017-06-06 18:10   ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-06 18:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git

On Mon, Jun 5, 2017 at 8:56 PM, Jeff King <peff@peff.net> wrote:
> While running some regression tests with v2.13, I noticed an odd
> behavior. If I create a repository where there's a gitlink with no
> matching .gitmodules entry:
>
>   git init repo
>   cd repo
>   n10=1234abcdef
>   n40=$n10$n10$n10$n10
>   git update-index --add --cacheinfo 160000 $n40 foo
>   git commit -m "gitlink without .gitmodule entry"
>
> and then I clone it recursively with v2.12, it fails:
>
>   $ git.v2.12.3 clone --recurse-submodules . dst; echo exit=$?
>   Cloning into 'dst'...
>   done.
>   fatal: No url found for submodule path 'foo' in .gitmodules
>   exit=128
>
> But with v2.13, it silently ignores the submodule:
>
>   $ git.v2.13.1 clone --recurse-submodules . dst; echo exit=$?
>   Cloning into 'dst'...
>   done.
>   exit=0
>
> This bisects to your bb62e0a99 (clone: teach --recurse-submodules to
> optionally take a pathspec, 2017-03-17). That patch just sets
> submodule.active by default, so I think the real issue is probably in
> a086f921a (submodule: decouple url and submodule interest, 2017-03-17).

It's a feature, not a bug, IMO.

When starting out the journey to improve submodules, one of the major
principle was to not interfere with gitlinks too much, as they are used in
ways git cannot fathom (cf git-series storing patches in gitlink form).

And building on that: You asked for recursing into *submodules*, not
into *gitlinks*. And submodules in the new Git have stronger requirements
w.r.t. the gitmodules file. (You have to tell us exactly how you want your
submodule to be treated, and we do not want to half-ass guess around
the shortcomings of a user not telling us about the submodule)

> I also wasn't sure if this might be intentional. I.e., that we'd just
> consider gitlink entries which aren't even configured as not-submodules
> and ignore them.

I think this is what we want to do, and we should do it consistently.
The only downside for this is that more unintentional gitlinks may be
added to repositories as Git will be very good at ignoring them.

> I couldn't certainly see an argument for moving in that
> direction, but it is different than what we used to do. But I couldn't
> find anything in any of the commit messages that mentioned this either
> way, so I figured I'd punt and ask. :)

Yeah, yesterday we had a big discussion if we want to publish our
roadmap and long term vision (as a team, as a company, or as a
community?) This would help newcomers and outsiders to see where
e.g. submodules are headed and people could speak up early if we miss
their use case.

Thanks for asking,
Stefan

>
> -Peff

^ permalink raw reply	[relevance 18%]

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-06 18:01 ` Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone Stefan Beller
@ 2017-06-06 18:10   ` Brandon Williams
      [irrelevant]     ` <20170606183914.6iowfhimo5yrvmtf@sigill.intra.peff.net>
  0 siblings, 1 reply; 200+ results
From: Brandon Williams @ 2017-06-06 18:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git

On 06/06, Stefan Beller wrote:
> On Mon, Jun 5, 2017 at 8:56 PM, Jeff King <peff@peff.net> wrote:
> > While running some regression tests with v2.13, I noticed an odd
> > behavior. If I create a repository where there's a gitlink with no
> > matching .gitmodules entry:
> >
> >   git init repo
> >   cd repo
> >   n10=1234abcdef
> >   n40=$n10$n10$n10$n10
> >   git update-index --add --cacheinfo 160000 $n40 foo
> >   git commit -m "gitlink without .gitmodule entry"
> >
> > and then I clone it recursively with v2.12, it fails:
> >
> >   $ git.v2.12.3 clone --recurse-submodules . dst; echo exit=$?
> >   Cloning into 'dst'...
> >   done.
> >   fatal: No url found for submodule path 'foo' in .gitmodules
> >   exit=128
> >
> > But with v2.13, it silently ignores the submodule:
> >
> >   $ git.v2.13.1 clone --recurse-submodules . dst; echo exit=$?
> >   Cloning into 'dst'...
> >   done.
> >   exit=0
> >
> > This bisects to your bb62e0a99 (clone: teach --recurse-submodules to
> > optionally take a pathspec, 2017-03-17). That patch just sets
> > submodule.active by default, so I think the real issue is probably in
> > a086f921a (submodule: decouple url and submodule interest, 2017-03-17).
> 
> It's a feature, not a bug, IMO.
> 
> When starting out the journey to improve submodules, one of the major
> principle was to not interfere with gitlinks too much, as they are used in
> ways git cannot fathom (cf git-series storing patches in gitlink form).
> 
> And building on that: You asked for recursing into *submodules*, not
> into *gitlinks*. And submodules in the new Git have stronger requirements
> w.r.t. the gitmodules file. (You have to tell us exactly how you want your
> submodule to be treated, and we do not want to half-ass guess around
> the shortcomings of a user not telling us about the submodule)

Just for some background on the new behavior and how this functionality
changed: My series changed how 'submodule init' behaved if you have
'submodule.active' set.  Once set (like how clone --recurse does now)
when not provided any path to a submodule, a list of 'active' submodules
matching the 'submodule.active' pathspec will be initialized.  One of
the requirements to be 'active' is to have an entry in the .gitmodules
file so gitlinks without an entry in the .gitmodules file will simply be
ignored now.

> 
> > I also wasn't sure if this might be intentional. I.e., that we'd just
> > consider gitlink entries which aren't even configured as not-submodules
> > and ignore them.
> 
> I think this is what we want to do, and we should do it consistently.
> The only downside for this is that more unintentional gitlinks may be
> added to repositories as Git will be very good at ignoring them.
> 
> > I couldn't certainly see an argument for moving in that
> > direction, but it is different than what we used to do. But I couldn't
> > find anything in any of the commit messages that mentioned this either
> > way, so I figured I'd punt and ask. :)
> 
> Yeah, yesterday we had a big discussion if we want to publish our
> roadmap and long term vision (as a team, as a company, or as a
> community?) This would help newcomers and outsiders to see where
> e.g. submodules are headed and people could speak up early if we miss
> their use case.
> 
> Thanks for asking,
> Stefan
> 
> >
> > -Peff

-- 
Brandon Williams

^ permalink raw reply	[relevance 18%]

* Re: [PATCH 0/3] update sha1dc
      [irrelevant] ` <20170606151231.25172-1-avarab@gmail.com>
@ 2017-06-06 18:23   ` Stefan Beller
  2017-06-06 18:51     ` Ævar Arnfjörð Bjarmason
      [irrelevant]   ` <20170606151231.25172-3-avarab@gmail.com>
  1 sibling, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-06 18:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> This updates sha1dc fixing the issue on Cygwin introduced in 2.13.1,
> and hopefully not regressing elsewhere. Liam, it would be much
> appreciated if you could test this on SPARC.
>
> As before the "sha1dc: update from upstream" patch is what should
> fast-track to master/maint and be in 2.13.2, the other two are the
> cooking submodule use, that's all unchanged aside from of course the
> submodule pointing to the same upstream commit as the code import
> itself does.
>
> Junio: There's a whitespace change to sha1.h that am warns about, but
> which it applies anyway that you didn't apply from my previous
> patch. I think it probably makes sense to just take upstream's
> whitespace shenanigans as-is instead of seeing that diff every time we
> update. I guess we could also send them a pull request...

I would suggest the pull request.

Also as to not make the mistake from before that I jump on the
submodule bandwagon here:
Patch 1 ought to go in its on series/patch, so with that out the way
we have more time to consider the pros and cons of the rest of
the series?

Thanks,
Stefan

^ permalink raw reply	[relevance 15%]

* Re: [PATCH 2/3] sha1dc: optionally use sha1collisiondetection as a submodule
      [irrelevant]   ` <20170606151231.25172-3-avarab@gmail.com>
@ 2017-06-06 18:48     ` Stefan Beller
  2017-06-06 19:03       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-06 18:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Add an option to use the sha1collisiondetection library from the
> submodule in sha1collisiondetection/ instead of in the copy in the
> sha1dc/ directory.
>
> This allows us to try out the submodule in sha1collisiondetection
> without breaking the build for anyone who's not expecting them as we
> work out any kinks.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Other projects using submodules sometimes have
a .gitattributes entry to have .gitmodules not exported
via git-archive. Do we want a similar thing?

Speaking of attributes, I wonder if we want to specify
the .gitmodules file to be text with unixy file endings:
Having an entry
    .gitattributes eol=crlf
to simulate a Windows environment doesn't harm
submodule operation, which is good. I'll check if we
have a test for that.

^ permalink raw reply	[relevance 24%]

* Re: [PATCH 0/3] update sha1dc
  2017-06-06 18:23   ` Re: [PATCH 0/3] update sha1dc Stefan Beller
@ 2017-06-06 18:51     ` Ævar Arnfjörð Bjarmason
  2017-06-06 19:01       ` [PATCH] sha1dc: ignore indent-with-non-tab whitespace violations Jeff King
  0 siblings, 1 reply; 200+ results
From: Ævar Arnfjörð Bjarmason @ 2017-06-06 18:51 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 6, 2017 at 8:23 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> This updates sha1dc fixing the issue on Cygwin introduced in 2.13.1,
>> and hopefully not regressing elsewhere. Liam, it would be much
>> appreciated if you could test this on SPARC.
>>
>> As before the "sha1dc: update from upstream" patch is what should
>> fast-track to master/maint and be in 2.13.2, the other two are the
>> cooking submodule use, that's all unchanged aside from of course the
>> submodule pointing to the same upstream commit as the code import
>> itself does.
>>
>> Junio: There's a whitespace change to sha1.h that am warns about, but
>> which it applies anyway that you didn't apply from my previous
>> patch. I think it probably makes sense to just take upstream's
>> whitespace shenanigans as-is instead of seeing that diff every time we
>> update. I guess we could also send them a pull request...
>
> I would suggest the pull request.

Looking at this again it's not a bug, just upstream choosing to indent
a comment with spaces, not a bug.

So it makes sense to just apply as-is so we don't have that diff with
them / different sha1s on the files etc.

> Also as to not make the mistake from before that I jump on the
> submodule bandwagon here:
> Patch 1 ought to go in its on series/patch, so with that out the way
> we have more time to consider the pros and cons of the rest of
> the series?

Yes it makes perfect sense to just take the 1st patch here and make
the submodule changes cook. This is just how I submitted it the last
time and Junio took the 1st patch into a maint topic, so I figured I'd
send it like this again.

^ permalink raw reply	[relevance 15%]

* [PATCH] sha1dc: ignore indent-with-non-tab whitespace violations
  2017-06-06 18:51     ` Ævar Arnfjörð Bjarmason
@ 2017-06-06 19:01       ` Jeff King
  2017-06-06 19:04         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 200+ results
From: Jeff King @ 2017-06-06 19:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Stefan Beller, git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 06, 2017 at 08:51:35PM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Jun 6, 2017 at 8:23 PM, Stefan Beller <sbeller@google.com> wrote:
> > On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >> This updates sha1dc fixing the issue on Cygwin introduced in 2.13.1,
> >> and hopefully not regressing elsewhere. Liam, it would be much
> >> appreciated if you could test this on SPARC.
> >>
> >> As before the "sha1dc: update from upstream" patch is what should
> >> fast-track to master/maint and be in 2.13.2, the other two are the
> >> cooking submodule use, that's all unchanged aside from of course the
> >> submodule pointing to the same upstream commit as the code import
> >> itself does.
> >>
> >> Junio: There's a whitespace change to sha1.h that am warns about, but
> >> which it applies anyway that you didn't apply from my previous
> >> patch. I think it probably makes sense to just take upstream's
> >> whitespace shenanigans as-is instead of seeing that diff every time we
> >> update. I guess we could also send them a pull request...
> >
> > I would suggest the pull request.
> 
> Looking at this again it's not a bug, just upstream choosing to indent
> a comment with spaces, not a bug.
> 
> So it makes sense to just apply as-is so we don't have that diff with
> them / different sha1s on the files etc.

Agreed. Maybe we'd also want this patch:

-- >8 --
Subject: sha1dc: ignore indent-with-non-tab whitespace violations

The upstream sha1dc code indents some lines with spaces.
While this doesn't match Git's coding guidelines, it's better
to leave this imported code untouched than to try to make it
match our style. However, we can use .gitattributes to tell
"diff --check" and "git am" not to bother us about it.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1dc/.gitattributes | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 sha1dc/.gitattributes

diff --git a/sha1dc/.gitattributes b/sha1dc/.gitattributes
new file mode 100644
index 000000000..da53f4054
--- /dev/null
+++ b/sha1dc/.gitattributes
@@ -0,0 +1 @@
+* whitespace=-indent-with-non-tab
-- 
2.13.1.664.g1b5a21ec3


^ permalink raw reply	[relevance 1%]

* Re: [PATCH 2/3] sha1dc: optionally use sha1collisiondetection as a submodule
  2017-06-06 18:48     ` Re: [PATCH 2/3] sha1dc: optionally use sha1collisiondetection as a submodule Stefan Beller
@ 2017-06-06 19:03       ` Ævar Arnfjörð Bjarmason
  2017-06-06 19:09         ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Ævar Arnfjörð Bjarmason @ 2017-06-06 19:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 6, 2017 at 8:48 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> Add an option to use the sha1collisiondetection library from the
>> submodule in sha1collisiondetection/ instead of in the copy in the
>> sha1dc/ directory.
>>
>> This allows us to try out the submodule in sha1collisiondetection
>> without breaking the build for anyone who's not expecting them as we
>> work out any kinks.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>
> Other projects using submodules sometimes have
> a .gitattributes entry to have .gitmodules not exported
> via git-archive. Do we want a similar thing?

Right now we end up with an empty directory due to the issue you noted
in https://public-inbox.org/git/CAGZ79kZC98CxA69QjmX2s_SU6z1CSgKgwZeqvwiMRAQc6+S3xg@mail.gmail.com/

It's probably best to have the .gitmodules file as some hint that
something should be there. We also ship the other .git* files.

> Speaking of attributes, I wonder if we want to specify
> the .gitmodules file to be text with unixy file endings:
> Having an entry
>     .gitattributes eol=crlf
> to simulate a Windows environment doesn't harm
> submodule operation, which is good. I'll check if we
> have a test for that.

I have no idea what that would do or why we'd have it, but I'm going
to understand this as you looking into it :)

^ permalink raw reply	[relevance 16%]

* Re: [PATCH] sha1dc: ignore indent-with-non-tab whitespace violations
  2017-06-06 19:01       ` [PATCH] sha1dc: ignore indent-with-non-tab whitespace violations Jeff King
@ 2017-06-06 19:04         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 200+ results
From: Ævar Arnfjörð Bjarmason @ 2017-06-06 19:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 6, 2017 at 9:01 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 06, 2017 at 08:51:35PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> On Tue, Jun 6, 2017 at 8:23 PM, Stefan Beller <sbeller@google.com> wrote:
>> > On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
>> > <avarab@gmail.com> wrote:
>> >> This updates sha1dc fixing the issue on Cygwin introduced in 2.13.1,
>> >> and hopefully not regressing elsewhere. Liam, it would be much
>> >> appreciated if you could test this on SPARC.
>> >>
>> >> As before the "sha1dc: update from upstream" patch is what should
>> >> fast-track to master/maint and be in 2.13.2, the other two are the
>> >> cooking submodule use, that's all unchanged aside from of course the
>> >> submodule pointing to the same upstream commit as the code import
>> >> itself does.
>> >>
>> >> Junio: There's a whitespace change to sha1.h that am warns about, but
>> >> which it applies anyway that you didn't apply from my previous
>> >> patch. I think it probably makes sense to just take upstream's
>> >> whitespace shenanigans as-is instead of seeing that diff every time we
>> >> update. I guess we could also send them a pull request...
>> >
>> > I would suggest the pull request.
>>
>> Looking at this again it's not a bug, just upstream choosing to indent
>> a comment with spaces, not a bug.
>>
>> So it makes sense to just apply as-is so we don't have that diff with
>> them / different sha1s on the files etc.
>
> Agreed. Maybe we'd also want this patch:

Great, that makes perfect sense for prepending to the series.

> -- >8 --
> Subject: sha1dc: ignore indent-with-non-tab whitespace violations
>
> The upstream sha1dc code indents some lines with spaces.
> While this doesn't match Git's coding guidelines, it's better
> to leave this imported code untouched than to try to make it
> match our style. However, we can use .gitattributes to tell
> "diff --check" and "git am" not to bother us about it.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  sha1dc/.gitattributes | 1 +
>  1 file changed, 1 insertion(+)
>  create mode 100644 sha1dc/.gitattributes
>
> diff --git a/sha1dc/.gitattributes b/sha1dc/.gitattributes
> new file mode 100644
> index 000000000..da53f4054
> --- /dev/null
> +++ b/sha1dc/.gitattributes
> @@ -0,0 +1 @@
> +* whitespace=-indent-with-non-tab
> --
> 2.13.1.664.g1b5a21ec3
>

^ permalink raw reply	[relevance 1%]

* Re: [PATCH 2/3] sha1dc: optionally use sha1collisiondetection as a submodule
  2017-06-06 19:03       ` Ævar Arnfjörð Bjarmason
@ 2017-06-06 19:09         ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-06 19:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Adam Dinwoodie, Ramsay Jones, Liam R . Howlett, Michael Kebe

On Tue, Jun 6, 2017 at 12:03 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Tue, Jun 6, 2017 at 8:48 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, Jun 6, 2017 at 8:12 AM, Ævar Arnfjörð Bjarmason
>> <avarab@gmail.com> wrote:
>>> Add an option to use the sha1collisiondetection library from the
>>> submodule in sha1collisiondetection/ instead of in the copy in the
>>> sha1dc/ directory.
>>>
>>> This allows us to try out the submodule in sha1collisiondetection
>>> without breaking the build for anyone who's not expecting them as we
>>> work out any kinks.
>>>
>>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>
>> Other projects using submodules sometimes have
>> a .gitattributes entry to have .gitmodules not exported
>> via git-archive. Do we want a similar thing?
>
> Right now we end up with an empty directory due to the issue you noted
> in https://public-inbox.org/git/CAGZ79kZC98CxA69QjmX2s_SU6z1CSgKgwZeqvwiMRAQc6+S3xg@mail.gmail.com/
>
> It's probably best to have the .gitmodules file as some hint that
> something should be there. We also ship the other .git* files.

Ok, but then let's talk about the other .git* files, would we want to
distribute these via tarballs? (I guess it is a minor thing if at all and
nobody downloading a git tarball would be surprised by these metadata
files or annoyed by them, so all is good?)

>
>> Speaking of attributes, I wonder if we want to specify
>> the .gitmodules file to be text with unixy file endings:
>> Having an entry
>>     .gitattributes eol=crlf
>> to simulate a Windows environment doesn't harm
>> submodule operation, which is good. I'll check if we
>> have a test for that.
>
> I have no idea what that would do or why we'd have it, but I'm going
> to understand this as you looking into it :)

I looked briefly into it and it seems to be no problem just as config files
on Windows are no problem. I just spoke up too quickly.

^ permalink raw reply	[relevance 16%]

* Re: pushing for a new hash, was Re: [PATCH 2/3] rebase: Add tests for console output
      [irrelevant]                 ` <alpine.DEB.2.21.1.1706071520280.171564@virtualbox>
@ 2017-06-07 16:53                   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-07 16:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jonathan Nieder, Junio C Hamano, Phillip Wood, git, Ævar Arnfjörð Bjarmason

On Wed, Jun 7, 2017 at 7:47 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Stefan,
>
> On Tue, 6 Jun 2017, Stefan Beller wrote:
>
>> On Tue, Jun 6, 2017 at 3:22 PM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>
>> > 3) the only person who could make that call is Junio
>>
>> Occasionally I think the same, but in fact it is not true.
>
> Again my poor English skillz make sure I get misunderstood. So bear with
> me, please, and let me try again.

I don't think it is a language thing, but a matter of perspective.

>
> The current What's cooking mails are full of stuff other than the
> transition from SHA-1 to a new function.

True, but there is also

    * bw/object-id (2017-06-05) 33 commits
    ...

     Conversion from uchar[20] to struct object_id continues.

     Will merge to 'next'.


>  In fact, every once in a while I
> see brian carlson's patch series with the remark "Needs review" while
> other patch series get reviewed even by Junio.

So are you trying to impose priorities on what Junio has to review?
(It sounds like so, but maybe you are just stating an observation,
and a conclusion with an actionable item comes next)

Sometimes I disagree with what Junio does and in which order, too.
But he has more experience in how to guide a successful community,
so I respect what he does even if I would have done it differently (such
as a different order).

>
> In my mind, this sends a message.

In the simplest form the message could be understood as a call for help
to review the patches Brian sent.

And the fact that YOU are not reviewing the patches, tells me that
you have more important things to do, such as running a fork of Git.

In my perception the conversion is picking up speed. It used to be Brian
who decided to start this years ago as a one-man show, but now we
have multiple people working on it
* Brian sending out more patches, as more review happens:
https://public-inbox.org/git/5973919a-e282-a02e-9b04-d313c77e250d@google.com/
https://public-inbox.org/git/20170509221322.GA106700@google.com/
* Brandon picking up one part of the conversion series (mentioned before, see
current cooking email)
* A potential migration plan has been made
  "Git hash function transition" (https://goo.gl/gh2Mzc).
  See the note atop the document "Note: this draft is somewhat out
  of date and is being reworked (in particular to use a different hash
function).
  See public-inbox.org/git for more current discussion."
* There are not a lot of patches, which do not have written
  "SHA1 CONVERISON" all over their face. (This one has, I made it last night
  as a one off response to this thread:
https://public-inbox.org/git/20170607021805.11849-1-sbeller@google.com/)

>
> If, hypothetically, a couple of What's cooking mails would have in their
> header some language to the extent that we need to focus on transitioning
> away from SHA-1,  and maybe even have the promise that Junio would not
> review other patch series as long as there are patches to review that
> prepare the tests for the transition, that convert more 20 and 40
> constants, that convert more users to object_ids (and maybe strongly
> encourage to coordinate with brian so as not to trip over each others'
> toes), to implement a command to convert a SHA-1 based repository to a
> repository based on a different hash, to implement caching of legacy SHA-1
> <=> new hash mapping, then that would send a wholly different message.

That message sent there would be "Junio thinks the SHA1 conversion is
the most important thing now, so he does the work (the work a maintainer
does to guide the project)".

You could send the same message "Johannes thinks the SHA1 conversion
is the most important thing and do the work (Johannes being a well respected
contributor would send patches, and that would attract a lot of
reviews for sure.
-- I don't mean this snarky, please don't read any snark in this.)

> And in my mind, if anybody else than Junio sent this message, it would
> sound ludicrous.

Yes I have seen a couple of these messages (unrelated topics).
"Git should do X. Git should not do Y. k, thx bye!" and I ignored them,
because these one-offs do not convince me to invest my own time
in it to produce a reasonable ROI.

If there are patches attached, it is not ludicrous any more as the "proof
of work done" shows that the voice raised is more than just hot talk, but
actual a genuine interest in moving things towards the right direction.

Another big one: "Move Git away from global state all over the place".
If you think about all subsystems, it may even reach the order of
magnitude to the sha1 conversion, but the way the message was sent
it did not seem ludicrous to me:
https://public-inbox.org/git/20170530171217.GB2798@google.com/

Or the other big project "Protocol v2, that scales and serves large
repos well (number of refs, large binary files omitted, refs in wants
and so on)" took a different approach, and mostly discussed design
(I recall emails both from Microsoft as well as Google discussing the
design, most of them having RFC patches attached, such that it very
much looked like "proof of work done, so it is not ludicrous")

> For example, if I sent a mail to that extent, I would
> find it ridiculous myself, in particular since I am a very unprolific
> reviewer, and the promise to focus on favoring reviews of SHA-1 transition
> related patches would sound very unsincere from somebody like me.

If you were to actually follow through after such an announcement,
and in fact review the sha1 conversion patches thoroughly and in a timely
manner, I would think such a call up front would be well received.

I thought about doing that myself, but I dread my future commitment.
Specifically as my $DAY_JOBs wants me to work on submodules
instead of the sha1 conversion. ("Submodules are more important
than the SHA1 conversion [for me] ", if I were to trust the infinite wisdom
of our management process. )


>
>> As said above, Junio has strong veto power for things going off rails,
>> but in his role as a maintainer he does not coordinate people. (He
>> occasionally asks them to coordinate between themselves, though)
>
> I never had in mind that Junio would coordinate people or distribute
> tasks.

Coordinate is a strong word here. Most recent observed examples:
https://public-inbox.org/git/xmqq60gpfvqj.fsf@gitster.mtv.corp.google.com/
(My patch series conflicted with Ævars series, so we had to figure
out how to fix it)

https://public-inbox.org/git/20170602182215.GA57260@google.com/
(Aforementioned object id conversion series having merge conflicts)

Note that this is only about coordination "You should talk to person Y
so we can figure out how to make these 2 patches work well together,
not about distributing tasks, as in "You must do X".

> Instead, I had in mind that a certain time period could be called out as
> focusing on that pretty important direction.

As remarked in an earlier email, if such a thing is called out, I would very
much prefer it in the next quarter, as then I can convince my manager to
have more time following such a goal. Otherwise *I* would ignore it.
The community at large may be different and jump on it like crazy.

> That would be mostly symbolic, of course. And encouraging. In a positive
> way. With a direction.

So you want a project roadmap?

As hinted at before the best would be to lead a good example here
and show *your* roadmap such that others see how valuable of a tool
it is to have a roadmap in the open.

>> > 4) we still have the problem that there is no cryptography expert among
>> > those who in the Git project are listened to
>>
>> I can assure you that Jonathan listened to crypto experts. It just did
>> not happen on the mailing list, which is sad regarding openness and
>> transparency.
>
> True. Same goes for me, of course. I just felt pretty uncomfortable
> sharing the contents of my private conversation publicly, when I tried
> very hard to convince my conversation partner to join the discussion on
> this mailing list, and they refused.
>
> The gist of it was: SHA-256 should be preferred to SHA3-256 because we
> will soon have good hardware support (and performance is really, really
> important when you need to work on the largest Git repository on this
> planet). And if there is no consensus about that, BLAKE should be
> considered over other algorithms because it has been studied pretty well.

BLAKE is what we're currently leaning on. (we=authors of "Git hash function
transition"; Leaning in the sense: If nobody ever speaks up until all work is
done, we'd just go with that. As soon as someone comes up with any
reasonable argument either publicly or privately on why other hashes are
better, we're easily persuaded to go with that)

> Again my poor English skillz make sure I get misunderstood. So bear with
> me, please, and let me try again.

Same for me, if I misunderstood you please point out.

tl;dr: Discussions are nice, but someone has to do the actual work, too.

Thanks,
Stefan

^ permalink raw reply	[relevance 11%]

* [RFC/PATCH] submodules: overhaul documentation
@ 2017-06-07 18:53 Stefan Beller
  2017-06-13 19:29 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stefan Beller @ 2017-06-07 18:53 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

This patch aims to detangle (a) the usage of `git-submodule`
from (b) the concept of submodules and (c) how the actual
implementation looks like, such as where they are configured
and (d) what the best practices are.

To do so, move the conceptual parts of the 'git-submodule'
man page to a new man page gitsubmodules(7). This new page
is just like gitmodules(5), gitattributes(5), gitcredentials(7),
gitnamespaces(7), gittutorial(7), which introduce a concept
rather than explaining a specific command.

The moved part of text has been slightly restructured:
* Rewrite first paragraph ("allows" is wrong. For example you can keep
  untracked repos as well, submodules enable tracking across versions)
  (Also remove short example as we have examples later)

* Remove "that is completely separate" from the second sentence as
  that was said in the first sentence.

* Introduce the gitmodules file in the third paragraph, mention name
  as the basic requirement. The URL is optional though strongly
  suggested. Leave it out as gitmodules(5) explains the url.

* The paragraphs about other mechanisms and implementation details
  are moved further down, as they are not as relevant to the concept of
  gitmodules.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This is kind of a resend from [RFC-PATCHv2] submodules: add a background story
https://public-inbox.org/git/20170209020855.23486-1-sbeller@google.com/
but the new man page is completely reworked, so I'd expect it go over better
for the first half at least.

(In the "data model" section it begins to differ from reality,
as it mentions a new not-yet-implemented place where to put submodule
related config)

Thanks,
Stefan

 Documentation/Makefile          |   1 +
 Documentation/git-rm.txt        |   4 +-
 Documentation/git-submodule.txt |  44 ++-------
 Documentation/gitsubmodules.txt | 214 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 227 insertions(+), 36 deletions(-)
 create mode 100644 Documentation/gitsubmodules.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index b5be2e2d3f..2415e0d657 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -31,6 +31,7 @@ MAN7_TXT += giteveryday.txt
 MAN7_TXT += gitglossary.txt
 MAN7_TXT += gitnamespaces.txt
 MAN7_TXT += gitrevisions.txt
+MAN7_TXT += gitsubmodules.txt
 MAN7_TXT += gittutorial-2.txt
 MAN7_TXT += gittutorial.txt
 MAN7_TXT += gitworkflows.txt
diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index f1efc116eb..db444693dd 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -152,8 +152,8 @@ Ignored files are deemed expendable and won't stop a submodule's work
 tree from being removed.
 
 If you only want to remove the local checkout of a submodule from your
-work tree without committing the removal,
-use linkgit:git-submodule[1] `deinit` instead.
+work tree without committing the removal, use linkgit:git-submodule[1] `deinit`
+instead. Also see linkgit:gitsubmodules[7] for details on submodule removal.
 
 EXAMPLES
 --------
diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 74bc6200d5..032590d828 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -24,37 +24,7 @@ DESCRIPTION
 -----------
 Inspects, updates and manages submodules.
 
-A submodule allows you to keep another Git repository in a subdirectory
-of your repository. The other repository has its own history, which does not
-interfere with the history of the current repository. This can be used to
-have external dependencies such as third party libraries for example.
-
-When cloning or pulling a repository containing submodules however,
-these will not be checked out by default; the 'init' and 'update'
-subcommands will maintain submodules checked out and at
-appropriate revision in your working tree.
-
-Submodules are composed from a so-called `gitlink` tree entry
-in the main repository that refers to a particular commit object
-within the inner repository that is completely separate.
-A record in the `.gitmodules` (see linkgit:gitmodules[5]) file at the
-root of the source tree assigns a logical name to the submodule and
-describes the default URL the submodule shall be cloned from.
-The logical name can be used for overriding this URL within your
-local repository configuration (see 'submodule init').
-
-Submodules are not to be confused with remotes, which are other
-repositories of the same project; submodules are meant for
-different projects you would like to make part of your source tree,
-while the history of the two projects still stays completely
-independent and you cannot modify the contents of the submodule
-from within the main project.
-If you want to merge the project histories and want to treat the
-aggregated whole as a single project from then on, you may want to
-add a remote for the other project and use the 'subtree' merge strategy,
-instead of treating the other project as a submodule. Directories
-that come from both projects can be cloned and checked out as a whole
-if you choose to go that route.
+For more information about submodules, see linkgit:gitsubmodules[5]
 
 COMMANDS
 --------
@@ -149,15 +119,17 @@ deinit [-f|--force] (--all|[--] <path>...)::
 	tree. Further calls to `git submodule update`, `git submodule foreach`
 	and `git submodule sync` will skip any unregistered submodules until
 	they are initialized again, so use this command if you don't want to
-	have a local checkout of the submodule in your working tree anymore. If
-	you really want to remove a submodule from the repository and commit
-	that use linkgit:git-rm[1] instead.
+	have a local checkout of the submodule in your working tree anymore.
 +
 When the command is run without pathspec, it errors out,
 instead of deinit-ing everything, to prevent mistakes.
 +
 If `--force` is specified, the submodule's working tree will
 be removed even if it contains local modifications.
++
+If you really want to remove a submodule from the repository and commit
+that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal
+options.
 
 update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--] [<path>...]::
 +
@@ -435,6 +407,10 @@ This file should be formatted in the same way as `$GIT_DIR/config`. The key
 to each submodule url is "submodule.$name.url".  See linkgit:gitmodules[5]
 for details.
 
+SEE ALSO
+--------
+linkgit:gitsubmodules[7], linkgit:gitmodules[5].
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
new file mode 100644
index 0000000000..2bf3149b68
--- /dev/null
+++ b/Documentation/gitsubmodules.txt
@@ -0,0 +1,214 @@
+gitsubmodules(7)
+================
+
+NAME
+----
+gitsubmodules - mounting one repository inside another
+
+SYNOPSIS
+--------
+.gitmodules, $GIT_DIR/config
+------------------
+git submodule
+git <command> --recurse-submodules
+------------------
+
+DESCRIPTION
+-----------
+
+A submodule is another Git repository tracked in a subdirectory of your
+repository. The tracked repository has its own history, which does not
+interfere with the history of the current repository.
+
+Submodules are composed from a so-called `gitlink` tree entry
+in the main repository that refers to a particular commit object
+within the inner repository.
+
+Additionally to the gitlink entry the `.gitmodules` file (see
+linkgit:gitmodules[5]) at the root of the source tree contains
+information needed for submodules. The only required information
+is the path setting, which estabishes a logical name for the submodule.
+
+The usual git configuration (see linkgit:git-config[1]) can be used to
+override settings given by the `.gitmodules` file.
+
+Submodules can be used for two different use cases:
+
+1. Using another project that stands on its own.
+  When you want to use a third party library, submodules allow you to
+  have a clean history for your own project as well as for the library.
+  This also allows for updating the third party library as needed.
+
+2. Artificially split a (logically single) project into multiple
+   repositories and tying them back together. This can be used to
+   overcome deficiences in the data model of Git, such as:
+
+* To have finer grained access control.
+  The design principles of Git do not allow for partial repositories to be
+  checked out or transferred. A repository is the smallest unit that a user
+  can be given access to. Submodules are separate repositories, such that
+  you can restrict access to parts of your project via the use of submodules.
+* In its current form Git scales up poorly for very large repositories that
+  change a lot, as the history grows very large. For that you may want to look
+  at shallow clone, sparse checkout, or git-LFS.
+  However you can also use submodules to e.g. hold large binary assets
+  and these repositories are then shallowly cloned such that you do not
+  have a large history locally.
+
+The data model
+--------------
+
+A submodule can be considered its own autonomous repository, that has a
+worktree and a git directory at a different place than the superproject.
+
+The superproject only records the commit object name in its tree, such that
+any other information, e.g. where to obtain a copy from, is not recorded
+in the core data structures of Git. The porcelain layer of Git however
+makes use of the `.gitmodules` file that gives hints where and how to
+obtain a copy of the submodule git repository from.
+
+Submodule operations can be configured using the following mechanisms
+(from highest to lowest precedence):
+
+ * the command line for those commands that support taking submodule specs.
+
+ * the configuration file `$GIT_DIR/config`.
+
+ * the configuration file `config` found in the `refs/submodule/config` branch.
+   This can be used to overwrite the upstream configuration in the `.gitmodules`
+   file without changing the history of the project.
+   Useful options here are overwriting the base, where relative URLs apply to,
+   when mirroring only parts of the larger collection of submodules.
+
+ * the `.gitmodules` file inside the repository. A project usually includes this
+   file to suggest defaults for the upstream collection of repositories.
+
+On the location of the git directory
+------------------------------------
+
+Since v1.7.7 of Git, the git directory of submodules is stored inside the
+superprojects git directory at $GIT_DIR/modules/<submodule-name>
+This location allows for the working tree to be non existent while keeping
+the history around. So we can use `git-rm` on a submodule without loosing
+information that may only be local; it is also possible to checkout the
+superproject before and after the deletion of the submodule without the
+need to reclone the submodule as it is kept locally.
+
+Workflow for a third party library
+----------------------------------
+
+  # add the submodule
+  git submodule add <url> <path>
+
+  # occasionally update the submodule to a new version:
+  git -C <path> checkout <new version>
+  git add <path>
+  git commit -m "update submodule to new version"
+
+  # see the discussion below on deleting submodules
+
+
+Workflow for an artifically split repo
+--------------------------------------
+
+  # Enable recursion for relevant commands, such that
+  # regular commands recurse into submodules by default
+  git config --global submodule.recurse true
+
+  # Unlike the other commands below clone still needs
+  # its own recurse flag:
+  git clone --recurse <URL> <directory>
+  cd <directory>
+
+  # Get to know the code:
+  git grep foo
+  git ls-files
+
+  # Get new code
+  git fetch
+  git pull --rebase
+
+  # change worktree
+  git checkout
+  git reset
+
+Deleting a submodule
+--------------------
+
+Deleting a submodule can happen on different levels:
+
+1) Removing it from the local working tree without tampering with
+   the history of the superproject.
+
+You may no longer need the submodule, but still want to keep it recorded
+in the superproject history as others may have use for it.
+--
+  git submodule deinit <submodule path>
+--
+will remove the configuration entries
+as well as the work
+
+2) Remove it from history:
+--
+   git rm <submodule>
+--
+
+3) Remove the submodules git directory:
+
+When you also want to free up the disk space that the submodules git
+directory uses, you have to delete it manually. It is found in
+`$GIT_DIR/modules`.
+The steps 1 and 2 can be undone via `git submodule init` or
+`git revert`, respectively.  This step may incur data loss,
+and cannot be undone. That is why there is no builtin.
+
+Other mechanisms
+----------------
+
+Git repositories are allowed to be kept inside other repositories without
+the need to use submodules. This however does not enable cross-repository
+versioning as the inner repository is unaware of the outer repository,
+which in turn ignores the inner.
+
+Submodules are not to be confused with remotes, which are other
+repositories of the same project; submodules are meant for
+different projects you would like to make part of your source tree,
+while the history of the two projects still stays completely
+independent and you cannot modify the contents of the submodule
+from within the main project.
+If you want to merge the project histories and want to treat the
+aggregated whole as a single project from then on, you may want to
+add a remote for the other project and use the 'subtree' merge strategy,
+instead of treating the other project as a submodule. Directories
+that come from both projects can be cloned and checked out as a whole
+if you choose to go that route.
+
+Third party tools
+-----------------
+
+There are a variety of third party tools that manage multiple repositories
+and their relationships to each other, such as Androids repo tool or git-slave.
+Often these tools lack cross repository versioning.
+
+https://source.android.com/source/using-repo
+
+http://gitslave.sourceforge.net/
+
+Implementation details
+----------------------
+
+When cloning or pulling a repository containing submodules the submodules
+will not be checked out by default; You can instruct 'clone' to recurse
+into submodules. The 'init' and 'update' subcommands of 'git submodule'
+will maintain submodules checked out and at an appropriate revision in
+your working tree. Alternatively you can set 'submodule.recurse' to have
+'checkout' recursing into submodules.
+
+
+SEE ALSO
+--------
+linkgit:git-submodule[1], linkgit:gitmodules[5].
+
+GIT
+---
+Part of the linkgit:git[1] suite
-- 
2.13.0.17.gf3d7728391


^ permalink raw reply	[relevance 17%]

* [PATCH v1] dir: create function count_slashes
@ 2017-06-08 18:08 Prathamesh Chavan
  2017-06-12  5:33 ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-08 18:08 UTC (permalink / raw)
  To: git; +Cc: sbeller, christian.couder, gitster, Prathamesh Chavan

Similar functions exist in apply.c and builtin/show-branch.c for
counting the number of slashes in a string. Also in the later
patches, we introduce a third caller for the same. Hence, we unify
it now by cleaning the existing functions and declaring a common
function count_slashes in dir.h and implementing it in dir.c to
remove this code duplication.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
In future, I intend to use this function in builtin/submodule--helper.c
as well, hence this change was introduced now.

The patch passes all the tests.
Complete build report of this patch is available at:
https://travis-ci.org/pratham-pc/git/builds
Branch: count_slashes
Build #97

 apply.c               | 11 -----------
 builtin/show-branch.c | 13 +++----------
 dir.c                 |  9 +++++++++
 dir.h                 |  3 +++
 4 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/apply.c b/apply.c
index c49cef063..121e53406 100644
--- a/apply.c
+++ b/apply.c
@@ -762,17 +762,6 @@ static char *find_name_traditional(struct apply_state *state,
 	return find_name_common(state, line, def, p_value, line + len, 0);
 }
 
-static int count_slashes(const char *cp)
-{
-	int cnt = 0;
-	char ch;
-
-	while ((ch = *cp++))
-		if (ch == '/')
-			cnt++;
-	return cnt;
-}
-
 /*
  * Given the string after "--- " or "+++ ", guess the appropriate
  * p_value for the given patch.
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 4a6cc6f49..3636a0559 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -5,6 +5,7 @@
 #include "color.h"
 #include "argv-array.h"
 #include "parse-options.h"
+#include "dir.h"
 
 static const char* show_branch_usage[] = {
     N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | --date-order]\n"
@@ -421,14 +422,6 @@ static int append_tag_ref(const char *refname, const struct object_id *oid,
 
 static const char *match_ref_pattern = NULL;
 static int match_ref_slash = 0;
-static int count_slash(const char *s)
-{
-	int cnt = 0;
-	while (*s)
-		if (*s++ == '/')
-			cnt++;
-	return cnt;
-}
 
 static int append_matching_ref(const char *refname, const struct object_id *oid,
 			       int flag, void *cb_data)
@@ -438,7 +431,7 @@ static int append_matching_ref(const char *refname, const struct object_id *oid,
 	 * refs/tags/v0.99.9a and friends.
 	 */
 	const char *tail;
-	int slash = count_slash(refname);
+	int slash = count_slashes(refname);
 	for (tail = refname; *tail && match_ref_slash < slash; )
 		if (*tail++ == '/')
 			slash--;
@@ -529,7 +522,7 @@ static void append_one_rev(const char *av)
 		int saved_matches = ref_name_cnt;
 
 		match_ref_pattern = av;
-		match_ref_slash = count_slash(av);
+		match_ref_slash = count_slashes(av);
 		for_each_ref(append_matching_ref, NULL);
 		if (saved_matches == ref_name_cnt &&
 		    ref_name_cnt < MAX_REVS)
diff --git a/dir.c b/dir.c
index 9efcf1eab..4a953c16a 100644
--- a/dir.c
+++ b/dir.c
@@ -52,6 +52,15 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 static int get_dtype(struct dirent *de, struct index_state *istate,
 		     const char *path, int len);
 
+int count_slashes(const char *s)
+{
+	int cnt = 0;
+	while (*s)
+		if (*s++ == '/')
+			cnt++;
+	return cnt;
+}
+
 int fspathcmp(const char *a, const char *b)
 {
 	return ignore_case ? strcasecmp(a, b) : strcmp(a, b);
diff --git a/dir.h b/dir.h
index a89c13e27..e3717055d 100644
--- a/dir.h
+++ b/dir.h
@@ -197,6 +197,9 @@ struct dir_struct {
 	unsigned unmanaged_exclude_files;
 };
 
+/*Count the number of slashes for string s*/
+extern int count_slashes(const char *s);
+
 /*
  * The ordering of these constants is significant, with
  * higher-numbered match types signifying "closer" (i.e. more
-- 
2.13.0


^ permalink raw reply	[relevance 10%]

* 'pu' broken at t5304 tonight
@ 2017-06-10  6:07 Junio C Hamano
  2017-06-10 12:48 ` Kevin Daudt
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-06-10  6:07 UTC (permalink / raw)
  To: git

I didn't check where it goes wrong.  Here is a list of suspects,
taken by

    $ git shortlog --no-merges pu@{8.hours}..pu

i.e. patches that weren't in pu before today's integration cycle.

Andreas Heiduk (1):
      doc: describe git svn init --ignore-refs

Brandon Williams (32):
      config: create config.h
      config: remove git_config_iter
      config: don't include config.h by default
      config: don't implicitly use gitdir
      setup: don't perform lazy initialization of repository state
      environment: remove namespace_len variable
      repository: introduce the repository object
      environment: place key repository state in the_repository
      environment: store worktree in the_repository
      setup: add comment indicating a hack
      config: read config from a repository object
      repository: add index_state to struct repo
      submodule-config: store the_submodule_cache in the_repository
      submodule: add repo_read_gitmodules
      submodule: convert is_submodule_initialized to work on a repository
      convert: convert get_cached_convert_stats_ascii to take an index
      convert: convert crlf_to_git to take an index
      convert: convert convert_to_git_filter_fd to take an index
      convert: convert convert_to_git to take an index
      convert: convert renormalize_buffer to take an index
      tree: convert read_tree to take an index parameter
      ls-files: convert overlay_tree_on_cache to take an index
      ls-files: convert write_eolinfo to take an index
      ls-files: convert show_killed_files to take an index
      ls-files: convert show_other_files to take an index
      ls-files: convert show_ru_info to take an index
      ls-files: convert ce_excluded to take an index
      ls-files: convert prune_cache to take an index
      ls-files: convert show_files to take an index
      ls-files: factor out debug info into a function
      ls-files: factor out tag calculation
      ls-files: use repository object

Jeff King (1):
      date: use localtime() for "-local" time formats

Johannes Schindelin (8):
      discover_git_directory(): avoid setting invalid git_dir
      config: report correct line number upon error
      help: use early config when autocorrecting aliases
      read_early_config(): optionally return the worktree's top-level directory
      t1308: relax the test verifying that empty alias values are disallowed
      t7006: demonstrate a problem with aliases in subdirectories
      alias_lookup(): optionally return top-level directory
      Use the early config machinery to expand aliases

Junio C Hamano (1):
      ### match next

Prathamesh Chavan (1):
      dir: create function count_slashes

SZEDER Gábor (5):
      revision.h: turn rev_info.early_output back into an unsigned int
      revision.c: stricter parsing of '--no-{min,max}-parents'
      revision.c: stricter parsing of '--early-output'
      revision.c: use skip_prefix() in handle_revision_opt()
      revision.c: use skip_prefix() in handle_revision_pseudo_opt()

Stefan Beller (1):
      t4005: modernize style and drop hard coded sha1


^ permalink raw reply	[relevance 15%]

* Re: 'pu' broken at t5304 tonight
  2017-06-10  6:07 'pu' broken at t5304 tonight Junio C Hamano
@ 2017-06-10 12:48 ` Kevin Daudt
  2017-06-10 19:05   ` Kevin Daudt
  0 siblings, 1 reply; 200+ results
From: Kevin Daudt @ 2017-06-10 12:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Jun 10, 2017 at 03:07:01PM +0900, Junio C Hamano wrote:
> I didn't check where it goes wrong.  Here is a list of suspects,
> taken by
> 
>     $ git shortlog --no-merges pu@{8.hours}..pu
> 
> i.e. patches that weren't in pu before today's integration cycle.
> 
> Andreas Heiduk (1):
>       doc: describe git svn init --ignore-refs
> 
> Brandon Williams (32):
>       config: create config.h
>       config: remove git_config_iter
>       config: don't include config.h by default
>       config: don't implicitly use gitdir
>       setup: don't perform lazy initialization of repository state
>       environment: remove namespace_len variable
>       repository: introduce the repository object
>       environment: place key repository state in the_repository
>       environment: store worktree in the_repository
>       setup: add comment indicating a hack
>       config: read config from a repository object
>       repository: add index_state to struct repo
>       submodule-config: store the_submodule_cache in the_repository
>       submodule: add repo_read_gitmodules
>       submodule: convert is_submodule_initialized to work on a repository
>       convert: convert get_cached_convert_stats_ascii to take an index
>       convert: convert crlf_to_git to take an index
>       convert: convert convert_to_git_filter_fd to take an index
>       convert: convert convert_to_git to take an index
>       convert: convert renormalize_buffer to take an index
>       tree: convert read_tree to take an index parameter
>       ls-files: convert overlay_tree_on_cache to take an index
>       ls-files: convert write_eolinfo to take an index
>       ls-files: convert show_killed_files to take an index
>       ls-files: convert show_other_files to take an index
>       ls-files: convert show_ru_info to take an index
>       ls-files: convert ce_excluded to take an index
>       ls-files: convert prune_cache to take an index
>       ls-files: convert show_files to take an index
>       ls-files: factor out debug info into a function
>       ls-files: factor out tag calculation
>       ls-files: use repository object
> 
> Jeff King (1):
>       date: use localtime() for "-local" time formats
> 
> Johannes Schindelin (8):
>       discover_git_directory(): avoid setting invalid git_dir
>       config: report correct line number upon error
>       help: use early config when autocorrecting aliases
>       read_early_config(): optionally return the worktree's top-level directory
>       t1308: relax the test verifying that empty alias values are disallowed
>       t7006: demonstrate a problem with aliases in subdirectories
>       alias_lookup(): optionally return top-level directory
>       Use the early config machinery to expand aliases
> 
> Junio C Hamano (1):
>       ### match next
> 
> Prathamesh Chavan (1):
>       dir: create function count_slashes
> 
> SZEDER Gábor (5):
>       revision.h: turn rev_info.early_output back into an unsigned int
>       revision.c: stricter parsing of '--no-{min,max}-parents'
>       revision.c: stricter parsing of '--early-output'
>       revision.c: use skip_prefix() in handle_revision_opt()
>       revision.c: use skip_prefix() in handle_revision_pseudo_opt()
> 
> Stefan Beller (1):
>       t4005: modernize style and drop hard coded sha1
> 

For me, this bisects to the latest merge:

2047eebd3 (Merge branch 'bw/repo-object' into pu, 2017-06-10), but
neither of the parent of the merge break this test, so it looks like
it's because of an interaction between the repo-object topic and another
topic.

^ permalink raw reply	[relevance 1%]

* Re: 'pu' broken at t5304 tonight
  2017-06-10 12:48 ` Kevin Daudt
@ 2017-06-10 19:05   ` Kevin Daudt
  0 siblings, 0 replies; 200+ results
From: Kevin Daudt @ 2017-06-10 19:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Jun 10, 2017 at 02:48:36PM +0200, Kevin Daudt wrote:
> On Sat, Jun 10, 2017 at 03:07:01PM +0900, Junio C Hamano wrote:
> > I didn't check where it goes wrong.  Here is a list of suspects,
> > taken by
> > 
> >     $ git shortlog --no-merges pu@{8.hours}..pu
> > 
> > i.e. patches that weren't in pu before today's integration cycle.
> > 
> > Andreas Heiduk (1):
> >       doc: describe git svn init --ignore-refs
> > 
> > Brandon Williams (32):
> >       config: create config.h
> >       config: remove git_config_iter
> >       config: don't include config.h by default
> >       config: don't implicitly use gitdir
> >       setup: don't perform lazy initialization of repository state
> >       environment: remove namespace_len variable
> >       repository: introduce the repository object
> >       environment: place key repository state in the_repository
> >       environment: store worktree in the_repository
> >       setup: add comment indicating a hack
> >       config: read config from a repository object
> >       repository: add index_state to struct repo
> >       submodule-config: store the_submodule_cache in the_repository
> >       submodule: add repo_read_gitmodules
> >       submodule: convert is_submodule_initialized to work on a repository
> >       convert: convert get_cached_convert_stats_ascii to take an index
> >       convert: convert crlf_to_git to take an index
> >       convert: convert convert_to_git_filter_fd to take an index
> >       convert: convert convert_to_git to take an index
> >       convert: convert renormalize_buffer to take an index
> >       tree: convert read_tree to take an index parameter
> >       ls-files: convert overlay_tree_on_cache to take an index
> >       ls-files: convert write_eolinfo to take an index
> >       ls-files: convert show_killed_files to take an index
> >       ls-files: convert show_other_files to take an index
> >       ls-files: convert show_ru_info to take an index
> >       ls-files: convert ce_excluded to take an index
> >       ls-files: convert prune_cache to take an index
> >       ls-files: convert show_files to take an index
> >       ls-files: factor out debug info into a function
> >       ls-files: factor out tag calculation
> >       ls-files: use repository object
> > 
> > Jeff King (1):
> >       date: use localtime() for "-local" time formats
> > 
> > Johannes Schindelin (8):
> >       discover_git_directory(): avoid setting invalid git_dir
> >       config: report correct line number upon error
> >       help: use early config when autocorrecting aliases
> >       read_early_config(): optionally return the worktree's top-level directory
> >       t1308: relax the test verifying that empty alias values are disallowed
> >       t7006: demonstrate a problem with aliases in subdirectories
> >       alias_lookup(): optionally return top-level directory
> >       Use the early config machinery to expand aliases
> > 
> > Junio C Hamano (1):
> >       ### match next
> > 
> > Prathamesh Chavan (1):
> >       dir: create function count_slashes
> > 
> > SZEDER Gábor (5):
> >       revision.h: turn rev_info.early_output back into an unsigned int
> >       revision.c: stricter parsing of '--no-{min,max}-parents'
> >       revision.c: stricter parsing of '--early-output'
> >       revision.c: use skip_prefix() in handle_revision_opt()
> >       revision.c: use skip_prefix() in handle_revision_pseudo_opt()
> > 
> > Stefan Beller (1):
> >       t4005: modernize style and drop hard coded sha1
> > 
> 
> For me, this bisects to the latest merge:
> 
> 2047eebd3 (Merge branch 'bw/repo-object' into pu, 2017-06-10), but
> neither of the parent of the merge break this test, so it looks like
> it's because of an interaction between the repo-object topic and another
> topic.

Merging the repo-object with different other topic branches reveals this
topic to cause the bad interaction:

b56c91004 (Merge branch 'nd/prune-in-worktree' into pu, 2017-06-10)

Still investigating why it happens.

^ permalink raw reply	[relevance 1%]

* Re: [PATCH v2 00/32] repository object
      [irrelevant]     ` <20170610060712.foqre5fscaxu3tnx@sigill.intra.peff.net>
@ 2017-06-12  5:24       ` Stefan Beller
  2017-06-12 21:23         ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-12  5:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, Brandon Williams, git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff Hostetler, Ævar Arnfjörð Bjarmason

On Fri, Jun 9, 2017 at 11:07 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Jun 09, 2017 at 05:40:34PM -0700, Jonathan Tan wrote:
>
>> Before I get into the details, I have some questions:
>>
>> 1. I am concerned that "struct repository" will end up growing without
>> bounds as we store more and more repo-specific concerns in it. Could it
>> be restricted to just the fields populated by repo_init()?
>> repo_read_index() will then return the index itself, instead of using
>> "struct repository" as a cache. This means that code using
>> repo_read_index() will need to maintain its own variable holding the
>> returned index, but that is likely a positive - it's better for code to
>> just pass around the specific thing needed between functions anyway, as
>> opposed to passing a giant "struct repository" (which partially defeats
>> the purpose of eliminating the usage of globals).
>
> I think the repository object has to become a kitchen sink of sorts,
> because we have tons of global variables representing repo-wide config.

AFAICT we want to operate on struct 'the_repo' and struct 'the_cmd_options'
eventually. In our use case of submodules the submodules would ignore the
settings of the main repo, but still accept guidance of the_cmd_config or
'the_config.

> So I have a feeling that we're always going to need some
> big object to hold all that context when doing multi-repo operations in
> a single process.

Well not just one big struct, but two. (or more?)

^ permalink raw reply	[relevance 17%]

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
      [irrelevant]         ` <xmqqbmpw4mpo.fsf@gitster.mtv.corp.google.com>
@ 2017-06-12  5:30           ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-12  5:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Brandon Williams, git

On Fri, Jun 9, 2017 at 7:10 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> In an ideal world the user do:
>>
>>   git submodule add git://host/repo.git path
>>
>> which adds the gitlink and the .gitmodules entry. But it doesn't seem
>> unreasonable for somebody unfamiliar with submodules to do:
>>
>>   git clone git://host/repo.git path
>>   git add path
>>
>> This does add the entry as a gitlink, but doesn't write any sort of
>> .gitmodules entry.
>
> I actually would think that is a perfectly valid state.

me too.

But on the other hand I do not want to offend non-submodule-gitlink-users
too much. So maybe:

  $ git add <gitlink>
  # Adding a raw gitlink to the index,
  # in case you want to use a submodule,
  # use add a .gitmodules file or use 'git submodule add'

> In that
> original repository pair (i.e. the superproject with a submodule
> without an entry in .gitmodules), as long as the configuration in
> the submodule repository "path/.git/config" has necessary remote
> definitions, "git push/fetch --recursive" etc., should also be able
> to work without having to consult .gitmodules at the top-level
> superproject, I would think.

but these are the 2nd step, clone fails first.

>
>> With the old code, cloning the repository (either by
>> another user, or in our case during a Pages build), a recursive clone or
>> submodule init would complain loudly. But now it's just quietly ignored.
>> Which seems unfortunate.
>
> Of course, if such an original superproject gets pushed to a
> publishing location and then the result is cloned, without an entry
> in .gitmodules, no information "git submodule" can use to work on
> that "path" exists in that clone.  I would say it is OK to leave it
> as-is when going "--recursive" (what you called "inactive because
> it does not even have a .gitmodules entry).
>
> But even in such a clone, once the user who cloned learns where the
> submodule commit that is recorded in the superproject's tree can be
> obtained out-of-band and makes a clone at "path" manually (which
> replicates the state the original repository pair), things that only
> need to look at "path/.git/config" should be able to work (e.g. "git
> fetch --recursive"), I'd say.

But the user would never learn, because

  $ git clone --recurse-submodules ...

does not complain, but put an empty dir instead.


>
>

^ permalink raw reply	[relevance 18%]

* Re: [PATCH v1] dir: create function count_slashes
  2017-06-08 18:08 [PATCH v1] dir: create function count_slashes Prathamesh Chavan
@ 2017-06-12  5:33 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-12  5:33 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git, Christian Couder, Junio C Hamano

On Thu, Jun 8, 2017 at 11:08 AM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> Similar functions exist in apply.c and builtin/show-branch.c for
> counting the number of slashes in a string. Also in the later
> patches, we introduce a third caller for the same. Hence, we unify
> it now by cleaning the existing functions and declaring a common
> function count_slashes in dir.h and implementing it in dir.c to
> remove this code duplication.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
> In future, I intend to use this function in builtin/submodule--helper.c
> as well, hence this change was introduced now.
>

Thanks for upstreaming this early!

I think this is a good change even
without the submodule work thrown into the soup as well,
but with the given promise to have it used a third time this
is a no-brainer later on.

^ permalink raw reply	[relevance 15%]

* Re: git push recurse.submodules behavior changed in 2.13
      [irrelevant]     ` <CAE5=+KUJr2=w3W=ZDTbd=L+8=KtwsV95Q7bcJassjzFncrnBKQ@mail.gmail.com>
@ 2017-06-12 17:27       ` Stefan Beller
  2017-06-16 14:11         ` John Shahid
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-12 17:27 UTC (permalink / raw)
  To: John Shahid; +Cc: Jonathan Nieder, git, Brandon Williams

On Sat, Jun 10, 2017 at 6:28 AM, John Shahid <jvshahid@gmail.com> wrote:
> bump. it's been a while and I'm still not clear what the next steps
> are. I'm happy to send a patch but I would like to get a consensus
> first.

What do you want a consensus on?
(Is the change in 2.13 a bug or feature? I considered it enough
of a feature to not pursue an urgent bug fix. Maybe I misunderstood
the discussion)

This thread has diverged into lots of things that could be done. Jonathan
pointed out 3 possible ways forward. (1) and (3) are being worked on,
(2) is a new thing that nobody (not Brandon nor me) have on our radar
for what to work on next.  So maybe that is a good idea to get started
when you want to get into sending a patch?

Also I have the impression that there may be one corner case, in which
the handling of refspecs is missed, i.e. maybe we'd need a list (in the
man page eventually) on what currently happens when recursing in
combination with remote/refspecs or other options.

^ permalink raw reply	[relevance 16%]

* Re: [PATCH v2 00/32] repository object
  2017-06-12  5:24       ` Re: [PATCH v2 00/32] repository object Stefan Beller
@ 2017-06-12 21:23         ` Jeff King
  0 siblings, 0 replies; 200+ results
From: Jeff King @ 2017-06-12 21:23 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jonathan Tan, Brandon Williams, git, Jonathan Nieder, Jacob Keller, Johannes Schindelin, brian m. carlson, Ben Peart, Duy Nguyen, Junio C Hamano, Jeff Hostetler, Ævar Arnfjörð Bjarmason

On Sun, Jun 11, 2017 at 10:24:12PM -0700, Stefan Beller wrote:

> On Fri, Jun 9, 2017 at 11:07 PM, Jeff King <peff@peff.net> wrote:
> > On Fri, Jun 09, 2017 at 05:40:34PM -0700, Jonathan Tan wrote:
> >
> >> Before I get into the details, I have some questions:
> >>
> >> 1. I am concerned that "struct repository" will end up growing without
> >> bounds as we store more and more repo-specific concerns in it. Could it
> >> be restricted to just the fields populated by repo_init()?
> >> repo_read_index() will then return the index itself, instead of using
> >> "struct repository" as a cache. This means that code using
> >> repo_read_index() will need to maintain its own variable holding the
> >> returned index, but that is likely a positive - it's better for code to
> >> just pass around the specific thing needed between functions anyway, as
> >> opposed to passing a giant "struct repository" (which partially defeats
> >> the purpose of eliminating the usage of globals).
> >
> > I think the repository object has to become a kitchen sink of sorts,
> > because we have tons of global variables representing repo-wide config.
> 
> AFAICT we want to operate on struct 'the_repo' and struct 'the_cmd_options'
> eventually. In our use case of submodules the submodules would ignore the
> settings of the main repo, but still accept guidance of the_cmd_config or
> 'the_config.
> 
> > So I have a feeling that we're always going to need some
> > big object to hold all that context when doing multi-repo operations in
> > a single process.
> 
> Well not just one big struct, but two. (or more?)

Right, I think you could have a separate kitchen-sink struct that isn't
the "repo" one. But now you have to pass both of those around, which is
going to get cumbersome. Almost every function is going to end up
passing around the context struct.

I almost think it would be easier to shove them all of the context into
a big global struct and "push" and "pop" contexts from a stack of
structs. That gets you the in-process benefits, though of course it's
absolutely horrible if you ever want to multi-thread across two contexts.

-Peff

^ permalink raw reply	[relevance 8%]

* [GSoC] Update: Week 4
@ 2017-06-12 22:10 Prathamesh Chavan
  0 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-06-12 22:10 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Christian Couder

SUMMARY OF MY PROJECT:

Git submodule subcommands are currently implemented by using shell script
'git-submodule.sh'. There are several reasons why we'll prefer not to
use the shell script. My project intends to convert the subcommands into
C code, thus making them builtins. This will increase Git's portability
and hence the efficiency of working with the git-submodule commands.
Link to the complete proposal: [1]

Mentors:
Stefan Beller <sbeller@google.com>
Christian Couder <christian.couder@gmail.com>

UPDATES:

Following are the updates about my ongoing project:

1. sync and status: The improvisions for the ported functions were
   implemented. I'm planning on floating a series patches,
   containing all the ported functions put together, but
   recently encountered some issue with the get_submodule_displaypath
   function, I haven't yet posted an updated version of the above
   ported function.The issues are now resolved. Hence, soon I plan
   on posting their updated versions.

2. deinit: As planned for the week, this submodule subcommand is
   ported from shell to C. But still, there are some tests, the
   ported functions are failing. Along with the updated versions
   of 'status' and 'sync', I'll also be posting a WIP patch
   about this subcommand ported.

3. summary: Porting of this subcommand is still underway. I choose
   to do the porting of this subcommand after deinit as it was
   smaller and hence porting this subcommand is still left.

4. count_slashes: A function was introduced in dir.h for reducing
   the code-duplication as similar functions exist in apply.c and
   builtin/show-branch.c

PLAN FOR WEEK-5 (13 June 2017 to 19 June 2017):

1. sync and status: Since the changes are ready, I plan to post the
   complete series of the updated version soon as a single series of
   patches.

2. summary: As this is still underway, I'll be finishing this submodule
   subcommand in the following week.

3. foreach: To unblock the conversion of this submodule subcommand,
   I'll be focusing on porting the original cmd_foreach, and
   will not be including the BUG-FIX patch here.
   An additional NEEDSWORK comment will be added to the ported function,
   stating the reported bug, and not resolving the bug in this patch
   series.

4. deinit: As mentioned earlier, there is still some debugging left for
   the ported functions. I plan to debug them and discuss the patch
   for further improvisions this week.

[1]: https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/

Thanks,
Prathamesh Chavan

^ permalink raw reply	[relevance 17%]

* Re: [PATCH 1/2] add: warn when adding an embedded repository
      [irrelevant] ` <20170613092408.db22ygki6wg2t23d@sigill.intra.peff.net>
@ 2017-06-13 17:07   ` Stefan Beller
  2017-06-14  6:36     ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-13 17:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Junio C Hamano, git

On Tue, Jun 13, 2017 at 2:24 AM, Jeff King <peff@peff.net> wrote:
> It's an easy mistake to add a repository inside another
> repository, like:
>
>   git clone $url
>   git add .
>
> The resulting entry is a gitlink, but there's no matching
> .gitmodules entry. Trying to use "submodule init" (or clone
> with --recursive) doesn't do anything useful. Prior to
> v2.13, such an entry caused git-submodule to barf entirely.
> In v2.13, the entry is considered "inactive" and quietly
> ignored. Either way, no clone of your repository can do
> anything useful with the gitlink without the user manually
> adding the submodule config.
>
> In most cases, the user probably meant to either add a real
> submodule, or they forgot to put the embedded repository in
> their .gitignore file.
>
> Let's issue a warning when we see this case. There are a few
> things to note:
>
>   - the warning will go in the git-add porcelain; anybody
>     wanting to do low-level manipulation of the index is
>     welcome to create whatever funny states they want.
>
>   - we detect the case by looking for a newly added gitlink;
>     updates via "git add submodule" are perfectly reasonable,
>     and this avoids us having to investigate .gitmodules
>     entirely
>
>   - there's a command-line option to suppress the warning.
>     This is needed for git-submodule itself (which adds the
>     entry before adding any submodule config), but also
>     provides a mechanism for other scripts doing
>     submodule-like things.
>
> We could make this a hard error instead of a warning.
> However, we do add lots of sub-repos in our test suite. It's
> not _wrong_ to do so. It just creates a state where users
> may be surprised. Pointing them in the right direction with
> a gentle hint is probably the best option.

Sounds good up to here (and right).

> There is a config knob that can disable the (long) hint. But
> I intentionally omitted a config knob to disable the warning
> entirely. Whether the warning is sensible or not is
> generally about context, not about the user's preferences.
> If there's a tool or workflow that adds gitlinks without
> matching .gitmodules, it should probably be taught about the
> new command-line option, rather than blanket-disabling the
> warning.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> The check for "is this a gitlink" is done by looking for a
> trailing "/" in the added path. This feels kind of hacky,
> but actually seems to work well in practice.

This whole "slash at the end" thing comes from extensive use
of shell completion adding the slash at the end of a directory
IMHO. (cf. PATHSPEC_STRIP_SUBMODULE_SLASH_* is
the same underlying hack.)

> We've already
> expanded the pathspecs to real filenames via dir.c, and that
> omits trees. So anything with a trailing slash must be a
> gitlink.

Oh!

>
> And I really didn't want to incur any extra cost in the
> common case here (e.g., checking for "path/.git"). We could
> do it at zero-cost by pushing the check much further down
> (i.e., when we'd realize anyway that it's a gitlink), but I
> didn't want to pollute read-cache.c with what is essentially
> a porcelain warning. The actual check done there seems to be
> checking S_ISDIR, but I didn't even want to incur an extra
> stat per-file.

makes sense.

>
> I also waffled on whether we should ask the submodule code
> whether it knows about a particular path. Technically:
>
>   git config submodule.foo.path foo
>   git config submodule.foo.url git://...
>   git add foo
>
> is legal, but would still warn with this patch. I don't know
> how much we should care (it would also be easy to do on
> top).

And here I was thinking this is not legal, because you may override
anything *except* submodule.*.path in the config. That is because
all the other settings (such as url, active flag, branch,
shallow recommendation) are dependent on the use case, the user,
changes to the environment (url) or such. The name<->path mapping
however is only to be changed via changes to the tracked content.
That is why it would make sense to disallow overriding the path
outside the tracked content.

In my ideal dream world of submodules we would have the following:

  $ cat .gitmodules
  [submodule "sub42"]
    path = foo
  # path only in tree!

  $ cat .git/config
  ...
  [submodule]
    active = .
    active = :(exclude)Irrelevant/submodules/for/my/usecase/*
  # note how this is user centric

  $ git show refs/meta/magic/for/refs/heads/master:.gitmodules
  [submodule "sub42"]
    url = https://example.org/foo
    branch = .
  # Note how this is neither centering on the in-tree
  # contents, nor the user. Instead it focuses on the
  # project or group. It is *workflow* centric.
  # Workflows may change over time, e.g. the url could
  # be repointed to k.org or an in-house mirror without tree
  # changes.


But back to reviewing this patch.

>
>  Documentation/config.txt      |  3 +++
>  Documentation/git-add.txt     |  7 +++++++
>  advice.c                      |  2 ++
>  advice.h                      |  1 +
>  builtin/add.c                 | 45 ++++++++++++++++++++++++++++++++++++++++++-
>  git-submodule.sh              |  5 +++--
>  t/t7414-submodule-mistakes.sh | 40 ++++++++++++++++++++++++++++++++++++++
>  7 files changed, 100 insertions(+), 3 deletions(-)
>  create mode 100755 t/t7414-submodule-mistakes.sh
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index dd4beec39..e909239bc 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -348,6 +348,9 @@ advice.*::
>         rmHints::
>                 In case of failure in the output of linkgit:git-rm[1],
>                 show directions on how to proceed from the current state.
> +       addEmbeddedRepo::
> +               Advice on what to do when you've accidentally added one
> +               git repo inside of another.
>  --
>
>  core.fileMode::
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 7ed63dce0..f4169fb1e 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -165,6 +165,13 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>         be ignored, no matter if they are already present in the work
>         tree or not.
>
> +--no-warn-embedded-repo::
> +       By default, `git add` will warn when adding an embedded
> +       repository to the index without using `git submodule add` to
> +       create an entry in `.gitmodules`. This option will suppress the
> +       warning (e.g., if you are manually performing operations on
> +       submodules).
> +
>  --chmod=(+|-)x::
>         Override the executable bit of the added files.  The executable
>         bit is only changed in the index, the files on disk are left
> diff --git a/advice.c b/advice.c
> index b84ae4960..e0611d52b 100644
> --- a/advice.c
> +++ b/advice.c
> @@ -15,6 +15,7 @@ int advice_detached_head = 1;
>  int advice_set_upstream_failure = 1;
>  int advice_object_name_warning = 1;
>  int advice_rm_hints = 1;
> +int advice_add_embedded_repo = 1;
>
>  static struct {
>         const char *name;
> @@ -35,6 +36,7 @@ static struct {
>         { "setupstreamfailure", &advice_set_upstream_failure },
>         { "objectnamewarning", &advice_object_name_warning },
>         { "rmhints", &advice_rm_hints },
> +       { "addembeddedrepo", &advice_add_embedded_repo },
>
>         /* make this an alias for backward compatibility */
>         { "pushnonfastforward", &advice_push_update_rejected }
> diff --git a/advice.h b/advice.h
> index b341a55ce..c84a44531 100644
> --- a/advice.h
> +++ b/advice.h
> @@ -18,6 +18,7 @@ extern int advice_detached_head;
>  extern int advice_set_upstream_failure;
>  extern int advice_object_name_warning;
>  extern int advice_rm_hints;
> +extern int advice_add_embedded_repo;
>
>  int git_default_advice_config(const char *var, const char *value);
>  __attribute__((format (printf, 1, 2)))
> diff --git a/builtin/add.c b/builtin/add.c
> index d9a2491e4..ea88db281 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -249,6 +249,7 @@ N_("The following paths are ignored by one of your .gitignore files:\n");
>
>  static int verbose, show_only, ignored_too, refresh_only;
>  static int ignore_add_errors, intent_to_add, ignore_missing;
> +static int warn_on_embedded_repo = 1;
>
>  #define ADDREMOVE_DEFAULT 1
>  static int addremove = ADDREMOVE_DEFAULT;
> @@ -282,6 +283,8 @@ static struct option builtin_add_options[] = {
>         OPT_BOOL( 0 , "ignore-errors", &ignore_add_errors, N_("just skip files which cannot be added because of errors")),
>         OPT_BOOL( 0 , "ignore-missing", &ignore_missing, N_("check if - even missing - files are ignored in dry run")),
>         OPT_STRING( 0 , "chmod", &chmod_arg, N_("(+/-)x"), N_("override the executable bit of the listed files")),
> +       OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
> +                       N_("warn when adding an embedded repository")),

We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base.
We should use them more often.

It makes sense though in this case.

>         OPT_END(),
>  };
>
> @@ -295,6 +298,44 @@ static int add_config(const char *var, const char *value, void *cb)
>         return git_default_config(var, value, cb);
>  }
>
> +static const char embedded_advice[] = N_(
> +"You've added another git repository inside your current repository.\n"
> +"Clones of the outer repository will not also contain the contents of\n"
> +"the embedded repository. If you meant to add a submodule, use:\n"

The "will not also" sounds a bit off to me. Maybe:
  ...
  Clones of the outer repository will not contain the contents
  of the embedded repository and has no way of knowing how
  to obtain the inner repo. If you meant to add a submodule ...


> +"\n"
> +"      git submodule add <url> %s\n"
> +"\n"
> +"If you added this path by mistake, you can remove it from the\n"
> +"index with:\n"
> +"\n"
> +"      git rm --cached %s\n"
> +"\n"
> +"See \"git help submodule\" for more information."

Once the overhaul of the submodule documentation
comes along[1], we rather want to point at
"man 7 git-submodules", which explains the concepts and
then tell you about commands how to use it. For now the
git-submodule man page is ok.

[1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/


> +);
> +
> +static void check_embedded_repo(const char *path)
> +{
> +       struct strbuf name = STRBUF_INIT;
> +
> +       if (!warn_on_embedded_repo)
> +               return;
> +       if (!ends_with(path, "/"))
> +               return;
> +
> +       /* Drop trailing slash for aesthetics */
> +       strbuf_addstr(&name, path);
> +       strbuf_strip_suffix(&name, "/");
> +
> +       warning(_("adding embedded git repository: %s"), name.buf);
> +       if (advice_add_embedded_repo) {
> +               advise(embedded_advice, name.buf, name.buf);
> +               /* there may be multiple entries; advise only once */
> +               advice_add_embedded_repo = 0;
> +       }
> +
> +       strbuf_release(&name);
> +}
> +
>  static int add_files(struct dir_struct *dir, int flags)
>  {
>         int i, exit_status = 0;
> @@ -307,12 +348,14 @@ static int add_files(struct dir_struct *dir, int flags)
>                 exit_status = 1;
>         }
>
> -       for (i = 0; i < dir->nr; i++)
> +       for (i = 0; i < dir->nr; i++) {
> +               check_embedded_repo(dir->entries[i]->name);
>                 if (add_file_to_index(&the_index, dir->entries[i]->name, flags)) {
>                         if (!ignore_add_errors)
>                                 die(_("adding files failed"));
>                         exit_status = 1;
>                 }
> +       }
>         return exit_status;
>  }
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index c0d0e9a4c..e131760ee 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -213,7 +213,8 @@ cmd_add()
>                 die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
>         fi
>
> -       if test -z "$force" && ! git add --dry-run --ignore-missing "$sm_path" > /dev/null 2>&1
> +       if test -z "$force" &&
> +               ! git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" > /dev/null 2>&1
>         then
>                 eval_gettextln "The following path is ignored by one of your .gitignore files:
>  \$sm_path
> @@ -267,7 +268,7 @@ or you are unsure what this means choose another name with the '--name' option."
>         fi
>         git config submodule."$sm_name".url "$realrepo"
>
> -       git add $force "$sm_path" ||
> +       git add --no-warn-embedded-repo $force "$sm_path" ||
>         die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
>
>         git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
> diff --git a/t/t7414-submodule-mistakes.sh b/t/t7414-submodule-mistakes.sh
> new file mode 100755
> index 000000000..8059bcb7f
> --- /dev/null
> +++ b/t/t7414-submodule-mistakes.sh
> @@ -0,0 +1,40 @@
> +#!/bin/sh
> +
> +test_description='handling of common mistakes people may make with submodules'

That is one way to say it. Do we have other tests for
"you think it is a bug, but it is features" ? ;)
I like it though. :)

> +. ./test-lib.sh
> +
> +test_expect_success 'create embedded repository' '
> +       git init embed &&
> +       (
> +               cd embed &&
> +               test_commit one
> +       )

shorter via:

  test_create_repo embed &&
  test_commit -C embed one

(and saves a shell IIRC)


> +test_expect_success 'git-add on embedded repository warns' '
> +       test_when_finished "git rm --cached -f embed" &&
> +       git add embed 2>stderr &&
> +       test_i18ngrep warning stderr
> +'
> +
> +test_expect_success '--no-warn-embedded-repo suppresses warning' '
> +       test_when_finished "git rm --cached -f embed" &&
> +       git add --no-warn-embedded-repo embed 2>stderr &&
> +       test_i18ngrep ! warning stderr
> +'
> +
> +test_expect_success 'no warning when updating entry' '
> +       test_when_finished "git rm --cached -f embed" &&
> +       git add embed &&
> +       git -C embed commit --allow-empty -m two &&
> +       git add embed 2>stderr &&
> +       test_i18ngrep ! warning stderr
> +'
> +
> +test_expect_success 'submodule add does not warn' '
> +       test_when_finished "git rm -rf submodule .gitmodules" &&
> +       git submodule add ./embed submodule 2>stderr &&
> +       test_i18ngrep ! warning stderr
> +'

Thanks for these tests.

This patch looks good to me, apart from the perceived wording nits.

Thanks,
Stefan

> +
> +test_done
> --
> 2.13.1.675.g57c06d071
>

^ permalink raw reply	[relevance 17%]

* Re: [PATCH 2/2] t: move "git add submodule" into test blocks
      [irrelevant] ` <20170613092419.hzrtbn2jvykoxsry@sigill.intra.peff.net>
@ 2017-06-13 17:15   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-13 17:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Junio C Hamano, git

On Tue, Jun 13, 2017 at 2:24 AM, Jeff King <peff@peff.net> wrote:
> Some submodule tests do some setup outside of a test_expect
> block. This is bad because we won't actually check the
> outcome of those commands. But it's doubly so because "git
> add submodule" now produces a warning to stderr, which is
> not suppressed by the test scripts in non-verbose mode.

Makes sense.

> This patch does the minimal to fix the annoying warnings.
> All three of these scripts could use more cleanup of related
> setup.

agreed.

>
> Signed-off-by: Jeff King <peff@peff.net>

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[relevance 25%]

* Re: [RFC/PATCH] builtin/blame: darken redundant line information
      [irrelevant]                 ` <xmqq37b37p9o.fsf@gitster.mtv.corp.google.com>
@ 2017-06-13 18:00                   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-13 18:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jun 13, 2017 at 10:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> I never said "start and end" (you did).  I just wanted the boundary
> of A and B and C clear, so I'd be perfectly happy with:
>
>          context
>         +A      dim
>         +A      dim
>         +A      highlight #1
>         +C      highlight #2
>         +B      highlight #1
>         +B      dim
>         +B      dim
>          context
>
> You can do that still with only two highlight colors, no?

So to put it into an algorithm:

1) detect blocks
2) if blocks are adjacent, their bounds are eligible for highlighting
3) the highlighting is implemented using the "alternate" strategy
  in that any line highlighted belonging to a different block flips
  the highlighting, such that:

          context
         +A      dim
         +A      dim
         +A      highlight #1
         +B      highlight #2
         +B      dim
         +B      dim
          context

So if we go this way, we would need indeed 6 colors:

  Dimmed, Highlighted, HighlightedAlternative

color-moved modes:
nobounds::
  uses dimmed only
allbounds::
adjacentbounds::
  See algorithm above, using dimmed for inside the block and
  both highlights for bounds, making sure adjacent block bounds
  alternate the highlighting color.
alternate::
  Uses only highlighting colors, complete block is colored with
  one of the highlights

I think that is reasonable to implement. But I do still wonder if
we really want to add so many new colors.
I'll give it a try after my next submodule series.

Thanks,
Stefan

^ permalink raw reply	[relevance 15%]

* Re: [RFC/PATCH] submodules: overhaul documentation
  2017-06-07 18:53 [RFC/PATCH] submodules: overhaul documentation Stefan Beller
@ 2017-06-13 19:29 ` Junio C Hamano
  2017-06-13 21:06   ` Stefan Beller
  2017-06-20 18:18 ` Jonathan Tan
  2017-06-20 22:56 ` [PATCHv2] submodules: overhaul documentation Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-06-13 19:29 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> @@ -149,15 +119,17 @@ deinit [-f|--force] (--all|[--] <path>...)::
>  	tree. Further calls to `git submodule update`, `git submodule foreach`
>  	and `git submodule sync` will skip any unregistered submodules until
>  	they are initialized again, so use this command if you don't want to
> -	have a local checkout of the submodule in your working tree anymore. If
> -	you really want to remove a submodule from the repository and commit
> -	that use linkgit:git-rm[1] instead.
> +	have a local checkout of the submodule in your working tree anymore.
>  +
>  When the command is run without pathspec, it errors out,
>  instead of deinit-ing everything, to prevent mistakes.
>  +
>  If `--force` is specified, the submodule's working tree will
>  be removed even if it contains local modifications.
> ++
> +If you really want to remove a submodule from the repository and commit
> +that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal
> +options.

Good reorganization.

> diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
> new file mode 100644
> index 0000000000..2bf3149b68
> --- /dev/null
> +++ b/Documentation/gitsubmodules.txt
> @@ -0,0 +1,214 @@
> +gitsubmodules(7)
> +================
> +
> +NAME
> +----
> +gitsubmodules - mounting one repository inside another
> +
> +SYNOPSIS
> +--------
> +.gitmodules, $GIT_DIR/config
> +------------------
> +git submodule
> +git <command> --recurse-submodules
> +------------------
> +
> +DESCRIPTION
> +-----------
> +
> +A submodule is another Git repository tracked in a subdirectory of your
> +repository. The tracked repository has its own history, which does not
> +interfere with the history of the current repository.

"tracked in a subdirectory" sounds as if your top-level superproject
has a dedicated submodules/ directory and in it there live a bunch
of submodules.  Which obviously is not what you meant.  If phrased
"tracked as a subdirectory", I think the sentence makes sense.

While "which does not interfere" may be technically correct, I am
not sure what the value of saying that is.

> +Submodules are composed from a so-called `gitlink` tree entry
> +in the main repository that refers to a particular commit object
> +within the inner repository.

Correct, but it may be unclear to the readers why we do so.  Perhaps

        ... and this way, the tree of each commit in the main repository
        "knows" which commit from the submodule's history is "tied" to it.

or something like that?

> +Additionally to the gitlink entry the `.gitmodules` file (see
> +linkgit:gitmodules[5]) at the root of the source tree contains
> +information needed for submodules.

Is that really true?  Each submodule do not *need* what is in
.gitmodules; the top-level superproject needs to learn about
its submodules from the contents of that file, though.

> +The only required information
> +is the path setting, which estabishes a logical name for the submodule.

The phrase "the path setting" feels a bit unfortunate.  Is that
"only" thing we need?  Without URL we have no way to populate it,
no?

> +The usual git configuration (see linkgit:git-config[1]) can be used to
> +override settings given by the `.gitmodules` file.
> +
> +Submodules can be used for two different use cases:
> +
> +1. Using another project that stands on its own.
> +  When you want to use a third party library, submodules allow you to
> +  have a clean history for your own project as well as for the library.
> +  This also allows for updating the third party library as needed.
> +
> +2. Artificially split a (logically single) project into multiple
> +   repositories and tying them back together. This can be used to
> +   overcome deficiences in the data model of Git, such as:

s/deficiences in the data model/current limitations/ perhaps?

> +* To have finer grained access control.
> +  The design principles of Git do not allow for partial repositories to be
> +  checked out or transferred. A repository is the smallest unit that a user
> +  can be given access to. Submodules are separate repositories, such that
> +  you can restrict access to parts of your project via the use of submodules.

Some servers implement per-branch access control that seems to work
rather well.  Given that "shallow history" is possible (i.e. you
could give one commit without exposing older parts of the history),
I think the limitation this paragrah refers to is that "a tree is
the smallest unit that the user can be given access to."

> +* In its current form Git scales up poorly for very large repositories that
> +  change a lot, as the history grows very large. For that you may want to look
> +  at shallow clone, sparse checkout, or git-LFS.
> +  However you can also use submodules to e.g. hold large binary assets
> +  and these repositories are then shallowly cloned such that you do not
> +  have a large history locally.

This is why I suggest "current limitations"; this is not about
deficiency in the data model.

> +A submodule can be considered its own autonomous repository, that has a
> +worktree and a git directory at a different place than the superproject.

"Its own" I agree, but autonomous?

The mention of "main repository" in the earlier part of the document
may want to use the same phrase "superproject".

> +The superproject only records the commit object name in its tree, such that
> +any other information, e.g. where to obtain a copy from, is not recorded
> +in the core data structures of Git. The porcelain layer of Git however
> +makes use of the `.gitmodules` file that gives hints where and how to
> +obtain a copy of the submodule git repository from.

OK.

> +On the location of the git directory
> +------------------------------------
> +
> +Since v1.7.7 of Git, the git directory of submodules is stored inside the
> +superprojects git directory at $GIT_DIR/modules/<submodule-name>
> +This location allows for the working tree to be non existent while keeping
> +the history around. So we can use `git-rm` on a submodule without loosing

s/git-rm/git -rm/
s/loosing/losing/

> +Workflow for a third party library
> +----------------------------------
> +
> +  # add the submodule
> +  git submodule add <url> <path>
> +
> +  # occasionally update the submodule to a new version:
> +  git -C <path> checkout <new version>
> +  git add <path>
> +  git commit -m "update submodule to new version"

OK.

> +Workflow for an artifically split repo
> +--------------------------------------
> +
> +  # Enable recursion for relevant commands, such that
> +  # regular commands recurse into submodules by default
> +  git config --global submodule.recurse true
> +
> +  # Unlike the other commands below clone still needs
> +  # its own recurse flag:
> +  git clone --recurse <URL> <directory>
> +  cd <directory>
> +
> +  # Get to know the code:
> +  git grep foo
> +  git ls-files
> +
> +  # Get new code
> +  git fetch
> +  git pull --rebase
> +
> +  # change worktree
> +  git checkout
> +  git reset

This part is interesting ;-)

> +Deleting a submodule
> +--------------------
> +
> +Deleting a submodule can happen on different levels:
> +
> +1) Removing it from the local working tree without tampering with
> +   the history of the superproject.
> +
> +You may no longer need the submodule, but still want to keep it recorded
> +in the superproject history as others may have use for it.
> +--
> +  git submodule deinit <submodule path>
> +--
> +will remove the configuration entries
> +as well as the work

Do we have an adjective used for submodules that are checked out
vs deleted in this manner (I am thinking of "active" from earlier
work by Brandon)?  Do we want to mention it around here?

> +2) Remove it from history:
> +--
> +   git rm <submodule>
> +--

Is this removing from "history"?  Isn't it merely removing it from
the index of the superproject (hence potentially removing it from
the tree of the upcoming commit in the superproject)?

> +3) Remove the submodules git directory:
> +
> +When you also want to free up the disk space that the submodules git
> +directory uses, you have to delete it manually. It is found in
> +`$GIT_DIR/modules`.
> +The steps 1 and 2 can be undone via `git submodule init` or
> +`git revert`, respectively.  This step may incur data loss,
> +and cannot be undone. That is why there is no builtin.

Perhaps "deinit" can learn an option to do this (tangent).  When you
are a follower, it is OK to do so.

When you are removing the only copy of the repository, of course
there will be some data loss ;-)


> +Other mechanisms
> +----------------
> +
> +Git repositories are allowed to be kept inside other repositories without
> +the need to use submodules. This however does not enable cross-repository
> +versioning as the inner repository is unaware of the outer repository,
> +which in turn ignores the inner.

s/the inner/& repository/;

> +Submodules are not to be confused with remotes, which are other
> +repositories of the same project; submodules are meant for
> +different projects you would like to make part of your source tree,
> +while the history of the two projects still stays completely
> +independent and you cannot modify the contents of the submodule
> +from within the main project.

Would anybody make such a confusion, though?  Perhaps drop the first
sentence up to ';' in a follow-up patch?

> +If you want to merge the project histories and want to treat the
> +aggregated whole as a single project from then on, you may want to
> +add a remote for the other project and use the 'subtree' merge strategy,
> +instead of treating the other project as a submodule. Directories
> +that come from both projects can be cloned and checked out as a whole
> +if you choose to go that route.

While it is correct, is this something we want to mention in
gitsubmodule.txt?  It sounds more like what "git merge" should say,
if we wanted to.

Thanks.

^ permalink raw reply	[relevance 22%]

* Re: [RFC/PATCH] submodules: overhaul documentation
  2017-06-13 19:29 ` Junio C Hamano
@ 2017-06-13 21:06   ` Stefan Beller
  2017-06-19 18:10     ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-13 21:06 UTC (permalink / raw)
  To: Junio C Hamano, Brandon Williams, Jonathan Nieder; +Cc: git

Adding two native speakers as we start word smithing.

On Tue, Jun 13, 2017 at 12:29 PM, Junio C Hamano <gitster@pobox.com> wrote:

>> +
>> +A submodule is another Git repository tracked in a subdirectory of your
>> +repository. The tracked repository has its own history, which does not
>> +interfere with the history of the current repository.
>
> "tracked in a subdirectory" sounds as if your top-level superproject
> has a dedicated submodules/ directory and in it there live a bunch
> of submodules.  Which obviously is not what you meant.  If phrased
> "tracked as a subdirectory", I think the sentence makes sense.

Given this explanation "as a" also sounds wrong[1], maybe we need to
separate (1) where it is put/mounted and (2) the fact that is tracked,
i.e. the superproject has an idea of what should be there at a given
revision. (I shortly thought about /s/as a/using/ in the above, but):

  A submodule is another Git repository at an arbitrary place inside
  the working tree, and also tracked. The tracked repository has its
  own history, which does not interfere with the history of the current
  repository.

[1] http://www.thesaurus.com/browse/as

>
> While "which does not interfere" may be technically correct, I am
> not sure what the value of saying that is.

I think we can drop it here. When writing I wanted to separate it from
subtrees, but this is the wrong place for that.

>
>> +Submodules are composed from a so-called `gitlink` tree entry
>> +in the main repository that refers to a particular commit object
>> +within the inner repository.
>
> Correct, but it may be unclear to the readers why we do so.  Perhaps
>
>         ... and this way, the tree of each commit in the main repository
>         "knows" which commit from the submodule's history is "tied" to it.
>
> or something like that?

sounds good to me.

>
>> +Additionally to the gitlink entry the `.gitmodules` file (see
>> +linkgit:gitmodules[5]) at the root of the source tree contains
>> +information needed for submodules.
>
> Is that really true?  Each submodule do not *need* what is in
> .gitmodules; the top-level superproject needs to learn about
> its submodules from the contents of that file, though.

Ha! The ediled words in my mind were:

 ... information needed for submodules [to work in the superproject].

But maybe we need to reword that as

  Additionally to the gitlink entry the `.gitmodules` file (see
  linkgit:gitmodules[5]) at the root of the source tree contains
  information on how to handle submodules.

I'd like to keep this part short and not go into detail.

>
>> +The only required information
>> +is the path setting, which estabishes a logical name for the submodule.
>
> The phrase "the path setting" feels a bit unfortunate.  Is that
> "only" thing we need?  Without URL we have no way to populate it,
> no?

    git config -f .gitmodules submodule.foo.path foo
    git config submodule.foo.url example.org/foo
    git submodule update --init

ought to work just fine. It is not the recommended way of working,
but it should work.

I think (in the far future) we actually should only have the path information
in-tree and *any* other information outside the tree, which includes the URL,

See[2], where I state how I'd like to shape the future:

  $ cat .gitmodules
  [submodule "sub42"]
    path = foo
  # path only in tree!

  $ cat .git/config
  ...
  [submodule]
    active = .
    active = :(exclude)Irrelevant/submodules/for/my/usecase/*
  # note how this is user centric

  $ git show refs/meta/magic/for/refs/heads/master:.gitmodules
  [submodule "sub42"]
    url = https://example.org/foo
    branch = .
  # Note how this is neither centering on the in-tree
  # contents, nor the user. Instead it focuses on the
  # project or group. It is *workflow* centric.
  # Workflows may change over time, e.g. the url could
  # be repointed to k.org or an in-house mirror without tree
  # changes.

Jonathan pointed out the ref name is chosen poorly, but conceptually
I would want to keep the URL setting outside the tree. The URL may
change over time, independently from the history currently checked out
(think of bisect, that includes an "submodule update --init" to bisect across
a fully populated superproject 'at the time')

[2] https://public-inbox.org/git/CAGZ79kbbTwQicVkRs51fV91R_7ZhDtC+FR8Z-SQzRpF2cjFfag@mail.gmail.com/




>
>> +The usual git configuration (see linkgit:git-config[1]) can be used to
>> +override settings given by the `.gitmodules` file.
>> +
>> +Submodules can be used for two different use cases:
>> +
>> +1. Using another project that stands on its own.
>> +  When you want to use a third party library, submodules allow you to
>> +  have a clean history for your own project as well as for the library.
>> +  This also allows for updating the third party library as needed.
>> +
>> +2. Artificially split a (logically single) project into multiple
>> +   repositories and tying them back together. This can be used to
>> +   overcome deficiences in the data model of Git, such as:
>
> s/deficiences in the data model/current limitations/ perhaps?

makes sense.

>
>> +* To have finer grained access control.
>> +  The design principles of Git do not allow for partial repositories to be
>> +  checked out or transferred. A repository is the smallest unit that a user
>> +  can be given access to. Submodules are separate repositories, such that
>> +  you can restrict access to parts of your project via the use of submodules.
>
> Some servers implement per-branch access control that seems to work
> rather well.

True. So maybe s/partial repository/partial working tree/

> Given that "shallow history" is possible (i.e. you
> could give one commit without exposing older parts of the history),
> I think the limitation this paragrah refers to is that "a tree is
> the smallest unit that the user can be given access to."

yes. Though in theory (with the work on omitting blobs and potentially trees)
we could omit partial trees as well and just tell the user they cannot have it.

>
>> +* In its current form Git scales up poorly for very large repositories that
>> +  change a lot, as the history grows very large. For that you may want to look
>> +  at shallow clone, sparse checkout, or git-LFS.
>> +  However you can also use submodules to e.g. hold large binary assets
>> +  and these repositories are then shallowly cloned such that you do not
>> +  have a large history locally.
>
> This is why I suggest "current limitations"; this is not about
> deficiency in the data model.

ok.

>
>> +A submodule can be considered its own autonomous repository, that has a
>> +worktree and a git directory at a different place than the superproject.
>
> "Its own" I agree, but autonomous?

I'll drop that word.


>> +Workflow for an artifically split repo
>> +--------------------------------------
>> +
...
>> +
>> +  # change worktree
>> +  git checkout
>> +  git reset
>
> This part is interesting ;-)

and the problem is this is still in flux ...


>
>> +Deleting a submodule
>> +--------------------
>> +
>> +Deleting a submodule can happen on different levels:
>> +
>> +1) Removing it from the local working tree without tampering with
>> +   the history of the superproject.
>> +
>> +You may no longer need the submodule, but still want to keep it recorded
>> +in the superproject history as others may have use for it.
>> +--
>> +  git submodule deinit <submodule path>
>> +--
>> +will remove the configuration entries
>> +as well as the work
>
> Do we have an adjective used for submodules that are checked out
> vs deleted in this manner (I am thinking of "active" from earlier
> work by Brandon)?  Do we want to mention it around here?

We'd want to propagate "active" more throughout our documentation,
too.

I think this state would be called "unpopulated" (as: the working
tree is not populated, no hint wither the git dir of the submodule
exists)

>
>> +2) Remove it from history:
>> +--
>> +   git rm <submodule>
>> +--
>
> Is this removing from "history"?  Isn't it merely removing it from
> the index of the superproject (hence potentially removing it from
> the tree of the upcoming commit in the superproject)?

True.

>
>> +3) Remove the submodules git directory:
>> +
>> +When you also want to free up the disk space that the submodules git
>> +directory uses, you have to delete it manually. It is found in
>> +`$GIT_DIR/modules`.
>> +The steps 1 and 2 can be undone via `git submodule init` or
>> +`git revert`, respectively.  This step may incur data loss,
>> +and cannot be undone. That is why there is no builtin.
>
> Perhaps "deinit" can learn an option to do this (tangent).  When you
> are a follower, it is OK to do so.
>
> When you are removing the only copy of the repository, of course
> there will be some data loss ;-)

Good point. deinit seems to be the logical place to put it.
Although we could also argue to not hide it in a flag of deinit,
but have a new subcommand "git submodule delete" that removes
the working tree and the git dir, but not the gitlink.

>> +Other mechanisms
>> +----------------
>> +
>> +Git repositories are allowed to be kept inside other repositories without
>> +the need to use submodules. This however does not enable cross-repository
>> +versioning as the inner repository is unaware of the outer repository,
>> +which in turn ignores the inner.
>
> s/the inner/& repository/;
>
>> +Submodules are not to be confused with remotes, which are other
>> +repositories of the same project; submodules are meant for
>> +different projects you would like to make part of your source tree,
>> +while the history of the two projects still stays completely
>> +independent and you cannot modify the contents of the submodule
>> +from within the main project.
>
> Would anybody make such a confusion, though?  Perhaps drop the first
> sentence up to ';' in a follow-up patch?

This code was moved from the current git-submodule man page.
I questioned this confusion as well. Maybe this was confusing when
it was new?

Will remove.

>
>> +If you want to merge the project histories and want to treat the
>> +aggregated whole as a single project from then on, you may want to
>> +add a remote for the other project and use the 'subtree' merge strategy,
>> +instead of treating the other project as a submodule. Directories
>> +that come from both projects can be cloned and checked out as a whole
>> +if you choose to go that route.
>
> While it is correct, is this something we want to mention in
> gitsubmodule.txt?  It sounds more like what "git merge" should say,
> if we wanted to.

The section "Other mechanisms" would want to point out all
things that are useful for slightly different use cases, which includes
sub trees?

>
> Thanks.

^ permalink raw reply	[relevance 22%]

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13 17:07   ` Re: [PATCH 1/2] add: warn when adding an embedded repository Stefan Beller
@ 2017-06-14  6:36     ` Jeff King
  2017-06-14 17:53       ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Jeff King @ 2017-06-14  6:36 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git

On Tue, Jun 13, 2017 at 10:07:43AM -0700, Stefan Beller wrote:

> > I also waffled on whether we should ask the submodule code
> > whether it knows about a particular path. Technically:
> >
> >   git config submodule.foo.path foo
> >   git config submodule.foo.url git://...
> >   git add foo
> >
> > is legal, but would still warn with this patch. I don't know
> > how much we should care (it would also be easy to do on
> > top).
> 
> And here I was thinking this is not legal, because you may override
> anything *except* submodule.*.path in the config. That is because
> all the other settings (such as url, active flag, branch,
> shallow recommendation) are dependent on the use case, the user,
> changes to the environment (url) or such. The name<->path mapping
> however is only to be changed via changes to the tracked content.
> That is why it would make sense to disallow overriding the path
> outside the tracked content.

It was probably a mistake to use normal config as the example. Junio
mentioned it as a case that could work if you communicate the submodule
URL to somebody else out-of-band. My understanding was that you could
set whatever you like in the regular config, but I think that is just
showing my ignorance of submodules.

Pretend like I said "-f .gitmodules" in each line above. ;)

> In my ideal dream world of submodules we would have the following:
> 
>   $ cat .gitmodules
>   [submodule "sub42"]
>     path = foo
>   # path only in tree!

TBH, I am not sure why we need "path"; couldn't we just use the
subsection name as an implicit path?

> > +       OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
> > +                       N_("warn when adding an embedded repository")),
> 
> We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base.
> We should use them more often.
> 
> It makes sense though in this case.

Actually, my main reason is that it's nonsense to show
"--warn-embedded-repo" in the help, when it's already the default. I
would like to have written:

  OPT_NEGBOOL(0, "no-warn-embedded-repo", &warn_on_embedded_repo,
		N_("disable warning when adding an embedded repository"))

but we don't have such a thing (and the last discussion on it a few
months ago left a lot of open questions). So given that this really
isn't something I'd expect users to want, I figured hiding it was a good
idea. I mentioned it in the manpage for script writers, but it's really
not worth cluttering "git add -h".

> > +static const char embedded_advice[] = N_(
> > +"You've added another git repository inside your current repository.\n"
> > +"Clones of the outer repository will not also contain the contents of\n"
> > +"the embedded repository. If you meant to add a submodule, use:\n"
> 
> The "will not also" sounds a bit off to me. Maybe:
>   ...
>   Clones of the outer repository will not contain the contents
>   of the embedded repository and has no way of knowing how
>   to obtain the inner repo. If you meant to add a submodule ...

Yeah, I think we could just strike the "also" (I played around with the
wording here quite a bit and I think it was left from an earlier attempt
where it made more sense).

Your "no way of knowing" is probably a good thing to mention.

> > +"See \"git help submodule\" for more information."
> 
> Once the overhaul of the submodule documentation
> comes along[1], we rather want to point at
> "man 7 git-submodules", which explains the concepts and
> then tell you about commands how to use it. For now the
> git-submodule man page is ok.
> 
> [1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/

Yeah, I poked around looking for a definitive "here's how submodules
work" intro. I'm happy one is in the works, and I agree this should
point there once it exists.

> > +++ b/t/t7414-submodule-mistakes.sh
> > @@ -0,0 +1,40 @@
> > +#!/bin/sh
> > +
> > +test_description='handling of common mistakes people may make with submodules'
> 
> That is one way to say it. Do we have other tests for
> "you think it is a bug, but it is features" ? ;)
> I like it though. :)

Heh. I didn't know how else to lump it together. Just "test git add on a
repository" felt like too little for its own script. I almost added it
to t7400, but I think that script is plenty long enough as it is (it's
also one of the longest-running scripts, I think).

> > +test_expect_success 'create embedded repository' '
> > +       git init embed &&
> > +       (
> > +               cd embed &&
> > +               test_commit one
> > +       )
> 
> shorter via:
> 
>   test_create_repo embed &&
>   test_commit -C embed one
> 
> (and saves a shell IIRC)

Right, I forgot we added -C there. Will change.

> Thanks for these tests.
> 
> This patch looks good to me, apart from the perceived wording nits.

Thanks. I'll re-roll with a few tweaks based on your feedback.

-Peff

^ permalink raw reply	[relevance 16%]

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-14  6:36     ` Jeff King
@ 2017-06-14 17:53       ` Stefan Beller
  2017-06-15  6:01         ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-14 17:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Junio C Hamano, git

On Tue, Jun 13, 2017 at 11:36 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 13, 2017 at 10:07:43AM -0700, Stefan Beller wrote:
>
>> > I also waffled on whether we should ask the submodule code
>> > whether it knows about a particular path. Technically:
>> >
>> >   git config submodule.foo.path foo
>> >   git config submodule.foo.url git://...
>> >   git add foo
>> >
>> > is legal, but would still warn with this patch. I don't know
>> > how much we should care (it would also be easy to do on
>> > top).
>>
>> And here I was thinking this is not legal, because you may override
>> anything *except* submodule.*.path in the config. That is because
>> all the other settings (such as url, active flag, branch,
>> shallow recommendation) are dependent on the use case, the user,
>> changes to the environment (url) or such. The name<->path mapping
>> however is only to be changed via changes to the tracked content.
>> That is why it would make sense to disallow overriding the path
>> outside the tracked content.
>
> It was probably a mistake to use normal config as the example. Junio
> mentioned it as a case that could work if you communicate the submodule
> URL to somebody else out-of-band. My understanding was that you could
> set whatever you like in the regular config, but I think that is just
> showing my ignorance of submodules.
>
> Pretend like I said "-f .gitmodules" in each line above. ;)
>
>> In my ideal dream world of submodules we would have the following:
>>
>>   $ cat .gitmodules
>>   [submodule "sub42"]
>>     path = foo
>>   # path only in tree!
>
> TBH, I am not sure why we need "path"; couldn't we just use the
> subsection name as an implicit path?

That is what was done back in the time. But then people wanted to rename
submodules (i.e. move them around in the worktree), so the path is not
constant, so either we'd have to move around the git dir whenever the
submodule is renamed (bad idea IMO), or instead introduce a mapping
between (constant name <-> variable path). So that was done.

Historically (IIUC) we had submodule.path.url which then was changed
to submodule.name.url + name->path resolution. And as a hack(?) or
easy way out of a problem then, the name is often the same as the path
hence confusing people, when they see:

    [submodule "foo"]
        path = foo
        url = dadada/foo

What foo means what now? ;)
As a tangent: I want to make the default name different to the path.

So yeah, we want to keep the name and not mingle with implicit path.

I think we may even have bugs in our code base where the
name/path confusion shows.

Talking about another tangent:

  For files there is a rename detection available. For submodules
  It is hard to imagine that there ever will be such a rename detection
  as files have because of the explciit name<->path mapping.

  We *know* when a submodule was moved. So why even try
  to do rename detection? As we record only sha1s for a submodule
  you could swap two submodule object names by accident.
  Consider a superproject that contains different kernels, such as
  a kernel for your phone/embedded device and then a kernel for
  your workstation or other device. And these two kernels are different
  for technical reasons but share the same history.

  Now the inattentive user may make a mistake and git-add the
  "wrong" kernel submodule.  The smart Git would tell that it is a
  rename/move just as we have with files.

>
>> > +       OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
>> > +                       N_("warn when adding an embedded repository")),
>>
>> We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base.
>> We should use them more often.
>>
>> It makes sense though in this case.
>
> Actually, my main reason is that it's nonsense to show
> "--warn-embedded-repo" in the help, when it's already the default. I
> would like to have written:
>
>   OPT_NEGBOOL(0, "no-warn-embedded-repo", &warn_on_embedded_repo,
>                 N_("disable warning when adding an embedded repository"))
>
> but we don't have such a thing (and the last discussion on it a few
> months ago left a lot of open questions). So given that this really
> isn't something I'd expect users to want, I figured hiding it was a good
> idea. I mentioned it in the manpage for script writers, but it's really
> not worth cluttering "git add -h".

ok :) If you really wanted, you could go with a raw OPTION though. ;)
This is fine with me though.

>
>> > +static const char embedded_advice[] = N_(
>> > +"You've added another git repository inside your current repository.\n"
>> > +"Clones of the outer repository will not also contain the contents of\n"
>> > +"the embedded repository. If you meant to add a submodule, use:\n"
>>
>> The "will not also" sounds a bit off to me. Maybe:
>>   ...
>>   Clones of the outer repository will not contain the contents
>>   of the embedded repository and has no way of knowing how
>>   to obtain the inner repo. If you meant to add a submodule ...
>
> Yeah, I think we could just strike the "also" (I played around with the
> wording here quite a bit and I think it was left from an earlier attempt
> where it made more sense).
>
> Your "no way of knowing" is probably a good thing to mention.
>
>> > +"See \"git help submodule\" for more information."
>>
>> Once the overhaul of the submodule documentation
>> comes along[1], we rather want to point at
>> "man 7 git-submodules", which explains the concepts and
>> then tell you about commands how to use it. For now the
>> git-submodule man page is ok.
>>
>> [1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/
>
> Yeah, I poked around looking for a definitive "here's how submodules
> work" intro. I'm happy one is in the works, and I agree this should
> point there once it exists.
>
>> > +++ b/t/t7414-submodule-mistakes.sh
>> > @@ -0,0 +1,40 @@
>> > +#!/bin/sh
>> > +
>> > +test_description='handling of common mistakes people may make with submodules'
>>
>> That is one way to say it. Do we have other tests for
>> "you think it is a bug, but it is features" ? ;)
>> I like it though. :)
>
> Heh. I didn't know how else to lump it together. Just "test git add on a
> repository" felt like too little for its own script. I almost added it
> to t7400, but I think that script is plenty long enough as it is (it's
> also one of the longest-running scripts, I think).

Thanks for not doing that. :)

^ permalink raw reply	[relevance 18%]

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-14 17:53       ` Stefan Beller
@ 2017-06-15  6:01         ` Jeff King
  0 siblings, 0 replies; 200+ results
From: Jeff King @ 2017-06-15  6:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git

On Wed, Jun 14, 2017 at 10:53:12AM -0700, Stefan Beller wrote:

> >> In my ideal dream world of submodules we would have the following:
> >>
> >>   $ cat .gitmodules
> >>   [submodule "sub42"]
> >>     path = foo
> >>   # path only in tree!
> >
> > TBH, I am not sure why we need "path"; couldn't we just use the
> > subsection name as an implicit path?
> 
> That is what was done back in the time. But then people wanted to rename
> submodules (i.e. move them around in the worktree), so the path is not
> constant, so either we'd have to move around the git dir whenever the
> submodule is renamed (bad idea IMO), or instead introduce a mapping
> between (constant name <-> variable path). So that was done.

Ah, right. That makes sense. I forgot that in addition to the in-tree
path, we have to store the submodule repository itself as some name. The
extra level of indirection there isn't strictly necessary, but it lets
the "name" act as a unique id.

> Historically (IIUC) we had submodule.path.url which then was changed
> to submodule.name.url + name->path resolution. And as a hack(?) or
> easy way out of a problem then, the name is often the same as the path
> hence confusing people, when they see:
> 
>     [submodule "foo"]
>         path = foo
>         url = dadada/foo
> 
> What foo means what now? ;)

Right, I am such a person that has been confused. ;)

Thanks for explaining.

> Talking about another tangent:
> 
>   For files there is a rename detection available. For submodules
>   It is hard to imagine that there ever will be such a rename detection
>   as files have because of the explciit name<->path mapping.
> 
>   We *know* when a submodule was moved. So why even try
>   to do rename detection? As we record only sha1s for a submodule
>   you could swap two submodule object names by accident.
>   Consider a superproject that contains different kernels, such as
>   a kernel for your phone/embedded device and then a kernel for
>   your workstation or other device. And these two kernels are different
>   for technical reasons but share the same history.

Do you mean during the rename detection phase of a diff, check to see if
.gitmodules registered a change in path for a particular module (by
finding its entry in the diff and looking at both sides), and if so then
mark that as a rename for the submodule paths?

From a cursory glance, that sounds like an interesting approach.

-Peff

^ permalink raw reply	[relevance 17%]

* Re: git push recurse.submodules behavior changed in 2.13
  2017-06-12 17:27       ` Stefan Beller
@ 2017-06-16 14:11         ` John Shahid
  0 siblings, 0 replies; 200+ results
From: John Shahid @ 2017-06-16 14:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jonathan Nieder, git, Brandon Williams

On Mon, Jun 12, 2017 at 1:27 PM, Stefan Beller <sbeller@google.com> wrote:
> On Sat, Jun 10, 2017 at 6:28 AM, John Shahid <jvshahid@gmail.com> wrote:
>> bump. it's been a while and I'm still not clear what the next steps
>> are. I'm happy to send a patch but I would like to get a consensus
>> first.
>
> What do you want a consensus on?
> (Is the change in 2.13 a bug or feature? I considered it enough
> of a feature to not pursue an urgent bug fix. Maybe I misunderstood
> the discussion)

I was under the impression that Jonathan and may be others considered
the fact that `git push --recurse-submodules=on-demand` doesn't work
as before an unintentional change. He asked me previously if pushing
without a refspec will work for us and I responded with a yes. The
question remains if everyone is on board with change push without
refspec to use `push.default` in the parent repo as well as
submodules.

Cheers,

JS

^ permalink raw reply	[relevance 24%]

* Re: [PATCH/RFC] Cleanup Documentation
      [irrelevant] ` <xmqq1sqgv9ax.fsf@gitster.mtv.corp.google.com>
@ 2017-06-19  5:50   ` Stefan Beller
  2017-06-19 17:33     ` Kaartic Sivaraam
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-19  5:50 UTC (permalink / raw)
  To: Junio C Hamano, mlevedahl; +Cc: git, Kaartic Sivaraam

On Sun, Jun 18, 2017 at 10:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan, this was sent in my way, but I know you are the primary
> person who is looking into updating submodule documentation these
> days, so I am forwarding it in your way to ask you to give the first
> comment.
>
> Thanks.

AFAICT this is specific to the arguments of 'add', such that it would not
collide with sb/submodule-doc[1]. However my series was RFC, while this
is on the order of "documentation bug fix", so this would be more important
than rewriting the documentation from scrach any way. :)



[1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/


>
> Kaartic Sivaraam <kaarticsivaraam91196@gmail.com> writes:
>
>> 1. Remove redundancy from documentation
>> 2. Remove unclear reference to alternative
>>
>> Signed-off-by: Kaartic Sivaraam <kaarticsivaraam91196@gmail.com>
>> ---
>>
>> The following line seemes unclear and hence was removed for now. Suggest any
>> changes that could make it clear.
>>
>> "This second form is provided to ease creating a new submodule from scratch,
>> and presumes the user will later push the submodule to the given URL."

+cc Marc who wrote this sentence originally.


>>
>>
>>  Documentation/git-submodule.txt | 37 ++++++++++++++++---------------------
>>  1 file changed, 16 insertions(+), 21 deletions(-)
>>
>> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
>> index 74bc6200d..9812b0655 100644
>> --- a/Documentation/git-submodule.txt
>> +++ b/Documentation/git-submodule.txt
>> @@ -63,13 +63,7 @@ add [-b <branch>] [-f|--force] [--name <name>] [--reference <repository>] [--dep
>>       to the changeset to be committed next to the current
>>       project: the current project is termed the "superproject".
>>  +
>> -This requires at least one argument: <repository>. The optional
>> -argument <path> is the relative location for the cloned submodule
>> -to exist in the superproject. If <path> is not given, the
>> -"humanish" part of the source repository is used ("repo" for
>> -"/path/to/repo.git" and "foo" for "host.xz:foo/.git").
>> -The <path> is also used as the submodule's logical name in its
>> -configuration entries unless `--name` is used to specify a logical name.
>> +This requires at least one argument: <repository>.
>>  +

So we're losing the information how the submodule name is chosen.
This may be fine as I plan (long term) to make the name an arbitrary random
string (IMHO that reduces confusion as there will be less 'nearly the same'
things)

On the other hand the newly added line
  'This requires at least one argument: <repository'
(actually moved, but) is sort of redundant. The notation in the argument line
should make that clear, already?


>>  <repository> is the URL of the new submodule's origin repository.
>>  This may be either an absolute URL, or (if it begins with ./
>> @@ -87,21 +81,22 @@ If the superproject doesn't have a default remote configured
>>  the superproject is its own authoritative upstream and the current
>>  working directory is used instead.
>>  +
>> -<path> is the relative location for the cloned submodule to
>> -exist in the superproject. If <path> does not exist, then the
>> -submodule is created by cloning from the named URL. If <path> does
>> -exist and is already a valid Git repository, then this is added
>> -to the changeset without cloning. This second form is provided
>> -to ease creating a new submodule from scratch, and presumes
>> -the user will later push the submodule to the given URL.
>> +The optional argument <path> is the relative location for the cloned
>> +submodule to exist in the superproject. If <path> is not given, the
>> +"humanish" part of the source repository is used ("repo" for
>> +"/path/to/repo.git" and "foo" for "host.xz:foo/.git"). If <path>
>> +exists and is already a valid Git repository, then this is added
>> +to the changeset without cloning. The <path> is also used as the
>> +submodule's logical name in its configuration entries unless `--name`
>> +is used to specify a logical name.

This sounds good, it consolidates all information about [<path>]
in one paragraph. While at it, maybe let's find another (better)
substitute for "humanish" as that can be anything(?).

Maybe "the last part of the URL" (without any .git)

>>  +
>> -In either case, the given URL is recorded into .gitmodules for
>> -use by subsequent users cloning the superproject. If the URL is
>> -given relative to the superproject's repository, the presumption
>> -is the superproject and submodule repositories will be kept
>> -together in the same relative location, and only the
>> -superproject's URL needs to be provided: git-submodule will correctly
>> -locate the submodule using the relative URL in .gitmodules.
>> +The given URL is recorded into .gitmodules for use by subsequent users
>> +cloning the superproject. If the URL is given relative to the
>> +superproject's repository, the presumption is the superproject and
>> +submodule repositories will be kept together in the same relative
>> +location, and only the superproject's URL needs to be provided.
>> +git-submodule will correctly locate the submodule using the relative
>> +URL in .gitmodules.
>>

(While at it:)
Please markup the '.gitmodules' either via single quotes or `.
(or even link to 'gitmodules(5)' )

>>  status [--cached] [--recursive] [--] [<path>...]::
>>       Show the status of the submodules. This will print the SHA-1 of the

I am undecided if this is really removing (2) unclearness, but the
(1) redundancy seems fine to me.

Thanks,
Stefan

^ permalink raw reply	[relevance 16%]

* AW: Restoring detached HEADs after Git operations
      [irrelevant] ` <88AC6179-75D6-416B-9235-C628D6C59CA5@gmail.com>
@ 2017-06-19  9:52   ` Patrick Lehmann
  2017-06-19 16:37     ` Re: Restoring detached HEADs after Git operations Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Patrick Lehmann @ 2017-06-19  9:52 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Mailinglist, Stefan Beller

Hello Lars,

for your questions:
> If there are multiple branches with the same hash then your script would pick the first one. Can you imagine a situation where this would be a problem?

I can't think of a good solution to resolve it automatically. Maybe a script could print that there are multiple possibilities and it choose the first branch in the list.


> Plus, you are looking only at local branches. Wouldn't it make sense to look at remote branches, too?

This is also related to restoring tags. If we go this way, we should have this priority list:
- local branches
- remote branches
- tags


> Submodule processing is already quite slow if you have many of them. I wonder how much this approach would affect the performance.

Yes. It takes a few seconds to iterate all the submodules. It could be improved if the processing wouldn't be based on slow Bash scripts spawning lot's of sub-shells to execute multiple Git commands.



Is there a way to avoid detached DEADs at the beginning?
Many submodules are attached to a reference and get detached to a hash of the same reference. It would be better, if they never get detached when the current and new hash are the same.


Kind regards
    Patrick

________________________________________
Von: git-owner@vger.kernel.org [git-owner@vger.kernel.org]&quot; im Auftrag von &quot;Lars Schneider [larsxschneider@gmail.com]
Gesendet: Montag, 19. Juni 2017 11:30
Bis: Patrick Lehmann
Cc: Git Mailinglist; Stefan Beller
Betreff: Re: Restoring detached HEADs after Git operations

> On 19 Jun 2017, at 10:46, Patrick Lehmann <Patrick.Lehmann@plc2.de> wrote:
>
> Hello,
>
> I wrote a Bash script to recover branch names after Git operations have create detached HEADs in a Git repository containing lots of Git submodules. The script works recursively.

I did run into this situation myself and therefore
I understand your motivation. I've CC'ed Stefan as
he is a Submodule expert!


> I would like to see:
> a) that script or algorithm being integrated into Git by default
> b) that as a default behavior for all Git operations creating detached HEADs
>
> That's the command:
> --------------------------------
> git submodule foreach --recursive  'HEAD=$(git branch --list | head -n 1); if [[ "$HEAD" == *HEAD* ]]; then REF=$(git rev-parse HEAD); FOUND=0; for Branch in $(git branch --list | grep "^  " | sed -e "s/  //" ); do if [[ "$(git rev-parse "$Branch")" == $REF ]]; then echo -e "  \e[36mCheckout $Branch...\e[0m"; git checkout $Branch; FOUND=1; break; fi done; if [[ $FOUND -eq 0 ]]; then echo -e "  \e[31mNo matching branch found.\e[0m"; fi else echo -e "  \e[36mNothing to do.\e[0m"; fi'
> --------------------------------
>
> How does it work:
> 1. It uses git submodule foreach to dive into each Git submodule and execute a series of Bash commands.
> 2. It's reading the list of branches and checks if the submodule is in detached mode. The first line contains the string HEAD.
> 3. Retrieve the hash of the detached HEAD
> 4. Iterate all local branches and get their hashes
> 5. Compare the branch hashes with the detached HEAD's hash. If it matches do a checkout.

If there are multiple branches with the same hash then
your script would pick the first one. Can you imagine a
situation where this would be a problem?

Plus, you are looking only at local branches. Wouldn't it
make sense to look at remote branches, too?


> 6. Report if no branch name was found or if a HEAD was not in detached mode.
>
> The Bash code with line breaks and indentation:
> --------------------------------
> HEAD=$(git branch --list | head -n 1)
> if [[ "$HEAD" == *HEAD* ]]; then
>  REF=$(git rev-parse HEAD)
>  FOUND=0
>  for Branch in $(git branch --list | grep "^  " | sed -e "s/  //" ); do

There is a convenient "git for-each-ref" function to iterate over
branches in scripts. See here an example:
https://github.com/larsxschneider/scotty/blob/master/admin/oss-fork.sh#L88


>    if [[ "$(git rev-parse "$Branch")" == $REF ]]; then
>      echo -e "  \e[36mCheckout $Branch...\e[0m"
>      git checkout $Branch
>      FOUND=1
>      break
>    fi
>  done
>  if [[ $FOUND -eq 0 ]]; then
>    echo -e "  \e[31mNo matching branch found.\e[0m"
>  fi
> else
>  echo -e "  \e[36mNothing to do.\e[0m"
> fi
> --------------------------------
>
> Are their any chances to get it integrated into Git?
>
> I tried to register that code as a Git alias, but git config complains about quote problem not showing where. It neither specifies if it's a single or double quote problem. Any advice on how to register that piece of code as an alias?

Try to escape ". See here for an example:
https://github.com/Autodesk/enterprise-config-for-git/blob/master/config.include#L76-L94


> If wished, I think I could expand the script to also recover hash values to Git tags if no branch was found.

It would be indeed nice to see the tagged version on my prompt.

--

Submodule processing is already quite slow if you have many of them.
I wonder how much this approach would affect the performance.

- Lars

^ permalink raw reply	[relevance 16%]

* in case you want a use-case with lots of submodules
@ 2017-06-19 15:59 Yaroslav Halchenko
  2017-06-19 19:30 ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Yaroslav Halchenko @ 2017-06-19 15:59 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Hi All,

On a recent trip I've listened to the git minutes podcast episode and
got excited to hear  Stefan Beller (CCed just in case) describing
ongoing work on submodules mechanism.  I got excited, since e.g.
performance improvements would be of great benefit to us too.

In our project, http://datalad.org, git submodules is the basic
mechanism to bring multiple "datasets" (mix of git and git-annex'ed
repositories)  under the same roof so we could non-ambiguously
version them all at any level.

http://datasets.datalad.org ATM provides quite a sizeable (ATM 370
repositories, up to 4 levels deep) hierarchy of git/git-annex
repositories all tied together via git submodules mechanism.  And as the
collection grows, interactions with it become slower, so additional
options (such as --ignore-submodules=dirty  to status) become our
friends.

So I thought to share this as a use-case happen you need more
motivation or just a real-case test-bed for your work.  And thank
you again for making Git even Greater.

P.S. Please CCme in your replies (if any), I am not on the list

With best regards,
-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[relevance 24%]

* Re: Restoring detached HEADs after Git operations
      [irrelevant] <0092CDD27C5F9D418B0F3E9B5D05BE08010287DF@SBS2011.opfingen.plc2.de>
      [irrelevant] ` <88AC6179-75D6-416B-9235-C628D6C59CA5@gmail.com>
@ 2017-06-19 16:31 ` Stefan Beller
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-19 16:31 UTC (permalink / raw)
  To: Patrick Lehmann; +Cc: Git Mailinglist

On Mon, Jun 19, 2017 at 1:46 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello,
>
> I wrote a Bash script to recover branch names after Git operations have create detached HEADs in a Git repository containing lots of Git submodules. The script works recursively.

Cool. :)

You may also like
https://public-inbox.org/git/20170501180058.8063-5-sbeller@google.com/
https://public-inbox.org/git/20170501180058.8063-6-sbeller@google.com/

These patches are still on my plate, they are not landed yet as I had issues
coming up with a good convincing commit message.

They are essentially putting submodules back on a branch (if configured).
Let's see how this differs from your solution.


> I would like to see:
> a) that script or algorithm being integrated into Git by default

For that you'd want to send a patch, see Documentation/SubmittingPatches.
We'd want to discuss if this command is an independent command
("git submodule reattachHEADs", name subject to bikeshedding ;) )
or if it is a configurable option that is obeyed by anything that touches
submodules (which I would prefer, as this mode seems to be the
"correct default". When having it as a mode we can switch the default
eventually such that submodules are always on a branch).

> b) that as a default behavior for all Git operations creating detached HEADs

changing defaults is hard. Let's go with a) first and then people will
report how
awesome the new mode/command is and then it is easier to see how this
may be a good default. :)

>
> That's the command:
> --------------------------------
(reformatted for readability:)

git submodule foreach --recursive
  'HEAD=$(git branch --list | head -n 1);
    if [[ "$HEAD" == *HEAD* ]]; then
      REF=$(git rev-parse HEAD);
      FOUND=0;
      for Branch in $(git branch --list | grep "^  " | sed -e "s/  //" );
      do
        if [[ "$(git rev-parse "$Branch")" == $REF ]]; then
          echo -e "  \e[36mCheckout $Branch...\e[0m";
          git checkout $Branch;
          FOUND=1;
          break;
        fi
      done;
      if [[ $FOUND -eq 0 ]]; then
        echo -e "  \e[31mNo matching branch found.\e[0m";
      fi
    else
      echo -e "  \e[36mNothing to do.\e[0m";
    fi'

> --------------------------------
>
> How does it work:
> 1. It uses git submodule foreach to dive into each Git submodule and execute a series of Bash commands.

If you want to see it upstream eventually, we'd make it shell commands.
There are some subtle differences between shell and bash,
one of them is the way conditions are written. I think plain shell
does not support [[ ]], so that would become

  if test $FOUND -eq 0
  then
    echo ...

Maybe look at git-submodule.sh for coding style suggestions.

> 2. It's reading the list of branches and checks if the submodule is in detached mode. The first line contains the string HEAD.

This works for you but some crazy person may have a branch containing
HEAD in their branch name. ;)
("git checkout -b notADetachedHEAD")

I think that check can be improved via

    if test $(git symbolic-ref HEAD 2>/dev/null >/dev/null) -eq 128
    then
      # detached HEAD
    else
      # on a branch
    fi

so if the output of symbolic-ref starts with ref then it is on a
branch. In detached HEAD

> 3. Retrieve the hash of the detached HEAD
> 4. Iterate all local branches and get their hashes

  What happens (/should happen) when multiple branches have the same sha1?
  With this implementation the first wins? Is this 'lazy guessing' desired?
  The patches referenced above assumed you'd have submodule.NAME.branch
  set and we'd reattach to that branch only (if matching hashes)

> 5. Compare the branch hashes with the detached HEAD's hash. If it matches do a checkout.

Speaking of checkout: checkout --recurse-submodules is a
thing in the latest version of Git, but it also detaches HEADs.

I'd like to have reattaching HEADs in there and then combined with
"git config submodule.recurse true", which is in master but no release
a plain "git checkout <branch>" in the superproject would put the submodules
on branches.

Using checkout within git submodule-foreach works of course just as fine.
Note: Currently Prathamesh Chavan converts git-submodule-foreach to C
https://public-inbox.org/git/CAME+mvUrzVxpRdPDvA1ZyatNm2R27QGJVjSB3=KX85CEedMaRQ@mail.gmail.com/
so it will be faster. In the process of doing so, we surfaced a couple
of bugs, but
they would not impact this script AFAICT.


> 6. Report if no branch name was found or if a HEAD was not in detached mode.

... and it is colored unconditionally in red. Maybe have a look at
    git config --get-color[bool]
which can help in figuring out if we want to print color codes.

> The Bash code with line breaks and indentation:
> --------------------------------
> HEAD=$(git branch --list | head -n 1)
> if [[ "$HEAD" == *HEAD* ]]; then
>   REF=$(git rev-parse HEAD)
>   FOUND=0
>   for Branch in $(git branch --list | grep "^  " | sed -e "s/  //" ); do
>     if [[ "$(git rev-parse "$Branch")" == $REF ]]; then
>       echo -e "  \e[36mCheckout $Branch...\e[0m"
>       git checkout $Branch
>       FOUND=1
>       break
>     fi
>   done
>   if [[ $FOUND -eq 0 ]]; then
>     echo -e "  \e[31mNo matching branch found.\e[0m"
>   fi
> else
>   echo -e "  \e[36mNothing to do.\e[0m"
> fi
> --------------------------------
>
> Are their any chances to get it integrated into Git?

I like the idea and I'd be happy to review patches. :)
Also you may want to look at the C version that I provided
above and tell me why yours is better. ;)
(Maybe the chosen defaults are saner, or such?)

>
> I tried to register that code as a Git alias, but git config complains about quote problem not showing where. It neither specifies if it's a single or double quote problem. Any advice on how to register that piece of code as an alias?

(a) not using the alias system for everything:
* You can define this as a (ba)sh function in e.g. .bashrc and then
just call the shell function from the alias.
* or you can put the code into an executable script "git-NAME" and
then the alias would be just "git submodule foreach --recursive git
NAME"
(b) define the function inside the alias, cf.
https://www.atlassian.com/blog/git/advanced-git-aliases

> If wished, I think I could expand the script to also recover hash values to Git tags if no branch was found.

Personally I do not think we should attach a HEAD to a tag in that case.
tags are just like branches with special meaning, i.e. they are also
in the refs/* hierarchy.  Note how git-checkout <tag> detaches from
the tag such that you do not modify the tag by default.

>
> Kind regards
>     Patrick Lehmann

^ permalink raw reply	[relevance 16%]

* Re: Restoring detached HEADs after Git operations
  2017-06-19  9:52   ` AW: Restoring detached HEADs after Git operations Patrick Lehmann
@ 2017-06-19 16:37     ` Stefan Beller
  2017-06-19 17:34       ` Patrick Lehmann
  2017-06-19 17:55       ` Junio C Hamano
  0 siblings, 2 replies; 200+ results
From: Stefan Beller @ 2017-06-19 16:37 UTC (permalink / raw)
  To: Patrick Lehmann; +Cc: Lars Schneider, Git Mailinglist

On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello Lars,
>
> for your questions:
>> If there are multiple branches with the same hash then your script would pick the first one. Can you imagine a situation where this would be a problem?
>
> I can't think of a good solution to resolve it automatically. Maybe a script could print that there are multiple possibilities and it choose the first branch in the list.
>
>
>> Plus, you are looking only at local branches. Wouldn't it make sense to look at remote branches, too?
>
> This is also related to restoring tags. If we go this way, we should have this priority list:
> - local branches
> - remote branches

For remote branches you would create a local branch of the same name
(if such a branch would not exist, possibly setting it up to track that remote
branch)?

> - tags

as said in the other email and similar to remote branches, we'd not want to have
HEAD pointing to them directly but somehow have a local branch.

>> Submodule processing is already quite slow if you have many of them. I wonder how much this approach would affect the performance.
>
> Yes. It takes a few seconds to iterate all the submodules. It could be improved if the processing wouldn't be based on slow Bash scripts spawning lot's of sub-shells to execute multiple Git commands.

How many submodules are we talking about? (Are you on Windows to make
shell even more fun?)

^ permalink raw reply	[relevance 15%]

* Re: [PATCH/RFC] Cleanup Documentation
  2017-06-19  5:50   ` Re: [PATCH/RFC] Cleanup Documentation Stefan Beller
@ 2017-06-19 17:33     ` Kaartic Sivaraam
  0 siblings, 0 replies; 200+ results
From: Kaartic Sivaraam @ 2017-06-19 17:33 UTC (permalink / raw)
  To: Stefan Beller, Junio C Hamano, mlevedahl; +Cc: git

On Sun, 2017-06-18 at 22:50 -0700, Stefan Beller wrote:
> > > diff --git a/Documentation/git-submodule.txt b/Documentation/git-
> > > submodule.txt
> > > index 74bc6200d..9812b0655 100644
> > > --- a/Documentation/git-submodule.txt
> > > +++ b/Documentation/git-submodule.txt
> > > @@ -63,13 +63,7 @@ add [-b <branch>] [-f|--force] [--name <name>]
> > > [--reference <repository>] [--dep
> > >       to the changeset to be committed next to the current
> > >       project: the current project is termed the "superproject".
> > >  +
> > > -This requires at least one argument: <repository>. The optional
> > > -argument <path> is the relative location for the cloned
> > > submodule
> > > -to exist in the superproject. If <path> is not given, the
> > > -"humanish" part of the source repository is used ("repo" for
> > > -"/path/to/repo.git" and "foo" for "host.xz:foo/.git").
> > > -The <path> is also used as the submodule's logical name in its
> > > -configuration entries unless `--name` is used to specify a
> > > logical name.
> > > +This requires at least one argument: <repository>.
> > >  +
> 
> So we're losing the information how the submodule name is chosen.
I just moved it. I don't think we're losing anything related to how the
name is chosen. Please let me know if I misinterpreted your statement.

> This may be fine as I plan (long term) to make the name an arbitrary
> random
> string (IMHO that reduces confusion as there will be less 'nearly the
> same'
> things)
> 
> On the other hand the newly added line
>   'This requires at least one argument: <repository'
> (actually moved, but) is sort of redundant. The notation in the
> argument line
> should make that clear, already?
> 
Makes clear sense. Removed it.

> This sounds good, it consolidates all information about [<path>]
> in one paragraph. While at it, maybe let's find another (better)
> substitute for "humanish" as that can be anything(?).
> 
> Maybe "the last part of the URL" (without any .git)
> 
How about "meaningful"? Put in place it reads like,

If <path> is not given, the meaningful part of the source repository
...

> Please markup the '.gitmodules' either via single quotes or `.
> (or even link to 'gitmodules(5)' )
> 
Marked it up using `. Help needed to link to 'gitmodules(5)', as I'm
not sure how to provide alternative text to 'linkgit:'.

> I am undecided if this is really removing (2) unclearness, but the
> (1) redundancy seems fine to me.
> 
Sorry about that. The commit message should have been,

...
2. Removed unclear back reference
...

by which I intend to denote the following removal,
> -In either case, the given URL is recorded into .gitmodules for
> -use by subsequent users cloning the superproject.

Note: Will follow up with a patch, soon.
-- 
Regards,
Kaartic Sivaraam <kaarticsivaraam91196@gmail.com>

^ permalink raw reply	[relevance 8%]

* AW: Restoring detached HEADs after Git operations
  2017-06-19 16:37     ` Re: Restoring detached HEADs after Git operations Stefan Beller
@ 2017-06-19 17:34       ` Patrick Lehmann
  2017-06-19 17:47         ` Stefan Beller
  2017-06-19 17:55       ` Junio C Hamano
  1 sibling, 1 reply; 200+ results
From: Patrick Lehmann @ 2017-06-19 17:34 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Lars Schneider, Git Mailinglist

Hello,

I'm just an advanced Git user, not a Git developer. So I might find some time to improve the suggested script, which I provided with the hints given on the mailing list, but I have no time to do a complete feature release in your patch based Git flow.

I'm currently involved in 8 other open source projects. One can't improve the world alone by supplying patches to any open source project one is using...

I have no experience with other shells then Bash. So if you rely on a Bash with less features, please port the syntax to such a shell system. (I personally do not support legacy programs or out-date programs).

------
We are talking about circa 50 submodules in total with a maximum depth of 4. The platforms are:
- Mint OS with Git in Bash
- Windows 7 with Git-Bash
- Windows 10 with Git-Bash
- Windows 10 with Posh-Git


Kind regards
    Patrick

________________________________________
Von: Stefan Beller [sbeller@google.com]
Gesendet: Montag, 19. Juni 2017 18:37
Bis: Patrick Lehmann
Cc: Lars Schneider; Git Mailinglist
Betreff: Re: Restoring detached HEADs after Git operations

On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello Lars,
>
> for your questions:
>> If there are multiple branches with the same hash then your script would pick the first one. Can you imagine a situation where this would be a problem?
>
> I can't think of a good solution to resolve it automatically. Maybe a script could print that there are multiple possibilities and it choose the first branch in the list.
>
>
>> Plus, you are looking only at local branches. Wouldn't it make sense to look at remote branches, too?
>
> This is also related to restoring tags. If we go this way, we should have this priority list:
> - local branches
> - remote branches

For remote branches you would create a local branch of the same name
(if such a branch would not exist, possibly setting it up to track that remote
branch)?

> - tags

as said in the other email and similar to remote branches, we'd not want to have
HEAD pointing to them directly but somehow have a local branch.

>> Submodule processing is already quite slow if you have many of them. I wonder how much this approach would affect the performance.
>
> Yes. It takes a few seconds to iterate all the submodules. It could be improved if the processing wouldn't be based on slow Bash scripts spawning lot's of sub-shells to execute multiple Git commands.

How many submodules are we talking about? (Are you on Windows to make
shell even more fun?)

^ permalink raw reply	[relevance 15%]

* Re: Restoring detached HEADs after Git operations
  2017-06-19 17:34       ` Patrick Lehmann
@ 2017-06-19 17:47         ` Stefan Beller
  2017-06-19 18:09           ` Patrick Lehmann
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-19 17:47 UTC (permalink / raw)
  To: Patrick Lehmann; +Cc: Lars Schneider, Git Mailinglist

On Mon, Jun 19, 2017 at 10:34 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello,
>
> I'm just an advanced Git user, not a Git developer. So I might find some time to improve the suggested script, which I provided with the hints given on the mailing list, but I have no time to do a complete feature release in your patch based Git flow.

ok, thanks for letting us know. I may re-prioritize the "reattach
HEAD" patches that I referenced earlier.
I would have hoped that additionally to the shell lines you'd have
given a good use case/summary.

> I have no experience with other shells then Bash. So if you rely on a Bash with less features, please port the syntax to such a shell system. (I personally do not support legacy programs or out-date programs).
>
> ------
> We are talking about circa 50 submodules in total with a maximum depth of 4. The platforms are:
> - Mint OS with Git in Bash
> - Windows 7 with Git-Bash
> - Windows 10 with Git-Bash
> - Windows 10 with Posh-Git

Thanks,
Stefan

^ permalink raw reply	[relevance 9%]

* Re: Restoring detached HEADs after Git operations
  2017-06-19 16:37     ` Re: Restoring detached HEADs after Git operations Stefan Beller
  2017-06-19 17:34       ` Patrick Lehmann
@ 2017-06-19 17:55       ` Junio C Hamano
  2017-06-19 19:11         ` Stefan Beller
  1 sibling, 1 reply; 200+ results
From: Junio C Hamano @ 2017-06-19 17:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Patrick Lehmann, Lars Schneider, Git Mailinglist

Stefan Beller <sbeller@google.com> writes:

> On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann
> <Patrick.Lehmann@plc2.de> wrote:
>> Hello Lars,
>>
>> for your questions:
>>> If there are multiple branches with the same hash then your script would pick the first one. Can you imagine a situation where this would be a problem?
>>
>> I can't think of a good solution to resolve it automatically. Maybe a script could print that there are multiple possibilities and it choose the first branch in the list.
>>
>>
>>> Plus, you are looking only at local branches. Wouldn't it make sense to look at remote branches, too?
>>
>> This is also related to restoring tags. If we go this way, we should have this priority list:
>> - local branches
>> - remote branches
>
> For remote branches you would create a local branch of the same name
> (if such a branch would not exist, possibly setting it up to track that remote
> branch)?
>
>> - tags
>
> as said in the other email and similar to remote branches, we'd not want to have
> HEAD pointing to them directly but somehow have a local branch.

Let's step back a bit.  We detach the HEAD for a good reason, no?
Why is it a good idea to move them back on to a branch picked among
multiple ones that all happen to be pointing at the same commit?

The user may build on a history of a submodule, and then may push
the result out to a particular branch at the other side; that is
when being on a named branch in the submodule becomes useful, but
even then I do not think randomly picking one branch and be on it
is a good thing to do.

I would understand the workflow would go more like so:

 - You do something at the superproject (e.g. create a new branch X
   from an existing commit and check it out), which results in
   submodules' HEADs getting detached at the commits bound to the
   superproject's tree.

 - Because you want to make changes to both submodules and the
   superproject in a consistent way, you'd want to commit changes to
   all of these repositories and the push the result out in an
   atomic way.

 - Hence you tell "Hey, Git, I want all the submodules that I
   modified to be on branch X" from the superproject.

   - This may succeed in a submodule where X is a new name, or the
     current tip of branch X is an ancestor of the detached HEAD.

   - This may fail in a submodule where there is branch X that does
     not want to move to the detached HEAD's state.  In this latter
     case, the user needs to deal with the situation (perhaps the
     old X is expendable; perhaps the HEAD's commit may need to be
     merged to old X; perhaps there are other cases).

though.

^ permalink raw reply	[relevance 17%]

* AW: Restoring detached HEADs after Git operations
  2017-06-19 17:47         ` Stefan Beller
@ 2017-06-19 18:09           ` Patrick Lehmann
  2017-06-19 19:21             ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Patrick Lehmann @ 2017-06-19 18:09 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Lars Schneider, Git Mailinglist

Hello Stefan,

the use case is as follows:

The projects consists of circa 18 IP cores. Each IP core is represented by a Git repository. Think of an IP core as of a lonestanding DLL or SO file project. Each IP core references 2 submodules, which bring the verification environments for testing the IP core standalone.

These 18 IP cores are grouped to bigger IP cores, referencing the low-level IP cores and each again the 2 verification submodules. Finally, the main project references the bigger IP cores and again the 2 verification cores.

TOPLEVEL
  o- IP1
       o- UVVM
       o- VUnit
  o- IP2
       o- UVVM
       o- VUnit
  o- IP3
       o- UVVM
       o- VUnit
  o- IP4
       o- UVVM
       o- VUnit
       o- IP5
           o- UVVM
           o- VUnit
       o- IP6
           o- UVVM
           o- VUnit
       o- IP7
           o- UVVM
           o- VUnit
  o- IP8
       o- UVVM
       o- VUnit
       o- IP9
           o- UVVM
           o- VUnit
       o- IP10
           o- UVVM
           o- VUnit
  o- IP11
       o- UVVM
       o- VUnit
       o- IP9
           o- UVVM
           o- VUnit
       o- IP12
           o- UVVM
           o- VUnit
   o- UVVM
   o- VUnit

That's the simplified structure. I can't write more, because it's a closed source project. You can find other usecases e.g. in my other open source projects. E.g. The PoC-Library or The PicoBlaze-Library and the corresponding PoC-Examples repository.

Example: PoC
Pile of Cores includes 4 Git submodules and is itself an IP core library.
So PoC-Examples again references PoC. This looks like this tree:

PoC-Examples
  |- lib/
       o- PoC
            |- lib
                o- Cocotb
                o- OSVVM
                o- VUnit
                     o- .... OSVVM
                o- UVVM

The library VUnit itself already includes OSVVM as a library.

----------------------
Forcast:
I'll write a new question / idea about multiple equal submodules and the memory footprint soon...
Here is my original question posted on StackOverflow: https://stackoverflow.com/questions/44585425/how-to-reduce-the-memory-footprint-for-multiple-submodules-of-the-same-source
----------------------

Do you need more use cases?


Kind regards
    Patrick
________________________________________
Von: git-owner@vger.kernel.org [git-owner@vger.kernel.org]&quot; im Auftrag von &quot;Stefan Beller [sbeller@google.com]
Gesendet: Montag, 19. Juni 2017 19:47
Bis: Patrick Lehmann
Cc: Lars Schneider; Git Mailinglist
Betreff: Re: Restoring detached HEADs after Git operations

On Mon, Jun 19, 2017 at 10:34 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello,
>
> I'm just an advanced Git user, not a Git developer. So I might find some time to improve the suggested script, which I provided with the hints given on the mailing list, but I have no time to do a complete feature release in your patch based Git flow.

ok, thanks for letting us know. I may re-prioritize the "reattach
HEAD" patches that I referenced earlier.
I would have hoped that additionally to the shell lines you'd have
given a good use case/summary.

> I have no experience with other shells then Bash. So if you rely on a Bash with less features, please port the syntax to such a shell system. (I personally do not support legacy programs or out-date programs).
>
> ------
> We are talking about circa 50 submodules in total with a maximum depth of 4. The platforms are:
> - Mint OS with Git in Bash
> - Windows 7 with Git-Bash
> - Windows 10 with Git-Bash
> - Windows 10 with Posh-Git

Thanks,
Stefan

^ permalink raw reply	[relevance 16%]

* Re: [RFC/PATCH] submodules: overhaul documentation
  2017-06-13 21:06   ` Stefan Beller
@ 2017-06-19 18:10     ` Brandon Williams
  2017-06-20 21:42       ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Brandon Williams @ 2017-06-19 18:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Jonathan Nieder, git

On 06/13, Stefan Beller wrote:
> Adding two native speakers as we start word smithing.
> 
> On Tue, Jun 13, 2017 at 12:29 PM, Junio C Hamano <gitster@pobox.com> wrote:
> 
> >> +
> >> +A submodule is another Git repository tracked in a subdirectory of your
> >> +repository. The tracked repository has its own history, which does not
> >> +interfere with the history of the current repository.
> >
> > "tracked in a subdirectory" sounds as if your top-level superproject
> > has a dedicated submodules/ directory and in it there live a bunch
> > of submodules.  Which obviously is not what you meant.  If phrased
> > "tracked as a subdirectory", I think the sentence makes sense.
> 
> Given this explanation "as a" also sounds wrong[1], maybe we need to
> separate (1) where it is put/mounted and (2) the fact that is tracked,
> i.e. the superproject has an idea of what should be there at a given
> revision. (I shortly thought about /s/as a/using/ in the above, but):
> 
>   A submodule is another Git repository at an arbitrary place inside
>   the working tree, and also tracked. The tracked repository has its
>   own history, which does not interfere with the history of the current
>   repository.

I would probably change the first sentence to:

  A submodule is another Git repository tracked at an arbitrary place
  inside the working tree.

> 
> [1] http://www.thesaurus.com/browse/as
> 
> >
> > While "which does not interfere" may be technically correct, I am
> > not sure what the value of saying that is.
> 
> I think we can drop it here. When writing I wanted to separate it from
> subtrees, but this is the wrong place for that.
> 
> >
> >> +Submodules are composed from a so-called `gitlink` tree entry
> >> +in the main repository that refers to a particular commit object
> >> +within the inner repository.
> >
> > Correct, but it may be unclear to the readers why we do so.  Perhaps
> >
> >         ... and this way, the tree of each commit in the main repository
> >         "knows" which commit from the submodule's history is "tied" to it.
> >
> > or something like that?
> 
> sounds good to me.
> 
> >
> >> +Additionally to the gitlink entry the `.gitmodules` file (see
> >> +linkgit:gitmodules[5]) at the root of the source tree contains
> >> +information needed for submodules.
> >
> > Is that really true?  Each submodule do not *need* what is in
> > .gitmodules; the top-level superproject needs to learn about
> > its submodules from the contents of that file, though.
> 
> Ha! The ediled words in my mind were:
> 
>  ... information needed for submodules [to work in the superproject].
> 
> But maybe we need to reword that as
> 
>   Additionally to the gitlink entry the `.gitmodules` file (see
>   linkgit:gitmodules[5]) at the root of the source tree contains
>   information on how to handle submodules.

This sounds slightly awkward.  Maybe:

    In addition to the gitlink entry, the `.gitmodules` file (see
    linkgit:gitmodules[5]) at the root of the source tree contains
    information on how to handle submodules.


-- 
Brandon Williams

^ permalink raw reply	[relevance 24%]

* Re: Restoring detached HEADs after Git operations
  2017-06-19 17:55       ` Junio C Hamano
@ 2017-06-19 19:11         ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-19 19:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Patrick Lehmann, Lars Schneider, Git Mailinglist

On Mon, Jun 19, 2017 at 10:55 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Mon, Jun 19, 2017 at 2:52 AM, Patrick Lehmann
>> <Patrick.Lehmann@plc2.de> wrote:
>>> Hello Lars,
>>>
>>> for your questions:
>>>> If there are multiple branches with the same hash then your script would pick the first one. Can you imagine a situation where this would be a problem?
>>>
>>> I can't think of a good solution to resolve it automatically. Maybe a script could print that there are multiple possibilities and it choose the first branch in the list.
>>>
>>>
>>>> Plus, you are looking only at local branches. Wouldn't it make sense to look at remote branches, too?
>>>
>>> This is also related to restoring tags. If we go this way, we should have this priority list:
>>> - local branches
>>> - remote branches
>>
>> For remote branches you would create a local branch of the same name
>> (if such a branch would not exist, possibly setting it up to track that remote
>> branch)?
>>
>>> - tags
>>
>> as said in the other email and similar to remote branches, we'd not want to have
>> HEAD pointing to them directly but somehow have a local branch.
>
> Let's step back a bit.  We detach the HEAD for a good reason, no?

And the 'good reason' being that at the time git-submodule was written
we did not know what would be best, and having a detached HEAD
would be (a) easy to implement, and (b) removing one moving thing
from the whole construction, hence making it a bit safer,
(c) it sort of follows the mental model:

    the superproject said it had the submodule at X
        (and not at branch Y!)
    the submodule itself is a whole repo on its own
        (it doesn't need to be aware of the superproject)

so in this world detaching at X is the best we can do.

> Why is it a good idea to move them back on to a branch picked among
> multiple ones that all happen to be pointing at the same commit?

This (rhetorical?) question reads like 2 questions actually:
(a) "Why is it a good idea to move them back on to a branch?"
It makes working easier as the submodule is not detached,
but on a proper branch
(b) "picked among multiple ones that all ..."
I think this is a bad idea and we'd rather want to follow
some configuration instead of wild-guessing by Git.

> The user may build on a history of a submodule, and then may push
> the result out to a particular branch at the other side; that is
> when being on a named branch in the submodule becomes useful, but
> even then I do not think randomly picking one branch and be on it
> is a good thing to do.

so you provide one reason why it is useful, but then claiming it is
'not a good thing' (yet). Can you give a reason why this is a 'bad thing'?

> I would understand the workflow would go more like so:
>
>  - You do something at the superproject (e.g. create a new branch X
>    from an existing commit and check it out), which results in
>    submodules' HEADs getting detached at the commits bound to the
>    superproject's tree.

And here we'd want to discuss if we *really* want to detach the HEADs
or rather have a symbolic ref "following the superproject".

>  - Because you want to make changes to both submodules and the
>    superproject in a consistent way, you'd want to commit changes to
>    all of these repositories and the push the result out in an
>    atomic way.

Committing and pushing are different things. You should not care if
I commit atomically as you (in the general "upstream" sense)
cannot observe my local commits.

For pushing we would want to have an atomic push, but that is
not the scope of this discussion. (As a Gerrit user, we implemented
the submodule atomicity serverside, but in plain Git server you'd
not need the atomicity either:

    git commit -a -m "update submodule pointers"
    git submodule foreach git push
    git push

should be fine w.r.t. any non-atomic race condition.)

>  - Hence you tell "Hey, Git, I want all the submodules that I
>    modified to be on branch X" from the superproject.
>
>    - This may succeed in a submodule where X is a new name, or the
>      current tip of branch X is an ancestor of the detached HEAD.

so we'd allow fast forward for X. This seems arbitrary to me. I could
also say "If X exists I allow a merge to be made between old X and
the object name given by the superproject". (maybe as a config option)

>    - This may fail in a submodule where there is branch X that does
>      not want to move to the detached HEAD's state.  In this latter
>      case, the user needs to deal with the situation (perhaps the
>      old X is expendable; perhaps the HEAD's commit may need to be
>      merged to old X; perhaps there are other cases).

makes sense.

>
> though.

^ permalink raw reply	[relevance 17%]

* Re: Restoring detached HEADs after Git operations
  2017-06-19 18:09           ` Patrick Lehmann
@ 2017-06-19 19:21             ` Stefan Beller
  2017-06-19 20:13               ` Patrick Lehmann
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-19 19:21 UTC (permalink / raw)
  To: Patrick Lehmann; +Cc: Lars Schneider, Git Mailinglist

On Mon, Jun 19, 2017 at 11:09 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello Stefan,
>
> the use case is as follows:
>
> The projects consists of circa 18 IP cores. Each IP core is represented by a Git repository. Think of an IP core as of a lonestanding DLL or SO file project. Each IP core references 2 submodules, which bring the verification environments for testing the IP core standalone.

So phrased differently: You are using submodules to avoid "DLL hell"
(sharing a lib, with ease of versioning as the submodules in the different IP
cores may be pointing at different versions).

>
> These 18 IP cores are grouped to bigger IP cores, referencing the low-level IP cores and each again the 2 verification submodules. Finally, the main project references the bigger IP cores and again the 2 verification cores.
>
> TOPLEVEL
>   o- IP1
>        o- UVVM
>        o- VUnit
>   o- IP2
>        o- UVVM
>        o- VUnit
>   o- IP3
>        o- UVVM
>        o- VUnit
>   o- IP4
>        o- UVVM
>        o- VUnit
>        o- IP5
>            o- UVVM
>            o- VUnit
>        o- IP6
>            o- UVVM
>            o- VUnit
>        o- IP7
>            o- UVVM
>            o- VUnit
>   o- IP8
>        o- UVVM
>        o- VUnit
>        o- IP9
>            o- UVVM
>            o- VUnit
>        o- IP10
>            o- UVVM
>            o- VUnit
>   o- IP11
>        o- UVVM
>        o- VUnit
>        o- IP9
>            o- UVVM
>            o- VUnit
>        o- IP12
>            o- UVVM
>            o- VUnit
>    o- UVVM
>    o- VUnit
>
> That's the simplified structure. I can't write more, because it's a closed source project. You can find other usecases e.g. in my other open source projects. E.g. The PoC-Library or The PicoBlaze-Library and the corresponding PoC-Examples repository.
>
> Example: PoC
> Pile of Cores includes 4 Git submodules and is itself an IP core library.
> So PoC-Examples again references PoC. This looks like this tree:
>
> PoC-Examples
>   |- lib/
>        o- PoC
>             |- lib
>                 o- Cocotb
>                 o- OSVVM
>                 o- VUnit
>                      o- .... OSVVM
>                 o- UVVM
>
> The library VUnit itself already includes OSVVM as a library.
>
> ----------------------
> Forcast:
> I'll write a new question / idea about multiple equal submodules and the memory footprint soon...
> Here is my original question posted on StackOverflow: https://stackoverflow.com/questions/44585425/how-to-reduce-the-memory-footprint-for-multiple-submodules-of-the-same-source
> ----------------------
>
> Do you need more use cases?
>

Well this use case points out a different issue than I hoped for. ;)
From the stackoverflow post and from looking at the layout here,
one of the major questions is how to deduplicate the submodule
object store for example.

By use case I rather meant a sales pitch for your initial email:

    I use this bash script because it fits in my workflow because
    I need branches instead of detached HEADS, because $REASONS

and I'd be interested in these $REASONS, which I assumed to be
* easier to work with branches than detached HEADS (it aids the workflow)
* we're not challenging the underlying mental model of tracking sha1s in
  the superproject rather than branches.

At least I gave these reasons in the "reattach HEAD" stuff that I wrote,
but maybe there are others? (I know the code base of submodules very
well, but I do not work with submodules on a day-to-day basis myself...)

^ permalink raw reply	[relevance 18%]

* Re: in case you want a use-case with lots of submodules
  2017-06-19 15:59 in case you want a use-case with lots of submodules Yaroslav Halchenko
@ 2017-06-19 19:30 ` Stefan Beller
  2017-06-19 20:20   ` Yaroslav Halchenko
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-19 19:30 UTC (permalink / raw)
  To: Yaroslav Halchenko, Prathamesh Chavan; +Cc: git

On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko <yoh@onerussian.com> wrote:
> Hi All,
>
> On a recent trip I've listened to the git minutes podcast episode and
> got excited to hear  Stefan Beller (CCed just in case) describing
> ongoing work on submodules mechanism.  I got excited, since e.g.
> performance improvements would be of great benefit to us too.

If you're mostly interested in performance improvements of the status
quo (i.e. "make git-submodule fast"), then the work of Prathamesh
Chavan (cc'd) might be more interesting to you than what I do.
He is porting git-submodule (which is mostly a shell script nowadays)
to C, such that we can save a lot of process invocations and can do
processing within one process.

> In our project, http://datalad.org, git submodules is the basic
> mechanism to bring multiple "datasets" (mix of git and git-annex'ed
> repositories)  under the same roof so we could non-ambiguously
> version them all at any level.

Cool, glad to here submodules being useful. :)

> http://datasets.datalad.org ATM provides quite a sizeable (ATM 370
> repositories, up to 4 levels deep) hierarchy of git/git-annex
> repositories all tied together via git submodules mechanism.  And as the
> collection grows, interactions with it become slower, so additional
> options (such as --ignore-submodules=dirty  to status) become our
> friends.

I am not as much concerned about the 370 number than about the
4 layers of nesting. In my experience the nested submodule case
is a little bit error prone and the bug reports are not as frequent as
there are not as many users of nesting, yet(?)

In a neighboring thread on the mailing list we have a discussion
on the usefulness of being on branches than in detached HEAD
in the submodules.
https://public-inbox.org/git/0092CDD27C5F9D418B0F3E9B5D05BE08010287DF@SBS2011.opfingen.plc2.de/

This would not break non-ambiguously, rather it would add
ease of use.

> So I thought to share this as a use-case happen you need more
> motivation or just a real-case test-bed for your work.  And thank
> you again for making Git even Greater.

Thanks for the motivation. :)

> P.S. Please CCme in your replies (if any), I am not on the list
>
> With best regards,

Cheers,
Stefan

^ permalink raw reply	[relevance 25%]

* AW: Restoring detached HEADs after Git operations
  2017-06-19 19:21             ` Stefan Beller
@ 2017-06-19 20:13               ` Patrick Lehmann
  0 siblings, 0 replies; 200+ results
From: Patrick Lehmann @ 2017-06-19 20:13 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Lars Schneider, Git Mailinglist

Hello Stefan,

I never have tapped into the DLL Hell trap. That's maybe I never did C++ development or I started with VB .NET / C# as .NET solved major parts of the DLL Hell :). That doesn't mean my new beloved language Python doesn't have a similar problem ...


Thinking about DLL Hell is a thinking in big version numbers like 1.0, 2.0 oder even 2.1, 2.2, ...
We are here talking about revisions in the build numbers which need to be synchronized between the parent repository and the sub modules (IP cores). Both sides are under heavy development and interfaces evolving from day to day because hardware design can't be planned as easy as software design.

So by using Git submodules a developer - responsible for a submodule / IP core - can after he finished interface level 1 now go on and implement interface level 2. The parent project can finish it's integration and testing of the level 1 interface before proceeding with level 2. More over if the same IP core is used multiple time in different sub IP cores, it's possible to update one usage place to interface level 2 by a second developer so he can finish his IP core at level 2, which other usage places can still use the level 1 interface.

Start situation:
--------------------------------------
TOPLEVEL (developer A)
  o- IP_1 @level1 (developer B)
       o- IP_2 @level1 (developer C)
  o- IP_3 @level1 (developer D)
       o- IP_2 @level1


Developer C creates interface level 2, but all instances use level1 of IP_2:
--------------------------------------
TOPLEVEL (developer A)
  o- IP_1 @level1 (developer B)
       o- IP_2 @level1 (developer C)
  o- IP_3 @level1 (developer D)
       o- IP_2 @level1


Developer D updates instance of IP_2 to level 2 and completes level 2 of IP_3:
--------------------------------------
TOPLEVEL (developer A)
  o- IP_1 @level1 (developer B)
       o- IP_2 @level1 (developer C)
  o- IP_3 @level1 (developer D)
       o- IP_2 @level2

Developer A updates instance of IP_3 to level 2:
--------------------------------------
TOPLEVEL (developer A)
  o- IP_1 @level1 (developer B)
       o- IP_2 @level1 (developer C)
  o- IP_3 @level2 (developer D)
       o- IP_2 @level2

Developer B has finished his testing for IP_1 and can now update the instance if IP_2:
--------------------------------------
TOPLEVEL (developer A)
  o- IP_1 @level1 (developer B)
       o- IP_2 @level2 (developer C)
  o- IP_3 @level2 (developer D)
       o- IP_2 @level2


So now imaging 8 developers, whereof 6 are working remote on the project. There is one responsible developer per IP core (maintainer) and an overall maintainer overseeing all integration merges and test results (CI).


Kind regards
    Patrick

________________________________________
Von: Stefan Beller [sbeller@google.com]
Gesendet: Montag, 19. Juni 2017 21:21
Bis: Patrick Lehmann
Cc: Lars Schneider; Git Mailinglist
Betreff: Re: Restoring detached HEADs after Git operations

On Mon, Jun 19, 2017 at 11:09 AM, Patrick Lehmann
<Patrick.Lehmann@plc2.de> wrote:
> Hello Stefan,
>
> the use case is as follows:
>
> The projects consists of circa 18 IP cores. Each IP core is represented by a Git repository. Think of an IP core as of a lonestanding DLL or SO file project. Each IP core references 2 submodules, which bring the verification environments for testing the IP core standalone.

So phrased differently: You are using submodules to avoid "DLL hell"
(sharing a lib, with ease of versioning as the submodules in the different IP
cores may be pointing at different versions).

>
> These 18 IP cores are grouped to bigger IP cores, referencing the low-level IP cores and each again the 2 verification submodules. Finally, the main project references the bigger IP cores and again the 2 verification cores.
>
> TOPLEVEL
>   o- IP1
>        o- UVVM
>        o- VUnit
>   o- IP2
>        o- UVVM
>        o- VUnit
>   o- IP3
>        o- UVVM
>        o- VUnit
>   o- IP4
>        o- UVVM
>        o- VUnit
>        o- IP5
>            o- UVVM
>            o- VUnit
>        o- IP6
>            o- UVVM
>            o- VUnit
>        o- IP7
>            o- UVVM
>            o- VUnit
>   o- IP8
>        o- UVVM
>        o- VUnit
>        o- IP9
>            o- UVVM
>            o- VUnit
>        o- IP10
>            o- UVVM
>            o- VUnit
>   o- IP11
>        o- UVVM
>        o- VUnit
>        o- IP9
>            o- UVVM
>            o- VUnit
>        o- IP12
>            o- UVVM
>            o- VUnit
>    o- UVVM
>    o- VUnit
>
> That's the simplified structure. I can't write more, because it's a closed source project. You can find other usecases e.g. in my other open source projects. E.g. The PoC-Library or The PicoBlaze-Library and the corresponding PoC-Examples repository.
>
> Example: PoC
> Pile of Cores includes 4 Git submodules and is itself an IP core library.
> So PoC-Examples again references PoC. This looks like this tree:
>
> PoC-Examples
>   |- lib/
>        o- PoC
>             |- lib
>                 o- Cocotb
>                 o- OSVVM
>                 o- VUnit
>                      o- .... OSVVM
>                 o- UVVM
>
> The library VUnit itself already includes OSVVM as a library.
>
> ----------------------
> Forcast:
> I'll write a new question / idea about multiple equal submodules and the memory footprint soon...
> Here is my original question posted on StackOverflow: https://stackoverflow.com/questions/44585425/how-to-reduce-the-memory-footprint-for-multiple-submodules-of-the-same-source
> ----------------------
>
> Do you need more use cases?
>

Well this use case points out a different issue than I hoped for. ;)
From the stackoverflow post and from looking at the layout here,
one of the major questions is how to deduplicate the submodule
object store for example.

By use case I rather meant a sales pitch for your initial email:

    I use this bash script because it fits in my workflow because
    I need branches instead of detached HEADS, because $REASONS

and I'd be interested in these $REASONS, which I assumed to be
* easier to work with branches than detached HEADS (it aids the workflow)
* we're not challenging the underlying mental model of tracking sha1s in
  the superproject rather than branches.

At least I gave these reasons in the "reattach HEAD" stuff that I wrote,
but maybe there are others? (I know the code base of submodules very
well, but I do not work with submodules on a day-to-day basis myself...)

^ permalink raw reply	[relevance 15%]

* Re: in case you want a use-case with lots of submodules
  2017-06-19 19:30 ` Stefan Beller
@ 2017-06-19 20:20   ` Yaroslav Halchenko
  2017-06-20  5:43     ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Yaroslav Halchenko @ 2017-06-19 20:20 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Prathamesh Chavan, git


On Mon, 19 Jun 2017, Stefan Beller wrote:

> On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko <yoh@onerussian.com> wrote:
> > Hi All,

> > On a recent trip I've listened to the git minutes podcast episode and
> > got excited to hear  Stefan Beller (CCed just in case) describing
> > ongoing work on submodules mechanism.  I got excited, since e.g.
> > performance improvements would be of great benefit to us too.

> If you're mostly interested in performance improvements of the status
> quo (i.e. "make git-submodule fast"), then the work of Prathamesh
> Chavan (cc'd) might be more interesting to you than what I do.
> He is porting git-submodule (which is mostly a shell script nowadays)
> to C, such that we can save a lot of process invocations and can do
> processing within one process.

ah -- cool.  I would be eager to test it out, thanks!  would be
interesting to see if it positively affects our overall performance.
Pointers to that development would be welcome!

> > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370
> > repositories, up to 4 levels deep) hierarchy of git/git-annex
> > repositories all tied together via git submodules mechanism.  And as the
> > collection grows, interactions with it become slower, so additional
> > options (such as --ignore-submodules=dirty  to status) become our
> > friends.

> I am not as much concerned about the 370 number than about the
> 4 layers of nesting. In my experience the nested submodule case
> is a little bit error prone and the bug reports are not as frequent as
> there are not as many users of nesting, yet(?)

well -- part of the story here is that we are forced to use/have full
blown .git/ directories (for git-annex symlinks to content files to
work) within submodules instead of .git file with a reference under
parent's .git/modules.   So we can 'slice' at any level and I
guess that is why may be avoiding some possibly issues due to nesting
and the "parent has all .git/modules" approach.

> In a neighboring thread on the mailing list we have a discussion
> on the usefulness of being on branches than in detached HEAD
> in the submodules.
> https://public-inbox.org/git/0092CDD27C5F9D418B0F3E9B5D05BE08010287DF@SBS2011.opfingen.plc2.de/

> This would not break non-ambiguously, rather it would add
> ease of use.

that is indeed a common caveat... I am not sure if any heuristic
approach would provide a 'bullet proof' solution.  I might even prefer a
hardcoded 'branch-name' to be listed/associated with each submodule
within .gitmodules.  In the datalad case, detached HEAD is common
whenever someone installs "outdated" (branch of which progressed
forward) submodule.  In this case we just check if the branch after "git
clone"  (but before git submodule update) includes the pointed by
Subproject commit, and if so -- we announce that it must be the branch
(so far it is always "master" branch anyways ;) )

> > So I thought to share this as a use-case happen you need more
> > motivation or just a real-case test-bed for your work.  And thank
> > you again for making Git even Greater.

> Thanks for the motivation. :)

the least I could do ;)

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[relevance 24%]

* Re: What's cooking in git.git (Jun 2017, #05; Mon, 19)
      [irrelevant] <xmqqh8zbspm7.fsf@gitster.mtv.corp.google.com>
@ 2017-06-19 20:57 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-19 20:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

>
> * sb/submodule-doc (2017-06-13) 1 commit
>  - submodules: overhaul documentation
>
>  Doc update.
>
>  Waiting for discussion to settle.

Please hold back, this definitely needs
another version.

> * sb/diff-color-move (2017-06-01) 17 commits
>  - diff.c: color moved lines differently
> ...
>  "git diff" has been taught to optionally paint new lines that are
>  the same as deleted lines elsewhere differently from genuinely new
>  lines.
>
>  Is any more update coming?

Yes.

I do have a series locally that replaces "diff_line" with
"emitted_string" (but the same structure) and also
changed the algorithm to have 8 colors configurable.

But then I got dragged away in "doing it right", which will be
presented shortly with a different approach for the
first patches and how they are refactored.

So the refactoring of that is done and I need to apply the
patches that bring in the new functionality on top.

Thanks,
Stefan

^ permalink raw reply	[relevance 9%]

* [GSoC] Update: Week 5
@ 2017-06-19 21:41 Prathamesh Chavan
      [irrelevant] ` <20170619215025.10086-1-pc44800@gmail.com>
  0 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-19 21:41 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Christian Couder

SUMMARY OF MY PROJECT:

Git submodule subcommands are currently implemented by using shell script
'git-submodule.sh'. There are several reasons why we'll prefer not to
use the shell script. My project intends to convert the subcommands into
C code, thus making them builtins. This will increase Git's portability
and hence the efficiency of working with the git-submodule commands.
Link to the complete proposal: [1]

Mentors:
Stefan Beller <sbeller@google.com>
Christian Couder <christian.couder@gmail.com>

UPDATES:

Following are the updates about my ongoing project:

1. sync and status: The patches were discussed with the mentors
   and after that, are being posted with this patch.

2. deinit: The patch is finally debugged, and is ready to be
   discussed. It is also attached with this update.

3. summary: While porting the subcommand, I underwent certain
   issues. After getting them clarified from my mentors, I
   have resumed working on it. I'm aware of the time I have
   taken for porting this subcommand is more than the previous
   ones. Hence will try my best to finish this in this week.

4. foreach: As stated in the previous update, the subcommand was
   ported without resolving the bug, and simply translating the
   present code, and adding a NEEDSWORK tag to the comment for
   mentioning the reported bug as well.
   But as communicating between child_process is still an issue
   and so there was no simple was to current carry out the
   porting. And hence, a hack was used instead. But after
   discussing it, instead using the repository-object patch
   series will help to resolve these issues in this situation.

PLAN FOR WEEK-6 (20 June 2017 to 26 June 2017):

1. summary: Mostly I'll be working on this and post the patch
   for discussion as soon as possible.

2. foreach: As it was decided that unblock the conversion of
   this submodule subcommand, the original cmd_foreach was
   ported without including the BUG-FIX patch here.
   Hence, for this week I will try to utilize the
   'repository-object' series by Brandon Williams.

3. deinit: I will be working on improvising this patch as it was
   recently debugged and posted for discussion.

[1]: https://docs.google.com/document/d/1krxVLooWl--75Pot3dazhfygR3wCUUWZWzTXtK1L-xU/

Thanks,
Prathamesh Chavan

^ permalink raw reply	[relevance 16%]

* [GSoC][PATCH 2/6] submodule--helper: introduce get_submodule_displaypath and for_each_submodule_list
      [irrelevant] ` <20170619215025.10086-1-pc44800@gmail.com>
@ 2017-06-19 21:50   ` Prathamesh Chavan
  2017-06-20 18:22     ` Brandon Williams
  2017-06-19 21:50   ` [GSoC][PATCH 3/6] submodule: port set_name_rev from shell to C Prathamesh Chavan
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-19 21:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, christian.couder, Prathamesh Chavan

Functions get_submodule_displaypath and for_each_submodule_list
for using them in the later patches, related to porting submodule
subcommands from shell to C.
These new functions are also used in ported submodule subcommand
init

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 69 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 50 insertions(+), 19 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 8cc648d85..f7adca95b 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,6 +13,9 @@
 #include "refs.h"
 #include "connect.h"
 
+typedef void (*submodule_list_func_t)(const struct cache_entry *list_item,
+				      void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -219,6 +222,27 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		int len = strlen(super_prefix);
+		const char *format = is_dir_sep(super_prefix[len-1]) ? "%s%s" : "%s/%s";
+		return xstrfmt(format, super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -330,26 +354,30 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static void init_submodule(const char *path, const char *prefix, int quiet)
+static void for_each_submodule_list(const struct module_list list,
+				    submodule_list_func_t fn, void *cb_data)
 {
+	int i;
+	for (i = 0; i < list.nr; i++)
+		fn(list.entries[i], cb_data);
+}
+
+struct init_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+};
+#define INIT_CB_INIT { NULL, 0 }
+
+static void init_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+	struct init_cb *info = cb_data;
 	const struct submodule *sub;
 	struct strbuf sb = STRBUF_INIT;
 	char *upd = NULL, *url = NULL, *displaypath;
 
-	/* Only loads from .gitmodules, no overlay with .git/config */
-	gitmodules_config();
-
-	if (prefix && get_super_prefix())
-		die("BUG: cannot have prefix and superprefix");
-	else if (prefix)
-		displaypath = xstrdup(relative_path(path, prefix, &sb));
-	else if (get_super_prefix()) {
-		strbuf_addf(&sb, "%s%s", get_super_prefix(), path);
-		displaypath = strbuf_detach(&sb, NULL);
-	} else
-		displaypath = xstrdup(path);
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
 
-	sub = submodule_from_path(null_sha1, path);
+	sub = submodule_from_path(null_sha1, list_item->name);
 
 	if (!sub)
 		die(_("No url found for submodule path '%s' in .gitmodules"),
@@ -361,7 +389,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 *
 	 * Set active flag for the submodule being initialized
 	 */
-	if (!is_submodule_initialized(path)) {
+	if (!is_submodule_initialized(list_item->name)) {
 		strbuf_reset(&sb);
 		strbuf_addf(&sb, "submodule.%s.active", sub->name);
 		git_config_set_gently(sb.buf, "true");
@@ -404,7 +432,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 		if (git_config_set_gently(sb.buf, url))
 			die(_("Failed to register url for submodule path '%s'"),
 			    displaypath);
-		if (!quiet)
+		if (!info->quiet)
 			fprintf(stderr,
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
@@ -433,10 +461,10 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 
 static int module_init(int argc, const char **argv, const char *prefix)
 {
+	struct init_cb info = INIT_CB_INIT;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
 	int quiet = 0;
-	int i;
 
 	struct option module_init_options[] = {
 		OPT__QUIET(&quiet, N_("Suppress output for initializing a submodule")),
@@ -461,8 +489,11 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	if (!argc && git_config_get_value_multi("submodule.active"))
 		module_list_active(&list);
 
-	for (i = 0; i < list.nr; i++)
-		init_submodule(list.entries[i]->name, prefix, quiet);
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+
+	gitmodules_config();
+	for_each_submodule_list(list, init_submodule, &info);
 
 	return 0;
 }
-- 
2.13.0


^ permalink raw reply	[relevance 19%]

* [GSoC][PATCH 3/6] submodule: port set_name_rev from shell to C
      [irrelevant] ` <20170619215025.10086-1-pc44800@gmail.com>
  2017-06-19 21:50   ` [GSoC][PATCH 2/6] submodule--helper: introduce get_submodule_displaypath and for_each_submodule_list Prathamesh Chavan
@ 2017-06-19 21:50   ` Prathamesh Chavan
  2017-06-19 21:50   ` [GSoC][PATCH 4/6] submodule: port submodule subcommand status Prathamesh Chavan
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-06-19 21:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, christian.couder, Prathamesh Chavan

Since later on we want to port submodule subcommand status, and since
set_name_rev is part of cmd_status, hence this function is ported. It
has been ported to function print_name_rev in C, which calls get_name_rev
to get the revname, and after formatting it, print_name_rev prints it.
And hence in this way, the command `git submodule--helper print-name-rev
"sm_path" "sha1"` sets value of revname in git-submodule.sh

The function get_name_rev returns the stdout of the git describe
commands. Since there are four different git-describe commands used for
generating the name rev, four child_process are introduced, each successive
child process running only when previous has no stdout. The order of these
four git-describe commands is maintained the same as it was in the function
set_name_rev() in shell script.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 69 +++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            | 16 ++---------
 2 files changed, 71 insertions(+), 14 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f7adca95b..6fd861e42 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -243,6 +243,74 @@ static char *get_submodule_displaypath(const char *path, const char *prefix)
 	}
 }
 
+enum describe_step {
+	step_bare,
+	step_tags,
+	step_contains,
+	step_all_always,
+	step_end
+};
+
+static char *get_name_rev(const char *sub_path, const char* object_id)
+{
+	struct strbuf sb = STRBUF_INIT;
+	enum describe_step cur_step;
+
+	for (cur_step = step_bare; cur_step < step_end; cur_step++) {
+		struct child_process cp = CHILD_PROCESS_INIT;
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.dir = sub_path;
+		cp.git_cmd = 1;
+		cp.no_stderr = 1;
+
+		switch (cur_step) {
+			case step_bare:
+				argv_array_pushl(&cp.args, "describe",
+						 object_id, NULL);
+				break;
+			case step_tags:	
+				argv_array_pushl(&cp.args, "describe",
+						 "--tags", object_id, NULL);
+				break;
+			case step_contains:
+				argv_array_pushl(&cp.args, "describe",
+						 "--contains", object_id,
+						 NULL);
+				break;
+			case step_all_always:
+				argv_array_pushl(&cp.args, "describe",
+						 "--all", "--always",
+						 object_id, NULL);
+				break;
+			default:
+				BUG("unknown describe step '%d'", cur_step);
+		}
+
+		if (!capture_command(&cp, &sb, 0) && sb.len) {
+			strbuf_strip_suffix(&sb, "\n");
+			return strbuf_detach(&sb, NULL);
+		}
+
+	}
+
+	strbuf_release(&sb);
+	return NULL;
+}
+
+static int print_name_rev(int argc, const char **argv, const char *prefix)
+{
+	char *namerev;
+	if (argc != 3)
+		die("print-name-rev only accepts two arguments: <path> <sha1>");
+
+	namerev = get_name_rev(argv[1], argv[2]);
+	if (namerev && namerev[0])
+		printf(" (%s)", namerev);
+	printf("\n");
+
+	return 0;
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -1242,6 +1310,7 @@ static struct cmd_struct commands[] = {
 	{"relative-path", resolve_relative_path, 0},
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
+	{"print-name-rev", print_name_rev, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c..091051891 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -758,18 +758,6 @@ cmd_update()
 	}
 }
 
-set_name_rev () {
-	revname=$( (
-		sanitize_submodule_env
-		cd "$1" && {
-			git describe "$2" 2>/dev/null ||
-			git describe --tags "$2" 2>/dev/null ||
-			git describe --contains "$2" 2>/dev/null ||
-			git describe --all --always "$2"
-		}
-	) )
-	test -z "$revname" || revname=" ($revname)"
-}
 #
 # Show commit summary for submodules in index or working tree
 #
@@ -1041,14 +1029,14 @@ cmd_status()
 		fi
 		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
 		then
-			set_name_rev "$sm_path" "$sha1"
+			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
 			say " $sha1 $displaypath$revname"
 		else
 			if test -z "$cached"
 			then
 				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
 			fi
-			set_name_rev "$sm_path" "$sha1"
+			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
 			say "+$sha1 $displaypath$revname"
 		fi
 
-- 
2.13.0


^ permalink raw reply	[relevance 20%]

* [GSoC][PATCH 4/6] submodule: port submodule subcommand status
      [irrelevant] ` <20170619215025.10086-1-pc44800@gmail.com>
  2017-06-19 21:50   ` [GSoC][PATCH 2/6] submodule--helper: introduce get_submodule_displaypath and for_each_submodule_list Prathamesh Chavan
  2017-06-19 21:50   ` [GSoC][PATCH 3/6] submodule: port set_name_rev from shell to C Prathamesh Chavan
@ 2017-06-19 21:50   ` Prathamesh Chavan
  2017-06-20 18:44     ` Brandon Williams
  2017-06-19 21:50   ` [GSoC][PATCH 5/6] submodule: port submodule subcommand sync from shell to C Prathamesh Chavan
  2017-06-19 21:50   ` [GSoC][PATCH 6/6] submodule: port submodule subcommand 'deinit' from shell to C Prathamesh Chavan
  4 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-19 21:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, christian.couder, Prathamesh Chavan

The mechanism used for porting submodule subcommand 'status'
is similar to that used for subcommand 'foreach'.
The function cmd_status from git-submodule is ported to three
functions in the builtin submodule--helper namely: module_status,
for_each_submodule_list and status_submodule.

print_status is also introduced for handling the output of
the subcommand and also to reduce the code size.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 152 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  49 +-------------
 2 files changed, 153 insertions(+), 48 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6fd861e42..78b21ab22 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -566,6 +566,157 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct status_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int recursive: 1;
+	unsigned int cached: 1;
+};
+#define STATUS_CB_INIT { NULL, 0, 0, 0 }
+
+static void print_status(struct status_cb *info, char state, const char *path,
+			 char *sub_sha1, char *displaypath)
+{
+	if (info->quiet)
+		return;
+
+	printf("%c%s %s", state, sub_sha1, displaypath);
+
+	if (state == ' ' || state == '+') {
+		struct argv_array name_rev_args = ARGV_ARRAY_INIT;
+
+		argv_array_pushl(&name_rev_args, "print-name-rev",
+				 path, sub_sha1, NULL);
+		print_name_rev(name_rev_args.argc, name_rev_args.argv,
+			       info->prefix);
+	} else {
+		printf("\n");
+	}
+}
+
+static void status_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+	struct status_cb *info = cb_data;
+	char *sub_sha1 = xstrdup(oid_to_hex(&list_item->oid));
+	char *displaypath;
+	struct argv_array diff_files_args = ARGV_ARRAY_INIT;
+
+	if (!submodule_from_path(null_sha1, list_item->name))
+		die(_("no submodule mapping found in .gitmodules for path '%s'"),
+		      list_item->name);
+
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+	if (list_item->ce_flags) {
+		print_status(info, 'U', list_item->name,
+			     sha1_to_hex(null_sha1), displaypath);
+		goto cleanup;
+	}
+
+	if (!is_submodule_initialized(list_item->name)) {
+		print_status(info, '-', list_item->name, sub_sha1, displaypath);
+		goto cleanup;
+	}
+
+	argv_array_pushl(&diff_files_args, "diff-files",
+			 "--ignore-submodules=dirty", "--quiet", "--",
+			 list_item->name, NULL);
+
+	if (!cmd_diff_files(diff_files_args.argc, diff_files_args.argv,
+			    info->prefix)) {
+		print_status(info, ' ', list_item->name, sub_sha1, displaypath);
+	} else {
+		if (!info->cached) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			struct strbuf sb = STRBUF_INIT;
+
+			prepare_submodule_repo_env(&cp.env_array);
+			cp.git_cmd = 1;
+			cp.dir = list_item->name;
+
+			argv_array_pushl(&cp.args, "rev-parse",
+					 "--verify", "HEAD", NULL);
+
+			if (capture_command(&cp, &sb, 0))
+				die(_("could not run 'git rev-parse --verify"
+				      "HEAD' in submodule %s"),
+				      list_item->name);
+
+			strbuf_strip_suffix(&sb, "\n");
+			print_status(info, '+', list_item->name, sb.buf,
+				     displaypath);
+			strbuf_release(&sb);
+		} else {
+			print_status(info, '+', list_item->name, sub_sha1,
+				     displaypath);
+		}
+	}
+
+	if (info->recursive) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = list_item->name;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
+				 "submodule--helper", "status", "--recursive",
+				 NULL);
+
+		if (info->cached)
+			argv_array_push(&cpr.args, "--cached");
+
+		if (info->quiet)
+			argv_array_push(&cpr.args, "--quiet");
+
+		if (run_command(&cpr))
+			die(_("failed to recurse into submodule '%s'"),
+			      list_item->name);
+	}
+
+cleanup:
+	free(displaypath);
+	free(sub_sha1);
+}
+
+static int module_status(int argc, const char **argv, const char *prefix)
+{
+	struct status_cb info = STATUS_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int cached = 0;
+	int recursive = 0;
+
+	struct option module_status_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress submodule status output")),
+		OPT_BOOL(0, "cached", &cached, N_("Use commit stored in the index instead of the one stored in the submodule HEAD")),
+		OPT_BOOL(0, "recursive", &recursive, N_("Recurse into nested submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule status [--quiet] [--cached] [--recursive] [<path>]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_status_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		return 1;
+
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+	info.recursive = !!recursive;
+	info.cached = !!cached;
+
+	gitmodules_config();
+	for_each_submodule_list(list, status_submodule, &info);
+
+	return 0;
+}
+
 static int module_name(int argc, const char **argv, const char *prefix)
 {
 	const struct submodule *sub;
@@ -1312,6 +1463,7 @@ static struct cmd_struct commands[] = {
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
 	{"print-name-rev", print_name_rev, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
+	{"status", module_status, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index 091051891..a24b1b91b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -1004,54 +1004,7 @@ cmd_status()
 		shift
 	done
 
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		name=$(git submodule--helper name "$sm_path") || exit
-		displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-		if test "$stage" = U
-		then
-			say "U$sha1 $displaypath"
-			continue
-		fi
-		if ! git submodule--helper is-active "$sm_path" ||
-		{
-			! test -d "$sm_path"/.git &&
-			! test -f "$sm_path"/.git
-		}
-		then
-			say "-$sha1 $displaypath"
-			continue;
-		fi
-		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
-		then
-			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
-			say " $sha1 $displaypath$revname"
-		else
-			if test -z "$cached"
-			then
-				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
-			fi
-			revname=$(git submodule--helper print-name-rev "$sm_path" "$sha1")
-			say "+$sha1 $displaypath$revname"
-		fi
-
-		if test -n "$recursive"
-		then
-			(
-				prefix="$displaypath/"
-				sanitize_submodule_env
-				wt_prefix=
-				cd "$sm_path" &&
-				eval cmd_status
-			) ||
-			die "$(eval_gettext "Failed to recurse into submodule path '\$sm_path'")"
-		fi
-	done
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper status ${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive} "$@"
 }
 #
 # Sync remote urls for submodules
-- 
2.13.0


^ permalink raw reply	[relevance 20%]

* [GSoC][PATCH 5/6] submodule: port submodule subcommand sync from shell to C
      [irrelevant] ` <20170619215025.10086-1-pc44800@gmail.com>
                     ` (2 preceding siblings ...)
  2017-06-19 21:50   ` [GSoC][PATCH 4/6] submodule: port submodule subcommand status Prathamesh Chavan
@ 2017-06-19 21:50   ` Prathamesh Chavan
  2017-06-20 17:35     ` Stefan Beller
  2017-06-19 21:50   ` [GSoC][PATCH 6/6] submodule: port submodule subcommand 'deinit' from shell to C Prathamesh Chavan
  4 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-06-19 21:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, christian.couder, Prathamesh Chavan

The mechanism used for porting the submodule subcommand 'sync' is
similar to that of 'foreach', where we split the function cmd_sync
from shell into three functions in C, module_sync,
for_each_submodule_list and sync_submodule.

print_default_remote is introduced as a submodule--helper
subcommand for getting the default remote as stdout.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 180 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  56 +-------------
 2 files changed, 181 insertions(+), 55 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 78b21ab22..e10cac462 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -43,6 +43,20 @@ static char *get_default_remote(void)
 	return ret;
 }
 
+static int print_default_remote(int argc, const char **argv, const char *prefix)
+{
+	const char *remote;
+
+	if (argc != 1)
+		die(_("submodule--helper print-default-remote takes no arguments"));
+
+	remote = get_default_remote();
+	if (remote)
+		puts(remote);
+
+	return 0;
+}
+
 static int starts_with_dot_slash(const char *str)
 {
 	return str[0] == '.' && is_dir_sep(str[1]);
@@ -311,6 +325,25 @@ static int print_name_rev(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static char *get_up_path(const char *path)
+{
+	int i = count_slashes(path);
+	struct strbuf sb = STRBUF_INIT;
+
+	while (i--)
+		strbuf_addstr(&sb, "../");
+
+	/*
+	 *Check if 'path' ends with slash or not
+	 *for having the same output for dir/sub_dir
+	 *and dir/sub_dir/
+	 */
+	if (!is_dir_sep(path[i - 1]))
+		strbuf_addstr(&sb, "../");
+
+	return strbuf_detach(&sb, NULL);
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -736,6 +769,151 @@ static int module_name(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct sync_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int recursive: 1;
+};
+#define SYNC_CB_INIT { NULL, 0, 0 }
+
+static void sync_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+	struct sync_cb *info = cb_data;
+	const struct submodule *sub;
+	char *sub_key, *remote_key;
+	char *url, *sub_origin_url, *super_config_url, *displaypath;
+	struct strbuf sb = STRBUF_INIT;
+	struct child_process cp = CHILD_PROCESS_INIT;
+
+	if (!is_submodule_initialized(list_item->name))
+		return;
+
+	sub = submodule_from_path(null_sha1, list_item->name);
+
+	if (!sub->url)
+		die(_("no url found for submodule path '%s' in .gitmodules"),
+		      list_item->name);
+
+	url = xstrdup(sub->url);
+
+	if (starts_with_dot_dot_slash(url) || starts_with_dot_slash(url)) {
+		char *remote_url, *up_path;
+		char *remote = get_default_remote();
+		char *remote_key = xstrfmt("remote.%s.url", remote);
+		free(remote);
+
+		if (git_config_get_string(remote_key, &remote_url))
+			remote_url = xgetcwd();
+		up_path = get_up_path(list_item->name);
+		sub_origin_url = relative_url(remote_url, url, up_path);
+		super_config_url = relative_url(remote_url, url, NULL);
+		free(remote_key);
+		free(up_path);
+		free(remote_url);
+	} else {
+		sub_origin_url = xstrdup(url);
+		super_config_url = xstrdup(url);
+	}
+
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+	if (!info->quiet)
+		printf(_("Synchronizing submodule url for '%s'\n"),
+			 displaypath);
+
+	sub_key = xstrfmt("submodule.%s.url", sub->name);
+	if (git_config_set_gently(sub_key, super_config_url))
+		die(_("failed to register url for submodule path '%s'"),
+		      displaypath);
+
+	if (!is_submodule_populated_gently(list_item->name, NULL))
+		goto cleanup;
+
+	prepare_submodule_repo_env(&cp.env_array);
+	cp.git_cmd = 1;
+	cp.dir = list_item->name;
+	argv_array_pushl(&cp.args, "submodule--helper",
+			 "print-default-remote", NULL);
+	if (capture_command(&cp, &sb, 0))
+		die(_("failed to get the default remote for submodule '%s'"),
+		      list_item->name);
+
+	strbuf_strip_suffix(&sb, "\n");
+	remote_key = xstrfmt("remote.%s.url", sb.buf);
+	strbuf_release(&sb);
+
+	child_process_init(&cp);
+	prepare_submodule_repo_env(&cp.env_array);
+	cp.git_cmd = 1;
+	cp.dir = list_item->name;
+	argv_array_pushl(&cp.args, "config", remote_key, sub_origin_url, NULL);
+	if (run_command(&cp))
+		die(_("failed to update remote for submodule '%s'"),
+		      list_item->name);
+
+	if (info->recursive) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = list_item->name;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
+				 "submodule--helper", "sync", "--recursive",
+				 NULL);
+
+		if (info->quiet)
+			argv_array_push(&cpr.args, "--quiet");
+
+		if (run_command(&cpr))
+			die(_("failed to recurse into submodule '%s'"),
+			      list_item->name);
+	}
+
+cleanup:
+	free(sub_key);
+	free(url);
+	free(super_config_url);
+	free(displaypath);
+	free(sub_origin_url);
+}
+
+static int module_sync(int argc, const char **argv, const char *prefix)
+{
+	struct sync_cb info = SYNC_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int recursive = 0;
+
+	struct option module_sync_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress output of synchronizing submodule url")),
+		OPT_BOOL(0, "recursive", &recursive,
+			N_("Recurse into nested submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper sync [--quiet] [--recursive] [<path>]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_sync_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		return 1;
+
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+	info.recursive = !!recursive;
+
+	gitmodules_config();
+	for_each_submodule_list(list, sync_submodule, &info);
+
+	return 0;
+}
+
 static int clone_submodule(const char *path, const char *gitdir, const char *url,
 			   const char *depth, struct string_list *reference,
 			   int quiet, int progress)
@@ -1464,6 +1642,8 @@ static struct cmd_struct commands[] = {
 	{"print-name-rev", print_name_rev, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
 	{"status", module_status, SUPPORT_SUPER_PREFIX},
+	{"print-default-remote", print_default_remote, 0},
+	{"sync", module_sync, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index a24b1b91b..33b4b7306 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -1036,63 +1036,9 @@ cmd_sync()
 			;;
 		esac
 	done
-	cd_to_toplevel
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-
-		# skip inactive submodules
-		if ! git submodule--helper is-active "$sm_path"
-		then
-			continue
-		fi
-
-		name=$(git submodule--helper name "$sm_path")
-		url=$(git config -f .gitmodules --get submodule."$name".url)
-
-		# Possibly a url relative to parent
-		case "$url" in
-		./*|../*)
-			# rewrite foo/bar as ../.. to find path from
-			# submodule work tree to superproject work tree
-			up_path="$(printf '%s\n' "$sm_path" | sed "s/[^/][^/]*/../g")" &&
-			# guarantee a trailing /
-			up_path=${up_path%/}/ &&
-			# path from submodule work tree to submodule origin repo
-			sub_origin_url=$(git submodule--helper resolve-relative-url "$url" "$up_path") &&
-			# path from superproject work tree to submodule origin repo
-			super_config_url=$(git submodule--helper resolve-relative-url "$url") || exit
-			;;
-		*)
-			sub_origin_url="$url"
-			super_config_url="$url"
-			;;
-		esac
 
-		displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-		say "$(eval_gettext "Synchronizing submodule url for '\$displaypath'")"
-		git config submodule."$name".url "$super_config_url"
-
-		if test -e "$sm_path"/.git
-		then
-		(
-			sanitize_submodule_env
-			cd "$sm_path"
-			remote=$(get_default_remote)
-			git config remote."$remote".url "$sub_origin_url"
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper sync ${GIT_QUIET:+--quiet} ${recursive:+--recursive} "$@"
 
-			if test -n "$recursive"
-			then
-				prefix="$prefix$sm_path/"
-				eval cmd_sync
-			fi
-		)
-		fi
-	done
 }
 
 cmd_absorbgitdirs()
-- 
2.13.0


^ permalink raw reply	[relevance 19%]

* [GSoC][PATCH 6/6] submodule: port submodule subcommand 'deinit' from shell to C
      [irrelevant] ` <20170619215025.10086-1-pc44800@gmail.com>
                     ` (3 preceding siblings ...)
  2017-06-19 21:50   ` [GSoC][PATCH 5/6] submodule: port submodule subcommand sync from shell to C Prathamesh Chavan
@ 2017-06-19 21:50   ` Prathamesh Chavan
  4 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-06-19 21:50 UTC (permalink / raw)
  To: git; +Cc: sbeller, christian.couder, Prathamesh Chavan

The same mechanism is used even for porting this submodule
subcommand, as used in the ported subcommands till now.
The function cmd_deinit in split up after porting into three
functions: module_deinit, for_each_submodule_list and
deinit_submodule.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 140 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  55 +----------------
 2 files changed, 141 insertions(+), 54 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index e10cac462..f029f5fae 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -750,6 +750,145 @@ static int module_status(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct deinit_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int force: 1;
+	unsigned int all: 1;
+};
+#define DEINIT_CB_INIT { NULL, 0, 0, 0 }
+
+static void deinit_submodule(const struct cache_entry *list_item,
+			     void *cb_data)
+{
+	struct deinit_cb *info = cb_data;
+	const struct submodule *sub;
+	char *displaypath = NULL;
+	struct child_process cp_config = CHILD_PROCESS_INIT;
+	struct strbuf sb_config = STRBUF_INIT;
+	char *sm_path = xstrdup(list_item->name);
+	char *sub_git_dir = xstrfmt("%s/.git", sm_path);
+
+	sub = submodule_from_path(null_sha1, sm_path);
+
+	if (!sub->name)
+		goto cleanup;
+
+	displaypath = get_submodule_displaypath(sm_path, info->prefix);
+
+	/* remove the submodule work tree (unless the user already did it) */
+	if (is_directory(sm_path)) {
+		struct child_process cp = CHILD_PROCESS_INIT;
+
+		/* protect submodules containing a .git directory */
+		if (is_git_directory(sub_git_dir))
+			die(_("Submodule work tree '%s' contains a .git "
+			      "directory use 'rm -rf' if you really want "
+			      "to remove it including all of its history"),
+			      displaypath);
+
+		if (!info->force) {
+			struct child_process cp_rm = CHILD_PROCESS_INIT;
+			cp_rm.git_cmd = 1;
+			argv_array_pushl(&cp_rm.args, "rm", "-qn", sm_path,
+					 NULL);
+
+			/* list_item->name is changed by cmd_rm() below */
+			if (run_command(&cp_rm))
+				die(_("Submodule work tree '%s' contains local "
+				      "modifications; use '-f' to discard them"),
+				      displaypath);
+		}
+
+		cp.use_shell = 1;
+		argv_array_pushl(&cp.args, "rm", "-rf", sm_path, NULL);
+		if (!run_command(&cp)) {
+			if (!info->quiet)
+				printf(_("Cleared directory '%s'\n"),
+					 displaypath);
+		} else {
+			if (!info->quiet)
+				printf(_("Could not remove submodule work tree '%s'\n"),
+					 displaypath);
+		}
+	}
+
+	if (mkdir(sm_path, 0700))
+		die(_("could not create empty submodule directory %s"),
+		      displaypath);
+
+	cp_config.git_cmd = 1;
+	argv_array_pushl(&cp_config.args, "config", "--get-regexp", NULL);
+	argv_array_pushf(&cp_config.args, "submodule.%s\\.", sub->name);
+
+	/* remove the .git/config entries (unless the user already did it) */
+	if (!capture_command(&cp_config, &sb_config, 0) && sb_config.len) {
+		char *sub_key = xstrfmt("submodule.%s", sub->name);
+		/*
+		 * remove the whole section so we have a clean state when
+		 * the user later decides to init this submodule again
+		 */
+		git_config_rename_section_in_file(NULL, sub_key, NULL);
+		if (!info->quiet)
+			printf(_("Submodule '%s' (%s) unregistered for path '%s'\n"),
+				 sub->name, sub->url, displaypath);
+		free(sub_key);
+	}
+
+cleanup:
+	free(displaypath);
+	free(sub_git_dir);
+	free(sm_path);
+	strbuf_release(&sb_config);
+}
+
+static int module_deinit(int argc, const char **argv, const char *prefix)
+{
+	struct deinit_cb info = DEINIT_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int force = 0;
+	int all = 0;
+
+	struct option module_deinit_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress submodule status output")),
+		OPT__FORCE(&force, N_("Remove submodule working trees even if they contain local changes")),
+		OPT_BOOL(0, "all", &all, N_("Unregister all submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule deinit [--quiet] [-f | --force] [--all | [--] [<path>...]]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_deinit_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		BUG("module_list_compute should not choke on empty pathspec");
+
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+	info.all = !!all;
+	info.force = !!force;
+
+	if (all && argc) {
+		error("pathspec and --all are incompatible");
+		usage_with_options(git_submodule_helper_usage,
+				   module_deinit_options);
+	}
+
+	if (!argc && !all)
+		die(_("Use '--all' if you really want to deinitialize all submodules"));
+
+	gitmodules_config();
+	for_each_submodule_list(list, deinit_submodule, &info);
+
+	return 0;
+}
+
 static int module_name(int argc, const char **argv, const char *prefix)
 {
 	const struct submodule *sub;
@@ -1644,6 +1783,7 @@ static struct cmd_struct commands[] = {
 	{"status", module_status, SUPPORT_SUPER_PREFIX},
 	{"print-default-remote", print_default_remote, 0},
 	{"sync", module_sync, SUPPORT_SUPER_PREFIX},
+	{"deinit", module_deinit, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index 33b4b7306..2b2c6f5da 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -427,60 +427,7 @@ cmd_deinit()
 		shift
 	done
 
-	if test -n "$deinit_all" && test "$#" -ne 0
-	then
-		echo >&2 "$(eval_gettext "pathspec and --all are incompatible")"
-		usage
-	fi
-	if test $# = 0 && test -z "$deinit_all"
-	then
-		die "$(eval_gettext "Use '--all' if you really want to deinitialize all submodules")"
-	fi
-
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		name=$(git submodule--helper name "$sm_path") || exit
-
-		displaypath=$(git submodule--helper relative-path "$sm_path" "$wt_prefix")
-
-		# Remove the submodule work tree (unless the user already did it)
-		if test -d "$sm_path"
-		then
-			# Protect submodules containing a .git directory
-			if test -d "$sm_path/.git"
-			then
-				die "$(eval_gettext "\
-Submodule work tree '\$displaypath' contains a .git directory
-(use 'rm -rf' if you really want to remove it including all of its history)")"
-			fi
-
-			if test -z "$force"
-			then
-				git rm -qn "$sm_path" ||
-				die "$(eval_gettext "Submodule work tree '\$displaypath' contains local modifications; use '-f' to discard them")"
-			fi
-			rm -rf "$sm_path" &&
-			say "$(eval_gettext "Cleared directory '\$displaypath'")" ||
-			say "$(eval_gettext "Could not remove submodule work tree '\$displaypath'")"
-		fi
-
-		mkdir "$sm_path" || say "$(eval_gettext "Could not create empty submodule directory '\$displaypath'")"
-
-		# Remove the .git/config entries (unless the user already did it)
-		if test -n "$(git config --get-regexp submodule."$name\.")"
-		then
-			# Remove the whole section so we have a clean state when
-			# the user later decides to init this submodule again
-			url=$(git config submodule."$name".url)
-			git config --remove-section submodule."$name" 2>/dev/null &&
-			say "$(eval_gettext "Submodule '\$name' (\$url) unregistered for path '\$displaypath'")"
-		fi
-	done
+	git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} ${force:+--force} ${deinit_all:+--all} "$@"
 }
 
 is_tip_reachable () (
-- 
2.13.0


^ permalink raw reply	[relevance 20%]

* [PATCH 00/26] reroll of sb/diff-color-moved
      [irrelevant] <CAGZ79kaqjQYmkt77kk5m=fdBfbZAvwd0YhhT7=O5b-FkQmDfHg@mail.gmail.com>
@ 2017-06-20  2:47 ` Stefan Beller
  2017-06-20  2:48   ` [PATCH 15/26] submodule.c: migrate diff output to use emit_diff_symbol Stefan Beller
                     ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stefan Beller @ 2017-06-20  2:47 UTC (permalink / raw)
  To: sbeller; +Cc: bmwill, git, gitster, jonathantanmy, jrnieder, mhagger, peff, philipoakley

This is a complete rewrite of the series. Highlights:
* instead of buffering partial lines, we'll pretend all diff output
  follows a well defined grammar, and we emit symbols thereof.
  (The difference is mostly mental, though by this trick we reduce
  the memory footprint for storing one of these symbols from 7 variables
  (3 pointers, 3 ints, one state (also int) down to 4 variables
  (one pointer, 2 ints, one state).
* The algorithm for color painting was detangled:
  -> different functions for block detection and dimming
  -> The last patch (not to be applied) is an RFC that shows
     how we would approach non-colored, but machine parseable highlighting
     of moved lines.

Thanks,
Stefan

Stefan Beller (26):
  diff.c: readability fix
  diff.c: move line ending check into emit_hunk_header
  diff.c: factor out diff_flush_patch_all_file_pairs
  diff.c: introduce emit_diff_symbol
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
  diff.c: migrate emit_line_checked to use emit_diff_symbol
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS{_PORCELAIN}
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
  diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
  submodule.c: migrate diff output to use emit_diff_symbol
  diff.c: convert emit_binary_diff_body to use emit_diff_symbol
  diff.c: convert show_stats to use emit_diff_symbol
  diff.c: convert word diffing to use emit_diff_symbol
  diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
  diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
  diff.c: buffer all output if asked to
  diff.c: color moved lines differently
  diff.c: color moved lines differently, plain mode
  diff.c: add dimming to moved line detection
  diff: document the new --color-moved setting
  WIP/RFC: diff.c: have a "machine parseable" move coloring

 Documentation/config.txt       |   12 +-
 Documentation/diff-options.txt |   27 +
 cache.h                        |    1 +
 color.h                        |    2 +
 diff.c                         | 1283 ++++++++++++++++++++++++++++++++--------
 diff.h                         |   39 +-
 submodule.c                    |   85 ++-
 submodule.h                    |   13 +-
 t/t4015-diff-whitespace.sh     |  369 ++++++++++++
 9 files changed, 1515 insertions(+), 316 deletions(-)

-- 
2.12.2.575.gb14f27f917


^ permalink raw reply	[relevance 24%]

* [PATCH 15/26] submodule.c: migrate diff output to use emit_diff_symbol
  2017-06-20  2:47 ` [PATCH 00/26] reroll of sb/diff-color-moved Stefan Beller
@ 2017-06-20  2:48   ` Stefan Beller
  2017-06-20 20:09     ` Jonathan Tan
  2017-06-20  2:48   ` [PATCH 22/26] diff.c: color moved lines differently Stefan Beller
  2017-06-23  1:28   ` [PATCHv2 00/25] reroll of sb/diff-color-moved Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-20  2:48 UTC (permalink / raw)
  To: sbeller; +Cc: bmwill, git, gitster, jonathantanmy, jrnieder, mhagger, peff, philipoakley

As the submodule process is no longer attached to the same stdout as
the superprojects process, we need to pass coloring explicitly.

Remove the colors from the function signatures, as all the coloring
decisions will be made either inside the child process or the final
emit_diff_symbol.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c      | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 diff.h      |  9 +++++++
 submodule.c | 85 ++++++++++++++++++++++++++--------------------------------
 submodule.h | 13 +++------
 4 files changed, 128 insertions(+), 68 deletions(-)

diff --git a/diff.c b/diff.c
index 96ce53c5cf..42a9020d95 100644
--- a/diff.c
+++ b/diff.c
@@ -574,6 +574,13 @@ enum diff_symbol {
 	DIFF_SYMBOL_HEADER,
 	DIFF_SYMBOL_BINARY_FILES,
 	DIFF_SYMBOL_REWRITE_DIFF,
+	DIFF_SYMBOL_SUBMODULE_ADD,
+	DIFF_SYMBOL_SUBMODULE_DEL,
+	DIFF_SYMBOL_SUBMODULE_UNTRACKED,
+	DIFF_SYMBOL_SUBMODULE_MODIFIED,
+	DIFF_SYMBOL_SUBMODULE_HEADER,
+	DIFF_SYMBOL_SUBMODULE_ERROR,
+	DIFF_SYMBOL_SUBMODULE_PIPETHROUGH,
 };
 /*
  * Flags for content lines:
@@ -700,11 +707,77 @@ static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s,
 		reset = diff_get_color_opt(o, DIFF_RESET);
 		emit_line(o, fraginfo, reset, line, len);
 		break;
+	case DIFF_SYMBOL_SUBMODULE_ADD:
+		set = diff_get_color_opt(o, DIFF_FILE_NEW);
+		reset = diff_get_color_opt(o, DIFF_RESET);
+		emit_line(o, set, reset, line, len);
+		break;
+	case DIFF_SYMBOL_SUBMODULE_DEL:
+		set = diff_get_color_opt(o, DIFF_FILE_OLD);
+		reset = diff_get_color_opt(o, DIFF_RESET);
+		emit_line(o, set, reset, line, len);
+		break;
+	case DIFF_SYMBOL_SUBMODULE_UNTRACKED:
+		fprintf(o->file, "%sSubmodule %s contains untracked content\n",
+			diff_line_prefix(o), line);
+		break;
+	case DIFF_SYMBOL_SUBMODULE_MODIFIED:
+		fprintf(o->file, "%sSubmodule %s contains modified content\n",
+			diff_line_prefix(o), line);
+		break;
+	case DIFF_SYMBOL_SUBMODULE_HEADER:
+		fprintf(o->file, "%s%s", diff_line_prefix(o), line);
+		break;
+	case DIFF_SYMBOL_SUBMODULE_ERROR:
+		emit_line(o, "", "", line, len);
+		break;
+	case DIFF_SYMBOL_SUBMODULE_PIPETHROUGH:
+		emit_line(o, "", "", line, len);
+		break;
 	default:
 		die("BUG: unknown diff symbol");
 	}
 }
 
+void diff_emit_submodule_del(struct diff_options *o, const char *line)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_DEL, line, strlen(line), 0);
+}
+
+void diff_emit_submodule_add(struct diff_options *o, const char *line)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_ADD, line, strlen(line), 0);
+}
+
+void diff_emit_submodule_untracked(struct diff_options *o, const char *path)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_UNTRACKED,
+			 path, strlen(path), 0);
+}
+
+void diff_emit_submodule_modified(struct diff_options *o, const char *path)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_MODIFIED,
+			 path, strlen(path), 0);
+}
+
+void diff_emit_submodule_header(struct diff_options *o, const char *header)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_HEADER,
+			 header, strlen(header), 0);
+}
+
+void diff_emit_submodule_error(struct diff_options *o, const char *err)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_ERROR, err, strlen(err), 0);
+}
+
+void diff_emit_submodule_pipethrough(struct diff_options *o,
+				     const char *line, int len)
+{
+	emit_diff_symbol(o, DIFF_SYMBOL_SUBMODULE_PIPETHROUGH, line, len, 0);
+}
+
 static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len)
 {
 	if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) &&
@@ -2465,24 +2538,16 @@ static void builtin_diff(const char *name_a,
 	if (o->submodule_format == DIFF_SUBMODULE_LOG &&
 	    (!one->mode || S_ISGITLINK(one->mode)) &&
 	    (!two->mode || S_ISGITLINK(two->mode))) {
-		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
-		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
-		show_submodule_summary(o->file, one->path ? one->path : two->path,
-				line_prefix,
+		show_submodule_summary(o, one->path ? one->path : two->path,
 				&one->oid, &two->oid,
-				two->dirty_submodule,
-				meta, del, add, reset);
+				two->dirty_submodule);
 		return;
 	} else if (o->submodule_format == DIFF_SUBMODULE_INLINE_DIFF &&
 		   (!one->mode || S_ISGITLINK(one->mode)) &&
 		   (!two->mode || S_ISGITLINK(two->mode))) {
-		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
-		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
-		show_submodule_inline_diff(o->file, one->path ? one->path : two->path,
-				line_prefix,
+		show_submodule_inline_diff(o, one->path ? one->path : two->path,
 				&one->oid, &two->oid,
-				two->dirty_submodule,
-				meta, del, add, reset, o);
+				two->dirty_submodule);
 		return;
 	}
 
diff --git a/diff.h b/diff.h
index 8483ca0991..2ee0ef3908 100644
--- a/diff.h
+++ b/diff.h
@@ -188,6 +188,15 @@ struct diff_options {
 	int diff_path_counter;
 };
 
+void diff_emit_submodule_del(struct diff_options *o, const char *line);
+void diff_emit_submodule_add(struct diff_options *o, const char *line);
+void diff_emit_submodule_untracked(struct diff_options *o, const char *path);
+void diff_emit_submodule_modified(struct diff_options *o, const char *path);
+void diff_emit_submodule_header(struct diff_options *o, const char *header);
+void diff_emit_submodule_error(struct diff_options *o, const char *err);
+void diff_emit_submodule_pipethrough(struct diff_options *o,
+				     const char *line, int len);
+
 enum color_diff {
 	DIFF_RESET = 0,
 	DIFF_CONTEXT = 1,
diff --git a/submodule.c b/submodule.c
index d3299e29c0..71471f3241 100644
--- a/submodule.c
+++ b/submodule.c
@@ -362,9 +362,7 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path,
 	return prepare_revision_walk(rev);
 }
 
-static void print_submodule_summary(struct rev_info *rev, FILE *f,
-		const char *line_prefix,
-		const char *del, const char *add, const char *reset)
+static void print_submodule_summary(struct rev_info *rev, struct diff_options *o)
 {
 	static const char format[] = "  %m %s";
 	struct strbuf sb = STRBUF_INIT;
@@ -375,18 +373,12 @@ static void print_submodule_summary(struct rev_info *rev, FILE *f,
 		ctx.date_mode = rev->date_mode;
 		ctx.output_encoding = get_log_output_encoding();
 		strbuf_setlen(&sb, 0);
-		strbuf_addstr(&sb, line_prefix);
-		if (commit->object.flags & SYMMETRIC_LEFT) {
-			if (del)
-				strbuf_addstr(&sb, del);
-		}
-		else if (add)
-			strbuf_addstr(&sb, add);
 		format_commit_message(commit, format, &sb, &ctx);
-		if (reset)
-			strbuf_addstr(&sb, reset);
 		strbuf_addch(&sb, '\n');
-		fprintf(f, "%s", sb.buf);
+		if (commit->object.flags & SYMMETRIC_LEFT)
+			diff_emit_submodule_del(o, sb.buf);
+		else
+			diff_emit_submodule_add(o, sb.buf);
 	}
 	strbuf_release(&sb);
 }
@@ -413,11 +405,9 @@ void prepare_submodule_repo_env(struct argv_array *out)
  * attempt to lookup both the left and right commits and put them into the
  * left and right pointers.
  */
-static void show_submodule_header(FILE *f, const char *path,
-		const char *line_prefix,
+static void show_submodule_header(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
-		unsigned dirty_submodule, const char *meta,
-		const char *reset,
+		unsigned dirty_submodule,
 		struct commit **left, struct commit **right,
 		struct commit_list **merge_bases)
 {
@@ -426,11 +416,10 @@ static void show_submodule_header(FILE *f, const char *path,
 	int fast_forward = 0, fast_backward = 0;
 
 	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
-		fprintf(f, "%sSubmodule %s contains untracked content\n",
-			line_prefix, path);
+		diff_emit_submodule_untracked(o, path);
+
 	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
-		fprintf(f, "%sSubmodule %s contains modified content\n",
-			line_prefix, path);
+		diff_emit_submodule_modified(o, path);
 
 	if (is_null_oid(one))
 		message = "(new submodule)";
@@ -472,31 +461,29 @@ static void show_submodule_header(FILE *f, const char *path,
 	}
 
 output_header:
-	strbuf_addf(&sb, "%s%sSubmodule %s ", line_prefix, meta, path);
+	strbuf_addf(&sb, "Submodule %s ", path);
 	strbuf_add_unique_abbrev(&sb, one->hash, DEFAULT_ABBREV);
 	strbuf_addstr(&sb, (fast_backward || fast_forward) ? ".." : "...");
 	strbuf_add_unique_abbrev(&sb, two->hash, DEFAULT_ABBREV);
 	if (message)
-		strbuf_addf(&sb, " %s%s\n", message, reset);
+		strbuf_addf(&sb, " %s\n", message);
 	else
-		strbuf_addf(&sb, "%s:%s\n", fast_backward ? " (rewind)" : "", reset);
-	fwrite(sb.buf, sb.len, 1, f);
+		strbuf_addf(&sb, "%s:\n", fast_backward ? " (rewind)" : "");
+	diff_emit_submodule_header(o, sb.buf);
 
 	strbuf_release(&sb);
 }
 
-void show_submodule_summary(FILE *f, const char *path,
-		const char *line_prefix,
+void show_submodule_summary(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
-		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset)
+		unsigned dirty_submodule)
 {
 	struct rev_info rev;
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
 
-	show_submodule_header(f, path, line_prefix, one, two, dirty_submodule,
-			      meta, reset, &left, &right, &merge_bases);
+	show_submodule_header(o, path, one, two, dirty_submodule,
+			      &left, &right, &merge_bases);
 
 	/*
 	 * If we don't have both a left and a right pointer, there is no
@@ -508,11 +495,11 @@ void show_submodule_summary(FILE *f, const char *path,
 
 	/* Treat revision walker failure the same as missing commits */
 	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
-		fprintf(f, "%s(revision walker failed)\n", line_prefix);
+		diff_emit_submodule_error(o, "(revision walker failed)\n");
 		goto out;
 	}
 
-	print_submodule_summary(&rev, f, line_prefix, del, add, reset);
+	print_submodule_summary(&rev, o);
 
 out:
 	if (merge_bases)
@@ -521,21 +508,18 @@ void show_submodule_summary(FILE *f, const char *path,
 	clear_commit_marks(right, ~0);
 }
 
-void show_submodule_inline_diff(FILE *f, const char *path,
-		const char *line_prefix,
+void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
-		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset,
-		const struct diff_options *o)
+		unsigned dirty_submodule)
 {
 	const struct object_id *old = &empty_tree_oid, *new = &empty_tree_oid;
 	struct commit *left = NULL, *right = NULL;
 	struct commit_list *merge_bases = NULL;
-	struct strbuf submodule_dir = STRBUF_INIT;
 	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf sb = STRBUF_INIT;
 
-	show_submodule_header(f, path, line_prefix, one, two, dirty_submodule,
-			      meta, reset, &left, &right, &merge_bases);
+	show_submodule_header(o, path, one, two, dirty_submodule,
+			      &left, &right, &merge_bases);
 
 	/* We need a valid left and right commit to display a difference */
 	if (!(left || is_null_oid(one)) ||
@@ -547,15 +531,16 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 	if (right)
 		new = two;
 
-	fflush(f);
 	cp.git_cmd = 1;
 	cp.dir = path;
-	cp.out = dup(fileno(f));
+	cp.out = -1;
 	cp.no_stdin = 1;
 
 	/* TODO: other options may need to be passed here. */
 	argv_array_push(&cp.args, "diff");
-	argv_array_pushf(&cp.args, "--line-prefix=%s", line_prefix);
+	argv_array_pushf(&cp.args, "--color=%s", want_color(o->use_color) ?
+			 "always" : "never");
+
 	if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
 		argv_array_pushf(&cp.args, "--src-prefix=%s%s/",
 				 o->b_prefix, path);
@@ -578,11 +563,17 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 		argv_array_push(&cp.args, oid_to_hex(new));
 
 	prepare_submodule_repo_env(&cp.env_array);
-	if (run_command(&cp))
-		fprintf(f, "(diff failed)\n");
+	if (start_command(&cp))
+		diff_emit_submodule_error(o, "(diff failed)\n");
+
+	while (strbuf_getwholeline_fd(&sb, cp.out, '\n') != EOF)
+		diff_emit_submodule_pipethrough(o, sb.buf, sb.len);
+
+	if (finish_command(&cp))
+		diff_emit_submodule_error(o, "(diff failed)\n");
 
 done:
-	strbuf_release(&submodule_dir);
+	strbuf_release(&sb);
 	if (merge_bases)
 		free_commit_list(merge_bases);
 	if (left)
diff --git a/submodule.h b/submodule.h
index 1277480add..e7358863e9 100644
--- a/submodule.h
+++ b/submodule.h
@@ -53,17 +53,12 @@ extern int parse_submodule_update_strategy(const char *value,
 		struct submodule_update_strategy *dst);
 extern const char *submodule_strategy_to_string(const struct submodule_update_strategy *s);
 extern void handle_ignore_submodules_arg(struct diff_options *, const char *);
-extern void show_submodule_summary(FILE *f, const char *path,
-		const char *line_prefix,
+extern void show_submodule_summary(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
-		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset);
-extern void show_submodule_inline_diff(FILE *f, const char *path,
-		const char *line_prefix,
+		unsigned dirty_submodule);
+extern void show_submodule_inline_diff(struct diff_options *o, const char *path,
 		struct object_id *one, struct object_id *two,
-		unsigned dirty_submodule, const char *meta,
-		const char *del, const char *add, const char *reset,
-		const struct diff_options *opt);
+		unsigned dirty_submodule);
 extern void set_config_fetch_recurse_submodules(int value);
 extern void set_config_update_recurse_submodules(int value);
 /* Check if we want to update any submodule.*/
-- 
2.12.2.575.gb14f27f917


^ permalink raw reply	[relevance 16%]

* [PATCH 22/26] diff.c: color moved lines differently
  2017-06-20  2:47 ` [PATCH 00/26] reroll of sb/diff-color-moved Stefan Beller
  2017-06-20  2:48   ` [PATCH 15/26] submodule.c: migrate diff output to use emit_diff_symbol Stefan Beller
@ 2017-06-20  2:48   ` Stefan Beller
  2017-06-23  1:28   ` [PATCHv2 00/25] reroll of sb/diff-color-moved Stefan Beller
  2 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-20  2:48 UTC (permalink / raw)
  To: sbeller; +Cc: bmwill, git, gitster, jonathantanmy, jrnieder, mhagger, peff, philipoakley

When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.

This patch implements the first mode:
* basic alternating 'Zebra' mode
  This conveys all information needed to the user.  Defer customization to
  later patches.

First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff.c                     | 304 ++++++++++++++++++++++++++++++++++++++++++---
 diff.h                     |  10 +-
 t/t4015-diff-whitespace.sh | 196 +++++++++++++++++++++++++++++
 3 files changed, 495 insertions(+), 15 deletions(-)

diff --git a/diff.c b/diff.c
index 35b5924ff2..20c1f9b99f 100644
--- a/diff.c
+++ b/diff.c
@@ -15,6 +15,7 @@
 #include "userdiff.h"
 #include "submodule-config.h"
 #include "submodule.h"
+#include "hashmap.h"
 #include "ll-merge.h"
 #include "string-list.h"
 #include "argv-array.h"
@@ -31,6 +32,7 @@ static int diff_indent_heuristic; /* experimental */
 static int diff_rename_limit_default = 400;
 static int diff_suppress_blank_empty;
 static int diff_use_color_default = -1;
+static int diff_color_moved_default;
 static int diff_context_default = 3;
 static int diff_interhunk_context_default;
 static const char *diff_word_regex_cfg;
@@ -55,6 +57,10 @@ static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_YELLOW,	/* COMMIT */
 	GIT_COLOR_BG_RED,	/* WHITESPACE */
 	GIT_COLOR_NORMAL,	/* FUNCINFO */
+	GIT_COLOR_MAGENTA,	/* OLD_MOVED */
+	GIT_COLOR_BLUE,		/* OLD_MOVED ALTERNATIVE */
+	GIT_COLOR_CYAN,		/* NEW_MOVED */
+	GIT_COLOR_YELLOW,	/* NEW_MOVED ALTERNATIVE */
 };
 
 static NORETURN void die_want_option(const char *option_name)
@@ -80,6 +86,14 @@ static int parse_diff_color_slot(const char *var)
 		return DIFF_WHITESPACE;
 	if (!strcasecmp(var, "func"))
 		return DIFF_FUNCINFO;
+	if (!strcasecmp(var, "oldmoved"))
+		return DIFF_FILE_OLD_MOVED;
+	if (!strcasecmp(var, "oldmovedalternative"))
+		return DIFF_FILE_OLD_MOVED_ALT;
+	if (!strcasecmp(var, "newmoved"))
+		return DIFF_FILE_NEW_MOVED;
+	if (!strcasecmp(var, "newmovedalternative"))
+		return DIFF_FILE_NEW_MOVED_ALT;
 	return -1;
 }
 
@@ -228,12 +242,29 @@ int git_diff_heuristic_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+static int parse_color_moved(const char *arg)
+{
+	if (!strcmp(arg, "no"))
+		return COLOR_MOVED_NO;
+	else if (!strcmp(arg, "zebra"))
+		return COLOR_MOVED_ZEBRA;
+	else
+		return -1;
+}
+
 int git_diff_ui_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
 		diff_use_color_default = git_config_colorbool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "diff.colormoved")) {
+		int cm = parse_color_moved(value);
+		if (cm < 0)
+			return -1;
+		diff_color_moved_default = cm;
+		return 0;
+	}
 	if (!strcmp(var, "diff.context")) {
 		diff_context_default = git_config_int(var, value);
 		if (diff_context_default < 0)
@@ -600,7 +631,9 @@ enum diff_symbol {
  * 13-15 are WSEH_NEW | WSEH_OLD | WSEH_CONTEXT
  * 16 is marking if the line is blank at EOF
  */
-#define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF (1<<16)
+#define DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF	(1<<16)
+#define DIFF_SYMBOL_MOVED_LINE			(1<<17)
+#define DIFF_SYMBOL_MOVED_LINE_ZEBRA		(1<<18)
 #define DIFF_SYMBOL_CONTENT_WS_MASK (WSEH_NEW | WSEH_OLD | WSEH_CONTEXT | WS_RULE_MASK)
 
 /*
@@ -643,6 +676,213 @@ static void append_emitted_diff_symbol(struct diff_options *o,
 	f->line = e->line ? xmemdupz(e->line, e->len) : NULL;
 }
 
+struct moved_entry {
+	struct hashmap_entry ent;
+	const struct emitted_diff_symbol *es;
+	struct moved_entry *next_line;
+};
+
+static void get_ws_cleaned_string(const struct emitted_diff_symbol *l,
+				  struct strbuf *out)
+{
+	int i;
+	for (i = 0; i < l->len; i++) {
+		if (isspace(l->line[i]))
+			continue;
+		strbuf_addch(out, l->line[i]);
+	}
+}
+
+static int emitted_symbol_cmp_no_ws(const struct emitted_diff_symbol *a,
+				    const struct emitted_diff_symbol *b,
+				    const void *keydata)
+{
+	int ret;
+	struct strbuf sba = STRBUF_INIT;
+	struct strbuf sbb = STRBUF_INIT;
+
+	get_ws_cleaned_string(a, &sba);
+	get_ws_cleaned_string(b, &sbb);
+	ret = sba.len != sbb.len || strncmp(sba.buf, sbb.buf, sba.len);
+
+	strbuf_release(&sba);
+	strbuf_release(&sbb);
+	return ret;
+}
+
+static int emitted_symbol_cmp(const struct emitted_diff_symbol *a,
+			      const struct emitted_diff_symbol *b,
+			      const void *keydata)
+{
+	return a->len != b->len || strncmp(a->line, b->line, a->len);
+}
+
+static int moved_entry_cmp(const struct moved_entry *a,
+			   const struct moved_entry *b,
+			   const void *keydata)
+{
+	return emitted_symbol_cmp(a->es, b->es, keydata);
+}
+
+static int moved_entry_cmp_no_ws(const struct moved_entry *a,
+				 const struct moved_entry *b,
+				 const void *keydata)
+{
+	return emitted_symbol_cmp_no_ws(a->es, b->es, keydata);
+}
+
+static unsigned get_string_hash(struct emitted_diff_symbol *es, unsigned ignore_ws)
+{
+	static struct strbuf sb = STRBUF_INIT;
+
+	if (ignore_ws) {
+		strbuf_reset(&sb);
+		get_ws_cleaned_string(es, &sb);
+		return memhash(sb.buf, sb.len);
+	} else {
+		return memhash(es->line, es->len);
+	}
+}
+
+static struct moved_entry *prepare_entry(struct diff_options *o,
+					 int line_no)
+{
+	struct moved_entry *ret = xmalloc(sizeof(*ret));
+	unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+	struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no];
+
+	ret->ent.hash = get_string_hash(l, ignore_ws);
+	ret->es = l;
+	ret->next_line = NULL;
+
+	return ret;
+}
+
+static void add_lines_to_move_detection(struct diff_options *o,
+					struct hashmap *add_lines,
+					struct hashmap *del_lines)
+{
+	struct moved_entry *prev_line = NULL;
+
+	int n;
+	for (n = 0; n < o->emitted_symbols->nr; n++) {
+		struct hashmap *hm;
+		struct moved_entry *key;
+
+		switch (o->emitted_symbols->buf[n].s) {
+		case DIFF_SYMBOL_PLUS:
+			hm = add_lines;
+			break;
+		case DIFF_SYMBOL_MINUS:
+			hm = del_lines;
+			break;
+		default:
+			prev_line = NULL;
+			continue;
+		}
+
+		key = prepare_entry(o, n);
+		if (prev_line && prev_line->es->s == o->emitted_symbols->buf[n].s)
+			prev_line->next_line = key;
+
+		hashmap_add(hm, key);
+		prev_line = key;
+	}
+}
+
+/* Find blocks of moved code, delegate actual coloring decision to helper */
+static void mark_color_as_moved(struct diff_options *o,
+				struct hashmap *add_lines,
+				struct hashmap *del_lines)
+{
+	struct moved_entry **pmb = NULL; /* potentially moved blocks */
+	int pmb_nr = 0, pmb_alloc = 0;
+	int n, flipped_block = 1;
+
+	for (n = 0; n < o->emitted_symbols->nr; n++) {
+		struct hashmap *hm = NULL;
+		struct moved_entry *key;
+		struct moved_entry *match = NULL;
+		struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n];
+		int i, lp, rp;
+
+		switch (l->s) {
+		case DIFF_SYMBOL_PLUS:
+			hm = del_lines;
+			key = prepare_entry(o, n);
+			match = hashmap_get(hm, key, o);
+			free(key);
+			break;
+		case DIFF_SYMBOL_MINUS:
+			hm = add_lines;
+			key = prepare_entry(o, n);
+			match = hashmap_get(hm, key, o);
+			free(key);
+			break;
+		default:
+			flipped_block = 1;
+		}
+
+		if (!match) {
+			pmb_nr = 0;
+			continue;
+		}
+
+		l->flags |= DIFF_SYMBOL_MOVED_LINE;
+
+		/* Check any potential block runs, advance each or nullify */
+		for (i = 0; i < pmb_nr; i++) {
+			struct moved_entry *p = pmb[i];
+			struct moved_entry *pnext = (p && p->next_line) ?
+					p->next_line : NULL;
+			if (pnext &&
+			    !emitted_symbol_cmp(pnext->es, l, o)) {
+				pmb[i] = p->next_line;
+			} else {
+				pmb[i] = NULL;
+			}
+		}
+
+		/* Shrink the set of potential block to the remaining running */
+		for (lp = 0, rp = pmb_nr - 1; lp <= rp;) {
+			while (lp < pmb_nr && pmb[lp])
+				lp++;
+			/* lp points at the first NULL now */
+
+			while (rp > -1 && !pmb[rp])
+				rp--;
+			/* rp points at the last non-NULL */
+
+			if (lp < pmb_nr && rp > -1 && lp < rp) {
+				pmb[lp] = pmb[rp];
+				pmb[rp] = NULL;
+				rp--;
+				lp++;
+			}
+		}
+
+		/* Remember the number of running sets */
+		pmb_nr = rp + 1;
+
+		if (pmb_nr == 0) {
+			/*
+			 * The current line is the start of a new block.
+			 * Setup the set of potential blocks.
+			 */
+			for (; match; match = hashmap_get_next(hm, match)) {
+				ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc);
+				pmb[pmb_nr++] = match;
+			}
+
+			flipped_block = (flipped_block + 1) % 2;
+		}
+
+		if (flipped_block)
+			l->flags |= DIFF_SYMBOL_MOVED_LINE_ZEBRA;
+	}
+
+	free(pmb);
+}
 
 static void emit_line_ws_markup(struct diff_options *o,
 				const char *set, const char *reset,
@@ -715,14 +955,24 @@ static void emit_diff_symbol_from_struct(struct diff_options *o,
 		emit_line(o, context, reset, line, len);
 		break;
 	case DIFF_SYMBOL_PLUS:
-		set = diff_get_color_opt(o, DIFF_FILE_NEW);
+		if (flags & DIFF_SYMBOL_MOVED_LINE_ZEBRA)
+			set = diff_get_color_opt(o, DIFF_FILE_NEW_MOVED_ALT);
+		else if (flags & DIFF_SYMBOL_MOVED_LINE)
+			set = diff_get_color_opt(o, DIFF_FILE_NEW_MOVED);
+		else
+			set = diff_get_color_opt(o, DIFF_FILE_NEW);
 		reset = diff_get_color_opt(o, DIFF_RESET);
 		emit_line_ws_markup(o, set, reset, line, len, '+',
 				    flags & DIFF_SYMBOL_CONTENT_WS_MASK,
 				    flags & DIFF_SYMBOL_CONTENT_BLANK_LINE_EOF);
 		break;
 	case DIFF_SYMBOL_MINUS:
-		set = diff_get_color_opt(o, DIFF_FILE_OLD);
+		if (flags & DIFF_SYMBOL_MOVED_LINE_ZEBRA)
+			set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED_ALT);
+		else if (flags & DIFF_SYMBOL_MOVED_LINE)
+			set = diff_get_color_opt(o, DIFF_FILE_OLD_MOVED);
+		else
+			set = diff_get_color_opt(o, DIFF_FILE_OLD);
 		reset = diff_get_color_opt(o, DIFF_RESET);
 		emit_line_ws_markup(o, set, reset, line, len, '-',
 				    flags & DIFF_SYMBOL_CONTENT_WS_MASK, 0);
@@ -3750,6 +4000,8 @@ void diff_setup(struct diff_options *options)
 		options->a_prefix = "a/";
 		options->b_prefix = "b/";
 	}
+
+	options->color_moved = diff_color_moved_default;
 }
 
 void diff_setup_done(struct diff_options *options)
@@ -3859,6 +4111,9 @@ void diff_setup_done(struct diff_options *options)
 
 	if (DIFF_OPT_TST(options, FOLLOW_RENAMES) && options->pathspec.nr != 1)
 		die(_("--follow requires exactly one pathspec"));
+
+	if (!options->use_color || external_diff())
+		options->color_moved = 0;
 }
 
 static int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
@@ -4283,7 +4538,16 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--no-color"))
 		options->use_color = 0;
-	else if (!strcmp(arg, "--color-words")) {
+	else if (!strcmp(arg, "--color-moved"))
+		options->color_moved = COLOR_MOVED_ZEBRA;
+	else if (!strcmp(arg, "--no-color-moved"))
+		options->color_moved = COLOR_MOVED_NO;
+	else if (skip_prefix(arg, "--color-moved=", &arg)) {
+		int cm = parse_color_moved(arg);
+		if (cm < 0)
+			die("bad --color-moved argument: %s", arg);
+		options->color_moved = cm;
+	} else if (!strcmp(arg, "--color-words")) {
 		options->use_color = 1;
 		options->word_diff = DIFF_WORDS_COLOR;
 	}
@@ -5094,16 +5358,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	int i;
 	static struct emitted_diff_symbols esm = EMITTED_DIFF_SYMBOLS_INIT;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	/*
-	 * For testing purposes we want to make sure the diff machinery
-	 * works completely with the buffer. If there is anything emitted
-	 * outside the emit_string, then the order is screwed
-	 * up and the tests will fail.
-	 *
-	 * TODO (later in this series):
-	 * We'll unset this pointer in a later patch.
-	 */
-	o->emitted_symbols = &esm;
+
+	if (o->color_moved)
+		o->emitted_symbols = &esm;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
@@ -5112,6 +5369,24 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	}
 
 	if (o->emitted_symbols) {
+		if (o->color_moved) {
+			struct hashmap add_lines, del_lines;
+			unsigned ignore_ws = DIFF_XDL_TST(o, IGNORE_WHITESPACE);
+
+			hashmap_init(&del_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+			hashmap_init(&add_lines, ignore_ws ?
+				(hashmap_cmp_fn)moved_entry_cmp_no_ws :
+				(hashmap_cmp_fn)moved_entry_cmp, 0);
+
+			add_lines_to_move_detection(o, &add_lines, &del_lines);
+			mark_color_as_moved(o, &add_lines, &del_lines);
+
+			hashmap_free(&add_lines, 0);
+			hashmap_free(&del_lines, 0);
+		}
+
 		for (i = 0; i < esm.nr; i++)
 			emit_diff_symbol_from_struct(o, &esm.buf[i]);
 
@@ -5195,6 +5470,7 @@ void diff_flush(struct diff_options *options)
 		if (!options->file)
 			die_errno("Could not open /dev/null");
 		options->close_file = 1;
+		options->color_moved = 0;
 		for (i = 0; i < q->nr; i++) {
 			struct diff_filepair *p = q->queue[i];
 			if (check_pair_status(p))
diff --git a/diff.h b/diff.h
index 65fc9dbb4b..7726ad255c 100644
--- a/diff.h
+++ b/diff.h
@@ -188,6 +188,10 @@ struct diff_options {
 	int diff_path_counter;
 
 	struct emitted_diff_symbols *emitted_symbols;
+	enum {
+		COLOR_MOVED_NO = 0,
+		COLOR_MOVED_ZEBRA = 2,
+	} color_moved;
 };
 
 void diff_emit_submodule_del(struct diff_options *o, const char *line);
@@ -208,7 +212,11 @@ enum color_diff {
 	DIFF_FILE_NEW = 5,
 	DIFF_COMMIT = 6,
 	DIFF_WHITESPACE = 7,
-	DIFF_FUNCINFO = 8
+	DIFF_FUNCINFO = 8,
+	DIFF_FILE_OLD_MOVED = 9,
+	DIFF_FILE_OLD_MOVED_ALT = 10,
+	DIFF_FILE_NEW_MOVED = 11,
+	DIFF_FILE_NEW_MOVED_ALT = 12
 };
 const char *diff_get_color(int diff_use_color, enum color_diff ix);
 #define diff_get_color_opt(o, ix) \
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 289806d0c7..4a03766f1f 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -972,4 +972,200 @@ test_expect_success 'option overrides diff.wsErrorHighlight' '
 
 '
 
+test_expect_success 'detect moved code, complete file' '
+	git reset --hard &&
+	cat <<-\EOF >test.c &&
+	#include<stdio.h>
+	main()
+	{
+	printf("Hello World");
+	}
+	EOF
+	git add test.c &&
+	git commit -m "add main function" &&
+	git mv test.c main.c &&
+	test_config color.diff.oldMoved "normal red" &&
+	test_config color.diff.newMoved "normal green" &&
+	git diff HEAD --color-moved --no-renames | test_decode_color >actual &&
+	cat >expected <<-\EOF &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>new file mode 100644<RESET>
+	<BOLD>index 0000000..a986c57<RESET>
+	<BOLD>--- /dev/null<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -0,0 +1,5 @@<RESET>
+	<BGREEN>+<RESET><BGREEN>#include<stdio.h><RESET>
+	<BGREEN>+<RESET><BGREEN>main()<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<BGREEN>+<RESET><BGREEN>printf("Hello World");<RESET>
+	<BGREEN>+<RESET><BGREEN>}<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>deleted file mode 100644<RESET>
+	<BOLD>index a986c57..0000000<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ /dev/null<RESET>
+	<CYAN>@@ -1,5 +0,0 @@<RESET>
+	<BRED>-#include<stdio.h><RESET>
+	<BRED>-main()<RESET>
+	<BRED>-{<RESET>
+	<BRED>-printf("Hello World");<RESET>
+	<BRED>-}<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'detect malicious moved code, inside file' '
+	test_config color.diff.oldMoved "normal red" &&
+	test_config color.diff.newMoved "normal green" &&
+	test_config color.diff.oldMovedAlternative "blue" &&
+	test_config color.diff.newMovedAlternative "yellow" &&
+	git reset --hard &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			if (!u->is_allowed_foo)
+				return;
+			foo(u);
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git add main.c test.c &&
+	git commit -m "add main and test file" &&
+	cat <<-\EOF >main.c &&
+		#include<stdio.h>
+		int stuff()
+		{
+			printf("Hello ");
+			printf("World\n");
+		}
+
+		int main()
+		{
+			foo();
+		}
+	EOF
+	cat <<-\EOF >test.c &&
+		#include<stdio.h>
+		int bar()
+		{
+			printf("Hello World, but different\n");
+		}
+
+		int secure_foo(struct user *u)
+		{
+			foo(u);
+			if (!u->is_allowed_foo)
+				return;
+		}
+
+		int another_function()
+		{
+			bar();
+		}
+	EOF
+	git diff HEAD --no-renames --color-moved=zebra| test_decode_color >actual &&
+	cat <<-\EOF >expected &&
+	<BOLD>diff --git a/main.c b/main.c<RESET>
+	<BOLD>index 27a619c..7cf9336 100644<RESET>
+	<BOLD>--- a/main.c<RESET>
+	<BOLD>+++ b/main.c<RESET>
+	<CYAN>@@ -5,13 +5,6 @@<RESET> <RESET>printf("Hello ");<RESET>
+	 printf("World\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BRED>-int secure_foo(struct user *u)<RESET>
+	<BRED>-{<RESET>
+	<BLUE>-if (!u->is_allowed_foo)<RESET>
+	<BLUE>-return;<RESET>
+	<BRED>-foo(u);<RESET>
+	<BLUE>-}<RESET>
+	<BLUE>-<RESET>
+	 int main()<RESET>
+	 {<RESET>
+	 foo();<RESET>
+	<BOLD>diff --git a/test.c b/test.c<RESET>
+	<BOLD>index 1dc1d85..2bedec9 100644<RESET>
+	<BOLD>--- a/test.c<RESET>
+	<BOLD>+++ b/test.c<RESET>
+	<CYAN>@@ -4,6 +4,13 @@<RESET> <RESET>int bar()<RESET>
+	 printf("Hello World, but different\n");<RESET>
+	 }<RESET>
+	 <RESET>
+	<BGREEN>+<RESET><BGREEN>int secure_foo(struct user *u)<RESET>
+	<BGREEN>+<RESET><BGREEN>{<RESET>
+	<YELLOW>+<RESET><YELLOW>foo(u);<RESET>
+	<BGREEN>+<RESET><BGREEN>if (!u->is_allowed_foo)<RESET>
+	<BGREEN>+<RESET><BGREEN>return;<RESET>
+	<YELLOW>+<RESET><YELLOW>}<RESET>
+	<YELLOW>+<RESET>
+	 int another_function()<RESET>
+	 {<RESET>
+	 bar();<RESET>
+	EOF
+
+	test_cmp expected actual
+'
+
+test_expect_success 'no effect from --color-moved with --word-diff' '
+	cat <<-\EOF >text.txt &&
+	Lorem Ipsum is simply dummy text of the printing and typesetting industry.
+	EOF
+	git add text.txt &&
+	git commit -a -m "clean state" &&
+	cat <<-\EOF >text.txt &&
+	simply Lorem Ipsum dummy is text of the typesetting and printing industry.
+	EOF
+	git diff --color-moved --word-diff >actual &&
+	git diff --word-diff >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'move detection with submodules' '
+	test_create_repo bananas &&
+	echo ripe >bananas/recipe &&
+	git -C bananas add recipe &&
+	test_commit fruit &&
+	test_commit -C bananas recipe &&
+	git submodule add ./bananas &&
+	git add bananas &&
+	git commit -a -m "bananas are like a heavy library?" &&
+	echo foul >bananas/recipe &&
+	echo ripe >fruit.t &&
+
+	git diff --submodule=diff --color-moved >actual &&
+
+	# no move detection as the moved line is across repository boundaries.
+	test_decode_color <actual >decoded_actual &&
+	! grep BGREEN decoded_actual &&
+	! grep BRED decoded_actual &&
+
+	# nor did we mess with it another way
+	git diff --submodule=diff | test_decode_color >expect &&
+	test_cmp expect decoded_actual
+'
+
 test_done
-- 
2.12.2.575.gb14f27f917


^ permalink raw reply	[relevance 10%]

* [PATCH/RFC] Cleanup Documentation
      [irrelevant] <CAGZ79kbMhQpxUa5TXK=WCzzKUCZ5vx3oC+fFTTozpgQihsUjTA@mail.gmail.com>
@ 2017-06-20  3:12 ` Kaartic Sivaraam
  2017-06-20 16:57   ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Kaartic Sivaraam @ 2017-06-20  3:12 UTC (permalink / raw)
  To: sbeller; +Cc: gitster, git, Kaartic Sivaraam

Make following changes to the git-submodule
documentation:

* Remove redundancy
* Remove unclear back reference
* Use more appropriate word
* Quote important word

Suggestions-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Kaartic Sivaraam <kaarticsivaraam91196@gmail.com>
---
 Currently used the word "canonical" instead of "humanish". If that word
 sounds more suitable then this is a [PATCH] and not a [PATCH/RFC].


 Documentation/git-submodule.txt | 37 +++++++++++++++----------------------
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 74bc6200d..045fef417 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -63,14 +63,6 @@ add [-b <branch>] [-f|--force] [--name <name>] [--reference <repository>] [--dep
 	to the changeset to be committed next to the current
 	project: the current project is termed the "superproject".
 +
-This requires at least one argument: <repository>. The optional
-argument <path> is the relative location for the cloned submodule
-to exist in the superproject. If <path> is not given, the
-"humanish" part of the source repository is used ("repo" for
-"/path/to/repo.git" and "foo" for "host.xz:foo/.git").
-The <path> is also used as the submodule's logical name in its
-configuration entries unless `--name` is used to specify a logical name.
-+
 <repository> is the URL of the new submodule's origin repository.
 This may be either an absolute URL, or (if it begins with ./
 or ../), the location relative to the superproject's default remote
@@ -87,21 +79,22 @@ If the superproject doesn't have a default remote configured
 the superproject is its own authoritative upstream and the current
 working directory is used instead.
 +
-<path> is the relative location for the cloned submodule to
-exist in the superproject. If <path> does not exist, then the
-submodule is created by cloning from the named URL. If <path> does
-exist and is already a valid Git repository, then this is added
-to the changeset without cloning. This second form is provided
-to ease creating a new submodule from scratch, and presumes
-the user will later push the submodule to the given URL.
+The optional argument <path> is the relative location for the cloned
+submodule to exist in the superproject. If <path> is not given, the
+canonical part of the source repository is used ("repo" for
+"/path/to/repo.git" and "foo" for "host.xz:foo/.git"). If <path>
+exists and is already a valid Git repository, then this is added
+to the changeset without cloning. The <path> is also used as the
+submodule's logical name in its configuration entries unless `--name`
+is used to specify a logical name.
 +
-In either case, the given URL is recorded into .gitmodules for
-use by subsequent users cloning the superproject. If the URL is
-given relative to the superproject's repository, the presumption
-is the superproject and submodule repositories will be kept
-together in the same relative location, and only the
-superproject's URL needs to be provided: git-submodule will correctly
-locate the submodule using the relative URL in .gitmodules.
+The given URL is recorded into `.gitmodules` for use by subsequent users
+cloning the superproject. If the URL is given relative to the
+superproject's repository, the presumption is the superproject and
+submodule repositories will be kept together in the same relative
+location, and only the superproject's URL needs to be provided.
+git-submodule will correctly locate the submodule using the relative
+URL in .gitmodules.
 
 status [--cached] [--recursive] [--] [<path>...]::
 	Show the status of the submodules. This will print the SHA-1 of the
-- 
2.11.0


^ permalink raw reply	[relevance 15%]

* Re: in case you want a use-case with lots of submodules
  2017-06-19 20:20   ` Yaroslav Halchenko
@ 2017-06-20  5:43     ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-06-20  5:43 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: Prathamesh Chavan, git

On Mon, Jun 19, 2017 at 1:20 PM, Yaroslav Halchenko <yoh@onerussian.com> wrote:
>
> On Mon, 19 Jun 2017, Stefan Beller wrote:
>
>> On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko <yoh@onerussian.com> wrote:
>> > Hi All,
>
>> > On a recent trip I've listened to the git minutes podcast episode and
>> > got excited to hear  Stefan Beller (CCed just in case) describing
>> > ongoing work on submodules mechanism.  I got excited, since e.g.
>> > performance improvements would be of great benefit to us too.
>
>> If you're mostly interested in performance improvements of the status
>> quo (i.e. "make git-submodule fast"), then the work of Prathamesh
>> Chavan (cc'd) might be more interesting to you than what I do.
>> He is porting git-submodule (which is mostly a shell script nowadays)
>> to C, such that we can save a lot of process invocations and can do
>> processing within one process.
>
> ah -- cool.  I would be eager to test it out, thanks!  would be
> interesting to see if it positively affects our overall performance.
> Pointers to that development would be welcome!

The latest from today:
https://public-inbox.org/git/CAME+mvUQJFneV7b1G7zmAidP-5L=nimvY43V0ug-Gtesr83tzg@mail.gmail.com/


>
>> > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370
>> > repositories, up to 4 levels deep) hierarchy of git/git-annex
>> > repositories all tied together via git submodules mechanism.  And as the
>> > collection grows, interactions with it become slower, so additional
>> > options (such as --ignore-submodules=dirty  to status) become our
>> > friends.
>
>> I am not as much concerned about the 370 number than about the
>> 4 layers of nesting. In my experience the nested submodule case
>> is a little bit error prone and the bug reports are not as frequent as
>> there are not as many users of nesting, yet(?)
>
> well -- part of the story here is that we are forced to use/have full
> blown .git/ directories (for git-annex symlinks to content files to
> work) within submodules instead of .git file with a reference under
> parent's .git/modules.   So we can 'slice' at any level and I
> guess that is why may be avoiding some possibly issues due to nesting
> and the "parent has all .git/modules" approach.

That sounds like you either want to configure to have the submodules
git dirs in-place or you want to convince git-annex to learn about the
gitdir pointer files.

>
>> In a neighboring thread on the mailing list we have a discussion
>> on the usefulness of being on branches than in detached HEAD
>> in the submodules.
>> https://public-inbox.org/git/0092CDD27C5F9D418B0F3E9B5D05BE08010287DF@SBS2011.opfingen.plc2.de/
>
>> This would not break non-ambiguously, rather it would add
>> ease of use.
>
> that is indeed a common caveat... I am not sure if any heuristic
> approach would provide a 'bullet proof' solution.  I might even prefer a
> hardcoded 'branch-name' to be listed/associated with each submodule
> within .gitmodules.

hardcoded as submodule.NAME.branch, maybe?
https://git-scm.com/docs/gitmodules

>  In the datalad case, detached HEAD is common

So you are accustomed to detached HEADs and would not
gain much from being back on a branch?  That's cool, too.


> whenever someone installs "outdated" (branch of which progressed
> forward) submodule.  In this case we just check if the branch after "git
> clone"  (but before git submodule update) includes the pointed by
> Subproject commit, and if so -- we announce that it must be the branch
> (so far it is always "master" branch anyways ;) )

heh, having just one branch. That is retro-style. :)

^ permalink raw reply	[relevance 24%]

* Re: [PATCH/RFC] Cleanup Documentation
  2017-06-20  3:12 ` Kaartic Sivaraam
@ 2017-06-20 16:57   ` Stefan Beller
  2017-06-20 17:05     ` Stefan Beller
  2017-06-20 18:27     ` Kaartic Sivaraam
  0 siblings, 2 replies; 200+ results
From: Stefan Beller @ 2017-06-20 16:57 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: Junio C Hamano, git

On Mon, Jun 19, 2017 at 8:12 PM, Kaartic Sivaraam
<kaarticsivaraam91196@gmail.com> wrote:
> Make following changes to the git-submodule
> documentation:
>
> * Remove redundancy
> * Remove unclear back reference
> * Use more appropriate word
> * Quote important word
>
> Suggestions-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Kaartic Sivaraam <kaarticsivaraam91196@gmail.com>
> ---
>  Currently used the word "canonical" instead of "humanish". If that word
>  sounds more suitable then this is a [PATCH] and not a [PATCH/RFC].

canonical: "according to recognized rules or scientific laws."
sounds about right. :)

>
>  Documentation/git-submodule.txt | 37 +++++++++++++++----------------------
>  1 file changed, 15 insertions(+), 22 deletions(-)
>
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 74bc6200d..045fef417 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -63,14 +63,6 @@ add [-b <branch>] [-f|--force] [--name <name>] [--reference <repository>] [--dep
>         to the changeset to be committed next to the current
>         project: the current project is termed the "superproject".
>  +
> -This requires at least one argument: <repository>. The optional
> -argument <path> is the relative location for the cloned submodule
> -to exist in the superproject. If <path> is not given, the
> -"humanish" part of the source repository is used ("repo" for
> -"/path/to/repo.git" and "foo" for "host.xz:foo/.git").
> -The <path> is also used as the submodule's logical name in its
> -configuration entries unless `--name` is used to specify a logical name.
> -+
>  <repository> is the URL of the new submodule's origin repository.
>  This may be either an absolute URL, or (if it begins with ./
>  or ../), the location relative to the superproject's default remote
> @@ -87,21 +79,22 @@ If the superproject doesn't have a default remote configured
>  the superproject is its own authoritative upstream and the current
>  working directory is used instead.
>  +
> -<path> is the relative location for the cloned submodule to
> -exist in the superproject. If <path> does not exist, then the
> -submodule is created by cloning from the named URL. If <path> does
> -exist and is already a valid Git repository, then this is added
> -to the changeset without cloning. This second form is provided
> -to ease creating a new submodule from scratch, and presumes
> -the user will later push the submodule to the given URL.
> +The optional argument <path> is the relative location for the cloned
> +submodule to exist in the superproject. If <path> is not given, the
> +canonical part of the source repository is used ("repo" for
> +"/path/to/repo.git" and "foo" for "host.xz:foo/.git"). If <path>
> +exists and is already a valid Git repository, then this is added
> +to the changeset without cloning.

While this was just reflowed and not newly introduced, I am still left
wondering what a changeset is in Git terms. Our Documentation/glossary
says:

  [[def_changeset]]changeset::
  BitKeeper/cvsps speak for "<<def_commit,commit>>". Since Git does not
  store changes, but states, it really does not make sense to use the term
  "changesets" with Git.

Maybe we should say instead:

    If <path>exists and is already a valid Git repository,
    then this is staged for commit without cloning.




> The <path> is also used as the
> +submodule's logical name in its configuration entries unless `--name`
> +is used to specify a logical name.
>  +
> -In either case, the given URL is recorded into .gitmodules for
> -use by subsequent users cloning the superproject. If the URL is
> -given relative to the superproject's repository, the presumption
> -is the superproject and submodule repositories will be kept
> -together in the same relative location, and only the
> -superproject's URL needs to be provided: git-submodule will correctly
> -locate the submodule using the relative URL in .gitmodules.
> +The given URL is recorded into `.gitmodules` for use by subsequent users
> +cloning the superproject. If the URL is given relative to the
> +superproject's repository, the presumption is the superproject and
> +submodule repositories will be kept together in the same relative
> +location, and only the superproject's URL needs to be provided.
> +git-submodule will correctly locate the submodule using the relative
> +URL in .gitmodules.
>

With or without this nit addressed, this patch looks good to me,

Thanks,
Stefan

^ permalink raw reply	[relevance 9%]

* Re: [PATCH/RFC] Cleanup Documentation
  2017-06-20 16:57   ` Stefan Beller
@ 2017-06-20 17:05     ` Stefan Beller
  2017-06-21  3:02       ` [PATCH/FINALRFC] Documentation/git-submodule: cleanup Kaartic Sivaraam
  2017-06-20 18:27     ` Kaartic Sivaraam
  1 sibling, 1 reply; 200+ results
From: Stefan Beller @ 2017-06-20 17:05 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: Junio C Hamano, git

On Tue, Jun 20, 2017 at 9:57 AM, Stefan Beller <sbeller@google.com> wrote:
>
> With or without this nit addressed, this patch looks good to me,
>

Well actually not quite. The subject (and commit message) is very vague,
maybe:

    Documentation/git-submodule: cleanup "add" section

    The "add" section for 'git-submodule' is redundant in its description and
    the short synopsis line. Remove the redundant mentioning of the
    'repository' argument being mandatory.

    The text is hard to read because of back-references, so remove those.

    Replace the word "humanish" by "canonical" as that conveys better what
    we do to guess the path.

    While at it, quote all occurrences of '.gitmodules' as that is an important
    file in the submodule context, also link to it on its first mention.
    (This paragraph is not exactly what happens in the commit, but I wrote it
    as a way how I would write commit messages. It shows the reader how
    you addressed the given problem, with the quantifiers "all" "the
first" showing
    what you think is important, and that you deliberately