git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
						download: 
* Re: [PATCH] git: add --no-optional-locks option
  2017-09-22  4:25   ` Jeff King
@ 2017-09-22 20:09     ` Stefan Beller
  2017-09-22 21:25       ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-22 20:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, Johannes Schindelin, git

On Thu, Sep 21, 2017 at 9:25 PM, Jeff King <peff@peff.net> wrote:

>
> But imagine that "git status" learns to recurse into submodules and run
> "git status" inside them. Surely we would want the submodule repos to
> also avoid taking any unnecessary locks?

You can teach Git to recurse into submodules already at home,
just 'git config status.submoduleSummary none'. ;)

It occurs to me that the config name is badly choosen, as it stores
an argument for git status --ignore-submodules[=mode]

^ permalink raw reply	[relevance 17%]

* Re: [PATCH] git: add --no-optional-locks option
  2017-09-22 20:09     ` Stefan Beller
@ 2017-09-22 21:25       ` Jeff King
  2017-09-22 21:41         ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Jeff King @ 2017-09-22 21:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes Sixt, Johannes Schindelin, git

On Fri, Sep 22, 2017 at 01:09:32PM -0700, Stefan Beller wrote:

> On Thu, Sep 21, 2017 at 9:25 PM, Jeff King <peff@peff.net> wrote:
> 
> >
> > But imagine that "git status" learns to recurse into submodules and run
> > "git status" inside them. Surely we would want the submodule repos to
> > also avoid taking any unnecessary locks?
> 
> You can teach Git to recurse into submodules already at home,
> just 'git config status.submoduleSummary none'. ;)
> 
> It occurs to me that the config name is badly choosen, as it stores
> an argument for git status --ignore-submodules[=mode]

Ah, thanks. I _thought_ we could already do that but when I went looking
for the standard --recursive option I couldn't find it.

So yes, I would think we would want this option to apply recursively in
that case, even when we cross repository boundaries.

-Peff

^ permalink raw reply	[relevance 8%]

* [PATCH] Documentation/config: clarify the meaning of submodule.<name>.update
@ 2017-09-22 21:28 Stefan Beller
  2017-09-22 21:37 ` Jonathan Nieder
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-22 21:28 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

With more commands (that potentially change a submodule) paying attention
to submodules as well as the recent discussion[1] on submodule.<name>.update,
let's spell out that submodule.<name>.update is strictly to be used
for configuring the "submodule update" command and not to be obeyed
by other commands.

These other commands usually have a strict meaning of what they should
do (i.e. checkout, reset, rebase, merge) as well as have their name
overlapping with the modes possible for submodule.<name>.update.

[1] https://public-inbox.org/git/4283F0B0-BC1C-4ED1-8126-7E512D84484B@gmail.com/

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a2..b0ded777fe 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3085,10 +3085,9 @@ submodule.<name>.url::
 	See linkgit:git-submodule[1] and linkgit:gitmodules[5] for details.
 
 submodule.<name>.update::
-	The default update procedure for a submodule. This variable
-	is populated by `git submodule init` from the
-	linkgit:gitmodules[5] file. See description of 'update'
-	command in linkgit:git-submodule[1].
+	The method how a submodule is updated via 'git submodule update'.
+	It is populated by `git submodule init` from the linkgit:gitmodules[5]
+	file. See description of 'update' command in linkgit:git-submodule[1].
 
 submodule.<name>.branch::
 	The remote branch name for a submodule, used by `git submodule
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 32%]

* Re: [PATCH] Documentation/config: clarify the meaning of submodule.<name>.update
  2017-09-22 21:28 [PATCH] Documentation/config: clarify the meaning of submodule.<name>.update Stefan Beller
@ 2017-09-22 21:37 ` Jonathan Nieder
  2017-09-22 22:52   ` [PATCHv2] " Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Jonathan Nieder @ 2017-09-22 21:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Hi,

Stefan Beller wrote:

> With more commands (that potentially change a submodule) paying attention
> to submodules as well as the recent discussion[1] on submodule.<name>.update,
> let's spell out that submodule.<name>.update is strictly to be used
> for configuring the "submodule update" command and not to be obeyed
> by other commands.

Good idea, thank you.

You'll want to update Documentation/gitmodules.txt, too.

I think this can go further: it should say explicitly that commands
like "git checkout --recurse-submodules" do not pay attention to this
option.

> These other commands usually have a strict meaning of what they should
> do (i.e. checkout, reset, rebase, merge) as well as have their name
> overlapping with the modes possible for submodule.<name>.update.
>
> [1] https://public-inbox.org/git/4283F0B0-BC1C-4ED1-8126-7E512D84484B@gmail.com/

Can you summarize what this discussion concluded with so the reader
does not have to look far to understand it?

> Signed-off-by: Stefan Beller <sbeller@google.com>

Reported-by: Lars Schneider <larsxschneider@gmail.com>

> ---
>  Documentation/config.txt | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index dc4e3f58a2..b0ded777fe 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3085,10 +3085,9 @@ submodule.<name>.url::
>  	See linkgit:git-submodule[1] and linkgit:gitmodules[5] for details.
>  
>  submodule.<name>.update::
> -	The default update procedure for a submodule. This variable
> -	is populated by `git submodule init` from the
> -	linkgit:gitmodules[5] file. See description of 'update'
> -	command in linkgit:git-submodule[1].
> +	The method how a submodule is updated via 'git submodule update'.
> +	It is populated by `git submodule init` from the linkgit:gitmodules[5]
> +	file. See description of 'update' command in linkgit:git-submodule[1].
>  

Wording nits: s/The method how/The method by which/; s/via/by/

More importantly, can this be more explicit about how it is meant to
be used?  E.g. to say

 1. This only affects "git submodule update" and doesn't affect
    commands like "git checkout --recurse-submodules".

 2. It exists for historical reasons; settings like submodule.active
    and pull.rebase are more likely to be what someone is looking for.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[relevance 25%]

* Re: [PATCH] git: add --no-optional-locks option
  2017-09-22 21:25       ` Jeff King
@ 2017-09-22 21:41         ` Stefan Beller
  2017-09-23  3:34           ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-22 21:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, Johannes Schindelin, git

On Fri, Sep 22, 2017 at 2:25 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Sep 22, 2017 at 01:09:32PM -0700, Stefan Beller wrote:
>
>> On Thu, Sep 21, 2017 at 9:25 PM, Jeff King <peff@peff.net> wrote:
>>
>> >
>> > But imagine that "git status" learns to recurse into submodules and run
>> > "git status" inside them. Surely we would want the submodule repos to
>> > also avoid taking any unnecessary locks?
>>
>> You can teach Git to recurse into submodules already at home,
>> just 'git config status.submoduleSummary none'. ;)
>>
>> It occurs to me that the config name is badly choosen, as it stores
>> an argument for git status --ignore-submodules[=mode]
>
> Ah, thanks. I _thought_ we could already do that but when I went looking
> for the standard --recursive option I couldn't find it.

Thanks for checking for submodules there.

I personally prefer --recurse-submodules despite the longer name,
because a plain recursive option can mean anything in a sufficiently
complex program such as Git (recurse into the tree (c.f. ls-tree), or
for the algorithm used (c.f. merge, diff) or yet another dimension
I did not think of).

> So yes, I would think we would want this option to apply recursively in
> that case, even when we cross repository boundaries.

Regarding the actual patch which is heavily aimed at coping with IDEs
despite the command line being used, I wonder how many IDEs pass
--ignore-submodules and recurse themselves (if needed). Reason for
my suspicion is [1] which does pay attention to submodules:

>    Our application calls status including the following flags:
>    --porcelain=v2 --ignored --untracked-files=all --ignore-submodules=none

[1] https://public-inbox.org/git/2bbb1d0f-ae06-1878-d185-112bd51f75c9@gmail.com/

There might be another option to cope with the situation:

 4. Teach all commands to spinlock / busywait shortly for important
     locks instead of giving up. In that case git-status rewriting
     the index ought to be fine?

Thanks,
Stefan

^ permalink raw reply	[relevance 18%]

* [PATCHv2] Documentation/config: clarify the meaning of submodule.<name>.update
  2017-09-22 21:37 ` Jonathan Nieder
@ 2017-09-22 22:52   ` " Stefan Beller
  2017-09-22 22:58     ` Jonathan Nieder
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-22 22:52 UTC (permalink / raw)
  To: jrnieder; +Cc: git, sbeller

With more commands (that potentially change a submodule) paying attention
to submodules as well as the recent discussion[1] on
submodule.<name>.update, let's spell out that submodule.<name>.update
is strictly to be used for configuring the "submodule update" command
and not to be obeyed by other commands.

These other commands usually have a strict meaning of what they should
do (i.e. checkout, reset, rebase, merge) as well as have their name
overlapping with the modes possible for submodule.<name>.update.

[1] https://public-inbox.org/git/4283F0B0-BC1C-4ED1-8126-7E512D84484B@gmail.com/
    submodule.<name>.update was set to "none", triggering unexpected
    behavior as the submodule was thought to never be touched.
    However a newer version of Git taught 'git pull --rebase' to also
    populate and rebase submodules if they were active.
    The newer options such as submodule.active and command specific
    flags would not have triggered unexpected behavior.

Reported-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Jonathan writes:
> You'll want to update Documentation/gitmodules.txt, too.

No. /grumpycat

It should already be fine, as I read it as if it is only relevant to 
"git submodule update" there, already:

  submodule.<name>.update::
	Defines the default update procedure for the named submodule,
	i.e. how the submodule is updated by "git submodule update"
	command in the superproject. This is only used by `git
	submodule init` to initialize the configuration variable of
	the same name. Allowed values here are 'checkout', 'rebase',
	'merge' or 'none'. See description of 'update' command in
	linkgit:git-submodule[1] for their meaning. Note that the
	'!command' form is intentionally ignored here for security
	reasons.
 

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a2..1ac0ae6adb 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3085,10 +3085,14 @@ submodule.<name>.url::
 	See linkgit:git-submodule[1] and linkgit:gitmodules[5] for details.
 
 submodule.<name>.update::
-	The default update procedure for a submodule. This variable
-	is populated by `git submodule init` from the
-	linkgit:gitmodules[5] file. See description of 'update'
-	command in linkgit:git-submodule[1].
+	The method by which a submodule is updated by 'git submodule update',
+	which is the only affected command, others such as
+	'git checkout --recurse-submodules' are unaffected. It exists for
+	historical reasons, when 'git submodule' was the only command to
+	interact with submodules; settings like `submodule.active`
+	and `pull.rebase` are more specific. It is populated by
+	`git submodule init` from the linkgit:gitmodules[5] file.
+	See description of 'update' command in linkgit:git-submodule[1].
 
 submodule.<name>.branch::
 	The remote branch name for a submodule, used by `git submodule
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 29%]

* Re: [PATCHv2] Documentation/config: clarify the meaning of submodule.<name>.update
  2017-09-22 22:52   ` [PATCHv2] " Stefan Beller
@ 2017-09-22 22:58     ` Jonathan Nieder
  2017-09-23 23:52       ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Jonathan Nieder @ 2017-09-22 22:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller wrote:

> Reported-by: Lars Schneider <larsxschneider@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  Documentation/config.txt | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> Jonathan writes:

>> You'll want to update Documentation/gitmodules.txt, too.
>
> No. /grumpycat
>
> It should already be fine, as I read it as if it is only relevant to 
> "git submodule update" there, already:
>
>   submodule.<name>.update::
> 	Defines the default update procedure for the named submodule,
> 	i.e. how the submodule is updated by "git submodule update"
> 	command in the superproject. This is only used by `git
> 	submodule init` to initialize the configuration variable of
> 	the same name. Allowed values here are 'checkout', 'rebase',
> 	'merge' or 'none'. See description of 'update' command in
> 	linkgit:git-submodule[1] for their meaning. Note that the
> 	'!command' form is intentionally ignored here for security
> 	reasons.

I disagree.  Actually, I think the git-config(1) blurb could just
point here, and that the text here ought to be clear about what
commands it affects and how an end user would use it.

^ permalink raw reply	[relevance 15%]

* Re: [PATCH] git: add --no-optional-locks option
  2017-09-22 21:41         ` Stefan Beller
@ 2017-09-23  3:34           ` Jeff King
  0 siblings, 0 replies; 200+ results
From: Jeff King @ 2017-09-23  3:34 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes Sixt, Johannes Schindelin, git

On Fri, Sep 22, 2017 at 02:41:06PM -0700, Stefan Beller wrote:

> > Ah, thanks. I _thought_ we could already do that but when I went looking
> > for the standard --recursive option I couldn't find it.
> 
> Thanks for checking for submodules there.
> 
> I personally prefer --recurse-submodules despite the longer name,
> because a plain recursive option can mean anything in a sufficiently
> complex program such as Git (recurse into the tree (c.f. ls-tree), or
> for the algorithm used (c.f. merge, diff) or yet another dimension
> I did not think of).

Yeah, I agree that's a better name (mentioning --recursive is probably
just showing my general ignorance of submodules).

> > So yes, I would think we would want this option to apply recursively in
> > that case, even when we cross repository boundaries.
> 
> Regarding the actual patch which is heavily aimed at coping with IDEs
> despite the command line being used, I wonder how many IDEs pass
> --ignore-submodules and recurse themselves (if needed). Reason for
> my suspicion is [1] which does pay attention to submodules:
> 
> >    Our application calls status including the following flags:
> >    --porcelain=v2 --ignored --untracked-files=all --ignore-submodules=none
> 
> [1] https://public-inbox.org/git/2bbb1d0f-ae06-1878-d185-112bd51f75c9@gmail.com/

Yeah, I think that's probably Visual Studio (which is what Johannes
presumably wrote the original patch for). GitHub Desktop does not
currently pass it, though perhaps should consider doing so (though of
course the user could tweak their config, too).

> There might be another option to cope with the situation:
> 
>  4. Teach all commands to spinlock / busywait shortly for important
>      locks instead of giving up. In that case git-status rewriting
>      the index ought to be fine?

We do have all the infrastructure in place to do a reasonable busywait
with backoff. I think the patch is roughly just:

diff --git a/read-cache.c b/read-cache.c
index 65f4fe8375..fc1ba122a3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1563,7 +1563,8 @@ static int read_index_extension(struct index_state *istate,
 
 int hold_locked_index(struct lock_file *lk, int lock_flags)
 {
-	return hold_lock_file_for_update(lk, get_index_file(), lock_flags);
+	return hold_lock_file_for_update_timeout(lk, get_index_file(),
+						 lock_flags, 500);
 }
 
 int read_index(struct index_state *istate)

though I think there are a few sites which manually call
hold_lock_file_for_update() on the index that would need similar
treatment.

I suspect it would work OK in practice, unless your index is so big that
500ms isn't enough. The user may also see minor stalls when there's lock
contention. I'm not sure how annoying that would be.

-Peff

^ permalink raw reply	[relevance 14%]

* Re: [PATCHv2] Documentation/config: clarify the meaning of submodule.<name>.update
  2017-09-22 22:58     ` Jonathan Nieder
@ 2017-09-23 23:52       ` Junio C Hamano
  2017-09-25 19:10         ` [PATCH] Documentation: consolidate " Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-09-23 23:52 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Stefan Beller, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> Stefan Beller wrote:
>
>> Reported-by: Lars Schneider <larsxschneider@gmail.com>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>  Documentation/config.txt | 12 ++++++++----
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> Jonathan writes:
>
>>> You'll want to update Documentation/gitmodules.txt, too.
>>
>> No. /grumpycat
>>
>> It should already be fine, as I read it as if it is only relevant to 
>> "git submodule update" there, already:
>>
>>   submodule.<name>.update::
>> 	Defines the default update procedure for the named submodule,
>> 	i.e. how the submodule is updated by "git submodule update"
>> 	command in the superproject. This is only used by `git
>> 	submodule init` to initialize the configuration variable of
>> 	the same name. Allowed values here are 'checkout', 'rebase',
>> 	'merge' or 'none'. See description of 'update' command in
>> 	linkgit:git-submodule[1] for their meaning. Note that the
>> 	'!command' form is intentionally ignored here for security
>> 	reasons.
>
> I disagree.  Actually, I think the git-config(1) blurb could just
> point here, and that the text here ought to be clear about what
> commands it affects and how an end user would use it.

I tend to agree with the consolidation.  It's not like this patch
makes things worse, but as we've already diverted our attention to
this area and took time to review, if it is a simple update we can
make before we apply, that is better than having to remember and
come back to patch the result of this step further.

In any case, thanks for reporting, attempting to improve and
reviewing, all.

^ permalink raw reply	[relevance 7%]

* [PATCH v5 1/4] submodule--helper: introduce get_submodule_displaypath()
  2017-09-24 12:08 ` [PATCH v5 0/4] Incremental rewrite of git-submodules Prathamesh Chavan
@ 2017-09-24 12:08   ` Prathamesh Chavan
  2017-09-25  3:35     ` Junio C Hamano
  2017-09-24 12:08   ` [PATCH v5 2/4] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-09-24 12:08 UTC (permalink / raw)
  To: hanwen; +Cc: christian.couder, git, gitster, pc44800, sbeller

Introduce function get_submodule_displaypath() to replace the code
occurring in submodule_init() for generating displaypath of the
submodule with a call to it.

This new function will also be used in other parts of the system
in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 38 ++++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 818fe74f0..d24ac9028 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -220,6 +220,29 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+/* the result should be freed by the caller. */
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		int len = strlen(super_prefix);
+		const char *format = (len > 0 && is_dir_sep(super_prefix[len - 1])) ? "%s%s" : "%s/%s";
+
+		return xstrfmt(format, super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -335,15 +358,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	struct strbuf sb = STRBUF_INIT;
 	char *upd = NULL, *url = NULL, *displaypath;
 
-	if (prefix && get_super_prefix())
-		die("BUG: cannot have prefix and superprefix");
-	else if (prefix)
-		displaypath = xstrdup(relative_path(path, prefix, &sb));
-	else if (get_super_prefix()) {
-		strbuf_addf(&sb, "%s%s", get_super_prefix(), path);
-		displaypath = strbuf_detach(&sb, NULL);
-	} else
-		displaypath = xstrdup(path);
+	displaypath = get_submodule_displaypath(path, prefix);
 
 	sub = submodule_from_path(&null_oid, path);
 
@@ -358,9 +373,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 * Set active flag for the submodule being initialized
 	 */
 	if (!is_submodule_active(the_repository, path)) {
-		strbuf_reset(&sb);
 		strbuf_addf(&sb, "submodule.%s.active", sub->name);
 		git_config_set_gently(sb.buf, "true");
+		strbuf_reset(&sb);
 	}
 
 	/*
@@ -368,7 +383,6 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 * To look up the url in .git/config, we must not fall back to
 	 * .gitmodules, so look it up directly.
 	 */
-	strbuf_reset(&sb);
 	strbuf_addf(&sb, "submodule.%s.url", sub->name);
 	if (git_config_get_string(sb.buf, &url)) {
 		if (!sub->url)
@@ -405,9 +419,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
 	}
+	strbuf_reset(&sb);
 
 	/* Copy "update" setting when it is not set yet */
-	strbuf_reset(&sb);
 	strbuf_addf(&sb, "submodule.%s.update", sub->name);
 	if (git_config_get_string(sb.buf, &upd) &&
 	    sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {
-- 
2.13.0


^ permalink raw reply	[relevance 21%]

* [PATCH v5 2/4] submodule--helper: introduce for_each_listed_submodule()
  2017-09-24 12:08 ` [PATCH v5 0/4] Incremental rewrite of git-submodules Prathamesh Chavan
  2017-09-24 12:08   ` [PATCH v5 1/4] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
@ 2017-09-24 12:08   ` Prathamesh Chavan
  2017-09-25  3:43     ` Junio C Hamano
  2017-09-24 12:08   ` [PATCH v5 3/4] submodule: port set_name_rev() from shell to C Prathamesh Chavan
  2017-09-24 12:08   ` [PATCH v5 4/4] submodule: port submodule subcommand 'status' " Prathamesh Chavan
  3 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-09-24 12:08 UTC (permalink / raw)
  To: hanwen; +Cc: christian.couder, git, gitster, pc44800, sbeller

Introduce function for_each_listed_submodule() and replace a loop
in module_init() with a call to it.

The new function will also be used in other parts of the
system in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index d24ac9028..d12790b5c 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -14,6 +14,9 @@
 #include "refs.h"
 #include "connect.h"
 
+typedef void (*each_submodule_fn)(const struct cache_entry *list_item,
+				  void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -352,15 +355,30 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static void init_submodule(const char *path, const char *prefix, int quiet)
+static void for_each_listed_submodule(const struct module_list *list,
+				      each_submodule_fn fn, void *cb_data)
+{
+	int i;
+	for (i = 0; i < list->nr; i++)
+		fn(list->entries[i], cb_data);
+}
+
+struct init_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+};
+#define INIT_CB_INIT { NULL, 0 }
+
+static void init_submodule(const struct cache_entry *list_item, void *cb_data)
 {
+	struct init_cb *info = cb_data;
 	const struct submodule *sub;
 	struct strbuf sb = STRBUF_INIT;
 	char *upd = NULL, *url = NULL, *displaypath;
 
-	displaypath = get_submodule_displaypath(path, prefix);
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
 
-	sub = submodule_from_path(&null_oid, path);
+	sub = submodule_from_path(&null_oid, list_item->name);
 
 	if (!sub)
 		die(_("No url found for submodule path '%s' in .gitmodules"),
@@ -372,7 +390,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 *
 	 * Set active flag for the submodule being initialized
 	 */
-	if (!is_submodule_active(the_repository, path)) {
+	if (!is_submodule_active(the_repository, list_item->name)) {
 		strbuf_addf(&sb, "submodule.%s.active", sub->name);
 		git_config_set_gently(sb.buf, "true");
 		strbuf_reset(&sb);
@@ -414,7 +432,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 		if (git_config_set_gently(sb.buf, url))
 			die(_("Failed to register url for submodule path '%s'"),
 			    displaypath);
-		if (!quiet)
+		if (!info->quiet)
 			fprintf(stderr,
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
@@ -443,10 +461,10 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 
 static int module_init(int argc, const char **argv, const char *prefix)
 {
+	struct init_cb info = INIT_CB_INIT;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
 	int quiet = 0;
-	int i;
 
 	struct option module_init_options[] = {
 		OPT__QUIET(&quiet, N_("Suppress output for initializing a submodule")),
@@ -471,8 +489,10 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	if (!argc && git_config_get_value_multi("submodule.active"))
 		module_list_active(&list);
 
-	for (i = 0; i < list.nr; i++)
-		init_submodule(list.entries[i]->name, prefix, quiet);
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+
+	for_each_listed_submodule(&list, init_submodule, &info);
 
 	return 0;
 }
-- 
2.13.0


^ permalink raw reply	[relevance 21%]

* [PATCH v5 3/4] submodule: port set_name_rev() from shell to C
  2017-09-24 12:08 ` [PATCH v5 0/4] Incremental rewrite of git-submodules Prathamesh Chavan
  2017-09-24 12:08   ` [PATCH v5 1/4] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
  2017-09-24 12:08   ` [PATCH v5 2/4] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
@ 2017-09-24 12:08   ` Prathamesh Chavan
  2017-09-24 12:08   ` [PATCH v5 4/4] submodule: port submodule subcommand 'status' " Prathamesh Chavan
  3 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-09-24 12:08 UTC (permalink / raw)
  To: hanwen; +Cc: christian.couder, git, gitster, pc44800, sbeller

Function set_name_rev() is ported from git-submodule to the
submodule--helper builtin. The function compute_rev_name() generates the
value of the revision name as required.
The function get_rev_name() calls compute_rev_name() and receives the
revision name, and later handles its formatting and printing.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 63 +++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            | 16 ++----------
 2 files changed, 65 insertions(+), 14 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index d12790b5c..7ca8e8153 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -246,6 +246,68 @@ static char *get_submodule_displaypath(const char *path, const char *prefix)
 	}
 }
 
+static char *compute_rev_name(const char *sub_path, const char* object_id)
+{
+	struct strbuf sb = STRBUF_INIT;
+	const char ***d;
+
+	static const char *describe_bare[] = {
+		NULL
+	};
+
+	static const char *describe_tags[] = {
+		"--tags", NULL
+	};
+
+	static const char *describe_contains[] = {
+		"--contains", NULL
+	};
+
+	static const char *describe_all_always[] = {
+		"--all", "--always", NULL
+	};
+
+	static const char **describe_argv[] = {
+		describe_bare, describe_tags, describe_contains,
+		describe_all_always, NULL
+	};
+
+	for (d = describe_argv; *d; d++) {
+		struct child_process cp = CHILD_PROCESS_INIT;
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.dir = sub_path;
+		cp.git_cmd = 1;
+		cp.no_stderr = 1;
+
+		argv_array_push(&cp.args, "describe");
+		argv_array_pushv(&cp.args, *d);
+		argv_array_push(&cp.args, object_id);
+
+		if (!capture_command(&cp, &sb, 0) && sb.len) {
+			strbuf_strip_suffix(&sb, "\n");
+			return strbuf_detach(&sb, NULL);
+		}
+	}
+
+	strbuf_release(&sb);
+	return NULL;
+}
+
+static int get_rev_name(int argc, const char **argv, const char *prefix)
+{
+	char *revname;
+	if (argc != 3)
+		die("get-rev-name only accepts two arguments: <path> <sha1>");
+
+	revname = compute_rev_name(argv[1], argv[2]);
+	if (revname && revname[0])
+		printf(" (%s)", revname);
+	printf("\n");
+
+	free(revname);
+	return 0;
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -1293,6 +1355,7 @@ static struct cmd_struct commands[] = {
 	{"relative-path", resolve_relative_path, 0},
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
+	{"get-rev-name", get_rev_name, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index 66d1ae8ef..5211361c5 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -758,18 +758,6 @@ cmd_update()
 	}
 }
 
-set_name_rev () {
-	revname=$( (
-		sanitize_submodule_env
-		cd "$1" && {
-			git describe "$2" 2>/dev/null ||
-			git describe --tags "$2" 2>/dev/null ||
-			git describe --contains "$2" 2>/dev/null ||
-			git describe --all --always "$2"
-		}
-	) )
-	test -z "$revname" || revname=" ($revname)"
-}
 #
 # Show commit summary for submodules in index or working tree
 #
@@ -1041,14 +1029,14 @@ cmd_status()
 		fi
 		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
 		then
-			set_name_rev "$sm_path" "$sha1"
+			revname=$(git submodule--helper get-rev-name "$sm_path" "$sha1")
 			say " $sha1 $displaypath$revname"
 		else
 			if test -z "$cached"
 			then
 				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
 			fi
-			set_name_rev "$sm_path" "$sha1"
+			revname=$(git submodule--helper get-rev-name "$sm_path" "$sha1")
 			say "+$sha1 $displaypath$revname"
 		fi
 
-- 
2.13.0


^ permalink raw reply	[relevance 21%]

* [PATCH v5 4/4] submodule: port submodule subcommand 'status' from shell to C
  2017-09-24 12:08 ` [PATCH v5 0/4] Incremental rewrite of git-submodules Prathamesh Chavan
                     ` (2 preceding siblings ...)
  2017-09-24 12:08   ` [PATCH v5 3/4] submodule: port set_name_rev() from shell to C Prathamesh Chavan
@ 2017-09-24 12:08   ` " Prathamesh Chavan
  3 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-09-24 12:08 UTC (permalink / raw)
  To: hanwen; +Cc: christian.couder, git, gitster, pc44800, sbeller

This aims to make git-submodule 'status' a built-in. Hence, the function
cmd_status() is ported from shell to C. This is done by introducing
three functions: module_status(), submodule_status() and print_status().

The function module_status() acts as the front-end of the subcommand.
It parses subcommand's options and then calls the function
module_list_compute() for computing the list of submodules. Then
this functions calls for_each_listed_submodule() looping through the
list obtained.

Then for_each_listed_submodule() calls submodule_status() for each of the
submodule in its list. The function submodule_status() is responsible
for generating the status each submodule it is called for, and
then calls print_status().

Finally, the function print_status() handles the printing of submodule's
status.

Helper function get_rev_name() removed after porting the above subcommand
as it is no longer used.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 162 +++++++++++++++++++++++++++++++++++++++-----
 git-submodule.sh            |  49 +-------------
 2 files changed, 147 insertions(+), 64 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 7ca8e8153..8876a4a08 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -293,21 +293,6 @@ static char *compute_rev_name(const char *sub_path, const char* object_id)
 	return NULL;
 }
 
-static int get_rev_name(int argc, const char **argv, const char *prefix)
-{
-	char *revname;
-	if (argc != 3)
-		die("get-rev-name only accepts two arguments: <path> <sha1>");
-
-	revname = compute_rev_name(argv[1], argv[2]);
-	if (revname && revname[0])
-		printf(" (%s)", revname);
-	printf("\n");
-
-	free(revname);
-	return 0;
-}
-
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -559,6 +544,151 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct status_cb {
+	const char *prefix;
+	unsigned int quiet: 1;
+	unsigned int recursive: 1;
+	unsigned int cached: 1;
+};
+#define STATUS_CB_INIT { NULL, 0, 0, 0 }
+
+static void print_status(struct status_cb *info, char state, const char *path,
+			 const struct object_id *oid, const char *displaypath)
+{
+	if (info->quiet)
+		return;
+
+	printf("%c%s %s", state, oid_to_hex(oid), displaypath);
+
+	if (state == ' ' || state == '+')
+		printf(" (%s)", compute_rev_name(path, oid_to_hex(oid)));
+
+	printf("\n");
+}
+
+static int handle_submodule_head_ref(const char *refname,
+				     const struct object_id *oid, int flags,
+				     void *cb_data)
+{
+	struct object_id *output = cb_data;
+	if (oid)
+		oidcpy(output, oid);
+
+	return 0;
+}
+
+static void status_submodule(const struct cache_entry *list_item, void *cb_data)
+{
+	struct status_cb *info = cb_data;
+	char *displaypath;
+	struct argv_array diff_files_args = ARGV_ARRAY_INIT;
+
+	if (!submodule_from_path(&null_oid, list_item->name))
+		die(_("no submodule mapping found in .gitmodules for path '%s'"),
+		      list_item->name);
+
+	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
+
+	if (ce_stage(list_item)) {
+		print_status(info, 'U', list_item->name,
+			     &null_oid, displaypath);
+		goto cleanup;
+	}
+
+	if (!is_submodule_active(the_repository, list_item->name)) {
+		print_status(info, '-', list_item->name, &list_item->oid,
+			     displaypath);
+		goto cleanup;
+	}
+
+	argv_array_pushl(&diff_files_args, "diff-files",
+			 "--ignore-submodules=dirty", "--quiet", "--",
+			 list_item->name, NULL);
+
+	if (!cmd_diff_files(diff_files_args.argc, diff_files_args.argv,
+			    info->prefix)) {
+		print_status(info, ' ', list_item->name, &list_item->oid,
+			     displaypath);
+	} else {
+		if (!info->cached) {
+			struct object_id oid;
+
+			if (refs_head_ref(get_submodule_ref_store(list_item->name), handle_submodule_head_ref, &oid))
+				die(_("could not resolve HEAD ref inside the"
+				      "submodule '%s'"), list_item->name);
+
+			print_status(info, '+', list_item->name, &oid,
+				     displaypath);
+		} else {
+			print_status(info, '+', list_item->name,
+				     &list_item->oid, displaypath);
+		}
+	}
+
+	if (info->recursive) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = list_item->name;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_pushl(&cpr.args, "--super-prefix", displaypath,
+				 "submodule--helper", "status", "--recursive",
+				 NULL);
+
+		if (info->cached)
+			argv_array_push(&cpr.args, "--cached");
+
+		if (info->quiet)
+			argv_array_push(&cpr.args, "--quiet");
+
+		if (run_command(&cpr))
+			die(_("failed to recurse into submodule '%s'"),
+			      list_item->name);
+	}
+
+cleanup:
+	argv_array_clear(&diff_files_args);
+	free(displaypath);
+}
+
+static int module_status(int argc, const char **argv, const char *prefix)
+{
+	struct status_cb info = STATUS_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+	int cached = 0;
+	int recursive = 0;
+
+	struct option module_status_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress submodule status output")),
+		OPT_BOOL(0, "cached", &cached, N_("Use commit stored in the index instead of the one stored in the submodule HEAD")),
+		OPT_BOOL(0, "recursive", &recursive, N_("Recurse into nested submodules")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule status [--quiet] [--cached] [--recursive] [<path>...]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_status_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		return 1;
+
+	info.prefix = prefix;
+	info.quiet = !!quiet;
+	info.recursive = !!recursive;
+	info.cached = !!cached;
+
+	for_each_listed_submodule(&list, status_submodule, &info);
+
+	return 0;
+}
+
 static int module_name(int argc, const char **argv, const char *prefix)
 {
 	const struct submodule *sub;
@@ -1355,8 +1485,8 @@ static struct cmd_struct commands[] = {
 	{"relative-path", resolve_relative_path, 0},
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
-	{"get-rev-name", get_rev_name, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
+	{"status", module_status, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index 5211361c5..156255a9e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -1004,54 +1004,7 @@ cmd_status()
 		shift
 	done
 
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		name=$(git submodule--helper name "$sm_path") || exit
-		displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-		if test "$stage" = U
-		then
-			say "U$sha1 $displaypath"
-			continue
-		fi
-		if ! git submodule--helper is-active "$sm_path" ||
-		{
-			! test -d "$sm_path"/.git &&
-			! test -f "$sm_path"/.git
-		}
-		then
-			say "-$sha1 $displaypath"
-			continue;
-		fi
-		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
-		then
-			revname=$(git submodule--helper get-rev-name "$sm_path" "$sha1")
-			say " $sha1 $displaypath$revname"
-		else
-			if test -z "$cached"
-			then
-				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
-			fi
-			revname=$(git submodule--helper get-rev-name "$sm_path" "$sha1")
-			say "+$sha1 $displaypath$revname"
-		fi
-
-		if test -n "$recursive"
-		then
-			(
-				prefix="$displaypath/"
-				sanitize_submodule_env
-				wt_prefix=
-				cd "$sm_path" &&
-				eval cmd_status
-			) ||
-			die "$(eval_gettext "Failed to recurse into submodule path '\$sm_path'")"
-		fi
-	done
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper status ${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive} "$@"
 }
 #
 # Sync remote urls for submodules
-- 
2.13.0


^ permalink raw reply	[relevance 20%]

* Re: [PATCH v5 1/4] submodule--helper: introduce get_submodule_displaypath()
  2017-09-24 12:08   ` [PATCH v5 1/4] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
@ 2017-09-25  3:35     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-09-25  3:35 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: hanwen, christian.couder, git, sbeller

Prathamesh Chavan <pc44800@gmail.com> writes:

> Introduce function get_submodule_displaypath() to replace the code
> occurring in submodule_init() for generating displaypath of the
> submodule with a call to it.
>
> This new function will also be used in other parts of the system
> in later patches.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
>  builtin/submodule--helper.c | 38 ++++++++++++++++++++++++++------------
>  1 file changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 818fe74f0..d24ac9028 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -220,6 +220,29 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
>  	return 0;
>  }
>  
> +/* the result should be freed by the caller. */
> +static char *get_submodule_displaypath(const char *path, const char *prefix)
> +{
> +	const char *super_prefix = get_super_prefix();
> +
> +	if (prefix && super_prefix) {
> +		BUG("cannot have prefix '%s' and superprefix '%s'",
> +		    prefix, super_prefix);
> +	} else if (prefix) {
> +		struct strbuf sb = STRBUF_INIT;
> +		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
> +		strbuf_release(&sb);
> +		return displaypath;
> +	} else if (super_prefix) {
> +		int len = strlen(super_prefix);
> +		const char *format = (len > 0 && is_dir_sep(super_prefix[len - 1])) ? "%s%s" : "%s/%s";
> +
> +		return xstrfmt(format, super_prefix, path);
> +	} else {
> +		return xstrdup(path);
> +	}
> +}

Looks like a fairly faithful rewrite of the original below, with a
reasonable clean-up (e.g. use of xstrfmt() instead of addf()).

One thing I noticed is that future callers of this function, unlike
init_submodule() which is its original caller, are allowed to have
super-prefix that does not end in a slash, because the helper
automatically supplies one when it is missing.

I am not sure the added leniency is desirable [*1*], but in any
case, the added leniency deserves a mention in the log message, I
would think.


[Footnote]

*1* Often it is harder to introduce bugs when the internal rules are
stricter, e.g. "prefix must end with slash", than when they are
looser, e.g. "prefix may or may not end with slash", because
allowing different codepaths to have the same thing in different
representations would hide bugs that is only uncovered when these
codepaths eventually meet (e.g. one codepath has a path with and the
other without trailing slash and they both think that is the
prefix---then they use strcmp() to see if they have the same prefix,
which would be a bug, which may not be noticed for a long time).

>  struct module_list {
>  	const struct cache_entry **entries;
>  	int alloc, nr;
> @@ -335,15 +358,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  	struct strbuf sb = STRBUF_INIT;
>  	char *upd = NULL, *url = NULL, *displaypath;
>  
> -	if (prefix && get_super_prefix())
> -		die("BUG: cannot have prefix and superprefix");
> -	else if (prefix)
> -		displaypath = xstrdup(relative_path(path, prefix, &sb));
> -	else if (get_super_prefix()) {
> -		strbuf_addf(&sb, "%s%s", get_super_prefix(), path);
> -		displaypath = strbuf_detach(&sb, NULL);
> -	} else
> -		displaypath = xstrdup(path);
> +	displaypath = get_submodule_displaypath(path, prefix);
>  
>  	sub = submodule_from_path(&null_oid, path);
>  
> @@ -358,9 +373,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  	 * Set active flag for the submodule being initialized
>  	 */
>  	if (!is_submodule_active(the_repository, path)) {
> -		strbuf_reset(&sb);
>  		strbuf_addf(&sb, "submodule.%s.active", sub->name);
>  		git_config_set_gently(sb.buf, "true");
> +		strbuf_reset(&sb);
>  	}
>  
>  	/*
> @@ -368,7 +383,6 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  	 * To look up the url in .git/config, we must not fall back to
>  	 * .gitmodules, so look it up directly.
>  	 */
> -	strbuf_reset(&sb);
>  	strbuf_addf(&sb, "submodule.%s.url", sub->name);
>  	if (git_config_get_string(sb.buf, &url)) {
>  		if (!sub->url)
> @@ -405,9 +419,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  				_("Submodule '%s' (%s) registered for path '%s'\n"),
>  				sub->name, url, displaypath);
>  	}
> +	strbuf_reset(&sb);
>  
>  	/* Copy "update" setting when it is not set yet */
> -	strbuf_reset(&sb);
>  	strbuf_addf(&sb, "submodule.%s.update", sub->name);
>  	if (git_config_get_string(sb.buf, &upd) &&
>  	    sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {

^ permalink raw reply	[relevance 7%]

* Re: [PATCH v5 2/4] submodule--helper: introduce for_each_listed_submodule()
  2017-09-24 12:08   ` [PATCH v5 2/4] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
@ 2017-09-25  3:43     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-09-25  3:43 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: hanwen, christian.couder, git, sbeller

Prathamesh Chavan <pc44800@gmail.com> writes:

> Introduce function for_each_listed_submodule() and replace a loop
> in module_init() with a call to it.
>
> The new function will also be used in other parts of the
> system in later patches.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
>  builtin/submodule--helper.c | 36 ++++++++++++++++++++++++++++--------
>  1 file changed, 28 insertions(+), 8 deletions(-)

This looks more or less perfectly done, but I say this basing it on
an assumption that nobody ever would want to initialize a single
submodule without going through module_init() interface.

If there will be a caller that would have made a direct call to
init_submodule() if this patch did not exist, then I think this
patch is better done by keeping the external interface to the
init_submodule() function intact, and introducing an intermediate
helper init_submodule_cb() function that is to be used as the
callback used in for_each_listed_submodule() API.  In general, we
should make sure that these callback functions that take "void
*cb_data" parameters are called _only_ by these iterators that
defined the callback (in this case, for_each_listed_submodule())
and not other functions.

> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index d24ac9028..d12790b5c 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -14,6 +14,9 @@
>  #include "refs.h"
>  #include "connect.h"
>  
> +typedef void (*each_submodule_fn)(const struct cache_entry *list_item,
> +				  void *cb_data);
> +
>  static char *get_default_remote(void)
>  {
>  	char *dest = NULL, *ret;
> @@ -352,15 +355,30 @@ static int module_list(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> -static void init_submodule(const char *path, const char *prefix, int quiet)
> +static void for_each_listed_submodule(const struct module_list *list,
> +				      each_submodule_fn fn, void *cb_data)
> +{
> +	int i;
> +	for (i = 0; i < list->nr; i++)
> +		fn(list->entries[i], cb_data);
> +}
> +
> +struct init_cb {
> +	const char *prefix;
> +	unsigned int quiet: 1;
> +};
> +#define INIT_CB_INIT { NULL, 0 }
> +
> +static void init_submodule(const struct cache_entry *list_item, void *cb_data)
>  {
> +	struct init_cb *info = cb_data;
>  	const struct submodule *sub;
>  	struct strbuf sb = STRBUF_INIT;
>  	char *upd = NULL, *url = NULL, *displaypath;
>  
> -	displaypath = get_submodule_displaypath(path, prefix);
> +	displaypath = get_submodule_displaypath(list_item->name, info->prefix);
>  
> -	sub = submodule_from_path(&null_oid, path);
> +	sub = submodule_from_path(&null_oid, list_item->name);
>  
>  	if (!sub)
>  		die(_("No url found for submodule path '%s' in .gitmodules"),
> @@ -372,7 +390,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  	 *
>  	 * Set active flag for the submodule being initialized
>  	 */
> -	if (!is_submodule_active(the_repository, path)) {
> +	if (!is_submodule_active(the_repository, list_item->name)) {
>  		strbuf_addf(&sb, "submodule.%s.active", sub->name);
>  		git_config_set_gently(sb.buf, "true");
>  		strbuf_reset(&sb);
> @@ -414,7 +432,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  		if (git_config_set_gently(sb.buf, url))
>  			die(_("Failed to register url for submodule path '%s'"),
>  			    displaypath);
> -		if (!quiet)
> +		if (!info->quiet)
>  			fprintf(stderr,
>  				_("Submodule '%s' (%s) registered for path '%s'\n"),
>  				sub->name, url, displaypath);
> @@ -443,10 +461,10 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
>  
>  static int module_init(int argc, const char **argv, const char *prefix)
>  {
> +	struct init_cb info = INIT_CB_INIT;
>  	struct pathspec pathspec;
>  	struct module_list list = MODULE_LIST_INIT;
>  	int quiet = 0;
> -	int i;
>  
>  	struct option module_init_options[] = {
>  		OPT__QUIET(&quiet, N_("Suppress output for initializing a submodule")),
> @@ -471,8 +489,10 @@ static int module_init(int argc, const char **argv, const char *prefix)
>  	if (!argc && git_config_get_value_multi("submodule.active"))
>  		module_list_active(&list);
>  
> -	for (i = 0; i < list.nr; i++)
> -		init_submodule(list.entries[i]->name, prefix, quiet);
> +	info.prefix = prefix;
> +	info.quiet = !!quiet;
> +
> +	for_each_listed_submodule(&list, init_submodule, &info);
>  
>  	return 0;
>  }

^ permalink raw reply	[relevance 14%]

* Re: What's cooking in git.git (Sep 2017, #04; Mon, 25)
  2017-09-25  7:25 What's cooking in git.git (Sep 2017, #04; Mon, 25) Junio C Hamano
@ 2017-09-25 18:30 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-09-25 18:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

>
> * sb/parse-options-blank-line-before-option-list (2017-08-25) 1 commit
>  - usage_with_options: omit double new line on empty option list
>
>  "git worktree" with no option and no subcommand showed too many
>  blank lines in its help text, which has been reduced.
>
>  Superseded by bc/rev-parse-parseopt-fix?

yes it is, please retract this one.


> * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
>  - Documentation/checkout: clarify submodule HEADs to be detached
>  - recursive submodules: detach HEAD from new state
>
>  "git checkout --recursive" may overwrite and rewind the history of
>  the branch that happens to be checked out in submodule
>  repositories, which might not be desirable.  Detach the HEAD but
>  still allow the recursive checkout to succeed in such a case.
>
>  Undecided.
>  This needs justification in a larger picture; it is unclear why
>  this is better than rejecting recursive checkout, for example.

I'll revisit this with fresh eyes, the original plan was to get
jn/per-repo-obj-store done to have more fundamental operations
available to do the right thing, whatever it is, though.

>
> * jn/per-repo-obj-store (2017-09-07) 39 commits

We intend to get this one rolling again.

^ permalink raw reply	[relevance 8%]

* [PATCH] Documentation: consolidate submodule.<name>.update
  2017-09-23 23:52       ` Junio C Hamano
@ 2017-09-25 19:10         ` " Stefan Beller
  2017-09-25 19:17           ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-25 19:10 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, sbeller

Have one place to explain the effects of setting submodule.<name>.update
instead of two.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
>> I disagree.  Actually, I think the git-config(1) blurb could just
>> point here, and that the text here ought to be clear about what
>> commands it affects and how an end user would use it.
>
> I tend to agree with the consolidation.

Something like this?

 Documentation/config.txt     |  9 +--------
 Documentation/gitmodules.txt | 20 +++++++++++---------
 2 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1ac0ae6adb..0d5a296b6c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3085,14 +3085,7 @@ submodule.<name>.url::
 	See linkgit:git-submodule[1] and linkgit:gitmodules[5] for details.
 
 submodule.<name>.update::
-	The method by which a submodule is updated by 'git submodule update',
-	which is the only affected command, others such as
-	'git checkout --recurse-submodules' are unaffected. It exists for
-	historical reasons, when 'git submodule' was the only command to
-	interact with submodules; settings like `submodule.active`
-	and `pull.rebase` are more specific. It is populated by
-	`git submodule init` from the linkgit:gitmodules[5] file.
-	See description of 'update' command in linkgit:git-submodule[1].
+	See `submodule.<name>.update` in linkgit:gitmodules[5].
 
 submodule.<name>.branch::
 	The remote branch name for a submodule, used by `git submodule
diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
index db5d47eb19..d156dee387 100644
--- a/Documentation/gitmodules.txt
+++ b/Documentation/gitmodules.txt
@@ -38,15 +38,17 @@ submodule.<name>.url::
 In addition, there are a number of optional keys:
 
 submodule.<name>.update::
-	Defines the default update procedure for the named submodule,
-	i.e. how the submodule is updated by "git submodule update"
-	command in the superproject. This is only used by `git
-	submodule init` to initialize the configuration variable of
-	the same name. Allowed values here are 'checkout', 'rebase',
-	'merge' or 'none'. See description of 'update' command in
-	linkgit:git-submodule[1] for their meaning. Note that the
-	'!command' form is intentionally ignored here for security
-	reasons.
+	The method by which a submodule is updated by 'git submodule update',
+	which is the only affected command, others such as
+	'git checkout --recurse-submodules' are unaffected. It exists for
+	historical reasons, when 'git submodule' was the only command to
+	interact with submodules; settings like `submodule.active`
+	and `pull.rebase` are more specific. It is copied to the config
+	by `git submodule init` from the .gitmodules file.
+	Allowed values here are 'checkout', 'rebase', 'merge' or 'none'.
+	See description of 'update' command in linkgit:git-submodule[1]
+	for their meaning. Note that the '!command' form is intentionally
+	ignored here for security reasons.
 
 submodule.<name>.branch::
 	A remote branch name for tracking updates in the upstream submodule.
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 28%]

* Re: [PATCH] Documentation: consolidate submodule.<name>.update
  2017-09-25 19:10         ` [PATCH] Documentation: consolidate " Stefan Beller
@ 2017-09-25 19:17           ` Brandon Williams
  2017-09-25 22:18             ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Brandon Williams @ 2017-09-25 19:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git, jrnieder

On 09/25, Stefan Beller wrote:
> Have one place to explain the effects of setting submodule.<name>.update
> instead of two.
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> >> I disagree.  Actually, I think the git-config(1) blurb could just
> >> point here, and that the text here ought to be clear about what
> >> commands it affects and how an end user would use it.
> >
> > I tend to agree with the consolidation.
> 
> Something like this?

I like the consolidation, its easier to keep up to date when its only in
one place.

> 
>  Documentation/config.txt     |  9 +--------
>  Documentation/gitmodules.txt | 20 +++++++++++---------
>  2 files changed, 12 insertions(+), 17 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1ac0ae6adb..0d5a296b6c 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3085,14 +3085,7 @@ submodule.<name>.url::
>  	See linkgit:git-submodule[1] and linkgit:gitmodules[5] for details.
>  
>  submodule.<name>.update::
> -	The method by which a submodule is updated by 'git submodule update',
> -	which is the only affected command, others such as
> -	'git checkout --recurse-submodules' are unaffected. It exists for
> -	historical reasons, when 'git submodule' was the only command to
> -	interact with submodules; settings like `submodule.active`
> -	and `pull.rebase` are more specific. It is populated by
> -	`git submodule init` from the linkgit:gitmodules[5] file.
> -	See description of 'update' command in linkgit:git-submodule[1].
> +	See `submodule.<name>.update` in linkgit:gitmodules[5].
>  
>  submodule.<name>.branch::
>  	The remote branch name for a submodule, used by `git submodule
> diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
> index db5d47eb19..d156dee387 100644
> --- a/Documentation/gitmodules.txt
> +++ b/Documentation/gitmodules.txt
> @@ -38,15 +38,17 @@ submodule.<name>.url::
>  In addition, there are a number of optional keys:
>  
>  submodule.<name>.update::
> -	Defines the default update procedure for the named submodule,
> -	i.e. how the submodule is updated by "git submodule update"
> -	command in the superproject. This is only used by `git
> -	submodule init` to initialize the configuration variable of
> -	the same name. Allowed values here are 'checkout', 'rebase',
> -	'merge' or 'none'. See description of 'update' command in
> -	linkgit:git-submodule[1] for their meaning. Note that the
> -	'!command' form is intentionally ignored here for security
> -	reasons.
> +	The method by which a submodule is updated by 'git submodule update',
> +	which is the only affected command, others such as
> +	'git checkout --recurse-submodules' are unaffected. It exists for
> +	historical reasons, when 'git submodule' was the only command to
> +	interact with submodules; settings like `submodule.active`
> +	and `pull.rebase` are more specific. It is copied to the config
> +	by `git submodule init` from the .gitmodules file.
> +	Allowed values here are 'checkout', 'rebase', 'merge' or 'none'.
> +	See description of 'update' command in linkgit:git-submodule[1]
> +	for their meaning. Note that the '!command' form is intentionally
> +	ignored here for security reasons.

This probably needs to be tweaked a bit to say that the '!command' form
is ignored by submodule init, in that it isn't copied over from the
.gitmodules file, but if it is configured in your config it will be
respected by 'submodule update'.

>  
>  submodule.<name>.branch::
>  	A remote branch name for tracking updates in the upstream submodule.
> -- 
> 2.14.0.rc0.3.g6c2e499285
> 

-- 
Brandon Williams

^ permalink raw reply	[relevance 24%]

* [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
@ 2017-09-25 19:55 Stefan Beller
  2017-09-25 20:04 ` Jonathan Nieder
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-25 19:55 UTC (permalink / raw)
  To: git; +Cc: jrnieder, Stefan Beller

submodule.<name>.update can be assigned an arbitrary command via setting
it to "!command". When this command is found in the regular config, Git
ought to just run that command instead of other update mechanisms.

However if that command is just found in the .gitmodules file, it is
potentially untrusted, which is why we do not run it.  Add a test
confirming the behavior.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t7406-submodule-update.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 034914a14f..780af4e6f5 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -406,6 +406,16 @@ test_expect_success 'submodule update - command in .git/config' '
 	)
 '
 
+test_expect_success 'submodule update - command in .gitmodules is ignored' '
+	test_when_finished "git -C super reset --hard HEAD^" &&
+
+	git -C super config -f .gitmodules submodule.submodule.update "!false || echo >bad" &&
+	git -C super commit -a -m "add command to .gitmodules file" &&
+	git -C super/submodule reset --hard $submodulesha1^ &&
+	git -C super submodule update submodule 2>../actual &&
+	test_path_is_missing super/bad
+'
+
 cat << EOF >expect
 Execution of 'false $submodulesha1' failed in submodule path 'submodule'
 EOF
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 33%]

* Re: [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-25 19:55 [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules Stefan Beller
@ 2017-09-25 20:04 ` Jonathan Nieder
  2017-09-25 22:50   ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Jonathan Nieder @ 2017-09-25 20:04 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller wrote:

> submodule.<name>.update can be assigned an arbitrary command via setting
> it to "!command". When this command is found in the regular config, Git
> ought to just run that command instead of other update mechanisms.
> 
> However if that command is just found in the .gitmodules file, it is
> potentially untrusted, which is why we do not run it.  Add a test
> confirming the behavior.
> 
> Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  t/t7406-submodule-update.sh | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index 034914a14f..780af4e6f5 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -406,6 +406,16 @@ test_expect_success 'submodule update - command in .git/config' '
>  	)
>  '
>  
> +test_expect_success 'submodule update - command in .gitmodules is ignored' '
> +	test_when_finished "git -C super reset --hard HEAD^" &&
> +
> +	git -C super config -f .gitmodules submodule.submodule.update "!false || echo >bad" &&

What does the '!false || echo >bad' do?

Ideally we want this test to be super robust: e.g. if it runs the
command but from a different directory, we still want the test to fail,
and if it runs the command but using exec instead of a shell, we still
want the test to fail.

Maybe write_script would help with this.  E.g. would something like

	test_when_finished ... &&
	write_script must_not_run.sh <<-EOF &&
	>$TEST_DIRECTORY/bad
	EOF

	git -C super config -f .gitmodules submodule.submodule.update \
		"!$TEST_DIRECTORY/must_not_run.sh" &&
	...

work?

Thanks,
Jonathan

^ permalink raw reply	[relevance 24%]

* Re: [PATCH] Documentation: consolidate submodule.<name>.update
  2017-09-25 19:17           ` Brandon Williams
@ 2017-09-25 22:18             ` Stefan Beller
  2017-09-25 22:22               ` Jonathan Nieder
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-25 22:18 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Junio C Hamano, git, Jonathan Nieder

On Mon, Sep 25, 2017 at 12:17 PM, Brandon Williams <bmwill@google.com> wrote:
> On 09/25, Stefan Beller wrote:
>> Have one place to explain the effects of setting submodule.<name>.update
>> instead of two.
>>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>> >> I disagree.  Actually, I think the git-config(1) blurb could just
>> >> point here, and that the text here ought to be clear about what
>> >> commands it affects and how an end user would use it.
>> >
>> > I tend to agree with the consolidation.
>>
>> Something like this?
>
> I like the consolidation, its easier to keep up to date when its only in
> one place.

After thinking about it further, I'd like to withdraw this patch
for now.

The reason is that this would also forward point to
git-submodule.txt, such that we'd still have 2 places
in which it is explained. The current situation with explaining
it in 3 places is not too bad as each place hints at their specific
circumstance:
git-submodule.txt explains the values to set itself.
gitmodules.txt explains what the possible fields in that file are,
  and regarding this field it points out it is ignored in the init call.
config.txt: actually describe the effects of the setting.

I think we can keep it as-is for now.

Thanks,
Stefan

^ permalink raw reply	[relevance 24%]

* Re: [PATCH] Documentation: consolidate submodule.<name>.update
  2017-09-25 22:18             ` Stefan Beller
@ 2017-09-25 22:22               ` Jonathan Nieder
  2017-09-25 22:42                 ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Jonathan Nieder @ 2017-09-25 22:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git

Stefan Beller wrote:
> On Mon, Sep 25, 2017 at 12:17 PM, Brandon Williams <bmwill@google.com> wrote:
>> On 09/25, Stefan Beller wrote:

>>> Have one place to explain the effects of setting submodule.<name>.update
>>> instead of two.
>>>
>>> Signed-off-by: Stefan Beller <sbeller@google.com>
>>> ---
>>>>> I disagree.  Actually, I think the git-config(1) blurb could just
>>>>> point here, and that the text here ought to be clear about what
>>>>> commands it affects and how an end user would use it.
>>>>
>>>> I tend to agree with the consolidation.
>>>
>>> Something like this?
>>
>> I like the consolidation, its easier to keep up to date when its only in
>> one place.
>
> After thinking about it further, I'd like to withdraw this patch
> for now.
>
> The reason is that this would also forward point to
> git-submodule.txt, such that we'd still have 2 places
> in which it is explained. The current situation with explaining
> it in 3 places is not too bad as each place hints at their specific
> circumstance:
> git-submodule.txt explains the values to set itself.
> gitmodules.txt explains what the possible fields in that file are,
>   and regarding this field it points out it is ignored in the init call.
> config.txt: actually describe the effects of the setting.
>
> I think we can keep it as-is for now.

Sorry, I got lost.  Which state is as-is?

As a user, how do I find out what submodule.*.update is going to do
and which commands will respect it?

Maybe we should work on first wordsmithing the doc and then figuring
out where it goes?  The current state of the doc with (?) three
different texts, all wrong, doesn't seem like a good state to
preserve.

Thanks,
Jonathan

^ permalink raw reply	[relevance 22%]

* Re: [PATCH] Documentation: consolidate submodule.<name>.update
  2017-09-25 22:22               ` Jonathan Nieder
@ 2017-09-25 22:42                 ` Stefan Beller
  2017-09-25 23:44                   ` Jonathan Nieder
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-25 22:42 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, Junio C Hamano, git

On Mon, Sep 25, 2017 at 3:22 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Stefan Beller wrote:
>> On Mon, Sep 25, 2017 at 12:17 PM, Brandon Williams <bmwill@google.com> wrote:
>>> On 09/25, Stefan Beller wrote:
>
>>>> Have one place to explain the effects of setting submodule.<name>.update
>>>> instead of two.
>>>>
>>>> Signed-off-by: Stefan Beller <sbeller@google.com>
>>>> ---
>>>>>> I disagree.  Actually, I think the git-config(1) blurb could just
>>>>>> point here, and that the text here ought to be clear about what
>>>>>> commands it affects and how an end user would use it.
>>>>>
>>>>> I tend to agree with the consolidation.
>>>>
>>>> Something like this?
>>>
>>> I like the consolidation, its easier to keep up to date when its only in
>>> one place.
>>
>> After thinking about it further, I'd like to withdraw this patch
>> for now.
>>
>> The reason is that this would also forward point to
>> git-submodule.txt, such that we'd still have 2 places
>> in which it is explained. The current situation with explaining
>> it in 3 places is not too bad as each place hints at their specific
>> circumstance:
>> git-submodule.txt explains the values to set itself.
>> gitmodules.txt explains what the possible fields in that file are,
>>   and regarding this field it points out it is ignored in the init call.
>> config.txt: actually describe the effects of the setting.
>>
>> I think we can keep it as-is for now.
>
> Sorry, I got lost.  Which state is as-is?

By as-is I refer to origin/pu.

> As a user, how do I find out what submodule.*.update is going to do
> and which commands will respect it?

The user would discover it via 'man git-config' I would assume, which
covers any config variable? It also directs to git-submodule which is
more detailed, but the text there is suitable for the casual reader.
(pu has origin/sb/doc-config-submodule-update)

> Maybe we should work on first wordsmithing the doc and then figuring
> out where it goes?  The current state of the doc with (?) three
> different texts,

such that each different text highlights each locations purpose.

> all wrong,

Care to spell out this bold claim?

Thanks,
Stefan

^ permalink raw reply	[relevance 24%]

* [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-25 20:04 ` Jonathan Nieder
@ 2017-09-25 22:50   ` Stefan Beller
  2017-09-26  0:01     ` Jonathan Nieder
  2017-09-26  5:37     ` Johannes Sixt
  0 siblings, 2 replies; 200+ results
From: Stefan Beller @ 2017-09-25 22:50 UTC (permalink / raw)
  To: jrnieder; +Cc: git, sbeller

submodule.<name>.update can be assigned an arbitrary command via setting
it to "!command". When this command is found in the regular config, Git
ought to just run that command instead of other update mechanisms.

However if that command is just found in the .gitmodules file, it is
potentially untrusted, which is why we do not run it.  Add a test
confirming the behavior.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

 updated to use the super robust script.
 Thanks Jonathan,
 
 Stefan

 t/t7406-submodule-update.sh | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 034914a14f..d718cb00e7 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -406,6 +406,20 @@ test_expect_success 'submodule update - command in .git/config' '
 	)
 '
 
+test_expect_success 'submodule update - command in .gitmodules is ignored' '
+	test_when_finished "git -C super reset --hard HEAD^" &&
+
+	write_script must_not_run.sh <<-EOF &&
+	>$TEST_DIRECTORY/bad
+	EOF
+
+	git -C super config -f .gitmodules submodule.submodule.update "!$TEST_DIRECTORY/must_not_run.sh" &&
+	git -C super commit -a -m "add command to .gitmodules file" &&
+	git -C super/submodule reset --hard $submodulesha1^ &&
+	git -C super submodule update submodule &&
+	test_path_is_missing bad
+'
+
 cat << EOF >expect
 Execution of 'false $submodulesha1' failed in submodule path 'submodule'
 EOF
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 33%]

* Re: [PATCH] Documentation: consolidate submodule.<name>.update
  2017-09-25 22:42                 ` Stefan Beller
@ 2017-09-25 23:44                   ` Jonathan Nieder
  0 siblings, 0 replies; 200+ results
From: Jonathan Nieder @ 2017-09-25 23:44 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git

Stefan Beller wrote:

> By as-is I refer to origin/pu.

Ah, okay.  I'm not in that habit because pu frequently loses topics
--- it's mostly useful as a tool to (1) distribute topic branches and
(2) check whether they are compatible with each other.

[...]
> On Mon, Sep 25, 2017 at 3:22 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Maybe we should work on first wordsmithing the doc and then figuring
>> out where it goes?  The current state of the doc with (?) three
>> different texts,
>
> such that each different text highlights each locations purpose.
>
>> all wrong,
>
> Care to spell out this bold claim?

In the state of "master", "man git-config" tells me that

	[submodule "<name>"]
		update = ...

sets the "default update procedure for a submodule".  It suggests that
I read about "git submodule update" in git-submodule(1) for more
details.  This is problematic because

 1. It doesn't tell me when I should and should not set it.
 2. I would expect "git checkout --recurse-submodules" to respect the
    default update procedure.
 3. It doesn't tell me what commands like "git fetch
    --recurse-submodules" will do.
 4. It doesn't point me to better alternatives, when this
    configuration doesn't fit my use case.

"man git-submodule" doesn't have a section on submodule.<name>.update.
It has references scattered throughout the manpage.  It only tells me
how the setting affects the "git submodule update" command and doesn't
address the problems (1)-(4).

"man gitmodules" also refers to this setting as the "default update
procedure for the named submodule".  It doesn't answer the questions
(1)-(4) either.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[relevance 26%]

* Re: [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-25 22:50   ` Stefan Beller
@ 2017-09-26  0:01     ` Jonathan Nieder
  2017-09-26  5:37     ` Johannes Sixt
  1 sibling, 0 replies; 200+ results
From: Jonathan Nieder @ 2017-09-26  0:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller wrote:

> submodule.<name>.update can be assigned an arbitrary command via setting
> it to "!command". When this command is found in the regular config, Git
> ought to just run that command instead of other update mechanisms.
>
> However if that command is just found in the .gitmodules file, it is
> potentially untrusted, which is why we do not run it.  Add a test
> confirming the behavior.
>
> Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  t/t7406-submodule-update.sh | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index 034914a14f..d718cb00e7 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -406,6 +406,20 @@ test_expect_success 'submodule update - command in .git/config' '
>  	)
>  '
>  
> +test_expect_success 'submodule update - command in .gitmodules is ignored' '
> +	test_when_finished "git -C super reset --hard HEAD^" &&
> +
> +	write_script must_not_run.sh <<-EOF &&
> +	>$TEST_DIRECTORY/bad
> +	EOF
> +
> +	git -C super config -f .gitmodules submodule.submodule.update "!$TEST_DIRECTORY/must_not_run.sh" &&

Long line, but I don't think I care.  I wish there were a tool like
"make style" to format shell scripts.

> +	git -C super commit -a -m "add command to .gitmodules file" &&
> +	git -C super/submodule reset --hard $submodulesha1^ &&
> +	git -C super submodule update submodule &&
> +	test_path_is_missing bad
> +'

Per offline discussion, you tested that this fails when you use
.git/config instead of .gitmodules, so there aren't any subtle typos
here. :)

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks for writing it.

^ permalink raw reply	[relevance 15%]

* Re: [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-25 22:50   ` Stefan Beller
  2017-09-26  0:01     ` Jonathan Nieder
@ 2017-09-26  5:37     ` Johannes Sixt
  1 sibling, 0 replies; 200+ results
From: Johannes Sixt @ 2017-09-26  5:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git

Am 26.09.2017 um 00:50 schrieb Stefan Beller:
> submodule.<name>.update can be assigned an arbitrary command via setting
> it to "!command". When this command is found in the regular config, Git
> ought to just run that command instead of other update mechanisms.
> 
> However if that command is just found in the .gitmodules file, it is
> potentially untrusted, which is why we do not run it.  Add a test
> confirming the behavior.
> 
> Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> 
>   updated to use the super robust script.
>   Thanks Jonathan,
>   
>   Stefan
> 
>   t/t7406-submodule-update.sh | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index 034914a14f..d718cb00e7 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -406,6 +406,20 @@ test_expect_success 'submodule update - command in .git/config' '
>   	)
>   '
>   
> +test_expect_success 'submodule update - command in .gitmodules is ignored' '
> +	test_when_finished "git -C super reset --hard HEAD^" &&
> +
> +	write_script must_not_run.sh <<-EOF &&
> +	>$TEST_DIRECTORY/bad
> +	EOF

I am pretty confident that this does not test what you intend to test. 
Notice that $TEST_DIRECTORY is expanded when the script is written. But 
that path contains a blank, and we have something like this in the test 
script:

	#!/bin/sh
	>/the/build/directory/t/trash directory.t7406/bad

If you inject the bug against which this test protects into 
git-submodule, you should find a file "trash" in your t directory, and 
the file "bad" still absent. Not to mention that the script fails 
because it cannot run "directory.t7406/bad".

To fix that, you should use and exported variable and access that from 
the test script, for example:

	write_script must_not_run.sh <<-\EOF &&
	>"$TEST_DIRECTORY"/bad
	EOF
...
	(
		export TEST_DIRECTORY &&
		git -C super submodule update submodule
	) &&
	test_path_is_missing bad

> +
> +	git -C super config -f .gitmodules submodule.submodule.update "!$TEST_DIRECTORY/must_not_run.sh" &&
> +	git -C super commit -a -m "add command to .gitmodules file" &&
> +	git -C super/submodule reset --hard $submodulesha1^ &&
> +	git -C super submodule update submodule &&
> +	test_path_is_missing bad
> +'
> +
>   cat << EOF >expect
>   Execution of 'false $submodulesha1' failed in submodule path 'submodule'
>   EOF
> 

-- Hannes

^ permalink raw reply	[relevance 25%]

* [ANNOUNCE] Git v2.14.2
@ 2017-09-26  6:09 Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-09-26  6:09 UTC (permalink / raw)
  To: git; +Cc: Linux Kernel

The latest maintenance release Git v2.14.2 is now available at the
usual places.  In addition to the cvsserver related fixes that
appear in 2.10.5, 2.11.4, 2.12.5 and 2.13.6 announced separately,
this also includes various fixes that were merged already to the
'master' branch.

The tarballs are found at:

    https://www.kernel.org/pub/software/scm/git/

The following public repositories all have a copy of the 'v2.14.2'
tag and the 'maint' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

----------------------------------------------------------------

Git v2.14.2 Release Notes
=========================

Fixes since v2.14.1
-------------------

 * Because recent Git for Windows do come with a real msgfmt, the
   build procedure for git-gui has been updated to use it instead of a
   hand-rolled substitute.

 * "%C(color name)" in the pretty print format always produced ANSI
   color escape codes, which was an early design mistake.  They now
   honor the configuration (e.g. "color.ui = never") and also tty-ness
   of the output medium.

 * The http.{sslkey,sslCert} configuration variables are to be
   interpreted as a pathname that honors "~[username]/" prefix, but
   weren't, which has been fixed.

 * Numerous bugs in walking of reflogs via "log -g" and friends have
   been fixed.

 * "git commit" when seeing an totally empty message said "you did not
   edit the message", which is clearly wrong.  The message has been
   corrected.

 * When a directory is not readable, "gitweb" fails to build the
   project list.  Work this around by skipping such a directory.

 * A recently added test for the "credential-cache" helper revealed
   that EOF detection done around the time the connection to the cache
   daemon is torn down were flaky.  This was fixed by reacting to
   ECONNRESET and behaving as if we got an EOF.

 * Some versions of GnuPG fail to kill gpg-agent it auto-spawned
   and such a left-over agent can interfere with a test.  Work it
   around by attempting to kill one before starting a new test.

 * "git log --tag=no-such-tag" showed log starting from HEAD, which
   has been fixed---it now shows nothing.

 * The "tag.pager" configuration variable was useless for those who
   actually create tag objects, as it interfered with the use of an
   editor.  A new mechanism has been introduced for commands to enable
   pager depending on what operation is being carried out to fix this,
   and then "git tag -l" is made to run pager by default.

 * "git push --recurse-submodules $there HEAD:$target" was not
   propagated down to the submodules, but now it is.

 * Commands like "git rebase" accepted the --rerere-autoupdate option
   from the command line, but did not always use it.  This has been
   fixed.

 * "git clone --recurse-submodules --quiet" did not pass the quiet
   option down to submodules.

 * "git am -s" has been taught that some input may end with a trailer
   block that is not Signed-off-by: and it should refrain from adding
   an extra blank line before adding a new sign-off in such a case.

 * "git svn" used with "--localtime" option did not compute the tz
   offset for the timestamp in question and instead always used the
   current time, which has been corrected.

 * Memory leaks in a few error codepaths have been plugged.

 * bash 4.4 or newer gave a warning on NUL byte in command
   substitution done in "git stash"; this has been squelched.

 * "git grep -L" and "git grep --quiet -L" reported different exit
   codes; this has been corrected.

 * When handshake with a subprocess filter notices that the process
   asked for an unknown capability, Git did not report what program
   the offending subprocess was running.  This has been corrected.

 * "git apply" that is used as a better "patch -p1" failed to apply a
   taken from a file with CRLF line endings to a file with CRLF line
   endings.  The root cause was because it misused convert_to_git()
   that tried to do "safe-crlf" processing by looking at the index
   entry at the same path, which is a nonsense---in that mode, "apply"
   is not working on the data in (or derived from) the index at all.
   This has been fixed.

 * Killing "git merge --edit" before the editor returns control left
   the repository in a state with MERGE_MSG but without MERGE_HEAD,
   which incorrectly tells the subsequent "git commit" that there was
   a squash merge in progress.  This has been fixed.

 * "git archive" did not work well with pathspecs and the
   export-ignore attribute.

 * "git cvsserver" no longer is invoked by "git shell" by default,
   as it is old and largely unmaintained.

 * Various Perl scripts did not use safe_pipe_capture() instead of
   backticks, leaving them susceptible to end-user input.  They have
   been corrected.

Also contains various documentation updates and code clean-ups.

Credits go to joernchen <joernchen@phenoelit.de> for finding the
unsafe constructs in "git cvsserver", and to Jeff King at GitHub for
finding and fixing instances of the same issue in other scripts.

----------------------------------------------------------------

Changes since v2.14.1 are as follows:

Andreas Heiduk (2):
      doc: add missing values "none" and "default" for diff.wsErrorHighlight
      doc: clarify "config --bool" behaviour with empty string

Anthony Sottile (1):
      git-grep: correct exit code with --quiet and -L

Brandon Williams (2):
      submodule--helper: teach push-check to handle HEAD
      clone: teach recursive clones to respect -q

Christian Couder (2):
      refs: use skip_prefix() in ref_is_hidden()
      sub-process: print the cmd when a capability is unsupported

Dimitrios Christidis (1):
      fmt-merge-msg: fix coding style

Heiko Voigt (1):
      t5526: fix some broken && chains

Hielke Christian Braun (1):
      gitweb: skip unreadable subdirectories

Jeff King (32):
      t1414: document some reflog-walk oddities
      revision: disallow reflog walking with revs->limited
      log: clarify comment about reflog cycles
      log: do not free parents when walking reflog
      get_revision_1(): replace do-while with an early return
      rev-list: check reflog_info before showing usage
      reflog-walk: stop using fake parents
      reflog-walk: apply --since/--until to reflog dates
      check return value of verify_ref_format()
      docs/for-each-ref: update pointer to color syntax
      t: use test_decode_color rather than literal ANSI codes
      ref-filter: simplify automatic color reset
      ref-filter: abstract ref format into its own struct
      ref-filter: move need_color_reset_at_eol into ref_format
      ref-filter: provide a function for parsing sort options
      ref-filter: make parse_ref_filter_atom a private function
      ref-filter: factor out the parsing of sorting atoms
      ref-filter: pass ref_format struct to atom parsers
      color: check color.ui in git_default_config()
      for-each-ref: load config earlier
      rev-list: pass diffopt->use_colors through to pretty-print
      pretty: respect color settings for %C placeholders
      ref-filter: consult want_color() before emitting colors
      t6018: flesh out empty input/output rev-list tests
      revision: add rev_input_given flag
      rev-list: don't show usage when we see empty ref patterns
      revision: do not fallback to default when rev_input_given is set
      hashcmp: use memcmp instead of open-coded loop
      doc: fix typo in sendemail.identity
      shell: drop git-cvsserver support by default
      archimport: use safe_pipe_capture for user input
      cvsimport: shell-quote variable used in backticks

Johannes Schindelin (2):
      run_processes_parallel: change confusing task_cb convention
      git-gui (MinGW): make use of MSys2's msgfmt

Jonathan Nieder (4):
      vcs-svn: remove more unused prototypes and declarations
      vcs-svn: remove custom mode constants
      vcs-svn: remove repo_delete wrapper function
      vcs-svn: move remaining repo_tree functions to fast_export.h

Jonathan Tan (7):
      fsck: remove redundant parse_tree() invocation
      object: remove "used" field from struct object
      fsck: cleanup unused variable
      Documentation: migrate sub-process docs to header
      sub-process: refactor handshake to common function
      tests: ensure fsck fails on corrupt packfiles
      Doc: clarify that pack-objects makes packs, plural

Junio C Hamano (11):
      http.c: http.sslcert and http.sslkey are both pathnames
      perl/Git.pm: typofix in a comment
      Prepare for 2.14.2
      RelNotes: further fixes for 2.14.2 from the master front
      cvsserver: move safe_pipe_capture() to the main package
      cvsserver: use safe_pipe_capture for `constant commands` as well
      Git 2.10.5
      Git 2.11.4
      Git 2.12.5
      Git 2.13.6
      Git 2.14.2

Kaartic Sivaraam (1):
      commit: check for empty message before the check for untouched template

Kevin Daudt (1):
      stash: prevent warning about null bytes in input

Lars Schneider (7):
      t0021: keep filter log files on comparison
      t0021: make debug log file name configurable
      t0021: write "OUT <size>" only on success
      convert: put the flags field before the flag itself for consistent style
      convert: move multiple file filter error handling to separate function
      convert: refactor capabilities negotiation
      convert: add "status=delayed" to filter process protocol

Martin Ågren (7):
      builtin.h: take over documentation from api-builtin.txt
      git.c: let builtins opt for handling `pager.foo` themselves
      git.c: provide setup_auto_pager()
      t7006: add tests for how git tag paginates
      tag: respect `pager.tag` in list-mode only
      tag: change default of `pager.tag` to "on"
      git.c: ignore pager.* when launching builtin as dashed external

Michael Forney (1):
      scripts: use "git foo" not "git-foo"

Michael J Gruber (6):
      Documentation: use proper wording for ref format strings
      Documentation/git-for-each-ref: clarify peeling of tags for --format
      Documentation/git-merge: explain --continue
      merge: clarify call chain
      merge: split write_merge_state in two
      merge: save merge state earlier

Philip Oakley (4):
      git-gui: remove duplicate entries from .gitconfig's gui.recentrepo
      git gui: cope with duplicates in _get_recentrepo
      git gui: de-dup selected repo from recentrepo history
      git gui: allow for a long recentrepo list

Phillip Wood (7):
      am: remember --rerere-autoupdate setting
      rebase: honor --rerere-autoupdate
      rebase -i: honor --rerere-autoupdate
      t3504: use test_commit
      cherry-pick/revert: remember --rerere-autoupdate
      cherry-pick/revert: reject --rerere-autoupdate when continuing
      am: fix signoff when other trailers are present

Ramsay Jones (2):
      credential-cache: interpret an ECONNRESET as an EOF
      builtin/add: add detail to a 'cannot chmod' error message

René Scharfe (23):
      bswap: convert to unsigned before shifting in get_be32
      bswap: convert get_be16, get_be32 and put_be32 to inline functions
      add MOVE_ARRAY
      use MOVE_ARRAY
      apply: use COPY_ARRAY and MOVE_ARRAY in update_image()
      ls-files: don't try to prune an empty index
      dir: support platforms that require aligned reads
      pack-objects: remove unnecessary NULL check
      t0001: skip test with restrictive permissions if getpwd(3) respects them
      test-path-utils: handle const parameter of basename and dirname
      t3700: fix broken test under !POSIXPERM
      t4062: use less than 256 repetitions in regex
      sha1_file: avoid comparison if no packed hash matches the first byte
      apply: remove prefix_length member from apply_state
      merge: use skip_prefix()
      win32: plug memory leak on realloc() failure in syslog()
      fsck: free buffers on error in fsck_obj()
      sha1_file: release delta_stack on error in unpack_entry()
      t1002: stop using sum(1)
      t5001: add tests for export-ignore attributes and exclude pathspecs
      archive: factor out helper functions for handling attributes
      archive: don't queue excluded directories
      commit: remove unused inline function single_parent()

Santiago Torres (1):
      t: lib-gpg: flush gpg agent on startup

Stefan Beller (3):
      t8008: rely on rev-parse'd HEAD instead of sha1 value
      sha1_file: make read_info_alternates static
      submodule.sh: remove unused variable

Torsten Bögershausen (2):
      convert: add SAFE_CRLF_KEEP_CRLF
      apply: file commited with CRLF should roundtrip diff and apply

Urs Thuermann (1):
      git svn fetch: Create correct commit timestamp when using --localtime

brian m. carlson (2):
      vcs-svn: remove unused prototypes
      vcs-svn: rename repo functions to "svn_repo"

joernchen (1):
      cvsserver: use safe_pipe_capture instead of backticks

Ævar Arnfjörð Bjarmason (1):
      tests: don't give unportable ">" to "test" built-in, use -gt



^ permalink raw reply	[relevance 11%]

* [PATCH] submodule: correct error message for missing commits.
@ 2017-09-26 18:27 Stefan Beller
  2017-09-26 23:37 ` Jacob Keller
  2017-09-27  0:52 ` Junio C Hamano
  0 siblings, 2 replies; 200+ results
From: Stefan Beller @ 2017-09-26 18:27 UTC (permalink / raw)
  To: git; +Cc: jacob.keller, Stefan Beller

When a submodule diff should be displayed we currently just add the
submodule objects to the main object store and then e.g. walk the
revision graph and create a summary for that submodule.

It is possible that we are missing the submodule either completely or
partially, which we currently differentiate with different error messages
depending on whether (1) the whole submodule object store is missing or
(2) just the needed for this particular diff. (1) is reported as
"not initialized", and (2) is reported as "commits not present".

If a submodule is deinit'ed its repository data is still around inside
the superproject, such that the diff can still be produced. In that way
the error message (1) is misleading as we can have a diff despite the
submodule being not initialized.

Downgrade the error message (1) to be the same as (2) and just say
the commits are not present, as that is the true reason why the diff
cannot be shown.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c                               | 2 +-
 t/t4059-diff-submodule-not-initialized.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/submodule.c b/submodule.c
index 6531c5d609..280c246477 100644
--- a/submodule.c
+++ b/submodule.c
@@ -567,7 +567,7 @@ static void show_submodule_header(FILE *f, const char *path,
 
 	if (add_submodule_odb(path)) {
 		if (!message)
-			message = "(not initialized)";
+			message = "(commits not present)";
 		goto output_header;
 	}
 
diff --git a/t/t4059-diff-submodule-not-initialized.sh b/t/t4059-diff-submodule-not-initialized.sh
index cd70fd5192..49bca7b48d 100755
--- a/t/t4059-diff-submodule-not-initialized.sh
+++ b/t/t4059-diff-submodule-not-initialized.sh
@@ -95,7 +95,7 @@ test_expect_success 'submodule not initialized in new clone' '
 	git clone . sm3 &&
 	git -C sm3 diff-tree -p --no-commit-id --submodule=log HEAD >actual &&
 	cat >expected <<-EOF &&
-	Submodule sm1 $smhead1...$smhead2 (not initialized)
+	Submodule sm1 $smhead1...$smhead2 (commits not present)
 	EOF
 	test_cmp expected actual
 '
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 32%]

* [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-26  6:28 [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules Junio C Hamano
@ 2017-09-26 18:54 ` Stefan Beller
  2017-09-26 19:46   ` Johannes Sixt
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-26 18:54 UTC (permalink / raw)
  To: gitster; +Cc: git, j6t, jrnieder, sbeller

submodule.<name>.update can be assigned an arbitrary command via setting
it to "!command". When this command is found in the regular config, Git
ought to just run that command instead of other update mechanisms.

However if that command is just found in the .gitmodules file, it is
potentially untrusted, which is why we do not run it.  Add a test
confirming the behavior.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Johannes wrote:
> I am pretty confident that this does not test what you intend to
> test. Notice that $TEST_DIRECTORY is expanded when the script is
> written. But that path contains a blank, and we have something like
> this in the test script:
>
>       #!/bin/sh
>       >/the/build/directory/t/trash directory.t7406/bad

I can confirm that.

Instead of mucking around with writing a script, "to make it robust",
I decided to got the simplest route and just have the command "!false",
which would make "git submodule update submodule" return an error.

Thanks,
Stefan


 t/t7406-submodule-update.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 034914a14f..a9ea098e55 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -406,6 +406,16 @@ test_expect_success 'submodule update - command in .git/config' '
 	)
 '
 
+test_expect_success 'submodule update - command in .gitmodules is ignored' '
+	test_when_finished "git -C super reset --hard HEAD^" &&
+
+	git -C super config -f .gitmodules submodule.submodule.update "!false" &&
+	git -C super commit -a -m "add command to .gitmodules file" &&
+	git -C super/submodule reset --hard $submodulesha1^ &&
+	git -C super submodule update submodule &&
+	test_path_is_missing bad
+'
+
 cat << EOF >expect
 Execution of 'false $submodulesha1' failed in submodule path 'submodule'
 EOF
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 32%]

* Re: [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-26 18:54 ` Stefan Beller
@ 2017-09-26 19:46   ` Johannes Sixt
  2017-09-26 19:54     ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Johannes Sixt @ 2017-09-26 19:46 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git, jrnieder

Am 26.09.2017 um 20:54 schrieb Stefan Beller:
> +test_expect_success 'submodule update - command in .gitmodules is ignored' '
> +	test_when_finished "git -C super reset --hard HEAD^" &&
> +
> +	git -C super config -f .gitmodules submodule.submodule.update "!false" &&
> +	git -C super commit -a -m "add command to .gitmodules file" &&
> +	git -C super/submodule reset --hard $submodulesha1^ &&
> +	git -C super submodule update submodule &&
> +	test_path_is_missing bad

This test for a missing file is certainly a remnant from the previous 
iteration, isn't it?

-- Hannes

^ permalink raw reply	[relevance 15%]

* [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-26 19:46   ` Johannes Sixt
@ 2017-09-26 19:54     ` Stefan Beller
  2017-09-27  3:21       ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-09-26 19:54 UTC (permalink / raw)
  To: j6t; +Cc: git, gitster, jrnieder, sbeller

submodule.<name>.update can be assigned an arbitrary command via setting
it to "!command". When this command is found in the regular config, Git
ought to just run that command instead of other update mechanisms.

However if that command is just found in the .gitmodules file, it is
potentially untrusted, which is why we do not run it.  Add a test
confirming the behavior.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

> This test for a missing file is certainly a remnant from the
> previous iteration, isn't it?

Yes. This is a good indicator I need some vacation.

Thanks,
Stefan

 t/t7406-submodule-update.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 034914a14f..6f083c4d68 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -406,6 +406,14 @@ test_expect_success 'submodule update - command in .git/config' '
 	)
 '
 
+test_expect_success 'submodule update - command in .gitmodules is ignored' '
+	test_when_finished "git -C super reset --hard HEAD^" &&
+	git -C super config -f .gitmodules submodule.submodule.update "!false" &&
+	git -C super commit -a -m "add command to .gitmodules file" &&
+	git -C super/submodule reset --hard $submodulesha1^ &&
+	git -C super submodule update submodule
+'
+
 cat << EOF >expect
 Execution of 'false $submodulesha1' failed in submodule path 'submodule'
 EOF
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 33%]

* [PATCH] technical doc: add a design doc for hash function transition
  2017-09-26 22:11 RFC v3: Another proposed hash function transition plan Johannes Schindelin
@ 2017-09-26 22:25 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-09-26 22:25 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: bmwill, david, demerphq, git, gitster, jason, jonathantanmy, jrnieder, peff, sandals, sbeller, torvalds, Jonathan Nieder

From: Jonathan Nieder <jrn@google.com>

This is "RFC v3: Another proposed hash function transition plan" from
the git mailing list.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

 This takes the original Google Doc[1] and adds it to our history,
 such that the discussion can be on on list and in the commit messages.
 
 * replaced SHA3-256 with NEWHASH, sha3 with newhash
 * added section 'Implementation plan'
 * added section 'Future work'
 * added section 'Agreed-upon criteria for selecting NewHash'
 
 As the discussion restarts again, here is our attempt
 to add value to the discussion, we planned to polish it more, but as the
 discussion is restarting, we might just post it as-is.
  
 Thanks.

[1] https://docs.google.com/document/d/18hYAQCTsDgaFUo-VJGhT0UqyetL2LbAzkWNK1fYS8R0/edit

 Documentation/Makefile                             |   1 +
 .../technical/hash-function-transition.txt         | 571 +++++++++++++++++++++
 2 files changed, 572 insertions(+)
 create mode 100644 Documentation/technical/hash-function-transition.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2415e0d657..471bb29725 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -67,6 +67,7 @@ SP_ARTICLES += howto/maintain-git
 API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
 SP_ARTICLES += $(API_DOCS)
 
+TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
 TECH_DOCS += technical/pack-format
diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
new file mode 100644
index 0000000000..0ac751d600
--- /dev/null
+++ b/Documentation/technical/hash-function-transition.txt
@@ -0,0 +1,571 @@
+Git hash function transition
+============================
+
+Objective
+---------
+Migrate Git from SHA-1 to a stronger hash function.
+
+Background
+----------
+At its core, the Git version control system is a content addressable
+filesystem. It uses the SHA-1 hash function to name content. For
+example, files, directories, and revisions are referred to by hash
+values unlike in other traditional version control systems where files
+or versions are referred to via sequential numbers. The use of a hash
+function to address its content delivers a few advantages:
+
+* Integrity checking is easy. Bit flips, for example, are easily
+  detected, as the hash of corrupted content does not match its name.
+* Lookup of objects is fast.
+
+Using a cryptographically secure hash function brings additional
+advantages:
+
+* Object names can be signed and third parties can trust the hash to
+  address the signed object and all objects it references.
+* Communication using Git protocol and out of band communication
+  methods have a short reliable string that can be used to reliably
+  address stored content.
+
+Over time some flaws in SHA-1 have been discovered by security
+researchers. https://shattered.io demonstrated a practical SHA-1 hash
+collision. As a result, SHA-1 cannot be considered cryptographically
+secure any more. This impacts the communication of hash values because
+we cannot trust that a given hash value represents the known good
+version of content that the speaker intended.
+
+SHA-1 still possesses the other properties such as fast object lookup
+and safe error checking, but other hash functions are equally suitable
+that are believed to be cryptographically secure.
+
+Goals
+-----
+1. The transition to NEWHASH can be done one local repository at a time.
+   a. Requiring no action by any other party.
+   b. A NEWHASH repository can communicate with SHA-1 Git servers
+      (push/fetch).
+   c. Users can use SHA-1 and NEWHASH identifiers for objects
+      interchangeably.
+   d. New signed objects make use of a stronger hash function than
+      SHA-1 for their security guarantees.
+2. Allow a complete transition away from SHA-1.
+   a. Local metadata for SHA-1 compatibility can be removed from a
+      repository if compatibility with SHA-1 is no longer needed.
+3. Maintainability throughout the process.
+   a. The object format is kept simple and consistent.
+   b. Creation of a generalized repository conversion tool.
+
+Non-Goals
+---------
+1. Add NEWHASH support to Git protocol. This is valuable and the
+   logical next step but it is out of scope for this initial design.
+2. Transparently improving the security of existing SHA-1 signed
+   objects.
+3. Intermixing objects using multiple hash functions in a single
+   repository.
+4. Taking the opportunity to fix other bugs in git's formats and
+   protocols.
+5. Shallow clones and fetches into a NEWHASH repository. (This will
+   change when we add NEWHASH support to Git protocol.)
+6. Skip fetching some submodules of a project into a NEWHASH
+   repository. (This also depends on NEWHASH support in Git
+   protocol.)
+
+Overview
+--------
+We introduce a new repository format extension `newhash`. Repositories
+with this extension enabled use NEWHASH instead of SHA-1 to name
+their objects. This affects both object names and object content ---
+both the names of objects and all references to other objects within
+an object are switched to the new hash function.
+
+newhash repositories cannot be read by older versions of Git.
+
+Alongside the packfile, a newhash repository stores a bidirectional
+mapping between newhash and sha1 object names in a new format of .idx files.
+The mapping is generated locally and can be verified using "git fsck".
+Object lookups use this mapping to allow naming objects using either
+their sha1 and newhash names interchangeably.
+
+"git cat-file" and "git hash-object" gain options to display an object
+in its sha1 form and write an object given its sha1 form. This
+requires all objects referenced by that object to be present in the
+object database so that they can be named using the appropriate name
+(using the bidirectional hash mapping).
+
+Fetches from a SHA-1 based server convert the fetched objects into
+newhash form and record the mapping in the bidirectional mapping table
+(see below for details). Pushes to a SHA-1 based server convert the
+objects being pushed into sha1 form so the server does not have to be
+aware of the hash function the client is using.
+
+Detailed Design
+---------------
+Object names
+~~~~~~~~~~~~
+Objects can be named by their 40 hexadecimal digit sha1-name or <n>
+hexadecimal digit newhash-name, plus names derived from those (see
+gitrevisions(7)).
+
+The sha1-name of an object is the SHA-1 of the concatenation of its
+type, length, a nul byte, and the object's sha1-content. This is the
+traditional <sha1> used in Git to name objects.
+
+The newhash-name of an object is the NEWHASH of the concatenation of its
+type, length, a nul byte, and the object's newhash-content.
+
+Object format
+~~~~~~~~~~~~~
+The content as a byte sequence of a tag, commit, or tree object named
+by sha1 and newhash differ because an object named by newhash-name refers to
+other objects by their newhash-names and an object named by sha1-name
+refers to other objects by their sha1-names.
+
+The newhash-content of an object is the same as its sha1-content, except
+that objects referenced by the object are named using their newhash-names
+instead of sha1-names. Because a blob object does not refer to any
+other object, its sha1-content and newhash-content are the same.
+
+The format allows round-trip conversion between newhash-content and
+sha1-content.
+
+Object storage
+~~~~~~~~~~~~~~
+Loose objects use zlib compression and packed objects use the packed
+format described in Documentation/technical/pack-format.txt, just like
+today. The content that is compressed and stored uses newhash-content
+instead of sha1-content.
+
+Translation table
+~~~~~~~~~~~~~~~~~
+A fast bidirectional mapping between sha1-names and newhash-names of all
+local objects in the repository is kept on disk.
+
+For pack files, upgrade the .idx file to be as follows:
+
+  4 magic bytes
+  header, containing pointers to the 3 lists below
+
+  list of
+  abbrev sha1 -> ordinal, sorted by sha1
+
+  list of
+  abbrev newhash -> ordinal, sorted by newhash
+
+  list of
+  ordinal, complete sha1, complete new hash,
+  sorted by ordinal, such that a lookup can be computed after looking into
+  one of the first lists.
+
+For unpacked objects, keep a simple list
+  sha1 -> newhash
+around at $OBJECT_DIR/loose-lookup
+
+All operations that make new objects (e.g., "git commit") add the new
+objects to the translation table.
+
+(This work could have been deferred to push time, but that would
+significantly complicate and slow down pushes. Calculating the
+sha1-name at object creation time at the same time it is being
+streamed to disk and having its newhash-name calculated should be an
+acceptable cost.)
+
+Reading an object's sha1-content
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The sha1-content of an object can be read by converting all newhash-names
+its newhash-content references to sha1-names using the translation table.
+
+Fetch
+~~~~~
+Fetching from a SHA-1 based server requires translating between SHA-1
+and NEWHASH based representations on the fly.
+
+SHA-1s named in the ref advertisement that are present on the client
+can be translated to NEWHASH and looked up as local objects using the
+translation table.
+
+Negotiation proceeds as today. Any "have"s generated locally are
+converted to SHA-1 before being sent to the server, and SHA-1s
+mentioned by the server are converted to NEWHASH when looking them up
+locally.
+
+After negotiation, the server sends a packfile containing the
+requested objects. We convert the packfile to NEWHASH format using
+the following steps:
+
+1. index-pack: inflate each object in the packfile and compute its
+   SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
+   objects the client has locally. These objects can be looked up
+   using the translation table and their sha1-content read as
+   described above to resolve the deltas.
+2. topological sort: starting at the "want"s from the negotiation
+   phase, walk through objects in the pack and emit a list of them,
+   excluding blobs, in reverse topologically sorted order, with each
+   object coming later in the list than all objects it references.
+   (This list only contains objects reachable from the "wants". If the
+   pack from the server contained additional extraneous objects, then
+   they will be discarded.)
+3. convert to newhash: open a new (newhash) packfile. Read the topologically
+   sorted list just generated. For each object, inflate its
+   sha1-content, convert to newhash-content, and write it to the newhash
+   pack. Include the new sha1<->newhash mapping entry in the translation
+   table.
+4. sort: reorder entries in the new pack to match the order of objects
+   in the pack the server generated and include blobs. Write a newhash idx
+   file.
+5. clean up: remove the SHA-1 based pack file, index, and
+   topologically sorted list obtained from the server and steps 1
+   and 2.
+
+Step 3 requires every object referenced by the new object to be in the
+translation table. This is why the topological sort step is necessary.
+
+As an optimization, step 1 could write a file describing what non-blob
+objects each object it has inflated from the packfile references. This
+makes the topological sort in step 2 possible without inflating the
+objects in the packfile for a second time. The objects need to be
+inflated again in step 3, for a total of two inflations.
+
+Step 4 is probably necessary for good read-time performance. "git
+pack-objects" on the server optimizes the pack file for good data
+locality (see Documentation/technical/pack-heuristics.txt).
+
+Details of this process are likely to change. It will take some
+experimenting to get this to perform well.
+
+Push
+~~~~
+Push is simpler than fetch because the objects referenced by the
+pushed objects are already in the translation table. The sha1-content
+of each object being pushed can be read as described in the "Reading
+an object's sha1-content" section to generate the pack written by git
+send-pack.
+
+Signed Commits
+~~~~~~~~~~~~~~
+We add a new field "gpgsig-newhash" to the commit object format to allow
+signing commits without relying on SHA-1. It is similar to the
+existing "gpgsig" field. Its signed payload is the newhash-content of the
+commit object with any "gpgsig" and "gpgsig-newhash" fields removed.
+
+This means commits can be signed
+1. using SHA-1 only, as in existing signed commit objects
+2. using both SHA-1 and NEWHASH, by using both gpgsig-newhash and gpgsig
+   fields.
+3. using only NEWHASH, by only using the gpgsig-newhash field.
+
+Old versions of "git verify-commit" can verify the gpgsig signature in
+cases (1) and (2) without modifications and view case (3) as an
+ordinary unsigned commit.
+
+Signed Tags
+~~~~~~~~~~~
+We add a new field "gpgsig-newhash" to the tag object format to allow
+signing tags without relying on SHA-1. Its signed payload is the
+newhash-content of the tag with its gpgsig-newhash field and "-----BEGIN PGP
+SIGNATURE-----" delimited in-body signature removed.
+
+This means tags can be signed
+1. using SHA-1 only, as in existing signed tag objects
+2. using both SHA-1 and NEWHASH, by using gpgsig-newhash and an in-body
+   signature.
+3. using only NEWHASH, by only using the gpgsig-newhash field.
+
+Mergetag embedding
+~~~~~~~~~~~~~~~~~~
+The mergetag field in the sha1-content of a commit contains the
+sha1-content of a tag that was merged by that commit.
+
+The mergetag field in the newhash-content of the same commit contains the
+newhash-content of the same tag.
+
+Submodules
+~~~~~~~~~~
+To convert recorded submodule pointers, you need to have the converted
+submodule repository in place. The translation table of the submodule
+can be used to look up the new hash.
+
+Caveats
+-------
+Invalid objects
+~~~~~~~~~~~~~~~
+The conversion from sha1-content to newhash-content retains any
+brokenness in the original object (e.g., tree entry modes encoded with
+leading 0, tree objects whose paths are not sorted correctly, and
+commit objects without an author or committer). This is a deliberate
+feature of the design to allow the conversion to round-trip.
+
+More profoundly broken objects (e.g., a commit with a truncated "tree"
+header line) cannot be converted but were not usable by current Git
+anyway.
+
+Shallow clone and submodules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Because it requires all referenced objects to be available in the
+locally generated translation table, this design does not support
+shallow clone or unfetched submodules. Protocol improvements might
+allow lifting this restriction.
+
+Alternates
+~~~~~~~~~~
+For the same reason, a newhash repository cannot borrow objects from a
+sha1 repository using objects/info/alternates or
+$GIT_ALTERNATE_OBJECT_REPOSITORIES.
+
+git notes
+~~~~~~~~~
+The "git notes" tool annotates objects using their sha1-name as key.
+This design does not describe a way to migrate notes trees to use
+newhash-names. That migration is expected to happen separately (for
+example using a file at the root of the notes tree to describe which
+hash it uses).
+
+Server-side cost
+~~~~~~~~~~~~~~~~
+Until Git protocol gains NEWHASH support, using newhash based storage on
+public-facing Git servers is strongly discouraged. Once Git protocol
+gains NEWHASH support, newhash based servers are likely not to support
+sha1 compatibility, to avoid what may be a very expensive hash
+reencode during clone and to encourage peers to modernize.
+
+The design described here allows fetches by SHA-1 clients of a
+personal NEWHASH repository because it's not much more difficult than
+allowing pushes from that repository. This support needs to be guarded
+by a configuration option --- servers like git.kernel.org that serve a
+large number of clients would not be expected to bear that cost.
+
+Meaning of signatures
+~~~~~~~~~~~~~~~~~~~~~
+The signed payload for signed commits and tags does not explicitly
+name the hash used to identify objects. If some day Git adopts a new
+hash function with the same length as the current SHA-1 (40
+hexadecimal digit) or NEWHASH (64 hexadecimal digit) objects then the
+intent behind the PGP signed payload in an object signature is
+unclear:
+
+	object e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7
+	type commit
+	tag v2.12.0
+	tagger Junio C Hamano <gitster@pobox.com> 1487962205 -0800
+
+	Git 2.12
+
+Does this mean Git v2.12.0 is the commit with sha1-name
+e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
+new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
+
+Fortunately NEWHASH and SHA-1 have different lengths. If Git starts
+using another hash with the same length to name objects, then it will
+need to change the format of signed payloads using that hash to
+address this issue.
+
+Alternatives considered
+-----------------------
+Upgrading everyone working on a particular project on a flag day
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Projects like the Linux kernel are large and complex enough that
+flipping the switch for all projects based on the repository at once
+is infeasible.
+
+Not only would all developers and server operators supporting
+developers have to switch on the same flag day, but supporting tooling
+(continuous integration, code review, bug trackers, etc) would have to
+be adapted as well. This also makes it difficult to get early feedback
+from some project participants testing before it is time for mass
+adoption.
+
+Using hash functions in parallel
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+(e.g. https://public-inbox.org/git/22708.8913.864049.452252@chiark.greenend.org.uk/ )
+Objects newly created would be addressed by the new hash, but inside
+such an object (e.g. commit) it is still possible to address objects
+using the old hash function.
+* You cannot trust its history (needed for bisectability) in the
+  future without further work
+* Maintenance burden as the number of supported hash functions grows
+  (they will never go away, so they accumulate). In this proposal, by
+  comparison, converted objects lose all references to SHA-1.
+
+Signed objects with multiple hashes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Instead of introducing the gpgsig-newhash field in commit and tag objects
+for newhash-content based signatures, an earlier version of this design
+added "hash newhash <newhash-name>" fields to strengthen the existing
+sha1-content based signatures.
+
+In other words, a single signature was used to attest to the object
+content using both hash functions. This had some advantages:
+* Using one signature instead of two speeds up the signing process.
+* Having one signed payload with both hashes allows the signer to
+  attest to the sha1-name and newhash-name referring to the same object.
+* All users consume the same signature. Broken signatures are likely
+  to be detected quickly using current versions of git.
+
+However, it also came with disadvantages:
+* Verifying a signed object requires access to the sha1-names of all
+  objects it references, even after the transition is complete and
+  translation table is no longer needed for anything else. To support
+  this, the design added fields such as "hash sha1 tree <sha1-name>"
+  and "hash sha1 parent <sha1-name>" to the newhash-content of a signed
+  commit, complicating the conversion process.
+* Allowing signed objects without a sha1 (for after the transition is
+  complete) complicated the design further, requiring a "nohash sha1"
+  field to suppress including "hash sha1" fields in the newhash-content
+  and signed payload.
+
+
+Implementation plan
+-------------------
+
+Here's a rough list of some useful tasks, in no particular order:
+
+1. bc/object-id: This patch series continues, eliminating assumptions
+   about the size of object ids by encapsulating them in a struct.
+   One straightforward way to find code that still needs to be
+   converted is to grep for "sha" --- often the conversion patches
+   change function and variable names to refer to oid_ where they used
+   to use sha1_, making the stragglers easier to spot.
+
+2. Hard-coded object ids in tests: Many tests beyond t00* make assumptions
+   about the exact values of object ids.  That's bad for maintainability
+   for other reasons beyond the hash function transition, too.
+
+   It should be possible to suss them out by patching git's sha1
+   routine to use the ones-complement of sha1 (~sha1) instead and
+   seeing which tests fail.
+
+3. Repository format extension to use a different hash function: we
+   want git to be able to work with two hash functions: sha1 and
+   something else.  For interoperability and simplity, it is useful
+   for a single git binary to support both hash functions.
+
+   That means a repository needs to be able to specify what hash
+   function is used for the objects in that repository.  This can be
+   configured by setting '[core] repositoryformatversion=1' (to avoid
+   confusing old versions of git) and
+   '[extensions] experimentalNewHashFunction = true'.
+   Documentation/technical/repository-version.txt has more details.
+
+   We can start experimenting with this using e.g. the ~sha1 function
+   described at (2), or the 160-bit hash of the patch author's choice
+   (e.g. truncated blake2bp-256).
+
+4. When choosing a hash function, people may argue about performance.
+   It would be useful for run some benchmarks for git (running
+   the test suite, t/perf tests, etc) using a variety of hash
+   functions as input to such a discussion.
+
+5. Longer hash: Even once all object id references in git use struct
+   object_id (see (1)), we need to tackle other assumptions about
+   object id size in git and its tests.
+
+   It should be possible to suss them out by replacing git's sha1
+   routine with a 40-byte hash: sha1 with each byte repeated (sha1+sha1)
+   and seeing what fails.
+
+6. Repository format extension for longer hash: As in (3), we could
+   add a repository format extension to experiment with using the
+   sha1+sha1 function.
+
+7. Avoiding wasted memory from unused hash functions: struct object_id
+   has definition 'unsigned char hash[GIT_MAX_RAWSZ]', where
+   GIT_MAX_RAWSZ is the size of the largest supported hash function.
+   When operating on a repository that only uses sha1, this wastes
+   memory.
+
+   Avoid that by making object identifiers variable-sized.  That is,
+   something like
+
+     struct object_id {
+        union {
+           unsigned char hash20[20];
+           unsigned char hash32[32];
+        } *hash;
+     }
+
+   or
+
+     struct object_id {
+       unsigned char *hash;
+     }
+
+   The hard part is that allocation and destruction have to be
+   explicit instead of happening automatically when an object_id is an
+   automatic variable.
+
+8. Implementation of this plan (roughly in order):
+   - abstract the hash computation to be able to plug in another hash
+   - make the choice of hash dependant on repository extension
+   - implement the new .idx format
+   - implement cat-file's flag to show things in old/new hash
+   - convert fetch, push
+
+9. We can use help from security experts in all of this.  Fuzzing,
+   analysis of how we use cryptography, security review of other parts
+   of the design, and information to help choose a hash function are
+   all appreciated.
+
+Agreed-upon criteria for selecting NewHash
+------------------------------------------
+
+The discussion which hash function to use is going in circles, so let's
+first argree on criteria on how to select the new hash function. These
+could include:
+* cryptografic strength
+* performance
+* other cryptografic aspects(?)
+* portability / availability of properly licensed implementations
+
+Future work
+-----------
+
+* other compression instead of zlib (this is a stated non goal, though!)
+* rehash discussion whether to include generation numbers natively
+  (this is a stated non goal, though!)
+* describing (1) the possibility of caching translated objects
+* and (2) protocol changes.
+* other format changes
+
+Document History
+----------------
+
+2017-03-03
+bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
+sbeller@google.com
+
+Initial version sent to
+http://public-inbox.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+
+2017-03-03 jrnieder@gmail.com
+Incorporated suggestions from jonathantanmy and sbeller:
+* describe purpose of signed objects with each hash type
+* redefine signed object verification using object content under the
+  first hash function
+
+2017-03-06 jrnieder@gmail.com
+* Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
+* Make sha3-based signatures a separate field, avoiding the need for
+  "hash" and "nohash" fields (thanks to peff[3]).
+* Add a sorting phase to fetch (thanks to Junio for noticing the need
+  for this).
+* Omit blobs from the topological sort during fetch (thanks to peff).
+* Discuss alternates, git notes, and git servers in the caveats
+  section (thanks to Junio Hamano, brian m. carlson[4], and Shawn
+  Pearce).
+* Clarify language throughout (thanks to various commenters,
+  especially Junio).
+
+[1] http://public-inbox.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+[2] http://public-inbox.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+[3] http://public-inbox.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+[4] http://public-inbox.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+
+2017-09-25
+* replaced SHA3-256 with NEWHASH, sha3 with newhash
+* added section 'Implementation plan'
+* added section 'Future work'
+
+* This version is sent to the list; to be incorporated into git.git, such
+  that further document history is found using git-log.
+
+
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 7%]

* Re: [PATCH] submodule: correct error message for missing commits.
  2017-09-26 18:27 [PATCH] submodule: correct error message for missing commits Stefan Beller
@ 2017-09-26 23:37 ` Jacob Keller
  2017-09-27  0:52 ` Junio C Hamano
  1 sibling, 0 replies; 200+ results
From: Jacob Keller @ 2017-09-26 23:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Git mailing list

On Tue, Sep 26, 2017 at 11:27 AM, Stefan Beller <sbeller@google.com> wrote:
> When a submodule diff should be displayed we currently just add the
> submodule objects to the main object store and then e.g. walk the
> revision graph and create a summary for that submodule.
>
> It is possible that we are missing the submodule either completely or
> partially, which we currently differentiate with different error messages
> depending on whether (1) the whole submodule object store is missing or
> (2) just the needed for this particular diff. (1) is reported as
> "not initialized", and (2) is reported as "commits not present".
>
> If a submodule is deinit'ed its repository data is still around inside
> the superproject, such that the diff can still be produced. In that way
> the error message (1) is misleading as we can have a diff despite the
> submodule being not initialized.
>
> Downgrade the error message (1) to be the same as (2) and just say
> the commits are not present, as that is the true reason why the diff
> cannot be shown.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>

This makes sense to me.

Thanks,
Jake

> ---
>  submodule.c                               | 2 +-
>  t/t4059-diff-submodule-not-initialized.sh | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index 6531c5d609..280c246477 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -567,7 +567,7 @@ static void show_submodule_header(FILE *f, const char *path,
>
>         if (add_submodule_odb(path)) {
>                 if (!message)
> -                       message = "(not initialized)";
> +                       message = "(commits not present)";
>                 goto output_header;
>         }
>
> diff --git a/t/t4059-diff-submodule-not-initialized.sh b/t/t4059-diff-submodule-not-initialized.sh
> index cd70fd5192..49bca7b48d 100755
> --- a/t/t4059-diff-submodule-not-initialized.sh
> +++ b/t/t4059-diff-submodule-not-initialized.sh
> @@ -95,7 +95,7 @@ test_expect_success 'submodule not initialized in new clone' '
>         git clone . sm3 &&
>         git -C sm3 diff-tree -p --no-commit-id --submodule=log HEAD >actual &&
>         cat >expected <<-EOF &&
> -       Submodule sm1 $smhead1...$smhead2 (not initialized)
> +       Submodule sm1 $smhead1...$smhead2 (commits not present)
>         EOF
>         test_cmp expected actual
>  '
> --
> 2.14.0.rc0.3.g6c2e499285
>

^ permalink raw reply	[relevance 15%]

* Re: [PATCH] submodule: correct error message for missing commits.
  2017-09-26 18:27 [PATCH] submodule: correct error message for missing commits Stefan Beller
  2017-09-26 23:37 ` Jacob Keller
@ 2017-09-27  0:52 ` Junio C Hamano
  1 sibling, 0 replies; 200+ results
From: Junio C Hamano @ 2017-09-27  0:52 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jacob.keller

Stefan Beller <sbeller@google.com> writes:

> When a submodule diff should be displayed we currently just add the
> submodule objects to the main object store and then e.g. walk the
> revision graph and create a summary for that submodule.
>
> It is possible that we are missing the submodule either completely or
> partially, which we currently differentiate with different error messages
> depending on whether (1) the whole submodule object store is missing or
> (2) just the needed for this particular diff. (1) is reported as
> "not initialized", and (2) is reported as "commits not present".
>
> If a submodule is deinit'ed its repository data is still around inside
> the superproject, such that the diff can still be produced. In that way
> the error message (1) is misleading as we can have a diff despite the
> submodule being not initialized.
>
> Downgrade the error message (1) to be the same as (2) and just say
> the commits are not present, as that is the true reason why the diff
> cannot be shown.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

Sounds good.  Thanks for working on it.


^ permalink raw reply	[relevance 15%]

* Re: [PATCH] t7406: submodule.<name>.update command must not be run from .gitmodules
  2017-09-26 19:54     ` Stefan Beller
@ 2017-09-27  3:21       ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-09-27  3:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: j6t, git, jrnieder

Stefan Beller <sbeller@google.com> writes:

> submodule.<name>.update can be assigned an arbitrary command via setting
> it to "!command". When this command is found in the regular config, Git
> ought to just run that command instead of other update mechanisms.
>
> However if that command is just found in the .gitmodules file, it is
> potentially untrusted, which is why we do not run it.  Add a test
> confirming the behavior.
>
> Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

Earlier, we saw:

    Ideally we want this test to be super robust: e.g. if it runs the
    command but from a different directory, we still want the test to fail,
    and if it runs the command but using exec instead of a shell, we still
    want the test to fail.

and this one (i.e. signal that it is a command by prefixing with
'!', and then have a single command that would fail whether it is
run via run_command() with or without shell) would satisfy that
criteria, I would think.

>> This test for a missing file is certainly a remnant from the
>> previous iteration, isn't it?
>
> Yes. This is a good indicator I need some vacation.

Or just take a deep breath before making a knee-jerk reaction public
and instead double-check before sending things out ;-)

Will queue.  Thanks.

>
> Thanks,
> Stefan
>
>  t/t7406-submodule-update.sh | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index 034914a14f..6f083c4d68 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -406,6 +406,14 @@ test_expect_success 'submodule update - command in .git/config' '
>  	)
>  '
>  
> +test_expect_success 'submodule update - command in .gitmodules is ignored' '
> +	test_when_finished "git -C super reset --hard HEAD^" &&
> +	git -C super config -f .gitmodules submodule.submodule.update "!false" &&
> +	git -C super commit -a -m "add command to .gitmodules file" &&
> +	git -C super/submodule reset --hard $submodulesha1^ &&
> +	git -C super submodule update submodule
> +'
> +
>  cat << EOF >expect
>  Execution of 'false $submodulesha1' failed in submodule path 'submodule'
>  EOF

^ permalink raw reply	[relevance 15%]

* [PATCH v4] technical doc: add a design doc for hash function transition
  2017-03-09 19:14     ` Shawn Pearce
@ 2017-09-28  4:43       ` Jonathan Nieder
  2017-09-29  6:06         ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Jonathan Nieder @ 2017-09-28  4:43 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Linus Torvalds, Git Mailing List, Stefan Beller, bmwill, Jonathan Tan, Jeff King, David Lang, brian m. carlson, Masaya Suzuki, demerphq, The Keccak Team, Johannes Schindelin

This document describes what a transition to a new hash function for
Git would look like.  Add it to Documentation/technical/ as the plan
of record so that future changes can be recorded as patches.

Also-by: Brandon Williams <bmwill@google.com>
Also-by: Jonathan Tan <jonathantanmy@google.com>
Also-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
On Thu, Mar 09, 2017 at 11:14 AM, Shawn Pearce wrote:
> On Mon, Mar 6, 2017 at 4:17 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Thanks for the kind words on what had quite a few flaws still.  Here's
>> a new draft.  I think the next version will be a patch against
>> Documentation/technical/.
>
> FWIW, I like this approach.

Okay, here goes.

Instead of sharding the loose object translation tables by first byte,
we went for a single table.  It simplifies the design and we need to
keep the number of loose objects under control anyway.

We also included a description of the transition plan and tried to
include a summary of what has been agreed upon so far about the choice
of hash function.

Thanks to Junio for reviving the discussion and in particular to Dscho
for pushing this forward and making the missing pieces clearer.

Thoughts of all kinds welcome, as always.

 Documentation/Makefile                             |   1 +
 .../technical/hash-function-transition.txt         | 797 +++++++++++++++++++++
 2 files changed, 798 insertions(+)
 create mode 100644 Documentation/technical/hash-function-transition.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2415e0d657..471bb29725 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -67,6 +67,7 @@ SP_ARTICLES += howto/maintain-git
 API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
 SP_ARTICLES += $(API_DOCS)
 
+TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
 TECH_DOCS += technical/pack-format
diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
new file mode 100644
index 0000000000..417ba491d0
--- /dev/null
+++ b/Documentation/technical/hash-function-transition.txt
@@ -0,0 +1,797 @@
+Git hash function transition
+============================
+
+Objective
+---------
+Migrate Git from SHA-1 to a stronger hash function.
+
+Background
+----------
+At its core, the Git version control system is a content addressable
+filesystem. It uses the SHA-1 hash function to name content. For
+example, files, directories, and revisions are referred to by hash
+values unlike in other traditional version control systems where files
+or versions are referred to via sequential numbers. The use of a hash
+function to address its content delivers a few advantages:
+
+* Integrity checking is easy. Bit flips, for example, are easily
+  detected, as the hash of corrupted content does not match its name.
+* Lookup of objects is fast.
+
+Using a cryptographically secure hash function brings additional
+advantages:
+
+* Object names can be signed and third parties can trust the hash to
+  address the signed object and all objects it references.
+* Communication using Git protocol and out of band communication
+  methods have a short reliable string that can be used to reliably
+  address stored content.
+
+Over time some flaws in SHA-1 have been discovered by security
+researchers. https://shattered.io demonstrated a practical SHA-1 hash
+collision. As a result, SHA-1 cannot be considered cryptographically
+secure any more. This impacts the communication of hash values because
+we cannot trust that a given hash value represents the known good
+version of content that the speaker intended.
+
+SHA-1 still possesses the other properties such as fast object lookup
+and safe error checking, but other hash functions are equally suitable
+that are believed to be cryptographically secure.
+
+Goals
+-----
+Where NewHash is a strong 256-bit hash function to replace SHA-1 (see
+"Selection of a New Hash", below):
+
+1. The transition to NewHash can be done one local repository at a time.
+   a. Requiring no action by any other party.
+   b. A NewHash repository can communicate with SHA-1 Git servers
+      (push/fetch).
+   c. Users can use SHA-1 and NewHash identifiers for objects
+      interchangeably (see "Object names on the command line", below).
+   d. New signed objects make use of a stronger hash function than
+      SHA-1 for their security guarantees.
+2. Allow a complete transition away from SHA-1.
+   a. Local metadata for SHA-1 compatibility can be removed from a
+      repository if compatibility with SHA-1 is no longer needed.
+3. Maintainability throughout the process.
+   a. The object format is kept simple and consistent.
+   b. Creation of a generalized repository conversion tool.
+
+Non-Goals
+---------
+1. Add NewHash support to Git protocol. This is valuable and the
+   logical next step but it is out of scope for this initial design.
+2. Transparently improving the security of existing SHA-1 signed
+   objects.
+3. Intermixing objects using multiple hash functions in a single
+   repository.
+4. Taking the opportunity to fix other bugs in Git's formats and
+   protocols.
+5. Shallow clones and fetches into a NewHash repository. (This will
+   change when we add NewHash support to Git protocol.)
+6. Skip fetching some submodules of a project into a NewHash
+   repository. (This also depends on NewHash support in Git
+   protocol.)
+
+Overview
+--------
+We introduce a new repository format extension. Repositories with this
+extension enabled use NewHash instead of SHA-1 to name their objects.
+This affects both object names and object content --- both the names
+of objects and all references to other objects within an object are
+switched to the new hash function.
+
+NewHash repositories cannot be read by older versions of Git.
+
+Alongside the packfile, a NewHash repository stores a bidirectional
+mapping between NewHash and SHA-1 object names. The mapping is generated
+locally and can be verified using "git fsck". Object lookups use this
+mapping to allow naming objects using either their SHA-1 and NewHash names
+interchangeably.
+
+"git cat-file" and "git hash-object" gain options to display an object
+in its sha1 form and write an object given its sha1 form. This
+requires all objects referenced by that object to be present in the
+object database so that they can be named using the appropriate name
+(using the bidirectional hash mapping).
+
+Fetches from a SHA-1 based server convert the fetched objects into
+NewHash form and record the mapping in the bidirectional mapping table
+(see below for details). Pushes to a SHA-1 based server convert the
+objects being pushed into sha1 form so the server does not have to be
+aware of the hash function the client is using.
+
+Detailed Design
+---------------
+Repository format extension
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+A NewHash repository uses repository format version `1` (see
+Documentation/technical/repository-version.txt) with extensions
+`objectFormat` and `compatObjectFormat`:
+
+	[core]
+		repositoryFormatVersion = 1
+	[extensions]
+		objectFormat = newhash
+		compatObjectFormat = sha1
+
+Specifying a repository format extension ensures that versions of Git
+not aware of NewHash do not try to operate on these repositories,
+instead producing an error message:
+
+	$ git status
+	fatal: unknown repository extensions found:
+		objectformat
+		compatobjectformat
+
+See the "Transition plan" section below for more details on these
+repository extensions.
+
+Object names
+~~~~~~~~~~~~
+Objects can be named by their 40 hexadecimal digit sha1-name or 64
+hexadecimal digit newhash-name, plus names derived from those (see
+gitrevisions(7)).
+
+The sha1-name of an object is the SHA-1 of the concatenation of its
+type, length, a nul byte, and the object's sha1-content. This is the
+traditional <sha1> used in Git to name objects.
+
+The newhash-name of an object is the NewHash of the concatenation of its
+type, length, a nul byte, and the object's newhash-content.
+
+Object format
+~~~~~~~~~~~~~
+The content as a byte sequence of a tag, commit, or tree object named
+by sha1 and newhash differ because an object named by newhash-name refers to
+other objects by their newhash-names and an object named by sha1-name
+refers to other objects by their sha1-names.
+
+The newhash-content of an object is the same as its sha1-content, except
+that objects referenced by the object are named using their newhash-names
+instead of sha1-names. Because a blob object does not refer to any
+other object, its sha1-content and newhash-content are the same.
+
+The format allows round-trip conversion between newhash-content and
+sha1-content.
+
+Object storage
+~~~~~~~~~~~~~~
+Loose objects use zlib compression and packed objects use the packed
+format described in Documentation/technical/pack-format.txt, just like
+today. The content that is compressed and stored uses newhash-content
+instead of sha1-content.
+
+Pack index
+~~~~~~~~~~
+Pack index (.idx) files use a new v3 format that supports multiple
+hash functions. They have the following format (all integers are in
+network byte order):
+
+- A header appears at the beginning and consists of the following:
+  - The 4-byte pack index signature: '\377t0c'
+  - 4-byte version number: 3
+  - 4-byte length of the header section, including the signature and
+    version number
+  - 4-byte number of objects contained in the pack
+  - 4-byte number of object formats in this pack index: 2
+  - For each object format:
+    - 4-byte format identifier (e.g., 'sha1' for SHA-1)
+    - 4-byte length in bytes of shortened object names. This is the
+      shortest possible length needed to make names in the shortened
+      object name table unambiguous.
+    - 4-byte integer, recording where tables relating to this format
+      are stored in this index file, as an offset from the beginning.
+  - 4-byte offset to the trailer from the beginning of this file.
+  - Zero or more additional key/value pairs (4-byte key, 4-byte
+    value). Only one key is supported: 'PSRC'. See the "Loose objects
+    and unreachable objects" section for supported values and how this
+    is used.  All other keys are reserved. Readers must ignore
+    unrecognized keys.
+- Zero or more NUL bytes. This can optionally be used to improve the
+  alignment of the full object name table below.
+- Tables for the first object format:
+  - A sorted table of shortened object names.  These are prefixes of
+    the names of all objects in this pack file, packed together
+    without offset values to reduce the cache footprint of the binary
+    search for a specific object name.
+
+  - A table of full object names in pack order. This allows resolving
+    a reference to "the nth object in the pack file" (from a
+    reachability bitmap or from the next table of another object
+    format) to its object name.
+
+  - A table of 4-byte values mapping object name order to pack order.
+    For an object in the table of sorted shortened object names, the
+    value at the corresponding index in this table is the index in the
+    previous table for that same object.
+
+    This can be used to look up the object in reachability bitmaps or
+    to look up its name in another object format.
+
+  - A table of 4-byte CRC32 values of the packed object data, in the
+    order that the objects appear in the pack file. This is to allow
+    compressed data to be copied directly from pack to pack during
+    repacking without undetected data corruption.
+
+  - A table of 4-byte offset values. For an object in the table of
+    sorted shortened object names, the value at the corresponding
+    index in this table indicates where that object can be found in
+    the pack file. These are usually 31-bit pack file offsets, but
+    large offsets are encoded as an index into the next table with the
+    most significant bit set.
+
+  - A table of 8-byte offset entries (empty for pack files less than
+    2 GiB). Pack files are organized with heavily used objects toward
+    the front, so most object references should not need to refer to
+    this table.
+- Zero or more NUL bytes.
+- Tables for the second object format, with the same layout as above,
+  up to and not including the table of CRC32 values.
+- Zero or more NUL bytes.
+- The trailer consists of the following:
+  - A copy of the 20-byte NewHash checksum at the end of the
+    corresponding packfile.
+
+  - 20-byte NewHash checksum of all of the above.
+
+Loose object index
+~~~~~~~~~~~~~~~~~~
+A new file $GIT_OBJECT_DIR/loose-object-idx contains information about
+all loose objects. Its format is
+
+  # loose-object-idx
+  (newhash-name SP sha1-name LF)*
+
+where the object names are in hexadecimal format. The file is not
+sorted.
+
+The loose object index is protected against concurrent writes by a
+lock file $GIT_OBJECT_DIR/loose-object-idx.lock. To add a new loose
+object:
+
+1. Write the loose object to a temporary file, like today.
+2. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the lock.
+3. Rename the loose object into place.
+4. Open loose-object-idx with O_APPEND and write the new object
+5. Unlink loose-object-idx.lock to release the lock.
+
+To remove entries (e.g. in "git pack-refs" or "git-prune"):
+
+1. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the
+   lock.
+2. Write the new content to loose-object-idx.lock.
+3. Unlink any loose objects being removed.
+4. Rename to replace loose-object-idx, releasing the lock.
+
+Translation table
+~~~~~~~~~~~~~~~~~
+The index files support a bidirectional mapping between sha1-names
+and newhash-names. The lookup proceeds similarly to ordinary object
+lookups. For example, to convert a sha1-name to a newhash-name:
+
+ 1. Look for the object in idx files. If a match is present in the
+    idx's sorted list of truncated sha1-names, then:
+    a. Read the corresponding entry in the sha1-name order to pack
+       name order mapping.
+    b. Read the corresponding entry in the full sha1-name table to
+       verify we found the right object. If it is, then
+    c. Read the corresponding entry in the full newhash-name table.
+       That is the object's newhash-name.
+ 2. Check for a loose object. Read lines from loose-object-idx until
+    we find a match.
+
+Step (1) takes the same amount of time as an ordinary object lookup:
+O(number of packs * log(objects per pack)). Step (2) takes O(number of
+loose objects) time. To maintain good performance it will be necessary
+to keep the number of loose objects low. See the "Loose objects and
+unreachable objects" section below for more details.
+
+Since all operations that make new objects (e.g., "git commit") add
+the new objects to the corresponding index, this mapping is possible
+for all objects in the object store.
+
+Reading an object's sha1-content
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The sha1-content of an object can be read by converting all newhash-names
+its newhash-content references to sha1-names using the translation table.
+
+Fetch
+~~~~~
+Fetching from a SHA-1 based server requires translating between SHA-1
+and NewHash based representations on the fly.
+
+SHA-1s named in the ref advertisement that are present on the client
+can be translated to NewHash and looked up as local objects using the
+translation table.
+
+Negotiation proceeds as today. Any "have"s generated locally are
+converted to SHA-1 before being sent to the server, and SHA-1s
+mentioned by the server are converted to NewHash when looking them up
+locally.
+
+After negotiation, the server sends a packfile containing the
+requested objects. We convert the packfile to NewHash format using
+the following steps:
+
+1. index-pack: inflate each object in the packfile and compute its
+   SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
+   objects the client has locally. These objects can be looked up
+   using the translation table and their sha1-content read as
+   described above to resolve the deltas.
+2. topological sort: starting at the "want"s from the negotiation
+   phase, walk through objects in the pack and emit a list of them,
+   excluding blobs, in reverse topologically sorted order, with each
+   object coming later in the list than all objects it references.
+   (This list only contains objects reachable from the "wants". If the
+   pack from the server contained additional extraneous objects, then
+   they will be discarded.)
+3. convert to newhash: open a new (newhash) packfile. Read the topologically
+   sorted list just generated. For each object, inflate its
+   sha1-content, convert to newhash-content, and write it to the newhash
+   pack. Record the new sha1<->newhash mapping entry for use in the idx.
+4. sort: reorder entries in the new pack to match the order of objects
+   in the pack the server generated and include blobs. Write a newhash idx
+   file
+5. clean up: remove the SHA-1 based pack file, index, and
+   topologically sorted list obtained from the server in steps 1
+   and 2.
+
+Step 3 requires every object referenced by the new object to be in the
+translation table. This is why the topological sort step is necessary.
+
+As an optimization, step 1 could write a file describing what non-blob
+objects each object it has inflated from the packfile references. This
+makes the topological sort in step 2 possible without inflating the
+objects in the packfile for a second time. The objects need to be
+inflated again in step 3, for a total of two inflations.
+
+Step 4 is probably necessary for good read-time performance. "git
+pack-objects" on the server optimizes the pack file for good data
+locality (see Documentation/technical/pack-heuristics.txt).
+
+Details of this process are likely to change. It will take some
+experimenting to get this to perform well.
+
+Push
+~~~~
+Push is simpler than fetch because the objects referenced by the
+pushed objects are already in the translation table. The sha1-content
+of each object being pushed can be read as described in the "Reading
+an object's sha1-content" section to generate the pack written by git
+send-pack.
+
+Signed Commits
+~~~~~~~~~~~~~~
+We add a new field "gpgsig-newhash" to the commit object format to allow
+signing commits without relying on SHA-1. It is similar to the
+existing "gpgsig" field. Its signed payload is the newhash-content of the
+commit object with any "gpgsig" and "gpgsig-newhash" fields removed.
+
+This means commits can be signed
+1. using SHA-1 only, as in existing signed commit objects
+2. using both SHA-1 and NewHash, by using both gpgsig-newhash and gpgsig
+   fields.
+3. using only NewHash, by only using the gpgsig-newhash field.
+
+Old versions of "git verify-commit" can verify the gpgsig signature in
+cases (1) and (2) without modifications and view case (3) as an
+ordinary unsigned commit.
+
+Signed Tags
+~~~~~~~~~~~
+We add a new field "gpgsig-newhash" to the tag object format to allow
+signing tags without relying on SHA-1. Its signed payload is the
+newhash-content of the tag with its gpgsig-newhash field and "-----BEGIN PGP
+SIGNATURE-----" delimited in-body signature removed.
+
+This means tags can be signed
+1. using SHA-1 only, as in existing signed tag objects
+2. using both SHA-1 and NewHash, by using gpgsig-newhash and an in-body
+   signature.
+3. using only NewHash, by only using the gpgsig-newhash field.
+
+Mergetag embedding
+~~~~~~~~~~~~~~~~~~
+The mergetag field in the sha1-content of a commit contains the
+sha1-content of a tag that was merged by that commit.
+
+The mergetag field in the newhash-content of the same commit contains the
+newhash-content of the same tag.
+
+Submodules
+~~~~~~~~~~
+To convert recorded submodule pointers, you need to have the converted
+submodule repository in place. The translation table of the submodule
+can be used to look up the new hash.
+
+Loose objects and unreachable objects
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Fast lookups in the loose-object-idx require that the number of loose
+objects not grow too high.
+
+"git gc --auto" currently waits for there to be 6700 loose objects
+present before consolidating them into a packfile. We will need to
+measure to find a more appropriate threshold for it to use.
+
+"git gc --auto" currently waits for there to be 50 packs present
+before combining packfiles. Packing loose objects more aggressively
+may cause the number of pack files to grow too quickly. This can be
+mitigated by using a strategy similar to Martin Fick's exponential
+rolling garbage collection script:
+https://gerrit-review.googlesource.com/c/gerrit/+/35215
+
+"git gc" currently expels any unreachable objects it encounters in
+pack files to loose objects in an attempt to prevent a race when
+pruning them (in case another process is simultaneously writing a new
+object that refers to the about-to-be-deleted object). This leads to
+an explosion in the number of loose objects present and disk space
+usage due to the objects in delta form being replaced with independent
+loose objects.  Worse, the race is still present for loose objects.
+
+Instead, "git gc" will need to move unreachable objects to a new
+packfile marked as UNREACHABLE_GARBAGE (using the PSRC field; see
+below). To avoid the race when writing new objects referring to an
+about-to-be-deleted object, code paths that write new objects will
+need to copy any objects from UNREACHABLE_GARBAGE packs that they
+refer to to new, non-UNREACHABLE_GARBAGE packs (or loose objects).
+UNREACHABLE_GARBAGE are then safe to delete if their creation time (as
+indicated by the file's mtime) is long enough ago.
+
+To avoid a proliferation of UNREACHABLE_GARBAGE packs, they can be
+combined under certain circumstances. If "gc.garbageTtl" is set to
+greater than one day, then packs created within a single calendar day,
+UTC, can be coalesced together. The resulting packfile would have an
+mtime before midnight on that day, so this makes the effective maximum
+ttl the garbageTtl + 1 day. If "gc.garbageTtl" is less than one day,
+then we divide the calendar day into intervals one-third of that ttl
+in duration. Packs created within the same interval can be coalesced
+together. The resulting packfile would have an mtime before the end of
+the interval, so this makes the effective maximum ttl equal to the
+garbageTtl * 4/3.
+
+This rule comes from Thirumala Reddy Mutchukota's JGit change
+https://git.eclipse.org/r/90465.
+
+The UNREACHABLE_GARBAGE setting goes in the PSRC field of the pack
+index. More generally, that field indicates where a pack came from:
+
+ - 1 (PACK_SOURCE_RECEIVE) for a pack received over the network
+ - 2 (PACK_SOURCE_AUTO) for a pack created by a lightweight
+   "gc --auto" operation
+ - 3 (PACK_SOURCE_GC) for a pack created by a full gc
+ - 4 (PACK_SOURCE_UNREACHABLE_GARBAGE) for potential garbage
+   discovered by gc
+ - 5 (PACK_SOURCE_INSERT) for locally created objects that were
+   written directly to a pack file, e.g. from "git add ."
+
+This information can be useful for debugging and for "gc --auto" to
+make appropriate choices about which packs to coalesce.
+
+Caveats
+-------
+Invalid objects
+~~~~~~~~~~~~~~~
+The conversion from sha1-content to newhash-content retains any
+brokenness in the original object (e.g., tree entry modes encoded with
+leading 0, tree objects whose paths are not sorted correctly, and
+commit objects without an author or committer). This is a deliberate
+feature of the design to allow the conversion to round-trip.
+
+More profoundly broken objects (e.g., a commit with a truncated "tree"
+header line) cannot be converted but were not usable by current Git
+anyway.
+
+Shallow clone and submodules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Because it requires all referenced objects to be available in the
+locally generated translation table, this design does not support
+shallow clone or unfetched submodules. Protocol improvements might
+allow lifting this restriction.
+
+Alternates
+~~~~~~~~~~
+For the same reason, a newhash repository cannot borrow objects from a
+sha1 repository using objects/info/alternates or
+$GIT_ALTERNATE_OBJECT_REPOSITORIES.
+
+git notes
+~~~~~~~~~
+The "git notes" tool annotates objects using their sha1-name as key.
+This design does not describe a way to migrate notes trees to use
+newhash-names. That migration is expected to happen separately (for
+example using a file at the root of the notes tree to describe which
+hash it uses).
+
+Server-side cost
+~~~~~~~~~~~~~~~~
+Until Git protocol gains NewHash support, using NewHash based storage
+on public-facing Git servers is strongly discouraged. Once Git
+protocol gains NewHash support, NewHash based servers are likely not
+to support SHA-1 compatibility, to avoid what may be a very expensive
+hash reencode during clone and to encourage peers to modernize.
+
+The design described here allows fetches by SHA-1 clients of a
+personal NewHash repository because it's not much more difficult than
+allowing pushes from that repository. This support needs to be guarded
+by a configuration option --- servers like git.kernel.org that serve a
+large number of clients would not be expected to bear that cost.
+
+Meaning of signatures
+~~~~~~~~~~~~~~~~~~~~~
+The signed payload for signed commits and tags does not explicitly
+name the hash used to identify objects. If some day Git adopts a new
+hash function with the same length as the current SHA-1 (40
+hexadecimal digit) or NewHash (64 hexadecimal digit) objects then the
+intent behind the PGP signed payload in an object signature is
+unclear:
+
+	object e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7
+	type commit
+	tag v2.12.0
+	tagger Junio C Hamano <gitster@pobox.com> 1487962205 -0800
+
+	Git 2.12
+
+Does this mean Git v2.12.0 is the commit with sha1-name
+e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
+new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
+
+Fortunately NewHash and SHA-1 have different lengths. If Git starts
+using another hash with the same length to name objects, then it will
+need to change the format of signed payloads using that hash to
+address this issue.
+
+Object names on the command line
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To support the transition (see Transition plan below), this design
+supports four different modes of operation:
+
+ 1. ("dark launch") Treat object names input by the user as SHA-1 and
+    convert any object names written to output to SHA-1, but store
+    objects using NewHash.  This allows users to test the code with no
+    visible behavior change except for performance.  This allows
+    allows running even tests that assume the SHA-1 hash function, to
+    sanity-check the behavior of the new mode.
+
+ 2. ("early transition") Allow both SHA-1 and NewHash object names in
+    input. Any object names written to output use SHA-1. This allows
+    users to continue to make use of SHA-1 to communicate with peers
+    (e.g. by email) that have not migrated yet and prepares for mode 3.
+
+ 3. ("late transition") Allow both SHA-1 and NewHash object names in
+    input. Any object names written to output use NewHash. In this
+    mode, users are using a more secure object naming method by
+    default.  The disruption is minimal as long as most of their peers
+    are in mode 2 or mode 3.
+
+ 4. ("post-transition") Treat object names input by the user as
+    NewHash and write output using NewHash. This is safer than mode 3
+    because there is less risk that input is incorrectly interpreted
+    using the wrong hash function.
+
+The mode is specified in configuration.
+
+The user can also explicitly specify which format to use for a
+particular revision specifier and for output, overriding the mode. For
+example:
+
+git --output-format=sha1 log abac87a^{sha1}..f787cac^{newhash}
+
+Selection of a New Hash
+-----------------------
+In early 2005, around the time that Git was written,  Xiaoyun Wang,
+Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
+collisions in 2^69 operations. In August they published details.
+Luckily, no practical demonstrations of a collision in full SHA-1 were
+published until 10 years later, in 2017.
+
+The hash function NewHash to replace SHA-1 should be stronger than
+SHA-1 was: we would like it to be trustworthy and useful in practice
+for at least 10 years.
+
+Some other relevant properties:
+
+1. A 256-bit hash (long enough to match common security practice; not
+   excessively long to hurt performance and disk usage).
+
+2. High quality implementations should be widely available (e.g. in
+   OpenSSL).
+
+3. The hash function's properties should match Git's needs (e.g. Git
+   requires collision and 2nd preimage resistance and does not require
+   length extension resistance).
+
+4. As a tiebreaker, the hash should be fast to compute (fortunately
+   many contenders are faster than SHA-1).
+
+Some hashes under consideration are SHA-256, SHA-512/256, SHA-256x16,
+K12, and BLAKE2bp-256.
+
+Transition plan
+---------------
+Some initial steps can be implemented independently of one another:
+- adding a hash function API (vtable)
+- teaching fsck to tolerate the gpgsig-newhash field
+- excluding gpgsig-* from the fields copied by "git commit --amend"
+- annotating tests that depend on SHA-1 values with a SHA1 test
+  prerequisite
+- using "struct object_id", GIT_MAX_RAWSZ, and GIT_MAX_HEXSZ
+  consistently instead of "unsigned char *" and the hardcoded
+  constants 20 and 40.
+- introducing index v3
+- adding support for the PSRC field and safer object pruning
+
+
+The first user-visible change is the introduction of the objectFormat
+extension (without compatObjectFormat). This requires:
+- implementing the loose-object-idx
+- teaching fsck about this mode of operation
+- using the hash function API (vtable) when computing object names
+- signing objects and verifying signatures
+- rejecting attempts to fetch from or push to an incompatible
+  repository
+
+Next comes introduction of compatObjectFormat:
+- translating object names between object formats
+- translating object content between object formats
+- generating and verifying signatures in the compat format
+- adding appropriate index entries when adding a new object to the
+  object store
+- --output-format option
+- ^{sha1} and ^{newhash} revision notation
+- configuration to specify default input and output format (see
+  "Object names on the command line" above)
+
+The next step is supporting fetches and pushes to SHA-1 repositories:
+- allow pushes to a repository using the compat format
+- generate a topologically sorted list of the SHA-1 names of fetched
+  objects
+- convert the fetched packfile to newhash format and generate an idx
+  file
+- re-sort to match the order of objects in the fetched packfile
+
+The infrastructure supporting fetch also allows converting an existing
+repository. In converted repositories and new clones, end users can
+gain support for the new hash function without any visible change in
+behavior (see "dark launch" in the "Object names on the command line"
+section). In particular this allows users to verify NewHash signatures
+on objects in the repository, and it should ensure the transition code
+is stable in production in preparation for using it more widely.
+
+Over time projects would encourage their users to adopt the "early
+transition" and then "late transition" modes to take advantage of the
+new, more futureproof NewHash object names.
+
+When objectFormat and compatObjectFormat are both set, commands
+generating signatures would generate both SHA-1 and NewHash signatures
+by default to support both new and old users.
+
+In projects using NewHash heavily, users could be encouraged to adopt
+the "post-transition" mode to avoid accidentally making implicit use
+of SHA-1 object names.
+
+Once a critical mass of users have upgraded to a version of Git that
+can verify NewHash signatures and have converted their existing
+repositories to support verifying them, we can add support for a
+setting to generate only NewHash signatures. This is expected to be at
+least a year later.
+
+That is also a good moment to advertise the ability to convert
+repositories to use NewHash only, stripping out all SHA-1 related
+metadata. This improves performance by eliminating translation
+overhead and security by avoiding the possibility of accidentally
+relying on the safety of SHA-1.
+
+Updating Git's protocols to allow a server to specify which hash
+functions it supports is also an important part of this transition. It
+is not discussed in detail in this document but this transition plan
+assumes it happens. :)
+
+Alternatives considered
+-----------------------
+Upgrading everyone working on a particular project on a flag day
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Projects like the Linux kernel are large and complex enough that
+flipping the switch for all projects based on the repository at once
+is infeasible.
+
+Not only would all developers and server operators supporting
+developers have to switch on the same flag day, but supporting tooling
+(continuous integration, code review, bug trackers, etc) would have to
+be adapted as well. This also makes it difficult to get early feedback
+from some project participants testing before it is time for mass
+adoption.
+
+Using hash functions in parallel
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+(e.g. https://public-inbox.org/git/22708.8913.864049.452252@chiark.greenend.org.uk/ )
+Objects newly created would be addressed by the new hash, but inside
+such an object (e.g. commit) it is still possible to address objects
+using the old hash function.
+* You cannot trust its history (needed for bisectability) in the
+  future without further work
+* Maintenance burden as the number of supported hash functions grows
+  (they will never go away, so they accumulate). In this proposal, by
+  comparison, converted objects lose all references to SHA-1.
+
+Signed objects with multiple hashes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Instead of introducing the gpgsig-newhash field in commit and tag objects
+for newhash-content based signatures, an earlier version of this design
+added "hash newhash <newhash-name>" fields to strengthen the existing
+sha1-content based signatures.
+
+In other words, a single signature was used to attest to the object
+content using both hash functions. This had some advantages:
+* Using one signature instead of two speeds up the signing process.
+* Having one signed payload with both hashes allows the signer to
+  attest to the sha1-name and newhash-name referring to the same object.
+* All users consume the same signature. Broken signatures are likely
+  to be detected quickly using current versions of git.
+
+However, it also came with disadvantages:
+* Verifying a signed object requires access to the sha1-names of all
+  objects it references, even after the transition is complete and
+  translation table is no longer needed for anything else. To support
+  this, the design added fields such as "hash sha1 tree <sha1-name>"
+  and "hash sha1 parent <sha1-name>" to the newhash-content of a signed
+  commit, complicating the conversion process.
+* Allowing signed objects without a sha1 (for after the transition is
+  complete) complicated the design further, requiring a "nohash sha1"
+  field to suppress including "hash sha1" fields in the newhash-content
+  and signed payload.
+
+Lazily populated translation table
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Some of the work of building the translation table could be deferred to
+push time, but that would significantly complicate and slow down pushes.
+Calculating the sha1-name at object creation time at the same time it is
+being streamed to disk and having its newhash-name calculated should be
+an acceptable cost.
+
+Document History
+----------------
+
+2017-03-03
+bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
+sbeller@google.com
+
+Initial version sent to
+http://public-inbox.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+
+2017-03-03 jrnieder@gmail.com
+Incorporated suggestions from jonathantanmy and sbeller:
+* describe purpose of signed objects with each hash type
+* redefine signed object verification using object content under the
+  first hash function
+
+2017-03-06 jrnieder@gmail.com
+* Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
+* Make sha3-based signatures a separate field, avoiding the need for
+  "hash" and "nohash" fields (thanks to peff[3]).
+* Add a sorting phase to fetch (thanks to Junio for noticing the need
+  for this).
+* Omit blobs from the topological sort during fetch (thanks to peff).
+* Discuss alternates, git notes, and git servers in the caveats
+  section (thanks to Junio Hamano, brian m. carlson[4], and Shawn
+  Pearce).
+* Clarify language throughout (thanks to various commenters,
+  especially Junio).
+
+2017-09-27 jrnieder@gmail.com, sbeller@google.com
+* use placeholder NewHash instead of SHA3-256
+* describe criteria for picking a hash function.
+* include a transition plan (thanks especially to Brandon Williams
+  for fleshing these ideas out)
+* define the translation table (thanks, Shawn Pearce[5], Jonathan
+  Tan, and Masaya Suzuki)
+* avoid loose object overhead by packing more aggressively in
+  "git gc --auto"
+
+[1] http://public-inbox.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+[2] http://public-inbox.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+[3] http://public-inbox.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+[4] http://public-inbox.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+[5] https://public-inbox.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
-- 
2.14.2.822.g60be5d43e6-goog


^ permalink raw reply	[relevance 4%]

* Re: [RFC] clang-format: outline the git project's coding style
  2017-08-09 22:33       ` Johannes Schindelin
@ 2017-09-28 11:41         ` Johannes Schindelin
  2017-09-28 17:16           ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Johannes Schindelin @ 2017-09-28 11:41 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Stefan Beller, git

Hi all,

On Thu, 10 Aug 2017, Johannes Schindelin wrote:

> On Tue, 8 Aug 2017, Brandon Williams wrote:
> 
> > On 08/08, Stefan Beller wrote:
> > > On Tue, Aug 8, 2017 at 5:05 AM, Johannes Schindelin
> > > <Johannes.Schindelin@gmx.de> wrote:
> > > >
> > > > On Mon, 7 Aug 2017, Brandon Williams wrote:
> > > >
> > > >> Add a '.clang-format' file which outlines the git project's
> > > >> coding style.  This can be used with clang-format to auto-format
> > > >> .c and .h files to conform with git's style.
> > > >>
> > > >> Signed-off-by: Brandon Williams <bmwill@google.com>
> > > >> ---
> > > >>
> > > >> I'm sure this sort of thing comes up every so often on the list
> > > >> but back at git-merge I mentioned how it would be nice to not
> > > >> have to worry about style when reviewing patches as that is
> > > >> something mechanical and best left to a machine (for the most
> > > >> part).
> > > >
> > > > Amen.
> > > >
> > > > If I never have to see a review mentioning an unwrapped line, I am quite
> > > > certain I will be quite content.
> > > 
> > > As a thought experiment I'd like to propose to take it one step further:
> > > 
> > >   If the code was formatted perfectly in one style such that a
> > >   formatter for this style would not produce changes when rerun
> > >   again on the code, then each individual could have a clean/smudge
> > >   filter to work in their preferred style, and only the exchange and
> > >   storage of code is in a mutual agreed style. If the mutually
> > >   agreed style is close to what I prefer, I don't have to use
> > >   clean/smudge filters.
> > > 
> > > Additionally to this patch, we'd want to either put a note into
> > > SubmittingPatches or Documentation/gitworkflows.txt to hint at
> > > how to use this formatting to just affect the patch that is currently
> > > worked on or rather a pre-commit hook?
> > 
> > I'm sure both of these things will probably happen if we decide to make
> > use of a code-formatter.  This RFC is really just trying to ask the
> > question: "Is this something we want to entertain doing?"
> > My response would be 'Yes' but there's many other opinions to consider
> > first :)
> 
> I am sure that something even better will be possible: a Continuous
> "Integration" that fixes the coding style automatically by using
> `filter-branch` (avoiding the merge conflicts that would arise if `rebase
> -i` was used).

FWIW I just set this up in my VSTS account, with the following build step
(performed in one of the Hosted Linux agents, i.e. I do not even have to
have a VM dedicated to the task):

-- snip --
die () {
    echo "$*" >&2
    exit 1
}

head=$(git rev-parse HEAD) ||
die "Could not determine HEAD"

test -n "$BUILD_SOURCEBRANCHNAME" ||
die "Need a source branch name to work with"

base=$(git merge-base origin/core/master HEAD) &&
count=$(git rev-list --count $base..) &&
test 0 -lt $count ||
die "Could not determine commits to clean up (count: $count)"

test -f .clang-format ||
git show origin/core/pu/.clang-format >.clang-format ||
die "Need a .clang-format"

sudo add-apt-repository 'deb http://apt.llvm.org/xenial/
llvm-toolchain-xenial main' &&
sudo apt-get update &&
sudo apt-get --allow-unauthenticated -y install clang-format-6.0 ||
die "Could not install clang-format 6.0"

git filter-branch -f --tree-filter \
    'git diff HEAD^.. |
     clang-format-diff-6.0 -p 1 -i &&

     git update-index --refresh --ignore-submodules &&
     git diff-files --quiet --ignore-submodules ||
     git commit --amend -C HEAD' $base.. ||
die "Could not rewrite branch"

if test "$head" = "$(git rev-parse HEAD)"
then
    echo "No changes in $BUILD_SOURCEBRANCHNAME introduced by clang-format" >&2
else
    git push vsts +HEAD:refs/heads/clang-format/"$BUILD_SOURCEBRANCHNAME" ||
    die "Could not push clang-format/$BUILD_SOURCEBRANCHNAME"
    echo "Clean branch pushed as clang-format/$BUILD_SOURCEBRANCHNAME" >&2
    exit 123
fi
-- snap --

A couple of notes for the interested:

- you can easily set this up for yourself, as Visual Studio Team Services
  is free for small teams up to five people (including single developer
  "teams"): https://www.visualstudio.com/team-services/, and of course you
  can back it by your own git.git fork on GitHub, no need to host the code
  in VSTS.

  Disclaimer: I work closely with the developers behind Visual Studio Team
  Services, and I am a genuine fan, yet I understand if anybody thinks of
  this as advertising the service, so this will be the only time I mention
  this.

- the script assumes that there is a `core/master` tracking upstream Git's
  master branch, then reformats the commits in the current branch that are
  not also reachable from core/master.

- The push credentials to push the result at the end are of course not
  included in the script, they need to be provided separately.

- the exit code 123 when the branch needed to be rewritten indicates to
  any consumer that the build "failed". The reason is that I want to
  integrate this into a system where I open a PR in my own account, which
  triggers the build automatically contingent on the base branch being
  core/master, and if the build fails, the PR gets "blocked", providing a
  very easy way to see that there is still work to be done.

- for the moment, I do not push back to the original branch, even if I
  could. The reason is that already my first test produced a dubious
  result, see below.

I am reasonably happy with the way this build job works right now,
especially given that I do not have to mess up any other setup I have just
to get the bleeding edge version of Clang.

Now for the dubious result.

I took my most recent contribution, the lazyload one (which you can easily
get yourself by fetching the lazyload-v2 tag from
https://github.com/dscho/git), because it was pretty self-contained and
small, only one patch. With the current .clang-format as per git.git's
master (or for that matter, pu, as they are identical), the output `git
show | clang-format-diff-6.0 -p 1` ends in this hunk:

-- snip --
@@ -43,8 +43,7 @@
        if (!proc->initialized) {
                HANDLE hnd;
                proc->initialized = 1;
-               hnd = LoadLibraryExA(proc->dll, NULL,
-                                    LOAD_LIBRARY_SEARCH_SYSTEM32);
+               hnd = LoadLibraryExA(proc->dll, NULL, LOAD_LIBRARY_SEARCH_SYSTEM32);
                if (hnd)
                        proc->pfunction = GetProcAddress(hnd, proc->function);
        }
-- snap --

(tabs intentionally converted to spaces, to show the extent of the damage
clearly)

In other words, despite the column limit of 80 (with a tab width of 8
spaces), it takes a perfectly well formatted pair of lines and combines
them into a single line that now has 84 columns. Absolutely not what we
want.

Even worse: if I replace the column limit of 80 by 79, like so:

-- snip --
diff --git a/.clang-format b/.clang-format
index 3ede2628d2d..9f686c1ed5a 100644
--- a/.clang-format
+++ b/.clang-format
@@ -6,7 +6,7 @@ UseTab: Always
 TabWidth: 8
 IndentWidth: 8
 ContinuationIndentWidth: 8
-ColumnLimit: 80
+ColumnLimit: 79
 
 # C Language specifics
 Language: Cpp
-- snap --

then that hunk vanishes and clang-format leaves the LoadLibraryEx() lines
alone!

Even stranger, if I revert to 80 columns, copy the offending line above
the conditional block so that the indentation level is different (based on
a hunch that this may have something to do with clang's understanding of
tab widths) and extend the last parameter artificially, it still breaks at
exactly 84 columns, i.e. if I make the line 84 columns long, it keeps it
as one line, if I extend it to 85 columns, it breaks the line into two.

In fact, the same holds true even with no indentation at all: if I turn
this line into a static variable assignment outside of the function, it
again breaks the line as soon as I extend it to 85 columns.

Then I repeated the exercise with clang-format-6.0 instead of
clang-format-diff-6.0, and the finding still holds. 85 columns, despite
the explicit ColumnLimit: 80 in .clang-format.

I then tried to format a file containing only the line "int i123, j123;"
with various values for ColumnLimit, and could not get it to break at all.

Any insights?

Thanks,
Dscho

^ permalink raw reply	[relevance 6%]

* Re: [RFC] clang-format: outline the git project's coding style
  2017-09-28 11:41         ` Johannes Schindelin
@ 2017-09-28 17:16           ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-09-28 17:16 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Stefan Beller, git

On 09/28, Johannes Schindelin wrote:
> Hi all,
>
> On Thu, 10 Aug 2017, Johannes Schindelin wrote:
>
> > On Tue, 8 Aug 2017, Brandon Williams wrote:
> >
> > > On 08/08, Stefan Beller wrote:
> > > > On Tue, Aug 8, 2017 at 5:05 AM, Johannes Schindelin
> > > > <Johannes.Schindelin@gmx.de> wrote:
> > > > >
> > > > > On Mon, 7 Aug 2017, Brandon Williams wrote:
> > > > >
> > > > >> Add a '.clang-format' file which outlines the git project's
> > > > >> coding style.  This can be used with clang-format to auto-format
> > > > >> .c and .h files to conform with git's style.
> > > > >>
> > > > >> Signed-off-by: Brandon Williams <bmwill@google.com>
> > > > >> ---
> > > > >>
> > > > >> I'm sure this sort of thing comes up every so often on the list
> > > > >> but back at git-merge I mentioned how it would be nice to not
> > > > >> have to worry about style when reviewing patches as that is
> > > > >> something mechanical and best left to a machine (for the most
> > > > >> part).
> > > > >
> > > > > Amen.
> > > > >
> > > > > If I never have to see a review mentioning an unwrapped line, I am quite
> > > > > certain I will be quite content.
> > > >
> > > > As a thought experiment I'd like to propose to take it one step further:
> > > >
> > > >   If the code was formatted perfectly in one style such that a
> > > >   formatter for this style would not produce changes when rerun
> > > >   again on the code, then each individual could have a clean/smudge
> > > >   filter to work in their preferred style, and only the exchange and
> > > >   storage of code is in a mutual agreed style. If the mutually
> > > >   agreed style is close to what I prefer, I don't have to use
> > > >   clean/smudge filters.
> > > >
> > > > Additionally to this patch, we'd want to either put a note into
> > > > SubmittingPatches or Documentation/gitworkflows.txt to hint at
> > > > how to use this formatting to just affect the patch that is currently
> > > > worked on or rather a pre-commit hook?
> > >
> > > I'm sure both of these things will probably happen if we decide to make
> > > use of a code-formatter.  This RFC is really just trying to ask the
> > > question: "Is this something we want to entertain doing?"
> > > My response would be 'Yes' but there's many other opinions to consider
> > > first :)
> >
> > I am sure that something even better will be possible: a Continuous
> > "Integration" that fixes the coding style automatically by using
> > `filter-branch` (avoiding the merge conflicts that would arise if `rebase
> > -i` was used).
>
> FWIW I just set this up in my VSTS account, with the following build step
> (performed in one of the Hosted Linux agents, i.e. I do not even have to
> have a VM dedicated to the task):
>
> -- snip --
> die () {
>     echo "$*" >&2
>     exit 1
> }
>
> head=$(git rev-parse HEAD) ||
> die "Could not determine HEAD"
>
> test -n "$BUILD_SOURCEBRANCHNAME" ||
> die "Need a source branch name to work with"
>
> base=$(git merge-base origin/core/master HEAD) &&
> count=$(git rev-list --count $base..) &&
> test 0 -lt $count ||
> die "Could not determine commits to clean up (count: $count)"
>
> test -f .clang-format ||
> git show origin/core/pu/.clang-format >.clang-format ||
> die "Need a .clang-format"
>
> sudo add-apt-repository 'deb http://apt.llvm.org/xenial/
> llvm-toolchain-xenial main' &&
> sudo apt-get update &&
> sudo apt-get --allow-unauthenticated -y install clang-format-6.0 ||
> die "Could not install clang-format 6.0"
>
> git filter-branch -f --tree-filter \
>     'git diff HEAD^.. |
>      clang-format-diff-6.0 -p 1 -i &&
>
>      git update-index --refresh --ignore-submodules &&
>      git diff-files --quiet --ignore-submodules ||
>      git commit --amend -C HEAD' $base.. ||
> die "Could not rewrite branch"
>
> if test "$head" = "$(git rev-parse HEAD)"
> then
>     echo "No changes in $BUILD_SOURCEBRANCHNAME introduced by clang-format" >&2
> else
>     git push vsts +HEAD:refs/heads/clang-format/"$BUILD_SOURCEBRANCHNAME" ||
>     die "Could not push clang-format/$BUILD_SOURCEBRANCHNAME"
>     echo "Clean branch pushed as clang-format/$BUILD_SOURCEBRANCHNAME" >&2
>     exit 123
> fi
> -- snap --
>
> A couple of notes for the interested:
>
> - you can easily set this up for yourself, as Visual Studio Team Services
>   is free for small teams up to five people (including single developer
>   "teams"): https://www.visualstudio.com/team-services/, and of course you
>   can back it by your own git.git fork on GitHub, no need to host the code
>   in VSTS.
>
>   Disclaimer: I work closely with the developers behind Visual Studio Team
>   Services, and I am a genuine fan, yet I understand if anybody thinks of
>   this as advertising the service, so this will be the only time I mention
>   this.
>
> - the script assumes that there is a `core/master` tracking upstream Git's
>   master branch, then reformats the commits in the current branch that are
>   not also reachable from core/master.
>
> - The push credentials to push the result at the end are of course not
>   included in the script, they need to be provided separately.
>
> - the exit code 123 when the branch needed to be rewritten indicates to
>   any consumer that the build "failed". The reason is that I want to
>   integrate this into a system where I open a PR in my own account, which
>   triggers the build automatically contingent on the base branch being
>   core/master, and if the build fails, the PR gets "blocked", providing a
>   very easy way to see that there is still work to be done.
>
> - for the moment, I do not push back to the original branch, even if I
>   could. The reason is that already my first test produced a dubious
>   result, see below.
>
> I am reasonably happy with the way this build job works right now,
> especially given that I do not have to mess up any other setup I have just
> to get the bleeding edge version of Clang.
>
> Now for the dubious result.
>
> I took my most recent contribution, the lazyload one (which you can easily
> get yourself by fetching the lazyload-v2 tag from
> https://github.com/dscho/git), because it was pretty self-contained and
> small, only one patch. With the current .clang-format as per git.git's
> master (or for that matter, pu, as they are identical), the output `git
> show | clang-format-diff-6.0 -p 1` ends in this hunk:
>
> -- snip --
> @@ -43,8 +43,7 @@
>         if (!proc->initialized) {
>                 HANDLE hnd;
>                 proc->initialized = 1;
> -               hnd = LoadLibraryExA(proc->dll, NULL,
> -                                    LOAD_LIBRARY_SEARCH_SYSTEM32);
> +               hnd = LoadLibraryExA(proc->dll, NULL, LOAD_LIBRARY_SEARCH_SYSTEM32);
>                 if (hnd)
>                         proc->pfunction = GetProcAddress(hnd, proc->function);
>         }
> -- snap --
>
> (tabs intentionally converted to spaces, to show the extent of the damage
> clearly)
>
> In other words, despite the column limit of 80 (with a tab width of 8
> spaces), it takes a perfectly well formatted pair of lines and combines
> them into a single line that now has 84 columns. Absolutely not what we
> want.
>
> Even worse: if I replace the column limit of 80 by 79, like so:
>
> -- snip --
> diff --git a/.clang-format b/.clang-format
> index 3ede2628d2d..9f686c1ed5a 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -6,7 +6,7 @@ UseTab: Always
>  TabWidth: 8
>  IndentWidth: 8
>  ContinuationIndentWidth: 8
> -ColumnLimit: 80
> +ColumnLimit: 79
>
>  # C Language specifics
>  Language: Cpp
> -- snap --
>
> then that hunk vanishes and clang-format leaves the LoadLibraryEx() lines
> alone!
>
> Even stranger, if I revert to 80 columns, copy the offending line above
> the conditional block so that the indentation level is different (based on
> a hunch that this may have something to do with clang's understanding of
> tab widths) and extend the last parameter artificially, it still breaks at
> exactly 84 columns, i.e. if I make the line 84 columns long, it keeps it
> as one line, if I extend it to 85 columns, it breaks the line into two.
>
> In fact, the same holds true even with no indentation at all: if I turn
> this line into a static variable assignment outside of the function, it
> again breaks the line as soon as I extend it to 85 columns.
>
> Then I repeated the exercise with clang-format-6.0 instead of
> clang-format-diff-6.0, and the finding still holds. 85 columns, despite
> the explicit ColumnLimit: 80 in .clang-format.
>
> I then tried to format a file containing only the line "int i123, j123;"
> with various values for ColumnLimit, and could not get it to break at all.
>
> Any insights?

I think I know what is happening here, and this is something that we
will need to tweak based on more people beginning to use the formatting
tool.

In the .clang-format file there are a number of parameters at the end:

    # Penalties
    # This decides what order things should be done if a line is too long
    PenaltyBreakAssignment: 100
    PenaltyBreakBeforeFirstCallParameter: 100
    PenaltyBreakComment: 100
    PenaltyBreakFirstLessLess: 0
    PenaltyBreakString: 100
    PenaltyExcessCharacter: 5
    PenaltyReturnTypeOnItsOwnLine: 0

These penalties are used to determine which line breaking events should be
done based on some programmable weight (or penalty).  I thing that if we
bump up the penalty on excess characters to something higher than '5'
(which i think is quite low) then it may format more like you were
expecting.

--
Brandon Williams

^ permalink raw reply	[relevance 1%]

* Re: [PATCH v4] technical doc: add a design doc for hash function transition
  2017-09-28  4:43       ` [PATCH v4] technical doc: add a design doc for hash function transition Jonathan Nieder
@ 2017-09-29  6:06         ` Junio C Hamano
  2017-09-29 17:34           ` Jonathan Nieder
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-09-29  6:06 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Shawn Pearce, Linus Torvalds, Git Mailing List, Stefan Beller, bmwill, Jonathan Tan, Jeff King, David Lang, brian m. carlson, Masaya Suzuki, demerphq, The Keccak Team, Johannes Schindelin

Jonathan Nieder <jrnieder@gmail.com> writes:

> This document describes what a transition to a new hash function for
> Git would look like.  Add it to Documentation/technical/ as the plan
> of record so that future changes can be recorded as patches.
>
> Also-by: Brandon Williams <bmwill@google.com>
> Also-by: Jonathan Tan <jonathantanmy@google.com>
> Also-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---

Shoudln't these all be s-o-b: (with a note immediately before that
to say all four contributed equally or something)?

> +Background
> +----------
> +At its core, the Git version control system is a content addressable
> +filesystem. It uses the SHA-1 hash function to name content. For
> +example, files, directories, and revisions are referred to by hash
> +values unlike in other traditional version control systems where files
> +or versions are referred to via sequential numbers. The use of a hash

Traditional systems refer to files via numbers???  Perhaps "where
versions of files are referred to via sequential numbers" or
something?

> +function to address its content delivers a few advantages:
> +
> +* Integrity checking is easy. Bit flips, for example, are easily
> +  detected, as the hash of corrupted content does not match its name.
> +* Lookup of objects is fast.

* There is no ambiguity what the object's name should be, given its
  content.

* Deduping the same content copied across versions and paths is
  automatic.

> +SHA-1 still possesses the other properties such as fast object lookup
> +and safe error checking, but other hash functions are equally suitable
> +that are believed to be cryptographically secure.

s/secure/more &/, perhaps?

> +Goals
> +-----
> +...
> +   c. Users can use SHA-1 and NewHash identifiers for objects
> +      interchangeably (see "Object names on the command line", below).

Mental note.  This needs to extend to the "index X..Y" lines in the
patch output, which is used by "apply -3" and "am -3".

> +2. Allow a complete transition away from SHA-1.
> +   a. Local metadata for SHA-1 compatibility can be removed from a
> +      repository if compatibility with SHA-1 is no longer needed.

I like the emphasis on "Local" here.  Metadata for compatiblity that
is embedded in the objects obviously cannot be removed.

From that point of view, one of the goals ought to be "make sure
that as much SHA-1 compatibility metadata as possible is local and
outside the object".  This goal may not be able to say more than "as
much as possible", as signed objects that came from SHA-1 world
needs to carry the compatibility metadata somewhere somehow.  

Or perhaps we could.  There is nothing that says a signed tag
created in the SHA-1 world must have the PGP/SHA-1 signature in the
NewHash payload---it could be split off of the object data and
stored in a local metadata cache, to be used only when we need to
convert it back to the SHA-1 world.

But I am getting ahead of myself before reading the proposal
through.

> +Non-Goals
> +---------
> ...
> +6. Skip fetching some submodules of a project into a NewHash
> +   repository. (This also depends on NewHash support in Git
> +   protocol.)

It is unclear what this means.  Around submodule support, one thing
I can think of is that a NewHash tree in a superproject would record
a gitlink that is a NewHash commit object name in it, therefore it
cannot refer to an unconverted SHA-1 submodule repository.  But it
is unclear if the above description refers to the same issue, or
something else.

> +Overview
> +--------
> +We introduce a new repository format extension. Repositories with this
> +extension enabled use NewHash instead of SHA-1 to name their objects.
> +This affects both object names and object content --- both the names
> +of objects and all references to other objects within an object are
> +switched to the new hash function.
> +
> +NewHash repositories cannot be read by older versions of Git.
> +
> +Alongside the packfile, a NewHash repository stores a bidirectional
> +mapping between NewHash and SHA-1 object names. The mapping is generated
> +locally and can be verified using "git fsck". Object lookups use this
> +mapping to allow naming objects using either their SHA-1 and NewHash names
> +interchangeably.
> +
> +"git cat-file" and "git hash-object" gain options to display an object
> +in its sha1 form and write an object given its sha1 form.

Both of these are somewhat unclear.  I am guessing that "git
cat-file --convert-to=sha1 <type> <NewHashName>" would emit the
object contents converted from their NewHash payload to SHA-1
payload (blobs are unchanged, trees, commits and tags get their
outgoing references converted from NewHash to their SHA-1
counterparts), and that is what you mean by "options to display an
object in its sha1 form".  

I am not sure how "git hash-object" with the option would work,
though.  Do you give an option "--hash=sha1 --stdout --stdin -t
<type>" to feed a NewHash contents (file, tree, commit or tag) to
the command, convert it to the SHA-1 content (hmm, how's that
different from the cat-file's new option???) and then write out its
loose object representation suitable to be used in the SHA-1 workd?
Where do you write it to?  It won't be in the repository, as we
rejected mixed repository in our Non-Goals section.

> +Object names
> +~~~~~~~~~~~~
> +Objects can be named by their 40 hexadecimal digit sha1-name or 64
> +hexadecimal digit newhash-name, plus names derived from those (see
> +gitrevisions(7)).
> +
> +The sha1-name of an object is the SHA-1 of the concatenation of its
> +type, length, a nul byte, and the object's sha1-content. This is the
> +traditional <sha1> used in Git to name objects.
> +
> +The newhash-name of an object is the NewHash of the concatenation of its
> +type, length, a nul byte, and the object's newhash-content.

It makes me wonder if we want to add the hashname in this object
header.  "length" would be different for non-blob objects anyway,
and it is not "compat metadata" we want to avoid baked in, yet it
would help diagnose a mistake of attempting to use a "mixed" objects
in a single repository.  Not a big issue, though.

> +The format allows round-trip conversion between newhash-content and
> +sha1-content.

If it is a goal to eventually be able to lose SHA-1 compatibility
metadata from the objects, then we might want to remove SHA-1 based
signature bits (e.g. PGP trailer in signed tag, gpgsig header in the
commit object) from NewHash contents, and instead have them stored
in a side "metadata" table, only to be used while converting back.
I dunno if that is desirable.

> +Pack index
> +~~~~~~~~~~
> +Pack index (.idx) files use a new v3 format that supports multiple
> +hash functions. They have the following format (all integers are in
> +network byte order):
> +
> +- A header appears at the beginning and consists of the following:
> +  - The 4-byte pack index signature: '\377t0c'
> +  - 4-byte version number: 3
> +  - 4-byte length of the header section, including the signature and
> +    version number
> +  - 4-byte number of objects contained in the pack
> +  - 4-byte number of object formats in this pack index: 2
> +  - For each object format:
> +    - 4-byte format identifier (e.g., 'sha1' for SHA-1)
> +    - 4-byte length in bytes of shortened object names. This is the
> +      shortest possible length needed to make names in the shortened
> +      object name table unambiguous.
> +    - 4-byte integer, recording where tables relating to this format
> +      are stored in this index file, as an offset from the beginning.
> +  - 4-byte offset to the trailer from the beginning of this file.
> +  - Zero or more additional key/value pairs (4-byte key, 4-byte
> +    value). Only one key is supported: 'PSRC'. See the "Loose objects
> +    and unreachable objects" section for supported values and how this
> +    is used.  All other keys are reserved. Readers must ignore
> +    unrecognized keys.
> +- Zero or more NUL bytes. This can optionally be used to improve the
> +  alignment of the full object name table below.
> +- Tables for the first object format:
> +  - A sorted table of shortened object names.  These are prefixes of
> +    the names of all objects in this pack file, packed together
> +    without offset values to reduce the cache footprint of the binary
> +    search for a specific object name.

I take it to mean that the stride is defined in the "length in bytes
of shortened object names" in the file header.  If so, I can see how
this would work.  This "sorted table", unlike the next one, does not
say how it is sorted, but I assume this is just the object name
order (as opposed to the pack location order the next table uses)?

> +  - A table of full object names in pack order. This allows resolving
> +    a reference to "the nth object in the pack file" (from a
> +    reachability bitmap or from the next table of another object
> +    format) to its object name.
> +
> +  - A table of 4-byte values mapping object name order to pack order.
> +    For an object in the table of sorted shortened object names, the
> +    value at the corresponding index in this table is the index in the
> +    previous table for that same object.
> +
> +    This can be used to look up the object in reachability bitmaps or
> +    to look up its name in another object format.

And this is a separate table because the short-name table wants to
be as compact as possible for binary search?  Otherwise an entry in
the short-name table could be <pack order number, n-bytes that is
short unique prefix>.

> +  - A table of 4-byte CRC32 values of the packed object data, in the
> +    order that the objects appear in the pack file. This is to allow
> +    compressed data to be copied directly from pack to pack during
> +    repacking without undetected data corruption.

An obvious alternative would be to have the CRC32 checksum near
(e.g. immediately before) the object data in the packfile (as
opposed to the .idx file like this document specifies).  I am not
sure what the pros and cons are between the two, though, and that is
why I mention the possiblity here.

Hmm, as the corresponding packfile stores object data only in
NewHash content format, it is somewhat curious that this table that
stores CRC32 of the data appears in the "Tables for each object
format" section, as they would be identical, no?  Unless I am
grossly misleading the spec, the checksum should either go outside
the "Tables for each object format" section but still in .idx, or
should be eliminated and become part of the packdata stream instead,
perhaps?

> +  - A table of 4-byte offset values. For an object in the table of
> +    sorted shortened object names, the value at the corresponding
> +    index in this table indicates where that object can be found in
> +    the pack file. These are usually 31-bit pack file offsets, but
> +    large offsets are encoded as an index into the next table with the
> +    most significant bit set.

Oy.  So we can go from a short prefix to the pack location by first
finding it via binsearch in the short-name table, realize that it is
nth object in the object name order, and consulting this table.
When we know the pack-order of an object, there is no direct way to
go to its location (short of reversing the name-order-to-pack-order
table)?

> +  - A table of 8-byte offset entries (empty for pack files less than
> +    2 GiB). Pack files are organized with heavily used objects toward
> +    the front, so most object references should not need to refer to
> +    this table.

> +- Zero or more NUL bytes.

... for padding/aligning.

> +- Tables for the second object format, with the same layout as above,
> +  up to and not including the table of CRC32 values.
> +- Zero or more NUL bytes.
> +- The trailer consists of the following:
> +  - A copy of the 20-byte NewHash checksum at the end of the
> +    corresponding packfile.
> +
> +  - 20-byte NewHash checksum of all of the above.

When did NewHash shrink to 20-byte suddenly?  I think the above two
are both "32-byte"?

> +Loose object index
> +~~~~~~~~~~~~~~~~~~
> +A new file $GIT_OBJECT_DIR/loose-object-idx contains information about
> +all loose objects. Its format is
> +
> +  # loose-object-idx
> +  (newhash-name SP sha1-name LF)*
> +
> +where the object names are in hexadecimal format. The file is not
> +sorted.

Shouldn't the file somehow say what hashes are involved to allow us
match it with extension.{objectFormat,compatObjectFormat}, perhaps
at the end of the "# loose-object-idx" line?

> +The loose object index is protected against concurrent writes by a
> +lock file $GIT_OBJECT_DIR/loose-object-idx.lock. To add a new loose
> +object:
> +
> +1. Write the loose object to a temporary file, like today.
> +2. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the lock.
> +3. Rename the loose object into place.
> +4. Open loose-object-idx with O_APPEND and write the new object

"write the new entry, fsync and close"?

> +Translation table
> +~~~~~~~~~~~~~~~~~
> +The index files support a bidirectional mapping between sha1-names
> +and newhash-names. The lookup proceeds similarly to ordinary object
> +lookups. For example, to convert a sha1-name to a newhash-name:
> +
> + 1. Look for the object in idx files. If a match is present in the
> +    idx's sorted list of truncated sha1-names, then:
> +    a. Read the corresponding entry in the sha1-name order to pack
> +       name order mapping.
> +    b. Read the corresponding entry in the full sha1-name table to
> +       verify we found the right object. If it is, then
> +    c. Read the corresponding entry in the full newhash-name table.
> +       That is the object's newhash-name.

c. is possible because b. and c. are sorted the same way, i.e. the
index used to consult the full sha1-name table, which is the pack
order number, can be used to find its full newhash in the "full
newhash sorted by pack order" table?

> +Reading an object's sha1-content
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I'd stop here and continue in a separate message.  Thanks for a
detailed write-up.

^ permalink raw reply	[relevance 5%]

* [PATCH v6 1/3] submodule--helper: introduce get_submodule_displaypath()
  2017-09-29  9:44 ` [PATCH v6 0/3] Incremental rewrite of git-submodules Prathamesh Chavan
@ 2017-09-29  9:44   ` Prathamesh Chavan
  2017-09-29  9:44   ` [PATCH v6 2/3] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
  2017-09-29  9:44   ` [PATCH v6 3/3] submodule: port submodule subcommand 'status' from shell to C Prathamesh Chavan
  2 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-09-29  9:44 UTC (permalink / raw)
  To: gitster; +Cc: christian.couder, git, hanwen, sbeller, Prathamesh Chavan

Introduce function get_submodule_displaypath() to replace the code
occurring in submodule_init() for generating displaypath of the
submodule with a call to it.

This new function will also be used in other parts of the system
in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 818fe74f0..cdae54426 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -220,6 +220,26 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+/* the result should be freed by the caller. */
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		return xstrfmt("%s%s", super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -335,15 +355,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	struct strbuf sb = STRBUF_INIT;
 	char *upd = NULL, *url = NULL, *displaypath;
 
-	if (prefix && get_super_prefix())
-		die("BUG: cannot have prefix and superprefix");
-	else if (prefix)
-		displaypath = xstrdup(relative_path(path, prefix, &sb));
-	else if (get_super_prefix()) {
-		strbuf_addf(&sb, "%s%s", get_super_prefix(), path);
-		displaypath = strbuf_detach(&sb, NULL);
-	} else
-		displaypath = xstrdup(path);
+	displaypath = get_submodule_displaypath(path, prefix);
 
 	sub = submodule_from_path(&null_oid, path);
 
@@ -358,9 +370,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 * Set active flag for the submodule being initialized
 	 */
 	if (!is_submodule_active(the_repository, path)) {
-		strbuf_reset(&sb);
 		strbuf_addf(&sb, "submodule.%s.active", sub->name);
 		git_config_set_gently(sb.buf, "true");
+		strbuf_reset(&sb);
 	}
 
 	/*
@@ -368,7 +380,6 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 * To look up the url in .git/config, we must not fall back to
 	 * .gitmodules, so look it up directly.
 	 */
-	strbuf_reset(&sb);
 	strbuf_addf(&sb, "submodule.%s.url", sub->name);
 	if (git_config_get_string(sb.buf, &url)) {
 		if (!sub->url)
@@ -405,9 +416,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
 	}
+	strbuf_reset(&sb);
 
 	/* Copy "update" setting when it is not set yet */
-	strbuf_reset(&sb);
 	strbuf_addf(&sb, "submodule.%s.update", sub->name);
 	if (git_config_get_string(sb.buf, &upd) &&
 	    sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {
-- 
2.13.0


^ permalink raw reply	[relevance 21%]

* [PATCH v6 2/3] submodule--helper: introduce for_each_listed_submodule()
  2017-09-29  9:44 ` [PATCH v6 0/3] Incremental rewrite of git-submodules Prathamesh Chavan
  2017-09-29  9:44   ` [PATCH v6 1/3] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
@ 2017-09-29  9:44   ` Prathamesh Chavan
  2017-10-02  0:55     ` Junio C Hamano
  2017-09-29  9:44   ` [PATCH v6 3/3] submodule: port submodule subcommand 'status' from shell to C Prathamesh Chavan
  2 siblings, 1 reply; 200+ results
From: Prathamesh Chavan @ 2017-09-29  9:44 UTC (permalink / raw)
  To: gitster; +Cc: christian.couder, git, hanwen, sbeller, Prathamesh Chavan

Introduce function for_each_listed_submodule() and replace a loop
in module_init() with a call to it.

The new function will also be used in other parts of the
system in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 39 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index cdae54426..20a1ef868 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -14,6 +14,11 @@
 #include "refs.h"
 #include "connect.h"
 
+#define CB_OPT_QUIET		(1<<0)
+
+typedef void (*each_submodule_fn)(const struct cache_entry *list_item,
+				  void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -349,7 +354,22 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static void init_submodule(const char *path, const char *prefix, int quiet)
+static void for_each_listed_submodule(const struct module_list *list,
+				      each_submodule_fn fn, void *cb_data)
+{
+	int i;
+	for (i = 0; i < list->nr; i++)
+		fn(list->entries[i], cb_data);
+}
+
+struct init_cb {
+	const char *prefix;
+	unsigned int cb_flags;
+};
+#define INIT_CB_INIT { NULL, 0 }
+
+static void init_submodule(const char *path, const char *prefix,
+			   unsigned int cb_flags)
 {
 	const struct submodule *sub;
 	struct strbuf sb = STRBUF_INIT;
@@ -411,7 +431,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 		if (git_config_set_gently(sb.buf, url))
 			die(_("Failed to register url for submodule path '%s'"),
 			    displaypath);
-		if (!quiet)
+		if (!(cb_flags & CB_OPT_QUIET))
 			fprintf(stderr,
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
@@ -438,12 +458,18 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	free(upd);
 }
 
+static void init_submodule_cb(const struct cache_entry *list_item, void *cb_data)
+{
+	struct init_cb *info = cb_data;
+	init_submodule(list_item->name, info->prefix, info->cb_flags);
+}
+
 static int module_init(int argc, const char **argv, const char *prefix)
 {
+	struct init_cb info = INIT_CB_INIT;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
 	int quiet = 0;
-	int i;
 
 	struct option module_init_options[] = {
 		OPT__QUIET(&quiet, N_("Suppress output for initializing a submodule")),
@@ -468,8 +494,11 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	if (!argc && git_config_get_value_multi("submodule.active"))
 		module_list_active(&list);
 
-	for (i = 0; i < list.nr; i++)
-		init_submodule(list.entries[i]->name, prefix, quiet);
+	info.prefix = prefix;
+	if (quiet)
+		info.cb_flags |= CB_OPT_QUIET;
+
+	for_each_listed_submodule(&list, init_submodule_cb, &info);
 
 	return 0;
 }
-- 
2.13.0


^ permalink raw reply	[relevance 21%]

* [PATCH v6 3/3] submodule: port submodule subcommand 'status' from shell to C
  2017-09-29  9:44 ` [PATCH v6 0/3] Incremental rewrite of git-submodules Prathamesh Chavan
  2017-09-29  9:44   ` [PATCH v6 1/3] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
  2017-09-29  9:44   ` [PATCH v6 2/3] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
@ 2017-09-29  9:44   ` Prathamesh Chavan
  2 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-09-29  9:44 UTC (permalink / raw)
  To: gitster; +Cc: christian.couder, git, hanwen, sbeller, Prathamesh Chavan

This aims to make git-submodule 'status' a built-in. Hence, the function
cmd_status() is ported from shell to C. This is done by introducing
four functions: module_status(), submodule_status_cb(),
submodule_status() and print_status().

The function module_status() acts as the front-end of the subcommand.
It parses subcommand's options and then calls the function
module_list_compute() for computing the list of submodules. Then
this functions calls for_each_listed_submodule() looping through the
list obtained.

Then for_each_listed_submodule() calls submodule_status_cb() for each of
the submodule in its list. The function submodule_status_cb() calls
submodule_status() after passing appropriate arguments to the funciton.
Function submodule_status() is responsible for generating the status
each submodule it is called for, and then calls print_status().

Finally, the function print_status() handles the printing of submodule's
status.

Function set_name_rev() is also ported from git-submodule to the
submodule--helper builtin function compute_rev_name(), which now
generates the value of the revision name as required.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 207 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  61 +------------
 2 files changed, 208 insertions(+), 60 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 20a1ef868..50d38fc20 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,8 +13,13 @@
 #include "remote.h"
 #include "refs.h"
 #include "connect.h"
+#include "revision.h"
+#include "diffcore.h"
+#include "diff.h"
 
 #define CB_OPT_QUIET		(1<<0)
+#define CB_OPT_CACHED		(1<<1)
+#define CB_OPT_RECURSIVE	(1<<2)
 
 typedef void (*each_submodule_fn)(const struct cache_entry *list_item,
 				  void *cb_data);
@@ -245,6 +250,53 @@ static char *get_submodule_displaypath(const char *path, const char *prefix)
 	}
 }
 
+static char *compute_rev_name(const char *sub_path, const char* object_id)
+{
+	struct strbuf sb = STRBUF_INIT;
+	const char ***d;
+
+	static const char *describe_bare[] = {
+		NULL
+	};
+
+	static const char *describe_tags[] = {
+		"--tags", NULL
+	};
+
+	static const char *describe_contains[] = {
+		"--contains", NULL
+	};
+
+	static const char *describe_all_always[] = {
+		"--all", "--always", NULL
+	};
+
+	static const char **describe_argv[] = {
+		describe_bare, describe_tags, describe_contains,
+		describe_all_always, NULL
+	};
+
+	for (d = describe_argv; *d; d++) {
+		struct child_process cp = CHILD_PROCESS_INIT;
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.dir = sub_path;
+		cp.git_cmd = 1;
+		cp.no_stderr = 1;
+
+		argv_array_push(&cp.args, "describe");
+		argv_array_pushv(&cp.args, *d);
+		argv_array_push(&cp.args, object_id);
+
+		if (!capture_command(&cp, &sb, 0)) {
+			strbuf_strip_suffix(&sb, "\n");
+			return strbuf_detach(&sb, NULL);
+		}
+	}
+
+	strbuf_release(&sb);
+	return NULL;
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -503,6 +555,160 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct status_cb {
+	const char *prefix;
+	unsigned int cb_flags;
+};
+#define STATUS_CB_INIT { NULL, 0 }
+
+static void print_status(unsigned int flags, char state, const char *path,
+			 const struct object_id *oid, const char *displaypath)
+{
+	if (flags & CB_OPT_QUIET)
+		return;
+
+	printf("%c%s %s", state, oid_to_hex(oid), displaypath);
+
+	if (state == ' ' || state == '+')
+		printf(" (%s)", compute_rev_name(path, oid_to_hex(oid)));
+
+	printf("\n");
+}
+
+static int handle_submodule_head_ref(const char *refname,
+				     const struct object_id *oid, int flags,
+				     void *cb_data)
+{
+	struct object_id *output = cb_data;
+	if (oid)
+		oidcpy(output, oid);
+
+	return 0;
+}
+
+static void status_submodule(const char *path, const struct object_id *ce_oid,
+			     unsigned int ce_flags, const char *prefix,
+			     unsigned int cb_flags)
+{
+	char *displaypath;
+	struct argv_array diff_files_args = ARGV_ARRAY_INIT;
+	struct rev_info rev;
+	int diff_files_result;
+
+	if (!submodule_from_path(&null_oid, path))
+		die(_("no submodule mapping found in .gitmodules for path '%s'"),
+		      path);
+
+	displaypath = get_submodule_displaypath(path, prefix);
+
+	if ((CE_STAGEMASK & ce_flags) >> CE_STAGESHIFT) {
+		print_status(cb_flags, 'U', path, &null_oid, displaypath);
+		goto cleanup;
+	}
+
+	if (!is_submodule_active(the_repository, path)) {
+		print_status(cb_flags, '-', path, ce_oid, displaypath);
+		goto cleanup;
+	}
+
+	argv_array_pushl(&diff_files_args, "diff-files",
+			 "--ignore-submodules=dirty", "--quiet", "--",
+			 path, NULL);
+
+	git_config(git_diff_basic_config, NULL);
+	init_revisions(&rev, prefix);
+	rev.abbrev = 0;
+	precompose_argv(diff_files_args.argc, diff_files_args.argv);
+	diff_files_args.argc = setup_revisions(diff_files_args.argc,
+					       diff_files_args.argv,
+					       &rev, NULL);
+	diff_files_result = run_diff_files(&rev, 0);
+
+	if (!diff_result_code(&rev.diffopt, diff_files_result)) {
+		print_status(cb_flags, ' ', path, ce_oid,
+			     displaypath);
+	} else if (!(cb_flags & CB_OPT_CACHED)) {
+		struct object_id oid;
+
+		if (refs_head_ref(get_submodule_ref_store(path),
+				  handle_submodule_head_ref, &oid))
+			die(_("could not resolve HEAD ref inside the"
+			      "submodule '%s'"), path);
+
+		print_status(cb_flags, '+', path, &oid, displaypath);
+	} else {
+		print_status(cb_flags, '+', path, ce_oid, displaypath);
+	}
+
+	if (cb_flags & CB_OPT_RECURSIVE) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = path;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_push(&cpr.args, "--super-prefix");
+		argv_array_pushf(&cpr.args, "%s/", displaypath);
+		argv_array_pushl(&cpr.args, "submodule--helper", "status",
+				 "--recursive", NULL);
+
+		if (cb_flags & CB_OPT_CACHED)
+			argv_array_push(&cpr.args, "--cached");
+
+		if (cb_flags & CB_OPT_QUIET)
+			argv_array_push(&cpr.args, "--quiet");
+
+		if (run_command(&cpr))
+			die(_("failed to recurse into submodule '%s'"), path);
+	}
+
+cleanup:
+	argv_array_clear(&diff_files_args);
+	free(displaypath);
+}
+
+static void status_submodule_cb(const struct cache_entry *list_item,
+				void *cb_data)
+{
+	struct status_cb *info = cb_data;
+	status_submodule(list_item->name, &list_item->oid, list_item->ce_flags,
+			 info->prefix, info->cb_flags);
+}
+
+static int module_status(int argc, const char **argv, const char *prefix)
+{
+	struct status_cb info = STATUS_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+
+	struct option module_status_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress submodule status output")),
+		OPT_BIT(0, "cached", &info.cb_flags, N_("Use commit stored in the index instead of the one stored in the submodule HEAD"), CB_OPT_CACHED),
+		OPT_BIT(0, "recursive", &info.cb_flags, N_("recurse into nested submodules"), CB_OPT_RECURSIVE),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule status [--quiet] [--cached] [--recursive] [<path>...]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_status_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		return 1;
+
+	info.prefix = prefix;
+	if (quiet)
+		info.cb_flags |= CB_OPT_QUIET;
+
+	for_each_listed_submodule(&list, status_submodule_cb, &info);
+
+	return 0;
+}
+
 static int module_name(int argc, const char **argv, const char *prefix)
 {
 	const struct submodule *sub;
@@ -1300,6 +1506,7 @@ static struct cmd_struct commands[] = {
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
+	{"status", module_status, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index 66d1ae8ef..156255a9e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -758,18 +758,6 @@ cmd_update()
 	}
 }
 
-set_name_rev () {
-	revname=$( (
-		sanitize_submodule_env
-		cd "$1" && {
-			git describe "$2" 2>/dev/null ||
-			git describe --tags "$2" 2>/dev/null ||
-			git describe --contains "$2" 2>/dev/null ||
-			git describe --all --always "$2"
-		}
-	) )
-	test -z "$revname" || revname=" ($revname)"
-}
 #
 # Show commit summary for submodules in index or working tree
 #
@@ -1016,54 +1004,7 @@ cmd_status()
 		shift
 	done
 
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		name=$(git submodule--helper name "$sm_path") || exit
-		displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-		if test "$stage" = U
-		then
-			say "U$sha1 $displaypath"
-			continue
-		fi
-		if ! git submodule--helper is-active "$sm_path" ||
-		{
-			! test -d "$sm_path"/.git &&
-			! test -f "$sm_path"/.git
-		}
-		then
-			say "-$sha1 $displaypath"
-			continue;
-		fi
-		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
-		then
-			set_name_rev "$sm_path" "$sha1"
-			say " $sha1 $displaypath$revname"
-		else
-			if test -z "$cached"
-			then
-				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
-			fi
-			set_name_rev "$sm_path" "$sha1"
-			say "+$sha1 $displaypath$revname"
-		fi
-
-		if test -n "$recursive"
-		then
-			(
-				prefix="$displaypath/"
-				sanitize_submodule_env
-				wt_prefix=
-				cd "$sm_path" &&
-				eval cmd_status
-			) ||
-			die "$(eval_gettext "Failed to recurse into submodule path '\$sm_path'")"
-		fi
-	done
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper status ${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive} "$@"
 }
 #
 # Sync remote urls for submodules
-- 
2.13.0


^ permalink raw reply	[relevance 19%]

* Re: [PATCH v4] technical doc: add a design doc for hash function transition
  2017-09-29  6:06         ` Junio C Hamano
@ 2017-09-29 17:34           ` Jonathan Nieder
  0 siblings, 0 replies; 200+ results
From: Jonathan Nieder @ 2017-09-29 17:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, Linus Torvalds, Git Mailing List, Stefan Beller, bmwill, Jonathan Tan, Jeff King, David Lang, brian m. carlson, Masaya Suzuki, demerphq, The Keccak Team, Johannes Schindelin

Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>> This document describes what a transition to a new hash function for
>> Git would look like.  Add it to Documentation/technical/ as the plan
>> of record so that future changes can be recorded as patches.
>>
>> Also-by: Brandon Williams <bmwill@google.com>
>> Also-by: Jonathan Tan <jonathantanmy@google.com>
>> Also-by: Stefan Beller <sbeller@google.com>
>> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
>> ---
>
> Shoudln't these all be s-o-b: (with a note immediately before that
> to say all four contributed equally or something)?

I don't want to get lost in the weeds in the question of how to
represent such a collaborative effort in git's metadata.

You're right that I should collect their sign-offs!  Your approach of
using text instead of machine-readable data for common authorship also
seems okay.

In any event, this is indeed

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>

(I just checked :)).

>> +Background
>> +----------
>> +At its core, the Git version control system is a content addressable
>> +filesystem. It uses the SHA-1 hash function to name content. For
>> +example, files, directories, and revisions are referred to by hash
>> +values unlike in other traditional version control systems where files
>> +or versions are referred to via sequential numbers. The use of a hash
>
> Traditional systems refer to files via numbers???  Perhaps "where
> versions of files are referred to via sequential numbers" or
> something?

Good point.  The wording you suggested will work well.

>> +function to address its content delivers a few advantages:
>> +
>> +* Integrity checking is easy. Bit flips, for example, are easily
>> +  detected, as the hash of corrupted content does not match its name.
>> +* Lookup of objects is fast.
>
> * There is no ambiguity what the object's name should be, given its
>   content.
>
> * Deduping the same content copied across versions and paths is
>   automatic.

:)  Yep, these are nice too, especially that second one.

It also is how we make diff-ing fast.

>> +SHA-1 still possesses the other properties such as fast object lookup
>> +and safe error checking, but other hash functions are equally suitable
>> +that are believed to be cryptographically secure.
>
> s/secure/more &/, perhaps?

We were looking for a phrase meaning that it should be a cryptographic
hash function in good standing, which SHA-1 is at least approaching
not being.

"more secure" should work fine.  Let's go with that.

>> +Goals
>> +-----
>> +...
>> +   c. Users can use SHA-1 and NewHash identifiers for objects
>> +      interchangeably (see "Object names on the command line", below).
>
> Mental note.  This needs to extend to the "index X..Y" lines in the
> patch output, which is used by "apply -3" and "am -3".

Will add a note about this to "Object names on the command line".  Stefan
had already pointed out that that section should really be renamed to
something like "Object names in input and output".

>> +2. Allow a complete transition away from SHA-1.
>> +   a. Local metadata for SHA-1 compatibility can be removed from a
>> +      repository if compatibility with SHA-1 is no longer needed.
>
> I like the emphasis on "Local" here.  Metadata for compatiblity that
> is embedded in the objects obviously cannot be removed.
>
> From that point of view, one of the goals ought to be "make sure
> that as much SHA-1 compatibility metadata as possible is local and
> outside the object".  This goal may not be able to say more than "as
> much as possible", as signed objects that came from SHA-1 world
> needs to carry the compatibility metadata somewhere somehow.
>
> Or perhaps we could.  There is nothing that says a signed tag
> created in the SHA-1 world must have the PGP/SHA-1 signature in the
> NewHash payload---it could be split off of the object data and
> stored in a local metadata cache, to be used only when we need to
> convert it back to the SHA-1 world.

That would break round-tripping and would mean that multiple SHA-1
objects could have the same NewHash name.  In other words, from
my point of view there is something that says that such data must
be preserved.

Another way to put it: even after removing all SHA-1 compatibility
metadata, one nice feature of this design is that it can be recovered
if I change my mind, from data in the NewHash based repository alone.

[...]
>> +Non-Goals
>> +---------
>> ...
>> +6. Skip fetching some submodules of a project into a NewHash
>> +   repository. (This also depends on NewHash support in Git
>> +   protocol.)
>
> It is unclear what this means.  Around submodule support, one thing
> I can think of is that a NewHash tree in a superproject would record
> a gitlink that is a NewHash commit object name in it, therefore it
> cannot refer to an unconverted SHA-1 submodule repository.  But it
> is unclear if the above description refers to the same issue, or
> something else.

It refers to that issue.

[...]
>> +Overview
>> +--------
>> +We introduce a new repository format extension. Repositories with this
>> +extension enabled use NewHash instead of SHA-1 to name their objects.
>> +This affects both object names and object content --- both the names
>> +of objects and all references to other objects within an object are
>> +switched to the new hash function.
>> +
>> +NewHash repositories cannot be read by older versions of Git.
>> +
>> +Alongside the packfile, a NewHash repository stores a bidirectional
>> +mapping between NewHash and SHA-1 object names. The mapping is generated
>> +locally and can be verified using "git fsck". Object lookups use this
>> +mapping to allow naming objects using either their SHA-1 and NewHash names
>> +interchangeably.
>> +
>> +"git cat-file" and "git hash-object" gain options to display an object
>> +in its sha1 form and write an object given its sha1 form.
>
> Both of these are somewhat unclear.

I think we can delete this paragraph.  It was written before the
"Object names on the command line" section that goes into such issues
in more detail.

[...]
>> +Object names
>> +~~~~~~~~~~~~
>> +Objects can be named by their 40 hexadecimal digit sha1-name or 64
>> +hexadecimal digit newhash-name, plus names derived from those (see
>> +gitrevisions(7)).
>> +
>> +The sha1-name of an object is the SHA-1 of the concatenation of its
>> +type, length, a nul byte, and the object's sha1-content. This is the
>> +traditional <sha1> used in Git to name objects.
>> +
>> +The newhash-name of an object is the NewHash of the concatenation of its
>> +type, length, a nul byte, and the object's newhash-content.
>
> It makes me wonder if we want to add the hashname in this object
> header.  "length" would be different for non-blob objects anyway,
> and it is not "compat metadata" we want to avoid baked in, yet it
> would help diagnose a mistake of attempting to use a "mixed" objects
> in a single repository.  Not a big issue, though.

Do you mean that adding the hashname into the computation that
produces the object name would help in some use case?

Or do you mean storing the hashname on disk somewhere, even if it
doesn't enter into the object name?  For the latter, we store the
hashname in the .git/config extensions.* configuration and the pack
index files.  You also suggested storing the hash name in
.git/objects/loose-object-idx, which seems to me like a good idea.

We didn't touch on the .pack format but we probably need to (if only
because of the size of REF_DELTAs and the cksum trailer), and it would
also need to name what object format it is using.

For loose objects, it would be nice to name the hash in the file, so
that "file" can understand what is happening if someone accidentally
mixes types using "cp".  The only downside is losing the ability to
copy blobs (which have the same content despite being named using
different hashes) between repositories after determining their new
names.  That doesn't seem like a strong downside --- it's pretty
harmless to include the hash type in loose object files, too.  I think
I would prefer this to be a "magic number" instead of part of the
zlib-deflated payload, since this way "file" can discover it more
easily.

>> +The format allows round-trip conversion between newhash-content and
>> +sha1-content.
>
> If it is a goal to eventually be able to lose SHA-1 compatibility
> metadata from the objects, then we might want to remove SHA-1 based
> signature bits (e.g. PGP trailer in signed tag, gpgsig header in the
> commit object) from NewHash contents, and instead have them stored
> in a side "metadata" table, only to be used while converting back.
> I dunno if that is desirable.

I don't consider that desirable.

A SHA-1 based signature is still of historical interest even if my
centuries-newer version of Git is not able to verify it.

[...]
> I take it to mean that the stride is defined in the "length in bytes
> of shortened object names" in the file header.  If so, I can see how
> this would work.  This "sorted table", unlike the next one, does not
> say how it is sorted, but I assume this is just the object name
> order (as opposed to the pack location order the next table uses)?

Yes.  Will clarify.

>> +  - A table of full object names in pack order. This allows resolving
>> +    a reference to "the nth object in the pack file" (from a
>> +    reachability bitmap or from the next table of another object
>> +    format) to its object name.
>> +
>> +  - A table of 4-byte values mapping object name order to pack order.
>> +    For an object in the table of sorted shortened object names, the
>> +    value at the corresponding index in this table is the index in the
>> +    previous table for that same object.
>> +
>> +    This can be used to look up the object in reachability bitmaps or
>> +    to look up its name in another object format.
>
> And this is a separate table because the short-name table wants to
> be as compact as possible for binary search?  Otherwise an entry in
> the short-name table could be <pack order number, n-bytes that is
> short unique prefix>.

Yes.  The idx v2 format has a similar design.

>> +  - A table of 4-byte CRC32 values of the packed object data, in the
>> +    order that the objects appear in the pack file. This is to allow
>> +    compressed data to be copied directly from pack to pack during
>> +    repacking without undetected data corruption.
>
> An obvious alternative would be to have the CRC32 checksum near
> (e.g. immediately before) the object data in the packfile (as
> opposed to the .idx file like this document specifies).  I am not
> sure what the pros and cons are between the two, though, and that is
> why I mention the possiblity here.

As you mentioned under separate cover, it is useful for derived data
like this to be outside the packfile.

> Hmm, as the corresponding packfile stores object data only in
> NewHash content format, it is somewhat curious that this table that
> stores CRC32 of the data appears in the "Tables for each object
> format" section, as they would be identical, no?  Unless I am
> grossly misleading the spec, the checksum should either go outside
> the "Tables for each object format" section but still in .idx, or
> should be eliminated and become part of the packdata stream instead,
> perhaps?

It's actually only present for the first object format.  Will find a
better way to describe this.

>> +  - A table of 4-byte offset values. For an object in the table of
>> +    sorted shortened object names, the value at the corresponding
>> +    index in this table indicates where that object can be found in
>> +    the pack file. These are usually 31-bit pack file offsets, but
>> +    large offsets are encoded as an index into the next table with the
>> +    most significant bit set.
>
> Oy.  So we can go from a short prefix to the pack location by first
> finding it via binsearch in the short-name table, realize that it is
> nth object in the object name order, and consulting this table.
> When we know the pack-order of an object, there is no direct way to
> go to its location (short of reversing the name-order-to-pack-order
> table)?

An earlier version of the design also had a pack-order-to-pack-offset
table, but we weren't able to think of any cases where that would be
used without also looking up the object name that can be used to
verify the integrity of the inflated object.

Do you have an application in mind?

[...]
>> +- Tables for the second object format, with the same layout as above,
>> +  up to and not including the table of CRC32 values.
>> +- Zero or more NUL bytes.
>> +- The trailer consists of the following:
>> +  - A copy of the 20-byte NewHash checksum at the end of the
>> +    corresponding packfile.
>> +
>> +  - 20-byte NewHash checksum of all of the above.
>
> When did NewHash shrink to 20-byte suddenly?  I think the above two
> are both "32-byte"?

Yes, good catch.

[...]
>> +Loose object index
>> +~~~~~~~~~~~~~~~~~~
>> +A new file $GIT_OBJECT_DIR/loose-object-idx contains information about
>> +all loose objects. Its format is
>> +
>> +  # loose-object-idx
>> +  (newhash-name SP sha1-name LF)*
>> +
>> +where the object names are in hexadecimal format. The file is not
>> +sorted.
>
> Shouldn't the file somehow say what hashes are involved to allow us
> match it with extension.{objectFormat,compatObjectFormat}, perhaps
> at the end of the "# loose-object-idx" line?

Good idea!

[...]
>> +The loose object index is protected against concurrent writes by a
>> +lock file $GIT_OBJECT_DIR/loose-object-idx.lock. To add a new loose
>> +object:
>> +
>> +1. Write the loose object to a temporary file, like today.
>> +2. Open loose-object-idx.lock with O_CREAT | O_EXCL to acquire the lock.
>> +3. Rename the loose object into place.
>> +4. Open loose-object-idx with O_APPEND and write the new object
>
> "write the new entry, fsync and close"?

Yes, I think we do need to fsync. :/

[...]
>> +Translation table
>> +~~~~~~~~~~~~~~~~~
>> +The index files support a bidirectional mapping between sha1-names
>> +and newhash-names. The lookup proceeds similarly to ordinary object
>> +lookups. For example, to convert a sha1-name to a newhash-name:
>> +
>> + 1. Look for the object in idx files. If a match is present in the
>> +    idx's sorted list of truncated sha1-names, then:
>> +    a. Read the corresponding entry in the sha1-name order to pack
>> +       name order mapping.
>> +    b. Read the corresponding entry in the full sha1-name table to
>> +       verify we found the right object. If it is, then
>> +    c. Read the corresponding entry in the full newhash-name table.
>> +       That is the object's newhash-name.
>
> c. is possible because b. and c. are sorted the same way, i.e. the
> index used to consult the full sha1-name table, which is the pack
> order number, can be used to find its full newhash in the "full
> newhash sorted by pack order" table?

Yes.

>> +Reading an object's sha1-content
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> I'd stop here and continue in a separate message.  Thanks for a
> detailed write-up.

Thanks for looking it over.

Jonathan

^ permalink raw reply	[relevance 5%]

* Re: [PATCH v6 2/3] submodule--helper: introduce for_each_listed_submodule()
  2017-09-29  9:44   ` [PATCH v6 2/3] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
@ 2017-10-02  0:55     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-10-02  0:55 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: christian.couder, git, hanwen, sbeller

Prathamesh Chavan <pc44800@gmail.com> writes:

> Introduce function for_each_listed_submodule() and replace a loop
> in module_init() with a call to it.
>
> The new function will also be used in other parts of the
> system in later patches.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
>  builtin/submodule--helper.c | 39 ++++++++++++++++++++++++++++++++++-----
>  1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index cdae54426..20a1ef868 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -14,6 +14,11 @@
>  #include "refs.h"
>  #include "connect.h"
>  
> +#define CB_OPT_QUIET		(1<<0)

Is the purpose of this bit to make the callback quiet?  I do not
think so.  Is there a reason why we cannot call it just OPT_QUIET or
something instead?

When the set of functions that pay attention to these flags include
both ones that are callable for a single submodule and ones meant as
callbacks for for-each interface, having to flip bit whose name
screams "CallBack!" in a caller of a single-short version feels very
wrong.

"make style" tells me to format the above like so:

	#define OPT_QUIET (1 << 0)

and I think I agree.

> @@ -349,7 +354,22 @@ static int module_list(int argc, const char **argv, const char *prefix)
>  	return 0;
>  }
>  
> -static void init_submodule(const char *path, const char *prefix, int quiet)
> +static void for_each_listed_submodule(const struct module_list *list,
> +				      each_submodule_fn fn, void *cb_data)
> +{
> +	int i;
> +	for (i = 0; i < list->nr; i++)
> +		fn(list->entries[i], cb_data);
> +}

Good.

> +struct init_cb {

I take it is a short-hand for "submodule init callback"?  As long as
the name stays inside this file, I think we are OK.

> +	const char *prefix;
> +	unsigned int cb_flags;

Call this just "flags"; call-back ness is plenty clear from the fact
that it lives in a structure meant as a callback interface already.

> +};

Blank line here?

> +#define INIT_CB_INIT { NULL, 0 }
> +
> +static void init_submodule(const char *path, const char *prefix,
> +			   unsigned int cb_flags)

Call this also "flags"; a direct caller of this function that wants
to initialize a single submodule without going thru the for-each
callback interface would not be passing "callback flags"--they are
just passing a set of flags.

^ permalink raw reply	[relevance 16%]

* Re: git submodule add fails when using both --branch and --depth
  2017-09-30 17:20 git submodule add fails when using both --branch and --depth Thadeus Fleming
@ 2017-10-02 19:55 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-02 19:55 UTC (permalink / raw)
  To: Thadeus Fleming; +Cc: git

On Sat, Sep 30, 2017 at 10:20 AM, Thadeus Fleming
<thadeus.j.fleming@gmail.com> wrote:
> I'm running git 2.14.2 on Ubuntu 16.04.
>
> Compare the behavior of
>
>> git clone --branch pu --depth 1 https://github.com/git/git git-pu
>
> which clones only the latest commit of the pu branch and
>
>> mkdir tmp && cd tmp && git init
>> git submodule add --branch pu --depth 1 https://github.com/git/git \
>   git-pu
>
> which gives the error
>
> fatal: 'origin/pu' is not a commit and a branch 'pu' cannot be created
> from it
> Unable to checkout submodule 'git-pu'
>
> Investigating further, there is indeed only one commit in the local repo:
>
>> cd git-pu
>> git log --oneline | wc -l
> 1
>
> But that commit is the head of master.
>
>> git branch -a
> * master                                 remotes/origin/master
>   remotes/origin/HEAD -> origin/master
>
> This appears to be because git-submodule--helper does not accept a
> --branch option. Using the --depth N option causes it to only clone N
> commits from the default branch, which generally do not include the
> desired branch. Thus, the next step,
>
> git checkout -f -q -B "$branch" "origin/$branch"
>
> fails, and provides the rather confusing error message above.
>
> I'd suggest that git-submodule--helper learn a --branch option
> consistent with git clone, and if that is impossible, that
> git submodule add rejects the simultaneous use of both the --branch and
> --depth options.

Adding the branch field to the submodule helper is a great idea.

>
>
>
> --tjf

^ permalink raw reply	[relevance 23%]

* Re: Security of .git/config and .git/hooks
  2017-10-03 12:32 ` Jeff King
@ 2017-10-03 15:10   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-03 15:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, git, Loic Guelorget, Sitaram Chamarty

So once upon a time we compared Gits security model with a
web browser. A web browser lets you execute 3rd party code
(e.g. javascript) and it is supposedly safe to look at malicious sites.

Currently Git only promises to have the clone/fetch operation safe,
not the "here is a zip of my whole repo" case, which sounds more
like the web browser experience ("here is a site with js, even zipped
in transfer"). Tightening the security model of Git towards this seems
like a good idea to me.

>>  1. Introduce a (configurable) list of "safe" configuration items that
>>     can be set in .git/config and don't respect any others.
>
> A whitelist is obviously safer than a blacklist. Though I also feel like
> some of the options may give an unexpectedly wide attack surface. I.e.,
> I wouldn't be surprised if some innocent-looking option ends up being
> used in a tricky way to gain more access. E.g., submodule config
> pointing to paths outside of the repository.
>
> Do you plan to run in safe-mode all the time? What if safe-mode was a
> lot more extreme, and simply avoided reading repo-level config at all
> (except for check_repository_format(), which should be pretty innocent).
>
> I have a feeling there are some features (like submodules) that would
> simply be broken in safe-mode.

I would think that the essential submodule things would be "safe"
to look at. But e.g. submodule.<name>.update = "!rm -rf /" would be
not ok, hence the .update configuration would be in the unsafe space.

Any unsafe config option would need to be set outside the actual
repository (~/.config/git/<repo-id>/config ?)

>
>>  2. But what if I want to set a different pager per-repository?
>>     I think we could do this using configuration "profiles".
>>     My ~/.config/git/profiles/ directory would contain git-style
>>     config files for repositories to include.  Repositories could
>>     then contain
>>
>>       [include]
>>               path = ~/.config/git/profiles/fancy-log-pager
>>
>>     to make use of those settings.  The facility (1) would
>>     special-case this directory to allow it to set "unsafe" settings
>>     since files there are assumed not to be under the control of an
>>     attacker.
>
> You can do something quite similar already:
>
>   git config --global \
>     include.gitdir:/path/to/specific/repo.path
>     .gitconfig-fancy-log-pager
>
> The main difference is that the profile inclusion is done by path rather
> than riding along with the repository directory if it gets moved. In
> practice I doubt that matters much, and I think the security model for
> include.gitdir is a lot simpler.

I am not sure if this works so well for the submodule.<name>.update
config (that we want to deprecate anyway, but still)

>> For backward compatibility, this facility would be controlled by a
>> global configuration setting.  If that setting is not enabled, then
>> the current, less safe behavior would remain.
>
> Are config and symlinks everything we need to care about? I can't think
> of anything else, but Git really has quite a large attack surface when
> accessing a local repo. Right now the safest thing you can do is "git
> clone --no-local" an untrusted repo and then look only at the clone. Of
> course nobody _actually_ does that, so any "safe" mode seems like it
> would be an improvement. But would claiming to have a "safe" mode
> encourage people to use it to look at risky repositories, exacerbating
> any holes (e.g., exploiting a bug in the index format)? I don't know.

Good point. Though we only care about the case of breaking out and
executing untrusted code; most of the index exploits would rather
trigger a segfault or infinite lop (in my imagination at least).

>
> -Peff

^ permalink raw reply	[relevance 17%]

* [ANNOUNCE] Git v2.15.0-rc0
@ 2017-10-05  5:55 Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-10-05  5:55 UTC (permalink / raw)
  To: git; +Cc: Linux Kernel

An early preview release Git v2.15.0-rc0 is now available for
testing at the usual places.  It is comprised of 672 non-merge
commits since v2.14.0, contributed by 66 people, 20 of which are
new faces.

The tarballs are found at:

    https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.15.0-rc0' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.14.0 are as follows.
Welcome to the Git development community!

  Ann T Ropea, Daniel Watkins, Dimitrios Christidis, Eric Rannaud,
  Evan Zacks, Hielke Christian Braun, Ian Campbell, Ilya Kantor,
  Jameson Miller, Job Snijders, Joel Teichroeb, joernchen,
  Łukasz Gryglicki, Manav Rathi, Martin Ågren, Michael Forney,
  Patryk Obara, Rene Scharfe, Ross Kabus, and Urs Thuermann.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Andreas Heiduk,
  Anthony Sottile, Ben Boeckel, Brandon Casey, Brandon Williams,
  brian m. carlson, Christian Couder, Eric Blake, Han-Wen Nienhuys,
  Heiko Voigt, Jeff Hostetler, Jeff King, Johannes Schindelin,
  Jonathan Nieder, Jonathan Tan, Junio C Hamano, Kaartic Sivaraam,
  Kevin Daudt, Kevin Willford, Lars Schneider, Martin Koegler,
  Matthieu Moy, Max Kirillov, Michael Haggerty, Michael J Gruber,
  Nguyễn Thái Ngọc Duy, Nicolas Morey-Chaisemartin, Øystein
  Walle, Paolo Bonzini, Pat Thoyts, Philip Oakley, Phillip
  Wood, Raman Gupta, Ramsay Jones, René Scharfe, Sahil Dua,
  Santiago Torres, Stefan Beller, Stephan Beyer, Takashi Iwai,
  Thomas Gummerer, Tom G. Christensen, Torsten Bögershausen,
  and William Duclot.

----------------------------------------------------------------

Git 2.15 Release Notes (draft)
==============================

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is still warned and Git asks users to use a
   more explicit '.' for that instead.  The hope is that existing
   users will not mind this change, and eventually the warning can be
   turned into a hard error, upgrading the deprecation into removal of
   this (mis)feature.  That is now scheduled to happen in the upcoming
   release.

 * Git now avoids blindly falling back to ".git" when the setup
   sequence said we are _not_ in Git repository.  A corner case that
   happens to work right now may be broken by a call to die("BUG").
   We've tried hard to locate such cases and fixed them, but there
   might still be cases that need to be addressed--bug reports are
   greatly appreciated.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.


Updates since v2.14
-------------------

UI, Workflows & Features

 * An example that is now obsolete has been removed from a sample hook,
   and an old example in it that added a sign-off manually has been
   improved to use the interpret-trailers command.

 * The advice message given when "git rebase" stops for conflicting
   changes has been improved.

 * The "rerere-train" script (in contrib/) learned the "--overwrite"
   option to allow overwriting existing recorded resolutions.

 * "git contacts" (in contrib/) now lists the address on the
   "Reported-by:" trailer to its output, in addition to those on
   S-o-b: and other trailers, to make it easier to notify (and thank)
   the original bug reporter.

 * "git rebase", especially when it is run by mistake and ends up
   trying to replay many changes, spent long time in silence.  The
   command has been taught to show progress report when it spends
   long time preparing these many changes to replay (which would give
   the user a chance to abort with ^C).

 * "git merge" learned a "--signoff" option to add the Signed-off-by:
   trailer with the committer's name.

 * "git diff" learned to optionally paint new lines that are the same
   as deleted lines elsewhere differently from genuinely new lines.

 * "git interpret-trailers" learned to take the trailer specifications
   from the command line that overrides the configured values.

 * "git interpret-trailers" has been taught a "--parse" and a few
   other options to make it easier for scripts to grab existing
   trailer lines from a commit log message.

 * "gitweb" shows a link to visit the 'raw' contents of blbos in the
   history overview page.

 * "[gc] rerereResolved = 5.days" used to be invalid, as the variable
   is defined to take an integer counting the number of days.  It now
   is allowed.

 * The code to acquire a lock on a reference (e.g. while accepting a
   push from a client) used to immediately fail when the reference is
   already locked---now it waits for a very short while and retries,
   which can make it succeed if the lock holder was holding it during
   a read-only operation.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.

 * The codepath to call external process filter for smudge/clean
   operation learned to show the progress meter.

 * "git rev-parse" learned "--is-shallow-repository", that is to be
   used in a way similar to existing "--is-bare-repository" and
   friends.

 * "git describe --match <pattern>" has been taught to play well with
   the "--all" option.

 * "git branch" learned "-c/-C" to create a new branch by copying an
   existing one.

 * Some commands (most notably "git status") makes an opportunistic
   update when performing a read-only operation to help optimize later
   operations in the same repository.  The new "--no-optional-locks"
   option can be passed to Git to disable them.


Performance, Internal Implementation, Development Support etc.

 * Conversion from uchar[20] to struct object_id continues.

 * Start using selected c99 constructs in small, stable and
   essentialpart of the system to catch people who care about
   older compilers that do not grok them.

 * The filter-process interface learned to allow a process with long
   latency give a "delayed" response.

 * Many uses of comparision callback function the hashmap API uses
   cast the callback function type when registering it to
   hashmap_init(), which defeats the compile time type checking when
   the callback interface changes (e.g. gaining more parameters).
   The callback implementations have been updated to take "void *"
   pointers and cast them to the type they expect instead.

 * Because recent Git for Windows do come with a real msgfmt, the
   build procedure for git-gui has been updated to use it instead of a
   hand-rolled substitute.

 * "git grep --recurse-submodules" has been reworked to give a more
   consistent output across submodule boundary (and do its thing
   without having to fork a separate process).

 * A helper function to read a single whole line into strbuf
   mistakenly triggered OOM error at EOF under certain conditions,
   which has been fixed.
   (merge 642956cf45 rs/strbuf-getwholeline-fix later to maint).

 * The "ref-store" code reorganization continues.

 * "git commit" used to discard the index and re-read from the filesystem
   just in case the pre-commit hook has updated it in the middle; this
   has been optimized out when we know we do not run the pre-commit hook.
   (merge 680ee550d7 kw/commit-keep-index-when-pre-commit-is-not-run later to maint).

 * Updates to the HTTP layer we made recently unconditionally used
   features of libCurl without checking the existence of them, causing
   compilation errors, which has been fixed.  Also migrate the code to
   check feature macros, not version numbers, to cope better with
   libCurl that vendor ships with backported features.

 * The API to start showing progress meter after a short delay has
   been simplified.
   (merge 8aade107dd jc/simplify-progress later to maint).

 * Code clean-up to avoid mixing values read from the .gitmodules file
   and values read from the .git/config file.

 * We used to spend more than necessary cycles allocating and freeing
   piece of memory while writing each index entry out.  This has been
   optimized.

 * Platforms that ship with a separate sha1 with collision detection
   library can link to it instead of using the copy we ship as part of
   our source tree.

 * Code around "notes" have been cleaned up.
   (merge 3964281524 mh/notes-cleanup later to maint).

 * The long-standing rule that an in-core lockfile instance, once it
   is used, must not be freed, has been lifted and the lockfile and
   tempfile APIs have been updated to reduce the chance of programming
   errors.

 * Our hashmap implementation in hashmap.[ch] is not thread-safe when
   adding a new item needs to expand the hashtable by rehashing; add
   an API to disable the automatic rehashing to work it around.

 * Many of our programs consider that it is OK to release dynamic
   storage that is used throughout the life of the program by simply
   exiting, but this makes it harder to leak detection tools to avoid
   reporting false positives.  Plug many existing leaks and introduce
   a mechanism for developers to mark that the region of memory
   pointed by a pointer is not lost/leaking to help these tools.

 * As "git commit" to conclude a conflicted "git merge" honors the
   commit-msg hook, "git merge" that records a merge commit that
   cleanly auto-merges should, but it didn't.

 * The codepath for "git merge-recursive" has been cleaned up.

 * Many leaks of strbuf have been fixed.

 * "git imap-send" has our own implementation of the protocol and also
   can use more recent libCurl with the imap protocol support.  Update
   the latter so that it can use the credential subsystem, and then
   make it the default option to use, so that we can eventually
   deprecate and remove the former.

 * "make style" runs git-clang-format to help developers by pointing
   out coding style issues.

 * A test to demonstrate "git mv" failing to adjust nested submodules
   has been added.
   (merge c514167df2 hv/mv-nested-submodules-test later to maint).

 * On Cygwin, "ulimit -s" does not report failure but it does not work
   at all, which causes an unexpected success of some tests that
   expect failures under a limited stack situation.  This has been
   fixed.

 * Many codepaths have been updated to squelch -Wimplicit-fallthrough
   warnings from Gcc 7 (which is a good code hygiene).

 * Add a helper for DLL loading in anticipation for its need in a
   future topic RSN.

 * "git status --ignored", when noticing that a directory without any
   tracked path is ignored, still enumerated all the ignored paths in
   the directory, which is unnecessary.  The codepath has been
   optimized to avoid this overhead.

 * The final batch to "git rebase -i" updates to move more code from
   the shell script to C has been merged.

 * Operations that do not touch (majority of) packed refs have been
   optimized by making accesses to packed-refs file lazy; we no longer
   pre-parse everything, and an access to a single ref in the
   packed-refs does not touch majority of irrelevant refs, either.

 * Add comment to clarify that the style file is meant to be used with
   clang-5 and the rules are still work in progress.

Also contains various documentation updates and code clean-ups.


Fixes since v2.14
-----------------

 * "%C(color name)" in the pretty print format always produced ANSI
   color escape codes, which was an early design mistake.  They now
   honor the configuration (e.g. "color.ui = never") and also tty-ness
   of the output medium.

 * The http.{sslkey,sslCert} configuration variables are to be
   interpreted as a pathname that honors "~[username]/" prefix, but
   weren't, which has been fixed.

 * Numerous bugs in walking of reflogs via "log -g" and friends have
   been fixed.

 * "git commit" when seeing an totally empty message said "you did not
   edit the message", which is clearly wrong.  The message has been
   corrected.

 * When a directory is not readable, "gitweb" fails to build the
   project list.  Work this around by skipping such a directory.

 * Some versions of GnuPG fails to kill gpg-agent it auto-spawned
   and such a left-over agent can interfere with a test.  Work it
   around by attempting to kill one before starting a new test.

 * A recently added test for the "credential-cache" helper revealed
   that EOF detection done around the time the connection to the cache
   daemon is torn down were flaky.  This was fixed by reacting to
   ECONNRESET and behaving as if we got an EOF.

 * "git log --tag=no-such-tag" showed log starting from HEAD, which
   has been fixed---it now shows nothing.

 * The "tag.pager" configuration variable was useless for those who
   actually create tag objects, as it interfered with the use of an
   editor.  A new mechanism has been introduced for commands to enable
   pager depending on what operation is being carried out to fix this,
   and then "git tag -l" is made to run pager by default.

 * "git push --recurse-submodules $there HEAD:$target" was not
   propagated down to the submodules, but now it is.

 * Commands like "git rebase" accepted the --rerere-autoupdate option
   from the command line, but did not always use it.  This has been
   fixed.

 * "git clone --recurse-submodules --quiet" did not pass the quiet
   option down to submodules.

 * Test portability fix for OBSD.

 * Portability fix for OBSD.

 * "git am -s" has been taught that some input may end with a trailer
   block that is not Signed-off-by: and it should refrain from adding
   an extra blank line before adding a new sign-off in such a case.

 * "git svn" used with "--localtime" option did not compute the tz
   offset for the timestamp in question and instead always used the
   current time, which has been corrected.

 * Memory leak in an error codepath has been plugged.

 * "git stash -u" used the contents of the committed version of the
   ".gitignore" file to decide which paths are ignored, even when the
   file has local changes.  The command has been taught to instead use
   the locally modified contents.

 * bash 4.4 or newer gave a warning on NUL byte in command
   substitution done in "git stash"; this has been squelched.

 * "git grep -L" and "git grep --quiet -L" reported different exit
   codes; this has been corrected.

 * When handshake with a subprocess filter notices that the process
   asked for an unknown capability, Git did not report what program
   the offending subprocess was running.  This has been corrected.

 * "git apply" that is used as a better "patch -p1" failed to apply a
   taken from a file with CRLF line endings to a file with CRLF line
   endings.  The root cause was because it misused convert_to_git()
   that tried to do "safe-crlf" processing by looking at the index
   entry at the same path, which is a nonsense---in that mode, "apply"
   is not working on the data in (or derived from) the index at all.
   This has been fixed.

 * Killing "git merge --edit" before the editor returns control left
   the repository in a state with MERGE_MSG but without MERGE_HEAD,
   which incorrectly tells the subsequent "git commit" that there was
   a squash merge in progress.  This has been fixed.

 * "git archive" did not work well with pathspecs and the
   export-ignore attribute.

 * In addition to "cc: <a@dd.re.ss> # cruft", "cc: a@dd.re.ss # cruft"
   was taught to "git send-email" as a valid way to tell it that it
   needs to also send a carbon copy to <a@dd.re.ss> in the trailer
   section.
   (merge cc90750677 mm/send-email-cc-cruft later to maint).

 * "git branch -M a b" while on a branch that is completely unrelated
   to either branch a or branch b misbehaved when multiple worktree
   was in use.  This has been fixed.
   (merge 31824d180d nd/worktree-kill-parse-ref later to maint).

 * "git gc" and friends when multiple worktrees are used off of a
   single repository did not consider the index and per-worktree refs
   of other worktrees as the root for reachability traversal, making
   objects that are in use only in other worktrees to be subject to
   garbage collection.

 * A regression to "gitk --bisect" by a recent update has been fixed.
   (merge 1d0538e486 mh/packed-ref-store-prep later to maint).

 * "git -c submodule.recurse=yes pull" did not work as if the
   "--recurse-submodules" option was given from the command line.
   This has been corrected.

 * Unlike "git commit-tree < file", "git commit-tree -F file" did not
   pass the contents of the file verbatim and instead completed an
   incomplete line at the end, if exists.  The latter has been updated
   to match the behaviour of the former.
   (merge c818e74332 rk/commit-tree-make-F-verbatim later to maint).

 * Many codepaths did not diagnose write failures correctly when disks
   go full, due to their misuse of write_in_full() helper function,
   which have been corrected.
   (merge f48ecd38cb jk/write-in-full-fix later to maint).

 * "git help co" now says "co is aliased to ...", not "git co is".
   (merge b3a8076e0d ks/help-alias-label later to maint).

 * "git archive", especially when used with pathspec, stored an empty
   directory in its output, even though Git itself never does so.
   This has been fixed.
   (merge 4318094047 rs/archive-excluded-directory later to maint).

 * API error-proofing which happens to also squelch warnings from GCC.
   (merge c788c54cde tg/refs-allowed-flags later to maint).

 * The explanation of the cut-line in the commit log editor has been
   slightly tweaked.
   (merge 8c4b1a3593 ks/commit-do-not-touch-cut-line later to maint).

 * "git gc" tries to avoid running two instances at the same time by
   reading and writing pid/host from and to a lock file; it used to
   use an incorrect fscanf() format when reading, which has been
   corrected.
   (merge afe2fab72c aw/gc-lockfile-fscanf-fix later to maint).

 * The scripts to drive TravisCI has been reorganized and then an
   optimization to avoid spending cycles on a branch whose tip is
   tagged has been implemented.
   (merge 8376eb4a8f ls/travis-scriptify later to maint).

 * The test linter has been taught that we do not like "echo -e".
   (merge 1a6d46895d tb/test-lint-echo-e later to maint).

 * Code cmp.std.c nitpick.
   (merge ac7da78ede mh/for-each-string-list-item-empty-fix later to maint).

 * A regression fix for 2.11 that made the code to read the list of
   alternate object stores overrun the end of the string.
   (merge f0f7bebef7 jk/info-alternates-fix later to maint).

 * "git describe --match" learned to take multiple patterns in v2.13
   series, but the feature ignored the patterns after the first one
   and did not work at all.  This has been fixed.
   (merge da769d2986 jk/describe-omit-some-refs later to maint).

 * "git filter-branch" cannot reproduce a history with a tag without
   the tagger field, which only ancient versions of Git allowed to be
   created.  This has been corrected.
   (merge b2c1ca6b4b ic/fix-filter-branch-to-handle-tag-without-tagger later to maint).

 * "git cat-file --textconv" started segfaulting recently, which
   has been corrected.
   (merge cc0ea7c9e5 jk/diff-blob later to maint).

 * The built-in pattern to detect the "function header" for HTML did
   not match <H1>..<H6> elements without any attributes, which has
   been fixed.
   (merge 9c03caca2c ik/userdiff-html-h-element-fix later to maint).

 * "git mailinfo" was loose in decoding quoted printable and produced
   garbage when the two letters after the equal sign are not
   hexadecimal.  This has been fixed.
   (merge c8cf423eab rs/mailinfo-qp-decode-fix later to maint).

 * The machinery to create xdelta used in pack files received the
   sizes of the data in size_t, but lost the higher bits of them by
   storing them in "unsigned int" during the computation, which is
   fixed.

 * The delta format used in the packfile cannot reference data at
   offset larger than what can be expressed in 4-byte, but the
   generator for the data failed to make sure the offset does not
   overflow.  This has been corrected.

 * The documentation for '-X<option>' for merges was misleadingly
   written to suggest that "-s theirs" exists, which is not the case.
   (merge c25d98b2a7 jc/merge-x-theirs-docfix later to maint).

 * "git fast-export" with -M/-C option issued "copy" instruction on a
   path that is simultaneously modified, which was incorrect.
   (merge b3e8ca89cf jt/fast-export-copy-modify-fix later to maint).

 * Many codepaths have been updated to squelch -Wsign-compare
   warnings.
   (merge 071bcaab64 rj/no-sign-compare later to maint).

 * Memory leaks in various codepaths have been plugged.
   (merge 4d01a7fa65 ma/leakplugs later to maint).

 * Recent versions of "git rev-parse --parseopt" did not parse the
   option specification that does not have the optional flags (*=?!)
   correctly, which has been corrected.
   (merge a6304fa4c2 bc/rev-parse-parseopt-fix later to maint).

 * The checkpoint command "git fast-import" did not flush updates to
   refs and marks unless at least one object was created since the
   last checkpoint, which has been corrected, as these things can
   happen without any new object getting created.
   (merge 30e215a65c er/fast-import-dump-refs-on-checkpoint later to maint).

 * Spell the name of our system as "Git" in the output from
   request-pull script.
   (merge e66d7c37a5 ar/request-pull-phrasofix later to maint).

 * Other minor doc, test and build updates and code cleanups.
   (merge f094b89a4d ma/parse-maybe-bool later to maint).
   (merge 39b00fa4d4 jk/drop-sha1-entry-pos later to maint).
   (merge 6cdf8a7929 ma/ts-cleanups later to maint).
   (merge 7560f547e6 ma/up-to-date later to maint).
   (merge 0db3dc75f3 rs/apply-epoch later to maint).
   (merge 74f1bd912b dw/diff-highlight-makefile-fix later to maint).
   (merge f991761eb8 jk/config-lockfile-leak-fix later to maint).
   (merge 150efef1e7 ma/pkt-line-leakfix later to maint).
   (merge 5554451de6 mg/timestamp-t-fix later to maint).
   (merge 276d0e35c0 ma/split-symref-update-fix later to maint).
   (merge 3bc4b8f7c7 bb/doc-eol-dirty later to maint).
   (merge c1bb33c99c jk/system-path-cleanup later to maint).
   (merge ab46e6fc72 cc/subprocess-handshake-missing-capabilities later to maint).
   (merge f7a32dd97f kd/doc-for-each-ref later to maint).
   (merge be94568bc7 ez/doc-duplicated-words-fix later to maint).
   (merge 01e4be6c3d ks/test-readme-phrasofix later to maint).
   (merge 217bb56d4f hn/typofix later to maint).
   (merge c08fd6388c jk/doc-read-tree-table-asciidoctor-fix later to maint).
   (merge c3342b362e ks/doc-use-camelcase-for-config-name later to maint).
   (merge 0bca165fdb jk/validate-headref-fix later to maint).
   (merge 93dbefb389 mr/doc-negative-pathspec later to maint).
   (merge 5e633326e4 ad/doc-markup-fix later to maint).
   (merge 9ca356fa8b rs/cocci-de-paren-call-params later to maint).
   (merge 7099153e8d rs/tag-null-pointer-arith-fix later to maint).
   (merge 0e187d758c rs/run-command-use-alloc-array later to maint).

----------------------------------------------------------------

Changes since v2.14.0 are as follows:

Adam Dinwoodie (1):
      doc: correct command formatting

Andreas Heiduk (2):
      doc: add missing values "none" and "default" for diff.wsErrorHighlight
      doc: clarify "config --bool" behaviour with empty string

Ann T Ropea (1):
      request-pull: capitalise "Git" to make it a proper noun

Anthony Sottile (1):
      git-grep: correct exit code with --quiet and -L

Ben Boeckel (1):
      Documentation: mention that `eol` can change the dirty status of paths

Brandon Casey (7):
      t1502: demonstrate rev-parse --parseopt option mis-parsing
      rev-parse parseopt: do not search help text for flag chars
      rev-parse parseopt: interpret any whitespace as start of help text
      git-rebase: don't ignore unexpected command line arguments
      t0040,t1502: Demonstrate parse_options bugs
      parse-options: write blank line to correct output stream
      parse-options: only insert newline in help text if needed

Brandon Williams (29):
      repo_read_index: don't discard the index
      repository: have the_repository use the_index
      submodule--helper: teach push-check to handle HEAD
      cache.h: add GITMODULES_FILE macro
      config: add config_from_gitmodules
      submodule: remove submodule.fetchjobs from submodule-config parsing
      submodule: remove fetch.recursesubmodules from submodule-config parsing
      submodule: check for unstaged .gitmodules outside of config parsing
      submodule: check for unmerged .gitmodules outside of config parsing
      submodule: merge repo_read_gitmodules and gitmodules_config
      grep: recurse in-process using 'struct repository'
      t7411: check configuration parsing errors
      submodule: don't use submodule_from_name
      add, reset: ensure submodules can be added or reset
      submodule--helper: don't overlay config in remote_submodule_branch
      submodule--helper: don't overlay config in update-clone
      fetch: don't overlay config with submodule-config
      submodule: don't rely on overlayed config when setting diffopts
      unpack-trees: don't respect submodule.update
      submodule: remove submodule_config callback routine
      diff: stop allowing diff to have submodules configured in .git/config
      submodule-config: remove support for overlaying repository config
      submodule-config: move submodule-config functions to submodule-config.c
      submodule-config: lazy-load a repository's .gitmodules file
      unpack-trees: improve loading of .gitmodules
      submodule: remove gitmodules_config
      clone: teach recursive clones to respect -q
      clang-format: outline the git project's coding style
      Makefile: add style build rule

Christian Couder (3):
      refs: use skip_prefix() in ref_is_hidden()
      sub-process: print the cmd when a capability is unsupported
      sha1-lookup: remove sha1_entry_pos() from header file

Daniel Watkins (1):
      diff-highlight: add clean target to Makefile

Dimitrios Christidis (1):
      fmt-merge-msg: fix coding style

Eric Blake (1):
      git-contacts: also recognise "Reported-by:"

Eric Rannaud (1):
      fast-import: checkpoint: dump branches/tags/marks even if object_count==0

Evan Zacks (1):
      doc: fix minor typos (extra/duplicated words)

Han-Wen Nienhuys (5):
      submodule.h: typofix
      submodule.c: describe submodule_to_gitdir() in a new comment
      real_path: clarify return value ownership
      read_gitfile_gently: clarify return value ownership.
      string-list.h: move documentation from Documentation/api/ into header

Heiko Voigt (2):
      t5526: fix some broken && chains
      add test for bug in git-mv for recursive submodules

Hielke Christian Braun (1):
      gitweb: skip unreadable subdirectories

Ian Campbell (4):
      filter-branch: reset $GIT_* before cleaning up
      filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_*
      filter-branch: stash away ref map in a branch
      filter-branch: use hash-object instead of mktag

Ilya Kantor (1):
      userdiff: fix HTML hunk header regexp

Jameson Miller (1):
      Improve performance of git status --ignored

Jeff Hostetler (1):
      hashmap: add API to disable item counting when threaded

Jeff King (110):
      t1414: document some reflog-walk oddities
      revision: disallow reflog walking with revs->limited
      log: clarify comment about reflog cycles
      log: do not free parents when walking reflog
      get_revision_1(): replace do-while with an early return
      rev-list: check reflog_info before showing usage
      reflog-walk: stop using fake parents
      reflog-walk: apply --since/--until to reflog dates
      check return value of verify_ref_format()
      docs/for-each-ref: update pointer to color syntax
      t: use test_decode_color rather than literal ANSI codes
      ref-filter: simplify automatic color reset
      ref-filter: abstract ref format into its own struct
      ref-filter: move need_color_reset_at_eol into ref_format
      ref-filter: provide a function for parsing sort options
      ref-filter: make parse_ref_filter_atom a private function
      ref-filter: factor out the parsing of sorting atoms
      ref-filter: pass ref_format struct to atom parsers
      color: check color.ui in git_default_config()
      for-each-ref: load config earlier
      rev-list: pass diffopt->use_colors through to pretty-print
      pretty: respect color settings for %C placeholders
      ref-filter: consult want_color() before emitting colors
      strbuf: use designated initializers in STRBUF_INIT
      t/lib-proto-disable: restore protocol.allow after config tests
      t5813: add test for hostname starting with dash
      connect: factor out "looks like command line option" check
      connect: reject dashed arguments for proxy commands
      connect: reject paths that look like command line options
      t6018: flesh out empty input/output rev-list tests
      revision: add rev_input_given flag
      rev-list: don't show usage when we see empty ref patterns
      revision: do not fallback to default when rev_input_given is set
      hashcmp: use memcmp instead of open-coded loop
      sha1_file: drop experimental GIT_USE_LOOKUP search
      trailer: put process_trailers() options into a struct
      interpret-trailers: add an option to show only the trailers
      interpret-trailers: add an option to show only existing trailers
      interpret-trailers: add an option to unfold values
      interpret-trailers: add --parse convenience option
      pretty: move trailer formatting to trailer.c
      t4205: refactor %(trailers) tests
      pretty: support normalization options for %(trailers)
      doc: fix typo in sendemail.identity
      config: use a static lock_file struct
      write_index_as_tree: cleanup tempfile on error
      setup_temporary_shallow: avoid using inactive tempfile
      setup_temporary_shallow: move tempfile struct into function
      verify_signed_buffer: prefer close_tempfile() to close()
      always check return value of close_tempfile
      tempfile: do not delete tempfile on failed close
      lockfile: do not rollback lock on failed close
      tempfile: prefer is_tempfile_active to bare access
      tempfile: handle NULL tempfile pointers gracefully
      tempfile: replace die("BUG") with BUG()
      tempfile: factor out activation
      tempfile: factor out deactivation
      tempfile: robustify cleanup handler
      tempfile: release deactivated strbufs instead of resetting
      tempfile: use list.h for linked list
      tempfile: remove deactivated list entries
      tempfile: auto-allocate tempfiles on heap
      lockfile: update lifetime requirements in documentation
      ref_lock: stop leaking lock_files
      stop leaking lock structs in some simple cases
      test-lib: --valgrind should not override --verbose-log
      test-lib: set LSAN_OPTIONS to abort by default
      add: free leaked pathspec after add_files_to_cache()
      update-index: fix cache entry leak in add_one_file()
      config: plug user_config leak
      reset: make tree counting less confusing
      reset: free allocated tree buffers
      repository: free fields before overwriting them
      set_git_dir: handle feeding gitdir to itself
      rev-parse: don't trim bisect refnames
      system_path: move RUNTIME_PREFIX to a sub-function
      git_extract_argv0_path: do nothing without RUNTIME_PREFIX
      add UNLEAK annotation for reducing leak false positives
      shortlog: skip format/parse roundtrip for internal traversal
      shell: drop git-cvsserver support by default
      archimport: use safe_pipe_capture for user input
      cvsimport: shell-quote variable used in backticks
      config: avoid "write_in_full(fd, buf, len) < len" pattern
      get-tar-commit-id: check write_in_full() return against 0
      avoid "write_in_full(fd, buf, len) != len" pattern
      convert less-trivial versions of "write_in_full() != len"
      pkt-line: check write_in_full() errors against "< 0"
      notes-merge: use ssize_t for write_in_full() return value
      config: flip return value of store_write_*()
      read_pack_header: handle signed/unsigned comparison in read result
      prefix_ref_iterator: break when we leave the prefix
      read_info_alternates: read contents into strbuf
      read_info_alternates: warn on non-trivial errors
      revision: replace "struct cmdline_pathspec" with argv_array
      cat-file: handle NULL object_context.path
      test-line-buffer: simplify command parsing
      curl_trace(): eliminate switch fallthrough
      consistently use "fallthrough" comments in switches
      doc: put literal block delimiter around table
      files-backend: prefer "0" for write_in_full() error check
      notes-merge: drop dead zero-write code
      prefer "!=" when checking read_in_full() result
      avoid looking at errno for short read_in_full() returns
      distinguish error versus short read from read_in_full()
      worktree: use xsize_t to access file size
      worktree: check the result of read_in_full()
      validate_headref: NUL-terminate HEAD buffer
      validate_headref: use skip_prefix for symref parsing
      validate_headref: use get_oid_hex for detached HEADs
      git: add --no-optional-locks option

Job Snijders (1):
      gitweb: add 'raw' blob_plain link in history overview

Joel Teichroeb (3):
      stash: add a test for stash create with no files
      stash: add a test for when apply fails during stash branch
      stash: add a test for stashing in a detached state

Johannes Schindelin (14):
      run_processes_parallel: change confusing task_cb convention
      git-gui (MinGW): make use of MSys2's msgfmt
      t3415: verify that an empty instructionFormat is handled as before
      rebase -i: generate the script via rebase--helper
      rebase -i: remove useless indentation
      rebase -i: do not invent onelines when expanding/collapsing SHA-1s
      rebase -i: also expand/collapse the SHA-1s via the rebase--helper
      t3404: relax rebase.missingCommitsCheck tests
      rebase -i: check for missing commits in the rebase--helper
      rebase -i: skip unnecessary picks using the rebase--helper
      t3415: test fixup with wrapped oneline
      rebase -i: rearrange fixup/squash lines using the rebase--helper
      Win32: simplify loading of DLL functions
      clang-format: adjust line break penalties

Jonathan Nieder (6):
      vcs-svn: remove more unused prototypes and declarations
      vcs-svn: remove custom mode constants
      vcs-svn: remove repo_delete wrapper function
      vcs-svn: move remaining repo_tree functions to fast_export.h
      pack: make packed_git_mru global a value instead of a pointer
      pathspec doc: parse_pathspec does not maintain references to args

Jonathan Tan (39):
      fsck: remove redundant parse_tree() invocation
      object: remove "used" field from struct object
      fsck: cleanup unused variable
      Documentation: migrate sub-process docs to header
      sub-process: refactor handshake to common function
      tests: ensure fsck fails on corrupt packfiles
      sha1_file: set whence in storage-specific info fn
      sha1_file: remove read_packed_sha1()
      diff: avoid redundantly clearing a flag
      diff: respect MIN_BLOCK_LENGTH for last block
      diff: define block by number of alphanumeric chars
      Doc: clarify that pack-objects makes packs, plural
      pack: move pack name-related functions
      pack: move static state variables
      pack: move pack_report()
      pack: move open_pack_index(), parse_pack_index()
      pack: move release_pack_memory()
      pack: move pack-closing functions
      pack: move use_pack()
      pack: move unuse_pack()
      pack: move add_packed_git()
      pack: move install_packed_git()
      pack: move {,re}prepare_packed_git and approximate_object_count
      pack: move unpack_object_header_buffer()
      pack: move get_size_from_delta()
      pack: move unpack_object_header()
      pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
      pack: move nth_packed_object_{sha1,oid}
      pack: move check_pack_index_ptr(), nth_packed_object_offset()
      pack: move find_pack_entry_one(), is_pack_valid()
      pack: move find_sha1_pack()
      pack: move find_pack_entry() and make it global
      pack: move has_sha1_pack()
      pack: move has_pack_index()
      pack: move for_each_packed_object()
      Remove inadvertently added outgoing/packfile.h
      Add t/helper/test-write-cache to .gitignore
      git-compat-util: make UNLEAK less error-prone
      fast-export: do not copy from modified file

Junio C Hamano (49):
      t1408: add a test of stale packed refs covered by loose refs
      clean.c: use designated initializer
      http.c: http.sslcert and http.sslkey are both pathnames
      connect: reject ssh hostname that begins with a dash
      Git 2.7.6
      Git 2.8.6
      Git 2.9.5
      Git 2.10.4
      Git 2.11.3
      Git 2.12.4
      Git 2.13.5
      Git 2.14.1
      Start post 2.14 cycle
      perl/Git.pm: typofix in a comment
      The first batch of topics after the 2.14 cycle
      diff: retire sane_truncate_fn
      progress: simplify "delayed" progress API
      The second batch post 2.14
      t4200: give us a clean slate after "rerere gc" tests
      t4200: make "rerere gc" test more robust
      t4200: gather "rerere gc" together
      t4200: parameterize "rerere gc" custom expiry test
      rerere: represent time duration in timestamp_t internally
      rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved
      The third batch post 2.14
      Prepare for 2.14.2
      The fourth batch post 2.14
      The fifth batch post 2.14
      The sixth batch post 2.14
      RelNotes: further fixes for 2.14.2 from the master front
      The seventh batch post 2.14
      travis: dedent a few scripts that are indented overly deeply
      subprocess: loudly die when subprocess asks for an unsupported capability
      cvsserver: move safe_pipe_capture() to the main package
      cvsserver: use safe_pipe_capture for `constant commands` as well
      gc: call fscanf() with %<len>s, not %<len>c, when reading hostname
      The eighth batch for 2.15
      Git 2.10.5
      Git 2.11.4
      Git 2.12.5
      Git 2.13.6
      Git 2.14.2
      branch: fix "copy" to never touch HEAD
      merge-strategies: avoid implying that "-s theirs" exists
      The ninth batch for 2.15
      The tenth batch for 2.15
      The eleventh batch for 2.15
      The twelfth batch for 2.15
      Git 2.15-rc0

Kaartic Sivaraam (13):
      hook: cleanup script
      hook: name the positional variables
      hook: add sign-off using "interpret-trailers"
      hook: add a simple first example
      commit: check for empty message before the check for untouched template
      hook: use correct logical variable
      t3200: cleanup cruft of a test
      builtin/branch: stop supporting the "--set-upstream" option
      branch: quote branch/ref names to improve readability
      help: change a message to be more precise
      commit-template: change a message to be more intuitive
      t/README: fix typo and grammatically improve a sentence
      doc: camelCase the config variables to improve readability

Kevin Daudt (3):
      stash: prevent warning about null bytes in input
      doc/for-each-ref: consistently use '=' to between argument names and values
      doc/for-each-ref: explicitly specify option names

Kevin Willford (9):
      format-patch: have progress option while generating patches
      rebase: turn on progress option by default for format-patch
      commit: skip discarding the index if there is no pre-commit hook
      perf: add test for writing the index
      read-cache: fix memory leak in do_write_index
      read-cache: avoid allocating every ondisk entry when writing
      merge-recursive: fix memory leak
      merge-recursive: remove return value from get_files_dirs
      merge-recursive: change current file dir string_lists to hashmap

Lars Schneider (11):
      t0021: keep filter log files on comparison
      t0021: make debug log file name configurable
      t0021: write "OUT <size>" only on success
      convert: put the flags field before the flag itself for consistent style
      convert: move multiple file filter error handling to separate function
      convert: refactor capabilities negotiation
      convert: add "status=delayed" to filter process protocol
      convert: display progress for filtered objects that have been delayed
      travis-ci: move Travis CI code into dedicated scripts
      travis-ci: skip a branch build if equal tag is present
      travis-ci: fix "skip_branch_tip_with_tag()" string comparison

Manav Rathi (1):
      docs: improve discoverability of exclude pathspec

Martin Koegler (2):
      diff-delta: fix encoding size that would not fit in "unsigned int"
      diff-delta: do not allow delta offset truncation

Martin Ågren (32):
      builtin.h: take over documentation from api-builtin.txt
      git.c: let builtins opt for handling `pager.foo` themselves
      git.c: provide setup_auto_pager()
      t7006: add tests for how git tag paginates
      tag: respect `pager.tag` in list-mode only
      tag: change default of `pager.tag` to "on"
      git.c: ignore pager.* when launching builtin as dashed external
      Doc/git-{push,send-pack}: correct --sign= to --signed=
      t5334: document that git push --signed=1 does not work
      config: introduce git_parse_maybe_bool_text
      config: make git_{config,parse}_maybe_bool equivalent
      treewide: deprecate git_config_maybe_bool, use git_parse_maybe_bool
      parse_decoration_style: drop unused argument `var`
      doc/interpret-trailers: fix "the this" typo
      convert: always initialize attr_action in convert_attrs
      pack-objects: take lock before accessing `remaining`
      strbuf_setlen: don't write to strbuf_slopbuf
      ThreadSanitizer: add suppressions
      Documentation/user-manual: update outdated example output
      treewide: correct several "up-to-date" to "up to date"
      pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
      config: remove git_config_maybe_bool
      refs/files-backend: add longer-scoped copy of string to list
      refs/files-backend: fix memory leak in lock_ref_for_update
      refs/files-backend: correct return value in lock_ref_for_update
      refs/files-backend: add `refname`, not "HEAD", to list
      builtin/commit: fix memory leak in `prepare_index()`
      commit: fix memory leak in `reduce_heads()`
      leak_pending: use `object_array_clear()`, not `free()`
      object_array: use `object_array_clear()`, not `free()`
      object_array: add and use `object_array_pop()`
      pack-bitmap[-write]: use `object_array_clear()`, don't leak

Matthieu Moy (2):
      send-email: fix garbage removal after address
      send-email: don't use Mail::Address, even if available

Max Kirillov (2):
      describe: fix matching to actually match all patterns
      describe: teach --match to handle branches and remotes

Michael Forney (1):
      scripts: use "git foo" not "git-foo"

Michael Haggerty (77):
      add_packed_ref(): teach function to overwrite existing refs
      packed_ref_store: new struct
      packed_ref_store: move `packed_refs_path` here
      packed_ref_store: move `packed_refs_lock` member here
      clear_packed_ref_cache(): take a `packed_ref_store *` parameter
      validate_packed_ref_cache(): take a `packed_ref_store *` parameter
      get_packed_ref_cache(): take a `packed_ref_store *` parameter
      get_packed_refs(): take a `packed_ref_store *` parameter
      add_packed_ref(): take a `packed_ref_store *` parameter
      lock_packed_refs(): take a `packed_ref_store *` parameter
      commit_packed_refs(): take a `packed_ref_store *` parameter
      rollback_packed_refs(): take a `packed_ref_store *` parameter
      get_packed_ref(): take a `packed_ref_store *` parameter
      repack_without_refs(): take a `packed_ref_store *` parameter
      packed_peel_ref(): new function, extracted from `files_peel_ref()`
      packed_ref_store: support iteration
      packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
      packed-backend: new module for handling packed references
      packed_ref_store: make class into a subclass of `ref_store`
      commit_packed_refs(): report errors rather than dying
      commit_packed_refs(): use a staging file separate from the lockfile
      packed_refs_lock(): function renamed from lock_packed_refs()
      packed_refs_lock(): report errors via a `struct strbuf *err`
      packed_refs_unlock(), packed_refs_is_locked(): new functions
      clear_packed_ref_cache(): don't protest if the lock is held
      commit_packed_refs(): remove call to `packed_refs_unlock()`
      repack_without_refs(): don't lock or unlock the packed refs
      t3210: add some tests of bogus packed-refs file contents
      read_packed_refs(): die if `packed-refs` contains bogus data
      packed_ref_store: handle a packed-refs file that is a symlink
      files-backend: cheapen refname_available check when locking refs
      refs: retry acquiring reference locks for 100ms
      notes: make GET_NIBBLE macro more robust
      load_subtree(): remove unnecessary conditional
      load_subtree(): reduce the scope of some local variables
      load_subtree(): fix incorrect comment
      load_subtree(): separate logic for internal vs. terminal entries
      load_subtree(): check earlier whether an internal node is a tree entry
      load_subtree(): only consider blobs to be potential notes
      get_oid_hex_segment(): return 0 on success
      load_subtree(): combine some common code
      get_oid_hex_segment(): don't pad the rest of `oid`
      hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
      load_subtree(): declare some variables to be `size_t`
      load_subtree(): check that `prefix_len` is in the expected range
      packed-backend: don't adjust the reference count on lock/unlock
      struct ref_transaction: add a place for backends to store data
      packed_ref_store: implement reference transactions
      packed_delete_refs(): implement method
      files_pack_refs(): use a reference transaction to write packed refs
      prune_refs(): also free the linked list
      files_initial_transaction_commit(): use a transaction for packed refs
      t1404: demonstrate two problems with reference transactions
      files_ref_store: use a transaction to update packed refs
      packed-backend: rip out some now-unused code
      files_transaction_finish(): delete reflogs before references
      ref_iterator: keep track of whether the iterator output is ordered
      packed_ref_cache: add a backlink to the associated `packed_ref_store`
      die_unterminated_line(), die_invalid_line(): new functions
      read_packed_refs(): use mmap to read the `packed-refs` file
      read_packed_refs(): only check for a header at the top of the file
      read_packed_refs(): make parsing of the header line more robust
      for_each_string_list_item: avoid undefined behavior for empty list
      read_packed_refs(): read references with minimal copying
      packed_ref_cache: remember the file-wide peeling state
      mmapped_ref_iterator: add iterator over a packed-refs file
      mmapped_ref_iterator_advance(): no peeled value for broken refs
      packed-backend.c: reorder some definitions
      packed_ref_cache: keep the `packed-refs` file mmapped if possible
      read_packed_refs(): ensure that references are ordered when read
      packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
      packed_read_raw_ref(): read the reference from the mmapped buffer
      ref_store: implement `refs_peel_ref()` generically
      packed_ref_store: get rid of the `ref_cache` entirely
      ref_cache: remove support for storing peeled values
      mmapped_ref_iterator: inline into `packed_ref_iterator`
      packed-backend.c: rename a bunch of things and update comments

Michael J Gruber (11):
      Documentation: use proper wording for ref format strings
      Documentation/git-for-each-ref: clarify peeling of tags for --format
      Documentation/git-merge: explain --continue
      merge: clarify call chain
      merge: split write_merge_state in two
      merge: save merge state earlier
      name-rev: change ULONG_MAX to TIME_MAX
      t7004: move limited stack prereq to test-lib
      t6120: test name-rev --all and --stdin
      t6120: clean up state after breaking repo
      t6120: test describe and name-rev with deep repos

Nguyễn Thái Ngọc Duy (17):
      branch: fix branch renaming not updating HEADs correctly
      revision.h: new flag in struct rev_info wrt. worktree-related refs
      refs.c: use is_dir_sep() in resolve_gitlink_ref()
      revision.c: refactor add_index_objects_to_pending()
      revision.c: --indexed-objects add objects from all worktrees
      refs.c: refactor get_submodule_ref_store(), share common free block
      refs: move submodule slash stripping code to get_submodule_ref_store
      refs: add refs_head_ref()
      revision.c: use refs_for_each*() instead of for_each_*_submodule()
      refs.c: move for_each_remote_ref_submodule() to submodule.c
      refs: remove dead for_each_*_submodule()
      revision.c: --all adds HEAD from all worktrees
      files-backend: make reflog iterator go through per-worktree reflog
      revision.c: --reflog add HEAD reflog from all worktrees
      rev-list: expose and document --single-worktree
      refs.c: remove fallback-to-main-store code get_submodule_ref_store()
      refs.c: reindent get_submodule_ref_store()

Nicolas Morey-Chaisemartin (7):
      stash: clean untracked files before reset
      pull: fix cli and config option parsing order
      pull: honor submodule.recurse config option
      imap-send: return with error if curl failed
      imap-send: add wrapper to get server credentials if needed
      imap_send: setup_curl: retreive credentials if not set in config file
      imap-send: use curl by default when possible

Paolo Bonzini (4):
      trailers: export action enums and corresponding lookup functions
      trailers: introduce struct new_trailer_item
      interpret-trailers: add options for actions
      interpret-trailers: fix documentation typo

Patryk Obara (10):
      sha1_file: fix definition of null_sha1
      commit: replace the raw buffer with strbuf in read_graft_line
      commit: allocate array using object_id size
      commit: rewrite read_graft_line
      builtin/hash-object: convert to struct object_id
      read-cache: convert to struct object_id
      sha1_file: convert index_path to struct object_id
      sha1_file: convert index_fd to struct object_id
      sha1_file: convert hash_sha1_file_literally to struct object_id
      sha1_file: convert index_stream to struct object_id

Philip Oakley (4):
      git-gui: remove duplicate entries from .gitconfig's gui.recentrepo
      git gui: cope with duplicates in _get_recentrepo
      git gui: de-dup selected repo from recentrepo history
      git gui: allow for a long recentrepo list

Phillip Wood (7):
      am: remember --rerere-autoupdate setting
      rebase: honor --rerere-autoupdate
      rebase -i: honor --rerere-autoupdate
      t3504: use test_commit
      cherry-pick/revert: remember --rerere-autoupdate
      cherry-pick/revert: reject --rerere-autoupdate when continuing
      am: fix signoff when other trailers are present

Raman Gupta (1):
      contrib/rerere-train: optionally overwrite existing resolutions

Ramsay Jones (9):
      credential-cache: interpret an ECONNRESET as an EOF
      builtin/add: add detail to a 'cannot chmod' error message
      test-lib: don't use ulimit in test prerequisites on cygwin
      test-lib: use more compact expression in PIPE prerequisite
      t9010-*.sh: skip all tests if the PIPE prereq is missing
      git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings
      commit-slab.h: avoid -Wsign-compare warnings
      cache.h: hex2chr() - avoid -Wsign-compare warnings
      ALLOC_GROW: avoid -Wsign-compare warnings

Rene Scharfe (34):
      am: release strbufs after use in detect_patch_format()
      am: release strbuf on error return in hg_patch_to_mail()
      am: release strbuf after use in safe_to_abort()
      check-ref-format: release strbuf after use in check_ref_format_branch()
      clean: release strbuf after use in remove_dirs()
      clone: release strbuf after use in remove_junk()
      commit: release strbuf on error return in commit_tree_extended()
      connect: release strbuf on error return in git_connect()
      convert: release strbuf on error return in filter_buffer_or_fd()
      diff: release strbuf after use in diff_summary()
      diff: release strbuf after use in show_rename_copy()
      diff: release strbuf after use in show_stats()
      help: release strbuf on error return in exec_man_konqueror()
      help: release strbuf on error return in exec_man_man()
      help: release strbuf on error return in exec_woman_emacs()
      mailinfo: release strbuf after use in handle_from()
      mailinfo: release strbuf on error return in handle_boundary()
      merge: release strbuf after use in save_state()
      merge: release strbuf after use in write_merge_heads()
      notes: release strbuf after use in notes_copy_from_stdin()
      refs: release strbuf on error return in write_pseudoref()
      remote: release strbuf after use in read_remote_branches()
      remote: release strbuf after use in migrate_file()
      remote: release strbuf after use in set_url()
      send-pack: release strbuf on error return in send_pack()
      sha1_file: release strbuf on error return in index_path()
      shortlog: release strbuf after use in insert_one_record()
      sequencer: release strbuf after use in save_head()
      transport-helper: release strbuf after use in process_connect_service()
      userdiff: release strbuf after use in userdiff_get_textconv()
      utf8: release strbuf on error return in strbuf_utf8_replace()
      vcs-svn: release strbuf after use in end_revision()
      wt-status: release strbuf after use in read_rebase_todolist()
      wt-status: release strbuf after use in wt_longstatus_print_tracking()

René Scharfe (44):
      tree-diff: don't access hash of NULL object_id pointer
      notes: don't access hash of NULL object_id pointer
      receive-pack: don't access hash of NULL object_id pointer
      bswap: convert to unsigned before shifting in get_be32
      bswap: convert get_be16, get_be32 and put_be32 to inline functions
      add MOVE_ARRAY
      use MOVE_ARRAY
      apply: use COPY_ARRAY and MOVE_ARRAY in update_image()
      ls-files: don't try to prune an empty index
      dir: support platforms that require aligned reads
      pack-objects: remove unnecessary NULL check
      t0001: skip test with restrictive permissions if getpwd(3) respects them
      test-path-utils: handle const parameter of basename and dirname
      t3700: fix broken test under !POSIXPERM
      t4062: use less than 256 repetitions in regex
      sha1_file: avoid comparison if no packed hash matches the first byte
      apply: remove prefix_length member from apply_state
      merge: use skip_prefix()
      win32: plug memory leak on realloc() failure in syslog()
      strbuf: clear errno before calling getdelim(3)
      fsck: free buffers on error in fsck_obj()
      sha1_file: release delta_stack on error in unpack_entry()
      tree-walk: convert fill_tree_descriptor() to object_id
      t1002: stop using sum(1)
      t5001: add tests for export-ignore attributes and exclude pathspecs
      archive: factor out helper functions for handling attributes
      archive: don't queue excluded directories
      commit: remove unused inline function single_parent()
      apply: check date of potential epoch timestamps first
      apply: remove epoch date from regex
      archive: don't add empty directories to archives
      refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
      refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
      refs: pass NULL to resolve_ref_unsafe() if hash is not needed
      mailinfo: don't decode invalid =XY quoted-printable sequences
      refs: pass NULL to refs_resolve_refdup() if hash is not needed
      refs: pass NULL to resolve_refdup() if hash is not needed
      coccinelle: remove parentheses that become unnecessary
      path: use strbuf_add_real_path()
      use strbuf_addstr() for adding strings to strbufs
      graph: use strbuf_addchars() to add spaces
      tag: avoid NULL pointer arithmetic
      repository: use FREE_AND_NULL
      run-command: use ALLOC_ARRAY

Ross Kabus (1):
      commit-tree: do not complete line in -F input

Sahil Dua (2):
      config: create a function to format section headers
      branch: add a --copy (-c) option to go with --move (-m)

Santiago Torres (1):
      t: lib-gpg: flush gpg agent on startup

Stefan Beller (49):
      diff.c: readability fix
      diff.c: move line ending check into emit_hunk_header
      diff.c: factor out diff_flush_patch_all_file_pairs
      diff.c: introduce emit_diff_symbol
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
      diff.c: migrate emit_line_checked to use emit_diff_symbol
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN]
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS}
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
      submodule.c: migrate diff output to use emit_diff_symbol
      diff.c: convert emit_binary_diff_body to use emit_diff_symbol
      diff.c: convert show_stats to use emit_diff_symbol
      diff.c: convert word diffing to use emit_diff_symbol
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
      diff.c: buffer all output if asked to
      diff.c: color moved lines differently
      diff.c: color moved lines differently, plain mode
      diff.c: add dimming to moved line detection
      diff: document the new --color-moved setting
      attr.c: drop hashmap_cmp_fn cast
      builtin/difftool.c: drop hashmap_cmp_fn cast
      builtin/describe: drop hashmap_cmp_fn cast
      config.c: drop hashmap_cmp_fn cast
      convert/sub-process: drop cast to hashmap_cmp_fn
      patch-ids.c: drop hashmap_cmp_fn cast
      remote.c: drop hashmap_cmp_fn cast
      submodule-config.c: drop hashmap_cmp_fn cast
      name-hash.c: drop hashmap_cmp_fn cast
      t/helper/test-hashmap: use custom data instead of duplicate cmp functions
      commit: convert lookup_commit_graft to struct object_id
      tag: convert gpg_verify_tag to use struct object_id
      t8008: rely on rev-parse'd HEAD instead of sha1 value
      t1200: remove t1200-tutorial.sh
      sha1_file: make read_info_alternates static
      submodule.sh: remove unused variable
      builtin/merge: honor commit-msg hook for merges
      push, fetch: error out for submodule entries not pointing to commits
      replace-objects: evaluate replacement refs without using the object store
      Documentation/githooks: mention merge in commit-msg hook
      Documentation/config: clarify the meaning of submodule.<name>.update
      t7406: submodule.<name>.update command must not be run from .gitmodules
      diff: correct newline in summary for renamed files
      submodule: correct error message for missing commits

Stephan Beyer (1):
      clang-format: add a comment about the meaning/status of the

Takashi Iwai (2):
      sha1dc: build git plumbing code more explicitly
      sha1dc: allow building with the external sha1dc library

Thomas Gummerer (2):
      read-cache: fix index corruption with index v4
      refs: strip out not allowed flags from ref_transaction_update

Tom G. Christensen (2):
      http: fix handling of missing CURLPROTO_*
      http: use a feature check to enable GSSAPI delegation control

Torsten Bögershausen (3):
      convert: add SAFE_CRLF_KEEP_CRLF
      apply: file commited with CRLF should roundtrip diff and apply
      test-lint: echo -e (or -E) is not portable

Urs Thuermann (1):
      git svn fetch: Create correct commit timestamp when using --localtime

William Duclot (1):
      rebase: make resolve message clearer for inexperienced users

brian m. carlson (14):
      builtin/fsck: convert remaining caller of get_sha1 to object_id
      builtin/merge-tree: convert remaining caller of get_sha1 to object_id
      submodule: convert submodule config lookup to use object_id
      remote: convert struct push_cas to struct object_id
      sequencer: convert to struct object_id
      builtin/update_ref: convert to struct object_id
      bisect: convert bisect_checkout to struct object_id
      builtin/unpack-file: convert to struct object_id
      Convert remaining callers of get_sha1 to get_oid.
      sha1_name: convert get_sha1* to get_oid*
      sha1_name: convert GET_SHA1* flags to GET_OID*
      sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
      vcs-svn: remove unused prototypes
      vcs-svn: rename repo functions to "svn_repo"

joernchen (1):
      cvsserver: use safe_pipe_capture instead of backticks

Ævar Arnfjörð Bjarmason (2):
      branch: add test for -m renaming multiple config sections
      tests: don't give unportable ">" to "test" built-in, use -gt

Øystein Walle (1):
      rev-parse: rev-parse: add --is-shallow-repository

Łukasz Gryglicki (1):
      merge: add a --signoff flag


^ permalink raw reply	[relevance 11%]

* Re: [PATCH 2/6] Update documentation for new directory and status logic
  2017-10-05 20:54 ` [PATCH 2/6] Update documentation for new directory and status logic jameson.miller81
@ 2017-10-05 21:51   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-05 21:51 UTC (permalink / raw)
  To: Jameson Miller; +Cc: git, Junio C Hamano, Jeff King, Brandon Williams, Jameson Miller

On Thu, Oct 5, 2017 at 1:54 PM,  <jameson.miller81@gmail.com> wrote:
> From: Jameson Miller <jamill@microsoft.com>
>
> Signed-off-by: Jameson Miller <jamill@microsoft.com>
> ---
>  Documentation/git-status.txt                      | 22 +++++++++++++++++-
>  Documentation/technical/api-directory-listing.txt | 28 +++++++++++++++++++----
>  2 files changed, 45 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
> index 9f3a78a36c..7d1410ff3f 100644
> --- a/Documentation/git-status.txt
> +++ b/Documentation/git-status.txt
> @@ -97,8 +97,28 @@ configuration variable documented in linkgit:git-config[1].
>         (and suppresses the output of submodule summaries when the config option
>         `status.submoduleSummary` is set).
>
> ---ignored::
> +--ignored[=<mode>]::
>         Show ignored files as well.
> ++
> +The mode parameter is used to specify the handling of ignored files.
> +It is optional: it defaults to 'default', and if specified, it must be
> +stuck to the option (e.g. '--ignored=default`, but not `--ignored default`).

Is there a rationale for not accepting `--ignored default`?
(It's just the way OPTION_STRING inputs seem to work,
but in other cases [c.f. man git gc, search "--prune=" ] this is
just implied). Or rather: no need to spell this out explicitly IMHO,
as that draws the users attention to it, which might be confusing.

> ++
> +The possible options are:
> ++
> +       - 'traditional' - Shows ignored files and directories, unless
> +                         --untracked-files=all is specifed, in which case
> +                         individual files in ignored directories are
> +                         displayed.
> +       - 'no'          - Show no ignored files.
> +       - 'matching'    - Shows ignored files and directories matching an
> +                         ignore pattern.
> ++
> +When 'matching' mode is specified, paths that explicity match an
> +ignored pattern are shown. If a directory matches an ignore pattern,
> +then it is shown, but not paths contained in the ignored directory. If
> +a directory does not match an ignore pattern, but all contents are
> +ignored, then the directory is not shown, but all contents are shown.
>
>  -z::
>         Terminate entries with NUL, instead of LF.  This implies
> diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
> index 6c77b4920c..86c981c29c 100644
> --- a/Documentation/technical/api-directory-listing.txt
> +++ b/Documentation/technical/api-directory-listing.txt
> @@ -22,16 +22,20 @@ The notable options are:
>
>  `flags`::
>
> -       A bit-field of options (the `*IGNORED*` flags are mutually exclusive):
> +       A bit-field of options:
>
>  `DIR_SHOW_IGNORED`:::
>
> -       Return just ignored files in `entries[]`, not untracked files.
> +       Return just ignored files in `entries[]`, not untracked
> +       files. This is flag is mutually exclusive with

"is flag is" ?


> +       `DIR_SHOW_IGNORED_TOO`.
>
>  `DIR_SHOW_IGNORED_TOO`:::
>
> -       Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
> -       in addition to untracked files in `entries[]`.
> +       Similar to `DIR_SHOW_IGNORED`, but return ignored files in
> +       `ignored[]` in addition to untracked files in
> +       `entries[]`. This flag is mutually exclusive with
> +       `DIR_SHOW_IGNORED`.
>
>  `DIR_KEEP_UNTRACKED_CONTENTS`:::
>
> @@ -39,6 +43,22 @@ The notable options are:
>         untracked contents of untracked directories are also returned in
>         `entries[]`.
>
> +`DIR_SHOW_IGNORED_TOO_MODE_MATCHING`:::
> +
> +       Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if
> +       this is set, returns ignored files and directories that match
> +       an exclude pattern. If a directory matches an exclude pattern,
> +       then the directory is returned and the contained paths are
> +       not. A directory that does not match an exclude pattern will
> +       not be returned even if all of its contents are ignored. In
> +       this case, the contents are returned as individual entries.
> +

I think to make the asciidoc happy, you'd put a '+ LF' as paragraph
delimiter and the subsequent paragraphs are not indented.
C.f. Documentation/git-gc.txt "--auto".
Oh, screw that. It turns out, this place is the only place in our docs
where we use triple colons. So I don't know what I am talking about.

> +If this is set, files and directories
> +       that explicity match an ignore pattern are reported. Implicity
> +       ignored directories (directories that do not match an ignore
> +       pattern, but whose contents are all ignored) are not reported,
> +       instead all of the contents are reported.
> +
>  `DIR_COLLECT_IGNORED`:::
>
>         Special mode for git-add. Return ignored files in `ignored[]` and
> --
> 2.13.6
>

^ permalink raw reply	[relevance 8%]

* [PATCH v7 1/3] submodule--helper: introduce get_submodule_displaypath()
  2017-10-06 13:24 ` [PATCH v7 0/3] Incremental rewrite of git-submodules Prathamesh Chavan
@ 2017-10-06 13:24   ` Prathamesh Chavan
  2017-10-06 13:24   ` [PATCH v7 2/3] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
  2017-10-06 13:24   ` [PATCH v7 3/3] submodule: port submodule subcommand 'status' from shell to C Prathamesh Chavan
  2 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-10-06 13:24 UTC (permalink / raw)
  To: gitster; +Cc: christian.couder, git, hanwen, pc44800, sbeller

Introduce function get_submodule_displaypath() to replace the code
occurring in submodule_init() for generating displaypath of the
submodule with a call to it.

This new function will also be used in other parts of the system
in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 06ed02f99..56c1c52e2 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -219,6 +219,26 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
+/* the result should be freed by the caller. */
+static char *get_submodule_displaypath(const char *path, const char *prefix)
+{
+	const char *super_prefix = get_super_prefix();
+
+	if (prefix && super_prefix) {
+		BUG("cannot have prefix '%s' and superprefix '%s'",
+		    prefix, super_prefix);
+	} else if (prefix) {
+		struct strbuf sb = STRBUF_INIT;
+		char *displaypath = xstrdup(relative_path(path, prefix, &sb));
+		strbuf_release(&sb);
+		return displaypath;
+	} else if (super_prefix) {
+		return xstrfmt("%s%s", super_prefix, path);
+	} else {
+		return xstrdup(path);
+	}
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -334,15 +354,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	struct strbuf sb = STRBUF_INIT;
 	char *upd = NULL, *url = NULL, *displaypath;
 
-	if (prefix && get_super_prefix())
-		die("BUG: cannot have prefix and superprefix");
-	else if (prefix)
-		displaypath = xstrdup(relative_path(path, prefix, &sb));
-	else if (get_super_prefix()) {
-		strbuf_addf(&sb, "%s%s", get_super_prefix(), path);
-		displaypath = strbuf_detach(&sb, NULL);
-	} else
-		displaypath = xstrdup(path);
+	displaypath = get_submodule_displaypath(path, prefix);
 
 	sub = submodule_from_path(&null_oid, path);
 
@@ -357,9 +369,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 * Set active flag for the submodule being initialized
 	 */
 	if (!is_submodule_active(the_repository, path)) {
-		strbuf_reset(&sb);
 		strbuf_addf(&sb, "submodule.%s.active", sub->name);
 		git_config_set_gently(sb.buf, "true");
+		strbuf_reset(&sb);
 	}
 
 	/*
@@ -367,7 +379,6 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	 * To look up the url in .git/config, we must not fall back to
 	 * .gitmodules, so look it up directly.
 	 */
-	strbuf_reset(&sb);
 	strbuf_addf(&sb, "submodule.%s.url", sub->name);
 	if (git_config_get_string(sb.buf, &url)) {
 		if (!sub->url)
@@ -404,9 +415,9 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
 	}
+	strbuf_reset(&sb);
 
 	/* Copy "update" setting when it is not set yet */
-	strbuf_reset(&sb);
 	strbuf_addf(&sb, "submodule.%s.update", sub->name);
 	if (git_config_get_string(sb.buf, &upd) &&
 	    sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) {
-- 
2.14.2


^ permalink raw reply	[relevance 21%]

* [PATCH v7 2/3] submodule--helper: introduce for_each_listed_submodule()
  2017-10-06 13:24 ` [PATCH v7 0/3] Incremental rewrite of git-submodules Prathamesh Chavan
  2017-10-06 13:24   ` [PATCH v7 1/3] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
@ 2017-10-06 13:24   ` Prathamesh Chavan
  2017-10-06 13:24   ` [PATCH v7 3/3] submodule: port submodule subcommand 'status' from shell to C Prathamesh Chavan
  2 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-10-06 13:24 UTC (permalink / raw)
  To: gitster; +Cc: christian.couder, git, hanwen, pc44800, sbeller

Introduce function for_each_listed_submodule() and replace a loop
in module_init() with a call to it.

The new function will also be used in other parts of the
system in later patches.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 40 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 56c1c52e2..29e3fde16 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -14,6 +14,11 @@
 #include "refs.h"
 #include "connect.h"
 
+#define OPT_QUIET (1 << 0)
+
+typedef void (*each_submodule_fn)(const struct cache_entry *list_item,
+				  void *cb_data);
+
 static char *get_default_remote(void)
 {
 	char *dest = NULL, *ret;
@@ -348,7 +353,23 @@ static int module_list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
-static void init_submodule(const char *path, const char *prefix, int quiet)
+static void for_each_listed_submodule(const struct module_list *list,
+				      each_submodule_fn fn, void *cb_data)
+{
+	int i;
+	for (i = 0; i < list->nr; i++)
+		fn(list->entries[i], cb_data);
+}
+
+struct init_cb {
+	const char *prefix;
+	unsigned int flags;
+};
+
+#define INIT_CB_INIT { NULL, 0 }
+
+static void init_submodule(const char *path, const char *prefix,
+			   unsigned int flags)
 {
 	const struct submodule *sub;
 	struct strbuf sb = STRBUF_INIT;
@@ -410,7 +431,7 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 		if (git_config_set_gently(sb.buf, url))
 			die(_("Failed to register url for submodule path '%s'"),
 			    displaypath);
-		if (!quiet)
+		if (!(flags & OPT_QUIET))
 			fprintf(stderr,
 				_("Submodule '%s' (%s) registered for path '%s'\n"),
 				sub->name, url, displaypath);
@@ -437,12 +458,18 @@ static void init_submodule(const char *path, const char *prefix, int quiet)
 	free(upd);
 }
 
+static void init_submodule_cb(const struct cache_entry *list_item, void *cb_data)
+{
+	struct init_cb *info = cb_data;
+	init_submodule(list_item->name, info->prefix, info->flags);
+}
+
 static int module_init(int argc, const char **argv, const char *prefix)
 {
+	struct init_cb info = INIT_CB_INIT;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
 	int quiet = 0;
-	int i;
 
 	struct option module_init_options[] = {
 		OPT__QUIET(&quiet, N_("Suppress output for initializing a submodule")),
@@ -467,8 +494,11 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	if (!argc && git_config_get_value_multi("submodule.active"))
 		module_list_active(&list);
 
-	for (i = 0; i < list.nr; i++)
-		init_submodule(list.entries[i]->name, prefix, quiet);
+	info.prefix = prefix;
+	if (quiet)
+		info.flags |= OPT_QUIET;
+
+	for_each_listed_submodule(&list, init_submodule_cb, &info);
 
 	return 0;
 }
-- 
2.14.2


^ permalink raw reply	[relevance 21%]

* [PATCH v7 3/3] submodule: port submodule subcommand 'status' from shell to C
  2017-10-06 13:24 ` [PATCH v7 0/3] Incremental rewrite of git-submodules Prathamesh Chavan
  2017-10-06 13:24   ` [PATCH v7 1/3] submodule--helper: introduce get_submodule_displaypath() Prathamesh Chavan
  2017-10-06 13:24   ` [PATCH v7 2/3] submodule--helper: introduce for_each_listed_submodule() Prathamesh Chavan
@ 2017-10-06 13:24   ` Prathamesh Chavan
  2 siblings, 0 replies; 200+ results
From: Prathamesh Chavan @ 2017-10-06 13:24 UTC (permalink / raw)
  To: gitster; +Cc: christian.couder, git, hanwen, pc44800, sbeller

This aims to make git-submodule 'status' a built-in. Hence, the function
cmd_status() is ported from shell to C. This is done by introducing
four functions: module_status(), submodule_status_cb(),
submodule_status() and print_status().

The function module_status() acts as the front-end of the subcommand.
It parses subcommand's options and then calls the function
module_list_compute() for computing the list of submodules. Then
this functions calls for_each_listed_submodule() looping through the
list obtained.

Then for_each_listed_submodule() calls submodule_status_cb() for each of
the submodule in its list. The function submodule_status_cb() calls
submodule_status() after passing appropriate arguments to the funciton.
Function submodule_status() is responsible for generating the status
each submodule it is called for, and then calls print_status().

Finally, the function print_status() handles the printing of submodule's
status.

Function set_name_rev() is also ported from git-submodule to the
submodule--helper builtin function compute_rev_name(), which now
generates the value of the revision name as required.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---
 builtin/submodule--helper.c | 198 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  61 +-------------
 2 files changed, 199 insertions(+), 60 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 29e3fde16..d366e8e7b 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -13,8 +13,13 @@
 #include "remote.h"
 #include "refs.h"
 #include "connect.h"
+#include "revision.h"
+#include "diffcore.h"
+#include "diff.h"
 
 #define OPT_QUIET (1 << 0)
+#define OPT_CACHED (1 << 1)
+#define OPT_RECURSIVE (1 << 2)
 
 typedef void (*each_submodule_fn)(const struct cache_entry *list_item,
 				  void *cb_data);
@@ -244,6 +249,44 @@ static char *get_submodule_displaypath(const char *path, const char *prefix)
 	}
 }
 
+static char *compute_rev_name(const char *sub_path, const char* object_id)
+{
+	struct strbuf sb = STRBUF_INIT;
+	const char ***d;
+
+	static const char *describe_bare[] = { NULL };
+
+	static const char *describe_tags[] = { "--tags", NULL };
+
+	static const char *describe_contains[] = { "--contains", NULL };
+
+	static const char *describe_all_always[] = { "--all", "--always", NULL };
+
+	static const char **describe_argv[] = { describe_bare, describe_tags,
+						describe_contains,
+						describe_all_always, NULL };
+
+	for (d = describe_argv; *d; d++) {
+		struct child_process cp = CHILD_PROCESS_INIT;
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.dir = sub_path;
+		cp.git_cmd = 1;
+		cp.no_stderr = 1;
+
+		argv_array_push(&cp.args, "describe");
+		argv_array_pushv(&cp.args, *d);
+		argv_array_push(&cp.args, object_id);
+
+		if (!capture_command(&cp, &sb, 0)) {
+			strbuf_strip_suffix(&sb, "\n");
+			return strbuf_detach(&sb, NULL);
+		}
+	}
+
+	strbuf_release(&sb);
+	return NULL;
+}
+
 struct module_list {
 	const struct cache_entry **entries;
 	int alloc, nr;
@@ -503,6 +546,160 @@ static int module_init(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct status_cb {
+	const char *prefix;
+	unsigned int flags;
+};
+
+#define STATUS_CB_INIT { NULL, 0 }
+
+static void print_status(unsigned int flags, char state, const char *path,
+			 const struct object_id *oid, const char *displaypath)
+{
+	if (flags & OPT_QUIET)
+		return;
+
+	printf("%c%s %s", state, oid_to_hex(oid), displaypath);
+
+	if (state == ' ' || state == '+')
+		printf(" (%s)", compute_rev_name(path, oid_to_hex(oid)));
+
+	printf("\n");
+}
+
+static int handle_submodule_head_ref(const char *refname,
+				     const struct object_id *oid, int flags,
+				     void *cb_data)
+{
+	struct object_id *output = cb_data;
+	if (oid)
+		oidcpy(output, oid);
+
+	return 0;
+}
+
+static void status_submodule(const char *path, const struct object_id *ce_oid,
+			     unsigned int ce_flags, const char *prefix,
+			     unsigned int flags)
+{
+	char *displaypath;
+	struct argv_array diff_files_args = ARGV_ARRAY_INIT;
+	struct rev_info rev;
+	int diff_files_result;
+
+	if (!submodule_from_path(&null_oid, path))
+		die(_("no submodule mapping found in .gitmodules for path '%s'"),
+		      path);
+
+	displaypath = get_submodule_displaypath(path, prefix);
+
+	if ((CE_STAGEMASK & ce_flags) >> CE_STAGESHIFT) {
+		print_status(flags, 'U', path, &null_oid, displaypath);
+		goto cleanup;
+	}
+
+	if (!is_submodule_active(the_repository, path)) {
+		print_status(flags, '-', path, ce_oid, displaypath);
+		goto cleanup;
+	}
+
+	argv_array_pushl(&diff_files_args, "diff-files",
+			 "--ignore-submodules=dirty", "--quiet", "--",
+			 path, NULL);
+
+	git_config(git_diff_basic_config, NULL);
+	init_revisions(&rev, prefix);
+	rev.abbrev = 0;
+	diff_files_args.argc = setup_revisions(diff_files_args.argc,
+					       diff_files_args.argv,
+					       &rev, NULL);
+	diff_files_result = run_diff_files(&rev, 0);
+
+	if (!diff_result_code(&rev.diffopt, diff_files_result)) {
+		print_status(flags, ' ', path, ce_oid,
+			     displaypath);
+	} else if (!(flags & OPT_CACHED)) {
+		struct object_id oid;
+
+		if (refs_head_ref(get_submodule_ref_store(path),
+				  handle_submodule_head_ref, &oid))
+			die(_("could not resolve HEAD ref inside the"
+			      "submodule '%s'"), path);
+
+		print_status(flags, '+', path, &oid, displaypath);
+	} else {
+		print_status(flags, '+', path, ce_oid, displaypath);
+	}
+
+	if (flags & OPT_RECURSIVE) {
+		struct child_process cpr = CHILD_PROCESS_INIT;
+
+		cpr.git_cmd = 1;
+		cpr.dir = path;
+		prepare_submodule_repo_env(&cpr.env_array);
+
+		argv_array_push(&cpr.args, "--super-prefix");
+		argv_array_pushf(&cpr.args, "%s/", displaypath);
+		argv_array_pushl(&cpr.args, "submodule--helper", "status",
+				 "--recursive", NULL);
+
+		if (flags & OPT_CACHED)
+			argv_array_push(&cpr.args, "--cached");
+
+		if (flags & OPT_QUIET)
+			argv_array_push(&cpr.args, "--quiet");
+
+		if (run_command(&cpr))
+			die(_("failed to recurse into submodule '%s'"), path);
+	}
+
+cleanup:
+	argv_array_clear(&diff_files_args);
+	free(displaypath);
+}
+
+static void status_submodule_cb(const struct cache_entry *list_item,
+				void *cb_data)
+{
+	struct status_cb *info = cb_data;
+	status_submodule(list_item->name, &list_item->oid, list_item->ce_flags,
+			 info->prefix, info->flags);
+}
+
+static int module_status(int argc, const char **argv, const char *prefix)
+{
+	struct status_cb info = STATUS_CB_INIT;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+	int quiet = 0;
+
+	struct option module_status_options[] = {
+		OPT__QUIET(&quiet, N_("Suppress submodule status output")),
+		OPT_BIT(0, "cached", &info.flags, N_("Use commit stored in the index instead of the one stored in the submodule HEAD"), OPT_CACHED),
+		OPT_BIT(0, "recursive", &info.flags, N_("recurse into nested submodules"), OPT_RECURSIVE),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule status [--quiet] [--cached] [--recursive] [<path>...]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_status_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0)
+		return 1;
+
+	info.prefix = prefix;
+	if (quiet)
+		info.flags |= OPT_QUIET;
+
+	for_each_listed_submodule(&list, status_submodule_cb, &info);
+
+	return 0;
+}
+
 static int module_name(int argc, const char **argv, const char *prefix)
 {
 	const struct submodule *sub;
@@ -1300,6 +1497,7 @@ static struct cmd_struct commands[] = {
 	{"resolve-relative-url", resolve_relative_url, 0},
 	{"resolve-relative-url-test", resolve_relative_url_test, 0},
 	{"init", module_init, SUPPORT_SUPER_PREFIX},
+	{"status", module_status, SUPPORT_SUPER_PREFIX},
 	{"remote-branch", resolve_remote_submodule_branch, 0},
 	{"push-check", push_check, 0},
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
diff --git a/git-submodule.sh b/git-submodule.sh
index 66d1ae8ef..156255a9e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -758,18 +758,6 @@ cmd_update()
 	}
 }
 
-set_name_rev () {
-	revname=$( (
-		sanitize_submodule_env
-		cd "$1" && {
-			git describe "$2" 2>/dev/null ||
-			git describe --tags "$2" 2>/dev/null ||
-			git describe --contains "$2" 2>/dev/null ||
-			git describe --all --always "$2"
-		}
-	) )
-	test -z "$revname" || revname=" ($revname)"
-}
 #
 # Show commit summary for submodules in index or working tree
 #
@@ -1016,54 +1004,7 @@ cmd_status()
 		shift
 	done
 
-	{
-		git submodule--helper list --prefix "$wt_prefix" "$@" ||
-		echo "#unmatched" $?
-	} |
-	while read -r mode sha1 stage sm_path
-	do
-		die_if_unmatched "$mode" "$sha1"
-		name=$(git submodule--helper name "$sm_path") || exit
-		displaypath=$(git submodule--helper relative-path "$prefix$sm_path" "$wt_prefix")
-		if test "$stage" = U
-		then
-			say "U$sha1 $displaypath"
-			continue
-		fi
-		if ! git submodule--helper is-active "$sm_path" ||
-		{
-			! test -d "$sm_path"/.git &&
-			! test -f "$sm_path"/.git
-		}
-		then
-			say "-$sha1 $displaypath"
-			continue;
-		fi
-		if git diff-files --ignore-submodules=dirty --quiet -- "$sm_path"
-		then
-			set_name_rev "$sm_path" "$sha1"
-			say " $sha1 $displaypath$revname"
-		else
-			if test -z "$cached"
-			then
-				sha1=$(sanitize_submodule_env; cd "$sm_path" && git rev-parse --verify HEAD)
-			fi
-			set_name_rev "$sm_path" "$sha1"
-			say "+$sha1 $displaypath$revname"
-		fi
-
-		if test -n "$recursive"
-		then
-			(
-				prefix="$displaypath/"
-				sanitize_submodule_env
-				wt_prefix=
-				cd "$sm_path" &&
-				eval cmd_status
-			) ||
-			die "$(eval_gettext "Failed to recurse into submodule path '\$sm_path'")"
-		fi
-	done
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper status ${GIT_QUIET:+--quiet} ${cached:+--cached} ${recursive:+--recursive} "$@"
 }
 #
 # Sync remote urls for submodules
-- 
2.14.2


^ permalink raw reply	[relevance 19%]

* [PATCH 2/2] tests: fix diff order arguments in test_cmp
  2017-10-06 19:00 [PATCH 1/2] tests: use shell negation instead of test_must_fail for test_cmp Stefan Beller
@ 2017-10-06 19:00 ` " Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-06 19:00 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Fix the argument order for test_cmp. When given the expected
result first the diff shows the actual output with '+' and the
expectation with '-', which is the convention for our tests.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t1004-read-tree-m-u-wf.sh          |  2 +-
 t/t4015-diff-whitespace.sh           |  4 ++--
 t/t4205-log-pretty-formats.sh        |  2 +-
 t/t6007-rev-list-cherry-pick-file.sh | 32 ++++++++++++++++----------------
 t/t6013-rev-list-reverse-parents.sh  |  4 ++--
 t/t7001-mv.sh                        |  2 +-
 t/t7005-editor.sh                    |  6 +++---
 t/t7102-reset.sh                     |  4 ++--
 t/t7201-co.sh                        |  4 ++--
 t/t7400-submodule-basic.sh           |  2 +-
 t/t7405-submodule-merge.sh           |  2 +-
 t/t7506-status-submodule.sh          |  4 ++--
 t/t7600-merge.sh                     |  6 +++---
 t/t7610-mergetool.sh                 |  4 ++--
 t/t9001-send-email.sh                |  6 +++---
 15 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/t/t1004-read-tree-m-u-wf.sh b/t/t1004-read-tree-m-u-wf.sh
index c70cf42300..c7ce5d8bb5 100755
--- a/t/t1004-read-tree-m-u-wf.sh
+++ b/t/t1004-read-tree-m-u-wf.sh
@@ -218,7 +218,7 @@ test_expect_success 'D/F' '
 		echo "100644 $a 2	subdir/file2"
 		echo "100644 $b 3	subdir/file2/another"
 	) >expect &&
-	test_cmp actual expect
+	test_cmp expect actual
 
 '
 
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 12d182dc1b..1709e4578b 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -155,7 +155,7 @@ test_expect_success 'ignore-blank-lines: only new lines' '
 " >x &&
 	git diff --ignore-blank-lines >out &&
 	>expect &&
-	test_cmp out expect
+	test_cmp expect out
 '
 
 test_expect_success 'ignore-blank-lines: only new lines with space' '
@@ -165,7 +165,7 @@ test_expect_success 'ignore-blank-lines: only new lines with space' '
  " >x &&
 	git diff -w --ignore-blank-lines >out &&
 	>expect &&
-	test_cmp out expect
+	test_cmp expect out
 '
 
 test_expect_success 'ignore-blank-lines: after change' '
diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index ec5f530102..42f584f8b3 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -590,7 +590,7 @@ test_expect_success '%(trailers:unfold) unfolds trailers' '
 test_expect_success ':only and :unfold work together' '
 	git log --no-walk --pretty="%(trailers:only:unfold)" >actual &&
 	git log --no-walk --pretty="%(trailers:unfold:only)" >reverse &&
-	test_cmp actual reverse &&
+	test_cmp reverse actual &&
 	{
 		grep -v patch.description <trailers | unfold &&
 		echo
diff --git a/t/t6007-rev-list-cherry-pick-file.sh b/t/t6007-rev-list-cherry-pick-file.sh
index 2959745196..f0268372d2 100755
--- a/t/t6007-rev-list-cherry-pick-file.sh
+++ b/t/t6007-rev-list-cherry-pick-file.sh
@@ -57,7 +57,7 @@ test_expect_success '--left-right' '
 	git rev-list --left-right B...C > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 test_expect_success '--count' '
@@ -77,14 +77,14 @@ test_expect_success '--cherry-pick bar does not come up empty' '
 	git rev-list --left-right --cherry-pick B...C -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 test_expect_success 'bar does not come up empty' '
 	git rev-list --left-right B...C -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -96,14 +96,14 @@ test_expect_success '--cherry-pick bar does not come up empty (II)' '
 	git rev-list --left-right --cherry-pick F...E -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 test_expect_success 'name-rev multiple --refs combine inclusive' '
 	git rev-list --left-right --cherry-pick F...E -- bar >actual &&
 	git name-rev --stdin --name-only --refs="*tags/F" --refs="*tags/E" \
 		<actual >actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -115,7 +115,7 @@ test_expect_success 'name-rev --refs excludes non-matched patterns' '
 	git rev-list --left-right --cherry-pick F...E -- bar >actual &&
 	git name-rev --stdin --name-only --refs="*tags/F" \
 		<actual >actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -127,14 +127,14 @@ test_expect_success 'name-rev --exclude excludes matched patterns' '
 	git rev-list --left-right --cherry-pick F...E -- bar >actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" --exclude="*E" \
 		<actual >actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 test_expect_success 'name-rev --no-refs clears the refs list' '
 	git rev-list --left-right --cherry-pick F...E -- bar >expect &&
 	git name-rev --stdin --name-only --refs="*tags/F" --refs="*tags/E" --no-refs --refs="*tags/G" \
 		<expect >actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 cat >expect <<EOF
@@ -148,7 +148,7 @@ test_expect_success '--cherry-mark' '
 	git rev-list --cherry-mark F...E -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -162,7 +162,7 @@ test_expect_success '--cherry-mark --left-right' '
 	git rev-list --cherry-mark --left-right F...E -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -173,14 +173,14 @@ test_expect_success '--cherry-pick --right-only' '
 	git rev-list --cherry-pick --right-only F...E -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 test_expect_success '--cherry-pick --left-only' '
 	git rev-list --cherry-pick --left-only E...F -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -192,7 +192,7 @@ test_expect_success '--cherry' '
 	git rev-list --cherry F...E -- bar > actual &&
 	git name-rev --stdin --name-only --refs="*tags/*" \
 		< actual > actual.named &&
-	test_cmp actual.named expect
+	test_cmp expect actual.named
 '
 
 cat >expect <<EOF
@@ -201,7 +201,7 @@ EOF
 
 test_expect_success '--cherry --count' '
 	git rev-list --cherry --count F...E -- bar > actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 cat >expect <<EOF
@@ -210,7 +210,7 @@ EOF
 
 test_expect_success '--cherry-mark --count' '
 	git rev-list --cherry-mark --count F...E -- bar > actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 cat >expect <<EOF
@@ -219,7 +219,7 @@ EOF
 
 test_expect_success '--cherry-mark --left-right --count' '
 	git rev-list --cherry-mark --left-right --count F...E -- bar > actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 test_expect_success '--cherry-pick with independent, but identical branches' '
diff --git a/t/t6013-rev-list-reverse-parents.sh b/t/t6013-rev-list-reverse-parents.sh
index 59fc2f06e0..89458d370f 100755
--- a/t/t6013-rev-list-reverse-parents.sh
+++ b/t/t6013-rev-list-reverse-parents.sh
@@ -28,7 +28,7 @@ test_expect_success '--reverse --parents --full-history combines correctly' '
 		perl -e "print reverse <>" > expected &&
 	git rev-list --reverse --parents --full-history master -- foo \
 		> actual &&
-	test_cmp actual expected
+	test_cmp expected actual
 	'
 
 test_expect_success '--boundary does too' '
@@ -36,7 +36,7 @@ test_expect_success '--boundary does too' '
 		perl -e "print reverse <>" > expected &&
 	git rev-list --boundary --reverse --parents --full-history \
 		master ^root -- foo > actual &&
-	test_cmp actual expected
+	test_cmp expected actual
 	'
 
 test_done
diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index cbc5fb37fe..f5929c46f3 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -488,7 +488,7 @@ test_expect_success 'moving a submodule in nested directories' '
 		git config -f ../.gitmodules submodule.deep/directory/hierarchy/sub.path >../actual &&
 		echo "directory/hierarchy/sub" >../expect
 	) &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 test_expect_failure 'moving nested submodules' '
diff --git a/t/t7005-editor.sh b/t/t7005-editor.sh
index 1b530b5022..29e5043b94 100755
--- a/t/t7005-editor.sh
+++ b/t/t7005-editor.sh
@@ -38,7 +38,7 @@ test_expect_success setup '
 	test_commit "$msg" &&
 	echo "$msg" >expect &&
 	git show -s --format=%s > actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 
 '
 
@@ -85,7 +85,7 @@ do
 		git --exec-path=. commit --amend &&
 		git show -s --pretty=oneline |
 		sed -e "s/^[0-9a-f]* //" >actual &&
-		test_cmp actual expect
+		test_cmp expect actual
 	'
 done
 
@@ -107,7 +107,7 @@ do
 		git --exec-path=. commit --amend &&
 		git show -s --pretty=oneline |
 		sed -e "s/^[0-9a-f]* //" >actual &&
-		test_cmp actual expect
+		test_cmp expect actual
 	'
 done
 
diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
index 86f23be34a..95653a08ca 100755
--- a/t/t7102-reset.sh
+++ b/t/t7102-reset.sh
@@ -428,9 +428,9 @@ test_expect_success 'test --mixed <paths>' '
 	git reset HEAD -- file1 file2 file3 &&
 	test_must_fail git diff --quiet &&
 	git diff > output &&
-	test_cmp output expect &&
+	test_cmp expect output &&
 	git diff --cached > output &&
-	test_cmp output cached_expect
+	test_cmp cached_expect output
 '
 
 test_expect_success 'test resetting the index at give paths' '
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index d4b217b0ee..76c223c967 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -187,7 +187,7 @@ test_expect_success 'format of merge conflict from checkout -m' '
 	d
 	>>>>>>> local
 	EOF
-	test_cmp two expect
+	test_cmp expect two
 '
 
 test_expect_success 'checkout --merge --conflict=diff3 <branch>' '
@@ -213,7 +213,7 @@ test_expect_success 'checkout --merge --conflict=diff3 <branch>' '
 	d
 	>>>>>>> local
 	EOF
-	test_cmp two expect
+	test_cmp expect two
 '
 
 test_expect_success 'switch to another branch while carrying a deletion' '
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 6f8337ffb5..a39e69a3eb 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -1211,7 +1211,7 @@ test_expect_success 'clone --recurse-submodules with a pathspec works' '
 
 	git clone --recurse-submodules="sub0" multisuper multisuper_clone &&
 	git -C multisuper_clone submodule status |cut -c1,43- >actual &&
-	test_cmp actual expected
+	test_cmp expected actual
 '
 
 test_expect_success 'clone with multiple --recurse-submodules options' '
diff --git a/t/t7405-submodule-merge.sh b/t/t7405-submodule-merge.sh
index 0d5b42a25b..7bfb2f498d 100755
--- a/t/t7405-submodule-merge.sh
+++ b/t/t7405-submodule-merge.sh
@@ -119,7 +119,7 @@ test_expect_success 'merge with one side as a fast-forward of the other' '
 	 git ls-tree test-forward sub | cut -f1 | cut -f3 -d" " > actual &&
 	 (cd sub &&
 	  git rev-parse sub-d > ../expect) &&
-	 test_cmp actual expect)
+	 test_cmp expect actual)
 '
 
 test_expect_success 'merging should conflict for non fast-forward' '
diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 055c90736e..9edf6572ed 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -306,7 +306,7 @@ test_expect_success 'diff with merge conflict in .gitmodules' '
 		cd super &&
 		git diff >../diff_actual 2>&1
 	) &&
-	test_cmp diff_actual diff_expect
+	test_cmp diff_expect diff_actual
 '
 
 test_expect_success 'diff --submodule with merge conflict in .gitmodules' '
@@ -314,7 +314,7 @@ test_expect_success 'diff --submodule with merge conflict in .gitmodules' '
 		cd super &&
 		git diff --submodule >../diff_submodule_actual 2>&1
 	) &&
-	test_cmp diff_submodule_actual diff_submodule_expect
+	test_cmp diff_submodule_expect diff_submodule_actual
 '
 
 # We'll setup different cases for further testing:
diff --git a/t/t7600-merge.sh b/t/t7600-merge.sh
index 80194b79f9..dfde6a675a 100755
--- a/t/t7600-merge.sh
+++ b/t/t7600-merge.sh
@@ -697,7 +697,7 @@ test_expect_success 'merge --no-ff --edit' '
 	git cat-file commit HEAD >raw &&
 	grep "work done on the side branch" raw &&
 	sed "1,/^$/d" >actual raw &&
-	test_cmp actual expected
+	test_cmp expected actual
 '
 
 test_expect_success GPG 'merge --ff-only tag' '
@@ -709,7 +709,7 @@ test_expect_success GPG 'merge --ff-only tag' '
 	git merge --ff-only signed &&
 	git rev-parse signed^0 >expect &&
 	git rev-parse HEAD >actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 test_expect_success GPG 'merge --no-edit tag should skip editor' '
@@ -721,7 +721,7 @@ test_expect_success GPG 'merge --no-edit tag should skip editor' '
 	EDITOR=false git merge --no-edit signed &&
 	git rev-parse signed^0 >expect &&
 	git rev-parse HEAD^2 >actual &&
-	test_cmp actual expect
+	test_cmp expect actual
 '
 
 test_expect_success 'set up mod-256 conflict scenario' '
diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh
index 381b7df452..1a430b9c40 100755
--- a/t/t7610-mergetool.sh
+++ b/t/t7610-mergetool.sh
@@ -621,7 +621,7 @@ test_expect_success 'file with no base' '
 	test_must_fail git merge master &&
 	git mergetool --no-prompt --tool mybase -- both &&
 	>expected &&
-	test_cmp both expected
+	test_cmp expected both
 '
 
 test_expect_success 'custom commands override built-ins' '
@@ -632,7 +632,7 @@ test_expect_success 'custom commands override built-ins' '
 	test_must_fail git merge master &&
 	git mergetool --no-prompt --tool defaults -- both &&
 	echo master both added >expected &&
-	test_cmp both expected
+	test_cmp expected both
 '
 
 test_expect_success 'filenames seen by tools start with ./' '
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index f30980895c..4d261c2a9c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1266,7 +1266,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
 	grep email-using-8bit stdout &&
 	grep "Which 8bit encoding" stdout &&
 	egrep "Content|MIME" msgtxt1 >actual &&
-	test_cmp actual content-type-decl
+	test_cmp content-type-decl actual
 '
 
 test_expect_success $PREREQ 'sendemail.8bitEncoding works' '
@@ -1277,7 +1277,7 @@ test_expect_success $PREREQ 'sendemail.8bitEncoding works' '
 			--smtp-server="$(pwd)/fake.sendmail" \
 			email-using-8bit >stdout &&
 	egrep "Content|MIME" msgtxt1 >actual &&
-	test_cmp actual content-type-decl
+	test_cmp content-type-decl actual
 '
 
 test_expect_success $PREREQ '--8bit-encoding overrides sendemail.8bitEncoding' '
@@ -1289,7 +1289,7 @@ test_expect_success $PREREQ '--8bit-encoding overrides sendemail.8bitEncoding' '
 			--8bit-encoding=UTF-8 \
 			email-using-8bit >stdout &&
 	egrep "Content|MIME" msgtxt1 >actual &&
-	test_cmp actual content-type-decl
+	test_cmp content-type-decl actual
 '
 
 test_expect_success $PREREQ 'setup expect' '
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 14%]

* Re: [RFC PATCH v3 0/4] implement fetching of moved submodules
  2017-10-06 22:25 [RFC PATCH v3 0/4] implement fetching of moved submodules Heiko Voigt
  2017-10-06 22:32 ` [RFC PATCH 2/4] change submodule push test to use proper repository setup Heiko Voigt
@ 2017-10-06 22:57 ` Stefan Beller
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-06 22:57 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Fri, Oct 6, 2017 at 3:25 PM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> The last iteration can be found here:
>
> https://public-inbox.org/git/20170817105349.GC52233@book.hvoigt.net/
>
> This is mainly a status update and to let people know that I am still
> working on this.

Cool. :)

> I struggled quite a bit with reviving my original test for the path
> based recursive fetch (first patch). The behavior seems to haved changed
> and simply setting the submodule configuration in .git/config without
> one in .gitmodules does not work anymore. I did not have time to
> investigate whether this was a deliberate change or a maybe a bug?

I think it is deliberate. (We wrote this man page "gitsubmodules"[1] and there
was so much discussion on "What is a submodule?". Key take away is this:
* a gitlink alone is not a submodule.
* a submodule consists of at least
  -> the gitlink in the superproject
  -> a mapping of path -> name via
      $(git config -f .gitmodules submodule.<name>.path)
  -> Depending on config in .git/config or the existence of its git directory,
      it may be [in]active or [de]initialized.

[1] not to be confused with "gitmodules" or "git-submodule"

Sometimes we accept a plain git-link without the config in .gitmodules,
(a) due to historic reasons or (b) because it seems sane even for
a repo "that just happens to exist at the gitlinks location"
(example git-diff)

> So the solution for now is that I write my fake configuration (to avoid
> skipping submodule handling altogether) into a .gitmodules file.

I'll try to spot what is fake about the config.

> The second patch (cleanup of a submodule push testcase) was written
> because that currently is the only test failing. It is not meant for
> inclusion but rather as a demonstration of what might be happening when
> we cleanup testcases: Because of the behavioral change above, on first
> sight, it seemed like there was a shortcut in fetch and so on-demand
> fetch without submodule configuration would not be supported anymore.
>
> IIRC there were a lot more tests failing before when I implemented my
> patch without the fallback on paths. So my guess is that some tests have
> been cleaned up to use proper (.gitmodules) submodule setup.

I don't remember any large recent activity for submodule things lately.

> So the thing here is: If we want to make sure that we stay backwards
> compatible by supporting the setup with gitlinks without configuration.
> Then we also should keep tests around that have the plain manual setup
> without .gitmodules files. Just something, I think, we should keep in
> mind.

But do we want this?

Without the name<->path mapping, we can only have the "old style"
submodules, that have their git repo inside its tree instead of inside
the superprojects git dir.
So renaming/moving "old style with no name<->path mapping" will not
work. (That may be an acceptable trade off. But then again, just providing
the mapping, such that the superproject can absorb the git directory
of the submodule for this use case, doesn't seem like a big deal to me.
Something you want to have anyway, for ease of use of the superproject
w.r.t. cloning for example)

So while I do not try to deliberately break these old behaviors, I'd rather
want us to go forward with a saner model than "if we happen to have
enough data around, the operation succeeds", i.e. ignore anything
that is not following the rather strict definition of a submodule.

FYI: Once upon a time I found "fake submodules"
http://debuggable.com/posts/git-fake-submodules:4b563ee4-f3cc-4061-967e-0e48cbdd56cb
Last time this was discussed on list, this was considered a bug not
worth fixing instead of a feature IIRC. (Personally I think this is
a rather cool hack, which we may want to abuse ourselves for
things like "convert a subtree into a submodule and back again",
but we could also go without this hack)

> Apart from the tests nothing has been added in this iteration. Since I
> finally have a working test now I will continue with reviving the
> fallback to paths.

I'll have a look.

Cheers,
Stefan

^ permalink raw reply	[relevance 24%]

* Re: [PATCH] submodule: avoid sentence-lego in translated string
  2017-10-09  4:14 ` Martin Ågren
@ 2017-10-09 16:47   ` Stefan Beller
  2017-10-09 21:11     ` Jean-Noël AVILA
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-09 16:47 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, brian m. carlson, Philip Oakley, Git Mailing List, Jiang Xin, Alexander Shopov, Jordi Mas, Ralf Thielow, Christopher Díaz, Jean-Noël Avila, Marco Paolone, Changwoo Ryu, Vasco Almeida, Dimitriy Ryazantcev, Peter Krefting, Trần Ngọc Quân, Jacques Viviers, m4sk1n

On Sun, Oct 8, 2017 at 9:14 PM, Martin Ågren <martin.agren@gmail.com> wrote:
> In this particular case, we could have three specific messages plus one default
> message (which at the time shouldn't ever occur). Or we have one specific
> message for the "tag"-case, which seems to have been Stefan's original
> motivation, and a generic message for the other cases.

I think the original patch is an improvement for the translations,
but I'd really want to deliver as much information to the user,
(it's an error message after all, so tell as much as possible that can
be helpful,)
which is why I would prefer to keep `typename(type)` in there.

I always assumed that translators are aware of these issues and sort of
work around this somehow, maybe like this:

  "submodule entry '%s' (%s) is not a commit. It is of type %s"

Thanks,
Stefan

^ permalink raw reply	[relevance 22%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-06 22:32 ` [RFC PATCH 2/4] change submodule push test to use proper repository setup Heiko Voigt
@ 2017-10-09 18:20   ` Stefan Beller
  2017-10-10 13:03     ` Heiko Voigt
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-09 18:20 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Fri, Oct 6, 2017 at 3:32 PM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> NOTE: The argument in this message is not correct, see description in
> cover letter.
>
> The setup of the repositories in this test is using gitlinks without the
> .gitmodules infrastructure. It is however testing convenience features
> like --recurse-submodules=on-demand. These features are already not
> supported by fetch without a .gitmodules file. This leads us to the
> conclusion that it is not really used here as well.
>
> Let's use the usual submodule commands to setup the repository in a
> typical way. This also has the advantage that we are testing with a
> repository structure that is more similar to one we could expect on a
> users setup.
>
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> ---
>
> As mentioned in the cover letter. This seems to be the only test that
> ensures that we stay compatible with setups without .gitmodules. Maybe
> we should add/revive some?

An interesting discussion covering this topic is found at
https://public-inbox.org/git/20170606035650.oykbz2uc4xkr3cr2@sigill.intra.peff.net/

^ permalink raw reply	[relevance 16%]

* Re: [PATCH] submodule: avoid sentence-lego in translated string
  2017-10-09 16:47   ` Stefan Beller
@ 2017-10-09 21:11     ` Jean-Noël AVILA
  2017-10-10  1:26       ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Jean-Noël AVILA @ 2017-10-09 21:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Martin Ågren, Junio C Hamano, brian m. carlson, Philip Oakley, Git Mailing List, Jiang Xin, Alexander Shopov, Jordi Mas, Ralf Thielow, Christopher Díaz, Marco Paolone, Changwoo Ryu, Vasco Almeida, Dimitriy Ryazantcev, Peter Krefting, Trần Ngọc Quân, Jacques Viviers, m4sk1n

On Monday, 9 October 2017, 09:47:26 CEST Stefan Beller wrote:

> I always assumed that translators are aware of these issues and sort of
> work around this somehow, maybe like this:
> 
>   "submodule entry '%s' (%s) is not a commit. It is of type %s"

Translators can be aware of the issue if the coder commented the 
internationalization string with some possible candidates for the placeholders 
when it is not clear unless you check in the source code. Much effort was 
poured into translating the technical terms in other parts of Git; it seems 
awkward to just step back in this occurence.


^ permalink raw reply	[relevance 15%]

* Re: Branch switching with submodules where the submodule replaces a folder aborts unexpectedly
  2017-10-09 21:29 Branch switching with submodules where the submodule replaces a folder aborts unexpectedly Thomas Braun
@ 2017-10-09 21:59 ` Stefan Beller
  2017-10-12 11:48   ` Thomas Braun
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-09 21:59 UTC (permalink / raw)
  To: Thomas Braun; +Cc: git

On Mon, Oct 9, 2017 at 2:29 PM, Thomas Braun
<thomas.braun@virtuell-zuhause.de> wrote:
> Hi,
>
> I'm currently in the progress of pulling some subprojects in a git repository of mine into their own repositories and adding these subprojects back as submodules.
>
> While doing this I enountered a potential bug as checkout complains on branch switching that a file already exists.

(And I presume you know about --recurse-submodules as a flag for git-checkout)

This is consistent with our tests, unfortunately.

git/t$ ./t2013-checkout-submodule.sh
...
not ok 15 - git checkout --recurse-submodules: replace submodule with
a directory # TODO known breakage
...

> If I'm misusing git here I'm glad for any advice.

You are not.

Apart from this bug report, would you think that such filtering of
trees into submodules (and back again) might be an interesting feature
of Git or are these cases rare and special?

Thanks,
Stefan

^ permalink raw reply	[relevance 28%]

* [PATCH] submodule: spell out API of submodule_move_head
@ 2017-10-09 22:06 Stefan Beller
  2017-10-09 22:22 ` Jonathan Nieder
  2017-10-10  0:16 ` Junio C Hamano
  0 siblings, 2 replies; 200+ results
From: Stefan Beller @ 2017-10-09 22:06 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Jonathan Nieder

This way users of this function do not need to read the implementation to
know what it will do.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c |  5 -----
 submodule.h | 18 ++++++++++++++++++
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/submodule.c b/submodule.c
index c1edc91a13..d12e4ea96c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1507,11 +1507,6 @@ static void submodule_reset_index(const char *path)
 		die("could not reset submodule index");
 }
 
-/**
- * Moves a submodule at a given path from a given head to another new head.
- * For edge cases (a submodule coming into existence or removing a submodule)
- * pass NULL for old or new respectively.
- */
 int submodule_move_head(const char *path,
 			 const char *old,
 			 const char *new,
diff --git a/submodule.h b/submodule.h
index f0da0277a4..3c15977be9 100644
--- a/submodule.h
+++ b/submodule.h
@@ -111,6 +111,24 @@ extern void connect_work_tree_and_git_dir(const char *work_tree, const char *git
  */
 int submodule_to_gitdir(struct strbuf *buf, const char *submodule);
 
+/**
+ * Move the HEAD and content of the active submodule at 'path' from object id
+ * 'old' to 'new'.
+ *
+ * Updates the submodule at 'path' and files in its work tree to commit
+ * 'new'. The commit previously pointed to by the submodule is named by
+ * 'old'. This updates the submodule's HEAD, index, and work tree but
+ * does not touch its gitlink entry in the superproject.
+ *
+ * If the submodule did not previously exist, then 'old' should be NULL.
+ * Similarly, if the submodule is to be removed, 'new' should be NULL.
+ *
+ * If updating the submodule would cause local changes to be overwritten,
+ * then instead of updating the submodule, this function prints an error
+ * message and returns -1.
+ *
+ * If the submodule is not active, this does nothing and returns 0.
+ */
 #define SUBMODULE_MOVE_HEAD_DRY_RUN (1<<0)
 #define SUBMODULE_MOVE_HEAD_FORCE   (1<<1)
 extern int submodule_move_head(const char *path,
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 30%]

* Re: [PATCH] submodule: spell out API of submodule_move_head
  2017-10-09 22:06 [PATCH] submodule: spell out API of submodule_move_head Stefan Beller
@ 2017-10-09 22:22 ` Jonathan Nieder
  2017-10-10  0:16 ` Junio C Hamano
  1 sibling, 0 replies; 200+ results
From: Jonathan Nieder @ 2017-10-09 22:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller wrote:

> This way users of this function do not need to read the implementation to
> know what it will do.
>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  submodule.c |  5 -----
>  submodule.h | 18 ++++++++++++++++++
>  2 files changed, 18 insertions(+), 5 deletions(-)

Looks good (well, you'd expect me to think that :)).  Thanks.

^ permalink raw reply	[relevance 15%]

* Re: [PATCH] submodule: spell out API of submodule_move_head
  2017-10-09 22:06 [PATCH] submodule: spell out API of submodule_move_head Stefan Beller
  2017-10-09 22:22 ` Jonathan Nieder
@ 2017-10-10  0:16 ` Junio C Hamano
  2017-10-10  0:27   ` Jonathan Nieder
  1 sibling, 1 reply; 200+ results
From: Junio C Hamano @ 2017-10-10  0:16 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Nieder

Stefan Beller <sbeller@google.com> writes:

> +/**
> + * Move the HEAD and content of the active submodule at 'path' from object id
> + * 'old' to 'new'.
> + *
> + * Updates the submodule at 'path' and files in its work tree to commit
> + * 'new'. The commit previously pointed to by the submodule is named by
> + * 'old'. This updates the submodule's HEAD, index, and work tree but
> + * does not touch its gitlink entry in the superproject.
> + *
> + * If the submodule did not previously exist, then 'old' should be NULL.
> + * Similarly, if the submodule is to be removed, 'new' should be NULL.
> + *
> + * If updating the submodule would cause local changes to be overwritten,
> + * then instead of updating the submodule, this function prints an error
> + * message and returns -1.

This is not a new issue (the removed comment did not mention this at
all), but is it correct to say that updates to "index and work tree"
was as if we did "git -C $path checkout new" (and of course, HEAD in
the $path submodule must be at 'old')?

What should happen if 'old' does not match reality (i.e. old is NULL
but HEAD does point at some commit, old and HEAD are different,
etc.)?  Should the call be aborted?

> + * If the submodule is not active, this does nothing and returns 0.
> + */
>  #define SUBMODULE_MOVE_HEAD_DRY_RUN (1<<0)
>  #define SUBMODULE_MOVE_HEAD_FORCE   (1<<1)
>  extern int submodule_move_head(const char *path,

^ permalink raw reply	[relevance 22%]

* Re: [PATCH] submodule: spell out API of submodule_move_head
  2017-10-10  0:16 ` Junio C Hamano
@ 2017-10-10  0:27   ` Jonathan Nieder
  0 siblings, 0 replies; 200+ results
From: Jonathan Nieder @ 2017-10-10  0:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git

Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:

>> +/**
>> + * Move the HEAD and content of the active submodule at 'path' from object id
>> + * 'old' to 'new'.
>> + *
>> + * Updates the submodule at 'path' and files in its work tree to commit
>> + * 'new'. The commit previously pointed to by the submodule is named by
>> + * 'old'. This updates the submodule's HEAD, index, and work tree but
>> + * does not touch its gitlink entry in the superproject.
>> + *
>> + * If the submodule did not previously exist, then 'old' should be NULL.
>> + * Similarly, if the submodule is to be removed, 'new' should be NULL.
>> + *
>> + * If updating the submodule would cause local changes to be overwritten,
>> + * then instead of updating the submodule, this function prints an error
>> + * message and returns -1.
>
> This is not a new issue (the removed comment did not mention this at
> all), but is it correct to say that updates to "index and work tree"
> was as if we did "git -C $path checkout new" (and of course, HEAD in
> the $path submodule must be at 'old')?

I don't understand the question.  This comment doesn't say it's like
"git checkout" --- are you saying it should?

The function is more like "git read-tree -m -u" (or --reset when
SUBMODULE_MOVE_HEAD_FORCE is passed) than like "git checkout".
Perhaps what you are hinting at is that read-tree --recurse-submodules
is not necessarily the right primitive to implement "git checkout"
with.  But that's a separate issue from documenting the current
behavior of the function.

> What should happen if 'old' does not match reality (i.e. old is NULL
> but HEAD does point at some commit, old and HEAD are different,
> etc.)?  Should the call be aborted?

No.

Thanks,
Jonathan

>> + * If the submodule is not active, this does nothing and returns 0.
>> + */
>>  #define SUBMODULE_MOVE_HEAD_DRY_RUN (1<<0)
>>  #define SUBMODULE_MOVE_HEAD_FORCE   (1<<1)
>>  extern int submodule_move_head(const char *path,

^ permalink raw reply	[relevance 14%]

* Re: [PATCH] submodule: avoid sentence-lego in translated string
  2017-10-09 21:11     ` Jean-Noël AVILA
@ 2017-10-10  1:26       ` Junio C Hamano
  2017-10-10  2:35         ` Changwoo Ryu
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-10-10  1:26 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: Stefan Beller, Martin Ågren, brian m. carlson, Philip Oakley, Git Mailing List, Jiang Xin, Alexander Shopov, Jordi Mas, Ralf Thielow, Christopher Díaz, Marco Paolone, Changwoo Ryu, Vasco Almeida, Dimitriy Ryazantcev, Peter Krefting, Trần Ngọc Quân, Jacques Viviers, m4sk1n

Jean-Noël AVILA <jn.avila@free.fr> writes:

> On Monday, 9 October 2017, 09:47:26 CEST Stefan Beller wrote:
>
>> I always assumed that translators are aware of these issues and sort of
>> work around this somehow, maybe like this:
>> 
>>   "submodule entry '%s' (%s) is not a commit. It is of type %s"
>
> Translators can be aware of the issue if the coder commented the 
> internationalization string with some possible candidates for the placeholders 
> when it is not clear unless you check in the source code. Much effort was 
> poured into translating the technical terms in other parts of Git; it seems 
> awkward to just step back in this occurence.

I do not see this particular case as "stepping back", though.

Our users do not spell "git cat-file -t commit v2.0^{commit}" with
'commit' translated to their language, right?  Shouldn't an error
message output use the same phrase the input side requests users to
use?

^ permalink raw reply	[relevance 7%]

* Re: [PATCH] submodule: avoid sentence-lego in translated string
  2017-10-10  1:26       ` Junio C Hamano
@ 2017-10-10  2:35         ` Changwoo Ryu
  2017-10-10 18:14           ` Martin Ågren
  0 siblings, 1 reply; 200+ results
From: Changwoo Ryu @ 2017-10-10  2:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jean-Noël AVILA, Stefan Beller, Martin Ågren, brian m. carlson, Philip Oakley, Git Mailing List, Jiang Xin, Alexander Shopov, Jordi Mas, Ralf Thielow, Christopher Díaz, Marco Paolone, Vasco Almeida, Dimitriy Ryazantcev, Peter Krefting, Trần Ngọc Quân, Jacques Viviers, m4sk1n

2017-10-10 10:26 GMT+09:00 Junio C Hamano <gitster@pobox.com>:
> Jean-Noël AVILA <jn.avila@free.fr> writes:
>
>> On Monday, 9 October 2017, 09:47:26 CEST Stefan Beller wrote:
>>
>>> I always assumed that translators are aware of these issues and sort of
>>> work around this somehow, maybe like this:
>>>
>>>   "submodule entry '%s' (%s) is not a commit. It is of type %s"
>>
>> Translators can be aware of the issue if the coder commented the
>> internationalization string with some possible candidates for the placeholders
>> when it is not clear unless you check in the source code. Much effort was
>> poured into translating the technical terms in other parts of Git; it seems
>> awkward to just step back in this occurence.
>
> I do not see this particular case as "stepping back", though.
>
> Our users do not spell "git cat-file -t commit v2.0^{commit}" with
> 'commit' translated to their language, right?  Shouldn't an error
> message output use the same phrase the input side requests users to
> use?

Users know the limit of command-line translation. They type "commit"
to commit but they see translated "commit" in output messages. It is
actually confusing. But the untranslated English literals in the
middle of translated sentences does not remove the confusion but
increase it in a different way.

^ permalink raw reply	[relevance 7%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-09 18:20   ` Stefan Beller
@ 2017-10-10 13:03     ` Heiko Voigt
  2017-10-10 18:39       ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Heiko Voigt @ 2017-10-10 13:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Nieder, Jens Lehmann, Brandon Williams

Hi,

On Mon, Oct 09, 2017 at 11:20:51AM -0700, Stefan Beller wrote:
> On Fri, Oct 6, 2017 at 3:32 PM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> > NOTE: The argument in this message is not correct, see description in
> > cover letter.
> >
> > The setup of the repositories in this test is using gitlinks without the
> > .gitmodules infrastructure. It is however testing convenience features
> > like --recurse-submodules=on-demand. These features are already not
> > supported by fetch without a .gitmodules file. This leads us to the
> > conclusion that it is not really used here as well.
> >
> > Let's use the usual submodule commands to setup the repository in a
> > typical way. This also has the advantage that we are testing with a
> > repository structure that is more similar to one we could expect on a
> > users setup.
> >
> > Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> > ---
> >
> > As mentioned in the cover letter. This seems to be the only test that
> > ensures that we stay compatible with setups without .gitmodules. Maybe
> > we should add/revive some?
> 
> An interesting discussion covering this topic is found at
> https://public-inbox.org/git/20170606035650.oykbz2uc4xkr3cr2@sigill.intra.peff.net/

Thanks for that pointer. So in that discussion Junio said that the
recursive operations should succeed if we have everything necessary at
hand. I kind of agree because why should we limit usage when not
necessary. On the other hand we want git to be easy to use. And that
example from Peff is perfect as a demonstration of a incosistency we
currently have:

git clone git://some.where.git/submodule.git
git add submodule

is an operation I remember, I did, when first getting in contact with
submodules (many years back), since that is one intuitive way. And the
thing is: It works, kind of... Only later I discovered that one actually
needs to us a special submodule command to get everything approriately
setup to work together with others.

If everyone agrees that submodules are the default way of handling
repositories insided repositories, IMO, 'git add' should also alter
.gitmodules by default. We could provide a switch to avoid doing that.

An intermediate solution would be to warn but in the long run my goal
for submodules is and always was: Make them behave as close to files as
possible. And why should a 'git add submodule' not magically do
everything it can to make submodules just work? I can look into a patch
for that if people agree here...

Regarding handling of gitlinks with or without .gitmodules:

Currently we are actually in some intermediate state:

 * If there is no .gitmodules file: No submodule processing on any
   gitlinks (AFAIK)
 * If there is a .gitmodules files with some submodule configured: Do
   recursive fetch and push as far as possible on gitlinks.

So I am not sure whether there are actually many users (knowingly)
using a mix of some submodules configured and some not and then relying
on the submodule infrastructure.

I would rather expect two sorts of users:

  1. Those that do use .gitmodules

  2. Those that do *not* use .gitmodules

Users that do not use any .gitmodules file will currently (AFAIK) not
get any submodule handling. So the question is are there really many
"mixed users"? My guess would be no.
Because without those using this mixed we could switch to saying: "You
need to have a .gitmodules file for submodule handling" without much
fallout from breaking users use cases.

Maybe we can test this out somehow? My patch series would be ready in
that case, just had to drop the first patch and adjust the commit
message of this one.

Cheers Heiko

^ permalink raw reply	[relevance 23%]

* Re: [PATCH] submodule: avoid sentence-lego in translated string
  2017-10-10  2:35         ` Changwoo Ryu
@ 2017-10-10 18:14           ` Martin Ågren
  0 siblings, 0 replies; 200+ results
From: Martin Ågren @ 2017-10-10 18:14 UTC (permalink / raw)
  To: Changwoo Ryu; +Cc: Junio C Hamano, Jean-Noël AVILA, Stefan Beller, brian m. carlson, Philip Oakley, Git Mailing List, Jiang Xin, Alexander Shopov, Jordi Mas, Ralf Thielow, Christopher Díaz, Marco Paolone, Vasco Almeida, Dimitriy Ryazantcev, Peter Krefting, Trần Ngọc Quân, Jacques Viviers, m4sk1n

On 10 October 2017 at 04:35, Changwoo Ryu <cwryu@debian.org> wrote:
> 2017-10-10 10:26 GMT+09:00 Junio C Hamano <gitster@pobox.com>:
>> Jean-Noël AVILA <jn.avila@free.fr> writes:
>>
>>> On Monday, 9 October 2017, 09:47:26 CEST Stefan Beller wrote:
>>>
>>>> I always assumed that translators are aware of these issues and sort of
>>>> work around this somehow, maybe like this:
>>>>
>>>>   "submodule entry '%s' (%s) is not a commit. It is of type %s"
>>>
>>> Translators can be aware of the issue if the coder commented the
>>> internationalization string with some possible candidates for the placeholders
>>> when it is not clear unless you check in the source code. Much effort was
>>> poured into translating the technical terms in other parts of Git; it seems
>>> awkward to just step back in this occurence.
>>
>> I do not see this particular case as "stepping back", though.
>>
>> Our users do not spell "git cat-file -t commit v2.0^{commit}" with
>> 'commit' translated to their language, right?  Shouldn't an error
>> message output use the same phrase the input side requests users to
>> use?

I thought Jean-Noël meant at least partially the translator-experience,
not just the user-experience, but I might be wrong.

I prepared a patch to give a TRANSLATORS:-comment, but then I realized
that we have more instances like this with `typename()`. Actually, quite
often we avoid the issue (intentionally or unintentionally) by writing
"of type %s", but other times, we do the "is a %s". So I don't know,
maybe it all works anyway. The regular translators have now received 10
mails (11 counting this) so might be aware of this particular string by
now. :-/

> Users know the limit of command-line translation. They type "commit"
> to commit but they see translated "commit" in output messages. It is
> actually confusing. But the untranslated English literals in the
> middle of translated sentences does not remove the confusion but
> increase it in a different way.

What you describe seems plausible, but I have to admit that I don't use
i18n-ized software myself.

Martin

^ permalink raw reply	[relevance 6%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-10 13:03     ` Heiko Voigt
@ 2017-10-10 18:39       ` Stefan Beller
  2017-10-10 23:31         ` Junio C Hamano
                           ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stefan Beller @ 2017-10-10 18:39 UTC (permalink / raw)
  To: Heiko Voigt, Josh Triplett; +Cc: git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Tue, Oct 10, 2017 at 6:03 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:

>> > As mentioned in the cover letter. This seems to be the only test that
>> > ensures that we stay compatible with setups without .gitmodules. Maybe
>> > we should add/revive some?
>>
>> An interesting discussion covering this topic is found at
>> https://public-inbox.org/git/20170606035650.oykbz2uc4xkr3cr2@sigill.intra.peff.net/
>
> Thanks for that pointer. So in that discussion Junio said that the
> recursive operations should succeed if we have everything necessary at
> hand. I kind of agree because why should we limit usage when not
> necessary. On the other hand we want git to be easy to use. And that
> example from Peff is perfect as a demonstration of a incosistency we
> currently have:
>
> git clone git://some.where.git/submodule.git
> git add submodule
>
> is an operation I remember, I did, when first getting in contact with
> submodules (many years back), since that is one intuitive way. And the
> thing is: It works, kind of... Only later I discovered that one actually
> needs to us a special submodule command to get everything approriately
> setup to work together with others.

I agree that we ought to not block off users "because we can", but rather
perform the operation if possible with the data at hand.

Note that the result of the discussion `jk/warn-add-gitlink actually`
warns about adding a raw gitlink now, such that we hint at using
"git submodule add", directly.

So you propose to make git-add behave like "git submodule add"
(i.e. also add the .gitmodules entry for name/path/URL), which I
like from a submodule perspective.

However other users of gitlinks might be confused[1], which is why
I refrained from "making every gitlink into a submodule". Specifically
the more powerful a submodule operation is (the more fluff adds),
the harder it should be for people to mis-use it.

[1] https://github.com/git-series/git-series/blob/master/INTERNALS.md
     "git-series uses gitlinks to store pointer to commits in its own repo."

> If everyone agrees that submodules are the default way of handling
> repositories insided repositories, IMO, 'git add' should also alter
> .gitmodules by default. We could provide a switch to avoid doing that.

I wonder if that switch should be default-on (i.e. not treat a gitlink as
a submodule initially, behavior as-is, and then eventually we will
die() on unconfigured repos, expecting the user to make the decision)

> An intermediate solution would be to warn

That is already implemented by Peff.

> but in the long run my goal
> for submodules is and always was: Make them behave as close to files as
> possible. And why should a 'git add submodule' not magically do
> everything it can to make submodules just work? I can look into a patch
> for that if people agree here...

I'd love to see this implemented. I cc'd Josh (the author of git-series), who
may disagree with this, or has some good input how to go forward without
breaking git-series.

> Regarding handling of gitlinks with or without .gitmodules:
>
> Currently we are actually in some intermediate state:
>
>  * If there is no .gitmodules file: No submodule processing on any
>    gitlinks (AFAIK)

AFAIK this is true.

>  * If there is a .gitmodules files with some submodule configured: Do
>    recursive fetch and push as far as possible on gitlinks.

* If submodule.recurse is set, then we also treat submodules like files
  for checkout, reset, read-tree.

> So I am not sure whether there are actually many users (knowingly)
> using a mix of some submodules configured and some not and then relying
> on the submodule infrastructure.
>
> I would rather expect two sorts of users:
>
>   1. Those that do use .gitmodules

Those want to reap all benefits of good submodules.

>
>   2. Those that do *not* use .gitmodules

As said above, we don't know if those users are
"holding submodules wrong" or are using gitlinks for
magic tricks (unrelated to submodules).

>
> Users that do not use any .gitmodules file will currently (AFAIK) not
> get any submodule handling. So the question is are there really many
> "mixed users"? My guess would be no.

I hope that there are few (if any) users of these mixed setups.

> Because without those using this mixed we could switch to saying: "You
> need to have a .gitmodules file for submodule handling" without much
> fallout from breaking users use cases.

That seems reasonable to me, actually.

> Maybe we can test this out somehow? My patch series would be ready in
> that case, just had to drop the first patch and adjust the commit
> message of this one.

I wonder how we would test this, though? Do you have any idea
(even vague) how we'd accomplish such a measurement?
I fear we'll have to go this way blindly.

Cheers,
Stefan

>
> Cheers Heiko

^ permalink raw reply	[relevance 24%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-10 18:39       ` Stefan Beller
@ 2017-10-10 23:31         ` Junio C Hamano
  2017-10-10 23:41           ` Stefan Beller
  2017-10-11 14:56           ` Heiko Voigt
  2017-10-11 14:52         ` Heiko Voigt
  2017-10-11 15:10         ` Josh Triplett
  2 siblings, 2 replies; 200+ results
From: Junio C Hamano @ 2017-10-10 23:31 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Heiko Voigt, Josh Triplett, git\, Jonathan Nieder, Jens Lehmann, Brandon Williams

Stefan Beller <sbeller@google.com> writes:

> So you propose to make git-add behave like "git submodule add"
> (i.e. also add the .gitmodules entry for name/path/URL), which I
> like from a submodule perspective.
>
> However other users of gitlinks might be confused[1], which is why
> I refrained from "making every gitlink into a submodule". Specifically
> the more powerful a submodule operation is (the more fluff adds),
> the harder it should be for people to mis-use it.

A few questions that come to mind are:

 - Does "git add sub/" have enough information to populate
   .gitmodules?  If we have reasonable "default" values for
   .gitmodules entries (e.g. missing URL means we won't fetch when
   asked to go recursively fetch), perhaps we can leave everything
   other than "submodule.$name.path" undefined.

 - Can't we help those who have gitlinks without .gitmodules entries
   exactly the same way as above, i.e. when we see a gitlink and try
   to treat it as a submodule, we'd first try to look it up from
   .gitmodules (by going from path to name and then to
   submodule.$name.$var); the above "'git add sub/' would add an
   entry for .gitmodules" wish is based on the assumption that there
   are reasonable "default" values for each of these $var--so by
   basing on the same assumption, we can "pretend" as if these
   submodule.$name.$var were in .gitmodules file when we see
   gitlinks without .gitmodules entries.  IOW, if "git add sub/" can
   add .gitmodules to help people without having to type "git
   submodule add sub/", then we can give exactly the same degree of
   help without even modifying .gitmodules when "git add sub/" is
   run.

 - Even if we could solve it with "git add sub/" that adds to
   .gitmodules, is it a good solution, when we can solve the same
   thing without having to do so?




^ permalink raw reply	[relevance 24%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-10 23:31         ` Junio C Hamano
@ 2017-10-10 23:41           ` Stefan Beller
  2017-10-11 14:56           ` Heiko Voigt
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-10 23:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Josh Triplett, git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Tue, Oct 10, 2017 at 4:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> So you propose to make git-add behave like "git submodule add"
>> (i.e. also add the .gitmodules entry for name/path/URL), which I
>> like from a submodule perspective.
>>
>> However other users of gitlinks might be confused[1], which is why
>> I refrained from "making every gitlink into a submodule". Specifically
>> the more powerful a submodule operation is (the more fluff adds),
>> the harder it should be for people to mis-use it.
>
> A few questions that come to mind are:
>
>  - Does "git add sub/" have enough information to populate
>    .gitmodules?  If we have reasonable "default" values for
>    .gitmodules entries (e.g. missing URL means we won't fetch when
>    asked to go recursively fetch), perhaps we can leave everything
>    other than "submodule.$name.path" undefined.

I think we would want to populate path and URL only.

>
>  - Can't we help those who have gitlinks without .gitmodules entries
>    exactly the same way as above, i.e. when we see a gitlink and try
>    to treat it as a submodule, we'd first try to look it up from
>    .gitmodules (by going from path to name and then to
>    submodule.$name.$var); the above "'git add sub/' would add an
>    entry for .gitmodules" wish is based on the assumption that there
>    are reasonable "default" values for each of these $var--so by
>    basing on the same assumption, we can "pretend" as if these
>    submodule.$name.$var were in .gitmodules file when we see
>    gitlinks without .gitmodules entries.  IOW, if "git add sub/" can
>    add .gitmodules to help people without having to type "git
>    submodule add sub/", then we can give exactly the same degree of
>    help without even modifying .gitmodules when "git add sub/" is
>    run.

I do not understand the gist of this paragraph, other then:

  "When git-add <repository> encounters a section submodule.<name>.*,
   do not modify it; We can assume it is sane already."

>  - Even if we could solve it with "git add sub/" that adds to
>    .gitmodules, is it a good solution, when we can solve the same
>    thing without having to do so?

I am confused even more.

So you suggest that "git add [--gitlink=submodule]" taking on the
responsibilities of "git submodule add" is a bad idea?

I thought we had the same transition from "git remote update" to
"git fetch", which eventually superseded the former.

^ permalink raw reply	[relevance 25%]

* [ANNOUNCE] Git v2.15.0-rc1
@ 2017-10-11  6:35 Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-10-11  6:35 UTC (permalink / raw)
  To: git; +Cc: Linux Kernel

A release candidate Git v2.15.0-rc1 is now available for testing
at the usual places.  It is comprised of 721 non-merge commits
since v2.14.0, contributed by 72 people, 22 of which are new faces.

The tarballs are found at:

    https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.15.0-rc1' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.14.0 are as follows.
Welcome to the Git development community!

  Ann T Ropea, Daniel Watkins, Derrick Stolee, Dimitrios
  Christidis, Eric Rannaud, Evan Zacks, Hielke Christian Braun,
  Ian Campbell, Ilya Kantor, Jameson Miller, Job Snijders, Joel
  Teichroeb, joernchen, Łukasz Gryglicki, Manav Rathi, Martin
  Ågren, Michael Forney, Patryk Obara, Randall S. Becker, Ross
  Kabus, Taylor Blau, and Urs Thuermann.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Andreas Heiduk,
  Anthony Sottile, Ben Boeckel, Brandon Casey, Brandon Williams,
  brian m. carlson, Christian Couder, Eric Blake, Han-Wen Nienhuys,
  Heiko Voigt, Jean-Noel Avila, Jeff Hostetler, Jeff King, Johannes
  Schindelin, Johannes Sixt, Jonathan Nieder, Jonathan Tan,
  Junio C Hamano, Kaartic Sivaraam, Kevin Daudt, Kevin Willford,
  Lars Schneider, Martin Koegler, Matthieu Moy, Max Kirillov,
  Michael Haggerty, Michael J Gruber, Nguyễn Thái Ngọc Duy,
  Nicolas Morey-Chaisemartin, Øystein Walle, Paolo Bonzini,
  Pat Thoyts, Philip Oakley, Phillip Wood, Raman Gupta, Ramsay
  Jones, René Scharfe, Sahil Dua, Santiago Torres, Stefan Beller,
  Stephan Beyer, Takashi Iwai, Thomas Braun, Thomas Gummerer,
  Todd Zullinger, Tom G. Christensen, Torsten Bögershausen,
  and William Duclot.

----------------------------------------------------------------

Git 2.15 Release Notes (draft)
==============================

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is still warned and Git asks users to use a
   more explicit '.' for that instead.  The hope is that existing
   users will not mind this change, and eventually the warning can be
   turned into a hard error, upgrading the deprecation into removal of
   this (mis)feature.  That is now scheduled to happen in Git v2.16,
   the next major release after this one.

 * Git now avoids blindly falling back to ".git" when the setup
   sequence said we are _not_ in Git repository.  A corner case that
   happens to work right now may be broken by a call to BUG().
   We've tried hard to locate such cases and fixed them, but there
   might still be cases that need to be addressed--bug reports are
   greatly appreciated.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.


Updates since v2.14
-------------------

UI, Workflows & Features

 * An example that is now obsolete has been removed from a sample hook,
   and an old example in it that added a sign-off manually has been
   improved to use the interpret-trailers command.

 * The advice message given when "git rebase" stops for conflicting
   changes has been improved.

 * The "rerere-train" script (in contrib/) learned the "--overwrite"
   option to allow overwriting existing recorded resolutions.

 * "git contacts" (in contrib/) now lists the address on the
   "Reported-by:" trailer to its output, in addition to those on
   S-o-b: and other trailers, to make it easier to notify (and thank)
   the original bug reporter.

 * "git rebase", especially when it is run by mistake and ends up
   trying to replay many changes, spent long time in silence.  The
   command has been taught to show progress report when it spends
   long time preparing these many changes to replay (which would give
   the user a chance to abort with ^C).

 * "git merge" learned a "--signoff" option to add the Signed-off-by:
   trailer with the committer's name.

 * "git diff" learned to optionally paint new lines that are the same
   as deleted lines elsewhere differently from genuinely new lines.

 * "git interpret-trailers" learned to take the trailer specifications
   from the command line that overrides the configured values.

 * "git interpret-trailers" has been taught a "--parse" and a few
   other options to make it easier for scripts to grab existing
   trailer lines from a commit log message.

 * The "--format=%(trailers)" option "git log" and its friends take
   learned to take the 'unfold' and 'only' modifiers to normalize its
   output, e.g. "git log --format=%(trailers:only,unfold)".

 * "gitweb" shows a link to visit the 'raw' contents of blbos in the
   history overview page.

 * "[gc] rerereResolved = 5.days" used to be invalid, as the variable
   is defined to take an integer counting the number of days.  It now
   is allowed.

 * The code to acquire a lock on a reference (e.g. while accepting a
   push from a client) used to immediately fail when the reference is
   already locked---now it waits for a very short while and retries,
   which can make it succeed if the lock holder was holding it during
   a read-only operation.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.

 * The codepath to call external process filter for smudge/clean
   operation learned to show the progress meter.

 * "git rev-parse" learned "--is-shallow-repository", that is to be
   used in a way similar to existing "--is-bare-repository" and
   friends.

 * "git describe --match <pattern>" has been taught to play well with
   the "--all" option.

 * "git branch" learned "-c/-C" to create a new branch by copying an
   existing one.

 * Some commands (most notably "git status") makes an opportunistic
   update when performing a read-only operation to help optimize later
   operations in the same repository.  The new "--no-optional-locks"
   option can be passed to Git to disable them.

 * "git for-each-ref --format=..." learned a new format element,
   %(trailers), to show only the commit log trailer part of the log
   message.


Performance, Internal Implementation, Development Support etc.

 * Conversion from uchar[20] to struct object_id continues.

 * Start using selected c99 constructs in small, stable and
   essentialpart of the system to catch people who care about
   older compilers that do not grok them.

 * The filter-process interface learned to allow a process with long
   latency give a "delayed" response.

 * Many uses of comparision callback function the hashmap API uses
   cast the callback function type when registering it to
   hashmap_init(), which defeats the compile time type checking when
   the callback interface changes (e.g. gaining more parameters).
   The callback implementations have been updated to take "void *"
   pointers and cast them to the type they expect instead.

 * Because recent Git for Windows do come with a real msgfmt, the
   build procedure for git-gui has been updated to use it instead of a
   hand-rolled substitute.

 * "git grep --recurse-submodules" has been reworked to give a more
   consistent output across submodule boundary (and do its thing
   without having to fork a separate process).

 * A helper function to read a single whole line into strbuf
   mistakenly triggered OOM error at EOF under certain conditions,
   which has been fixed.
   (merge 642956cf45 rs/strbuf-getwholeline-fix later to maint).

 * The "ref-store" code reorganization continues.

 * "git commit" used to discard the index and re-read from the filesystem
   just in case the pre-commit hook has updated it in the middle; this
   has been optimized out when we know we do not run the pre-commit hook.
   (merge 680ee550d7 kw/commit-keep-index-when-pre-commit-is-not-run later to maint).

 * Updates to the HTTP layer we made recently unconditionally used
   features of libCurl without checking the existence of them, causing
   compilation errors, which has been fixed.  Also migrate the code to
   check feature macros, not version numbers, to cope better with
   libCurl that vendor ships with backported features.

 * The API to start showing progress meter after a short delay has
   been simplified.
   (merge 8aade107dd jc/simplify-progress later to maint).

 * Code clean-up to avoid mixing values read from the .gitmodules file
   and values read from the .git/config file.

 * We used to spend more than necessary cycles allocating and freeing
   piece of memory while writing each index entry out.  This has been
   optimized.

 * Platforms that ship with a separate sha1 with collision detection
   library can link to it instead of using the copy we ship as part of
   our source tree.

 * Code around "notes" have been cleaned up.
   (merge 3964281524 mh/notes-cleanup later to maint).

 * The long-standing rule that an in-core lockfile instance, once it
   is used, must not be freed, has been lifted and the lockfile and
   tempfile APIs have been updated to reduce the chance of programming
   errors.

 * Our hashmap implementation in hashmap.[ch] is not thread-safe when
   adding a new item needs to expand the hashtable by rehashing; add
   an API to disable the automatic rehashing to work it around.

 * Many of our programs consider that it is OK to release dynamic
   storage that is used throughout the life of the program by simply
   exiting, but this makes it harder to leak detection tools to avoid
   reporting false positives.  Plug many existing leaks and introduce
   a mechanism for developers to mark that the region of memory
   pointed by a pointer is not lost/leaking to help these tools.

 * As "git commit" to conclude a conflicted "git merge" honors the
   commit-msg hook, "git merge" that records a merge commit that
   cleanly auto-merges should, but it didn't.

 * The codepath for "git merge-recursive" has been cleaned up.

 * Many leaks of strbuf have been fixed.

 * "git imap-send" has our own implementation of the protocol and also
   can use more recent libCurl with the imap protocol support.  Update
   the latter so that it can use the credential subsystem, and then
   make it the default option to use, so that we can eventually
   deprecate and remove the former.

 * "make style" runs git-clang-format to help developers by pointing
   out coding style issues.

 * A test to demonstrate "git mv" failing to adjust nested submodules
   has been added.
   (merge c514167df2 hv/mv-nested-submodules-test later to maint).

 * On Cygwin, "ulimit -s" does not report failure but it does not work
   at all, which causes an unexpected success of some tests that
   expect failures under a limited stack situation.  This has been
   fixed.

 * Many codepaths have been updated to squelch -Wimplicit-fallthrough
   warnings from Gcc 7 (which is a good code hygiene).

 * Add a helper for DLL loading in anticipation for its need in a
   future topic RSN.

 * "git status --ignored", when noticing that a directory without any
   tracked path is ignored, still enumerated all the ignored paths in
   the directory, which is unnecessary.  The codepath has been
   optimized to avoid this overhead.

 * The final batch to "git rebase -i" updates to move more code from
   the shell script to C has been merged.

 * Operations that do not touch (majority of) packed refs have been
   optimized by making accesses to packed-refs file lazy; we no longer
   pre-parse everything, and an access to a single ref in the
   packed-refs does not touch majority of irrelevant refs, either.

 * Add comment to clarify that the style file is meant to be used with
   clang-5 and the rules are still work in progress.

 * Many variables that points at a region of memory that will live
   throughout the life of the program have been marked with UNLEAK
   marker to help the leak checkers concentrate on real leaks..

 * Plans for weaning us off of SHA-1 has been documented.

 * A new "oidmap" API has been introduced and oidset API has been
   rewritten to use it.


Also contains various documentation updates and code clean-ups.


Fixes since v2.14
-----------------

 * "%C(color name)" in the pretty print format always produced ANSI
   color escape codes, which was an early design mistake.  They now
   honor the configuration (e.g. "color.ui = never") and also tty-ness
   of the output medium.

 * The http.{sslkey,sslCert} configuration variables are to be
   interpreted as a pathname that honors "~[username]/" prefix, but
   weren't, which has been fixed.

 * Numerous bugs in walking of reflogs via "log -g" and friends have
   been fixed.

 * "git commit" when seeing an totally empty message said "you did not
   edit the message", which is clearly wrong.  The message has been
   corrected.

 * When a directory is not readable, "gitweb" fails to build the
   project list.  Work this around by skipping such a directory.

 * Some versions of GnuPG fails to kill gpg-agent it auto-spawned
   and such a left-over agent can interfere with a test.  Work it
   around by attempting to kill one before starting a new test.

 * A recently added test for the "credential-cache" helper revealed
   that EOF detection done around the time the connection to the cache
   daemon is torn down were flaky.  This was fixed by reacting to
   ECONNRESET and behaving as if we got an EOF.

 * "git log --tag=no-such-tag" showed log starting from HEAD, which
   has been fixed---it now shows nothing.

 * The "tag.pager" configuration variable was useless for those who
   actually create tag objects, as it interfered with the use of an
   editor.  A new mechanism has been introduced for commands to enable
   pager depending on what operation is being carried out to fix this,
   and then "git tag -l" is made to run pager by default.

 * "git push --recurse-submodules $there HEAD:$target" was not
   propagated down to the submodules, but now it is.

 * Commands like "git rebase" accepted the --rerere-autoupdate option
   from the command line, but did not always use it.  This has been
   fixed.

 * "git clone --recurse-submodules --quiet" did not pass the quiet
   option down to submodules.

 * Test portability fix for OBSD.

 * Portability fix for OBSD.

 * "git am -s" has been taught that some input may end with a trailer
   block that is not Signed-off-by: and it should refrain from adding
   an extra blank line before adding a new sign-off in such a case.

 * "git svn" used with "--localtime" option did not compute the tz
   offset for the timestamp in question and instead always used the
   current time, which has been corrected.

 * Memory leak in an error codepath has been plugged.

 * "git stash -u" used the contents of the committed version of the
   ".gitignore" file to decide which paths are ignored, even when the
   file has local changes.  The command has been taught to instead use
   the locally modified contents.

 * bash 4.4 or newer gave a warning on NUL byte in command
   substitution done in "git stash"; this has been squelched.

 * "git grep -L" and "git grep --quiet -L" reported different exit
   codes; this has been corrected.

 * When handshake with a subprocess filter notices that the process
   asked for an unknown capability, Git did not report what program
   the offending subprocess was running.  This has been corrected.

 * "git apply" that is used as a better "patch -p1" failed to apply a
   taken from a file with CRLF line endings to a file with CRLF line
   endings.  The root cause was because it misused convert_to_git()
   that tried to do "safe-crlf" processing by looking at the index
   entry at the same path, which is a nonsense---in that mode, "apply"
   is not working on the data in (or derived from) the index at all.
   This has been fixed.

 * Killing "git merge --edit" before the editor returns control left
   the repository in a state with MERGE_MSG but without MERGE_HEAD,
   which incorrectly tells the subsequent "git commit" that there was
   a squash merge in progress.  This has been fixed.

 * "git archive" did not work well with pathspecs and the
   export-ignore attribute.

 * In addition to "cc: <a@dd.re.ss> # cruft", "cc: a@dd.re.ss # cruft"
   was taught to "git send-email" as a valid way to tell it that it
   needs to also send a carbon copy to <a@dd.re.ss> in the trailer
   section.
   (merge cc90750677 mm/send-email-cc-cruft later to maint).

 * "git branch -M a b" while on a branch that is completely unrelated
   to either branch a or branch b misbehaved when multiple worktree
   was in use.  This has been fixed.
   (merge 31824d180d nd/worktree-kill-parse-ref later to maint).

 * "git gc" and friends when multiple worktrees are used off of a
   single repository did not consider the index and per-worktree refs
   of other worktrees as the root for reachability traversal, making
   objects that are in use only in other worktrees to be subject to
   garbage collection.

 * A regression to "gitk --bisect" by a recent update has been fixed.
   (merge 1d0538e486 mh/packed-ref-store-prep later to maint).

 * "git -c submodule.recurse=yes pull" did not work as if the
   "--recurse-submodules" option was given from the command line.
   This has been corrected.

 * Unlike "git commit-tree < file", "git commit-tree -F file" did not
   pass the contents of the file verbatim and instead completed an
   incomplete line at the end, if exists.  The latter has been updated
   to match the behaviour of the former.
   (merge c818e74332 rk/commit-tree-make-F-verbatim later to maint).

 * Many codepaths did not diagnose write failures correctly when disks
   go full, due to their misuse of write_in_full() helper function,
   which have been corrected.
   (merge f48ecd38cb jk/write-in-full-fix later to maint).

 * "git help co" now says "co is aliased to ...", not "git co is".
   (merge b3a8076e0d ks/help-alias-label later to maint).

 * "git archive", especially when used with pathspec, stored an empty
   directory in its output, even though Git itself never does so.
   This has been fixed.
   (merge 4318094047 rs/archive-excluded-directory later to maint).

 * API error-proofing which happens to also squelch warnings from GCC.
   (merge c788c54cde tg/refs-allowed-flags later to maint).

 * The explanation of the cut-line in the commit log editor has been
   slightly tweaked.
   (merge 8c4b1a3593 ks/commit-do-not-touch-cut-line later to maint).

 * "git gc" tries to avoid running two instances at the same time by
   reading and writing pid/host from and to a lock file; it used to
   use an incorrect fscanf() format when reading, which has been
   corrected.
   (merge afe2fab72c aw/gc-lockfile-fscanf-fix later to maint).

 * The scripts to drive TravisCI has been reorganized and then an
   optimization to avoid spending cycles on a branch whose tip is
   tagged has been implemented.
   (merge 8376eb4a8f ls/travis-scriptify later to maint).

 * The test linter has been taught that we do not like "echo -e".
   (merge 1a6d46895d tb/test-lint-echo-e later to maint).

 * Code cmp.std.c nitpick.
   (merge ac7da78ede mh/for-each-string-list-item-empty-fix later to maint).

 * A regression fix for 2.11 that made the code to read the list of
   alternate object stores overrun the end of the string.
   (merge f0f7bebef7 jk/info-alternates-fix later to maint).

 * "git describe --match" learned to take multiple patterns in v2.13
   series, but the feature ignored the patterns after the first one
   and did not work at all.  This has been fixed.
   (merge da769d2986 jk/describe-omit-some-refs later to maint).

 * "git filter-branch" cannot reproduce a history with a tag without
   the tagger field, which only ancient versions of Git allowed to be
   created.  This has been corrected.
   (merge b2c1ca6b4b ic/fix-filter-branch-to-handle-tag-without-tagger later to maint).

 * "git cat-file --textconv" started segfaulting recently, which
   has been corrected.
   (merge cc0ea7c9e5 jk/diff-blob later to maint).

 * The built-in pattern to detect the "function header" for HTML did
   not match <H1>..<H6> elements without any attributes, which has
   been fixed.
   (merge 9c03caca2c ik/userdiff-html-h-element-fix later to maint).

 * "git mailinfo" was loose in decoding quoted printable and produced
   garbage when the two letters after the equal sign are not
   hexadecimal.  This has been fixed.
   (merge c8cf423eab rs/mailinfo-qp-decode-fix later to maint).

 * The machinery to create xdelta used in pack files received the
   sizes of the data in size_t, but lost the higher bits of them by
   storing them in "unsigned int" during the computation, which is
   fixed.

 * The delta format used in the packfile cannot reference data at
   offset larger than what can be expressed in 4-byte, but the
   generator for the data failed to make sure the offset does not
   overflow.  This has been corrected.

 * The documentation for '-X<option>' for merges was misleadingly
   written to suggest that "-s theirs" exists, which is not the case.
   (merge c25d98b2a7 jc/merge-x-theirs-docfix later to maint).

 * "git fast-export" with -M/-C option issued "copy" instruction on a
   path that is simultaneously modified, which was incorrect.
   (merge b3e8ca89cf jt/fast-export-copy-modify-fix later to maint).

 * Many codepaths have been updated to squelch -Wsign-compare
   warnings.
   (merge 071bcaab64 rj/no-sign-compare later to maint).

 * Memory leaks in various codepaths have been plugged.
   (merge 4d01a7fa65 ma/leakplugs later to maint).

 * Recent versions of "git rev-parse --parseopt" did not parse the
   option specification that does not have the optional flags (*=?!)
   correctly, which has been corrected.
   (merge a6304fa4c2 bc/rev-parse-parseopt-fix later to maint).

 * The checkpoint command "git fast-import" did not flush updates to
   refs and marks unless at least one object was created since the
   last checkpoint, which has been corrected, as these things can
   happen without any new object getting created.
   (merge 30e215a65c er/fast-import-dump-refs-on-checkpoint later to maint).

 * Spell the name of our system as "Git" in the output from
   request-pull script.
   (merge e66d7c37a5 ar/request-pull-phrasofix later to maint).

 * Fixes for a handful memory access issues identified by valgrind.
   (merge 2944a94c6b tg/memfixes later to maint).

 * Backports a moral equivalent of 2015 fix to the poll() emulation
   from the upstream gnulib to fix occasional breakages on HPE NonStop.
   (merge 61b2a1acaa rb/compat-poll-fix later to maint).

 * Users with "color.ui = always" in their configuration were broken
   by a recent change that made plumbing commands to pay attention to
   them as the patch created internally by "git add -p" were colored
   (heh) and made unusable.  Fix this regression by redefining
   'always' to mean the same thing as 'auto'.
   (merge 6be4595edb jk/ui-color-always-to-auto-maint later to maint).

 * In the "--format=..." option of the "git for-each-ref" command (and
   its friends, i.e. the listing mode of "git branch/tag"), "%(atom:)"
   (e.g. "%(refname:)", "%(body:)" used to error out.  Instead, treat
   them as if the colon and an empty string that follows it were not
   there.
   (merge bea4dbeafd tb/ref-filter-empty-modifier later to maint).

 * An ancient bug that made Git misbehave with creation/renaming of
   refs has been fixed.

 * Other minor doc, test and build updates and code cleanups.
   (merge f094b89a4d ma/parse-maybe-bool later to maint).
   (merge 39b00fa4d4 jk/drop-sha1-entry-pos later to maint).
   (merge 6cdf8a7929 ma/ts-cleanups later to maint).
   (merge 7560f547e6 ma/up-to-date later to maint).
   (merge 0db3dc75f3 rs/apply-epoch later to maint).
   (merge 74f1bd912b dw/diff-highlight-makefile-fix later to maint).
   (merge f991761eb8 jk/config-lockfile-leak-fix later to maint).
   (merge 150efef1e7 ma/pkt-line-leakfix later to maint).
   (merge 5554451de6 mg/timestamp-t-fix later to maint).
   (merge 276d0e35c0 ma/split-symref-update-fix later to maint).
   (merge 3bc4b8f7c7 bb/doc-eol-dirty later to maint).
   (merge c1bb33c99c jk/system-path-cleanup later to maint).
   (merge ab46e6fc72 cc/subprocess-handshake-missing-capabilities later to maint).
   (merge f7a32dd97f kd/doc-for-each-ref later to maint).
   (merge be94568bc7 ez/doc-duplicated-words-fix later to maint).
   (merge 01e4be6c3d ks/test-readme-phrasofix later to maint).
   (merge 217bb56d4f hn/typofix later to maint).
   (merge c08fd6388c jk/doc-read-tree-table-asciidoctor-fix later to maint).
   (merge c3342b362e ks/doc-use-camelcase-for-config-name later to maint).
   (merge 0bca165fdb jk/validate-headref-fix later to maint).
   (merge 93dbefb389 mr/doc-negative-pathspec later to maint).
   (merge 5e633326e4 ad/doc-markup-fix later to maint).
   (merge 9ca356fa8b rs/cocci-de-paren-call-params later to maint).
   (merge 7099153e8d rs/tag-null-pointer-arith-fix later to maint).
   (merge 0e187d758c rs/run-command-use-alloc-array later to maint).
   (merge e0222159fa jn/strbuf-doc-re-reuse later to maint).
   (merge 97487ea11a rs/qsort-s later to maint).
   (merge a9155c50bd sb/branch-avoid-repeated-strbuf-release later to maint).
   (merge f777623514 ks/branch-tweak-error-message-for-extra-args later to maint).
   (merge 33f3c683ec ks/verify-filename-non-option-error-message-tweak later to maint).
   (merge b3ea7dd32d jk/sha1-loose-object-info-fix later to maint).
   (merge 2720f6db5d rs/fsck-null-return-from-lookup later to maint).
   (merge 99b7b687a6 rs/rs-mailmap later to maint).
   (merge 7823655082 tb/complete-describe later to maint).
   (merge 7cbbf9d6a2 ls/filter-process-delayed later to maint).

----------------------------------------------------------------

Changes since v2.14.0 are as follows:

Adam Dinwoodie (1):
      doc: correct command formatting

Andreas Heiduk (2):
      doc: add missing values "none" and "default" for diff.wsErrorHighlight
      doc: clarify "config --bool" behaviour with empty string

Ann T Ropea (1):
      request-pull: capitalise "Git" to make it a proper noun

Anthony Sottile (1):
      git-grep: correct exit code with --quiet and -L

Ben Boeckel (1):
      Documentation: mention that `eol` can change the dirty status of paths

Brandon Casey (7):
      t1502: demonstrate rev-parse --parseopt option mis-parsing
      rev-parse parseopt: do not search help text for flag chars
      rev-parse parseopt: interpret any whitespace as start of help text
      git-rebase: don't ignore unexpected command line arguments
      t0040,t1502: Demonstrate parse_options bugs
      parse-options: write blank line to correct output stream
      parse-options: only insert newline in help text if needed

Brandon Williams (29):
      repo_read_index: don't discard the index
      repository: have the_repository use the_index
      submodule--helper: teach push-check to handle HEAD
      cache.h: add GITMODULES_FILE macro
      config: add config_from_gitmodules
      submodule: remove submodule.fetchjobs from submodule-config parsing
      submodule: remove fetch.recursesubmodules from submodule-config parsing
      submodule: check for unstaged .gitmodules outside of config parsing
      submodule: check for unmerged .gitmodules outside of config parsing
      submodule: merge repo_read_gitmodules and gitmodules_config
      grep: recurse in-process using 'struct repository'
      t7411: check configuration parsing errors
      submodule: don't use submodule_from_name
      add, reset: ensure submodules can be added or reset
      submodule--helper: don't overlay config in remote_submodule_branch
      submodule--helper: don't overlay config in update-clone
      fetch: don't overlay config with submodule-config
      submodule: don't rely on overlayed config when setting diffopts
      unpack-trees: don't respect submodule.update
      submodule: remove submodule_config callback routine
      diff: stop allowing diff to have submodules configured in .git/config
      submodule-config: remove support for overlaying repository config
      submodule-config: move submodule-config functions to submodule-config.c
      submodule-config: lazy-load a repository's .gitmodules file
      unpack-trees: improve loading of .gitmodules
      submodule: remove gitmodules_config
      clone: teach recursive clones to respect -q
      clang-format: outline the git project's coding style
      Makefile: add style build rule

Christian Couder (3):
      refs: use skip_prefix() in ref_is_hidden()
      sub-process: print the cmd when a capability is unsupported
      sha1-lookup: remove sha1_entry_pos() from header file

Daniel Watkins (1):
      diff-highlight: add clean target to Makefile

Derrick Stolee (1):
      cleanup: fix possible overflow errors in binary search

Dimitrios Christidis (1):
      fmt-merge-msg: fix coding style

Eric Blake (1):
      git-contacts: also recognise "Reported-by:"

Eric Rannaud (1):
      fast-import: checkpoint: dump branches/tags/marks even if object_count==0

Evan Zacks (1):
      doc: fix minor typos (extra/duplicated words)

Han-Wen Nienhuys (5):
      submodule.h: typofix
      submodule.c: describe submodule_to_gitdir() in a new comment
      real_path: clarify return value ownership
      read_gitfile_gently: clarify return value ownership.
      string-list.h: move documentation from Documentation/api/ into header

Heiko Voigt (2):
      t5526: fix some broken && chains
      add test for bug in git-mv for recursive submodules

Hielke Christian Braun (1):
      gitweb: skip unreadable subdirectories

Ian Campbell (4):
      filter-branch: reset $GIT_* before cleaning up
      filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_*
      filter-branch: stash away ref map in a branch
      filter-branch: use hash-object instead of mktag

Ilya Kantor (1):
      userdiff: fix HTML hunk header regexp

Jameson Miller (1):
      Improve performance of git status --ignored

Jean-Noel Avila (1):
      i18n: add a missing space in message

Jeff Hostetler (1):
      hashmap: add API to disable item counting when threaded

Jeff King (129):
      t1414: document some reflog-walk oddities
      revision: disallow reflog walking with revs->limited
      log: clarify comment about reflog cycles
      log: do not free parents when walking reflog
      get_revision_1(): replace do-while with an early return
      rev-list: check reflog_info before showing usage
      reflog-walk: stop using fake parents
      reflog-walk: apply --since/--until to reflog dates
      check return value of verify_ref_format()
      docs/for-each-ref: update pointer to color syntax
      t: use test_decode_color rather than literal ANSI codes
      ref-filter: simplify automatic color reset
      ref-filter: abstract ref format into its own struct
      ref-filter: move need_color_reset_at_eol into ref_format
      ref-filter: provide a function for parsing sort options
      ref-filter: make parse_ref_filter_atom a private function
      ref-filter: factor out the parsing of sorting atoms
      ref-filter: pass ref_format struct to atom parsers
      color: check color.ui in git_default_config()
      for-each-ref: load config earlier
      rev-list: pass diffopt->use_colors through to pretty-print
      pretty: respect color settings for %C placeholders
      ref-filter: consult want_color() before emitting colors
      strbuf: use designated initializers in STRBUF_INIT
      t/lib-proto-disable: restore protocol.allow after config tests
      t5813: add test for hostname starting with dash
      connect: factor out "looks like command line option" check
      connect: reject dashed arguments for proxy commands
      connect: reject paths that look like command line options
      t6018: flesh out empty input/output rev-list tests
      revision: add rev_input_given flag
      rev-list: don't show usage when we see empty ref patterns
      revision: do not fallback to default when rev_input_given is set
      hashcmp: use memcmp instead of open-coded loop
      sha1_file: drop experimental GIT_USE_LOOKUP search
      trailer: put process_trailers() options into a struct
      interpret-trailers: add an option to show only the trailers
      interpret-trailers: add an option to show only existing trailers
      interpret-trailers: add an option to unfold values
      interpret-trailers: add --parse convenience option
      pretty: move trailer formatting to trailer.c
      t4205: refactor %(trailers) tests
      pretty: support normalization options for %(trailers)
      doc: fix typo in sendemail.identity
      config: use a static lock_file struct
      write_index_as_tree: cleanup tempfile on error
      setup_temporary_shallow: avoid using inactive tempfile
      setup_temporary_shallow: move tempfile struct into function
      verify_signed_buffer: prefer close_tempfile() to close()
      always check return value of close_tempfile
      tempfile: do not delete tempfile on failed close
      lockfile: do not rollback lock on failed close
      tempfile: prefer is_tempfile_active to bare access
      tempfile: handle NULL tempfile pointers gracefully
      tempfile: replace die("BUG") with BUG()
      tempfile: factor out activation
      tempfile: factor out deactivation
      tempfile: robustify cleanup handler
      tempfile: release deactivated strbufs instead of resetting
      tempfile: use list.h for linked list
      tempfile: remove deactivated list entries
      tempfile: auto-allocate tempfiles on heap
      lockfile: update lifetime requirements in documentation
      ref_lock: stop leaking lock_files
      stop leaking lock structs in some simple cases
      test-lib: --valgrind should not override --verbose-log
      test-lib: set LSAN_OPTIONS to abort by default
      add: free leaked pathspec after add_files_to_cache()
      update-index: fix cache entry leak in add_one_file()
      config: plug user_config leak
      reset: make tree counting less confusing
      reset: free allocated tree buffers
      repository: free fields before overwriting them
      set_git_dir: handle feeding gitdir to itself
      rev-parse: don't trim bisect refnames
      system_path: move RUNTIME_PREFIX to a sub-function
      git_extract_argv0_path: do nothing without RUNTIME_PREFIX
      add UNLEAK annotation for reducing leak false positives
      shortlog: skip format/parse roundtrip for internal traversal
      shell: drop git-cvsserver support by default
      archimport: use safe_pipe_capture for user input
      cvsimport: shell-quote variable used in backticks
      config: avoid "write_in_full(fd, buf, len) < len" pattern
      get-tar-commit-id: check write_in_full() return against 0
      avoid "write_in_full(fd, buf, len) != len" pattern
      convert less-trivial versions of "write_in_full() != len"
      pkt-line: check write_in_full() errors against "< 0"
      notes-merge: use ssize_t for write_in_full() return value
      config: flip return value of store_write_*()
      read_pack_header: handle signed/unsigned comparison in read result
      prefix_ref_iterator: break when we leave the prefix
      read_info_alternates: read contents into strbuf
      read_info_alternates: warn on non-trivial errors
      revision: replace "struct cmdline_pathspec" with argv_array
      cat-file: handle NULL object_context.path
      test-line-buffer: simplify command parsing
      curl_trace(): eliminate switch fallthrough
      consistently use "fallthrough" comments in switches
      doc: put literal block delimiter around table
      files-backend: prefer "0" for write_in_full() error check
      notes-merge: drop dead zero-write code
      prefer "!=" when checking read_in_full() result
      avoid looking at errno for short read_in_full() returns
      distinguish error versus short read from read_in_full()
      worktree: use xsize_t to access file size
      worktree: check the result of read_in_full()
      validate_headref: NUL-terminate HEAD buffer
      validate_headref: use skip_prefix for symref parsing
      validate_headref: use get_oid_hex for detached HEADs
      git: add --no-optional-locks option
      test-terminal: set TERM=vt100
      t4015: prefer --color to -c color.diff=always
      t3701: use test-terminal to collect color output
      t7508: use test_terminal for color output
      t7502: use diff.noprefix for --verbose test
      t6006: drop "always" color config tests
      t3203: drop "always" color test
      t3205: use --color instead of color.branch=always
      provide --color option for all ref-filter users
      color: make "always" the same as "auto" in config
      t4015: use --color with --color-moved
      t7301: use test_terminal to check color
      path.c: fix uninitialized memory access
      sha1_loose_object_info: handle errors from unpack_sha1_rest
      t3308: create a real ref directory/file conflict
      refs_resolve_ref_unsafe: handle d/f conflicts for writes
      write_entry: fix leak when retrying delayed filter
      write_entry: avoid reading blobs in CE_RETRY case
      write_entry: untangle symlink and regular-file cases

Job Snijders (1):
      gitweb: add 'raw' blob_plain link in history overview

Joel Teichroeb (3):
      stash: add a test for stash create with no files
      stash: add a test for when apply fails during stash branch
      stash: add a test for stashing in a detached state

Johannes Schindelin (14):
      run_processes_parallel: change confusing task_cb convention
      git-gui (MinGW): make use of MSys2's msgfmt
      t3415: verify that an empty instructionFormat is handled as before
      rebase -i: generate the script via rebase--helper
      rebase -i: remove useless indentation
      rebase -i: do not invent onelines when expanding/collapsing SHA-1s
      rebase -i: also expand/collapse the SHA-1s via the rebase--helper
      t3404: relax rebase.missingCommitsCheck tests
      rebase -i: check for missing commits in the rebase--helper
      rebase -i: skip unnecessary picks using the rebase--helper
      t3415: test fixup with wrapped oneline
      rebase -i: rearrange fixup/squash lines using the rebase--helper
      Win32: simplify loading of DLL functions
      clang-format: adjust line break penalties

Johannes Sixt (1):
      sub-process: use child_process.args instead of child_process.argv

Jonathan Nieder (8):
      vcs-svn: remove more unused prototypes and declarations
      vcs-svn: remove custom mode constants
      vcs-svn: remove repo_delete wrapper function
      vcs-svn: move remaining repo_tree functions to fast_export.h
      pack: make packed_git_mru global a value instead of a pointer
      pathspec doc: parse_pathspec does not maintain references to args
      technical doc: add a design doc for hash function transition
      strbuf doc: reuse after strbuf_release is fine

Jonathan Tan (40):
      fsck: remove redundant parse_tree() invocation
      object: remove "used" field from struct object
      fsck: cleanup unused variable
      Documentation: migrate sub-process docs to header
      sub-process: refactor handshake to common function
      tests: ensure fsck fails on corrupt packfiles
      sha1_file: set whence in storage-specific info fn
      sha1_file: remove read_packed_sha1()
      diff: avoid redundantly clearing a flag
      diff: respect MIN_BLOCK_LENGTH for last block
      diff: define block by number of alphanumeric chars
      Doc: clarify that pack-objects makes packs, plural
      pack: move pack name-related functions
      pack: move static state variables
      pack: move pack_report()
      pack: move open_pack_index(), parse_pack_index()
      pack: move release_pack_memory()
      pack: move pack-closing functions
      pack: move use_pack()
      pack: move unuse_pack()
      pack: move add_packed_git()
      pack: move install_packed_git()
      pack: move {,re}prepare_packed_git and approximate_object_count
      pack: move unpack_object_header_buffer()
      pack: move get_size_from_delta()
      pack: move unpack_object_header()
      pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
      pack: move nth_packed_object_{sha1,oid}
      pack: move check_pack_index_ptr(), nth_packed_object_offset()
      pack: move find_pack_entry_one(), is_pack_valid()
      pack: move find_sha1_pack()
      pack: move find_pack_entry() and make it global
      pack: move has_sha1_pack()
      pack: move has_pack_index()
      pack: move for_each_packed_object()
      Remove inadvertently added outgoing/packfile.h
      Add t/helper/test-write-cache to .gitignore
      git-compat-util: make UNLEAK less error-prone
      fast-export: do not copy from modified file
      oidmap: map with OID as key

Junio C Hamano (51):
      t1408: add a test of stale packed refs covered by loose refs
      clean.c: use designated initializer
      http.c: http.sslcert and http.sslkey are both pathnames
      connect: reject ssh hostname that begins with a dash
      Git 2.7.6
      Git 2.8.6
      Git 2.9.5
      Git 2.10.4
      Git 2.11.3
      Git 2.12.4
      Git 2.13.5
      Git 2.14.1
      Start post 2.14 cycle
      perl/Git.pm: typofix in a comment
      The first batch of topics after the 2.14 cycle
      diff: retire sane_truncate_fn
      progress: simplify "delayed" progress API
      The second batch post 2.14
      t4200: give us a clean slate after "rerere gc" tests
      t4200: make "rerere gc" test more robust
      t4200: gather "rerere gc" together
      t4200: parameterize "rerere gc" custom expiry test
      rerere: represent time duration in timestamp_t internally
      rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved
      The third batch post 2.14
      Prepare for 2.14.2
      The fourth batch post 2.14
      The fifth batch post 2.14
      The sixth batch post 2.14
      RelNotes: further fixes for 2.14.2 from the master front
      The seventh batch post 2.14
      travis: dedent a few scripts that are indented overly deeply
      subprocess: loudly die when subprocess asks for an unsupported capability
      cvsserver: move safe_pipe_capture() to the main package
      cvsserver: use safe_pipe_capture for `constant commands` as well
      gc: call fscanf() with %<len>s, not %<len>c, when reading hostname
      The eighth batch for 2.15
      Git 2.10.5
      Git 2.11.4
      Git 2.12.5
      Git 2.13.6
      Git 2.14.2
      branch: fix "copy" to never touch HEAD
      merge-strategies: avoid implying that "-s theirs" exists
      The ninth batch for 2.15
      The tenth batch for 2.15
      The eleventh batch for 2.15
      The twelfth batch for 2.15
      Git 2.15-rc0
      Prepare for -rc1
      Git 2.15-rc1

Kaartic Sivaraam (15):
      hook: cleanup script
      hook: name the positional variables
      hook: add sign-off using "interpret-trailers"
      hook: add a simple first example
      commit: check for empty message before the check for untouched template
      hook: use correct logical variable
      t3200: cleanup cruft of a test
      builtin/branch: stop supporting the "--set-upstream" option
      branch: quote branch/ref names to improve readability
      help: change a message to be more precise
      commit-template: change a message to be more intuitive
      t/README: fix typo and grammatically improve a sentence
      doc: camelCase the config variables to improve readability
      branch: change the error messages to be more meaningful
      setup: update error message to be more meaningful

Kevin Daudt (3):
      stash: prevent warning about null bytes in input
      doc/for-each-ref: consistently use '=' to between argument names and values
      doc/for-each-ref: explicitly specify option names

Kevin Willford (9):
      format-patch: have progress option while generating patches
      rebase: turn on progress option by default for format-patch
      commit: skip discarding the index if there is no pre-commit hook
      perf: add test for writing the index
      read-cache: fix memory leak in do_write_index
      read-cache: avoid allocating every ondisk entry when writing
      merge-recursive: fix memory leak
      merge-recursive: remove return value from get_files_dirs
      merge-recursive: change current file dir string_lists to hashmap

Lars Schneider (13):
      t0021: keep filter log files on comparison
      t0021: make debug log file name configurable
      t0021: write "OUT <size>" only on success
      convert: put the flags field before the flag itself for consistent style
      convert: move multiple file filter error handling to separate function
      convert: refactor capabilities negotiation
      convert: add "status=delayed" to filter process protocol
      convert: display progress for filtered objects that have been delayed
      travis-ci: move Travis CI code into dedicated scripts
      travis-ci: skip a branch build if equal tag is present
      travis-ci: fix "skip_branch_tip_with_tag()" string comparison
      entry.c: update cache entry only for existing files
      entry.c: check if file exists after checkout

Manav Rathi (1):
      docs: improve discoverability of exclude pathspec

Martin Koegler (2):
      diff-delta: fix encoding size that would not fit in "unsigned int"
      diff-delta: do not allow delta offset truncation

Martin Ågren (33):
      builtin.h: take over documentation from api-builtin.txt
      git.c: let builtins opt for handling `pager.foo` themselves
      git.c: provide setup_auto_pager()
      t7006: add tests for how git tag paginates
      tag: respect `pager.tag` in list-mode only
      tag: change default of `pager.tag` to "on"
      git.c: ignore pager.* when launching builtin as dashed external
      Doc/git-{push,send-pack}: correct --sign= to --signed=
      t5334: document that git push --signed=1 does not work
      config: introduce git_parse_maybe_bool_text
      config: make git_{config,parse}_maybe_bool equivalent
      treewide: deprecate git_config_maybe_bool, use git_parse_maybe_bool
      parse_decoration_style: drop unused argument `var`
      doc/interpret-trailers: fix "the this" typo
      convert: always initialize attr_action in convert_attrs
      pack-objects: take lock before accessing `remaining`
      strbuf_setlen: don't write to strbuf_slopbuf
      ThreadSanitizer: add suppressions
      Documentation/user-manual: update outdated example output
      treewide: correct several "up-to-date" to "up to date"
      pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
      config: remove git_config_maybe_bool
      refs/files-backend: add longer-scoped copy of string to list
      refs/files-backend: fix memory leak in lock_ref_for_update
      refs/files-backend: correct return value in lock_ref_for_update
      refs/files-backend: add `refname`, not "HEAD", to list
      builtin/commit: fix memory leak in `prepare_index()`
      commit: fix memory leak in `reduce_heads()`
      leak_pending: use `object_array_clear()`, not `free()`
      object_array: use `object_array_clear()`, not `free()`
      object_array: add and use `object_array_pop()`
      pack-bitmap[-write]: use `object_array_clear()`, don't leak
      builtin/: add UNLEAKs

Matthieu Moy (2):
      send-email: fix garbage removal after address
      send-email: don't use Mail::Address, even if available

Max Kirillov (2):
      describe: fix matching to actually match all patterns
      describe: teach --match to handle branches and remotes

Michael Forney (1):
      scripts: use "git foo" not "git-foo"

Michael Haggerty (77):
      add_packed_ref(): teach function to overwrite existing refs
      packed_ref_store: new struct
      packed_ref_store: move `packed_refs_path` here
      packed_ref_store: move `packed_refs_lock` member here
      clear_packed_ref_cache(): take a `packed_ref_store *` parameter
      validate_packed_ref_cache(): take a `packed_ref_store *` parameter
      get_packed_ref_cache(): take a `packed_ref_store *` parameter
      get_packed_refs(): take a `packed_ref_store *` parameter
      add_packed_ref(): take a `packed_ref_store *` parameter
      lock_packed_refs(): take a `packed_ref_store *` parameter
      commit_packed_refs(): take a `packed_ref_store *` parameter
      rollback_packed_refs(): take a `packed_ref_store *` parameter
      get_packed_ref(): take a `packed_ref_store *` parameter
      repack_without_refs(): take a `packed_ref_store *` parameter
      packed_peel_ref(): new function, extracted from `files_peel_ref()`
      packed_ref_store: support iteration
      packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
      packed-backend: new module for handling packed references
      packed_ref_store: make class into a subclass of `ref_store`
      commit_packed_refs(): report errors rather than dying
      commit_packed_refs(): use a staging file separate from the lockfile
      packed_refs_lock(): function renamed from lock_packed_refs()
      packed_refs_lock(): report errors via a `struct strbuf *err`
      packed_refs_unlock(), packed_refs_is_locked(): new functions
      clear_packed_ref_cache(): don't protest if the lock is held
      commit_packed_refs(): remove call to `packed_refs_unlock()`
      repack_without_refs(): don't lock or unlock the packed refs
      t3210: add some tests of bogus packed-refs file contents
      read_packed_refs(): die if `packed-refs` contains bogus data
      packed_ref_store: handle a packed-refs file that is a symlink
      files-backend: cheapen refname_available check when locking refs
      refs: retry acquiring reference locks for 100ms
      notes: make GET_NIBBLE macro more robust
      load_subtree(): remove unnecessary conditional
      load_subtree(): reduce the scope of some local variables
      load_subtree(): fix incorrect comment
      load_subtree(): separate logic for internal vs. terminal entries
      load_subtree(): check earlier whether an internal node is a tree entry
      load_subtree(): only consider blobs to be potential notes
      get_oid_hex_segment(): return 0 on success
      load_subtree(): combine some common code
      get_oid_hex_segment(): don't pad the rest of `oid`
      hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
      load_subtree(): declare some variables to be `size_t`
      load_subtree(): check that `prefix_len` is in the expected range
      packed-backend: don't adjust the reference count on lock/unlock
      struct ref_transaction: add a place for backends to store data
      packed_ref_store: implement reference transactions
      packed_delete_refs(): implement method
      files_pack_refs(): use a reference transaction to write packed refs
      prune_refs(): also free the linked list
      files_initial_transaction_commit(): use a transaction for packed refs
      t1404: demonstrate two problems with reference transactions
      files_ref_store: use a transaction to update packed refs
      packed-backend: rip out some now-unused code
      files_transaction_finish(): delete reflogs before references
      ref_iterator: keep track of whether the iterator output is ordered
      packed_ref_cache: add a backlink to the associated `packed_ref_store`
      die_unterminated_line(), die_invalid_line(): new functions
      read_packed_refs(): use mmap to read the `packed-refs` file
      read_packed_refs(): only check for a header at the top of the file
      read_packed_refs(): make parsing of the header line more robust
      for_each_string_list_item: avoid undefined behavior for empty list
      read_packed_refs(): read references with minimal copying
      packed_ref_cache: remember the file-wide peeling state
      mmapped_ref_iterator: add iterator over a packed-refs file
      mmapped_ref_iterator_advance(): no peeled value for broken refs
      packed-backend.c: reorder some definitions
      packed_ref_cache: keep the `packed-refs` file mmapped if possible
      read_packed_refs(): ensure that references are ordered when read
      packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
      packed_read_raw_ref(): read the reference from the mmapped buffer
      ref_store: implement `refs_peel_ref()` generically
      packed_ref_store: get rid of the `ref_cache` entirely
      ref_cache: remove support for storing peeled values
      mmapped_ref_iterator: inline into `packed_ref_iterator`
      packed-backend.c: rename a bunch of things and update comments

Michael J Gruber (11):
      Documentation: use proper wording for ref format strings
      Documentation/git-for-each-ref: clarify peeling of tags for --format
      Documentation/git-merge: explain --continue
      merge: clarify call chain
      merge: split write_merge_state in two
      merge: save merge state earlier
      name-rev: change ULONG_MAX to TIME_MAX
      t7004: move limited stack prereq to test-lib
      t6120: test name-rev --all and --stdin
      t6120: clean up state after breaking repo
      t6120: test describe and name-rev with deep repos

Nguyễn Thái Ngọc Duy (17):
      branch: fix branch renaming not updating HEADs correctly
      revision.h: new flag in struct rev_info wrt. worktree-related refs
      refs.c: use is_dir_sep() in resolve_gitlink_ref()
      revision.c: refactor add_index_objects_to_pending()
      revision.c: --indexed-objects add objects from all worktrees
      refs.c: refactor get_submodule_ref_store(), share common free block
      refs: move submodule slash stripping code to get_submodule_ref_store
      refs: add refs_head_ref()
      revision.c: use refs_for_each*() instead of for_each_*_submodule()
      refs.c: move for_each_remote_ref_submodule() to submodule.c
      refs: remove dead for_each_*_submodule()
      revision.c: --all adds HEAD from all worktrees
      files-backend: make reflog iterator go through per-worktree reflog
      revision.c: --reflog add HEAD reflog from all worktrees
      rev-list: expose and document --single-worktree
      refs.c: remove fallback-to-main-store code get_submodule_ref_store()
      refs.c: reindent get_submodule_ref_store()

Nicolas Morey-Chaisemartin (7):
      stash: clean untracked files before reset
      pull: fix cli and config option parsing order
      pull: honor submodule.recurse config option
      imap-send: return with error if curl failed
      imap-send: add wrapper to get server credentials if needed
      imap_send: setup_curl: retreive credentials if not set in config file
      imap-send: use curl by default when possible

Paolo Bonzini (4):
      trailers: export action enums and corresponding lookup functions
      trailers: introduce struct new_trailer_item
      interpret-trailers: add options for actions
      interpret-trailers: fix documentation typo

Patryk Obara (10):
      sha1_file: fix definition of null_sha1
      commit: replace the raw buffer with strbuf in read_graft_line
      commit: allocate array using object_id size
      commit: rewrite read_graft_line
      builtin/hash-object: convert to struct object_id
      read-cache: convert to struct object_id
      sha1_file: convert index_path to struct object_id
      sha1_file: convert index_fd to struct object_id
      sha1_file: convert hash_sha1_file_literally to struct object_id
      sha1_file: convert index_stream to struct object_id

Philip Oakley (4):
      git-gui: remove duplicate entries from .gitconfig's gui.recentrepo
      git gui: cope with duplicates in _get_recentrepo
      git gui: de-dup selected repo from recentrepo history
      git gui: allow for a long recentrepo list

Phillip Wood (7):
      am: remember --rerere-autoupdate setting
      rebase: honor --rerere-autoupdate
      rebase -i: honor --rerere-autoupdate
      t3504: use test_commit
      cherry-pick/revert: remember --rerere-autoupdate
      cherry-pick/revert: reject --rerere-autoupdate when continuing
      am: fix signoff when other trailers are present

Raman Gupta (1):
      contrib/rerere-train: optionally overwrite existing resolutions

Ramsay Jones (9):
      credential-cache: interpret an ECONNRESET as an EOF
      builtin/add: add detail to a 'cannot chmod' error message
      test-lib: don't use ulimit in test prerequisites on cygwin
      test-lib: use more compact expression in PIPE prerequisite
      t9010-*.sh: skip all tests if the PIPE prereq is missing
      git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings
      commit-slab.h: avoid -Wsign-compare warnings
      cache.h: hex2chr() - avoid -Wsign-compare warnings
      ALLOC_GROW: avoid -Wsign-compare warnings

Randall S. Becker (1):
      poll.c: always set revents, even if to zero

René Scharfe (81):
      tree-diff: don't access hash of NULL object_id pointer
      notes: don't access hash of NULL object_id pointer
      receive-pack: don't access hash of NULL object_id pointer
      bswap: convert to unsigned before shifting in get_be32
      bswap: convert get_be16, get_be32 and put_be32 to inline functions
      add MOVE_ARRAY
      use MOVE_ARRAY
      apply: use COPY_ARRAY and MOVE_ARRAY in update_image()
      ls-files: don't try to prune an empty index
      dir: support platforms that require aligned reads
      pack-objects: remove unnecessary NULL check
      t0001: skip test with restrictive permissions if getpwd(3) respects them
      test-path-utils: handle const parameter of basename and dirname
      t3700: fix broken test under !POSIXPERM
      t4062: use less than 256 repetitions in regex
      sha1_file: avoid comparison if no packed hash matches the first byte
      apply: remove prefix_length member from apply_state
      merge: use skip_prefix()
      win32: plug memory leak on realloc() failure in syslog()
      strbuf: clear errno before calling getdelim(3)
      fsck: free buffers on error in fsck_obj()
      sha1_file: release delta_stack on error in unpack_entry()
      tree-walk: convert fill_tree_descriptor() to object_id
      t1002: stop using sum(1)
      t5001: add tests for export-ignore attributes and exclude pathspecs
      archive: factor out helper functions for handling attributes
      archive: don't queue excluded directories
      commit: remove unused inline function single_parent()
      apply: check date of potential epoch timestamps first
      apply: remove epoch date from regex
      am: release strbufs after use in detect_patch_format()
      am: release strbuf on error return in hg_patch_to_mail()
      am: release strbuf after use in safe_to_abort()
      check-ref-format: release strbuf after use in check_ref_format_branch()
      clean: release strbuf after use in remove_dirs()
      clone: release strbuf after use in remove_junk()
      commit: release strbuf on error return in commit_tree_extended()
      connect: release strbuf on error return in git_connect()
      convert: release strbuf on error return in filter_buffer_or_fd()
      diff: release strbuf after use in diff_summary()
      diff: release strbuf after use in show_rename_copy()
      diff: release strbuf after use in show_stats()
      help: release strbuf on error return in exec_man_konqueror()
      help: release strbuf on error return in exec_man_man()
      help: release strbuf on error return in exec_woman_emacs()
      mailinfo: release strbuf after use in handle_from()
      mailinfo: release strbuf on error return in handle_boundary()
      merge: release strbuf after use in save_state()
      merge: release strbuf after use in write_merge_heads()
      notes: release strbuf after use in notes_copy_from_stdin()
      refs: release strbuf on error return in write_pseudoref()
      remote: release strbuf after use in read_remote_branches()
      remote: release strbuf after use in migrate_file()
      remote: release strbuf after use in set_url()
      send-pack: release strbuf on error return in send_pack()
      sha1_file: release strbuf on error return in index_path()
      shortlog: release strbuf after use in insert_one_record()
      sequencer: release strbuf after use in save_head()
      transport-helper: release strbuf after use in process_connect_service()
      userdiff: release strbuf after use in userdiff_get_textconv()
      utf8: release strbuf on error return in strbuf_utf8_replace()
      vcs-svn: release strbuf after use in end_revision()
      wt-status: release strbuf after use in read_rebase_todolist()
      wt-status: release strbuf after use in wt_longstatus_print_tracking()
      archive: don't add empty directories to archives
      refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
      refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
      refs: pass NULL to resolve_ref_unsafe() if hash is not needed
      mailinfo: don't decode invalid =XY quoted-printable sequences
      refs: pass NULL to refs_resolve_refdup() if hash is not needed
      refs: pass NULL to resolve_refdup() if hash is not needed
      coccinelle: remove parentheses that become unnecessary
      path: use strbuf_add_real_path()
      use strbuf_addstr() for adding strings to strbufs
      graph: use strbuf_addchars() to add spaces
      tag: avoid NULL pointer arithmetic
      repository: use FREE_AND_NULL
      run-command: use ALLOC_ARRAY
      test-stringlist: avoid buffer underrun when sorting nothing
      fsck: handle NULL return of lookup_blob() and lookup_tree()
      .mailmap: normalize name for René Scharfe

Ross Kabus (1):
      commit-tree: do not complete line in -F input

Sahil Dua (2):
      config: create a function to format section headers
      branch: add a --copy (-c) option to go with --move (-m)

Santiago Torres (1):
      t: lib-gpg: flush gpg agent on startup

Stefan Beller (51):
      diff.c: readability fix
      diff.c: move line ending check into emit_hunk_header
      diff.c: factor out diff_flush_patch_all_file_pairs
      diff.c: introduce emit_diff_symbol
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
      diff.c: migrate emit_line_checked to use emit_diff_symbol
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN]
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS}
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
      submodule.c: migrate diff output to use emit_diff_symbol
      diff.c: convert emit_binary_diff_body to use emit_diff_symbol
      diff.c: convert show_stats to use emit_diff_symbol
      diff.c: convert word diffing to use emit_diff_symbol
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
      diff.c: buffer all output if asked to
      diff.c: color moved lines differently
      diff.c: color moved lines differently, plain mode
      diff.c: add dimming to moved line detection
      diff: document the new --color-moved setting
      attr.c: drop hashmap_cmp_fn cast
      builtin/difftool.c: drop hashmap_cmp_fn cast
      builtin/describe: drop hashmap_cmp_fn cast
      config.c: drop hashmap_cmp_fn cast
      convert/sub-process: drop cast to hashmap_cmp_fn
      patch-ids.c: drop hashmap_cmp_fn cast
      remote.c: drop hashmap_cmp_fn cast
      submodule-config.c: drop hashmap_cmp_fn cast
      name-hash.c: drop hashmap_cmp_fn cast
      t/helper/test-hashmap: use custom data instead of duplicate cmp functions
      commit: convert lookup_commit_graft to struct object_id
      tag: convert gpg_verify_tag to use struct object_id
      t8008: rely on rev-parse'd HEAD instead of sha1 value
      t1200: remove t1200-tutorial.sh
      sha1_file: make read_info_alternates static
      submodule.sh: remove unused variable
      builtin/merge: honor commit-msg hook for merges
      push, fetch: error out for submodule entries not pointing to commits
      replace-objects: evaluate replacement refs without using the object store
      Documentation/githooks: mention merge in commit-msg hook
      Documentation/config: clarify the meaning of submodule.<name>.update
      t7406: submodule.<name>.update command must not be run from .gitmodules
      diff: correct newline in summary for renamed files
      submodule: correct error message for missing commits
      branch: reset instead of release a strbuf
      tests: fix diff order arguments in test_cmp

Stephan Beyer (1):
      clang-format: add a comment about the meaning/status of the

Takashi Iwai (2):
      sha1dc: build git plumbing code more explicitly
      sha1dc: allow building with the external sha1dc library

Taylor Blau (8):
      pretty.c: delimit "%(trailers)" arguments with ","
      t4205: unfold across multiple lines
      doc: 'trailers' is the preferred way to format trailers
      doc: use "`<literal>`"-style quoting for literal strings
      t6300: refactor %(trailers) tests
      ref-filter.c: use trailer_opts to format trailers
      ref-filter.c: parse trailers arguments with %(contents) atom
      ref-filter.c: pass empty-string as NULL to atom parsers

Thomas Braun (1):
      completion: add --broken and --dirty to describe

Thomas Gummerer (3):
      read-cache: fix index corruption with index v4
      refs: strip out not allowed flags from ref_transaction_update
      http-push: fix construction of hex value from path

Todd Zullinger (1):
      api-argv-array.txt: remove broken link to string-list API

Tom G. Christensen (2):
      http: fix handling of missing CURLPROTO_*
      http: use a feature check to enable GSSAPI delegation control

Torsten Bögershausen (3):
      convert: add SAFE_CRLF_KEEP_CRLF
      apply: file commited with CRLF should roundtrip diff and apply
      test-lint: echo -e (or -E) is not portable

Urs Thuermann (1):
      git svn fetch: Create correct commit timestamp when using --localtime

William Duclot (1):
      rebase: make resolve message clearer for inexperienced users

brian m. carlson (14):
      builtin/fsck: convert remaining caller of get_sha1 to object_id
      builtin/merge-tree: convert remaining caller of get_sha1 to object_id
      submodule: convert submodule config lookup to use object_id
      remote: convert struct push_cas to struct object_id
      sequencer: convert to struct object_id
      builtin/update_ref: convert to struct object_id
      bisect: convert bisect_checkout to struct object_id
      builtin/unpack-file: convert to struct object_id
      Convert remaining callers of get_sha1 to get_oid.
      sha1_name: convert get_sha1* to get_oid*
      sha1_name: convert GET_SHA1* flags to GET_OID*
      sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
      vcs-svn: remove unused prototypes
      vcs-svn: rename repo functions to "svn_repo"

joernchen (1):
      cvsserver: use safe_pipe_capture instead of backticks

Ævar Arnfjörð Bjarmason (2):
      branch: add test for -m renaming multiple config sections
      tests: don't give unportable ">" to "test" built-in, use -gt

Øystein Walle (1):
      rev-parse: rev-parse: add --is-shallow-repository

Łukasz Gryglicki (1):
      merge: add a --signoff flag


^ permalink raw reply	[relevance 11%]

* Re: Enhancement request: git-push: Allow (configurable) default push-option
  2017-10-04 15:20   ` Marius Paliga
@ 2017-10-11  7:14     ` Marius Paliga
  2017-10-11  9:18       ` Marius Paliga
  0 siblings, 1 reply; 200+ results
From: Marius Paliga @ 2017-10-11  7:14 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Including proposed patch...


Signed-off-by: Marius Paliga <marius.paliga@gmail.com>
---
 Documentation/git-push.txt |  3 +++
 builtin/push.c             | 11 ++++++++++-
 t/t5545-push-options.sh    | 48 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 3e76e99f3..133c42183 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -161,6 +161,9 @@ already exists on the remote side.
     Transmit the given string to the server, which passes them to
     the pre-receive as well as the post-receive hook. The given string
     must not contain a NUL or LF character.
+    Default push options can also be specified with configuration
+    variable `push.optiondefault`. String(s) specified here will always
+    be passed to the server without need to specify it using `--push-option`

 --receive-pack=<git-receive-pack>::
 --exec=<git-receive-pack>::
diff --git a/builtin/push.c b/builtin/push.c
index 2ac810422..4dd5d6f0e 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -32,6 +32,8 @@ static const char **refspec;
 static int refspec_nr;
 static int refspec_alloc;

+static struct string_list push_options = STRING_LIST_INIT_DUP;
+
 static void add_refspec(const char *ref)
 {
     refspec_nr++;
@@ -467,6 +469,8 @@ static int git_push_config(const char *k, const
char *v, void *cb)
 {
     int *flags = cb;
     int status;
+    const struct string_list *default_push_options;
+    struct string_list_item *item;

     status = git_gpg_config(k, v, NULL);
     if (status)
@@ -505,6 +509,12 @@ static int git_push_config(const char *k, const
char *v, void *cb)
         recurse_submodules = val;
     }

+    default_push_options = git_config_get_value_multi("push.optiondefault");
+    if (default_push_options)
+        for_each_string_list_item(item, default_push_options)
+            if (!string_list_has_string(&push_options, item->string))
+                string_list_append(&push_options, item->string);
+
     return git_default_config(k, v, NULL);
 }

@@ -515,7 +525,6 @@ int cmd_push(int argc, const char **argv, const
char *prefix)
     int push_cert = -1;
     int rc;
     const char *repo = NULL;    /* default repository */
-    struct string_list push_options = STRING_LIST_INIT_DUP;
     const struct string_list_item *item;

     struct option options[] = {
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 90a4b0d2f..575f3dc38 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -140,6 +140,54 @@ test_expect_success 'push options and submodules' '
     test_cmp expect parent_upstream/.git/hooks/post-receive.push_options
 '

+test_expect_success 'default push option' '
+    mk_repo_pair &&
+    git -C upstream config receive.advertisePushOptions true &&
+    (
+        cd workbench &&
+        test_commit one &&
+        git push --mirror up &&
+        test_commit two &&
+        git -c push.optiondefault=default push up master
+    ) &&
+    test_refs master master &&
+    echo "default" >expect &&
+    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+    test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'two default push options' '
+    mk_repo_pair &&
+    git -C upstream config receive.advertisePushOptions true &&
+    (
+        cd workbench &&
+        test_commit one &&
+        git push --mirror up &&
+        test_commit two &&
+        git -c push.optiondefault=default1 -c
push.optiondefault=default2 push up master
+    ) &&
+    test_refs master master &&
+    printf "default1\ndefault2\n" >expect &&
+    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+    test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'default and manual push options' '
+    mk_repo_pair &&
+    git -C upstream config receive.advertisePushOptions true &&
+    (
+        cd workbench &&
+        test_commit one &&
+        git push --mirror up &&
+        test_commit two &&
+        git -c push.optiondefault=default push --push-option=manual up master
+    ) &&
+    test_refs master master &&
+    printf "default\nmanual\n" >expect &&
+    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+    test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd

-- 
2.14.1


2017-10-04 17:20 GMT+02:00 Marius Paliga <marius.paliga@gmail.com>:
> Hi Stefan,
>
> I will look at it.
>
> Thanks,
> Marius
>
>
> 2017-10-03 18:53 GMT+02:00 Stefan Beller <sbeller@google.com>:
>> On Tue, Oct 3, 2017 at 3:15 AM, Marius Paliga <marius.paliga@gmail.com> wrote:
>>> There is a need to pass predefined push-option during "git push"
>>> without need to specify it explicitly.
>>>
>>> In another words we need to have a new "git config" variable to
>>> specify string that will be automatically passed as "--push-option"
>>> when pushing to remote.
>>>
>>> Something like the following:
>>>
>>> git config push.optionDefault AllowMultipleCommits
>>>
>>> and then command
>>>   git push
>>> would silently run
>>>   git push --push-option "AllowMultipleCommits"
>>
>> We would need to
>> * design this feature (seems like you already have a good idea what you need)
>> * implement it (see builtin/push.c):
>>  - move "struct string_list push_options = STRING_LIST_INIT_DUP;"
>>   to be a file-static variable, such that we have access to it outside
>> of cmd_push.
>>  - In git_push_config in builtin/push.c that parses the config, we'd
>> need to check
>>   for "push.optionDefault" and add these to the push_options (I assume multiple
>>   are allowed)
>> * document it (Documentation/git-push.txt)
>> * add a test for it ? (t/t5545-push-options.sh)
>>
>> Care to write a patch? Otherwise I'd mark it up as part of
>> #leftoverbits for now,
>> as it seems like a good starter project.
>>
>> Thanks,
>> Stefan

^ permalink raw reply	[relevance 4%]

* Re: Enhancement request: git-push: Allow (configurable) default push-option
  2017-10-11  7:14     ` Marius Paliga
@ 2017-10-11  9:18       ` Marius Paliga
  0 siblings, 0 replies; 200+ results
From: Marius Paliga @ 2017-10-11  9:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Found one possible issue when looking for duplicates, we need to use
  "unsorted_string_list_has_string" instead of "string_list_has_string"

-                       if (!string_list_has_string(&push_options,
item->string))
+                       if
(!unsorted_string_list_has_string(&push_options, item->string)) {

New (fixed) patch follows...


Signed-off-by: Marius Paliga <marius.paliga@gmail.com>
---
 Documentation/git-push.txt |  3 +++
 builtin/push.c             | 11 ++++++++++-
 t/t5545-push-options.sh    | 48 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 3e76e99f3..133c42183 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -161,6 +161,9 @@ already exists on the remote side.
     Transmit the given string to the server, which passes them to
     the pre-receive as well as the post-receive hook. The given string
     must not contain a NUL or LF character.
+    Default push options can also be specified with configuration
+    variable `push.optiondefault`. String(s) specified here will always
+    be passed to the server without need to specify it using `--push-option`

 --receive-pack=<git-receive-pack>::
 --exec=<git-receive-pack>::
diff --git a/builtin/push.c b/builtin/push.c
index 2ac810422..ab458419a 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -32,6 +32,8 @@ static const char **refspec;
 static int refspec_nr;
 static int refspec_alloc;

+static struct string_list push_options = STRING_LIST_INIT_DUP;
+
 static void add_refspec(const char *ref)
 {
     refspec_nr++;
@@ -467,6 +469,8 @@ static int git_push_config(const char *k, const
char *v, void *cb)
 {
     int *flags = cb;
     int status;
+    const struct string_list *default_push_options;
+    struct string_list_item *item;

     status = git_gpg_config(k, v, NULL);
     if (status)
@@ -505,6 +509,12 @@ static int git_push_config(const char *k, const
char *v, void *cb)
         recurse_submodules = val;
     }

+    default_push_options = git_config_get_value_multi("push.optiondefault");
+    if (default_push_options)
+        for_each_string_list_item(item, default_push_options)
+            if (!unsorted_string_list_has_string(&push_options, item->string))
+                string_list_append(&push_options, item->string);
+
     return git_default_config(k, v, NULL);
 }

@@ -515,7 +525,6 @@ int cmd_push(int argc, const char **argv, const
char *prefix)
     int push_cert = -1;
     int rc;
     const char *repo = NULL;    /* default repository */
-    struct string_list push_options = STRING_LIST_INIT_DUP;
     const struct string_list_item *item;

     struct option options[] = {
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 90a4b0d2f..575f3dc38 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -140,6 +140,54 @@ test_expect_success 'push options and submodules' '
     test_cmp expect parent_upstream/.git/hooks/post-receive.push_options
 '

+test_expect_success 'default push option' '
+    mk_repo_pair &&
+    git -C upstream config receive.advertisePushOptions true &&
+    (
+        cd workbench &&
+        test_commit one &&
+        git push --mirror up &&
+        test_commit two &&
+        git -c push.optiondefault=default push up master
+    ) &&
+    test_refs master master &&
+    echo "default" >expect &&
+    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+    test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'two default push options' '
+    mk_repo_pair &&
+    git -C upstream config receive.advertisePushOptions true &&
+    (
+        cd workbench &&
+        test_commit one &&
+        git push --mirror up &&
+        test_commit two &&
+        git -c push.optiondefault=default1 -c
push.optiondefault=default2 push up master
+    ) &&
+    test_refs master master &&
+    printf "default1\ndefault2\n" >expect &&
+    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+    test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'default and manual push options' '
+    mk_repo_pair &&
+    git -C upstream config receive.advertisePushOptions true &&
+    (
+        cd workbench &&
+        test_commit one &&
+        git push --mirror up &&
+        test_commit two &&
+        git -c push.optiondefault=default push --push-option=manual up master
+    ) &&
+    test_refs master master &&
+    printf "default\nmanual\n" >expect &&
+    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+    test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd

-- 
2.14.1

2017-10-11 9:14 GMT+02:00 Marius Paliga <marius.paliga@gmail.com>:
> Including proposed patch...
>
>
> Signed-off-by: Marius Paliga <marius.paliga@gmail.com>
> ---
>  Documentation/git-push.txt |  3 +++
>  builtin/push.c             | 11 ++++++++++-
>  t/t5545-push-options.sh    | 48 ++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
> index 3e76e99f3..133c42183 100644
> --- a/Documentation/git-push.txt
> +++ b/Documentation/git-push.txt
> @@ -161,6 +161,9 @@ already exists on the remote side.
>      Transmit the given string to the server, which passes them to
>      the pre-receive as well as the post-receive hook. The given string
>      must not contain a NUL or LF character.
> +    Default push options can also be specified with configuration
> +    variable `push.optiondefault`. String(s) specified here will always
> +    be passed to the server without need to specify it using `--push-option`
>
>  --receive-pack=<git-receive-pack>::
>  --exec=<git-receive-pack>::
> diff --git a/builtin/push.c b/builtin/push.c
> index 2ac810422..4dd5d6f0e 100644
> --- a/builtin/push.c
> +++ b/builtin/push.c
> @@ -32,6 +32,8 @@ static const char **refspec;
>  static int refspec_nr;
>  static int refspec_alloc;
>
> +static struct string_list push_options = STRING_LIST_INIT_DUP;
> +
>  static void add_refspec(const char *ref)
>  {
>      refspec_nr++;
> @@ -467,6 +469,8 @@ static int git_push_config(const char *k, const
> char *v, void *cb)
>  {
>      int *flags = cb;
>      int status;
> +    const struct string_list *default_push_options;
> +    struct string_list_item *item;
>
>      status = git_gpg_config(k, v, NULL);
>      if (status)
> @@ -505,6 +509,12 @@ static int git_push_config(const char *k, const
> char *v, void *cb)
>          recurse_submodules = val;
>      }
>
> +    default_push_options = git_config_get_value_multi("push.optiondefault");
> +    if (default_push_options)
> +        for_each_string_list_item(item, default_push_options)
> +            if (!string_list_has_string(&push_options, item->string))
> +                string_list_append(&push_options, item->string);
> +
>      return git_default_config(k, v, NULL);
>  }
>
> @@ -515,7 +525,6 @@ int cmd_push(int argc, const char **argv, const
> char *prefix)
>      int push_cert = -1;
>      int rc;
>      const char *repo = NULL;    /* default repository */
> -    struct string_list push_options = STRING_LIST_INIT_DUP;
>      const struct string_list_item *item;
>
>      struct option options[] = {
> diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
> index 90a4b0d2f..575f3dc38 100755
> --- a/t/t5545-push-options.sh
> +++ b/t/t5545-push-options.sh
> @@ -140,6 +140,54 @@ test_expect_success 'push options and submodules' '
>      test_cmp expect parent_upstream/.git/hooks/post-receive.push_options
>  '
>
> +test_expect_success 'default push option' '
> +    mk_repo_pair &&
> +    git -C upstream config receive.advertisePushOptions true &&
> +    (
> +        cd workbench &&
> +        test_commit one &&
> +        git push --mirror up &&
> +        test_commit two &&
> +        git -c push.optiondefault=default push up master
> +    ) &&
> +    test_refs master master &&
> +    echo "default" >expect &&
> +    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
> +    test_cmp expect upstream/.git/hooks/post-receive.push_options
> +'
> +
> +test_expect_success 'two default push options' '
> +    mk_repo_pair &&
> +    git -C upstream config receive.advertisePushOptions true &&
> +    (
> +        cd workbench &&
> +        test_commit one &&
> +        git push --mirror up &&
> +        test_commit two &&
> +        git -c push.optiondefault=default1 -c
> push.optiondefault=default2 push up master
> +    ) &&
> +    test_refs master master &&
> +    printf "default1\ndefault2\n" >expect &&
> +    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
> +    test_cmp expect upstream/.git/hooks/post-receive.push_options
> +'
> +
> +test_expect_success 'default and manual push options' '
> +    mk_repo_pair &&
> +    git -C upstream config receive.advertisePushOptions true &&
> +    (
> +        cd workbench &&
> +        test_commit one &&
> +        git push --mirror up &&
> +        test_commit two &&
> +        git -c push.optiondefault=default push --push-option=manual up master
> +    ) &&
> +    test_refs master master &&
> +    printf "default\nmanual\n" >expect &&
> +    test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
> +    test_cmp expect upstream/.git/hooks/post-receive.push_options
> +'
> +
>  . "$TEST_DIRECTORY"/lib-httpd.sh
>  start_httpd
>
> --
> 2.14.1
>
>
> 2017-10-04 17:20 GMT+02:00 Marius Paliga <marius.paliga@gmail.com>:
>> Hi Stefan,
>>
>> I will look at it.
>>
>> Thanks,
>> Marius
>>
>>
>> 2017-10-03 18:53 GMT+02:00 Stefan Beller <sbeller@google.com>:
>>> On Tue, Oct 3, 2017 at 3:15 AM, Marius Paliga <marius.paliga@gmail.com> wrote:
>>>> There is a need to pass predefined push-option during "git push"
>>>> without need to specify it explicitly.
>>>>
>>>> In another words we need to have a new "git config" variable to
>>>> specify string that will be automatically passed as "--push-option"
>>>> when pushing to remote.
>>>>
>>>> Something like the following:
>>>>
>>>> git config push.optionDefault AllowMultipleCommits
>>>>
>>>> and then command
>>>>   git push
>>>> would silently run
>>>>   git push --push-option "AllowMultipleCommits"
>>>
>>> We would need to
>>> * design this feature (seems like you already have a good idea what you need)
>>> * implement it (see builtin/push.c):
>>>  - move "struct string_list push_options = STRING_LIST_INIT_DUP;"
>>>   to be a file-static variable, such that we have access to it outside
>>> of cmd_push.
>>>  - In git_push_config in builtin/push.c that parses the config, we'd
>>> need to check
>>>   for "push.optionDefault" and add these to the push_options (I assume multiple
>>>   are allowed)
>>> * document it (Documentation/git-push.txt)
>>> * add a test for it ? (t/t5545-push-options.sh)
>>>
>>> Care to write a patch? Otherwise I'd mark it up as part of
>>> #leftoverbits for now,
>>> as it seems like a good starter project.
>>>
>>> Thanks,
>>> Stefan

^ permalink raw reply	[relevance 4%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-10 18:39       ` Stefan Beller
  2017-10-10 23:31         ` Junio C Hamano
@ 2017-10-11 14:52         ` Heiko Voigt
  2017-10-11 15:10         ` Josh Triplett
  2 siblings, 0 replies; 200+ results
From: Heiko Voigt @ 2017-10-11 14:52 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Josh Triplett, git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Tue, Oct 10, 2017 at 11:39:21AM -0700, Stefan Beller wrote:
> So you propose to make git-add behave like "git submodule add"
> (i.e. also add the .gitmodules entry for name/path/URL), which I
> like from a submodule perspective.

Well more like: clone and add will behave like "git submodule add" but
basically yes.

> However other users of gitlinks might be confused[1], which is why
> I refrained from "making every gitlink into a submodule". Specifically
> the more powerful a submodule operation is (the more fluff adds),
> the harder it should be for people to mis-use it.
> 
> [1] https://github.com/git-series/git-series/blob/master/INTERNALS.md
>      "git-series uses gitlinks to store pointer to commits in its own repo."

But would those users use

    git add

to add a gitlink? From the description in that file I read that it
points to commits in its own repository. Will there also be files
checked out like submodules at that location?

Otherwise I would propose that 'git add' could detect whether a gitlink
is a submodule by trying to read its git configuration. If we do not
find that we simply do not do anything.

> > If everyone agrees that submodules are the default way of handling
> > repositories insided repositories, IMO, 'git add' should also alter
> > .gitmodules by default. We could provide a switch to avoid doing that.
> 
> I wonder if that switch should be default-on (i.e. not treat a gitlink as
> a submodule initially, behavior as-is, and then eventually we will
> die() on unconfigured repos, expecting the user to make the decision)
> 
> > An intermediate solution would be to warn
> 
> That is already implemented by Peff.

Ah ok, thanks I suspected so when I realized that this discussion was
older.

> > but in the long run my goal
> > for submodules is and always was: Make them behave as close to files as
> > possible. And why should a 'git add submodule' not magically do
> > everything it can to make submodules just work? I can look into a patch
> > for that if people agree here...
> 
> I'd love to see this implemented. I cc'd Josh (the author of git-series), who
> may disagree with this, or has some good input how to go forward without
> breaking git-series.

Yeah, lets see if, as described above, that actually would break
git-series.

> > Regarding handling of gitlinks with or without .gitmodules:
> >
> > Currently we are actually in some intermediate state:
> >
> >  * If there is no .gitmodules file: No submodule processing on any
> >    gitlinks (AFAIK)
> 
> AFAIK this is true.
> 
> >  * If there is a .gitmodules files with some submodule configured: Do
> >    recursive fetch and push as far as possible on gitlinks.
> 
> * If submodule.recurse is set, then we also treat submodules like files
>   for checkout, reset, read-tree.

To clarify: If submodule.recurse is set but there is no .gitmodules file
we do submodule processing for the above commands?

> > So I am not sure whether there are actually many users (knowingly)
> > using a mix of some submodules configured and some not and then relying
> > on the submodule infrastructure.
> >
> > I would rather expect two sorts of users:
> >
> >   1. Those that do use .gitmodules
> 
> Those want to reap all benefits of good submodules.
> 
> >
> >   2. Those that do *not* use .gitmodules
> 
> As said above, we don't know if those users are
> "holding submodules wrong" or are using gitlinks for
> magic tricks (unrelated to submodules).

I did not want to say that they are "holding submodules wrong" but
rather that if they do not use .gitmodules they do that knowingly and
thus consistently not use .gitmodules for any gitlink.

> > Users that do not use any .gitmodules file will currently (AFAIK) not
> > get any submodule handling. So the question is are there really many
> > "mixed users"? My guess would be no.
> 
> I hope that there are few (if any) users of these mixed setups.

That sounds promising.

> > Because without those using this mixed we could switch to saying: "You
> > need to have a .gitmodules file for submodule handling" without much
> > fallout from breaking users use cases.
> 
> That seems reasonable to me, actually.

Nice.

> > Maybe we can test this out somehow? My patch series would be ready in
> > that case, just had to drop the first patch and adjust the commit
> > message of this one.
> 
> I wonder how we would test this, though? Do you have any idea
> (even vague) how we'd accomplish such a measurement?
> I fear we'll have to go this way blindly.

One idea would be to expose this somewhere to a limited amount of users.
I remember Jonathan was suggesting, back when Jens was working on the
recursive checkout, that he could add the series to the debian package
and see what happens. Or we could use Junios next branch? Something like
that. If we get complaints we know the assumption was wrong and we need
a fallback.

Cheers Heiko

^ permalink raw reply	[relevance 24%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-10 23:31         ` Junio C Hamano
  2017-10-10 23:41           ` Stefan Beller
@ 2017-10-11 14:56           ` Heiko Voigt
  1 sibling, 0 replies; 200+ results
From: Heiko Voigt @ 2017-10-11 14:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, Josh Triplett, git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Wed, Oct 11, 2017 at 08:31:37AM +0900, Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:
> 
> > So you propose to make git-add behave like "git submodule add"
> > (i.e. also add the .gitmodules entry for name/path/URL), which I
> > like from a submodule perspective.
> >
> > However other users of gitlinks might be confused[1], which is why
> > I refrained from "making every gitlink into a submodule". Specifically
> > the more powerful a submodule operation is (the more fluff adds),
> > the harder it should be for people to mis-use it.
> 
> A few questions that come to mind are:
> 
>  - Does "git add sub/" have enough information to populate
>    .gitmodules?  If we have reasonable "default" values for
>    .gitmodules entries (e.g. missing URL means we won't fetch when
>    asked to go recursively fetch), perhaps we can leave everything
>    other than "submodule.$name.path" undefined.

My suggestion would be: If we do not have them we do not populate them.
We could even go further and say: If we do not have the set "git
submodule add" would populate then we do not add anything to .gitmodules
and warn the user.

>  - Can't we help those who have gitlinks without .gitmodules entries
>    exactly the same way as above, i.e. when we see a gitlink and try
>    to treat it as a submodule, we'd first try to look it up from
>    .gitmodules (by going from path to name and then to
>    submodule.$name.$var); the above "'git add sub/' would add an
>    entry for .gitmodules" wish is based on the assumption that there
>    are reasonable "default" values for each of these $var--so by
>    basing on the same assumption, we can "pretend" as if these
>    submodule.$name.$var were in .gitmodules file when we see
>    gitlinks without .gitmodules entries.  IOW, if "git add sub/" can
>    add .gitmodules to help people without having to type "git
>    submodule add sub/", then we can give exactly the same degree of
>    help without even modifying .gitmodules when "git add sub/" is
>    run.

This "default" value thing got me thinking in a different direction. We
could use a scheme like that to get names (and values) for submodules
that are missing from the .gitmodules file. If we decide that we need to
handle them.

Cheers Heiko

^ permalink raw reply	[relevance 15%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-10 18:39       ` Stefan Beller
  2017-10-10 23:31         ` Junio C Hamano
  2017-10-11 14:52         ` Heiko Voigt
@ 2017-10-11 15:10         ` Josh Triplett
  2017-10-12 16:17           ` Brandon Williams
  2 siblings, 1 reply; 200+ results
From: Josh Triplett @ 2017-10-11 15:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Heiko Voigt, git, Jonathan Nieder, Jens Lehmann, Brandon Williams

On Tue, Oct 10, 2017 at 11:39:21AM -0700, Stefan Beller wrote:
> On Tue, Oct 10, 2017 at 6:03 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> > but in the long run my goal
> > for submodules is and always was: Make them behave as close to files as
> > possible. And why should a 'git add submodule' not magically do
> > everything it can to make submodules just work? I can look into a patch
> > for that if people agree here...
> 
> I'd love to see this implemented. I cc'd Josh (the author of git-series), who
> may disagree with this, or has some good input how to go forward without
> breaking git-series.

git-series doesn't use the git-submodule command at all, nor does it
construct series trees using git-add or any other git command-line tool;
it constructs gitlinks directly. Most of the time, it doesn't even make
sense to `git checkout` a series branch. Modifying commands like git-add
and similar to automatically manage .gitmodules won't cause any issue at
all, as long as git itself doesn't start rejecting or complaining about
repositories that have gitlinks without a .gitmodules file.

- Josh Triplett

^ permalink raw reply	[relevance 22%]

* Re: Branch switching with submodules where the submodule replaces a folder aborts unexpectedly
  2017-10-09 21:59 ` Stefan Beller
@ 2017-10-12 11:48   ` Thomas Braun
  0 siblings, 0 replies; 200+ results
From: Thomas Braun @ 2017-10-12 11:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

> Stefan Beller <sbeller@google.com> hat am 9. Oktober 2017 um 23:59 geschrieben:
> 
> 
> On Mon, Oct 9, 2017 at 2:29 PM, Thomas Braun
> <thomas.braun@virtuell-zuhause.de> wrote:
> > Hi,
> >
> > I'm currently in the progress of pulling some subprojects in a git repository of mine into their
> > own repositories and adding these subprojects back as submodules.
> >
> > While doing this I enountered a potential bug as checkout complains on branch switching that a 
> > file already exists.
> 
> (And I presume you know about --recurse-submodules as a flag for git-checkout)

No I did not know about it. I tend to not know options which don't complete in my shell (patch follows for that).

> This is consistent with our tests, unfortunately.
> 
> git/t$ ./t2013-checkout-submodule.sh
> ...
> not ok 15 - git checkout --recurse-submodules: replace submodule with
> a directory # TODO known breakage
> ...
> 
> > If I'm misusing git here I'm glad for any advice.
> 
> You are not.

Glad to know that.

> Apart from this bug report, would you think that such filtering of
> trees into submodules (and back again) might be an interesting feature
> of Git or are these cases rare and special?

For me not particularly. In my case it is a one time thing going from an embedded project folder to a submodule.

^ permalink raw reply	[relevance 16%]

* Re: [RFC PATCH 2/4] change submodule push test to use proper repository setup
  2017-10-11 15:10         ` Josh Triplett
@ 2017-10-12 16:17           ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-10-12 16:17 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Stefan Beller, Heiko Voigt, git, Jonathan Nieder, Jens Lehmann

On 10/11, Josh Triplett wrote:
> On Tue, Oct 10, 2017 at 11:39:21AM -0700, Stefan Beller wrote:
> > On Tue, Oct 10, 2017 at 6:03 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> > > but in the long run my goal
> > > for submodules is and always was: Make them behave as close to files as
> > > possible. And why should a 'git add submodule' not magically do
> > > everything it can to make submodules just work? I can look into a patch
> > > for that if people agree here...
> > 
> > I'd love to see this implemented. I cc'd Josh (the author of git-series), who
> > may disagree with this, or has some good input how to go forward without
> > breaking git-series.
> 
> git-series doesn't use the git-submodule command at all, nor does it
> construct series trees using git-add or any other git command-line tool;
> it constructs gitlinks directly. Most of the time, it doesn't even make
> sense to `git checkout` a series branch. Modifying commands like git-add
> and similar to automatically manage .gitmodules won't cause any issue at
> all, as long as git itself doesn't start rejecting or complaining about
> repositories that have gitlinks without a .gitmodules file.

That's good to know!  And from what I remember, with the commands we've
begun teaching to understand submodules we have been requiring a
.gitmodules entry for a submodule in order to do the recursion, and a
gitlink without a .gitmodules entry would simply be ignored or skipped.

-- 
Brandon Williams

^ permalink raw reply	[relevance 15%]

* [PATCH] diff.c: increment buffer pointer in all code path
  2017-10-12 20:05 Out of memory with diff.colormoved enabled Jeff King
@ 2017-10-12 23:33 ` Stefan Beller
  2017-10-13  0:18   ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-12 23:33 UTC (permalink / raw)
  To: peff; +Cc: git, orgads, sbeller

The added test would hang up Git due to an infinite loop. The function
`next_byte()` doesn't make any forward progress in the buffer with
`--ignore-space-change`.

Fix this by only returning early when there was actual white space
to be covered, fall back to the default case at the end of the function
when there is no white space.

Reported-by: Orgad Shaneh <orgads@gmail.com>
Debugged-by: Jeff King <peff@peff.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Peff, feel free to take ownership here. I merely made it to a patch.

Thanks,
Stefan

 diff.c                     | 12 ++++++++----
 t/t4015-diff-whitespace.sh |  8 ++++++++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index 69f03570ad..6fe84e6994 100644
--- a/diff.c
+++ b/diff.c
@@ -713,13 +713,17 @@ static int next_byte(const char **cp, const char **endp,
 		return -1;
 
 	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_CHANGE)) {
-		while (*cp < *endp && isspace(**cp))
+		int saw_whitespace = 0;
+		while (*cp < *endp && isspace(**cp)) {
 			(*cp)++;
+			saw_whitespace = 1;
+		}
 		/*
-		 * After skipping a couple of whitespaces, we still have to
-		 * account for one space.
+		 * After skipping a couple of whitespaces,
+		 * we still have to account for one space.
 		 */
-		return (int)' ';
+		if (saw_whitespace)
+			return (int)' ';
 	}
 
 	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index bd0f75d9f7..c088ae86af 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1530,4 +1530,12 @@ test_expect_success 'move detection with submodules' '
 	test_cmp expect decoded_actual
 '
 
+test_expect_success 'move detection with whitespace changes' '
+	test_seq 10 > test &&
+	git add test &&
+	sed -i "s/3/42/" test &&
+	git -c diff.colormoved diff --ignore-space-change -- test &&
+	git reset --hard
+'
+
 test_done
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 20%]

* Re: [PATCH] diff.c: increment buffer pointer in all code path
  2017-10-12 23:33 ` [PATCH] diff.c: increment buffer pointer in all code path Stefan Beller
@ 2017-10-13  0:18   ` Jeff King
  0 siblings, 0 replies; 200+ results
From: Jeff King @ 2017-10-13  0:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, orgads

On Thu, Oct 12, 2017 at 04:33:22PM -0700, Stefan Beller wrote:

> Peff, feel free to take ownership here. I merely made it to a patch.

I think compared to my original, it makes sense to actually wrap the
whole thing with a check for the whitespace. You can do it just in the
IGNORE_WHITESPACE_CHANGE conditional, but I think it makes more sense to
cover both. Like the patch below (view with "-w" to see the logic more
clearly).

I also tweaked the test to remove "sed -i", which isn't portable, and
fixed up a few style nits.

-- >8 --
Subject: diff: fix infinite loop with --color-moved --ignore-space-change

The --color-moved code uses next_byte() to advance through
the blob contents. When the user has asked to ignore
whitespace changes, we try to collapse any whitespace change
down to a single space.

However, we enter the conditional block whenever we see the
IGNORE_WHITESPACE_CHANGE flag, even if the next byte isn't
whitespace.

This means that the combination of "--color-moved and
--ignore-space-change" was completely broken. Worse, because
we return from next_byte() without having advanced our
pointer, the function makes no forward progress in the
buffer and loops infinitely.

Fix this by entering the conditional only when we actually
see whitespace. We can apply this also to the
IGNORE_WHITESPACE change. That code path isn't buggy
(because it falls through to returning the next
non-whitespace byte), but it makes the logic more clear if
we only bother to look at whitespace flags after seeing that
the next byte is whitespace.

Reported-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c                     | 28 +++++++++++++++-------------
 t/t4015-diff-whitespace.sh |  9 +++++++++
 2 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index 69f03570ad..d76bb937c1 100644
--- a/diff.c
+++ b/diff.c
@@ -712,20 +712,22 @@ static int next_byte(const char **cp, const char **endp,
 	if (*cp > *endp)
 		return -1;
 
-	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_CHANGE)) {
-		while (*cp < *endp && isspace(**cp))
-			(*cp)++;
-		/*
-		 * After skipping a couple of whitespaces, we still have to
-		 * account for one space.
-		 */
-		return (int)' ';
-	}
+	if (isspace(**cp)) {
+		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_CHANGE)) {
+			while (*cp < *endp && isspace(**cp))
+				(*cp)++;
+			/*
+			 * After skipping a couple of whitespaces,
+			 * we still have to account for one space.
+			 */
+			return (int)' ';
+		}
 
-	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
-		while (*cp < *endp && isspace(**cp))
-			(*cp)++;
-		/* return the first non-ws character via the usual below */
+		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
+			while (*cp < *endp && isspace(**cp))
+				(*cp)++;
+			/* return the first non-ws character via the usual below */
+		}
 	}
 
 	retval = (unsigned char)(**cp);
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index bd0f75d9f7..87083f728f 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1530,4 +1530,13 @@ test_expect_success 'move detection with submodules' '
 	test_cmp expect decoded_actual
 '
 
+test_expect_success 'move detection with whitespace changes' '
+	test_when_finished "git reset --hard" &&
+	test_seq 10 >test &&
+	git add test &&
+	sed s/3/42/ <test >test.tmp &&
+	mv test.tmp test &&
+	git -c diff.colormoved diff --ignore-space-change -- test
+'
+
 test_done
-- 
2.15.0.rc1.381.g94fac273d4


^ permalink raw reply	[relevance 10%]

* Re: [PATCH v4 2/3] implement fetching of moved submodules
  2017-10-16 13:58 ` [PATCH v4 2/3] implement fetching of moved submodules Heiko Voigt
@ 2017-10-17 17:47   ` Stefan Beller
  2017-10-18  0:03     ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-17 17:47 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Junio C Hamano, Jonathan Nieder, Jens Lehmann, Brandon Williams, git

On Mon, Oct 16, 2017 at 6:58 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> We store the changed submodules paths to calculate which submodule needs
> fetching. This does not work for moved submodules since their paths do
> not stay the same in case of a moved submodules. In case of new
> submodules we do not have a path in the current checkout, since they
> just appeared in this fetch.
>
> It is more general to collect the submodule names for changes instead of
> their paths to include the above cases. If we do not have a
> configuration for a gitlink we rely on constructing a default name from
> the path if a git repository can be found at its path. We skip
> non-configured gitlinks whose default name collides with a configured
> one.

Thanks for working on this!

As detailed below, I wonder if it is easier (in maintenance, explaining
correctness, reviewing) if we'd rather keep two lists around. One for
based on names, and if we cannot lookup a name for a submodule, we
rather use the second path based list as a fallback. That would avoid
potential namespace collisions between names and paths, as well as
not having the confusion of mapping back and forth.

Most functions would then need to operate on path, as the name->path
mapping can be looked up for the first list, but the path->name mapping
cannot be looked up for the second list.

Cheers,
Stefan

> With the change described above we implement 'on-demand' fetching of
> changes in moved submodules.
>
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>

> ---
>  submodule-config.h          |   3 +
>  submodule.c                 | 138 ++++++++++++++++++++++++++++++++------------
>  t/t5526-fetch-submodules.sh |  35 +++++++++++
>  3 files changed, 138 insertions(+), 38 deletions(-)
>
> diff --git a/submodule-config.h b/submodule-config.h
> index e3845831f6..a5503a5d17 100644
> --- a/submodule-config.h
> +++ b/submodule-config.h
> @@ -22,6 +22,9 @@ struct submodule {
>         int recommend_shallow;
>  };
>
> +#define SUBMODULE_INIT { NULL, NULL, NULL, RECURSE_SUBMODULES_NONE, \
> +       NULL, NULL, SUBMODULE_UPDATE_STRATEGY_INIT, {0}, -1 };
> +
>  struct submodule_cache;
>  struct repository;
>
> diff --git a/submodule.c b/submodule.c
> index 63e7094e16..71d1773e2e 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -21,7 +21,7 @@
>  #include "parse-options.h"
>
>  static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
> -static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
> +static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
>  static int initialized_fetch_ref_tips;
>  static struct oid_array ref_tips_before_fetch;
>  static struct oid_array ref_tips_after_fetch;
> @@ -674,11 +674,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
>  }
>
>  static struct oid_array *submodule_commits(struct string_list *submodules,
> -                                          const char *path)
> +                                          const char *name)
>  {
>         struct string_list_item *item;
>
> -       item = string_list_insert(submodules, path);
> +       item = string_list_insert(submodules, name);
>         if (item->util)
>                 return (struct oid_array *) item->util;
>
> @@ -687,39 +687,67 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
>         return (struct oid_array *) item->util;
>  }
>
> +struct collect_changed_submodules_cb_data {
> +       struct string_list *changed;
> +       const struct object_id *commit_oid;
> +};
> +
> +/*
> + * this would normally be two functions: default_name_from_path() and

Please start the comment capitalised. (minor nit)

> + * path_from_default_name(). Since the default name is the same as
> + * the submodule path we can get away with just one function which only
> + * checks whether there is a submodule in the working directory at that
> + * location.

This is an interesting comment, as it hints that we ought to keep it that way.
Earlier I was wondering if we want to make the name distinctively different
than its path, as that will confuse users *less* IMHO. (I just remember
someone asking on the mailing list why their "rename" did not work, as
they just renamed everything in the .gitmodules that looked like the path)

As the path/name is confusing, I'd wish we'd be super concise, such that
errors are harder to sneak into. For example, the arguments name should
be "path" as that is the only thing we can look up using is_sub_pop_gently,
if a "name" is given, than it just works because the chosen default name
was its path.

> +static const char *default_name_or_path(const char *path_or_name)
> +{
> +       int error_code;
> +
> +       if (!is_submodule_populated_gently(path_or_name, &error_code))
> +               return NULL;
> +
> +       return path_or_name;
> +}
> +


> +               if (submodule)
> +                       name = submodule->name;
> +               else {
> +                       name = default_name_or_path(p->two->path);

Here we use the path, as expected. So ideally we'd use
"default_name_for_path".


> +                       /* make sure name does not collide with existing one */
> +                       submodule = submodule_from_name(commit_oid, name);
> +                       if (submodule) {
> +                               warning("Submodule in commit %s at path: "
> +                                       "'%s' collides with a submodule named "
> +                                       "the same. Skipping it.",
> +                                       oid_to_hex(commit_oid), name);
> +                               name = NULL;
> +                       }

This is the ugly part of using one string list and storing names or
path in it. I wonder if we could omit this warning if we had 2 string lists?
One for names (which will then be used for renamed and new submodules)
and the "fall back" path based list. In such a world we would not need
to map back and forth between names and path.

> +               submodule = submodule_from_name(&null_oid, name->string);
> +               if (submodule)
> +                       path = submodule->path;
> +               else
> +                       path = default_name_or_path(name->string);
> +
> +               if (!path)
> +                       continue;


> +               submodule = submodule_from_name(&null_oid, name->string);
> +               if (submodule)
> +                       path = submodule->path;
> +               else
> +                       path = default_name_or_path(name->string);
> +
> +               if (!path)
> +                       continue;


>                 submodule = submodule_from_path(&null_oid, ce->name);
> +               if (!submodule) {
> +                       const char *name = default_name_or_path(ce->name);
> +                       if (name) {
> +                               default_submodule.path = default_submodule.name = name;
> +                               submodule = &default_submodule;
> +                       }
> +               }
>

>
> +test_expect_success "fetch new commits when submodule got renamed" '
> +       git clone . downstream_rename &&
> +       (
> +               cd downstream_rename &&
> +               git submodule update --init &&
> +# NEEDSWORK: we omitted --recursive for the submodule update here since
> +# that does not work. See test 7001 for mv "moving nested submodules"
> +# for details. Once that is fixed we should add the --recursive option
> +# here.
> +               git checkout -b rename &&
> +               git mv submodule submodule_renamed &&
> +               (
> +                       cd submodule_renamed &&
> +                       git checkout -b rename_sub &&
> +                       echo a >a &&
> +                       git add a &&
> +                       git commit -ma &&
> +                       git push origin rename_sub &&
> +                       git rev-parse HEAD >../../expect
> +               ) &&
> +               git add submodule_renamed &&
> +               git commit -m "update renamed submodule" &&
> +               git push origin rename
> +       ) &&
> +       (
> +               cd downstream &&
> +               git fetch --recurse-submodules=on-demand &&
> +               (
> +                       cd submodule &&
> +                       git rev-parse origin/rename_sub >../../actual
> +               )
> +       ) &&
> +       test_cmp expect actual
> +'
> +
>  test_done
> --
> 2.14.1.145.gb3622a4
>

^ permalink raw reply	[relevance 21%]

* Re: [PATCH v4 1/3] fetch: add test to make sure we stay backwards compatible
  2017-10-16 13:57 ` [PATCH v4 1/3] fetch: add test to make sure we stay backwards compatible Heiko Voigt
@ 2017-10-17 17:56   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-17 17:56 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Junio C Hamano, Jonathan Nieder, Jens Lehmann, Brandon Williams, git

On Mon, Oct 16, 2017 at 6:57 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> The current implementation of submodules supports on-demand fetch if
> there is no .gitmodules entry for a submodule. Let's add a test to
> document this behavior.
>
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> ---
>  t/t5526-fetch-submodules.sh | 42 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index 42251f7f3a..43a22f680f 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -478,7 +478,47 @@ test_expect_success "don't fetch submodule when newly recorded commits are alrea
>                 git fetch >../actual.out 2>../actual.err
>         ) &&
>         ! test -s actual.out &&
> -       test_i18ncmp expect.err actual.err
> +       test_i18ncmp expect.err actual.err &&
> +       (
> +               cd submodule &&
> +               git checkout -q master
> +       )

For few instructions inside another repo, I tend to use the
-C option:

  git -C submodule checkout -q master

That saves a shell, which is noticeable cost on Windows I was told.
(also fewer lines to type).

Oh, I see, that is consistent with the rest of the file. Oh well.
(Otherwise I would have lobbied to even move it further up and
put it inside a test_when_finished "<cmd>"


> +'
> +
> +test_expect_success "'fetch.recurseSubmodules=on-demand' works also without .gitmodule entry" '
> +       (
> +               cd downstream &&
> +               git fetch --recurse-submodules
> +       ) &&

This is consistent with the rest of the file as well, so I shall
refrain from complaining. ;)

> +       add_upstream_commit &&
> +       head1=$(git rev-parse --short HEAD) &&
> +       git add submodule &&
> +       git rm .gitmodules &&
> +       git commit -m "new submodule without .gitmodules" &&
> +       printf "" >expect.out &&

This could be just

    : >expect.out

no need to invoke a function to print nothing.

> +       head2=$(git rev-parse --short HEAD) &&
> +       echo "From $pwd/." >expect.err.2 &&
> +       echo "   $head1..$head2  master     -> origin/master" >>expect.err.2 &&
> +       head -3 expect.err >>expect.err.2 &&
> +       (
> +               cd downstream &&
> +               rm .gitmodules &&
> +               git config fetch.recurseSubmodules on-demand &&
> +               # fake submodule configuration to avoid skipping submodule handling
> +               git config -f .gitmodules submodule.fake.path fake &&
> +               git config -f .gitmodules submodule.fake.url fakeurl &&
> +               git add .gitmodules &&
> +               git config --unset submodule.submodule.url &&
> +               git fetch >../actual.out 2>../actual.err &&
> +               # cleanup
> +               git config --unset fetch.recurseSubmodules &&
> +               git reset --hard
> +       ) &&
> +       test_i18ncmp expect.out actual.out &&
> +       test_i18ncmp expect.err.2 actual.err &&
> +       git checkout HEAD^ -- .gitmodules &&
> +       git add .gitmodules &&
> +       git commit -m "new submodule restored .gitmodules"
>  '

Thanks for writing this test.
With or without the nits addressed, this is

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[relevance 23%]

* Re: [PATCH v4 3/3] submodule: simplify decision tree whether to or not to fetch
  2017-10-16 13:59 ` [PATCH v4 3/3] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
@ 2017-10-17 18:22   ` Stefan Beller
  2017-10-18  0:03     ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-17 18:22 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Junio C Hamano, Jonathan Nieder, Jens Lehmann, Brandon Williams, git

On Mon, Oct 16, 2017 at 6:59 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> To make extending this logic later easier.
>
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>

Thanks for this readability fix!

Stefan

^ permalink raw reply	[relevance 16%]

* Re: [PATCH v4 2/3] implement fetching of moved submodules
  2017-10-17 17:47   ` Stefan Beller
@ 2017-10-18  0:03     ` Junio C Hamano
  2017-10-18 17:56       ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-10-18  0:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Heiko Voigt, Jonathan Nieder, Jens Lehmann, Brandon Williams, git\

Stefan Beller <sbeller@google.com> writes:

>> +                       /* make sure name does not collide with existing one */
>> +                       submodule = submodule_from_name(commit_oid, name);
>> +                       if (submodule) {
>> +                               warning("Submodule in commit %s at path: "
>> +                                       "'%s' collides with a submodule named "
>> +                                       "the same. Skipping it.",
>> +                                       oid_to_hex(commit_oid), name);
>> +                               name = NULL;
>> +                       }
>
> This is the ugly part of using one string list and storing names or
> path in it. I wonder if we could omit this warning if we had 2 string lists?

We are keying off of 'name', because that is what will give a module
its identity.  If we have a gitlink whose path is not in .gitmodules
in the same tree, then we are seeing an unregistered submodule.  If
we were to "git add" it, then we'd use its path as the default name,
but if we already have a submodule with that name (the most likely
explanation for its existence is because it started its life there
and then later moved), and the submodule is bound to a different
path, then that is a different submodule.  Skipping and warning both
are sensible thing to do.

I do not know what you see as ugly here, and more importantly, I am
not sure how having two lists would help.

^ permalink raw reply	[relevance 25%]

* Re: [PATCH v4 3/3] submodule: simplify decision tree whether to or not to fetch
  2017-10-17 18:22   ` Stefan Beller
@ 2017-10-18  0:03     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-10-18  0:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Heiko Voigt, Jonathan Nieder, Jens Lehmann, Brandon Williams, git\

Stefan Beller <sbeller@google.com> writes:

> On Mon, Oct 16, 2017 at 6:59 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
>> To make extending this logic later easier.
>>
>> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
>
> Thanks for this readability fix!
>
> Stefan

Thanks, both.

^ permalink raw reply	[relevance 15%]

* Re: [PATCH v4 2/3] implement fetching of moved submodules
  2017-10-18  0:03     ` Junio C Hamano
@ 2017-10-18 17:56       ` Stefan Beller
  2017-10-19  0:35         ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-18 17:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Jonathan Nieder, Jens Lehmann, Brandon Williams, git

On Tue, Oct 17, 2017 at 5:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> +                       /* make sure name does not collide with existing one */
>>> +                       submodule = submodule_from_name(commit_oid, name);
>>> +                       if (submodule) {
>>> +                               warning("Submodule in commit %s at path: "
>>> +                                       "'%s' collides with a submodule named "
>>> +                                       "the same. Skipping it.",
>>> +                                       oid_to_hex(commit_oid), name);
>>> +                               name = NULL;
>>> +                       }
>>
>> This is the ugly part of using one string list and storing names or
>> path in it. I wonder if we could omit this warning if we had 2 string lists?
>
> We are keying off of 'name', because that is what will give a module
> its identity.  If we have a gitlink whose path is not in .gitmodules
> in the same tree, then we are seeing an unregistered submodule.

Right, so it has no submodule specific identity and we chose to "fake it"
by pretending its path is its name. However this requires checking as
there might be overlap in the name-namespace and the path-namespace.


>  If
> we were to "git add" it, then we'd use its path as the default name,

I presume "git submodule add"

> but if we already have a submodule with that name (the most likely
> explanation for its existence is because it started its life there
> and then later moved), and the submodule is bound to a different
> path, then that is a different submodule.  Skipping and warning both
> are sensible thing to do.

Skipping and warning is sensible once we decide to go this way.

I propose to take a step back and not throw away the information
whether the given string is a name or path, as then we do not have
to warn&skip, but we can treat both correctly.

As we only need to store an additional boolean (is it path or name?),
I had suggested to just use two lists, one for key-by-name and one
key-by-path, where we intend to use the key-by-name for submodules
and the by-path only for those with no name (i.e. lone gitlinks), hence
making this a "fallback list"

>
> I do not know what you see as ugly here,

the necessity of warn&skip instead of having a solution that
works in corner cases just fine.

> and more importantly, I am
> not sure how having two lists would help.

The current situation is that we use the path of the submodules only,
which makes it work without warn&skip, but it has other disadvantages
(i.e. new & moved submodules are not detected), which we want to fix.

We can add this functionality without caving in to skip the corner case
by storing an additional bit of information. The renaming is detected by
having a constant name before and after, just the path changed.
So we could continue to use by-path logic and only have the name
for rename detection. However that seems to be ugly, too. So we
seem to think that the by-name is better (as it is more in line with what
we think should happen, it is easier to explain, review and maintain(?)).

So we could have by-name keys, with the extra information of whether the
key is genuine or a "fake" key, which is t be resolved to a path instead.
And as that is just one bit, I proposed two lists for that.

Do I miss an essential part here?

Thanks,
Stefan

^ permalink raw reply	[relevance 23%]

* Re: [PATCH v4 2/3] implement fetching of moved submodules
  2017-10-18 17:56       ` Stefan Beller
@ 2017-10-19  0:35         ` Junio C Hamano
  2017-10-19 18:11           ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Stefan Beller
  2017-10-19 23:34           ` [PATCH v4 2/3] implement fetching of moved submodules Stefan Beller
  0 siblings, 2 replies; 200+ results
From: Junio C Hamano @ 2017-10-19  0:35 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Heiko Voigt, Jonathan Nieder, Jens Lehmann, Brandon Williams, git\

Stefan Beller <sbeller@google.com> writes:

>> but if we already have a submodule with that name (the most likely
>> explanation for its existence is because it started its life there
>> and then later moved), and the submodule is bound to a different
>> path, then that is a different submodule.  Skipping and warning both
>> are sensible thing to do.
>
> Skipping and warning is sensible once we decide to go this way.
>
> I propose to take a step back and not throw away the information
> whether the given string is a name or path, as then we do not have
> to warn&skip, but we can treat both correctly.

Now either one of us is utterly confused, and I suspect it is me, as
I do not see how "treat both correctly" could possibly work in the
case this code warns and skips.

At this point in the flow, we already know that it is not name,
because we asked and got a "Nah, there is no submodule registered in
.gitmodules at that path" from submodule_from_path().  Then we ask
submodule_from_name() if there is any submodule registered under the
name it would have got if it were added there, and we indeed find
one.  And that is definitely *not* a submodule we are looking for,
because if it were, its .path would have pointed at the path we were
using to ask in the first place.  The one we originally found at
path and are interested in finding out the details is not known to
.gitmodules, and the one under that name is not the one that we are
intereted in, so fetching from the repository the other one that
happens to have the same name but is different from the submodule we
are interested in would simply be wrong.

If we only have path without any .gitmodules entry (hence there is
not even URL), how would we proceed from that point on?


^ permalink raw reply	[relevance 24%]

* [PATCH 1/2] t5526: check for name/path collision in submodule fetch
  2017-10-19  0:35         ` Junio C Hamano
@ 2017-10-19 18:11           ` Stefan Beller
  2017-10-19 18:11             ` [PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks Stefan Beller
  2017-10-23 14:16             ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Heiko Voigt
  2017-10-19 23:34           ` [PATCH v4 2/3] implement fetching of moved submodules Stefan Beller
  1 sibling, 2 replies; 200+ results
From: Stefan Beller @ 2017-10-19 18:11 UTC (permalink / raw)
  To: gitster; +Cc: Jens.Lehmann, bmwill, git, hvoigt, jrnieder, sbeller

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This is just to test the corner case we're discussing.
Applies on top of origin/hv/fetch-moved-submodules-on-demand.


 t/t5526-fetch-submodules.sh | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index a552ad4ead..c82d519e06 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -571,6 +571,7 @@ test_expect_success 'fetching submodule into a broken repository' '
 '
 
 test_expect_success "fetch new commits when submodule got renamed" '
+	test_when_finished "rm -rf downstream_rename" &&
 	git clone . downstream_rename &&
 	(
 		cd downstream_rename &&
@@ -605,4 +606,45 @@ test_expect_success "fetch new commits when submodule got renamed" '
 	test_cmp expect actual
 '
 
+test_expect_success "warn on submodule name/path clash, but new commits fetched in renamed" '
+	test_when_finished "rm -rf downstream_rename" &&
+	git clone . downstream_rename &&
+	(
+		cd downstream_rename &&
+		git submodule update --init &&
+# NEEDSWORK: we omitted --recursive for the submodule update here since
+# that does not work. See test 7001 for mv "moving nested submodules"
+# for details. Once that is fixed we should add the --recursive option
+# here.
+		git checkout -b rename &&
+		git mv submodule submodule_renamed &&
+		(
+			cd submodule_renamed &&
+			git checkout -b rename_sub &&
+			echo a >a &&
+			git add a &&
+			git commit -ma &&
+			git push origin rename_sub &&
+			git rev-parse HEAD >../../expect
+		) &&
+		git add submodule_renamed &&
+		git commit -m "update renamed submodule" &&
+		# produce collision, note that we use no submodule command
+		git clone ../submodule submodule &&
+		git add submodule &&
+		git commit -m "have new submodule at old path " &&
+		git push origin rename
+	) &&
+	(
+		cd downstream &&
+		git fetch --recurse-submodules=on-demand 2>err &&
+		grep "collides with a submodule named" err &&
+		(
+			cd submodule &&
+			git rev-parse origin/rename_sub >../../actual
+		)
+	) &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 30%]

* [PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks
  2017-10-19 18:11           ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Stefan Beller
@ 2017-10-19 18:11             ` Stefan Beller
  2017-10-23 14:12               ` Heiko Voigt
  2017-10-23 14:16             ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Heiko Voigt
  1 sibling, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-19 18:11 UTC (permalink / raw)
  To: gitster; +Cc: Jens.Lehmann, bmwill, git, hvoigt, jrnieder, sbeller

Currently when fetching we collect the names of submodules to be fetched
in a list. As we also want to support fetching 'gitlinks, that happen to
have a repo checked out at the right place', we'll just pretend that these
are submodules. We do that by assuming their path is their name. This in
turn can yield collisions between the name-namespace and the
path-namespace. (See the previous test for a demonstration.)

This patch rewrites the code such that we treat the 'real submodule' case
differently from the 'gitlink, but ok' case. This introduces a bit
of code duplication, but gets rid of the confusing mapping between names
and paths.

The test is incomplete as the long term vision is not achieved yet.
(which would be fetching both the renamed submodule as well as
the gitlink thing, putting them in place via e.g. git-pull)

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 Heiko,
 Junio,

 I assumed the code would ease up a lot more, but now I am undecided if
 I want to keep arguing as the code is not stopping to be ugly. :)
 
 The idea is to treat submodule and gitlinks separately, with submodules
 supporting renames, and gitlinks as a historic artefact.
 
 Sorry for the noise about code ugliness.
 
 Thanks,
 Stefan
 

 submodule.c                 | 168 +++++++++++++++++++++-----------------------
 t/t5526-fetch-submodules.sh |   1 -
 2 files changed, 81 insertions(+), 88 deletions(-)

diff --git a/submodule.c b/submodule.c
index 82d206eb65..115df82f32 100644
--- a/submodule.c
+++ b/submodule.c
@@ -22,6 +22,7 @@
 
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
+static struct string_list changed_gitlink_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
@@ -674,11 +675,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
 }
 
 static struct oid_array *submodule_commits(struct string_list *submodules,
-					   const char *name)
+					   const char *key)
 {
 	struct string_list_item *item;
 
-	item = string_list_insert(submodules, name);
+	item = string_list_insert(submodules, key);
 	if (item->util)
 		return (struct oid_array *) item->util;
 
@@ -688,33 +689,20 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
 }
 
 struct collect_changed_submodules_cb_data {
-	struct string_list *changed;
-	const struct object_id *commit_oid;
-};
+	/* used for submodules, supports renames: */
+	struct string_list *changed_by_name;
 
-/*
- * this would normally be two functions: default_name_from_path() and
- * path_from_default_name(). Since the default name is the same as
- * the submodule path we can get away with just one function which only
- * checks whether there is a submodule in the working directory at that
- * location.
- */
-static const char *default_name_or_path(const char *path_or_name)
-{
-	int error_code;
+	/* support old 'gitlink' with repo in-place, no rename support*/
+	struct string_list *changed_by_path;
 
-	if (!is_submodule_populated_gently(path_or_name, &error_code))
-		return NULL;
-
-	return path_or_name;
-}
+	const struct object_id *commit_oid;
+};
 
 static void collect_changed_submodules_cb(struct diff_queue_struct *q,
 					  struct diff_options *options,
 					  void *data)
 {
 	struct collect_changed_submodules_cb_data *me = data;
-	struct string_list *changed = me->changed;
 	const struct object_id *commit_oid = me->commit_oid;
 	int i;
 
@@ -722,42 +710,35 @@ static void collect_changed_submodules_cb(struct diff_queue_struct *q,
 		struct diff_filepair *p = q->queue[i];
 		struct oid_array *commits;
 		const struct submodule *submodule;
-		const char *name;
 
 		if (!S_ISGITLINK(p->two->mode))
 			continue;
 
 		submodule = submodule_from_path(commit_oid, p->two->path);
-		if (submodule)
-			name = submodule->name;
-		else {
-			name = default_name_or_path(p->two->path);
-			/* make sure name does not collide with existing one */
-			submodule = submodule_from_name(commit_oid, name);
-			if (submodule) {
-				warning("Submodule in commit %s at path: "
-					"'%s' collides with a submodule named "
-					"the same. Skipping it.",
-					oid_to_hex(commit_oid), name);
-				name = NULL;
-			}
+		if (submodule) {
+			commits = submodule_commits(me->changed_by_name, submodule->name);
+			oid_array_append(commits, &p->two->oid);
+		} else {
+			commits = submodule_commits(me->changed_by_path, p->two->path);
+			oid_array_append(commits, &p->two->oid);
 		}
-
-		if (!name)
-			continue;
-
-		commits = submodule_commits(changed, name);
-		oid_array_append(commits, &p->two->oid);
 	}
 }
 
 /*
- * Collect the paths of submodules in 'changed' which have changed based on
- * the revisions as specified in 'argv'.  Each entry in 'changed' will also
- * have a corresponding 'struct oid_array' (in the 'util' field) which lists
- * what the submodule pointers were updated to during the change.
+ * Collect the paths of submodules in 'changed_by_{name, path}' which have
+ * changed based on the revisions as specified in 'argv'.
+ *
+ * Each gitlink/submodule will occur in only one of the list. We'll prefer
+ * to give it by_name as that allows rename detection. We'll fall back to
+ * by_path to support gitlinks with no entry in '.gitmodules'.
+ *
+ * Each entry in 'changed_*' will also have a corresponding 'struct oid_array'
+ * (in the 'util' field) which lists what the submodule pointers were updated
+ * to during the change.
  */
-static void collect_changed_submodules(struct string_list *changed,
+static void collect_changed_submodules(struct string_list *changed_by_name,
+				       struct string_list *changed_by_path,
 				       struct argv_array *argv)
 {
 	struct rev_info rev;
@@ -771,7 +752,8 @@ static void collect_changed_submodules(struct string_list *changed,
 	while ((commit = get_revision(&rev))) {
 		struct rev_info diff_rev;
 		struct collect_changed_submodules_cb_data data;
-		data.changed = changed;
+		data.changed_by_name = changed_by_name;
+		data.changed_by_path = changed_by_path;
 		data.commit_oid = &commit->object.oid;
 
 		init_revisions(&diff_rev, NULL);
@@ -924,8 +906,9 @@ static int submodule_needs_pushing(const char *path, struct oid_array *commits)
 int find_unpushed_submodules(struct oid_array *commits,
 		const char *remotes_name, struct string_list *needs_pushing)
 {
-	struct string_list submodules = STRING_LIST_INIT_DUP;
-	struct string_list_item *name;
+	struct string_list submodules_by_name = STRING_LIST_INIT_DUP;
+	struct string_list gitlinks_by_path = STRING_LIST_INIT_DUP;
+	struct string_list_item *item;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	/* argv.argv[0] will be ignored by setup_revisions */
@@ -934,27 +917,33 @@ int find_unpushed_submodules(struct oid_array *commits,
 	argv_array_push(&argv, "--not");
 	argv_array_pushf(&argv, "--remotes=%s", remotes_name);
 
-	collect_changed_submodules(&submodules, &argv);
+	collect_changed_submodules(&submodules_by_name, &gitlinks_by_path, &argv);
 
-	for_each_string_list_item(name, &submodules) {
-		struct oid_array *commits = name->util;
+	for_each_string_list_item(item, &submodules_by_name) {
+		struct oid_array *commits = item->util;
+		const char *name = item->string;
 		const struct submodule *submodule;
-		const char *path = NULL;
+		const char *path;
 
-		submodule = submodule_from_name(&null_oid, name->string);
-		if (submodule)
-			path = submodule->path;
-		else
-			path = default_name_or_path(name->string);
+		submodule = submodule_from_name(&null_oid, name);
+		if (!submodule)
+			BUG("submodule name/path mapping corrupt");
+		path = submodule->path;
 
-		if (!path)
-			continue;
+		if (submodule_needs_pushing(path, commits))
+			string_list_insert(needs_pushing, path);
+	}
+
+	for_each_string_list_item(item, &gitlinks_by_path) {
+		struct oid_array *commits = item->util;
+		const char *path = item->string;
 
 		if (submodule_needs_pushing(path, commits))
 			string_list_insert(needs_pushing, path);
 	}
 
-	free_submodules_oids(&submodules);
+	free_submodules_oids(&submodules_by_name);
+	free_submodules_oids(&gitlinks_by_path);
 	argv_array_clear(&argv);
 
 	return needs_pushing->nr;
@@ -1106,7 +1095,8 @@ static void calculate_changed_submodule_paths(void)
 {
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct string_list changed_submodules = STRING_LIST_INIT_DUP;
-	const struct string_list_item *name;
+	struct string_list changed_gitlinks = STRING_LIST_INIT_DUP;
+	const struct string_list_item *item;
 
 	/* No need to check if there are no submodules configured */
 	if (!submodule_from_path(NULL, NULL))
@@ -1123,27 +1113,32 @@ static void calculate_changed_submodule_paths(void)
 	 * Collect all submodules (whether checked out or not) for which new
 	 * commits have been recorded upstream in "changed_submodule_names".
 	 */
-	collect_changed_submodules(&changed_submodules, &argv);
+	collect_changed_submodules(&changed_submodules, &changed_gitlinks, &argv);
 
-	for_each_string_list_item(name, &changed_submodules) {
-		struct oid_array *commits = name->util;
-		const struct submodule *submodule;
-		const char *path = NULL;
+	for_each_string_list_item(item, &changed_submodules) {
+		struct oid_array *commits = item->util;
+		const char *name = item->string;
+		const struct submodule *sub =
+			submodule_from_name(&null_oid, name);
 
-		submodule = submodule_from_name(&null_oid, name->string);
-		if (submodule)
-			path = submodule->path;
-		else
-			path = default_name_or_path(name->string);
+		if (!sub)
+			BUG("cannot lookup submodule, but we could before?");
 
-		if (!path)
-			continue;
+		if (!submodule_has_commits(sub->path, commits))
+			string_list_append(&changed_submodule_names, name);
+	}
+
+	/* the same for gitnlinks, stored in 'changed_gitlink_paths' */
+	for_each_string_list_item(item, &changed_gitlinks) {
+		const char *path = item->string;
+		struct oid_array *commits = item->util;
 
 		if (!submodule_has_commits(path, commits))
-			string_list_append(&changed_submodule_names, name->string);
+			string_list_append(&changed_gitlink_paths, path);
 	}
 
 	free_submodules_oids(&changed_submodules);
+	free_submodules_oids(&changed_gitlinks);
 	argv_array_clear(&argv);
 	oid_array_clear(&ref_tips_before_fetch);
 	oid_array_clear(&ref_tips_after_fetch);
@@ -1154,6 +1149,7 @@ int submodule_touches_in_range(struct object_id *excl_oid,
 			       struct object_id *incl_oid)
 {
 	struct string_list subs = STRING_LIST_INIT_DUP;
+	struct string_list gitlinks = STRING_LIST_INIT_DUP;
 	struct argv_array args = ARGV_ARRAY_INIT;
 	int ret;
 
@@ -1166,8 +1162,8 @@ int submodule_touches_in_range(struct object_id *excl_oid,
 	argv_array_push(&args, "--not");
 	argv_array_push(&args, oid_to_hex(excl_oid));
 
-	collect_changed_submodules(&subs, &args);
-	ret = subs.nr;
+	collect_changed_submodules(&subs, &gitlinks, &args);
+	ret = subs.nr + gitlinks.nr;
 
 	argv_array_clear(&args);
 
@@ -1225,27 +1221,25 @@ static int get_next_submodule(struct child_process *cp,
 		const struct cache_entry *ce = active_cache[spf->count];
 		const char *git_dir, *default_argv;
 		const struct submodule *submodule;
-		struct submodule default_submodule = SUBMODULE_INIT;
+		int found = 0;
 
 		if (!S_ISGITLINK(ce->ce_mode))
 			continue;
 
 		submodule = submodule_from_path(&null_oid, ce->name);
-		if (!submodule) {
-			const char *name = default_name_or_path(ce->name);
-			if (name) {
-				default_submodule.path = default_submodule.name = name;
-				submodule = &default_submodule;
-			}
-		}
 
 		switch (get_fetch_recurse_config(submodule, spf))
 		{
 		default:
 		case RECURSE_SUBMODULES_DEFAULT:
 		case RECURSE_SUBMODULES_ON_DEMAND:
-			if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
-							 submodule->name))
+
+			if (submodule)
+				found |= !!unsorted_string_list_lookup(&changed_submodule_names, submodule->name);
+
+			found |= !!unsorted_string_list_lookup(&changed_gitlink_paths, ce->name);
+
+			if (!found)
 				continue;
 			default_argv = "on-demand";
 			break;
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index c82d519e06..d6a6d6a4e1 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -638,7 +638,6 @@ test_expect_success "warn on submodule name/path clash, but new commits fetched
 	(
 		cd downstream &&
 		git fetch --recurse-submodules=on-demand 2>err &&
-		grep "collides with a submodule named" err &&
 		(
 			cd submodule &&
 			git rev-parse origin/rename_sub >../../actual
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply	[relevance 20%]

* Re: [PATCH 3/5] t4015: test the output of "diff --color-moved -b"
  2017-10-19 20:26 ` [PATCH 3/5] t4015: test the output of "diff --color-moved -b" Jeff King
@ 2017-10-19 21:03   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-19 21:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Orgad Shaneh

On Thu, Oct 19, 2017 at 1:26 PM, Jeff King <peff@peff.net> wrote:
> Commit fa5ba2c1dd (diff: fix infinite loop with
> --color-moved --ignore-space-change, 2017-10-12) added a
> test to make sure that "--color-moved -b" doesn't run
> forever, but the test in question doesn't actually have any
> moved lines in it.
>
> Let's scrap that test and add a variant of the existing
> "--color-moved -w" test, but this time we'll check that we
> find the move with whitespace changes, but not arbitrary
> whitespace additions.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---

> +
> +       git diff HEAD --no-renames --color-moved --color |
> +               grep -v "index" |
> +               test_decode_color >actual &&

The -v index grep makes it future proof (for a new hash)!
I like that. What I do not like is the pipe at the end of
git-diff as we certainly want to inspect the return code
(or for a segfault ;)

> +       git diff HEAD --no-renames -b --color-moved --color |
> +               grep -v "index" |
> +               test_decode_color >actual &&

Do we need (or rather want) to give --color additionally,
instead of having --color-moved imply it? I guess this is
extra carefulness from the "color always" series?


>  test_expect_success 'clean up whitespace-test colors' '
>         git config --unset color.diff.oldMoved &&
>         git config --unset color.diff.newMoved
> @@ -1549,13 +1613,4 @@ test_expect_success 'move detection with submodules' '
>         test_cmp expect decoded_actual
>  '
>
> -test_expect_success 'move detection with whitespace changes' '
> -       test_when_finished "git reset --hard" &&
> -       test_seq 10 >test &&
> -       git add test &&
> -       sed s/3/42/ <test >test.tmp &&
> -       mv test.tmp test &&
> -       git -c diff.colormoved diff --ignore-space-change -- test
> -'

Ok, this is covered above in testing for the unchanged lines (1-5)

^ permalink raw reply	[relevance 8%]

* Re: [PATCH 2/3] t5615: avoid re-using descriptor 4
  2017-10-19 21:07 ` [PATCH 2/3] t5615: avoid re-using descriptor 4 Jeff King
@ 2017-10-19 21:46   ` Stefan Beller
  2017-10-19 23:23     ` Jeff King
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-19 21:46 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Johannes Schindelin, Lars Schneider

> I also considered trying to bump the "set -x" output descriptor to "9".
> That just moves the problem around, but presumably scripts are less
> likely to go that high. :)
>
> It would also be possible to pick something insanely high, like "999".
> Many shells choke on descriptors higher than 9, but since the issue is
> related to BASH_XTRACEFD, we could make it a bash-only thing. I don't
> know if it's worth the trouble. I hate leaving a subtle "don't use
> descriptor 4 in a subshell or your script will break" hand-grenade like
> this lying around, but we do seem to have only one instance of it over
> the whole test suite.

I would imagine that a higher fd for BASH_XTRACEFD
would be less explosive than requiring the tests to skip
the low number 4. (It is not *that* low. In e.g. git-submodule.sh
we make use of 3 in cmd_foreach, not needing more.)

Thanks,
Stefan

^ permalink raw reply	[relevance 15%]

* Re: [PATCH 5/5] diff: handle NULs in get_string_hash()
  2017-10-19 21:39     ` Jeff King
@ 2017-10-19 21:50       ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-19 21:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Orgad Shaneh

On Thu, Oct 19, 2017 at 2:39 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Oct 19, 2017 at 02:31:20PM -0700, Stefan Beller wrote:
>
>> > This is unlikely to ever come up in practice since our line
>> > boundaries generally come from calling strlen() in the first
>> > place.
>>
>> get_string_hash is called from
>>  prepare_entry, which in turn is called from
>>   add_lines_to_move_detection or mark_color_as_moved
>>    diff_flush_patch_all_file_pairs
>>
>> that constructs the arguments in
>> diff_flush_patch
>>  run_diff
>>   run_diff_cmd
>>    builtin_diff (part "/* Crazy xdl interfaces.. */")
>>     xdi_diff_outf( fn_out_consume as arg!)
>>      xdi_diff
>>       xdl_diff
>>        xdl_call_hunk_func
>>         -> fn_out_consume(cb, line, len)
>>
>> xdl_call_hunk_func however uses pointer arithmetic instead
>> of strlen. So I think this sentence is not a good idea to put in
>> the commit message.
>>
>> It may not occur in practice, due to binary files detection using
>> NUL as a signal, but conceptually our move-colored(!) diffs
>> should be compatible with NULs with this patch now.
>
> Good catch. I just skimmed over all the diff_emit_* functions, which use
> strlen(). But at the bottom is emit_add_line(), which has a real length
> marker from xdiff.

Doh!
There is diff_emit_submodule_* but I presume you meant
emit_diff_* actually as there the strlen is legit, as we generate
known non-NUL data to print.

The submodule output may get mangled in diff_emit_submodule_{add,del}
as the input is coming from user data.

> I agree we wouldn't see NULs in general, but this is maybe fixing "diff
> --color-moved -a". I dunno. It's probably not worth digging too much,
> since it seems like the right thing to do regardless.

I agree, that this digging seems not worth it; I was just agitated at
the seemingly incorrect commit message.

Thanks,
Stefan

>
> -Peff

^ permalink raw reply	[relevance 15%]

* Re: [PATCH 2/3] t5615: avoid re-using descriptor 4
  2017-10-19 21:46   ` Stefan Beller
@ 2017-10-19 23:23     ` Jeff King
  0 siblings, 0 replies; 200+ results
From: Jeff King @ 2017-10-19 23:23 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Johannes Schindelin, Lars Schneider

On Thu, Oct 19, 2017 at 02:46:33PM -0700, Stefan Beller wrote:

> > I also considered trying to bump the "set -x" output descriptor to "9".
> > That just moves the problem around, but presumably scripts are less
> > likely to go that high. :)
> >
> > It would also be possible to pick something insanely high, like "999".
> > Many shells choke on descriptors higher than 9, but since the issue is
> > related to BASH_XTRACEFD, we could make it a bash-only thing. I don't
> > know if it's worth the trouble. I hate leaving a subtle "don't use
> > descriptor 4 in a subshell or your script will break" hand-grenade like
> > this lying around, but we do seem to have only one instance of it over
> > the whole test suite.
> 
> I would imagine that a higher fd for BASH_XTRACEFD
> would be less explosive than requiring the tests to skip
> the low number 4. (It is not *that* low. In e.g. git-submodule.sh
> we make use of 3 in cmd_foreach, not needing more.)

So one trick is that we can't just set it to a higher number. We have to
also open and manage that descriptor. It might be enough to do:

  if test -n "$BASH_VERSION"
  then
	exec 999>&4
	BASH_XTRACEFD=999
  fi

but since we fiddle with descriptors on a per-test basis, I'm not sure
if that's it or not. I think for convincing output to go to it, it
probably is. We tend to point things _at_ descriptor 4, not point
descriptor 4 elsewhere. But the fix in patch 1 is an exception, where we
try to suppress the "set +x" output. For that we have to redirect 999,
too (which is hard because ">&999" is syntactically invalid in other
shells, so we have to follow a separate code path for bash or get into
evals).

Or start playing games with unsetting BASH_XTRACEFD (which closes 999,
and then we re-open 999>&4 and reset XTRACEFD).

I think it might be workable, but I'm worried we're opening a can of
worms. Or continuing to dig into the can of worms from the original
BASH_XTRACEFD, if you prefer. :)

-Peff

^ permalink raw reply	[relevance 8%]

* Re: [PATCH v4 2/3] implement fetching of moved submodules
  2017-10-19  0:35         ` Junio C Hamano
  2017-10-19 18:11           ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Stefan Beller
@ 2017-10-19 23:34           ` Stefan Beller
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-19 23:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Jonathan Nieder, Jens Lehmann, Brandon Williams, git

On Wed, Oct 18, 2017 at 5:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> but if we already have a submodule with that name (the most likely
>>> explanation for its existence is because it started its life there
>>> and then later moved), and the submodule is bound to a different
>>> path, then that is a different submodule.  Skipping and warning both
>>> are sensible thing to do.
>>
>> Skipping and warning is sensible once we decide to go this way.
>>
>> I propose to take a step back and not throw away the information
>> whether the given string is a name or path, as then we do not have
>> to warn&skip, but we can treat both correctly.
>
> Now either one of us is utterly confused, and I suspect it is me, as
> I do not see how "treat both correctly" could possibly work in the
> case this code warns and skips.
>
> At this point in the flow, we already know that it is not name,
> because we asked and got a "Nah, there is no submodule registered in
> .gitmodules at that path" from submodule_from_path().  Then we ask
> submodule_from_name() if there is any submodule registered under the
> name it would have got if it were added there, and we indeed find
> one.  And that is definitely *not* a submodule we are looking for,
> because if it were, its .path would have pointed at the path we were
> using to ask in the first place.  The one we originally found at
> path and are interested in finding out the details is not known to
> .gitmodules, and the one under that name is not the one that we are
> intereted in, so fetching from the repository the other one that
> happens to have the same name but is different from the submodule we
> are interested in would simply be wrong.

Eventually we'd want to also init new submodules on fetch
(if you use submodule.active to specify the interesting submodules),
and in that case I would imagine to fetch both submodules.

As I wrote the code to further improve this series,
I realized that this is maybe "good enough" for now,
so assume that I have reviewed this series and found it good.

> If we only have path without any .gitmodules entry (hence there is
> not even URL), how would we proceed from that point on?

Oh well, right. We can only offer to keep the existing behavior
which means supporting existing repos in place at that path.

Stefan

>

^ permalink raw reply	[relevance 26%]

* "Cannot fetch git.git" (worktrees at fault? or origin/HEAD) ?
@ 2017-10-20  1:43 Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-20  1:43 UTC (permalink / raw)
  To: git, Duy Nguyen, Jonathan Nieder

So I ran "git fetch --all" inside git.git that you are all familiar with.
All fetches failed with a similar error as:
Fetching kernelorg
fatal: bad object HEAD
error: https://kernel.googlesource.com/pub/scm/git/git did not send
all necessary objects

error: Could not fetch kernelorg

Working with an older version of Git (2.13.0) gives me

$ GIT_TRACE=/u/trace GIT_TRACE_PACKET=/u/out git fetch git://github.com/git/git
From git://github.com/git/git
 * branch                  HEAD       -> FETCH_HEAD

After fiddling around and some printf debugging,
clean fsck(!)
I could not make sense of it, so bisecting

$ git bisect bad
d0c39a49ccb5dfe7feba4325c3374d99ab123c59 is the first bad commit
commit d0c39a49ccb5dfe7feba4325c3374d99ab123c59
Author: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Date:   Wed Aug 23 19:36:59 2017 +0700

    revision.c: --all adds HEAD from all worktrees

    Unless single_worktree is set, --all now adds HEAD from all worktrees.

    Since reachable.c code does not use setup_revisions(), we need to call
    other_head_refs_submodule() explicitly there to have the same effect on
    "git prune", so that we won't accidentally delete objects needed by some
    other HEADs.

    A new FIXME is added because we would need something like

        int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);

    in addition to other_head_refs() to handle it, which might require

        int get_submodule_worktrees(const char *submodule, int flags);

    It could be a separate topic to reduce the scope of this one.

    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

And here we go, I do use worktrees nowadays!
$ git worktree list
/usr/local/google/home/sbeller/OSS/git                   117ddefdb4
(detached HEAD)
/u/git                                                   d0c39a49cc
(detached HEAD)
/usr/local/google/home/sbeller/OSS/git_origin_master     b14f27f917
(detached HEAD)
/usr/local/google/home/sbeller/OSS/submodule_remote_dot  3d9025bd69
(detached HEAD)

$ git show 3d9025bd69
fatal: ambiguous argument '3d9025bd69': unknown revision or path not
in the working tree.

ok, so I presume I just delete that working tree to "fix my copy of git"
sbeller@sbeller:/u/git$ rm -rf
/usr/local/google/home/sbeller/OSS/submodule_remote_dot
sbeller@sbeller:/u/git$ git worktree prune
sbeller@sbeller:/u/git$ git worktree list
/usr/local/google/home/sbeller/OSS/git                117ddefdb4 (detached HEAD)
/u/git                                                d0c39a49cc (detached HEAD)
/usr/local/google/home/sbeller/OSS/git_origin_master  b14f27f917 (detached HEAD)
$ git bisect reset
Previous HEAD position was d0c39a49cc... revision.c: --all adds HEAD
from all worktrees
HEAD is now at 660fb3dfa8... Sync with maint
$ make install
$ git fetch --all
# works fine!

Any idea which direction we need to heading for a real fix?

^ permalink raw reply	[relevance 12%]

* [ANNOUNCE] Git v2.15.0-rc2
@ 2017-10-20  6:49 Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-10-20  6:49 UTC (permalink / raw)
  To: git; +Cc: Linux Kernel

A release candidate Git v2.15.0-rc2 is now available for testing
at the usual places.  It is comprised of 737 non-merge commits
since v2.14.0, contributed by 75 people, 22 of which are new faces.

We had to back-track a bit wrt to the "git add -p" regression; for
now, we simply revert the changes that caused issues to users
without redefining 'color.ui=always' to mean 'color.ui=auto', which
may or may not happen in future releases.

The tarballs are found at:

    https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.15.0-rc2' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.14.0 are as follows.
Welcome to the Git development community!

  Ann T Ropea, Daniel Watkins, Derrick Stolee, Dimitrios
  Christidis, Eric Rannaud, Evan Zacks, Hielke Christian Braun,
  Ian Campbell, Ilya Kantor, Jameson Miller, Job Snijders, Joel
  Teichroeb, joernchen, Łukasz Gryglicki, Manav Rathi, Martin
  Ågren, Michael Forney, Patryk Obara, Randall S. Becker, Ross
  Kabus, Taylor Blau, and Urs Thuermann.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Andreas Heiduk,
  Anthony Sottile, Ben Boeckel, Brandon Casey, Brandon Williams,
  brian m. carlson, Christian Couder, David Glasser, Eric Blake,
  Han-Wen Nienhuys, Heiko Voigt, Jean-Noel Avila, Jeff Hostetler,
  Jeff King, Johannes Schindelin, Johannes Sixt, Jonathan Nieder,
  Jonathan Tan, Junio C Hamano, Kaartic Sivaraam, Kevin Daudt,
  Kevin Willford, Lars Schneider, Martin Koegler, Matthieu Moy,
  Max Kirillov, Michael Haggerty, Michael J Gruber, Nguyễn Thái
  Ngọc Duy, Nicolas Morey-Chaisemartin, Øystein Walle, Paolo
  Bonzini, Pat Thoyts, Philip Oakley, Phillip Wood, Ralf Thielow,
  Raman Gupta, Ramsay Jones, René Scharfe, Sahil Dua, Santiago
  Torres, Stefan Beller, Stephan Beyer, Takashi Iwai, Thomas
  Braun, Thomas Gummerer, Todd Zullinger, Tom G. Christensen,
  Torsten Bögershausen, William Duclot, and W. Trevor King.

----------------------------------------------------------------

Git 2.15 Release Notes (draft)
==============================

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is still warned and Git asks users to use a
   more explicit '.' for that instead.  The hope is that existing
   users will not mind this change, and eventually the warning can be
   turned into a hard error, upgrading the deprecation into removal of
   this (mis)feature.  That is now scheduled to happen in Git v2.16,
   the next major release after this one.

 * Git now avoids blindly falling back to ".git" when the setup
   sequence said we are _not_ in Git repository.  A corner case that
   happens to work right now may be broken by a call to BUG().
   We've tried hard to locate such cases and fixed them, but there
   might still be cases that need to be addressed--bug reports are
   greatly appreciated.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.


Updates since v2.14
-------------------

UI, Workflows & Features

 * An example that is now obsolete has been removed from a sample hook,
   and an old example in it that added a sign-off manually has been
   improved to use the interpret-trailers command.

 * The advice message given when "git rebase" stops for conflicting
   changes has been improved.

 * The "rerere-train" script (in contrib/) learned the "--overwrite"
   option to allow overwriting existing recorded resolutions.

 * "git contacts" (in contrib/) now lists the address on the
   "Reported-by:" trailer to its output, in addition to those on
   S-o-b: and other trailers, to make it easier to notify (and thank)
   the original bug reporter.

 * "git rebase", especially when it is run by mistake and ends up
   trying to replay many changes, spent long time in silence.  The
   command has been taught to show progress report when it spends
   long time preparing these many changes to replay (which would give
   the user a chance to abort with ^C).

 * "git merge" learned a "--signoff" option to add the Signed-off-by:
   trailer with the committer's name.

 * "git diff" learned to optionally paint new lines that are the same
   as deleted lines elsewhere differently from genuinely new lines.

 * "git interpret-trailers" learned to take the trailer specifications
   from the command line that overrides the configured values.

 * "git interpret-trailers" has been taught a "--parse" and a few
   other options to make it easier for scripts to grab existing
   trailer lines from a commit log message.

 * The "--format=%(trailers)" option "git log" and its friends take
   learned to take the 'unfold' and 'only' modifiers to normalize its
   output, e.g. "git log --format=%(trailers:only,unfold)".

 * "gitweb" shows a link to visit the 'raw' contents of blbos in the
   history overview page.

 * "[gc] rerereResolved = 5.days" used to be invalid, as the variable
   is defined to take an integer counting the number of days.  It now
   is allowed.

 * The code to acquire a lock on a reference (e.g. while accepting a
   push from a client) used to immediately fail when the reference is
   already locked---now it waits for a very short while and retries,
   which can make it succeed if the lock holder was holding it during
   a read-only operation.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.

 * The codepath to call external process filter for smudge/clean
   operation learned to show the progress meter.

 * "git rev-parse" learned "--is-shallow-repository", that is to be
   used in a way similar to existing "--is-bare-repository" and
   friends.

 * "git describe --match <pattern>" has been taught to play well with
   the "--all" option.

 * "git branch" learned "-c/-C" to create a new branch by copying an
   existing one.

 * Some commands (most notably "git status") makes an opportunistic
   update when performing a read-only operation to help optimize later
   operations in the same repository.  The new "--no-optional-locks"
   option can be passed to Git to disable them.

 * "git for-each-ref --format=..." learned a new format element,
   %(trailers), to show only the commit log trailer part of the log
   message.


Performance, Internal Implementation, Development Support etc.

 * Conversion from uchar[20] to struct object_id continues.

 * Start using selected c99 constructs in small, stable and
   essentialpart of the system to catch people who care about
   older compilers that do not grok them.

 * The filter-process interface learned to allow a process with long
   latency give a "delayed" response.

 * Many uses of comparision callback function the hashmap API uses
   cast the callback function type when registering it to
   hashmap_init(), which defeats the compile time type checking when
   the callback interface changes (e.g. gaining more parameters).
   The callback implementations have been updated to take "void *"
   pointers and cast them to the type they expect instead.

 * Because recent Git for Windows do come with a real msgfmt, the
   build procedure for git-gui has been updated to use it instead of a
   hand-rolled substitute.

 * "git grep --recurse-submodules" has been reworked to give a more
   consistent output across submodule boundary (and do its thing
   without having to fork a separate process).

 * A helper function to read a single whole line into strbuf
   mistakenly triggered OOM error at EOF under certain conditions,
   which has been fixed.

 * The "ref-store" code reorganization continues.

 * "git commit" used to discard the index and re-read from the filesystem
   just in case the pre-commit hook has updated it in the middle; this
   has been optimized out when we know we do not run the pre-commit hook.
   (merge 680ee550d7 kw/commit-keep-index-when-pre-commit-is-not-run later to maint).

 * Updates to the HTTP layer we made recently unconditionally used
   features of libCurl without checking the existence of them, causing
   compilation errors, which has been fixed.  Also migrate the code to
   check feature macros, not version numbers, to cope better with
   libCurl that vendor ships with backported features.

 * The API to start showing progress meter after a short delay has
   been simplified.
   (merge 8aade107dd jc/simplify-progress later to maint).

 * Code clean-up to avoid mixing values read from the .gitmodules file
   and values read from the .git/config file.

 * We used to spend more than necessary cycles allocating and freeing
   piece of memory while writing each index entry out.  This has been
   optimized.

 * Platforms that ship with a separate sha1 with collision detection
   library can link to it instead of using the copy we ship as part of
   our source tree.

 * Code around "notes" have been cleaned up.
   (merge 3964281524 mh/notes-cleanup later to maint).

 * The long-standing rule that an in-core lockfile instance, once it
   is used, must not be freed, has been lifted and the lockfile and
   tempfile APIs have been updated to reduce the chance of programming
   errors.

 * Our hashmap implementation in hashmap.[ch] is not thread-safe when
   adding a new item needs to expand the hashtable by rehashing; add
   an API to disable the automatic rehashing to work it around.

 * Many of our programs consider that it is OK to release dynamic
   storage that is used throughout the life of the program by simply
   exiting, but this makes it harder to leak detection tools to avoid
   reporting false positives.  Plug many existing leaks and introduce
   a mechanism for developers to mark that the region of memory
   pointed by a pointer is not lost/leaking to help these tools.

 * As "git commit" to conclude a conflicted "git merge" honors the
   commit-msg hook, "git merge" that records a merge commit that
   cleanly auto-merges should, but it didn't.

 * The codepath for "git merge-recursive" has been cleaned up.

 * Many leaks of strbuf have been fixed.

 * "git imap-send" has our own implementation of the protocol and also
   can use more recent libCurl with the imap protocol support.  Update
   the latter so that it can use the credential subsystem, and then
   make it the default option to use, so that we can eventually
   deprecate and remove the former.

 * "make style" runs git-clang-format to help developers by pointing
   out coding style issues.

 * A test to demonstrate "git mv" failing to adjust nested submodules
   has been added.
   (merge c514167df2 hv/mv-nested-submodules-test later to maint).

 * On Cygwin, "ulimit -s" does not report failure but it does not work
   at all, which causes an unexpected success of some tests that
   expect failures under a limited stack situation.  This has been
   fixed.

 * Many codepaths have been updated to squelch -Wimplicit-fallthrough
   warnings from Gcc 7 (which is a good code hygiene).

 * Add a helper for DLL loading in anticipation for its need in a
   future topic RSN.

 * "git status --ignored", when noticing that a directory without any
   tracked path is ignored, still enumerated all the ignored paths in
   the directory, which is unnecessary.  The codepath has been
   optimized to avoid this overhead.

 * The final batch to "git rebase -i" updates to move more code from
   the shell script to C has been merged.

 * Operations that do not touch (majority of) packed refs have been
   optimized by making accesses to packed-refs file lazy; we no longer
   pre-parse everything, and an access to a single ref in the
   packed-refs does not touch majority of irrelevant refs, either.

 * Add comment to clarify that the style file is meant to be used with
   clang-5 and the rules are still work in progress.

 * Many variables that points at a region of memory that will live
   throughout the life of the program have been marked with UNLEAK
   marker to help the leak checkers concentrate on real leaks..

 * Plans for weaning us off of SHA-1 has been documented.

 * A new "oidmap" API has been introduced and oidset API has been
   rewritten to use it.


Also contains various documentation updates and code clean-ups.


Fixes since v2.14
-----------------

 * "%C(color name)" in the pretty print format always produced ANSI
   color escape codes, which was an early design mistake.  They now
   honor the configuration (e.g. "color.ui = never") and also tty-ness
   of the output medium.

 * The http.{sslkey,sslCert} configuration variables are to be
   interpreted as a pathname that honors "~[username]/" prefix, but
   weren't, which has been fixed.

 * Numerous bugs in walking of reflogs via "log -g" and friends have
   been fixed.

 * "git commit" when seeing an totally empty message said "you did not
   edit the message", which is clearly wrong.  The message has been
   corrected.

 * When a directory is not readable, "gitweb" fails to build the
   project list.  Work this around by skipping such a directory.

 * Some versions of GnuPG fails to kill gpg-agent it auto-spawned
   and such a left-over agent can interfere with a test.  Work it
   around by attempting to kill one before starting a new test.

 * A recently added test for the "credential-cache" helper revealed
   that EOF detection done around the time the connection to the cache
   daemon is torn down were flaky.  This was fixed by reacting to
   ECONNRESET and behaving as if we got an EOF.

 * "git log --tag=no-such-tag" showed log starting from HEAD, which
   has been fixed---it now shows nothing.

 * The "tag.pager" configuration variable was useless for those who
   actually create tag objects, as it interfered with the use of an
   editor.  A new mechanism has been introduced for commands to enable
   pager depending on what operation is being carried out to fix this,
   and then "git tag -l" is made to run pager by default.

 * "git push --recurse-submodules $there HEAD:$target" was not
   propagated down to the submodules, but now it is.

 * Commands like "git rebase" accepted the --rerere-autoupdate option
   from the command line, but did not always use it.  This has been
   fixed.

 * "git clone --recurse-submodules --quiet" did not pass the quiet
   option down to submodules.

 * Test portability fix for OBSD.

 * Portability fix for OBSD.

 * "git am -s" has been taught that some input may end with a trailer
   block that is not Signed-off-by: and it should refrain from adding
   an extra blank line before adding a new sign-off in such a case.

 * "git svn" used with "--localtime" option did not compute the tz
   offset for the timestamp in question and instead always used the
   current time, which has been corrected.

 * Memory leak in an error codepath has been plugged.

 * "git stash -u" used the contents of the committed version of the
   ".gitignore" file to decide which paths are ignored, even when the
   file has local changes.  The command has been taught to instead use
   the locally modified contents.

 * bash 4.4 or newer gave a warning on NUL byte in command
   substitution done in "git stash"; this has been squelched.

 * "git grep -L" and "git grep --quiet -L" reported different exit
   codes; this has been corrected.

 * When handshake with a subprocess filter notices that the process
   asked for an unknown capability, Git did not report what program
   the offending subprocess was running.  This has been corrected.

 * "git apply" that is used as a better "patch -p1" failed to apply a
   taken from a file with CRLF line endings to a file with CRLF line
   endings.  The root cause was because it misused convert_to_git()
   that tried to do "safe-crlf" processing by looking at the index
   entry at the same path, which is a nonsense---in that mode, "apply"
   is not working on the data in (or derived from) the index at all.
   This has been fixed.

 * Killing "git merge --edit" before the editor returns control left
   the repository in a state with MERGE_MSG but without MERGE_HEAD,
   which incorrectly tells the subsequent "git commit" that there was
   a squash merge in progress.  This has been fixed.

 * "git archive" did not work well with pathspecs and the
   export-ignore attribute.

 * In addition to "cc: <a@dd.re.ss> # cruft", "cc: a@dd.re.ss # cruft"
   was taught to "git send-email" as a valid way to tell it that it
   needs to also send a carbon copy to <a@dd.re.ss> in the trailer
   section.

 * "git branch -M a b" while on a branch that is completely unrelated
   to either branch a or branch b misbehaved when multiple worktree
   was in use.  This has been fixed.
   (merge 31824d180d nd/worktree-kill-parse-ref later to maint).

 * "git gc" and friends when multiple worktrees are used off of a
   single repository did not consider the index and per-worktree refs
   of other worktrees as the root for reachability traversal, making
   objects that are in use only in other worktrees to be subject to
   garbage collection.

 * A regression to "gitk --bisect" by a recent update has been fixed.

 * "git -c submodule.recurse=yes pull" did not work as if the
   "--recurse-submodules" option was given from the command line.
   This has been corrected.

 * Unlike "git commit-tree < file", "git commit-tree -F file" did not
   pass the contents of the file verbatim and instead completed an
   incomplete line at the end, if exists.  The latter has been updated
   to match the behaviour of the former.

 * Many codepaths did not diagnose write failures correctly when disks
   go full, due to their misuse of write_in_full() helper function,
   which have been corrected.
   (merge f48ecd38cb jk/write-in-full-fix later to maint).

 * "git help co" now says "co is aliased to ...", not "git co is".
   (merge b3a8076e0d ks/help-alias-label later to maint).

 * "git archive", especially when used with pathspec, stored an empty
   directory in its output, even though Git itself never does so.
   This has been fixed.

 * API error-proofing which happens to also squelch warnings from GCC.

 * The explanation of the cut-line in the commit log editor has been
   slightly tweaked.
   (merge 8c4b1a3593 ks/commit-do-not-touch-cut-line later to maint).

 * "git gc" tries to avoid running two instances at the same time by
   reading and writing pid/host from and to a lock file; it used to
   use an incorrect fscanf() format when reading, which has been
   corrected.

 * The scripts to drive TravisCI has been reorganized and then an
   optimization to avoid spending cycles on a branch whose tip is
   tagged has been implemented.
   (merge 8376eb4a8f ls/travis-scriptify later to maint).

 * The test linter has been taught that we do not like "echo -e".

 * Code cmp.std.c nitpick.

 * A regression fix for 2.11 that made the code to read the list of
   alternate object stores overrun the end of the string.
   (merge f0f7bebef7 jk/info-alternates-fix later to maint).

 * "git describe --match" learned to take multiple patterns in v2.13
   series, but the feature ignored the patterns after the first one
   and did not work at all.  This has been fixed.

 * "git filter-branch" cannot reproduce a history with a tag without
   the tagger field, which only ancient versions of Git allowed to be
   created.  This has been corrected.
   (merge b2c1ca6b4b ic/fix-filter-branch-to-handle-tag-without-tagger later to maint).

 * "git cat-file --textconv" started segfaulting recently, which
   has been corrected.

 * The built-in pattern to detect the "function header" for HTML did
   not match <H1>..<H6> elements without any attributes, which has
   been fixed.

 * "git mailinfo" was loose in decoding quoted printable and produced
   garbage when the two letters after the equal sign are not
   hexadecimal.  This has been fixed.

 * The machinery to create xdelta used in pack files received the
   sizes of the data in size_t, but lost the higher bits of them by
   storing them in "unsigned int" during the computation, which is
   fixed.

 * The delta format used in the packfile cannot reference data at
   offset larger than what can be expressed in 4-byte, but the
   generator for the data failed to make sure the offset does not
   overflow.  This has been corrected.

 * The documentation for '-X<option>' for merges was misleadingly
   written to suggest that "-s theirs" exists, which is not the case.

 * "git fast-export" with -M/-C option issued "copy" instruction on a
   path that is simultaneously modified, which was incorrect.
   (merge b3e8ca89cf jt/fast-export-copy-modify-fix later to maint).

 * Many codepaths have been updated to squelch -Wsign-compare
   warnings.
   (merge 071bcaab64 rj/no-sign-compare later to maint).

 * Memory leaks in various codepaths have been plugged.
   (merge 4d01a7fa65 ma/leakplugs later to maint).

 * Recent versions of "git rev-parse --parseopt" did not parse the
   option specification that does not have the optional flags (*=?!)
   correctly, which has been corrected.
   (merge a6304fa4c2 bc/rev-parse-parseopt-fix later to maint).

 * The checkpoint command "git fast-import" did not flush updates to
   refs and marks unless at least one object was created since the
   last checkpoint, which has been corrected, as these things can
   happen without any new object getting created.
   (merge 30e215a65c er/fast-import-dump-refs-on-checkpoint later to maint).

 * Spell the name of our system as "Git" in the output from
   request-pull script.

 * Fixes for a handful memory access issues identified by valgrind.

 * Backports a moral equivalent of 2015 fix to the poll() emulation
   from the upstream gnulib to fix occasional breakages on HPE NonStop.

 * Users with "color.ui = always" in their configuration were broken
   by a recent change that made plumbing commands to pay attention to
   them as the patch created internally by "git add -p" were colored
   (heh) and made unusable.  This has been fixed by reverting the
   offending change.

 * In the "--format=..." option of the "git for-each-ref" command (and
   its friends, i.e. the listing mode of "git branch/tag"), "%(atom:)"
   (e.g. "%(refname:)", "%(body:)" used to error out.  Instead, treat
   them as if the colon and an empty string that follows it were not
   there.

 * An ancient bug that made Git misbehave with creation/renaming of
   refs has been fixed.

 * "git fetch <there> <src>:<dst>" allows an object name on the <src>
   side when the other side accepts such a request since Git v2.5, but
   the documentation was left stale.
   (merge 83558a412a jc/fetch-refspec-doc-update later to maint).

 * Update the documentation for "git filter-branch" so that the filter
   options are listed in the same order as they are applied, as
   described in an earlier part of the doc.
   (merge 07c4984508 dg/filter-branch-filter-order-doc later to maint).

 * Other minor doc, test and build updates and code cleanups.
   (merge f094b89a4d ma/parse-maybe-bool later to maint).
   (merge 6cdf8a7929 ma/ts-cleanups later to maint).
   (merge 7560f547e6 ma/up-to-date later to maint).
   (merge 0db3dc75f3 rs/apply-epoch later to maint).
   (merge 276d0e35c0 ma/split-symref-update-fix later to maint).
   (merge f777623514 ks/branch-tweak-error-message-for-extra-args later to maint).
   (merge 33f3c683ec ks/verify-filename-non-option-error-message-tweak later to maint).
   (merge 7cbbf9d6a2 ls/filter-process-delayed later to maint).
   (merge 488aa65c8f wk/merge-options-gpg-sign-doc later to maint).
   (merge e61cb19a27 jc/branch-force-doc-readability-fix later to maint).

----------------------------------------------------------------

Changes since v2.14.0 are as follows:

Adam Dinwoodie (1):
      doc: correct command formatting

Andreas Heiduk (2):
      doc: add missing values "none" and "default" for diff.wsErrorHighlight
      doc: clarify "config --bool" behaviour with empty string

Ann T Ropea (1):
      request-pull: capitalise "Git" to make it a proper noun

Anthony Sottile (1):
      git-grep: correct exit code with --quiet and -L

Ben Boeckel (1):
      Documentation: mention that `eol` can change the dirty status of paths

Brandon Casey (7):
      t1502: demonstrate rev-parse --parseopt option mis-parsing
      rev-parse parseopt: do not search help text for flag chars
      rev-parse parseopt: interpret any whitespace as start of help text
      git-rebase: don't ignore unexpected command line arguments
      t0040,t1502: Demonstrate parse_options bugs
      parse-options: write blank line to correct output stream
      parse-options: only insert newline in help text if needed

Brandon Williams (29):
      repo_read_index: don't discard the index
      repository: have the_repository use the_index
      submodule--helper: teach push-check to handle HEAD
      cache.h: add GITMODULES_FILE macro
      config: add config_from_gitmodules
      submodule: remove submodule.fetchjobs from submodule-config parsing
      submodule: remove fetch.recursesubmodules from submodule-config parsing
      submodule: check for unstaged .gitmodules outside of config parsing
      submodule: check for unmerged .gitmodules outside of config parsing
      submodule: merge repo_read_gitmodules and gitmodules_config
      grep: recurse in-process using 'struct repository'
      t7411: check configuration parsing errors
      submodule: don't use submodule_from_name
      add, reset: ensure submodules can be added or reset
      submodule--helper: don't overlay config in remote_submodule_branch
      submodule--helper: don't overlay config in update-clone
      fetch: don't overlay config with submodule-config
      submodule: don't rely on overlayed config when setting diffopts
      unpack-trees: don't respect submodule.update
      submodule: remove submodule_config callback routine
      diff: stop allowing diff to have submodules configured in .git/config
      submodule-config: remove support for overlaying repository config
      submodule-config: move submodule-config functions to submodule-config.c
      submodule-config: lazy-load a repository's .gitmodules file
      unpack-trees: improve loading of .gitmodules
      submodule: remove gitmodules_config
      clone: teach recursive clones to respect -q
      clang-format: outline the git project's coding style
      Makefile: add style build rule

Christian Couder (3):
      refs: use skip_prefix() in ref_is_hidden()
      sub-process: print the cmd when a capability is unsupported
      sha1-lookup: remove sha1_entry_pos() from header file

Daniel Watkins (1):
      diff-highlight: add clean target to Makefile

David Glasser (1):
      doc: list filter-branch subdirectory-filter first

Derrick Stolee (1):
      cleanup: fix possible overflow errors in binary search

Dimitrios Christidis (1):
      fmt-merge-msg: fix coding style

Eric Blake (1):
      git-contacts: also recognise "Reported-by:"

Eric Rannaud (1):
      fast-import: checkpoint: dump branches/tags/marks even if object_count==0

Evan Zacks (1):
      doc: fix minor typos (extra/duplicated words)

Han-Wen Nienhuys (5):
      submodule.h: typofix
      submodule.c: describe submodule_to_gitdir() in a new comment
      real_path: clarify return value ownership
      read_gitfile_gently: clarify return value ownership.
      string-list.h: move documentation from Documentation/api/ into header

Heiko Voigt (2):
      t5526: fix some broken && chains
      add test for bug in git-mv for recursive submodules

Hielke Christian Braun (1):
      gitweb: skip unreadable subdirectories

Ian Campbell (4):
      filter-branch: reset $GIT_* before cleaning up
      filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_*
      filter-branch: stash away ref map in a branch
      filter-branch: use hash-object instead of mktag

Ilya Kantor (1):
      userdiff: fix HTML hunk header regexp

Jameson Miller (1):
      Improve performance of git status --ignored

Jean-Noel Avila (1):
      i18n: add a missing space in message

Jeff Hostetler (1):
      hashmap: add API to disable item counting when threaded

Jeff King (134):
      t1414: document some reflog-walk oddities
      revision: disallow reflog walking with revs->limited
      log: clarify comment about reflog cycles
      log: do not free parents when walking reflog
      get_revision_1(): replace do-while with an early return
      rev-list: check reflog_info before showing usage
      reflog-walk: stop using fake parents
      reflog-walk: apply --since/--until to reflog dates
      check return value of verify_ref_format()
      docs/for-each-ref: update pointer to color syntax
      t: use test_decode_color rather than literal ANSI codes
      ref-filter: simplify automatic color reset
      ref-filter: abstract ref format into its own struct
      ref-filter: move need_color_reset_at_eol into ref_format
      ref-filter: provide a function for parsing sort options
      ref-filter: make parse_ref_filter_atom a private function
      ref-filter: factor out the parsing of sorting atoms
      ref-filter: pass ref_format struct to atom parsers
      color: check color.ui in git_default_config()
      for-each-ref: load config earlier
      rev-list: pass diffopt->use_colors through to pretty-print
      pretty: respect color settings for %C placeholders
      ref-filter: consult want_color() before emitting colors
      strbuf: use designated initializers in STRBUF_INIT
      t/lib-proto-disable: restore protocol.allow after config tests
      t5813: add test for hostname starting with dash
      connect: factor out "looks like command line option" check
      connect: reject dashed arguments for proxy commands
      connect: reject paths that look like command line options
      t6018: flesh out empty input/output rev-list tests
      revision: add rev_input_given flag
      rev-list: don't show usage when we see empty ref patterns
      revision: do not fallback to default when rev_input_given is set
      hashcmp: use memcmp instead of open-coded loop
      sha1_file: drop experimental GIT_USE_LOOKUP search
      trailer: put process_trailers() options into a struct
      interpret-trailers: add an option to show only the trailers
      interpret-trailers: add an option to show only existing trailers
      interpret-trailers: add an option to unfold values
      interpret-trailers: add --parse convenience option
      pretty: move trailer formatting to trailer.c
      t4205: refactor %(trailers) tests
      pretty: support normalization options for %(trailers)
      doc: fix typo in sendemail.identity
      config: use a static lock_file struct
      write_index_as_tree: cleanup tempfile on error
      setup_temporary_shallow: avoid using inactive tempfile
      setup_temporary_shallow: move tempfile struct into function
      verify_signed_buffer: prefer close_tempfile() to close()
      always check return value of close_tempfile
      tempfile: do not delete tempfile on failed close
      lockfile: do not rollback lock on failed close
      tempfile: prefer is_tempfile_active to bare access
      tempfile: handle NULL tempfile pointers gracefully
      tempfile: replace die("BUG") with BUG()
      tempfile: factor out activation
      tempfile: factor out deactivation
      tempfile: robustify cleanup handler
      tempfile: release deactivated strbufs instead of resetting
      tempfile: use list.h for linked list
      tempfile: remove deactivated list entries
      tempfile: auto-allocate tempfiles on heap
      lockfile: update lifetime requirements in documentation
      ref_lock: stop leaking lock_files
      stop leaking lock structs in some simple cases
      test-lib: --valgrind should not override --verbose-log
      test-lib: set LSAN_OPTIONS to abort by default
      add: free leaked pathspec after add_files_to_cache()
      update-index: fix cache entry leak in add_one_file()
      config: plug user_config leak
      reset: make tree counting less confusing
      reset: free allocated tree buffers
      repository: free fields before overwriting them
      set_git_dir: handle feeding gitdir to itself
      rev-parse: don't trim bisect refnames
      system_path: move RUNTIME_PREFIX to a sub-function
      git_extract_argv0_path: do nothing without RUNTIME_PREFIX
      add UNLEAK annotation for reducing leak false positives
      shortlog: skip format/parse roundtrip for internal traversal
      shell: drop git-cvsserver support by default
      archimport: use safe_pipe_capture for user input
      cvsimport: shell-quote variable used in backticks
      config: avoid "write_in_full(fd, buf, len) < len" pattern
      get-tar-commit-id: check write_in_full() return against 0
      avoid "write_in_full(fd, buf, len) != len" pattern
      convert less-trivial versions of "write_in_full() != len"
      pkt-line: check write_in_full() errors against "< 0"
      notes-merge: use ssize_t for write_in_full() return value
      config: flip return value of store_write_*()
      read_pack_header: handle signed/unsigned comparison in read result
      prefix_ref_iterator: break when we leave the prefix
      read_info_alternates: read contents into strbuf
      read_info_alternates: warn on non-trivial errors
      revision: replace "struct cmdline_pathspec" with argv_array
      cat-file: handle NULL object_context.path
      test-line-buffer: simplify command parsing
      curl_trace(): eliminate switch fallthrough
      consistently use "fallthrough" comments in switches
      doc: put literal block delimiter around table
      files-backend: prefer "0" for write_in_full() error check
      notes-merge: drop dead zero-write code
      prefer "!=" when checking read_in_full() result
      avoid looking at errno for short read_in_full() returns
      distinguish error versus short read from read_in_full()
      worktree: use xsize_t to access file size
      worktree: check the result of read_in_full()
      validate_headref: NUL-terminate HEAD buffer
      validate_headref: use skip_prefix for symref parsing
      validate_headref: use get_oid_hex for detached HEADs
      git: add --no-optional-locks option
      test-terminal: set TERM=vt100
      t4015: prefer --color to -c color.diff=always
      t3701: use test-terminal to collect color output
      t7508: use test_terminal for color output
      t7502: use diff.noprefix for --verbose test
      t6006: drop "always" color config tests
      t3203: drop "always" color test
      t3205: use --color instead of color.branch=always
      provide --color option for all ref-filter users
      color: make "always" the same as "auto" in config
      t4015: use --color with --color-moved
      t7301: use test_terminal to check color
      path.c: fix uninitialized memory access
      sha1_loose_object_info: handle errors from unpack_sha1_rest
      t3308: create a real ref directory/file conflict
      refs_resolve_ref_unsafe: handle d/f conflicts for writes
      write_entry: fix leak when retrying delayed filter
      write_entry: avoid reading blobs in CE_RETRY case
      write_entry: untangle symlink and regular-file cases
      diff: fix infinite loop with --color-moved --ignore-space-change
      Revert "color: make "always" the same as "auto" in config"
      Revert "t6006: drop "always" color config tests"
      Revert "color: check color.ui in git_default_config()"
      tag: respect color.ui config

Job Snijders (1):
      gitweb: add 'raw' blob_plain link in history overview

Joel Teichroeb (3):
      stash: add a test for stash create with no files
      stash: add a test for when apply fails during stash branch
      stash: add a test for stashing in a detached state

Johannes Schindelin (14):
      run_processes_parallel: change confusing task_cb convention
      git-gui (MinGW): make use of MSys2's msgfmt
      t3415: verify that an empty instructionFormat is handled as before
      rebase -i: generate the script via rebase--helper
      rebase -i: remove useless indentation
      rebase -i: do not invent onelines when expanding/collapsing SHA-1s
      rebase -i: also expand/collapse the SHA-1s via the rebase--helper
      t3404: relax rebase.missingCommitsCheck tests
      rebase -i: check for missing commits in the rebase--helper
      rebase -i: skip unnecessary picks using the rebase--helper
      t3415: test fixup with wrapped oneline
      rebase -i: rearrange fixup/squash lines using the rebase--helper
      Win32: simplify loading of DLL functions
      clang-format: adjust line break penalties

Johannes Sixt (1):
      sub-process: use child_process.args instead of child_process.argv

Jonathan Nieder (8):
      vcs-svn: remove more unused prototypes and declarations
      vcs-svn: remove custom mode constants
      vcs-svn: remove repo_delete wrapper function
      vcs-svn: move remaining repo_tree functions to fast_export.h
      pack: make packed_git_mru global a value instead of a pointer
      pathspec doc: parse_pathspec does not maintain references to args
      technical doc: add a design doc for hash function transition
      strbuf doc: reuse after strbuf_release is fine

Jonathan Tan (40):
      fsck: remove redundant parse_tree() invocation
      object: remove "used" field from struct object
      fsck: cleanup unused variable
      Documentation: migrate sub-process docs to header
      sub-process: refactor handshake to common function
      tests: ensure fsck fails on corrupt packfiles
      sha1_file: set whence in storage-specific info fn
      sha1_file: remove read_packed_sha1()
      diff: avoid redundantly clearing a flag
      diff: respect MIN_BLOCK_LENGTH for last block
      diff: define block by number of alphanumeric chars
      Doc: clarify that pack-objects makes packs, plural
      pack: move pack name-related functions
      pack: move static state variables
      pack: move pack_report()
      pack: move open_pack_index(), parse_pack_index()
      pack: move release_pack_memory()
      pack: move pack-closing functions
      pack: move use_pack()
      pack: move unuse_pack()
      pack: move add_packed_git()
      pack: move install_packed_git()
      pack: move {,re}prepare_packed_git and approximate_object_count
      pack: move unpack_object_header_buffer()
      pack: move get_size_from_delta()
      pack: move unpack_object_header()
      pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
      pack: move nth_packed_object_{sha1,oid}
      pack: move check_pack_index_ptr(), nth_packed_object_offset()
      pack: move find_pack_entry_one(), is_pack_valid()
      pack: move find_sha1_pack()
      pack: move find_pack_entry() and make it global
      pack: move has_sha1_pack()
      pack: move has_pack_index()
      pack: move for_each_packed_object()
      Remove inadvertently added outgoing/packfile.h
      Add t/helper/test-write-cache to .gitignore
      git-compat-util: make UNLEAK less error-prone
      fast-export: do not copy from modified file
      oidmap: map with OID as key

Junio C Hamano (58):
      t1408: add a test of stale packed refs covered by loose refs
      clean.c: use designated initializer
      http.c: http.sslcert and http.sslkey are both pathnames
      connect: reject ssh hostname that begins with a dash
      Git 2.7.6
      Git 2.8.6
      Git 2.9.5
      Git 2.10.4
      Git 2.11.3
      Git 2.12.4
      Git 2.13.5
      Git 2.14.1
      Start post 2.14 cycle
      perl/Git.pm: typofix in a comment
      The first batch of topics after the 2.14 cycle
      diff: retire sane_truncate_fn
      progress: simplify "delayed" progress API
      The second batch post 2.14
      t4200: give us a clean slate after "rerere gc" tests
      t4200: make "rerere gc" test more robust
      t4200: gather "rerere gc" together
      t4200: parameterize "rerere gc" custom expiry test
      rerere: represent time duration in timestamp_t internally
      rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved
      The third batch post 2.14
      Prepare for 2.14.2
      The fourth batch post 2.14
      The fifth batch post 2.14
      The sixth batch post 2.14
      RelNotes: further fixes for 2.14.2 from the master front
      The seventh batch post 2.14
      travis: dedent a few scripts that are indented overly deeply
      subprocess: loudly die when subprocess asks for an unsupported capability
      cvsserver: move safe_pipe_capture() to the main package
      cvsserver: use safe_pipe_capture for `constant commands` as well
      gc: call fscanf() with %<len>s, not %<len>c, when reading hostname
      The eighth batch for 2.15
      Git 2.10.5
      Git 2.11.4
      Git 2.12.5
      Git 2.13.6
      Git 2.14.2
      branch: fix "copy" to never touch HEAD
      merge-strategies: avoid implying that "-s theirs" exists
      The ninth batch for 2.15
      The tenth batch for 2.15
      The eleventh batch for 2.15
      The twelfth batch for 2.15
      Git 2.15-rc0
      Prepare for -rc1
      Git 2.15-rc1
      checkout doc: clarify command line args for "checkout paths" mode
      Crawling towards -rc2
      fetch doc: src side of refspec could be full SHA-1
      Preparing for rc2 continues
      branch doc: sprinkle a few commas for readability
      Prepare for 2.14.3
      Git 2.15-rc2

Kaartic Sivaraam (15):
      hook: cleanup script
      hook: name the positional variables
      hook: add sign-off using "interpret-trailers"
      hook: add a simple first example
      commit: check for empty message before the check for untouched template
      hook: use correct logical variable
      t3200: cleanup cruft of a test
      builtin/branch: stop supporting the "--set-upstream" option
      branch: quote branch/ref names to improve readability
      help: change a message to be more precise
      commit-template: change a message to be more intuitive
      t/README: fix typo and grammatically improve a sentence
      doc: camelCase the config variables to improve readability
      branch: change the error messages to be more meaningful
      setup: update error message to be more meaningful

Kevin Daudt (3):
      stash: prevent warning about null bytes in input
      doc/for-each-ref: consistently use '=' to between argument names and values
      doc/for-each-ref: explicitly specify option names

Kevin Willford (9):
      format-patch: have progress option while generating patches
      rebase: turn on progress option by default for format-patch
      commit: skip discarding the index if there is no pre-commit hook
      perf: add test for writing the index
      read-cache: fix memory leak in do_write_index
      read-cache: avoid allocating every ondisk entry when writing
      merge-recursive: fix memory leak
      merge-recursive: remove return value from get_files_dirs
      merge-recursive: change current file dir string_lists to hashmap

Lars Schneider (13):
      t0021: keep filter log files on comparison
      t0021: make debug log file name configurable
      t0021: write "OUT <size>" only on success
      convert: put the flags field before the flag itself for consistent style
      convert: move multiple file filter error handling to separate function
      convert: refactor capabilities negotiation
      convert: add "status=delayed" to filter process protocol
      convert: display progress for filtered objects that have been delayed
      travis-ci: move Travis CI code into dedicated scripts
      travis-ci: skip a branch build if equal tag is present
      travis-ci: fix "skip_branch_tip_with_tag()" string comparison
      entry.c: update cache entry only for existing files
      entry.c: check if file exists after checkout

Manav Rathi (1):
      docs: improve discoverability of exclude pathspec

Martin Koegler (2):
      diff-delta: fix encoding size that would not fit in "unsigned int"
      diff-delta: do not allow delta offset truncation

Martin Ågren (33):
      builtin.h: take over documentation from api-builtin.txt
      git.c: let builtins opt for handling `pager.foo` themselves
      git.c: provide setup_auto_pager()
      t7006: add tests for how git tag paginates
      tag: respect `pager.tag` in list-mode only
      tag: change default of `pager.tag` to "on"
      git.c: ignore pager.* when launching builtin as dashed external
      Doc/git-{push,send-pack}: correct --sign= to --signed=
      t5334: document that git push --signed=1 does not work
      config: introduce git_parse_maybe_bool_text
      config: make git_{config,parse}_maybe_bool equivalent
      treewide: deprecate git_config_maybe_bool, use git_parse_maybe_bool
      parse_decoration_style: drop unused argument `var`
      doc/interpret-trailers: fix "the this" typo
      convert: always initialize attr_action in convert_attrs
      pack-objects: take lock before accessing `remaining`
      strbuf_setlen: don't write to strbuf_slopbuf
      ThreadSanitizer: add suppressions
      Documentation/user-manual: update outdated example output
      treewide: correct several "up-to-date" to "up to date"
      pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
      config: remove git_config_maybe_bool
      refs/files-backend: add longer-scoped copy of string to list
      refs/files-backend: fix memory leak in lock_ref_for_update
      refs/files-backend: correct return value in lock_ref_for_update
      refs/files-backend: add `refname`, not "HEAD", to list
      builtin/commit: fix memory leak in `prepare_index()`
      commit: fix memory leak in `reduce_heads()`
      leak_pending: use `object_array_clear()`, not `free()`
      object_array: use `object_array_clear()`, not `free()`
      object_array: add and use `object_array_pop()`
      pack-bitmap[-write]: use `object_array_clear()`, don't leak
      builtin/: add UNLEAKs

Matthieu Moy (2):
      send-email: fix garbage removal after address
      send-email: don't use Mail::Address, even if available

Max Kirillov (2):
      describe: fix matching to actually match all patterns
      describe: teach --match to handle branches and remotes

Michael Forney (1):
      scripts: use "git foo" not "git-foo"

Michael Haggerty (77):
      add_packed_ref(): teach function to overwrite existing refs
      packed_ref_store: new struct
      packed_ref_store: move `packed_refs_path` here
      packed_ref_store: move `packed_refs_lock` member here
      clear_packed_ref_cache(): take a `packed_ref_store *` parameter
      validate_packed_ref_cache(): take a `packed_ref_store *` parameter
      get_packed_ref_cache(): take a `packed_ref_store *` parameter
      get_packed_refs(): take a `packed_ref_store *` parameter
      add_packed_ref(): take a `packed_ref_store *` parameter
      lock_packed_refs(): take a `packed_ref_store *` parameter
      commit_packed_refs(): take a `packed_ref_store *` parameter
      rollback_packed_refs(): take a `packed_ref_store *` parameter
      get_packed_ref(): take a `packed_ref_store *` parameter
      repack_without_refs(): take a `packed_ref_store *` parameter
      packed_peel_ref(): new function, extracted from `files_peel_ref()`
      packed_ref_store: support iteration
      packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
      packed-backend: new module for handling packed references
      packed_ref_store: make class into a subclass of `ref_store`
      commit_packed_refs(): report errors rather than dying
      commit_packed_refs(): use a staging file separate from the lockfile
      packed_refs_lock(): function renamed from lock_packed_refs()
      packed_refs_lock(): report errors via a `struct strbuf *err`
      packed_refs_unlock(), packed_refs_is_locked(): new functions
      clear_packed_ref_cache(): don't protest if the lock is held
      commit_packed_refs(): remove call to `packed_refs_unlock()`
      repack_without_refs(): don't lock or unlock the packed refs
      t3210: add some tests of bogus packed-refs file contents
      read_packed_refs(): die if `packed-refs` contains bogus data
      packed_ref_store: handle a packed-refs file that is a symlink
      files-backend: cheapen refname_available check when locking refs
      refs: retry acquiring reference locks for 100ms
      notes: make GET_NIBBLE macro more robust
      load_subtree(): remove unnecessary conditional
      load_subtree(): reduce the scope of some local variables
      load_subtree(): fix incorrect comment
      load_subtree(): separate logic for internal vs. terminal entries
      load_subtree(): check earlier whether an internal node is a tree entry
      load_subtree(): only consider blobs to be potential notes
      get_oid_hex_segment(): return 0 on success
      load_subtree(): combine some common code
      get_oid_hex_segment(): don't pad the rest of `oid`
      hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
      load_subtree(): declare some variables to be `size_t`
      load_subtree(): check that `prefix_len` is in the expected range
      packed-backend: don't adjust the reference count on lock/unlock
      struct ref_transaction: add a place for backends to store data
      packed_ref_store: implement reference transactions
      packed_delete_refs(): implement method
      files_pack_refs(): use a reference transaction to write packed refs
      prune_refs(): also free the linked list
      files_initial_transaction_commit(): use a transaction for packed refs
      t1404: demonstrate two problems with reference transactions
      files_ref_store: use a transaction to update packed refs
      packed-backend: rip out some now-unused code
      files_transaction_finish(): delete reflogs before references
      ref_iterator: keep track of whether the iterator output is ordered
      packed_ref_cache: add a backlink to the associated `packed_ref_store`
      die_unterminated_line(), die_invalid_line(): new functions
      read_packed_refs(): use mmap to read the `packed-refs` file
      read_packed_refs(): only check for a header at the top of the file
      read_packed_refs(): make parsing of the header line more robust
      for_each_string_list_item: avoid undefined behavior for empty list
      read_packed_refs(): read references with minimal copying
      packed_ref_cache: remember the file-wide peeling state
      mmapped_ref_iterator: add iterator over a packed-refs file
      mmapped_ref_iterator_advance(): no peeled value for broken refs
      packed-backend.c: reorder some definitions
      packed_ref_cache: keep the `packed-refs` file mmapped if possible
      read_packed_refs(): ensure that references are ordered when read
      packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
      packed_read_raw_ref(): read the reference from the mmapped buffer
      ref_store: implement `refs_peel_ref()` generically
      packed_ref_store: get rid of the `ref_cache` entirely
      ref_cache: remove support for storing peeled values
      mmapped_ref_iterator: inline into `packed_ref_iterator`
      packed-backend.c: rename a bunch of things and update comments

Michael J Gruber (11):
      Documentation: use proper wording for ref format strings
      Documentation/git-for-each-ref: clarify peeling of tags for --format
      Documentation/git-merge: explain --continue
      merge: clarify call chain
      merge: split write_merge_state in two
      merge: save merge state earlier
      name-rev: change ULONG_MAX to TIME_MAX
      t7004: move limited stack prereq to test-lib
      t6120: test name-rev --all and --stdin
      t6120: clean up state after breaking repo
      t6120: test describe and name-rev with deep repos

Nguyễn Thái Ngọc Duy (17):
      branch: fix branch renaming not updating HEADs correctly
      revision.h: new flag in struct rev_info wrt. worktree-related refs
      refs.c: use is_dir_sep() in resolve_gitlink_ref()
      revision.c: refactor add_index_objects_to_pending()
      revision.c: --indexed-objects add objects from all worktrees
      refs.c: refactor get_submodule_ref_store(), share common free block
      refs: move submodule slash stripping code to get_submodule_ref_store
      refs: add refs_head_ref()
      revision.c: use refs_for_each*() instead of for_each_*_submodule()
      refs.c: move for_each_remote_ref_submodule() to submodule.c
      refs: remove dead for_each_*_submodule()
      revision.c: --all adds HEAD from all worktrees
      files-backend: make reflog iterator go through per-worktree reflog
      revision.c: --reflog add HEAD reflog from all worktrees
      rev-list: expose and document --single-worktree
      refs.c: remove fallback-to-main-store code get_submodule_ref_store()
      refs.c: reindent get_submodule_ref_store()

Nicolas Morey-Chaisemartin (7):
      stash: clean untracked files before reset
      pull: fix cli and config option parsing order
      pull: honor submodule.recurse config option
      imap-send: return with error if curl failed
      imap-send: add wrapper to get server credentials if needed
      imap_send: setup_curl: retreive credentials if not set in config file
      imap-send: use curl by default when possible

Paolo Bonzini (4):
      trailers: export action enums and corresponding lookup functions
      trailers: introduce struct new_trailer_item
      interpret-trailers: add options for actions
      interpret-trailers: fix documentation typo

Patryk Obara (10):
      sha1_file: fix definition of null_sha1
      commit: replace the raw buffer with strbuf in read_graft_line
      commit: allocate array using object_id size
      commit: rewrite read_graft_line
      builtin/hash-object: convert to struct object_id
      read-cache: convert to struct object_id
      sha1_file: convert index_path to struct object_id
      sha1_file: convert index_fd to struct object_id
      sha1_file: convert hash_sha1_file_literally to struct object_id
      sha1_file: convert index_stream to struct object_id

Philip Oakley (4):
      git-gui: remove duplicate entries from .gitconfig's gui.recentrepo
      git gui: cope with duplicates in _get_recentrepo
      git gui: de-dup selected repo from recentrepo history
      git gui: allow for a long recentrepo list

Phillip Wood (7):
      am: remember --rerere-autoupdate setting
      rebase: honor --rerere-autoupdate
      rebase -i: honor --rerere-autoupdate
      t3504: use test_commit
      cherry-pick/revert: remember --rerere-autoupdate
      cherry-pick/revert: reject --rerere-autoupdate when continuing
      am: fix signoff when other trailers are present

Ralf Thielow (2):
      sequencer.c: fix and unify error messages in rearrange_squash()
      sequencer.c: unify an error message

Raman Gupta (1):
      contrib/rerere-train: optionally overwrite existing resolutions

Ramsay Jones (9):
      credential-cache: interpret an ECONNRESET as an EOF
      builtin/add: add detail to a 'cannot chmod' error message
      test-lib: don't use ulimit in test prerequisites on cygwin
      test-lib: use more compact expression in PIPE prerequisite
      t9010-*.sh: skip all tests if the PIPE prereq is missing
      git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings
      commit-slab.h: avoid -Wsign-compare warnings
      cache.h: hex2chr() - avoid -Wsign-compare warnings
      ALLOC_GROW: avoid -Wsign-compare warnings

Randall S. Becker (1):
      poll.c: always set revents, even if to zero

René Scharfe (81):
      tree-diff: don't access hash of NULL object_id pointer
      notes: don't access hash of NULL object_id pointer
      receive-pack: don't access hash of NULL object_id pointer
      bswap: convert to unsigned before shifting in get_be32
      bswap: convert get_be16, get_be32 and put_be32 to inline functions
      add MOVE_ARRAY
      use MOVE_ARRAY
      apply: use COPY_ARRAY and MOVE_ARRAY in update_image()
      ls-files: don't try to prune an empty index
      dir: support platforms that require aligned reads
      pack-objects: remove unnecessary NULL check
      t0001: skip test with restrictive permissions if getpwd(3) respects them
      test-path-utils: handle const parameter of basename and dirname
      t3700: fix broken test under !POSIXPERM
      t4062: use less than 256 repetitions in regex
      sha1_file: avoid comparison if no packed hash matches the first byte
      apply: remove prefix_length member from apply_state
      merge: use skip_prefix()
      win32: plug memory leak on realloc() failure in syslog()
      strbuf: clear errno before calling getdelim(3)
      fsck: free buffers on error in fsck_obj()
      sha1_file: release delta_stack on error in unpack_entry()
      tree-walk: convert fill_tree_descriptor() to object_id
      t1002: stop using sum(1)
      t5001: add tests for export-ignore attributes and exclude pathspecs
      archive: factor out helper functions for handling attributes
      archive: don't queue excluded directories
      commit: remove unused inline function single_parent()
      apply: check date of potential epoch timestamps first
      apply: remove epoch date from regex
      am: release strbufs after use in detect_patch_format()
      am: release strbuf on error return in hg_patch_to_mail()
      am: release strbuf after use in safe_to_abort()
      check-ref-format: release strbuf after use in check_ref_format_branch()
      clean: release strbuf after use in remove_dirs()
      clone: release strbuf after use in remove_junk()
      commit: release strbuf on error return in commit_tree_extended()
      connect: release strbuf on error return in git_connect()
      convert: release strbuf on error return in filter_buffer_or_fd()
      diff: release strbuf after use in diff_summary()
      diff: release strbuf after use in show_rename_copy()
      diff: release strbuf after use in show_stats()
      help: release strbuf on error return in exec_man_konqueror()
      help: release strbuf on error return in exec_man_man()
      help: release strbuf on error return in exec_woman_emacs()
      mailinfo: release strbuf after use in handle_from()
      mailinfo: release strbuf on error return in handle_boundary()
      merge: release strbuf after use in save_state()
      merge: release strbuf after use in write_merge_heads()
      notes: release strbuf after use in notes_copy_from_stdin()
      refs: release strbuf on error return in write_pseudoref()
      remote: release strbuf after use in read_remote_branches()
      remote: release strbuf after use in migrate_file()
      remote: release strbuf after use in set_url()
      send-pack: release strbuf on error return in send_pack()
      sha1_file: release strbuf on error return in index_path()
      shortlog: release strbuf after use in insert_one_record()
      sequencer: release strbuf after use in save_head()
      transport-helper: release strbuf after use in process_connect_service()
      userdiff: release strbuf after use in userdiff_get_textconv()
      utf8: release strbuf on error return in strbuf_utf8_replace()
      vcs-svn: release strbuf after use in end_revision()
      wt-status: release strbuf after use in read_rebase_todolist()
      wt-status: release strbuf after use in wt_longstatus_print_tracking()
      archive: don't add empty directories to archives
      refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
      refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
      refs: pass NULL to resolve_ref_unsafe() if hash is not needed
      mailinfo: don't decode invalid =XY quoted-printable sequences
      refs: pass NULL to refs_resolve_refdup() if hash is not needed
      refs: pass NULL to resolve_refdup() if hash is not needed
      coccinelle: remove parentheses that become unnecessary
      path: use strbuf_add_real_path()
      use strbuf_addstr() for adding strings to strbufs
      graph: use strbuf_addchars() to add spaces
      tag: avoid NULL pointer arithmetic
      repository: use FREE_AND_NULL
      run-command: use ALLOC_ARRAY
      test-stringlist: avoid buffer underrun when sorting nothing
      fsck: handle NULL return of lookup_blob() and lookup_tree()
      .mailmap: normalize name for René Scharfe

Ross Kabus (1):
      commit-tree: do not complete line in -F input

Sahil Dua (2):
      config: create a function to format section headers
      branch: add a --copy (-c) option to go with --move (-m)

Santiago Torres (1):
      t: lib-gpg: flush gpg agent on startup

Stefan Beller (51):
      diff.c: readability fix
      diff.c: move line ending check into emit_hunk_header
      diff.c: factor out diff_flush_patch_all_file_pairs
      diff.c: introduce emit_diff_symbol
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
      diff.c: migrate emit_line_checked to use emit_diff_symbol
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN]
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS}
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
      diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
      submodule.c: migrate diff output to use emit_diff_symbol
      diff.c: convert emit_binary_diff_body to use emit_diff_symbol
      diff.c: convert show_stats to use emit_diff_symbol
      diff.c: convert word diffing to use emit_diff_symbol
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
      diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
      diff.c: buffer all output if asked to
      diff.c: color moved lines differently
      diff.c: color moved lines differently, plain mode
      diff.c: add dimming to moved line detection
      diff: document the new --color-moved setting
      attr.c: drop hashmap_cmp_fn cast
      builtin/difftool.c: drop hashmap_cmp_fn cast
      builtin/describe: drop hashmap_cmp_fn cast
      config.c: drop hashmap_cmp_fn cast
      convert/sub-process: drop cast to hashmap_cmp_fn
      patch-ids.c: drop hashmap_cmp_fn cast
      remote.c: drop hashmap_cmp_fn cast
      submodule-config.c: drop hashmap_cmp_fn cast
      name-hash.c: drop hashmap_cmp_fn cast
      t/helper/test-hashmap: use custom data instead of duplicate cmp functions
      commit: convert lookup_commit_graft to struct object_id
      tag: convert gpg_verify_tag to use struct object_id
      t8008: rely on rev-parse'd HEAD instead of sha1 value
      t1200: remove t1200-tutorial.sh
      sha1_file: make read_info_alternates static
      submodule.sh: remove unused variable
      builtin/merge: honor commit-msg hook for merges
      push, fetch: error out for submodule entries not pointing to commits
      replace-objects: evaluate replacement refs without using the object store
      Documentation/githooks: mention merge in commit-msg hook
      Documentation/config: clarify the meaning of submodule.<name>.update
      t7406: submodule.<name>.update command must not be run from .gitmodules
      diff: correct newline in summary for renamed files
      submodule: correct error message for missing commits
      branch: reset instead of release a strbuf
      tests: fix diff order arguments in test_cmp

Stephan Beyer (1):
      clang-format: add a comment about the meaning/status of the

Takashi Iwai (2):
      sha1dc: build git plumbing code more explicitly
      sha1dc: allow building with the external sha1dc library

Taylor Blau (8):
      pretty.c: delimit "%(trailers)" arguments with ","
      t4205: unfold across multiple lines
      doc: 'trailers' is the preferred way to format trailers
      doc: use "`<literal>`"-style quoting for literal strings
      t6300: refactor %(trailers) tests
      ref-filter.c: use trailer_opts to format trailers
      ref-filter.c: parse trailers arguments with %(contents) atom
      ref-filter.c: pass empty-string as NULL to atom parsers

Thomas Braun (1):
      completion: add --broken and --dirty to describe

Thomas Gummerer (3):
      read-cache: fix index corruption with index v4
      refs: strip out not allowed flags from ref_transaction_update
      http-push: fix construction of hex value from path

Todd Zullinger (1):
      api-argv-array.txt: remove broken link to string-list API

Tom G. Christensen (2):
      http: fix handling of missing CURLPROTO_*
      http: use a feature check to enable GSSAPI delegation control

Torsten Bögershausen (3):
      convert: add SAFE_CRLF_KEEP_CRLF
      apply: file commited with CRLF should roundtrip diff and apply
      test-lint: echo -e (or -E) is not portable

Urs Thuermann (1):
      git svn fetch: Create correct commit timestamp when using --localtime

W. Trevor King (1):
      Documentation/merge-options.txt: describe -S/--gpg-sign for 'pull'

William Duclot (1):
      rebase: make resolve message clearer for inexperienced users

brian m. carlson (14):
      builtin/fsck: convert remaining caller of get_sha1 to object_id
      builtin/merge-tree: convert remaining caller of get_sha1 to object_id
      submodule: convert submodule config lookup to use object_id
      remote: convert struct push_cas to struct object_id
      sequencer: convert to struct object_id
      builtin/update_ref: convert to struct object_id
      bisect: convert bisect_checkout to struct object_id
      builtin/unpack-file: convert to struct object_id
      Convert remaining callers of get_sha1 to get_oid.
      sha1_name: convert get_sha1* to get_oid*
      sha1_name: convert GET_SHA1* flags to GET_OID*
      sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
      vcs-svn: remove unused prototypes
      vcs-svn: rename repo functions to "svn_repo"

joernchen (1):
      cvsserver: use safe_pipe_capture instead of backticks

Ævar Arnfjörð Bjarmason (2):
      branch: add test for -m renaming multiple config sections
      tests: don't give unportable ">" to "test" built-in, use -gt

Øystein Walle (1):
      rev-parse: rev-parse: add --is-shallow-repository

Łukasz Gryglicki (1):
      merge: add a --signoff flag


^ permalink raw reply	[relevance 11%]

* Re: [RFE] Add minimal universal release management capabilities to GIT
  2017-10-20 10:40 ` [RFE] Add minimal universal release management capabilities to GIT nicolas.mailhot
@ 2017-10-20 21:12   ` Stefan Beller
  2017-10-21 13:56     ` nicolas.mailhot
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-20 21:12 UTC (permalink / raw)
  To: nicolas.mailhot; +Cc: git

On Fri, Oct 20, 2017 at 3:40 AM,  <nicolas.mailhot@laposte.net> wrote:
> Hi,
>
> Git is a wonderful tool, which has transformed how software is created, and made code sharing and reuse, a lot easier (both between human and software tools).
>
> Unfortunately Git is so good more and more developers start to procrastinate on any activity that happens outside of GIT, starting with cutting releases. The meme "one only needs a git commit hash" is going strong, even infecting institutions like lwn and glibc (https://lwn.net/SubscriberLink/736429/e5a8c8888cc85cc8/)

For release you would want to include more than just "the code" into
the hash, such as compiler versions, environment variables, the phase
of the moon, what have you, that may impact the release build.

> However, the properties that make a hash commit terrific at the local development level, also make it suboptimal as a release ID:
>
> – hashes are not ordered. A human can not guess the sequencing of two hashes, nor can a tool, without access to Git history. Just try to handle "critical security problem in project X, introduced with version Y and fixed in Z" when all you have is some git hashes. hashing-only introduces severe frictions when analysing deployment states.

It sounds to me as if you assume that if X, Y, Z were numbers (or
rather had some order), this can be easily deduced.
The output of git-describe ought to be sufficient for an ordering
scheme to rely on?
However the problem with deployments is that Y might be v1.8.0.1 and Z
might be v2.1.2.0 and X (that you are running) is v2.10.2.0.

> — hashes are not ranked. You can not guess, looking at a hash, if it corresponds to a project stability point, or is in a middle of a refactoring sequence, where things are expected to break. Evaluating every hash of every project you use quickly becomes prohibitive, with the only possible strategy being to just use the latest commit at a given time and pray (and if you are lucky never never update afterwards unless you have lots of fixing and testing time to waste).

That is up to the hash function. One could imagine a hash function
that generates bit patterns that you can use to obtain an order from.
SHA-1 that Git uses is not such a hash, but rather a supposedly secure
hash. One hash value looks like white noise, such that the entropy of
a SHA-1 object name can be estimated with 160 bits.

> – commit mixing is broken by design.

In Git terms a repository is the whole universe.
If you want relationships between different projects, you need to
include these projects e.g. via subtree or submodules.
It scales even up to linux distributions (e.g.
https://github.com/gittup/gittup, includes nethack!)

> One can not adapt the user of a piece of code to changes in this piece of code before those changes are committed in the first place. There will always be moments where the latest commit of a project, is incompatible with the latest commit of downsteam users of this project. It is not a problem in developer environments and automated testers, where you want things to break early and be fixed early. It is a huge problem when you follow the same early commit push strategy for actual production code, where failures are not just a red light in a build farm dashboard, but have real-world consequences. And the more interlinked git repositories you pile on one another, the higher the probability is two commits won't work with one another with failures cascading down

That is software engineering in general, I am not sure how Git relates
to this? Any change that you make (with or without utilizing Git) can
break the downstream world.

> – commits are too granular. Even assuming one could build an automated regression farm powerful enough to build and test instantaneously every commit, it is not possible to instantaneously push those rebuilds to every instance where this code is deployed (even with infinite bandwidth, infinite network reach and infinite network availability).

With infinite resources it would be possible, as the computers are
also infinitely fast. ;)

> Computers would be spending their time resetting to the latest build of one component or another, with no real work being done. So there will always be a distance, between the latest commit in a git repo, and what is actually deployed. And we've seen bare hashes make evaluating this distance difficult
>
> – commits are a bad inter-project synchronisation point. There are too many of them, they are not ranked, everyone is choosing a different commit to deploy, that effectively kills the network effects that helped making traditional releases solid (because distributors used the same release state, and could share feedback and audit results).

There are different strategies. Relevant open source projects (kernel,
glibc, git) are pretty good at not breaking the downstream users with
every commit. So "no matter which version of X you use, it ought to
work fine".
If you want faster velocity, you have to couple the projects more
(submodules or a large repo including everything)

> One could mitigate those problems in a Git management overlay (and, indeed, many try). The problem of those overlays is that they have variable maturity levels, make incompatible choices, cut corners, are not universal like Git, making building anything on top of them of dubious value, with quick fallback to commit hashes, which *are* universal among Git repos. Release handling and versioning really needs to happen in Git itself to be effective.

I am not convinced, yet. As said initially the release handling needs
to take more things into account (compiler version, hardware version
of the fleet, etc) which is usually not tracked in Git. Well you
could, but that is the job of the release management tool, no?

> Please please please add release handling and versioning capabilities to Git itself. Without it some enthusiastic Git adopters are on a fast trajectory to unmanageable hash soup states, even if they are not realising it yet, because the deleterious side effects of giving up on releases only get clear with time.
>
> Here is what such capabilities could look like (people on this list can probably invent something better, I don't care as long as something exists).
>
> 1. "release versions" are first class objects that can be attached to a commit (not just freestyle tags that look like versions, but may be something else entirely). Tools can identify release IDs reliably.

git tags ?

> 2. "release versions" have strong format constrains, that allow humans and tools to deduce their ordering without needing access to something else (full git history or project-specific conventions). The usual string of numbers separated by dots is probably simple and universal enough (if you start to allow letters people will try to use clever schemes like alpha or roman numerals, that break automation). There needs to be at least two numbers in the string to allow tracking patchlevels.

git tags are pretty open ended in their naming. the strictness would
need to be enforced by the given requirement of the environment. (Some
want to have just one integer number going up; others want patch
levels, i.e. 4 ints; yet others want dates?)

> 3. several such objects can be attached to a commit (a project may wish to promote a minor release to major one after it passes QA, versionning history should not be lost).

Multiple git tags can be attached to the same commit. You can even tag
a tag or tag a blob.


> 4. absent human intervention the release state of a repo is initialised at 0.0, for its first commit (tools can rely on at least one release existing in a repo).

An initial repo doesn't have tags, which comes close to 0.

>
> 5. a command, such as "git release", allow a human with control of the repo to set an explicit release version to a commit. Git enforces ordering (refuses versions lower than the latest repo version in git history). The most minor number of the explicit release is necessarily zero.
>
> 6. a command, such as "git release" without argument, allows a human to request setting of a minor patchlevel release version for the current commit. The computed version is:
>    "last release version in git history except most minor number"
>  + "."
>  + "number of commits in history since this version"
> (patchlevel versioning is predictable and decentralized, credits to Willy Tarreau for the idea)
>
> 7. a command, such as "git release bump", allows a human to request setting of a new non-patchlevel release version. The computed version is
>    "last release version in git history except most minor number, incrementing the remaining most minor number"
>  + "."
>  + "0"
>
> 8. a command, such as "git release promote", allows a human to request setting a new more major release version. The computed version is
>    "last release version in git history except most minor number, incrementing the next-to-remaining-most-minor-and-non-zero number, and resetting the remaining-most-minor-and-non-zero number to zero"
>  + "."
>  + "0"

This sounds fairly specific to an environment that you are in, maybe
write git-release for your environment and then open source it. The
world will love it (assuming they have the same environment and
needs).


> 9. a command, such as "git release cut", creates a release archive, named reponame-releaseversion.tar.xz, with a reponame-releaseversion root directory, a reponame-releaseversion/VERSION file containing releaseversion (so automation like makefiles can synchronize itself with the release version state), removing git metadata (.git tree) from the result. If the current commit has several release objects attached the highest one in ordering is chosen. If the current commit is lacking a release object a new minor patchlevel release version is autogenerated. Archive compression format can be overridden in repo config.

git -archive comes to mind, doing a subset here.

> 10. a command, such as "git release translate", outputs the commit hash associated to the version given in argument if it exists, the version associated to the commit hash given in argument if it exists, the version associated to the current commit without argument. If it is translating commit hashes with no version it outputs the various versions that could be computed for this hash by git release, git release bump, git release promote. This is necessary to bridge developer-oriented tools, that will continue to talk in commit hashes, and release/distribution/security-audit oriented tools, that want to manipulate release versions
>
> 11. when no releasing has been done in a repo for some time (I'd suggest 3 months to balance freshness with churn, it can be user-overidable in repo config), git reminds its human masters at the next commit events they should think about stabilizing state and cutting a release.

This is all process specific to your environment. Consider e.g. the
C++ standard committee tracking the C++ Standard in Git .
https://isocpp.org/std/the-committee
They make a release every 10 years or such, so 3 month is off!  Other
examples could be made for 3 month to be way too long.

> So nothing terribly complex, just a lot a small helpers to make releasing easier, less tedious, and cheaper for developers, that formalize, automate, and make easier existing practices of mature software projects, making them accessible to smaller projects. They would make releasing more predictable and reliable for people deploying the code, and easier to consume by higher-level cross-project management tools. That would transform the deployment stage of software just like Git already transformed early code writing and autotest stages.

Integrating with CI and release is definitely important, but Git
itself has no idea about the requirements and environments of the
project specifics, hence it is not developed a lot on that front.
For example the contribution process to git (as well as Linux) is done
via email and partially pull requests. These workflows have pretty
good integration in Git via "format-patch" and "request-pull". There
may be other workflows that are not as nicely integrated as they did
not come to mind to the Git developers.

Thanks,
Stefan


>
> Best regards,
>
> --
> Nicolas Mailhot
>

^ permalink raw reply	[relevance 13%]

* Re: [RFE] Add minimal universal release management capabilities to GIT
  2017-10-20 21:12   ` Stefan Beller
@ 2017-10-21 13:56     ` nicolas.mailhot
  0 siblings, 0 replies; 200+ results
From: nicolas.mailhot @ 2017-10-21 13:56 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git



----- Mail original -----
De: "Stefan Beller" 

>> Unfortunately Git is so good more and more developers start to procrastinate on any activity that happens outside of GIT,
>> starting with cutting releases. The meme "one only needs a git commit hash" is going strong, even infecting institutions
>> like lwn and glibc (https://lwn.net/SubscriberLink/736429/e5a8c8888cc85cc8/)

> For release you would want to include more than just "the code" into
> the hash, such as compiler versions, environment variables, the phase
> of the moon, what have you, that may impact the release build.

Yes and no. Yes because you do want to limit failure cases, and no because it's very easy to overspecify and block code reuse possibilities. Anyway I don't see a strong consensus on how to do those yet, they are very language-specific, and the first step is being able to identify other code you depend on which requires some sort of release id, which is what my message was about. You can't build any compatibility matrix, before being able to name the dimensions of the matrix.

> It sounds to me as if you assume that if X, Y, Z were numbers (or
> rather had some order), this can be easily deduced.

It's a lot more easy to use "option foo was introduced in version 2.3.4 and takes Y parameters" than "option foo was introduced in commit hash #############################################, you have version hash $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$", good luck.

> The output of git-describe ought to be sufficient for an ordering
> scheme to rely on?

That relies on git access to the repo of every bit of code your computer runs. This is not practical past the deployment phase. For deployment the ordering needs to be extracted from all the git data so you only need to manipulate short human and tool-friendly ids. You need low coupling not the strong coupling of git repo access.

>> — hashes are not ranked. You can not guess, looking at a hash, if it corresponds to a project stability point, or is in a
>> middle of a refactoring sequence, where things are expected to break. Evaluating every hash of every project you use
>> quickly becomes prohibitive, with the only possible strategy being to just use the latest commit at a given time and pra
>> (and if you are lucky never never update afterwards unless you have lots of fixing and testing time to waste).

> That is up to the hash function. One could imagine a hash function
> that generates bit patterns that you can use to obtain an order from.

No that is not up to the hash function. First because hashes are too long to be manipulated by humans, and second no hash will ever capture human intent. You need an explicit human action to mark "I want others to use this particular state of my project because I am confident it is solid".

>> – commit mixing is broken by design.

> In Git terms a repository is the whole universe.
> If you want relationships between different projects, you need to
> include these projects e.g. via subtree or submodules.
> It scales even up to linux distributions (e.g.
> https://github.com/gittup/gittup, includes nethack!)

This is still pre-deployment phase. And I wouldn't qualify this as "full linux distro", it's very small scale. If anything it demonstrated than even on a smallish perimeter relying on git alone as it stands today is too hard (3 updates in the whole 2017 year!).

>> One can not adapt the user of a piece of code to changes in this piece of code before those changes are committed in the
>> first place. There will always be moments where the latest commit of a project, is incompatible with the latest commit of
>> downsteam users of this project. It is not a problem in developer environments and automated testers, where you want things >> to break early and be fixed early. It is a huge problem when you follow the same early commit push strategy for actual
>> production code, where failures are not just a red light in a build farm dashboard, but have real-world consequences. And
>> the more interlinked git repositories you pile on one another, the higher the probability is two commits won't work with
>> one another with failures cascading down

> That is software engineering in general, I am not sure how Git relates
> to this? Any change that you make (with or without utilizing Git) can
> break the downstream world.

It's a lot easier to manage when you have discrete release synchronisation point and not just a flow of commits

>> – commits are a bad inter-project synchronisation point. There are too many of them, they are not ranked, everyone is
>> choosing a different commit to deploy, that effectively kills the network effects that helped making traditional releases
>> solid (because distributors used the same release state, and could share feedback and audit results).

> There are different strategies. Relevant open source projects (kernel,
> glibc, git) are pretty good at not breaking the downstream users with
> every commit.

Just pick any random kernel commit during merge windows, try to build/run it and we'll talk again;)
What those projects are pretty good at is a clear release strategy that helps their users identify good project states which are safe to run.

Except, the releasing happens outside git, it's still fairly manual. All I'm proposing is to integrate the basic functions in git to simplify the life of those projects and help smaller projects that want completely intergrated git workflows.

> If you want faster velocity, you have to couple the projects more
> (submodules or a large repo including everything)

Just try do to it. You'll get slower velocity because of the difficulty inherent in managing a large number of projects with strong git coupling. And if you tell me "I'll just not update everything in parallel all the time" you've just reinvented releasing without the help of explicit release states.

> I am not convinced, yet. As said initially the release handling needs
> to take more things into account (compiler version, hardware version
> of the fleet, etc) which is usually not tracked in Git. Well you
> could, but that is the job of the release management tool, no?

Yes and it is so fun to herd hundreds of management tools with different conventions and quirks. About as much fun as managing dozens of scm before most projects settled on git. All commonalities need to migrate in the common git layer to simplify management and release id is the first of those. Besides the first thing those tools want is a way to identify the states to use, they'll be the first consumers of release integration in git.

>> 1. "release versions" are first class objects that can be attached to a commit (not just freestyle tags that look like
>> versions, but may be something else entirely). Tools can identify release IDs reliably.

> git tags ?

Too loosely defined to be relied on by project-agnostic tools. That's what most tools won't ever try to use those. Anything you will define around tags as they stand is unlikely to work on the project of someone else

>> 2. "release versions" have strong format constrains, that allow humans and tools to deduce their ordering without needing
>> access to something else (full git history or project-specific conventions). The usual string of numbers separated by dots
>> is probably simple and universal enough (if you start to allow letters people will try to use clever schemes like alpha or 
>> roman numerals, that break automation). There needs to be at least two numbers in the string to allow tracking patchlevels.

> git tags are pretty open ended in their naming. the strictness would
> need to be enforced by the given requirement of the environment.

You've just lost. You can't build any complex system without some level of shared conventions, if you limit the conventions to the project level you limit what you can build above the project level, starting with tooling

> (Some
> want to have just one integer number going up; others want patch
>levels, i.e. 4 ints; 

That's why I don't propose to set any constrain on the number of levels, except for a minimum (2, because in practice every project will need a point release at some time).

This is sufficient for automation, and pretty much half of what linux distros do to manage complex multi-project systems (convert loosely under-specified versionning to chains of numbers that deb or rpm can understand). And distributions manage to do that because that's already pretty much the release id conventions everyone uses, with minor variations.

> yet others want dates?)

That never worked so well, half the time you miss the date because of delays and it's too late to change the naming everyone expects. But anyway, nothing prevents you from using 2017.10.21.0 as release id, the proposed scheme allows this.

>> 3. several such objects can be attached to a commit (a project may wish to promote a minor release to major one after it
>> passes QA, versionning history should not be lost).

> Multiple git tags can be attached to the same commit. You can even tag
> a tag or tag a blob.

Again the problem with tags is that they can be anything, you can't rely on a tag being a release id, you can't rely on a tag having ordering constrains, you can't build any tooling around those above the project level.

>> 4. absent human intervention the release state of a repo is initialised at 0.0, for its first commit (tools can rely on at >> least one release existing in a repo).

> An initial repo doesn't have tags, which comes close to 0.

And it's not defined anywere so some will insist history starts at 1 or at -52 BC or whatever. Explicit convention enforced by tooling that others tools can rely on trumps implicit convention that can be argued to death all the time.

>> 5. a command, such as "git release", allow a human with control of the repo to set an explicit release version to a commit. 

> This sounds fairly specific to an environment that you are in, maybe
> write git-release for your environment and then open source it. The
> world will love it (assuming they have the same environment and
> needs).

If you take the time to look at it it is not specific, it is generic.

But, anyway yet another project bubble presents no value. The value of conventions is that they are shared, not that they are better than the neighbour's. I'll applaud anything done at git level because all the other tools and humans rely on this level. I'm sick of looking at conversion heuristics between higher-level tools, because they can't rely on scm-level conventions.

>> 9. a command, such as "git release cut", 
> git -archive comes to mind, doing a subset here.

It is not complex to do. The value is not on its complexity, the value is in setting conventions others can rely on.

>> 11. when no releasing has been done in a repo for some time (I'd suggest 3 months to balance freshness with churn, it can >> be user-overidable in repo config), git reminds its human masters at the next commit events they should think about >> stabilizing state and cutting a release.

> This is all process specific to your environment. Consider e.g. the
> C++ standard committee tracking the C++ Standard in Git .
> https://isocpp.org/std/the-committee
> They make a release every 10 years or such, so 3 month is off!

Actually you'll find out that they do intermediary pre-standard releases way more often. 3 months is the average that works for most projects. I don't propose to set it in stone, just as a sane default.


> Integrating with CI and release is definitely important, but Git
> itself has no idea about the requirements and environments of the
> project specifics,

The proposal is not just about CI. The software life does not end when a dev pushes code to CI. You need to identify software during its whole lifecycle, and the id needs to start in the scm, because that's where the lifecycle starts.

Right now the only shared id that does not depend on project environment that git proposes is commit hashes, and it is terrible in post-dev stages of the lifecycle.

Regards,

-- 
Nicolas Mailhot

^ permalink raw reply	[relevance 5%]

* Re: [PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks
  2017-10-19 18:11             ` [PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks Stefan Beller
@ 2017-10-23 14:12               ` Heiko Voigt
  2017-10-23 18:05                 ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Heiko Voigt @ 2017-10-23 14:12 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, Jens.Lehmann, bmwill, git, jrnieder

On Thu, Oct 19, 2017 at 11:11:09AM -0700, Stefan Beller wrote:
> Currently when fetching we collect the names of submodules to be fetched
> in a list. As we also want to support fetching 'gitlinks, that happen to
> have a repo checked out at the right place', we'll just pretend that these
> are submodules. We do that by assuming their path is their name. This in
> turn can yield collisions between the name-namespace and the
> path-namespace. (See the previous test for a demonstration.)
> 
> This patch rewrites the code such that we treat the 'real submodule' case
> differently from the 'gitlink, but ok' case. This introduces a bit
> of code duplication, but gets rid of the confusing mapping between names
> and paths.
> 
> The test is incomplete as the long term vision is not achieved yet.
> (which would be fetching both the renamed submodule as well as
> the gitlink thing, putting them in place via e.g. git-pull)
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> 
>  Heiko,
>  Junio,
> 
>  I assumed the code would ease up a lot more, but now I am undecided if
>  I want to keep arguing as the code is not stopping to be ugly. :)

So we are basically coming to the same conclusion? :) My previous
fallback approach basically did the same but with the old architecture
(without parallel fetch, ...) and was already ugly.

With the fallback on submodule default names approach we can keep most
of the old functionality and keep the code that handles that minimal.

Since there is only a small (IMO quite unlikely) cornercase that could
break peoples expectations I would like to have a look whether anyone
even notices the behavioral change on next or master. If there are
complaints we can still extend and add the two lists.

>  The idea is to treat submodule and gitlinks separately, with submodules
>  supporting renames, and gitlinks as a historic artefact.
>  
>  Sorry for the noise about code ugliness.

Why sorry? For me it is actually interesting to see you basically coming
to the same conclusions.

Cheers Heiko

^ permalink raw reply	[relevance 22%]

* Re: [PATCH 1/2] t5526: check for name/path collision in submodule fetch
  2017-10-19 18:11           ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Stefan Beller
  2017-10-19 18:11             ` [PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks Stefan Beller
@ 2017-10-23 14:16             ` Heiko Voigt
  2017-10-23 17:58               ` Stefan Beller
  1 sibling, 1 reply; 200+ results
From: Heiko Voigt @ 2017-10-23 14:16 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, Jens.Lehmann, bmwill, git, jrnieder

On Thu, Oct 19, 2017 at 11:11:08AM -0700, Stefan Beller wrote:
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> 
> This is just to test the corner case we're discussing.
> Applies on top of origin/hv/fetch-moved-submodules-on-demand.
> 
> 
>  t/t5526-fetch-submodules.sh | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index a552ad4ead..c82d519e06 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -571,6 +571,7 @@ test_expect_success 'fetching submodule into a broken repository' '
>  '
>  
>  test_expect_success "fetch new commits when submodule got renamed" '
> +	test_when_finished "rm -rf downstream_rename" &&
>  	git clone . downstream_rename &&
>  	(
>  		cd downstream_rename &&
> @@ -605,4 +606,45 @@ test_expect_success "fetch new commits when submodule got renamed" '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success "warn on submodule name/path clash, but new commits fetched in renamed" '
> +	test_when_finished "rm -rf downstream_rename" &&
> +	git clone . downstream_rename &&
> +	(
> +		cd downstream_rename &&
> +		git submodule update --init &&
> +# NEEDSWORK: we omitted --recursive for the submodule update here since
> +# that does not work. See test 7001 for mv "moving nested submodules"
> +# for details. Once that is fixed we should add the --recursive option
> +# here.
> +		git checkout -b rename &&
> +		git mv submodule submodule_renamed &&
> +		(
> +			cd submodule_renamed &&
> +			git checkout -b rename_sub &&
> +			echo a >a &&
> +			git add a &&
> +			git commit -ma &&
> +			git push origin rename_sub &&
> +			git rev-parse HEAD >../../expect
> +		) &&
> +		git add submodule_renamed &&
> +		git commit -m "update renamed submodule" &&
> +		# produce collision, note that we use no submodule command
> +		git clone ../submodule submodule &&
> +		git add submodule &&

A small note even though this is not meant for inclusion: This would
break when I start working on teaching 'git add' to set default values
in .gitmodules when available.

But I guess I will discover a few other places, when starting that, that
will break in the tests anyway.

Cheers Heiko

^ permalink raw reply	[relevance 15%]

* Re: [PATCH 1/2] t5526: check for name/path collision in submodule fetch
  2017-10-23 14:16             ` [PATCH 1/2] t5526: check for name/path collision in submodule fetch Heiko Voigt
@ 2017-10-23 17:58               ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-23 17:58 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Junio C Hamano, Jens Lehmann, Brandon Williams, git, Jonathan Nieder

>> +             git add submodule &&
>
> A small note even though this is not meant for inclusion: This would
> break when I start working on teaching 'git add' to set default values
> in .gitmodules when available.

Yes. I should have used something like:

    git update-index --add --cacheinfo 160000,$(git rev-parse HEAD),sub

as that is the real plumbing command to change a gitlink.

>
> But I guess I will discover a few other places, when starting that, that
> will break in the tests anyway.

Yes, I have the same suspicion. Thanks for reminding
me to use more plumbing!

Stefan

^ permalink raw reply	[relevance 16%]

* Re: [PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks
  2017-10-23 14:12               ` Heiko Voigt
@ 2017-10-23 18:05                 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-23 18:05 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Junio C Hamano, Jens Lehmann, Brandon Williams, git, Jonathan Nieder

On Mon, Oct 23, 2017 at 7:12 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> On Thu, Oct 19, 2017 at 11:11:09AM -0700, Stefan Beller wrote:
>> Currently when fetching we collect the names of submodules to be fetched
>> in a list. As we also want to support fetching 'gitlinks, that happen to
>> have a repo checked out at the right place', we'll just pretend that these
>> are submodules. We do that by assuming their path is their name. This in
>> turn can yield collisions between the name-namespace and the
>> path-namespace. (See the previous test for a demonstration.)
>>
>> This patch rewrites the code such that we treat the 'real submodule' case
>> differently from the 'gitlink, but ok' case. This introduces a bit
>> of code duplication, but gets rid of the confusing mapping between names
>> and paths.
>>
>> The test is incomplete as the long term vision is not achieved yet.
>> (which would be fetching both the renamed submodule as well as
>> the gitlink thing, putting them in place via e.g. git-pull)
>>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>
>>  Heiko,
>>  Junio,
>>
>>  I assumed the code would ease up a lot more, but now I am undecided if
>>  I want to keep arguing as the code is not stopping to be ugly. :)
>
> So we are basically coming to the same conclusion? :) My previous
> fallback approach basically did the same but with the old architecture
> (without parallel fetch, ...) and was already ugly.

It depends on the conclusion you drew. ;)
Here is my conclusion:
* It would really be nice to have this fallback separated out.
* However for the current state the ugliness of such code trumps the
  more maintainable, long term oriented thing with path/names not
  clashing. I could not spend more time polishing these patches,
  so I could not ask you to do it either
-> I think your patches are fine as is for inclusion
-> We may have #leftoverbits here to clear up the confusion around
  path/names, as well as making the code more pleasant to read.

> With the fallback on submodule default names approach we can keep most
> of the old functionality and keep the code that handles that minimal.
>
> Since there is only a small (IMO quite unlikely) cornercase that could
> break peoples expectations I would like to have a look whether anyone
> even notices the behavioral change on next or master. If there are
> complaints we can still extend and add the two lists.

That sounds good to me.

>
>>  The idea is to treat submodule and gitlinks separately, with submodules
>>  supporting renames, and gitlinks as a historic artefact.
>>
>>  Sorry for the noise about code ugliness.
>
> Why sorry? For me it is actually interesting to see you basically coming
> to the same conclusions.

I thought I might come off awkwardly criticizing code for ugliness without
having a better alternative to show.

Thanks for working on this,
Stefan

>
> Cheers Heiko

^ permalink raw reply	[relevance 16%]

* [WIP PATCH] diff: add option to ignore whitespaces for move detection only
  2017-10-23 22:16 [PATCH 1/5] connect: split git:// setup into a separate function Stefan Beller
@ 2017-10-24  0:09 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-24  0:09 UTC (permalink / raw)
  To: sbeller; +Cc: bmwill, bturner, git, git, gitster, jonathantanmy, jrnieder, peff, wyan

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 diff.c                     |  10 ++--
 diff.h                     |   1 +
 t/t4015-diff-whitespace.sh | 114 ++++++++++++++++++++++++++++++++++++++++++++-
 
See, only 10 lines of code! (and a few more for tests)

We have run out of space in diff_options.flags,touched_flags.
as we 1<<U31 as the highest bit. We could reuse 1<<9 that is currently
unused (removed in 882749a04f (diff: add --word-diff option that
generalizes --color-words, 2010-04-14)). But that postpones the
real fix for only a short amount of time.

Ideas welcome how to extend the flag space. (We cannot just make it
a long either, as some arcane architecures have 32 bit longs.)

Another TODO: documentation

I plan to trim the CC list for any resend that will be needed.

Thanks,
Stefan
 
 3 files changed, 121 insertions(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index c4a669ffa8..ddb2018307 100644
--- a/diff.c
+++ b/diff.c
@@ -726,7 +726,8 @@ static int next_byte(const char **cp, const char **endp,
 			return (int)' ';
 		}
 
-		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
+		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE) ||
+		    diffopt->color_moved_ignore_space) {
 			while (*cp < *endp && isspace(**cp))
 				(*cp)++;
 			/*
@@ -751,7 +752,8 @@ static int moved_entry_cmp(const struct diff_options *diffopt,
 	const char *ap = a->es->line, *ae = a->es->line + a->es->len;
 	const char *bp = b->es->line, *be = b->es->line + b->es->len;
 
-	if (!(diffopt->xdl_opts & XDF_WHITESPACE_FLAGS))
+	if (!(diffopt->xdl_opts & XDF_WHITESPACE_FLAGS) &&
+	    !diffopt->color_moved_ignore_space)
 		return a->es->len != b->es->len  || memcmp(ap, bp, a->es->len);
 
 	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_AT_EOL)) {
@@ -774,7 +776,7 @@ static int moved_entry_cmp(const struct diff_options *diffopt,
 
 static unsigned get_string_hash(struct emitted_diff_symbol *es, struct diff_options *o)
 {
-	if (o->xdl_opts & XDF_WHITESPACE_FLAGS) {
+	if ((o->xdl_opts & XDF_WHITESPACE_FLAGS) || o->color_moved_ignore_space) {
 		static struct strbuf sb = STRBUF_INIT;
 		const char *ap = es->line, *ae = es->line + es->len;
 		int c;
@@ -4660,6 +4662,8 @@ int diff_opt_parse(struct diff_options *options,
 		DIFF_XDL_CLR(options, NEED_MINIMAL);
 	else if (!strcmp(arg, "-w") || !strcmp(arg, "--ignore-all-space"))
 		DIFF_XDL_SET(options, IGNORE_WHITESPACE);
+	else if (!strcmp(arg, "--ignore-all-space-in-move-detection"))
+		options->color_moved_ignore_space = 1;
 	else if (!strcmp(arg, "-b") || !strcmp(arg, "--ignore-space-change"))
 		DIFF_XDL_SET(options, IGNORE_WHITESPACE_CHANGE);
 	else if (!strcmp(arg, "--ignore-space-at-eol"))
diff --git a/diff.h b/diff.h
index aca150ba2e..6ba3f53bbd 100644
--- a/diff.h
+++ b/diff.h
@@ -196,6 +196,7 @@ struct diff_options {
 	} color_moved;
 	#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
 	#define COLOR_MOVED_MIN_ALNUM_COUNT 20
+	int color_moved_ignore_space;
 };
 
 void diff_emit_submodule_del(struct diff_options *o, const char *line);
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 6c9a93b734..d7ee3aabf2 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1677,7 +1677,119 @@ test_expect_success 'move detection with submodules' '
 
 	# nor did we mess with it another way
 	git diff --submodule=diff --color | test_decode_color >expect &&
-	test_cmp expect decoded_actual
+	test_cmp expect decoded_actual &&
+	rm -rf bananas &&
+	git submodule deinit bananas
+'
+
+test_expect_success 'move detection only ignores white spaces' '
+	git reset --hard &&
+	q_to_tab <<-\EOF >function.c &&
+	int func()
+	{
+	Qif (foo) {
+	QQ// this part of the function
+	QQ// function will be very long
+	QQ// indeed. We must exceed both
+	QQ// per-line and number of line
+	QQ// minimums
+	QQ;
+	Q}
+	Qbaz();
+	Qbar();
+	Q// more unrelated stuff
+	}
+	EOF
+	git add function.c &&
+	git commit -m "add function.c" &&
+	q_to_tab <<-\EOF >function.c &&
+	int do_foo()
+	{
+	Q// this part of the function
+	Q// function will be very long
+	Q// indeed. We must exceed both
+	Q// per-line and number of line
+	Q// minimums
+	Q;
+	}
+
+	int func()
+	{
+	Qif (foo)
+	QQdo_foo();
+	Qbaz();
+	Qbar();
+	Q// more unrelated stuff
+	}
+	EOF
+
+	# Make sure we get a different diff using -w ("moved function header")
+	git diff --color --color-moved -w |
+		grep -v "index" |
+		test_decode_color >actual &&
+	q_to_tab <<-\EOF >expected &&
+	<BOLD>diff --git a/function.c b/function.c<RESET>
+	<BOLD>--- a/function.c<RESET>
+	<BOLD>+++ b/function.c<RESET>
+	<CYAN>@@ -1,6 +1,5 @@<RESET>
+	<RED>-int func()<RESET>
+	<GREEN>+<RESET><GREEN>int do_foo()<RESET>
+	 {<RESET>
+	<RED>-	if (foo) {<RESET>
+	 Q// this part of the function<RESET>
+	 Q// function will be very long<RESET>
+	 Q// indeed. We must exceed both<RESET>
+	<CYAN>@@ -8,6 +7,11 @@<RESET> <RESET>int func()<RESET>
+	 Q// minimums<RESET>
+	 Q;<RESET>
+	 }<RESET>
+	<GREEN>+<RESET>
+	<GREEN>+<RESET><GREEN>int func()<RESET>
+	<GREEN>+<RESET><GREEN>{<RESET>
+	<GREEN>+<RESET>Q<GREEN>if (foo)<RESET>
+	<GREEN>+<RESET>QQ<GREEN>do_foo();<RESET>
+	 Qbaz();<RESET>
+	 Qbar();<RESET>
+	 Q// more unrelated stuff<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	# And now ignoring white space only in the move detection
+	git diff --color --color-moved --ignore-all-space-in-move-detection |
+		grep -v "index" |
+		test_decode_color >actual &&
+	q_to_tab <<-\EOF >expected &&
+	<BOLD>diff --git a/function.c b/function.c<RESET>
+	<BOLD>--- a/function.c<RESET>
+	<BOLD>+++ b/function.c<RESET>
+	<CYAN>@@ -1,13 +1,17 @@<RESET>
+	<GREEN>+<RESET><GREEN>int do_foo()<RESET>
+	<GREEN>+<RESET><GREEN>{<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// this part of the function<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// function will be very long<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// indeed. We must exceed both<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// per-line and number of line<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// minimums<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>;<RESET>
+	<BOLD;CYAN>+<RESET><BOLD;CYAN>}<RESET>
+	<GREEN>+<RESET>
+	 int func()<RESET>
+	 {<RESET>
+	<RED>-Qif (foo) {<RESET>
+	<BOLD;MAGENTA>-QQ// this part of the function<RESET>
+	<BOLD;MAGENTA>-QQ// function will be very long<RESET>
+	<BOLD;MAGENTA>-QQ// indeed. We must exceed both<RESET>
+	<BOLD;MAGENTA>-QQ// per-line and number of line<RESET>
+	<BOLD;MAGENTA>-QQ// minimums<RESET>
+	<BOLD;MAGENTA>-QQ;<RESET>
+	<BOLD;MAGENTA>-Q}<RESET>
+	<GREEN>+<RESET>Q<GREEN>if (foo)<RESET>
+	<GREEN>+<RESET>QQ<GREEN>do_foo();<RESET>
+	 Qbaz();<RESET>
+	 Qbar();<RESET>
+	 Q// more unrelated stuff<RESET>
+	EOF
+	test_cmp expected actual
 '
 
 test_done
-- 
2.15.0.rc2.6.g953226eb5f


^ permalink raw reply	[relevance 14%]

* Re: [PATCH] status: do not get confused by submodules in excluded directories
  2017-10-24 12:15   ` Heiko Voigt
@ 2017-10-24 15:34     ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-24 15:34 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Junio C Hamano, Johannes Schindelin, git

On Tue, Oct 24, 2017 at 5:15 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:

> Looks good to me.

Same here,

Thanks,
Stefan

^ permalink raw reply	[relevance 16%]

* [PATCH 1/5] fnv: migrate code from hashmap to fnv
  2017-10-24 19:02 [PATCH 0/4] (x)diff cleanup: remove duplicate code Stefan Beller
@ 2017-10-24 23:45 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-24 23:45 UTC (permalink / raw)
  To: sbeller; +Cc: git, peff, l.s.r

The functions regarding the Fowler–Noll–Vo hash function are put in a
separate compilation unit, as it is logically different from the hashmap
functionality.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This may still be an interesting cleanup on its own.

Thanks,
Stefan

 Makefile                |  1 +
 attr.c                  |  1 +
 builtin/fast-export.c   |  1 +
 diff.c                  |  1 +
 fnv.c                   | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
 fnv.h                   | 20 ++++++++++++++++
 hashmap.c               | 64 +------------------------------------------------
 hashmap.h               | 15 ------------
 merge-recursive.c       |  1 +
 remote.c                |  1 +
 submodule-config.c      |  1 +
 t/helper/test-hashmap.c |  1 +
 12 files changed, 93 insertions(+), 78 deletions(-)
 create mode 100644 fnv.c
 create mode 100644 fnv.h

diff --git a/Makefile b/Makefile
index cd75985991..8e1c5988f3 100644
--- a/Makefile
+++ b/Makefile
@@ -793,6 +793,7 @@ LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
+LIB_OBJS += fnv.o
 LIB_OBJS += fsck.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
diff --git a/attr.c b/attr.c
index dfc3a558d8..2e4217c4f1 100644
--- a/attr.c
+++ b/attr.c
@@ -16,6 +16,7 @@
 #include "utf8.h"
 #include "quote.h"
 #include "thread-utils.h"
+#include "fnv.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 2fb60d6d48..62f4010510 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -21,6 +21,7 @@
 #include "quote.h"
 #include "remote.h"
 #include "blob.h"
+#include "fnv.h"
 
 static const char *fast_export_usage[] = {
 	N_("git fast-export [rev-list-opts]"),
diff --git a/diff.c b/diff.c
index c4a669ffa8..a23f4521fb 100644
--- a/diff.c
+++ b/diff.c
@@ -22,6 +22,7 @@
 #include "argv-array.h"
 #include "graph.h"
 #include "packfile.h"
+#include "fnv.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
diff --git a/fnv.c b/fnv.c
new file mode 100644
index 0000000000..b4cbf39f0a
--- /dev/null
+++ b/fnv.c
@@ -0,0 +1,64 @@
+#include "fnv.h"
+
+#define FNV32_BASE ((unsigned int) 0x811c9dc5)
+#define FNV32_PRIME ((unsigned int) 0x01000193)
+
+unsigned int strhash(const char *str)
+{
+	unsigned int c, hash = FNV32_BASE;
+	while ((c = (unsigned char) *str++))
+		hash = (hash * FNV32_PRIME) ^ c;
+	return hash;
+}
+
+unsigned int strihash(const char *str)
+{
+	unsigned int c, hash = FNV32_BASE;
+	while ((c = (unsigned char) *str++)) {
+		if (c >= 'a' && c <= 'z')
+			c -= 'a' - 'A';
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
+
+unsigned int memhash(const void *buf, size_t len)
+{
+	unsigned int hash = FNV32_BASE;
+	unsigned char *ucbuf = (unsigned char *) buf;
+	while (len--) {
+		unsigned int c = *ucbuf++;
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
+
+unsigned int memihash(const void *buf, size_t len)
+{
+	unsigned int hash = FNV32_BASE;
+	unsigned char *ucbuf = (unsigned char *) buf;
+	while (len--) {
+		unsigned int c = *ucbuf++;
+		if (c >= 'a' && c <= 'z')
+			c -= 'a' - 'A';
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
+
+/*
+ * Incoporate another chunk of data into a memihash
+ * computation.
+ */
+unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len)
+{
+	unsigned int hash = hash_seed;
+	unsigned char *ucbuf = (unsigned char *) buf;
+	while (len--) {
+		unsigned int c = *ucbuf++;
+		if (c >= 'a' && c <= 'z')
+			c -= 'a' - 'A';
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
diff --git a/fnv.h b/fnv.h
new file mode 100644
index 0000000000..b425c85c66
--- /dev/null
+++ b/fnv.h
@@ -0,0 +1,20 @@
+#ifndef FNV_H
+#define FNV_H
+
+#include <stdlib.h>
+/*
+ * Ready-to-use hash functions for strings, using the FNV-1 algorithm (see
+ * http://www.isthe.com/chongo/tech/comp/fnv).
+ * `strhash` and `strihash` take 0-terminated strings, while `memhash` and
+ * `memihash` operate on arbitrary-length memory.
+ * `strihash` and `memihash` are case insensitive versions.
+ * `memihash_cont` is a variant of `memihash` that allows a computation to be
+ * continued with another chunk of data.
+ */
+extern unsigned int strhash(const char *buf);
+extern unsigned int strihash(const char *buf);
+extern unsigned int memhash(const void *buf, size_t len);
+extern unsigned int memihash(const void *buf, size_t len);
+extern unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len);
+
+#endif
diff --git a/hashmap.c b/hashmap.c
index d42f01ff5a..1605edbbc3 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -3,69 +3,7 @@
  */
 #include "cache.h"
 #include "hashmap.h"
-
-#define FNV32_BASE ((unsigned int) 0x811c9dc5)
-#define FNV32_PRIME ((unsigned int) 0x01000193)
-
-unsigned int strhash(const char *str)
-{
-	unsigned int c, hash = FNV32_BASE;
-	while ((c = (unsigned char) *str++))
-		hash = (hash * FNV32_PRIME) ^ c;
-	return hash;
-}
-
-unsigned int strihash(const char *str)
-{
-	unsigned int c, hash = FNV32_BASE;
-	while ((c = (unsigned char) *str++)) {
-		if (c >= 'a' && c <= 'z')
-			c -= 'a' - 'A';
-		hash = (hash * FNV32_PRIME) ^ c;
-	}
-	return hash;
-}
-
-unsigned int memhash(const void *buf, size_t len)
-{
-	unsigned int hash = FNV32_BASE;
-	unsigned char *ucbuf = (unsigned char *) buf;
-	while (len--) {
-		unsigned int c = *ucbuf++;
-		hash = (hash * FNV32_PRIME) ^ c;
-	}
-	return hash;
-}
-
-unsigned int memihash(const void *buf, size_t len)
-{
-	unsigned int hash = FNV32_BASE;
-	unsigned char *ucbuf = (unsigned char *) buf;
-	while (len--) {
-		unsigned int c = *ucbuf++;
-		if (c >= 'a' && c <= 'z')
-			c -= 'a' - 'A';
-		hash = (hash * FNV32_PRIME) ^ c;
-	}
-	return hash;
-}
-
-/*
- * Incoporate another chunk of data into a memihash
- * computation.
- */
-unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len)
-{
-	unsigned int hash = hash_seed;
-	unsigned char *ucbuf = (unsigned char *) buf;
-	while (len--) {
-		unsigned int c = *ucbuf++;
-		if (c >= 'a' && c <= 'z')
-			c -= 'a' - 'A';
-		hash = (hash * FNV32_PRIME) ^ c;
-	}
-	return hash;
-}
+#include "fnv.h"
 
 #define HASHMAP_INITIAL_SIZE 64
 /* grow / shrink by 2^2 */
diff --git a/hashmap.h b/hashmap.h
index 7cb29a6aed..96176f7d8c 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -97,21 +97,6 @@
  * }
  */
 
-/*
- * Ready-to-use hash functions for strings, using the FNV-1 algorithm (see
- * http://www.isthe.com/chongo/tech/comp/fnv).
- * `strhash` and `strihash` take 0-terminated strings, while `memhash` and
- * `memihash` operate on arbitrary-length memory.
- * `strihash` and `memihash` are case insensitive versions.
- * `memihash_cont` is a variant of `memihash` that allows a computation to be
- * continued with another chunk of data.
- */
-extern unsigned int strhash(const char *buf);
-extern unsigned int strihash(const char *buf);
-extern unsigned int memhash(const void *buf, size_t len);
-extern unsigned int memihash(const void *buf, size_t len);
-extern unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len);
-
 /*
  * Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code
  * for use in hash tables. Cryptographic hashes are supposed to have
diff --git a/merge-recursive.c b/merge-recursive.c
index 1d3f8f0d22..d1fb2f5f3d 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -22,6 +22,7 @@
 #include "attr.h"
 #include "merge-recursive.h"
 #include "dir.h"
+#include "fnv.h"
 #include "submodule.h"
 
 struct path_hashmap_entry {
diff --git a/remote.c b/remote.c
index b220f0dfc6..24957ba32c 100644
--- a/remote.c
+++ b/remote.c
@@ -10,6 +10,7 @@
 #include "string-list.h"
 #include "mergesort.h"
 #include "argv-array.h"
+#include "fnv.h"
 
 enum map_direction { FROM_SRC, FROM_DST };
 
diff --git a/submodule-config.c b/submodule-config.c
index 2aa8a1747f..432423f876 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -5,6 +5,7 @@
 #include "submodule.h"
 #include "strbuf.h"
 #include "parse-options.h"
+#include "fnv.h"
 
 /*
  * submodule cache lookup structure
diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 1145d51671..b5ac97886f 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -1,5 +1,6 @@
 #include "git-compat-util.h"
 #include "hashmap.h"
+#include "fnv.h"
 
 struct test_entry
 {
-- 
2.15.0.rc2.6.g953226eb5f


^ permalink raw reply	[relevance 15%]

* [PATCH 1/2] diff: decouple white space treatment for move detection from generic option
  2017-10-25 22:46 [PATCH 0/2] color-moved: ignore all space changes by default Stefan Beller
@ 2017-10-25 22:46 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-25 22:46 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

In the original implementation of the move dection logic we assumed that
the choice for ignoring white space changes is the same for the move
detection as it is for the generic diff.

It turns out, users want to have different choices regarding white spaces
for the move detection and the generic diff.  For example when reviewing
a patch that is sent to the mailing list (which doesn't ignore white space
changes at all; it is an exact patch) refactors some nested code out
into a separate function, we want to have the move detection kick in
despite different indentation levels of the old and new code.

This patch enables the user to have a different choice for ignoring
white spaces in the move detection.

The example given is covered in a test. Convert the existing tests
to be more explicit on their choice of white space behavior in the
move detection as we'll change the default shortly.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/diff-options.txt |  13 ++++
 diff.c                         |  17 ++++-
 diff.h                         |   1 +
 t/t4015-diff-whitespace.sh     | 150 +++++++++++++++++++++++++++++++++++++++--
 4 files changed, 172 insertions(+), 9 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index a88c76741e..1000b53b84 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -265,6 +265,19 @@ dimmed_zebra::
 	blocks are considered interesting, the rest is uninteresting.
 --
 
+--color-moved-[no-]ignore-all-space::
+	Ignore whitespace when comparing lines when performing the move
+	detection for --color-moved.  This ignores differences even if
+	one line has whitespace where the other line has none.
+--color-moved-[no-]ignore-space-change::
+	Ignore changes in amount of whitespace when performing the move
+	detection for --color-moved.  This ignores whitespace
+	at line end, and considers all other sequences of one or
+	more whitespace characters to be equivalent.
+--color-moved-[no-]ignore-space-at-eol::
+	Ignore changes in whitespace at EOL when performing the move
+	detection for --color-moved.
+
 --word-diff[=<mode>]::
 	Show a word diff, using the <mode> to delimit changed words.
 	By default, words are delimited by whitespace; see
diff --git a/diff.c b/diff.c
index e6814b9e9c..2580315ab9 100644
--- a/diff.c
+++ b/diff.c
@@ -714,7 +714,7 @@ static int moved_entry_cmp(const struct diff_options *diffopt,
 {
 	return !xdiff_compare_lines(a->es->line, a->es->len,
 				    b->es->line, b->es->len,
-				    diffopt->xdl_opts);
+				    diffopt->color_moved_ignore_space);
 }
 
 static struct moved_entry *prepare_entry(struct diff_options *o,
@@ -723,7 +723,8 @@ static struct moved_entry *prepare_entry(struct diff_options *o,
 	struct moved_entry *ret = xmalloc(sizeof(*ret));
 	struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no];
 
-	ret->ent.hash = xdiff_hash_string(l->line, l->len, o->xdl_opts);
+	ret->ent.hash = xdiff_hash_string(l->line, l->len,
+					  o->color_moved_ignore_space);
 	ret->es = l;
 	ret->next_line = NULL;
 
@@ -4592,6 +4593,18 @@ int diff_opt_parse(struct diff_options *options,
 		DIFF_XDL_SET(options, IGNORE_WHITESPACE_AT_EOL);
 	else if (!strcmp(arg, "--ignore-blank-lines"))
 		DIFF_XDL_SET(options, IGNORE_BLANK_LINES);
+	else if (!strcmp(arg, "--color-moved-no-ignore-all-space"))
+		options->color_moved_ignore_space &= ~XDF_IGNORE_WHITESPACE;
+	else if (!strcmp(arg, "--color-moved-no-ignore-space-change"))
+		options->color_moved_ignore_space &= ~XDF_IGNORE_WHITESPACE_CHANGE;
+	else if (!strcmp(arg, "--color-moved-no-ignore-space-at-eol"))
+		options->color_moved_ignore_space &= ~XDF_IGNORE_WHITESPACE_AT_EOL;
+	else if (!strcmp(arg, "--color-moved-ignore-all-space"))
+		options->color_moved_ignore_space |= XDF_IGNORE_WHITESPACE;
+	else if (!strcmp(arg, "--color-moved-ignore-space-change"))
+		options->color_moved_ignore_space |= XDF_IGNORE_WHITESPACE_CHANGE;
+	else if (!strcmp(arg, "--color-moved-ignore-space-at-eol"))
+		options->color_moved_ignore_space |= XDF_IGNORE_WHITESPACE_AT_EOL;
 	else if (!strcmp(arg, "--indent-heuristic"))
 		DIFF_XDL_SET(options, INDENT_HEURISTIC);
 	else if (!strcmp(arg, "--no-indent-heuristic"))
diff --git a/diff.h b/diff.h
index aca150ba2e..6ba3f53bbd 100644
--- a/diff.h
+++ b/diff.h
@@ -196,6 +196,7 @@ struct diff_options {
 	} color_moved;
 	#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
 	#define COLOR_MOVED_MIN_ALNUM_COUNT 20
+	int color_moved_ignore_space;
 };
 
 void diff_emit_submodule_del(struct diff_options *o, const char *line);
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 6c9a93b734..4ef4d6934a 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1350,7 +1350,10 @@ test_expect_success 'move detection ignoring whitespace ' '
 	line 4
 	line 5
 	EOF
-	git diff HEAD --no-renames --color-moved --color |
+	git diff HEAD --no-renames --color-moved --color \
+		--color-moved-no-ignore-all-space \
+		--color-moved-no-ignore-space-change \
+		--color-moved-no-ignore-space-at-eol |
 		grep -v "index" |
 		test_decode_color >actual &&
 	cat <<-\EOF >expected &&
@@ -1374,7 +1377,10 @@ test_expect_success 'move detection ignoring whitespace ' '
 	EOF
 	test_cmp expected actual &&
 
-	git diff HEAD --no-renames -w --color-moved --color |
+	git diff HEAD --no-renames --color-moved --color \
+		--color-moved-ignore-all-space \
+		--color-moved-no-ignore-space-change \
+		--color-moved-no-ignore-space-at-eol |
 		grep -v "index" |
 		test_decode_color >actual &&
 	cat <<-\EOF >expected &&
@@ -1414,7 +1420,10 @@ test_expect_success 'move detection ignoring whitespace changes' '
 	line 5
 	EOF
 
-	git diff HEAD --no-renames --color-moved --color |
+	git diff HEAD --no-renames --color-moved --color \
+		--color-moved-no-ignore-all-space \
+		--color-moved-no-ignore-space-change \
+		--color-moved-no-ignore-space-at-eol |
 		grep -v "index" |
 		test_decode_color >actual &&
 	cat <<-\EOF >expected &&
@@ -1438,7 +1447,10 @@ test_expect_success 'move detection ignoring whitespace changes' '
 	EOF
 	test_cmp expected actual &&
 
-	git diff HEAD --no-renames -b --color-moved --color |
+	git diff HEAD --no-renames --color-moved --color \
+		--color-moved-no-ignore-all-space \
+		--color-moved-no-ignore-space-at-eol \
+		--color-moved-ignore-space-change |
 		grep -v "index" |
 		test_decode_color >actual &&
 	cat <<-\EOF >expected &&
@@ -1481,7 +1493,10 @@ test_expect_success 'move detection ignoring whitespace at eol' '
 	# avoid cluttering the output with complaints about our eol whitespace
 	test_config core.whitespace -blank-at-eol &&
 
-	git diff HEAD --no-renames --color-moved --color |
+	git diff HEAD --no-renames --color-moved --color \
+		--color-moved-no-ignore-all-space \
+		--color-moved-no-ignore-space-change \
+		--color-moved-no-ignore-space-at-eol |
 		grep -v "index" |
 		test_decode_color >actual &&
 	cat <<-\EOF >expected &&
@@ -1505,7 +1520,10 @@ test_expect_success 'move detection ignoring whitespace at eol' '
 	EOF
 	test_cmp expected actual &&
 
-	git diff HEAD --no-renames --ignore-space-at-eol --color-moved --color |
+	git diff HEAD --no-renames --color-moved --color \
+		--color-moved-no-ignore-all-space \
+		--color-moved-no-ignore-space-change \
+		--color-moved-ignore-space-at-eol |
 		grep -v "index" |
 		test_decode_color >actual &&
 	cat <<-\EOF >expected &&
@@ -1677,7 +1695,125 @@ test_expect_success 'move detection with submodules' '
 
 	# nor did we mess with it another way
 	git diff --submodule=diff --color | test_decode_color >expect &&
-	test_cmp expect decoded_actual
+	test_cmp expect decoded_actual &&
+	rm -rf bananas &&
+	git submodule deinit bananas
+'
+
+test_expect_success 'move detection only ignores white spaces' '
+	git reset --hard &&
+	q_to_tab <<-\EOF >function.c &&
+	int func()
+	{
+	Qif (foo) {
+	QQ// this part of the function
+	QQ// function will be very long
+	QQ// indeed. We must exceed both
+	QQ// per-line and number of line
+	QQ// minimums
+	QQ;
+	Q}
+	Qbaz();
+	Qbar();
+	Q// more unrelated stuff
+	}
+	EOF
+	git add function.c &&
+	git commit -m "add function.c" &&
+	q_to_tab <<-\EOF >function.c &&
+	int do_foo()
+	{
+	Q// this part of the function
+	Q// function will be very long
+	Q// indeed. We must exceed both
+	Q// per-line and number of line
+	Q// minimums
+	Q;
+	}
+
+	int func()
+	{
+	Qif (foo)
+	QQdo_foo();
+	Qbaz();
+	Qbar();
+	Q// more unrelated stuff
+	}
+	EOF
+
+	# Make sure we get a different diff using -w ("moved function header")
+	git diff --color --color-moved -w \
+		--color-moved-no-ignore-all-space \
+		--color-moved-no-ignore-space-change \
+		--color-moved-no-ignore-space-at-eol |
+		grep -v "index" |
+		test_decode_color >actual &&
+	q_to_tab <<-\EOF >expected &&
+	<BOLD>diff --git a/function.c b/function.c<RESET>
+	<BOLD>--- a/function.c<RESET>
+	<BOLD>+++ b/function.c<RESET>
+	<CYAN>@@ -1,6 +1,5 @@<RESET>
+	<RED>-int func()<RESET>
+	<GREEN>+<RESET><GREEN>int do_foo()<RESET>
+	 {<RESET>
+	<RED>-	if (foo) {<RESET>
+	 Q// this part of the function<RESET>
+	 Q// function will be very long<RESET>
+	 Q// indeed. We must exceed both<RESET>
+	<CYAN>@@ -8,6 +7,11 @@<RESET> <RESET>int func()<RESET>
+	 Q// minimums<RESET>
+	 Q;<RESET>
+	 }<RESET>
+	<GREEN>+<RESET>
+	<GREEN>+<RESET><GREEN>int func()<RESET>
+	<GREEN>+<RESET><GREEN>{<RESET>
+	<GREEN>+<RESET>Q<GREEN>if (foo)<RESET>
+	<GREEN>+<RESET>QQ<GREEN>do_foo();<RESET>
+	 Qbaz();<RESET>
+	 Qbar();<RESET>
+	 Q// more unrelated stuff<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	# And now ignoring white space only in the move detection
+	git diff --color --color-moved \
+		--color-moved-ignore-all-space \
+		--color-moved-ignore-space-change \
+		--color-moved-ignore-space-at-eol |
+		grep -v "index" |
+		test_decode_color >actual &&
+	q_to_tab <<-\EOF >expected &&
+	<BOLD>diff --git a/function.c b/function.c<RESET>
+	<BOLD>--- a/function.c<RESET>
+	<BOLD>+++ b/function.c<RESET>
+	<CYAN>@@ -1,13 +1,17 @@<RESET>
+	<GREEN>+<RESET><GREEN>int do_foo()<RESET>
+	<GREEN>+<RESET><GREEN>{<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// this part of the function<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// function will be very long<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// indeed. We must exceed both<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// per-line and number of line<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// minimums<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>;<RESET>
+	<BOLD;CYAN>+<RESET><BOLD;CYAN>}<RESET>
+	<GREEN>+<RESET>
+	 int func()<RESET>
+	 {<RESET>
+	<RED>-Qif (foo) {<RESET>
+	<BOLD;MAGENTA>-QQ// this part of the function<RESET>
+	<BOLD;MAGENTA>-QQ// function will be very long<RESET>
+	<BOLD;MAGENTA>-QQ// indeed. We must exceed both<RESET>
+	<BOLD;MAGENTA>-QQ// per-line and number of line<RESET>
+	<BOLD;MAGENTA>-QQ// minimums<RESET>
+	<BOLD;MAGENTA>-QQ;<RESET>
+	<BOLD;MAGENTA>-Q}<RESET>
+	<GREEN>+<RESET>Q<GREEN>if (foo)<RESET>
+	<GREEN>+<RESET>QQ<GREEN>do_foo();<RESET>
+	 Qbaz();<RESET>
+	 Qbar();<RESET>
+	 Q// more unrelated stuff<RESET>
+	EOF
+	test_cmp expected actual
 '
 
 test_done
-- 
2.15.0.rc2.6.g953226eb5f


^ permalink raw reply	[relevance 10%]

* Re: What's cooking in git.git (Oct 2017, #06; Fri, 27)
  2017-10-27  8:32 What's cooking in git.git (Oct 2017, #06; Fri, 27) Junio C Hamano
@ 2017-10-27 18:50 ` Stefan Beller
  2017-11-07  2:00   ` sb/submodule-recursive-checkout-detach-head Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-27 18:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

> * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
>   (merged to 'next' on 2017-10-26 at 30994b4c76)
>  + Documentation/checkout: clarify submodule HEADs to be detached
>  + recursive submodules: detach HEAD from new state
>
>  "git checkout --recursive" may overwrite and rewind the history of
>  the branch that happens to be checked out in submodule
>  repositories, which might not be desirable.  Detach the HEAD but
>  still allow the recursive checkout to succeed in such a case.
>
>  Undecided.
>  This needs justification in a larger picture; it is unclear why
>  this is better than rejecting recursive checkout, for example.

We have been running this patch internally for a couple of weeks now,
and had no complaints by users.

Detaching the submodule HEAD is in line with the current thinking
of submodules, though I am about to send out a plan later
asking if we want to keep it that way long term.

Thanks,
Stefan

^ permalink raw reply	[relevance 17%]

* [ANNOUNCE] Git v2.15.0
@ 2017-10-30  6:19 Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-10-30  6:19 UTC (permalink / raw)
  To: git; +Cc: Linux Kernel

The latest feature release Git v2.15.0 is now available at the
usual places.  It is comprised of 769 non-merge commits since
v2.14.0, contributed by 91 people, 28 of which are new faces.

The tarballs are found at:

    https://www.kernel.org/pub/software/scm/git/

The following public repositories all have a copy of the 'v2.15.0'
tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.14.0 are as follows.
Welcome to the Git development community!

  Andre Hinrichs, Andrey Okoshkin, Ann T Ropea, Christopher Díaz,
  Christopher Díaz Riveros, Daniel Watkins, Derrick Stolee,
  Dimitrios Christidis, Eric Rannaud, Evan Zacks, Hielke Christian
  Braun, Ian Campbell, Ilya Kantor, Jameson Miller, Job Snijders,
  Joel Teichroeb, joernchen, Łukasz Gryglicki, Manav Rathi,
  Martin Ågren, Michael Forney, Nathan Payre, Nicolas Cornu,
  Patryk Obara, Randall S. Becker, Ross Kabus, Taylor Blau,
  and Urs Thuermann.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Alexander Shopov,
  Andreas Heiduk, Anthony Sottile, Ben Boeckel, Brandon Casey,
  Brandon Williams, brian m. carlson, Changwoo Ryu, Christian
  Couder, David Glasser, Dimitriy Ryazantcev, Eric Blake, Han-Wen
  Nienhuys, Heiko Voigt, Jean-Noel Avila, Jeff Hostetler, Jeff
  King, Jiang Xin, Johannes Schindelin, Johannes Sixt, Jonathan
  Nieder, Jonathan Tan, Jordi Mas, Junio C Hamano, Kaartic
  Sivaraam, Kevin Daudt, Kevin Willford, Lars Schneider, Martin
  Koegler, Matthieu Moy, Max Kirillov, Michael Haggerty, Michael J
  Gruber, Nguyễn Thái Ngọc Duy, Nicolas Morey-Chaisemartin,
  Øystein Walle, Paolo Bonzini, Pat Thoyts, Peter Krefting,
  Philip Oakley, Phillip Wood, Ralf Thielow, Raman Gupta, Ramsay
  Jones, Ray Chen, René Scharfe, Sahil Dua, Santiago Torres,
  Sebastian Schuberth, Stefan Beller, Stephan Beyer, SZEDER Gábor,
  Takashi Iwai, Thomas Braun, Thomas Gummerer, Todd Zullinger,
  Tom G. Christensen, Torsten Bögershausen, Trần Ngọc Quân,
  William Duclot, and W. Trevor King.

----------------------------------------------------------------

Git 2.15 Release Notes
======================

Backward compatibility notes and other notable changes.

 * Use of an empty string as a pathspec element that is used for
   'everything matches' is still warned and Git asks users to use a
   more explicit '.' for that instead.  The hope is that existing
   users will not mind this change, and eventually the warning can be
   turned into a hard error, upgrading the deprecation into removal of
   this (mis)feature.  That is now scheduled to happen in Git v2.16,
   the next major release after this one.

 * Git now avoids blindly falling back to ".git" when the setup
   sequence said we are _not_ in Git repository.  A corner case that
   happens to work right now may be broken by a call to BUG().
   We've tried hard to locate such cases and fixed them, but there
   might still be cases that need to be addressed--bug reports are
   greatly appreciated.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.


Updates since v2.14
-------------------

UI, Workflows & Features

 * An example that is now obsolete has been removed from a sample hook,
   and an old example in it that added a sign-off manually has been
   improved to use the interpret-trailers command.

 * The advice message given when "git rebase" stops for conflicting
   changes has been improved.

 * The "rerere-train" script (in contrib/) learned the "--overwrite"
   option to allow overwriting existing recorded resolutions.

 * "git contacts" (in contrib/) now lists the address on the
   "Reported-by:" trailer to its output, in addition to those on
   S-o-b: and other trailers, to make it easier to notify (and thank)
   the original bug reporter.

 * "git rebase", especially when it is run by mistake and ends up
   trying to replay many changes, spent long time in silence.  The
   command has been taught to show progress report when it spends
   long time preparing these many changes to replay (which would give
   the user a chance to abort with ^C).

 * "git merge" learned a "--signoff" option to add the Signed-off-by:
   trailer with the committer's name.

 * "git diff" learned to optionally paint new lines that are the same
   as deleted lines elsewhere differently from genuinely new lines.

 * "git interpret-trailers" learned to take the trailer specifications
   from the command line that overrides the configured values.

 * "git interpret-trailers" has been taught a "--parse" and a few
   other options to make it easier for scripts to grab existing
   trailer lines from a commit log message.

 * The "--format=%(trailers)" option "git log" and its friends take
   learned to take the 'unfold' and 'only' modifiers to normalize its
   output, e.g. "git log --format=%(trailers:only,unfold)".

 * "gitweb" shows a link to visit the 'raw' contents of blbos in the
   history overview page.

 * "[gc] rerereResolved = 5.days" used to be invalid, as the variable
   is defined to take an integer counting the number of days.  It now
   is allowed.

 * The code to acquire a lock on a reference (e.g. while accepting a
   push from a client) used to immediately fail when the reference is
   already locked---now it waits for a very short while and retries,
   which can make it succeed if the lock holder was holding it during
   a read-only operation.

 * "branch --set-upstream" that has been deprecated in Git 1.8 has
   finally been retired.

 * The codepath to call external process filter for smudge/clean
   operation learned to show the progress meter.

 * "git rev-parse" learned "--is-shallow-repository", that is to be
   used in a way similar to existing "--is-bare-repository" and
   friends.

 * "git describe --match <pattern>" has been taught to play well with
   the "--all" option.

 * "git branch" learned "-c/-C" to create a new branch by copying an
   existing one.

 * Some commands (most notably "git status") makes an opportunistic
   update when performing a read-only operation to help optimize later
   operations in the same repository.  The new "--no-optional-locks"
   option can be passed to Git to disable them.

 * "git for-each-ref --format=..." learned a new format element,
   %(trailers), to show only the commit log trailer part of the log
   message.


Performance, Internal Implementation, Development Support etc.

 * Conversion from uchar[20] to struct object_id continues.

 * Start using selected c99 constructs in small, stable and
   essentialpart of the system to catch people who care about
   older compilers that do not grok them.

 * The filter-process interface learned to allow a process with long
   latency give a "delayed" response.

 * Many uses of comparision callback function the hashmap API uses
   cast the callback function type when registering it to
   hashmap_init(), which defeats the compile time type checking when
   the callback interface changes (e.g. gaining more parameters).
   The callback implementations have been updated to take "void *"
   pointers and cast them to the type they expect instead.

 * Because recent Git for Windows do come with a real msgfmt, the
   build procedure for git-gui has been updated to use it instead of a
   hand-rolled substitute.

 * "git grep --recurse-submodules" has been reworked to give a more
   consistent output across submodule boundary (and do its thing
   without having to fork a separate process).

 * A helper function to read a single whole line into strbuf
   mistakenly triggered OOM error at EOF under certain conditions,
   which has been fixed.

 * The "ref-store" code reorganization continues.

 * "git commit" used to discard the index and re-read from the filesystem
   just in case the pre-commit hook has updated it in the middle; this
   has been optimized out when we know we do not run the pre-commit hook.
   (merge 680ee550d7 kw/commit-keep-index-when-pre-commit-is-not-run later to maint).

 * Updates to the HTTP layer we made recently unconditionally used
   features of libCurl without checking the existence of them, causing
   compilation errors, which has been fixed.  Also migrate the code to
   check feature macros, not version numbers, to cope better with
   libCurl that vendor ships with backported features.

 * The API to start showing progress meter after a short delay has
   been simplified.
   (merge 8aade107dd jc/simplify-progress later to maint).

 * Code clean-up to avoid mixing values read from the .gitmodules file
   and values read from the .git/config file.

 * We used to spend more than necessary cycles allocating and freeing
   piece of memory while writing each index entry out.  This has been
   optimized.

 * Platforms that ship with a separate sha1 with collision detection
   library can link to it instead of using the copy we ship as part of
   our source tree.

 * Code around "notes" have been cleaned up.
   (merge 3964281524 mh/notes-cleanup later to maint).

 * The long-standing rule that an in-core lockfile instance, once it
   is used, must not be freed, has been lifted and the lockfile and
   tempfile APIs have been updated to reduce the chance of programming
   errors.

 * Our hashmap implementation in hashmap.[ch] is not thread-safe when
   adding a new item needs to expand the hashtable by rehashing; add
   an API to disable the automatic rehashing to work it around.

 * Many of our programs consider that it is OK to release dynamic
   storage that is used throughout the life of the program by simply
   exiting, but this makes it harder to leak detection tools to avoid
   reporting false positives.  Plug many existing leaks and introduce
   a mechanism for developers to mark that the region of memory
   pointed by a pointer is not lost/leaking to help these tools.

 * As "git commit" to conclude a conflicted "git merge" honors the
   commit-msg hook, "git merge" that records a merge commit that
   cleanly auto-merges should, but it didn't.

 * The codepath for "git merge-recursive" has been cleaned up.

 * Many leaks of strbuf have been fixed.

 * "git imap-send" has our own implementation of the protocol and also
   can use more recent libCurl with the imap protocol support.  Update
   the latter so that it can use the credential subsystem, and then
   make it the default option to use, so that we can eventually
   deprecate and remove the former.

 * "make style" runs git-clang-format to help developers by pointing
   out coding style issues.

 * A test to demonstrate "git mv" failing to adjust nested submodules
   has been added.
   (merge c514167df2 hv/mv-nested-submodules-test later to maint).

 * On Cygwin, "ulimit -s" does not report failure but it does not work
   at all, which causes an unexpected success of some tests that
   expect failures under a limited stack situation.  This has been
   fixed.

 * Many codepaths have been updated to squelch -Wimplicit-fallthrough
   warnings from Gcc 7 (which is a good code hygiene).

 * Add a helper for DLL loading in anticipation for its need in a
   future topic RSN.

 * "git status --ignored", when noticing that a directory without any
   tracked path is ignored, still enumerated all the ignored paths in
   the directory, which is unnecessary.  The codepath has been
   optimized to avoid this overhead.

 * The final batch to "git rebase -i" updates to move more code from
   the shell script to C has been merged.

 * Operations that do not touch (majority of) packed refs have been
   optimized by making accesses to packed-refs file lazy; we no longer
   pre-parse everything, and an access to a single ref in the
   packed-refs does not touch majority of irrelevant refs, either.

 * Add comment to clarify that the style file is meant to be used with
   clang-5 and the rules are still work in progress.

 * Many variables that points at a region of memory that will live
   throughout the life of the program have been marked with UNLEAK
   marker to help the leak checkers concentrate on real leaks..

 * Plans for weaning us off of SHA-1 has been documented.

 * A new "oidmap" API has been introduced and oidset API has been
   rewritten to use it.


Also contains various documentation updates and code clean-ups.


Fixes since v2.14
-----------------

 * "%C(color name)" in the pretty print format always produced ANSI
   color escape codes, which was an early design mistake.  They now
   honor the configuration (e.g. "color.ui = never") and also tty-ness
   of the output medium.

 * The http.{sslkey,sslCert} configuration variables are to be
   interpreted as a pathname that honors "~[username]/" prefix, but
   weren't, which has been fixed.

 * Numerous bugs in walking of reflogs via "log -g" and friends have
   been fixed.

 * "git commit" when seeing an totally empty message said "you did not
   edit the message", which is clearly wrong.  The message has been
   corrected.

 * When a directory is not readable, "gitweb" fails to build the
   project list.  Work this around by skipping such a directory.

 * Some versions of GnuPG fails to kill gpg-agent it auto-spawned
   and such a left-over agent can interfere with a test.  Work it
   around by attempting to kill one before starting a new test.

 * A recently added test for the "credential-cache" helper revealed
   that EOF detection done around the time the connection to the cache
   daemon is torn down were flaky.  This was fixed by reacting to
   ECONNRESET and behaving as if we got an EOF.

 * "git log --tag=no-such-tag" showed log starting from HEAD, which
   has been fixed---it now shows nothing.

 * The "tag.pager" configuration variable was useless for those who
   actually create tag objects, as it interfered with the use of an
   editor.  A new mechanism has been introduced for commands to enable
   pager depending on what operation is being carried out to fix this,
   and then "git tag -l" is made to run pager by default.

 * "git push --recurse-submodules $there HEAD:$target" was not
   propagated down to the submodules, but now it is.

 * Commands like "git rebase" accepted the --rerere-autoupdate option
   from the command line, but did not always use it.  This has been
   fixed.

 * "git clone --recurse-submodules --quiet" did not pass the quiet
   option down to submodules.

 * Test portability fix for OBSD.

 * Portability fix for OBSD.

 * "git am -s" has been taught that some input may end with a trailer
   block that is not Signed-off-by: and it should refrain from adding
   an extra blank line before adding a new sign-off in such a case.

 * "git svn" used with "--localtime" option did not compute the tz
   offset for the timestamp in question and instead always used the
   current time, which has been corrected.

 * Memory leak in an error codepath has been plugged.

 * "git stash -u" used the contents of the committed version of the
   ".gitignore" file to decide which paths are ignored, even when the
   file has local changes.  The command has been taught to instead use
   the locally modified contents.

 * bash 4.4 or newer gave a warning on NUL byte in command
   substitution done in "git stash"; this has been squelched.

 * "git grep -L" and "git grep --quiet -L" reported different exit
   codes; this has been corrected.

 * When handshake with a subprocess filter notices that the process
   asked for an unknown capability, Git did not report what program
   the offending subprocess was running.  This has been corrected.

 * "git apply" that is used as a better "patch -p1" failed to apply a
   taken from a file with CRLF line endings to a file with CRLF line
   endings.  The root cause was because it misused convert_to_git()
   that tried to do "safe-crlf" processing by looking at the index
   entry at the same path, which is a nonsense---in that mode, "apply"
   is not working on the data in (or derived from) the index at all.
   This has been fixed.

 * Killing "git merge --edit" before the editor returns control left
   the repository in a state with MERGE_MSG but without MERGE_HEAD,
   which incorrectly tells the subsequent "git commit" that there was
   a squash merge in progress.  This has been fixed.

 * "git archive" did not work well with pathspecs and the
   export-ignore attribute.

 * In addition to "cc: <a@dd.re.ss> # cruft", "cc: a@dd.re.ss # cruft"
   was taught to "git send-email" as a valid way to tell it that it
   needs to also send a carbon copy to <a@dd.re.ss> in the trailer
   section.

 * "git branch -M a b" while on a branch that is completely unrelated
   to either branch a or branch b misbehaved when multiple worktree
   was in use.  This has been fixed.
   (merge 31824d180d nd/worktree-kill-parse-ref later to maint).

 * "git gc" and friends when multiple worktrees are used off of a
   single repository did not consider the index and per-worktree refs
   of other worktrees as the root for reachability traversal, making
   objects that are in use only in other worktrees to be subject to
   garbage collection.

 * A regression to "gitk --bisect" by a recent update has been fixed.

 * "git -c submodule.recurse=yes pull" did not work as if the
   "--recurse-submodules" option was given from the command line.
   This has been corrected.

 * Unlike "git commit-tree < file", "git commit-tree -F file" did not
   pass the contents of the file verbatim and instead completed an
   incomplete line at the end, if exists.  The latter has been updated
   to match the behaviour of the former.

 * Many codepaths did not diagnose write failures correctly when disks
   go full, due to their misuse of write_in_full() helper function,
   which have been corrected.
   (merge f48ecd38cb jk/write-in-full-fix later to maint).

 * "git help co" now says "co is aliased to ...", not "git co is".
   (merge b3a8076e0d ks/help-alias-label later to maint).

 * "git archive", especially when used with pathspec, stored an empty
   directory in its output, even though Git itself never does so.
   This has been fixed.

 * API error-proofing which happens to also squelch warnings from GCC.

 * The explanation of the cut-line in the commit log editor has been
   slightly tweaked.
   (merge 8c4b1a3593 ks/commit-do-not-touch-cut-line later to maint).

 * "git gc" tries to avoid running two instances at the same time by
   reading and writing pid/host from and to a lock file; it used to
   use an incorrect fscanf() format when reading, which has been
   corrected.

 * The scripts to drive TravisCI has been reorganized and then an
   optimization to avoid spending cycles on a branch whose tip is
   tagged has been implemented.
   (merge 8376eb4a8f ls/travis-scriptify later to maint).

 * The test linter has been taught that we do not like "echo -e".

 * Code cmp.std.c nitpick.

 * A regression fix for 2.11 that made the code to read the list of
   alternate object stores overrun the end of the string.
   (merge f0f7bebef7 jk/info-alternates-fix later to maint).

 * "git describe --match" learned to take multiple patterns in v2.13
   series, but the feature ignored the patterns after the first one
   and did not work at all.  This has been fixed.

 * "git filter-branch" cannot reproduce a history with a tag without
   the tagger field, which only ancient versions of Git allowed to be
   created.  This has been corrected.
   (merge b2c1ca6b4b ic/fix-filter-branch-to-handle-tag-without-tagger later to maint).

 * "git cat-file --textconv" started segfaulting recently, which
   has been corrected.

 * The built-in pattern to detect the "function header" for HTML did
   not match <H1>..<H6> elements without any attributes, which has
   been fixed.

 * "git mailinfo" was loose in decoding quoted printable and produced
   garbage when the two letters after the equal sign are not
   hexadecimal.  This has been fixed.

 * The machinery to create xdelta used in pack files received the
   sizes of the data in size_t, but lost the higher bits of them by
   storing them in "unsigned int" during the computation, which is
   fixed.

 * The delta format used in the packfile cannot reference data at
   offset larger than what can be expressed in 4-byte, but the
   generator for the data failed to make sure the offset does not
   overflow.  This has been corrected.

 * The documentation for '-X<option>' for merges was misleadingly
   written to suggest that "-s theirs" exists, which is not the case.

 * "git fast-export" with -M/-C option issued "copy" instruction on a
   path that is simultaneously modified, which was incorrect.
   (merge b3e8ca89cf jt/fast-export-copy-modify-fix later to maint).

 * Many codepaths have been updated to squelch -Wsign-compare
   warnings.
   (merge 071bcaab64 rj/no-sign-compare later to maint).

 * Memory leaks in various codepaths have been plugged.
   (merge 4d01a7fa65 ma/leakplugs later to maint).

 * Recent versions of "git rev-parse --parseopt" did not parse the
   option specification that does not have the optional flags (*=?!)
   correctly, which has been corrected.
   (merge a6304fa4c2 bc/rev-parse-parseopt-fix later to maint).

 * The checkpoint command "git fast-import" did not flush updates to
   refs and marks unless at least one object was created since the
   last checkpoint, which has been corrected, as these things can
   happen without any new object getting created.
   (merge 30e215a65c er/fast-import-dump-refs-on-checkpoint later to maint).

 * Spell the name of our system as "Git" in the output from
   request-pull script.

 * Fixes for a handful memory access issues identified by valgrind.

 * Backports a moral equivalent of 2015 fix to the poll() emulation
   from the upstream gnulib to fix occasional breakages on HPE NonStop.

 * Users with "color.ui = always" in their configuration were broken
   by a recent change that made plumbing commands to pay attention to
   them as the patch created internally by "git add -p" were colored
   (heh) and made unusable.  This has been fixed by reverting the
   offending change.

 * In the "--format=..." option of the "git for-each-ref" command (and
   its friends, i.e. the listing mode of "git branch/tag"), "%(atom:)"
   (e.g. "%(refname:)", "%(body:)" used to error out.  Instead, treat
   them as if the colon and an empty string that follows it were not
   there.

 * An ancient bug that made Git misbehave with creation/renaming of
   refs has been fixed.

 * "git fetch <there> <src>:<dst>" allows an object name on the <src>
   side when the other side accepts such a request since Git v2.5, but
   the documentation was left stale.
   (merge 83558a412a jc/fetch-refspec-doc-update later to maint).

 * Update the documentation for "git filter-branch" so that the filter
   options are listed in the same order as they are applied, as
   described in an earlier part of the doc.
   (merge 07c4984508 dg/filter-branch-filter-order-doc later to maint).

 * A possible oom error is now caught as a fatal error, instead of
   continuing and dereferencing NULL.
   (merge 55d7d15847 ao/path-use-xmalloc later to maint).

 * Other minor doc, test and build updates and code cleanups.
   (merge f094b89a4d ma/parse-maybe-bool later to maint).
   (merge 6cdf8a7929 ma/ts-cleanups later to maint).
   (merge 7560f547e6 ma/up-to-date later to maint).
   (merge 0db3dc75f3 rs/apply-epoch later to maint).
   (merge 276d0e35c0 ma/split-symref-update-fix later to maint).
   (merge f777623514 ks/branch-tweak-error-message-for-extra-args later to maint).
   (merge 33f3c683ec ks/verify-filename-non-option-error-message-tweak later to maint).
   (merge 7cbbf9d6a2 ls/filter-process-delayed later to maint).
   (merge 488aa65c8f wk/merge-options-gpg-sign-doc later to maint).
   (merge e61cb19a27 jc/branch-force-doc-readability-fix later to maint).
   (merge 32fceba3fd np/config-path-doc later to maint).
   (merge e38c681fb7 sb/rev-parse-show-superproject-root later to maint).
   (merge 4f851dc883 sg/rev-list-doc-reorder-fix later to maint).

^ permalink raw reply	[relevance 9%]

* Re: [PATCH v2 3/4] diff: add flag to indicate textconv was set via cmdline
  2017-10-30 19:46   ` [PATCH v2 3/4] diff: add flag to indicate textconv was set via cmdline Brandon Williams
@ 2017-10-30 20:41     ` Stefan Beller
  2017-10-30 20:44       ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-30 20:41 UTC (permalink / raw)
  To: Brandon Williams, Michael J Gruber; +Cc: git, Junio C Hamano

On Mon, Oct 30, 2017 at 12:46 PM, Brandon Williams <bmwill@google.com> wrote:
> git-show is unique in that it wants to use textconv by default except
> for when it is showing blobs.  When asked to show a blob, show doesn't
> want to use textconv unless the user explicitly requested that it be
> used by providing the command line flag '--textconv'.
>
> Currently this is done by using a parallel set of 'touched' flags which
> get set every time a particular flag is set or cleared.  In a future
> patch we want to eliminate this parallel set of flags so instead of
> relying on if the textconv flag has been touched, add a new flag
> 'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is requested
> via the command line.

Is it worth mentioning 4197361e39 (Merge branch 'mg/more-textconv',
2013-10-23), that introduced the touched_flags?
(+cc Michael Gruber FYI)

> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  builtin/log.c | 2 +-
>  diff.c        | 8 +++++---
>  diff.h        | 1 +
>  3 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index dc28d43eb..82131751d 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -485,7 +485,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
>         unsigned long size;
>
>         fflush(rev->diffopt.file);
> -       if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
> +       if (!DIFF_OPT_TST(&rev->diffopt, TEXTCONV_SET_VIA_CMDLINE) ||
>             !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
>                 return stream_blob_to_fd(1, oid, NULL, 0);
>
> diff --git a/diff.c b/diff.c
> index 3ad9c9b31..8b700b1bd 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4762,11 +4762,13 @@ int diff_opt_parse(struct diff_options *options,
>                 DIFF_OPT_SET(options, ALLOW_EXTERNAL);
>         else if (!strcmp(arg, "--no-ext-diff"))
>                 DIFF_OPT_CLR(options, ALLOW_EXTERNAL);
> -       else if (!strcmp(arg, "--textconv"))
> +       else if (!strcmp(arg, "--textconv")) {
>                 DIFF_OPT_SET(options, ALLOW_TEXTCONV);
> -       else if (!strcmp(arg, "--no-textconv"))
> +               DIFF_OPT_SET(options, TEXTCONV_SET_VIA_CMDLINE);
> +       } else if (!strcmp(arg, "--no-textconv")) {
>                 DIFF_OPT_CLR(options, ALLOW_TEXTCONV);

Also clear TEXTCONV_SET_VIA_CMDLINE here?
(`git show --textconv --no-textconv` might act funny?)

> -       else if (!strcmp(arg, "--ignore-submodules")) {
> +               DIFF_OPT_CLR(options, TEXTCONV_SET_VIA_CMDLINE);
> +       } else if (!strcmp(arg, "--ignore-submodules")) {
>                 DIFF_OPT_SET(options, OVERRIDE_SUBMODULE_CONFIG);
>                 handle_ignore_submodules_arg(options, "all");
>         } else if (skip_prefix(arg, "--ignore-submodules=", &arg)) {
> diff --git a/diff.h b/diff.h
> index 47e6d43cb..4eaf9b370 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -83,6 +83,7 @@ struct diff_flags {
>         unsigned DIRSTAT_CUMULATIVE:1;
>         unsigned DIRSTAT_BY_FILE:1;
>         unsigned ALLOW_TEXTCONV:1;
> +       unsigned TEXTCONV_SET_VIA_CMDLINE:1;
>         unsigned DIFF_FROM_CONTENTS:1;
>         unsigned DIRTY_SUBMODULES:1;
>         unsigned IGNORE_UNTRACKED_IN_SUBMODULES:1;
> --
> 2.15.0.403.gc27cc4dac6-goog
>

^ permalink raw reply	[relevance 8%]

* Re: [PATCH v2 3/4] diff: add flag to indicate textconv was set via cmdline
  2017-10-30 20:41     ` Stefan Beller
@ 2017-10-30 20:44       ` Brandon Williams
  2017-10-30 20:48         ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Brandon Williams @ 2017-10-30 20:44 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Michael J Gruber, git, Junio C Hamano

On 10/30, Stefan Beller wrote:
> On Mon, Oct 30, 2017 at 12:46 PM, Brandon Williams <bmwill@google.com> wrote:
> > git-show is unique in that it wants to use textconv by default except
> > for when it is showing blobs.  When asked to show a blob, show doesn't
> > want to use textconv unless the user explicitly requested that it be
> > used by providing the command line flag '--textconv'.
> >
> > Currently this is done by using a parallel set of 'touched' flags which
> > get set every time a particular flag is set or cleared.  In a future
> > patch we want to eliminate this parallel set of flags so instead of
> > relying on if the textconv flag has been touched, add a new flag
> > 'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is requested
> > via the command line.
> 
> Is it worth mentioning 4197361e39 (Merge branch 'mg/more-textconv',
> 2013-10-23), that introduced the touched_flags?
> (+cc Michael Gruber FYI)
> 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  builtin/log.c | 2 +-
> >  diff.c        | 8 +++++---
> >  diff.h        | 1 +
> >  3 files changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/builtin/log.c b/builtin/log.c
> > index dc28d43eb..82131751d 100644
> > --- a/builtin/log.c
> > +++ b/builtin/log.c
> > @@ -485,7 +485,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
> >         unsigned long size;
> >
> >         fflush(rev->diffopt.file);
> > -       if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
> > +       if (!DIFF_OPT_TST(&rev->diffopt, TEXTCONV_SET_VIA_CMDLINE) ||
> >             !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
> >                 return stream_blob_to_fd(1, oid, NULL, 0);
> >
> > diff --git a/diff.c b/diff.c
> > index 3ad9c9b31..8b700b1bd 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -4762,11 +4762,13 @@ int diff_opt_parse(struct diff_options *options,
> >                 DIFF_OPT_SET(options, ALLOW_EXTERNAL);
> >         else if (!strcmp(arg, "--no-ext-diff"))
> >                 DIFF_OPT_CLR(options, ALLOW_EXTERNAL);
> > -       else if (!strcmp(arg, "--textconv"))
> > +       else if (!strcmp(arg, "--textconv")) {
> >                 DIFF_OPT_SET(options, ALLOW_TEXTCONV);
> > -       else if (!strcmp(arg, "--no-textconv"))
> > +               DIFF_OPT_SET(options, TEXTCONV_SET_VIA_CMDLINE);
> > +       } else if (!strcmp(arg, "--no-textconv")) {
> >                 DIFF_OPT_CLR(options, ALLOW_TEXTCONV);
> 
> Also clear TEXTCONV_SET_VIA_CMDLINE here?
> (`git show --textconv --no-textconv` might act funny?)

Oops, thanks for catching this, I added it in the wrong place! (as you
can see below.

> 
> > -       else if (!strcmp(arg, "--ignore-submodules")) {
> > +               DIFF_OPT_CLR(options, TEXTCONV_SET_VIA_CMDLINE);
> > +       } else if (!strcmp(arg, "--ignore-submodules")) {
> >                 DIFF_OPT_SET(options, OVERRIDE_SUBMODULE_CONFIG);
> >                 handle_ignore_submodules_arg(options, "all");
> >         } else if (skip_prefix(arg, "--ignore-submodules=", &arg)) {
> > diff --git a/diff.h b/diff.h
> > index 47e6d43cb..4eaf9b370 100644
> > --- a/diff.h
> > +++ b/diff.h
> > @@ -83,6 +83,7 @@ struct diff_flags {
> >         unsigned DIRSTAT_CUMULATIVE:1;
> >         unsigned DIRSTAT_BY_FILE:1;
> >         unsigned ALLOW_TEXTCONV:1;
> > +       unsigned TEXTCONV_SET_VIA_CMDLINE:1;
> >         unsigned DIFF_FROM_CONTENTS:1;
> >         unsigned DIRTY_SUBMODULES:1;
> >         unsigned IGNORE_UNTRACKED_IN_SUBMODULES:1;
> > --
> > 2.15.0.403.gc27cc4dac6-goog
> >

-- 
Brandon Williams

^ permalink raw reply	[relevance 8%]

* Re: [PATCH v2 3/4] diff: add flag to indicate textconv was set via cmdline
  2017-10-30 20:44       ` Brandon Williams
@ 2017-10-30 20:48         ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-10-30 20:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Michael J Gruber, git, Junio C Hamano

On 10/30, Brandon Williams wrote:
> On 10/30, Stefan Beller wrote:
> > On Mon, Oct 30, 2017 at 12:46 PM, Brandon Williams <bmwill@google.com> wrote:
> > > git-show is unique in that it wants to use textconv by default except
> > > for when it is showing blobs.  When asked to show a blob, show doesn't
> > > want to use textconv unless the user explicitly requested that it be
> > > used by providing the command line flag '--textconv'.
> > >
> > > Currently this is done by using a parallel set of 'touched' flags which
> > > get set every time a particular flag is set or cleared.  In a future
> > > patch we want to eliminate this parallel set of flags so instead of
> > > relying on if the textconv flag has been touched, add a new flag
> > > 'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is requested
> > > via the command line.
> > 
> > Is it worth mentioning 4197361e39 (Merge branch 'mg/more-textconv',
> > 2013-10-23), that introduced the touched_flags?
> > (+cc Michael Gruber FYI)
> > 
> > > Signed-off-by: Brandon Williams <bmwill@google.com>
> > > ---
> > >  builtin/log.c | 2 +-
> > >  diff.c        | 8 +++++---
> > >  diff.h        | 1 +
> > >  3 files changed, 7 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/builtin/log.c b/builtin/log.c
> > > index dc28d43eb..82131751d 100644
> > > --- a/builtin/log.c
> > > +++ b/builtin/log.c
> > > @@ -485,7 +485,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
> > >         unsigned long size;
> > >
> > >         fflush(rev->diffopt.file);
> > > -       if (!DIFF_OPT_TOUCHED(&rev->diffopt, ALLOW_TEXTCONV) ||
> > > +       if (!DIFF_OPT_TST(&rev->diffopt, TEXTCONV_SET_VIA_CMDLINE) ||
> > >             !DIFF_OPT_TST(&rev->diffopt, ALLOW_TEXTCONV))
> > >                 return stream_blob_to_fd(1, oid, NULL, 0);
> > >
> > > diff --git a/diff.c b/diff.c
> > > index 3ad9c9b31..8b700b1bd 100644
> > > --- a/diff.c
> > > +++ b/diff.c
> > > @@ -4762,11 +4762,13 @@ int diff_opt_parse(struct diff_options *options,
> > >                 DIFF_OPT_SET(options, ALLOW_EXTERNAL);
> > >         else if (!strcmp(arg, "--no-ext-diff"))
> > >                 DIFF_OPT_CLR(options, ALLOW_EXTERNAL);
> > > -       else if (!strcmp(arg, "--textconv"))
> > > +       else if (!strcmp(arg, "--textconv")) {
> > >                 DIFF_OPT_SET(options, ALLOW_TEXTCONV);
> > > -       else if (!strcmp(arg, "--no-textconv"))
> > > +               DIFF_OPT_SET(options, TEXTCONV_SET_VIA_CMDLINE);
> > > +       } else if (!strcmp(arg, "--no-textconv")) {
> > >                 DIFF_OPT_CLR(options, ALLOW_TEXTCONV);
> > 
> > Also clear TEXTCONV_SET_VIA_CMDLINE here?
> > (`git show --textconv --no-textconv` might act funny?)
> 
> Oops, thanks for catching this, I added it in the wrong place! (as you
> can see below.

wait nevermind, i overreacted.  And this patch does correctly clear
TEXTCONV_SET_VIA_CMDLINE.

> 
> > 
> > > -       else if (!strcmp(arg, "--ignore-submodules")) {
> > > +               DIFF_OPT_CLR(options, TEXTCONV_SET_VIA_CMDLINE);
> > > +       } else if (!strcmp(arg, "--ignore-submodules")) {
> > >                 DIFF_OPT_SET(options, OVERRIDE_SUBMODULE_CONFIG);
> > >                 handle_ignore_submodules_arg(options, "all");
> > >         } else if (skip_prefix(arg, "--ignore-submodules=", &arg)) {
> > > diff --git a/diff.h b/diff.h
> > > index 47e6d43cb..4eaf9b370 100644
> > > --- a/diff.h
> > > +++ b/diff.h
> > > @@ -83,6 +83,7 @@ struct diff_flags {
> > >         unsigned DIRSTAT_CUMULATIVE:1;
> > >         unsigned DIRSTAT_BY_FILE:1;
> > >         unsigned ALLOW_TEXTCONV:1;
> > > +       unsigned TEXTCONV_SET_VIA_CMDLINE:1;
> > >         unsigned DIFF_FROM_CONTENTS:1;
> > >         unsigned DIRTY_SUBMODULES:1;
> > >         unsigned IGNORE_UNTRACKED_IN_SUBMODULES:1;
> > > --
> > > 2.15.0.403.gc27cc4dac6-goog
> > >
> 
> -- 
> Brandon Williams

-- 
Brandon Williams

^ permalink raw reply	[relevance 1%]

* [PATCH 6/7] builtin/describe.c: describe a blob
  2017-10-31  0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller
@ 2017-10-31  0:33   ` Stefan Beller
  2017-10-31  0:33   ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller
  2017-10-31 21:18   ` [PATCHv2 0/7] git describe blob Stefan Beller
  2 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-31  0:33 UTC (permalink / raw)
  To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

"This is an interesting endeavor, because describing things is hard."
  -- me, upon writing this patch.

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambivalent, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob.  For
source code this is likely not the first commit that introduced the blob,
but rather the latest commit that contained the blob.  For example:

  git describe v0.99:Makefile
  v0.99-5-gab6625e06a:Makefile

tells us the latest commit that contained the Makefile as it was in tag
v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next
commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist",
2005-07-11) touches the Makefile.

Let's see how this description turns out, if it is useful in day-to-day
use as I have the intuition that we'd rather want to see the *first*
commit that this blob was introduced to the repository (which can be
achieved easily by giving the `--reverse` flag in the describe_blob rev
walk).

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-describe.txt | 12 +++++++-
 builtin/describe.c             | 65 ++++++++++++++++++++++++++++++++++++++----
 t/t6120-describe.sh            | 15 ++++++++++
 3 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..3d618b2445 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,7 +3,7 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit using the most recent tag reachable from it
+git-describe - Describe a commit or blob using the most recent tag reachable from it
 
 
 SYNOPSIS
@@ -24,6 +24,16 @@ By default (without --all or --tags) `git describe` only shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given `<commit-ish>` refers to a blob, it will be described
+as `<commit-description>:<path>`, such that the blob can be found
+at `<path>` in the `<commit-ish`. Note, that the commit is likely
+not the commit that introduced the blob, but the one that was found
+first; to find the commit that introduced the blob, you need to find
+the commit that last touched the path, e.g.
+`git log <commit-description> -- <path>`.
+As blobs do not point at the commits they are contained in,
+describing blobs is slow as we have to walk the whole graph.
+
 OPTIONS
 -------
 <commit-ish>...::
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..382cbae908 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,54 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+	struct object_id current_commit;
+	struct object_id looking_for;
+	struct strbuf *dst;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+	struct process_commit_data *pcd = data;
+	pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+	struct process_commit_data *pcd = data;
+
+	if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
+		reset_revision_walk();
+		describe_commit(&pcd->current_commit, pcd->dst);
+		strbuf_addf(pcd->dst, ":%s", path);
+	}
+}
+
+static void describe_blob(struct object_id oid, struct strbuf *dst)
+{
+	struct rev_info revs;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct process_commit_data pcd = { null_oid, oid, dst};
+
+	argv_array_pushl(&args, "internal: The first arg is not parsed",
+		"--all", "--reflog", /* as many starting points as possible */
+		/* NEEDSWORK: --all is incompatible with worktrees for now: */
+		"--single-worktree",
+		"--objects",
+		"--in-commit-order",
+		NULL);
+
+	init_revisions(&revs, NULL);
+	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
+		BUG("setup_revisions could not handle all args?");
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, process_commit, process_object, &pcd);
+	reset_revision_walk();
+}
+
 static void describe(const char *arg, int last_one)
 {
 	struct object_id oid;
@@ -445,11 +495,16 @@ static void describe(const char *arg, int last_one)
 
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
+	cmit = lookup_commit_reference_gently(&oid, 1);
 
-	describe_commit(&oid, &sb);
+	if (cmit) {
+		describe_commit(&oid, &sb);
+	} else {
+		if (lookup_blob(&oid))
+			describe_blob(oid, &sb);
+		else
+			die(_("%s is neither a commit nor blob"), arg);
+	}
 
 	puts(sb.buf);
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..3be01316e8 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' '
 	grep broken out
 '
 
+test_expect_success 'describe a blob at a tag' '
+	echo "make it a unique blob" >file &&
+	git add file && git commit -m "content in file" &&
+	git tag -a -m "latest annotated tag" unique-file &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe a surviving blob' '
+	git commit --allow-empty -m "empty commit" &&
+	git describe HEAD:file >actual &&
+	grep unique-file-1-g actual
+'
+
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
-- 
2.15.0.rc2.443.gfcc3b81c0a


^ permalink raw reply	[relevance 11%]

* [PATCH 7/7] t6120: fix typo in test name
  2017-10-31  0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller
  2017-10-31  0:33   ` [PATCH 6/7] builtin/describe.c: describe a blob Stefan Beller
@ 2017-10-31  0:33   ` Stefan Beller
  2017-11-01  1:21     ` Junio C Hamano
  2017-10-31 21:18   ` [PATCHv2 0/7] git describe blob Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-31  0:33 UTC (permalink / raw)
  To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t6120-describe.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 3be01316e8..fd329f173a 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' '
 	mv .git/modules/sub1/ .git/modules/sub_moved &&
 	test_must_fail git describe --dirty
 '
-test_expect_success 'describe ignoring a borken submodule' '
+test_expect_success 'describe ignoring a broken submodule' '
 	git describe --broken >out &&
 	test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
 	grep broken out
-- 
2.15.0.rc2.443.gfcc3b81c0a


^ permalink raw reply	[relevance 26%]

* [PATCHv2 6/7] builtin/describe.c: describe a blob
  2017-10-31 21:18   ` [PATCHv2 0/7] git describe blob Stefan Beller
@ 2017-10-31 21:18     ` " Stefan Beller
  2017-11-01  4:11       ` Junio C Hamano
  2017-10-31 21:18     ` [PATCHv2 7/7] t6120: fix typo in test name Stefan Beller
  2017-11-02 19:41     ` [PATCHv3 0/7] git describe blob Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw)
  To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

"This is an interesting endeavor, because describing things is hard."
  -- me, upon writing this patch.

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambivalent, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob.  For
source code this is likely not the first commit that introduced the blob,
but rather the latest commit that contained the blob.  For example:

  git describe v0.99:Makefile
  v0.99-5-gab6625e06a:Makefile

tells us the latest commit that contained the Makefile as it was in tag
v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next
commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist",
2005-07-11) touches the Makefile.

Let's see how this description turns out, if it is useful in day-to-day
use as I have the intuition that we'd rather want to see the *first*
commit that this blob was introduced to the repository (which can be
achieved easily by giving the `--reverse` flag in the describe_blob rev
walk).

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-describe.txt | 13 ++++++++-
 builtin/describe.c             | 65 ++++++++++++++++++++++++++++++++++++++----
 t/t6120-describe.sh            | 15 ++++++++++
 3 files changed, 87 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..077c3c2193 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,7 +3,7 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit using the most recent tag reachable from it
+git-describe - Describe a commit or blob using the graph relations
 
 
 SYNOPSIS
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>]
+'git describe' [<options>] <blob-ish>
 
 DESCRIPTION
 -----------
@@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given object refers to a blob, it will be described
+as `<commit-ish>:<path>`, such that the blob can be found
+at `<path>` in the `<commit-ish>`. Note, that the commit is likely
+not the commit that introduced the blob, but the one that was found
+first; to find the commit that introduced the blob, you need to find
+the commit that last touched the path, e.g.
+`git log <commit-description> -- <path>`.
+As blobs do not point at the commits they are contained in,
+describing blobs is slow as we have to walk the whole graph.
+
 OPTIONS
 -------
 <commit-ish>...::
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..382cbae908 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,54 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+	struct object_id current_commit;
+	struct object_id looking_for;
+	struct strbuf *dst;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+	struct process_commit_data *pcd = data;
+	pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+	struct process_commit_data *pcd = data;
+
+	if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
+		reset_revision_walk();
+		describe_commit(&pcd->current_commit, pcd->dst);
+		strbuf_addf(pcd->dst, ":%s", path);
+	}
+}
+
+static void describe_blob(struct object_id oid, struct strbuf *dst)
+{
+	struct rev_info revs;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct process_commit_data pcd = { null_oid, oid, dst};
+
+	argv_array_pushl(&args, "internal: The first arg is not parsed",
+		"--all", "--reflog", /* as many starting points as possible */
+		/* NEEDSWORK: --all is incompatible with worktrees for now: */
+		"--single-worktree",
+		"--objects",
+		"--in-commit-order",
+		NULL);
+
+	init_revisions(&revs, NULL);
+	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
+		BUG("setup_revisions could not handle all args?");
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, process_commit, process_object, &pcd);
+	reset_revision_walk();
+}
+
 static void describe(const char *arg, int last_one)
 {
 	struct object_id oid;
@@ -445,11 +495,16 @@ static void describe(const char *arg, int last_one)
 
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
+	cmit = lookup_commit_reference_gently(&oid, 1);
 
-	describe_commit(&oid, &sb);
+	if (cmit) {
+		describe_commit(&oid, &sb);
+	} else {
+		if (lookup_blob(&oid))
+			describe_blob(oid, &sb);
+		else
+			die(_("%s is neither a commit nor blob"), arg);
+	}
 
 	puts(sb.buf);
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..3be01316e8 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' '
 	grep broken out
 '
 
+test_expect_success 'describe a blob at a tag' '
+	echo "make it a unique blob" >file &&
+	git add file && git commit -m "content in file" &&
+	git tag -a -m "latest annotated tag" unique-file &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe a surviving blob' '
+	git commit --allow-empty -m "empty commit" &&
+	git describe HEAD:file >actual &&
+	grep unique-file-1-g actual
+'
+
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
-- 
2.15.0.rc2.443.gfcc3b81c0a


^ permalink raw reply	[relevance 11%]

* [PATCHv2 7/7] t6120: fix typo in test name
  2017-10-31 21:18   ` [PATCHv2 0/7] git describe blob Stefan Beller
  2017-10-31 21:18     ` [PATCHv2 6/7] builtin/describe.c: describe a " Stefan Beller
@ 2017-10-31 21:18     ` Stefan Beller
  2017-11-02 19:41     ` [PATCHv3 0/7] git describe blob Stefan Beller
  2 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw)
  To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t6120-describe.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 3be01316e8..fd329f173a 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' '
 	mv .git/modules/sub1/ .git/modules/sub_moved &&
 	test_must_fail git describe --dirty
 '
-test_expect_success 'describe ignoring a borken submodule' '
+test_expect_success 'describe ignoring a broken submodule' '
 	git describe --broken >out &&
 	test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
 	grep broken out
-- 
2.15.0.rc2.443.gfcc3b81c0a


^ permalink raw reply	[relevance 26%]

* Re: [PATCH 7/7] t6120: fix typo in test name
  2017-10-31  0:33   ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller
@ 2017-11-01  1:21     ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-11-01  1:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin

Stefan Beller <sbeller@google.com> writes:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  t/t6120-describe.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Good.  I am guessing that you are sending this as the last/optional
one because this was found _after_ you worked on other parts of the
series, but I think it is easier to reason about if this were marked
as a preliminary clean-up and moved to the front of the series.

Thanks.

>
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 3be01316e8..fd329f173a 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' '
>  	mv .git/modules/sub1/ .git/modules/sub_moved &&
>  	test_must_fail git describe --dirty
>  '
> -test_expect_success 'describe ignoring a borken submodule' '
> +test_expect_success 'describe ignoring a broken submodule' '
>  	git describe --broken >out &&
>  	test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
>  	grep broken out

^ permalink raw reply	[relevance 8%]

* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob
  2017-10-31 21:18     ` [PATCHv2 6/7] builtin/describe.c: describe a " Stefan Beller
@ 2017-11-01  4:11       ` Junio C Hamano
  2017-11-01 21:28         ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-11-01  4:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me

Stefan Beller <sbeller@google.com> writes:

> +If the given object refers to a blob, it will be described
> +as `<commit-ish>:<path>`, such that the blob can be found
> +at `<path>` in the `<commit-ish>`. Note, that the commit is likely

Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or
would it always be <commit>:<path>?

> +not the commit that introduced the blob, but the one that was found
> +first; to find the commit that introduced the blob, you need to find

Because we do not want to descend into the same tree object we saw
earlier, this "we show the name we happened to find first without
attempting to refine it for a better name" is a rather fundamental
limitation of this approach.  Hopefully we can later improve it with
more thought, but for now it is better than nothing (and much better
than "git log --raw | grep").

> +the commit that last touched the path, e.g.
> +`git log <commit-description> -- <path>`.
> +As blobs do not point at the commits they are contained in,
> +describing blobs is slow as we have to walk the whole graph.

Is it true that we walk the "whole graph", or do we stop immediately
we find a path to the blob?  The former makes it sound like
completely useless.

> -#define SEEN		(1u << 0)

Interesting.  Now we include revision.h this becomes redundant.

> +static void describe_blob(struct object_id oid, struct strbuf *dst)
> +{
> +	struct rev_info revs;
> +	struct argv_array args = ARGV_ARRAY_INIT;
> +	struct process_commit_data pcd = { null_oid, oid, dst};
> +
> +	argv_array_pushl(&args, "internal: The first arg is not parsed",
> +		"--all", "--reflog", /* as many starting points as possible */

Interesting.  

Do we also search in the reflog in the normal "describe" operation?
If not, perhaps we should?  I wonder what's the performance
implications would be.

That tangent aside, as "describe blob" is most likely a "what
reaches and is holding onto this blob?" query, finding something
that can only be reached from a reflog entry would make it more
helpful than without the option.

> +		/* NEEDSWORK: --all is incompatible with worktrees for now: */

What's that last colon about?

> +		"--single-worktree",
> +		"--objects",
> +		"--in-commit-order",
> +		NULL);
> +
> +	init_revisions(&revs, NULL);
> +	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
> +		BUG("setup_revisions could not handle all args?");
> +
> +	if (prepare_revision_walk(&revs))
> +		die("revision walk setup failed");
> +
> +	traverse_commit_list(&revs, process_commit, process_object, &pcd);
> +	reset_revision_walk();
> +}
> +

OK.

> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 1c0e8659d9..3be01316e8 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' '
>  	grep broken out
>  '
>  
> +test_expect_success 'describe a blob at a tag' '
> +	echo "make it a unique blob" >file &&
> +	git add file && git commit -m "content in file" &&
> +	git tag -a -m "latest annotated tag" unique-file &&
> +	git describe HEAD:file >actual &&
> +	echo "unique-file:file" >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'describe a surviving blob' '
> +	git commit --allow-empty -m "empty commit" &&
> +	git describe HEAD:file >actual &&
> +	grep unique-file-1-g actual
> +'
> +

I am not sure what "surviving" means in this context.  The word
hinted that the test would be finding a blob that only appears in a
commit that only appears as a reflog entry, but that wasn't the
case, which was a bit disappointing.




^ permalink raw reply	[relevance 7%]

* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob
  2017-11-01  4:11       ` Junio C Hamano
@ 2017-11-01 21:28         ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-01 21:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git, Jacob Keller, Kevin Daudt

On Tue, Oct 31, 2017 at 9:11 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> +If the given object refers to a blob, it will be described
>> +as `<commit-ish>:<path>`, such that the blob can be found
>> +at `<path>` in the `<commit-ish>`. Note, that the commit is likely
>
> Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or
> would it always be <commit>:<path>?
>
>> +not the commit that introduced the blob, but the one that was found
>> +first; to find the commit that introduced the blob, you need to find
>
> Because we do not want to descend into the same tree object we saw
> earlier, this "we show the name we happened to find first without
> attempting to refine it for a better name" is a rather fundamental
> limitation of this approach.  Hopefully we can later improve it with
> more thought, but for now it is better than nothing (and much better
> than "git log --raw | grep").

ok.


>
>> +the commit that last touched the path, e.g.
>> +`git log <commit-description> -- <path>`.
>> +As blobs do not point at the commits they are contained in,
>> +describing blobs is slow as we have to walk the whole graph.
>
> Is it true that we walk the "whole graph", or do we stop immediately
> we find a path to the blob?  The former makes it sound like
> completely useless.

Unfortunately we walk the whole graph, as I have not figured out how to
stop the walking from the callback in 'traverse_commit_list'.

I assume I have to modify the struct rev_info that we operate on,
clearing any pending commits?

>
>> -#define SEEN         (1u << 0)
>
> Interesting.  Now we include revision.h this becomes redundant.
>

Correct. In a way this small part is a revert of 8713ab3079
(Improve git-describe performance by reducing revision listing., 2007-01-13)


>> +     argv_array_pushl(&args, "internal: The first arg is not parsed",
>> +             "--all", "--reflog", /* as many starting points as possible */
>
> Interesting.
>
> Do we also search in the reflog in the normal "describe" operation?
> If not, perhaps we should?  I wonder what's the performance
> implications would be.

"normal" git describe doesn't need to walk the whole graph
as we only walk down from the given commit-ish until a land mark
is found.

For --contains, this might be an interesting though, as there we
also have to "walk backwards without pointers to follow".

>
> That tangent aside, as "describe blob" is most likely a "what
> reaches and is holding onto this blob?" query, finding something
> that can only be reached from a reflog entry would make it more
> helpful than without the option.

Yeah that is my reasoning as well.

>
>> +             /* NEEDSWORK: --all is incompatible with worktrees for now: */
>
> What's that last colon about?

will replace with a dot, it ought to hint at the line that follows,
the --single-worktree flag.

>
>> +             "--single-worktree",
>> +             "--objects",
>> +             "--in-commit-order",
>> +             NULL);
>> +
>> +     init_revisions(&revs, NULL);
>> +     if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
>> +             BUG("setup_revisions could not handle all args?");
>> +
>> +     if (prepare_revision_walk(&revs))
>> +             die("revision walk setup failed");
>> +
>> +     traverse_commit_list(&revs, process_commit, process_object, &pcd);
>> +     reset_revision_walk();
>> +}
>> +
>
> OK.
>
>> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
>> index 1c0e8659d9..3be01316e8 100755
>> --- a/t/t6120-describe.sh
>> +++ b/t/t6120-describe.sh
>> @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' '
>>       grep broken out
>>  '
>>
>> +test_expect_success 'describe a blob at a tag' '
>> +     echo "make it a unique blob" >file &&
>> +     git add file && git commit -m "content in file" &&
>> +     git tag -a -m "latest annotated tag" unique-file &&
>> +     git describe HEAD:file >actual &&
>> +     echo "unique-file:file" >expect &&
>> +     test_cmp expect actual
>> +'
>> +
>> +test_expect_success 'describe a surviving blob' '
>> +     git commit --allow-empty -m "empty commit" &&
>> +     git describe HEAD:file >actual &&
>> +     grep unique-file-1-g actual
>> +'
>> +
>
> I am not sure what "surviving" means in this context.

Maybe "unchanged", "still kept around" ?


> The word
> hinted that the test would be finding a blob that only appears in a
> commit that only appears as a reflog entry, but that wasn't the
> case, which was a bit disappointing.

oh!

^ permalink raw reply	[relevance 8%]

* Re: Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified
  2017-11-02  8:30 Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified Orgad Shaneh
@ 2017-11-02 18:34 ` Stefan Beller
  2017-11-02 21:45   ` Orgad Shaneh
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-11-02 18:34 UTC (permalink / raw)
  To: Orgad Shaneh, Johannes Schindelin, Brandon Williams, Heiko Voigt; +Cc: git

On Thu, Nov 2, 2017 at 1:30 AM, Orgad Shaneh <orgads@gmail.com> wrote:
> I can't reproduce this with a minimal example, but it happens in my project.
>
> What I tried to do for reproducing is:
> rm -rf super sub
> mkdir sub; cd sub; git init
> git commit --allow-empty -m 'Initial commit'
> mkdir ../super; cd ../super
> git init
> git submodule add ../sub
> touch foo; git add foo sub
> git commit -m 'Initial commit'
> touch a; git add a; git commit -m 'a'
> touch b; git add b; git commit -m 'b'
> cd sub; git commit --allow-empty -m 'New commit'; cd ..
> git rebase -i HEAD^^
>
> Then drop a.
>
> In my project I get:
> error: cannot rebase: You have unstaged changes.
>
> This works fine with 2.14.3.

  git log --oneline v2.14.3..v2.15.0 -- submodule.c
doesn't give any promising hints (i.e. I don't think one of a
submodule related series introduced this either by chance or
on purpose)

"rebase -i" was rewritten into C in 570676e011, though
that series was extensively tested by DScho, so I wouldn't
want to point fingers here quickly.

Would you be willing to bisect this behavior?

^ permalink raw reply	[relevance 24%]

* [PATCHv3 1/7] t6120: fix typo in test name
  2017-11-02 19:41     ` [PATCHv3 0/7] git describe blob Stefan Beller
@ 2017-11-02 19:41       ` Stefan Beller
  2017-11-02 19:41       ` [PATCHv3 7/7] builtin/describe.c: describe a blob Stefan Beller
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw)
  To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t6120-describe.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..c8b7ed82d9 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' '
 	mv .git/modules/sub1/ .git/modules/sub_moved &&
 	test_must_fail git describe --dirty
 '
-test_expect_success 'describe ignoring a borken submodule' '
+test_expect_success 'describe ignoring a broken submodule' '
 	git describe --broken >out &&
 	test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
 	grep broken out
-- 
2.15.0.7.g980e40477f


^ permalink raw reply	[relevance 26%]

* [PATCHv3 7/7] builtin/describe.c: describe a blob
  2017-11-02 19:41     ` [PATCHv3 0/7] git describe blob Stefan Beller
  2017-11-02 19:41       ` [PATCHv3 1/7] t6120: fix typo in test name Stefan Beller
@ 2017-11-02 19:41       ` Stefan Beller
  1 sibling, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw)
  To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

"This is an interesting endeavor, because describing things is hard."
  -- me, upon writing this patch.

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob.  For
source code this is likely not the first commit that introduced the blob,
but rather the latest commit that contained the blob.  For example:

  git describe v0.99:Makefile
  v0.99-5-gab6625e06a:Makefile

tells us the latest commit that contained the Makefile as it was in tag
v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next
commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist",
2005-07-11) touches the Makefile.

Let's see how this description turns out, if it is useful in day-to-day
use as I have the intuition that we'd rather want to see the *first*
commit that this blob was introduced to the repository (which can be
achieved easily by giving the `--reverse` flag in the describe_blob rev
walk).

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-describe.txt | 13 ++++++++-
 builtin/describe.c             | 65 ++++++++++++++++++++++++++++++++++++++----
 t/t6120-describe.sh            | 15 ++++++++++
 3 files changed, 87 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..79ec0be62a 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,7 +3,7 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit using the most recent tag reachable from it
+git-describe - Describe a commit or blob using the graph relations
 
 
 SYNOPSIS
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>]
+'git describe' [<options>] <blob>
 
 DESCRIPTION
 -----------
@@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given object refers to a blob, it will be described
+as `<commit-ish>:<path>`, such that the blob can be found
+at `<path>` in the `<commit-ish>`. Note, that the commit is likely
+not the commit that introduced the blob, but the one that was found
+first; to find the commit that introduced the blob, you need to find
+the commit that last touched the path, e.g.
+`git log <commit-description> -- <path>`.
+As blobs do not point at the commits they are contained in,
+describing blobs is slow as we have to walk the whole graph.
+
 OPTIONS
 -------
 <commit-ish>...::
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..cf08bef344 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,56 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+	struct object_id current_commit;
+	struct object_id looking_for;
+	struct strbuf *dst;
+	struct rev_info *revs;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+	struct process_commit_data *pcd = data;
+	pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+	struct process_commit_data *pcd = data;
+
+	if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
+		reset_revision_walk();
+		describe_commit(&pcd->current_commit, pcd->dst);
+		strbuf_addf(pcd->dst, ":%s", path);
+		pcd->revs->max_count = 0;
+	}
+}
+
+static void describe_blob(struct object_id oid, struct strbuf *dst)
+{
+	struct rev_info revs;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct process_commit_data pcd = { null_oid, oid, dst, &revs};
+
+	argv_array_pushl(&args, "internal: The first arg is not parsed",
+		"--all", "--reflog", /* as many starting points as possible */
+		/* NEEDSWORK: --all is incompatible with worktrees for now: */
+		"--single-worktree",
+		"--objects",
+		"--in-commit-order",
+		NULL);
+
+	init_revisions(&revs, NULL);
+	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
+		BUG("setup_revisions could not handle all args?");
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, process_commit, process_object, &pcd);
+	reset_revision_walk();
+}
+
 static void describe(const char *arg, int last_one)
 {
 	struct object_id oid;
@@ -445,11 +497,14 @@ static void describe(const char *arg, int last_one)
 
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
+	cmit = lookup_commit_reference_gently(&oid, 1);
 
-	describe_commit(&oid, &sb);
+	if (cmit)
+		describe_commit(&oid, &sb);
+	else if (lookup_blob(&oid))
+		describe_blob(oid, &sb);
+	else
+		die(_("%s is neither a commit nor blob"), arg);
 
 	puts(sb.buf);
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index c8b7ed82d9..aec6ed192d 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a broken submodule' '
 	grep broken out
 '
 
+test_expect_success 'describe a blob at a tag' '
+	echo "make it a unique blob" >file &&
+	git add file && git commit -m "content in file" &&
+	git tag -a -m "latest annotated tag" unique-file &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe a blob with commit-ish' '
+	git commit --allow-empty -m "empty commit" &&
+	git describe HEAD:file >actual &&
+	grep unique-file-1-g actual
+'
+
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
-- 
2.15.0.7.g980e40477f


^ permalink raw reply	[relevance 11%]

* Re: No log --no-decorate completion?
  2017-10-24 15:32       ` Stefan Beller
@ 2017-11-02 20:25         ` Max Rothman
  2017-11-02 21:19           ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Max Rothman @ 2017-11-02 20:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

No problem! Let me know if I've done something wrong with this patch,
I'm new to git's contributor process.

completion: add missing completions for log, diff, show

* Add bash completion for the missing --no-* options on git log
* Add bash completion for --textconv and --indent-heuristic families to
  git diff and all commands that use diff's options
* Add bash completion for --no-abbrev-commit, --expand-tabs, and
  --no-expand-tabs to git show
---
 contrib/completion/git-completion.bash | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/contrib/completion/git-completion.bash
b/contrib/completion/git-completion.bash
index 0e16f017a4144..252a6c8c0c80c 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1412,6 +1412,8 @@ __git_diff_common_options="--stat --numstat
--shortstat --summary
                        --dirstat-by-file= --cumulative
                        --diff-algorithm=
                        --submodule --submodule=
+                       --indent-heuristic --no-indent-heuristic
+                       --textconv --no-textconv
 "

 _git_diff ()
@@ -1752,6 +1754,10 @@ _git_log ()
                __gitcomp "$__git_diff_submodule_formats" ""
"${cur##--submodule=}"
                return
                ;;
+       --no-walk=*)
+               __gitcomp "sorted unsorted" "" "${cur##--no-walk=}"
+               return
+               ;;
        --*)
                __gitcomp "
                        $__git_log_common_options
@@ -1759,16 +1765,19 @@ _git_log ()
                        $__git_log_gitk_options
                        --root --topo-order --date-order --reverse
                        --follow --full-diff
-                       --abbrev-commit --abbrev=
+                       --abbrev-commit --no-abbrev-commit --abbrev=
                        --relative-date --date=
                        --pretty= --format= --oneline
                        --show-signature
                        --cherry-mark
                        --cherry-pick
                        --graph
-                       --decorate --decorate=
+                       --decorate --decorate= --no-decorate
                        --walk-reflogs
+                       --no-walk --no-walk= --do-walk
                        --parents --children
+                       --expand-tabs --expand-tabs= --no-expand-tabs
+                       --patch
                        $merge
                        $__git_diff_common_options
                        --pickaxe-all --pickaxe-regex
@@ -2816,8 +2825,9 @@ _git_show ()
                return
                ;;
        --*)
-               __gitcomp "--pretty= --format= --abbrev-commit --oneline
-                       --show-signature
+               __gitcomp "--pretty= --format= --abbrev-commit
--no-abbrev-commit
+                       --oneline --show-signature --patch
+                       --expand-tabs --expand-tabs= --no-expand-tabs
                        $__git_diff_common_options
                        "
                return

On Tue, Oct 24, 2017 at 11:32 AM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Oct 24, 2017 at 8:15 AM, Max Rothman <max.r.rothman@gmail.com> wrote:
>> Just re-discovered this in my inbox. So is this worth fixing? I could
>> (probably) figure out a patch.
>>
>> Thanks,
>> Max
>>
>
> $ git log -- <TAB>
> Display all 100 possibilities? (y or n)
>
> I guess adding one more option is no big deal, so go for it!
> We also have a couple --no-options already (as you said earlier),
> which I think makes it even more viable.
>
> Tangent:
> I wonder if we can cascade the completion by common
> prefixes, e.g. --no- or --ignore- occurs a couple of times,
> such that we could only show these --no- once and only
> if you try autocompleting thereafter you get all --no- options.
>
> Thanks for reviving this thread!
> Stefan

^ permalink raw reply	[relevance 16%]

* Re: No log --no-decorate completion?
  2017-11-02 20:25         ` Max Rothman
@ 2017-11-02 21:19           ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-02 21:19 UTC (permalink / raw)
  To: Max Rothman; +Cc: git

On Thu, Nov 2, 2017 at 1:25 PM, Max Rothman <max.r.rothman@gmail.com> wrote:
> No problem! Let me know if I've done something wrong with this patch,
> I'm new to git's contributor process.

Thanks for coming up with a patch!
Yeah, the contribution process has some initial road blocks;
I'll point them out below.

> completion: add missing completions for log, diff, show

I would think this is a good commit subject as it describes
the changes made superficially

> * Add bash completion for the missing --no-* options on git log
> * Add bash completion for --textconv and --indent-heuristic families to
>   git diff and all commands that use diff's options
> * Add bash completion for --no-abbrev-commit, --expand-tabs, and
>   --no-expand-tabs to git show

This describes what happens in this patch, but not why, which helps
future readers of commit message more than the "what".
I'm also just guessing but maybe:

    A user might have configured a repo to show diffs in a certain way,
    add the --no-* options to the autocompletion to override it easier
    from the command line.
    New in this patch is autocompletion for --textconv as well as abbreviation
    options as that needs to be flipped all the time in <example workflow>.

I wonder if we really want to expose indent-heuristic,
I guess by not having it experimental any more it makes sense to do so.
c.f. https://public-inbox.org/git/20171029151228.607834-1-cmn@dwim.me/

At the end of a commit message, the Git project requires a sign off.
(See section (5) in Documentation/SubmittingPatches;
tl;dr: add Signed-off-by: NAME <EMAIL> if you can agree to
https://developercertificate.org/)

> ---
>  contrib/completion/git-completion.bash | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash
> b/contrib/completion/git-completion.bash
> index 0e16f017a4144..252a6c8c0c80c 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1412,6 +1412,8 @@ __git_diff_common_options="--stat --numstat
> --shortstat --summary
>                         --dirstat-by-file= --cumulative
>                         --diff-algorithm=
>                         --submodule --submodule=
> +                       --indent-heuristic --no-indent-heuristic
> +                       --textconv --no-textconv
>  "
>
>  _git_diff ()
> @@ -1752,6 +1754,10 @@ _git_log ()
>                 __gitcomp "$__git_diff_submodule_formats" ""
> "${cur##--submodule=}"
>                 return
>                 ;;
> +       --no-walk=*)
> +               __gitcomp "sorted unsorted" "" "${cur##--no-walk=}"
> +               return
> +               ;;
>         --*)
>                 __gitcomp "
>                         $__git_log_common_options
> @@ -1759,16 +1765,19 @@ _git_log ()
>                         $__git_log_gitk_options
>                         --root --topo-order --date-order --reverse
>                         --follow --full-diff
> -                       --abbrev-commit --abbrev=
> +                       --abbrev-commit --no-abbrev-commit --abbrev=
>                         --relative-date --date=
>                         --pretty= --format= --oneline
>                         --show-signature
>                         --cherry-mark
>                         --cherry-pick
>                         --graph
> -                       --decorate --decorate=
> +                       --decorate --decorate= --no-decorate
>                         --walk-reflogs
> +                       --no-walk --no-walk= --do-walk
>                         --parents --children
> +                       --expand-tabs --expand-tabs= --no-expand-tabs
> +                       --patch
>                         $merge
>                         $__git_diff_common_options
>                         --pickaxe-all --pickaxe-regex
> @@ -2816,8 +2825,9 @@ _git_show ()
>                 return
>                 ;;
>         --*)
> -               __gitcomp "--pretty= --format= --abbrev-commit --oneline
> -                       --show-signature
> +               __gitcomp "--pretty= --format= --abbrev-commit
> --no-abbrev-commit
> +                       --oneline --show-signature --patch
> +                       --expand-tabs --expand-tabs= --no-expand-tabs
>                         $__git_diff_common_options
>                         "
>                 return
>

The patch looks good, but doesn't apply because the email contains
white spaces instead of tabs. Maybe try https://submitgit.herokuapp.com/
(or fix/change your email client to send a patch; the gmail web interface
doesn't work. I personally got 'git send-email' up and running;
The Documentation/SubmittingPatches has a section on email clients, too)

Thanks,
Stefan

^ permalink raw reply	[relevance 8%]

* Re: Regression[2.14.3->2.15]: Interactive rebase fails if submodule is modified
  2017-11-02 18:34 ` Stefan Beller
@ 2017-11-02 21:45   ` Orgad Shaneh
  0 siblings, 0 replies; 200+ results
From: Orgad Shaneh @ 2017-11-02 21:45 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Johannes Schindelin, Brandon Williams, Heiko Voigt, git

On Thu, Nov 2, 2017 at 8:34 PM, Stefan Beller <sbeller@google.com> wrote:
> On Thu, Nov 2, 2017 at 1:30 AM, Orgad Shaneh <orgads@gmail.com> wrote:
>> I can't reproduce this with a minimal example, but it happens in my project.
>>
>> What I tried to do for reproducing is:
>> rm -rf super sub
>> mkdir sub; cd sub; git init
>> git commit --allow-empty -m 'Initial commit'
>> mkdir ../super; cd ../super
>> git init
>> git submodule add ../sub
>> touch foo; git add foo sub
>> git commit -m 'Initial commit'
>> touch a; git add a; git commit -m 'a'
>> touch b; git add b; git commit -m 'b'
>> cd sub; git commit --allow-empty -m 'New commit'; cd ..
>> git rebase -i HEAD^^
>>
>> Then drop a.
>>
>> In my project I get:
>> error: cannot rebase: You have unstaged changes.
>>
>> This works fine with 2.14.3.
>
>   git log --oneline v2.14.3..v2.15.0 -- submodule.c
> doesn't give any promising hints (i.e. I don't think one of a
> submodule related series introduced this either by chance or
> on purpose)
>
> "rebase -i" was rewritten into C in 570676e011, though
> that series was extensively tested by DScho, so I wouldn't
> want to point fingers here quickly.
>
> Would you be willing to bisect this behavior?

Bisected to ff6f1f564c48def1f8e1852826bab58af5044b06:
submodule-config: lazy-load a repository's .gitmodules file

- Orgad

^ permalink raw reply	[relevance 22%]

* Re: Bug report: git reset --hard does not fix submodule worktree
  2017-11-04  0:46 Bug report: git reset --hard does not fix submodule worktree Billy O'Neal (VC LIBS)
@ 2017-11-06 20:50 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-06 20:50 UTC (permalink / raw)
  To: Billy O'Neal (VC LIBS); +Cc: git, Stephan T. Lavavej, Cody Miller

On Fri, Nov 3, 2017 at 5:46 PM, Billy O'Neal (VC LIBS)
<bion@microsoft.com> wrote:
> Hello, Git folks. I managed to accidentally break the our test lab by attempting to git mv a directory with a submodule inside. It seems like if a reset does an undo on a mv, the workfold entry should be fixed to put the submodule in its old location. Consider the following sequence of commands:
>
> Setup a git repo with a submodule:
> mkdir metaproject
> mkdir upstream
> cd metaproject
> git init
> cd ..\upstream
> git init
> echo hello > test.txt
> git add -A
> git commit -m "an example commit"
> cd ..\metapoject
> mkdir start
> git submodule add ../upstream start/upstream
> git add -A
> git commit -m "Add submodule in start/upstream."
>
> Move the directory containing the submodule:
> git mv start target
> git add -A
> git commit -m "Moved submodule parent directory."
>
> Check that the worktree got correctly fixed by git mv; this output is as expected:
> type .git\modules\start\upstream\config
> [core]
>         repositoryformatversion = 0
>         filemode = false
>         bare = false
>         logallrefupdates = true
>         symlinks = false
>         ignorecase = true
>         worktree = ../../../../target/upstream
> [remote "origin"]
>         url = C:/Users/bion/Desktop/upstream
>         fetch = +refs/heads/*:refs/remotes/origin/*
> [branch "master"]
>         remote = origin
>         merge = refs/heads/master
>
> Now try to go back to the previous commit using git reset --hard:
> git log --oneline
>  1f560be (HEAD -> master) Moved submodule parent directory.
>  a5977ce Add submodule in start/upstream.
> git reset --hard a5977ce
>  warning: unable to rmdir target/upstream: Directory not empty
>  HEAD is now at a5977ce Add submodule in start/upstream.
>
> Check that the worktree got fixed correctly; it did not:
> type .git\modules\start\upstream\config
> [core]
>         repositoryformatversion = 0
>         filemode = false
>         bare = false
>         logallrefupdates = true
>         symlinks = false
>         ignorecase = true
>         worktree = ../../../../target/upstream
> [remote "origin"]
>         url = C:/Users/bion/Desktop/upstream
>         fetch = +refs/heads/*:refs/remotes/origin/*
> [branch "master"]
>         remote = origin
>         merge = refs/heads/master
>
> Is git reset intended to put the submodule in the right place?
>
> Thanks folks!

git-reset sounds like it ought to put submodules back into another directory.
Historically git-reset did not touch submodules at all; a recent
release introduced
the --recurse-submodules flag for git-reset, which would cover this situation.
However that particular situation (with moving the submodules) seems to be
not covered yet,

./t7112-reset-submodule.sh
...
not ok 69 - git reset --hard: replace submodule with a directory must
fail # TODO known breakage



> If not, is there a command we can run before/after reset to restore consistency?

The submodule "fix-it-all" command would be

    git submodule update --init --recursive

IMHO.

^ permalink raw reply	[relevance 27%]

* Re: [PATCH] wt-status: actually ignore submodules when requested
  2017-11-06 22:08 ` [PATCH] wt-status: actually ignore submodules when requested Brandon Williams
@ 2017-11-06 22:44   ` Stefan Beller
  2017-11-06 22:52     ` Brandon Williams
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-11-06 22:44 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Orgad Shaneh, Johannes Schindelin

On Mon, Nov 6, 2017 at 2:08 PM, Brandon Williams <bmwill@google.com> wrote:
> Since ff6f1f564 (submodule-config: lazy-load a repository's .gitmodules
> file, 2017-08-03) rebase interactive fails if there are any submodules
> with unstaged changes which have been configured with a value for
> 'submodule.<name>.ignore' in the repository's config.
>
> This is due to how configured values of 'submodule.<name>.ignore' are
> handled in addition to a change in how the submodule config is loaded.
> When the diff machinery hits a submodule (gitlink as well as a
> corresponding entry in the submodule subsystem) it will read the value
> of 'submodule.<name>.ignore' stored in the repository's config and if
> the config is present it will clear the 'IGNORE_SUBMODULES' (which is
> the flag explicitly requested by rebase interactive),
> 'IGNORE_UNTRACKED_IN_SUBMODULES', and 'IGNORE_DIRTY_SUBMODULES' diff
> flags and then set one of them based on the configured value.
>
> Historically this wasn't a problem because the submodule subsystem
> wasn't initialized because the .gitmodules file wasn't explicitly loaded
> by the rebase interactive command.  So when the diff machinery hit a
> submodule it would skip over reading any configured values of
> 'submodule.<name>.ignore'.
>
> In order to preserve the behavior of submodules being ignored by rebase
> interactive, also set the 'OVERRIDE_SUBMODULE_CONFIG' diff flag when
> submodules are requested to be ignored when checking for unstaged
> changes.
>
> Reported-by: Orgad Shaneh <orgads@gmail.com>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  t/t3426-rebase-submodule.sh | 16 ++++++++++++++++
>  wt-status.c                 |  4 +++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/t/t3426-rebase-submodule.sh b/t/t3426-rebase-submodule.sh
> index ebf4f5e4b..760c872e2 100755
> --- a/t/t3426-rebase-submodule.sh
> +++ b/t/t3426-rebase-submodule.sh
> @@ -40,4 +40,20 @@ git_rebase_interactive () {
>
>  test_submodule_switch "git_rebase_interactive"
>
> +test_expect_success 'rebase interactive ignores modified submodules' '
> +       test_when_finished "rm -rf super sub" &&
> +       git init sub &&
> +       git -C sub commit --allow-empty -m "Initial commit" &&
> +       git init super &&
> +       git -C super submodule add ../sub &&
> +       git -C super config submodule.sub.ignore dirty &&
> +       > super/foo &&
> +       git -C super add foo &&
> +       git -C super commit -m "Initial commit" &&
> +       test_commit -C super a &&
> +       test_commit -C super b &&
> +       test_commit -C super/sub c &&
> +       git -C super rebase -i HEAD^^

The previous test `set_fake_editor` already, though I am unsure
about the current directory. It turns out that the setting of the fake
editor is done via environment variable using absolute path to
the generated `fake_editor.sh`, hence it works even when
invoking a rebase in another directory/repo. Spooky.

Also we do want to be able to skip previous tests,
which this might be a problem for. Can we have a 'setup'
that sets up the fake editor instead of repeating it here or
hoping the previous tests has run? (Calling for a refactoring
during a regression fix is bad taste, so maybe just take this
patch as is and put it to the #leftoverbits ?)

> +'
> +
>  test_done
> diff --git a/wt-status.c b/wt-status.c
> index 29bc64cc0..94e5ebaf8 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -2262,8 +2262,10 @@ int has_unstaged_changes(int ignore_submodules)
>         int result;
>
>         init_revisions(&rev_info, NULL);
> -       if (ignore_submodules)
> +       if (ignore_submodules) {
>                 DIFF_OPT_SET(&rev_info.diffopt, IGNORE_SUBMODULES);
> +               DIFF_OPT_SET(&rev_info.diffopt, OVERRIDE_SUBMODULE_CONFIG);
> +       }

There are a couple of submodule related flags:

#define DIFF_OPT_IGNORE_SUBMODULES   (1 << 18)
...
#define DIFF_OPT_DIRTY_SUBMODULES    (1 << 24)
#define DIFF_OPT_IGNORE_UNTRACKED_IN_SUBMODULES (1 << 25)
#define DIFF_OPT_IGNORE_DIRTY_SUBMODULES (1 << 26)
#define DIFF_OPT_OVERRIDE_SUBMODULE_CONFIG (1 << 27)

We don't need to pay attention to 24,25,26 here, because
DIRTY_SUBMODULES and IGN_* are only used to specify further
interest, the generic INORE_SUBMODULES turns off any submodule
handling. (so if we were to turn them on as well, it would still be correct,
though it may have performance impact -- I shortly wondered if we'd rather
want to have a mask covering all submodule related flags, specifically
all flags invented in the future ;) )

The patch looks good to me (i.e. I am convinced by review it fixes the
regression), so maybe we can put the test refactor on top of it, which
then doesn't need fast-tracking down to the maintenance track?

Thanks,
Stefan

^ permalink raw reply	[relevance 23%]

* Re: [PATCH] wt-status: actually ignore submodules when requested
  2017-11-06 22:44   ` Stefan Beller
@ 2017-11-06 22:52     ` Brandon Williams
  0 siblings, 0 replies; 200+ results
From: Brandon Williams @ 2017-11-06 22:52 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Orgad Shaneh, Johannes Schindelin

On 11/06, Stefan Beller wrote:
> On Mon, Nov 6, 2017 at 2:08 PM, Brandon Williams <bmwill@google.com> wrote:
> > Since ff6f1f564 (submodule-config: lazy-load a repository's .gitmodules
> > file, 2017-08-03) rebase interactive fails if there are any submodules
> > with unstaged changes which have been configured with a value for
> > 'submodule.<name>.ignore' in the repository's config.
> >
> > This is due to how configured values of 'submodule.<name>.ignore' are
> > handled in addition to a change in how the submodule config is loaded.
> > When the diff machinery hits a submodule (gitlink as well as a
> > corresponding entry in the submodule subsystem) it will read the value
> > of 'submodule.<name>.ignore' stored in the repository's config and if
> > the config is present it will clear the 'IGNORE_SUBMODULES' (which is
> > the flag explicitly requested by rebase interactive),
> > 'IGNORE_UNTRACKED_IN_SUBMODULES', and 'IGNORE_DIRTY_SUBMODULES' diff
> > flags and then set one of them based on the configured value.
> >
> > Historically this wasn't a problem because the submodule subsystem
> > wasn't initialized because the .gitmodules file wasn't explicitly loaded
> > by the rebase interactive command.  So when the diff machinery hit a
> > submodule it would skip over reading any configured values of
> > 'submodule.<name>.ignore'.
> >
> > In order to preserve the behavior of submodules being ignored by rebase
> > interactive, also set the 'OVERRIDE_SUBMODULE_CONFIG' diff flag when
> > submodules are requested to be ignored when checking for unstaged
> > changes.
> >
> > Reported-by: Orgad Shaneh <orgads@gmail.com>
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  t/t3426-rebase-submodule.sh | 16 ++++++++++++++++
> >  wt-status.c                 |  4 +++-
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/t/t3426-rebase-submodule.sh b/t/t3426-rebase-submodule.sh
> > index ebf4f5e4b..760c872e2 100755
> > --- a/t/t3426-rebase-submodule.sh
> > +++ b/t/t3426-rebase-submodule.sh
> > @@ -40,4 +40,20 @@ git_rebase_interactive () {
> >
> >  test_submodule_switch "git_rebase_interactive"
> >
> > +test_expect_success 'rebase interactive ignores modified submodules' '
> > +       test_when_finished "rm -rf super sub" &&
> > +       git init sub &&
> > +       git -C sub commit --allow-empty -m "Initial commit" &&
> > +       git init super &&
> > +       git -C super submodule add ../sub &&
> > +       git -C super config submodule.sub.ignore dirty &&
> > +       > super/foo &&
> > +       git -C super add foo &&
> > +       git -C super commit -m "Initial commit" &&
> > +       test_commit -C super a &&
> > +       test_commit -C super b &&
> > +       test_commit -C super/sub c &&
    +       set_fake_editor &&
> > +       git -C super rebase -i HEAD^^
> 
> The previous test `set_fake_editor` already, though I am unsure
> about the current directory. It turns out that the setting of the fake
> editor is done via environment variable using absolute path to
> the generated `fake_editor.sh`, hence it works even when
> invoking a rebase in another directory/repo. Spooky.
> 
> Also we do want to be able to skip previous tests,
> which this might be a problem for. Can we have a 'setup'
> that sets up the fake editor instead of repeating it here or
> hoping the previous tests has run? (Calling for a refactoring
> during a regression fix is bad taste, so maybe just take this
> patch as is and put it to the #leftoverbits ?)

Thanks for catching that.  'set_fake_editor' definitely needs to be set
(like what i did above in-line).

> 
> > +'
> > +
> >  test_done
> > diff --git a/wt-status.c b/wt-status.c
> > index 29bc64cc0..94e5ebaf8 100644
> > --- a/wt-status.c
> > +++ b/wt-status.c
> > @@ -2262,8 +2262,10 @@ int has_unstaged_changes(int ignore_submodules)
> >         int result;
> >
> >         init_revisions(&rev_info, NULL);
> > -       if (ignore_submodules)
> > +       if (ignore_submodules) {
> >                 DIFF_OPT_SET(&rev_info.diffopt, IGNORE_SUBMODULES);
> > +               DIFF_OPT_SET(&rev_info.diffopt, OVERRIDE_SUBMODULE_CONFIG);
> > +       }
> 
> There are a couple of submodule related flags:
> 
> #define DIFF_OPT_IGNORE_SUBMODULES   (1 << 18)
> ...
> #define DIFF_OPT_DIRTY_SUBMODULES    (1 << 24)
> #define DIFF_OPT_IGNORE_UNTRACKED_IN_SUBMODULES (1 << 25)
> #define DIFF_OPT_IGNORE_DIRTY_SUBMODULES (1 << 26)
> #define DIFF_OPT_OVERRIDE_SUBMODULE_CONFIG (1 << 27)
> 
> We don't need to pay attention to 24,25,26 here, because
> DIRTY_SUBMODULES and IGN_* are only used to specify further
> interest, the generic INORE_SUBMODULES turns off any submodule
> handling. (so if we were to turn them on as well, it would still be correct,
> though it may have performance impact -- I shortly wondered if we'd rather
> want to have a mask covering all submodule related flags, specifically
> all flags invented in the future ;) )
> 
> The patch looks good to me (i.e. I am convinced by review it fixes the
> regression), so maybe we can put the test refactor on top of it, which
> then doesn't need fast-tracking down to the maintenance track?
> 
> Thanks,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[relevance 15%]

* sb/submodule-recursive-checkout-detach-head
  2017-10-27 18:50 ` Stefan Beller
@ 2017-11-07  2:00   ` Junio C Hamano
  2017-11-07 17:05     ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-11-07  2:00 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

>> * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
>>   (merged to 'next' on 2017-10-26 at 30994b4c76)
>>  + Documentation/checkout: clarify submodule HEADs to be detached
>>  + recursive submodules: detach HEAD from new state
>>
>>  "git checkout --recursive" may overwrite and rewind the history of
>>  the branch that happens to be checked out in submodule
>>  repositories, which might not be desirable.  Detach the HEAD but
>>  still allow the recursive checkout to succeed in such a case.
>>
>>  Undecided.
>>  This needs justification in a larger picture; it is unclear why
>>  this is better than rejecting recursive checkout, for example.
> ...
> Detaching the submodule HEAD is in line with the current thinking
> of submodules, though I am about to send out a plan later
> asking if we want to keep it that way long term.

Did this "send out a plan" ever happen?  I am about to rewind 'next'
and rebuild on top of v2.15, and wondering if I should keep the
topic or kick it back to 'pu' so that a better justification can be
given.


^ permalink raw reply	[relevance 15%]

* Re: sb/submodule-recursive-checkout-detach-head
  2017-11-07  2:00   ` sb/submodule-recursive-checkout-detach-head Junio C Hamano
@ 2017-11-07 17:05     ` Stefan Beller
  2017-11-08  0:21       ` Junio C Hamano
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-11-07 17:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Nov 6, 2017 at 6:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits
>>>   (merged to 'next' on 2017-10-26 at 30994b4c76)
>>>  + Documentation/checkout: clarify submodule HEADs to be detached
>>>  + recursive submodules: detach HEAD from new state
>>>
>>>  "git checkout --recursive" may overwrite and rewind the history of
>>>  the branch that happens to be checked out in submodule
>>>  repositories, which might not be desirable.  Detach the HEAD but
>>>  still allow the recursive checkout to succeed in such a case.
>>>
>>>  Undecided.
>>>  This needs justification in a larger picture; it is unclear why
>>>  this is better than rejecting recursive checkout, for example.
>> ...
>> Detaching the submodule HEAD is in line with the current thinking
>> of submodules, though I am about to send out a plan later
>> asking if we want to keep it that way long term.
>
> Did this "send out a plan" ever happen?

No. (Not yet?)

>  I am about to rewind 'next'
> and rebuild on top of v2.15, and wondering if I should keep the
> topic or kick it back to 'pu' so that a better justification can be
> given.

Feel free to kick back to 'pu'.

Thanks,
Stefan

^ permalink raw reply	[relevance 16%]

* Re: [PATCH] Tests: document test_submodule_{,forced_}switch()
  2017-11-07  1:52 ` Junio C Hamano
@ 2017-11-07 17:27   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-07 17:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Mon, Nov 6, 2017 at 5:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> Factor out the commonalities from test_submodule_switch() and
>> test_submodule_forced_switch() in lib-submodule-update.sh, and document
>> their usage.
>>
>> This also makes explicit (through the KNOWN_FAILURE_FORCED_SWITCH_TESTS
>> variable) the fact that, currently, all functionality tested using
>> test_submodule_forced_switch() do not correctly handle the situation in
>> which a submodule is replaced with an ordinary directory.
>>
>> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
>> ---
>> I find tests that use lib-submodule-update.sh difficult to understand
>> due to the lack of clarity of what test_submodule_switch() and others do
>> with their argument - I hope this will make things easier for future
>> readers.
>> ---
>>  t/lib-submodule-update.sh | 250 +++++++++-------------------------------------
>>  1 file changed, 46 insertions(+), 204 deletions(-)
>
> I suspect that the benefit of this is a lot larger than "document" a
> test helper function or two ;-)  "document & clean-up", perhaps?

It is.

>
> I didn't compare the before-and-after with fine toothed comb, but
> a cursory look didn't find anything glaringly questionable other
> than the above.
>

I have reviewed the patch and thinks it withstands the test of a fine comb.

^ permalink raw reply	[relevance 8%]

* Re: sb/submodule-recursive-checkout-detach-head
  2017-11-07 17:05     ` Stefan Beller
@ 2017-11-08  0:21       ` Junio C Hamano
  0 siblings, 0 replies; 200+ results
From: Junio C Hamano @ 2017-11-08  0:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

>>> Detaching the submodule HEAD is in line with the current thinking
>>> of submodules, though I am about to send out a plan later
>>> asking if we want to keep it that way long term.
>>
>> Did this "send out a plan" ever happen?
>
> No. (Not yet?)
>
>>  I am about to rewind 'next'
>> and rebuild on top of v2.15, and wondering if I should keep the
>> topic or kick it back to 'pu' so that a better justification can be
>> given.
>
> Feel free to kick back to 'pu'.

OK.

I am puzzled by the question mark after "Not yet", which I could
interpret as "it is even questionable to Stefan that it may or may
not happen", but I'd assume that kicking back to 'pu' and allowing
it to be reroll will be taken as an opportunity to skip such a
separate "plan" and instead to clearify the plan and design in an
updated log message--that is fine by me, too.

Thanks.


^ permalink raw reply	[relevance 15%]

* [RFC PATCH 0/4] git-status reports relation to superproject
@ 2017-11-08 19:55 Stefan Beller
  2017-11-08 19:55 ` [PATCH 2/4] submodule.c: factor start_ls_files_dot_dot out of get_superproject_working_tree Stefan Beller
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Stefan Beller @ 2017-11-08 19:55 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

  $ git -c status.superprojectinfo status
  HEAD detached at v2.15-rc2
  superproject is 6 commits behind HEAD 7070ce2..5e6d0fb
  nothing to commit, working tree clean

How cool is that?

This series side steps the questions raised in
https://public-inbox.org/git/xmqq4lq6hmp2.fsf_-_@gitster.mtv.corp.google.com/
which I am also putting together albeit slowly.

This series just reports the relationship between the superprojects gitlink
(if any) to HEAD. I think that is useful information in the current
world of submodules.

Stefan

Stefan Beller (4):
  remote, revision: factor out exclusive counting between two commits
  submodule.c: factor start_ls_files_dot_dot out of
    get_superproject_working_tree
  submodule.c: get superprojects gitlink value
  git-status: report reference to superproject

 Documentation/config.txt |  5 +++
 builtin/commit.c         |  2 ++
 remote.c                 | 40 +-----------------------
 revision.c               | 45 +++++++++++++++++++++++++++
 revision.h               |  7 +++++
 submodule.c              | 80 ++++++++++++++++++++++++++++++++++++------------
 submodule.h              |  6 ++++
 t/t7519-superproject.sh  | 57 ++++++++++++++++++++++++++++++++++
 wt-status.c              | 37 ++++++++++++++++++++++
 wt-status.h              |  1 +
 10 files changed, 222 insertions(+), 58 deletions(-)
 create mode 100755 t/t7519-superproject.sh

-- 
2.15.0.128.g40905b34bf.dirty


^ permalink raw reply	[relevance 27%]

* [PATCH 2/4] submodule.c: factor start_ls_files_dot_dot out of get_superproject_working_tree
  2017-11-08 19:55 [RFC PATCH 0/4] git-status reports relation to superproject Stefan Beller
@ 2017-11-08 19:55 ` Stefan Beller
  2017-11-08 19:55 ` [PATCH 3/4] submodule.c: get superprojects gitlink value Stefan Beller
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-08 19:55 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

We'll reuse the code of the factored out function shortly, when exploring
the superproject for another aspect. Instead of knowing the root of the
superproject we'll find out about the gitlink value.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 53 ++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/submodule.c b/submodule.c
index 239d94d539..4fcb64469e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1977,15 +1977,13 @@ void absorb_git_dir_into_superproject(const char *prefix,
 	}
 }
 
-const char *get_superproject_working_tree(void)
+/* Starts a child process `ls-files` one directory above the root of the repo. */
+static int start_ls_files_dot_dot(struct child_process *cp, struct strbuf *out)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
-	struct strbuf sb = STRBUF_INIT;
 	const char *one_up = real_path_if_valid("../");
-	const char *cwd = xgetcwd();
-	const char *ret = NULL;
 	const char *subpath;
-	int code;
+	char *cwd = xgetcwd();
+	struct strbuf sb = STRBUF_INIT;
 	ssize_t len;
 
 	if (!is_inside_work_tree())
@@ -1994,31 +1992,48 @@ const char *get_superproject_working_tree(void)
 		 * We might have a superproject, but it is harder
 		 * to determine.
 		 */
-		return NULL;
+		return -1;
 
 	if (!one_up)
-		return NULL;
+		return -1;
 
 	subpath = relative_path(cwd, one_up, &sb);
 
-	prepare_submodule_repo_env(&cp.env_array);
-	argv_array_pop(&cp.env_array);
+	prepare_submodule_repo_env(&cp->env_array);
+	argv_array_pop(&cp->env_array);
 
-	argv_array_pushl(&cp.args, "--literal-pathspecs", "-C", "..",
+	argv_array_pushl(&cp->args, "--literal-pathspecs", "-C", "..",
 			"ls-files", "-z", "--stage", "--full-name", "--",
 			subpath, NULL);
-	strbuf_reset(&sb);
 
-	cp.no_stdin = 1;
-	cp.no_stderr = 1;
-	cp.out = -1;
-	cp.git_cmd = 1;
+	cp->no_stdin = 1;
+	cp->no_stderr = 1;
+	cp->out = -1;
+	cp->git_cmd = 1;
 
-	if (start_command(&cp))
+	if (start_command(cp))
 		die(_("could not start ls-files in .."));
 
-	len = strbuf_read(&sb, cp.out, PATH_MAX);
-	close(cp.out);
+	len = strbuf_read(out, cp->out, PATH_MAX);
+	close(cp->out);
+
+	strbuf_release(&sb);
+	free(cwd);
+	return len;
+}
+
+const char *get_superproject_working_tree(void)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf sb = STRBUF_INIT;
+	const char *cwd = xgetcwd();
+	const char *ret = NULL;
+	int code;
+	ssize_t len;
+
+	len = start_ls_files_dot_dot(&cp, &sb);
+	if (len < 0)
+		return NULL;
 
 	if (starts_with(sb.buf, "160000")) {
 		int super_sub_len;
-- 
2.15.0.128.g40905b34bf.dirty


^ permalink raw reply	[relevance 28%]

* [PATCH 3/4] submodule.c: get superprojects gitlink value
  2017-11-08 19:55 [RFC PATCH 0/4] git-status reports relation to superproject Stefan Beller
  2017-11-08 19:55 ` [PATCH 2/4] submodule.c: factor start_ls_files_dot_dot out of get_superproject_working_tree Stefan Beller
@ 2017-11-08 19:55 ` Stefan Beller
  2017-11-08 19:55 ` [PATCH 4/4] git-status: report reference to superproject Stefan Beller
  2017-11-08 22:36 ` [RFC PATCH 0/4] git-status reports relation " Jonathan Tan
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-08 19:55 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 27 +++++++++++++++++++++++++++
 submodule.h |  6 ++++++
 2 files changed, 33 insertions(+)

diff --git a/submodule.c b/submodule.c
index 4fcb64469e..68b123eb13 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2074,6 +2074,33 @@ const char *get_superproject_working_tree(void)
 	return ret;
 }
 
+/*
+ * Returns 0 when the gitlink is found in the superprojects index,
+ * the value will be found in `oid`. Otherwise return -1.
+ */
+int get_superproject_gitlink(struct object_id *oid)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf sb = STRBUF_INIT;
+	const char *hash;
+
+	if (start_ls_files_dot_dot(&cp, &sb) < 0)
+		return -1;
+
+	if (!skip_prefix(sb.buf, "160000 ", &hash))
+		/*
+		 * superproject doesn't have a gitlink at submodule position or
+		 * output is gibberish
+		 */
+		return -1;
+
+	if (get_oid_hex(hash, oid))
+		/* could not parse the object name */
+		return -1;
+
+	return 0;
+}
+
 /*
  * Put the gitdir for a submodule (given relative to the main
  * repository worktree) into `buf`, or return -1 on error.
diff --git a/submodule.h b/submodule.h
index f0da0277a4..5fc602f0c7 100644
--- a/submodule.h
+++ b/submodule.h
@@ -137,4 +137,10 @@ extern void absorb_git_dir_into_superproject(const char *prefix,
  */
 extern const char *get_superproject_working_tree(void);
 
+/*
+ * Returns 0 when the gitlink is found in the superprojects index,
+ * the value will be found in `oid`. Otherwise return -1.
+ */
+extern int get_superproject_gitlink(struct object_id *oid);
+
 #endif
-- 
2.15.0.128.g40905b34bf.dirty


^ permalink raw reply	[relevance 32%]

* [PATCH 4/4] git-status: report reference to superproject
  2017-11-08 19:55 [RFC PATCH 0/4] git-status reports relation to superproject Stefan Beller
  2017-11-08 19:55 ` [PATCH 2/4] submodule.c: factor start_ls_files_dot_dot out of get_superproject_working_tree Stefan Beller
  2017-11-08 19:55 ` [PATCH 3/4] submodule.c: get superprojects gitlink value Stefan Beller
@ 2017-11-08 19:55 ` Stefan Beller
  2017-11-08 22:36 ` [RFC PATCH 0/4] git-status reports relation " Jonathan Tan
  3 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-08 19:55 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

In a submodule the position of HEAD in relation to the gitlink pointer
in the superproject may be of interest.

Introduce a config option `status.superprojectInfo` that when enabled
will report the relation between HEAD and the commit pointed to by the
gitlink in the index of the superproject.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt |  5 +++++
 builtin/commit.c         |  2 ++
 t/t7519-superproject.sh  | 57 ++++++++++++++++++++++++++++++++++++++++++++++++
 wt-status.c              | 37 +++++++++++++++++++++++++++++++
 wt-status.h              |  1 +
 5 files changed, 102 insertions(+)
 create mode 100755 t/t7519-superproject.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5f0d62753d..7825a1a7be 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3097,6 +3097,11 @@ status.submoduleSummary::
 	submodule summary' command, which shows a similar output but does
 	not honor these settings.
 
+status.superprojectInfo
+	Defaults to false.
+	This shows the relation of the current HEAD to the commit pointed
+	to be the gitlink entry in the superprojects index.
+
 stash.showPatch::
 	If this is set to true, the `git stash show` command without an
 	option will show the stash entry in patch form.  Defaults to false.
diff --git a/builtin/commit.c b/builtin/commit.c
index c38542ee46..f937f6c6cf 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1286,6 +1286,8 @@ static int git_status_config(const char *k, const char *v, void *cb)
 			s->submodule_summary = -1;
 		return 0;
 	}
+	if (!strcmp(k, "status.superprojectinfo"))
+		s->superproject_info = git_config_bool(k, v);
 	if (!strcmp(k, "status.short")) {
 		if (git_config_bool(k, v))
 			status_deferred_config.status_format = STATUS_FORMAT_SHORT;
diff --git a/t/t7519-superproject.sh b/t/t7519-superproject.sh
new file mode 100755
index 0000000000..ade2379a59
--- /dev/null
+++ b/t/t7519-superproject.sh
@@ -0,0 +1,57 @@
+
+
+test_description='git status for superproject relations'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit initial &&
+	test_create_repo sub &&
+	# this whole file tests superproject reporting, so set this config here
+	git -C sub config status.superprojectInfo true
+'
+
+test_expect_success 'repo on initial commit does not mention superproject' '
+	git -C sub status > actual &&
+	test_i18ngrep "No commits yet" actual &&
+	test_i18ngrep -v superproject actual
+'
+
+test_expect_success 'setup submodule' '
+	test_commit -C sub initial &&
+	git submodule add ./sub sub
+'
+
+test_expect_success 'submodule in sync with superproject index' '
+	git -C sub status >actual &&
+	test_i18ngrep "superproject points at HEAD" actual
+'
+
+test_expect_success 'submodule in sync with superproject' '
+	git commit -a -m "superproject adds submodule" &&
+	git -C sub status >actual &&
+	test_i18ngrep "superproject points at HEAD" actual
+'
+
+test_expect_success 'submodule advances two commits' '
+	git -C sub commit --allow-empty -m "test" &&
+	git -C sub commit --allow-empty -m "test2" &&
+	git -C sub status >actual &&
+	test_i18ngrep "superproject is 2 commits behind HEAD" actual
+'
+
+test_expect_success 'submodule behind superproject' '
+	git add sub &&
+	git commit -m "update sub" &&
+	git -C sub reset --hard HEAD^ &&
+	git -C sub status >actual &&
+	test_i18ngrep "superproject is ahead of HEAD by 1 commits" actual
+'
+
+test_expect_success 'submodule and superproject differ' '
+	git -C sub commit --allow-empty -m "test2b" &&
+	git -C sub status >actual &&
+	test_i18ngrep "superproject and HEAD differ by +1, -1 commits" actual
+'
+
+test_done
diff --git a/wt-status.c b/wt-status.c
index bedef256ce..3e8e27a550 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1027,6 +1027,41 @@ static void wt_longstatus_print_tracking(struct wt_status *s)
 	strbuf_release(&sb);
 }
 
+static void wt_longstatus_print_superproject_relation(struct wt_status *s)
+{
+	struct object_id oid;
+	struct commit *in_gitlink, *head;
+	int head_nr, gitlink_nr;
+
+	if (get_superproject_gitlink(&oid))
+		return;
+
+	in_gitlink = lookup_commit(&oid);
+
+	read_ref("HEAD", &oid);
+	head = lookup_commit(&oid);
+
+	compare_commits(head, in_gitlink, &head_nr, &gitlink_nr);
+
+	if (!head_nr && !gitlink_nr)
+		printf(_("superproject points at HEAD\n"));
+	else if (!head_nr)
+		printf(_("superproject is ahead of HEAD by %d commits %s..%s\n"),
+			 gitlink_nr,
+			 find_unique_abbrev(head->object.oid.hash, DEFAULT_ABBREV),
+			 find_unique_abbrev(in_gitlink->object.oid.hash, DEFAULT_ABBREV));
+	else if (!gitlink_nr)
+		printf(_("superproject is %d commits behind HEAD %s..%s\n"),
+			 head_nr,
+			 find_unique_abbrev(in_gitlink->object.oid.hash, DEFAULT_ABBREV),
+			 find_unique_abbrev(head->object.oid.hash, DEFAULT_ABBREV));
+	else
+		printf(_("superproject and HEAD differ by +%d, -%d commits %s...%s\n"),
+			 gitlink_nr, head_nr,
+			 find_unique_abbrev(head->object.oid.hash, DEFAULT_ABBREV),
+			 find_unique_abbrev(in_gitlink->object.oid.hash, DEFAULT_ABBREV));
+}
+
 static int has_unmerged(struct wt_status *s)
 {
 	int i;
@@ -1593,6 +1628,8 @@ static void wt_longstatus_print(struct wt_status *s)
 		if (!s->is_initial)
 			wt_longstatus_print_tracking(s);
 	}
+	if (!s->is_initial && s->superproject_info)
+		wt_longstatus_print_superproject_relation(s);
 
 	wt_longstatus_print_state(s, &state);
 	free(state.branch);
diff --git a/wt-status.h b/wt-status.h
index 64f4d33ea1..75700980d2 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -70,6 +70,7 @@ struct wt_status {
 	int display_comment_prefix;
 	int relative_paths;
 	int submodule_summary;
+	int superproject_info;
 	int show_ignored_files;
 	enum untracked_status_type show_untracked_files;
 	const char *ignore_submodule_arg;
-- 
2.15.0.128.g40905b34bf.dirty


^ permalink raw reply	[relevance 19%]

* Re: [RFC PATCH 0/4] git-status reports relation to superproject
  2017-11-08 19:55 [RFC PATCH 0/4] git-status reports relation to superproject Stefan Beller
                   ` (2 preceding siblings ...)
  2017-11-08 19:55 ` [PATCH 4/4] git-status: report reference to superproject Stefan Beller
@ 2017-11-08 22:36 ` " Jonathan Tan
  2017-11-09  0:10   ` [RFD] Long term plan with submodule refs? Stefan Beller
  3 siblings, 1 reply; 200+ results
From: Jonathan Tan @ 2017-11-08 22:36 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Wed,  8 Nov 2017 11:55:05 -0800
Stefan Beller <sbeller@google.com> wrote:

>   $ git -c status.superprojectinfo status
>   HEAD detached at v2.15-rc2
>   superproject is 6 commits behind HEAD 7070ce2..5e6d0fb
>   nothing to commit, working tree clean
> 
> How cool is that?
> 
> This series side steps the questions raised in
> https://public-inbox.org/git/xmqq4lq6hmp2.fsf_-_@gitster.mtv.corp.google.com/
> which I am also putting together albeit slowly.
> 
> This series just reports the relationship between the superprojects gitlink
> (if any) to HEAD. I think that is useful information in the current
> world of submodules.

The relationship is indeed currently useful, but if the long term plan
is to strongly discourage detached submodule HEAD, then I would think
that these patches are in the wrong direction. (If the long term plan is
to end up supporting both detached and linked submodule HEAD, then these
patches are fine, of course.) So I think that the plan referenced in
Junio's email (that you linked above) still needs to be discussed.

About the patches themselves, they look OK to me. Some minor things off
the top of my head are to retain the "ours" and "theirs" (instead of
"one" and "two"), and to replicate the language in remote.c more closely
("This submodule is (ahead of/behind) the superproject by %d commit(s)")
instead of inventing your own.

^ permalink raw reply	[relevance 17%]

* [RFD] Long term plan with submodule refs?
  2017-11-08 22:36 ` [RFC PATCH 0/4] git-status reports relation " Jonathan Tan
@ 2017-11-09  0:10   ` Stefan Beller
  2017-11-09  1:29     ` Jonathan Tan
                       ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stefan Beller @ 2017-11-09  0:10 UTC (permalink / raw)
  To: jonathantanmy; +Cc: git, sbeller

> The relationship is indeed currently useful, but if the long term plan
> is to strongly discourage detached submodule HEAD, then I would think
> that these patches are in the wrong direction. (If the long term plan is
> to end up supporting both detached and linked submodule HEAD, then these
> patches are fine, of course.) So I think that the plan referenced in
> Junio's email (that you linked above) still needs to be discussed.

This email presents different approaches.

Objective
=========
This document should summarize the current situation of Git submodules
and start a discussion of where it can be headed long term.
Show different ways in which submodule refs could evolve.

Background
==========
Submodules in Git are considered as an independet repository currently.
This is okay for current workflows, such as utilizing a library that is
rarely updated. Other workflows that require a tighter integration between
submodule and superproject are possible, but cumbersome as there is an
additional step that has to be performed, which is the update of the gitlink
pointer in the superproject.

Other discussions of the past:
"Re-attach HEAD?"
  https://public-inbox.org/git/20170501180058.8063-1-sbeller@google.com/
"Semantics of checkout --recursive for submodules on a branch"
  https://public-inbox.org/git/20170630003851.17288-1-sbeller@google.com/
"A new type of symref?"
  https://public-inbox.org/git/xmqqvamqg2fy.fsf@gitster.mtv.corp.google.com/

Workflows
=========
* Obtaining a copy of the Superproject tightly coupled with submodules
  solved via git clone --recurse-submodules=<pathspec>
* Changing the submodule selection
  solved via submodule.active flags
* Changing the remote / Interacting with a different remote for all submodules
  -> need to be solved, not core issue of this discussion
* Syncing to the latest upstream
  solved via git pull --recurse  
* Working on a local feature in one submodule
  -> How do refs work spanning superproject/submodule?
* Working on a feature spanning multiple submodules
  -> How do refs work spanning multiple repos?
* Working on a bug fix (Changing the feature that you currently work on, branches)
  -> How does switching branches in the superproject affect submodules

This discussion should resolve around refs are handled in submodules in
relation to a superproject.

Possible data models and workflow implications
==============================================
In the following different data models are presented, which aid a submodule
heavy workflow each giving pros and cons.

Keep everything as is, superproject and submodule have their own refs
---------------------------------------------------------------------
In this alternative we'd just make existing commands nicer, e.g.
git-status, git-log would give information about the superprojects
gitlink similar as they give information about a remote branch.

We might want to introduce an option that triggers adding the submodule
to the superproject once a commit is done in the submodule.

Pros:
 * easiest to implement
 * easy to understand when having a git background already
 
Cons:
 * Current tools that manage multiple repositories (e.g. repo, git-slave)
   have "branches in parallel", i.e. each repo has a branch of the same
   name, instead of using a superproject to manage the state of all repos
   involved. So users of such tools may be confused by submodules.
 * when using a detached HEAD in the submodule, we may run into git-gc issues.
 

Use replicate refs in submodules
--------------------------------
This approach will replicate the superproject refs into the submodule
ref namespace, e.g. git-branch learns about --recurse-submodules, which
creates a branch of a given name in all submodules. These (topic) branches
should be kept in sync with the superproject

Pros:
 * This seemed intuitive to Gerrit users
 * 'quick' to implement, most of the commands are already there,
   just git-branch is needed to have the workflows mentioned above complete.
Cons:
 * What does "git checkout -b A B" mean? (special case: B == HEAD)
   Is the branch name replicated as a string into the submodule operation,
   or do we dereference the superprojects gitlink and walk from there?
   When taking the superprojects gitlink, then why do we have the branches
   in the submodule in the first place? When taking the string as-is,
   then it might confuse users.
 * non-atomic of refs between superproject and submodule by design;
   This relies on superproject and submodule to stay in sync via hope.

No submodule refstore at all
----------------------------
Use refs and commits in the superproject to stitch submodule changes
together. Disallow branches in the submodule. This is only restricted
to the working tree inside the superproject, such that the output of git-branch
changes depending whether the working tree is in- or outside the superproject
working tree.

The messages of git-status inside the superproject working tree are changed
as "detached HEAD"s are common in submodule and sound scary.  Maybe
"following the superproject"

Pros:
 * solves the atomicity issue from the prior proposal
Cons:
 * In a submodule one must use a worktree outside the superproject
   to do upstream work.
 * As the detached HEAD is not referenced, we have git-gc issues.


New type of symbolic refs
=========================
A symbolic ref can currently only point at a ref or another symbolic ref.
This proposal showcases different scenarios on how this could change in the
future.

HEAD pointing at the superprojects index
----------------------------------------
Introduce a new symbolic ref that points at the superprojects
index of the gitlink. The format is

  "repo:" <superprojects gitdir> '\0' <gitlink-path> '\0'

Just like existing symrefs, the content of the ref will be read and followed.
On reading "repo:", the sha1 will be obtained equivalent to:

    git -C <superproject> ls-files -s <gitlink-path> | awk '{ print $2}'

Ref write operations driven by the submodule, affecting symrefs
  e.g. git checkout <other branch> (in the submodule)

In this scenario only the HEAD is optionally attached to the superproject,
so we can rewrite the HEAD to be anything else, such as a branch just fine.
Once the HEAD is not pointing at the superproject any more, we'll leave the
submodule alone in operations driven by the superproject.
To get back on the superproject branch, we’d need to invent new UX, such as
   git checkout --attach-superproject
as that is similar to --detach

Ref write operations driven by the submodule, affecting target ref
  e.g. git commit, reset --hard, update-ref (in the submodule)

The HEAD stays the same, pointing at the superproject.
The gitlink is changed to the target sha1, using

  git -C <superproject> update-index --add \
      --cacheinfo 160000,$SHA1,<gitlink-path>

This will affect the superprojects index, such that then a commit in
the superproject is needed.

Ref write operations driven by the superproject, changing the gitlink
  e.g. git checkout <tree-ish>, git reset --hard (in the superproject)

This will change the gitlink in the superprojects index, such that the HEAD
in the submodule changes, which would trigger an update of the
submodules working tree.

Superproject operations spanning index and worktree
  E.g. git reset --mixed
As the submodules HEAD is defined in the index, we would reset it to the
version in the last commit. As --mixed promises to not touch the working tree,
the submodules worktree would not be touched. git reset --mixed in the
superproject is the same as --soft in the submodule.

Consistency considerations (gc)
  e.g. git gc --aggressive --prune=now

The repacking logic is already aware of a detached HEAD, such that
using this new symref mechanism would not generate problems as long as
we keep the HEAD attached to the superproject. However when commits/objects
are created while the HEAD is attached to the superproject and then HEAD
switches to a local branch, there are problems with the created objects
as they seem unreachable now.

This problem is not new as a superproject may record submodule objects
that are not reachable from any of the submodule branches. Such objects
fall prey to overzealous packing in the submodule.

This proposal however exposes this problem a lot more, as the submodule
has fewer needs for branches.

Pros
 * easy to tell if a submodule is attached to the superproject,
 * no atomicity issues
 * once enough commands implement this behavior, it may be easier to understand
   than previous alternatives and feel more intuitive
Cons:
 * gc issues for now
 * lots of work as it revamps submodules alot.
 
 This last proposal might be differentiated further, e.g. the submodule HEAD
 pointing at the superprojects gitlink in the index, in its HEAD or other
 branch.

Any feedback welcome!
Stefan









^ permalink raw reply	[relevance 19%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-09  0:10   ` [RFD] Long term plan with submodule refs? Stefan Beller
@ 2017-11-09  1:29     ` Jonathan Tan
  2017-11-09  5:08     ` Junio C Hamano
  2017-11-09  6:54     ` Jacob Keller
  2 siblings, 0 replies; 200+ results
From: Jonathan Tan @ 2017-11-09  1:29 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Wed,  8 Nov 2017 16:10:07 -0800
Stefan Beller <sbeller@google.com> wrote:

I thought of a possible alternative and how it would work.

> Possible data models and workflow implications
> ==============================================
> In the following different data models are presented, which aid a submodule
> heavy workflow each giving pros and cons.

What if, in the submodule, we have a new ref backend that mirrors the
superproject? When initializing the submodule, its original refs are not
cloned at all, but instead virtual refs are used.

Creation of brand-new refs is forbidden in the submodule.

When reading a ref in the submodule, if that ref is the current branch
in the superproject, read the corresponding gitlink entry in the index
(which may be dirty); otherwise read the gitlink in the tree of the tip
commit.

When updating a ref in the submodule, if that ref is the current branch
in the superproject, update the index; otherwise, create a commit on top
of the tip and update the ref to point to the new tip.

No synchronicity is enforced between superproject and submodule in terms
of HEAD, though: If a submodule is currently checked out to a branch,
and the gitlink for that branch is updated through whatever means, that
is equivalent to a "git reset --soft" in the submodule.

These rules seem straightforward to me (although I have been working
with Git for a while, so perhaps I'm not the best judge), and I think
leads to a good workflow, as discussed below.

> Workflows
> =========
> * Obtaining a copy of the Superproject tightly coupled with submodules
>   solved via git clone --recurse-submodules=<pathspec>
> * Changing the submodule selection
>   solved via submodule.active flags
> * Changing the remote / Interacting with a different remote for all submodules
>   -> need to be solved, not core issue of this discussion
> * Syncing to the latest upstream
>   solved via git pull --recurse  

(skipping the above, since they are either solved or not a core issue)

> * Working on a local feature in one submodule
>   -> How do refs work spanning superproject/submodule?

This is perhaps one weak point of my proposal - you can't work on a
submodule as if it were independent. You can checkout a branch and make
commits, but (i) they will automatically affect the superproject, and
(ii) the "origin/foo" etc. branches are those of the superproject. (But
if you checkout a detached HEAD, everything should still work.)

> * Working on a feature spanning multiple submodules
>   -> How do refs work spanning multiple repos?

The above rules allow the following workflow:
 - "checkout --recurse-submodules" the branch you want on the
   superproject
 - make whatever changes you want in each submodule
 - commit each individual submodule (which updates the index of the
   superproject), then commit the superproject (we can introduce a
   commit --recurse-submodules to make this more convenient)
 - a "push --recurse-submodules" can be implemented to push the
   superproject and its submodules independently (and the same refspec
   can be legitimately used both when pushing the superproject and when
   pushing a submodule, since the ref names are the same, and not by
   coincidence)

If the user insists on making changes on a non-current branch (i.e. by
creating commits in submodules then using "git update-ref" or
equivalent), possibly multiple commits would be created in the
superproject, but the user can still squash them later if desired.

> * Working on a bug fix (Changing the feature that you currently work on, branches)
>   -> How does switching branches in the superproject affect submodules

You will have to stash or commit your changes. (Which reminds me...GC in
the subproject will need to consult the revlog of the superproject too.)

> New type of symbolic refs
> =========================
> A symbolic ref can currently only point at a ref or another symbolic ref.
> This proposal showcases different scenarios on how this could change in the
> future.
> 
> HEAD pointing at the superprojects index
> ----------------------------------------

Assuming we don't need synchronicity, the existing HEAD format can be
retained. To clarify what happens during ref writes, I'll reuse the
scenarios Stefan described:

> Ref write operations driven by the submodule, affecting target ref
>   e.g. git commit, reset --hard, update-ref (in the submodule)
> 
> The HEAD stays the same, pointing at the superproject.
> The gitlink is changed to the target sha1, using
> 
>   git -C <superproject> update-index --add \
>       --cacheinfo 160000,$SHA1,<gitlink-path>
> 
> This will affect the superprojects index, such that then a commit in
> the superproject is needed.

In this proposal, the HEAD also stays the same (pointing at the branch).

Either the index is updated or a commit is needed. If a commit is
needed, it is automatically performed.

> Ref write operations driven by the superproject, changing the gitlink
>   e.g. git checkout <tree-ish>, git reset --hard (in the superproject)
> 
> This will change the gitlink in the superprojects index, such that the HEAD
> in the submodule changes, which would trigger an update of the
> submodules working tree.

The HEAD in the submodule is unchanged. If the value of a ref has
changed "from underneath", this is as if a "git reset --soft" was done.

> Superproject operations spanning index and worktree
>   E.g. git reset --mixed
> As the submodules HEAD is defined in the index, we would reset it to the
> version in the last commit. As --mixed promises to not touch the working tree,
> the submodules worktree would not be touched. git reset --mixed in the
> superproject is the same as --soft in the submodule.

Same.

> Consistency considerations (gc)
>   e.g. git gc --aggressive --prune=now
> 
> The repacking logic is already aware of a detached HEAD, such that
> using this new symref mechanism would not generate problems as long as
> we keep the HEAD attached to the superproject. However when commits/objects
> are created while the HEAD is attached to the superproject and then HEAD
> switches to a local branch, there are problems with the created objects
> as they seem unreachable now.
> 
> This problem is not new as a superproject may record submodule objects
> that are not reachable from any of the submodule branches. Such objects
> fall prey to overzealous packing in the submodule.

The scenario Stefan describes will work OK - if a commit is created
while the HEAD is pointing to a branch, then either the superproject's
index will be updated or commits will be created in the superproject.
When GC reads the list of refs in the submodule, the new submodule
commit will be included. (Remember that if the superproject's current
branch is "foo", "refs/heads/foo" in the submodule reflects the
superproject's index, so any changes to the index, though uncommitted,
will appear as a ref.)

The problem still exists (e.g. stashes in the superproject) but is
reduced, I think.

^ permalink raw reply	[relevance 22%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-09  0:10   ` [RFD] Long term plan with submodule refs? Stefan Beller
  2017-11-09  1:29     ` Jonathan Tan
@ 2017-11-09  5:08     ` Junio C Hamano
  2017-11-09 19:57       ` Stefan Beller
  2017-11-09  6:54     ` Jacob Keller
  2 siblings, 1 reply; 200+ results
From: Junio C Hamano @ 2017-11-09  5:08 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jonathantanmy, git

Stefan Beller <sbeller@google.com> writes:

>> The relationship is indeed currently useful, but if the long term plan
>> is to strongly discourage detached submodule HEAD, then I would think
>> that these patches are in the wrong direction. (If the long term plan is
>> to end up supporting both detached and linked submodule HEAD, then these
>> patches are fine, of course.) So I think that the plan referenced in
>> Junio's email (that you linked above) still needs to be discussed.
>
> This email presents different approaches.
>
> Objective
> =========
> This document should summarize the current situation of Git submodules
> and start a discussion of where it can be headed long term.
> Show different ways in which submodule refs could evolve.
>
> Background
> ==========
> Submodules in Git are considered as an independet repository currently.
> This is okay for current workflows, such as utilizing a library that is
> rarely updated. Other workflows that require a tighter integration between
> submodule and superproject are possible, but cumbersome as there is an
> additional step that has to be performed, which is the update of the gitlink
> pointer in the superproject.

I do not think "rarely updaed" is an issue.

The problem is that we may want to make it easier to use a
superproject and its submodules as if the combined whole were a
single project, which currently is not easy, primarily because
submodules are separate entities with different set of branches that
can be checked out independently from what branch the superproject
is working on.

> Workflows
> =========
> * Obtaining a copy of the Superproject tightly coupled with submodules
>   solved via git clone --recurse-submodules=<pathspec>
> * Changing the submodule selection
>   solved via submodule.active flags
> * Changing the remote / Interacting with a different remote for all submodules
>   -> need to be solved, not core issue of this discussion
> * Syncing to the latest upstream
>   solved via git pull --recurse  
> * Working on a local feature in one submodule
>   -> How do refs work spanning superproject/submodule?
> * Working on a feature spanning multiple submodules
>   -> How do refs work spanning multiple repos?
> * Working on a bug fix (Changing the feature that you currently work on, branches)
>   -> How does switching branches in the superproject affect submodules

These are good starting points for copying such a combined whole to
your local machine and start working on it.  The more interesting,
important, and potentially difficult part is how the result of such
work is shared back to where you started from.  "push --recursive"
may be a simple phrase, but a sensible definition of how it should
work won't be that simple.

> Possible data models and workflow implications
> ==============================================
> In the following different data models are presented, which aid a submodule
> heavy workflow each giving pros and cons.
>
> Keep everything as is, superproject and submodule have their own refs
> ---------------------------------------------------------------------
> ...
> Cons:
>  * Current tools that manage multiple repositories (e.g. repo, git-slave)
>    have "branches in parallel", i.e. each repo has a branch of the same
>    name, instead of using a superproject to manage the state of all repos
>    involved. So users of such tools may be confused by submodules.
>  * when using a detached HEAD in the submodule, we may run into git-gc issues.

We should make detached HEAD safe against gc if it is not,
regardless of the use of submodules.  I thought it already was made
safe long time ago.

> Use replicate refs in submodules
> --------------------------------
> This approach will replicate the superproject refs into the submodule
> ref namespace, e.g. git-branch learns about --recurse-submodules, which
> creates a branch of a given name in all submodules. These (topic) branches
> should be kept in sync with the superproject
>
> Pros:
>  * This seemed intuitive to Gerrit users
>  * 'quick' to implement, most of the commands are already there,
>    just git-branch is needed to have the workflows mentioned above complete.
> Cons:
>  * What does "git checkout -b A B" mean? (special case: B == HEAD)

The command ran at which level?  In the superproject, or in a single
submodule?

>    Is the branch name replicated as a string into the submodule operation,
>    or do we dereference the superprojects gitlink and walk from there?

If they are "kept in sync with the superproject", then there should
be no difference between the two, so I do not see any room for
wondering about that.  In other words, if there is need to worry
about the differences between the above two, then it probably is
fundamentally impossible to keep these in sync, and a design that
assumes it is possible would have to expose glitches to the end-user
experience.

I do not know if glitches resulting from there would be so severe to
be show-stoppers, though.  It might be possible to paper them over.

> No submodule refstore at all
> ----------------------------
> Use refs and commits in the superproject to stitch submodule changes
> together. Disallow branches in the submodule. This is only restricted
> to the working tree inside the superproject, such that the output of git-branch
> changes depending whether the working tree is in- or outside the superproject
> working tree.

This would need enhancement for reachability code, but it feels the
cleanest from the philosophical standpoint---if you want to treat a
superproject and its submodules as if it were a single project,
ability to check out a branch in a submodule that does not match
that of the superproject would only get in the way of preserving the
illusion of "single project"-ness.

> New type of symbolic refs
> =========================
> A symbolic ref can currently only point at a ref or another symbolic ref.
> This proposal showcases different scenarios on how this could change in the
> future.
>
> HEAD pointing at the superprojects index
> ----------------------------------------

This looks to me a mere implementation detail for a (part of)
necessary component to realize the above "No submodule refstore".

> Superproject operations spanning index and worktree
>   E.g. git reset --mixed
> As the submodules HEAD is defined in the index, we would reset it to the
> version in the last commit. As --mixed promises to not touch the working tree,
> the submodules worktree would not be touched. git reset --mixed in the
> superproject is the same as --soft in the submodule.

I am not sure if you want to take these promises low-level "single
repository" plumbing operations make too literally.  "reset --mixed"
may promise not to touch the working tree, but it also promises not
to touch submodules at all.  If you are breaking the latter anyway,
it would make more sense not to be afraid of breaking the former if
it makes sense in the context of allowing the command to do more by
breaking the latter.


^ permalink raw reply	[relevance 23%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-09  0:10   ` [RFD] Long term plan with submodule refs? Stefan Beller
  2017-11-09  1:29     ` Jonathan Tan
  2017-11-09  5:08     ` Junio C Hamano
@ 2017-11-09  6:54     ` Jacob Keller
  2017-11-09 20:16       ` Stefan Beller
  2 siblings, 1 reply; 200+ results
From: Jacob Keller @ 2017-11-09  6:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jonathan Tan, Git mailing list

On Wed, Nov 8, 2017 at 4:10 PM, Stefan Beller <sbeller@google.com> wrote:
>> The relationship is indeed currently useful, but if the long term plan
>> is to strongly discourage detached submodule HEAD, then I would think
>> that these patches are in the wrong direction. (If the long term plan is
>> to end up supporting both detached and linked submodule HEAD, then these
>> patches are fine, of course.) So I think that the plan referenced in
>> Junio's email (that you linked above) still needs to be discussed.
>

> New type of symbolic refs
> =========================
> A symbolic ref can currently only point at a ref or another symbolic ref.
> This proposal showcases different scenarios on how this could change in the
> future.
>
> HEAD pointing at the superprojects index
> ----------------------------------------
> Introduce a new symbolic ref that points at the superprojects
> index of the gitlink. The format is
>
>   "repo:" <superprojects gitdir> '\0' <gitlink-path> '\0'
>
> Just like existing symrefs, the content of the ref will be read and followed.
> On reading "repo:", the sha1 will be obtained equivalent to:
>
>     git -C <superproject> ls-files -s <gitlink-path> | awk '{ print $2}'
>
> Ref write operations driven by the submodule, affecting symrefs
>   e.g. git checkout <other branch> (in the submodule)
>
> In this scenario only the HEAD is optionally attached to the superproject,
> so we can rewrite the HEAD to be anything else, such as a branch just fine.
> Once the HEAD is not pointing at the superproject any more, we'll leave the
> submodule alone in operations driven by the superproject.
> To get back on the superproject branch, we’d need to invent new UX, such as
>    git checkout --attach-superproject
> as that is similar to --detach
>

Some of the idea trimmed for brevity, but I like this aspect the most.
Currently, I work on several projects which have multiple
repositories, which are essentially submodules.

However, historically, we kept them separate. 99% of the time, you can
use all 3 projects on "master" and everything works. But if you go
back in time, there's no correlation to "what did the parent project
want this "COMMON" folder to be at?

I started promoting using submodules for this, since it seemed quite natural.

The core problem, is that several developers never quite understood or
grasped how submodules worked. There's problems like "but what if I
wanna work on master?" or people assume submodules need to be checked
out at master instead of in a detached HEAD state.

So we often get people who don't run git submodule update and thus are
confused about why their submodules are often out of date. (This can
be solved by recursive options to commands to more often recurse into
submodules and checkout and update them).

We also often get people who accidentally commit the old version of
the repository, or commit an update to the parent project pointing the
submodule at some commit which isn't yet in the upstream of the common
repository.

The proposal here seems to match the intuition about how submodules
should work, with the ability to "attach" or "detach" the submodule
when working on the submodule directly.

Ideally, I'd like for more ways to say "ignore what my submodule is
checked out at, since I will have something else checked out, and
don't intend to commit just yet."

Basically, a workflow where it's easier to have each submodule checked
out at master, and we can still keep track of historical relationship
of what commit was the submodule at some time ago, but without causing
some of these headaches.

I've often tried to use the "--skip-worktree" bit to have people set
their repository to ignore the submodule. Unfortunately, this is
pretty complex, and most of the time, developers never remember to do
this again on a fresh clone.

Thanks,
Jake

^ permalink raw reply	[relevance 24%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-09  5:08     ` Junio C Hamano
@ 2017-11-09 19:57       ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-09 19:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Wed, Nov 8, 2017 at 9:08 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> The relationship is indeed currently useful, but if the long term plan
>>> is to strongly discourage detached submodule HEAD, then I would think
>>> that these patches are in the wrong direction. (If the long term plan is
>>> to end up supporting both detached and linked submodule HEAD, then these
>>> patches are fine, of course.) So I think that the plan referenced in
>>> Junio's email (that you linked above) still needs to be discussed.
>>
>> This email presents different approaches.
>>
>> Objective
>> =========
>> This document should summarize the current situation of Git submodules
>> and start a discussion of where it can be headed long term.
>> Show different ways in which submodule refs could evolve.
>>
>> Background
>> ==========
>> Submodules in Git are considered as an independet repository currently.
>> This is okay for current workflows, such as utilizing a library that is
>> rarely updated. Other workflows that require a tighter integration between
>> submodule and superproject are possible, but cumbersome as there is an
>> additional step that has to be performed, which is the update of the gitlink
>> pointer in the superproject.
>
> I do not think "rarely updaed" is an issue.
>
> The problem is that we may want to make it easier to use a
> superproject and its submodules as if the combined whole were a
> single project, which currently is not easy, primarily because
> submodules are separate entities with different set of branches that
> can be checked out independently from what branch the superproject
> is working on.

Well and this fact seems to be not a problem in the current use of submodules,
precisely because the workflow either (a) is not too cumbersome or (b)
is executed
not too often to bother enough.

> These are good starting points for copying such a combined whole to
> your local machine and start working on it.  The more interesting,
> important, and potentially difficult part is how the result of such
> work is shared back to where you started from.  "push --recursive"
> may be a simple phrase, but a sensible definition of how it should
> work won't be that simple.
...
>
> We should make detached HEAD safe against gc if it is not,
> regardless of the use of submodules.  I thought it already was made
> safe long time ago.

The detached HEAD itself is protected via its reflog (which is around
for say 2 weeks?)

If I were to develop using detached HEAD only in todays world of
submodules using different branches in the superproject, I run the risk
of loosing some commits in the submodule, as they are not the detached
HEAD all the time, but might even be loose tips.

This combined with the previous paragraph brings in another important
concern:
Some projects would have a very different history when used as a
submodule compared to when used as a stand alone project.
Other projects may be closely aligned between their branches and
what the superproject records.

So the more we deviate from the traditional branch model, the easier
we make it to have the submodule tips be very different from the
standalone tips, which may overexpose us to the gc issues as well as
the general question how much these projects have in common.

>> Use replicate refs in submodules
>> --------------------------------
>> This approach will replicate the superproject refs into the submodule
>> ref namespace, e.g. git-branch learns about --recurse-submodules, which
>> creates a branch of a given name in all submodules. These (topic) branches
>> should be kept in sync with the superproject
>>
>> Pros:
>>  * This seemed intuitive to Gerrit users
>>  * 'quick' to implement, most of the commands are already there,
>>    just git-branch is needed to have the workflows mentioned above complete.
>> Cons:
>>  * What does "git checkout -b A B" mean? (special case: B == HEAD)
>
> The command ran at which level?  In the superproject, or in a single
> submodule?

In the superproject, with --recurse-submodules, as the A and B would recurse
as strings, and not change meaning depending on the gitlink value.

>
>>    Is the branch name replicated as a string into the submodule operation,
>>    or do we dereference the superprojects gitlink and walk from there?
>
> If they are "kept in sync with the superproject", then there should
> be no difference between the two, so I do not see any room for
> wondering about that.

Except you can still break out by issuing commands in the submodule
to change the submodule refs to be different from the superproject.

This was also more along the lines of thinking about the (Gerrit) remote,
which does and okay, but not stellar job in keeping the remote branches
for superproject and submodule in sync. I'd expect glitches there.

> In other words, if there is need to worry
> about the differences between the above two, then it probably is
> fundamentally impossible to keep these in sync, and a design that
> assumes it is possible would have to expose glitches to the end-user
> experience.

yup. And by exposing you probably mean a patch series as presented?
(git status/log/diff making noise about the glitch?)

> I do not know if glitches resulting from there would be so severe to
> be show-stoppers, though.  It might be possible to paper them over.

I think so, too, as long as the user is pointed at the glitch to correct them.

>
>> No submodule refstore at all
>> ----------------------------
>> Use refs and commits in the superproject to stitch submodule changes
>> together. Disallow branches in the submodule. This is only restricted
>> to the working tree inside the superproject, such that the output of git-branch
>> changes depending whether the working tree is in- or outside the superproject
>> working tree.
>
> This would need enhancement for reachability code, but it feels the
> cleanest from the philosophical standpoint---if you want to treat a
> superproject and its submodules as if it were a single project,
> ability to check out a branch in a submodule that does not match
> that of the superproject would only get in the way of preserving the
> illusion of "single project"-ness.

I wonder if we can combine this with the approach Jonathan gave above.
In the worktree (of the submodule inside the superproject) you are allowed
to use these "mirrored" refs, whereas in any other worktree you have full
access to the normal refs of the project.

>
>> New type of symbolic refs
>> =========================
>> A symbolic ref can currently only point at a ref or another symbolic ref.
>> This proposal showcases different scenarios on how this could change in the
>> future.
>>
>> HEAD pointing at the superprojects index
>> ----------------------------------------
>
> This looks to me a mere implementation detail for a (part of)
> necessary component to realize the above "No submodule refstore".

Ah ok.

If all branches would use this new symref type, the handling would
seem to be very similar to what Jonathan described with a new type
of refstore instead.

>> Superproject operations spanning index and worktree
>>   E.g. git reset --mixed
>> As the submodules HEAD is defined in the index, we would reset it to the
>> version in the last commit. As --mixed promises to not touch the working tree,
>> the submodules worktree would not be touched. git reset --mixed in the
>> superproject is the same as --soft in the submodule.
>
> I am not sure if you want to take these promises low-level "single
> repository" plumbing operations make too literally.  "reset --mixed"
> may promise not to touch the working tree, but it also promises not
> to touch submodules at all.  If you are breaking the latter anyway,
> it would make more sense not to be afraid of breaking the former if
> it makes sense in the context of allowing the command to do more by
> breaking the latter.

ok.

Thanks,
Stefan

^ permalink raw reply	[relevance 24%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-09  6:54     ` Jacob Keller
@ 2017-11-09 20:16       ` Stefan Beller
  2017-11-10  3:37         ` Jacob Keller
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-11-09 20:16 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jonathan Tan, Git mailing list

On Wed, Nov 8, 2017 at 10:54 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> On Wed, Nov 8, 2017 at 4:10 PM, Stefan Beller <sbeller@google.com> wrote:
>>> The relationship is indeed currently useful, but if the long term plan
>>> is to strongly discourage detached submodule HEAD, then I would think
>>> that these patches are in the wrong direction. (If the long term plan is
>>> to end up supporting both detached and linked submodule HEAD, then these
>>> patches are fine, of course.) So I think that the plan referenced in
>>> Junio's email (that you linked above) still needs to be discussed.
>>
>
>> New type of symbolic refs
>> =========================
>> A symbolic ref can currently only point at a ref or another symbolic ref.
>> This proposal showcases different scenarios on how this could change in the
>> future.
>>
>> HEAD pointing at the superprojects index
>> ----------------------------------------
>> Introduce a new symbolic ref that points at the superprojects
>> index of the gitlink. The format is
>>
>>   "repo:" <superprojects gitdir> '\0' <gitlink-path> '\0'
>>
>> Just like existing symrefs, the content of the ref will be read and followed.
>> On reading "repo:", the sha1 will be obtained equivalent to:
>>
>>     git -C <superproject> ls-files -s <gitlink-path> | awk '{ print $2}'
>>
>> Ref write operations driven by the submodule, affecting symrefs
>>   e.g. git checkout <other branch> (in the submodule)
>>
>> In this scenario only the HEAD is optionally attached to the superproject,
>> so we can rewrite the HEAD to be anything else, such as a branch just fine.
>> Once the HEAD is not pointing at the superproject any more, we'll leave the
>> submodule alone in operations driven by the superproject.
>> To get back on the superproject branch, we’d need to invent new UX, such as
>>    git checkout --attach-superproject
>> as that is similar to --detach
>>
>
> Some of the idea trimmed for brevity, but I like this aspect the most.
> Currently, I work on several projects which have multiple
> repositories, which are essentially submodules.
>
> However, historically, we kept them separate. 99% of the time, you can
> use all 3 projects on "master" and everything works. But if you go
> back in time, there's no correlation to "what did the parent project
> want this "COMMON" folder to be at?

So an environment where "git submodule update --remote" is not that
harmful, but rather brings the joy of being up to date in each project?

> I started promoting using submodules for this, since it seemed quite natural.
>
> The core problem, is that several developers never quite understood or
> grasped how submodules worked. There's problems like "but what if I
> wanna work on master?" or people assume submodules need to be checked
> out at master instead of in a detached HEAD state.

So the documentation sucks?

It is intentional that from the superprojects perspective the gitlink
must be one
exact value, and rely on the submodule to get to and keep that state.

(I think we once discussed if setting the gitlink value to 00...00 or otherwise
signal that we actually want "the most recent tip of the X branch" would be
a good idea, but I do not think it as it misses the point of versioning)

> So we often get people who don't run git submodule update and thus are
> confused about why their submodules are often out of date. (This can
> be solved by recursive options to commands to more often recurse into
> submodules and checkout and update them).
>
> We also often get people who accidentally commit the old version of
> the repository, or commit an update to the parent project pointing the
> submodule at some commit which isn't yet in the upstream of the common
> repository.

Would an upstream prereceive hook (maybe even builtin and accessible via
'receive.denyUnreachableSubmodules') help? (It would require submodules
to be defined with relative URLs in the .gitmodules file and then the receive
command can check for the gitlink value present in this other repository)

> The proposal here seems to match the intuition about how submodules
> should work, with the ability to "attach" or "detach" the submodule
> when working on the submodule directly.

Well I think the big picture discussion is how easy this attaching or
detaching is. Whether only the HEAD is attached or detached, or if we
invent a new refstore that is a complete new submodule thing, which
cannot be detached from the superproject at all.

> Ideally, I'd like for more ways to say "ignore what my submodule is
> checked out at, since I will have something else checked out, and
> don't intend to commit just yet."

This is in the superproject, when doing a git add . ?

> Basically, a workflow where it's easier to have each submodule checked
> out at master, and we can still keep track of historical relationship
> of what commit was the submodule at some time ago, but without causing
> some of these headaches.

So essentially a repo or otherwise parallel workflow just with the versioning
happening magically behind your back?

> I've often tried to use the "--skip-worktree" bit to have people set
> their repository to ignore the submodule. Unfortunately, this is
> pretty complex, and most of the time, developers never remember to do
> this again on a fresh clone.

That sounds interesting.

Thanks,
Stefan

^ permalink raw reply	[relevance 24%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-09 20:16       ` Stefan Beller
@ 2017-11-10  3:37         ` Jacob Keller
  2017-11-10 20:01           ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Jacob Keller @ 2017-11-10  3:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jonathan Tan, Git mailing list

On Thu, Nov 9, 2017 at 12:16 PM, Stefan Beller <sbeller@google.com> wrote:
> On Wed, Nov 8, 2017 at 10:54 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>> On Wed, Nov 8, 2017 at 4:10 PM, Stefan Beller <sbeller@google.com> wrote:
>>>> The relationship is indeed currently useful, but if the long term plan
>>>> is to strongly discourage detached submodule HEAD, then I would think
>>>> that these patches are in the wrong direction. (If the long term plan is
>>>> to end up supporting both detached and linked submodule HEAD, then these
>>>> patches are fine, of course.) So I think that the plan referenced in
>>>> Junio's email (that you linked above) still needs to be discussed.
>>>
>>
>>> New type of symbolic refs
>>> =========================
>>> A symbolic ref can currently only point at a ref or another symbolic ref.
>>> This proposal showcases different scenarios on how this could change in the
>>> future.
>>>
>>> HEAD pointing at the superprojects index
>>> ----------------------------------------
>>> Introduce a new symbolic ref that points at the superprojects
>>> index of the gitlink. The format is
>>>
>>>   "repo:" <superprojects gitdir> '\0' <gitlink-path> '\0'
>>>
>>> Just like existing symrefs, the content of the ref will be read and followed.
>>> On reading "repo:", the sha1 will be obtained equivalent to:
>>>
>>>     git -C <superproject> ls-files -s <gitlink-path> | awk '{ print $2}'
>>>
>>> Ref write operations driven by the submodule, affecting symrefs
>>>   e.g. git checkout <other branch> (in the submodule)
>>>
>>> In this scenario only the HEAD is optionally attached to the superproject,
>>> so we can rewrite the HEAD to be anything else, such as a branch just fine.
>>> Once the HEAD is not pointing at the superproject any more, we'll leave the
>>> submodule alone in operations driven by the superproject.
>>> To get back on the superproject branch, we’d need to invent new UX, such as
>>>    git checkout --attach-superproject
>>> as that is similar to --detach
>>>
>>
>> Some of the idea trimmed for brevity, but I like this aspect the most.
>> Currently, I work on several projects which have multiple
>> repositories, which are essentially submodules.
>>
>> However, historically, we kept them separate. 99% of the time, you can
>> use all 3 projects on "master" and everything works. But if you go
>> back in time, there's no correlation to "what did the parent project
>> want this "COMMON" folder to be at?
>
> So an environment where "git submodule update --remote" is not that
> harmful, but rather brings the joy of being up to date in each project?
>
>> I started promoting using submodules for this, since it seemed quite natural.
>>
>> The core problem, is that several developers never quite understood or
>> grasped how submodules worked. There's problems like "but what if I
>> wanna work on master?" or people assume submodules need to be checked
>> out at master instead of in a detached HEAD state.
>
> So the documentation sucks?
>
> It is intentional that from the superprojects perspective the gitlink
> must be one
> exact value, and rely on the submodule to get to and keep that state.
>
> (I think we once discussed if setting the gitlink value to 00...00 or otherwise
> signal that we actually want "the most recent tip of the X branch" would be
> a good idea, but I do not think it as it misses the point of versioning)
>
>> So we often get people who don't run git submodule update and thus are
>> confused about why their submodules are often out of date. (This can
>> be solved by recursive options to commands to more often recurse into
>> submodules and checkout and update them).
>>
>> We also often get people who accidentally commit the old version of
>> the repository, or commit an update to the parent project pointing the
>> submodule at some commit which isn't yet in the upstream of the common
>> repository.
>
> Would an upstream prereceive hook (maybe even builtin and accessible via
> 'receive.denyUnreachableSubmodules') help? (It would require submodules
> to be defined with relative URLs in the .gitmodules file and then the receive
> command can check for the gitlink value present in this other repository)
>
>> The proposal here seems to match the intuition about how submodules
>> should work, with the ability to "attach" or "detach" the submodule
>> when working on the submodule directly.
>
> Well I think the big picture discussion is how easy this attaching or
> detaching is. Whether only the HEAD is attached or detached, or if we
> invent a new refstore that is a complete new submodule thing, which
> cannot be detached from the superproject at all.
>
>> Ideally, I'd like for more ways to say "ignore what my submodule is
>> checked out at, since I will have something else checked out, and
>> don't intend to commit just yet."
>
> This is in the superproject, when doing a git add . ?
>

Yes.

>> Basically, a workflow where it's easier to have each submodule checked
>> out at master, and we can still keep track of historical relationship
>> of what commit was the submodule at some time ago, but without causing
>> some of these headaches.
>
> So essentially a repo or otherwise parallel workflow just with the versioning
> happening magically behind your back?

Ideally, my developers would like to just have each submodule checked
out at master.

Ideally, I'd like to be able to checkout an old version of the parent
project and have it recorded what version of the shared submodule was
at at the time.

Ideally, my developers don't want to have to worry about knowing that
they shouldn't "git add -a" or "git commit -a" when they have a
submodule checked out at a different location from the parent projects
gitlink.

Thanks,
Jake

>
>> I've often tried to use the "--skip-worktree" bit to have people set
>> their repository to ignore the submodule. Unfortunately, this is
>> pretty complex, and most of the time, developers never remember to do
>> this again on a fresh clone.
>
> That sounds interesting.
>
> Thanks,
> Stefan

^ permalink raw reply	[relevance 25%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-10  3:37         ` Jacob Keller
@ 2017-11-10 20:01           ` Stefan Beller
  2017-11-11  5:25             ` Jacob Keller
  0 siblings, 1 reply; 200+ results
From: Stefan Beller @ 2017-11-10 20:01 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jonathan Tan, Git mailing list

>
>>> Basically, a workflow where it's easier to have each submodule checked
>>> out at master, and we can still keep track of historical relationship
>>> of what commit was the submodule at some time ago, but without causing
>>> some of these headaches.
>>
>> So essentially a repo or otherwise parallel workflow just with the versioning
>> happening magically behind your back?
>
> Ideally, my developers would like to just have each submodule checked
> out at master.
>
> Ideally, I'd like to be able to checkout an old version of the parent
> project and have it recorded what version of the shared submodule was
> at at the time.

This sounds as if a "passive superproject" would work best for you, i.e.
each commit in a submodule is bubbled up into the superproject,
making a commit potentially even behind the scenes, such that the
user interaction with the superproject would be none.

However this approach also sounds careless, as there is no precondition
that e.g. the superproject builds with all the submodules as is; it is a mere
tracking of "at this time we have the submodules arranged as such",
whereas for the versioning aspect, you would want to have commit messages
in the superproject saying *why* you bumped up a specific submodule.
The user may not like to give such an explanation as they already wrote
a commit message for the individual project.

Also this approach sounds like a local approach, as it is not clear to me,
why you'd want to share the superproject history.

> Ideally, my developers don't want to have to worry about knowing that
> they shouldn't "git add -a" or "git commit -a" when they have a
> submodule checked out at a different location from the parent projects
> gitlink.
>
> Thanks,
> Jake
>

^ permalink raw reply	[relevance 26%]

* [PATCH] builtin/describe.c: describe a blob
  2017-11-10 22:44 ` [PATCH 0/1] describe a blob: with better docs Stefan Beller
@ 2017-11-10 22:44   ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-10 22:44 UTC (permalink / raw)
  To: gitster; +Cc: Johannes.Schindelin, git, jacob.keller, me, philipoakley, sbeller

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob.  For
source code this is likely not the first commit that introduced the blob,
but rather the latest commit that contained the blob.  For example:

  git describe v0.99:Makefile
  v0.99-5-gab6625e06a:Makefile

tells us the latest commit that contained the Makefile as it was in tag
v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next
commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist",
2005-07-11) touches the Makefile.

Let's see how this description turns out, if it is useful in day-to-day
use as I have the intuition that we'd rather want to see the *first*
commit that this blob was introduced to the repository (which can be
achieved easily by giving the `--reverse` flag in the describe_blob rev
walk).

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-describe.txt | 13 +++++++-
 builtin/describe.c             | 71 ++++++++++++++++++++++++++++++++++++++----
 t/t6120-describe.sh            | 15 +++++++++
 3 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..a25443ca91 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,7 +3,7 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit using the most recent tag reachable from it
+git-describe - Describe a commit or blob using the graph relations
 
 
 SYNOPSIS
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>]
+'git describe' <blob>
 
 DESCRIPTION
 -----------
@@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given object refers to a blob, it will be described
+as `<commit-ish>:<path>`, such that the blob can be found
+at `<path>` in the `<commit-ish>`. Note, that the commit is likely
+not the commit that introduced the blob, but the one that was found
+first; to find the commit that introduced the blob, you need to find
+the commit that last touched the path, e.g.
+`git log <commit-description> -- <path>`.
+As blobs do not point at the commits they are contained in,
+describing blobs is slow as we have to walk the whole graph.
+
 OPTIONS
 -------
 <commit-ish>...::
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..acfd853a30 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,56 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+	struct object_id current_commit;
+	struct object_id looking_for;
+	struct strbuf *dst;
+	struct rev_info *revs;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+	struct process_commit_data *pcd = data;
+	pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+	struct process_commit_data *pcd = data;
+
+	if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
+		reset_revision_walk();
+		describe_commit(&pcd->current_commit, pcd->dst);
+		strbuf_addf(pcd->dst, ":%s", path);
+		pcd->revs->max_count = 0;
+	}
+}
+
+static void describe_blob(struct object_id oid, struct strbuf *dst)
+{
+	struct rev_info revs;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct process_commit_data pcd = { null_oid, oid, dst, &revs};
+
+	argv_array_pushl(&args, "internal: The first arg is not parsed",
+		"--all", "--reflog", /* as many starting points as possible */
+		/* NEEDSWORK: --all is incompatible with worktrees for now: */
+		"--single-worktree",
+		"--objects",
+		"--in-commit-order",
+		NULL);
+
+	init_revisions(&revs, NULL);
+	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
+		BUG("setup_revisions could not handle all args?");
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, process_commit, process_object, &pcd);
+	reset_revision_walk();
+}
+
 static void describe(const char *arg, int last_one)
 {
 	struct object_id oid;
@@ -445,11 +497,18 @@ static void describe(const char *arg, int last_one)
 
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
-
-	describe_commit(&oid, &sb);
+	cmit = lookup_commit_reference_gently(&oid, 1);
+
+	if (cmit)
+		describe_commit(&oid, &sb);
+	else if (lookup_blob(&oid)) {
+		if (all || tags || longformat || first_parent ||
+		    patterns.nr || exclude_patterns.nr ||
+		    always || dirty || broken)
+			die(_("options not available for describing blobs"));
+		describe_blob(oid, &sb);
+	} else
+		die(_("%s is neither a commit nor blob"), arg);
 
 	puts(sb.buf);
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index c8b7ed82d9..aec6ed192d 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a broken submodule' '
 	grep broken out
 '
 
+test_expect_success 'describe a blob at a tag' '
+	echo "make it a unique blob" >file &&
+	git add file && git commit -m "content in file" &&
+	git tag -a -m "latest annotated tag" unique-file &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe a blob with commit-ish' '
+	git commit --allow-empty -m "empty commit" &&
+	git describe HEAD:file >actual &&
+	grep unique-file-1-g actual
+'
+
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply	[relevance 10%]

* [PATCH] t/3512: demonstrate unrelated submodule/file conflict as cherry-pick failure
  2017-11-10 16:41 Bug: cherry-pick & submodule Ефимов Василий
@ 2017-11-11  0:04 ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan Beller @ 2017-11-11  0:04 UTC (permalink / raw)
  To: real; +Cc: git, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---

I rewrote your script to integrate with our test suite, potentially acting as
a regression test once we solve the Directory/File conflict issue.

Thanks,
Stefan

 t/t3512-cherry-pick-submodule.sh | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/t/t3512-cherry-pick-submodule.sh b/t/t3512-cherry-pick-submodule.sh
index 6863b7bb6f..17a6773247 100755
--- a/t/t3512-cherry-pick-submodule.sh
+++ b/t/t3512-cherry-pick-submodule.sh
@@ -10,4 +10,40 @@ KNOWN_FAILURE_NOFF_MERGE_DOESNT_CREATE_EMPTY_SUBMODULE_DIR=1
 KNOWN_FAILURE_NOFF_MERGE_ATTEMPTS_TO_MERGE_REMOVED_SUBMODULE_FILES=1
 test_submodule_switch "git cherry-pick"
 
+test_expect_failure 'unrelated submodule/file conflict is ignored' '
+	test_create_repo sub &&
+
+	touch sub/file &&
+	git -C sub add file &&
+	git -C sub commit -m "add a file in a submodule" &&
+
+	test_create_repo a_repo &&
+	(
+		cd a_repo &&
+		touch a_file &&
+		git add a_file &&
+		git commit -m "add a file" &&
+
+		git branch test &&
+		git checkout test &&
+
+		mkdir sub &&
+		touch sub/content &&
+		git add sub/content &&
+		git commit -m "add a regular folder with name sub" &&
+
+		echo "123" >a_file &&
+		git add a_file &&
+		git commit -m "modify a file" &&
+
+		git checkout master &&
+
+		git submodule add ../sub sub &&
+		git submodule update sub &&
+		git commit -m "add a submodule info folder with name sub" &&
+
+		git cherry-pick test
+	)
+'
+
 test_done
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply	[relevance 32%]

* Re: [RFD] Long term plan with submodule refs?
  2017-11-10 20:01           ` Stefan Beller
@ 2017-11-11  5:25             ` Jacob Keller
  0 siblings, 0 replies; 200+ results
From: Jacob Keller @ 2017-11-11  5:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jonathan Tan, Git mailing list

On Fri, Nov 10, 2017 at 12:01 PM, Stefan Beller <sbeller@google.com> wrote:
>>
>>>> Basically, a workflow where it's easier to have each submodule checked
>>>> out at master, and we can still keep track of historical relationship
>>>> of what commit was the submodule at some time ago, but without causing
>>>> some of these headaches.
>>>
>>> So essentially a repo or otherwise parallel workflow just with the versioning
>>> happening magically behind your back?
>>
>> Ideally, my developers would like to just have each submodule checked
>> out at master.
>>
>> Ideally, I'd like to be able to checkout an old version of the parent
>> project and have it recorded what version of the shared submodule was
>> at at the time.
>
> This sounds as if a "passive superproject" would work best for you, i.e.
> each commit in a submodule is bubbled up into the superproject,
> making a commit potentially even behind the scenes, such that the
> user interaction with the superproject would be none.
>
> However this approach also sounds careless, as there is no precondition
> that e.g. the superproject builds with all the submodules as is; it is a mere
> tracking of "at this time we have the submodules arranged as such",
> whereas for the versioning aspect, you would want to have commit messages
> in the superproject saying *why* you bumped up a specific submodule.
> The user may not like to give such an explanation as they already wrote
> a commit message for the individual project.
>
> Also this approach sounds like a local approach, as it is not clear to me,
> why you'd want to share the superproject history.
>
>> Ideally, my developers don't want to have to worry about knowing that
>> they shouldn't "git add -a" or "git commit -a" when they have a
>> submodule checked out at a different location from the parent projects
>> gitlink.
>>
>> Thanks,
>> Jake
>>


It doesn't need to be totally passive, in that some (or one
maintainer) can manage when the submodule pointer is actually updated,
but ideally other users don't have to worry about that and can
"pretend" to always keep each submodule at master, as they have always
done in the past.

Thanks,
Jake

^ permalink raw reply	[relevance 24%]

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-13 22:12       ` Stefan Beller
@ 2017-11-13 23:39         ` Elijah Newren
  2017-11-13 23:46           ` Stefan Beller
  0 siblings, 1 reply; 200+ results
From: Elijah Newren @ 2017-11-13 23:39 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 2:12 PM, Stefan Beller <sbeller@google.com> wrote:
> I wanted to debug a very similar issue today just after reviewing this
> series, see
> https://public-inbox.org/git/743acc29-85bb-3773-b6a0-68d4a0b8fd63@ispras.ru/

Oh, bleh.  That's not a D/F conflict at all, it's the code assuming
there's a D/F conflict because the entry it is processing ("sub") is a
submodule rather than a file, and it panics when it sees "a directory
in the way" -- a directory that just so happens to be named "sub" and
which is in fact the desired submodule, meaning that the working
directory is already good and needs no changes.

In this case, the relevant code from merge-recursive.c is the following:

        /* Case B: Added in one. */
        /* [nothing|directory] -> ([nothing|directory], file) */
<snip>
        if (dir_in_way(path, !o->call_depth,
                   S_ISGITLINK(a_mode))) {
            char *new_path = unique_path(o, path, add_branch);
            clean_merge = 0;
            output(o, 1, _("CONFLICT (%s): There is a directory with
name %s in %s. "
                   "Adding %s as %s"),
                   conf, path, other_branch, path, new_path);

Note that the comment even explicitly assumes the newly added entry is
a file.  We should expect there to be a directory present (the
submodule being added), but the code doesn't have a check for that.
The S_ISGITLINK(a_mode) makes you think it has special handling for
the submodule case, but that's for the reverse situation (the
submodule isn't yet present in the working copy, it came from the
other side of history, but there is an empty directory present).

^ permalink raw reply	[relevance 18%]

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-13 23:39         ` Elijah Newren
@ 2017-11-13 23:46           ` Stefan Beller
  0 siblings, 0 replies; 200+ results
From: Stefan