git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH 0/3] git-describe deals gracefully with broken submodules
@ 2017-03-21  0:11 Stefan Beller
  2017-03-21  6:28 ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Beller @ 2017-03-21  0:11 UTC (permalink / raw)
  To: jrnieder; +Cc: git, mfick, Stefan Beller

Our own version generation in GIT-VERSION-GEN is somewhat sane by testing
if we have a .git dir, and use that as a signal whether the obtained
copy of git was obtained using git (clone/fetch) or if it is just a
downloaded tar ball.

Other scripts to generate a version are not as cautious and just run
"git describe". An error from git-describe is treated as a sufficient
signal to assume it is not a git repository.

When submodules come into play, this is not true, as a submodule
may be damaged instead, such that we're still in a git repository
but error out for the sake of reporting a severly broken submodule.

Add a flag to git-describe that instructs it to treat severe submodule
errors as "dirty" instead of erroring out.

Thanks,
Stefan

Stefan Beller (3):
  submodule.c: port is_submodule_modified to use porcelain 2
  revision machinery: gentle submodule errors
  builtin/describe: introduce --submodule-error-as-dirty flag

 builtin/describe.c                           | 17 ++---
 diff-lib.c                                   | 10 ++-
 diff.h                                       |  1 +
 diffcore.h                                   |  3 +-
 revision.c                                   |  2 +
 submodule.c                                  | 94 +++++++++++++++++-----------
 submodule.h                                  | 10 ++-
 t/t4060-diff-submodule-option-diff-format.sh | 22 +++++++
 t/t6120-describe.sh                          | 17 +++++
 9 files changed, 128 insertions(+), 48 deletions(-)

-- 
2.12.0.402.g4b3201c2d6.dirty


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH 0/3] git-describe deals gracefully with broken submodules
  2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
@ 2017-03-21  6:28 ` Junio C Hamano
  2017-03-21  6:54   ` Junio C Hamano
  2017-03-21 17:46   ` Stefan Beller
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-03-21  6:28 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git, mfick

Stefan Beller <sbeller@google.com> writes:

> Our own version generation in GIT-VERSION-GEN is somewhat sane by testing
> if we have a .git dir, and use that as a signal whether the obtained
> copy of git was obtained using git (clone/fetch) or if it is just a
> downloaded tar ball.
>
> Other scripts to generate a version are not as cautious and just run
> "git describe". An error from git-describe is treated as a sufficient
> signal to assume it is not a git repository.
>
> When submodules come into play, this is not true, as a submodule
> may be damaged instead, such that we're still in a git repository
> but error out for the sake of reporting a severly broken submodule.
>
> Add a flag to git-describe that instructs it to treat severe submodule
> errors as "dirty" instead of erroring out.

I do not have a strong preference for or against the "treat a broken
repository as if nothing is wrong with the revision, but just mark
it as dirty" idea.  I would be more receptive if it substituted the
"-dirty" marker with something else, e.g. "-broken", though.

My knee-jerk reaction to the code change is that treating submodule
as something very special is probably not a good idea.  Even if you
do not use submodules, if some of the objects referenced from your
index and/or HEAD are damaged or otherwise causes some error while
accessing, the diff machinery would die, wouldn't it?

I saw that some new symbolic constants in the code to tell the
machinery to "gracefully die" (or "hide the breakage under the rug")
are named with SUBMODULE in them, which is probably a bad sign that
the design is being too centric to submodules.  The implementation
that covers only breakages in submodule as its first step may be OK
(you have to start somewhere, after all), but I think the aspiration
should be to cover all kinds of breakages in the end and turn them
to be "graceful", and if you had that goal in mind, you wouldn't be
naming these constants with SUBMODULE in them.

If "treat a broken repository as just being 'dirty'" were a good
idea, I'd suspect that we would want to see all breakages, not just
ones related to submodules, to be treated the same way.  

But it is possible there may be a reason why submodules are special.
I do not think the third paragraph quoted above is a good
justification.  A repository with broken submodule is just as broken
and untrustworthy as a broken repository without a submodule, and if
you want to allow such a checkout with broken submodule to call
itself v2.0-dirty, you would also want to allow a broken checkout
without any submodule to do so, too.


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH 0/3] git-describe deals gracefully with broken submodules
  2017-03-21  6:28 ` Junio C Hamano
@ 2017-03-21  6:54   ` Junio C Hamano
  2017-03-21 18:51     ` [PATCH] builtin/describe: introduce --broken flag Stefan Beller
  2017-03-21 17:46   ` Stefan Beller
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2017-03-21  6:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git, mfick

Junio C Hamano <gitster@pobox.com> writes:

> I do not have a strong preference for or against the "treat a broken
> repository as if nothing is wrong with the revision, but just mark
> it as dirty" idea.  I would be more receptive if it substituted the
> "-dirty" marker with something else, e.g. "-broken", though.
> ...
> But it is possible there may be a reason why submodules are special.
> I do not think the third paragraph quoted above is a good
> justification.  A repository with broken submodule is just as broken
> and untrustworthy as a broken repository without a submodule, and if
> you want to allow such a checkout with broken submodule to call
> itself v2.0-dirty, you would also want to allow a broken checkout
> without any submodule to do so, too.

Having said all that, if I thought it were a good idea to optionally
allow people to treat "repository is corrupt in some ways to make it
impossible for us if the checkout is dirty or not, even though we
are fairly certain what commit .git/HEAD points at is" as a state
that is describable, I probably would

 - introduce a new "git describe --possibly-broken" option;

 - instead of running "diff-index" internally to decide the "-dirty"
   state, spawn "diff-index" as a separate process;

 - observe the exit status from "diff-index" and add "-dirty" suffix
   when it is _known_ to be dirty, add "-broken" suffix when it
   died, and leave out the suffix when we know that the checkout is
   clean.

That way, we wouldn't have to contaminate the generic codepath with
a "treat broken and modified states as if they are the same" logic,
only to support such a niche feature that we wouldn't want to use
anywhere else.

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH 0/3] git-describe deals gracefully with broken submodules
  2017-03-21  6:28 ` Junio C Hamano
  2017-03-21  6:54   ` Junio C Hamano
@ 2017-03-21 17:46   ` Stefan Beller
  1 sibling, 0 replies; 12+ messages in thread
From: Stefan Beller @ 2017-03-21 17:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git, Martin Fick

On Mon, Mar 20, 2017 at 11:28 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> Our own version generation in GIT-VERSION-GEN is somewhat sane by testing
>> if we have a .git dir, and use that as a signal whether the obtained
>> copy of git was obtained using git (clone/fetch) or if it is just a
>> downloaded tar ball.
>>
>> Other scripts to generate a version are not as cautious and just run
>> "git describe". An error from git-describe is treated as a sufficient
>> signal to assume it is not a git repository.
>>
>> When submodules come into play, this is not true, as a submodule
>> may be damaged instead, such that we're still in a git repository
>> but error out for the sake of reporting a severly broken submodule.
>>
>> Add a flag to git-describe that instructs it to treat severe submodule
>> errors as "dirty" instead of erroring out.
>
> I do not have a strong preference for or against the "treat a broken
> repository as if nothing is wrong with the revision, but just mark
> it as dirty" idea.  I would be more receptive if it substituted the
> "-dirty" marker with something else, e.g. "-broken", though.

ok. That was my initial plan. I plan on also incorporating this into
git-status.

>
> My knee-jerk reaction to the code change is that treating submodule
> as something very special is probably not a good idea.  Even if you
> do not use submodules, if some of the objects referenced from your
> index and/or HEAD are damaged or otherwise causes some error while
> accessing, the diff machinery would die, wouldn't it?

Yes; ideally we would want to have a generic "error with working tree"
for all these kinds of errors. Note that the probability of submodule errors
is way higher. (This has a couple of reasons: (1) a submodule has more
complexity than a file, (2) that opens up new areas in which errors can occur
and (3) submodules are less popular so we have fewer
bug reports and people working on it, hence less mature code.

As an example for (2), we had relative/absolute path issues in Git 2.7
for submodules, which is what Martin was complaining about.

> I saw that some new symbolic constants in the code to tell the
> machinery to "gracefully die" (or "hide the breakage under the rug")
> are named with SUBMODULE in them, which is probably a bad sign that
> the design is being too centric to submodules.  The implementation
> that covers only breakages in submodule as its first step may be OK
> (you have to start somewhere, after all), but I think the aspiration
> should be to cover all kinds of breakages in the end and turn them
> to be "graceful", and if you had that goal in mind, you wouldn't be
> naming these constants with SUBMODULE in them.

ok

>
> If "treat a broken repository as just being 'dirty'" were a good
> idea, I'd suspect that we would want to see all breakages, not just
> ones related to submodules, to be treated the same way.
>
> But it is possible there may be a reason why submodules are special.
> I do not think the third paragraph quoted above is a good
> justification.  A repository with broken submodule is just as broken
> and untrustworthy as a broken repository without a submodule, and if
> you want to allow such a checkout with broken submodule to call
> itself v2.0-dirty, you would also want to allow a broken checkout
> without any submodule to do so, too.

agreed.

> I probably would
>
> - introduce a new "git describe --possibly-broken" option;
>
> - instead of running "diff-index" internally to decide the "-dirty"
>   state, spawn "diff-index" as a separate process;
>
> - observe the exit status from "diff-index" and add "-dirty" suffix
>   when it is _known_ to be dirty, add "-broken" suffix when it
>   died, and leave out the suffix when we know that the checkout is
>   clean.
>
> That way, we wouldn't have to contaminate the generic codepath with
> a "treat broken and modified states as if they are the same" logic,
> only to support such a niche feature that we wouldn't want to use
> anywhere else.

I planned on a follow-up that adds similar behavior to git-status,
such that git-status can be used to get the big picture before
diagnosing the individual breakages of a submodule.

That is the reason why I started with the first patch porting
'is_submodule_modified' to porcelain2 of git-status, as that
allows way better reporting on submodules. Sorry for intermixing
these two things.

Thanks,
Stefan

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH] builtin/describe: introduce --broken flag
  2017-03-21  6:54   ` Junio C Hamano
@ 2017-03-21 18:51     ` Stefan Beller
  2017-03-21 21:51       ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Beller @ 2017-03-21 18:51 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, mfick, Stefan Beller

git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.

To keep this property of only erroring out, when not in a repository,
severe (submodule) errors must be downgraded to reporting them gently
instead of having git-describe error out completely.

To achieve that a flag '--broken' is introduced, which is in the same
vein as '--dirty' but uses an actual child process to check for dirtiness.
When that child dies unexpectedly, we'll append '-broken' instead of
'-dirty'.

This patch helps to fix the root cause in [1], which tries to work around
this situation.

[1] ("Work around git describe bug for build.")
https://gerrit-review.googlesource.com/#/c/99851/

Signed-off-by: Stefan Beller <sbeller@google.com>
---

> I probably would
> 
> - introduce a new "git describe --possibly-broken" option;
> 
> - instead of running "diff-index" internally to decide the "-dirty"
>    state, spawn "diff-index" as a separate process;
> 
>  - observe the exit status from "diff-index" and add "-dirty" suffix

This is what this patch does. It doesn't need any preceeding refactor
patches, but comes on its own; developed on origin/master.

Thanks,
Stefan

 Documentation/git-describe.txt |  7 +++++
 builtin/describe.c             | 59 ++++++++++++++++++++++++++++++------------
 t/t6120-describe.sh            | 20 ++++++++++++++
 3 files changed, 70 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index 8755f3af7b..b71fa7a4ad 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -34,6 +34,13 @@ OPTIONS
 	It means describe HEAD and appends <mark> (`-dirty` by
 	default) if the working tree is dirty.
 
+--broken[=<mark>]::
+	Describe the working tree.
+	It means describe HEAD and appends <mark> (`-broken` by
+	default) if the working tree cannot be examined for dirtiness.
+	This implies `--dirty`, which is the fallback behavior when
+	the working tree examination works.
+
 --all::
 	Instead of using only the annotated tags, use any ref
 	found in `refs/` namespace.  This option enables matching
diff --git a/builtin/describe.c b/builtin/describe.c
index 76c18059bf..37a83520c9 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -9,6 +9,7 @@
 #include "diff.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include "run-command.h"
 
 #define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
@@ -31,12 +32,7 @@ static int have_util;
 static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
 static int always;
-static const char *dirty;
-
-/* diff-index command arguments to check if working tree is dirty. */
-static const char *diff_index_args[] = {
-	"diff-index", "--quiet", "HEAD", "--", NULL
-};
+static const char *append, *dirty, *broken;
 
 struct commit_name {
 	struct hashmap_entry entry;
@@ -292,8 +288,8 @@ static void describe(const char *arg, int last_one)
 		display_name(n);
 		if (longformat)
 			show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid);
-		if (dirty)
-			printf("%s", dirty);
+		if (append)
+			printf("%s", append);
 		printf("\n");
 		return;
 	}
@@ -369,8 +365,8 @@ static void describe(const char *arg, int last_one)
 		struct object_id *oid = &cmit->object.oid;
 		if (always) {
 			printf("%s", find_unique_abbrev(oid->hash, abbrev));
-			if (dirty)
-				printf("%s", dirty);
+			if (append)
+				printf("%s", append);
 			printf("\n");
 			return;
 		}
@@ -413,8 +409,8 @@ static void describe(const char *arg, int last_one)
 	display_name(all_matches[0].name);
 	if (abbrev)
 		show_suffix(all_matches[0].depth, &cmit->object.oid);
-	if (dirty)
-		printf("%s", dirty);
+	if (append)
+		printf("%s", append);
 	printf("\n");
 
 	if (!last_one)
@@ -445,6 +441,9 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		{OPTION_STRING, 0, "dirty",  &dirty, N_("mark"),
 			N_("append <mark> on dirty working tree (default: \"-dirty\")"),
 			PARSE_OPT_OPTARG, NULL, (intptr_t) "-dirty"},
+		{OPTION_STRING, 0, "broken",  &broken, N_("mark"),
+			N_("append <mark> on broken working tree (default: \"-broken\")"),
+			PARSE_OPT_OPTARG, NULL, (intptr_t) "-broken"},
 		OPT_END(),
 	};
 
@@ -493,10 +492,35 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		die(_("No names found, cannot describe anything."));
 
 	if (argc == 0) {
-		if (dirty) {
+		if (broken) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			argv_array_pushl(&cp.args, "diff-index", "--quiet", "HEAD", "--", NULL);
+			cp.git_cmd = 1;
+			cp.no_stdin = 1;
+			cp.no_stdout = 1;
+
+			if (!dirty)
+				dirty = "-dirty";
+
+			switch (run_command(&cp)) {
+			case 0:
+				append = NULL;
+				break;
+			case 1:
+				/* keep dirty as is */
+				append = dirty;
+				break;
+			default:
+				/* diff-index aborted abnormally */
+				append = broken;
+			}
+		} else if (dirty) {
+			struct argv_array args = ARGV_ARRAY_INIT;
 			static struct lock_file index_lock;
 			int fd;
 
+			argv_array_pushl(&args, "diff-index", "--quiet", "HEAD", "--", NULL);
+
 			read_cache_preload(NULL);
 			refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
 				      NULL, NULL, NULL);
@@ -504,13 +528,16 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			if (0 <= fd)
 				update_index_if_able(&the_index, &index_lock);
 
-			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
-					    diff_index_args, prefix))
-				dirty = NULL;
+			if (cmd_diff_index(args.argc, args.argv, prefix))
+				append = dirty;
+			else
+				append = NULL;
 		}
 		describe("HEAD", 1);
 	} else if (dirty) {
 		die(_("--dirty is incompatible with commit-ishes"));
+	} else if (broken) {
+		die(_("--broken is incompatible with commit-ishes"));
 	} else {
 		while (argc-- > 0)
 			describe(*argv++, argc == 0);
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 167491fd5b..16952e44fc 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -233,4 +233,24 @@ test_expect_success 'describe --contains and --no-match' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup and absorb a submodule' '
+	test_create_repo sub1 &&
+	test_commit -C sub1 initial &&
+	git submodule add ./sub1 &&
+	git submodule absorbgitdirs &&
+	git commit -a -m "add submodule" &&
+	git describe --dirty >expect &&
+	git describe --broken >out &&
+	test_cmp expect out
+'
+
+test_expect_success 'describe chokes on severly broken submodules' '
+	mv .git/modules/sub1/ .git/modules/sub_moved &&
+	test_must_fail git describe --dirty
+'
+test_expect_success 'describe ignoring a borken submodule' '
+	git describe --broken >out &&
+	grep broken out
+'
+
 test_done
-- 
2.12.0.402.g0501f7a28e.dirty


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 18:51     ` [PATCH] builtin/describe: introduce --broken flag Stefan Beller
@ 2017-03-21 21:51       ` Junio C Hamano
  2017-03-21 22:27         ` Stefan Beller
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2017-03-21 21:51 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jrnieder, mfick

Stefan Beller <sbeller@google.com> writes:

> This patch helps to fix the root cause in [1], which tries to work around
> this situation.

I do not necessarily think it is reasonable to give $version-dirty
and proceed when a repository corruption is detected; if there is a
breakage in the repository, "git describe" is correct to die when a
populated submodule is broken.  IOW, I do not agree that [1] below
is working with a sensible expectation.

This is a tangent, but how does the Gerrit repository get corrupted
in the way described in [1] in the first place?  That might be what
needs to be corrected, perhaps?

>  Documentation/git-describe.txt |  7 +++++
>  builtin/describe.c             | 59 ++++++++++++++++++++++++++++++------------
>  t/t6120-describe.sh            | 20 ++++++++++++++
>  3 files changed, 70 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
> index 8755f3af7b..b71fa7a4ad 100644
> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -34,6 +34,13 @@ OPTIONS
>  	It means describe HEAD and appends <mark> (`-dirty` by
>  	default) if the working tree is dirty.
>  
> +--broken[=<mark>]::
> +	Describe the working tree.
> +	It means describe HEAD and appends <mark> (`-broken` by
> +	default) if the working tree cannot be examined for dirtiness.
> +	This implies `--dirty`, which is the fallback behavior when
> +	the working tree examination works.

The wording for the "--dirty" is already awkward, but this one is
even more so ("Describe the working tree. It means" conveys no
useful information).  I however cannot come up with something much
better.  This is the best I could come up with:

	Describe the state of the working tree.  When the working
	tree matches HEAD, the output is the same as "git describe
	HEAD" and "-dirty" is appended to it if the working tree has
	local modification.  When a repository is corrupt and Git
	cannot determine if there is local modification, instead of
	dying, append "-broken" instead.

The last sentence can be tweaked to replace the description for
"--dirty".

> diff --git a/builtin/describe.c b/builtin/describe.c
> index 76c18059bf..37a83520c9 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -9,6 +9,7 @@
>  #include "diff.h"
>  #include "hashmap.h"
>  #include "argv-array.h"
> +#include "run-command.h"
>  
>  #define SEEN		(1u << 0)
>  #define MAX_TAGS	(FLAG_BITS - 1)
> @@ -31,12 +32,7 @@ static int have_util;
>  static struct string_list patterns = STRING_LIST_INIT_NODUP;
>  static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
>  static int always;
> -static const char *dirty;
> -
> -/* diff-index command arguments to check if working tree is dirty. */
> -static const char *diff_index_args[] = {
> -	"diff-index", "--quiet", "HEAD", "--", NULL
> -};
> +static const char *append, *dirty, *broken;

Perhaps call it "suffix" or something?

> +		if (broken) {
> +			struct child_process cp = CHILD_PROCESS_INIT;
> +			argv_array_pushl(&cp.args, "diff-index", "--quiet", "HEAD", "--", NULL);
> ...
> +			}
> +		} else if (dirty) {
> +			struct argv_array args = ARGV_ARRAY_INIT;
>  			static struct lock_file index_lock;
>  			int fd;
>  
> +			argv_array_pushl(&args, "diff-index", "--quiet", "HEAD", "--", NULL);


Somehow it looks like the patch is making the code a lot worse by
losing diff_index_args[] and duplicating the same command line twice
in the code.  Wouldn't argv_array_pushv() into these two different args
array from the same template work better?

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 21:51       ` Junio C Hamano
@ 2017-03-21 22:27         ` Stefan Beller
  2017-03-21 22:41           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Beller @ 2017-03-21 22:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Nieder, Martin Fick

On Tue, Mar 21, 2017 at 2:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> This patch helps to fix the root cause in [1], which tries to work around
>> this situation.
>
> I do not necessarily think it is reasonable to give $version-dirty
> and proceed when a repository corruption is detected; if there is a
> breakage in the repository, "git describe" is correct to die when a
> populated submodule is broken.  IOW, I do not agree that [1] below
> is working with a sensible expectation.

ok, so I won't quote it in the commit message

>
> This is a tangent, but how does the Gerrit repository get corrupted
> in the way described in [1] in the first place?  That might be what
> needs to be corrected, perhaps?

AFAICT, someone is (was?) using a version of Git that doesn't contain
f8eaa0ba98 (submodule--helper, module_clone: always operate
on absolute paths, 2016-03-31). So then the submodule paths were
made absolute paths on creation of the Gerrit repo.

And then someone moved the repo and the absolute paths broke.

Even after an upgrade of Git to its latest and greatest version, the underlying
issue of having broken submodule paths remains in that case.

So there are a couple of ways forward
0) as an immediate fix, manually fix the absolute path or make them relative

1A) have more error resilient tools in Git
1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
    the .git file link and the core.worktree setting to be relative and correct.

I think we should do both A and B, I decided to go with A first, specifically
"git-describe" as that was reported to not work well in this situation with the
given broken data

> The wording for the "--dirty" is already awkward, but this one is
> even more so ("Describe the working tree. It means" conveys no
> useful information).  I however cannot come up with something much
> better.  This is the best I could come up with:
>
>         Describe the state of the working tree.  When the working
>         tree matches HEAD, the output is the same as "git describe
>         HEAD" and "-dirty" is appended to it if the working tree has
>         local modification.  When a repository is corrupt and Git
>         cannot determine if there is local modification, instead of
>         dying, append "-broken" instead.

ok, I'll reuse parts of it.

>> +static const char *append, *dirty, *broken;
>
> Perhaps call it "suffix" or something?

done


>> +                     argv_array_pushl(&args, "diff-index", "--quiet", "HEAD", "--", NULL);
..
>  Wouldn't argv_array_pushv() into these two different args
> array from the same template work better?

yes, looks much saner. Thanks

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 22:27         ` Stefan Beller
@ 2017-03-21 22:41           ` Junio C Hamano
  2017-03-21 22:50             ` Stefan Beller
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2017-03-21 22:41 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git\, Jonathan Nieder, Martin Fick

Stefan Beller <sbeller@google.com> writes:

> AFAICT, someone is (was?) using a version of Git that doesn't contain
> f8eaa0ba98 (submodule--helper, module_clone: always operate
> on absolute paths, 2016-03-31). So then the submodule paths were
> made absolute paths on creation of the Gerrit repo.
>
> And then someone moved the repo and the absolute paths broke.
>
> Even after an upgrade of Git to its latest and greatest version, the underlying
> issue of having broken submodule paths remains in that case.
>
> So there are a couple of ways forward
> 0) as an immediate fix, manually fix the absolute path or make them relative
>
> 1A) have more error resilient tools in Git
> 1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
>     the .git file link and the core.worktree setting to be relative and correct.
>
> I think we should do both A and B, I decided to go with A first, specifically
> "git-describe" as that was reported to not work well in this situation with the
> given broken data

Sounds sensible.  I would say that we should do less 1A (i.e. hiding
problems under the rug) and more 1B.  Of course, making the problem
less likely to happen in the first place would be more important ;-)

Thanks.

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 22:41           ` Junio C Hamano
@ 2017-03-21 22:50             ` Stefan Beller
  2017-03-21 22:57               ` [PATCH v2] builtin/describe: introduce --broken flag Stefan Beller
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Beller @ 2017-03-21 22:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Nieder, Martin Fick

On Tue, Mar 21, 2017 at 3:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> AFAICT, someone is (was?) using a version of Git that doesn't contain
>> f8eaa0ba98 (submodule--helper, module_clone: always operate
>> on absolute paths, 2016-03-31). So then the submodule paths were
>> made absolute paths on creation of the Gerrit repo.
>>
>> And then someone moved the repo and the absolute paths broke.
>>
>> Even after an upgrade of Git to its latest and greatest version, the underlying
>> issue of having broken submodule paths remains in that case.
>>
>> So there are a couple of ways forward
>> 0) as an immediate fix, manually fix the absolute path or make them relative
>>
>> 1A) have more error resilient tools in Git
>> 1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
>>     the .git file link and the core.worktree setting to be relative and correct.
>>
>> I think we should do both A and B, I decided to go with A first, specifically
>> "git-describe" as that was reported to not work well in this situation with the
>> given broken data
>
> Sounds sensible.  I would say that we should do less 1A (i.e. hiding
> problems under the rug) and more 1B.  Of course, making the problem
> less likely to happen in the first place would be more important ;-)

Agreed on not introducing bugs as often. For (B), I wonder what the
right place is.

Maybe "reset --hard", in combination with "--recurse-submodules" would also
fix these submodule path issues. The made up "git submodule fsck-setup"
sounds like it would only detect (and not fix) the problems. But these absolute
paths issues are known how to fix, so a "fsck"-y tool may convey a wrong
impression.

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* [PATCH v2] builtin/describe: introduce --broken flag
  2017-03-21 22:50             ` Stefan Beller
@ 2017-03-21 22:57               ` Stefan Beller
  2017-03-22 17:21                 ` Junio C Hamano
  2017-03-22 21:50                 ` Jakub Narębski
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Beller @ 2017-03-21 22:57 UTC (permalink / raw)
  To: gitster, jrnieder; +Cc: git, mfick, Stefan Beller

git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.

To keep this property of only erroring out, when not in a repository,
severe (submodule) errors must be downgraded to reporting them gently
instead of having git-describe error out completely.

To achieve that a flag '--broken' is introduced, which is in the same
vein as '--dirty' but uses an actual child process to check for dirtiness.
When that child dies unexpectedly, we'll append '-broken' instead of
'-dirty'.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-describe.txt | 11 +++++++---
 builtin/describe.c             | 47 ++++++++++++++++++++++++++++++++++--------
 t/t6120-describe.sh            | 20 ++++++++++++++++++
 3 files changed, 66 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index 8755f3af7b..26f19d3b07 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -30,9 +30,14 @@ OPTIONS
 	Commit-ish object names to describe.  Defaults to HEAD if omitted.
 
 --dirty[=<mark>]::
-	Describe the working tree.
-	It means describe HEAD and appends <mark> (`-dirty` by
-	default) if the working tree is dirty.
+--broken[=<mark>]::
+	Describe the state of the working tree.  When the working
+	tree matches HEAD, the output is the same as "git describe
+	HEAD".  If the working tree has local modification "-dirty"
+	is appended to it.  If a repository is corrupt and Git
+	cannot determine if there is local modification, Git will
+	error out, unless `--broken' is given, which appends
+	the suffix "-broken" instead.
 
 --all::
 	Instead of using only the annotated tags, use any ref
diff --git a/builtin/describe.c b/builtin/describe.c
index 76c18059bf..45adbf67d5 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -9,6 +9,7 @@
 #include "diff.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include "run-command.h"
 
 #define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
@@ -31,7 +32,7 @@ static int have_util;
 static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
 static int always;
-static const char *dirty;
+static const char *suffix, *dirty, *broken;
 
 /* diff-index command arguments to check if working tree is dirty. */
 static const char *diff_index_args[] = {
@@ -292,8 +293,8 @@ static void describe(const char *arg, int last_one)
 		display_name(n);
 		if (longformat)
 			show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid);
-		if (dirty)
-			printf("%s", dirty);
+		if (suffix)
+			printf("%s", suffix);
 		printf("\n");
 		return;
 	}
@@ -369,8 +370,8 @@ static void describe(const char *arg, int last_one)
 		struct object_id *oid = &cmit->object.oid;
 		if (always) {
 			printf("%s", find_unique_abbrev(oid->hash, abbrev));
-			if (dirty)
-				printf("%s", dirty);
+			if (suffix)
+				printf("%s", suffix);
 			printf("\n");
 			return;
 		}
@@ -413,8 +414,8 @@ static void describe(const char *arg, int last_one)
 	display_name(all_matches[0].name);
 	if (abbrev)
 		show_suffix(all_matches[0].depth, &cmit->object.oid);
-	if (dirty)
-		printf("%s", dirty);
+	if (suffix)
+		printf("%s", suffix);
 	printf("\n");
 
 	if (!last_one)
@@ -445,6 +446,9 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		{OPTION_STRING, 0, "dirty",  &dirty, N_("mark"),
 			N_("append <mark> on dirty working tree (default: \"-dirty\")"),
 			PARSE_OPT_OPTARG, NULL, (intptr_t) "-dirty"},
+		{OPTION_STRING, 0, "broken",  &broken, N_("mark"),
+			N_("append <mark> on broken working tree (default: \"-broken\")"),
+			PARSE_OPT_OPTARG, NULL, (intptr_t) "-broken"},
 		OPT_END(),
 	};
 
@@ -493,7 +497,28 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		die(_("No names found, cannot describe anything."));
 
 	if (argc == 0) {
-		if (dirty) {
+		if (broken) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			argv_array_pushv(&cp.args, diff_index_args);
+			cp.git_cmd = 1;
+			cp.no_stdin = 1;
+			cp.no_stdout = 1;
+
+			if (!dirty)
+				dirty = "-dirty";
+
+			switch (run_command(&cp)) {
+			case 0:
+				suffix = NULL;
+				break;
+			case 1:
+				suffix = dirty;
+				break;
+			default:
+				/* diff-index aborted abnormally */
+				suffix = broken;
+			}
+		} else if (dirty) {
 			static struct lock_file index_lock;
 			int fd;
 
@@ -506,11 +531,15 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 
 			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
 					    diff_index_args, prefix))
-				dirty = NULL;
+				suffix = NULL;
+			else
+				suffix = dirty;
 		}
 		describe("HEAD", 1);
 	} else if (dirty) {
 		die(_("--dirty is incompatible with commit-ishes"));
+	} else if (broken) {
+		die(_("--broken is incompatible with commit-ishes"));
 	} else {
 		while (argc-- > 0)
 			describe(*argv++, argc == 0);
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 167491fd5b..16952e44fc 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -233,4 +233,24 @@ test_expect_success 'describe --contains and --no-match' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup and absorb a submodule' '
+	test_create_repo sub1 &&
+	test_commit -C sub1 initial &&
+	git submodule add ./sub1 &&
+	git submodule absorbgitdirs &&
+	git commit -a -m "add submodule" &&
+	git describe --dirty >expect &&
+	git describe --broken >out &&
+	test_cmp expect out
+'
+
+test_expect_success 'describe chokes on severly broken submodules' '
+	mv .git/modules/sub1/ .git/modules/sub_moved &&
+	test_must_fail git describe --dirty
+'
+test_expect_success 'describe ignoring a borken submodule' '
+	git describe --broken >out &&
+	grep broken out
+'
+
 test_done
-- 
2.12.1.385.gd4d53dfa89.dirty


^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH v2] builtin/describe: introduce --broken flag
  2017-03-21 22:57               ` [PATCH v2] builtin/describe: introduce --broken flag Stefan Beller
@ 2017-03-22 17:21                 ` Junio C Hamano
  2017-03-22 21:50                 ` Jakub Narębski
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-03-22 17:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git, mfick

Stefan Beller <sbeller@google.com> writes:

> git-describe tells you the version number you're at, or errors out, e.g.
> when you run it outside of a repository, which may happen when downloading
> a tar ball instead of using git to obtain the source code.
>
> To keep this property of only erroring out, when not in a repository,
> severe (submodule) errors must be downgraded to reporting them gently
> instead of having git-describe error out completely.
>
> To achieve that a flag '--broken' is introduced, which is in the same
> vein as '--dirty' but uses an actual child process to check for dirtiness.
> When that child dies unexpectedly, we'll append '-broken' instead of
> '-dirty'.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  Documentation/git-describe.txt | 11 +++++++---
>  builtin/describe.c             | 47 ++++++++++++++++++++++++++++++++++--------
>  t/t6120-describe.sh            | 20 ++++++++++++++++++
>  3 files changed, 66 insertions(+), 12 deletions(-)

I admit that my suggestion to use pushv() was done without trying it
out myself, but I do agree that the result looks better, especially
because the "dirty" (not "broken") side does not even have to use a
separate argv_array to prepare the command line in the first place
(which I failed to spot exactly because I didn't try it myself ;-).

We probably _could_ use cp.argv to point at the diff_index_args()
without doing pushv(&cp.args), and that would save a bit of
allocation, but it would not matter in practice as this is not a
performance critical codepath.

Thanks.
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 167491fd5b..16952e44fc 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -233,4 +233,24 @@ test_expect_success 'describe --contains and --no-match' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup and absorb a submodule' '
> +	test_create_repo sub1 &&
> +	test_commit -C sub1 initial &&
> +	git submodule add ./sub1 &&
> +	git submodule absorbgitdirs &&
> +	git commit -a -m "add submodule" &&
> +	git describe --dirty >expect &&
> +	git describe --broken >out &&
> +	test_cmp expect out
> +'
> +
> +test_expect_success 'describe chokes on severly broken submodules' '
> +	mv .git/modules/sub1/ .git/modules/sub_moved &&
> +	test_must_fail git describe --dirty
> +'
> +test_expect_success 'describe ignoring a borken submodule' '
> +	git describe --broken >out &&
> +	grep broken out
> +'
> +
>  test_done

^ permalink raw reply	[flat|threaded] 12+ messages in thread

* Re: [PATCH v2] builtin/describe: introduce --broken flag
  2017-03-21 22:57               ` [PATCH v2] builtin/describe: introduce --broken flag Stefan Beller
  2017-03-22 17:21                 ` Junio C Hamano
@ 2017-03-22 21:50                 ` Jakub Narębski
  1 sibling, 0 replies; 12+ messages in thread
From: Jakub Narębski @ 2017-03-22 21:50 UTC (permalink / raw)
  To: Stefan Beller, Junio C Hamano, Jonathan Nieder; +Cc: git, mfick

W dniu 21.03.2017 o 23:57, Stefan Beller pisze:

> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -30,9 +30,14 @@ OPTIONS
>  	Commit-ish object names to describe.  Defaults to HEAD if omitted.
>  
>  --dirty[=<mark>]::
> -	Describe the working tree.
> -	It means describe HEAD and appends <mark> (`-dirty` by
> -	default) if the working tree is dirty.
> +--broken[=<mark>]::
> +	Describe the state of the working tree.  When the working
> +	tree matches HEAD, the output is the same as "git describe
> +	HEAD".  If the working tree has local modification "-dirty"
> +	is appended to it.  If a repository is corrupt and Git
> +	cannot determine if there is local modification, Git will
> +	error out, unless `--broken' is given, which appends
> +	the suffix "-broken" instead.

The common description reads better... but unfortunately it lost
information about the optional parameter, namely <mark>.  The
'-dirty' is just the default for <dirty-mark>, and '-broken' is
the default for <broken-mark>.

Maybe /the suffix "-broken"/<broken-mark> suffix ('-broken' by default)/
and similarly for "-dirty"?

Best,
-- 
Jakub Narębski


^ permalink raw reply	[flat|threaded] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
2017-03-21  6:28 ` Junio C Hamano
2017-03-21  6:54   ` Junio C Hamano
2017-03-21 18:51     ` [PATCH] builtin/describe: introduce --broken flag Stefan Beller
2017-03-21 21:51       ` Junio C Hamano
2017-03-21 22:27         ` Stefan Beller
2017-03-21 22:41           ` Junio C Hamano
2017-03-21 22:50             ` Stefan Beller
2017-03-21 22:57               ` [PATCH v2] builtin/describe: introduce --broken flag Stefan Beller
2017-03-22 17:21                 ` Junio C Hamano
2017-03-22 21:50                 ` Jakub Narębski
2017-03-21 17:46   ` Stefan Beller

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox