git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] git-describe deals gracefully with broken submodules
@ 2017-03-21  0:11 Stefan Beller
  2017-03-21  0:11 ` [PATCH 1/3] submodule.c: port is_submodule_modified to use porcelain 2 Stefan Beller
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stefan Beller @ 2017-03-21  0:11 UTC (permalink / raw)
  To: jrnieder; +Cc: git, mfick, Stefan Beller

Our own version generation in GIT-VERSION-GEN is somewhat sane by testing
if we have a .git dir, and use that as a signal whether the obtained
copy of git was obtained using git (clone/fetch) or if it is just a
downloaded tar ball.

Other scripts to generate a version are not as cautious and just run
"git describe". An error from git-describe is treated as a sufficient
signal to assume it is not a git repository.

When submodules come into play, this is not true, as a submodule
may be damaged instead, such that we're still in a git repository
but error out for the sake of reporting a severly broken submodule.

Add a flag to git-describe that instructs it to treat severe submodule
errors as "dirty" instead of erroring out.

Thanks,
Stefan

Stefan Beller (3):
  submodule.c: port is_submodule_modified to use porcelain 2
  revision machinery: gentle submodule errors
  builtin/describe: introduce --submodule-error-as-dirty flag

 builtin/describe.c                           | 17 ++---
 diff-lib.c                                   | 10 ++-
 diff.h                                       |  1 +
 diffcore.h                                   |  3 +-
 revision.c                                   |  2 +
 submodule.c                                  | 94 +++++++++++++++++-----------
 submodule.h                                  | 10 ++-
 t/t4060-diff-submodule-option-diff-format.sh | 22 +++++++
 t/t6120-describe.sh                          | 17 +++++
 9 files changed, 128 insertions(+), 48 deletions(-)

-- 
2.12.0.402.g4b3201c2d6.dirty


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] submodule.c: port is_submodule_modified to use porcelain 2
  2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
@ 2017-03-21  0:11 ` Stefan Beller
  2017-03-21  0:11 ` [PATCH 2/3] revision machinery: gentle submodule errors Stefan Beller
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2017-03-21  0:11 UTC (permalink / raw)
  To: jrnieder; +Cc: git, mfick, Stefan Beller

Migrate 'is_submodule_modified' to the new porcelain format of
git-status.

As the old porcelain only reported ' M' for submodules, no
matter what happened inside the submodule (untracked files,
changes to tracked files or move of HEAD), the new API
properly reports the different scenarios.

In a followup patch we will make use of these finer grained
reporting for git-status.

While porting this to the new API, add another extension
point that will get used in the future: When a submodule
is broken (e.g. the .git file pointing to a wrong directory,
not containing a git dir, as fallout of e.g. f8eaa0ba98
(submodule--helper, module_clone: always operate on absolute
paths, 2016-03-31)), we can chose to not die and report it
differently.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 85 +++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 49 insertions(+), 36 deletions(-)

diff --git a/submodule.c b/submodule.c
index 3200b7bb2b..81d44cb7e9 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1041,67 +1041,80 @@ int fetch_populated_submodules(const struct argv_array *options,
 
 unsigned is_submodule_modified(const char *path, int ignore_untracked)
 {
-	ssize_t len;
 	struct child_process cp = CHILD_PROCESS_INIT;
-	const char *argv[] = {
-		"status",
-		"--porcelain",
-		NULL,
-		NULL,
-	};
 	struct strbuf buf = STRBUF_INIT;
 	unsigned dirty_submodule = 0;
-	const char *line, *next_line;
 	const char *git_dir;
+	int error_code = 0;
 
 	strbuf_addf(&buf, "%s/.git", path);
-	git_dir = read_gitfile(buf.buf);
-	if (!git_dir)
-		git_dir = buf.buf;
-	if (!is_directory(git_dir)) {
-		strbuf_release(&buf);
-		/* The submodule is not checked out, so it is not modified */
-		return 0;
+	git_dir = resolve_gitdir_gently(buf.buf, &error_code);
 
+	if (!git_dir) {
+		switch (error_code) {
+		case READ_GITFILE_ERR_STAT_FAILED:
+		case READ_GITFILE_ERR_NOT_A_FILE:
+			/* We may have an uninitialized repo here */
+			return 0;
+		default:
+		case READ_GITFILE_ERR_OPEN_FAILED:
+		case READ_GITFILE_ERR_READ_FAILED:
+		case READ_GITFILE_ERR_INVALID_FORMAT:
+		case READ_GITFILE_ERR_NO_PATH:
+		case READ_GITFILE_ERR_NOT_A_REPO:
+		case READ_GITFILE_ERR_TOO_LARGE:
+			/*
+			 * All these other error codes are indicating
+			 * a broken submodule. We do not know what is
+			 * right here. Resolve again triggering die()
+			 * inside of the parsing.
+			 */
+			read_gitfile_gently(buf.buf, NULL);
+			die("BUG: read_gitfile_gently should have died.");
+		}
 	}
+
 	strbuf_reset(&buf);
 
+	argv_array_pushl(&cp.args, "status", "--porcelain=2", NULL);
 	if (ignore_untracked)
-		argv[2] = "-uno";
+		argv_array_push(&cp.args, "-uno");
 
-	cp.argv = argv;
 	prepare_submodule_repo_env(&cp.env_array);
 	cp.git_cmd = 1;
 	cp.no_stdin = 1;
 	cp.out = -1;
 	cp.dir = path;
 	if (start_command(&cp))
-		die("Could not run 'git status --porcelain' in submodule %s", path);
+		die("Could not run 'git status --porcelain=2' in submodule %s", path);
 
-	len = strbuf_read(&buf, cp.out, 1024);
-	line = buf.buf;
-	while (len > 2) {
-		if ((line[0] == '?') && (line[1] == '?')) {
+	while (strbuf_getwholeline_fd(&buf, cp.out, '\n') != EOF) {
+		/* regular untracked files */
+		if (buf.buf[0] == '?')
 			dirty_submodule |= DIRTY_SUBMODULE_UNTRACKED;
-			if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
-				break;
-		} else {
-			dirty_submodule |= DIRTY_SUBMODULE_MODIFIED;
-			if (ignore_untracked ||
-			    (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED))
-				break;
+
+		/* regular unmerged and renamed files */
+		if (buf.buf[0] == 'u' ||
+		    buf.buf[0] == '1' ||
+		    buf.buf[0] == '2') {
+			if (buf.buf[5] == 'S') {
+				/* nested submodule handling */
+				if (buf.buf[6] == 'C' || buf.buf[7] == 'M')
+					dirty_submodule |= DIRTY_SUBMODULE_MODIFIED;
+				if (buf.buf[8] == 'U')
+					dirty_submodule |= DIRTY_SUBMODULE_UNTRACKED;
+			} else
+				dirty_submodule |= DIRTY_SUBMODULE_MODIFIED;
 		}
-		next_line = strchr(line, '\n');
-		if (!next_line)
-			break;
-		next_line++;
-		len -= (next_line - line);
-		line = next_line;
+
+		if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED &&
+		    dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
+				break;
 	}
 	close(cp.out);
 
 	if (finish_command(&cp))
-		die("'git status --porcelain' failed in submodule %s", path);
+		die("'git status --porcelain=2' failed in submodule %s", path);
 
 	strbuf_release(&buf);
 	return dirty_submodule;
-- 
2.12.0.402.g4b3201c2d6.dirty


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] revision machinery: gentle submodule errors
  2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
  2017-03-21  0:11 ` [PATCH 1/3] submodule.c: port is_submodule_modified to use porcelain 2 Stefan Beller
@ 2017-03-21  0:11 ` Stefan Beller
  2017-03-21  0:11 ` [PATCH 3/3] builtin/describe: introduce --submodule-error-as-dirty flag Stefan Beller
  2017-03-21  6:28 ` [PATCH 0/3] git-describe deals gracefully with broken submodules Junio C Hamano
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2017-03-21  0:11 UTC (permalink / raw)
  To: jrnieder; +Cc: git, mfick, Stefan Beller

When a submodule is broken (e.g. its git directory is deleted or is not
pointed to via the .git linking file), commands that deal with submodules
error out reporting the submodule error.

Sometimes users rather want to see the rest of the command succeed,
ignoring the submodule that is severely broken.

Introduce a flag for the revision machinery '--gentle-submodule-errors'
that will ask 'is_submodule_modified' to return the broken flag instead
of dying().

By switching the argument of is_submodule_modified to an unsigned bit
vector, we can signal more flags, such as the gentle bit introduced
in this patch. When this flag is given, avoid dying inside
'is_submodule_modified' for severe errors and return
DIRTY_SUBMODULE_BROKEN instead.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 diff-lib.c                                   | 10 ++++++++--
 diff.h                                       |  1 +
 diffcore.h                                   |  3 ++-
 revision.c                                   |  2 ++
 submodule.c                                  |  9 ++++++++-
 submodule.h                                  | 10 +++++++++-
 t/t4060-diff-submodule-option-diff-format.sh | 22 ++++++++++++++++++++++
 7 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 52447466b5..61d33b6d26 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -77,8 +77,14 @@ static int match_stat_with_submodule(struct diff_options *diffopt,
 		if (DIFF_OPT_TST(diffopt, IGNORE_SUBMODULES))
 			changed = 0;
 		else if (!DIFF_OPT_TST(diffopt, IGNORE_DIRTY_SUBMODULES)
-		    && (!changed || DIFF_OPT_TST(diffopt, DIRTY_SUBMODULES)))
-			*dirty_submodule = is_submodule_modified(ce->name, DIFF_OPT_TST(diffopt, IGNORE_UNTRACKED_IN_SUBMODULES));
+		    && (!changed || DIFF_OPT_TST(diffopt, DIRTY_SUBMODULES))) {
+			unsigned flags = 0;
+			if (DIFF_OPT_TST(diffopt, IGNORE_UNTRACKED_IN_SUBMODULES))
+				flags |= IS_SUBMODULE_MODIFIED_IGNORE_UNTRACKED;
+			if (diffopt->gentle_submodule_errors)
+				flags |= IS_SUBMODULE_MODIFIED_GENTLY;
+			*dirty_submodule = is_submodule_modified(ce->name, flags);
+		}
 		diffopt->flags = orig_flags;
 	}
 	return changed;
diff --git a/diff.h b/diff.h
index e9ccb38c26..d1e48de43c 100644
--- a/diff.h
+++ b/diff.h
@@ -164,6 +164,7 @@ struct diff_options {
 	const char *word_regex;
 	enum diff_words_type word_diff;
 	enum diff_submodule_format submodule_format;
+	int gentle_submodule_errors;
 
 	/* this is set by diffcore for DIFF_FORMAT_PATCH */
 	int found_changes;
diff --git a/diffcore.h b/diffcore.h
index 6230241354..ceef783570 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -40,9 +40,10 @@ struct diff_filespec {
 #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
 	unsigned should_free : 1; /* data should be free()'ed */
 	unsigned should_munmap : 1; /* data should be munmap()'ed */
-	unsigned dirty_submodule : 2;  /* For submodules: its work tree is dirty */
+	unsigned dirty_submodule : 3;  /* For submodules: its work tree is dirty */
 #define DIRTY_SUBMODULE_UNTRACKED 1
 #define DIRTY_SUBMODULE_MODIFIED  2
+#define DIRTY_SUBMODULE_BROKEN 4
 	unsigned is_stdin : 1;
 	unsigned has_more_entries : 1; /* only appear in combined diff */
 	/* data should be considered "binary"; -1 means "don't know yet" */
diff --git a/revision.c b/revision.c
index 7ff61ff5f7..8ae1179b8d 100644
--- a/revision.c
+++ b/revision.c
@@ -2014,6 +2014,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->limited = 1;
 	} else if (!strcmp(arg, "--ignore-missing")) {
 		revs->ignore_missing = 1;
+	} else if (!strcmp(arg, "--gentle-submodule-errors")) {
+		revs->diffopt.gentle_submodule_errors = 1;
 	} else {
 		int opts = diff_opt_parse(&revs->diffopt, argv, argc, revs->prefix);
 		if (!opts)
diff --git a/submodule.c b/submodule.c
index 81d44cb7e9..d477297fab 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1039,13 +1039,18 @@ int fetch_populated_submodules(const struct argv_array *options,
 	return spf.result;
 }
 
-unsigned is_submodule_modified(const char *path, int ignore_untracked)
+/*
+ * Inspects a submodule and returns its state using flags
+ */
+unsigned is_submodule_modified(const char *path, unsigned flags)
 {
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buf = STRBUF_INIT;
 	unsigned dirty_submodule = 0;
 	const char *git_dir;
 	int error_code = 0;
+	int ignore_untracked = flags & IS_SUBMODULE_MODIFIED_IGNORE_UNTRACKED;
+	int gently = flags & IS_SUBMODULE_MODIFIED_GENTLY;
 
 	strbuf_addf(&buf, "%s/.git", path);
 	git_dir = resolve_gitdir_gently(buf.buf, &error_code);
@@ -1069,6 +1074,8 @@ unsigned is_submodule_modified(const char *path, int ignore_untracked)
 			 * right here. Resolve again triggering die()
 			 * inside of the parsing.
 			 */
+			if (gently)
+				return DIRTY_SUBMODULE_BROKEN;
 			read_gitfile_gently(buf.buf, NULL);
 			die("BUG: read_gitfile_gently should have died.");
 		}
diff --git a/submodule.h b/submodule.h
index c8a0c9cb29..b7afac5e87 100644
--- a/submodule.h
+++ b/submodule.h
@@ -62,7 +62,15 @@ extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 extern int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
 			       int quiet, int max_parallel_jobs);
-extern unsigned is_submodule_modified(const char *path, int ignore_untracked);
+
+/* Ignore untracked files */
+#define IS_SUBMODULE_MODIFIED_IGNORE_UNTRACKED 1
+/*
+ * Report severe errors in the submodule (e.g. missing git dir) via
+ * DIRTY_SUBMODULE_BROKEN instead of dying.
+ */
+#define IS_SUBMODULE_MODIFIED_GENTLY 2
+extern unsigned is_submodule_modified(const char *path, unsigned flags);
 extern int submodule_uses_gitfile(const char *path);
 
 #define SUBMODULE_REMOVAL_DIE_ON_ERROR (1<<0)
diff --git a/t/t4060-diff-submodule-option-diff-format.sh b/t/t4060-diff-submodule-option-diff-format.sh
index 7e23b55ea4..e5e11a8378 100755
--- a/t/t4060-diff-submodule-option-diff-format.sh
+++ b/t/t4060-diff-submodule-option-diff-format.sh
@@ -746,4 +746,26 @@ test_expect_success 'diff --submodule=diff with .git file' '
 	test_cmp expected actual
 '
 
+test_expect_success 'setup -- break submodule' '
+	# the gitlink in sm2 points at a missing git dir now
+	mv .real .moved_real
+'
+
+test_expect_success 'diff with severely broken submodule' '
+	test_must_fail git diff &&
+
+	sm_hash=$(git rev-parse HEAD:sm2) &&
+	cat >expect <<-EOF &&
+	diff --git a/sm2 b/sm2
+	--- a/sm2
+	+++ b/sm2
+	@@ -1 +1 @@
+	-Subproject commit $sm_hash
+	+Subproject commit $sm_hash-dirty
+	EOF
+
+	git diff --gentle-submodule-errors -- sm2 >out &&
+	test_cmp expect out
+'
+
 test_done
-- 
2.12.0.402.g4b3201c2d6.dirty


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] builtin/describe: introduce --submodule-error-as-dirty flag
  2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
  2017-03-21  0:11 ` [PATCH 1/3] submodule.c: port is_submodule_modified to use porcelain 2 Stefan Beller
  2017-03-21  0:11 ` [PATCH 2/3] revision machinery: gentle submodule errors Stefan Beller
@ 2017-03-21  0:11 ` Stefan Beller
  2017-03-21  6:28 ` [PATCH 0/3] git-describe deals gracefully with broken submodules Junio C Hamano
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2017-03-21  0:11 UTC (permalink / raw)
  To: jrnieder; +Cc: git, mfick, Stefan Beller

git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.

To keep this property of only erroring out, when not in a repository,
severe submodule errors must be downgraded to reporting them gently
instead of having git-describe error out completely.

This patch helps to fix the root cause in [1], which tries to work around
this situation.

[1] ("Work around git describe bug for build.")
https://gerrit-review.googlesource.com/#/c/99851/

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/describe.c  | 17 ++++++++++-------
 t/t6120-describe.sh | 17 +++++++++++++++++
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 76c18059bf..569fef9ecf 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -31,13 +31,9 @@ static int have_util;
 static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
 static int always;
+static int gentle_submodule_errors;
 static const char *dirty;
 
-/* diff-index command arguments to check if working tree is dirty. */
-static const char *diff_index_args[] = {
-	"diff-index", "--quiet", "HEAD", "--", NULL
-};
-
 struct commit_name {
 	struct hashmap_entry entry;
 	struct object_id peeled;
@@ -442,6 +438,8 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			   N_("do not consider tags matching <pattern>")),
 		OPT_BOOL(0, "always",        &always,
 			N_("show abbreviated commit object as fallback")),
+		OPT_BOOL(0, "submodule-error-as-dirty", &gentle_submodule_errors,
+			N_("show abbreviated commit object as fallback")),
 		{OPTION_STRING, 0, "dirty",  &dirty, N_("mark"),
 			N_("append <mark> on dirty working tree (default: \"-dirty\")"),
 			PARSE_OPT_OPTARG, NULL, (intptr_t) "-dirty"},
@@ -496,6 +494,12 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		if (dirty) {
 			static struct lock_file index_lock;
 			int fd;
+			struct argv_array args = ARGV_ARRAY_INIT;
+
+			argv_array_push(&args, "diff-index");
+			if (gentle_submodule_errors)
+				argv_array_push(&args, "--gentle-submodule-errors");
+			argv_array_pushl(&args, "--quiet", "HEAD", "--", NULL);
 
 			read_cache_preload(NULL);
 			refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
@@ -504,8 +508,7 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			if (0 <= fd)
 				update_index_if_able(&the_index, &index_lock);
 
-			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
-					    diff_index_args, prefix))
+			if (!cmd_diff_index(args.argc, args.argv, prefix))
 				dirty = NULL;
 		}
 		describe("HEAD", 1);
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 167491fd5b..99e5ba44b7 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -233,4 +233,21 @@ test_expect_success 'describe --contains and --no-match' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup and absorb a submodule' '
+	test_create_repo sub1 &&
+	test_commit -C sub1 initial &&
+	git submodule add ./sub1 &&
+	git submodule absorbgitdirs &&
+	git commit -a -m "add submodule"
+'
+
+test_expect_success 'describe chokes on severly broken submodules' '
+	mv .git/modules/sub1/ .git/modules/sub_moved &&
+	test_must_fail git describe --dirty
+'
+test_expect_success 'describe ignoring a borken submodule' '
+	git describe --dirty --submodule-error-as-dirty >out &&
+	grep dirty out
+'
+
 test_done
-- 
2.12.0.402.g4b3201c2d6.dirty


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] git-describe deals gracefully with broken submodules
  2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
                   ` (2 preceding siblings ...)
  2017-03-21  0:11 ` [PATCH 3/3] builtin/describe: introduce --submodule-error-as-dirty flag Stefan Beller
@ 2017-03-21  6:28 ` Junio C Hamano
  2017-03-21  6:54   ` Junio C Hamano
  2017-03-21 17:46   ` [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
  3 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-03-21  6:28 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git, mfick

Stefan Beller <sbeller@google.com> writes:

> Our own version generation in GIT-VERSION-GEN is somewhat sane by testing
> if we have a .git dir, and use that as a signal whether the obtained
> copy of git was obtained using git (clone/fetch) or if it is just a
> downloaded tar ball.
>
> Other scripts to generate a version are not as cautious and just run
> "git describe". An error from git-describe is treated as a sufficient
> signal to assume it is not a git repository.
>
> When submodules come into play, this is not true, as a submodule
> may be damaged instead, such that we're still in a git repository
> but error out for the sake of reporting a severly broken submodule.
>
> Add a flag to git-describe that instructs it to treat severe submodule
> errors as "dirty" instead of erroring out.

I do not have a strong preference for or against the "treat a broken
repository as if nothing is wrong with the revision, but just mark
it as dirty" idea.  I would be more receptive if it substituted the
"-dirty" marker with something else, e.g. "-broken", though.

My knee-jerk reaction to the code change is that treating submodule
as something very special is probably not a good idea.  Even if you
do not use submodules, if some of the objects referenced from your
index and/or HEAD are damaged or otherwise causes some error while
accessing, the diff machinery would die, wouldn't it?

I saw that some new symbolic constants in the code to tell the
machinery to "gracefully die" (or "hide the breakage under the rug")
are named with SUBMODULE in them, which is probably a bad sign that
the design is being too centric to submodules.  The implementation
that covers only breakages in submodule as its first step may be OK
(you have to start somewhere, after all), but I think the aspiration
should be to cover all kinds of breakages in the end and turn them
to be "graceful", and if you had that goal in mind, you wouldn't be
naming these constants with SUBMODULE in them.

If "treat a broken repository as just being 'dirty'" were a good
idea, I'd suspect that we would want to see all breakages, not just
ones related to submodules, to be treated the same way.  

But it is possible there may be a reason why submodules are special.
I do not think the third paragraph quoted above is a good
justification.  A repository with broken submodule is just as broken
and untrustworthy as a broken repository without a submodule, and if
you want to allow such a checkout with broken submodule to call
itself v2.0-dirty, you would also want to allow a broken checkout
without any submodule to do so, too.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] git-describe deals gracefully with broken submodules
  2017-03-21  6:28 ` [PATCH 0/3] git-describe deals gracefully with broken submodules Junio C Hamano
@ 2017-03-21  6:54   ` Junio C Hamano
  2017-03-21 18:51     ` [PATCH] builtin/describe: introduce --broken flag Stefan Beller
  2017-03-21 17:46   ` [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2017-03-21  6:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git, mfick

Junio C Hamano <gitster@pobox.com> writes:

> I do not have a strong preference for or against the "treat a broken
> repository as if nothing is wrong with the revision, but just mark
> it as dirty" idea.  I would be more receptive if it substituted the
> "-dirty" marker with something else, e.g. "-broken", though.
> ...
> But it is possible there may be a reason why submodules are special.
> I do not think the third paragraph quoted above is a good
> justification.  A repository with broken submodule is just as broken
> and untrustworthy as a broken repository without a submodule, and if
> you want to allow such a checkout with broken submodule to call
> itself v2.0-dirty, you would also want to allow a broken checkout
> without any submodule to do so, too.

Having said all that, if I thought it were a good idea to optionally
allow people to treat "repository is corrupt in some ways to make it
impossible for us if the checkout is dirty or not, even though we
are fairly certain what commit .git/HEAD points at is" as a state
that is describable, I probably would

 - introduce a new "git describe --possibly-broken" option;

 - instead of running "diff-index" internally to decide the "-dirty"
   state, spawn "diff-index" as a separate process;

 - observe the exit status from "diff-index" and add "-dirty" suffix
   when it is _known_ to be dirty, add "-broken" suffix when it
   died, and leave out the suffix when we know that the checkout is
   clean.

That way, we wouldn't have to contaminate the generic codepath with
a "treat broken and modified states as if they are the same" logic,
only to support such a niche feature that we wouldn't want to use
anywhere else.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] git-describe deals gracefully with broken submodules
  2017-03-21  6:28 ` [PATCH 0/3] git-describe deals gracefully with broken submodules Junio C Hamano
  2017-03-21  6:54   ` Junio C Hamano
@ 2017-03-21 17:46   ` Stefan Beller
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Beller @ 2017-03-21 17:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git@vger.kernel.org, Martin Fick

On Mon, Mar 20, 2017 at 11:28 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> Our own version generation in GIT-VERSION-GEN is somewhat sane by testing
>> if we have a .git dir, and use that as a signal whether the obtained
>> copy of git was obtained using git (clone/fetch) or if it is just a
>> downloaded tar ball.
>>
>> Other scripts to generate a version are not as cautious and just run
>> "git describe". An error from git-describe is treated as a sufficient
>> signal to assume it is not a git repository.
>>
>> When submodules come into play, this is not true, as a submodule
>> may be damaged instead, such that we're still in a git repository
>> but error out for the sake of reporting a severly broken submodule.
>>
>> Add a flag to git-describe that instructs it to treat severe submodule
>> errors as "dirty" instead of erroring out.
>
> I do not have a strong preference for or against the "treat a broken
> repository as if nothing is wrong with the revision, but just mark
> it as dirty" idea.  I would be more receptive if it substituted the
> "-dirty" marker with something else, e.g. "-broken", though.

ok. That was my initial plan. I plan on also incorporating this into
git-status.

>
> My knee-jerk reaction to the code change is that treating submodule
> as something very special is probably not a good idea.  Even if you
> do not use submodules, if some of the objects referenced from your
> index and/or HEAD are damaged or otherwise causes some error while
> accessing, the diff machinery would die, wouldn't it?

Yes; ideally we would want to have a generic "error with working tree"
for all these kinds of errors. Note that the probability of submodule errors
is way higher. (This has a couple of reasons: (1) a submodule has more
complexity than a file, (2) that opens up new areas in which errors can occur
and (3) submodules are less popular so we have fewer
bug reports and people working on it, hence less mature code.

As an example for (2), we had relative/absolute path issues in Git 2.7
for submodules, which is what Martin was complaining about.

> I saw that some new symbolic constants in the code to tell the
> machinery to "gracefully die" (or "hide the breakage under the rug")
> are named with SUBMODULE in them, which is probably a bad sign that
> the design is being too centric to submodules.  The implementation
> that covers only breakages in submodule as its first step may be OK
> (you have to start somewhere, after all), but I think the aspiration
> should be to cover all kinds of breakages in the end and turn them
> to be "graceful", and if you had that goal in mind, you wouldn't be
> naming these constants with SUBMODULE in them.

ok

>
> If "treat a broken repository as just being 'dirty'" were a good
> idea, I'd suspect that we would want to see all breakages, not just
> ones related to submodules, to be treated the same way.
>
> But it is possible there may be a reason why submodules are special.
> I do not think the third paragraph quoted above is a good
> justification.  A repository with broken submodule is just as broken
> and untrustworthy as a broken repository without a submodule, and if
> you want to allow such a checkout with broken submodule to call
> itself v2.0-dirty, you would also want to allow a broken checkout
> without any submodule to do so, too.

agreed.

> I probably would
>
> - introduce a new "git describe --possibly-broken" option;
>
> - instead of running "diff-index" internally to decide the "-dirty"
>   state, spawn "diff-index" as a separate process;
>
> - observe the exit status from "diff-index" and add "-dirty" suffix
>   when it is _known_ to be dirty, add "-broken" suffix when it
>   died, and leave out the suffix when we know that the checkout is
>   clean.
>
> That way, we wouldn't have to contaminate the generic codepath with
> a "treat broken and modified states as if they are the same" logic,
> only to support such a niche feature that we wouldn't want to use
> anywhere else.

I planned on a follow-up that adds similar behavior to git-status,
such that git-status can be used to get the big picture before
diagnosing the individual breakages of a submodule.

That is the reason why I started with the first patch porting
'is_submodule_modified' to porcelain2 of git-status, as that
allows way better reporting on submodules. Sorry for intermixing
these two things.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] builtin/describe: introduce --broken flag
  2017-03-21  6:54   ` Junio C Hamano
@ 2017-03-21 18:51     ` Stefan Beller
  2017-03-21 21:51       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2017-03-21 18:51 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, mfick, Stefan Beller

git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.

To keep this property of only erroring out, when not in a repository,
severe (submodule) errors must be downgraded to reporting them gently
instead of having git-describe error out completely.

To achieve that a flag '--broken' is introduced, which is in the same
vein as '--dirty' but uses an actual child process to check for dirtiness.
When that child dies unexpectedly, we'll append '-broken' instead of
'-dirty'.

This patch helps to fix the root cause in [1], which tries to work around
this situation.

[1] ("Work around git describe bug for build.")
https://gerrit-review.googlesource.com/#/c/99851/

Signed-off-by: Stefan Beller <sbeller@google.com>
---

> I probably would
> 
> - introduce a new "git describe --possibly-broken" option;
> 
> - instead of running "diff-index" internally to decide the "-dirty"
>    state, spawn "diff-index" as a separate process;
> 
>  - observe the exit status from "diff-index" and add "-dirty" suffix

This is what this patch does. It doesn't need any preceeding refactor
patches, but comes on its own; developed on origin/master.

Thanks,
Stefan

 Documentation/git-describe.txt |  7 +++++
 builtin/describe.c             | 59 ++++++++++++++++++++++++++++++------------
 t/t6120-describe.sh            | 20 ++++++++++++++
 3 files changed, 70 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index 8755f3af7b..b71fa7a4ad 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -34,6 +34,13 @@ OPTIONS
 	It means describe HEAD and appends <mark> (`-dirty` by
 	default) if the working tree is dirty.
 
+--broken[=<mark>]::
+	Describe the working tree.
+	It means describe HEAD and appends <mark> (`-broken` by
+	default) if the working tree cannot be examined for dirtiness.
+	This implies `--dirty`, which is the fallback behavior when
+	the working tree examination works.
+
 --all::
 	Instead of using only the annotated tags, use any ref
 	found in `refs/` namespace.  This option enables matching
diff --git a/builtin/describe.c b/builtin/describe.c
index 76c18059bf..37a83520c9 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -9,6 +9,7 @@
 #include "diff.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include "run-command.h"
 
 #define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
@@ -31,12 +32,7 @@ static int have_util;
 static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
 static int always;
-static const char *dirty;
-
-/* diff-index command arguments to check if working tree is dirty. */
-static const char *diff_index_args[] = {
-	"diff-index", "--quiet", "HEAD", "--", NULL
-};
+static const char *append, *dirty, *broken;
 
 struct commit_name {
 	struct hashmap_entry entry;
@@ -292,8 +288,8 @@ static void describe(const char *arg, int last_one)
 		display_name(n);
 		if (longformat)
 			show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid);
-		if (dirty)
-			printf("%s", dirty);
+		if (append)
+			printf("%s", append);
 		printf("\n");
 		return;
 	}
@@ -369,8 +365,8 @@ static void describe(const char *arg, int last_one)
 		struct object_id *oid = &cmit->object.oid;
 		if (always) {
 			printf("%s", find_unique_abbrev(oid->hash, abbrev));
-			if (dirty)
-				printf("%s", dirty);
+			if (append)
+				printf("%s", append);
 			printf("\n");
 			return;
 		}
@@ -413,8 +409,8 @@ static void describe(const char *arg, int last_one)
 	display_name(all_matches[0].name);
 	if (abbrev)
 		show_suffix(all_matches[0].depth, &cmit->object.oid);
-	if (dirty)
-		printf("%s", dirty);
+	if (append)
+		printf("%s", append);
 	printf("\n");
 
 	if (!last_one)
@@ -445,6 +441,9 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		{OPTION_STRING, 0, "dirty",  &dirty, N_("mark"),
 			N_("append <mark> on dirty working tree (default: \"-dirty\")"),
 			PARSE_OPT_OPTARG, NULL, (intptr_t) "-dirty"},
+		{OPTION_STRING, 0, "broken",  &broken, N_("mark"),
+			N_("append <mark> on broken working tree (default: \"-broken\")"),
+			PARSE_OPT_OPTARG, NULL, (intptr_t) "-broken"},
 		OPT_END(),
 	};
 
@@ -493,10 +492,35 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		die(_("No names found, cannot describe anything."));
 
 	if (argc == 0) {
-		if (dirty) {
+		if (broken) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			argv_array_pushl(&cp.args, "diff-index", "--quiet", "HEAD", "--", NULL);
+			cp.git_cmd = 1;
+			cp.no_stdin = 1;
+			cp.no_stdout = 1;
+
+			if (!dirty)
+				dirty = "-dirty";
+
+			switch (run_command(&cp)) {
+			case 0:
+				append = NULL;
+				break;
+			case 1:
+				/* keep dirty as is */
+				append = dirty;
+				break;
+			default:
+				/* diff-index aborted abnormally */
+				append = broken;
+			}
+		} else if (dirty) {
+			struct argv_array args = ARGV_ARRAY_INIT;
 			static struct lock_file index_lock;
 			int fd;
 
+			argv_array_pushl(&args, "diff-index", "--quiet", "HEAD", "--", NULL);
+
 			read_cache_preload(NULL);
 			refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
 				      NULL, NULL, NULL);
@@ -504,13 +528,16 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			if (0 <= fd)
 				update_index_if_able(&the_index, &index_lock);
 
-			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
-					    diff_index_args, prefix))
-				dirty = NULL;
+			if (cmd_diff_index(args.argc, args.argv, prefix))
+				append = dirty;
+			else
+				append = NULL;
 		}
 		describe("HEAD", 1);
 	} else if (dirty) {
 		die(_("--dirty is incompatible with commit-ishes"));
+	} else if (broken) {
+		die(_("--broken is incompatible with commit-ishes"));
 	} else {
 		while (argc-- > 0)
 			describe(*argv++, argc == 0);
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 167491fd5b..16952e44fc 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -233,4 +233,24 @@ test_expect_success 'describe --contains and --no-match' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup and absorb a submodule' '
+	test_create_repo sub1 &&
+	test_commit -C sub1 initial &&
+	git submodule add ./sub1 &&
+	git submodule absorbgitdirs &&
+	git commit -a -m "add submodule" &&
+	git describe --dirty >expect &&
+	git describe --broken >out &&
+	test_cmp expect out
+'
+
+test_expect_success 'describe chokes on severly broken submodules' '
+	mv .git/modules/sub1/ .git/modules/sub_moved &&
+	test_must_fail git describe --dirty
+'
+test_expect_success 'describe ignoring a borken submodule' '
+	git describe --broken >out &&
+	grep broken out
+'
+
 test_done
-- 
2.12.0.402.g0501f7a28e.dirty


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 18:51     ` [PATCH] builtin/describe: introduce --broken flag Stefan Beller
@ 2017-03-21 21:51       ` Junio C Hamano
  2017-03-21 22:27         ` Stefan Beller
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2017-03-21 21:51 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jrnieder, mfick

Stefan Beller <sbeller@google.com> writes:

> This patch helps to fix the root cause in [1], which tries to work around
> this situation.

I do not necessarily think it is reasonable to give $version-dirty
and proceed when a repository corruption is detected; if there is a
breakage in the repository, "git describe" is correct to die when a
populated submodule is broken.  IOW, I do not agree that [1] below
is working with a sensible expectation.

This is a tangent, but how does the Gerrit repository get corrupted
in the way described in [1] in the first place?  That might be what
needs to be corrected, perhaps?

>  Documentation/git-describe.txt |  7 +++++
>  builtin/describe.c             | 59 ++++++++++++++++++++++++++++++------------
>  t/t6120-describe.sh            | 20 ++++++++++++++
>  3 files changed, 70 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
> index 8755f3af7b..b71fa7a4ad 100644
> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -34,6 +34,13 @@ OPTIONS
>  	It means describe HEAD and appends <mark> (`-dirty` by
>  	default) if the working tree is dirty.
>  
> +--broken[=<mark>]::
> +	Describe the working tree.
> +	It means describe HEAD and appends <mark> (`-broken` by
> +	default) if the working tree cannot be examined for dirtiness.
> +	This implies `--dirty`, which is the fallback behavior when
> +	the working tree examination works.

The wording for the "--dirty" is already awkward, but this one is
even more so ("Describe the working tree. It means" conveys no
useful information).  I however cannot come up with something much
better.  This is the best I could come up with:

	Describe the state of the working tree.  When the working
	tree matches HEAD, the output is the same as "git describe
	HEAD" and "-dirty" is appended to it if the working tree has
	local modification.  When a repository is corrupt and Git
	cannot determine if there is local modification, instead of
	dying, append "-broken" instead.

The last sentence can be tweaked to replace the description for
"--dirty".

> diff --git a/builtin/describe.c b/builtin/describe.c
> index 76c18059bf..37a83520c9 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -9,6 +9,7 @@
>  #include "diff.h"
>  #include "hashmap.h"
>  #include "argv-array.h"
> +#include "run-command.h"
>  
>  #define SEEN		(1u << 0)
>  #define MAX_TAGS	(FLAG_BITS - 1)
> @@ -31,12 +32,7 @@ static int have_util;
>  static struct string_list patterns = STRING_LIST_INIT_NODUP;
>  static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
>  static int always;
> -static const char *dirty;
> -
> -/* diff-index command arguments to check if working tree is dirty. */
> -static const char *diff_index_args[] = {
> -	"diff-index", "--quiet", "HEAD", "--", NULL
> -};
> +static const char *append, *dirty, *broken;

Perhaps call it "suffix" or something?

> +		if (broken) {
> +			struct child_process cp = CHILD_PROCESS_INIT;
> +			argv_array_pushl(&cp.args, "diff-index", "--quiet", "HEAD", "--", NULL);
> ...
> +			}
> +		} else if (dirty) {
> +			struct argv_array args = ARGV_ARRAY_INIT;
>  			static struct lock_file index_lock;
>  			int fd;
>  
> +			argv_array_pushl(&args, "diff-index", "--quiet", "HEAD", "--", NULL);


Somehow it looks like the patch is making the code a lot worse by
losing diff_index_args[] and duplicating the same command line twice
in the code.  Wouldn't argv_array_pushv() into these two different args
array from the same template work better?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 21:51       ` Junio C Hamano
@ 2017-03-21 22:27         ` Stefan Beller
  2017-03-21 22:41           ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2017-03-21 22:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org, Jonathan Nieder, Martin Fick

On Tue, Mar 21, 2017 at 2:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> This patch helps to fix the root cause in [1], which tries to work around
>> this situation.
>
> I do not necessarily think it is reasonable to give $version-dirty
> and proceed when a repository corruption is detected; if there is a
> breakage in the repository, "git describe" is correct to die when a
> populated submodule is broken.  IOW, I do not agree that [1] below
> is working with a sensible expectation.

ok, so I won't quote it in the commit message

>
> This is a tangent, but how does the Gerrit repository get corrupted
> in the way described in [1] in the first place?  That might be what
> needs to be corrected, perhaps?

AFAICT, someone is (was?) using a version of Git that doesn't contain
f8eaa0ba98 (submodule--helper, module_clone: always operate
on absolute paths, 2016-03-31). So then the submodule paths were
made absolute paths on creation of the Gerrit repo.

And then someone moved the repo and the absolute paths broke.

Even after an upgrade of Git to its latest and greatest version, the underlying
issue of having broken submodule paths remains in that case.

So there are a couple of ways forward
0) as an immediate fix, manually fix the absolute path or make them relative

1A) have more error resilient tools in Git
1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
    the .git file link and the core.worktree setting to be relative and correct.

I think we should do both A and B, I decided to go with A first, specifically
"git-describe" as that was reported to not work well in this situation with the
given broken data

> The wording for the "--dirty" is already awkward, but this one is
> even more so ("Describe the working tree. It means" conveys no
> useful information).  I however cannot come up with something much
> better.  This is the best I could come up with:
>
>         Describe the state of the working tree.  When the working
>         tree matches HEAD, the output is the same as "git describe
>         HEAD" and "-dirty" is appended to it if the working tree has
>         local modification.  When a repository is corrupt and Git
>         cannot determine if there is local modification, instead of
>         dying, append "-broken" instead.

ok, I'll reuse parts of it.

>> +static const char *append, *dirty, *broken;
>
> Perhaps call it "suffix" or something?

done


>> +                     argv_array_pushl(&args, "diff-index", "--quiet", "HEAD", "--", NULL);
..
>  Wouldn't argv_array_pushv() into these two different args
> array from the same template work better?

yes, looks much saner. Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 22:27         ` Stefan Beller
@ 2017-03-21 22:41           ` Junio C Hamano
  2017-03-21 22:50             ` Stefan Beller
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2017-03-21 22:41 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Jonathan Nieder, Martin Fick

Stefan Beller <sbeller@google.com> writes:

> AFAICT, someone is (was?) using a version of Git that doesn't contain
> f8eaa0ba98 (submodule--helper, module_clone: always operate
> on absolute paths, 2016-03-31). So then the submodule paths were
> made absolute paths on creation of the Gerrit repo.
>
> And then someone moved the repo and the absolute paths broke.
>
> Even after an upgrade of Git to its latest and greatest version, the underlying
> issue of having broken submodule paths remains in that case.
>
> So there are a couple of ways forward
> 0) as an immediate fix, manually fix the absolute path or make them relative
>
> 1A) have more error resilient tools in Git
> 1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
>     the .git file link and the core.worktree setting to be relative and correct.
>
> I think we should do both A and B, I decided to go with A first, specifically
> "git-describe" as that was reported to not work well in this situation with the
> given broken data

Sounds sensible.  I would say that we should do less 1A (i.e. hiding
problems under the rug) and more 1B.  Of course, making the problem
less likely to happen in the first place would be more important ;-)

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] builtin/describe: introduce --broken flag
  2017-03-21 22:41           ` Junio C Hamano
@ 2017-03-21 22:50             ` Stefan Beller
  2017-03-21 22:57               ` [PATCH v2] " Stefan Beller
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Beller @ 2017-03-21 22:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org, Jonathan Nieder, Martin Fick

On Tue, Mar 21, 2017 at 3:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> AFAICT, someone is (was?) using a version of Git that doesn't contain
>> f8eaa0ba98 (submodule--helper, module_clone: always operate
>> on absolute paths, 2016-03-31). So then the submodule paths were
>> made absolute paths on creation of the Gerrit repo.
>>
>> And then someone moved the repo and the absolute paths broke.
>>
>> Even after an upgrade of Git to its latest and greatest version, the underlying
>> issue of having broken submodule paths remains in that case.
>>
>> So there are a couple of ways forward
>> 0) as an immediate fix, manually fix the absolute path or make them relative
>>
>> 1A) have more error resilient tools in Git
>> 1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
>>     the .git file link and the core.worktree setting to be relative and correct.
>>
>> I think we should do both A and B, I decided to go with A first, specifically
>> "git-describe" as that was reported to not work well in this situation with the
>> given broken data
>
> Sounds sensible.  I would say that we should do less 1A (i.e. hiding
> problems under the rug) and more 1B.  Of course, making the problem
> less likely to happen in the first place would be more important ;-)

Agreed on not introducing bugs as often. For (B), I wonder what the
right place is.

Maybe "reset --hard", in combination with "--recurse-submodules" would also
fix these submodule path issues. The made up "git submodule fsck-setup"
sounds like it would only detect (and not fix) the problems. But these absolute
paths issues are known how to fix, so a "fsck"-y tool may convey a wrong
impression.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] builtin/describe: introduce --broken flag
  2017-03-21 22:50             ` Stefan Beller
@ 2017-03-21 22:57               ` Stefan Beller
  2017-03-22 17:21                 ` Junio C Hamano
  2017-03-22 21:50                 ` Jakub Narębski
  0 siblings, 2 replies; 15+ messages in thread
From: Stefan Beller @ 2017-03-21 22:57 UTC (permalink / raw)
  To: gitster, jrnieder; +Cc: git, mfick, Stefan Beller

git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.

To keep this property of only erroring out, when not in a repository,
severe (submodule) errors must be downgraded to reporting them gently
instead of having git-describe error out completely.

To achieve that a flag '--broken' is introduced, which is in the same
vein as '--dirty' but uses an actual child process to check for dirtiness.
When that child dies unexpectedly, we'll append '-broken' instead of
'-dirty'.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-describe.txt | 11 +++++++---
 builtin/describe.c             | 47 ++++++++++++++++++++++++++++++++++--------
 t/t6120-describe.sh            | 20 ++++++++++++++++++
 3 files changed, 66 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index 8755f3af7b..26f19d3b07 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -30,9 +30,14 @@ OPTIONS
 	Commit-ish object names to describe.  Defaults to HEAD if omitted.
 
 --dirty[=<mark>]::
-	Describe the working tree.
-	It means describe HEAD and appends <mark> (`-dirty` by
-	default) if the working tree is dirty.
+--broken[=<mark>]::
+	Describe the state of the working tree.  When the working
+	tree matches HEAD, the output is the same as "git describe
+	HEAD".  If the working tree has local modification "-dirty"
+	is appended to it.  If a repository is corrupt and Git
+	cannot determine if there is local modification, Git will
+	error out, unless `--broken' is given, which appends
+	the suffix "-broken" instead.
 
 --all::
 	Instead of using only the annotated tags, use any ref
diff --git a/builtin/describe.c b/builtin/describe.c
index 76c18059bf..45adbf67d5 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -9,6 +9,7 @@
 #include "diff.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include "run-command.h"
 
 #define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
@@ -31,7 +32,7 @@ static int have_util;
 static struct string_list patterns = STRING_LIST_INIT_NODUP;
 static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
 static int always;
-static const char *dirty;
+static const char *suffix, *dirty, *broken;
 
 /* diff-index command arguments to check if working tree is dirty. */
 static const char *diff_index_args[] = {
@@ -292,8 +293,8 @@ static void describe(const char *arg, int last_one)
 		display_name(n);
 		if (longformat)
 			show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid);
-		if (dirty)
-			printf("%s", dirty);
+		if (suffix)
+			printf("%s", suffix);
 		printf("\n");
 		return;
 	}
@@ -369,8 +370,8 @@ static void describe(const char *arg, int last_one)
 		struct object_id *oid = &cmit->object.oid;
 		if (always) {
 			printf("%s", find_unique_abbrev(oid->hash, abbrev));
-			if (dirty)
-				printf("%s", dirty);
+			if (suffix)
+				printf("%s", suffix);
 			printf("\n");
 			return;
 		}
@@ -413,8 +414,8 @@ static void describe(const char *arg, int last_one)
 	display_name(all_matches[0].name);
 	if (abbrev)
 		show_suffix(all_matches[0].depth, &cmit->object.oid);
-	if (dirty)
-		printf("%s", dirty);
+	if (suffix)
+		printf("%s", suffix);
 	printf("\n");
 
 	if (!last_one)
@@ -445,6 +446,9 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		{OPTION_STRING, 0, "dirty",  &dirty, N_("mark"),
 			N_("append <mark> on dirty working tree (default: \"-dirty\")"),
 			PARSE_OPT_OPTARG, NULL, (intptr_t) "-dirty"},
+		{OPTION_STRING, 0, "broken",  &broken, N_("mark"),
+			N_("append <mark> on broken working tree (default: \"-broken\")"),
+			PARSE_OPT_OPTARG, NULL, (intptr_t) "-broken"},
 		OPT_END(),
 	};
 
@@ -493,7 +497,28 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		die(_("No names found, cannot describe anything."));
 
 	if (argc == 0) {
-		if (dirty) {
+		if (broken) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			argv_array_pushv(&cp.args, diff_index_args);
+			cp.git_cmd = 1;
+			cp.no_stdin = 1;
+			cp.no_stdout = 1;
+
+			if (!dirty)
+				dirty = "-dirty";
+
+			switch (run_command(&cp)) {
+			case 0:
+				suffix = NULL;
+				break;
+			case 1:
+				suffix = dirty;
+				break;
+			default:
+				/* diff-index aborted abnormally */
+				suffix = broken;
+			}
+		} else if (dirty) {
 			static struct lock_file index_lock;
 			int fd;
 
@@ -506,11 +531,15 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 
 			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
 					    diff_index_args, prefix))
-				dirty = NULL;
+				suffix = NULL;
+			else
+				suffix = dirty;
 		}
 		describe("HEAD", 1);
 	} else if (dirty) {
 		die(_("--dirty is incompatible with commit-ishes"));
+	} else if (broken) {
+		die(_("--broken is incompatible with commit-ishes"));
 	} else {
 		while (argc-- > 0)
 			describe(*argv++, argc == 0);
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 167491fd5b..16952e44fc 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -233,4 +233,24 @@ test_expect_success 'describe --contains and --no-match' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup and absorb a submodule' '
+	test_create_repo sub1 &&
+	test_commit -C sub1 initial &&
+	git submodule add ./sub1 &&
+	git submodule absorbgitdirs &&
+	git commit -a -m "add submodule" &&
+	git describe --dirty >expect &&
+	git describe --broken >out &&
+	test_cmp expect out
+'
+
+test_expect_success 'describe chokes on severly broken submodules' '
+	mv .git/modules/sub1/ .git/modules/sub_moved &&
+	test_must_fail git describe --dirty
+'
+test_expect_success 'describe ignoring a borken submodule' '
+	git describe --broken >out &&
+	grep broken out
+'
+
 test_done
-- 
2.12.1.385.gd4d53dfa89.dirty


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] builtin/describe: introduce --broken flag
  2017-03-21 22:57               ` [PATCH v2] " Stefan Beller
@ 2017-03-22 17:21                 ` Junio C Hamano
  2017-03-22 21:50                 ` Jakub Narębski
  1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-03-22 17:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: jrnieder, git, mfick

Stefan Beller <sbeller@google.com> writes:

> git-describe tells you the version number you're at, or errors out, e.g.
> when you run it outside of a repository, which may happen when downloading
> a tar ball instead of using git to obtain the source code.
>
> To keep this property of only erroring out, when not in a repository,
> severe (submodule) errors must be downgraded to reporting them gently
> instead of having git-describe error out completely.
>
> To achieve that a flag '--broken' is introduced, which is in the same
> vein as '--dirty' but uses an actual child process to check for dirtiness.
> When that child dies unexpectedly, we'll append '-broken' instead of
> '-dirty'.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  Documentation/git-describe.txt | 11 +++++++---
>  builtin/describe.c             | 47 ++++++++++++++++++++++++++++++++++--------
>  t/t6120-describe.sh            | 20 ++++++++++++++++++
>  3 files changed, 66 insertions(+), 12 deletions(-)

I admit that my suggestion to use pushv() was done without trying it
out myself, but I do agree that the result looks better, especially
because the "dirty" (not "broken") side does not even have to use a
separate argv_array to prepare the command line in the first place
(which I failed to spot exactly because I didn't try it myself ;-).

We probably _could_ use cp.argv to point at the diff_index_args()
without doing pushv(&cp.args), and that would save a bit of
allocation, but it would not matter in practice as this is not a
performance critical codepath.

Thanks.
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 167491fd5b..16952e44fc 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -233,4 +233,24 @@ test_expect_success 'describe --contains and --no-match' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup and absorb a submodule' '
> +	test_create_repo sub1 &&
> +	test_commit -C sub1 initial &&
> +	git submodule add ./sub1 &&
> +	git submodule absorbgitdirs &&
> +	git commit -a -m "add submodule" &&
> +	git describe --dirty >expect &&
> +	git describe --broken >out &&
> +	test_cmp expect out
> +'
> +
> +test_expect_success 'describe chokes on severly broken submodules' '
> +	mv .git/modules/sub1/ .git/modules/sub_moved &&
> +	test_must_fail git describe --dirty
> +'
> +test_expect_success 'describe ignoring a borken submodule' '
> +	git describe --broken >out &&
> +	grep broken out
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] builtin/describe: introduce --broken flag
  2017-03-21 22:57               ` [PATCH v2] " Stefan Beller
  2017-03-22 17:21                 ` Junio C Hamano
@ 2017-03-22 21:50                 ` Jakub Narębski
  1 sibling, 0 replies; 15+ messages in thread
From: Jakub Narębski @ 2017-03-22 21:50 UTC (permalink / raw)
  To: Stefan Beller, Junio C Hamano, Jonathan Nieder; +Cc: git, mfick

W dniu 21.03.2017 o 23:57, Stefan Beller pisze:

> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -30,9 +30,14 @@ OPTIONS
>  	Commit-ish object names to describe.  Defaults to HEAD if omitted.
>  
>  --dirty[=<mark>]::
> -	Describe the working tree.
> -	It means describe HEAD and appends <mark> (`-dirty` by
> -	default) if the working tree is dirty.
> +--broken[=<mark>]::
> +	Describe the state of the working tree.  When the working
> +	tree matches HEAD, the output is the same as "git describe
> +	HEAD".  If the working tree has local modification "-dirty"
> +	is appended to it.  If a repository is corrupt and Git
> +	cannot determine if there is local modification, Git will
> +	error out, unless `--broken' is given, which appends
> +	the suffix "-broken" instead.

The common description reads better... but unfortunately it lost
information about the optional parameter, namely <mark>.  The
'-dirty' is just the default for <dirty-mark>, and '-broken' is
the default for <broken-mark>.

Maybe /the suffix "-broken"/<broken-mark> suffix ('-broken' by default)/
and similarly for "-dirty"?

Best,
-- 
Jakub Narębski


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-03-22 21:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21  0:11 [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller
2017-03-21  0:11 ` [PATCH 1/3] submodule.c: port is_submodule_modified to use porcelain 2 Stefan Beller
2017-03-21  0:11 ` [PATCH 2/3] revision machinery: gentle submodule errors Stefan Beller
2017-03-21  0:11 ` [PATCH 3/3] builtin/describe: introduce --submodule-error-as-dirty flag Stefan Beller
2017-03-21  6:28 ` [PATCH 0/3] git-describe deals gracefully with broken submodules Junio C Hamano
2017-03-21  6:54   ` Junio C Hamano
2017-03-21 18:51     ` [PATCH] builtin/describe: introduce --broken flag Stefan Beller
2017-03-21 21:51       ` Junio C Hamano
2017-03-21 22:27         ` Stefan Beller
2017-03-21 22:41           ` Junio C Hamano
2017-03-21 22:50             ` Stefan Beller
2017-03-21 22:57               ` [PATCH v2] " Stefan Beller
2017-03-22 17:21                 ` Junio C Hamano
2017-03-22 21:50                 ` Jakub Narębski
2017-03-21 17:46   ` [PATCH 0/3] git-describe deals gracefully with broken submodules Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).