git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCHv5 0/7] describe blob
@ 2017-11-16  2:00 Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 1/7] t6120: fix typo in test name Stefan Beller
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

Thanks Jonathan and Junio for the patient review!

* fixed issues brought up in the last patch, see interdiff below.
  (I found that walking from so many refs as starting points was
   the source of confusion, hence we only want to walk from HEAD
* reworded commit messages from earlier patches
* added a BUGS section to the man page
* took Junios suggestion for the NAME section.

Stefan Beller (7):
  t6120: fix typo in test name
  list-objects.c: factor out traverse_trees_and_blobs
  revision.h: introduce blob/tree walking in order of the commits
  builtin/describe.c: rename `oid` to avoid variable shadowing
  builtin/describe.c: print debug statements earlier
  builtin/describe.c: factor out describe_commit
  builtin/describe.c: describe a blob

 Documentation/git-describe.txt     |  18 +++++-
 Documentation/rev-list-options.txt |   5 ++
 builtin/describe.c                 | 122 ++++++++++++++++++++++++++++---------
 list-objects.c                     |  58 ++++++++++++------
 revision.c                         |   2 +
 revision.h                         |   3 +-
 t/t6100-rev-list-in-order.sh       |  77 +++++++++++++++++++++++
 t/t6120-describe.sh                |  36 ++++++++++-
 8 files changed, 270 insertions(+), 51 deletions(-)
 create mode 100755 t/t6100-rev-list-in-order.sh

-- 
2.15.0.128.gcadd42da22

diff --git c/Documentation/git-describe.txt w/Documentation/git-describe.txt
index a25443ca91..e027fb8c4b 100644
--- c/Documentation/git-describe.txt
+++ w/Documentation/git-describe.txt
@@ -3,8 +3,7 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit or blob using the graph relations
-
+git-describe - Give an object a human readable name based on an available ref
 
 SYNOPSIS
 --------
@@ -27,13 +26,9 @@ see the -a and -s options to linkgit:git-tag[1].
 
 If the given object refers to a blob, it will be described
 as `<commit-ish>:<path>`, such that the blob can be found
-at `<path>` in the `<commit-ish>`. Note, that the commit is likely
-not the commit that introduced the blob, but the one that was found
-first; to find the commit that introduced the blob, you need to find
-the commit that last touched the path, e.g.
-`git log <commit-description> -- <path>`.
-As blobs do not point at the commits they are contained in,
-describing blobs is slow as we have to walk the whole graph.
+at `<path>` in the `<commit-ish>`, which itself describes the
+first commit in which this blob occurs in a reverse revision walk
+from HEAD.
 
 OPTIONS
 -------
@@ -197,6 +192,14 @@ selected and output.  Here fewest commits different is defined as
 the number of commits which would be shown by `git log tag..input`
 will be the smallest number of commits possible.
 
+BUGS
+----
+
+Tree objects as well as tag objects not pointing at commits, cannot be described.
+When describing blobs, the lightweight tags pointing at blobs are ignored,
+but the blob is still described as <committ-ish>:<path> despite the lightweight
+tag being favorable.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git c/builtin/describe.c w/builtin/describe.c
index a2a5fdc48d..5b4bfaba3f 100644
--- c/builtin/describe.c
+++ w/builtin/describe.c
@@ -457,7 +457,8 @@ static void process_object(struct object *obj, const char *path, void *data)
 		reset_revision_walk();
 		describe_commit(&pcd->current_commit, pcd->dst);
 		strbuf_addf(pcd->dst, ":%s", path);
-		pcd->revs->max_count = 0;
+		free_commit_list(pcd->revs->commits);
+		pcd->revs->commits = NULL;
 	}
 }
 
@@ -468,12 +469,7 @@ static void describe_blob(struct object_id oid, struct strbuf *dst)
 	struct process_commit_data pcd = { null_oid, oid, dst, &revs};
 
 	argv_array_pushl(&args, "internal: The first arg is not parsed",
-		"--all", "--reflog", /* as many starting points as possible */
-		/* NEEDSWORK: --all is incompatible with worktrees for now: */
-		"--single-worktree",
-		"--objects",
-		"--in-commit-order",
-		"--reverse",
+		"--objects", "--in-commit-order", "--reverse", "HEAD",
 		NULL);
 
 	init_revisions(&revs, NULL);
diff --git c/list-objects.c w/list-objects.c
index 07a92f35fe..4caa6fcb77 100644
--- c/list-objects.c
+++ w/list-objects.c
@@ -239,7 +239,13 @@ void traverse_commit_list(struct rev_info *revs,
 		if (commit->tree)
 			add_pending_tree(revs, commit->tree);
 		show_commit(commit, data);
+
 		if (revs->tree_blobs_in_commit_order)
+			/*
+			 * NEEDSWORK: Adding the tree and then flushing it here
+			 * needs a reallocation for each commit. Can we pass the
+			 * tree directory without allocation churn?
+			 */
 			traverse_trees_and_blobs(revs, &csp, show_object, data);
 	}
 	traverse_trees_and_blobs(revs, &csp, show_object, data);
diff --git c/t/t6120-describe.sh w/t/t6120-describe.sh
index ec4f25d009..4668f0058e 100755
--- c/t/t6120-describe.sh
+++ w/t/t6120-describe.sh
@@ -310,7 +310,7 @@ test_expect_success 'describe ignoring a broken submodule' '
 	grep broken out
 '
 
-test_expect_success 'describe a blob at a tag' '
+test_expect_success 'describe a blob at a directly tagged commit' '
 	echo "make it a unique blob" >file &&
 	git add file && git commit -m "content in file" &&
 	git tag -a -m "latest annotated tag" unique-file &&
@@ -319,14 +319,29 @@ test_expect_success 'describe a blob at a tag' '
 	test_cmp expect actual
 '
 
-test_expect_success 'describe a blob with its last introduction' '
+test_expect_success 'describe a blob with its first introduction' '
 	git commit --allow-empty -m "empty commit" &&
 	git rm file &&
 	git commit -m "delete blob" &&
 	git revert HEAD &&
 	git commit --allow-empty -m "empty commit" &&
 	git describe HEAD:file >actual &&
-	grep unique-file-3-g actual
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe directly tagged blob' '
+	git tag test-blob unique-file:file &&
+	git describe test-blob >actual &&
+	echo "unique-file:file" >expect &&
+	# suboptimal: we rather want to see "test-blob"
+	test_cmp expect actual
+'
+
+test_expect_success 'describe tag object' '
+	git tag test-blob-1 -a -m msg unique-file:file &&
+	test_must_fail git describe test-blob-1 2>actual &&
+	grep "fatal: test-blob-1 is neither a commit nor blob" actual
 '
 
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 1/7] t6120: fix typo in test name
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6120-describe.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..c8b7ed82d9 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' '
 	mv .git/modules/sub1/ .git/modules/sub_moved &&
 	test_must_fail git describe --dirty
 '
-test_expect_success 'describe ignoring a borken submodule' '
+test_expect_success 'describe ignoring a broken submodule' '
 	git describe --broken >out &&
 	test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
 	grep broken out
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 2/7] list-objects.c: factor out traverse_trees_and_blobs
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 1/7] t6120: fix typo in test name Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

With traverse_trees_and_blobs factored out of the main traverse function,
the next patch can introduce an in-order revision walking with ease.

In the next patch we'll call `traverse_trees_and_blobs` from within the
loop walking the commits, such that we'll have one invocation of that
function per commit.  That is why we do not want to have memory allocations
in that function, such as we'd have if we were to use a strbuf locally.
Pass a strbuf from traverse_commit_list into the blob and tree traversing
function as a scratch pad that only needs to be allocated once.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 list-objects.c | 50 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index b3931fa434..7c2ce9c4bd 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -183,25 +183,15 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree)
 	add_pending_object(revs, &tree->object, "");
 }
 
-void traverse_commit_list(struct rev_info *revs,
-			  show_commit_fn show_commit,
-			  show_object_fn show_object,
-			  void *data)
+static void traverse_trees_and_blobs(struct rev_info *revs,
+				     struct strbuf *base,
+				     show_object_fn show_object,
+				     void *data)
 {
 	int i;
-	struct commit *commit;
-	struct strbuf base;
 
-	strbuf_init(&base, PATH_MAX);
-	while ((commit = get_revision(revs)) != NULL) {
-		/*
-		 * an uninteresting boundary commit may not have its tree
-		 * parsed yet, but we are not going to show them anyway
-		 */
-		if (commit->tree)
-			add_pending_tree(revs, commit->tree);
-		show_commit(commit, data);
-	}
+	assert(base->len == 0);
+
 	for (i = 0; i < revs->pending.nr; i++) {
 		struct object_array_entry *pending = revs->pending.objects + i;
 		struct object *obj = pending->item;
@@ -218,17 +208,39 @@ void traverse_commit_list(struct rev_info *revs,
 			path = "";
 		if (obj->type == OBJ_TREE) {
 			process_tree(revs, (struct tree *)obj, show_object,
-				     &base, path, data);
+				     base, path, data);
 			continue;
 		}
 		if (obj->type == OBJ_BLOB) {
 			process_blob(revs, (struct blob *)obj, show_object,
-				     &base, path, data);
+				     base, path, data);
 			continue;
 		}
 		die("unknown pending object %s (%s)",
 		    oid_to_hex(&obj->oid), name);
 	}
 	object_array_clear(&revs->pending);
-	strbuf_release(&base);
+}
+
+void traverse_commit_list(struct rev_info *revs,
+			  show_commit_fn show_commit,
+			  show_object_fn show_object,
+			  void *data)
+{
+	struct commit *commit;
+	struct strbuf csp; /* callee's scratch pad */
+	strbuf_init(&csp, PATH_MAX);
+
+	while ((commit = get_revision(revs)) != NULL) {
+		/*
+		 * an uninteresting boundary commit may not have its tree
+		 * parsed yet, but we are not going to show them anyway
+		 */
+		if (commit->tree)
+			add_pending_tree(revs, commit->tree);
+		show_commit(commit, data);
+	}
+	traverse_trees_and_blobs(revs, &csp, show_object, data);
+
+	strbuf_release(&csp);
 }
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 3/7] revision.h: introduce blob/tree walking in order of the commits
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 1/7] t6120: fix typo in test name Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

The functionality to list tree objects in the order they were seen
while traversing the commits will be used in one of the next commits,
where we teach `git describe` to describe not only commits, but blobs, too.

The change in list-objects.c is rather minimal as we'll be re-using
the infrastructure put in place of the revision walking machinery. For
example one could expect that add_pending_tree is not called, but rather
commit->tree is directly passed to the tree traversal function. This
however requires a lot more code than just emptying the queue containing
trees after each commit.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/rev-list-options.txt |  5 +++
 list-objects.c                     |  8 ++++
 revision.c                         |  2 +
 revision.h                         |  3 +-
 t/t6100-rev-list-in-order.sh       | 77 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 94 insertions(+), 1 deletion(-)
 create mode 100755 t/t6100-rev-list-in-order.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 13501e1556..9066e0c777 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -686,6 +686,11 @@ ifdef::git-rev-list[]
 	all object IDs which I need to download if I have the commit
 	object _bar_ but not _foo_''.
 
+--in-commit-order::
+	Print tree and blob ids in order of the commits. The tree
+	and blob ids are printed after they are first referenced
+	by a commit.
+
 --objects-edge::
 	Similar to `--objects`, but also print the IDs of excluded
 	commits prefixed with a ``-'' character.  This is used by
diff --git a/list-objects.c b/list-objects.c
index 7c2ce9c4bd..4caa6fcb77 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -239,6 +239,14 @@ void traverse_commit_list(struct rev_info *revs,
 		if (commit->tree)
 			add_pending_tree(revs, commit->tree);
 		show_commit(commit, data);
+
+		if (revs->tree_blobs_in_commit_order)
+			/*
+			 * NEEDSWORK: Adding the tree and then flushing it here
+			 * needs a reallocation for each commit. Can we pass the
+			 * tree directory without allocation churn?
+			 */
+			traverse_trees_and_blobs(revs, &csp, show_object, data);
 	}
 	traverse_trees_and_blobs(revs, &csp, show_object, data);
 
diff --git a/revision.c b/revision.c
index d167223e69..9329d4ebbf 100644
--- a/revision.c
+++ b/revision.c
@@ -1845,6 +1845,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->dense = 0;
 	} else if (!strcmp(arg, "--show-all")) {
 		revs->show_all = 1;
+	} else if (!strcmp(arg, "--in-commit-order")) {
+		revs->tree_blobs_in_commit_order = 1;
 	} else if (!strcmp(arg, "--remove-empty")) {
 		revs->remove_empty_trees = 1;
 	} else if (!strcmp(arg, "--merges")) {
diff --git a/revision.h b/revision.h
index 54761200ad..86985d68aa 100644
--- a/revision.h
+++ b/revision.h
@@ -121,7 +121,8 @@ struct rev_info {
 			bisect:1,
 			ancestry_path:1,
 			first_parent_only:1,
-			line_level_traverse:1;
+			line_level_traverse:1,
+			tree_blobs_in_commit_order:1;
 
 	/* Diff flags */
 	unsigned int	diff:1,
diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh
new file mode 100755
index 0000000000..b2bb0a7f61
--- /dev/null
+++ b/t/t6100-rev-list-in-order.sh
@@ -0,0 +1,77 @@
+#!/bin/sh
+
+test_description='rev-list testing in-commit-order'
+
+. ./test-lib.sh
+
+test_expect_success 'setup a commit history with trees, blobs' '
+	for x in one two three four
+	do
+		echo $x >$x &&
+		git add $x &&
+		git commit -m "add file $x" ||
+		return 1
+	done &&
+	for x in four three
+	do
+		git rm $x &&
+		git commit -m "remove $x" ||
+		return 1
+	done
+'
+
+test_expect_success 'rev-list --in-commit-order' '
+	git rev-list --in-commit-order --objects HEAD >actual.raw &&
+	cut -c 1-40 >actual <actual.raw &&
+
+	git cat-file --batch-check="%(objectname)" >expect.raw <<-\EOF &&
+		HEAD^{commit}
+		HEAD^{tree}
+		HEAD^{tree}:one
+		HEAD^{tree}:two
+		HEAD~1^{commit}
+		HEAD~1^{tree}
+		HEAD~1^{tree}:three
+		HEAD~2^{commit}
+		HEAD~2^{tree}
+		HEAD~2^{tree}:four
+		HEAD~3^{commit}
+		# HEAD~3^{tree} skipped, same as HEAD~1^{tree}
+		HEAD~4^{commit}
+		# HEAD~4^{tree} skipped, same as HEAD^{tree}
+		HEAD~5^{commit}
+		HEAD~5^{tree}
+	EOF
+	grep -v "#" >expect <expect.raw &&
+
+	test_cmp expect actual
+'
+
+test_expect_success 'rev-list lists blobs and trees after commits' '
+	git rev-list --objects HEAD >actual.raw &&
+	cut -c 1-40 >actual <actual.raw &&
+
+	git cat-file --batch-check="%(objectname)" >expect.raw <<-\EOF &&
+		HEAD^{commit}
+		HEAD~1^{commit}
+		HEAD~2^{commit}
+		HEAD~3^{commit}
+		HEAD~4^{commit}
+		HEAD~5^{commit}
+		HEAD^{tree}
+		HEAD^{tree}:one
+		HEAD^{tree}:two
+		HEAD~1^{tree}
+		HEAD~1^{tree}:three
+		HEAD~2^{tree}
+		HEAD~2^{tree}:four
+		# HEAD~3^{tree} skipped, same as HEAD~1^{tree}
+		# HEAD~4^{tree} skipped, same as HEAD^{tree}
+		HEAD~5^{tree}
+	EOF
+	grep -v "#" >expect <expect.raw &&
+
+	test_cmp expect actual
+'
+
+test_done
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
                   ` (2 preceding siblings ...)
  2017-11-16  2:00 ` [PATCHv5 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 5/7] builtin/describe.c: print debug statements earlier Stefan Beller
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

The function `describe` has already a variable named `oid` declared at
the beginning of the function for an object id.  Do not shadow that
variable with a pointer to an object id.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/describe.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 29075dbd0f..fd61f463cf 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -381,9 +381,9 @@ static void describe(const char *arg, int last_one)
 	}
 
 	if (!match_cnt) {
-		struct object_id *oid = &cmit->object.oid;
+		struct object_id *cmit_oid = &cmit->object.oid;
 		if (always) {
-			printf("%s", find_unique_abbrev(oid->hash, abbrev));
+			printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev));
 			if (suffix)
 				printf("%s", suffix);
 			printf("\n");
@@ -392,11 +392,11 @@ static void describe(const char *arg, int last_one)
 		if (unannotated_cnt)
 			die(_("No annotated tags can describe '%s'.\n"
 			    "However, there were unannotated tags: try --tags."),
-			    oid_to_hex(oid));
+			    oid_to_hex(cmit_oid));
 		else
 			die(_("No tags can describe '%s'.\n"
 			    "Try --always, or create some tags."),
-			    oid_to_hex(oid));
+			    oid_to_hex(cmit_oid));
 	}
 
 	QSORT(all_matches, match_cnt, compare_pt);
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 5/7] builtin/describe.c: print debug statements earlier
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
                   ` (3 preceding siblings ...)
  2017-11-16  2:00 ` [PATCHv5 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 6/7] builtin/describe.c: factor out describe_commit Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

When debugging, print the received argument at the start of the
function instead of in the middle. This ensures that the received
argument is printed in all code paths, and also allows a subsequent
refactoring to not need to move the "arg" parameter.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/describe.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index fd61f463cf..3136efde31 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one)
 	unsigned long seen_commits = 0;
 	unsigned int unannotated_cnt = 0;
 
+	if (debug)
+		fprintf(stderr, _("describe %s\n"), arg);
+
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
 	cmit = lookup_commit_reference(&oid);
@@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one)
 	if (!max_candidates)
 		die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
 	if (debug)
-		fprintf(stderr, _("searching to describe %s\n"), arg);
+		fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
 
 	if (!have_util) {
 		struct hashmap_iter iter;
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 6/7] builtin/describe.c: factor out describe_commit
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
                   ` (4 preceding siblings ...)
  2017-11-16  2:00 ` [PATCHv5 5/7] builtin/describe.c: print debug statements earlier Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

Factor out describing commits into its own function `describe_commit`,
which will put any output to stdout into a strbuf, to be printed
afterwards.

As the next patch will teach Git to describe blobs using a commit and path,
this refactor will make it easy to reuse the code describing commits.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/describe.c | 63 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 37 insertions(+), 26 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 3136efde31..9e9a5ed5d4 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -256,7 +256,7 @@ static unsigned long finish_depth_computation(
 	return seen_commits;
 }
 
-static void display_name(struct commit_name *n)
+static void append_name(struct commit_name *n, struct strbuf *dst)
 {
 	if (n->prio == 2 && !n->tag) {
 		n->tag = lookup_tag(&n->oid);
@@ -272,19 +272,18 @@ static void display_name(struct commit_name *n)
 	}
 
 	if (n->tag)
-		printf("%s", n->tag->tag);
+		strbuf_addstr(dst, n->tag->tag);
 	else
-		printf("%s", n->path);
+		strbuf_addstr(dst, n->path);
 }
 
-static void show_suffix(int depth, const struct object_id *oid)
+static void append_suffix(int depth, const struct object_id *oid, struct strbuf *dst)
 {
-	printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev));
+	strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev));
 }
 
-static void describe(const char *arg, int last_one)
+static void describe_commit(struct object_id *oid, struct strbuf *dst)
 {
-	struct object_id oid;
 	struct commit *cmit, *gave_up_on = NULL;
 	struct commit_list *list;
 	struct commit_name *n;
@@ -293,26 +292,18 @@ static void describe(const char *arg, int last_one)
 	unsigned long seen_commits = 0;
 	unsigned int unannotated_cnt = 0;
 
-	if (debug)
-		fprintf(stderr, _("describe %s\n"), arg);
-
-	if (get_oid(arg, &oid))
-		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
+	cmit = lookup_commit_reference(oid);
 
 	n = find_commit_name(&cmit->object.oid);
 	if (n && (tags || all || n->prio == 2)) {
 		/*
 		 * Exact match to an existing ref.
 		 */
-		display_name(n);
+		append_name(n, dst);
 		if (longformat)
-			show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid);
+			append_suffix(0, n->tag ? &n->tag->tagged->oid : oid, dst);
 		if (suffix)
-			printf("%s", suffix);
-		printf("\n");
+			strbuf_addstr(dst, suffix);
 		return;
 	}
 
@@ -386,10 +377,9 @@ static void describe(const char *arg, int last_one)
 	if (!match_cnt) {
 		struct object_id *cmit_oid = &cmit->object.oid;
 		if (always) {
-			printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev));
+			strbuf_addstr(dst, find_unique_abbrev(cmit_oid->hash, abbrev));
 			if (suffix)
-				printf("%s", suffix);
-			printf("\n");
+				strbuf_addstr(dst, suffix);
 			return;
 		}
 		if (unannotated_cnt)
@@ -437,15 +427,36 @@ static void describe(const char *arg, int last_one)
 		}
 	}
 
-	display_name(all_matches[0].name);
+	append_name(all_matches[0].name, dst);
 	if (abbrev)
-		show_suffix(all_matches[0].depth, &cmit->object.oid);
+		append_suffix(all_matches[0].depth, &cmit->object.oid, dst);
 	if (suffix)
-		printf("%s", suffix);
-	printf("\n");
+		strbuf_addstr(dst, suffix);
+}
+
+static void describe(const char *arg, int last_one)
+{
+	struct object_id oid;
+	struct commit *cmit;
+	struct strbuf sb = STRBUF_INIT;
+
+	if (debug)
+		fprintf(stderr, _("describe %s\n"), arg);
+
+	if (get_oid(arg, &oid))
+		die(_("Not a valid object name %s"), arg);
+	cmit = lookup_commit_reference(&oid);
+	if (!cmit)
+		die(_("%s is not a valid '%s' object"), arg, commit_type);
+
+	describe_commit(&oid, &sb);
+
+	puts(sb.buf);
 
 	if (!last_one)
 		clear_commit_marks(cmit, -1);
+
+	strbuf_release(&sb);
 }
 
 int cmd_describe(int argc, const char **argv, const char *prefix)
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
                   ` (5 preceding siblings ...)
  2017-11-16  2:00 ` [PATCHv5 6/7] builtin/describe.c: factor out describe_commit Stefan Beller
@ 2017-11-16  2:00 ` Stefan Beller
  2017-11-16  3:24   ` Junio C Hamano
                     ` (2 more replies)
  6 siblings, 3 replies; 19+ messages in thread
From: Stefan Beller @ 2017-11-16  2:00 UTC (permalink / raw)
  To: git; +Cc: gitster, jonathantanmy, Stefan Beller

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob. For
example

  git describe --tags v0.99:Makefile
  conversion-901-g7672db20c2:Makefile

tells us the Makefile as it was in v0.99 was introduced in commit 7672db20.

The walking is performed in reverse order to show the introduction of a
blob rather than its last occurrence.

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-describe.txt | 18 ++++++++++--
 builtin/describe.c             | 62 ++++++++++++++++++++++++++++++++++++++----
 t/t6120-describe.sh            | 34 +++++++++++++++++++++++
 3 files changed, 107 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..e027fb8c4b 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,14 +3,14 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit using the most recent tag reachable from it
-
+git-describe - Give an object a human readable name based on an available ref
 
 SYNOPSIS
 --------
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>]
+'git describe' <blob>
 
 DESCRIPTION
 -----------
@@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` only shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given object refers to a blob, it will be described
+as `<commit-ish>:<path>`, such that the blob can be found
+at `<path>` in the `<commit-ish>`, which itself describes the
+first commit in which this blob occurs in a reverse revision walk
+from HEAD.
+
 OPTIONS
 -------
 <commit-ish>...::
@@ -186,6 +192,14 @@ selected and output.  Here fewest commits different is defined as
 the number of commits which would be shown by `git log tag..input`
 will be the smallest number of commits possible.
 
+BUGS
+----
+
+Tree objects as well as tag objects not pointing at commits, cannot be described.
+When describing blobs, the lightweight tags pointing at blobs are ignored,
+but the blob is still described as <committ-ish>:<path> despite the lightweight
+tag being favorable.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..5b4bfaba3f 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+	struct object_id current_commit;
+	struct object_id looking_for;
+	struct strbuf *dst;
+	struct rev_info *revs;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+	struct process_commit_data *pcd = data;
+	pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+	struct process_commit_data *pcd = data;
+
+	if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
+		reset_revision_walk();
+		describe_commit(&pcd->current_commit, pcd->dst);
+		strbuf_addf(pcd->dst, ":%s", path);
+		free_commit_list(pcd->revs->commits);
+		pcd->revs->commits = NULL;
+	}
+}
+
+static void describe_blob(struct object_id oid, struct strbuf *dst)
+{
+	struct rev_info revs;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct process_commit_data pcd = { null_oid, oid, dst, &revs};
+
+	argv_array_pushl(&args, "internal: The first arg is not parsed",
+		"--objects", "--in-commit-order", "--reverse", "HEAD",
+		NULL);
+
+	init_revisions(&revs, NULL);
+	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
+		BUG("setup_revisions could not handle all args?");
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, process_commit, process_object, &pcd);
+	reset_revision_walk();
+}
+
 static void describe(const char *arg, int last_one)
 {
 	struct object_id oid;
@@ -445,11 +494,14 @@ static void describe(const char *arg, int last_one)
 
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
+	cmit = lookup_commit_reference_gently(&oid, 1);
 
-	describe_commit(&oid, &sb);
+	if (cmit)
+		describe_commit(&oid, &sb);
+	else if (lookup_blob(&oid))
+		describe_blob(oid, &sb);
+	else
+		die(_("%s is neither a commit nor blob"), arg);
 
 	puts(sb.buf);
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index c8b7ed82d9..4668f0058e 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,6 +310,40 @@ test_expect_success 'describe ignoring a broken submodule' '
 	grep broken out
 '
 
+test_expect_success 'describe a blob at a directly tagged commit' '
+	echo "make it a unique blob" >file &&
+	git add file && git commit -m "content in file" &&
+	git tag -a -m "latest annotated tag" unique-file &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe a blob with its first introduction' '
+	git commit --allow-empty -m "empty commit" &&
+	git rm file &&
+	git commit -m "delete blob" &&
+	git revert HEAD &&
+	git commit --allow-empty -m "empty commit" &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe directly tagged blob' '
+	git tag test-blob unique-file:file &&
+	git describe test-blob >actual &&
+	echo "unique-file:file" >expect &&
+	# suboptimal: we rather want to see "test-blob"
+	test_cmp expect actual
+'
+
+test_expect_success 'describe tag object' '
+	git tag test-blob-1 -a -m msg unique-file:file &&
+	test_must_fail git describe test-blob-1 2>actual &&
+	grep "fatal: test-blob-1 is neither a commit nor blob" actual
+'
+
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
@ 2017-11-16  3:24   ` Junio C Hamano
  2017-11-16 19:34     ` Stefan Beller
  2017-11-20 15:10   ` Philip Oakley
  2017-12-19 19:22   ` Junio C Hamano
  2 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2017-11-16  3:24 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jonathantanmy

Stefan Beller <sbeller@google.com> writes:

> When describing commits, we try to anchor them to tags or refs, as these
> are conceptually on a higher level than the commit. And if there is no ref
> or tag that matches exactly, we're out of luck.  So we employ a heuristic
> to make up a name for the commit. These names are ambiguous, there might
> be different tags or refs to anchor to, and there might be different
> path in the DAG to travel to arrive at the commit precisely.

I am not sure if "And if there is ..." is adding much value here (I
do not think it is even technically correct for that matter).  If
there are more than one tag that point at the commit the user is
interested in, we use one of the tags, as tags conceptually sit at a
higher level.  And we use a heuristic to use one or the other tag to
make up a name for the commit, so the same commit can have two valid
names. ---So what?  Neither of these two valid names is "ambigous";
the commit object the user wanted to name _is_ correctly identified
(I would assume that we are not discussing a hash collision).

Lucikly, if we remove "And if...precisely", the logic still flows
nicely, if not more, to the next paragraph.

> When describing a blob, we want to describe the blob from a higher layer
> as well, which is a tuple of (commit, deep/path) as the tree objects
> involved are rather uninteresting.  The same blob can be referenced by
> multiple commits, so how we decide which commit to use?  This patch
> implements a rather naive approach on this: As there are no back pointers
> from blobs to commits in which the blob occurs, we'll start walking from
> any tips available, listing the blobs in-order of the commit and once we

Is "any tips" still the case?  I was wondering why you start your
traversal at HEAD and nothing else in this iteration.  There seems
to be no mention of this design decision in the documentation and no
justification in the log.

> found the blob, we'll take the first commit that listed the blob. For
> example
>
>   git describe --tags v0.99:Makefile
>   conversion-901-g7672db20c2:Makefile
>
> tells us the Makefile as it was in v0.99 was introduced in commit 7672db20.
>
> The walking is performed in reverse order to show the introduction of a
> blob rather than its last occurrence.

The reversing may improve the chance of an older commit to be chosen
rather than the newer one, but it does not even guarantee to show the
"introduction".

What this guarantees is that a long history will be traversed fully
before we start considering which commit has the blob of interest, I
am afraid.  Is this a sensible trade-off?

> +	argv_array_pushl(&args, "internal: The first arg is not parsed",
> +		"--objects", "--in-commit-order", "--reverse", "HEAD",
> +		NULL);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-16  3:24   ` Junio C Hamano
@ 2017-11-16 19:34     ` Stefan Beller
  2017-11-22  7:55       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Beller @ 2017-11-16 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

On Wed, Nov 15, 2017 at 7:24 PM, Junio C Hamano <gitster@pobox.com> wrote:

> I am not sure if "And if there is ..." is adding much value here (I
> do not think it is even technically correct for that matter).  If
> there are more than one tag that point at the commit the user is
> interested in, we use one of the tags, as tags conceptually sit at a
> higher level.  And we use a heuristic to use one or the other tag to
> make up a name for the commit, so the same commit can have two valid
> names. ---So what?  Neither of these two valid names is "ambigous";
> the commit object the user wanted to name _is_ correctly identified
> (I would assume that we are not discussing a hash collision).
>
> Lucikly, if we remove "And if...precisely", the logic still flows
> nicely, if not more, to the next paragraph.

fixed.

>> When describing a blob, we want to describe the blob from a higher layer
>> as well, which is a tuple of (commit, deep/path) as the tree objects
>> involved are rather uninteresting.  The same blob can be referenced by
>> multiple commits, so how we decide which commit to use?  This patch
>> implements a rather naive approach on this: As there are no back pointers
>> from blobs to commits in which the blob occurs, we'll start walking from
>> any tips available, listing the blobs in-order of the commit and once we
>
> Is "any tips" still the case?  I was wondering why you start your
> traversal at HEAD and nothing else in this iteration.  There seems
> to be no mention of this design decision in the documentation and no
> justification in the log.

fixed the text. The design decision to reverse walk HEAD is tied to
your observation below:

> The reversing may improve the chance of an older commit to be chosen
> rather than the newer one, but it does not even guarantee to show the
> "introduction".

This is what I realized when we started walking all refs including reflog.
The introduction can only be found when we take the commit-parents
into account and look at the diffs. I noticed you started
origin/jc/diff-blobfind
which may be helpful to find the introduction correctly, which I'll play around
with that and see if I can get that working.

> What this guarantees is that a long history will be traversed fully
> before we start considering which commit has the blob of interest, I
> am afraid.  Is this a sensible trade-off?

I am not fully convinced all descriptions are in recent history, but I
tend to agree that most are, so probably the trade off is a wash.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
  2017-11-16  3:24   ` Junio C Hamano
@ 2017-11-20 15:10   ` Philip Oakley
  2017-12-19 19:22   ` Junio C Hamano
  2 siblings, 0 replies; 19+ messages in thread
From: Philip Oakley @ 2017-11-20 15:10 UTC (permalink / raw)
  To: Stefan Beller, git; +Cc: gitster, jonathantanmy, Stefan Beller

From: "Stefan Beller" <sbeller@google.com>
Sent: Thursday, November 16, 2017 2:00 AM

[in catch up mode..]

> Sometimes users are given a hash of an object and they want to
> identify it further (ex.: Use verify-pack to find the largest blobs,
> but what are these? or [1])
>
> When describing commits, we try to anchor them to tags or refs, as these
> are conceptually on a higher level than the commit. And if there is no ref
> or tag that matches exactly, we're out of luck.  So we employ a heuristic
> to make up a name for the commit. These names are ambiguous, there might
> be different tags or refs to anchor to, and there might be different
> path in the DAG to travel to arrive at the commit precisely.
>
> When describing a blob, we want to describe the blob from a higher layer
> as well, which is a tuple of (commit, deep/path) as the tree objects
> involved are rather uninteresting.  The same blob can be referenced by
> multiple commits, so how we decide which commit to use?  This patch
> implements a rather naive approach on this: As there are no back pointers
> from blobs to commits in which the blob occurs, we'll start walking from
> any tips available, listing the blobs in-order of the commit and once we
> found the blob, we'll take the first commit that listed the blob. For
> example
>
>  git describe --tags v0.99:Makefile
>  conversion-901-g7672db20c2:Makefile
>
> tells us the Makefile as it was in v0.99 was introduced in commit 
> 7672db20.
>
> The walking is performed in reverse order to show the introduction of a
> blob rather than its last occurrence.
>
> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> Documentation/git-describe.txt | 18 ++++++++++--
> builtin/describe.c             | 62 
> ++++++++++++++++++++++++++++++++++++++----
> t/t6120-describe.sh            | 34 +++++++++++++++++++++++
> 3 files changed, 107 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/git-describe.txt 
> b/Documentation/git-describe.txt
> index c924c945ba..e027fb8c4b 100644
> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -3,14 +3,14 @@ git-describe(1)
>
> NAME
> ----
> -git-describe - Describe a commit using the most recent tag reachable from 
> it
> -
> +git-describe - Give an object a human readable name based on an available 
> ref
>
> SYNOPSIS
> --------
> [verse]
> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] 
> [<commit-ish>...]
> 'git describe' [--all] [--tags] [--contains] 
> [--abbrev=<n>] --dirty[=<mark>]
> +'git describe' <blob>
>
> DESCRIPTION
> -----------
> @@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` 
> only shows
> annotated tags.  For more information about creating annotated tags
> see the -a and -s options to linkgit:git-tag[1].
>
> +If the given object refers to a blob, it will be described
> +as `<commit-ish>:<path>`, such that the blob can be found
> +at `<path>` in the `<commit-ish>`, which itself describes the
> +first commit in which this blob occurs in a reverse revision walk
> +from HEAD.
> +
> OPTIONS
> -------
> <commit-ish>...::
> @@ -186,6 +192,14 @@ selected and output.  Here fewest commits different 
> is defined as
> the number of commits which would be shown by `git log tag..input`
> will be the smallest number of commits possible.
>
> +BUGS
> +----
> +
> +Tree objects as well as tag objects not pointing at commits, cannot be 
> described.

Is this true? Is it stand alone from the describing of a blob? If so should 
it be its own patchlet. - I thought I'd read that within the series there is 
now a tree / tag (of blob/trees) description capability.

I'd prefer that we don't start with the "can't" view (relative to the 
subsequent sentences of the paragraph). It puts off the reader - we are 
about to say what can be described but in a limited way - the limitation 
being the bug. Maybe just swap the line to form a second paragraph.

> +When describing blobs, the lightweight tags pointing at blobs are 
> ignored,
> +but the blob is still described as <committ-ish>:<path> despite the 
> lightweight
> +tag being favorable.
> +
--
Philip

> GIT
> ---
> Part of the linkgit:git[1] suite
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 9e9a5ed5d4..5b4bfaba3f 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -3,6 +3,7 @@
> #include "lockfile.h"
> #include "commit.h"
> #include "tag.h"
> +#include "blob.h"
> #include "refs.h"
> #include "builtin.h"
> #include "exec_cmd.h"
> @@ -11,8 +12,9 @@
> #include "hashmap.h"
> #include "argv-array.h"
> #include "run-command.h"
> +#include "revision.h"
> +#include "list-objects.h"
>
> -#define SEEN (1u << 0)
> #define MAX_TAGS (FLAG_BITS - 1)
>
> static const char * const describe_usage[] = {
> @@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, 
> struct strbuf *dst)
>  strbuf_addstr(dst, suffix);
> }
>
> +struct process_commit_data {
> + struct object_id current_commit;
> + struct object_id looking_for;
> + struct strbuf *dst;
> + struct rev_info *revs;
> +};
> +
> +static void process_commit(struct commit *commit, void *data)
> +{
> + struct process_commit_data *pcd = data;
> + pcd->current_commit = commit->object.oid;
> +}
> +
> +static void process_object(struct object *obj, const char *path, void 
> *data)
> +{
> + struct process_commit_data *pcd = data;
> +
> + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
> + reset_revision_walk();
> + describe_commit(&pcd->current_commit, pcd->dst);
> + strbuf_addf(pcd->dst, ":%s", path);
> + free_commit_list(pcd->revs->commits);
> + pcd->revs->commits = NULL;
> + }
> +}
> +
> +static void describe_blob(struct object_id oid, struct strbuf *dst)
> +{
> + struct rev_info revs;
> + struct argv_array args = ARGV_ARRAY_INIT;
> + struct process_commit_data pcd = { null_oid, oid, dst, &revs};
> +
> + argv_array_pushl(&args, "internal: The first arg is not parsed",
> + "--objects", "--in-commit-order", "--reverse", "HEAD",
> + NULL);
> +
> + init_revisions(&revs, NULL);
> + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
> + BUG("setup_revisions could not handle all args?");
> +
> + if (prepare_revision_walk(&revs))
> + die("revision walk setup failed");
> +
> + traverse_commit_list(&revs, process_commit, process_object, &pcd);
> + reset_revision_walk();
> +}
> +
> static void describe(const char *arg, int last_one)
> {
>  struct object_id oid;
> @@ -445,11 +494,14 @@ static void describe(const char *arg, int last_one)
>
>  if (get_oid(arg, &oid))
>  die(_("Not a valid object name %s"), arg);
> - cmit = lookup_commit_reference(&oid);
> - if (!cmit)
> - die(_("%s is not a valid '%s' object"), arg, commit_type);
> + cmit = lookup_commit_reference_gently(&oid, 1);
>
> - describe_commit(&oid, &sb);
> + if (cmit)
> + describe_commit(&oid, &sb);
> + else if (lookup_blob(&oid))
> + describe_blob(oid, &sb);
> + else
> + die(_("%s is neither a commit nor blob"), arg);
>
>  puts(sb.buf);
>
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index c8b7ed82d9..4668f0058e 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -310,6 +310,40 @@ test_expect_success 'describe ignoring a broken 
> submodule' '
>  grep broken out
> '
>
> +test_expect_success 'describe a blob at a directly tagged commit' '
> + echo "make it a unique blob" >file &&
> + git add file && git commit -m "content in file" &&
> + git tag -a -m "latest annotated tag" unique-file &&
> + git describe HEAD:file >actual &&
> + echo "unique-file:file" >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe a blob with its first introduction' '
> + git commit --allow-empty -m "empty commit" &&
> + git rm file &&
> + git commit -m "delete blob" &&
> + git revert HEAD &&
> + git commit --allow-empty -m "empty commit" &&
> + git describe HEAD:file >actual &&
> + echo "unique-file:file" >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe directly tagged blob' '
> + git tag test-blob unique-file:file &&
> + git describe test-blob >actual &&
> + echo "unique-file:file" >expect &&
> + # suboptimal: we rather want to see "test-blob"
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe tag object' '
> + git tag test-blob-1 -a -m msg unique-file:file &&
> + test_must_fail git describe test-blob-1 2>actual &&
> + grep "fatal: test-blob-1 is neither a commit nor blob" actual
> +'
> +
> test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
>  i=1 &&
>  while test $i -lt 8000
> -- 
> 2.15.0.128.gcadd42da22
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-16 19:34     ` Stefan Beller
@ 2017-11-22  7:55       ` Junio C Hamano
  2017-11-22 17:00         ` Stefan Beller
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2017-11-22  7:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

> ...
> fixed.
> ...
> fixed the text...
> ...
> I am not fully convinced all descriptions are in recent history, but I
> tend to agree that most are, so probably the trade off is a wash.

So what do we want with this topic?  I think the "teach 'git log' to
highlight commits whose changes involve the given blob" is a more or
less an orthogonal thing, and I suspect that it is something users
may (although I personally do not) find valuable to have a related
but different feature in "git describe".



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-22  7:55       ` Junio C Hamano
@ 2017-11-22 17:00         ` Stefan Beller
  2017-11-24  7:18           ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Beller @ 2017-11-22 17:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> ...
>> fixed.
>> ...
>> fixed the text...
>> ...
>> I am not fully convinced all descriptions are in recent history, but I
>> tend to agree that most are, so probably the trade off is a wash.
>
> So what do we want with this topic?  I think the "teach 'git log' to
> highlight commits whose changes involve the given blob" is a more or
> less an orthogonal thing,

Well, both of them solve our immediate needs, so I'd be fine with pursuing
just one of them, but I do not oppose taking both.

> and I suspect that it is something users
> may (although I personally do not) find valuable to have a related
> but different feature in "git describe".

agreed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-22 17:00         ` Stefan Beller
@ 2017-11-24  7:18           ` Junio C Hamano
  2017-11-27 19:38             ` Stefan Beller
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2017-11-24  7:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

> On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>
>>> ...
>>> fixed.
>>> ...
>>> fixed the text...
>>> ...
>>> I am not fully convinced all descriptions are in recent history, but I
>>> tend to agree that most are, so probably the trade off is a wash.
>>
>> So what do we want with this topic?  I think the "teach 'git log' to
>> highlight commits whose changes involve the given blob" is a more or
>> less an orthogonal thing,
>
> Well, both of them solve our immediate needs, so I'd be fine with pursuing
> just one of them, but I do not oppose taking both.
>
>> and I suspect that it is something users
>> may (although I personally do not) find valuable to have a related
>> but different feature in "git describe".
>
> agreed.

I was reacting to your "fixed".  So will we see a rerolled series or
not?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-24  7:18           ` Junio C Hamano
@ 2017-11-27 19:38             ` Stefan Beller
  2017-11-27 23:37               ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Beller @ 2017-11-27 19:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

On Thu, Nov 23, 2017 at 11:18 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Stefan Beller <sbeller@google.com> writes:
>>>
>>>> ...
>>>> fixed.
>>>> ...
>>>> fixed the text...
>>>> ...
>>>> I am not fully convinced all descriptions are in recent history, but I
>>>> tend to agree that most are, so probably the trade off is a wash.
>>>
>>> So what do we want with this topic?  I think the "teach 'git log' to
>>> highlight commits whose changes involve the given blob" is a more or
>>> less an orthogonal thing,
>>
>> Well, both of them solve our immediate needs, so I'd be fine with pursuing
>> just one of them, but I do not oppose taking both.
>>
>>> and I suspect that it is something users
>>> may (although I personally do not) find valuable to have a related
>>> but different feature in "git describe".
>>
>> agreed.
>
> I was reacting to your "fixed".  So will we see a rerolled series or
> not?

I was not planning on rerolling it.

FYI: I am on vacation until next Monday, so if there is anything
to be fixed here, I'd do it in a week, which may be enough time
to warrant incremental fixes?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-27 19:38             ` Stefan Beller
@ 2017-11-27 23:37               ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2017-11-27 23:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

>> I was reacting to your "fixed".  So will we see a rerolled series or
>> not?
>
> I was not planning on rerolling it.

OK.  Then I do not have to worry about this one and drop it perhaps.
One less topic on 'pu' is always good, if it is not active.

Enjoy your vacation.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
  2017-11-16  3:24   ` Junio C Hamano
  2017-11-20 15:10   ` Philip Oakley
@ 2017-12-19 19:22   ` Junio C Hamano
  2017-12-19 19:39     ` Stefan Beller
  2 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2017-12-19 19:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jonathantanmy

I had to squash in the following to make 'pu' pass under
gettext-poison build.  Is this ready for 'next' otherwise?

With the "log --find-object" thing, it may be that this no longer is
needed, but then again we haven't done anything with the other
Jonathan's idea to unify the --find-object thing into the --pickaxe
framework.

It seems that we tend to open and then abandon new interests without
seeing them through completion, leaving too many loose ends untied.
This has to stop.

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 4668f0058e..3e3fb462a0 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -341,7 +341,7 @@ test_expect_success 'describe directly tagged blob' '
 test_expect_success 'describe tag object' '
 	git tag test-blob-1 -a -m msg unique-file:file &&
 	test_must_fail git describe test-blob-1 2>actual &&
-	grep "fatal: test-blob-1 is neither a commit nor blob" actual
+	test_i18ngrep "fatal: test-blob-1 is neither a commit nor blob" actual
 '
 
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-12-19 19:22   ` Junio C Hamano
@ 2017-12-19 19:39     ` Stefan Beller
  2017-12-22 23:10       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Beller @ 2017-12-19 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan

On Tue, Dec 19, 2017 at 11:22 AM, Junio C Hamano <gitster@pobox.com> wrote:
> I had to squash in the following to make 'pu' pass under
> gettext-poison build.  Is this ready for 'next' otherwise?

I saw that in pu, thanks for squashing. I should have spoken
up earlier confirming it.

> With the "log --find-object" thing, it may be that this no longer is
> needed, but then again we haven't done anything with the other
> Jonathan's idea to unify the --find-object thing into the --pickaxe
> framework.

I'll look into that after I finish looking at a submodule related bug.

> It seems that we tend to open and then abandon new interests without
> seeing them through completion, leaving too many loose ends untied.
> This has to stop.

I'll try to find my balance again.

When working on too few topics at the same time, sometimes I get stalled
waiting for the mailing list (or internal reviewers) to respond, so I start many
topics, which then overwhelm me after a while once reviews catch up.

In this specific case it was not clear what the best way is, i.e. the
interest was
rather broad: "Hey I have this blob sha1, how do I get more information?",
which can be solved in many different ways, so I tried some of them.
(Another alternative just now considered: I could have wrapped that
script from stackoverflow into a new command and called it a day, though that
has very poor benefits for the rest of ecosystem, an addition to an established
powerful command gives more benefits IMHO)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
  2017-12-19 19:39     ` Stefan Beller
@ 2017-12-22 23:10       ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2017-12-22 23:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

> On Tue, Dec 19, 2017 at 11:22 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> I had to squash in the following to make 'pu' pass under
>> gettext-poison build.  Is this ready for 'next' otherwise?
>
> I saw that in pu, thanks for squashing. I should have spoken
> up earlier confirming it.

So is this ready for 'next' if it is squashed?

>> With the "log --find-object" thing, it may be that this no longer is
>> needed, but then again we haven't done anything with the other
>> Jonathan's idea to unify the --find-object thing into the --pickaxe
>> framework.
>
> I'll look into that after I finish looking at a submodule related bug.

Thanks.

>> It seems that we tend to open and then abandon new interests without
>> seeing them through completion, leaving too many loose ends untied.
>> This has to stop.
>
> I'll try to find my balance again.

Sorry, but this was not really about you, but was more about me.  

I probably should more aggressively drop a topic that stalled and
also merge a topic that reached the point of diminishing returns to
'next'.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-12-22 23:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
2017-11-16  2:00 ` [PATCHv5 1/7] t6120: fix typo in test name Stefan Beller
2017-11-16  2:00 ` [PATCHv5 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-11-16  2:00 ` [PATCHv5 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-11-16  2:00 ` [PATCHv5 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
2017-11-16  2:00 ` [PATCHv5 5/7] builtin/describe.c: print debug statements earlier Stefan Beller
2017-11-16  2:00 ` [PATCHv5 6/7] builtin/describe.c: factor out describe_commit Stefan Beller
2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
2017-11-16  3:24   ` Junio C Hamano
2017-11-16 19:34     ` Stefan Beller
2017-11-22  7:55       ` Junio C Hamano
2017-11-22 17:00         ` Stefan Beller
2017-11-24  7:18           ` Junio C Hamano
2017-11-27 19:38             ` Stefan Beller
2017-11-27 23:37               ` Junio C Hamano
2017-11-20 15:10   ` Philip Oakley
2017-12-19 19:22   ` Junio C Hamano
2017-12-19 19:39     ` Stefan Beller
2017-12-22 23:10       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).