git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Stefan Beller <sbeller@google.com>
To: gitster@pobox.com
Cc: Johannes.Schindelin@gmx.de, git@vger.kernel.org, jacob.keller@gmail.com, me@ikke.info, philipoakley@iee.org, sbeller@google.com
Subject: [PATCH] builtin/describe.c: describe a blob
Date: Fri, 10 Nov 2017 14:44:25 -0800
Message-ID: <20171110224425.15299-2-sbeller@google.com> (raw)
In-Reply-To: <20171110224425.15299-1-sbeller@google.com>

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck.  So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.

When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting.  The same blob can be referenced by
multiple commits, so how we decide which commit to use?  This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob.  For
source code this is likely not the first commit that introduced the blob,
but rather the latest commit that contained the blob.  For example:

  git describe v0.99:Makefile
  v0.99-5-gab6625e06a:Makefile

tells us the latest commit that contained the Makefile as it was in tag
v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next
commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist",
2005-07-11) touches the Makefile.

Let's see how this description turns out, if it is useful in day-to-day
use as I have the intuition that we'd rather want to see the *first*
commit that this blob was introduced to the repository (which can be
achieved easily by giving the `--reverse` flag in the describe_blob rev
walk).

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-describe.txt | 13 +++++++-
 builtin/describe.c             | 71 ++++++++++++++++++++++++++++++++++++++----
 t/t6120-describe.sh            | 15 +++++++++
 3 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c924c945ba..a25443ca91 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -3,7 +3,7 @@ git-describe(1)
 
 NAME
 ----
-git-describe - Describe a commit using the most recent tag reachable from it
+git-describe - Describe a commit or blob using the graph relations
 
 
 SYNOPSIS
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...]
 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>]
+'git describe' <blob>
 
 DESCRIPTION
 -----------
@@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows
 annotated tags.  For more information about creating annotated tags
 see the -a and -s options to linkgit:git-tag[1].
 
+If the given object refers to a blob, it will be described
+as `<commit-ish>:<path>`, such that the blob can be found
+at `<path>` in the `<commit-ish>`. Note, that the commit is likely
+not the commit that introduced the blob, but the one that was found
+first; to find the commit that introduced the blob, you need to find
+the commit that last touched the path, e.g.
+`git log <commit-description> -- <path>`.
+As blobs do not point at the commits they are contained in,
+describing blobs is slow as we have to walk the whole graph.
+
 OPTIONS
 -------
 <commit-ish>...::
diff --git a/builtin/describe.c b/builtin/describe.c
index 9e9a5ed5d4..acfd853a30 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -3,6 +3,7 @@
 #include "lockfile.h"
 #include "commit.h"
 #include "tag.h"
+#include "blob.h"
 #include "refs.h"
 #include "builtin.h"
 #include "exec_cmd.h"
@@ -11,8 +12,9 @@
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
+#include "revision.h"
+#include "list-objects.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -434,6 +436,56 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		strbuf_addstr(dst, suffix);
 }
 
+struct process_commit_data {
+	struct object_id current_commit;
+	struct object_id looking_for;
+	struct strbuf *dst;
+	struct rev_info *revs;
+};
+
+static void process_commit(struct commit *commit, void *data)
+{
+	struct process_commit_data *pcd = data;
+	pcd->current_commit = commit->object.oid;
+}
+
+static void process_object(struct object *obj, const char *path, void *data)
+{
+	struct process_commit_data *pcd = data;
+
+	if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
+		reset_revision_walk();
+		describe_commit(&pcd->current_commit, pcd->dst);
+		strbuf_addf(pcd->dst, ":%s", path);
+		pcd->revs->max_count = 0;
+	}
+}
+
+static void describe_blob(struct object_id oid, struct strbuf *dst)
+{
+	struct rev_info revs;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct process_commit_data pcd = { null_oid, oid, dst, &revs};
+
+	argv_array_pushl(&args, "internal: The first arg is not parsed",
+		"--all", "--reflog", /* as many starting points as possible */
+		/* NEEDSWORK: --all is incompatible with worktrees for now: */
+		"--single-worktree",
+		"--objects",
+		"--in-commit-order",
+		NULL);
+
+	init_revisions(&revs, NULL);
+	if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
+		BUG("setup_revisions could not handle all args?");
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, process_commit, process_object, &pcd);
+	reset_revision_walk();
+}
+
 static void describe(const char *arg, int last_one)
 {
 	struct object_id oid;
@@ -445,11 +497,18 @@ static void describe(const char *arg, int last_one)
 
 	if (get_oid(arg, &oid))
 		die(_("Not a valid object name %s"), arg);
-	cmit = lookup_commit_reference(&oid);
-	if (!cmit)
-		die(_("%s is not a valid '%s' object"), arg, commit_type);
-
-	describe_commit(&oid, &sb);
+	cmit = lookup_commit_reference_gently(&oid, 1);
+
+	if (cmit)
+		describe_commit(&oid, &sb);
+	else if (lookup_blob(&oid)) {
+		if (all || tags || longformat || first_parent ||
+		    patterns.nr || exclude_patterns.nr ||
+		    always || dirty || broken)
+			die(_("options not available for describing blobs"));
+		describe_blob(oid, &sb);
+	} else
+		die(_("%s is neither a commit nor blob"), arg);
 
 	puts(sb.buf);
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index c8b7ed82d9..aec6ed192d 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a broken submodule' '
 	grep broken out
 '
 
+test_expect_success 'describe a blob at a tag' '
+	echo "make it a unique blob" >file &&
+	git add file && git commit -m "content in file" &&
+	git tag -a -m "latest annotated tag" unique-file &&
+	git describe HEAD:file >actual &&
+	echo "unique-file:file" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'describe a blob with commit-ish' '
+	git commit --allow-empty -m "empty commit" &&
+	git describe HEAD:file >actual &&
+	grep unique-file-1-g actual
+'
+
 test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
-- 
2.15.0.128.gcadd42da22


  reply index

Thread overview: 110+ messages in thread (expand / mbox.gz / Atom feed / [top])
2017-10-28  0:44 [RFC PATCH 0/3] git-describe <blob> ? Stefan Beller
2017-10-28  0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-10-28  0:45   ` [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-10-28 17:20     ` Johannes Schindelin
2017-10-29  3:22       ` Stefan Beller
2017-10-29  3:23         ` Stefan Beller
2017-10-29  3:43           ` Junio C Hamano
2017-10-28  0:45   ` [PATCH 3/3] builtin/describe: describe blobs Stefan Beller
2017-10-28 17:32     ` Johannes Schindelin
2017-10-28 22:47       ` Jacob Keller
2017-10-29  3:28       ` Stefan Beller
2017-10-29 12:02         ` Kevin Daudt
2017-10-29 12:07         ` Johannes Schindelin
2017-10-28 17:15   ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Johannes Schindelin
2017-10-29  3:13     ` Stefan Beller
2017-10-28 16:04 ` [RFC PATCH 0/3] git-describe <blob> ? Johannes Schindelin
2017-10-31  0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller
2017-10-31  0:33   ` [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-10-31  6:07     ` Junio C Hamano
2017-10-31  0:33   ` [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-10-31  6:57     ` Junio C Hamano
2017-10-31 18:12       ` Stefan Beller
2017-10-31  0:33   ` [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
2017-10-31  8:15     ` Jacob Keller
2017-10-31  0:33   ` [PATCH 4/7] builtin/describe.c: print debug statements earlier Stefan Beller
2017-10-31  7:03     ` Junio C Hamano
2017-10-31 19:05       ` Stefan Beller
2017-10-31  0:33   ` [PATCH 5/7] builtin/describe.c: factor out describe_commit Stefan Beller
2017-10-31  0:33   ` [PATCH 6/7] builtin/describe.c: describe a blob Stefan Beller
2017-10-31  6:25     ` Junio C Hamano
2017-10-31 19:16       ` Stefan Beller
2017-11-01  3:34         ` Junio C Hamano
2017-11-01 20:58           ` Stefan Beller
2017-11-02  1:53             ` Junio C Hamano
2017-11-02  4:23               ` Junio C Hamano
2017-11-04 21:15                 ` Philip Oakley
2017-11-05  6:28                   ` Junio C Hamano
2017-11-06 23:50                     ` Philip Oakley
2017-11-09 20:30                       ` Stefan Beller
2017-11-10  0:25                         ` Philip Oakley
2017-11-10  1:24                           ` Junio C Hamano
2017-11-10 22:44                             ` [PATCH 0/1] describe a blob: with better docs Stefan Beller
2017-11-10 22:44                               ` Stefan Beller [this message]
2017-11-13  1:33                                 ` [PATCH] builtin/describe.c: describe a blob Junio C Hamano
2017-11-14 23:37                                   ` Stefan Beller
2017-11-20 15:22                             ` [PATCH 6/7] " Philip Oakley
2017-11-20 18:18                               ` Philip Oakley
2017-11-01  3:44         ` Junio C Hamano
2017-10-31  0:33   ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller
2017-11-01  1:21     ` Junio C Hamano
2017-11-01 18:13       ` Stefan Beller
2017-11-02  1:36       ` Junio C Hamano
2017-10-31 21:18   ` [PATCHv2 0/7] git describe blob Stefan Beller
2017-10-31 21:18     ` [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-11-01  3:46       ` Junio C Hamano
2017-10-31 21:18     ` [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-11-01  3:50       ` Junio C Hamano
2017-11-01 12:26         ` Johannes Schindelin
2017-11-01 12:37           ` Junio C Hamano
2017-11-01 19:37             ` Stefan Beller
2017-11-01 22:08               ` Johannes Schindelin
2017-11-01 22:19                 ` Stefan Beller
2017-11-01 22:39                   ` Johannes Schindelin
2017-11-01 22:46                     ` Stefan Beller
2017-11-01 21:36             ` Johannes Schindelin
2017-11-01 21:39               ` Jeff King
2017-11-01 22:33                 ` Johannes Schindelin
2017-11-02  1:20                   ` Junio C Hamano
2017-10-31 21:18     ` [PATCHv2 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
2017-10-31 21:18     ` [PATCHv2 4/7] builtin/describe.c: print debug statements earlier Stefan Beller
2017-10-31 21:31       ` Eric Sunshine
2017-10-31 21:18     ` [PATCHv2 5/7] builtin/describe.c: factor out describe_commit Stefan Beller
2017-10-31 21:18     ` [PATCHv2 6/7] builtin/describe.c: describe a blob Stefan Beller
2017-10-31 21:49       ` Eric Sunshine
2017-11-01 19:51         ` Stefan Beller
2017-11-01  4:11       ` Junio C Hamano
2017-11-01 12:32         ` Johannes Schindelin
2017-11-01 17:59           ` Stefan Beller
2017-11-01 21:05             ` Jacob Keller
2017-11-01 22:12               ` Johannes Schindelin
2017-11-01 22:21                 ` Stefan Beller
2017-11-01 22:41                   ` Johannes Schindelin
2017-11-01 22:53                     ` Stefan Beller
2017-11-02  6:05                     ` Jacob Keller
2017-11-03  5:18                       ` Junio C Hamano
2017-11-03  6:55                         ` Jacob Keller
2017-11-03 15:02                           ` Junio C Hamano
2017-11-02  7:23                     ` Andreas Schwab
2017-11-02 18:18                       ` Stefan Beller
2017-11-03 12:05                         ` Johannes Schindelin
2017-11-01 21:28         ` Stefan Beller
2017-10-31 21:18     ` [PATCHv2 7/7] t6120: fix typo in test name Stefan Beller
2017-11-01  5:14     ` [PATCHv2 0/7] git describe blob Junio C Hamano
2017-11-02 19:41     ` [PATCHv3 " Stefan Beller
2017-11-02 19:41       ` [PATCHv3 1/7] t6120: fix typo in test name Stefan Beller
2017-11-02 19:41       ` [PATCHv3 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-11-02 19:41       ` [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-11-14 19:52         ` Jonathan Tan
2017-11-02 19:41       ` [PATCHv3 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
2017-11-02 19:41       ` [PATCHv3 5/7] builtin/describe.c: print debug statements earlier Stefan Beller
2017-11-14 19:55         ` Jonathan Tan
2017-11-14 20:00           ` Stefan Beller
2017-11-02 19:41       ` [PATCHv3 6/7] builtin/describe.c: factor out describe_commit Stefan Beller
2017-11-02 19:41       ` [PATCHv3 7/7] builtin/describe.c: describe a blob Stefan Beller
2017-11-14 20:02         ` Jonathan Tan
2017-11-14 20:40           ` Stefan Beller
2017-11-14 21:17             ` Jonathan Tan
2017-11-03  0:23       ` [PATCHv3 0/7] git describe blob Jacob Keller
2017-11-03  1:46         ` Junio C Hamano
2017-11-03  2:29           ` Stefan Beller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply to all the recipients using the --to, --cc,
  and --in-reply-to switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171110224425.15299-2-sbeller@google.com \
    --to=sbeller@google.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=me@ikke.info \
    --cc=philipoakley@iee.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox