git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: "Philip Oakley" <philipoakley@iee.org>
To: "Stefan Beller" <sbeller@google.com>, <git@vger.kernel.org>
Cc: <gitster@pobox.com>, <jonathantanmy@google.com>, "Stefan Beller" <sbeller@google.com>
Subject: Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Date: Mon, 20 Nov 2017 15:10:53 -0000
Message-ID: <B03A23EA75104E99BADBB420B6BEA614@PhilipOakley> (raw)
In-Reply-To: <20171116020039.17810-8-sbeller@google.com>

From: "Stefan Beller" <sbeller@google.com>
Sent: Thursday, November 16, 2017 2:00 AM

[in catch up mode..]

> Sometimes users are given a hash of an object and they want to
> identify it further (ex.: Use verify-pack to find the largest blobs,
> but what are these? or [1])
>
> When describing commits, we try to anchor them to tags or refs, as these
> are conceptually on a higher level than the commit. And if there is no ref
> or tag that matches exactly, we're out of luck.  So we employ a heuristic
> to make up a name for the commit. These names are ambiguous, there might
> be different tags or refs to anchor to, and there might be different
> path in the DAG to travel to arrive at the commit precisely.
>
> When describing a blob, we want to describe the blob from a higher layer
> as well, which is a tuple of (commit, deep/path) as the tree objects
> involved are rather uninteresting.  The same blob can be referenced by
> multiple commits, so how we decide which commit to use?  This patch
> implements a rather naive approach on this: As there are no back pointers
> from blobs to commits in which the blob occurs, we'll start walking from
> any tips available, listing the blobs in-order of the commit and once we
> found the blob, we'll take the first commit that listed the blob. For
> example
>
>  git describe --tags v0.99:Makefile
>  conversion-901-g7672db20c2:Makefile
>
> tells us the Makefile as it was in v0.99 was introduced in commit 
> 7672db20.
>
> The walking is performed in reverse order to show the introduction of a
> blob rather than its last occurrence.
>
> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> Documentation/git-describe.txt | 18 ++++++++++--
> builtin/describe.c             | 62 
> ++++++++++++++++++++++++++++++++++++++----
> t/t6120-describe.sh            | 34 +++++++++++++++++++++++
> 3 files changed, 107 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/git-describe.txt 
> b/Documentation/git-describe.txt
> index c924c945ba..e027fb8c4b 100644
> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -3,14 +3,14 @@ git-describe(1)
>
> NAME
> ----
> -git-describe - Describe a commit using the most recent tag reachable from 
> it
> -
> +git-describe - Give an object a human readable name based on an available 
> ref
>
> SYNOPSIS
> --------
> [verse]
> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] 
> [<commit-ish>...]
> 'git describe' [--all] [--tags] [--contains] 
> [--abbrev=<n>] --dirty[=<mark>]
> +'git describe' <blob>
>
> DESCRIPTION
> -----------
> @@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` 
> only shows
> annotated tags.  For more information about creating annotated tags
> see the -a and -s options to linkgit:git-tag[1].
>
> +If the given object refers to a blob, it will be described
> +as `<commit-ish>:<path>`, such that the blob can be found
> +at `<path>` in the `<commit-ish>`, which itself describes the
> +first commit in which this blob occurs in a reverse revision walk
> +from HEAD.
> +
> OPTIONS
> -------
> <commit-ish>...::
> @@ -186,6 +192,14 @@ selected and output.  Here fewest commits different 
> is defined as
> the number of commits which would be shown by `git log tag..input`
> will be the smallest number of commits possible.
>
> +BUGS
> +----
> +
> +Tree objects as well as tag objects not pointing at commits, cannot be 
> described.

Is this true? Is it stand alone from the describing of a blob? If so should 
it be its own patchlet. - I thought I'd read that within the series there is 
now a tree / tag (of blob/trees) description capability.

I'd prefer that we don't start with the "can't" view (relative to the 
subsequent sentences of the paragraph). It puts off the reader - we are 
about to say what can be described but in a limited way - the limitation 
being the bug. Maybe just swap the line to form a second paragraph.

> +When describing blobs, the lightweight tags pointing at blobs are 
> ignored,
> +but the blob is still described as <committ-ish>:<path> despite the 
> lightweight
> +tag being favorable.
> +
--
Philip

> GIT
> ---
> Part of the linkgit:git[1] suite
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 9e9a5ed5d4..5b4bfaba3f 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -3,6 +3,7 @@
> #include "lockfile.h"
> #include "commit.h"
> #include "tag.h"
> +#include "blob.h"
> #include "refs.h"
> #include "builtin.h"
> #include "exec_cmd.h"
> @@ -11,8 +12,9 @@
> #include "hashmap.h"
> #include "argv-array.h"
> #include "run-command.h"
> +#include "revision.h"
> +#include "list-objects.h"
>
> -#define SEEN (1u << 0)
> #define MAX_TAGS (FLAG_BITS - 1)
>
> static const char * const describe_usage[] = {
> @@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, 
> struct strbuf *dst)
>  strbuf_addstr(dst, suffix);
> }
>
> +struct process_commit_data {
> + struct object_id current_commit;
> + struct object_id looking_for;
> + struct strbuf *dst;
> + struct rev_info *revs;
> +};
> +
> +static void process_commit(struct commit *commit, void *data)
> +{
> + struct process_commit_data *pcd = data;
> + pcd->current_commit = commit->object.oid;
> +}
> +
> +static void process_object(struct object *obj, const char *path, void 
> *data)
> +{
> + struct process_commit_data *pcd = data;
> +
> + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
> + reset_revision_walk();
> + describe_commit(&pcd->current_commit, pcd->dst);
> + strbuf_addf(pcd->dst, ":%s", path);
> + free_commit_list(pcd->revs->commits);
> + pcd->revs->commits = NULL;
> + }
> +}
> +
> +static void describe_blob(struct object_id oid, struct strbuf *dst)
> +{
> + struct rev_info revs;
> + struct argv_array args = ARGV_ARRAY_INIT;
> + struct process_commit_data pcd = { null_oid, oid, dst, &revs};
> +
> + argv_array_pushl(&args, "internal: The first arg is not parsed",
> + "--objects", "--in-commit-order", "--reverse", "HEAD",
> + NULL);
> +
> + init_revisions(&revs, NULL);
> + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1)
> + BUG("setup_revisions could not handle all args?");
> +
> + if (prepare_revision_walk(&revs))
> + die("revision walk setup failed");
> +
> + traverse_commit_list(&revs, process_commit, process_object, &pcd);
> + reset_revision_walk();
> +}
> +
> static void describe(const char *arg, int last_one)
> {
>  struct object_id oid;
> @@ -445,11 +494,14 @@ static void describe(const char *arg, int last_one)
>
>  if (get_oid(arg, &oid))
>  die(_("Not a valid object name %s"), arg);
> - cmit = lookup_commit_reference(&oid);
> - if (!cmit)
> - die(_("%s is not a valid '%s' object"), arg, commit_type);
> + cmit = lookup_commit_reference_gently(&oid, 1);
>
> - describe_commit(&oid, &sb);
> + if (cmit)
> + describe_commit(&oid, &sb);
> + else if (lookup_blob(&oid))
> + describe_blob(oid, &sb);
> + else
> + die(_("%s is neither a commit nor blob"), arg);
>
>  puts(sb.buf);
>
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index c8b7ed82d9..4668f0058e 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -310,6 +310,40 @@ test_expect_success 'describe ignoring a broken 
> submodule' '
>  grep broken out
> '
>
> +test_expect_success 'describe a blob at a directly tagged commit' '
> + echo "make it a unique blob" >file &&
> + git add file && git commit -m "content in file" &&
> + git tag -a -m "latest annotated tag" unique-file &&
> + git describe HEAD:file >actual &&
> + echo "unique-file:file" >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe a blob with its first introduction' '
> + git commit --allow-empty -m "empty commit" &&
> + git rm file &&
> + git commit -m "delete blob" &&
> + git revert HEAD &&
> + git commit --allow-empty -m "empty commit" &&
> + git describe HEAD:file >actual &&
> + echo "unique-file:file" >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe directly tagged blob' '
> + git tag test-blob unique-file:file &&
> + git describe test-blob >actual &&
> + echo "unique-file:file" >expect &&
> + # suboptimal: we rather want to see "test-blob"
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'describe tag object' '
> + git tag test-blob-1 -a -m msg unique-file:file &&
> + test_must_fail git describe test-blob-1 2>actual &&
> + grep "fatal: test-blob-1 is neither a commit nor blob" actual
> +'
> +
> test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
>  i=1 &&
>  while test $i -lt 8000
> -- 
> 2.15.0.128.gcadd42da22
> 


  parent reply index

Thread overview: 19+ messages in thread (expand / mbox.gz / Atom feed / [top])
2017-11-16  2:00 [PATCHv5 0/7] describe blob Stefan Beller
2017-11-16  2:00 ` [PATCHv5 1/7] t6120: fix typo in test name Stefan Beller
2017-11-16  2:00 ` [PATCHv5 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller
2017-11-16  2:00 ` [PATCHv5 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller
2017-11-16  2:00 ` [PATCHv5 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller
2017-11-16  2:00 ` [PATCHv5 5/7] builtin/describe.c: print debug statements earlier Stefan Beller
2017-11-16  2:00 ` [PATCHv5 6/7] builtin/describe.c: factor out describe_commit Stefan Beller
2017-11-16  2:00 ` [PATCHv5 7/7] builtin/describe.c: describe a blob Stefan Beller
2017-11-16  3:24   ` Junio C Hamano
2017-11-16 19:34     ` Stefan Beller
2017-11-22  7:55       ` Junio C Hamano
2017-11-22 17:00         ` Stefan Beller
2017-11-24  7:18           ` Junio C Hamano
2017-11-27 19:38             ` Stefan Beller
2017-11-27 23:37               ` Junio C Hamano
2017-11-20 15:10   ` Philip Oakley [this message]
2017-12-19 19:22   ` Junio C Hamano
2017-12-19 19:39     ` Stefan Beller
2017-12-22 23:10       ` Junio C Hamano

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply to all the recipients using the --to, --cc,
  and --in-reply-to switches of git-send-email(1):

  git send-email \
    --in-reply-to=B03A23EA75104E99BADBB420B6BEA614@PhilipOakley \
    --to=philipoakley@iee.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox