* [RFC PATCH 0/3] git-describe <blob> ? @ 2017-10-28 0:44 Stefan Beller 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller ` (2 more replies) 0 siblings, 3 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-28 0:44 UTC (permalink / raw) To: git; +Cc: Stefan Beller Occasionally a user is given an object hash from a blob as an error message or other output (e.g. [1]). It would be useful to get a further description of such a blob, such as the (commit, path) tuple where this blob was introduced. This implements the answer in builtin/describe, but I am not sure if that is the right place. (One office mate argued it could be a "reverse-blame" that tells you all the trees/commits where the blob is referenced from). This is RFC for other reasons as well: tests, docs missing. Any feedback welcome, thanks, Stefan [1] https://stackoverflow.com/questions/10622179/how-to-find-identify-large-files-commits-in-git-history Stefan Beller (3): list-objects.c: factor out traverse_trees_and_blobs revision.h: introduce blob/tree walking in order of the commits builtin/describe: describe blobs builtin/describe.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++---- list-objects.c | 48 +++++++++++++++++++++++++++----------------- revision.h | 3 ++- 3 files changed, 86 insertions(+), 23 deletions(-) -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs 2017-10-28 0:44 [RFC PATCH 0/3] git-describe <blob> ? Stefan Beller @ 2017-10-28 0:45 ` Stefan Beller 2017-10-28 0:45 ` [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits Stefan Beller ` (2 more replies) 2017-10-28 16:04 ` [RFC PATCH 0/3] git-describe <blob> ? Johannes Schindelin 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller 2 siblings, 3 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-28 0:45 UTC (permalink / raw) To: sbeller; +Cc: git With traverse_trees_and_blobs factored out of the main traverse function, the next patch can introduce an in-order revision walking with ease. Signed-off-by: Stefan Beller <sbeller@google.com> --- list-objects.c | 46 ++++++++++++++++++++++++++++------------------ 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/list-objects.c b/list-objects.c index b3931fa434..0ee0551604 100644 --- a/list-objects.c +++ b/list-objects.c @@ -183,25 +183,13 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) add_pending_object(revs, &tree->object, ""); } -void traverse_commit_list(struct rev_info *revs, - show_commit_fn show_commit, - show_object_fn show_object, - void *data) +static void traverse_trees_and_blobs(struct rev_info *revs, + struct strbuf *base, + show_object_fn show_object, + void *data) { int i; - struct commit *commit; - struct strbuf base; - strbuf_init(&base, PATH_MAX); - while ((commit = get_revision(revs)) != NULL) { - /* - * an uninteresting boundary commit may not have its tree - * parsed yet, but we are not going to show them anyway - */ - if (commit->tree) - add_pending_tree(revs, commit->tree); - show_commit(commit, data); - } for (i = 0; i < revs->pending.nr; i++) { struct object_array_entry *pending = revs->pending.objects + i; struct object *obj = pending->item; @@ -218,17 +206,39 @@ void traverse_commit_list(struct rev_info *revs, path = ""; if (obj->type == OBJ_TREE) { process_tree(revs, (struct tree *)obj, show_object, - &base, path, data); + base, path, data); continue; } if (obj->type == OBJ_BLOB) { process_blob(revs, (struct blob *)obj, show_object, - &base, path, data); + base, path, data); continue; } die("unknown pending object %s (%s)", oid_to_hex(&obj->oid), name); } object_array_clear(&revs->pending); +} + +void traverse_commit_list(struct rev_info *revs, + show_commit_fn show_commit, + show_object_fn show_object, + void *data) +{ + struct commit *commit; + struct strbuf base; + strbuf_init(&base, PATH_MAX); + + while ((commit = get_revision(revs)) != NULL) { + /* + * an uninteresting boundary commit may not have its tree + * parsed yet, but we are not going to show them anyway + */ + if (commit->tree) + add_pending_tree(revs, commit->tree); + show_commit(commit, data); + } + traverse_trees_and_blobs(revs, &base, show_object, data); + strbuf_release(&base); } -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-10-28 0:45 ` Stefan Beller 2017-10-28 17:20 ` Johannes Schindelin 2017-10-28 0:45 ` [PATCH 3/3] builtin/describe: describe blobs Stefan Beller 2017-10-28 17:15 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Johannes Schindelin 2 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-28 0:45 UTC (permalink / raw) To: sbeller; +Cc: git This will be useful shortly. Signed-off-by: Stefan Beller <sbeller@google.com> --- list-objects.c | 2 ++ revision.h | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/list-objects.c b/list-objects.c index 0ee0551604..ef64a237d3 100644 --- a/list-objects.c +++ b/list-objects.c @@ -237,6 +237,8 @@ void traverse_commit_list(struct rev_info *revs, if (commit->tree) add_pending_tree(revs, commit->tree); show_commit(commit, data); + if (revs->tree_blobs_in_commit_order) + traverse_trees_and_blobs(revs, &base, show_object, data); } traverse_trees_and_blobs(revs, &base, show_object, data); diff --git a/revision.h b/revision.h index 54761200ad..86985d68aa 100644 --- a/revision.h +++ b/revision.h @@ -121,7 +121,8 @@ struct rev_info { bisect:1, ancestry_path:1, first_parent_only:1, - line_level_traverse:1; + line_level_traverse:1, + tree_blobs_in_commit_order:1; /* Diff flags */ unsigned int diff:1, -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits 2017-10-28 0:45 ` [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-10-28 17:20 ` Johannes Schindelin 2017-10-29 3:22 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-10-28 17:20 UTC (permalink / raw) To: Stefan Beller; +Cc: git Hi Stefan, On Fri, 27 Oct 2017, Stefan Beller wrote: > This will be useful shortly. Something tells me that I will hate you in the future when I read this commit message and lack the context (e.g. when blaming, where I cannot see the child commits let alone the comment in the Merge commit). How about: The functionality to list tree objects in the order they were seen while traversing the commits will be used in the next commit, where we teach `git describe` to describe not only commits, but trees and blobs, too. The diff itself is amazingly easy to review, and obviously correct. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits 2017-10-28 17:20 ` Johannes Schindelin @ 2017-10-29 3:22 ` Stefan Beller 2017-10-29 3:23 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-29 3:22 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sat, Oct 28, 2017 at 10:20 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi Stefan, > > On Fri, 27 Oct 2017, Stefan Beller wrote: > >> This will be useful shortly. > > Something tells me that I will hate you in the future when I read this > commit message and lack the context (e.g. when blaming, where I cannot see > the child commits let alone the comment in the Merge commit). Thanks for calling me out. (Not making excuses, but...) I remembered some other senior members having such commit messages, so I felt like I had to step up my game. https://public-inbox.org/git/70fbcd573f5c8a78a19a08ffc255437c36e7f49d.1495014840.git.mhagger@alum.mit.edu/ > > How about: > > The functionality to list tree objects in the order they were seen > while traversing the commits will be used in the next commit, > where we teach `git describe` to describe not only commits, but > trees and blobs, too. Sounds good. If only we had git-when-merged[1] as some easy upstream function, then the original commit message would provoke less hate as future us would have an easier time finding the next commit. [1] https://github.com/mhagger/git-when-merged When writing the patch I wondered if this could be useful for other use cases as well, i.e. do we want to have a command line argument to trigger this behavior? Thanks, Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits 2017-10-29 3:22 ` Stefan Beller @ 2017-10-29 3:23 ` Stefan Beller 2017-10-29 3:43 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-29 3:23 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sat, Oct 28, 2017 at 8:22 PM, Stefan Beller <sbeller@google.com> wrote: > > (Not making excuses, but...) I remembered some other senior members > having such commit messages, so I felt like I had to step up my game. > > https://public-inbox.org/git/70fbcd573f5c8a78a19a08ffc255437c36e7f49d.1495014840.git.mhagger@alum.mit.edu/ I forgot to note though, that this never made it into the official tree, though. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits 2017-10-29 3:23 ` Stefan Beller @ 2017-10-29 3:43 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-10-29 3:43 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes Schindelin, git Stefan Beller <sbeller@google.com> writes: > On Sat, Oct 28, 2017 at 8:22 PM, Stefan Beller <sbeller@google.com> wrote: > >> >> (Not making excuses, but...) I remembered some other senior members >> having such commit messages, so I felt like I had to step up my game. >> >> https://public-inbox.org/git/70fbcd573f5c8a78a19a08ffc255437c36e7f49d.1495014840.git.mhagger@alum.mit.edu/ > > I forgot to note though, that this never made it into the official tree, though. Besides, leading by setting a better example would be the right approach to gain seniority (if that is your goal, that is) ;-). ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 3/3] builtin/describe: describe blobs 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-28 0:45 ` [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-10-28 0:45 ` Stefan Beller 2017-10-28 17:32 ` Johannes Schindelin 2017-10-28 17:15 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Johannes Schindelin 2 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-28 0:45 UTC (permalink / raw) To: sbeller; +Cc: git Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) The best identification of a blob hash is done via a its path at a given commit, which this implements. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 54 insertions(+), 4 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 29075dbd0f..752de5843b 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -11,8 +11,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -282,6 +283,50 @@ static void show_suffix(int depth, const struct object_id *oid) printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); } +struct blob_descriptor { + struct object_id current_commit; + struct object_id looking_for; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct blob_descriptor *bd = data; + + bd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *name, void *data) +{ + struct blob_descriptor *bd = data; + + if (!oidcmp(&bd->looking_for, &obj->oid)) + printf(_("blob %s present at path %s in commit %s\n"), + oid_to_hex(&bd->looking_for), name, + oid_to_hex(&bd->current_commit)); +} + +static void describe_blob(struct object_id oid) +{ + struct rev_info revs; + struct argv_array args = ARGV_ARRAY_INIT; + struct blob_descriptor bd = { null_oid, oid }; + + argv_array_pushl(&args, "internal: The first arg is not parsed", + "--all", "--single-worktree", "--objects", NULL); + + revs.tree_blobs_in_commit_order = 1; + + init_revisions(&revs, NULL); + + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) + BUG("setup_revisions could not handle all args?"); + + if (prepare_revision_walk(&revs)) + die("revision walk setup failed"); + + traverse_commit_list(&revs, process_commit, process_object, &bd); +} + static void describe(const char *arg, int last_one) { struct object_id oid; @@ -295,9 +340,14 @@ static void describe(const char *arg, int last_one) if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference_gently(&oid, 1); + if (!cmit) { + if (lookup_blob(&oid)) + describe_blob(oid); + else + die(_("%s is not a commit nor blob"), arg); + return; + } n = find_commit_name(&cmit->object.oid); if (n && (tags || all || n->prio == 2)) { -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 3/3] builtin/describe: describe blobs 2017-10-28 0:45 ` [PATCH 3/3] builtin/describe: describe blobs Stefan Beller @ 2017-10-28 17:32 ` Johannes Schindelin 2017-10-28 22:47 ` Jacob Keller 2017-10-29 3:28 ` Stefan Beller 0 siblings, 2 replies; 110+ messages in thread From: Johannes Schindelin @ 2017-10-28 17:32 UTC (permalink / raw) To: Stefan Beller; +Cc: git Hi Stefan, On Fri, 27 Oct 2017, Stefan Beller wrote: > Sometimes users are given a hash of an object and they want to identify > it further (ex.: Use verify-pack to find the largest blobs, but what are > these? or [1]) > > The best identification of a blob hash is done via a its path at a > given commit, which this implements. > > [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob I also came up with a script to do that: https://github.com/msysgit/msysgit/blob/master/bin/what-made-this-repo-so-large.sh Your method is much more elegant, of course (it describes the commit in the same run as it finds the object, and it does not output tons of stuff only to be filtered). > @@ -282,6 +283,50 @@ static void show_suffix(int depth, const struct object_id *oid) > printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); > } > > +struct blob_descriptor { > + struct object_id current_commit; > + struct object_id looking_for; > +}; Personally, I would call this process_commit_data, but I do not mind too much about the name. > +static void process_object(struct object *obj, const char *name, void *data) > +{ > + struct blob_descriptor *bd = data; > + > + if (!oidcmp(&bd->looking_for, &obj->oid)) > + printf(_("blob %s present at path %s in commit %s\n"), > + oid_to_hex(&bd->looking_for), name, > + oid_to_hex(&bd->current_commit)); > +} s/name/path/ > @@ -295,9 +340,14 @@ static void describe(const char *arg, int last_one) > > if (get_oid(arg, &oid)) > die(_("Not a valid object name %s"), arg); > - cmit = lookup_commit_reference(&oid); > - if (!cmit) > - die(_("%s is not a valid '%s' object"), arg, commit_type); > + cmit = lookup_commit_reference_gently(&oid, 1); > + if (!cmit) { > + if (lookup_blob(&oid)) > + describe_blob(oid); > + else > + die(_("%s is not a commit nor blob"), arg); s/not/neither/ Nicely done, sir! I wonder whether it would make sense to extend this to tree objects while we are at it, but maybe that's an easy up-for-grabs. Thank you very much! Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 3/3] builtin/describe: describe blobs 2017-10-28 17:32 ` Johannes Schindelin @ 2017-10-28 22:47 ` Jacob Keller 2017-10-29 3:28 ` Stefan Beller 1 sibling, 0 replies; 110+ messages in thread From: Jacob Keller @ 2017-10-28 22:47 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Stefan Beller, Git mailing list On Sat, Oct 28, 2017 at 10:32 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi Stefan, > > > Nicely done, sir! > > I wonder whether it would make sense to extend this to tree objects while > we are at it, but maybe that's an easy up-for-grabs. > > Thank you very much! > Dscho I'd very much like the same ability for trees and gitlinks as well! But it does seem it might be fairly easy. Thanks, Jake ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 3/3] builtin/describe: describe blobs 2017-10-28 17:32 ` Johannes Schindelin 2017-10-28 22:47 ` Jacob Keller @ 2017-10-29 3:28 ` Stefan Beller 2017-10-29 12:02 ` Kevin Daudt 2017-10-29 12:07 ` Johannes Schindelin 1 sibling, 2 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-29 3:28 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sat, Oct 28, 2017 at 10:32 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi Stefan, > > On Fri, 27 Oct 2017, Stefan Beller wrote: > >> Sometimes users are given a hash of an object and they want to identify >> it further (ex.: Use verify-pack to find the largest blobs, but what are >> these? or [1]) >> >> The best identification of a blob hash is done via a its path at a >> given commit, which this implements. >> >> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob > > I also came up with a script to do that: > https://github.com/msysgit/msysgit/blob/master/bin/what-made-this-repo-so-large.sh > > Your method is much more elegant, of course (it describes the commit in the > same run as it finds the object, and it does not output tons of stuff only > to be filtered). That was the task I was given. Discussion here turned out that users mostly only ever care about commit object names, trees and blobs are only useful when specifically given. > >> @@ -282,6 +283,50 @@ static void show_suffix(int depth, const struct object_id *oid) >> printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); >> } >> >> +struct blob_descriptor { >> + struct object_id current_commit; >> + struct object_id looking_for; >> +}; > > Personally, I would call this process_commit_data, but I do not mind too > much about the name. I'll take any naming suggestion, as this code was assembled in a hurry using copy/paste and trial&error as the high-tech methods used. >> +static void process_object(struct object *obj, const char *name, void *data) >> +{ >> + struct blob_descriptor *bd = data; >> + >> + if (!oidcmp(&bd->looking_for, &obj->oid)) >> + printf(_("blob %s present at path %s in commit %s\n"), >> + oid_to_hex(&bd->looking_for), name, >> + oid_to_hex(&bd->current_commit)); >> +} > > s/name/path/ > >> @@ -295,9 +340,14 @@ static void describe(const char *arg, int last_one) >> >> if (get_oid(arg, &oid)) >> die(_("Not a valid object name %s"), arg); >> - cmit = lookup_commit_reference(&oid); >> - if (!cmit) >> - die(_("%s is not a valid '%s' object"), arg, commit_type); >> + cmit = lookup_commit_reference_gently(&oid, 1); >> + if (!cmit) { >> + if (lookup_blob(&oid)) >> + describe_blob(oid); >> + else >> + die(_("%s is not a commit nor blob"), arg); > > s/not/neither/ > > Nicely done, sir! > > I wonder whether it would make sense to extend this to tree objects while > we are at it, but maybe that's an easy up-for-grabs. I can look into incorporating that, too. What is the use case though? (Is there any error message, common enough that users want to identify trees?) Thanks for the review, Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 3/3] builtin/describe: describe blobs 2017-10-29 3:28 ` Stefan Beller @ 2017-10-29 12:02 ` Kevin Daudt 2017-10-29 12:07 ` Johannes Schindelin 1 sibling, 0 replies; 110+ messages in thread From: Kevin Daudt @ 2017-10-29 12:02 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes Schindelin, git On Sat, Oct 28, 2017 at 08:28:54PM -0700, Stefan Beller wrote: > On Sat, Oct 28, 2017 at 10:32 AM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > > [..] > > I wonder whether it would make sense to extend this to tree objects while > > we are at it, but maybe that's an easy up-for-grabs. > > I can look into incorporating that, too. What is the use case though? > (Is there any error message, common enough that users want to > identify trees?) > > Thanks for the review, > Stefan Not sure if it's really helpfulp, but sometimes with corrup repos, git would complain about a certain object missing or corrupt, where it might be usefull to find out how it's referenced. Not sure though if this would work in that case, because the repo is already corrupt. Kevin ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 3/3] builtin/describe: describe blobs 2017-10-29 3:28 ` Stefan Beller 2017-10-29 12:02 ` Kevin Daudt @ 2017-10-29 12:07 ` Johannes Schindelin 1 sibling, 0 replies; 110+ messages in thread From: Johannes Schindelin @ 2017-10-29 12:07 UTC (permalink / raw) To: Stefan Beller; +Cc: git Hi Stefan, On Sat, 28 Oct 2017, Stefan Beller wrote: > On Sat, Oct 28, 2017 at 10:32 AM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > I wonder whether it would make sense to extend this to tree objects > > while we are at it, but maybe that's an easy up-for-grabs. > > I can look into incorporating that, too. What is the use case though? > (Is there any error message, common enough that users want to identify > trees?) I remember that I wanted to know in the past where that tree object came from that is now missing (back in the days when worktrees gc'ed away detached worktree HEADs and their reflogs), but on second thought, I guess `git describe` would go belly up in that case (die()ing due to the very same missing tree object)... Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-28 0:45 ` [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits Stefan Beller 2017-10-28 0:45 ` [PATCH 3/3] builtin/describe: describe blobs Stefan Beller @ 2017-10-28 17:15 ` Johannes Schindelin 2017-10-29 3:13 ` Stefan Beller 2 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-10-28 17:15 UTC (permalink / raw) To: Stefan Beller; +Cc: git Hi Stefan, [I was intrigued enough by your work to postpone to later this coming week reading the What's cooking email in favor of reviewing your patch series.] On Fri, 27 Oct 2017, Stefan Beller wrote: > With traverse_trees_and_blobs factored out of the main traverse function, > the next patch can introduce an in-order revision walking with ease. Makes sense. > diff --git a/list-objects.c b/list-objects.c > index b3931fa434..0ee0551604 100644 > --- a/list-objects.c > +++ b/list-objects.c > @@ -183,25 +183,13 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) > add_pending_object(revs, &tree->object, ""); > } > > -void traverse_commit_list(struct rev_info *revs, > - show_commit_fn show_commit, > - show_object_fn show_object, > - void *data) > +static void traverse_trees_and_blobs(struct rev_info *revs, > + struct strbuf *base, In the context of one function, it was obvious what `base` meant. Maybe we can call it `base_path` now? Thanks, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs 2017-10-28 17:15 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Johannes Schindelin @ 2017-10-29 3:13 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-29 3:13 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sat, Oct 28, 2017 at 10:15 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi Stefan, > > [I was intrigued enough by your work to postpone to later this coming week > reading the What's cooking email in favor of reviewing your patch series.] > > On Fri, 27 Oct 2017, Stefan Beller wrote: > >> With traverse_trees_and_blobs factored out of the main traverse function, >> the next patch can introduce an in-order revision walking with ease. > > Makes sense. > >> diff --git a/list-objects.c b/list-objects.c >> index b3931fa434..0ee0551604 100644 >> --- a/list-objects.c >> +++ b/list-objects.c >> @@ -183,25 +183,13 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) >> add_pending_object(revs, &tree->object, ""); >> } >> >> -void traverse_commit_list(struct rev_info *revs, >> - show_commit_fn show_commit, >> - show_object_fn show_object, >> - void *data) >> +static void traverse_trees_and_blobs(struct rev_info *revs, >> + struct strbuf *base, > > In the context of one function, it was obvious what `base` meant. Maybe we > can call it `base_path` now? I was intrigued to keep the base local to the factored out function, but that would mean, we'd have to allocate memory for it in every call. That I wanted to avoid, so the only reason to pass it in, is memory management. base_path sounds good, will rename. Thanks for the review! Stefan > > Thanks, > Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [RFC PATCH 0/3] git-describe <blob> ? 2017-10-28 0:44 [RFC PATCH 0/3] git-describe <blob> ? Stefan Beller 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-10-28 16:04 ` Johannes Schindelin 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller 2 siblings, 0 replies; 110+ messages in thread From: Johannes Schindelin @ 2017-10-28 16:04 UTC (permalink / raw) To: Stefan Beller; +Cc: git Hi Stefan, On Fri, 27 Oct 2017, Stefan Beller wrote: > Occasionally a user is given an object hash from a blob as an error message > or other output (e.g. [1]). > > It would be useful to get a further description of such a blob, such as > the (commit, path) tuple where this blob was introduced. > > This implements the answer in builtin/describe, but I am not sure if that > is the right place. (One office mate argued it could be a "reverse-blame" > that tells you all the trees/commits where the blob is referenced from). > > This is RFC for other reasons as well: tests, docs missing. > > Any feedback welcome, As you ask so nicely, I'll just cheer (I do not have time to review it right now, but I really, really wanted this, I even started working on the name-rev side of things some time ago but eventually had to let things slide). Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 0/7] git-describe <blob> 2017-10-28 0:44 [RFC PATCH 0/3] git-describe <blob> ? Stefan Beller 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-28 16:04 ` [RFC PATCH 0/3] git-describe <blob> ? Johannes Schindelin @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 0:33 ` [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller ` (7 more replies) 2 siblings, 8 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin This is not an RFC any more, but a serious series. Occasionally a user is given an object hash from a blob as an error message or other output (e.g. [1]). It would be useful to get a further description of such a blob, such as the (commit, path) tuple where this blob was introduced. This implements the answer in builtin/describe, however the heuristics are weak. See patch 6 for details. Any feedback welcome, Thanks, Stefan [1] https://stackoverflow.com/questions/10622179/how-to-find-identify-large-files-commits-in-git-history Stefan Beller (7): list-objects.c: factor out traverse_trees_and_blobs revision.h: introduce blob/tree walking in order of the commits builtin/describe.c: rename `oid` to avoid variable shadowing builtin/describe.c: print debug statements earlier builtin/describe.c: factor out describe_commit builtin/describe.c: describe a blob t6120: fix typo in test name Documentation/git-describe.txt | 12 +++- builtin/describe.c | 125 ++++++++++++++++++++++++++++++++--------- list-objects.c | 50 ++++++++++------- revision.c | 2 + revision.h | 3 +- t/t6100-rev-list-in-order.sh | 44 +++++++++++++++ t/t6120-describe.sh | 17 +++++- 7 files changed, 203 insertions(+), 50 deletions(-) create mode 100755 t/t6100-rev-list-in-order.sh -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 6:07 ` Junio C Hamano 2017-10-31 0:33 ` [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller ` (6 subsequent siblings) 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin With traverse_trees_and_blobs factored out of the main traverse function, the next patch can introduce an in-order revision walking with ease. The variable holding the base path is only used in the newly factored out function `traverse_trees_and_blobs`, however we keep its scope to `traverse_commit_list` to keep the number of invocations for memory allocations and release to one per commit traversal. Rename the variable to `base_path` for clarity. Signed-off-by: Stefan Beller <sbeller@google.com> --- list-objects.c | 48 +++++++++++++++++++++++++++++------------------- 1 file changed, 29 insertions(+), 19 deletions(-) diff --git a/list-objects.c b/list-objects.c index b3931fa434..bf46f80dff 100644 --- a/list-objects.c +++ b/list-objects.c @@ -183,25 +183,13 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) add_pending_object(revs, &tree->object, ""); } -void traverse_commit_list(struct rev_info *revs, - show_commit_fn show_commit, - show_object_fn show_object, - void *data) +static void traverse_trees_and_blobs(struct rev_info *revs, + struct strbuf *base_path, + show_object_fn show_object, + void *data) { int i; - struct commit *commit; - struct strbuf base; - strbuf_init(&base, PATH_MAX); - while ((commit = get_revision(revs)) != NULL) { - /* - * an uninteresting boundary commit may not have its tree - * parsed yet, but we are not going to show them anyway - */ - if (commit->tree) - add_pending_tree(revs, commit->tree); - show_commit(commit, data); - } for (i = 0; i < revs->pending.nr; i++) { struct object_array_entry *pending = revs->pending.objects + i; struct object *obj = pending->item; @@ -218,17 +206,39 @@ void traverse_commit_list(struct rev_info *revs, path = ""; if (obj->type == OBJ_TREE) { process_tree(revs, (struct tree *)obj, show_object, - &base, path, data); + base_path, path, data); continue; } if (obj->type == OBJ_BLOB) { process_blob(revs, (struct blob *)obj, show_object, - &base, path, data); + base_path, path, data); continue; } die("unknown pending object %s (%s)", oid_to_hex(&obj->oid), name); } object_array_clear(&revs->pending); - strbuf_release(&base); +} + +void traverse_commit_list(struct rev_info *revs, + show_commit_fn show_commit, + show_object_fn show_object, + void *data) +{ + struct commit *commit; + struct strbuf base_path; + strbuf_init(&base_path, PATH_MAX); + + while ((commit = get_revision(revs)) != NULL) { + /* + * an uninteresting boundary commit may not have its tree + * parsed yet, but we are not going to show them anyway + */ + if (commit->tree) + add_pending_tree(revs, commit->tree); + show_commit(commit, data); + } + traverse_trees_and_blobs(revs, &base_path, show_object, data); + + strbuf_release(&base_path); } -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs 2017-10-31 0:33 ` [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-10-31 6:07 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-10-31 6:07 UTC (permalink / raw) To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin Stefan Beller <sbeller@google.com> writes: > With traverse_trees_and_blobs factored out of the main traverse function, > the next patch can introduce an in-order revision walking with ease. > > The variable holding the base path is only used in the newly factored out > function `traverse_trees_and_blobs`, however we keep its scope to > `traverse_commit_list` to keep the number of invocations for memory > allocations and release to one per commit traversal. In this step, the caller is calling traverse_trees_and_blobs() only once per traversal anyway, so the paragraph does not quite justify this somewhat strange calling convention, unless the reader knows more calls will be made to the same function by the caller in later steps, potentially doubling the chance the buffer can be reused with this arrangement. Even after this function gains more callers, wouldn't the same invariant that "path" recorded for trees and blobs are relative to the root of tree recorded in the object in revs->pending.objects[]? IOW, should base_path->len be always 0 upon entry to this function? We may want to have an explicit _reset() or assert(->len == 0) at the beginning of the function to ensure that. > Rename the variable to `base_path` for clarity. Hmph. Inside traverse_commit_list(), which never looks at the contents of the buffer, it could just be called with a name that does not have any meaning (e.g. "scratch area for the callee to use"), and such a change would make it clear why it is defined one level above its use and the caller never looks at it. In the callee (i.e. inside traverse_trees_and_blobs()), however, I do not see a reason why base_path is any clearer than base. Other than that, this looks like a straight-forward code movement and looks good. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller 2017-10-31 0:33 ` [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 6:57 ` Junio C Hamano 2017-10-31 0:33 ` [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller ` (5 subsequent siblings) 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin The functionality to list tree objects in the order they were seen while traversing the commits will be used in the next commit, where we teach `git describe` to describe not only commits, but trees and blobs, too. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Stefan Beller <sbeller@google.com> --- list-objects.c | 2 ++ revision.c | 2 ++ revision.h | 3 ++- t/t6100-rev-list-in-order.sh | 44 ++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 1 deletion(-) create mode 100755 t/t6100-rev-list-in-order.sh diff --git a/list-objects.c b/list-objects.c index bf46f80dff..5e114c9a8a 100644 --- a/list-objects.c +++ b/list-objects.c @@ -237,6 +237,8 @@ void traverse_commit_list(struct rev_info *revs, if (commit->tree) add_pending_tree(revs, commit->tree); show_commit(commit, data); + if (revs->tree_blobs_in_commit_order) + traverse_trees_and_blobs(revs, &base_path, show_object, data); } traverse_trees_and_blobs(revs, &base_path, show_object, data); diff --git a/revision.c b/revision.c index d167223e69..9329d4ebbf 100644 --- a/revision.c +++ b/revision.c @@ -1845,6 +1845,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->dense = 0; } else if (!strcmp(arg, "--show-all")) { revs->show_all = 1; + } else if (!strcmp(arg, "--in-commit-order")) { + revs->tree_blobs_in_commit_order = 1; } else if (!strcmp(arg, "--remove-empty")) { revs->remove_empty_trees = 1; } else if (!strcmp(arg, "--merges")) { diff --git a/revision.h b/revision.h index 54761200ad..86985d68aa 100644 --- a/revision.h +++ b/revision.h @@ -121,7 +121,8 @@ struct rev_info { bisect:1, ancestry_path:1, first_parent_only:1, - line_level_traverse:1; + line_level_traverse:1, + tree_blobs_in_commit_order:1; /* Diff flags */ unsigned int diff:1, diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh new file mode 100755 index 0000000000..67ebe815d2 --- /dev/null +++ b/t/t6100-rev-list-in-order.sh @@ -0,0 +1,44 @@ +#!/bin/sh + +test_description='miscellaneous rev-list tests' + +. ./test-lib.sh + + +test_expect_success 'setup' ' + for x in one two three four + do + echo $x >$x && + git add $x && + git commit -m "add file $x" + done && + for x in four three + do + git rm $x + git commit -m "remove $x" + done && + git rev-list --in-commit-order --objects HEAD >actual.raw && + cut -c 1-40 > actual < actual.raw && + + >expect && + git rev-parse HEAD^{commit} >>expect && + git rev-parse HEAD^{tree} >>expect && + git rev-parse HEAD^{tree}:one >>expect && + git rev-parse HEAD^{tree}:two >>expect && + git rev-parse HEAD~1^{commit} >>expect && + git rev-parse HEAD~1^{tree} >>expect && + git rev-parse HEAD~1^{tree}:three >>expect && + git rev-parse HEAD~2^{commit} >>expect && + git rev-parse HEAD~2^{tree} >>expect && + git rev-parse HEAD~2^{tree}:four >>expect && + git rev-parse HEAD~3^{commit} >>expect && + # skip HEAD~3^{tree} + git rev-parse HEAD~4^{commit} >>expect && + # skip HEAD~4^{tree} + git rev-parse HEAD~5^{commit} >>expect && + git rev-parse HEAD~5^{tree} >>expect && + + test_cmp expect actual +' + +test_done -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-10-31 0:33 ` [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-10-31 6:57 ` Junio C Hamano 2017-10-31 18:12 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-10-31 6:57 UTC (permalink / raw) To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin Stefan Beller <sbeller@google.com> writes: > diff --git a/list-objects.c b/list-objects.c > index bf46f80dff..5e114c9a8a 100644 > --- a/list-objects.c > +++ b/list-objects.c > @@ -237,6 +237,8 @@ void traverse_commit_list(struct rev_info *revs, > if (commit->tree) > add_pending_tree(revs, commit->tree); > show_commit(commit, data); > + if (revs->tree_blobs_in_commit_order) > + traverse_trees_and_blobs(revs, &base_path, show_object, data); > } > traverse_trees_and_blobs(revs, &base_path, show_object, data); This one is interesting. While we walk the ancestry chain, we may add the tree for each commit that we discover to the "pending. We used to sweep the pending array at the end after we run out of the commits, but now we do the sweeping as we process each commit. Would this make the last call to traverse_trees_and_blobs() always a no-op? I am not suggesting to add a new conditional in front of it; I am just asking for curiosity's sake. > diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh > new file mode 100755 > index 0000000000..67ebe815d2 > --- /dev/null > +++ b/t/t6100-rev-list-in-order.sh > @@ -0,0 +1,44 @@ > +#!/bin/sh > + > +test_description='miscellaneous rev-list tests' > + > +. ./test-lib.sh > + > + > +test_expect_success 'setup' ' > + for x in one two three four > + do > + echo $x >$x && > + git add $x && > + git commit -m "add file $x" > + done && > + for x in four three > + do > + git rm $x > + git commit -m "remove $x" > + done && There is break in &&-chain in the second loop, but in any case, for loops and &&-chains do not mix very well unless done carefully. Imagine what happens if "git commit" in the first loop failed when x is two; it won't stop and keep going to create three and four and nobody would noice any error. > + git rev-list --in-commit-order --objects HEAD >actual.raw && > + cut -c 1-40 > actual < actual.raw && > + > + >expect && > + git rev-parse HEAD^{commit} >>expect && > + git rev-parse HEAD^{tree} >>expect && > +... > + git rev-parse HEAD~5^{tree} >>expect && Ooooh, ugly ;-). I wish we had --stdin interface or something. Short of that, I'd probably write this more like... git rev-parse >expect \ A B C D \ E F Even though we do not have --stdin for rev-parse, you can still do: git cat-file --batch-check='%(objectname)' >expect <<-\EOF && HEAD^{commit} HEAD^{tree} HEAD:one HEAD:two ... EOF > + test_cmp expect actual > +' > + > +test_done ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-10-31 6:57 ` Junio C Hamano @ 2017-10-31 18:12 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 18:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin On Mon, Oct 30, 2017 at 11:57 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> diff --git a/list-objects.c b/list-objects.c >> index bf46f80dff..5e114c9a8a 100644 >> --- a/list-objects.c >> +++ b/list-objects.c >> @@ -237,6 +237,8 @@ void traverse_commit_list(struct rev_info *revs, >> if (commit->tree) >> add_pending_tree(revs, commit->tree); >> show_commit(commit, data); >> + if (revs->tree_blobs_in_commit_order) >> + traverse_trees_and_blobs(revs, &base_path, show_object, data); >> } >> traverse_trees_and_blobs(revs, &base_path, show_object, data); > > This one is interesting. While we walk the ancestry chain, we may > add the tree for each commit that we discover to the "pending. We > used to sweep the pending array at the end after we run out of the > commits, but now we do the sweeping as we process each commit. > > Would this make the last call to traverse_trees_and_blobs() always a > no-op? That is my understanding of the code, the important part of the previous patch was the move of 'object_array_clear(&revs->pending);' into the `traverse_trees_and_blobs` function. > I am not suggesting to add a new conditional in front of it; > I am just asking for curiosity's sake. Thanks for being clear on that. I have the same taste for this. (Though for a split second I wondered if we can pull some esoteric trick to skip this, but I could not find any that is sane as well as readable.) >> +test_expect_success 'setup' ' >> + for x in one two three four >> + do >> + echo $x >$x && >> + git add $x && >> + git commit -m "add file $x" >> + done && >> + for x in four three >> + do >> + git rm $x >> + git commit -m "remove $x" >> + done && > > There is break in &&-chain in the second loop, but in any case, for > loops and &&-chains do not mix very well unless done carefully. > Imagine what happens if "git commit" in the first loop failed when > x is two; it won't stop and keep going to create three and four and > nobody would noice any error. > I'll fix that. > Even though we do not have --stdin for rev-parse, you can still do: > > git cat-file --batch-check='%(objectname)' >expect <<-\EOF && > HEAD^{commit} > HEAD^{tree} > HEAD:one > HEAD:two > ... > EOF sounds better than the current implementation. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller 2017-10-31 0:33 ` [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-31 0:33 ` [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 8:15 ` Jacob Keller 2017-10-31 0:33 ` [PATCH 4/7] builtin/describe.c: print debug statements earlier Stefan Beller ` (4 subsequent siblings) 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin The function `describe` has already a variable named `oid` declared at the beginning of the function for an object id. Do now shadow that variable with a pointer to an object id. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 29075dbd0f..fd61f463cf 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -381,9 +381,9 @@ static void describe(const char *arg, int last_one) } if (!match_cnt) { - struct object_id *oid = &cmit->object.oid; + struct object_id *cmit_oid = &cmit->object.oid; if (always) { - printf("%s", find_unique_abbrev(oid->hash, abbrev)); + printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); if (suffix) printf("%s", suffix); printf("\n"); @@ -392,11 +392,11 @@ static void describe(const char *arg, int last_one) if (unannotated_cnt) die(_("No annotated tags can describe '%s'.\n" "However, there were unannotated tags: try --tags."), - oid_to_hex(oid)); + oid_to_hex(cmit_oid)); else die(_("No tags can describe '%s'.\n" "Try --always, or create some tags."), - oid_to_hex(oid)); + oid_to_hex(cmit_oid)); } QSORT(all_matches, match_cnt, compare_pt); -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing 2017-10-31 0:33 ` [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller @ 2017-10-31 8:15 ` Jacob Keller 0 siblings, 0 replies; 110+ messages in thread From: Jacob Keller @ 2017-10-31 8:15 UTC (permalink / raw) To: Stefan Beller, sbeller; +Cc: git, me, Johannes.Schindelin On October 30, 2017 5:33:47 PM PDT, Stefan Beller <sbeller@google.com> wrote: >The function `describe` has already a variable named `oid` declared at >the beginning of the function for an object id. Do now shadow that Nit, s/now/not/ >variable with a pointer to an object id. > >Signed-off-by: Stefan Beller <sbeller@google.com> >--- > builtin/describe.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > >diff --git a/builtin/describe.c b/builtin/describe.c >index 29075dbd0f..fd61f463cf 100644 >--- a/builtin/describe.c >+++ b/builtin/describe.c >@@ -381,9 +381,9 @@ static void describe(const char *arg, int last_one) > } > > if (!match_cnt) { >- struct object_id *oid = &cmit->object.oid; >+ struct object_id *cmit_oid = &cmit->object.oid; > if (always) { >- printf("%s", find_unique_abbrev(oid->hash, abbrev)); >+ printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); > if (suffix) > printf("%s", suffix); > printf("\n"); >@@ -392,11 +392,11 @@ static void describe(const char *arg, int >last_one) > if (unannotated_cnt) > die(_("No annotated tags can describe '%s'.\n" > "However, there were unannotated tags: try --tags."), >- oid_to_hex(oid)); >+ oid_to_hex(cmit_oid)); > else > die(_("No tags can describe '%s'.\n" > "Try --always, or create some tags."), >- oid_to_hex(oid)); >+ oid_to_hex(cmit_oid)); > } > > QSORT(all_matches, match_cnt, compare_pt); -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 4/7] builtin/describe.c: print debug statements earlier 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller ` (2 preceding siblings ...) 2017-10-31 0:33 ` [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 7:03 ` Junio C Hamano 2017-10-31 0:33 ` [PATCH 5/7] builtin/describe.c: factor out describe_commit Stefan Beller ` (3 subsequent siblings) 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin For debuggers aid we'd want to print debug statements early, so introduce a new line in the debug output that describes the whole function, and the change the next debug output to describe why we need to search. Conveniently drop the arg from the second line; which will be useful in a follow up commit, that refactors the describe function. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/describe.c b/builtin/describe.c index fd61f463cf..3136efde31 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; + if (debug) + fprintf(stderr, _("describe %s\n"), arg); + if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); cmit = lookup_commit_reference(&oid); @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) if (!max_candidates) die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); if (debug) - fprintf(stderr, _("searching to describe %s\n"), arg); + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); if (!have_util) { struct hashmap_iter iter; -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 4/7] builtin/describe.c: print debug statements earlier 2017-10-31 0:33 ` [PATCH 4/7] builtin/describe.c: print debug statements earlier Stefan Beller @ 2017-10-31 7:03 ` Junio C Hamano 2017-10-31 19:05 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-10-31 7:03 UTC (permalink / raw) To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin Stefan Beller <sbeller@google.com> writes: > For debuggers aid we'd want to print debug statements early, so > introduce a new line in the debug output that describes the whole > function, and the change the next debug output to describe why we need > to search. Conveniently drop the arg from the second line; which will > be useful in a follow up commit, that refactors the describe function. > > Signed-off-by: Stefan Beller <sbeller@google.com> > --- Adding an eary "entry" log may indeed be a good idea. This is not a new issue, but I do not think it is a good trade-off to burden i18n teams to translate debug-only messages, though. > builtin/describe.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/builtin/describe.c b/builtin/describe.c > index fd61f463cf..3136efde31 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) > unsigned long seen_commits = 0; > unsigned int unannotated_cnt = 0; > > + if (debug) > + fprintf(stderr, _("describe %s\n"), arg); > + > if (get_oid(arg, &oid)) > die(_("Not a valid object name %s"), arg); > cmit = lookup_commit_reference(&oid); > @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) > if (!max_candidates) > die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); > if (debug) > - fprintf(stderr, _("searching to describe %s\n"), arg); > + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); > > if (!have_util) { > struct hashmap_iter iter; ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 4/7] builtin/describe.c: print debug statements earlier 2017-10-31 7:03 ` Junio C Hamano @ 2017-10-31 19:05 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 19:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin On Tue, Oct 31, 2017 at 12:03 AM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> For debuggers aid we'd want to print debug statements early, so >> introduce a new line in the debug output that describes the whole >> function, and the change the next debug output to describe why we need >> to search. Conveniently drop the arg from the second line; which will >> be useful in a follow up commit, that refactors the describe function. >> >> Signed-off-by: Stefan Beller <sbeller@google.com> >> --- > > Adding an eary "entry" log may indeed be a good idea. This is not a > new issue, but I do not think it is a good trade-off to burden i18n > teams to translate debug-only messages, though. The original motivation here is to use the 'arg' argument so early in the function that the next patch can omit passing in 'arg' as the lines using 'arg' are not in the factored out function. 8713ab3079 (Improve git-describe performance by reducing revision listing., 2007-01-13) added the --debug flag, and also exposed it to the man page, so it is user visible, which is why we cannot keep these messages untranslated. > >> builtin/describe.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/builtin/describe.c b/builtin/describe.c >> index fd61f463cf..3136efde31 100644 >> --- a/builtin/describe.c >> +++ b/builtin/describe.c >> @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) >> unsigned long seen_commits = 0; >> unsigned int unannotated_cnt = 0; >> >> + if (debug) >> + fprintf(stderr, _("describe %s\n"), arg); >> + >> if (get_oid(arg, &oid)) >> die(_("Not a valid object name %s"), arg); >> cmit = lookup_commit_reference(&oid); >> @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) >> if (!max_candidates) >> die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); >> if (debug) >> - fprintf(stderr, _("searching to describe %s\n"), arg); >> + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); >> >> if (!have_util) { >> struct hashmap_iter iter; ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 5/7] builtin/describe.c: factor out describe_commit 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller ` (3 preceding siblings ...) 2017-10-31 0:33 ` [PATCH 4/7] builtin/describe.c: print debug statements earlier Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 0:33 ` [PATCH 6/7] builtin/describe.c: describe a blob Stefan Beller ` (2 subsequent siblings) 7 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin In the next patch we'll learn how to describe more than just commits, so factor out describing commits into its own function. That will make the next patches easy as we still need to describe a commit as part of describing blobs. While factoring out the functionality to describe_commit, make sure that any output to stdout is put into a strbuf, which we can print afterwards, using puts which also adds the line ending. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 63 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 37 insertions(+), 26 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 3136efde31..9e9a5ed5d4 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -256,7 +256,7 @@ static unsigned long finish_depth_computation( return seen_commits; } -static void display_name(struct commit_name *n) +static void append_name(struct commit_name *n, struct strbuf *dst) { if (n->prio == 2 && !n->tag) { n->tag = lookup_tag(&n->oid); @@ -272,19 +272,18 @@ static void display_name(struct commit_name *n) } if (n->tag) - printf("%s", n->tag->tag); + strbuf_addstr(dst, n->tag->tag); else - printf("%s", n->path); + strbuf_addstr(dst, n->path); } -static void show_suffix(int depth, const struct object_id *oid) +static void append_suffix(int depth, const struct object_id *oid, struct strbuf *dst) { - printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); + strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); } -static void describe(const char *arg, int last_one) +static void describe_commit(struct object_id *oid, struct strbuf *dst) { - struct object_id oid; struct commit *cmit, *gave_up_on = NULL; struct commit_list *list; struct commit_name *n; @@ -293,26 +292,18 @@ static void describe(const char *arg, int last_one) unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; - if (debug) - fprintf(stderr, _("describe %s\n"), arg); - - if (get_oid(arg, &oid)) - die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference(oid); n = find_commit_name(&cmit->object.oid); if (n && (tags || all || n->prio == 2)) { /* * Exact match to an existing ref. */ - display_name(n); + append_name(n, dst); if (longformat) - show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid); + append_suffix(0, n->tag ? &n->tag->tagged->oid : oid, dst); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); return; } @@ -386,10 +377,9 @@ static void describe(const char *arg, int last_one) if (!match_cnt) { struct object_id *cmit_oid = &cmit->object.oid; if (always) { - printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); + strbuf_addstr(dst, find_unique_abbrev(cmit_oid->hash, abbrev)); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); return; } if (unannotated_cnt) @@ -437,15 +427,36 @@ static void describe(const char *arg, int last_one) } } - display_name(all_matches[0].name); + append_name(all_matches[0].name, dst); if (abbrev) - show_suffix(all_matches[0].depth, &cmit->object.oid); + append_suffix(all_matches[0].depth, &cmit->object.oid, dst); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); +} + +static void describe(const char *arg, int last_one) +{ + struct object_id oid; + struct commit *cmit; + struct strbuf sb = STRBUF_INIT; + + if (debug) + fprintf(stderr, _("describe %s\n"), arg); + + if (get_oid(arg, &oid)) + die(_("Not a valid object name %s"), arg); + cmit = lookup_commit_reference(&oid); + if (!cmit) + die(_("%s is not a valid '%s' object"), arg, commit_type); + + describe_commit(&oid, &sb); + + puts(sb.buf); if (!last_one) clear_commit_marks(cmit, -1); + + strbuf_release(&sb); } int cmd_describe(int argc, const char **argv, const char *prefix) -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH 6/7] builtin/describe.c: describe a blob 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller ` (4 preceding siblings ...) 2017-10-31 0:33 ` [PATCH 5/7] builtin/describe.c: factor out describe_commit Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-10-31 6:25 ` Junio C Hamano 2017-10-31 0:33 ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) "This is an interesting endeavor, because describing things is hard." -- me, upon writing this patch. When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambivalent, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For source code this is likely not the first commit that introduced the blob, but rather the latest commit that contained the blob. For example: git describe v0.99:Makefile v0.99-5-gab6625e06a:Makefile tells us the latest commit that contained the Makefile as it was in tag v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist", 2005-07-11) touches the Makefile. Let's see how this description turns out, if it is useful in day-to-day use as I have the intuition that we'd rather want to see the *first* commit that this blob was introduced to the repository (which can be achieved easily by giving the `--reverse` flag in the describe_blob rev walk). [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> --- Documentation/git-describe.txt | 12 +++++++- builtin/describe.c | 65 ++++++++++++++++++++++++++++++++++++++---- t/t6120-describe.sh | 15 ++++++++++ 3 files changed, 86 insertions(+), 6 deletions(-) diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt index c924c945ba..3d618b2445 100644 --- a/Documentation/git-describe.txt +++ b/Documentation/git-describe.txt @@ -3,7 +3,7 @@ git-describe(1) NAME ---- -git-describe - Describe a commit using the most recent tag reachable from it +git-describe - Describe a commit or blob using the most recent tag reachable from it SYNOPSIS @@ -24,6 +24,16 @@ By default (without --all or --tags) `git describe` only shows annotated tags. For more information about creating annotated tags see the -a and -s options to linkgit:git-tag[1]. +If the given `<commit-ish>` refers to a blob, it will be described +as `<commit-description>:<path>`, such that the blob can be found +at `<path>` in the `<commit-ish`. Note, that the commit is likely +not the commit that introduced the blob, but the one that was found +first; to find the commit that introduced the blob, you need to find +the commit that last touched the path, e.g. +`git log <commit-description> -- <path>`. +As blobs do not point at the commits they are contained in, +describing blobs is slow as we have to walk the whole graph. + OPTIONS ------- <commit-ish>...:: diff --git a/builtin/describe.c b/builtin/describe.c index 9e9a5ed5d4..382cbae908 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -3,6 +3,7 @@ #include "lockfile.h" #include "commit.h" #include "tag.h" +#include "blob.h" #include "refs.h" #include "builtin.h" #include "exec_cmd.h" @@ -11,8 +12,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -434,6 +436,54 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) strbuf_addstr(dst, suffix); } +struct process_commit_data { + struct object_id current_commit; + struct object_id looking_for; + struct strbuf *dst; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct process_commit_data *pcd = data; + pcd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *path, void *data) +{ + struct process_commit_data *pcd = data; + + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { + reset_revision_walk(); + describe_commit(&pcd->current_commit, pcd->dst); + strbuf_addf(pcd->dst, ":%s", path); + } +} + +static void describe_blob(struct object_id oid, struct strbuf *dst) +{ + struct rev_info revs; + struct argv_array args = ARGV_ARRAY_INIT; + struct process_commit_data pcd = { null_oid, oid, dst}; + + argv_array_pushl(&args, "internal: The first arg is not parsed", + "--all", "--reflog", /* as many starting points as possible */ + /* NEEDSWORK: --all is incompatible with worktrees for now: */ + "--single-worktree", + "--objects", + "--in-commit-order", + NULL); + + init_revisions(&revs, NULL); + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) + BUG("setup_revisions could not handle all args?"); + + if (prepare_revision_walk(&revs)) + die("revision walk setup failed"); + + traverse_commit_list(&revs, process_commit, process_object, &pcd); + reset_revision_walk(); +} + static void describe(const char *arg, int last_one) { struct object_id oid; @@ -445,11 +495,16 @@ static void describe(const char *arg, int last_one) if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference_gently(&oid, 1); - describe_commit(&oid, &sb); + if (cmit) { + describe_commit(&oid, &sb); + } else { + if (lookup_blob(&oid)) + describe_blob(oid, &sb); + else + die(_("%s is neither a commit nor blob"), arg); + } puts(sb.buf); diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 1c0e8659d9..3be01316e8 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' ' grep broken out ' +test_expect_success 'describe a blob at a tag' ' + echo "make it a unique blob" >file && + git add file && git commit -m "content in file" && + git tag -a -m "latest annotated tag" unique-file && + git describe HEAD:file >actual && + echo "unique-file:file" >expect && + test_cmp expect actual +' + +test_expect_success 'describe a surviving blob' ' + git commit --allow-empty -m "empty commit" && + git describe HEAD:file >actual && + grep unique-file-1-g actual +' + test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' ' i=1 && while test $i -lt 8000 -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-10-31 0:33 ` [PATCH 6/7] builtin/describe.c: describe a blob Stefan Beller @ 2017-10-31 6:25 ` Junio C Hamano 2017-10-31 19:16 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-10-31 6:25 UTC (permalink / raw) To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin Stefan Beller <sbeller@google.com> writes: > diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt > index c924c945ba..3d618b2445 100644 > --- a/Documentation/git-describe.txt > +++ b/Documentation/git-describe.txt > @@ -3,7 +3,7 @@ git-describe(1) > > NAME > ---- > -git-describe - Describe a commit using the most recent tag reachable from it > +git-describe - Describe a commit or blob using the most recent tag reachable from it If I am not mistaken, this series is about describing a blob as a tuple of a recent commit-ish and a path in the tree in it. Blob never reaches anything, so "desribing blob using X reachable from it" is a mere nonsense. The original is not great in that it ignores the "--contains" mode and referring to "tagged commit" merely as "tag" for brevity, but at least it made some sense. > @@ -24,6 +24,16 @@ By default (without --all or --tags) `git describe` only shows > annotated tags. For more information about creating annotated tags > see the -a and -s options to linkgit:git-tag[1]. > > +If the given `<commit-ish>` refers to a blob, it will be described Perhaps this step should update the SYNOPSIS so that the command takes not just commit-ish but a blob too. Given the difficulty in coming up with the single-liner description of what it does we saw above, I suspect that splitting SYNOPSIS out into two very distinct operating mode might make it easier to read. SYNOPSIS -------- [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] +'git describe' [<options>...] <blob>... Then this additional paragraph can say "When describin a <blob>", without using a (technically nonsense) phrase "if <commit-ish> refers to a blob", which is never true. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-10-31 6:25 ` Junio C Hamano @ 2017-10-31 19:16 ` Stefan Beller 2017-11-01 3:34 ` Junio C Hamano 2017-11-01 3:44 ` Junio C Hamano 0 siblings, 2 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 19:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin On Mon, Oct 30, 2017 at 11:25 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt >> index c924c945ba..3d618b2445 100644 >> --- a/Documentation/git-describe.txt >> +++ b/Documentation/git-describe.txt >> @@ -3,7 +3,7 @@ git-describe(1) >> >> NAME >> ---- >> -git-describe - Describe a commit using the most recent tag reachable from it >> +git-describe - Describe a commit or blob using the most recent tag reachable from it > > If I am not mistaken, this series is about describing a blob as a > tuple of a recent commit-ish and a path in the tree in it. Blob > never reaches anything, so "desribing blob using X reachable from > it" is a mere nonsense. Agreed, though the commitish used for the blob description may refer to a tag. > The original is not great in that it ignores the "--contains" mode > and referring to "tagged commit" merely as "tag" for brevity, but > at least it made some sense. Maybe "git-describe - Describe a blob or commit using graph relations" though that sounds too generic, but it is accurate as all we do is a heuristic for graph walk ways. >> @@ -24,6 +24,16 @@ By default (without --all or --tags) `git describe` only shows >> annotated tags. For more information about creating annotated tags >> see the -a and -s options to linkgit:git-tag[1]. >> >> +If the given `<commit-ish>` refers to a blob, it will be described > > Perhaps this step should update the SYNOPSIS so that the command > takes not just commit-ish but a blob too. That is a good idea, as the --contains would not work for blobs currently. > Given the difficulty in > coming up with the single-liner description of what it does we saw > above, I suspect that splitting SYNOPSIS out into two very distinct > operating mode might make it easier to read. > > SYNOPSIS > -------- > [verse] > 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] > +'git describe' [<options>...] <blob>... > > Then this additional paragraph can say "When describin a <blob>", > without using a (technically nonsense) phrase "if <commit-ish> > refers to a blob", which is never true. > ok, do we know about 'blob-ish' as a term? ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-10-31 19:16 ` Stefan Beller @ 2017-11-01 3:34 ` Junio C Hamano 2017-11-01 20:58 ` Stefan Beller 2017-11-01 3:44 ` Junio C Hamano 1 sibling, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 3:34 UTC (permalink / raw) To: Stefan Beller; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin Stefan Beller <sbeller@google.com> writes: >> Given the difficulty in >> coming up with the single-liner description of what it does we saw >> above, I suspect that splitting SYNOPSIS out into two very distinct >> operating mode might make it easier to read. >> >> SYNOPSIS >> -------- >> [verse] >> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] >> +'git describe' [<options>...] <blob>... >> >> Then this additional paragraph can say "When describin a <blob>", >> without using a (technically nonsense) phrase "if <commit-ish> >> refers to a blob", which is never true. > > ok, do we know about 'blob-ish' as a term? No, and I do not think there is any need to say -ish at all for this use case. After all, when we accept a <commit> when a <tree-ish> is called for, that is only because there is only one way to use the commit in place of the wanted <tree>; we take the top-level tree contained in it. You cannot say you take <blob-ish> and take a <tree>, as it is unclear which entry in the <tree> can act as the substitute for the wanted <blob>. You accept blob object name in this mode, so just saying <blob> is sufficient. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-01 3:34 ` Junio C Hamano @ 2017-11-01 20:58 ` Stefan Beller 2017-11-02 1:53 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-01 20:58 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin On Tue, Oct 31, 2017 at 8:34 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >>> Given the difficulty in >>> coming up with the single-liner description of what it does we saw >>> above, I suspect that splitting SYNOPSIS out into two very distinct >>> operating mode might make it easier to read. >>> >>> SYNOPSIS >>> -------- >>> [verse] >>> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] >>> +'git describe' [<options>...] <blob>... >>> >>> Then this additional paragraph can say "When describin a <blob>", >>> without using a (technically nonsense) phrase "if <commit-ish> >>> refers to a blob", which is never true. >> >> ok, do we know about 'blob-ish' as a term? > > No, and I do not think there is any need to say -ish at all for this > use case. > > After all, when we accept a <commit> when a <tree-ish> is called > for, that is only because there is only one way to use the commit in > place of the wanted <tree>; we take the top-level tree contained in > it. You cannot say you take <blob-ish> and take a <tree>, as it is > unclear which entry in the <tree> can act as the substitute for the > wanted <blob>. But now we have a path as well, the notation of <commit-ish> COLON <path> is not a unique description of the blob, because * there can be multiple <commit-ish>s depending on the tags and walking * in boilerplate code cases, we might even have the blob at different places (e.g. pristine copies of a license file in subdirectories) When calling for a tree-ish, we also accept commits and tags plus walking directions. So I find it hard to think we have to take in exact blob names, but we would also accept blob-ishs (i.e. commit+path) ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-01 20:58 ` Stefan Beller @ 2017-11-02 1:53 ` Junio C Hamano 2017-11-02 4:23 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-02 1:53 UTC (permalink / raw) To: Stefan Beller; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin Stefan Beller <sbeller@google.com> writes: > But now we have a path as well, the notation of > <commit-ish> COLON <path> > is not a unique description of the blob, because > * there can be multiple <commit-ish>s depending on the tags and walking > * in boilerplate code cases, we might even have the blob at different > places (e.g. pristine copies of a license file in subdirectories) > > When calling for a tree-ish, we also accept commits and tags > plus walking directions. I think you are confused --- that is not what "-ish" suffix is used in our conversation on objects. The reason why we say "-ish" is "Yes we know v2.15.0 is *NOT* a commit object, we very well know it is a tag object, but because we allow it to be used in a context that calls for a commit object, we mark that use context as 'this accepts commit-ish, not just commit'". But what you call "walking direction", e.g. "HEAD:Makefile", names a "BLOB". It is not "this is something like BLOB, no, we very well know it is not a blob but the context allows such a non-blob to be fed in place of a blob, so we take that even though it is not a blob". Because it is. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-02 1:53 ` Junio C Hamano @ 2017-11-02 4:23 ` Junio C Hamano 2017-11-04 21:15 ` Philip Oakley 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-02 4:23 UTC (permalink / raw) To: Stefan Beller; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin Junio C Hamano <gitster@pobox.com> writes: > The reason why we say "-ish" is "Yes we know v2.15.0 is *NOT* a > commit object, we very well know it is a tag object, but because we > allow it to be used in a context that calls for a commit object, we > mark that use context as 'this accepts commit-ish, not just > commit'". Having said all that, there is a valid case in which we might want to say "blob-ish". To review, X-ish is the word we use when the command wants to take an X, but tolerates a lazy user who gives a Y, which is *NOT* X, without bothering to add ^{X} suffix, i.e. Y^{X}. In such a case, the command takes not just X but takes X-ish because it takes a Y and converts it internally to an X to be extra nice. When the command wants to take a blob, but tolerates something else and does "^{blob}" internally, we can say it takes "blob-ish". Technically that "something else" could be an annotated tag that points at a blob object, without any intervening commit or tree (I did not check if the "describe <blob>" code in this thread handles this, though). But because it is not usually done to tag a blob directly, it would probably be not worth to say "blob-ish" in the document and cause readers to wonder in what situation something that is not a blob can be treated as if it were a blob. It does feel like we would be pursuing technical correctness too much and sacrificing the readability of the document, at least to me, and a bad trade-off. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-02 4:23 ` Junio C Hamano @ 2017-11-04 21:15 ` Philip Oakley 2017-11-05 6:28 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Philip Oakley @ 2017-11-04 21:15 UTC (permalink / raw) To: Junio C Hamano, Stefan Beller Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin From: "Junio C Hamano" <gitster@pobox.com> Sent: Thursday, November 02, 2017 4:23 AM > Junio C Hamano <gitster@pobox.com> writes: > >> The reason why we say "-ish" is "Yes we know v2.15.0 is *NOT* a >> commit object, we very well know it is a tag object, but because we >> allow it to be used in a context that calls for a commit object, we >> mark that use context as 'this accepts commit-ish, not just >> commit'". > > Having said all that, there is a valid case in which we might want > to say "blob-ish". Is this not also an alternative case, relative to the user, for the scenario where the user has an oid/sha1 value but does not know what it is, and would like to find its source and type relative to the `describe` command. IIUC the existing `describe` command only accepts <commit-ish> values, and here we are extending that to be even more inclusive, but at the same time the options become more restricted. Thus the synopsis terminology would be more about suggesting the range of options available (search style/start points) that are applicable to blobs, than being exactly about the 'allow-blobs' parameter. Or have I misunderstood how the fast commit search and the slower potentially-a-blob searching are disambiguated? -- Philip > > To review, X-ish is the word we use when the command wants to take > an X, but tolerates a lazy user who gives a Y, which is *NOT* X, > without bothering to add ^{X} suffix, i.e. Y^{X}. In such a case, > the command takes not just X but takes X-ish because it takes a Y > and converts it internally to an X to be extra nice. > > When the command wants to take a blob, but tolerates something else > and does "^{blob}" internally, we can say it takes "blob-ish". > Technically that "something else" could be an annotated tag that > points at a blob object, without any intervening commit or tree (I > did not check if the "describe <blob>" code in this thread handles > this, though). > > But because it is not usually done to tag a blob directly, it would > probably be not worth to say "blob-ish" in the document and cause > readers to wonder in what situation something that is not a blob can > be treated as if it were a blob. It does feel like we would be > pursuing technical correctness too much and sacrificing the readability > of the document, at least to me, and a bad trade-off. > > ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-04 21:15 ` Philip Oakley @ 2017-11-05 6:28 ` Junio C Hamano 2017-11-06 23:50 ` Philip Oakley 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-05 6:28 UTC (permalink / raw) To: Philip Oakley Cc: Stefan Beller, git, Kevin Daudt, Jacob Keller, Johannes Schindelin "Philip Oakley" <philipoakley@iee.org> writes: > Is this not also an alternative case, relative to the user, for the > scenario where the user has an oid/sha1 value but does not know what > it is, and would like to find its source and type relative to the > `describe` command. I am not sure what you wanted to say with "source and type RELATIVE TO the describe command". The first thing the combination of the user and the describe command would do when the user has a 40-hex string would be to do the equivalent of "cat-file -t" to learn if it even exists and what its type is. With Stefan's patch, that is what describe command does in order to choose quite a different codeflow from the traditional mode when it learns that it was given a blob. > IIUC the existing `describe` command only accepts <commit-ish> values, > and here we are extending that to be even more inclusive, but at the > same time the options become more restricted. Do you mean that the command should check if it was given an option that would not be applicable to the "find a commit that has the blob" mode, once it learns that it was given a blob and needs to go in that codepath? I think that would make sense. > Or have I misunderstood how the fast commit search and the slower > potentially-a-blob searching are disambiguated? I do not think so. We used to barf when we got anything but commit-ish, but Stefan's new code kicks in if the object turns out to be a blob---I think that is what you mean by the disambiguation. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-05 6:28 ` Junio C Hamano @ 2017-11-06 23:50 ` Philip Oakley 2017-11-09 20:30 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Philip Oakley @ 2017-11-06 23:50 UTC (permalink / raw) To: Junio C Hamano, Stefan Beller Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin From: "Junio C Hamano" <gitster@pobox.com> Sent: Sunday, November 05, 2017 6:28 AM > "Philip Oakley" <philipoakley@iee.org> writes: > >> Is this not also an alternative case, relative to the user, for the >> scenario where the user has an oid/sha1 value but does not know what >> it is, and would like to find its source and type relative to the >> `describe` command. > > I am not sure what you wanted to say with "source and type RELATIVE TO > the describe command". The 'relative to' was meaning the user's expectation about this particular command. For a non-expert user, who may not have come across cat-file yet, their world view may not extend beyond 'Git describe <this>' for me. > > The first thing the combination of the user and the describe command > would do when the user has a 40-hex string would be to do the > equivalent of "cat-file -t" to learn if it even exists and what its > type is. With Stefan's patch, that is what describe command does in > order to choose quite a different codeflow from the traditional mode > when it learns that it was given a blob. I realised, after sending, that this was probably the method for non-ambiguous shortened oid's. Thanks for the reminder. > >> IIUC the existing `describe` command only accepts <commit-ish> values, >> and here we are extending that to be even more inclusive, but at the >> same time the options become more restricted. > > Do you mean that the command should check if it was given an option > that would not be applicable to the "find a commit that has the > blob" mode, once it learns that it was given a blob and needs to go > in that codepath? I think that would make sense. Correct, it was the option selection aspect. > >> Or have I misunderstood how the fast commit search and the slower >> potentially-a-blob searching are disambiguated? > > I do not think so. We used to barf when we got anything but > commit-ish, but Stefan's new code kicks in if the object turns out > to be a blob---I think that is what you mean by the disambiguation. Correct. We ask to describe an object, but then the option choices may vary by type. The new [blob] synopys only lists <options>, while the old [commit-ish] shows specifics. It wasn't clear if the options are the same for both. I quess they are the same once the cat-file -t has done its bit. Its only the speed that's affected. As a side note, the commit message example don't show any pathspec that is not in the top level directory. -- Philip ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-06 23:50 ` Philip Oakley @ 2017-11-09 20:30 ` Stefan Beller 2017-11-10 0:25 ` Philip Oakley 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-09 20:30 UTC (permalink / raw) To: Philip Oakley Cc: Junio C Hamano, git, Kevin Daudt, Jacob Keller, Johannes Schindelin Rereading this discussion, there is currently no urgent thing to address? Then the state as announced by the last cooking email, to just cook it, seems about right and we'll wait for further feedback. Thanks, Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-09 20:30 ` Stefan Beller @ 2017-11-10 0:25 ` Philip Oakley 2017-11-10 1:24 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Philip Oakley @ 2017-11-10 0:25 UTC (permalink / raw) To: Stefan Beller Cc: Junio C Hamano, git, Kevin Daudt, Jacob Keller, Johannes Schindelin From: "Stefan Beller" <sbeller@google.com> > Rereading this discussion, there is currently no urgent thing to address? True. > Then the state as announced by the last cooking email, to just cook it, > seems > about right and we'll wait for further feedback. Possibly only checking the documenation aspects, so folks don't fall into the same trap as me.. ;-) -- Philip ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-10 0:25 ` Philip Oakley @ 2017-11-10 1:24 ` Junio C Hamano 2017-11-10 22:44 ` [PATCH 0/1] describe a blob: with better docs Stefan Beller 2017-11-20 15:22 ` [PATCH 6/7] " Philip Oakley 0 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-10 1:24 UTC (permalink / raw) To: Philip Oakley Cc: Stefan Beller, git, Kevin Daudt, Jacob Keller, Johannes Schindelin "Philip Oakley" <philipoakley@iee.org> writes: > From: "Stefan Beller" <sbeller@google.com> >> Rereading this discussion, there is currently no urgent thing to address? > > True. > >> Then the state as announced by the last cooking email, to just cook >> it, seems >> about right and we'll wait for further feedback. A shiny new toy that is not a fix for a grave bug is rarely urgent, so with that criterion, we'd end up with hundreds of topics not in 'next' but in 'pu' waiting for the original contributor to get out of his or her procrastination, which certainly is not what I want to see, as I'd have to throw them into the Stalled bin and then eventually discard them, while having to worry about possible mismerges with remaining good topics caused by these topics appearing and disappearing from 'pu'. I'd rather see any topic that consumed reviewers' time to be polished enough to get into 'next' while we all recall the issues raised during previous reviews. I consider the process to further incrementally polish it after that happens a true "cooking". For this topic, aside from "known issues" that we decided to punt for now, my impression was that the code is in good enough shape, and we need a bit of documentation polishes before I can mark it as "Will merge to 'next'". > Possibly only checking the documenation aspects, so folks don't fall > into the same trap as me.. ;-) Yup, so let's resolve that documentation thing while we remember that the topic has that issue, and what part of the documentation we find needs improvement. I am not sure what "trap: you fell into, though. Are you saying that giving git describe [<option to describe a commit>...] <commit-ish> git describe [<option to describe a blob>...] <blob> in the synopsis is not helpful, because the user may not know what kind of object s/he has, and cannot decide from which set of options to pick? Then an alternative would be to list git describe [<option>...] <object> in the synopsis, say upfront that most options are applicable only when describing a commit-ish, and when describing a blob, we do quite different thing and a separate set of options apply, perhaps? ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 0/1] describe a blob: with better docs 2017-11-10 1:24 ` Junio C Hamano @ 2017-11-10 22:44 ` Stefan Beller 2017-11-10 22:44 ` [PATCH] builtin/describe.c: describe a blob Stefan Beller 2017-11-20 15:22 ` [PATCH 6/7] " Philip Oakley 1 sibling, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-10 22:44 UTC (permalink / raw) To: gitster; +Cc: Johannes.Schindelin, git, jacob.keller, me, philipoakley, sbeller This replaces the last patch of origin/sb/describe-blob. Interdiff is below. I chose to not mention options at all, as currently none are applicable; Check for options and tell the user by die()ing that we don't know about the options for blobs. Thanks, Stefan builtin/describe.c: describe a blob Documentation/git-describe.txt | 13 +++++++- builtin/describe.c | 71 ++++++++++++++++++++++++++++++++++++++---- t/t6120-describe.sh | 15 +++++++++ 3 files changed, 92 insertions(+), 7 deletions(-) -- 2.15.0.128.gcadd42da22 diff --git c/Documentation/git-describe.txt w/Documentation/git-describe.txt index 79ec0be62a..a25443ca91 100644 --- c/Documentation/git-describe.txt +++ w/Documentation/git-describe.txt @@ -11,7 +11,7 @@ SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] -'git describe' [<options>] <blob> +'git describe' <blob> DESCRIPTION ----------- diff --git c/builtin/describe.c w/builtin/describe.c index cf08bef344..acfd853a30 100644 --- c/builtin/describe.c +++ w/builtin/describe.c @@ -501,9 +501,13 @@ static void describe(const char *arg, int last_one) if (cmit) describe_commit(&oid, &sb); - else if (lookup_blob(&oid)) + else if (lookup_blob(&oid)) { + if (all || tags || longformat || first_parent || + patterns.nr || exclude_patterns.nr || + always || dirty || broken) + die(_("options not available for describing blobs")); describe_blob(oid, &sb); - else + } else die(_("%s is neither a commit nor blob"), arg); puts(sb.buf); ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH] builtin/describe.c: describe a blob 2017-11-10 22:44 ` [PATCH 0/1] describe a blob: with better docs Stefan Beller @ 2017-11-10 22:44 ` Stefan Beller 2017-11-13 1:33 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-10 22:44 UTC (permalink / raw) To: gitster; +Cc: Johannes.Schindelin, git, jacob.keller, me, philipoakley, sbeller Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For source code this is likely not the first commit that introduced the blob, but rather the latest commit that contained the blob. For example: git describe v0.99:Makefile v0.99-5-gab6625e06a:Makefile tells us the latest commit that contained the Makefile as it was in tag v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist", 2005-07-11) touches the Makefile. Let's see how this description turns out, if it is useful in day-to-day use as I have the intuition that we'd rather want to see the *first* commit that this blob was introduced to the repository (which can be achieved easily by giving the `--reverse` flag in the describe_blob rev walk). [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/git-describe.txt | 13 +++++++- builtin/describe.c | 71 ++++++++++++++++++++++++++++++++++++++---- t/t6120-describe.sh | 15 +++++++++ 3 files changed, 92 insertions(+), 7 deletions(-) diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt index c924c945ba..a25443ca91 100644 --- a/Documentation/git-describe.txt +++ b/Documentation/git-describe.txt @@ -3,7 +3,7 @@ git-describe(1) NAME ---- -git-describe - Describe a commit using the most recent tag reachable from it +git-describe - Describe a commit or blob using the graph relations SYNOPSIS @@ -11,6 +11,7 @@ SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] +'git describe' <blob> DESCRIPTION ----------- @@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows annotated tags. For more information about creating annotated tags see the -a and -s options to linkgit:git-tag[1]. +If the given object refers to a blob, it will be described +as `<commit-ish>:<path>`, such that the blob can be found +at `<path>` in the `<commit-ish>`. Note, that the commit is likely +not the commit that introduced the blob, but the one that was found +first; to find the commit that introduced the blob, you need to find +the commit that last touched the path, e.g. +`git log <commit-description> -- <path>`. +As blobs do not point at the commits they are contained in, +describing blobs is slow as we have to walk the whole graph. + OPTIONS ------- <commit-ish>...:: diff --git a/builtin/describe.c b/builtin/describe.c index 9e9a5ed5d4..acfd853a30 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -3,6 +3,7 @@ #include "lockfile.h" #include "commit.h" #include "tag.h" +#include "blob.h" #include "refs.h" #include "builtin.h" #include "exec_cmd.h" @@ -11,8 +12,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -434,6 +436,56 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) strbuf_addstr(dst, suffix); } +struct process_commit_data { + struct object_id current_commit; + struct object_id looking_for; + struct strbuf *dst; + struct rev_info *revs; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct process_commit_data *pcd = data; + pcd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *path, void *data) +{ + struct process_commit_data *pcd = data; + + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { + reset_revision_walk(); + describe_commit(&pcd->current_commit, pcd->dst); + strbuf_addf(pcd->dst, ":%s", path); + pcd->revs->max_count = 0; + } +} + +static void describe_blob(struct object_id oid, struct strbuf *dst) +{ + struct rev_info revs; + struct argv_array args = ARGV_ARRAY_INIT; + struct process_commit_data pcd = { null_oid, oid, dst, &revs}; + + argv_array_pushl(&args, "internal: The first arg is not parsed", + "--all", "--reflog", /* as many starting points as possible */ + /* NEEDSWORK: --all is incompatible with worktrees for now: */ + "--single-worktree", + "--objects", + "--in-commit-order", + NULL); + + init_revisions(&revs, NULL); + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) + BUG("setup_revisions could not handle all args?"); + + if (prepare_revision_walk(&revs)) + die("revision walk setup failed"); + + traverse_commit_list(&revs, process_commit, process_object, &pcd); + reset_revision_walk(); +} + static void describe(const char *arg, int last_one) { struct object_id oid; @@ -445,11 +497,18 @@ static void describe(const char *arg, int last_one) if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); - - describe_commit(&oid, &sb); + cmit = lookup_commit_reference_gently(&oid, 1); + + if (cmit) + describe_commit(&oid, &sb); + else if (lookup_blob(&oid)) { + if (all || tags || longformat || first_parent || + patterns.nr || exclude_patterns.nr || + always || dirty || broken) + die(_("options not available for describing blobs")); + describe_blob(oid, &sb); + } else + die(_("%s is neither a commit nor blob"), arg); puts(sb.buf); diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index c8b7ed82d9..aec6ed192d 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a broken submodule' ' grep broken out ' +test_expect_success 'describe a blob at a tag' ' + echo "make it a unique blob" >file && + git add file && git commit -m "content in file" && + git tag -a -m "latest annotated tag" unique-file && + git describe HEAD:file >actual && + echo "unique-file:file" >expect && + test_cmp expect actual +' + +test_expect_success 'describe a blob with commit-ish' ' + git commit --allow-empty -m "empty commit" && + git describe HEAD:file >actual && + grep unique-file-1-g actual +' + test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' ' i=1 && while test $i -lt 8000 -- 2.15.0.128.gcadd42da22 ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH] builtin/describe.c: describe a blob 2017-11-10 22:44 ` [PATCH] builtin/describe.c: describe a blob Stefan Beller @ 2017-11-13 1:33 ` Junio C Hamano 2017-11-14 23:37 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-13 1:33 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me, philipoakley Stefan Beller <sbeller@google.com> writes: > Sometimes users are given a hash of an object and they want to > identify it further (ex.: Use verify-pack to find the largest blobs, > but what are these? or [1]) Thanks for finishing the topic you started. > @@ -11,6 +11,7 @@ SYNOPSIS > [verse] > 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] > 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] > +'git describe' <blob> OK. > diff --git a/builtin/describe.c b/builtin/describe.c > index 9e9a5ed5d4..acfd853a30 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > ... > static void describe(const char *arg, int last_one) > { > ... > @@ -445,11 +497,18 @@ static void describe(const char *arg, int last_one) > ... > + cmit = lookup_commit_reference_gently(&oid, 1); > + > + if (cmit) > + describe_commit(&oid, &sb); > + else if (lookup_blob(&oid)) { > + if (all || tags || longformat || first_parent || > + patterns.nr || exclude_patterns.nr || > + always || dirty || broken) > + die(_("options not available for describing blobs")); > + describe_blob(oid, &sb); I am not sure if I agree with some of them. > + } else > + die(_("%s is neither a commit nor blob"), arg); This side I would agree with, though. The caller of the describe() function is either * 'git describe<RETURN>' makes a single call to it with "HEAD" and exits; or * 'git describe A B C...' makes one call to it for each of these command line arguments. And 'git describe <blob>' code is most likely trigger from the latter, as it is not likely for HEAD to be pointing at a blob. $ blob=$(git rev-parse master:Makefile) $ git describe --always master $blob and trigger the above check. Is the check against "always" useful, or is it better to simply ignore it while describing $blob, but still keeping it in effect while describing 'master'? The 'dirty' and 'broken' check is redundant because we would have already errored out if either of them is set before calling describe() on user-supplied object names. If I understand the way "describe <blob>" works correctly, it traverses the history with objects, doing a moral equivalent of "rev-list --objects", stops when it finds the blob object with the given name, and when it stops, it knows the commit object that contains the blob and path in that commit to the blob. Then the commit is described to be a human-readable string, so that the path can be concatenated after it. Aren't these options that affect how a commit object is described in effect and useful when you do the "internal" describe of the commit you found the blob object in? IOW, wouldn't this $ git describe --exclude='*-wip' $blob make sure that in its output $commit:$path, $commit is not described in terms of any "wip" refs? ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH] builtin/describe.c: describe a blob 2017-11-13 1:33 ` Junio C Hamano @ 2017-11-14 23:37 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-14 23:37 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, git, Jacob Keller, Kevin Daudt, Philip Oakley On Sun, Nov 12, 2017 at 5:33 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> Sometimes users are given a hash of an object and they want to >> identify it further (ex.: Use verify-pack to find the largest blobs, >> but what are these? or [1]) > > Thanks for finishing the topic you started. > >> @@ -11,6 +11,7 @@ SYNOPSIS >> [verse] >> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] >> 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] >> +'git describe' <blob> > > OK. > >> diff --git a/builtin/describe.c b/builtin/describe.c >> index 9e9a5ed5d4..acfd853a30 100644 >> --- a/builtin/describe.c >> +++ b/builtin/describe.c >> ... >> static void describe(const char *arg, int last_one) >> { >> ... >> @@ -445,11 +497,18 @@ static void describe(const char *arg, int last_one) >> ... >> + cmit = lookup_commit_reference_gently(&oid, 1); >> + >> + if (cmit) >> + describe_commit(&oid, &sb); >> + else if (lookup_blob(&oid)) { >> + if (all || tags || longformat || first_parent || >> + patterns.nr || exclude_patterns.nr || >> + always || dirty || broken) >> + die(_("options not available for describing blobs")); >> + describe_blob(oid, &sb); > > I am not sure if I agree with some of them. > >> + } else >> + die(_("%s is neither a commit nor blob"), arg); > > This side I would agree with, though. > > The caller of the describe() function is either > > * 'git describe<RETURN>' makes a single call to it with "HEAD" and > exits; or > * 'git describe A B C...' makes one call to it for each of these > command line arguments. > > And 'git describe <blob>' code is most likely trigger from the latter, > as it is not likely for HEAD to be pointing at a blob. > > $ blob=$(git rev-parse master:Makefile) > $ git describe --always master $blob > > and trigger the above check. Is the check against "always" useful, > or is it better to simply ignore it while describing $blob, but > still keeping it in effect while describing 'master'? > > The 'dirty' and 'broken' check is redundant because we would have > already errored out if either of them is set before calling describe() > on user-supplied object names. > > If I understand the way "describe <blob>" works correctly, it > traverses the history with objects, doing a moral equivalent of > "rev-list --objects", stops when it finds the blob object with the > given name, and when it stops, it knows the commit object that > contains the blob and path in that commit to the blob. Then the > commit is described to be a human-readable string, so that the path > can be concatenated after it. > > Aren't these options that affect how a commit object is described in > effect and useful when you do the "internal" describe of the commit > you found the blob object in? IOW, wouldn't this > > $ git describe --exclude='*-wip' $blob > > make sure that in its output $commit:$path, $commit is not described > in terms of any "wip" refs? yes, we would exclude those refs. But the name alone (without reading the docs) suggests anything given in --exclude is excluded, so what about the blob $commit:path-wip/test1.c which I might want to exclude from the search using the exclude pattern? After reading the docs, this is silly of course, and the exclusion only applies to the commit description part. --all is an interesting case, as we pass --all to the revision walking machinery for blobs, but that is slightly different than the --all flag given to describe, which also only relates to the commit name that should be produced. So, I'll go through all the options and see which we can drop. Thanks, Stefan > ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-10 1:24 ` Junio C Hamano 2017-11-10 22:44 ` [PATCH 0/1] describe a blob: with better docs Stefan Beller @ 2017-11-20 15:22 ` Philip Oakley 2017-11-20 18:18 ` Philip Oakley 1 sibling, 1 reply; 110+ messages in thread From: Philip Oakley @ 2017-11-20 15:22 UTC (permalink / raw) To: Junio C Hamano Cc: Stefan Beller, git, Kevin Daudt, Jacob Keller, Johannes Schindelin From: "Junio C Hamano" <gitster@pobox.com> : Friday, November 10, 2017 1:24 AM [catch up] > "Philip Oakley" <philipoakley@iee.org> writes: > >> From: "Stefan Beller" <sbeller@google.com> >>> Rereading this discussion, there is currently no urgent thing to >>> address? >> >> True. >> >>> Then the state as announced by the last cooking email, to just cook >>> it, seems >>> about right and we'll wait for further feedback. > > A shiny new toy that is not a fix for a grave bug is rarely urgent, > so with that criterion, we'd end up with hundreds of topics not in > 'next' but in 'pu' waiting for the original contributor to get out > of his or her procrastination, which certainly is not what I want to > see, as I'd have to throw them into the Stalled bin and then > eventually discard them, while having to worry about possible > mismerges with remaining good topics caused by these topics > appearing and disappearing from 'pu'. > > I'd rather see any topic that consumed reviewers' time to be > polished enough to get into 'next' while we all recall the issues > raised during previous reviews. I consider the process to further > incrementally polish it after that happens a true "cooking". > > For this topic, aside from "known issues" that we decided to punt > for now, my impression was that the code is in good enough shape, > and we need a bit of documentation polishes before I can mark it > as "Will merge to 'next'". > >> Possibly only checking the documenation aspects, so folks don't fall >> into the same trap as me.. ;-) > > Yup, so let's resolve that documentation thing while we remember > that the topic has that issue, and what part of the documentation > we find needs improvement. > > I am not sure what "trap: you fell into, though. Are you saying > that giving > > git describe [<option to describe a commit>...] <commit-ish> > git describe [<option to describe a blob>...] <blob> > > in the synopsis is not helpful, because the user may not know what > kind of object s/he has, and cannot decide from which set of options > to pick? Then an alternative would be to list (If I remember correctly) My nit pick was roughly along the lines you suggest, and that the two option lists (for commit-ish and blob) were shown in different ways, which could lead to the scenarion that, with knowing the oid object type (or knowing how to get it), the user could give an invalid option, and think the command failure was because the oid was invalid, not that the option was not appropriate, along with variations on that theme. The newer synopsis (v5) looks Ok in that it avoids digging the hole by not mentioning the blob options. Personally I'm more for manuals that tend toward instructional, rather than being expert references. I'd sneak in a line saying "The object type can be determined using `git cat-file`.", but maybe that's my work environment... > > git describe [<option>...] <object> > > in the synopsis, say upfront that most options are applicable only > when describing a commit-ish, and when describing a blob, we do > quite different thing and a separate set of options apply, perhaps? > -- Philip ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-11-20 15:22 ` [PATCH 6/7] " Philip Oakley @ 2017-11-20 18:18 ` Philip Oakley 0 siblings, 0 replies; 110+ messages in thread From: Philip Oakley @ 2017-11-20 18:18 UTC (permalink / raw) To: Philip Oakley, Junio C Hamano Cc: Stefan Beller, git, Kevin Daudt, Jacob Keller, Johannes Schindelin From: "Philip Oakley" <philipoakley@iee.org> s/with/without/ ... > From: "Junio C Hamano" <gitster@pobox.com> > : Friday, November 10, 2017 1:24 AM > [catch up] > >> "Philip Oakley" <philipoakley@iee.org> writes: >> >>> From: "Stefan Beller" <sbeller@google.com> >>>> Rereading this discussion, there is currently no urgent thing to >>>> address? >>> >>> True. >>> >>>> Then the state as announced by the last cooking email, to just cook >>>> it, seems >>>> about right and we'll wait for further feedback. >> >> A shiny new toy that is not a fix for a grave bug is rarely urgent, >> so with that criterion, we'd end up with hundreds of topics not in >> 'next' but in 'pu' waiting for the original contributor to get out >> of his or her procrastination, which certainly is not what I want to >> see, as I'd have to throw them into the Stalled bin and then >> eventually discard them, while having to worry about possible >> mismerges with remaining good topics caused by these topics >> appearing and disappearing from 'pu'. >> >> I'd rather see any topic that consumed reviewers' time to be >> polished enough to get into 'next' while we all recall the issues >> raised during previous reviews. I consider the process to further >> incrementally polish it after that happens a true "cooking". >> >> For this topic, aside from "known issues" that we decided to punt >> for now, my impression was that the code is in good enough shape, >> and we need a bit of documentation polishes before I can mark it >> as "Will merge to 'next'". >> >>> Possibly only checking the documenation aspects, so folks don't fall >>> into the same trap as me.. ;-) >> >> Yup, so let's resolve that documentation thing while we remember >> that the topic has that issue, and what part of the documentation >> we find needs improvement. >> >> I am not sure what "trap: you fell into, though. Are you saying >> that giving >> >> git describe [<option to describe a commit>...] <commit-ish> >> git describe [<option to describe a blob>...] <blob> >> >> in the synopsis is not helpful, because the user may not know what >> kind of object s/he has, and cannot decide from which set of options >> to pick? Then an alternative would be to list > > (If I remember correctly) My nit pick was roughly along the lines you > suggest, and that the two option lists (for commit-ish and blob) were > shown in different ways, which could lead to the scenarion that, with > knowing the s/with/without/ ... > oid object type (or knowing how to get it), the user could give an invalid > option, and think the command failure was because the oid was invalid, not > that the option was not appropriate, along with variations on that theme. > > The newer synopsis (v5) looks Ok in that it avoids digging the hole by not > mentioning the blob options. Personally I'm more for manuals that tend > toward instructional, rather than being expert references. I'd sneak in a > line saying "The object type can be determined using `git cat-file`.", but > maybe that's my work environment... > >> >> git describe [<option>...] <object> >> >> in the synopsis, say upfront that most options are applicable only >> when describing a commit-ish, and when describing a blob, we do >> quite different thing and a separate set of options apply, perhaps? >> > -- > Philip ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 6/7] builtin/describe.c: describe a blob 2017-10-31 19:16 ` Stefan Beller 2017-11-01 3:34 ` Junio C Hamano @ 2017-11-01 3:44 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 3:44 UTC (permalink / raw) To: Stefan Beller; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin Stefan Beller <sbeller@google.com> writes: > Maybe > > "git-describe - Describe a blob or commit using graph relations" > > though that sounds too generic, but it is accurate as all we do is > a heuristic for graph walk ways. We used to describe commit using commit ancestry (i.e. finding the place where a "wanted" commit sits in the commit ancestry graph). Now we are extending it to describe an object by finding the place where it sits in the graph that shows object reachability relation, but saying "graph" without saying "graph over what" is probably not descriptive enough for most readers. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH 7/7] t6120: fix typo in test name 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller ` (5 preceding siblings ...) 2017-10-31 0:33 ` [PATCH 6/7] builtin/describe.c: describe a blob Stefan Beller @ 2017-10-31 0:33 ` Stefan Beller 2017-11-01 1:21 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 0:33 UTC (permalink / raw) To: sbeller; +Cc: git, me, jacob.keller, Johannes.Schindelin Signed-off-by: Stefan Beller <sbeller@google.com> --- t/t6120-describe.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 3be01316e8..fd329f173a 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' ' mv .git/modules/sub1/ .git/modules/sub_moved && test_must_fail git describe --dirty ' -test_expect_success 'describe ignoring a borken submodule' ' +test_expect_success 'describe ignoring a broken submodule' ' git describe --broken >out && test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" && grep broken out -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 7/7] t6120: fix typo in test name 2017-10-31 0:33 ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller @ 2017-11-01 1:21 ` Junio C Hamano 2017-11-01 18:13 ` Stefan Beller 2017-11-02 1:36 ` Junio C Hamano 0 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 1:21 UTC (permalink / raw) To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin Stefan Beller <sbeller@google.com> writes: > Signed-off-by: Stefan Beller <sbeller@google.com> > --- > t/t6120-describe.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Good. I am guessing that you are sending this as the last/optional one because this was found _after_ you worked on other parts of the series, but I think it is easier to reason about if this were marked as a preliminary clean-up and moved to the front of the series. Thanks. > > diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh > index 3be01316e8..fd329f173a 100755 > --- a/t/t6120-describe.sh > +++ b/t/t6120-describe.sh > @@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' ' > mv .git/modules/sub1/ .git/modules/sub_moved && > test_must_fail git describe --dirty > ' > -test_expect_success 'describe ignoring a borken submodule' ' > +test_expect_success 'describe ignoring a broken submodule' ' > git describe --broken >out && > test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" && > grep broken out ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 7/7] t6120: fix typo in test name 2017-11-01 1:21 ` Junio C Hamano @ 2017-11-01 18:13 ` Stefan Beller 2017-11-02 1:36 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-01 18:13 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Kevin Daudt, Jacob Keller, Johannes Schindelin On Tue, Oct 31, 2017 at 6:21 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> Signed-off-by: Stefan Beller <sbeller@google.com> >> --- >> t/t6120-describe.sh | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Good. I am guessing that you are sending this as the last/optional > one because this was found _after_ you worked on other parts of the > series, but I think it is easier to reason about if this were marked > as a preliminary clean-up and moved to the front of the series. > > Thanks. It can be independent even. I did not bother to put it on its own flight, but I can shift it to the front of this series. Thanks, Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 7/7] t6120: fix typo in test name 2017-11-01 1:21 ` Junio C Hamano 2017-11-01 18:13 ` Stefan Beller @ 2017-11-02 1:36 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-02 1:36 UTC (permalink / raw) To: Stefan Beller; +Cc: git, me, jacob.keller, Johannes.Schindelin Junio C Hamano <gitster@pobox.com> writes: > Stefan Beller <sbeller@google.com> writes: > >> Signed-off-by: Stefan Beller <sbeller@google.com> >> --- >> t/t6120-describe.sh | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Good. I am guessing that you are sending this as the last/optional > one because this was found _after_ you worked on other parts of the > series, but I think it is easier to reason about if this were marked > as a preliminary clean-up and moved to the front of the series. > > Thanks. Having said that, if you already have v2 that keeps it at the end, that is fine. It's just leaving such an "By the way I found this unrelated one and fixed it while at it" at the end when submitting will give an impression that the series is not as well proof-read as it could be, because a good proof-reader would notice something out-of-place like this fairly quickly and would move it to the front or even in a separate preliminary clean-up series. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv2 0/7] git describe blob 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller ` (6 preceding siblings ...) 2017-10-31 0:33 ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-10-31 21:18 ` [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller ` (8 more replies) 7 siblings, 9 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me v2: * other variable names in patch v1, the commit message explains the unusual strategy for the scratch pad variable, + assert * less ugly test in p2 * typofix in p3 commit msg * patch 4 (debug printing) unchanged, awaiting discussion to start/settle. * rephrased the man page in p6. Thanks, Stefan v1: This is not an RFC any more, but a serious series. Occasionally a user is given an object hash from a blob as an error message or other output (e.g. [1]). It would be useful to get a further description of such a blob, such as the (commit, path) tuple where this blob was introduced. This implements the answer in builtin/describe, however the heuristics are weak. See patch 6 for details. Any feedback welcome, Thanks, Stefan Stefan Beller (7): list-objects.c: factor out traverse_trees_and_blobs revision.h: introduce blob/tree walking in order of the commits builtin/describe.c: rename `oid` to avoid variable shadowing builtin/describe.c: print debug statements earlier builtin/describe.c: factor out describe_commit builtin/describe.c: describe a blob t6120: fix typo in test name Documentation/git-describe.txt | 13 +++- Documentation/rev-list-options.txt | 5 ++ builtin/describe.c | 125 ++++++++++++++++++++++++++++--------- list-objects.c | 52 +++++++++------ revision.c | 2 + revision.h | 3 +- t/t6100-rev-list-in-order.sh | 46 ++++++++++++++ t/t6120-describe.sh | 17 ++++- 8 files changed, 213 insertions(+), 50 deletions(-) create mode 100755 t/t6100-rev-list-in-order.sh -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-11-01 3:46 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller ` (7 subsequent siblings) 8 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me With traverse_trees_and_blobs factored out of the main traverse function, the next patch can introduce an in-order revision walking with ease. In the next patch we'll call `traverse_trees_and_blobs` from within the loop walking the commits, such that we'll have one invocation of that function per commit. That is why we do not want to have memory allocations in that function, such as we'd have if we were to use a strbuf locally. Pass a strbuf from traverse_commit_list into the blob and tree traversing function as a scratch pad that only needs to be allocated once. Signed-off-by: Stefan Beller <sbeller@google.com> --- list-objects.c | 50 +++++++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 19 deletions(-) diff --git a/list-objects.c b/list-objects.c index b3931fa434..ef9dbe8f92 100644 --- a/list-objects.c +++ b/list-objects.c @@ -183,25 +183,15 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) add_pending_object(revs, &tree->object, ""); } -void traverse_commit_list(struct rev_info *revs, - show_commit_fn show_commit, - show_object_fn show_object, - void *data) +static void traverse_trees_and_blobs(struct rev_info *revs, + struct strbuf *base_path, + show_object_fn show_object, + void *data) { int i; - struct commit *commit; - struct strbuf base; - strbuf_init(&base, PATH_MAX); - while ((commit = get_revision(revs)) != NULL) { - /* - * an uninteresting boundary commit may not have its tree - * parsed yet, but we are not going to show them anyway - */ - if (commit->tree) - add_pending_tree(revs, commit->tree); - show_commit(commit, data); - } + assert(base_path->len == 0); + for (i = 0; i < revs->pending.nr; i++) { struct object_array_entry *pending = revs->pending.objects + i; struct object *obj = pending->item; @@ -218,17 +208,39 @@ void traverse_commit_list(struct rev_info *revs, path = ""; if (obj->type == OBJ_TREE) { process_tree(revs, (struct tree *)obj, show_object, - &base, path, data); + base_path, path, data); continue; } if (obj->type == OBJ_BLOB) { process_blob(revs, (struct blob *)obj, show_object, - &base, path, data); + base_path, path, data); continue; } die("unknown pending object %s (%s)", oid_to_hex(&obj->oid), name); } object_array_clear(&revs->pending); - strbuf_release(&base); +} + +void traverse_commit_list(struct rev_info *revs, + show_commit_fn show_commit, + show_object_fn show_object, + void *data) +{ + struct commit *commit; + struct strbuf csp; /* callee's scratch pad */ + strbuf_init(&csp, PATH_MAX); + + while ((commit = get_revision(revs)) != NULL) { + /* + * an uninteresting boundary commit may not have its tree + * parsed yet, but we are not going to show them anyway + */ + if (commit->tree) + add_pending_tree(revs, commit->tree); + show_commit(commit, data); + } + traverse_trees_and_blobs(revs, &csp, show_object, data); + + strbuf_release(&csp); } -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs 2017-10-31 21:18 ` [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-11-01 3:46 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 3:46 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me Stefan Beller <sbeller@google.com> writes: > With traverse_trees_and_blobs factored out of the main traverse function, > the next patch can introduce an in-order revision walking with ease. > > In the next patch we'll call `traverse_trees_and_blobs` from within the > loop walking the commits, such that we'll have one invocation of that > function per commit. That is why we do not want to have memory allocations > in that function, such as we'd have if we were to use a strbuf locally. > Pass a strbuf from traverse_commit_list into the blob and tree traversing > function as a scratch pad that only needs to be allocated once. Makes sense. I still don't think base_path is any clearer than base that was used in the original, though. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller 2017-10-31 21:18 ` [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-11-01 3:50 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller ` (6 subsequent siblings) 8 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me The functionality to list tree objects in the order they were seen while traversing the commits will be used in the next commit, where we teach `git describe` to describe not only commits, but trees and blobs, too. Signed-off-by: Stefan Beller <sbeller@google.com> --- Documentation/rev-list-options.txt | 5 +++++ list-objects.c | 2 ++ revision.c | 2 ++ revision.h | 3 ++- t/t6100-rev-list-in-order.sh | 46 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 57 insertions(+), 1 deletion(-) create mode 100755 t/t6100-rev-list-in-order.sh diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 13501e1556..9066e0c777 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -686,6 +686,11 @@ ifdef::git-rev-list[] all object IDs which I need to download if I have the commit object _bar_ but not _foo_''. +--in-commit-order:: + Print tree and blob ids in order of the commits. The tree + and blob ids are printed after they are first referenced + by a commit. + --objects-edge:: Similar to `--objects`, but also print the IDs of excluded commits prefixed with a ``-'' character. This is used by diff --git a/list-objects.c b/list-objects.c index ef9dbe8f92..03438e5a8b 100644 --- a/list-objects.c +++ b/list-objects.c @@ -239,6 +239,8 @@ void traverse_commit_list(struct rev_info *revs, if (commit->tree) add_pending_tree(revs, commit->tree); show_commit(commit, data); + if (revs->tree_blobs_in_commit_order) + traverse_trees_and_blobs(revs, &csp, show_object, data); } traverse_trees_and_blobs(revs, &csp, show_object, data); diff --git a/revision.c b/revision.c index d167223e69..9329d4ebbf 100644 --- a/revision.c +++ b/revision.c @@ -1845,6 +1845,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->dense = 0; } else if (!strcmp(arg, "--show-all")) { revs->show_all = 1; + } else if (!strcmp(arg, "--in-commit-order")) { + revs->tree_blobs_in_commit_order = 1; } else if (!strcmp(arg, "--remove-empty")) { revs->remove_empty_trees = 1; } else if (!strcmp(arg, "--merges")) { diff --git a/revision.h b/revision.h index 54761200ad..86985d68aa 100644 --- a/revision.h +++ b/revision.h @@ -121,7 +121,8 @@ struct rev_info { bisect:1, ancestry_path:1, first_parent_only:1, - line_level_traverse:1; + line_level_traverse:1, + tree_blobs_in_commit_order:1; /* Diff flags */ unsigned int diff:1, diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh new file mode 100755 index 0000000000..651666979b --- /dev/null +++ b/t/t6100-rev-list-in-order.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +test_description='miscellaneous rev-list tests' + +. ./test-lib.sh + + +test_expect_success 'setup' ' + for x in one two three four + do + echo $x >$x && + git add $x && + git commit -m "add file $x" + done && + for x in four three + do + git rm $x && + git commit -m "remove $x" + done && + git rev-list --in-commit-order --objects HEAD >actual.raw && + cut -c 1-40 >actual <actual.raw && + + git cat-file --batch-check="%(objectname)" >expect.raw <<-\EOF && + HEAD^{commit} + HEAD^{tree} + HEAD^{tree}:one + HEAD^{tree}:two + HEAD~1^{commit} + HEAD~1^{tree} + HEAD~1^{tree}:three + HEAD~2^{commit} + HEAD~2^{tree} + HEAD~2^{tree}:four + HEAD~3^{commit} + # HEAD~3^{tree} skipped + HEAD~4^{commit} + # HEAD~4^{tree} skipped + HEAD~5^{commit} + HEAD~5^{tree} + EOF + grep -v "#" >expect <expect.raw && + + test_cmp expect actual +' + +test_done -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-10-31 21:18 ` [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-11-01 3:50 ` Junio C Hamano 2017-11-01 12:26 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 3:50 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me Stefan Beller <sbeller@google.com> writes: > diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh > new file mode 100755 > index 0000000000..651666979b > --- /dev/null > +++ b/t/t6100-rev-list-in-order.sh > @@ -0,0 +1,46 @@ > +#!/bin/sh > + > +test_description='miscellaneous rev-list tests' > + > +. ./test-lib.sh > + > + > +test_expect_success 'setup' ' > + for x in one two three four > + do > + echo $x >$x && > + git add $x && > + git commit -m "add file $x" > + done && > + for x in four three > + do > + git rm $x && > + git commit -m "remove $x" > + done && When "git commit -m 'remove four'" fails, this loop would still continue, so the &&-chain in "done &&" would still be rendered ineffetive. > + git rev-list --in-commit-order --objects HEAD >actual.raw && > + cut -c 1-40 >actual <actual.raw && > + > + git cat-file --batch-check="%(objectname)" >expect.raw <<-\EOF && > + HEAD^{commit} > + HEAD^{tree} > + HEAD^{tree}:one > + HEAD^{tree}:two > + HEAD~1^{commit} > + HEAD~1^{tree} > + HEAD~1^{tree}:three > + HEAD~2^{commit} > + HEAD~2^{tree} > + HEAD~2^{tree}:four > + HEAD~3^{commit} > + # HEAD~3^{tree} skipped > + HEAD~4^{commit} > + # HEAD~4^{tree} skipped > + HEAD~5^{commit} > + HEAD~5^{tree} > + EOF > + grep -v "#" >expect <expect.raw && Hmm, that is unfortunate that we still have to filter them out, but these have documentation values, so perhaps this is as good as it would get. These "skipped" are shorthand for "not shown here, because it was already shown as X"; it would be good to come up with a phrasing to force us show what X is for each of them. "already shown as X", perhaps? > + > + test_cmp expect actual > +' > + > +test_done ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 3:50 ` Junio C Hamano @ 2017-11-01 12:26 ` Johannes Schindelin 2017-11-01 12:37 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 12:26 UTC (permalink / raw) To: Junio C Hamano; +Cc: Stefan Beller, git, jacob.keller, me Hi, On Wed, 1 Nov 2017, Junio C Hamano wrote: > Stefan Beller <sbeller@google.com> writes: > > > diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh > > new file mode 100755 > > index 0000000000..651666979b > > --- /dev/null > > +++ b/t/t6100-rev-list-in-order.sh > > @@ -0,0 +1,46 @@ > > +#!/bin/sh > > + > > +test_description='miscellaneous rev-list tests' > > + > > +. ./test-lib.sh > > + > > + > > +test_expect_success 'setup' ' > > + for x in one two three four > > + do > > + echo $x >$x && > > + git add $x && > > + git commit -m "add file $x" > > + done && > > + for x in four three > > + do > > + git rm $x && > > + git commit -m "remove $x" > > + done && > > When "git commit -m 'remove four'" fails, this loop would still > continue, so the &&-chain in "done &&" would still be rendered > ineffetive. ... so the proper fix is to append an "|| break", as in: ... for x in four three do git rm $x && git commit -m "remote $x" || break done && ... But that is still incorrect, as the "break" statement will succeed, breaking (pun intended) the && chain in a different way. The canonical way to do it is to wrap the loop in a subshell, *and also* test $? (because `(exit 1) && echo something` still prints `something`): ... ( for x in four three do git rm $x && git commit -m "remote $x" || exit done ) && test 0 -eq $? && ... Ugly? Yes. Correct? Also yes. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 12:26 ` Johannes Schindelin @ 2017-11-01 12:37 ` Junio C Hamano 2017-11-01 19:37 ` Stefan Beller 2017-11-01 21:36 ` Johannes Schindelin 0 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 12:37 UTC (permalink / raw) To: Johannes Schindelin Cc: Stefan Beller, Git Mailing List, Jacob Keller, Kevin D On Wed, Nov 1, 2017 at 9:26 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > ... > ( > for x in four three > do > git rm $x && > git commit -m "remote $x" || > exit > done > ) && > test 0 -eq $? && > ... > > Ugly? Yes. Correct? Also yes. I think returning non-zero with "return" is how other tests avoid an extra level of subshell. Ugly? Yes. Correct? Questionable but it seems to work for those who wrote them ;-) ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 12:37 ` Junio C Hamano @ 2017-11-01 19:37 ` Stefan Beller 2017-11-01 22:08 ` Johannes Schindelin 2017-11-01 21:36 ` Johannes Schindelin 1 sibling, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-01 19:37 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Git Mailing List, Jacob Keller, Kevin D On Wed, Nov 1, 2017 at 5:37 AM, Junio C Hamano <gitster@pobox.com> wrote: > On Wed, Nov 1, 2017 at 9:26 PM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: >> >> ... >> ( >> for x in four three >> do >> git rm $x && >> git commit -m "remote $x" || >> exit >> done >> ) && >> test 0 -eq $? && >> ... >> >> Ugly? Yes. Correct? Also yes. > > I think returning non-zero with "return" is how other tests avoid an > extra level of subshell. > Ugly? Yes. Correct? Questionable but it seems to work for those who > wrote them ;-) I think I'll go without the extra subshell and with s/return 1/false/ as the exact value doesn't matter. Thanks for hinting at correct implementations ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 19:37 ` Stefan Beller @ 2017-11-01 22:08 ` Johannes Schindelin 2017-11-01 22:19 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 22:08 UTC (permalink / raw) To: Stefan Beller; +Cc: Junio C Hamano, Git Mailing List, Jacob Keller, Kevin D Hi Stefan, On Wed, 1 Nov 2017, Stefan Beller wrote: > On Wed, Nov 1, 2017 at 5:37 AM, Junio C Hamano <gitster@pobox.com> wrote: > > On Wed, Nov 1, 2017 at 9:26 PM, Johannes Schindelin > > <Johannes.Schindelin@gmx.de> wrote: > >> > >> ... > >> ( > >> for x in four three > >> do > >> git rm $x && > >> git commit -m "remote $x" || > >> exit > >> done > >> ) && > >> test 0 -eq $? && > >> ... > >> > >> Ugly? Yes. Correct? Also yes. > > > > I think returning non-zero with "return" is how other tests avoid an > > extra level of subshell. > > Ugly? Yes. Correct? Questionable but it seems to work for those who > > wrote them ;-) > > I think I'll go without the extra subshell and with s/return 1/false/ as > the exact value doesn't matter. You mean for ... do xyz || false done ? That does not work. Try this: for a in 1 2 3; do echo $a && false; done && echo done While it does not print `done`, it will print all of 1, 2 and 3. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 22:08 ` Johannes Schindelin @ 2017-11-01 22:19 ` Stefan Beller 2017-11-01 22:39 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-01 22:19 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Git Mailing List, Jacob Keller, Kevin D >> I think I'll go without the extra subshell and with s/return 1/false/ as >> the exact value doesn't matter. > > You mean > > for ... > do > xyz || > false > done Yes, I do. > ? That does not work. Try this: > > for a in 1 2 3; do echo $a && false; done && echo done ... && echo $? 1 > While it does not print `done`, it will print all of 1, 2 and 3. We do not care about the internal state, aborting early, we rather care only if the whole loop body was executed. Running the test test_expect_success 'witty title' ' for a in 1 2 3; do echo $a && false; done && echo done ' not ok 1 - witty title That is all we want to care about here? Otherwise I may still have a misunderstanding. Thanks, Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 22:19 ` Stefan Beller @ 2017-11-01 22:39 ` Johannes Schindelin 2017-11-01 22:46 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 22:39 UTC (permalink / raw) To: Stefan Beller; +Cc: Junio C Hamano, Git Mailing List, Jacob Keller, Kevin D Hi Stefan, On Wed, 1 Nov 2017, Stefan Beller wrote: > We do not care about the internal state, aborting early, we rather > care only if the whole loop body was executed. Running the test > > test_expect_success 'witty title' ' > for a in 1 2 3; do echo $a && false; done && echo done > ' > > not ok 1 - witty title > > That is all we want to care about here? We care about the loop body being executed successfully *each time*. A better counter example: $ for a in 1 2 3; do echo $a && test 2 != $a || false; done && echo done 1 2 3 done (even if the second of three runs failed) Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 22:39 ` Johannes Schindelin @ 2017-11-01 22:46 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-01 22:46 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Git Mailing List, Jacob Keller, Kevin D On Wed, Nov 1, 2017 at 3:39 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >> not ok 1 - witty title >> >> That is all we want to care about here? > > We care about the loop body being executed successfully *each time*. A > better counter example: Good point. I'll use return in that case. Thanks! Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 12:37 ` Junio C Hamano 2017-11-01 19:37 ` Stefan Beller @ 2017-11-01 21:36 ` Johannes Schindelin 2017-11-01 21:39 ` Jeff King 1 sibling, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 21:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Stefan Beller, Git Mailing List, Jacob Keller, Kevin D Hi Junio, On Wed, 1 Nov 2017, Junio C Hamano wrote: > On Wed, Nov 1, 2017 at 9:26 PM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > > ... > > ( > > for x in four three > > do > > git rm $x && > > git commit -m "remote $x" || > > exit > > done > > ) && > > test 0 -eq $? && > > ... > > > > Ugly? Yes. Correct? Also yes. > > I think returning non-zero with "return" is how other tests avoid an > extra level of subshell. > Ugly? Yes. Correct? Questionable but it seems to work for those who > wrote them ;-) Given that the test_expect_* functions evaluate the code, it makes me wonder whether those `return` statements really return appropriately, or one call level too low. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 21:36 ` Johannes Schindelin @ 2017-11-01 21:39 ` Jeff King 2017-11-01 22:33 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Jeff King @ 2017-11-01 21:39 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Stefan Beller, Git Mailing List, Jacob Keller, Kevin D On Wed, Nov 01, 2017 at 10:36:02PM +0100, Johannes Schindelin wrote: > Hi Junio, > > On Wed, 1 Nov 2017, Junio C Hamano wrote: > > > On Wed, Nov 1, 2017 at 9:26 PM, Johannes Schindelin > > <Johannes.Schindelin@gmx.de> wrote: > > > > > > ... > > > ( > > > for x in four three > > > do > > > git rm $x && > > > git commit -m "remote $x" || > > > exit > > > done > > > ) && > > > test 0 -eq $? && > > > ... > > > > > > Ugly? Yes. Correct? Also yes. > > > > I think returning non-zero with "return" is how other tests avoid an > > extra level of subshell. > > Ugly? Yes. Correct? Questionable but it seems to work for those who > > wrote them ;-) > > Given that the test_expect_* functions evaluate the code, it makes me > wonder whether those `return` statements really return appropriately, or > one call level too low. The test_expect functions eval the actual snippets inside a dummy function. This is intentional exactly to allow them to call "return" at will. -Peff ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 21:39 ` Jeff King @ 2017-11-01 22:33 ` Johannes Schindelin 2017-11-02 1:20 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 22:33 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Stefan Beller, Git Mailing List, Jacob Keller, Kevin D Hi, On Wed, 1 Nov 2017, Jeff King wrote: > On Wed, Nov 01, 2017 at 10:36:02PM +0100, Johannes Schindelin wrote: > > > On Wed, 1 Nov 2017, Junio C Hamano wrote: > > > > > On Wed, Nov 1, 2017 at 9:26 PM, Johannes Schindelin > > > <Johannes.Schindelin@gmx.de> wrote: > > > > > > > > ... > > > > ( > > > > for x in four three > > > > do > > > > git rm $x && > > > > git commit -m "remote $x" || > > > > exit > > > > done > > > > ) && > > > > test 0 -eq $? && > > > > ... > > > > > > > > Ugly? Yes. Correct? Also yes. > > > > > > I think returning non-zero with "return" is how other tests avoid an > > > extra level of subshell. > > > Ugly? Yes. Correct? Questionable but it seems to work for those who > > > wrote them ;-) > > > > Given that the test_expect_* functions evaluate the code, it makes me > > wonder whether those `return` statements really return appropriately, or > > one call level too low. > > The test_expect functions eval the actual snippets inside a dummy > function. This is intentional exactly to allow them to call "return" at > will. Tricksy. And a bit unintuitive for script kings such as myself, but okay. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-01 22:33 ` Johannes Schindelin @ 2017-11-02 1:20 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-02 1:20 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Stefan Beller, Git Mailing List, Jacob Keller, Kevin D Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> > Given that the test_expect_* functions evaluate the code, it makes me >> > wonder whether those `return` statements really return appropriately, or >> > one call level too low. >> >> The test_expect functions eval the actual snippets inside a dummy >> function. This is intentional exactly to allow them to call "return" at >> will. > > Tricksy. And a bit unintuitive for script kings such as myself, but okay. Exactly. The hidden assumption on the way how "return" interacts with the way we use "eval" is the reason why I said "Ugly? Yes, Correct? Questionable" in the first place. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv2 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller 2017-10-31 21:18 ` [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-31 21:18 ` [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-10-31 21:18 ` [PATCHv2 4/7] builtin/describe.c: print debug statements earlier Stefan Beller ` (5 subsequent siblings) 8 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me The function `describe` has already a variable named `oid` declared at the beginning of the function for an object id. Do not shadow that variable with a pointer to an object id. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 29075dbd0f..fd61f463cf 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -381,9 +381,9 @@ static void describe(const char *arg, int last_one) } if (!match_cnt) { - struct object_id *oid = &cmit->object.oid; + struct object_id *cmit_oid = &cmit->object.oid; if (always) { - printf("%s", find_unique_abbrev(oid->hash, abbrev)); + printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); if (suffix) printf("%s", suffix); printf("\n"); @@ -392,11 +392,11 @@ static void describe(const char *arg, int last_one) if (unannotated_cnt) die(_("No annotated tags can describe '%s'.\n" "However, there were unannotated tags: try --tags."), - oid_to_hex(oid)); + oid_to_hex(cmit_oid)); else die(_("No tags can describe '%s'.\n" "Try --always, or create some tags."), - oid_to_hex(oid)); + oid_to_hex(cmit_oid)); } QSORT(all_matches, match_cnt, compare_pt); -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv2 4/7] builtin/describe.c: print debug statements earlier 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller ` (2 preceding siblings ...) 2017-10-31 21:18 ` [PATCHv2 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-10-31 21:31 ` Eric Sunshine 2017-10-31 21:18 ` [PATCHv2 5/7] builtin/describe.c: factor out describe_commit Stefan Beller ` (4 subsequent siblings) 8 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me For debuggers aid we'd want to print debug statements early, so introduce a new line in the debug output that describes the whole function, and the change the next debug output to describe why we need to search. Conveniently drop the arg from the second line; which will be useful in a follow up commit, that refactors the describe function. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/describe.c b/builtin/describe.c index fd61f463cf..3136efde31 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; + if (debug) + fprintf(stderr, _("describe %s\n"), arg); + if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); cmit = lookup_commit_reference(&oid); @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) if (!max_candidates) die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); if (debug) - fprintf(stderr, _("searching to describe %s\n"), arg); + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); if (!have_util) { struct hashmap_iter iter; -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv2 4/7] builtin/describe.c: print debug statements earlier 2017-10-31 21:18 ` [PATCHv2 4/7] builtin/describe.c: print debug statements earlier Stefan Beller @ 2017-10-31 21:31 ` Eric Sunshine 0 siblings, 0 replies; 110+ messages in thread From: Eric Sunshine @ 2017-10-31 21:31 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes Schindelin, Git List, Jacob Keller, Kevin Daudt On Tue, Oct 31, 2017 at 5:18 PM, Stefan Beller <sbeller@google.com> wrote: > For debuggers aid we'd want to print debug statements early, so > introduce a new line in the debug output that describes the whole > function, and the change the next debug output to describe why we need s/and the/and/ ...or... s/and the/and then/ > to search. Conveniently drop the arg from the second line; which will > be useful in a follow up commit, that refactors the describe function. > > Signed-off-by: Stefan Beller <sbeller@google.com> ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv2 5/7] builtin/describe.c: factor out describe_commit 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller ` (3 preceding siblings ...) 2017-10-31 21:18 ` [PATCHv2 4/7] builtin/describe.c: print debug statements earlier Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-10-31 21:18 ` [PATCHv2 6/7] builtin/describe.c: describe a blob Stefan Beller ` (3 subsequent siblings) 8 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me In the next patch we'll learn how to describe more than just commits, so factor out describing commits into its own function. That will make the next patches easy as we still need to describe a commit as part of describing blobs. While factoring out the functionality to describe_commit, make sure that any output to stdout is put into a strbuf, which we can print afterwards, using puts which also adds the line ending. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 63 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 37 insertions(+), 26 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 3136efde31..9e9a5ed5d4 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -256,7 +256,7 @@ static unsigned long finish_depth_computation( return seen_commits; } -static void display_name(struct commit_name *n) +static void append_name(struct commit_name *n, struct strbuf *dst) { if (n->prio == 2 && !n->tag) { n->tag = lookup_tag(&n->oid); @@ -272,19 +272,18 @@ static void display_name(struct commit_name *n) } if (n->tag) - printf("%s", n->tag->tag); + strbuf_addstr(dst, n->tag->tag); else - printf("%s", n->path); + strbuf_addstr(dst, n->path); } -static void show_suffix(int depth, const struct object_id *oid) +static void append_suffix(int depth, const struct object_id *oid, struct strbuf *dst) { - printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); + strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); } -static void describe(const char *arg, int last_one) +static void describe_commit(struct object_id *oid, struct strbuf *dst) { - struct object_id oid; struct commit *cmit, *gave_up_on = NULL; struct commit_list *list; struct commit_name *n; @@ -293,26 +292,18 @@ static void describe(const char *arg, int last_one) unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; - if (debug) - fprintf(stderr, _("describe %s\n"), arg); - - if (get_oid(arg, &oid)) - die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference(oid); n = find_commit_name(&cmit->object.oid); if (n && (tags || all || n->prio == 2)) { /* * Exact match to an existing ref. */ - display_name(n); + append_name(n, dst); if (longformat) - show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid); + append_suffix(0, n->tag ? &n->tag->tagged->oid : oid, dst); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); return; } @@ -386,10 +377,9 @@ static void describe(const char *arg, int last_one) if (!match_cnt) { struct object_id *cmit_oid = &cmit->object.oid; if (always) { - printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); + strbuf_addstr(dst, find_unique_abbrev(cmit_oid->hash, abbrev)); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); return; } if (unannotated_cnt) @@ -437,15 +427,36 @@ static void describe(const char *arg, int last_one) } } - display_name(all_matches[0].name); + append_name(all_matches[0].name, dst); if (abbrev) - show_suffix(all_matches[0].depth, &cmit->object.oid); + append_suffix(all_matches[0].depth, &cmit->object.oid, dst); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); +} + +static void describe(const char *arg, int last_one) +{ + struct object_id oid; + struct commit *cmit; + struct strbuf sb = STRBUF_INIT; + + if (debug) + fprintf(stderr, _("describe %s\n"), arg); + + if (get_oid(arg, &oid)) + die(_("Not a valid object name %s"), arg); + cmit = lookup_commit_reference(&oid); + if (!cmit) + die(_("%s is not a valid '%s' object"), arg, commit_type); + + describe_commit(&oid, &sb); + + puts(sb.buf); if (!last_one) clear_commit_marks(cmit, -1); + + strbuf_release(&sb); } int cmd_describe(int argc, const char **argv, const char *prefix) -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller ` (4 preceding siblings ...) 2017-10-31 21:18 ` [PATCHv2 5/7] builtin/describe.c: factor out describe_commit Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-10-31 21:49 ` Eric Sunshine 2017-11-01 4:11 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 7/7] t6120: fix typo in test name Stefan Beller ` (2 subsequent siblings) 8 siblings, 2 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) "This is an interesting endeavor, because describing things is hard." -- me, upon writing this patch. When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambivalent, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For source code this is likely not the first commit that introduced the blob, but rather the latest commit that contained the blob. For example: git describe v0.99:Makefile v0.99-5-gab6625e06a:Makefile tells us the latest commit that contained the Makefile as it was in tag v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist", 2005-07-11) touches the Makefile. Let's see how this description turns out, if it is useful in day-to-day use as I have the intuition that we'd rather want to see the *first* commit that this blob was introduced to the repository (which can be achieved easily by giving the `--reverse` flag in the describe_blob rev walk). [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> --- Documentation/git-describe.txt | 13 ++++++++- builtin/describe.c | 65 ++++++++++++++++++++++++++++++++++++++---- t/t6120-describe.sh | 15 ++++++++++ 3 files changed, 87 insertions(+), 6 deletions(-) diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt index c924c945ba..077c3c2193 100644 --- a/Documentation/git-describe.txt +++ b/Documentation/git-describe.txt @@ -3,7 +3,7 @@ git-describe(1) NAME ---- -git-describe - Describe a commit using the most recent tag reachable from it +git-describe - Describe a commit or blob using the graph relations SYNOPSIS @@ -11,6 +11,7 @@ SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] +'git describe' [<options>] <blob-ish> DESCRIPTION ----------- @@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows annotated tags. For more information about creating annotated tags see the -a and -s options to linkgit:git-tag[1]. +If the given object refers to a blob, it will be described +as `<commit-ish>:<path>`, such that the blob can be found +at `<path>` in the `<commit-ish>`. Note, that the commit is likely +not the commit that introduced the blob, but the one that was found +first; to find the commit that introduced the blob, you need to find +the commit that last touched the path, e.g. +`git log <commit-description> -- <path>`. +As blobs do not point at the commits they are contained in, +describing blobs is slow as we have to walk the whole graph. + OPTIONS ------- <commit-ish>...:: diff --git a/builtin/describe.c b/builtin/describe.c index 9e9a5ed5d4..382cbae908 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -3,6 +3,7 @@ #include "lockfile.h" #include "commit.h" #include "tag.h" +#include "blob.h" #include "refs.h" #include "builtin.h" #include "exec_cmd.h" @@ -11,8 +12,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -434,6 +436,54 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) strbuf_addstr(dst, suffix); } +struct process_commit_data { + struct object_id current_commit; + struct object_id looking_for; + struct strbuf *dst; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct process_commit_data *pcd = data; + pcd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *path, void *data) +{ + struct process_commit_data *pcd = data; + + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { + reset_revision_walk(); + describe_commit(&pcd->current_commit, pcd->dst); + strbuf_addf(pcd->dst, ":%s", path); + } +} + +static void describe_blob(struct object_id oid, struct strbuf *dst) +{ + struct rev_info revs; + struct argv_array args = ARGV_ARRAY_INIT; + struct process_commit_data pcd = { null_oid, oid, dst}; + + argv_array_pushl(&args, "internal: The first arg is not parsed", + "--all", "--reflog", /* as many starting points as possible */ + /* NEEDSWORK: --all is incompatible with worktrees for now: */ + "--single-worktree", + "--objects", + "--in-commit-order", + NULL); + + init_revisions(&revs, NULL); + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) + BUG("setup_revisions could not handle all args?"); + + if (prepare_revision_walk(&revs)) + die("revision walk setup failed"); + + traverse_commit_list(&revs, process_commit, process_object, &pcd); + reset_revision_walk(); +} + static void describe(const char *arg, int last_one) { struct object_id oid; @@ -445,11 +495,16 @@ static void describe(const char *arg, int last_one) if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference_gently(&oid, 1); - describe_commit(&oid, &sb); + if (cmit) { + describe_commit(&oid, &sb); + } else { + if (lookup_blob(&oid)) + describe_blob(oid, &sb); + else + die(_("%s is neither a commit nor blob"), arg); + } puts(sb.buf); diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 1c0e8659d9..3be01316e8 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' ' grep broken out ' +test_expect_success 'describe a blob at a tag' ' + echo "make it a unique blob" >file && + git add file && git commit -m "content in file" && + git tag -a -m "latest annotated tag" unique-file && + git describe HEAD:file >actual && + echo "unique-file:file" >expect && + test_cmp expect actual +' + +test_expect_success 'describe a surviving blob' ' + git commit --allow-empty -m "empty commit" && + git describe HEAD:file >actual && + grep unique-file-1-g actual +' + test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' ' i=1 && while test $i -lt 8000 -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-10-31 21:18 ` [PATCHv2 6/7] builtin/describe.c: describe a blob Stefan Beller @ 2017-10-31 21:49 ` Eric Sunshine 2017-11-01 19:51 ` Stefan Beller 2017-11-01 4:11 ` Junio C Hamano 1 sibling, 1 reply; 110+ messages in thread From: Eric Sunshine @ 2017-10-31 21:49 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes Schindelin, Git List, Jacob Keller, Kevin Daudt On Tue, Oct 31, 2017 at 5:18 PM, Stefan Beller <sbeller@google.com> wrote: > When describing commits, we try to anchor them to tags or refs, as these > are conceptually on a higher level than the commit. And if there is no ref > or tag that matches exactly, we're out of luck. So we employ a heuristic > to make up a name for the commit. These names are ambivalent, there might I guess you meant s/ambivalent/ambiguous/ ? > be different tags or refs to anchor to, and there might be different > path in the DAG to travel to arrive at the commit precisely. > > [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob > Signed-off-by: Stefan Beller <sbeller@google.com> Blank line before sign-off. > --- > diff --git a/builtin/describe.c b/builtin/describe.c > @@ -445,11 +495,16 @@ static void describe(const char *arg, int last_one) > > if (get_oid(arg, &oid)) > die(_("Not a valid object name %s"), arg); > - cmit = lookup_commit_reference(&oid); > - if (!cmit) > - die(_("%s is not a valid '%s' object"), arg, commit_type); > + cmit = lookup_commit_reference_gently(&oid, 1); > > - describe_commit(&oid, &sb); > + if (cmit) { > + describe_commit(&oid, &sb); > + } else { > + if (lookup_blob(&oid)) > + describe_blob(oid, &sb); > + else > + die(_("%s is neither a commit nor blob"), arg); > + } Not at all worth a re-roll, but less nesting and a bit less noisy: if (cmt) describe_commit(...); else if (lookup_blob(...)) describe_blob(...); else die(...); ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-10-31 21:49 ` Eric Sunshine @ 2017-11-01 19:51 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-01 19:51 UTC (permalink / raw) To: Eric Sunshine; +Cc: Johannes Schindelin, Git List, Jacob Keller, Kevin Daudt On Tue, Oct 31, 2017 at 2:49 PM, Eric Sunshine <sunshine@sunshineco.com> wrote: >> to make up a name for the commit. These names are ambivalent, there might > > I guess you meant s/ambivalent/ambiguous/ ? Indeed! ambivalent early 20th century: from ambivalence (from German Ambivalenz ), on the pattern of equivalent. In German ambivalent (~synonym to mehrdeutig, "multi meaning") actually means ambiguous, one of the false friends! Thanks for pointing out. >> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob >> Signed-off-by: Stefan Beller <sbeller@google.com> > > Blank line before sign-off. fixes. > > Not at all worth a re-roll, but less nesting and a bit less noisy: fixed. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-10-31 21:18 ` [PATCHv2 6/7] builtin/describe.c: describe a blob Stefan Beller 2017-10-31 21:49 ` Eric Sunshine @ 2017-11-01 4:11 ` Junio C Hamano 2017-11-01 12:32 ` Johannes Schindelin 2017-11-01 21:28 ` Stefan Beller 1 sibling, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 4:11 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me Stefan Beller <sbeller@google.com> writes: > +If the given object refers to a blob, it will be described > +as `<commit-ish>:<path>`, such that the blob can be found > +at `<path>` in the `<commit-ish>`. Note, that the commit is likely Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or would it always be <commit>:<path>? > +not the commit that introduced the blob, but the one that was found > +first; to find the commit that introduced the blob, you need to find Because we do not want to descend into the same tree object we saw earlier, this "we show the name we happened to find first without attempting to refine it for a better name" is a rather fundamental limitation of this approach. Hopefully we can later improve it with more thought, but for now it is better than nothing (and much better than "git log --raw | grep"). > +the commit that last touched the path, e.g. > +`git log <commit-description> -- <path>`. > +As blobs do not point at the commits they are contained in, > +describing blobs is slow as we have to walk the whole graph. Is it true that we walk the "whole graph", or do we stop immediately we find a path to the blob? The former makes it sound like completely useless. > -#define SEEN (1u << 0) Interesting. Now we include revision.h this becomes redundant. > +static void describe_blob(struct object_id oid, struct strbuf *dst) > +{ > + struct rev_info revs; > + struct argv_array args = ARGV_ARRAY_INIT; > + struct process_commit_data pcd = { null_oid, oid, dst}; > + > + argv_array_pushl(&args, "internal: The first arg is not parsed", > + "--all", "--reflog", /* as many starting points as possible */ Interesting. Do we also search in the reflog in the normal "describe" operation? If not, perhaps we should? I wonder what's the performance implications would be. That tangent aside, as "describe blob" is most likely a "what reaches and is holding onto this blob?" query, finding something that can only be reached from a reflog entry would make it more helpful than without the option. > + /* NEEDSWORK: --all is incompatible with worktrees for now: */ What's that last colon about? > + "--single-worktree", > + "--objects", > + "--in-commit-order", > + NULL); > + > + init_revisions(&revs, NULL); > + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) > + BUG("setup_revisions could not handle all args?"); > + > + if (prepare_revision_walk(&revs)) > + die("revision walk setup failed"); > + > + traverse_commit_list(&revs, process_commit, process_object, &pcd); > + reset_revision_walk(); > +} > + OK. > diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh > index 1c0e8659d9..3be01316e8 100755 > --- a/t/t6120-describe.sh > +++ b/t/t6120-describe.sh > @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' ' > grep broken out > ' > > +test_expect_success 'describe a blob at a tag' ' > + echo "make it a unique blob" >file && > + git add file && git commit -m "content in file" && > + git tag -a -m "latest annotated tag" unique-file && > + git describe HEAD:file >actual && > + echo "unique-file:file" >expect && > + test_cmp expect actual > +' > + > +test_expect_success 'describe a surviving blob' ' > + git commit --allow-empty -m "empty commit" && > + git describe HEAD:file >actual && > + grep unique-file-1-g actual > +' > + I am not sure what "surviving" means in this context. The word hinted that the test would be finding a blob that only appears in a commit that only appears as a reflog entry, but that wasn't the case, which was a bit disappointing. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 4:11 ` Junio C Hamano @ 2017-11-01 12:32 ` Johannes Schindelin 2017-11-01 17:59 ` Stefan Beller 2017-11-01 21:28 ` Stefan Beller 1 sibling, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 12:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: Stefan Beller, git, jacob.keller, me Hi Junio, On Wed, 1 Nov 2017, Junio C Hamano wrote: > Stefan Beller <sbeller@google.com> writes: > > > +If the given object refers to a blob, it will be described > > +as `<commit-ish>:<path>`, such that the blob can be found > > +at `<path>` in the `<commit-ish>`. Note, that the commit is likely > > Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or > would it always be <commit>:<path>? As the blob is described using this function: static void process_object(struct object *obj, const char *path, void *data) { struct process_commit_data *pcd = data; if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { reset_revision_walk(); describe_commit(&pcd->current_commit, pcd->dst); strbuf_addf(pcd->dst, ":%s", path); } } i.e. as `describe_commit()` is used on the commit part, the answer to your question is: the former. I guess that is why Stefan wrote `commit-ish` instead of `commit` ;-) > > +not the commit that introduced the blob, but the one that was found > > +first; to find the commit that introduced the blob, you need to find > > Because we do not want to descend into the same tree object we saw > earlier, this "we show the name we happened to find first without > attempting to refine it for a better name" is a rather fundamental > limitation of this approach. Hopefully we can later improve it with > more thought, but for now it is better than nothing (and much better > than "git log --raw | grep"). Indeed. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 12:32 ` Johannes Schindelin @ 2017-11-01 17:59 ` Stefan Beller 2017-11-01 21:05 ` Jacob Keller 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-01 17:59 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, git, Jacob Keller, Kevin Daudt >> Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or >> would it always be <commit>:<path>? > > As the blob is described using this function: > > static void process_object(struct object *obj, const char *path, void *data) > { > struct process_commit_data *pcd = data; > > if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { > reset_revision_walk(); > describe_commit(&pcd->current_commit, pcd->dst); > strbuf_addf(pcd->dst, ":%s", path); > } > } > > i.e. as `describe_commit()` is used on the commit part, the answer to your > question is: the former. I guess that is why Stefan wrote `commit-ish` > instead of `commit` ;-) $ ./git describe a9dbc3f12c warning: reflog of 'HEAD' references pruned commits v2.15.0-7-g980e40477f:GIT-VERSION-GEN So as noted below, this output is less than ideal, but technically correct as v2.15.0-7-g980e40477f contains that blob as well (you don't have these; it is this very series consisting of 7 patches on top of 2.15, none of them touching GIT-VERSION-GEN, hence that blob stays intact.) The way Junio asked, we actually may prefer the commit-ish to give that commit that introduced the blob for the first time, i.e. add the --reverse to the graph walking. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 17:59 ` Stefan Beller @ 2017-11-01 21:05 ` Jacob Keller 2017-11-01 22:12 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Jacob Keller @ 2017-11-01 21:05 UTC (permalink / raw) To: Stefan Beller, Johannes Schindelin; +Cc: Junio C Hamano, git, Kevin Daudt On November 1, 2017 10:59:08 AM PDT, Stefan Beller <sbeller@google.com> wrote: >>> Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or >>> would it always be <commit>:<path>? >> >> As the blob is described using this function: >> >> static void process_object(struct object *obj, const char *path, void >*data) >> { >> struct process_commit_data *pcd = data; >> >> if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { >> reset_revision_walk(); >> describe_commit(&pcd->current_commit, pcd->dst); >> strbuf_addf(pcd->dst, ":%s", path); >> } >> } >> >> i.e. as `describe_commit()` is used on the commit part, the answer to >your >> question is: the former. I guess that is why Stefan wrote >`commit-ish` >> instead of `commit` ;-) > >$ ./git describe a9dbc3f12c >warning: reflog of 'HEAD' references pruned commits >v2.15.0-7-g980e40477f:GIT-VERSION-GEN > >So as noted below, this output is less than ideal, but technically >correct as >v2.15.0-7-g980e40477f contains that blob as well (you don't have these; >it is this very series consisting of 7 patches on top of 2.15, none of >them >touching GIT-VERSION-GEN, hence that blob stays intact.) > >The way Junio asked, we actually may prefer the commit-ish to give >that commit that introduced the blob for the first time, i.e. add the >--reverse to the graph walking. I know id prefer the first commit that introduced the blob. That's what describing a commit does, it finds the oldest tag prior to the commit, while --contains finds the first tag that contains the commit as an ancestor. Neither of these things is a perfect match for blobs since we want to only find an exact commit that still has that blob. I think finding the first commit to introduce a blob is generally more useful. Finding the last commit from some branch before the blob was changed also might be useful but I don't know if that's exactly what you implemented here... In either respect, unless it's super easy to implement, going with what you have now is better than nothing and we can improve it in the future should someone wish to take up the work. -Jake -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 21:05 ` Jacob Keller @ 2017-11-01 22:12 ` Johannes Schindelin 2017-11-01 22:21 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 22:12 UTC (permalink / raw) To: Jacob Keller; +Cc: Stefan Beller, Junio C Hamano, git, Kevin Daudt Hi, On Wed, 1 Nov 2017, Jacob Keller wrote: > On November 1, 2017 10:59:08 AM PDT, Stefan Beller <sbeller@google.com> wrote: > >>> Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or > >>> would it always be <commit>:<path>? > >> > >> As the blob is described using this function: > >> > >> static void process_object(struct object *obj, const char *path, void > >*data) > >> { > >> struct process_commit_data *pcd = data; > >> > >> if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { > >> reset_revision_walk(); > >> describe_commit(&pcd->current_commit, pcd->dst); > >> strbuf_addf(pcd->dst, ":%s", path); > >> } > >> } > >> > >> i.e. as `describe_commit()` is used on the commit part, the answer to > >your > >> question is: the former. I guess that is why Stefan wrote > >`commit-ish` > >> instead of `commit` ;-) > > > >$ ./git describe a9dbc3f12c > >warning: reflog of 'HEAD' references pruned commits > >v2.15.0-7-g980e40477f:GIT-VERSION-GEN > > > >So as noted below, this output is less than ideal, but technically > >correct as > >v2.15.0-7-g980e40477f contains that blob as well (you don't have these; > >it is this very series consisting of 7 patches on top of 2.15, none of > >them > >touching GIT-VERSION-GEN, hence that blob stays intact.) > > > >The way Junio asked, we actually may prefer the commit-ish to give > >that commit that introduced the blob for the first time, i.e. add the > >--reverse to the graph walking. > > I know id prefer the first commit that introduced the blob. That's what > describing a commit does, it finds the oldest tag prior to the commit, > while --contains finds the first tag that contains the commit as an > ancestor. It is very easy to wish for "the oldest commit introducing a blob". But since we're in a DAG, this question is not necessarily easy to answer: - A - B - C \ / D Let's assume that all commits except A have the blob, and that B and D (for some reason) have the same author/committer dates. Do you identify B or D as the commit introducing the blob? Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 22:12 ` Johannes Schindelin @ 2017-11-01 22:21 ` Stefan Beller 2017-11-01 22:41 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-01 22:21 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Jacob Keller, Junio C Hamano, git, Kevin Daudt >> I know id prefer the first commit that introduced the blob. That's what >> describing a commit does, it finds the oldest tag prior to the commit, >> while --contains finds the first tag that contains the commit as an >> ancestor. > > It is very easy to wish for "the oldest commit introducing a blob". But > since we're in a DAG, this question is not necessarily easy to answer: > > - A - B - C > \ / > D > > Let's assume that all commits except A have the blob, and that B and D > (for some reason) have the same author/committer dates. > > Do you identify B or D as the commit introducing the blob? The current implementation gives C, though. (assuming C is HEAD, and A is ancient) With the --reverse flag one of B or D is given (not sure which, depends on the exact order). ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 22:21 ` Stefan Beller @ 2017-11-01 22:41 ` Johannes Schindelin 2017-11-01 22:53 ` Stefan Beller ` (2 more replies) 0 siblings, 3 replies; 110+ messages in thread From: Johannes Schindelin @ 2017-11-01 22:41 UTC (permalink / raw) To: Stefan Beller; +Cc: Jacob Keller, Junio C Hamano, git, Kevin Daudt Hi Stefan, On Wed, 1 Nov 2017, Stefan Beller wrote: > >> I know id prefer the first commit that introduced the blob. That's > >> what describing a commit does, it finds the oldest tag prior to the > >> commit, while --contains finds the first tag that contains the commit > >> as an ancestor. > > > > It is very easy to wish for "the oldest commit introducing a blob". But > > since we're in a DAG, this question is not necessarily easy to answer: > > > > - A - B - C > > \ / > > D > > > > Let's assume that all commits except A have the blob, and that B and D > > (for some reason) have the same author/committer dates. > > > > Do you identify B or D as the commit introducing the blob? > > The current implementation gives C, though. > (assuming C is HEAD, and A is ancient) > > With the --reverse flag one of B or D is given (not sure which, > depends on the exact order). Sure, but it is still a tricky thing. Imagine - A1 - B1 - A2 - B2 - B3 where all the B* commits have the blob. Do you really want to report B1 rather than B2 as the commit introducing the blob? (I would prefer B2...) Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 22:41 ` Johannes Schindelin @ 2017-11-01 22:53 ` Stefan Beller 2017-11-02 6:05 ` Jacob Keller 2017-11-02 7:23 ` Andreas Schwab 2 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-01 22:53 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Jacob Keller, Junio C Hamano, git, Kevin Daudt On Wed, Nov 1, 2017 at 3:41 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >> The current implementation gives C, though. >> (assuming C is HEAD, and A is ancient) >> >> With the --reverse flag one of B or D is given (not sure which, >> depends on the exact order). > > Sure, but it is still a tricky thing. Imagine > > - A1 - B1 - A2 - B2 - B3 > > where all the B* commits have the blob. Do you really want to report B1 > rather than B2 as the commit introducing the blob? (I would prefer B2...) You are correct, that B2 is also important to find. Ideally you want a list of all adjacent commit groups, but that is too tricky to implement for now, deferring it to another contribution in the future. The current proposal would give you B3, such that `git log B3 -- path` will find you B2. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 22:41 ` Johannes Schindelin 2017-11-01 22:53 ` Stefan Beller @ 2017-11-02 6:05 ` Jacob Keller 2017-11-03 5:18 ` Junio C Hamano 2017-11-02 7:23 ` Andreas Schwab 2 siblings, 1 reply; 110+ messages in thread From: Jacob Keller @ 2017-11-02 6:05 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Stefan Beller, Junio C Hamano, git, Kevin Daudt On Wed, Nov 1, 2017 at 3:41 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi Stefan, > > On Wed, 1 Nov 2017, Stefan Beller wrote: > >> >> I know id prefer the first commit that introduced the blob. That's >> >> what describing a commit does, it finds the oldest tag prior to the >> >> commit, while --contains finds the first tag that contains the commit >> >> as an ancestor. >> > >> > It is very easy to wish for "the oldest commit introducing a blob". But >> > since we're in a DAG, this question is not necessarily easy to answer: >> > >> > - A - B - C >> > \ / >> > D >> > >> > Let's assume that all commits except A have the blob, and that B and D >> > (for some reason) have the same author/committer dates. >> > >> > Do you identify B or D as the commit introducing the blob? >> >> The current implementation gives C, though. >> (assuming C is HEAD, and A is ancient) >> >> With the --reverse flag one of B or D is given (not sure which, >> depends on the exact order). > > Sure, but it is still a tricky thing. Imagine > > - A1 - B1 - A2 - B2 - B3 > > where all the B* commits have the blob. Do you really want to report B1 > rather than B2 as the commit introducing the blob? (I would prefer B2...) > > Ciao, > Dscho Yes, but I'd rather we find either B1 or B2, rather than B3, or in the first case, I'd rather we *at least* provide B or D, rather than C. Or: A - B1 - C -E - ... - Z \ B2 / In this example, if everything except a had the blob, we'd find z, which I think isn't as useful as finding one of B or D. Sure we can't find a perfect answer, but I'd rather find a commit which introduced as blob as one that adds the blob, rather than one that happens to point to it because it has history which includes it. I think both questions are valuable, the first is "which commit last had this blob", the second is "which commit first introduced this blob", neither of which can always give a definitive answer. It really depends on what question you're asking up front. Thanks, Jake ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-02 6:05 ` Jacob Keller @ 2017-11-03 5:18 ` Junio C Hamano 2017-11-03 6:55 ` Jacob Keller 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-03 5:18 UTC (permalink / raw) To: Jacob Keller; +Cc: Johannes Schindelin, Stefan Beller, git, Kevin Daudt Jacob Keller <jacob.keller@gmail.com> writes: > I think both questions are valuable, the first is "which commit last > had this blob", the second is "which commit first introduced this > blob", neither of which can always give a definitive answer. It really > depends on what question you're asking up front. Given that "describe" is about giving an object _a_ name that is plausibly useful to refer to it, it is not a good match for the above query that wants to know where it came from, how long it remains in the tree and where it ceases to be relevant. In order to support that use case, a totally different and possibly more useful avenue would be to think how this can be hooked into "git log" machinery. A refresher for how "git log [--options] <pathspec>" works may be beneficial. We walk history and compare the tree of the commit we are looking at with the tree of its parent commits. If everything within <pathspec> is the same, we mark the transition between the parent and our commit TREESAME (other commits, i.e. the ones that have meaningful change within <pathspec>, are !TREESAME). Then the output routine presents the set of commits that includes commits that are !TREESAME, within the constraints of the --options given (e.g. with --full-history, sides of a merge that is TREESAME may still be shown to preserve connectedness of the resulting graph). It is easy to imagine that we can restrict "git log" traversal with a "--blobchange=<blob>" option instead of (or in addition to) the limitation <pathspec> gives us. Instead of treating a commit whose diff against its parent commit has any filepair that is different within <pathspec> as "!TREESAME", we can treat a commit whose diff against its parent commit has a filepair that has the <blob> on either side of the filepair as "!TREESAME" (in other words, we ignore a transition that is not about introducing or forgetting the <blob> we are looking for as an "interesting change"). That would give you a commit history graph in which only (and all) such commits that either adds or removes the <blob> in it. Hmm? ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-03 5:18 ` Junio C Hamano @ 2017-11-03 6:55 ` Jacob Keller 2017-11-03 15:02 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Jacob Keller @ 2017-11-03 6:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, Stefan Beller, git, Kevin Daudt On Thu, Nov 2, 2017 at 10:18 PM, Junio C Hamano <gitster@pobox.com> wrote: > Jacob Keller <jacob.keller@gmail.com> writes: > >> I think both questions are valuable, the first is "which commit last >> had this blob", the second is "which commit first introduced this >> blob", neither of which can always give a definitive answer. It really >> depends on what question you're asking up front. > > Given that "describe" is about giving an object _a_ name that is > plausibly useful to refer to it, it is not a good match for the > above query that wants to know where it came from, how long it > remains in the tree and where it ceases to be relevant. In order to > support that use case, a totally different and possibly more useful > avenue would be to think how this can be hooked into "git log" > machinery. > > A refresher for how "git log [--options] <pathspec>" works may be > beneficial. We walk history and compare the tree of the commit we > are looking at with the tree of its parent commits. If everything > within <pathspec> is the same, we mark the transition between the > parent and our commit TREESAME (other commits, i.e. the ones that > have meaningful change within <pathspec>, are !TREESAME). Then the > output routine presents the set of commits that includes commits > that are !TREESAME, within the constraints of the --options given > (e.g. with --full-history, sides of a merge that is TREESAME may > still be shown to preserve connectedness of the resulting graph). > > It is easy to imagine that we can restrict "git log" traversal with > a "--blobchange=<blob>" option instead of (or in addition to) the > limitation <pathspec> gives us. Instead of treating a commit whose > diff against its parent commit has any filepair that is different > within <pathspec> as "!TREESAME", we can treat a commit whose diff > against its parent commit has a filepair that has the <blob> on > either side of the filepair as "!TREESAME" (in other words, we > ignore a transition that is not about introducing or forgetting the > <blob> we are looking for as an "interesting change"). That would > give you a commit history graph in which only (and all) such commits > that either adds or removes the <blob> in it. > > Hmm? This seems quite useful in the context of figuring out how a file got to such a state. This is useful to me, since if I know the state of a file (ie: it's exact contents) I can determine the blob name, and then use that to lookup where it was introduced. Thanks, Jake ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-03 6:55 ` Jacob Keller @ 2017-11-03 15:02 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-03 15:02 UTC (permalink / raw) To: git; +Cc: Johannes Schindelin, Stefan Beller, Kevin Daudt, Jacob Keller Jacob Keller <jacob.keller@gmail.com> writes: > On Thu, Nov 2, 2017 at 10:18 PM, Junio C Hamano <gitster@pobox.com> wrote: >> ... >> It is easy to imagine that we can restrict "git log" traversal with >> a "--blobchange=<blob>" option instead of (or in addition to) the >> limitation <pathspec> gives us. Instead of treating a commit whose >> diff against its parent commit has any filepair that is different >> within <pathspec> as "!TREESAME", we can treat a commit whose diff >> against its parent commit has a filepair that has the <blob> on >> either side of the filepair as "!TREESAME" (in other words, we >> ignore a transition that is not about introducing or forgetting the >> <blob> we are looking for as an "interesting change"). That would >> give you a commit history graph in which only (and all) such commits >> that either adds or removes the <blob> in it. >> >> Hmm? > > This seems quite useful in the context of figuring out how a file got > to such a state. This is useful to me, since if I know the state of a > file (ie: it's exact contents) I can determine the blob name, and then > use that to lookup where it was introduced. This is probably a bit harder than an average #leftoverbits, but if somebody wants to get their hands dirty, it should be reasonably straightforward. The needed changes would roughly go like so: - Add "struct oid *blob_change;" to diff_options, initialized to NULL. - Teach diff_opt_parse() a new option "--blobchange=<blob>". Allocate a struct oid when you parse this option and point at it with the blob_change field above. - Write diffcore-blobchange.c, modeled after diffcore-pickaxe.c, but this should be a lot simpler (as there is no need for an equivalent for "pickaxe-all" option). It would - prepare an empty diff_queue_struct "outq"; - iterate over the given diff_queue "q", and - a filepair p whose p->one is blob_change and p->two is not, (or the other way around) is added to outq with diff_q() - a filepair whose p->one/p->two do not involve blob_change is freed with diff_free_filepair(). - replace "q" with "outq". - Add a call to diffcore_blobchange() early in diffcore_std(), probably immediately after skip-stat-unmatch, when blob_change field is not NULL. - Arrange that blob_change is propagated to revs->pruning in setup_revisions(). ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 22:41 ` Johannes Schindelin 2017-11-01 22:53 ` Stefan Beller 2017-11-02 6:05 ` Jacob Keller @ 2017-11-02 7:23 ` Andreas Schwab 2017-11-02 18:18 ` Stefan Beller 2 siblings, 1 reply; 110+ messages in thread From: Andreas Schwab @ 2017-11-02 7:23 UTC (permalink / raw) To: Johannes Schindelin Cc: Stefan Beller, Jacob Keller, Junio C Hamano, git, Kevin Daudt On Nov 01 2017, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Sure, but it is still a tricky thing. Imagine > > - A1 - B1 - A2 - B2 - B3 > > where all the B* commits have the blob. Do you really want to report B1 > rather than B2 as the commit introducing the blob? (I would prefer B2...) What if B3 renames or copies the blob? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-02 7:23 ` Andreas Schwab @ 2017-11-02 18:18 ` Stefan Beller 2017-11-03 12:05 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-02 18:18 UTC (permalink / raw) To: Andreas Schwab Cc: Johannes Schindelin, Jacob Keller, Junio C Hamano, git, Kevin Daudt On Thu, Nov 2, 2017 at 12:23 AM, Andreas Schwab <schwab@linux-m68k.org> wrote: > On Nov 01 2017, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > >> Sure, but it is still a tricky thing. Imagine >> >> - A1 - B1 - A2 - B2 - B3 >> >> where all the B* commits have the blob. Do you really want to report B1 >> rather than B2 as the commit introducing the blob? (I would prefer B2...) > > What if B3 renames or copies the blob? > > Andreas. With the current proposed patch you'd find B3, and then use the diff machinery to digg deeper from there (renames/copies ought to be easy to detect already?) So with a copy B3 might be a better start than B1, as starting from B1 you would not find B3 easily. For a rename, I would think a reverse log/blame on B1:path may help. With that said, I think I'll just reroll the series with the current logic fixing the other minor issues that were brought up as B3 seems to be the most versatile (though not optimal) answer for many use cases. Thanks for that thought, Stefan ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-02 18:18 ` Stefan Beller @ 2017-11-03 12:05 ` Johannes Schindelin 0 siblings, 0 replies; 110+ messages in thread From: Johannes Schindelin @ 2017-11-03 12:05 UTC (permalink / raw) To: Stefan Beller Cc: Andreas Schwab, Jacob Keller, Junio C Hamano, git, Kevin Daudt Hi Stefan, On Thu, 2 Nov 2017, Stefan Beller wrote: > On Thu, Nov 2, 2017 at 12:23 AM, Andreas Schwab <schwab@linux-m68k.org> wrote: > > On Nov 01 2017, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > >> Sure, but it is still a tricky thing. Imagine > >> > >> - A1 - B1 - A2 - B2 - B3 > >> > >> where all the B* commits have the blob. Do you really want to report B1 > >> rather than B2 as the commit introducing the blob? (I would prefer B2...) > > > > What if B3 renames or copies the blob? > > > > Andreas. > > With the current proposed patch you'd find B3, and then use the diff machinery > to digg deeper from there (renames/copies ought to be easy to detect already?) > > So with a copy B3 might be a better start than B1, as starting from B1 you > would not find B3 easily. > > For a rename, I would think a reverse log/blame on B1:path may help. > > With that said, I think I'll just reroll the series with the current logic > fixing the other minor issues that were brought up as B3 seems to > be the most versatile (though not optimal) answer for many use cases. I know this is a bit of semantics, but I disagree that B3 is the most versatile. For my use cases, `git describe` comes in handy relatively rarely, and usually only when I want to know what version some work was based on. In that respect, B2 would be the most versatile answer. However, if you say that B3 is the easiest answer to explain, I whole-heartedly agree. Any other answer would be necessarily more complicated to reason about, given that we're operating on a DAG, i.e. not always on a linear history. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv2 6/7] builtin/describe.c: describe a blob 2017-11-01 4:11 ` Junio C Hamano 2017-11-01 12:32 ` Johannes Schindelin @ 2017-11-01 21:28 ` Stefan Beller 1 sibling, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-01 21:28 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, git, Jacob Keller, Kevin Daudt On Tue, Oct 31, 2017 at 9:11 PM, Junio C Hamano <gitster@pobox.com> wrote: > Stefan Beller <sbeller@google.com> writes: > >> +If the given object refers to a blob, it will be described >> +as `<commit-ish>:<path>`, such that the blob can be found >> +at `<path>` in the `<commit-ish>`. Note, that the commit is likely > > Does the code describe a9dbc3f12c as v2.15.0:GIT-VERSION-GEN, or > would it always be <commit>:<path>? > >> +not the commit that introduced the blob, but the one that was found >> +first; to find the commit that introduced the blob, you need to find > > Because we do not want to descend into the same tree object we saw > earlier, this "we show the name we happened to find first without > attempting to refine it for a better name" is a rather fundamental > limitation of this approach. Hopefully we can later improve it with > more thought, but for now it is better than nothing (and much better > than "git log --raw | grep"). ok. > >> +the commit that last touched the path, e.g. >> +`git log <commit-description> -- <path>`. >> +As blobs do not point at the commits they are contained in, >> +describing blobs is slow as we have to walk the whole graph. > > Is it true that we walk the "whole graph", or do we stop immediately > we find a path to the blob? The former makes it sound like > completely useless. Unfortunately we walk the whole graph, as I have not figured out how to stop the walking from the callback in 'traverse_commit_list'. I assume I have to modify the struct rev_info that we operate on, clearing any pending commits? > >> -#define SEEN (1u << 0) > > Interesting. Now we include revision.h this becomes redundant. > Correct. In a way this small part is a revert of 8713ab3079 (Improve git-describe performance by reducing revision listing., 2007-01-13) >> + argv_array_pushl(&args, "internal: The first arg is not parsed", >> + "--all", "--reflog", /* as many starting points as possible */ > > Interesting. > > Do we also search in the reflog in the normal "describe" operation? > If not, perhaps we should? I wonder what's the performance > implications would be. "normal" git describe doesn't need to walk the whole graph as we only walk down from the given commit-ish until a land mark is found. For --contains, this might be an interesting though, as there we also have to "walk backwards without pointers to follow". > > That tangent aside, as "describe blob" is most likely a "what > reaches and is holding onto this blob?" query, finding something > that can only be reached from a reflog entry would make it more > helpful than without the option. Yeah that is my reasoning as well. > >> + /* NEEDSWORK: --all is incompatible with worktrees for now: */ > > What's that last colon about? will replace with a dot, it ought to hint at the line that follows, the --single-worktree flag. > >> + "--single-worktree", >> + "--objects", >> + "--in-commit-order", >> + NULL); >> + >> + init_revisions(&revs, NULL); >> + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) >> + BUG("setup_revisions could not handle all args?"); >> + >> + if (prepare_revision_walk(&revs)) >> + die("revision walk setup failed"); >> + >> + traverse_commit_list(&revs, process_commit, process_object, &pcd); >> + reset_revision_walk(); >> +} >> + > > OK. > >> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh >> index 1c0e8659d9..3be01316e8 100755 >> --- a/t/t6120-describe.sh >> +++ b/t/t6120-describe.sh >> @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a borken submodule' ' >> grep broken out >> ' >> >> +test_expect_success 'describe a blob at a tag' ' >> + echo "make it a unique blob" >file && >> + git add file && git commit -m "content in file" && >> + git tag -a -m "latest annotated tag" unique-file && >> + git describe HEAD:file >actual && >> + echo "unique-file:file" >expect && >> + test_cmp expect actual >> +' >> + >> +test_expect_success 'describe a surviving blob' ' >> + git commit --allow-empty -m "empty commit" && >> + git describe HEAD:file >actual && >> + grep unique-file-1-g actual >> +' >> + > > I am not sure what "surviving" means in this context. Maybe "unchanged", "still kept around" ? > The word > hinted that the test would be finding a blob that only appears in a > commit that only appears as a reflog entry, but that wasn't the > case, which was a bit disappointing. oh! ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv2 7/7] t6120: fix typo in test name 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller ` (5 preceding siblings ...) 2017-10-31 21:18 ` [PATCHv2 6/7] builtin/describe.c: describe a blob Stefan Beller @ 2017-10-31 21:18 ` Stefan Beller 2017-11-01 5:14 ` [PATCHv2 0/7] git describe blob Junio C Hamano 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller 8 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-10-31 21:18 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me Signed-off-by: Stefan Beller <sbeller@google.com> --- t/t6120-describe.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 3be01316e8..fd329f173a 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' ' mv .git/modules/sub1/ .git/modules/sub_moved && test_must_fail git describe --dirty ' -test_expect_success 'describe ignoring a borken submodule' ' +test_expect_success 'describe ignoring a broken submodule' ' git describe --broken >out && test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" && grep broken out -- 2.15.0.rc2.443.gfcc3b81c0a ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv2 0/7] git describe blob 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller ` (6 preceding siblings ...) 2017-10-31 21:18 ` [PATCHv2 7/7] t6120: fix typo in test name Stefan Beller @ 2017-11-01 5:14 ` Junio C Hamano 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller 8 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2017-11-01 5:14 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me Stefan Beller <sbeller@google.com> writes: > Occasionally a user is given an object hash from a blob as an error message > or other output (e.g. [1]). > > It would be useful to get a further description of such a blob, such as > the (commit, path) tuple where this blob was introduced. > > This implements the answer in builtin/describe, > however the heuristics are weak. See patch 6 for details. Overall this was a relatively pleasant read. Will queue. Thanks. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv3 0/7] git describe blob 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller ` (7 preceding siblings ...) 2017-11-01 5:14 ` [PATCHv2 0/7] git describe blob Junio C Hamano @ 2017-11-02 19:41 ` Stefan Beller 2017-11-02 19:41 ` [PATCHv3 1/7] t6120: fix typo in test name Stefan Beller ` (7 more replies) 8 siblings, 8 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab Thanks for the discussion on v2[1]. Interdiff is below, just fixing minor things. We'll keep the original algorithm for now, deferring an improvement on the algorithm front towards a future developer. Thanks, Stefan [1] https://public-inbox.org/git/20171031211852.13001-1-sbeller@google.com/ Stefan Beller (7): t6120: fix typo in test name list-objects.c: factor out traverse_trees_and_blobs revision.h: introduce blob/tree walking in order of the commits builtin/describe.c: rename `oid` to avoid variable shadowing builtin/describe.c: print debug statements earlier builtin/describe.c: factor out describe_commit builtin/describe.c: describe a blob Documentation/git-describe.txt | 13 +++- Documentation/rev-list-options.txt | 5 ++ builtin/describe.c | 125 ++++++++++++++++++++++++++++--------- list-objects.c | 52 +++++++++------ revision.c | 2 + revision.h | 3 +- t/t6100-rev-list-in-order.sh | 47 ++++++++++++++ t/t6120-describe.sh | 17 ++++- 8 files changed, 214 insertions(+), 50 deletions(-) create mode 100755 t/t6100-rev-list-in-order.sh diff --git c/Documentation/git-describe.txt w/Documentation/git-describe.txt index 077c3c2193..79ec0be62a 100644 --- c/Documentation/git-describe.txt +++ w/Documentation/git-describe.txt @@ -11,7 +11,7 @@ SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] -'git describe' [<options>] <blob-ish> +'git describe' [<options>] <blob> DESCRIPTION ----------- diff --git c/builtin/describe.c w/builtin/describe.c index 382cbae908..cf08bef344 100644 --- c/builtin/describe.c +++ w/builtin/describe.c @@ -440,6 +440,7 @@ struct process_commit_data { struct object_id current_commit; struct object_id looking_for; struct strbuf *dst; + struct rev_info *revs; }; static void process_commit(struct commit *commit, void *data) @@ -456,6 +457,7 @@ static void process_object(struct object *obj, const char *path, void *data) reset_revision_walk(); describe_commit(&pcd->current_commit, pcd->dst); strbuf_addf(pcd->dst, ":%s", path); + pcd->revs->max_count = 0; } } @@ -463,7 +465,7 @@ static void describe_blob(struct object_id oid, struct strbuf *dst) { struct rev_info revs; struct argv_array args = ARGV_ARRAY_INIT; - struct process_commit_data pcd = { null_oid, oid, dst}; + struct process_commit_data pcd = { null_oid, oid, dst, &revs}; argv_array_pushl(&args, "internal: The first arg is not parsed", "--all", "--reflog", /* as many starting points as possible */ @@ -497,14 +499,12 @@ static void describe(const char *arg, int last_one) die(_("Not a valid object name %s"), arg); cmit = lookup_commit_reference_gently(&oid, 1); - if (cmit) { + if (cmit) describe_commit(&oid, &sb); - } else { - if (lookup_blob(&oid)) - describe_blob(oid, &sb); - else - die(_("%s is neither a commit nor blob"), arg); - } + else if (lookup_blob(&oid)) + describe_blob(oid, &sb); + else + die(_("%s is neither a commit nor blob"), arg); puts(sb.buf); diff --git c/list-objects.c w/list-objects.c index 03438e5a8b..07a92f35fe 100644 --- c/list-objects.c +++ w/list-objects.c @@ -184,13 +184,13 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) } static void traverse_trees_and_blobs(struct rev_info *revs, - struct strbuf *base_path, + struct strbuf *base, show_object_fn show_object, void *data) { int i; - assert(base_path->len == 0); + assert(base->len == 0); for (i = 0; i < revs->pending.nr; i++) { struct object_array_entry *pending = revs->pending.objects + i; @@ -208,12 +208,12 @@ static void traverse_trees_and_blobs(struct rev_info *revs, path = ""; if (obj->type == OBJ_TREE) { process_tree(revs, (struct tree *)obj, show_object, - base_path, path, data); + base, path, data); continue; } if (obj->type == OBJ_BLOB) { process_blob(revs, (struct blob *)obj, show_object, - base_path, path, data); + base, path, data); continue; } die("unknown pending object %s (%s)", diff --git c/t/t6100-rev-list-in-order.sh w/t/t6100-rev-list-in-order.sh index 651666979b..d4d539b0da 100755 --- c/t/t6100-rev-list-in-order.sh +++ w/t/t6100-rev-list-in-order.sh @@ -1,21 +1,22 @@ #!/bin/sh -test_description='miscellaneous rev-list tests' +test_description='rev-list testing in-commit-order' . ./test-lib.sh - -test_expect_success 'setup' ' +test_expect_success 'rev-list --in-commit-order' ' for x in one two three four do echo $x >$x && git add $x && - git commit -m "add file $x" + git commit -m "add file $x" || + return 1 done && for x in four three do git rm $x && - git commit -m "remove $x" + git commit -m "remove $x" || + return 1 done && git rev-list --in-commit-order --objects HEAD >actual.raw && cut -c 1-40 >actual <actual.raw && @@ -32,9 +33,9 @@ test_expect_success 'setup' ' HEAD~2^{tree} HEAD~2^{tree}:four HEAD~3^{commit} - # HEAD~3^{tree} skipped + # HEAD~3^{tree} skipped, same as HEAD~1^{tree} HEAD~4^{commit} - # HEAD~4^{tree} skipped + # HEAD~4^{tree} skipped, same as HEAD^{tree} HEAD~5^{commit} HEAD~5^{tree} EOF diff --git c/t/t6120-describe.sh w/t/t6120-describe.sh index fd329f173a..aec6ed192d 100755 --- c/t/t6120-describe.sh +++ w/t/t6120-describe.sh @@ -319,7 +319,7 @@ test_expect_success 'describe a blob at a tag' ' test_cmp expect actual ' -test_expect_success 'describe a surviving blob' ' +test_expect_success 'describe a blob with commit-ish' ' git commit --allow-empty -m "empty commit" && git describe HEAD:file >actual && grep unique-file-1-g actual ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv3 1/7] t6120: fix typo in test name 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-02 19:41 ` [PATCHv3 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller ` (6 subsequent siblings) 7 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab Signed-off-by: Stefan Beller <sbeller@google.com> --- t/t6120-describe.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 1c0e8659d9..c8b7ed82d9 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -304,7 +304,7 @@ test_expect_success 'describe chokes on severely broken submodules' ' mv .git/modules/sub1/ .git/modules/sub_moved && test_must_fail git describe --dirty ' -test_expect_success 'describe ignoring a borken submodule' ' +test_expect_success 'describe ignoring a broken submodule' ' git describe --broken >out && test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" && grep broken out -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv3 2/7] list-objects.c: factor out traverse_trees_and_blobs 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller 2017-11-02 19:41 ` [PATCHv3 1/7] t6120: fix typo in test name Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-02 19:41 ` [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller ` (5 subsequent siblings) 7 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab With traverse_trees_and_blobs factored out of the main traverse function, the next patch can introduce an in-order revision walking with ease. In the next patch we'll call `traverse_trees_and_blobs` from within the loop walking the commits, such that we'll have one invocation of that function per commit. That is why we do not want to have memory allocations in that function, such as we'd have if we were to use a strbuf locally. Pass a strbuf from traverse_commit_list into the blob and tree traversing function as a scratch pad that only needs to be allocated once. Signed-off-by: Stefan Beller <sbeller@google.com> --- list-objects.c | 50 +++++++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 19 deletions(-) diff --git a/list-objects.c b/list-objects.c index b3931fa434..7c2ce9c4bd 100644 --- a/list-objects.c +++ b/list-objects.c @@ -183,25 +183,15 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) add_pending_object(revs, &tree->object, ""); } -void traverse_commit_list(struct rev_info *revs, - show_commit_fn show_commit, - show_object_fn show_object, - void *data) +static void traverse_trees_and_blobs(struct rev_info *revs, + struct strbuf *base, + show_object_fn show_object, + void *data) { int i; - struct commit *commit; - struct strbuf base; - strbuf_init(&base, PATH_MAX); - while ((commit = get_revision(revs)) != NULL) { - /* - * an uninteresting boundary commit may not have its tree - * parsed yet, but we are not going to show them anyway - */ - if (commit->tree) - add_pending_tree(revs, commit->tree); - show_commit(commit, data); - } + assert(base->len == 0); + for (i = 0; i < revs->pending.nr; i++) { struct object_array_entry *pending = revs->pending.objects + i; struct object *obj = pending->item; @@ -218,17 +208,39 @@ void traverse_commit_list(struct rev_info *revs, path = ""; if (obj->type == OBJ_TREE) { process_tree(revs, (struct tree *)obj, show_object, - &base, path, data); + base, path, data); continue; } if (obj->type == OBJ_BLOB) { process_blob(revs, (struct blob *)obj, show_object, - &base, path, data); + base, path, data); continue; } die("unknown pending object %s (%s)", oid_to_hex(&obj->oid), name); } object_array_clear(&revs->pending); - strbuf_release(&base); +} + +void traverse_commit_list(struct rev_info *revs, + show_commit_fn show_commit, + show_object_fn show_object, + void *data) +{ + struct commit *commit; + struct strbuf csp; /* callee's scratch pad */ + strbuf_init(&csp, PATH_MAX); + + while ((commit = get_revision(revs)) != NULL) { + /* + * an uninteresting boundary commit may not have its tree + * parsed yet, but we are not going to show them anyway + */ + if (commit->tree) + add_pending_tree(revs, commit->tree); + show_commit(commit, data); + } + traverse_trees_and_blobs(revs, &csp, show_object, data); + + strbuf_release(&csp); } -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller 2017-11-02 19:41 ` [PATCHv3 1/7] t6120: fix typo in test name Stefan Beller 2017-11-02 19:41 ` [PATCHv3 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-14 19:52 ` Jonathan Tan 2017-11-02 19:41 ` [PATCHv3 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller ` (4 subsequent siblings) 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab The functionality to list tree objects in the order they were seen while traversing the commits will be used in the next commit, where we teach `git describe` to describe not only commits, but trees and blobs, too. Signed-off-by: Stefan Beller <sbeller@google.com> --- Documentation/rev-list-options.txt | 5 ++++ list-objects.c | 2 ++ revision.c | 2 ++ revision.h | 3 ++- t/t6100-rev-list-in-order.sh | 47 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 58 insertions(+), 1 deletion(-) create mode 100755 t/t6100-rev-list-in-order.sh diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 13501e1556..9066e0c777 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -686,6 +686,11 @@ ifdef::git-rev-list[] all object IDs which I need to download if I have the commit object _bar_ but not _foo_''. +--in-commit-order:: + Print tree and blob ids in order of the commits. The tree + and blob ids are printed after they are first referenced + by a commit. + --objects-edge:: Similar to `--objects`, but also print the IDs of excluded commits prefixed with a ``-'' character. This is used by diff --git a/list-objects.c b/list-objects.c index 7c2ce9c4bd..07a92f35fe 100644 --- a/list-objects.c +++ b/list-objects.c @@ -239,6 +239,8 @@ void traverse_commit_list(struct rev_info *revs, if (commit->tree) add_pending_tree(revs, commit->tree); show_commit(commit, data); + if (revs->tree_blobs_in_commit_order) + traverse_trees_and_blobs(revs, &csp, show_object, data); } traverse_trees_and_blobs(revs, &csp, show_object, data); diff --git a/revision.c b/revision.c index d167223e69..9329d4ebbf 100644 --- a/revision.c +++ b/revision.c @@ -1845,6 +1845,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->dense = 0; } else if (!strcmp(arg, "--show-all")) { revs->show_all = 1; + } else if (!strcmp(arg, "--in-commit-order")) { + revs->tree_blobs_in_commit_order = 1; } else if (!strcmp(arg, "--remove-empty")) { revs->remove_empty_trees = 1; } else if (!strcmp(arg, "--merges")) { diff --git a/revision.h b/revision.h index 54761200ad..86985d68aa 100644 --- a/revision.h +++ b/revision.h @@ -121,7 +121,8 @@ struct rev_info { bisect:1, ancestry_path:1, first_parent_only:1, - line_level_traverse:1; + line_level_traverse:1, + tree_blobs_in_commit_order:1; /* Diff flags */ unsigned int diff:1, diff --git a/t/t6100-rev-list-in-order.sh b/t/t6100-rev-list-in-order.sh new file mode 100755 index 0000000000..d4d539b0da --- /dev/null +++ b/t/t6100-rev-list-in-order.sh @@ -0,0 +1,47 @@ +#!/bin/sh + +test_description='rev-list testing in-commit-order' + +. ./test-lib.sh + +test_expect_success 'rev-list --in-commit-order' ' + for x in one two three four + do + echo $x >$x && + git add $x && + git commit -m "add file $x" || + return 1 + done && + for x in four three + do + git rm $x && + git commit -m "remove $x" || + return 1 + done && + git rev-list --in-commit-order --objects HEAD >actual.raw && + cut -c 1-40 >actual <actual.raw && + + git cat-file --batch-check="%(objectname)" >expect.raw <<-\EOF && + HEAD^{commit} + HEAD^{tree} + HEAD^{tree}:one + HEAD^{tree}:two + HEAD~1^{commit} + HEAD~1^{tree} + HEAD~1^{tree}:three + HEAD~2^{commit} + HEAD~2^{tree} + HEAD~2^{tree}:four + HEAD~3^{commit} + # HEAD~3^{tree} skipped, same as HEAD~1^{tree} + HEAD~4^{commit} + # HEAD~4^{tree} skipped, same as HEAD^{tree} + HEAD~5^{commit} + HEAD~5^{tree} + EOF + grep -v "#" >expect <expect.raw && + + test_cmp expect actual +' + +test_done -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits 2017-11-02 19:41 ` [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-11-14 19:52 ` Jonathan Tan 0 siblings, 0 replies; 110+ messages in thread From: Jonathan Tan @ 2017-11-14 19:52 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab On Thu, 2 Nov 2017 12:41:44 -0700 Stefan Beller <sbeller@google.com> wrote: > @@ -239,6 +239,8 @@ void traverse_commit_list(struct rev_info *revs, > if (commit->tree) > add_pending_tree(revs, commit->tree); > show_commit(commit, data); > + if (revs->tree_blobs_in_commit_order) > + traverse_trees_and_blobs(revs, &csp, show_object, data); > } > traverse_trees_and_blobs(revs, &csp, show_object, data); > I would have expected add_pending_tree() above to no longer be invoked. If it still needs to be invoked, maybe add an explanation in the form of a comment or commit message. > +test_expect_success 'rev-list --in-commit-order' ' > + for x in one two three four > + do > + echo $x >$x && > + git add $x && > + git commit -m "add file $x" || > + return 1 > + done && > + for x in four three > + do > + git rm $x && > + git commit -m "remove $x" || > + return 1 > + done && > + git rev-list --in-commit-order --objects HEAD >actual.raw && > + cut -c 1-40 >actual <actual.raw && > + > + git cat-file --batch-check="%(objectname)" >expect.raw <<-\EOF && > + HEAD^{commit} > + HEAD^{tree} > + HEAD^{tree}:one > + HEAD^{tree}:two > + HEAD~1^{commit} > + HEAD~1^{tree} > + HEAD~1^{tree}:three > + HEAD~2^{commit} > + HEAD~2^{tree} > + HEAD~2^{tree}:four > + HEAD~3^{commit} > + # HEAD~3^{tree} skipped, same as HEAD~1^{tree} > + HEAD~4^{commit} > + # HEAD~4^{tree} skipped, same as HEAD^{tree} > + HEAD~5^{commit} > + HEAD~5^{tree} > + EOF > + grep -v "#" >expect <expect.raw && > + > + test_cmp expect actual > +' Would it be useful to have another test without --in-commit-order, so that we can see the difference (and ensure that existing behavior is unchanged)? ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv3 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller ` (2 preceding siblings ...) 2017-11-02 19:41 ` [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-02 19:41 ` [PATCHv3 5/7] builtin/describe.c: print debug statements earlier Stefan Beller ` (3 subsequent siblings) 7 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab The function `describe` has already a variable named `oid` declared at the beginning of the function for an object id. Do not shadow that variable with a pointer to an object id. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 29075dbd0f..fd61f463cf 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -381,9 +381,9 @@ static void describe(const char *arg, int last_one) } if (!match_cnt) { - struct object_id *oid = &cmit->object.oid; + struct object_id *cmit_oid = &cmit->object.oid; if (always) { - printf("%s", find_unique_abbrev(oid->hash, abbrev)); + printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); if (suffix) printf("%s", suffix); printf("\n"); @@ -392,11 +392,11 @@ static void describe(const char *arg, int last_one) if (unannotated_cnt) die(_("No annotated tags can describe '%s'.\n" "However, there were unannotated tags: try --tags."), - oid_to_hex(oid)); + oid_to_hex(cmit_oid)); else die(_("No tags can describe '%s'.\n" "Try --always, or create some tags."), - oid_to_hex(oid)); + oid_to_hex(cmit_oid)); } QSORT(all_matches, match_cnt, compare_pt); -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv3 5/7] builtin/describe.c: print debug statements earlier 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller ` (3 preceding siblings ...) 2017-11-02 19:41 ` [PATCHv3 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-14 19:55 ` Jonathan Tan 2017-11-02 19:41 ` [PATCHv3 6/7] builtin/describe.c: factor out describe_commit Stefan Beller ` (2 subsequent siblings) 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab For debuggers aid we'd want to print debug statements early, so introduce a new line in the debug output that describes the whole function, and then change the next debug output to describe why we need to search. Conveniently drop the arg from the second line; which will be useful in a follow up commit, that refactors the\ describe function. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/describe.c b/builtin/describe.c index fd61f463cf..3136efde31 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; + if (debug) + fprintf(stderr, _("describe %s\n"), arg); + if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); cmit = lookup_commit_reference(&oid); @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) if (!max_candidates) die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); if (debug) - fprintf(stderr, _("searching to describe %s\n"), arg); + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); if (!have_util) { struct hashmap_iter iter; -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv3 5/7] builtin/describe.c: print debug statements earlier 2017-11-02 19:41 ` [PATCHv3 5/7] builtin/describe.c: print debug statements earlier Stefan Beller @ 2017-11-14 19:55 ` Jonathan Tan 2017-11-14 20:00 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Jonathan Tan @ 2017-11-14 19:55 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab On Thu, 2 Nov 2017 12:41:46 -0700 Stefan Beller <sbeller@google.com> wrote: > For debuggers aid we'd want to print debug statements early, so > introduce a new line in the debug output that describes the whole > function, and then change the next debug output to describe why we > need to search. Conveniently drop the arg from the second line; > which will be useful in a follow up commit, that refactors the\ > describe function. > > Signed-off-by: Stefan Beller <sbeller@google.com> > --- > builtin/describe.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/builtin/describe.c b/builtin/describe.c > index fd61f463cf..3136efde31 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) > unsigned long seen_commits = 0; > unsigned int unannotated_cnt = 0; > > + if (debug) > + fprintf(stderr, _("describe %s\n"), arg); > + Could you explain in the commit message why this wasn't needed before (if it wasn't needed), and why this is needed now? > if (get_oid(arg, &oid)) > die(_("Not a valid object name %s"), arg); > cmit = lookup_commit_reference(&oid); > @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) > if (!max_candidates) > die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); > if (debug) > - fprintf(stderr, _("searching to describe %s\n"), arg); > + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); What is this arg that can be safely dropped? You mention that it is for convenience (since the describe() function will be refactored), but could the arg just be passed to the new function? ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv3 5/7] builtin/describe.c: print debug statements earlier 2017-11-14 19:55 ` Jonathan Tan @ 2017-11-14 20:00 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-14 20:00 UTC (permalink / raw) To: Jonathan Tan Cc: Johannes Schindelin, git, Jacob Keller, Kevin Daudt, Andreas Schwab On Tue, Nov 14, 2017 at 11:55 AM, Jonathan Tan <jonathantanmy@google.com> wrote: > On Thu, 2 Nov 2017 12:41:46 -0700 > Stefan Beller <sbeller@google.com> wrote: > >> For debuggers aid we'd want to print debug statements early, so >> introduce a new line in the debug output that describes the whole >> function, and then change the next debug output to describe why we >> need to search. Conveniently drop the arg from the second line; >> which will be useful in a follow up commit, that refactors the\ >> describe function. >> >> Signed-off-by: Stefan Beller <sbeller@google.com> >> --- >> builtin/describe.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/builtin/describe.c b/builtin/describe.c >> index fd61f463cf..3136efde31 100644 >> --- a/builtin/describe.c >> +++ b/builtin/describe.c >> @@ -293,6 +293,9 @@ static void describe(const char *arg, int last_one) >> unsigned long seen_commits = 0; >> unsigned int unannotated_cnt = 0; >> >> + if (debug) >> + fprintf(stderr, _("describe %s\n"), arg); >> + > > Could you explain in the commit message why this wasn't needed before > (if it wasn't needed), and why this is needed now? > >> if (get_oid(arg, &oid)) >> die(_("Not a valid object name %s"), arg); >> cmit = lookup_commit_reference(&oid); >> @@ -316,7 +319,7 @@ static void describe(const char *arg, int last_one) >> if (!max_candidates) >> die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid)); >> if (debug) >> - fprintf(stderr, _("searching to describe %s\n"), arg); >> + fprintf(stderr, _("No exact match on refs or tags, searching to describe\n")); > > What is this arg that can be safely dropped? > > You mention that it is for convenience (since the describe() function > will be refactored), but could the arg just be passed to the new > function? It could, but I want to avoid that just to print a debugging statement inside the function. So I factor the debugging printing out of the function introduced in the next patch. ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCHv3 6/7] builtin/describe.c: factor out describe_commit 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller ` (4 preceding siblings ...) 2017-11-02 19:41 ` [PATCHv3 5/7] builtin/describe.c: print debug statements earlier Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-02 19:41 ` [PATCHv3 7/7] builtin/describe.c: describe a blob Stefan Beller 2017-11-03 0:23 ` [PATCHv3 0/7] git describe blob Jacob Keller 7 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab In the next patch we'll learn how to describe more than just commits, so factor out describing commits into its own function. That will make the next patches easy as we still need to describe a commit as part of describing blobs. While factoring out the functionality to describe_commit, make sure that any output to stdout is put into a strbuf, which we can print afterwards, using puts which also adds the line ending. Signed-off-by: Stefan Beller <sbeller@google.com> --- builtin/describe.c | 63 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 37 insertions(+), 26 deletions(-) diff --git a/builtin/describe.c b/builtin/describe.c index 3136efde31..9e9a5ed5d4 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -256,7 +256,7 @@ static unsigned long finish_depth_computation( return seen_commits; } -static void display_name(struct commit_name *n) +static void append_name(struct commit_name *n, struct strbuf *dst) { if (n->prio == 2 && !n->tag) { n->tag = lookup_tag(&n->oid); @@ -272,19 +272,18 @@ static void display_name(struct commit_name *n) } if (n->tag) - printf("%s", n->tag->tag); + strbuf_addstr(dst, n->tag->tag); else - printf("%s", n->path); + strbuf_addstr(dst, n->path); } -static void show_suffix(int depth, const struct object_id *oid) +static void append_suffix(int depth, const struct object_id *oid, struct strbuf *dst) { - printf("-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); + strbuf_addf(dst, "-%d-g%s", depth, find_unique_abbrev(oid->hash, abbrev)); } -static void describe(const char *arg, int last_one) +static void describe_commit(struct object_id *oid, struct strbuf *dst) { - struct object_id oid; struct commit *cmit, *gave_up_on = NULL; struct commit_list *list; struct commit_name *n; @@ -293,26 +292,18 @@ static void describe(const char *arg, int last_one) unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; - if (debug) - fprintf(stderr, _("describe %s\n"), arg); - - if (get_oid(arg, &oid)) - die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference(oid); n = find_commit_name(&cmit->object.oid); if (n && (tags || all || n->prio == 2)) { /* * Exact match to an existing ref. */ - display_name(n); + append_name(n, dst); if (longformat) - show_suffix(0, n->tag ? &n->tag->tagged->oid : &oid); + append_suffix(0, n->tag ? &n->tag->tagged->oid : oid, dst); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); return; } @@ -386,10 +377,9 @@ static void describe(const char *arg, int last_one) if (!match_cnt) { struct object_id *cmit_oid = &cmit->object.oid; if (always) { - printf("%s", find_unique_abbrev(cmit_oid->hash, abbrev)); + strbuf_addstr(dst, find_unique_abbrev(cmit_oid->hash, abbrev)); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); return; } if (unannotated_cnt) @@ -437,15 +427,36 @@ static void describe(const char *arg, int last_one) } } - display_name(all_matches[0].name); + append_name(all_matches[0].name, dst); if (abbrev) - show_suffix(all_matches[0].depth, &cmit->object.oid); + append_suffix(all_matches[0].depth, &cmit->object.oid, dst); if (suffix) - printf("%s", suffix); - printf("\n"); + strbuf_addstr(dst, suffix); +} + +static void describe(const char *arg, int last_one) +{ + struct object_id oid; + struct commit *cmit; + struct strbuf sb = STRBUF_INIT; + + if (debug) + fprintf(stderr, _("describe %s\n"), arg); + + if (get_oid(arg, &oid)) + die(_("Not a valid object name %s"), arg); + cmit = lookup_commit_reference(&oid); + if (!cmit) + die(_("%s is not a valid '%s' object"), arg, commit_type); + + describe_commit(&oid, &sb); + + puts(sb.buf); if (!last_one) clear_commit_marks(cmit, -1); + + strbuf_release(&sb); } int cmd_describe(int argc, const char **argv, const char *prefix) -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCHv3 7/7] builtin/describe.c: describe a blob 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller ` (5 preceding siblings ...) 2017-11-02 19:41 ` [PATCHv3 6/7] builtin/describe.c: factor out describe_commit Stefan Beller @ 2017-11-02 19:41 ` Stefan Beller 2017-11-14 20:02 ` Jonathan Tan 2017-11-03 0:23 ` [PATCHv3 0/7] git describe blob Jacob Keller 7 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-02 19:41 UTC (permalink / raw) To: sbeller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) "This is an interesting endeavor, because describing things is hard." -- me, upon writing this patch. When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For source code this is likely not the first commit that introduced the blob, but rather the latest commit that contained the blob. For example: git describe v0.99:Makefile v0.99-5-gab6625e06a:Makefile tells us the latest commit that contained the Makefile as it was in tag v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist", 2005-07-11) touches the Makefile. Let's see how this description turns out, if it is useful in day-to-day use as I have the intuition that we'd rather want to see the *first* commit that this blob was introduced to the repository (which can be achieved easily by giving the `--reverse` flag in the describe_blob rev walk). [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> --- Documentation/git-describe.txt | 13 ++++++++- builtin/describe.c | 65 ++++++++++++++++++++++++++++++++++++++---- t/t6120-describe.sh | 15 ++++++++++ 3 files changed, 87 insertions(+), 6 deletions(-) diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt index c924c945ba..79ec0be62a 100644 --- a/Documentation/git-describe.txt +++ b/Documentation/git-describe.txt @@ -3,7 +3,7 @@ git-describe(1) NAME ---- -git-describe - Describe a commit using the most recent tag reachable from it +git-describe - Describe a commit or blob using the graph relations SYNOPSIS @@ -11,6 +11,7 @@ SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] +'git describe' [<options>] <blob> DESCRIPTION ----------- @@ -24,6 +25,16 @@ By default (without --all or --tags) `git describe` only shows annotated tags. For more information about creating annotated tags see the -a and -s options to linkgit:git-tag[1]. +If the given object refers to a blob, it will be described +as `<commit-ish>:<path>`, such that the blob can be found +at `<path>` in the `<commit-ish>`. Note, that the commit is likely +not the commit that introduced the blob, but the one that was found +first; to find the commit that introduced the blob, you need to find +the commit that last touched the path, e.g. +`git log <commit-description> -- <path>`. +As blobs do not point at the commits they are contained in, +describing blobs is slow as we have to walk the whole graph. + OPTIONS ------- <commit-ish>...:: diff --git a/builtin/describe.c b/builtin/describe.c index 9e9a5ed5d4..cf08bef344 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -3,6 +3,7 @@ #include "lockfile.h" #include "commit.h" #include "tag.h" +#include "blob.h" #include "refs.h" #include "builtin.h" #include "exec_cmd.h" @@ -11,8 +12,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -434,6 +436,56 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) strbuf_addstr(dst, suffix); } +struct process_commit_data { + struct object_id current_commit; + struct object_id looking_for; + struct strbuf *dst; + struct rev_info *revs; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct process_commit_data *pcd = data; + pcd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *path, void *data) +{ + struct process_commit_data *pcd = data; + + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { + reset_revision_walk(); + describe_commit(&pcd->current_commit, pcd->dst); + strbuf_addf(pcd->dst, ":%s", path); + pcd->revs->max_count = 0; + } +} + +static void describe_blob(struct object_id oid, struct strbuf *dst) +{ + struct rev_info revs; + struct argv_array args = ARGV_ARRAY_INIT; + struct process_commit_data pcd = { null_oid, oid, dst, &revs}; + + argv_array_pushl(&args, "internal: The first arg is not parsed", + "--all", "--reflog", /* as many starting points as possible */ + /* NEEDSWORK: --all is incompatible with worktrees for now: */ + "--single-worktree", + "--objects", + "--in-commit-order", + NULL); + + init_revisions(&revs, NULL); + if (setup_revisions(args.argc, args.argv, &revs, NULL) > 1) + BUG("setup_revisions could not handle all args?"); + + if (prepare_revision_walk(&revs)) + die("revision walk setup failed"); + + traverse_commit_list(&revs, process_commit, process_object, &pcd); + reset_revision_walk(); +} + static void describe(const char *arg, int last_one) { struct object_id oid; @@ -445,11 +497,14 @@ static void describe(const char *arg, int last_one) if (get_oid(arg, &oid)) die(_("Not a valid object name %s"), arg); - cmit = lookup_commit_reference(&oid); - if (!cmit) - die(_("%s is not a valid '%s' object"), arg, commit_type); + cmit = lookup_commit_reference_gently(&oid, 1); - describe_commit(&oid, &sb); + if (cmit) + describe_commit(&oid, &sb); + else if (lookup_blob(&oid)) + describe_blob(oid, &sb); + else + die(_("%s is neither a commit nor blob"), arg); puts(sb.buf); diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index c8b7ed82d9..aec6ed192d 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -310,6 +310,21 @@ test_expect_success 'describe ignoring a broken submodule' ' grep broken out ' +test_expect_success 'describe a blob at a tag' ' + echo "make it a unique blob" >file && + git add file && git commit -m "content in file" && + git tag -a -m "latest annotated tag" unique-file && + git describe HEAD:file >actual && + echo "unique-file:file" >expect && + test_cmp expect actual +' + +test_expect_success 'describe a blob with commit-ish' ' + git commit --allow-empty -m "empty commit" && + git describe HEAD:file >actual && + grep unique-file-1-g actual +' + test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' ' i=1 && while test $i -lt 8000 -- 2.15.0.7.g980e40477f ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCHv3 7/7] builtin/describe.c: describe a blob 2017-11-02 19:41 ` [PATCHv3 7/7] builtin/describe.c: describe a blob Stefan Beller @ 2017-11-14 20:02 ` Jonathan Tan 2017-11-14 20:40 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Jonathan Tan @ 2017-11-14 20:02 UTC (permalink / raw) To: Stefan Beller; +Cc: Johannes.Schindelin, git, jacob.keller, me, schwab On Thu, 2 Nov 2017 12:41:48 -0700 Stefan Beller <sbeller@google.com> wrote: > Sometimes users are given a hash of an object and they want to > identify it further (ex.: Use verify-pack to find the largest blobs, > but what are these? or [1]) > > "This is an interesting endeavor, because describing things is hard." > -- me, upon writing this patch. > > When describing commits, we try to anchor them to tags or refs, as these > are conceptually on a higher level than the commit. And if there is no ref > or tag that matches exactly, we're out of luck. So we employ a heuristic > to make up a name for the commit. These names are ambiguous, there might > be different tags or refs to anchor to, and there might be different > path in the DAG to travel to arrive at the commit precisely. > > When describing a blob, we want to describe the blob from a higher layer > as well, which is a tuple of (commit, deep/path) as the tree objects > involved are rather uninteresting. The same blob can be referenced by > multiple commits, so how we decide which commit to use? This patch > implements a rather naive approach on this: As there are no back pointers > from blobs to commits in which the blob occurs, we'll start walking from > any tips available, listing the blobs in-order of the commit and once we > found the blob, we'll take the first commit that listed the blob. For > source code this is likely not the first commit that introduced the blob, > but rather the latest commit that contained the blob. For example: > > git describe v0.99:Makefile > v0.99-5-gab6625e06a:Makefile > > tells us the latest commit that contained the Makefile as it was in tag > v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next > commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist", > 2005-07-11) touches the Makefile. > > Let's see how this description turns out, if it is useful in day-to-day > use as I have the intuition that we'd rather want to see the *first* > commit that this blob was introduced to the repository (which can be > achieved easily by giving the `--reverse` flag in the describe_blob rev > walk). The method of your intuition indeed seems better - could we just have this from the start? Alternatively, to me, it seems that listing commits that *introduces* the blob (that is, where it references the blob, but none of its parents do) would be the best way. That would then be independent of traversal order (and we would no longer need to find a tag etc. to tie the blob to). If we do that, it seems to me that there is a future optimization that could get the first commit to the user more quickly - once a commit without the blob and a descendant commit with the blob is found, that interval can be bisected, so that the first commit is found in O(log number of commits) instead of O(commits). But this can be done later. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv3 7/7] builtin/describe.c: describe a blob 2017-11-14 20:02 ` Jonathan Tan @ 2017-11-14 20:40 ` Stefan Beller 2017-11-14 21:17 ` Jonathan Tan 0 siblings, 1 reply; 110+ messages in thread From: Stefan Beller @ 2017-11-14 20:40 UTC (permalink / raw) To: Jonathan Tan Cc: Johannes Schindelin, git, Jacob Keller, Kevin Daudt, Andreas Schwab On Tue, Nov 14, 2017 at 12:02 PM, Jonathan Tan <jonathantanmy@google.com> wrote: > On Thu, 2 Nov 2017 12:41:48 -0700 > Stefan Beller <sbeller@google.com> wrote: > >> Sometimes users are given a hash of an object and they want to >> identify it further (ex.: Use verify-pack to find the largest blobs, >> but what are these? or [1]) >> >> "This is an interesting endeavor, because describing things is hard." >> -- me, upon writing this patch. >> >> When describing commits, we try to anchor them to tags or refs, as these >> are conceptually on a higher level than the commit. And if there is no ref >> or tag that matches exactly, we're out of luck. So we employ a heuristic >> to make up a name for the commit. These names are ambiguous, there might >> be different tags or refs to anchor to, and there might be different >> path in the DAG to travel to arrive at the commit precisely. >> >> When describing a blob, we want to describe the blob from a higher layer >> as well, which is a tuple of (commit, deep/path) as the tree objects >> involved are rather uninteresting. The same blob can be referenced by >> multiple commits, so how we decide which commit to use? This patch >> implements a rather naive approach on this: As there are no back pointers >> from blobs to commits in which the blob occurs, we'll start walking from >> any tips available, listing the blobs in-order of the commit and once we >> found the blob, we'll take the first commit that listed the blob. For >> source code this is likely not the first commit that introduced the blob, >> but rather the latest commit that contained the blob. For example: >> >> git describe v0.99:Makefile >> v0.99-5-gab6625e06a:Makefile >> >> tells us the latest commit that contained the Makefile as it was in tag >> v0.99 is commit v0.99-5-gab6625e06a (and at the same path), as the next >> commit on top v0.99-6-gb1de9de2b9 ([PATCH] Bootstrap "make dist", >> 2005-07-11) touches the Makefile. >> >> Let's see how this description turns out, if it is useful in day-to-day >> use as I have the intuition that we'd rather want to see the *first* >> commit that this blob was introduced to the repository (which can be >> achieved easily by giving the `--reverse` flag in the describe_blob rev >> walk). > > The method of your intuition indeed seems better - could we just have > this from the start? Thanks for the review! This series was written with the mindset, that a user would only ever want to describe bad blobs. (bad in terms of file size, unwanted content, etc) With the --reverse you only see the *first* introduction of said blob, so finding out if it was re-introduced is still not as easy, whereas "when was this blob last used" which is what the current algorithm does, covers that case better. > Alternatively, to me, it seems that listing commits that *introduces* > the blob (that is, where it references the blob, but none of its parents > do) would be the best way. That would then be independent of traversal > order (and we would no longer need to find a tag etc. to tie the blob > to). What if it is introduced multiple times? (either in multiple competing side branches; or introduced, reverted and re-introduced?) > If we do that, it seems to me that there is a future optimization that > could get the first commit to the user more quickly - once a commit > without the blob and a descendant commit with the blob is found, that > interval can be bisected, so that the first commit is found in O(log > number of commits) instead of O(commits). But this can be done later. bisection assumes that we only have one "event" going from good to bad, which doesn't hold true here, as the blob can be there at different occasions of the history. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv3 7/7] builtin/describe.c: describe a blob 2017-11-14 20:40 ` Stefan Beller @ 2017-11-14 21:17 ` Jonathan Tan 0 siblings, 0 replies; 110+ messages in thread From: Jonathan Tan @ 2017-11-14 21:17 UTC (permalink / raw) To: Stefan Beller Cc: Johannes Schindelin, git, Jacob Keller, Kevin Daudt, Andreas Schwab On Tue, 14 Nov 2017 12:40:03 -0800 Stefan Beller <sbeller@google.com> wrote: > Thanks for the review! > > This series was written with the mindset, that a user would only ever > want to describe bad blobs. (bad in terms of file size, unwanted content, etc) > > With the --reverse you only see the *first* introduction of said blob, > so finding out if it was re-introduced is still not as easy, whereas "when > was this blob last used" which is what the current algorithm does, covers > that case better. How does "when was this blob last used" cover reintroduction better? If you want to check all introductions, then you'll need something like what I describe (quoted below). > > Alternatively, to me, it seems that listing commits that *introduces* > > the blob (that is, where it references the blob, but none of its parents > > do) would be the best way. That would then be independent of traversal > > order (and we would no longer need to find a tag etc. to tie the blob > > to). > > What if it is introduced multiple times? (either in multiple competing > side branches; or introduced, reverted and re-introduced?) Then all of them should be listed. > > If we do that, it seems to me that there is a future optimization that > > could get the first commit to the user more quickly - once a commit > > without the blob and a descendant commit with the blob is found, that > > interval can be bisected, so that the first commit is found in O(log > > number of commits) instead of O(commits). But this can be done later. > > bisection assumes that we only have one "event" going from good to > bad, which doesn't hold true here, as the blob can be there at different > occasions of the history. Yes, bisection would only be useful to output one commit more quickly. We will still need to exhaustively search for all commits if the user needs more than one. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv3 0/7] git describe blob 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller ` (6 preceding siblings ...) 2017-11-02 19:41 ` [PATCHv3 7/7] builtin/describe.c: describe a blob Stefan Beller @ 2017-11-03 0:23 ` Jacob Keller 2017-11-03 1:46 ` Junio C Hamano 7 siblings, 1 reply; 110+ messages in thread From: Jacob Keller @ 2017-11-03 0:23 UTC (permalink / raw) To: Stefan Beller Cc: Johannes Schindelin, Git mailing list, Kevin Daudt, Andreas Schwab On Thu, Nov 2, 2017 at 12:41 PM, Stefan Beller <sbeller@google.com> wrote: > Thanks for the discussion on v2[1]. > > Interdiff is below, just fixing minor things. > > We'll keep the original algorithm for now, deferring an improvement on > the algorithm front towards a future developer. > I agree, "something" is better than "nothing", and we can work to improve "something" in the future, especially after we get more real use and see what people think. Only question would be how much do we need to document the current behavior might be open for improvement? > Thanks, > Stefan > > [1] https://public-inbox.org/git/20171031211852.13001-1-sbeller@google.com/ > > Stefan Beller (7): > t6120: fix typo in test name > list-objects.c: factor out traverse_trees_and_blobs > revision.h: introduce blob/tree walking in order of the commits > builtin/describe.c: rename `oid` to avoid variable shadowing > builtin/describe.c: print debug statements earlier > builtin/describe.c: factor out describe_commit > builtin/describe.c: describe a blob > > Documentation/git-describe.txt | 13 +++- > Documentation/rev-list-options.txt | 5 ++ > builtin/describe.c | 125 ++++++++++++++++++++++++++++--------- > list-objects.c | 52 +++++++++------ > revision.c | 2 + > revision.h | 3 +- > t/t6100-rev-list-in-order.sh | 47 ++++++++++++++ > t/t6120-describe.sh | 17 ++++- > 8 files changed, 214 insertions(+), 50 deletions(-) > create mode 100755 t/t6100-rev-list-in-order.sh > diff --git c/Documentation/git-describe.txt w/Documentation/git-describe.txt > index 077c3c2193..79ec0be62a 100644 > --- c/Documentation/git-describe.txt > +++ w/Documentation/git-describe.txt > @@ -11,7 +11,7 @@ SYNOPSIS > [verse] > 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...] > 'git describe' [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>] > -'git describe' [<options>] <blob-ish> > +'git describe' [<options>] <blob> > > DESCRIPTION > ----------- > diff --git c/builtin/describe.c w/builtin/describe.c > index 382cbae908..cf08bef344 100644 > --- c/builtin/describe.c > +++ w/builtin/describe.c > @@ -440,6 +440,7 @@ struct process_commit_data { > struct object_id current_commit; > struct object_id looking_for; > struct strbuf *dst; > + struct rev_info *revs; > }; > > static void process_commit(struct commit *commit, void *data) > @@ -456,6 +457,7 @@ static void process_object(struct object *obj, const char *path, void *data) > reset_revision_walk(); > describe_commit(&pcd->current_commit, pcd->dst); > strbuf_addf(pcd->dst, ":%s", path); > + pcd->revs->max_count = 0; > } > } > > @@ -463,7 +465,7 @@ static void describe_blob(struct object_id oid, struct strbuf *dst) > { > struct rev_info revs; > struct argv_array args = ARGV_ARRAY_INIT; > - struct process_commit_data pcd = { null_oid, oid, dst}; > + struct process_commit_data pcd = { null_oid, oid, dst, &revs}; > > argv_array_pushl(&args, "internal: The first arg is not parsed", > "--all", "--reflog", /* as many starting points as possible */ > @@ -497,14 +499,12 @@ static void describe(const char *arg, int last_one) > die(_("Not a valid object name %s"), arg); > cmit = lookup_commit_reference_gently(&oid, 1); > > - if (cmit) { > + if (cmit) > describe_commit(&oid, &sb); > - } else { > - if (lookup_blob(&oid)) > - describe_blob(oid, &sb); > - else > - die(_("%s is neither a commit nor blob"), arg); > - } > + else if (lookup_blob(&oid)) > + describe_blob(oid, &sb); > + else > + die(_("%s is neither a commit nor blob"), arg); > > puts(sb.buf); > > diff --git c/list-objects.c w/list-objects.c > index 03438e5a8b..07a92f35fe 100644 > --- c/list-objects.c > +++ w/list-objects.c > @@ -184,13 +184,13 @@ static void add_pending_tree(struct rev_info *revs, struct tree *tree) > } > > static void traverse_trees_and_blobs(struct rev_info *revs, > - struct strbuf *base_path, > + struct strbuf *base, > show_object_fn show_object, > void *data) > { > int i; > > - assert(base_path->len == 0); > + assert(base->len == 0); > > for (i = 0; i < revs->pending.nr; i++) { > struct object_array_entry *pending = revs->pending.objects + i; > @@ -208,12 +208,12 @@ static void traverse_trees_and_blobs(struct rev_info *revs, > path = ""; > if (obj->type == OBJ_TREE) { > process_tree(revs, (struct tree *)obj, show_object, > - base_path, path, data); > + base, path, data); > continue; > } > if (obj->type == OBJ_BLOB) { > process_blob(revs, (struct blob *)obj, show_object, > - base_path, path, data); > + base, path, data); > continue; > } > die("unknown pending object %s (%s)", > diff --git c/t/t6100-rev-list-in-order.sh w/t/t6100-rev-list-in-order.sh > index 651666979b..d4d539b0da 100755 > --- c/t/t6100-rev-list-in-order.sh > +++ w/t/t6100-rev-list-in-order.sh > @@ -1,21 +1,22 @@ > #!/bin/sh > > -test_description='miscellaneous rev-list tests' > +test_description='rev-list testing in-commit-order' > > . ./test-lib.sh > > - > -test_expect_success 'setup' ' > +test_expect_success 'rev-list --in-commit-order' ' > for x in one two three four > do > echo $x >$x && > git add $x && > - git commit -m "add file $x" > + git commit -m "add file $x" || > + return 1 > done && > for x in four three > do > git rm $x && > - git commit -m "remove $x" > + git commit -m "remove $x" || > + return 1 > done && > git rev-list --in-commit-order --objects HEAD >actual.raw && > cut -c 1-40 >actual <actual.raw && > @@ -32,9 +33,9 @@ test_expect_success 'setup' ' > HEAD~2^{tree} > HEAD~2^{tree}:four > HEAD~3^{commit} > - # HEAD~3^{tree} skipped > + # HEAD~3^{tree} skipped, same as HEAD~1^{tree} > HEAD~4^{commit} > - # HEAD~4^{tree} skipped > + # HEAD~4^{tree} skipped, same as HEAD^{tree} > HEAD~5^{commit} > HEAD~5^{tree} > EOF > diff --git c/t/t6120-describe.sh w/t/t6120-describe.sh > index fd329f173a..aec6ed192d 100755 > --- c/t/t6120-describe.sh > +++ w/t/t6120-describe.sh > @@ -319,7 +319,7 @@ test_expect_success 'describe a blob at a tag' ' > test_cmp expect actual > ' > > -test_expect_success 'describe a surviving blob' ' > +test_expect_success 'describe a blob with commit-ish' ' > git commit --allow-empty -m "empty commit" && > git describe HEAD:file >actual && > grep unique-file-1-g actual ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv3 0/7] git describe blob 2017-11-03 0:23 ` [PATCHv3 0/7] git describe blob Jacob Keller @ 2017-11-03 1:46 ` Junio C Hamano 2017-11-03 2:29 ` Stefan Beller 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2017-11-03 1:46 UTC (permalink / raw) To: Jacob Keller Cc: Stefan Beller, Johannes Schindelin, Git mailing list, Kevin Daudt, Andreas Schwab Jacob Keller <jacob.keller@gmail.com> writes: > I agree, "something" is better than "nothing", and we can work to > improve "something" in the future, especially after we get more real > use and see what people think. Only question would be how much do we > need to document the current behavior might be open for improvement? If - it always digs to the root of the history no matter what; and/or - it almost always yields correct but suboptimal result then that fact must be documented in BUGS section, possibly a brief descrition of why that limitation is there, with a hint to invite people to look into fixing it. We should mark it prominently as experimental and advertise it as such. "It's too slow in real project to be usable" and "Its output bases the answer on an irrelevant commit" are not something we want our users to experience, except for those with inclination to (or ability and time to) help improve the feature. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCHv3 0/7] git describe blob 2017-11-03 1:46 ` Junio C Hamano @ 2017-11-03 2:29 ` Stefan Beller 0 siblings, 0 replies; 110+ messages in thread From: Stefan Beller @ 2017-11-03 2:29 UTC (permalink / raw) To: Junio C Hamano Cc: Jacob Keller, Johannes Schindelin, Git mailing list, Kevin Daudt, Andreas Schwab On Thu, Nov 2, 2017 at 6:46 PM, Junio C Hamano <gitster@pobox.com> wrote: > Jacob Keller <jacob.keller@gmail.com> writes: > >> I agree, "something" is better than "nothing", and we can work to >> improve "something" in the future, especially after we get more real >> use and see what people think. Only question would be how much do we >> need to document the current behavior might be open for improvement? > > If > > - it always digs to the root of the history no matter what; and/or this is fixed. > - it almost always yields correct but suboptimal result this is not, for the lack of knowing what the optimal result is. > > then that fact must be documented in BUGS section, possibly a brief > descrition of why that limitation is there, with a hint to invite > people to look into fixing it. > > We should mark it prominently as experimental and advertise it as > such. "It's too slow in real project to be usable" I found it quite fast after fixing the history walking, but still. > and "Its output > bases the answer on an irrelevant commit" are not something we want > our users to experience, except for those with inclination to (or > ability and time to) help improve the feature. I think the current documentation states exactly this. Thanks. ^ permalink raw reply [flat|nested] 110+ messages in thread
end of thread, other threads:[~2017-11-20 18:19 UTC | newest] Thread overview: 110+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-10-28 0:44 [RFC PATCH 0/3] git-describe <blob> ? Stefan Beller 2017-10-28 0:45 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-28 0:45 ` [PATCH 2/3] revision.h: introduce blob/tree walking in order of the commits Stefan Beller 2017-10-28 17:20 ` Johannes Schindelin 2017-10-29 3:22 ` Stefan Beller 2017-10-29 3:23 ` Stefan Beller 2017-10-29 3:43 ` Junio C Hamano 2017-10-28 0:45 ` [PATCH 3/3] builtin/describe: describe blobs Stefan Beller 2017-10-28 17:32 ` Johannes Schindelin 2017-10-28 22:47 ` Jacob Keller 2017-10-29 3:28 ` Stefan Beller 2017-10-29 12:02 ` Kevin Daudt 2017-10-29 12:07 ` Johannes Schindelin 2017-10-28 17:15 ` [PATCH 1/3] list-objects.c: factor out traverse_trees_and_blobs Johannes Schindelin 2017-10-29 3:13 ` Stefan Beller 2017-10-28 16:04 ` [RFC PATCH 0/3] git-describe <blob> ? Johannes Schindelin 2017-10-31 0:33 ` [PATCH 0/7] git-describe <blob> Stefan Beller 2017-10-31 0:33 ` [PATCH 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-10-31 6:07 ` Junio C Hamano 2017-10-31 0:33 ` [PATCH 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller 2017-10-31 6:57 ` Junio C Hamano 2017-10-31 18:12 ` Stefan Beller 2017-10-31 0:33 ` [PATCH 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller 2017-10-31 8:15 ` Jacob Keller 2017-10-31 0:33 ` [PATCH 4/7] builtin/describe.c: print debug statements earlier Stefan Beller 2017-10-31 7:03 ` Junio C Hamano 2017-10-31 19:05 ` Stefan Beller 2017-10-31 0:33 ` [PATCH 5/7] builtin/describe.c: factor out describe_commit Stefan Beller 2017-10-31 0:33 ` [PATCH 6/7] builtin/describe.c: describe a blob Stefan Beller 2017-10-31 6:25 ` Junio C Hamano 2017-10-31 19:16 ` Stefan Beller 2017-11-01 3:34 ` Junio C Hamano 2017-11-01 20:58 ` Stefan Beller 2017-11-02 1:53 ` Junio C Hamano 2017-11-02 4:23 ` Junio C Hamano 2017-11-04 21:15 ` Philip Oakley 2017-11-05 6:28 ` Junio C Hamano 2017-11-06 23:50 ` Philip Oakley 2017-11-09 20:30 ` Stefan Beller 2017-11-10 0:25 ` Philip Oakley 2017-11-10 1:24 ` Junio C Hamano 2017-11-10 22:44 ` [PATCH 0/1] describe a blob: with better docs Stefan Beller 2017-11-10 22:44 ` [PATCH] builtin/describe.c: describe a blob Stefan Beller 2017-11-13 1:33 ` Junio C Hamano 2017-11-14 23:37 ` Stefan Beller 2017-11-20 15:22 ` [PATCH 6/7] " Philip Oakley 2017-11-20 18:18 ` Philip Oakley 2017-11-01 3:44 ` Junio C Hamano 2017-10-31 0:33 ` [PATCH 7/7] t6120: fix typo in test name Stefan Beller 2017-11-01 1:21 ` Junio C Hamano 2017-11-01 18:13 ` Stefan Beller 2017-11-02 1:36 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 0/7] git describe blob Stefan Beller 2017-10-31 21:18 ` [PATCHv2 1/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-11-01 3:46 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 2/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller 2017-11-01 3:50 ` Junio C Hamano 2017-11-01 12:26 ` Johannes Schindelin 2017-11-01 12:37 ` Junio C Hamano 2017-11-01 19:37 ` Stefan Beller 2017-11-01 22:08 ` Johannes Schindelin 2017-11-01 22:19 ` Stefan Beller 2017-11-01 22:39 ` Johannes Schindelin 2017-11-01 22:46 ` Stefan Beller 2017-11-01 21:36 ` Johannes Schindelin 2017-11-01 21:39 ` Jeff King 2017-11-01 22:33 ` Johannes Schindelin 2017-11-02 1:20 ` Junio C Hamano 2017-10-31 21:18 ` [PATCHv2 3/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller 2017-10-31 21:18 ` [PATCHv2 4/7] builtin/describe.c: print debug statements earlier Stefan Beller 2017-10-31 21:31 ` Eric Sunshine 2017-10-31 21:18 ` [PATCHv2 5/7] builtin/describe.c: factor out describe_commit Stefan Beller 2017-10-31 21:18 ` [PATCHv2 6/7] builtin/describe.c: describe a blob Stefan Beller 2017-10-31 21:49 ` Eric Sunshine 2017-11-01 19:51 ` Stefan Beller 2017-11-01 4:11 ` Junio C Hamano 2017-11-01 12:32 ` Johannes Schindelin 2017-11-01 17:59 ` Stefan Beller 2017-11-01 21:05 ` Jacob Keller 2017-11-01 22:12 ` Johannes Schindelin 2017-11-01 22:21 ` Stefan Beller 2017-11-01 22:41 ` Johannes Schindelin 2017-11-01 22:53 ` Stefan Beller 2017-11-02 6:05 ` Jacob Keller 2017-11-03 5:18 ` Junio C Hamano 2017-11-03 6:55 ` Jacob Keller 2017-11-03 15:02 ` Junio C Hamano 2017-11-02 7:23 ` Andreas Schwab 2017-11-02 18:18 ` Stefan Beller 2017-11-03 12:05 ` Johannes Schindelin 2017-11-01 21:28 ` Stefan Beller 2017-10-31 21:18 ` [PATCHv2 7/7] t6120: fix typo in test name Stefan Beller 2017-11-01 5:14 ` [PATCHv2 0/7] git describe blob Junio C Hamano 2017-11-02 19:41 ` [PATCHv3 " Stefan Beller 2017-11-02 19:41 ` [PATCHv3 1/7] t6120: fix typo in test name Stefan Beller 2017-11-02 19:41 ` [PATCHv3 2/7] list-objects.c: factor out traverse_trees_and_blobs Stefan Beller 2017-11-02 19:41 ` [PATCHv3 3/7] revision.h: introduce blob/tree walking in order of the commits Stefan Beller 2017-11-14 19:52 ` Jonathan Tan 2017-11-02 19:41 ` [PATCHv3 4/7] builtin/describe.c: rename `oid` to avoid variable shadowing Stefan Beller 2017-11-02 19:41 ` [PATCHv3 5/7] builtin/describe.c: print debug statements earlier Stefan Beller 2017-11-14 19:55 ` Jonathan Tan 2017-11-14 20:00 ` Stefan Beller 2017-11-02 19:41 ` [PATCHv3 6/7] builtin/describe.c: factor out describe_commit Stefan Beller 2017-11-02 19:41 ` [PATCHv3 7/7] builtin/describe.c: describe a blob Stefan Beller 2017-11-14 20:02 ` Jonathan Tan 2017-11-14 20:40 ` Stefan Beller 2017-11-14 21:17 ` Jonathan Tan 2017-11-03 0:23 ` [PATCHv3 0/7] git describe blob Jacob Keller 2017-11-03 1:46 ` Junio C Hamano 2017-11-03 2:29 ` Stefan Beller
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).