git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v1] load_ref_decorations(): fix decoration with tags
@ 2021-07-12 22:41 Martin Ågren
  2021-07-13  0:06 ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2021-07-12 22:41 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Taylor Blau

Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
Rather than calling `parse_object()` directly, call `oid_object_info()`
and then either return early or go on to call `lookup_object_by_type()`
using the type just discovered. As detailed in the commit message, this
provides a significant time saving.

Unfortunately, it also changes the behavior. As an example, on git.git,

  git log --oneline --decorate origin/master | grep '(.*tag:.*)'

returns zero hits after 88473c8bae. Before it, we have around 800 hits.
What happens is, all annotated tags are lost from the decoration.

Let's do a partial revert of 88473c8bae: Keep doing that early
`oid_object_info()`, but after that, go on with the full
`parse_object()`. This restores the pre-88473c8bae behavior. We clearly
have lacking test coverage here, so make sure to add a test. Without
this fix to log-tree.c, the git-log invocation in this new test does
pick up the lightweight tags involved, but misses out on the annotated
tag.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 Cc-ing Peff, the author of 88473c8bae, and Taylor, who made a comment
 regarding the peeling of tags [1], which might be related.

 I'm genuinely unconvinced that my proposed fix is the best possible one.
 Or maybe trying a more surgical fix around annotated tags misses a
 whole bunch of *other* special cases and we really should go for the
 full `parse_object()` to cover all possibilities.

 In my brief testing (time git log -1 --decorate on git.git), the time
 savings from 88473c8bae seem to be gone. So maybe this should be a full
 revert, rather than a partial one. (Plus the test.) Let's hope that
 won't be necessary.

 Also, I'm not sure whether the test really needs to worry about the
 order of the decorations suddenly changing -- maybe it's supposed to be
 stable.

 [1] https://lore.kernel.org/git/YNKgkGkPiMgNubNE@nand.local/

 log-tree.c     |  6 ++----
 t/t4202-log.sh | 14 ++++++++++++++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 4f69ed176d..0b638d2e3c 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -134,7 +134,6 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
 			      int flags, void *cb_data)
 {
 	struct object *obj;
-	enum object_type objtype;
 	enum decoration_type deco_type = DECORATION_NONE;
 	struct decoration_filter *filter = (struct decoration_filter *)cb_data;
 
@@ -156,10 +155,9 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
 		return 0;
 	}
 
-	objtype = oid_object_info(the_repository, oid, NULL);
-	if (objtype < 0)
+	if (oid_object_info(the_repository, oid, NULL) < 0)
 		return 0;
-	obj = lookup_object_by_type(the_repository, oid, objtype);
+	obj = parse_object(the_repository, oid);
 
 	if (starts_with(refname, "refs/heads/"))
 		deco_type = DECORATION_REF_LOCAL;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 350cfa3593..3aa5451913 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1905,6 +1905,20 @@ test_expect_success '--exclude-promisor-objects does not BUG-crash' '
 	test_must_fail git log --exclude-promisor-objects source-a
 '
 
+test_expect_success 'log --decorate includes, e.g., all kinds of tags' '
+	git log --oneline --decorate >log &&
+	test_line_count = 2 log &&
+	grep "^1ac6c77 (tag: one) one$" log &&
+	grep HEAD log | sed -e "s/^.*(\(.*\)).*$/\1/" -e "s/, /\n/g" |
+		sort >actual &&
+	cat >expect <<-\EOF &&
+		HEAD -> source-b
+		tag: source-tag
+		tag: three
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'log --end-of-options' '
        git update-ref refs/heads/--source HEAD &&
        git log --end-of-options --source >actual &&
-- 
2.32.0.402.g57bb445576


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v1] load_ref_decorations(): fix decoration with tags
  2021-07-12 22:41 [PATCH v1] load_ref_decorations(): fix decoration with tags Martin Ågren
@ 2021-07-13  0:06 ` Jeff King
  2021-07-13  7:40   ` [PATCH v2] " Martin Ågren
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2021-07-13  0:06 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Taylor Blau

On Tue, Jul 13, 2021 at 12:41:52AM +0200, Martin Ågren wrote:

> Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
> objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
> Rather than calling `parse_object()` directly, call `oid_object_info()`
> and then either return early or go on to call `lookup_object_by_type()`
> using the type just discovered. As detailed in the commit message, this
> provides a significant time saving.
> 
> Unfortunately, it also changes the behavior. As an example, on git.git,
> 
>   git log --oneline --decorate origin/master | grep '(.*tag:.*)'
> 
> returns zero hits after 88473c8bae. Before it, we have around 800 hits.
> What happens is, all annotated tags are lost from the decoration.

Eek. Thanks for catching this.

I wondered how I could have missed this, but it does work if somebody
else happens to have parsed it. For example:

  $ git log -1 --oneline --decorate v5.11
  f40ddce88593 (tag: v5.11) Linux 5.11

works because we'll already have parsed it as a traversal tip.

> Let's do a partial revert of 88473c8bae: Keep doing that early
> `oid_object_info()`, but after that, go on with the full
> `parse_object()`. This restores the pre-88473c8bae behavior.

Your fix is _almost_ there. There's not much point in doing
oid_object_info() if we're just going to parse unconditionally. But we
would want to parse only tags.

And we even do parse tags recursively in the peeling loop. The trouble
is that we do so after realizing we need to recurse. We just need to
bump it up in the loop, like:

diff --git a/log-tree.c b/log-tree.c
index 4f69ed176d..6dc4412268 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -174,11 +174,11 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
 
 	add_name_decoration(deco_type, refname, obj);
 	while (obj->type == OBJ_TAG) {
+		if (!obj->parsed)
+			parse_object(the_repository, &obj->oid);
 		obj = ((struct tag *)obj)->tagged;
 		if (!obj)
 			break;
-		if (!obj->parsed)
-			parse_object(the_repository, &obj->oid);
 		add_name_decoration(DECORATION_REF_TAG, refname, obj);
 	}
 	return 0;

That's the minimum needed to unbreak things. I think we could do even
better, though. There is no need for us to parse a commit object pointed
to by a tag here. We should only be parsing tags we see (whether at the
top-level or recursively).

Do you want to incorporate the fix above, or would you prefer me to pick
it up from here?

> We clearly
> have lacking test coverage here, so make sure to add a test. Without
> this fix to log-tree.c, the git-log invocation in this new test does
> pick up the lightweight tags involved, but misses out on the annotated
> tag.

Yeah, definitely.

>  In my brief testing (time git log -1 --decorate on git.git), the time
>  savings from 88473c8bae seem to be gone. So maybe this should be a full
>  revert, rather than a partial one. (Plus the test.) Let's hope that
>  won't be necessary.

Right. After your patch, the oid_object_info() is pointless, because
we're still parsing everything (the "< 0" error check would only trigger
in a corrupted repo). And it adds some overhead, so it may even be
slightly slower than the original code. :)

The timings I get for "git log -1 --decorate" on git.git are:

  -  before either patch: 27.5ms
  - with my broken patch:  5.9ms
  - with the patch above: 11.3ms

which makes sense. I have a bunch of branches, and now we don't parse
them. We do still have to parse tags. On linux.git, where it's almost
entirely tags, most of the advantage dries up (but it would probably be
helped a bit by the further suggestion I gave above to avoid parsing
tagged commits).

On my big ~220k ref test case (where it's mostly non-tags), the timings
are:

  -  before either patch: 2.945s
  - with my broken patch: 0.707s
  - with the patch above: 0.788s

so the savings are retained.

>  Also, I'm not sure whether the test really needs to worry about the
>  order of the decorations suddenly changing -- maybe it's supposed to be
>  stable.

I think it's probably OK to count on it being stable. We iterate the
refs in a stable order to insert them, and then store the result as a
linked list. If that strategy ever changed, I think we'd end up doing a
manual sort on them to get a stable order anyway.

> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index 350cfa3593..3aa5451913 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -1905,6 +1905,20 @@ test_expect_success '--exclude-promisor-objects does not BUG-crash' '
>  	test_must_fail git log --exclude-promisor-objects source-a
>  '
>  
> +test_expect_success 'log --decorate includes, e.g., all kinds of tags' '
> +	git log --oneline --decorate >log &&
> +	test_line_count = 2 log &&
> +	grep "^1ac6c77 (tag: one) one$" log &&

This presumably breaks when the tests are run in sha256 mode. Coupled
with the ordering simplification, maybe:

        cat >expect <<-\EOF &&
        three HEAD -> source-b, tag: three, tag: source-tag
        one tag: one
        EOF
        git log --format="%s %D" >actual &&
        test_cmp expect actual

(or maybe %d is more readable in the output; it doesn't matter much if
we're matching it verbatim).

Thanks again for noticing this.

-Peff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13  0:06 ` Jeff King
@ 2021-07-13  7:40   ` Martin Ågren
  2021-07-13  7:52     ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2021-07-13  7:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Taylor Blau

Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
Rather than calling `parse_object()`, we go for `oid_object_info()` and
then `lookup_object_by_type()` using the type just discovered. As
detailed in the commit message, this provides a significant time saving.

Unfortunately, it also changes the behavior: We lose all annotated tags
from the decoration.

The reason this happens is in the loop where we try to peel the tags, we
won't necessarily have parsed that first object. If we haven't, its
`tag` will be NULL, so nothing will be displayed, and its `tagged` will
also be NULL, so we won't peel any further.

Make sure to parse the tag object at the top of the peeling loop. This
effectively restores the pre-88473c8bae parsing -- but only of tags,
allowing us to keep most of the possible speedup from 88473c8bae. Jeff
King reports:

  On my big ~220k ref test case (where it's mostly non-tags), the
  timings [using "git log -1 --decorate"] are:

    - before either patch: 2.945s
    - with my broken patch: 0.707s
    - with [this patch]: 0.788s

Note how this commit could have been done as an optimization before
88473c8bae: When our peeling hits a non-tag, we won't parse that tagged
object only to immediately end the loop.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 On Tue, 13 Jul 2021 at 02:06, Jeff King <peff@peff.net> wrote:
 >
 > Your fix is _almost_ there.

 It's very kind of you to put it like that. I've picked up your
 suggestions and have tried to summarize my understanding of the issue
 and the fix in the commit message.

 > That's the minimum needed to unbreak things. I think we could do even
 > better, though. There is no need for us to parse a commit object pointed
 > to by a tag here. We should only be parsing tags we see (whether at the
 > top-level or recursively).

 Maybe you wrote this before circling back and actually writing that
 "even better" thing? Because it seems to me like that's what you did.
 Maybe I'm still missing something.

 Thank you for your insightful and helpful comments.

 log-tree.c     | 4 ++--
 t/t4202-log.sh | 9 +++++++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 4f69ed176d..6dc4412268 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -174,11 +174,11 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
 
 	add_name_decoration(deco_type, refname, obj);
 	while (obj->type == OBJ_TAG) {
+		if (!obj->parsed)
+			parse_object(the_repository, &obj->oid);
 		obj = ((struct tag *)obj)->tagged;
 		if (!obj)
 			break;
-		if (!obj->parsed)
-			parse_object(the_repository, &obj->oid);
 		add_name_decoration(DECORATION_REF_TAG, refname, obj);
 	}
 	return 0;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 350cfa3593..536b1eef42 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1905,6 +1905,15 @@ test_expect_success '--exclude-promisor-objects does not BUG-crash' '
 	test_must_fail git log --exclude-promisor-objects source-a
 '
 
+test_expect_success 'log --decorate includes lightweight and annotated tags' '
+	cat >expect <<-\EOF &&
+	three HEAD -> source-b, tag: three, tag: source-tag
+	one tag: one
+	EOF
+	git log --format="%s %D" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'log --end-of-options' '
        git update-ref refs/heads/--source HEAD &&
        git log --end-of-options --source >actual &&
-- 
2.32.0.402.g57bb445576


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13  7:40   ` [PATCH v2] " Martin Ågren
@ 2021-07-13  7:52     ` Jeff King
  2021-07-13  8:47       ` Martin Ågren
  2021-07-13 21:17       ` Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2021-07-13  7:52 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, git, Taylor Blau

[+cc Junio; this patch looks good to me, and should go on top of
	    jk/log-decorate-optim, which is in 'next' and has a pretty
	    ugly regression]

On Tue, Jul 13, 2021 at 09:40:18AM +0200, Martin Ågren wrote:

> Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
> objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
> Rather than calling `parse_object()`, we go for `oid_object_info()` and
> then `lookup_object_by_type()` using the type just discovered. As
> detailed in the commit message, this provides a significant time saving.
> 
> Unfortunately, it also changes the behavior: We lose all annotated tags
> from the decoration.
> 
> The reason this happens is in the loop where we try to peel the tags, we
> won't necessarily have parsed that first object. If we haven't, its
> `tag` will be NULL, so nothing will be displayed, and its `tagged` will
> also be NULL, so we won't peel any further.

Thanks, nicely explained.

> Note how this commit could have been done as an optimization before
> 88473c8bae: When our peeling hits a non-tag, we won't parse that tagged
> object only to immediately end the loop.

Yep, thanks for mentioning this, as it's somewhat subtle.

>  On Tue, 13 Jul 2021 at 02:06, Jeff King <peff@peff.net> wrote:
>  >
>  > Your fix is _almost_ there.
> 
>  It's very kind of you to put it like that. I've picked up your
>  suggestions and have tried to summarize my understanding of the issue
>  and the fix in the commit message.

When I wrote that, I thought the fix would just be:

  if (obj_type == OBJ_TAG)
	parse_object(...);

which really would put it only one line off of your fix. :)

>  > That's the minimum needed to unbreak things. I think we could do even
>  > better, though. There is no need for us to parse a commit object pointed
>  > to by a tag here. We should only be parsing tags we see (whether at the
>  > top-level or recursively).
> 
>  Maybe you wrote this before circling back and actually writing that
>  "even better" thing? Because it seems to me like that's what you did.
>  Maybe I'm still missing something.

Nope, I'm just dumb. I wrote what I sent in the other email (rather than
just adding the "if" as above) because it only involved having a single
parse_object() call in the function. To my credit, I did realize about
an hour after sending the other email that I had in fact done the
"better thing" quite accidentally. But I really like how you explained
it in the commit message here, which I had not quite thought through.

>  log-tree.c     | 4 ++--
>  t/t4202-log.sh | 9 +++++++++

Patch looks good. Thanks for noticing the problem and cleaning up my
mess.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13  7:52     ` Jeff King
@ 2021-07-13  8:47       ` Martin Ågren
  2021-07-13 21:17       ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Martin Ågren @ 2021-07-13  8:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List, Taylor Blau

On Tue, 13 Jul 2021 at 09:52, Jeff King <peff@peff.net> wrote:
>
> [+cc Junio; this patch looks good to me, and should go on top of
>             jk/log-decorate-optim, which is in 'next' and has a pretty
>             ugly regression]

Thanks for calling out the branch name when I failed to. That's indeed
where my copy of the patch sits and where I've tested this.

> >  Maybe you wrote this before circling back and actually writing that
> >  "even better" thing? Because it seems to me like that's what you did.
> >  Maybe I'm still missing something.
>
> Nope, I'm just dumb. I wrote what I sent in the other email (rather than
> just adding the "if" as above) because it only involved having a single
> parse_object() call in the function. To my credit, I did realize about
> an hour after sending the other email that I had in fact done the
> "better thing" quite accidentally. But I really like how you explained
> it in the commit message here, which I had not quite thought through.

That's the best kind of hindsight: the sudden realization that you
actually got it right.

> Patch looks good. Thanks for noticing the problem and cleaning up my
> mess.

Thanks a lot for your comments and all the help. I'm glad you liked my
explanation in the posted patch.

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13  7:52     ` Jeff King
  2021-07-13  8:47       ` Martin Ågren
@ 2021-07-13 21:17       ` Junio C Hamano
  2021-07-13 21:27         ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-07-13 21:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Ågren, git, Taylor Blau

Jeff King <peff@peff.net> writes:

>> The reason this happens is in the loop where we try to peel the tags, we
>> won't necessarily have parsed that first object. If we haven't, its
>> `tag` will be NULL, so nothing will be displayed, and its `tagged` will
>> also be NULL, so we won't peel any further.
>
> Thanks, nicely explained.

Yup, nicely explained indeed.

>> Note how this commit could have been done as an optimization before
>> 88473c8bae: When our peeling hits a non-tag, we won't parse that tagged
>> object only to immediately end the loop.
>
> Yep, thanks for mentioning this, as it's somewhat subtle.

It is too subtle that I am not sure what the paragraph wants to say.

Before 88473c8b, we had a fully parsed object in obj and entered the
while() loop iff the outermost object is a tag, then we find the
underlying object via obj->tagged.  We parse that underlying object
to find if it is a tag, and break out if it is not.

By "this commit", I assume that the above mean the change in this
fix, i.e. parse 'obj' if it has not been parsed before looking at
its tagged field.  But I am not sure how that would have been an
optimization before 88473c8b that gave a parsed tag object 'obj'
upon entry to the loop.

Puzzled.

In any case, let's talk about this patch in the context to which it
is designed to be applied, i.e. post 88473c8b3c8b.

When we come here, we have done oid_object_info() plus
lookup_object_by_type() to obtain 'obj' and we know its type.
Then we enter the loop.

 	while (obj->type == OBJ_TAG) {
+		if (!obj->parsed)
+			parse_object(the_repository, &obj->oid);

And we parse if it hasn't been parsed.  THat is why we can ...

 		obj = ((struct tag *)obj)->tagged;
 		if (!obj)
 			break;

... look at its tagged member.

-		if (!obj->parsed)
-			parse_object(the_repository, &obj->oid);
 		add_name_decoration(DECORATION_REF_TAG, refname, obj);

And the updated 'obj' (i.e. direct referent of the tag object) is
fed to add_name_decoration().  And then we move to the next
iteration.

Now, do we know what type of object 'obj' is at this point?  We
did parse the outermost tag upon entry to this loop, we replaced
'obj' variable with the referent by following the .tagged member,
but we haven't parsed that object or ran oid_object_info() on it.

Puzzled.

 	}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13 21:17       ` Junio C Hamano
@ 2021-07-13 21:27         ` Jeff King
  2021-07-13 21:40           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2021-07-13 21:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, git, Taylor Blau

On Tue, Jul 13, 2021 at 02:17:53PM -0700, Junio C Hamano wrote:

> >> Note how this commit could have been done as an optimization before
> >> 88473c8bae: When our peeling hits a non-tag, we won't parse that tagged
> >> object only to immediately end the loop.
> >
> > Yep, thanks for mentioning this, as it's somewhat subtle.
> 
> It is too subtle that I am not sure what the paragraph wants to say.
> 
> Before 88473c8b, we had a fully parsed object in obj and entered the
> while() loop iff the outermost object is a tag, then we find the
> underlying object via obj->tagged.  We parse that underlying object
> to find if it is a tag, and break out if it is not.
> 
> By "this commit", I assume that the above mean the change in this
> fix, i.e. parse 'obj' if it has not been parsed before looking at
> its tagged field.  But I am not sure how that would have been an
> optimization before 88473c8b that gave a parsed tag object 'obj'
> upon entry to the loop.
> 
> Puzzled.

The optimization is that we are parsing tags before looking at their
structs, instead of always parsing the thing that the tag points to.

So in the old loop (pseudo-code for clarity):

  parse_object(obj);
  while (obj->type == OBJ_TAG) {
          obj = obj->tagged;
	  parse_object(obj);
  }

if we had a tag pointing to a commit, we'd parse the commit. But we
don't need to. We just need to know that it exists and is a commit.

In the new code, we parse only when we need to look at obj->tagged:

  while (obj->type == OBJ_TAG) {
          parse_object(obj);
	  obj = obj->tagged;
  }

So we must "somehow" know the type of "obj" in the first place, as well
as the type of every obj->tagged we look at. And that leads into your
question here:

> In any case, let's talk about this patch in the context to which it
> is designed to be applied, i.e. post 88473c8b3c8b.
> 
> When we come here, we have done oid_object_info() plus
> lookup_object_by_type() to obtain 'obj' and we know its type.
> Then we enter the loop.
> 
>  	while (obj->type == OBJ_TAG) {
> +		if (!obj->parsed)
> +			parse_object(the_repository, &obj->oid);
> 
> And we parse if it hasn't been parsed.  THat is why we can ...
> 
>  		obj = ((struct tag *)obj)->tagged;
>  		if (!obj)
>  			break;
> 
> ... look at its tagged member.
> 
> -		if (!obj->parsed)
> -			parse_object(the_repository, &obj->oid);
>  		add_name_decoration(DECORATION_REF_TAG, refname, obj);
> 
> And the updated 'obj' (i.e. direct referent of the tag object) is
> fed to add_name_decoration().  And then we move to the next
> iteration.
> 
> Now, do we know what type of object 'obj' is at this point?  We
> did parse the outermost tag upon entry to this loop, we replaced
> 'obj' variable with the referent by following the .tagged member,
> but we haven't parsed that object or ran oid_object_info() on it.
> 
> Puzzled.

...and the answer is that we don't need to parse it. The tag object
mentions the type of what it points to, and we use lookup_commit(), etc,
to create the object pointed to by its "tagged" field. If we can't do
that (say, because the tag is missing the type field, or we previously
saw the object as a different type, etc), then obj->tagged would be NULL
and we'd break out of the loop.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13 21:27         ` Jeff King
@ 2021-07-13 21:40           ` Junio C Hamano
  2021-07-13 21:52             ` Martin Ågren
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-07-13 21:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Ågren, git, Taylor Blau

Jeff King <peff@peff.net> writes:

>> Puzzled.
>
> ...and the answer is that we don't need to parse it. The tag object
> mentions the type of what it points to, and we use lookup_commit(), etc,
> to create the object pointed to by its "tagged" field.

Ahh, parse_object() on the outer tag, when instantiating the in-core
obj, allocated an in-core object and that instance is already given
a type from the tag object and .taggeed member points at that
object, so it is not an "unknown" object (tag.c::parse_tag_buffer()).

Totally forgot about that one; thanks.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13 21:40           ` Junio C Hamano
@ 2021-07-13 21:52             ` Martin Ågren
  2021-07-13 22:22               ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2021-07-13 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List, Taylor Blau

Earlier, Junio C Hamano <gitster@pobox.com> wrote:

>>> Note how this commit could have been done as an optimization before
>>> 88473c8bae: When our peeling hits a non-tag, we won't parse that tagged
>>> object only to immediately end the loop.
>>
>> Yep, thanks for mentioning this, as it's somewhat subtle.
>
> It is too subtle that I am not sure what the paragraph wants to say.

Then:

> Jeff King <peff@peff.net> writes:
>
> >> Puzzled.
> >
> > ...and the answer is that we don't need to parse it. The tag object
> > mentions the type of what it points to, and we use lookup_commit(), etc,
> > to create the object pointed to by its "tagged" field.
>
> Ahh, parse_object() on the outer tag, when instantiating the in-core
> obj, allocated an in-core object and that instance is already given
> a type from the tag object and .taggeed member points at that
> object, so it is not an "unknown" object (tag.c::parse_tag_buffer()).
>
> Totally forgot about that one; thanks.

Do you have any suggestions for how this could be explained better? I
waffled on whether to add that paragraph to the commit message and when
I finally did, it seems it got a little bit too succinct.

I'm about to check out for today. Maybe in the morning I can think of
some clarification.

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13 21:52             ` Martin Ågren
@ 2021-07-13 22:22               ` Jeff King
  2021-07-14  8:13                 ` Martin Ågren
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2021-07-13 22:22 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, Git Mailing List, Taylor Blau

On Tue, Jul 13, 2021 at 11:52:53PM +0200, Martin Ågren wrote:

> > >> Puzzled.
> > >
> > > ...and the answer is that we don't need to parse it. The tag object
> > > mentions the type of what it points to, and we use lookup_commit(), etc,
> > > to create the object pointed to by its "tagged" field.
> >
> > Ahh, parse_object() on the outer tag, when instantiating the in-core
> > obj, allocated an in-core object and that instance is already given
> > a type from the tag object and .taggeed member points at that
> > object, so it is not an "unknown" object (tag.c::parse_tag_buffer()).
> >
> > Totally forgot about that one; thanks.
> 
> Do you have any suggestions for how this could be explained better? I
> waffled on whether to add that paragraph to the commit message and when
> I finally did, it seems it got a little bit too succinct.
> 
> I'm about to check out for today. Maybe in the morning I can think of
> some clarification.

My attempt is below. Most of the new explanation is near the end, but I
tweaked a few other things.

Your original said:

  The reason this happens is in the loop where we try to peel the tags,
  we won't necessarily have parsed that first object. If we haven't, its
  `tag` will be NULL, so nothing will be displayed, and its `tagged`
  will also be NULL, so we won't peel any further.

and my earlier explanations were not thinking of the "tag" field at all,
which made me worried there was another subtle bug in not parsing the
tag earlier. But I don't think so. We don't look at the "tag" field for
setting the annotation; it always comes from the refname. So the
paragraph above should not mention "tag" at all.

I also beefed up the test a bit. All this talk of parsing made me want
to make sure we were covering tags-of-tags correctly (which I think we
are both before and after the patch). After adding that, the expected
decoration output was getting quite cluttered. So I tweaked the test to
make a new commit, give the tags sensible names, and just look at that
one commit.

Here it is.

-- >8 --
From: Martin Ågren <martin.agren@gmail.com>
Subject: load_ref_decorations(): fix decoration with tags

Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
Rather than calling `parse_object()`, we go for `oid_object_info()` and
then `lookup_object_by_type()` using the type just discovered. As
detailed in the commit message, this provides a significant time saving.

Unfortunately, it also changes the behavior: We lose all annotated tags
from the decoration.

The reason this happens is in the loop where we try to peel the tags, we
won't necessarily have parsed that first object. If we haven't, its
`tagged` field will be NULL, so we won't actually add a decoration for
the pointed-to object.

Make sure to parse the tag object at the top of the peeling loop. This
effectively restores the pre-88473c8bae parsing -- but only of tags,
allowing us to keep most of the possible speedup from 88473c8bae. Jeff
King reports:

  On my big ~220k ref test case (where it's mostly non-tags), the
  timings [using "git log -1 --decorate"] are:

    - before either patch: 2.945s
    - with my broken patch: 0.707s
    - with [this patch]: 0.788s

The simplest way to do this is to just conditionally parse before the
loop:

  if (obj->type == OBJ_TAG)
          parse_object(&obj->oid);

But we can observe that our tag-peeling loop needs to peel already, to
examine recursive tags-of-tags. So instead of introducing a new call to
parse_object(), we can simply move the parsing higher in the loop:
instead of parsing the new object before we loop, parse each tag object
before we look at its "tagged" field.

This has another beneficial side effect: if a tag points at a commit (or
other non-tag type), we do not bother to parse the commit at all now.
And we know it is a commit without calling oid_object_info(), because
parsing the surrounding tag object will have created the correct in-core
object based on the "type" field of the tag.

Our test coverage for --decorate was obviously not good, since we missed
this quite-basic regression. The new tests covers an annotated tag
(showing the fix), but also that we correctly show annotations for
lightweight tags and double-annotated tag-of-tags.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 log-tree.c     |  4 ++--
 t/t4202-log.sh | 14 ++++++++++++++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 4f69ed176d..6dc4412268 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -174,11 +174,11 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
 
 	add_name_decoration(deco_type, refname, obj);
 	while (obj->type == OBJ_TAG) {
+		if (!obj->parsed)
+			parse_object(the_repository, &obj->oid);
 		obj = ((struct tag *)obj)->tagged;
 		if (!obj)
 			break;
-		if (!obj->parsed)
-			parse_object(the_repository, &obj->oid);
 		add_name_decoration(DECORATION_REF_TAG, refname, obj);
 	}
 	return 0;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 350cfa3593..fe8f5e2067 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1905,6 +1905,20 @@ test_expect_success '--exclude-promisor-objects does not BUG-crash' '
 	test_must_fail git log --exclude-promisor-objects source-a
 '
 
+test_expect_success 'log --decorate includes all levels of tag annotated tags' '
+	git checkout -b branch &&
+	git commit --allow-empty -m "new commit" &&
+	git tag lightweight HEAD &&
+	git tag -m annotated annotated HEAD &&
+	git tag -m double-0 double-0 HEAD &&
+	git tag -m double-1 double-1 double-0 &&
+	cat >expect <<-\EOF &&
+	HEAD -> branch, tag: lightweight, tag: double-1, tag: double-0, tag: annotated
+	EOF
+	git log -1 --format="%D" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'log --end-of-options' '
        git update-ref refs/heads/--source HEAD &&
        git log --end-of-options --source >actual &&
-- 
2.32.0.663.g932e3f012f


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] load_ref_decorations(): fix decoration with tags
  2021-07-13 22:22               ` Jeff King
@ 2021-07-14  8:13                 ` Martin Ågren
  2021-07-14 16:31                   ` [PATCH v3] " Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2021-07-14  8:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List, Taylor Blau

On Wed, 14 Jul 2021 at 00:22, Jeff King <peff@peff.net> wrote:
>
> On Tue, Jul 13, 2021 at 11:52:53PM +0200, Martin Ågren wrote:
>
> > > Totally forgot about that one; thanks.
> >
> > Do you have any suggestions for how this could be explained better? I
> > waffled on whether to add that paragraph to the commit message and when
> > I finally did, it seems it got a little bit too succinct.
> >
> > I'm about to check out for today. Maybe in the morning I can think of
> > some clarification.
>
> My attempt is below. Most of the new explanation is near the end, but I
> tweaked a few other things.
>
> Your original said:
>
>   The reason this happens is in the loop where we try to peel the tags,
>   we won't necessarily have parsed that first object. If we haven't, its
>   `tag` will be NULL, so nothing will be displayed, and its `tagged`
>   will also be NULL, so we won't peel any further.
>
> and my earlier explanations were not thinking of the "tag" field at all,
> which made me worried there was another subtle bug in not parsing the
> tag earlier. But I don't think so. We don't look at the "tag" field for
> setting the annotation; it always comes from the refname. So the
> paragraph above should not mention "tag" at all.

Thanks for correcting that. The parsed-ness of "obj" affects whether the
decoration is shown at all. I originally concluded that when the
decorations are eventually displayed, it was something like "if ->tag is
non-NULL, display it". But that's obviously not the case. It's more like
"->tagged is NULL, so I have no idea where to place this decoration".

> I also beefed up the test a bit. All this talk of parsing made me want
> to make sure we were covering tags-of-tags correctly (which I think we
> are both before and after the patch). After adding that, the expected
> decoration output was getting quite cluttered. So I tweaked the test to
> make a new commit, give the tags sensible names, and just look at that
> one commit.

> From: Martin Ågren <martin.agren@gmail.com>

At this point, I think it's fair to say that you've done most of the
authoring here. I wouldn't be at all offended if you took the credit for
this patch. It's your code diff, it's your test, and now it's even your
*updated* test, plus half the commit message. :)

Here's that added half of the message:

> The simplest way to do this is to just conditionally parse before the
> loop:
>
>   if (obj->type == OBJ_TAG)
>           parse_object(&obj->oid);
>
> But we can observe that our tag-peeling loop needs to peel already, to
> examine recursive tags-of-tags. So instead of introducing a new call to
> parse_object(), we can simply move the parsing higher in the loop:
> instead of parsing the new object before we loop, parse each tag object
> before we look at its "tagged" field.
>
> This has another beneficial side effect: if a tag points at a commit (or
> other non-tag type), we do not bother to parse the commit at all now.
> And we know it is a commit without calling oid_object_info(), because
> parsing the surrounding tag object will have created the correct in-core
> object based on the "type" field of the tag.
>
> Our test coverage for --decorate was obviously not good, since we missed
> this quite-basic regression. The new tests covers an annotated tag
> (showing the fix), but also that we correctly show annotations for
> lightweight tags and double-annotated tag-of-tags.

Very well described.

> Helped-by: Jeff King <peff@peff.net>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>

If you take authorship of this, I think this could be something like

Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Martin Ågren <martin.agren@gmail.com>

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3] load_ref_decorations(): fix decoration with tags
  2021-07-14  8:13                 ` Martin Ågren
@ 2021-07-14 16:31                   ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2021-07-14 16:31 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, Git Mailing List, Taylor Blau

On Wed, Jul 14, 2021 at 10:13:38AM +0200, Martin Ågren wrote:

> > From: Martin Ågren <martin.agren@gmail.com>
> 
> At this point, I think it's fair to say that you've done most of the
> authoring here. I wouldn't be at all offended if you took the credit for
> this patch. It's your code diff, it's your test, and now it's even your
> *updated* test, plus half the commit message. :)

Yeah, that occurred to me, too. Let's swap it, then. When the fix
inevitably turns out to be wrong, I can take the blame. ;)

Here's an updated patch for Junio's convenience (same as what I posted
before, but with author/trailers tweaked). Thanks again for finding and
working on this!

-- >8 --
Subject: [PATCH] load_ref_decorations(): fix decoration with tags

Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag
objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`:
Rather than calling `parse_object()`, we go for `oid_object_info()` and
then `lookup_object_by_type()` using the type just discovered. As
detailed in the commit message, this provides a significant time saving.

Unfortunately, it also changes the behavior: We lose all annotated tags
from the decoration.

The reason this happens is in the loop where we try to peel the tags, we
won't necessarily have parsed that first object. If we haven't, its
`tagged` field will be NULL, so we won't actually add a decoration for
the pointed-to object.

Make sure to parse the tag object at the top of the peeling loop. This
effectively restores the pre-88473c8bae parsing -- but only of tags,
allowing us to keep most of the possible speedup from 88473c8bae.

On my big ~220k ref test case (where it's mostly non-tags), the
timings [using "git log -1 --decorate"] are:

  - before either patch: 2.945s
  - with my broken patch: 0.707s
  - with [this patch]: 0.788s

The simplest way to do this is to just conditionally parse before the
loop:

  if (obj->type == OBJ_TAG)
          parse_object(&obj->oid);

But we can observe that our tag-peeling loop needs to peel already, to
examine recursive tags-of-tags. So instead of introducing a new call to
parse_object(), we can simply move the parsing higher in the loop:
instead of parsing the new object before we loop, parse each tag object
before we look at its "tagged" field.

This has another beneficial side effect: if a tag points at a commit (or
other non-tag type), we do not bother to parse the commit at all now.
And we know it is a commit without calling oid_object_info(), because
parsing the surrounding tag object will have created the correct in-core
object based on the "type" field of the tag.

Our test coverage for --decorate was obviously not good, since we missed
this quite-basic regression. The new tests covers an annotated tag
(showing the fix), but also that we correctly show annotations for
lightweight tags and double-annotated tag-of-tags.

Reported-by: Martin Ågren <martin.agren@gmail.com>
Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Martin Ågren <martin.agren@gmail.com>
---
 log-tree.c     |  4 ++--
 t/t4202-log.sh | 14 ++++++++++++++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 4f69ed176d..6dc4412268 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -174,11 +174,11 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
 
 	add_name_decoration(deco_type, refname, obj);
 	while (obj->type == OBJ_TAG) {
+		if (!obj->parsed)
+			parse_object(the_repository, &obj->oid);
 		obj = ((struct tag *)obj)->tagged;
 		if (!obj)
 			break;
-		if (!obj->parsed)
-			parse_object(the_repository, &obj->oid);
 		add_name_decoration(DECORATION_REF_TAG, refname, obj);
 	}
 	return 0;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 39e746fbcb..9dfead936b 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1915,6 +1915,20 @@ test_expect_success '--exclude-promisor-objects does not BUG-crash' '
 	test_must_fail git log --exclude-promisor-objects source-a
 '
 
+test_expect_success 'log --decorate includes all levels of tag annotated tags' '
+	git checkout -b branch &&
+	git commit --allow-empty -m "new commit" &&
+	git tag lightweight HEAD &&
+	git tag -m annotated annotated HEAD &&
+	git tag -m double-0 double-0 HEAD &&
+	git tag -m double-1 double-1 double-0 &&
+	cat >expect <<-\EOF &&
+	HEAD -> branch, tag: lightweight, tag: double-1, tag: double-0, tag: annotated
+	EOF
+	git log -1 --format="%D" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'log --end-of-options' '
        git update-ref refs/heads/--source HEAD &&
        git log --end-of-options --source >actual &&
-- 
2.32.0.689.gbb74d99cdd


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-14 16:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12 22:41 [PATCH v1] load_ref_decorations(): fix decoration with tags Martin Ågren
2021-07-13  0:06 ` Jeff King
2021-07-13  7:40   ` [PATCH v2] " Martin Ågren
2021-07-13  7:52     ` Jeff King
2021-07-13  8:47       ` Martin Ågren
2021-07-13 21:17       ` Junio C Hamano
2021-07-13 21:27         ` Jeff King
2021-07-13 21:40           ` Junio C Hamano
2021-07-13 21:52             ` Martin Ågren
2021-07-13 22:22               ` Jeff King
2021-07-14  8:13                 ` Martin Ågren
2021-07-14 16:31                   ` [PATCH v3] " Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).