git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug
@ 2021-06-01 14:35 ZheNing Hu via GitGitGadget
  2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-01 14:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Jeff King,
	Felipe Contreras, ZheNing Hu

Change from last version:

 1. Use Peff's processing method.
 2. Add tests for atoms %(objectname), %(rest), and modified the test
    method.
 3. Merge two check block into one.

ZheNing Hu (2):
  [GSOC] cat-file: fix --batch report changed-type bug
  [GSOC] cat-file: merge two block into one

 builtin/cat-file.c  | 10 ++++------
 t/t1006-cat-file.sh | 19 +++++++++++++++++++
 2 files changed, 23 insertions(+), 6 deletions(-)


base-commit: 5d5b1473453400224ebb126bf3947e0a3276bdf5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-967%2Fadlternative%2Fcat-file-batch-bug-fix-v2-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-967/adlternative/cat-file-batch-bug-fix-v2-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/967
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-01 14:35 [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
@ 2021-06-01 14:35 ` ZheNing Hu via GitGitGadget
  2021-06-01 15:52   ` Jeff King
  2021-06-01 14:35 ` [PATCH 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
  2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 16+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-01 14:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Jeff King,
	Felipe Contreras, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

When `--batch` used with `--batch-all-objects`,
with some format atoms like %(objectname), %(rest)
or even no atoms may cause Git exit and report
"object xxx changed type!?".

E.g. `git cat-file --batch="batman" --batch-all-objects`

This is because we did not get the object type through
oid_object_info_extended(), it's composed of two
situations:

1. Since object_info is empty, skip_object_info is
set to true, We skipped collecting the object type.

2. The formatting atom like %(objectname) does not require
oid_object_info_extended() to collect object types.

The correct way to deal with it is to swap the order
of setting skip_object_info and setting typep. This
will ensure that we must get the type of the object
when using --batch.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/cat-file.c  | 13 +++++++------
 t/t1006-cat-file.sh | 19 +++++++++++++++++++
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5ebf13359e83..02461bb5ea6f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -512,12 +512,6 @@ static int batch_objects(struct batch_options *opt)
 	if (opt->cmdmode)
 		data.split_on_whitespace = 1;
 
-	if (opt->all_objects) {
-		struct object_info empty = OBJECT_INFO_INIT;
-		if (!memcmp(&data.info, &empty, sizeof(empty)))
-			data.skip_object_info = 1;
-	}
-
 	/*
 	 * If we are printing out the object, then always fill in the type,
 	 * since we will want to decide whether or not to stream.
@@ -525,6 +519,13 @@ static int batch_objects(struct batch_options *opt)
 	if (opt->print_contents)
 		data.info.typep = &data.type;
 
+	if (opt->all_objects) {
+		struct object_info empty = OBJECT_INFO_INIT;
+
+		if (!memcmp(&data.info, &empty, sizeof(empty)))
+			data.skip_object_info = 1;
+	}
+
 	if (opt->all_objects) {
 		struct object_cb_data cb;
 
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 5d2dc99b74ad..1502a27142ba 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -586,4 +586,23 @@ test_expect_success 'cat-file --unordered works' '
 	test_cmp expect actual
 '
 
+test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
+	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects &&
+	git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
+	git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
+	cmp expect actual
+'
+
+test_expect_success 'cat-file --batch="%(rest)" with --batch-all-objects will work' '
+	git -C all-two cat-file --batch="%(rest)" <objects >expect &&
+	git -C all-two cat-file --batch-all-objects --batch="%(rest)" >actual &&
+	cmp expect actual
+'
+
+test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' '
+	git -C all-two cat-file --batch="batman" <objects >expect &&
+	git -C all-two cat-file --batch-all-objects --batch="batman" >actual &&
+	cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2]  [GSOC] cat-file: merge two block into one
  2021-06-01 14:35 [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
  2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
@ 2021-06-01 14:35 ` ZheNing Hu via GitGitGadget
  2021-06-01 15:55   ` Jeff King
  2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 16+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-01 14:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Jeff King,
	Felipe Contreras, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

 Because the two "if (opt->all_objects)" block
 are redundant, merge them into one, to provide
 better readability.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/cat-file.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 02461bb5ea6f..243fe6844bc6 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -520,14 +520,11 @@ static int batch_objects(struct batch_options *opt)
 		data.info.typep = &data.type;
 
 	if (opt->all_objects) {
+		struct object_cb_data cb;
 		struct object_info empty = OBJECT_INFO_INIT;
 
 		if (!memcmp(&data.info, &empty, sizeof(empty)))
 			data.skip_object_info = 1;
-	}
-
-	if (opt->all_objects) {
-		struct object_cb_data cb;
 
 		if (has_promisor_remote())
 			warning("This repository uses promisor remotes. Some objects may not be loaded.");
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
@ 2021-06-01 15:52   ` Jeff King
  2021-06-02 13:15     ` ZheNing Hu
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2021-06-01 15:52 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Hariom Verma,
	Felipe Contreras, ZheNing Hu

On Tue, Jun 01, 2021 at 02:35:56PM +0000, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
> 
> When `--batch` used with `--batch-all-objects`,
> with some format atoms like %(objectname), %(rest)
> or even no atoms may cause Git exit and report
> "object xxx changed type!?".
> 
> E.g. `git cat-file --batch="batman" --batch-all-objects`
> 
> This is because we did not get the object type through
> oid_object_info_extended(), it's composed of two
> situations:
> 
> 1. Since object_info is empty, skip_object_info is
> set to true, We skipped collecting the object type.
> 
> 2. The formatting atom like %(objectname) does not require
> oid_object_info_extended() to collect object types.
> 
> The correct way to deal with it is to swap the order
> of setting skip_object_info and setting typep. This
> will ensure that we must get the type of the object
> when using --batch.

Thanks, this explanation and the patch make sense, and I'd be happy if
we take it as-is. But in the name of GSoC, let me offer a few polishing
tips.

The commit message hints at the root of the problem, but doesn't say it
explicitly. Which is: because setting skip_object_info depends on seeing
that the object_info is empty, we can't add items to it after setting
that flag. And the code path for --batch does that, hence re-ordering
them is the solution.

It might also be worth noting that the bug was present from when the
skip_object_info code was initially added in 845de33a5b (cat-file: avoid
noop calls to sha1_object_info_extended, 2016-05-18).

> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index 5d2dc99b74ad..1502a27142ba 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -586,4 +586,23 @@ test_expect_success 'cat-file --unordered works' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
> +	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects &&
> +	git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
> +	git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
> +	cmp expect actual
> +'

OK, we're checking the output of --batch-all-objects versus taking the
object list from the input. We can depend on the ordering being
identical between the two because --batch-all-objects sorts. Good.

> +test_expect_success 'cat-file --batch="%(rest)" with --batch-all-objects will work' '
> +	git -C all-two cat-file --batch="%(rest)" <objects >expect &&
> +	git -C all-two cat-file --batch-all-objects --batch="%(rest)" >actual &&
> +	cmp expect actual
> +'

This one is rather curious. It definitely shows the bug, but I can't
imagine why %(rest) would be useful with --batch-all-objects, since its
purpose is to show the rest of the input line (and there are no input
lines!).

That might be a problem later if we change the behavior (e.g., to say
"%(rest) does not make sense with --batch-all-objects"). But I'm also OK
leaving it for now; somebody later can dig up this commit via git-blame.

> +test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' '
> +	git -C all-two cat-file --batch="batman" <objects >expect &&
> +	git -C all-two cat-file --batch-all-objects --batch="batman" >actual &&
> +	cmp expect actual
> +'

And this one is a more extreme version of the "%(objectname)" one. It's
useful as a regression test because we might later change the
optimization so that %(objectname) ends up filling in the other object
info.

There's a subtle dependency here on the "objects" file from the earlier
test. In such a case, we'll often split that out as a separate setup
step like:

  test_expect_success 'set up object list for --batch-all-objects tests' '
	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects
  '

That makes it more clear that all three of the other tests are doing the
same thing (just with different formats), and can be reordered, removed,
etc, later if we wanted to. So not a correctness thing, but rather just
communicating the structure of the tests to later readers.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2]  [GSOC] cat-file: merge two block into one
  2021-06-01 14:35 ` [PATCH 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
@ 2021-06-01 15:55   ` Jeff King
  2021-06-02 13:27     ` ZheNing Hu
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2021-06-01 15:55 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Hariom Verma,
	Felipe Contreras, ZheNing Hu

On Tue, Jun 01, 2021 at 02:35:57PM +0000, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
> 
>  Because the two "if (opt->all_objects)" block
>  are redundant, merge them into one, to provide
>  better readability.

Funny indentation of the commit message. :)

I think this is worth doing, and I agree the end-result is easier to
read. Really minor nit, but I probably wouldn't have said "redundant"
here. The conditionals themselves are redundant, but not the blocks.
Maybe: There are two "if (opt->all_objects)" blocks next to each other;
merge them into one to provide better readability.

(not a huge deal, as I think seeing the patch helps explain what is
going on. But again, just trying to offer polishing advice for future
patches)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-01 15:52   ` Jeff King
@ 2021-06-02 13:15     ` ZheNing Hu
  2021-06-02 20:00       ` Jeff King
  0 siblings, 1 reply; 16+ messages in thread
From: ZheNing Hu @ 2021-06-02 13:15 UTC (permalink / raw)
  To: Jeff King
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Felipe Contreras

Jeff King <peff@peff.net> 于2021年6月1日周二 下午11:52写道:
>
> On Tue, Jun 01, 2021 at 02:35:56PM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > When `--batch` used with `--batch-all-objects`,
> > with some format atoms like %(objectname), %(rest)
> > or even no atoms may cause Git exit and report
> > "object xxx changed type!?".
> >
> > E.g. `git cat-file --batch="batman" --batch-all-objects`
> >
> > This is because we did not get the object type through
> > oid_object_info_extended(), it's composed of two
> > situations:
> >
> > 1. Since object_info is empty, skip_object_info is
> > set to true, We skipped collecting the object type.
> >
> > 2. The formatting atom like %(objectname) does not require
> > oid_object_info_extended() to collect object types.
> >
> > The correct way to deal with it is to swap the order
> > of setting skip_object_info and setting typep. This
> > will ensure that we must get the type of the object
> > when using --batch.
>
> Thanks, this explanation and the patch make sense, and I'd be happy if
> we take it as-is. But in the name of GSoC, let me offer a few polishing
> tips.
>

Thanks, as a GSOC student, your reminders will be very helpful to me.

> The commit message hints at the root of the problem, but doesn't say it
> explicitly. Which is: because setting skip_object_info depends on seeing
> that the object_info is empty, we can't add items to it after setting
> that flag. And the code path for --batch does that, hence re-ordering
> them is the solution.
>

Um, let's rewrite the commit message, I don't know if this is accurate:

[GSOC] cat-file: fix --batch report changed-type bug

When `--batch` used with `--batch-all-objects`,
with some format atoms like %(objectname), %(rest)
or even no atoms may cause Git exit and report
"object xxx changed type!?".

E.g. `git cat-file --batch="batman" --batch-all-objects`

The bug was present from when the skip_object_info code
was initially added in 845de33a5b (cat-file: avoid
noop calls to sha1_object_info_extended, 2016-05-18).

This is because we did not get the object type through
oid_object_info_extended(), it's composed of two
situations:

1. all_objects will be set to true when we use
`--batch-all-objects`, seeing that object_info
is empty, skip_object_info will be to true,
`oid_object_info_extended()` will not get the
type of the object.

2. The formatting atom like %(objectname) does
not require oid_object_info_extended() to collect
object types.

print_contents will be set to true when we use
`--batch`, which can make object_info non-empty,
so the solution is to swap the code order of it
and checking if object_info is empty, which will
ensure that we must get the type of the object
when using --batch.

> It might also be worth noting that the bug was present from when the
> skip_object_info code was initially added in 845de33a5b (cat-file: avoid
> noop calls to sha1_object_info_extended, 2016-05-18).
>

OK.

> > +test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
> > +     git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects &&
> > +     git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
> > +     git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
> > +     cmp expect actual
> > +'
>
> OK, we're checking the output of --batch-all-objects versus taking the
> object list from the input. We can depend on the ordering being
> identical between the two because --batch-all-objects sorts. Good.
>
> > +test_expect_success 'cat-file --batch="%(rest)" with --batch-all-objects will work' '
> > +     git -C all-two cat-file --batch="%(rest)" <objects >expect &&
> > +     git -C all-two cat-file --batch-all-objects --batch="%(rest)" >actual &&
> > +     cmp expect actual
> > +'
>
> This one is rather curious. It definitely shows the bug, but I can't
> imagine why %(rest) would be useful with --batch-all-objects, since its
> purpose is to show the rest of the input line (and there are no input
> lines!).
>

I wanted to argue that print_object_or_die() will use data->rest originally.
But --batch-all-objects and --textconv and --filter are incompatible.
So your idea is reasonable: %(rest) with --batch-all-objects is useless.

> That might be a problem later if we change the behavior (e.g., to say
> "%(rest) does not make sense with --batch-all-objects"). But I'm also OK
> leaving it for now; somebody later can dig up this commit via git-blame.
>

Yes, I think this feature can be completed later.(not in this patch)

> > +test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' '
> > +     git -C all-two cat-file --batch="batman" <objects >expect &&
> > +     git -C all-two cat-file --batch-all-objects --batch="batman" >actual &&
> > +     cmp expect actual
> > +'
>
> And this one is a more extreme version of the "%(objectname)" one. It's
> useful as a regression test because we might later change the
> optimization so that %(objectname) ends up filling in the other object
> info.
>

Yes, it does not use atoms, so it is the most special.

> There's a subtle dependency here on the "objects" file from the earlier
> test. In such a case, we'll often split that out as a separate setup
> step like:
>
>   test_expect_success 'set up object list for --batch-all-objects tests' '
>         git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects
>   '
>
> That makes it more clear that all three of the other tests are doing the
> same thing (just with different formats), and can be reordered, removed,
> etc, later if we wanted to. So not a correctness thing, but rather just
> communicating the structure of the tests to later readers.
>

Makes sense.

> -Peff

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] [GSOC] cat-file: merge two block into one
  2021-06-01 15:55   ` Jeff King
@ 2021-06-02 13:27     ` ZheNing Hu
  0 siblings, 0 replies; 16+ messages in thread
From: ZheNing Hu @ 2021-06-02 13:27 UTC (permalink / raw)
  To: Jeff King
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Felipe Contreras

Jeff King <peff@peff.net> 于2021年6月1日周二 下午11:55写道:
>
> On Tue, Jun 01, 2021 at 02:35:57PM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> >  Because the two "if (opt->all_objects)" block
> >  are redundant, merge them into one, to provide
> >  better readability.
>
> Funny indentation of the commit message. :)
>

Yes, there is a small space that I didn’t notice. :)

> I think this is worth doing, and I agree the end-result is easier to
> read. Really minor nit, but I probably wouldn't have said "redundant"
> here. The conditionals themselves are redundant, but not the blocks.
> Maybe: There are two "if (opt->all_objects)" blocks next to each other;
> merge them into one to provide better readability.
>

Yeah, your expression is correct.

> (not a huge deal, as I think seeing the patch helps explain what is
> going on. But again, just trying to offer polishing advice for future
> patches)

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-02 13:15     ` ZheNing Hu
@ 2021-06-02 20:00       ` Jeff King
  2021-06-03  6:28         ` ZheNing Hu
  2021-06-03  7:18         ` ZheNing Hu
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff King @ 2021-06-02 20:00 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Felipe Contreras

On Wed, Jun 02, 2021 at 09:15:45PM +0800, ZheNing Hu wrote:

> > The commit message hints at the root of the problem, but doesn't say it
> > explicitly. Which is: because setting skip_object_info depends on seeing
> > that the object_info is empty, we can't add items to it after setting
> > that flag. And the code path for --batch does that, hence re-ordering
> > them is the solution.
> 
> Um, let's rewrite the commit message, I don't know if this is accurate:
> 
> [GSOC] cat-file: fix --batch report changed-type bug
> 
> When `--batch` used with `--batch-all-objects`,
> with some format atoms like %(objectname), %(rest)
> or even no atoms may cause Git exit and report
> "object xxx changed type!?".
> 
> E.g. `git cat-file --batch="batman" --batch-all-objects`
> 
> The bug was present from when the skip_object_info code
> was initially added in 845de33a5b (cat-file: avoid
> noop calls to sha1_object_info_extended, 2016-05-18).
> 
> This is because we did not get the object type through
> oid_object_info_extended(), it's composed of two
> situations:
> 
> 1. all_objects will be set to true when we use
> `--batch-all-objects`, seeing that object_info
> is empty, skip_object_info will be to true,
> `oid_object_info_extended()` will not get the
> type of the object.
> 
> 2. The formatting atom like %(objectname) does
> not require oid_object_info_extended() to collect
> object types.
> 
> print_contents will be set to true when we use
> `--batch`, which can make object_info non-empty,
> so the solution is to swap the code order of it
> and checking if object_info is empty, which will
> ensure that we must get the type of the object
> when using --batch.

I don't see any inaccuracies there. I do think we could explain it a bit
more succinctly. I'll give my attempt, and then you can pick and choose
which parts to include between ours. :)

  Subject: cat-file: handle trivial --batch format with --batch-all-objects

  The --batch code to print an object assumes we found out the type of
  the object from calling oid_object_info_extended(). This is true for
  the default format, but even in a custom format, we manually modify
  the object_info struct to ask for the type.

  This assumption was broken by 845de33a5b (cat-file: avoid noop calls
  to sha1_object_info_extended, 2016-05-18). That commit skips the call
  to oid_object_info_extended() entirely when --batch-all-objects is in
  use, and the custom format does not include any placeholders that
  require calling it.

  This results in an error when we try to confirm that the type didn't
  change:

    $ git cat-file --batch=batman --batch-all-objects
    batman
    fatal: object 000023961a0c02d6e21dc51ea3484ff71abf1c74 changed type!?

  and also has other subtle effects (e.g., we'd fail to stream a blob,
  since we don't realize it's a blog in the first place).

  We can fix this by flipping the order of the setup. The check for "do
  we need to get the object info" must come _after_ we've decided
  whether we need to look up the type.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-02 20:00       ` Jeff King
@ 2021-06-03  6:28         ` ZheNing Hu
  2021-06-03  7:18         ` ZheNing Hu
  1 sibling, 0 replies; 16+ messages in thread
From: ZheNing Hu @ 2021-06-03  6:28 UTC (permalink / raw)
  To: Jeff King
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Felipe Contreras

Jeff King <peff@peff.net> 于2021年6月3日周四 上午4:01写道:
>
> I don't see any inaccuracies there. I do think we could explain it a bit
> more succinctly. I'll give my attempt, and then you can pick and choose
> which parts to include between ours. :)
>

Yes, you wrote it more concisely. I don't need to write out the details of
those variables. I just need to tell the reader the introduction of
skip_object_info will break the assumption of --batch that we have found
the object type.

>   Subject: cat-file: handle trivial --batch format with --batch-all-objects
>
>   The --batch code to print an object assumes we found out the type of
>   the object from calling oid_object_info_extended(). This is true for
>   the default format, but even in a custom format, we manually modify
>   the object_info struct to ask for the type.
>
>   This assumption was broken by 845de33a5b (cat-file: avoid noop calls
>   to sha1_object_info_extended, 2016-05-18). That commit skips the call
>   to oid_object_info_extended() entirely when --batch-all-objects is in
>   use, and the custom format does not include any placeholders that
>   require calling it.
>
>   This results in an error when we try to confirm that the type didn't
>   change:
>
>     $ git cat-file --batch=batman --batch-all-objects
>     batman
>     fatal: object 000023961a0c02d6e21dc51ea3484ff71abf1c74 changed type!?
>
>   and also has other subtle effects (e.g., we'd fail to stream a blob,
>   since we don't realize it's a blog in the first place).
>

s/blog/blob/

>   We can fix this by flipping the order of the setup. The check for "do
>   we need to get the object info" must come _after_ we've decided
>   whether we need to look up the type.
>
> -Peff

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-02 20:00       ` Jeff King
  2021-06-03  6:28         ` ZheNing Hu
@ 2021-06-03  7:18         ` ZheNing Hu
  2021-06-03 19:28           ` Jeff King
  1 sibling, 1 reply; 16+ messages in thread
From: ZheNing Hu @ 2021-06-03  7:18 UTC (permalink / raw)
  To: Jeff King
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Felipe Contreras

Jeff King <peff@peff.net> 于2021年6月3日周四 上午4:01写道:
>
>   This assumption was broken by 845de33a5b (cat-file: avoid noop calls
>   to sha1_object_info_extended, 2016-05-18). That commit skips the call
>   to oid_object_info_extended() entirely when --batch-all-objects is in
>   use, and the custom format does not include any placeholders that
>   require calling it.
>

Or when the custom format only include placeholders like %(objectname) or
%(rest), oid_object_info_extended() will not get the type of the object.

>   This results in an error when we try to confirm that the type didn't
>   change:
>
>     $ git cat-file --batch=batman --batch-all-objects
>     batman
>     fatal: object 000023961a0c02d6e21dc51ea3484ff71abf1c74 changed type!?
>
>   and also has other subtle effects (e.g., we'd fail to stream a blob,
>   since we don't realize it's a blog in the first place).
>
>   We can fix this by flipping the order of the setup. The check for "do
>   we need to get the object info" must come _after_ we've decided
>   whether we need to look up the type.
>
> -Peff

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-01 14:35 [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
  2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
  2021-06-01 14:35 ` [PATCH 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
@ 2021-06-03 16:29 ` ZheNing Hu via GitGitGadget
  2021-06-03 16:29   ` [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
                     ` (3 more replies)
  2 siblings, 4 replies; 16+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-03 16:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Jeff King,
	Felipe Contreras, ZheNing Hu

Change from last version:

 1. Modified the test structure under the recommendation of Peff.
 2. Use clearer and more concise commit message help by Peff.

ZheNing Hu (2):
  [GSOC] cat-file: handle trivial --batch format with
    --batch-all-objects
  [GSOC] cat-file: merge two block into one

 builtin/cat-file.c  | 10 ++++------
 t/t1006-cat-file.sh | 22 ++++++++++++++++++++++
 2 files changed, 26 insertions(+), 6 deletions(-)


base-commit: 5d5b1473453400224ebb126bf3947e0a3276bdf5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-967%2Fadlternative%2Fcat-file-batch-bug-fix-v2-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-967/adlternative/cat-file-batch-bug-fix-v2-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/967

Range-diff vs v1:

 1:  495cd90dbaf4 ! 1:  4af3b958dd05 [GSOC] cat-file: fix --batch report changed-type bug
     @@ Metadata
      Author: ZheNing Hu <adlternative@gmail.com>
      
       ## Commit message ##
     -    [GSOC] cat-file: fix --batch report changed-type bug
     +    [GSOC] cat-file: handle trivial --batch format with --batch-all-objects
      
     -    When `--batch` used with `--batch-all-objects`,
     -    with some format atoms like %(objectname), %(rest)
     -    or even no atoms may cause Git exit and report
     -    "object xxx changed type!?".
     +    The --batch code to print an object assumes we found out the type of
     +    the object from calling oid_object_info_extended(). This is true for
     +    the default format, but even in a custom format, we manually modify
     +    the object_info struct to ask for the type.
      
     -    E.g. `git cat-file --batch="batman" --batch-all-objects`
     +    This assumption was broken by 845de33a5b (cat-file: avoid noop calls
     +    to sha1_object_info_extended, 2016-05-18). That commit skips the call
     +    to oid_object_info_extended() entirely when --batch-all-objects is in
     +    use, and the custom format does not include any placeholders that
     +    require calling it.
      
     -    This is because we did not get the object type through
     -    oid_object_info_extended(), it's composed of two
     -    situations:
     +    Or when the custom format only include placeholders like %(objectname) or
     +    %(rest), oid_object_info_extended() will not get the type of the object.
      
     -    1. Since object_info is empty, skip_object_info is
     -    set to true, We skipped collecting the object type.
     +    This results in an error when we try to confirm that the type didn't
     +    change:
      
     -    2. The formatting atom like %(objectname) does not require
     -    oid_object_info_extended() to collect object types.
     +    $ git cat-file --batch=batman --batch-all-objects
     +    batman
     +    fatal: object 000023961a0c02d6e21dc51ea3484ff71abf1c74 changed type!?
      
     -    The correct way to deal with it is to swap the order
     -    of setting skip_object_info and setting typep. This
     -    will ensure that we must get the type of the object
     -    when using --batch.
     +    and also has other subtle effects (e.g., we'd fail to stream a blob,
     +    since we don't realize it's a blob in the first place).
     +
     +    We can fix this by flipping the order of the setup. The check for "do
     +    we need to get the object info" must come _after_ we've decided
     +    whether we need to look up the type.
      
          Helped-by: Jeff King <peff@peff.net>
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
     @@ t/t1006-cat-file.sh: test_expect_success 'cat-file --unordered works' '
       	test_cmp expect actual
       '
       
     ++test_expect_success 'set up object list for --batch-all-objects tests' '
     ++	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects
     ++'
     ++
      +test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
     -+	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects &&
      +	git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
      +	git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
      +	cmp expect actual
 2:  f02c1144d916 ! 2:  759451e784de  [GSOC] cat-file: merge two block into one
     @@ Metadata
      Author: ZheNing Hu <adlternative@gmail.com>
      
       ## Commit message ##
     -     [GSOC] cat-file: merge two block into one
     +    [GSOC] cat-file: merge two block into one
      
     -     Because the two "if (opt->all_objects)" block
     -     are redundant, merge them into one, to provide
     -     better readability.
     +    There are two "if (opt->all_objects)" blocks next
     +    to each other, merge them into one to provide better
     +    readability.
      
          Helped-by: Jeff King <peff@peff.net>
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects
  2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
@ 2021-06-03 16:29   ` ZheNing Hu via GitGitGadget
  2021-06-03 16:29   ` [PATCH v2 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-03 16:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Jeff King,
	Felipe Contreras, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

The --batch code to print an object assumes we found out the type of
the object from calling oid_object_info_extended(). This is true for
the default format, but even in a custom format, we manually modify
the object_info struct to ask for the type.

This assumption was broken by 845de33a5b (cat-file: avoid noop calls
to sha1_object_info_extended, 2016-05-18). That commit skips the call
to oid_object_info_extended() entirely when --batch-all-objects is in
use, and the custom format does not include any placeholders that
require calling it.

Or when the custom format only include placeholders like %(objectname) or
%(rest), oid_object_info_extended() will not get the type of the object.

This results in an error when we try to confirm that the type didn't
change:

$ git cat-file --batch=batman --batch-all-objects
batman
fatal: object 000023961a0c02d6e21dc51ea3484ff71abf1c74 changed type!?

and also has other subtle effects (e.g., we'd fail to stream a blob,
since we don't realize it's a blob in the first place).

We can fix this by flipping the order of the setup. The check for "do
we need to get the object info" must come _after_ we've decided
whether we need to look up the type.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/cat-file.c  | 13 +++++++------
 t/t1006-cat-file.sh | 22 ++++++++++++++++++++++
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 5ebf13359e83..02461bb5ea6f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -512,12 +512,6 @@ static int batch_objects(struct batch_options *opt)
 	if (opt->cmdmode)
 		data.split_on_whitespace = 1;
 
-	if (opt->all_objects) {
-		struct object_info empty = OBJECT_INFO_INIT;
-		if (!memcmp(&data.info, &empty, sizeof(empty)))
-			data.skip_object_info = 1;
-	}
-
 	/*
 	 * If we are printing out the object, then always fill in the type,
 	 * since we will want to decide whether or not to stream.
@@ -525,6 +519,13 @@ static int batch_objects(struct batch_options *opt)
 	if (opt->print_contents)
 		data.info.typep = &data.type;
 
+	if (opt->all_objects) {
+		struct object_info empty = OBJECT_INFO_INIT;
+
+		if (!memcmp(&data.info, &empty, sizeof(empty)))
+			data.skip_object_info = 1;
+	}
+
 	if (opt->all_objects) {
 		struct object_cb_data cb;
 
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 5d2dc99b74ad..18b3779ccb60 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -586,4 +586,26 @@ test_expect_success 'cat-file --unordered works' '
 	test_cmp expect actual
 '
 
+test_expect_success 'set up object list for --batch-all-objects tests' '
+	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects
+'
+
+test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
+	git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
+	git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
+	cmp expect actual
+'
+
+test_expect_success 'cat-file --batch="%(rest)" with --batch-all-objects will work' '
+	git -C all-two cat-file --batch="%(rest)" <objects >expect &&
+	git -C all-two cat-file --batch-all-objects --batch="%(rest)" >actual &&
+	cmp expect actual
+'
+
+test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' '
+	git -C all-two cat-file --batch="batman" <objects >expect &&
+	git -C all-two cat-file --batch-all-objects --batch="batman" >actual &&
+	cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/2] [GSOC] cat-file: merge two block into one
  2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
  2021-06-03 16:29   ` [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
@ 2021-06-03 16:29   ` ZheNing Hu via GitGitGadget
  2021-06-03 19:29   ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug Jeff King
  2021-06-03 22:51   ` Junio C Hamano
  3 siblings, 0 replies; 16+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-03 16:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Jeff King,
	Felipe Contreras, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

There are two "if (opt->all_objects)" blocks next
to each other, merge them into one to provide better
readability.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/cat-file.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 02461bb5ea6f..243fe6844bc6 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -520,14 +520,11 @@ static int batch_objects(struct batch_options *opt)
 		data.info.typep = &data.type;
 
 	if (opt->all_objects) {
+		struct object_cb_data cb;
 		struct object_info empty = OBJECT_INFO_INIT;
 
 		if (!memcmp(&data.info, &empty, sizeof(empty)))
 			data.skip_object_info = 1;
-	}
-
-	if (opt->all_objects) {
-		struct object_cb_data cb;
 
 		if (has_promisor_remote())
 			warning("This repository uses promisor remotes. Some objects may not be loaded.");
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-03  7:18         ` ZheNing Hu
@ 2021-06-03 19:28           ` Jeff King
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2021-06-03 19:28 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Felipe Contreras

On Thu, Jun 03, 2021 at 03:18:43PM +0800, ZheNing Hu wrote:

> Jeff King <peff@peff.net> 于2021年6月3日周四 上午4:01写道:
> >
> >   This assumption was broken by 845de33a5b (cat-file: avoid noop calls
> >   to sha1_object_info_extended, 2016-05-18). That commit skips the call
> >   to oid_object_info_extended() entirely when --batch-all-objects is in
> >   use, and the custom format does not include any placeholders that
> >   require calling it.
> >
> 
> Or when the custom format only include placeholders like %(objectname) or
> %(rest), oid_object_info_extended() will not get the type of the object.

Yeah, that's what I was trying to get at with "placeholders that require
calling it", but I couldn't think of a less awkward way to say that. :)
Spelling it out is probably a good idea.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
  2021-06-03 16:29   ` [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
  2021-06-03 16:29   ` [PATCH v2 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
@ 2021-06-03 19:29   ` Jeff King
  2021-06-03 22:51   ` Junio C Hamano
  3 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2021-06-03 19:29 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Hariom Verma,
	Felipe Contreras, ZheNing Hu

On Thu, Jun 03, 2021 at 04:29:24PM +0000, ZheNing Hu via GitGitGadget wrote:

> Change from last version:
> 
>  1. Modified the test structure under the recommendation of Peff.
>  2. Use clearer and more concise commit message help by Peff.

Thanks, this version looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug
  2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-06-03 19:29   ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug Jeff King
@ 2021-06-03 22:51   ` Junio C Hamano
  3 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2021-06-03 22:51 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, Jeff King, Felipe Contreras,
	ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Change from last version:
>
>  1. Modified the test structure under the recommendation of Peff.
>  2. Use clearer and more concise commit message help by Peff.
>
> ZheNing Hu (2):
>   [GSOC] cat-file: handle trivial --batch format with
>     --batch-all-objects
>   [GSOC] cat-file: merge two block into one
>
>  builtin/cat-file.c  | 10 ++++------
>  t/t1006-cat-file.sh | 22 ++++++++++++++++++++++
>  2 files changed, 26 insertions(+), 6 deletions(-)
>

Nicely done.  Thanks, both.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-06-03 22:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-01 14:35 [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
2021-06-01 15:52   ` Jeff King
2021-06-02 13:15     ` ZheNing Hu
2021-06-02 20:00       ` Jeff King
2021-06-03  6:28         ` ZheNing Hu
2021-06-03  7:18         ` ZheNing Hu
2021-06-03 19:28           ` Jeff King
2021-06-01 14:35 ` [PATCH 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-06-01 15:55   ` Jeff King
2021-06-02 13:27     ` ZheNing Hu
2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
2021-06-03 16:29   ` [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
2021-06-03 16:29   ` [PATCH v2 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-06-03 19:29   ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug Jeff King
2021-06-03 22:51   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).