git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] pack-objects: disable pack reuse for object-selection options
@ 2017-05-02  8:43 Jeff King
  2017-05-08  4:56 ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2017-05-02  8:43 UTC (permalink / raw)
  To: git

If certain options like --honor-pack-keep, --local, or
--incremental are used with pack-objects, then we need to
feed each potential object to want_object_in_pack() to see
if it should be filtered out.  This is totally contrary to
the purpose of the pack-reuse optimization, which tries hard
to avoid doing any per-object work.  Therefore we need to
disable this optimization when these options are in use.

This bug has been present since the inception of the
pack-reuse code, but was unlikely to come up in practice.
These options are generally used for on-disk packing, not
transfer packs (which go to stdout), but we've never allowed
pack reuse for non-stdout packs (until 645c432d6, we did not
even use bitmaps, which the reuse optimization relies on;
after that, we explicitly turned it off when not packing to
stdout).

There are tests in t5310 that check these options with
bitmaps and --stdout, but they didn't catch the bug, and
it's hard to adapt them to do so.

One problem is that they don't use --delta-base-offset;
without that option, we always disable the reuse
optimization entirely. It would be fine to add it in (it
actually makes the test more realistic), but that still
isn't quite enough.

Another problem is that the reuse code is very picky; it
only kicks in when it can reuse most of a pack, starting
from the first byte. So we'd have to start from a fully
repacked and bitmapped state to trigger it. But the tests
for these options use a much more subtle state; they want to
be sure that the want_object_in_pack() code is allowing some
objects but not others. Doing a full repack runs counter to
that.

So this patch adds new tests at the end of the script which
create the fully-packed state and make sure that each option
is not fooled by reusable pack.

Signed-off-by: Jeff King <peff@peff.net>
---
I happened to notice this because I have a series which makes the reuse
code much less picky (it kicks in more often, and can even convert to
non-ofs-delta clients on the fly). And it fails the tests when merged
with 702d1b958 (pack-objects: respect --local/--honor-pack-keep/--incremental
when bitmap is in use, 2016-09-10).

But the bug is much older than that.  So this isn't at all urgent for
v2.13.

 builtin/pack-objects.c  |  6 +++++-
 t/t5310-pack-bitmaps.sh | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0fe35d1b5..50e01aa80 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2717,7 +2717,11 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
  */
 static int pack_options_allow_reuse(void)
 {
-	return pack_to_stdout && allow_ofs_delta;
+	return pack_to_stdout &&
+	       allow_ofs_delta &&
+	       !ignore_packed_keep &&
+	       (!local || !have_non_local_packs) &&
+	       !incremental;
 }
 
 static int get_object_list_from_bitmap(struct rev_info *revs)
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 424bec7d7..c3ddfa217 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -289,4 +289,42 @@ test_expect_success 'splitting packs does not generate bogus bitmaps' '
 	git -C no-bitmaps.git fetch .. HEAD
 '
 
+test_expect_success 'set up reusable pack' '
+	rm -f .git/objects/pack/*.keep &&
+	git repack -adb &&
+	reusable_pack () {
+		git for-each-ref --format="%(objectname)" |
+		git pack-objects --delta-base-offset --revs --stdout "$@"
+	}
+'
+
+test_expect_success 'pack reuse respects --honor-pack-keep' '
+	test_when_finished "rm -f .git/objects/pack/*.keep" &&
+	for i in .git/objects/pack/*.pack; do
+		>${i%.pack}.keep
+	done &&
+	reusable_pack --honor-pack-keep >empty.pack &&
+	git index-pack empty.pack &&
+	>expect &&
+	git show-index <empty.idx >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'pack reuse respects --local' '
+	mv .git/objects/pack/* alt.git/objects/pack/ &&
+	test_when_finished "mv alt.git/objects/pack/* .git/objects/pack/" &&
+	reusable_pack --local >empty.pack &&
+	git index-pack empty.pack &&
+	>expect &&
+	git show-index <empty.idx >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'pack reuse respects --incremental' '
+	reusable_pack --incremental >empty.pack &&
+	git index-pack empty.pack &&
+	>expect &&
+	git show-index <empty.idx >actual &&
+	test_cmp expect actual
+'
 test_done
-- 
2.13.0.rc1.437.g927e4246e

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-02  8:43 [PATCH] pack-objects: disable pack reuse for object-selection options Jeff King
@ 2017-05-08  4:56 ` Junio C Hamano
  2017-05-08  7:31   ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-05-08  4:56 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> If certain options like --honor-pack-keep, --local, or
> --incremental are used with pack-objects, then we need to
> feed each potential object to want_object_in_pack() to see
> if it should be filtered out.  This is totally contrary to
> the purpose of the pack-reuse optimization, which tries hard
> to avoid doing any per-object work.  Therefore we need to
> disable this optimization when these options are in use.

I read this five times, as I wanted to understand what you are
saying, but I am not sure if I got it right.  One of the reasons why
I was confused was that I originally thought this "reuse" was about
delta reuse, but it is not.  It is "sending a slice of the original
packfile straight to the output".  But even after getting myself out
of that confusion, I still do not see why we "need to disable".
Surely, even if we need to exclude some objects from an existing
packfile due to these selection options, we should be able to reuse
the non-excluded part, no?  The end result may involve having to
pick and reuse more and smaller slices from existing packfiles,
which may be much less efficient, but it is no immediately obvious
to me if it leads to "need to disable".  I would understand it if it
were "it becomes much less efficient and we are better off not using
the bitmap code at all", though.

Is the real reason why we want to disable the "reuse" because it is
not easy to update the reuse_partial_packfile_from_bitmap() logic to
take these selection options into account?  If so, I would also
understand why this is a good change.

Puzzled.

> This bug has been present since the inception of the
> pack-reuse code, but was unlikely to come up in practice.


> +test_expect_success 'pack reuse respects --honor-pack-keep' '
> +	test_when_finished "rm -f .git/objects/pack/*.keep" &&
> +	for i in .git/objects/pack/*.pack; do
> +		>${i%.pack}.keep
> +	done &&

Micronit: style.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-08  4:56 ` Junio C Hamano
@ 2017-05-08  7:31   ` Jeff King
  2017-05-09  0:44     ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2017-05-08  7:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, May 08, 2017 at 01:56:27PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > If certain options like --honor-pack-keep, --local, or
> > --incremental are used with pack-objects, then we need to
> > feed each potential object to want_object_in_pack() to see
> > if it should be filtered out.  This is totally contrary to
> > the purpose of the pack-reuse optimization, which tries hard
> > to avoid doing any per-object work.  Therefore we need to
> > disable this optimization when these options are in use.
> 
> I read this five times, as I wanted to understand what you are
> saying, but I am not sure if I got it right.  One of the reasons why
> I was confused was that I originally thought this "reuse" was about
> delta reuse, but it is not.  It is "sending a slice of the original
> packfile straight to the output".

Right. The "send a slice" goes under the name "reuse_packfile" in the
code, but I'm not surprised you're not familiar with it. We added it
long ago as part of the bitmap feature, and it's not often talked about
(and in its current incarnation, isn't actually very useful; I have
patches to improve that, but haven't gotten around to upstreaming them).

> But even after getting myself out
> of that confusion, I still do not see why we "need to disable".
> Surely, even if we need to exclude some objects from an existing
> packfile due to these selection options, we should be able to reuse
> the non-excluded part, no?  The end result may involve having to
> pick and reuse more and smaller slices from existing packfiles,
> which may be much less efficient, but it is no immediately obvious
> to me if it leads to "need to disable".  I would understand it if it
> were "it becomes much less efficient and we are better off not using
> the bitmap code at all", though.

Yes, it's this last bit. The main win of the packfile reuse code is that
it builds on the bitmaps to avoid doing as much per-object work as
possible. So the objects don't even get added to the list of "struct
object_entry", and we never consider them for the "should they be in the
result" checks beyond the have/want computation done by the bitmaps.

We could add those checks in, but what's the point? The idea of the
reuse code is to be a fast path for serving vanilla clones. Searching
through all of the packfiles for a .keep entry is the antithesis of
that.

> Is the real reason why we want to disable the "reuse" because it is
> not easy to update the reuse_partial_packfile_from_bitmap() logic to
> take these selection options into account?  If so, I would also
> understand why this is a good change.

This is a side concern for the current form. With the patches I
mentioned above, it's not too hard to omit some objects and still reuse
the other bits verbatim. But again, even evaluating the function for
each pack is too expensive, even if the implementation supported it.

> > +test_expect_success 'pack reuse respects --honor-pack-keep' '
> > +	test_when_finished "rm -f .git/objects/pack/*.keep" &&
> > +	for i in .git/objects/pack/*.pack; do
> > +		>${i%.pack}.keep
> > +	done &&
> 
> Micronit: style.

I assume you mean putting "do" on a separate line. Sorry, the "; do" is
my native tongue and I sometimes slip back into it without thinking.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-08  7:31   ` Jeff King
@ 2017-05-09  0:44     ` Junio C Hamano
  2017-05-09  2:00       ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-05-09  0:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> On Mon, May 08, 2017 at 01:56:27PM +0900, Junio C Hamano wrote:
>
>> Surely, even if we need to exclude some objects from an existing
>> packfile due to these selection options, we should be able to reuse
>> the non-excluded part, no?  The end result may involve having to
>> pick and reuse more and smaller slices from existing packfiles,
>> which may be much less efficient, but it is no immediately obvious
>> to me if it leads to "need to disable".  I would understand it if it
>> were "it becomes much less efficient and we are better off not using
>> the bitmap code at all", though.
>
> Yes, it's this last bit. The main win of the packfile reuse code is that
> it builds on the bitmaps to avoid doing as much per-object work as
> possible. So the objects don't even get added to the list of "struct
> object_entry", and we never consider them for the "should they be in the
> result" checks beyond the have/want computation done by the bitmaps.
>
> We could add those checks in, but what's the point? The idea of the
> reuse code is to be a fast path for serving vanilla clones. Searching
> through all of the packfiles for a .keep entry is the antithesis of
> that.

Ah, OK, and now I understand why you called this a "bug" (which is
older and do not need to be addressed as part of 2.13) in the
original message.  The new tests check requests that ought to
produce an empty packfile as the result actually do, but with the
current code, the reuse code does not work with --local and friends
and ends up including what was requested to be excluded.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  0:44     ` Junio C Hamano
@ 2017-05-09  2:00       ` Jeff King
  2017-05-09  2:14         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2017-05-09  2:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, May 09, 2017 at 09:44:50AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Mon, May 08, 2017 at 01:56:27PM +0900, Junio C Hamano wrote:
> >
> >> Surely, even if we need to exclude some objects from an existing
> >> packfile due to these selection options, we should be able to reuse
> >> the non-excluded part, no?  The end result may involve having to
> >> pick and reuse more and smaller slices from existing packfiles,
> >> which may be much less efficient, but it is no immediately obvious
> >> to me if it leads to "need to disable".  I would understand it if it
> >> were "it becomes much less efficient and we are better off not using
> >> the bitmap code at all", though.
> >
> > Yes, it's this last bit. The main win of the packfile reuse code is that
> > it builds on the bitmaps to avoid doing as much per-object work as
> > possible. So the objects don't even get added to the list of "struct
> > object_entry", and we never consider them for the "should they be in the
> > result" checks beyond the have/want computation done by the bitmaps.
> >
> > We could add those checks in, but what's the point? The idea of the
> > reuse code is to be a fast path for serving vanilla clones. Searching
> > through all of the packfiles for a .keep entry is the antithesis of
> > that.
> 
> Ah, OK, and now I understand why you called this a "bug" (which is
> older and do not need to be addressed as part of 2.13) in the
> original message.  The new tests check requests that ought to
> produce an empty packfile as the result actually do, but with the
> current code, the reuse code does not work with --local and friends
> and ends up including what was requested to be excluded.

Right. Did you want me to try re-wording the commit message, or does it
make sense now?

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  2:00       ` Jeff King
@ 2017-05-09  2:14         ` Junio C Hamano
  2017-05-09  2:21           ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-05-09  2:14 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

>> Ah, OK, and now I understand why you called this a "bug" (which is
>> older and do not need to be addressed as part of 2.13) in the
>> original message.  The new tests check requests that ought to
>> produce an empty packfile as the result actually do, but with the
>> current code, the reuse code does not work with --local and friends
>> and ends up including what was requested to be excluded.
>
> Right. Did you want me to try re-wording the commit message, or does it
> make sense now?

It does make sense to me now, but I do not speak for all future
readers of "git log", so...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  2:14         ` Junio C Hamano
@ 2017-05-09  2:21           ` Jeff King
  2017-05-09  2:48             ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2017-05-09  2:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, May 09, 2017 at 11:14:18AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> Ah, OK, and now I understand why you called this a "bug" (which is
> >> older and do not need to be addressed as part of 2.13) in the
> >> original message.  The new tests check requests that ought to
> >> produce an empty packfile as the result actually do, but with the
> >> current code, the reuse code does not work with --local and friends
> >> and ends up including what was requested to be excluded.
> >
> > Right. Did you want me to try re-wording the commit message, or does it
> > make sense now?
> 
> It does make sense to me now, but I do not speak for all future
> readers of "git log", so...

I guess what I was asking was: do you still think it was unclear, or do
you think you were just being dense?

I don't feel like I gave any information in the follow-on explanation
that wasn't in the commit message, so I wasn't clear if I worded it
better or if it just sunk in better.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  2:21           ` Jeff King
@ 2017-05-09  2:48             ` Junio C Hamano
  2017-05-09  2:54               ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-05-09  2:48 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> On Tue, May 09, 2017 at 11:14:18AM +0900, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> >> Ah, OK, and now I understand why you called this a "bug" (which is
>> >> older and do not need to be addressed as part of 2.13) in the
>> >> original message.  The new tests check requests that ought to
>> >> produce an empty packfile as the result actually do, but with the
>> >> current code, the reuse code does not work with --local and friends
>> >> and ends up including what was requested to be excluded.
>> >
>> > Right. Did you want me to try re-wording the commit message, or does it
>> > make sense now?
>> 
>> It does make sense to me now, but I do not speak for all future
>> readers of "git log", so...
>
> I guess what I was asking was: do you still think it was unclear, or do
> you think you were just being dense?
>
> I don't feel like I gave any information in the follow-on explanation
> that wasn't in the commit message, so I wasn't clear if I worded it
> better or if it just sunk in better.

At least, "the current code is buggy when --local and friend are
given and includes needless objects in the result" was something I
learned only during the discussion, and would never have guessed by
reading the log message.  The second paragraph does talk about "This
bug has been present since...", but the first paragraph does not say
anything about the current output being broken.

So, I am not sure if this was a case where I was dense.  I think the
first paragraph needs a bit more clarity.

    If certain options like --honor-pack-keep, --local, or
    --incremental are used with pack-objects, then we need to
    feed each potential object to want_object_in_pack() to see
    if it should be filtered out.  This is totally contrary to
    the purpose of the pack-reuse optimization, which tries hard
    to avoid doing any per-object work.  Therefore we need to
    disable this optimization when these options are in use.

Perhaps "... should be filtered out." can be followed by "However,
the current code fails to do so, and we end up including these
unwanted objects in the output.", and then "This is totally..."  can
instead begin with "Besides, having to do per-object filtering is
totally...".  I wouldn't have been confused if it were like so.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  2:48             ` Junio C Hamano
@ 2017-05-09  2:54               ` Jeff King
  2017-05-09  2:59                 ` Jeff King
  2017-05-09  3:08                 ` Junio C Hamano
  0 siblings, 2 replies; 11+ messages in thread
From: Jeff King @ 2017-05-09  2:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, May 09, 2017 at 11:48:17AM +0900, Junio C Hamano wrote:

> > I guess what I was asking was: do you still think it was unclear, or do
> > you think you were just being dense?
> >
> > I don't feel like I gave any information in the follow-on explanation
> > that wasn't in the commit message, so I wasn't clear if I worded it
> > better or if it just sunk in better.
> 
> At least, "the current code is buggy when --local and friend are
> given and includes needless objects in the result" was something I
> learned only during the discussion, and would never have guessed by
> reading the log message.  The second paragraph does talk about "This
> bug has been present since...", but the first paragraph does not say
> anything about the current output being broken.

While waiting for your response I took a look to see if I could improve
it and came to the same conclusion. The result is below.

> So, I am not sure if this was a case where I was dense.  I think the
> first paragraph needs a bit more clarity.

Yeah, you were not being dense. I just wrote it badly. Sorry
for the confusion.

-- >8 --
Subject: pack-objects: disable pack reuse for object-selection options

If certain options like --honor-pack-keep, --local, or
--incremental are used with pack-objects, then we need to
feed each potential object to want_object_in_pack() to see
if it should be filtered out. But when the bitmap
reuse_packfile optimization is in effect, we do not call
that function at all, and in fact skip adding the objects to
the to_pack list entirely.  This means we have a bug: for
certain requests we will silently ignore those options and
include objects in that pack that should not be there.

The problem has been present since the inception of the
pack-reuse code in 6b8fda2db (pack-objects: use bitmaps when
packing objects, 2013-12-21), but it was unlikely to come up
in practice.  These options are generally used for on-disk
packing, not transfer packs (which go to stdout), but we've
never allowed pack reuse for non-stdout packs (until
645c432d6, we did not even use bitmaps, which the reuse
optimization relies on; after that, we explicitly turned it
off when not packing to stdout).

We can fix this by just disabling the reuse_packfile
optimization when the options are in use. In theory we could
teach the pack-reuse code to satisfy these checks, but it's
not worth the complexity. The purpose of the optimization is
to keep the amount of per-object work we do to a minimum.
But these options inherently require us to search for other
copies of each object, drowning out any benefit of the
pack-reuse optimization. But note that the optimizations
from 56dfeb626 (pack-objects: compute local/ignore_pack_keep
early, 2016-07-29) happen before pack-reuse, meaning that
specifying "--honor-pack-keep" in a repository with no .keep
files can still follow the fast path.

There are tests in t5310 that check these options with
bitmaps and --stdout, but they didn't catch the bug, and
it's hard to adapt them to do so.

One problem is that they don't use --delta-base-offset;
without that option, we always disable the reuse
optimization entirely. It would be fine to add it in (it
actually makes the test more realistic), but that still
isn't quite enough.

The other problem is that the reuse code is very picky; it
only kicks in when it can reuse most of a pack, starting
from the first byte. So we'd have to start from a fully
repacked and bitmapped state to trigger it. But the tests
for these options use a much more subtle state; they want to
be sure that the want_object_in_pack() code is allowing some
objects but not others. Doing a full repack runs counter to
that.

So this patch adds new tests at the end of the script which
create the fully-packed state and make sure that each option
is not fooled by reusable pack.

Signed-off-by: Jeff King <peff@peff.net>
---
Contents are the same. I decided to leave the "; do" as it
matches the rest of the script. If we're going to fix it, we
should do them all.

 builtin/pack-objects.c  |  6 +++++-
 t/t5310-pack-bitmaps.sh | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0fe35d1b5..50e01aa80 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2717,7 +2717,11 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
  */
 static int pack_options_allow_reuse(void)
 {
-	return pack_to_stdout && allow_ofs_delta;
+	return pack_to_stdout &&
+	       allow_ofs_delta &&
+	       !ignore_packed_keep &&
+	       (!local || !have_non_local_packs) &&
+	       !incremental;
 }
 
 static int get_object_list_from_bitmap(struct rev_info *revs)
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 424bec7d7..c3ddfa217 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -289,4 +289,42 @@ test_expect_success 'splitting packs does not generate bogus bitmaps' '
 	git -C no-bitmaps.git fetch .. HEAD
 '
 
+test_expect_success 'set up reusable pack' '
+	rm -f .git/objects/pack/*.keep &&
+	git repack -adb &&
+	reusable_pack () {
+		git for-each-ref --format="%(objectname)" |
+		git pack-objects --delta-base-offset --revs --stdout "$@"
+	}
+'
+
+test_expect_success 'pack reuse respects --honor-pack-keep' '
+	test_when_finished "rm -f .git/objects/pack/*.keep" &&
+	for i in .git/objects/pack/*.pack; do
+		>${i%.pack}.keep
+	done &&
+	reusable_pack --honor-pack-keep >empty.pack &&
+	git index-pack empty.pack &&
+	>expect &&
+	git show-index <empty.idx >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'pack reuse respects --local' '
+	mv .git/objects/pack/* alt.git/objects/pack/ &&
+	test_when_finished "mv alt.git/objects/pack/* .git/objects/pack/" &&
+	reusable_pack --local >empty.pack &&
+	git index-pack empty.pack &&
+	>expect &&
+	git show-index <empty.idx >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'pack reuse respects --incremental' '
+	reusable_pack --incremental >empty.pack &&
+	git index-pack empty.pack &&
+	>expect &&
+	git show-index <empty.idx >actual &&
+	test_cmp expect actual
+'
 test_done
-- 
2.13.0.rc2.436.g524cea07c


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  2:54               ` Jeff King
@ 2017-05-09  2:59                 ` Jeff King
  2017-05-09  3:08                 ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff King @ 2017-05-09  2:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, May 08, 2017 at 10:54:12PM -0400, Jeff King wrote:

> Contents are the same. I decided to leave the "; do" as it
> matches the rest of the script. If we're going to fix it, we
> should do them all.

Like this, perhaps.

I didn't go on a crusade against "; do" in the other scripts, but
perhaps that is low-hanging fruit that somebody else might want to pick.

-- >8 --
Subject: [PATCH] t5310: fix "; do" style

Our usual shell style is to put the "do" of a loop on its
own line, like:

  while $cond
  do
          something
  done

instead of:

  while $cond; do
          something
  done

We have a bit of both in our code base, but the former is
what's in CodingGuidelines (and outnumbers the latter in t/
by about 6:1).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5310-pack-bitmaps.sh | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index c3ddfa217..20e2473a0 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -20,11 +20,13 @@ has_any () {
 }
 
 test_expect_success 'setup repo with moderate-sized history' '
-	for i in $(test_seq 1 10); do
+	for i in $(test_seq 1 10)
+	do
 		test_commit $i
 	done &&
 	git checkout -b other HEAD~5 &&
-	for i in $(test_seq 1 10); do
+	for i in $(test_seq 1 10)
+	do
 		test_commit side-$i
 	done &&
 	git checkout master &&
@@ -104,7 +106,8 @@ test_expect_success 'clone from bitmapped repository' '
 '
 
 test_expect_success 'setup further non-bitmapped commits' '
-	for i in $(test_seq 1 10); do
+	for i in $(test_seq 1 10)
+	do
 		test_commit further-$i
 	done
 '
@@ -300,7 +303,8 @@ test_expect_success 'set up reusable pack' '
 
 test_expect_success 'pack reuse respects --honor-pack-keep' '
 	test_when_finished "rm -f .git/objects/pack/*.keep" &&
-	for i in .git/objects/pack/*.pack; do
+	for i in .git/objects/pack/*.pack
+	do
 		>${i%.pack}.keep
 	done &&
 	reusable_pack --honor-pack-keep >empty.pack &&
-- 
2.13.0.rc2.436.g524cea07c


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] pack-objects: disable pack reuse for object-selection options
  2017-05-09  2:54               ` Jeff King
  2017-05-09  2:59                 ` Jeff King
@ 2017-05-09  3:08                 ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2017-05-09  3:08 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> On Tue, May 09, 2017 at 11:48:17AM +0900, Junio C Hamano wrote:
>
>> > I guess what I was asking was: do you still think it was unclear, or do
>> > you think you were just being dense?
>> >
>> > I don't feel like I gave any information in the follow-on explanation
>> > that wasn't in the commit message, so I wasn't clear if I worded it
>> > better or if it just sunk in better.
>> 
>> At least, "the current code is buggy when --local and friend are
>> given and includes needless objects in the result" was something I
>> learned only during the discussion, and would never have guessed by
>> reading the log message.  The second paragraph does talk about "This
>> bug has been present since...", but the first paragraph does not say
>> anything about the current output being broken.
>
> While waiting for your response I took a look to see if I could improve
> it and came to the same conclusion. The result is below.

Looks good to me.  I really like how the third-paragraph reasons
about pros and cons and decides to just disable the codepath.

I see this as an example of omitting something you know so well as
"too obvious", and it turns out that it isn't so obvious to others;
I commit the same sin all the time myself.  Catching instances of
these is part of the review process.

Thanks.

> -- >8 --
> Subject: pack-objects: disable pack reuse for object-selection options
>
> If certain options like --honor-pack-keep, --local, or
> --incremental are used with pack-objects, then we need to
> feed each potential object to want_object_in_pack() to see
> if it should be filtered out. But when the bitmap
> reuse_packfile optimization is in effect, we do not call
> that function at all, and in fact skip adding the objects to
> the to_pack list entirely.  This means we have a bug: for
> certain requests we will silently ignore those options and
> include objects in that pack that should not be there.
>
> The problem has been present since the inception of the
> pack-reuse code in 6b8fda2db (pack-objects: use bitmaps when
> packing objects, 2013-12-21), but it was unlikely to come up
> in practice.  These options are generally used for on-disk
> packing, not transfer packs (which go to stdout), but we've
> never allowed pack reuse for non-stdout packs (until
> 645c432d6, we did not even use bitmaps, which the reuse
> optimization relies on; after that, we explicitly turned it
> off when not packing to stdout).
>
> We can fix this by just disabling the reuse_packfile
> optimization when the options are in use. In theory we could
> teach the pack-reuse code to satisfy these checks, but it's
> not worth the complexity. The purpose of the optimization is
> to keep the amount of per-object work we do to a minimum.
> But these options inherently require us to search for other
> copies of each object, drowning out any benefit of the
> pack-reuse optimization. But note that the optimizations
> from 56dfeb626 (pack-objects: compute local/ignore_pack_keep
> early, 2016-07-29) happen before pack-reuse, meaning that
> specifying "--honor-pack-keep" in a repository with no .keep
> files can still follow the fast path.
>
> There are tests in t5310 that check these options with
> bitmaps and --stdout, but they didn't catch the bug, and
> it's hard to adapt them to do so.
>
> One problem is that they don't use --delta-base-offset;
> without that option, we always disable the reuse
> optimization entirely. It would be fine to add it in (it
> actually makes the test more realistic), but that still
> isn't quite enough.
>
> The other problem is that the reuse code is very picky; it
> only kicks in when it can reuse most of a pack, starting
> from the first byte. So we'd have to start from a fully
> repacked and bitmapped state to trigger it. But the tests
> for these options use a much more subtle state; they want to
> be sure that the want_object_in_pack() code is allowing some
> objects but not others. Doing a full repack runs counter to
> that.
>
> So this patch adds new tests at the end of the script which
> create the fully-packed state and make sure that each option
> is not fooled by reusable pack.
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-05-09  3:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-02  8:43 [PATCH] pack-objects: disable pack reuse for object-selection options Jeff King
2017-05-08  4:56 ` Junio C Hamano
2017-05-08  7:31   ` Jeff King
2017-05-09  0:44     ` Junio C Hamano
2017-05-09  2:00       ` Jeff King
2017-05-09  2:14         ` Junio C Hamano
2017-05-09  2:21           ` Jeff King
2017-05-09  2:48             ` Junio C Hamano
2017-05-09  2:54               ` Jeff King
2017-05-09  2:59                 ` Jeff King
2017-05-09  3:08                 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).