git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Jeff King <peff@peff.net>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] t5304: add a test for pruning with bitmaps
Date: Fri, 19 Apr 2019 23:24:35 -0400
Message-ID: <20190420032435.GA3559@sigill.intra.peff.net> (raw)
In-Reply-To: <ba16afb4-25c1-8148-602e-130b0e17fc89@gmail.com>

On Fri, Apr 19, 2019 at 09:01:50PM -0400, Derrick Stolee wrote:

> > +test_expect_success 'trivial prune with bitmaps enabled' '
> > +	git repack -adb &&
> > +	blob=$(echo bitmap-unreachable-blob | git hash-object -w --stdin) &&
> > +	git prune --expire=now &&
> > +	git cat-file -e HEAD &&
> > +	test_must_fail git cat-file -e $blob
> > +'
> > +
> >  test_done
> 
> This test does cover the 'mark_object_seen()' method, which I tested by
> removing the "!" in the "if (!obj) die();" condition.
> 
> However, my first inclination was to remove the line
> 
> 	obj->flags |= SEEN;
> 
> from the method, and I found that it still worked! This was confusing,
> so I looked for places where the SEEN flag was added during bitmap walks,
> and it turns out that the objects are marked as SEEN by prepare_bitmap_walk().

I think this is only _some_ objects. The full bitmap is created by
reading the on-disk bitmaps, and then filling in any necessary gaps by
doing a traditional traversal. We also set it on tip objects to de-dup
the initial revs->pending list.

Try running t5304 with this:

diff --git a/reachable.c b/reachable.c
index eba6f64e01..7ec73ef43f 100644
--- a/reachable.c
+++ b/reachable.c
@@ -191,6 +191,8 @@ static int mark_object_seen(const struct object_id *oid,
 	if (!obj)
 		die("unable to create object '%s'", oid_to_hex(oid));
 
+	if (!(obj->flags & SEEN))
+		die("seen flag not already there");
 	obj->flags |= SEEN;
 	return 0;
 }

which shows that we are indeed freshly setting SEEN on some objects.

Interestingly, I don't _think_ you can cause an object to get pruned
accidentally here, because:

  1. Any object that will miss its SEEN has to be in the bitmapped pack,
     and not found through normal traversal.

  2. git-prune only removes loose objects, not packed ones.

  3. Loose objects cannot be in the bitmapped pack. Therefore, no object
     can hit both cases (1) and (2).

Even if that were the end of the story, I don't think I'd want to rely
on it here. The post-condition of the function should be that SEEN is
set on all reachable objects, whether bitmaps are used or not. And we do
indeed use this elsewhere:

We'll later call prune_shallow(), which uses SEEN to discard unreachable
entries. I'm not sure it even makes sense to have a shallow repo with
bitmaps, though.  Obviously we're lacking the full graph, but I wouldn't
be surprised if the shallow code quietly pretends that all is well and
we generate bogus bitmaps or something. Or maybe it even mostly works as
long as you don't walk over the shallow cut-points.

mark_reachable_objects() is also used for reflog expiration with
--stale-fix. I tried generating a test that would fail with your patch,
but I think it's not quite possible, because --stale-fix will do a
follow-up walk of the graph to see which objects mentioned in the reflog
we have and do not have. So it doesn't actually break things, it just
makes them slower (because we erroneously fail to mark SEEN things that
it's then forced to walk).

So I don't _think_ your patch actually breaks anything user-visible, but
it's a bit like leaving a live grenade in your living room for somebody
to find.

And I think we are indeed covering lookup_object_by_type(), etc in the
t5304 test I added. So AFAICT all of the new code is covered after that?

-Peff

  reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14  4:31 [PATCH 0/3] some prune optimizations Jeff King
2019-02-14  4:35 ` [PATCH 1/3] prune: lazily perform reachability traversal Jeff King
2019-02-14 10:54   ` Eric Sunshine
2019-02-14 11:07     ` Jeff King
2019-02-14  4:37 ` [PATCH 2/3] prune: use bitmaps for " Jeff King
2019-03-09  2:49   ` bitmaps by default? [was: prune: use bitmaps for reachability traversal] Eric Wong
2019-03-10 23:39     ` Jeff King
2019-03-12  3:13       ` [PATCH] repack: enable bitmaps by default on bare repos Eric Wong
2019-03-12  9:07         ` Ævar Arnfjörð Bjarmason
2019-03-12 10:49         ` Jeff King
2019-03-12 12:05           ` Jeff King
2019-03-13  1:51           ` Eric Wong
2019-03-13 14:54             ` Jeff King
2019-03-14  9:12               ` [PATCH v3] " Eric Wong
2019-03-14 16:02                 ` Jeff King
2019-03-15  6:21                   ` [PATCH 0/2] enable bitmap hash-cache by default Jeff King
2019-03-15  6:22                     ` [PATCH 1/2] t5310: correctly remove bitmaps for jgit test Jeff King
2019-03-15 13:25                       ` SZEDER Gábor
2019-03-15 18:36                         ` Jeff King
2019-03-15  6:25                     ` [PATCH 2/2] pack-objects: default to writing bitmap hash-cache Jeff King
2019-04-09 15:10                 ` [PATCH v3] repack: enable bitmaps by default on bare repos Ævar Arnfjörð Bjarmason
2019-04-10 22:57                   ` Jeff King
2019-04-25  7:16                     ` Junio C Hamano
2019-05-04  1:37                       ` Jeff King
2019-05-04  6:52                         ` Ævar Arnfjörð Bjarmason
2019-05-04 13:23                           ` SZEDER Gábor
2019-05-08 20:17                             ` Ævar Arnfjörð Bjarmason
2019-05-09  4:24                               ` Junio C Hamano
2019-05-07  7:45                           ` Jeff King
2019-05-07  8:12                             ` Ævar Arnfjörð Bjarmason
2019-05-08  7:11                               ` Jeff King
2019-05-08 14:20                                 ` Derrick Stolee
2019-05-08 16:13                                 ` Ævar Arnfjörð Bjarmason
2019-05-08 22:25                                   ` Jeff King
2019-04-15 15:00   ` [PATCH 2/3] prune: use bitmaps for reachability traversal Derrick Stolee
2019-04-18 19:49     ` Jeff King
2019-04-18 20:08       ` [PATCH] t5304: add a test for pruning with bitmaps Jeff King
2019-04-20  1:01         ` Derrick Stolee
2019-04-20  3:24           ` Jeff King [this message]
2019-04-20 21:01             ` Derrick Stolee
2019-02-14  4:38 ` [PATCH 3/3] prune: check SEEN flag for reachability Jeff King

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190420032435.GA3559@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox