git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	christian.couder@gmail.com, git@vger.kernel.org,
	chriscool@tuxfamily.org, ramsay@ramsayjones.plus.com
Subject: Re: [RFC PATCH 10/10] pack-objects: improve partial packfile reuse
Date: Sun, 13 Oct 2019 03:38:52 -0400	[thread overview]
Message-ID: <20191013073851.GA7001@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqsgnyg76d.fsf@gitster-ct.c.googlers.com>

On Sat, Oct 12, 2019 at 09:04:58AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The current code does so by creating a new entry in the reused_chunks
> > array. In the worst case that can grow to have the same number of
> > entries as we have objects. So this code was an attempt to pad the
> > header of a shrunken entry to keep it the same size. I don't remember
> > all the problems we ran into with that, but in the end we found that it
> > didn't actually help much, because in practice we don't end up with a
> > lot of chunks anyway.
> 
> Hmm, I am kind of surprised that the decoding side allowed such a
> padding.

IIRC, the "padding" is just a sequence of 0-length-plus-continuation-bit
varint bytes. And for some reason it worked for the size but not for the
delta offset value. So the decoder wasn't aware of it, but simply hadn't
explicitly enforced that there weren't pointless bytes.

But yeah, it seems like a pretty hacky thing to rely on. I don't think
we ever even ran that code in production, and the if(0) was just
leftover experimental cruft.

> > I think the original code may simply have been buggy and nobody noticed.
> > Here's what I wrote when this line was added in our fork:
> [...]
> Impressed by the careful thinking here.

It's unfortunate that the reasoning there wasn't part of the earlier
submission. I'm not sure how to reconcile that. The patches as
originally written can't be applied now (they were munged over the years
during merges with upstream). And some of them are just "oops, fix this
dumb bug" trash that we wouldn't want to take anyway.

So I think at best they're something for somebody to manually look at
and try to incorporate into the commit messages of the revised patch
series. But I didn't give them to Christian to work with because it's
hard to even figure out which patches are still relevant. I wish I had a
better tooling there. I've been playing with something that looks at a
diff and then tries to blame each of the touched lines. Which is sort of
like the "-L" line-log, I guess, but for a very non-contiguous set of
lines.

-Peff

  reply	other threads:[~2019-10-13  7:38 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-13 13:02 [RFC PATCH 00/10] Rewrite packfile reuse code Christian Couder
2019-09-13 13:02 ` [RFC PATCH 01/10] builtin/pack-objects: report reused packfile objects Christian Couder
2019-09-13 13:02 ` [RFC PATCH 02/10] packfile: expose get_delta_base() Christian Couder
2019-09-13 13:02 ` [RFC PATCH 03/10] ewah/bitmap: introduce bitmap_word_alloc() Christian Couder
2019-09-13 13:02 ` [RFC PATCH 04/10] ewah/bitmap: always allocate 2 more words Christian Couder
2019-10-10 23:40   ` Jonathan Tan
2019-10-11  7:49     ` Christian Couder
2019-10-11 18:05       ` Jeff King
2019-09-13 13:02 ` [RFC PATCH 05/10] pack-bitmap: don't rely on bitmap_git->reuse_objects Christian Couder
2019-10-10 23:44   ` Jonathan Tan
2019-10-11  7:50     ` Christian Couder
2019-09-13 13:02 ` [RFC PATCH 06/10] pack-bitmap: introduce bitmap_walk_contains() Christian Couder
2019-09-13 13:02 ` [RFC PATCH 07/10] csum-file: introduce hashfile_total() Christian Couder
2019-09-13 13:02 ` [RFC PATCH 08/10] pack-objects: introduce pack.allowPackReuse Christian Couder
2019-09-13 21:37   ` Junio C Hamano
2019-09-13 13:02 ` [RFC PATCH 09/10] builtin/pack-objects: introduce obj_is_packed() Christian Couder
2019-09-13 13:02 ` [RFC PATCH 10/10] pack-objects: improve partial packfile reuse Christian Couder
2019-09-13 22:29   ` Junio C Hamano
2019-09-14  2:02     ` Jeff King
2019-09-14  3:06       ` Junio C Hamano
2019-10-02 15:57         ` Jeff King
2019-10-03  2:06           ` Junio C Hamano
2019-10-03  6:55             ` Christian Couder
2019-10-10 23:59   ` Jonathan Tan
2019-10-11  7:39     ` Christian Couder
2019-10-11 18:01     ` Jeff King
2019-10-11 21:04       ` Jonathan Tan
2019-10-12  0:04       ` Junio C Hamano
2019-10-13  7:38         ` Jeff King [this message]
2019-10-17  7:03           ` Junio C Hamano
2019-10-17  7:23             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191013073851.GA7001@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=ramsay@ramsayjones.plus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).