git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH v3 0/2] pack-objects mru
Date: Thu, 11 Aug 2016 00:48:09 -0400	[thread overview]
Message-ID: <20160811044809.ep22lybdyfzmx4pl@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqbn107epz.fsf@gitster.mtv.corp.google.com>

On Wed, Aug 10, 2016 at 09:47:52AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> This is not new with this change, but I am not quite sure what in
> >> the current code prevents us from busting the delta limit for reused
> >> ones, though.
> >
> > I think in the current code you are limited by the depth you might find
> > in a single existing pack (since we never reuse cross-pack deltas).
> 
> Sorry for going deeper in the tangent, but I vaguely recall raising
> it long time ago as a potential issue that delta reuse out of an
> original pack created with deep delta chain may bust a delta chain
> limit when repacking with shorter delta chain limit; I just do not
> remember where that discussion went (i.e. decided to be a non-issue?
> we added code to avoid it? etc.)

Digging on the list and in the history, I found your e4c9327
(pack-objects: avoid delta chains that are too long., 2006-02-17). That
approach went away with Nico's 898b14c (pack-objects: rework
check_delta_limit usage, 2007-04-16), which I think is where we are at
today.

I found the patches for both on the list, but no interesting discussion.

It looks like 898b14c does the depth check dynamically in find_delta. So
we would not build on a too-long chain, but I do not see anything that
prevents us from creating a too-long chain purely out of reused deltas.

Which means that even without my patches, repacking with a shorter delta
chain does not guarantee you do not have a longer chain. And mine
introduces a potential "packs * max_depth" problem (I also think
check_delta_limit could recurse very deeply if given a pathologically
weird pack that has insane delta limits, but presumably we would just
run out of stack and crash, which seems like an OK outcome for such
maliciousness).

I guess it would not be that hard to break the reused chains as part of
the DFS search I introduced (we are recursing already; just stop
recursing and break when we hit the max-depth).

> > However, I think with cross-pack deltas, you could have a situation
> > like:
> >
> >   pack 1: A -> B -> C
> >   pack 2: C -> D -> E
> >
> > and pick A and B from the first pack, and C, D, and E from the second.
> > Then you end up with:
> >
> >   A -> B -> C -> D -> E
> >
> > in the output. IOW, I think the absolute worst case chain is the
> > max_depth times the number of packs.
> 
> Also if pack1 and pack2 were created with depth limit of 3 and we
> are repacking with depth limit of 2, then we are busting the limit
> already with or without cross-pack chaining, I would think.

Right, though that is at least bounded to "what you packed with before",
which people do not usually change (OTOH, we accept packs from random
strangers via the protocol).

-Peff

      reply	other threads:[~2016-08-11  4:48 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-29  4:04 [PATCH v2 0/7] speed up pack-objects counting with many packs Jeff King
2016-07-29  4:06 ` [PATCH v2 1/7] t/perf: add tests for many-pack scenarios Jeff King
2016-07-29  4:06 ` [PATCH v2 2/7] sha1_file: drop free_pack_by_name Jeff King
2016-07-29  4:06 ` [PATCH v2 3/7] add generic most-recently-used list Jeff King
2016-07-29  4:09 ` [PATCH v2 4/7] find_pack_entry: replace last_found_pack with MRU cache Jeff King
2016-07-29  4:10 ` [PATCH v2 5/7] pack-objects: break out of want_object loop early Jeff King
2016-07-29  4:11 ` [PATCH v2 6/7] pack-objects: compute local/ignore_pack_keep early Jeff King
2016-07-29  4:15 ` [PATCH v2 7/7] pack-objects: use mru list when iterating over packs Jeff King
2016-07-29  5:45   ` Jeff King
2016-07-29 15:02     ` Junio C Hamano
2016-08-08 14:50       ` Jeff King
2016-08-08 16:28         ` Junio C Hamano
2016-08-08 16:51           ` Jeff King
2016-08-08 17:16             ` Junio C Hamano
2016-08-09 14:04               ` Jeff King
2016-08-09 17:45                 ` Jeff King
2016-08-09 18:06                   ` Junio C Hamano
2016-08-09 22:29                 ` Junio C Hamano
2016-08-10 11:52                   ` [PATCH v3 0/2] pack-objects mru Jeff King
2016-08-10 12:02                     ` [PATCH v3 1/2] pack-objects: break delta cycles before delta-search phase Jeff King
2016-08-10 20:17                       ` Junio C Hamano
2016-08-11  5:02                         ` Jeff King
2016-08-11  5:15                           ` [PATCH v4 " Jeff King
2016-08-11  6:57                           ` [PATCH v3 " Jeff King
2016-08-11  9:20                             ` [PATCH v5] pack-objects mru Jeff King
2016-08-11  9:24                               ` [PATCH v5 1/4] provide an initializer for "struct object_info" Jeff King
2016-08-11  9:25                               ` [PATCH v5 2/4] sha1_file: make packed_object_info public Jeff King
2016-08-11  9:26                               ` [PATCH v5 3/4] pack-objects: break delta cycles before delta-search phase Jeff King
2016-08-11  9:26                               ` [PATCH v5 4/4] pack-objects: use mru list when iterating over packs Jeff King
2016-08-11  9:57                               ` [PATCH v5] pack-objects mru Jeff King
2016-08-11 15:11                                 ` Junio C Hamano
2016-08-11 16:19                                   ` Jeff King
2016-08-10 12:03                     ` [PATCH v3 2/2] pack-objects: use mru list when iterating over packs Jeff King
2016-08-10 16:47                     ` [PATCH v3 0/2] pack-objects mru Junio C Hamano
2016-08-11  4:48                       ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160811044809.ep22lybdyfzmx4pl@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).