git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH v2 7/7] pack-objects: use mru list when iterating over packs
Date: Mon, 8 Aug 2016 12:51:28 -0400	[thread overview]
Message-ID: <20160808165127.fvhnkcfsj4vif7iu@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqq8tw7gr82.fsf@gitster.mtv.corp.google.com>

On Mon, Aug 08, 2016 at 09:28:29AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Here's a list of approaches I think we can use to fix this:
> >
> >   1. squelch the warning and ignore it. The downside here, besides not
> >      warning the user about true in-pack cycles, is that we have no
> >      opportunity to actually find a new delta (because we realize the
> >      problem only after the delta-compression phase).
> >
> >      My test repository is a bad packing of all of the forks of
> >      torvalds/linux, with 3600 packs. I'm happy to share it if anybody
> >      wants to see it, but note that it is 11GB.
> >
> >      The space overhead of the resulting pack in this case is ~3.2%
> >      (versus a pack generated by the original code, using the static
> >      pack order).  Which is really not that bad, but I'm sure there are
> >      more pathological cases (i.e., there were on the order of hundreds
> >      or maybe thousands of cycles that needed broken, out of about 15
> >      million total objects; but one could imagine the worst case as
> >      nr_of_objects/2).
> > ...
> >
> >     So I dunno. I really like the MRU approach if we can salvage it.
> 
> I think I share the same feeling.  As long as the chance is small
> enough that the pack reordering creates a new cycle, the resulting
> pack would not become too bloated by the last-ditch cycle breaking
> code and finding a replacement delta instead of inflating it may not
> be worth the trouble.

Let me see how complicated it is to do the cycle-detection earlier. It
really shouldn't be that hard (we don't even need an auxiliary data
structure; we can just smudge the index value like write_object does).
I'm very curious to see the pack resulting pack size.

> It worries me a lot to lose the warning unconditionally, though.
> That's the (only) coal-mine canary that lets us notice a problem
> when we actually start hitting that last-ditch cycle breaking too
> often.

The dedicated cycle-detection will lose that warning, too (it doesn't
touch that code, but it's effectively checking the same thing earlier).

I agree it's unfortunate to lose. On the other hand, it is being lost
because we are correctly handling the cycles, and there is nothing to
warn about. So it ceases to be a problem, and starts being a normal,
acceptable thing.

-Peff

  reply	other threads:[~2016-08-08 16:51 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-29  4:04 [PATCH v2 0/7] speed up pack-objects counting with many packs Jeff King
2016-07-29  4:06 ` [PATCH v2 1/7] t/perf: add tests for many-pack scenarios Jeff King
2016-07-29  4:06 ` [PATCH v2 2/7] sha1_file: drop free_pack_by_name Jeff King
2016-07-29  4:06 ` [PATCH v2 3/7] add generic most-recently-used list Jeff King
2016-07-29  4:09 ` [PATCH v2 4/7] find_pack_entry: replace last_found_pack with MRU cache Jeff King
2016-07-29  4:10 ` [PATCH v2 5/7] pack-objects: break out of want_object loop early Jeff King
2016-07-29  4:11 ` [PATCH v2 6/7] pack-objects: compute local/ignore_pack_keep early Jeff King
2016-07-29  4:15 ` [PATCH v2 7/7] pack-objects: use mru list when iterating over packs Jeff King
2016-07-29  5:45   ` Jeff King
2016-07-29 15:02     ` Junio C Hamano
2016-08-08 14:50       ` Jeff King
2016-08-08 16:28         ` Junio C Hamano
2016-08-08 16:51           ` Jeff King [this message]
2016-08-08 17:16             ` Junio C Hamano
2016-08-09 14:04               ` Jeff King
2016-08-09 17:45                 ` Jeff King
2016-08-09 18:06                   ` Junio C Hamano
2016-08-09 22:29                 ` Junio C Hamano
2016-08-10 11:52                   ` [PATCH v3 0/2] pack-objects mru Jeff King
2016-08-10 12:02                     ` [PATCH v3 1/2] pack-objects: break delta cycles before delta-search phase Jeff King
2016-08-10 20:17                       ` Junio C Hamano
2016-08-11  5:02                         ` Jeff King
2016-08-11  5:15                           ` [PATCH v4 " Jeff King
2016-08-11  6:57                           ` [PATCH v3 " Jeff King
2016-08-11  9:20                             ` [PATCH v5] pack-objects mru Jeff King
2016-08-11  9:24                               ` [PATCH v5 1/4] provide an initializer for "struct object_info" Jeff King
2016-08-11  9:25                               ` [PATCH v5 2/4] sha1_file: make packed_object_info public Jeff King
2016-08-11  9:26                               ` [PATCH v5 3/4] pack-objects: break delta cycles before delta-search phase Jeff King
2016-08-11  9:26                               ` [PATCH v5 4/4] pack-objects: use mru list when iterating over packs Jeff King
2016-08-11  9:57                               ` [PATCH v5] pack-objects mru Jeff King
2016-08-11 15:11                                 ` Junio C Hamano
2016-08-11 16:19                                   ` Jeff King
2016-08-10 12:03                     ` [PATCH v3 2/2] pack-objects: use mru list when iterating over packs Jeff King
2016-08-10 16:47                     ` [PATCH v3 0/2] pack-objects mru Junio C Hamano
2016-08-11  4:48                       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160808165127.fvhnkcfsj4vif7iu@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).