git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Vicent Marti <vicent@github.com>
Cc: Jeff King <peff@peff.net>, git <git@vger.kernel.org>
Subject: Re: [PATCH 10/19] pack-bitmap: add support for bitmap indexes
Date: Wed, 30 Oct 2013 16:04:06 +0000	[thread overview]
Message-ID: <CAJo=hJvm_sZqobzO0Er-OVp-Xkf4fT_tySQeNWf0mveHH-G-Xg@mail.gmail.com> (raw)
In-Reply-To: <CAFFjANQyMfV4M_wynPORfN2S1=eAdBxScgNYzD_dsRmKekp83Q@mail.gmail.com>

On Wed, Oct 30, 2013 at 3:47 PM, Vicent Marti <vicent@github.com> wrote:
> On Wed, Oct 30, 2013 at 9:10 AM, Jeff King <peff@peff.net> wrote:
>>
>> In fact, I'm not quite sure that even a partial reuse up to an offset is
>> 100% safe. In a newly packed git repo it is, because we always put bases
>> before deltas (and OFS_DELTA objects need this). But if you had a bitmap
>> generated from a fixed thin pack, we would have REF_DELTA objects early
>> on that depend on bases appended to the end of the pack. So I really
>> wonder if we should scrap this partial reuse and either just have full
>> reuse, or go through the regular object_entry construction.
>>
>> Vicent, you've thought about the reuse code a lot more than I have. Any
>> thoughts?
>
> Yes, our pack writing and bitmap code takes enough precautions to
> arrange the objects in the packfile in a way that can be partially
> reused, so for any given bitmap file written from Git, I'd say we're
> safe to always reuse the leader of the pack if this is possible.
>
> For bitmaps generated from JGit, however, we cannot make this
> assumption. I mean, we can right now (from my understanding of the
> current implementation for pack-objects on JGit), but they are free to
> change this in the future.

JGit certainly doesn't promise the ordering behavior, so the fact that
its happening is just luck. The code could change in the future to
invalidate this.

> Obviously I intend to keep the pack reuse on production because the
> CPU savings are noticeable, but we can drop it from the public
> patchset.

I think you should keep it in, its a significant improvement.

> Ideally, we'd have full pack reuse like JGit, but we cannot
> reasonably do that in GitHub because splitting a pack for the network
> root would double our disk usage for all the forks.

I gave a talk the week before about Git bitmaps and why we sometimes
have to slice pack files by object.

Some guy in the audience kept yelling that since its Git its all open
source and `git clone` is "just" a file transfer problem. So maybe for
his GitHub repositories and forks its OK to include the entire fork
network when someone clones?  :-)

  reply	other threads:[~2013-10-30 16:04 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-24 17:59 [PATCH 0/19] pack bitmaps Jeff King
2013-10-24 17:59 ` [PATCH 01/19] sha1write: make buffer const-correct Jeff King
2013-10-24 18:00 ` [PATCH 02/19] revindex: Export new APIs Jeff King
2013-10-24 18:01 ` [PATCH 03/19] pack-objects: Refactor the packing list Jeff King
2013-10-24 18:01 ` [PATCH 04/19] pack-objects: factor out name_hash Jeff King
2013-10-24 18:01 ` [PATCH 05/19] revision: allow setting custom limiter function Jeff King
2013-10-24 18:01 ` [PATCH 06/19] sha1_file: export `git_open_noatime` Jeff King
2013-10-24 18:01 ` [PATCH 07/19] compat: add endianness helpers Jeff King
2013-10-26  7:55   ` Thomas Rast
2013-10-30  8:25     ` Jeff King
2013-10-30 17:06       ` Vicent Martí
2013-10-24 18:02 ` [PATCH 08/19] ewah: compressed bitmap implementation Jeff King
2013-10-24 23:34   ` Junio C Hamano
2013-10-25  3:15     ` Jeff King
2013-10-26  7:55   ` Thomas Rast
2013-10-24 18:03 ` [PATCH 09/19] documentation: add documentation for the bitmap format Jeff King
2013-10-25  1:16   ` Duy Nguyen
2013-10-25  3:21     ` Jeff King
2013-10-25  3:28       ` Duy Nguyen
2013-10-25 13:47       ` Shawn Pearce
2013-10-30  7:50         ` Jeff King
2013-10-30 10:23           ` Shawn Pearce
2013-10-30 16:11             ` Vicent Marti
2013-10-30 16:14             ` Vicent Marti
2013-10-24 18:03 ` [PATCH 10/19] pack-bitmap: add support for bitmap indexes Jeff King
2013-10-25 13:55   ` Shawn Pearce
2013-10-30  8:10     ` Jeff King
2013-10-30 10:27       ` Shawn Pearce
2013-10-30 15:47       ` Vicent Marti
2013-10-30 16:04         ` Shawn Pearce [this message]
2013-10-30 20:25         ` Jeff King
2013-10-24 18:04 ` [PATCH 11/19] pack-objects: use bitmaps when packing objects Jeff King
2013-10-25 14:14   ` Shawn Pearce
2013-10-30  8:21     ` Jeff King
2013-10-30 10:38       ` Shawn Pearce
2013-10-30 16:01         ` Vicent Marti
2013-10-24 18:06 ` [PATCH 12/19] rev-list: add bitmap mode to speed up object lists Jeff King
2013-10-25 14:00   ` Shawn Pearce
2013-10-30  8:12     ` Jeff King
2013-10-24 18:06 ` [PATCH 13/19] pack-objects: implement bitmap writing Jeff King
2013-10-25  1:21   ` Duy Nguyen
2013-10-25  3:22     ` Jeff King
2013-10-24 18:06 ` [PATCH 14/19] repack: stop using magic number for ARRAY_SIZE(exts) Jeff King
2013-10-24 18:07 ` [PATCH 15/19] repack: turn exts array into array-of-struct Jeff King
2013-10-24 18:07 ` [PATCH 16/19] repack: handle optional files created by pack-objects Jeff King
2013-10-24 18:08 ` [PATCH 17/19] repack: consider bitmaps when performing repacks Jeff King
2013-10-24 18:08 ` [PATCH 18/19] t: add basic bitmap functionality tests Jeff King
2013-10-24 18:08 ` [PATCH 19/19] pack-bitmap: implement optional name_hash cache Jeff King
2013-10-24 20:25 ` [PATCH 0/19] pack bitmaps Junio C Hamano
2013-10-25  3:07 ` Junio C Hamano
2013-10-25  5:55 ` [PATCHv2 " Jeff King
2013-10-25  5:57   ` [PATCH v2 01/19] sha1write: make buffer const-correct Jeff King
2013-10-25  6:02   ` [PATCH 02/19] revindex: Export new APIs Jeff King
2013-10-25  6:03   ` [PATCH v2 03/19] pack-objects: Refactor the packing list Jeff King
2013-10-25  6:03   ` [PATCH v2 04/19] pack-objects: factor out name_hash Jeff King
2013-10-25  6:03   ` [PATCH v2 05/19] revision: allow setting custom limiter function Jeff King
2013-10-25  6:03   ` [PATCH v2 06/19] sha1_file: export `git_open_noatime` Jeff King
2013-10-25  6:03   ` [PATCH v2 07/19] compat: add endianness helpers Jeff King
2013-10-25  6:03   ` [PATCH v2 08/19] ewah: compressed bitmap implementation Jeff King
2013-10-25  6:03   ` [PATCH v2 09/19] documentation: add documentation for the bitmap format Jeff King
2013-10-25  6:03   ` [PATCH v2 10/19] pack-bitmap: add support for bitmap indexes Jeff King
2013-10-25 23:06     ` Junio C Hamano
2013-10-26  0:26       ` Jeff King
2013-10-26  6:25         ` Jeff King
2013-10-28 15:22           ` Junio C Hamano
2013-10-30  7:00             ` Jeff King
2013-10-26 10:14     ` Duy Nguyen
2013-10-30  7:27       ` Jeff King
2013-10-25  6:03   ` [PATCH v2 11/19] pack-objects: use bitmaps when packing objects Jeff King
2013-10-26 10:25     ` Duy Nguyen
2013-10-30  7:36       ` Jeff King
2013-10-30 10:28         ` Duy Nguyen
2013-10-30 20:07           ` Jeff King
2013-10-31 12:03             ` Duy Nguyen
2013-10-25  6:04   ` [PATCH v2 12/19] rev-list: add bitmap mode to speed up object lists Jeff King
2013-10-25  6:04   ` [PATCH v2 13/19] pack-objects: implement bitmap writing Jeff King
2013-10-25  6:04   ` [PATCH v2 14/19] repack: stop using magic number for ARRAY_SIZE(exts) Jeff King
2013-10-25  6:04   ` [PATCH v2 15/19] repack: turn exts array into array-of-struct Jeff King
2013-10-25  6:04   ` [PATCH v2 16/19] repack: handle optional files created by pack-objects Jeff King
2013-10-25  6:04   ` [PATCH v2 17/19] repack: consider bitmaps when performing repacks Jeff King
2013-10-25  6:04   ` [PATCH v2 18/19] t: add basic bitmap functionality tests Jeff King
2013-10-28 22:13     ` SZEDER Gábor
2013-10-30  7:39       ` Jeff King
2013-10-25  6:04   ` [PATCH v2 19/19] pack-bitmap: implement optional name_hash cache Jeff King
2013-10-26 10:19     ` [PATCH 20/19] count-objects: consider .bitmap without .pack/.idx pair garbage Nguyễn Thái Ngọc Duy
2013-10-30  6:59       ` Jeff King
2013-10-30 17:36         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJo=hJvm_sZqobzO0Er-OVp-Xkf4fT_tySQeNWf0mveHH-G-Xg@mail.gmail.com' \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=vicent@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).