git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: Shawn Pearce <spearce@spearce.org>, git <git@vger.kernel.org>
Subject: Re: reftable: new ref storage format
Date: Thu, 13 Jul 2017 12:56:54 -0700	[thread overview]
Message-ID: <CAGZ79kbY_t=Xtpb7fy0sZ9TWOy-UOUx8X5+_qLx60Dtg48Ok-g@mail.gmail.com> (raw)
In-Reply-To: <20170713193234.fkxf73t6jevj4svg@sigill.intra.peff.net>

On Thu, Jul 13, 2017 at 12:32 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Jul 12, 2017 at 05:17:58PM -0700, Shawn Pearce wrote:
>
>> We've been having scaling problems with insane number of references
>> (>866k), so I started thinking a lot about improving ref storage.
>>
>> I've written a simple approach, and implemented it in JGit.
>> Performance is promising:
>>
>>   - 62M packed-refs compresses to 27M
>>   - 42.3 usec lookup
>
> Exciting. I'd love for us to have a simple-ish on-disk structure that
> scales well and doesn't involve a dependency on a third-party database
> structure.
>
> Let me see what holes I can poke in your proposal, though. :)
>
>> ### Problem statement
>>
>> Some repositories contain a lot of references (e.g.  android at 866k,
>> rails at 31k).  The existing packed-refs format takes up a lot of
>> space (e.g.  62M), and does not scale with additional references.
>> Lookup of a single reference requires linearly scanning the file.
>
> I think the linear scan is actually an implementation short-coming. Even
> though the records aren't fixed-length, the fact that newlines can only
> appear as end-of-record is sufficient to mmap and binary search a
> packed-refs file (you just have to backtrack a little when you land in
> the middle of a record).

Except that a record is a "delta" to the previous record, so it's not
just finding a record, but reconstructing it. Example for records:

    varint( prefix_length )
    varint( (suffix_length << 2) | type )
    suffix
    value?

First record:

 0,
 16 << 2 | 0x01,
  "refs/heads/maint"
  08f9c32463bf9e578acb7ac5f77afd36e803c6bc

next record (refs/heads/master):

  13
  4 << 2 | 0x01
  "ster",
  80145b1e412719c960036c8c62a9e35dd23a5b2d

Now if you found the second one, you cannot reconstruct its
real name (refs/heads/master) without knowing the name
of the first. The name of the first is easy because the prefix_length
is 0. If it also had a prefix length != 0 you'd have to go back more.


>> - Occupy less disk space for large repositories.
>
> Good goal.  Just to play devil's advocate, the simplest way to do that
> with the current code would be to gzip packed-refs (and/or store sha1s
> as binary). That works against the "mmap and binary search" plan,
> though. :)

Given the compression by delta-ing the name to the previous change and
the fact that Gerrit has

  refs/heads/changes/1
  refs/heads/changes/2
  refs/heads/changes/3
  ...

I think this format would trump a "dumb" zip.
(Github having sequentially numbered pull requests would also
benefit here)

>> ## File format
>
> OK, let me try to summarize to see if I understand.

When Shawn presented the proposal, a couple of colleagues here
were as excited as I was, but the daring question is, why Shawn
did not give the whole thing in BNF format from top down:

  initial-block
  content-blocks*
  (index-block)
  footer

> The reftable file is a sequence of blocks, each of which contains a
> finite set of heavily-compressed refs. You have to read each block
> sequentially,

Each block may have restarting points, that allow for intra-block
binary search.

> but since they're a fixed size, that's still a
> constant-time operation (I'm ignoring the "restarts" thing for now). You
> find the right block by reading the index.

or by reading the footer at the end. If the footer and the index differ
in block size (one bit flipped), we can ask the CRC of the footer
for more guidance.

>  So lookup really is more
> like O(block_size * log(n/block_size)), but block_size being a constant,
> it drops out to O(log n).

There is also an index block such that you can binary search across
blocks, so

O( log(block_count) + log(intra_block_restarting_points) + small linear scan)

There are 2 binary searches, and the block size is an interesting
thing to look at when making up trade offs.

  reply	other threads:[~2017-07-13 19:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-13  0:17 reftable: new ref storage format Shawn Pearce
2017-07-13 19:32 ` Jeff King
2017-07-13 19:56   ` Stefan Beller [this message]
2017-07-13 20:35     ` Jeff King
2017-07-13 21:51       ` Eric Wong
2017-07-14  0:27       ` Shawn Pearce
2017-07-14 20:10         ` Jeff King
2017-07-14  0:11   ` Shawn Pearce
2017-07-14 14:27     ` Dave Borowitz
2017-07-14 15:31       ` Shawn Pearce
2017-07-14 20:08     ` Jeff King
2017-07-16  6:01       ` Shawn Pearce
2017-07-16 10:01         ` Jeff King
2017-07-16  8:07       ` Johannes Sixt
2017-07-16 10:03         ` Jeff King
2017-07-16 10:10           ` Johannes Sixt
2017-07-16 17:33 ` Michael Haggerty
2017-07-16 19:43   ` Shawn Pearce
2017-07-16 21:12     ` Shawn Pearce
2017-07-16 21:13     ` Dave Borowitz
2017-07-16 21:31       ` Shawn Pearce
2017-07-18  1:43     ` Michael Haggerty
2017-07-18 18:53       ` Junio C Hamano
2017-07-23 22:56       ` Shawn Pearce
2017-07-23 23:03         ` Shawn Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kbY_t=Xtpb7fy0sZ9TWOy-UOUx8X5+_qLx60Dtg48Ok-g@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).