From: Jeff King <peff@peff.net>
To: Stefan Beller <sbeller@google.com>
Cc: Shawn Pearce <spearce@spearce.org>, git <git@vger.kernel.org>
Subject: Re: reftable: new ref storage format
Date: Thu, 13 Jul 2017 16:35:33 -0400 [thread overview]
Message-ID: <20170713203533.vcfyf5iei46g4tcf@sigill.intra.peff.net> (raw)
In-Reply-To: <CAGZ79kbY_t=Xtpb7fy0sZ9TWOy-UOUx8X5+_qLx60Dtg48Ok-g@mail.gmail.com>
On Thu, Jul 13, 2017 at 12:56:54PM -0700, Stefan Beller wrote:
> >> ### Problem statement
> >>
> >> Some repositories contain a lot of references (e.g. android at 866k,
> >> rails at 31k). The existing packed-refs format takes up a lot of
> >> space (e.g. 62M), and does not scale with additional references.
> >> Lookup of a single reference requires linearly scanning the file.
> >
> > I think the linear scan is actually an implementation short-coming. Even
> > though the records aren't fixed-length, the fact that newlines can only
> > appear as end-of-record is sufficient to mmap and binary search a
> > packed-refs file (you just have to backtrack a little when you land in
> > the middle of a record).
>
> Except that a record is a "delta" to the previous record, so it's not
> just finding a record, but reconstructing it. Example for records:
I was still talking about the existing packed-refs implementation here.
I agree that a full binary search of a reftable is harder because of the
prefix compression (it may still be possible by scanning backwards, but
I think there are ambiguities when you land in the middle of a record,
since there's no unambiguous end-of-record character). But I don't think
it matters. If you binary-search to a constant-sized block, then a
linear scan of the block is acceptable.
> >> - Occupy less disk space for large repositories.
> >
> > Good goal. Just to play devil's advocate, the simplest way to do that
> > with the current code would be to gzip packed-refs (and/or store sha1s
> > as binary). That works against the "mmap and binary search" plan,
> > though. :)
>
> Given the compression by delta-ing the name to the previous change and
> the fact that Gerrit has
>
> refs/heads/changes/1
> refs/heads/changes/2
> refs/heads/changes/3
> ...
>
> I think this format would trump a "dumb" zip.
> (Github having sequentially numbered pull requests would also
> benefit here)
You may be surprised. Let's imagine that you have a set of 4096 refs in
refs/changes/1, refs/changes/2, etc:
for i in $(seq 1 4096)
do
echo refs/changes/$i
done >input
Now let's do a prefix compression, with a single byte for "how many
characters to reuse from the last entry":
perl -lne '
my $common;
if (defined $last) {
chop $last while !/\Q$last\E/;
$common = length($last);
} else {
$common = 0;
}
print chr($common), substr($_, $common);
$last = $_;
' <input >prefix
And a gzip:
gzip -c -9 <input >zip
And the results:
$ wc -c prefix; wc -c zip
12754 prefix
10116 zip
The secret sauce is most likely that gzip is bit-packing, using only a
few bits per character and not aligning with byte boundaries.
Not that I'm recommending just gzipping the whole packed-refs file. It
ruins the fast-lookup. We _could_ consider gzipping individual blocks of
a reftable (or any structure that allows you to search to a
constant-sized block and do a linear search from there). But given that
they're in the same ballpark, I'm happy with whatever ends up the
simplest to code and debug. ;)
Just for fun, here's the decoding script for the prefix-compression:
perl -e '
while (read(STDIN, $common, 1)) {
$common = ord($common);
$rest = <STDIN>;
if ($common > 0) {
$rest = substr($last, 0, $common) . $rest
}
print $rest;
$last = $rest}' <prefix
'
> > OK, let me try to summarize to see if I understand.
>
> When Shawn presented the proposal, a couple of colleagues here
> were as excited as I was, but the daring question is, why Shawn
> did not give the whole thing in BNF format from top down:
>
> initial-block
> content-blocks*
> (index-block)
> footer
Yeah, I agree it took me a bit to figure out what was going on. A
high-level overview of the format would have been nice.
> > So lookup really is more
> > like O(block_size * log(n/block_size)), but block_size being a constant,
> > it drops out to O(log n).
>
> There is also an index block such that you can binary search across
> blocks, so
>
> O( log(block_count) + log(intra_block_restarting_points) + small linear scan)
>
> There are 2 binary searches, and the block size is an interesting
> thing to look at when making up trade offs.
Right, the cross-block index was what I was trying to account for.
Either way, from a big-O perspective the block size and the number of
restarts are constants with respect to the total number of entries. I'm
happy with log(n), though. It's hard to do better.
-Peff
next prev parent reply other threads:[~2017-07-13 20:35 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-13 0:17 reftable: new ref storage format Shawn Pearce
2017-07-13 19:32 ` Jeff King
2017-07-13 19:56 ` Stefan Beller
2017-07-13 20:35 ` Jeff King [this message]
2017-07-13 21:51 ` Eric Wong
2017-07-14 0:27 ` Shawn Pearce
2017-07-14 20:10 ` Jeff King
2017-07-14 0:11 ` Shawn Pearce
2017-07-14 14:27 ` Dave Borowitz
2017-07-14 15:31 ` Shawn Pearce
2017-07-14 20:08 ` Jeff King
2017-07-16 6:01 ` Shawn Pearce
2017-07-16 10:01 ` Jeff King
2017-07-16 8:07 ` Johannes Sixt
2017-07-16 10:03 ` Jeff King
2017-07-16 10:10 ` Johannes Sixt
2017-07-16 17:33 ` Michael Haggerty
2017-07-16 19:43 ` Shawn Pearce
2017-07-16 21:12 ` Shawn Pearce
2017-07-16 21:13 ` Dave Borowitz
2017-07-16 21:31 ` Shawn Pearce
2017-07-18 1:43 ` Michael Haggerty
2017-07-18 18:53 ` Junio C Hamano
2017-07-23 22:56 ` Shawn Pearce
2017-07-23 23:03 ` Shawn Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170713203533.vcfyf5iei46g4tcf@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=sbeller@google.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).