git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Johannes Berg <johannes@sipsolutions.net>, git@vger.kernel.org
Subject: Re: [PATCH] pack-format: correct multi-pack-index description
Date: Mon, 10 Feb 2020 09:46:56 -0500	[thread overview]
Message-ID: <28b6fd7f-85ea-9ef1-1977-888cdd737c6d@gmail.com> (raw)
In-Reply-To: <526a7a3d8d135c9b97890c1c238ca5baaa138c3c.camel@sipsolutions.net>

On 2/10/2020 9:22 AM, Johannes Berg wrote:
> On Mon, 2020-02-10 at 09:18 -0500, Derrick Stolee wrote:
>>
>> Thank you for finding this doc bug. This is a very subtle point,
>> and you have described it very clearly.
> 
> I was going back and forth on the wording a bit, glad I found something
> that you think is a good description :)
> 
> Are you familiar with the multi-pack-index and how it's used, by any
> chance?

Yes. I wrote the first version, and we use it a lot in VFS for Git.

> I came here from bup (https://github.com/bup/bup/) and needed a way to
> store the offset to find objects in "pure bup", today it only stores
> object *presence* and *pack* in its multi-index, but not the offset.
> 
> However, it seems to do a bit better in terms of not requiring a single
> multi-index, but instead storing it in midx-*.midx files and multiple
> can describe the repository state. Why wasn't something like that done
> for git as well? It's a bit annoying to have to recreate the full midx
> every time a pack file is added, and searching in two or three midx
> files wouldn't really be a big deal?

Part of my initial plan was to have this incremental file format.
The commit-graph uses a very similar mechanism. The difference may
be that you likely allow multiple .midx files found by scanning the
pack directory, but I would expect something like the
"commit-graph-chain" file that provides an ordered list of the
incremental files. This can be important for deciding when to merge
layers or delete old files, and would be critical to the possibility
of converting reachability bitmaps to rely on a stable object order
stored in the multi-pack-index instead of pack-order.

The reason the multi-pack-index has not become incremental is that
VFS for Git no longer needs to write it very often. We write the
entire multi-pack-index during a background job that triggers once
per day. If we needed to write it more frequently, then the incremental
format would be more important to us.

That said: if someone wanted to contribute an incremental format,
then I would be happy to review it!

> Anyway, that's just an aside, but during all this investigation I
> stumbled across this small inconsistency - I'm glad the docs exist at
> all! :-)

I'm glad they helped.

Thanks,
-Stolee


  reply	other threads:[~2020-02-10 14:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-07 22:16 [PATCH] pack-format: correct multi-pack-index description Johannes Berg
2020-02-10 14:18 ` Derrick Stolee
2020-02-10 14:22   ` Johannes Berg
2020-02-10 14:46     ` Derrick Stolee [this message]
2020-02-10 14:50       ` Johannes Berg
2020-02-10 15:02         ` Derrick Stolee
2020-02-10 15:06           ` Johannes Berg
2020-02-10 17:02   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28b6fd7f-85ea-9ef1-1977-888cdd737c6d@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=johannes@sipsolutions.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).