git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, dstolee@microsoft.com, jrnieder@gmail.com,
	peff@peff.net
Subject: Re: [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files
Date: Thu, 14 Jan 2021 13:13:05 -0500	[thread overview]
Message-ID: <YACJsdDToLQ24p/Y@nand.local> (raw)
In-Reply-To: <xmqqsg74x9ks.fsf@gitster.c.googlers.com>

On Wed, Jan 13, 2021 at 11:26:59PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Specify the format of the on-disk reverse index 'pack-*.rev' file, as
> > well as prepare the code for the existence of such files.
>
> We've changed the pack .idx file format once as the file format is
> versioned.  I wonder if you considered placing the reverse index
> information in the same file, with version bump, in the .idx?

Funny enough, I couldn't remember whether I considered this or not.
Peff reminded me off-list that he and I *had* considered this, but
decided against it.

The main benefit to introducing a new '.rev' format is that we can have
packs that can be upgraded to write a reverse index without having to
rewrite their forward index. It would also allow us to avoid teaching
other implementations about a new version of the index file (since they
can ignore it and continue to build their equivalent of the reverse
index file in memory by reading the forward index).

(Peff reminds me that dumb-http does look at remote .idx files, so this
new format would leak across to clients, whether or not that's something
to be concerned about...).

Of course, having the contents of the .rev file be included in the .idx
file nets us one fewer file to manage, but I'm not sure that's a reason
to do things one way or another.

Your response did pique my interest, since I was wondering if we could
improve the cold cache performance if the .rev file's contents were
included in the .idx, but after giving it some thought I don't think we
can. Reasons are:

  - If the reverse index's contents appears at the end of the .idx file,
    then in any .idx file large enough to matter, we'll almost certainly
    still be evicting cache lines back and forth when swapping between
    reading the forward- and reverse-indexes. So, no gains to be had
    there.

  - If, on the other hand, we included the reverse index's contents by
    interleaving it with the forward index's offsets, then we'd be
    worsening the cache performance of the forward index.

So, I'm more in favor of a new .rev file rather than a v3 .idx version.
Apologies for not including more of a rationale "why" in the cover
letter (had I not forgotten that I'd even considered it, I would have).

Thanks,
Taylor

  reply	other threads:[~2021-01-14 18:17 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:19 [PATCH 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-08 18:19 ` [PATCH 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-08 18:20 ` [PATCH 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 17:11   ` Ævar Arnfjörð Bjarmason
2021-01-12 18:40     ` Taylor Blau
2021-01-08 18:20 ` [PATCH 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 16:49   ` Derrick Stolee
2021-01-12 17:34     ` Taylor Blau
2021-01-12 17:18   ` Ævar Arnfjörð Bjarmason
2021-01-12 17:39     ` Derrick Stolee
2021-01-12 18:17       ` Taylor Blau
2021-01-08 18:20 ` [PATCH 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-13 22:28 ` [PATCH v2 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-13 22:28   ` [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-14  7:22     ` Junio C Hamano
2021-01-14 12:07       ` Derrick Stolee
2021-01-14 19:57         ` Jeff King
2021-01-14 18:28       ` Taylor Blau
2021-01-14  7:26     ` Junio C Hamano
2021-01-14 18:13       ` Taylor Blau [this message]
2021-01-14 20:57         ` Junio C Hamano
2021-01-22 22:54     ` Jeff King
2021-01-25 17:44       ` Taylor Blau
2021-01-25 18:27         ` Jeff King
2021-01-25 19:04         ` Junio C Hamano
2021-01-25 19:23           ` Taylor Blau
2021-01-13 22:28   ` [PATCH v2 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-22 23:24     ` Jeff King
2021-01-25 19:15       ` Taylor Blau
2021-01-26 21:43         ` Jeff King
2021-01-13 22:28   ` [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-22 23:53     ` Jeff King
2021-01-25 20:03       ` Taylor Blau
2021-01-13 22:28   ` [PATCH v2 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-22 23:57     ` Jeff King
2021-01-23  0:08       ` Jeff King
2021-01-25 20:21         ` Taylor Blau
2021-01-25 20:50           ` Jeff King
2021-01-13 22:28   ` [PATCH v2 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-13 22:28   ` [PATCH v2 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28   ` [PATCH v2 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28   ` [PATCH v2 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-25 23:37 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-25 23:37   ` [PATCH v3 01/10] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-29  0:27     ` Jeff King
2021-01-29  1:14       ` Taylor Blau
2021-01-30  8:39         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 02/10] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-25 23:37   ` [PATCH v3 03/10] builtin/index-pack.c: allow stripping arbitrary extensions Taylor Blau
2021-01-29  0:28     ` Jeff King
2021-01-29  1:15       ` Taylor Blau
2021-01-25 23:37   ` [PATCH v3 04/10] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-25 23:37   ` [PATCH v3 05/10] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-25 23:37   ` [PATCH v3 06/10] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-29  0:30     ` Jeff King
2021-01-29  1:17       ` Taylor Blau
2021-01-30  8:41         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 07/10] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29  0:45     ` Jeff King
2021-01-29  1:09       ` Eric Sunshine
2021-01-29  1:21       ` Taylor Blau
2021-01-30  8:43         ` Jeff King
2021-01-29  2:42       ` Junio C Hamano
2021-01-25 23:37   ` [PATCH v3 08/10] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29  0:47     ` Jeff King
2021-01-25 23:37   ` [PATCH v3 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-29  0:53     ` Jeff King
2021-01-29  1:25       ` Taylor Blau
2021-01-30  8:46         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 10/10] t5325: check both on-disk and in-memory reverse index Taylor Blau
2021-01-29  1:04     ` Jeff King
2021-01-29  1:05     ` Jeff King
2021-01-29  1:32       ` Taylor Blau
2021-01-30  8:47         ` Jeff King
2021-01-26  2:36   ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Junio C Hamano
2021-01-26  2:49     ` Taylor Blau
2021-01-29  1:06   ` Jeff King
2021-01-29  1:34     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YACJsdDToLQ24p/Y@nand.local \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).