git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: Son Luong Ngoc <sluongng@gmail.com>, git <git@vger.kernel.org>,
	dstolee@microsoft.com
Subject: Re: [PATCH] builtin/repack.c: invalidate MIDX only when necessary
Date: Tue, 25 Aug 2020 12:47:21 -0400	[thread overview]
Message-ID: <20200825164721.GA1414394@coredump.intra.peff.net> (raw)
In-Reply-To: <20200825144515.GB7671@syl.lan>

On Tue, Aug 25, 2020 at 10:45:15AM -0400, Taylor Blau wrote:

> > I wonder if its worth to trigger write_midx_file() to update the midx instead of
> > just removing MIDX?
> 
> There's no reason that we couldn't do this, but I don't think that it's
> a very good idea, especially if the new 'git maintenance' command will
> be able to do something like this by itself.
> 
> I'm hesitant to add yet another option to 'git repack', which I have
> always thought as a plumbing tool. That's important because callers
> (like 'git maintenance' or user scripts) can 'git multi-pack-index write
> ...' after their 'git repack' to generate a new MIDX if they want one.

It may be worth thinking a bit about atomicity here, though. Rather than
separate delete and write steps, would somebody want a sequence like:

  - we have a midx with packs A, B, C

  - we find out that pack C is redundant and want to drop it

  - we create a new midx with A and B in a tempfile

  - we atomically rename the new midx over the old

  - we delete pack C

A simultaneous reader always sees a consistent midx. Whereas in a
delete-then-rewrite scenario there is a moment where there is no midx,
and they'd fall back to reading the individual idx files.

It may not matter all that much, though, for two reasons:

  - reading individual idx files should still give a correct answer.
    It just may be a bit slower.

  - even with an atomic replacement, I think caching on the reader side
    may cause interesting effects. For instance, if a reader process
    opens the old midx, it will generally cache that knowledge in memory
    (including mmap the content). So even after the replacement and
    deletion of C, it may still try to use the old midx that references
    C.

If we do care, though, that implies some cooperation between the
deletion process and the midx code. Perhaps it even argues that repack
should refuse to delete such a single pack at all, since it _isn't_
redundant. It's part of a midx, and the caller should rewrite the midx
first itself, and _then_ look for redundant packs.

-Peff

  parent reply	other threads:[~2020-08-25 16:47 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-25  2:01 [PATCH] builtin/repack.c: invalidate MIDX only when necessary Taylor Blau
2020-08-25  2:26 ` Jeff King
2020-08-25  2:37   ` Taylor Blau
2020-08-25 13:14     ` Derrick Stolee
2020-08-25 14:41       ` Taylor Blau
2020-08-25 15:14         ` Derrick Stolee
2020-08-25 15:42           ` Taylor Blau
2020-08-25 16:56             ` Jeff King
2020-08-25 15:58   ` Junio C Hamano
2020-08-25 16:08     ` Taylor Blau
2020-08-25 16:18     ` Derrick Stolee
2020-08-25 17:34       ` Jeff King
2020-08-25 17:22     ` Jeff King
2020-08-25 18:05       ` Junio C Hamano
2020-08-25 18:27         ` Jeff King
2020-08-25 22:45           ` [PATCH] pack-redundant: gauge the usage before proposing its removal Junio C Hamano
2020-08-25 23:09             ` Taylor Blau
2020-08-25 23:22               ` Junio C Hamano
2020-08-26  1:17             ` [PATCH v1 0/3] War on dashed-git Junio C Hamano
2020-08-26  1:17               ` [PATCH v1 1/3] transport-helper: do not run git-remote-ext etc. in dashed form Junio C Hamano
2020-08-26  1:24                 ` Eric Sunshine
2020-08-26  7:55                   ` Johannes Schindelin
2020-08-26 16:27                   ` Junio C Hamano
2020-08-26  1:17               ` [PATCH v1 2/3] cvsexportcommit: do not run git programs " Junio C Hamano
2020-08-26  1:28                 ` Eric Sunshine
2020-08-26  1:42                   ` Junio C Hamano
2020-08-26 16:08                   ` Junio C Hamano
2020-08-26 16:28                     ` Junio C Hamano
2020-08-26  8:02                 ` Johannes Schindelin
2020-08-26  1:17               ` [PATCH v1 3/3] git: catch an attempt to run "git-foo" Junio C Hamano
2020-08-26  1:19                 ` Junio C Hamano
2020-08-26  8:06                 ` Johannes Schindelin
2020-08-26 16:30                   ` Junio C Hamano
2020-08-28  2:13                 ` Johannes Schindelin
2020-08-28 22:03                   ` Junio C Hamano
2020-08-31  9:59                     ` Johannes Schindelin
2020-08-31 17:45                       ` Junio C Hamano
2020-12-20 15:25                   ` Johannes Schindelin
2020-12-21 22:24                     ` Junio C Hamano
2020-12-30  5:30                       ` Johannes Schindelin
2020-08-26  8:09               ` [PATCH v1 0/3] War on dashed-git Johannes Schindelin
2020-08-26 16:45                 ` Junio C Hamano
2020-08-26 19:46                   ` [PATCH v2 0/2] avoid running "git-subcmd" in the dashed form Junio C Hamano
2020-08-26 19:46                     ` [PATCH v2 1/2] transport-helper: do not run git-remote-ext etc. in " Junio C Hamano
2020-08-26 19:46                     ` [PATCH v2 2/2] cvsexportcommit: do not run git programs " Junio C Hamano
2020-08-26 21:37                       ` [PATCH v2 3/2] credential-cache: use child_process.args Junio C Hamano
2020-08-26 22:25                         ` [PATCH] run_command: teach API users to use embedded 'args' more Junio C Hamano
2020-08-27  4:21                           ` Jeff King
2020-08-27  4:30                             ` Junio C Hamano
2020-08-27  4:31                             ` Eric Sunshine
2020-08-27  4:44                               ` Jeff King
2020-08-27  5:03                                 ` Eric Sunshine
2020-08-27  5:25                                   ` [PATCH] worktree: fix leak in check_clean_worktree() Jeff King
2020-08-27  5:56                                     ` Eric Sunshine
2020-08-27 15:31                                       ` Junio C Hamano
2020-08-27  4:13                         ` [PATCH v2 3/2] credential-cache: use child_process.args Jeff King
2020-08-27  4:22                           ` Jeff King
2020-08-27  4:31                           ` Junio C Hamano
2020-08-27  4:14                         ` Jeff King
2020-08-27 15:34                           ` Junio C Hamano
2020-08-31 22:56                         ` Junio C Hamano
2020-09-01  4:49                           ` Jeff King
2020-09-01 16:11                             ` Junio C Hamano
2020-08-27  0:57                     ` [PATCH v2 0/2] avoid running "git-subcmd" in the dashed form Derrick Stolee
2020-08-27  1:22                       ` Junio C Hamano
2020-08-28  9:14             ` [PATCH] pack-redundant: gauge the usage before proposing its removal Jeff King
2020-08-28 22:45               ` Junio C Hamano
2020-08-25  7:55 ` [PATCH] builtin/repack.c: invalidate MIDX only when necessary Son Luong Ngoc
2020-08-25 12:45   ` Derrick Stolee
2020-08-25 14:45   ` Taylor Blau
2020-08-25 16:04     ` [PATCH v2] " Taylor Blau
2020-08-26 20:51       ` Derrick Stolee
2020-08-26 20:54         ` Junio C Hamano
2020-08-25 16:47     ` Jeff King [this message]
2020-08-25 17:10       ` [PATCH] " Derrick Stolee
2020-08-25 17:29         ` Jeff King
2020-08-25 17:34           ` Taylor Blau
2020-08-25 17:42             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200825164721.GA1414394@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=sluongng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).