git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: ZheNing Hu <adlternative@gmail.com>
Cc: "Jeff King" <peff@peff.net>,
	"Michal Suchánek" <msuchanek@suse.de>,
	"Git List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Christian Couder" <christian.couder@gmail.com>,
	johncai86@gmail.com, "Taylor Blau" <me@ttaylorr.com>
Subject: Re: Question: How to execute git-gc correctly on the git server
Date: Fri, 09 Dec 2022 14:48:14 +0100	[thread overview]
Message-ID: <221209.86bkoc7kgi.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CAOLTT8SR6JWX6mRLbyq4keb4JCfJP6Vq07LzHpb_f+e1jMnsZQ@mail.gmail.com>


On Fri, Dec 09 2022, ZheNing Hu wrote:

> Jeff King <peff@peff.net> 于2022年12月9日周五 09:37写道:
>>
>> On Fri, Dec 09, 2022 at 01:49:18AM +0100, Michal Suchánek wrote:
>>
>> > > In this case it's the mtime on the object file (or the pack containing
>> > > it). But yes, it is far from a complete race-free solution.
>> >
>> > So if you are pushing a branch that happens to reuse commits or other
>> > objects from an earlier branh that might have been collected ín the
>> > meantime you are basically doomed.
>>
>> Basically yes. We do "freshen" the mtimes on object files when we omit
>> an object write (e.g., your index state ends up at the same tree as an
>> old one). But for a push, there is no freshening. We check the graph at
>> the time of the push and decide if we have everything we need (either
>> newly pushed, or from what we already had in the repo). And that is
>> what's racy; somebody might be deleting as that check is happening.
>>
>> > People deleting a branch and then pushing another variant in which many
>> > objects are the same is a risk.
>> >
>> > People exporting files from somewhere and adding them to the repo which
>> > are bit-identical when independently exported by multiple people and
>> > sometimes deleting branches is a risk.
>>
>> Yes, both of those are risky (along with many other variants).
>>
>
> I'm wondering if there's an easy and poor performance way to do
> gc safely? For example, add a file lock to the repository during
> git push and git gc?

We don't have any "easy" way to do it, but we probably should. The root
cause of the race is tricky to fix, and we don't have any "global ref
lock".

But in the context of a client<->server and wanting to do gc on the
server a good enough and easy solution would be e.g.:

 1. Have a {pre,post}-receive hook logging attempted/finished pushes
 2. Have the pre-receive hook able to reject (or better yet, hang with
    sleep()) incoming deletions
 3. Do a gc with a small wrapper script, which:
    - Flips the "no deletion ops now" (or "delay deletion ops") switch
    - Polls until it's sure there's no relevant in-progress operations
    - Do a full gc
    - Unlock

You'd need to be certain that all relevant repo operations are going
through git-receive-pack etc., i.e. a local "git branch -d" or the like
won't run {pre,post}-receive.

  reply	other threads:[~2022-12-09 13:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-07 15:58 Question: How to execute git-gc correctly on the git server ZheNing Hu
2022-12-07 23:57 ` Ævar Arnfjörð Bjarmason
2022-12-08  1:16   ` Michal Suchánek
2022-12-08  7:01     ` Jeff King
2022-12-09  0:49       ` Michal Suchánek
2022-12-09  1:37         ` Jeff King
2022-12-09  7:26           ` ZheNing Hu
2022-12-09 13:48             ` Ævar Arnfjörð Bjarmason [this message]
2022-12-11 16:01               ` ZheNing Hu
2022-12-11 16:27                 ` Michal Suchánek
2022-12-09  7:15     ` ZheNing Hu
2022-12-08  6:59   ` Jeff King
2022-12-08 12:35     ` Ævar Arnfjörð Bjarmason
2022-12-14 20:11       ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=221209.86bkoc7kgi.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=adlternative@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=msuchanek@suse.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).