From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: SURA via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org, SURA <sura907@hotmail.com>
Subject: Re: [PATCH] builtin/fetch.c: clean tmp pack after receive signal
Date: Wed, 17 Mar 2021 11:15:07 -0700 [thread overview]
Message-ID: <xmqqsg4twst0.fsf@gitster.g> (raw)
In-Reply-To: <YFEpGGLBgLSdR40V@coredump.intra.peff.net> (Jeff King's message of "Tue, 16 Mar 2021 17:54:32 -0400")
Jeff King <peff@peff.net> writes:
> On Tue, Mar 16, 2021 at 02:53:36AM +0000, SURA via GitGitGadget wrote:
>
>> In Gitee.com, I often use scripts to start a time-limited
>
> Not related to your patch, but I think this name falls afoul of Git's
> trademark policy. See:
>
> https://git-scm.com/trademark
>
> There's also some discussion in this thread:
>
> https://lore.kernel.org/git/20170202022655.2jwvudhvo4hmueaw@sigill.intra.peff.net/
Thanks. On somewhat related to this patch, we also ask contributors
to use their real names so that we do not render the Signed-off-by:
procedure meaningless.
> This isn't quite true. "git gc" will clean up the temporary files, but
> only if the mtime is sufficiently old. The purpose here is to give a
> grace period to avoid deleting a file that is actively being written to.
> However, we use the same grace period that we use for deleting
> unreachable objects, which is absurdly long for this purpose: 2 weeks.
> Probably something like an hour would be more appropriate (since the
> mtime is updated on each write, this would imply a process not making
> forward progress).
I agree that for temporaries the two-week default is way too long,
and I am OK if we decide to shorten the expiration for them
separately from the known-to-be-good-but-unreferenced objects.
> Likewise, we have a tempfile cleanup system already.
>
> I think this hunk:
>
>> @@ -336,6 +339,7 @@ static const char *open_pack_file(const char *pack_name)
>> output_fd = odb_mkstemp(&tmp_file,
>> "pack/tmp_pack_XXXXXX");
>> pack_name = strbuf_detach(&tmp_file, NULL);
>> + tmp_pack_name = pack_name;
>
> ...can just call register_tempfile(). It should also record the result
> so that we don't try to unlink() it after we've already moved it away
> from its temporary name (though it's fairly unlikely for somebody else
> to have used the name in the interim).
>
> I think you'd want to do the same for the tmp_idx_* files, too. Likewise
> for ".rev" files we create starting in v2.31.
>
> I think it would also make sense in create_tmp_packfile(), which is used
> during repacking (a different problem space, but really the same thing:
> if repacking fails for some reason, we probably shouldn't leave a
> useless gigantic half-finished packfile on disk).
>
> We should possibly also do so for tmp_obj_* files. Those can be written
> for a fetch or push via unpack-objects (as well as normal local
> commands). They're not usually as big as a pack, obviously, but I think
> the same principle applies.
>
>> [...]
>
> It would be nice to see some tests covering this functionality, too.
> Reproducing it with signals is likely to be racy and not worth it. But I
> think that right now index-pack reading a bogus pack (say, one that
> fails fsck checks) will leave the tmp_pack_* on disk. And it would not
> if we cleanup tempfiles (again, this would be on any exit, not just
> signal death, but I think that is what we'd want, and also what
> register_tempfile() will do).
Sounds like a good medium difficulty leftover bit.
Thanks.
prev parent reply other threads:[~2021-03-17 18:16 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-16 2:53 [PATCH] builtin/fetch.c: clean tmp pack after receive signal SURA via GitGitGadget
2021-03-16 21:54 ` Jeff King
2021-03-17 18:15 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqsg4twst0.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=peff@peff.net \
--cc=sura907@hotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).