git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: SURA via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, SURA <sura907@hotmail.com>
Subject: Re: [PATCH] builtin/fetch.c: clean tmp pack after receive signal
Date: Wed, 17 Mar 2021 11:15:07 -0700	[thread overview]
Message-ID: <xmqqsg4twst0.fsf@gitster.g> (raw)
In-Reply-To: <YFEpGGLBgLSdR40V@coredump.intra.peff.net> (Jeff King's message of "Tue, 16 Mar 2021 17:54:32 -0400")

Jeff King <peff@peff.net> writes:

> On Tue, Mar 16, 2021 at 02:53:36AM +0000, SURA via GitGitGadget wrote:
>
>> In Gitee.com, I often use scripts to start a time-limited
>
> Not related to your patch, but I think this name falls afoul of Git's
> trademark policy. See:
>
>   https://git-scm.com/trademark
>
> There's also some discussion in this thread:
>
>   https://lore.kernel.org/git/20170202022655.2jwvudhvo4hmueaw@sigill.intra.peff.net/

Thanks.  On somewhat related to this patch, we also ask contributors
to use their real names so that we do not render the Signed-off-by:
procedure meaningless.

> This isn't quite true. "git gc" will clean up the temporary files, but
> only if the mtime is sufficiently old. The purpose here is to give a
> grace period to avoid deleting a file that is actively being written to.
> However, we use the same grace period that we use for deleting
> unreachable objects, which is absurdly long for this purpose: 2 weeks.
> Probably something like an hour would be more appropriate (since the
> mtime is updated on each write, this would imply a process not making
> forward progress).

I agree that for temporaries the two-week default is way too long,
and I am OK if we decide to shorten the expiration for them
separately from the known-to-be-good-but-unreferenced objects.

> Likewise, we have a tempfile cleanup system already.
>
> I think this hunk:
>
>> @@ -336,6 +339,7 @@ static const char *open_pack_file(const char *pack_name)
>>  			output_fd = odb_mkstemp(&tmp_file,
>>  						"pack/tmp_pack_XXXXXX");
>>  			pack_name = strbuf_detach(&tmp_file, NULL);
>> +			tmp_pack_name = pack_name;
>
> ...can just call register_tempfile(). It should also record the result
> so that we don't try to unlink() it after we've already moved it away
> from its temporary name (though it's fairly unlikely for somebody else
> to have used the name in the interim).
>
> I think you'd want to do the same for the tmp_idx_* files, too. Likewise
> for ".rev" files we create starting in v2.31.
>
> I think it would also make sense in create_tmp_packfile(), which is used
> during repacking (a different problem space, but really the same thing:
> if repacking fails for some reason, we probably shouldn't leave a
> useless gigantic half-finished packfile on disk).
>
> We should possibly also do so for tmp_obj_* files. Those can be written
> for a fetch or push via unpack-objects (as well as normal local
> commands). They're not usually as big as a pack, obviously, but I think
> the same principle applies.
>
>> [...]
>
> It would be nice to see some tests covering this functionality, too.
> Reproducing it with signals is likely to be racy and not worth it. But I
> think that right now index-pack reading a bogus pack (say, one that
> fails fsck checks) will leave the tmp_pack_* on disk. And it would not
> if we cleanup tempfiles (again, this would be on any exit, not just
> signal death, but I think that is what we'd want, and also what
> register_tempfile() will do).

Sounds like a good medium difficulty leftover bit.

Thanks.

      reply	other threads:[~2021-03-17 18:16 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-16  2:53 [PATCH] builtin/fetch.c: clean tmp pack after receive signal SURA via GitGitGadget
2021-03-16 21:54 ` Jeff King
2021-03-17 18:15   ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqsg4twst0.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=peff@peff.net \
    --cc=sura907@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).