git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matt McCutchen <matt@mattmccutchen.net>
To: git <git@vger.kernel.org>
Subject: Protecting old temporary objects being reused from concurrent "git gc"?
Date: Tue, 15 Nov 2016 09:13:14 -0500	[thread overview]
Message-ID: <1479219194.2406.73.camel@mattmccutchen.net> (raw)

The Braid subproject management tool stores the subproject content in
the main tree and is able to switch to a different upstream revision of
a subproject by doing the equivalent of "git read-tree -m" on the
superproject tree and the two upstream trees.  The tricky part is
preparing temporary trees with the upstream content moved to the path
configured for the superproject.  The usual method is "git read-tree
--prefix", but using what index file?  Braid currently uses the user's
actual worktree, which can leave a mess if it gets interrupted:

https://github.com/cristibalan/braid/blob/7d81da6e86e24de62a74f3ab8d880666cb343b04/lib/braid/commands/update.rb#L98

I want to change this to something that won't leave an inconsistent
state if interrupted.  I've written code for this kind of thing before
that sets GIT_INDEX_FILE and uses a temporary index file and "git
write-tree".  But I realized that if "git gc" runs concurrently, the
generated tree could be deleted before it is used and the tool would
fail.  If I had a need to run "git commit-tree", it seems like I might
even end up with a commit object with a broken reference to a tree.
 "git gc" normally doesn't delete objects that were created in the last
2 weeks, but if an identical tree was added to the object database more
than 2 weeks ago by another operation and is unreferenced, it could be
reused without updating its mtime and it could still get deleted.

Is there a recommended way to avoid this kind of problem in add-on
tools?  (I searched the Git documentation and the web for information
about races with "git gc" and didn't find anything useful.)  If not, it
seems to be a significant design flaw in "git gc", even if the problem
is extremely rare in practice.  I wonder if some of the built-in
commands may have the same problem, though I haven't tried to test
them.  If this is confirmed to be a known problem affecting built-in
commands, then at least I won't feel bad about introducing the
same problem into add-on tools. :/

Thanks,
Matt

             reply	other threads:[~2016-11-15 14:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-15 14:13 Matt McCutchen [this message]
2016-11-15 17:06 ` Protecting old temporary objects being reused from concurrent "git gc"? Jeff King
2016-11-15 17:33   ` Matt McCutchen
2016-11-15 17:40     ` Jeff King
2016-11-15 19:08       ` [PATCH] git-gc.txt: expand discussion of races with other processes Matt McCutchen
2016-11-15 19:12       ` Protecting old temporary objects being reused from concurrent "git gc"? Matt McCutchen
2016-11-15 20:01       ` Junio C Hamano
2016-11-16  8:07         ` Jeff King
2016-11-16 18:18           ` Junio C Hamano
2016-11-16 18:58       ` Junio C Hamano
2016-11-17  1:04         ` Jeff King
2016-11-17  1:35           ` Re* " Junio C Hamano
2016-11-17  1:43             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1479219194.2406.73.camel@mattmccutchen.net \
    --to=matt@mattmccutchen.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).