git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Protecting old temporary objects being reused from concurrent "git gc"?
@ 2016-11-15 14:13 Matt McCutchen
  2016-11-15 17:06 ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Matt McCutchen @ 2016-11-15 14:13 UTC (permalink / raw)
  To: git

The Braid subproject management tool stores the subproject content in
the main tree and is able to switch to a different upstream revision of
a subproject by doing the equivalent of "git read-tree -m" on the
superproject tree and the two upstream trees.  The tricky part is
preparing temporary trees with the upstream content moved to the path
configured for the superproject.  The usual method is "git read-tree
--prefix", but using what index file?  Braid currently uses the user's
actual worktree, which can leave a mess if it gets interrupted:

https://github.com/cristibalan/braid/blob/7d81da6e86e24de62a74f3ab8d880666cb343b04/lib/braid/commands/update.rb#L98

I want to change this to something that won't leave an inconsistent
state if interrupted.  I've written code for this kind of thing before
that sets GIT_INDEX_FILE and uses a temporary index file and "git
write-tree".  But I realized that if "git gc" runs concurrently, the
generated tree could be deleted before it is used and the tool would
fail.  If I had a need to run "git commit-tree", it seems like I might
even end up with a commit object with a broken reference to a tree.
 "git gc" normally doesn't delete objects that were created in the last
2 weeks, but if an identical tree was added to the object database more
than 2 weeks ago by another operation and is unreferenced, it could be
reused without updating its mtime and it could still get deleted.

Is there a recommended way to avoid this kind of problem in add-on
tools?  (I searched the Git documentation and the web for information
about races with "git gc" and didn't find anything useful.)  If not, it
seems to be a significant design flaw in "git gc", even if the problem
is extremely rare in practice.  I wonder if some of the built-in
commands may have the same problem, though I haven't tried to test
them.  If this is confirmed to be a known problem affecting built-in
commands, then at least I won't feel bad about introducing the
same problem into add-on tools. :/

Thanks,
Matt

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-11-17  1:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-15 14:13 Protecting old temporary objects being reused from concurrent "git gc"? Matt McCutchen
2016-11-15 17:06 ` Jeff King
2016-11-15 17:33   ` Matt McCutchen
2016-11-15 17:40     ` Jeff King
2016-11-15 19:08       ` [PATCH] git-gc.txt: expand discussion of races with other processes Matt McCutchen
2016-11-15 19:12       ` Protecting old temporary objects being reused from concurrent "git gc"? Matt McCutchen
2016-11-15 20:01       ` Junio C Hamano
2016-11-16  8:07         ` Jeff King
2016-11-16 18:18           ` Junio C Hamano
2016-11-16 18:58       ` Junio C Hamano
2016-11-17  1:04         ` Jeff King
2016-11-17  1:35           ` Re* " Junio C Hamano
2016-11-17  1:43             ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).