git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Demonstrate a critical worktree/gc bug
@ 2017-02-08 12:17 Johannes Schindelin
  2017-02-08 12:17 ` [PATCH 1/2] worktree: demonstrate data lost to auto-gc Johannes Schindelin
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Johannes Schindelin @ 2017-02-08 12:17 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

... and a half-working workaround for the auto-gc case.

This patch series really is just another attempt to prod Duy into fixing
this instead of dabbling with shiny new toys ;-)

Changes since "v0":

- split out the test case, both to make it easier for Duy to integrate
  it into the patch series that fixes the bug, as well as to provide a
  little more, very gentle pressure to demonstrate the severity of this
  problem.


Johannes Schindelin (2):
  worktree: demonstrate data lost to auto-gc
  worktree: demonstrate a half-working workaround for auto-gc

 t/t6500-gc.sh | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)


base-commit: 6e3a7b3398559305c7a239a42e447c21a8f39ff8
Published-As: https://github.com/dscho/git/releases/tag/gc-worktree-v1
Fetch-It-Via: git fetch https://github.com/dscho/git gc-worktree-v1

-- 
2.11.1.windows.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] worktree: demonstrate data lost to auto-gc
  2017-02-08 12:17 [PATCH 0/2] Demonstrate a critical worktree/gc bug Johannes Schindelin
@ 2017-02-08 12:17 ` Johannes Schindelin
  2017-02-08 12:18 ` [PATCH 2/2] worktree: demonstrate a half-working workaround for auto-gc Johannes Schindelin
  2017-02-08 12:28 ` [PATCH 0/2] Demonstrate a critical worktree/gc bug Duy Nguyen
  2 siblings, 0 replies; 4+ messages in thread
From: Johannes Schindelin @ 2017-02-08 12:17 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

When gc --auto is called in the presence of worktrees, it fails to take
*all* reflogs into account when trying to retain reachable objects, and as
a consequence data integrity goes pretty much to hell.

This patch provides a test case in the hopes that this bug gets fixed,
after an initial attempt has stalled for eight months already.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t6500-gc.sh | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 1762dfa6a3..e24a4fb611 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -67,5 +67,44 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre
 	test_line_count = 2 new # There is one new pack and its .idx
 '
 
+trigger_auto_gc () {
+	# This is unfortunately very, very ugly
+	gdir="$(git rev-parse --git-common-dir)" &&
+	mkdir -p "$gdir"/objects/17 &&
+	touch "$gdir"/objects/17/17171717171717171717171717171717171717 &&
+	touch "$gdir"/objects/17/17171717171717171717171717171717171718 &&
+	git -c gc.auto=1 -c gc.pruneexpire=now -c gc.autodetach=0 gc --auto
+}
+
+test_expect_failure 'gc respects refs/reflogs in all worktrees' '
+	test_commit something &&
+	git worktree add worktree &&
+	(
+		cd worktree &&
+		git checkout --detach &&
+		# avoid implicit tagging of test_commit
+		echo 1 >something.t &&
+		test_tick &&
+		git commit -m worktree-reflog something.t &&
+		git rev-parse --verify HEAD >../commit-reflog &&
+		echo 2 >something.t &&
+		test_tick &&
+		git commit -m worktree-ref something.t &&
+		git rev-parse --verify HEAD >../commit-ref
+	) &&
+	trigger_auto_gc &&
+	git rev-parse --verify $(cat commit-ref)^{commit} &&
+	git rev-parse --verify $(cat commit-reflog)^{commit} &&
+
+	# Now, add a reflog in the top-level dir and verify that `git gc` in
+	# the worktree does not purge that
+	git checkout --detach &&
+	echo 3 >something.t &&
+	test_tick &&
+	git commit -m commondir-reflog something.t &&
+	git rev-parse --verify HEAD >commondir-reflog &&
+	(cd worktree && trigger_auto_gc) &&
+	git rev-parse --verify $(cat commondir-reflog)^{commit}
+'
 
 test_done
-- 
2.11.1.windows.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] worktree: demonstrate a half-working workaround for auto-gc
  2017-02-08 12:17 [PATCH 0/2] Demonstrate a critical worktree/gc bug Johannes Schindelin
  2017-02-08 12:17 ` [PATCH 1/2] worktree: demonstrate data lost to auto-gc Johannes Schindelin
@ 2017-02-08 12:18 ` Johannes Schindelin
  2017-02-08 12:28 ` [PATCH 0/2] Demonstrate a critical worktree/gc bug Duy Nguyen
  2 siblings, 0 replies; 4+ messages in thread
From: Johannes Schindelin @ 2017-02-08 12:18 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

While worktrees are marked "experimental", there is really no alternative
to using them when trying to work on multiple patch series in parallel.

Sadly, there is still a bug that causes irretrievable data loss caused by
worktree's incomplete handling of reflogs (and of the multiple indices of
the various worktrees), as this developer can testify a dozen times over.

In the --auto case, we can install a hook that runs before-hand,
accumulates all the worktree-specific refs and reflogs and installs them
into a very special reflog in the common refs namespace. The only
purpose of this stunt is to let gc pick up on those refs and reflogs, of
course, and *not* ignore them.

Unfortunately, this still does not address the "git gc" case, but
hopefully a real fix will be here some time soon.

Also, if there are simply too many loose objects for Git's liking, it will
kick off auto-gc (including the hook, which takes a couple of seconds to
run) that subsequently only says that it won't do anything, so the next
Git operation will kick off another auto-gc (including the multi-second
hook run).

Needless to say: this patch is not actually intended for inclusion, but
instead tries to demonstrate the pain and the dear need for a real bug fix
(and not another 8 months of buggy Git).

So here is hoping to a timely fix. Cheers.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t6500-gc.sh | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index e24a4fb611..2f1e52825d 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -67,6 +67,30 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre
 	test_line_count = 2 new # There is one new pack and its .idx
 '
 
+test_expect_success 'install pre-auto-gc hook for worktrees' '
+	mkdir -p .git/hooks &&
+	write_script .git/hooks/pre-auto-gc <<-\EOF
+	echo "Preserving refs/reflogs of worktrees" >&2 &&
+	dir="$(git rev-parse --git-common-dir)" &&
+	refsdir="$dir/logs/refs" &&
+	rm -f "$refsdir"/preserve &&
+	ident="$(GIT_COMMITTER_DATE= git var GIT_COMMITTER_IDENT)" &&
+	(
+		find "$dir"/logs "$dir"/worktrees/*/logs \
+			-type f -exec cat {} \; |
+		cut -d" " -f1
+		find "$dir"/HEAD "$dir"/worktrees/*/HEAD "$dir"/refs \
+			"$dir"/worktrees/*/refs -type f -exec cat {} \; |
+		grep -v "^ref:"
+	) 2>/dev/null |
+	sort |
+	uniq |
+	sed "s/.*/& & $ident	dummy/" >"$dir"/preserve &&
+	mkdir -p "$refsdir" &&
+	mv "$dir"/preserve "$refsdir"/
+	EOF
+'
+
 trigger_auto_gc () {
 	# This is unfortunately very, very ugly
 	gdir="$(git rev-parse --git-common-dir)" &&
@@ -76,7 +100,7 @@ trigger_auto_gc () {
 	git -c gc.auto=1 -c gc.pruneexpire=now -c gc.autodetach=0 gc --auto
 }
 
-test_expect_failure 'gc respects refs/reflogs in all worktrees' '
+test_expect_success 'gc respects refs/reflogs in all worktrees' '
 	test_commit something &&
 	git worktree add worktree &&
 	(
-- 
2.11.1.windows.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] Demonstrate a critical worktree/gc bug
  2017-02-08 12:17 [PATCH 0/2] Demonstrate a critical worktree/gc bug Johannes Schindelin
  2017-02-08 12:17 ` [PATCH 1/2] worktree: demonstrate data lost to auto-gc Johannes Schindelin
  2017-02-08 12:18 ` [PATCH 2/2] worktree: demonstrate a half-working workaround for auto-gc Johannes Schindelin
@ 2017-02-08 12:28 ` Duy Nguyen
  2 siblings, 0 replies; 4+ messages in thread
From: Duy Nguyen @ 2017-02-08 12:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List, Junio C Hamano

On Wed, Feb 8, 2017 at 7:17 PM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> ... and a half-working workaround for the auto-gc case.
>
> This patch series really is just another attempt to prod Duy into fixing
> this instead of dabbling with shiny new toys ;-)

FYI work is ongoing [1] [2]. If you want it even faster, go do it yourself.

[1] https://github.com/pclouds/git/commits/prune-in-worktrees-2
[2] https://public-inbox.org/git/20170208113144.8201-1-pclouds@gmail.com/
-- 
Duy

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-02-08 12:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-08 12:17 [PATCH 0/2] Demonstrate a critical worktree/gc bug Johannes Schindelin
2017-02-08 12:17 ` [PATCH 1/2] worktree: demonstrate data lost to auto-gc Johannes Schindelin
2017-02-08 12:18 ` [PATCH 2/2] worktree: demonstrate a half-working workaround for auto-gc Johannes Schindelin
2017-02-08 12:28 ` [PATCH 0/2] Demonstrate a critical worktree/gc bug Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).