git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Alban Gruin <alban.gruin@gmail.com>
Cc: git@vger.kernel.org, Thomas Gummerer <t.gummerer@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Junio C Hamano <gitster@pobox.com>,
	Son Luong Ngoc <sluongng@gmail.com>
Subject: Re: [RFC PATCH v1 0/6] stash: drop usage of a second index
Date: Mon, 15 Jun 2020 23:50:20 +0200	[thread overview]
Message-ID: <20200615215020.GE2898@szeder.dev> (raw)
In-Reply-To: <20200615152715.GD2898@szeder.dev>

On Mon, Jun 15, 2020 at 05:27:15PM +0200, SZEDER Gábor wrote:
>       - Should we even allow 'splitIndex.sharedIndexExpire=now'?
>     
>         I believe, though haven't confirmed, that it can cause trouble
>         even without using an alternate index.  Consider the following
>         sequence of events:
>     
>           - Git process A reads '.git/index', finds the 'link' extension,
>             and reads the SHA1 recorded there that determines the filename
>             of its shared index.
>     
>           - The scheduler steps in, and puts process A to sleep.
>     
>           - Git process B updates the index, decides that it's time to
>             write a new shared index, does so, and then because of
>             'splitIndex.sharedIndexExpire=now' it removes all other shared
>             index files.
>     
>           - The scheduler wakes process A, which now tries to open the
>             shared index file it just learned about, but fails because
>             that file has just been removed by process B.

Confirmed.

To help reproduce the issue, this diff adds a strategically-placed
controllable delay between reading '.git/index' and reading its
shared/base index:

diff --git a/read-cache.c b/read-cache.c
index b888c5df44..5a66e9bf4b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2319,6 +2319,9 @@ int read_index_from(struct index_state *istate, const char *path,
 	else
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 
+	if (git_env_bool("GIT_TEST_WAIT", 0))
+		sleep(3);
+
 	base_oid_hex = oid_to_hex(&split_index->base_oid);
 	base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_oid_hex);
 	trace2_region_enter_printf("index", "shared/do_read_index",


Then this test creates the above described sequence of events:

test_expect_failure 'splitIndex.sharedIndexExpire=now can be harmful' '
	>file1 &&
	>file2 &&
	git update-index --split-index --add file1 &&

	{
		sleep 1 &&
		# "process B"
		git -c splitIndex.sharedIndexExpire=now \
		    update-index --split-index --add file2 &
	} &&

	# "process A"
	GIT_TEST_WAIT=1 git diff --cached --name-only
'

... and fails reliably with:

  [...]
  + GIT_TEST_WAIT=1 git diff --cached --name-only
  [ ... trace from background commands removed ...]
  fatal: .git/sharedindex.818f65852e7402f236aeaadd32efdbb62291aa75: index file open failed: No such file or directory


>         This is similar to the issue we have with 'git gc --prune=now',
>         except that 'git gc's documentation explicitly warns about the
>         risks of using '--prune=now', while the description of
>         'splitIndex.sharedIndexExpire' doesn't have any such warning.
>     
>         I think that 'splitIndex.sharedIndexExpire=now' should be allowed,
>         for those who hopefully know what they are doing, just as we allow
>         'git gc --prune=now', but the documentation should clearly warn
>         against its potential pitfalls.

  reply	other threads:[~2020-06-15 21:50 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 10:48 [RFC PATCH v1 0/6] stash: drop usage of a second index Alban Gruin
2020-05-05 10:48 ` [RFC PATCH v1 1/6] stash: mark `i_tree' in reset_tree() const Alban Gruin
2020-06-13  8:09   ` Christian Couder
2020-05-05 10:48 ` [RFC PATCH v1 2/6] stash: remove the second index in stash_working_tree() Alban Gruin
2020-06-13  8:52   ` Christian Couder
2020-06-13 18:00     ` Alban Gruin
2020-06-15 12:02       ` Christian Couder
2020-05-05 10:48 ` [RFC PATCH v1 3/6] stash: remove the second index in stash_patch() Alban Gruin
2020-06-13  9:38   ` Christian Couder
2020-06-13 10:04     ` Christian Couder
2020-05-05 10:48 ` [RFC PATCH v1 4/6] stash: remove the second index in save_untracked_files() Alban Gruin
2020-06-13 18:51   ` Christian Couder
2020-05-05 10:48 ` [RFC PATCH v1 5/6] stash: remove the second index in restore_untracked() Alban Gruin
2020-06-13 19:41   ` Christian Couder
2020-05-05 10:48 ` [RFC PATCH v1 6/6] stash: remove `stash_index_path' Alban Gruin
2020-06-04 12:07 ` [RFC PATCH v1 0/6] stash: drop usage of a second index Alban Gruin
2020-06-13  7:52 ` Christian Couder
2020-06-25 12:35   ` Alban Gruin
2020-06-15 15:27 ` SZEDER Gábor
2020-06-15 21:50   ` SZEDER Gábor [this message]
2020-06-16  7:06     ` SZEDER Gábor
2020-06-17 20:04       ` Junio C Hamano
2020-06-17 21:31         ` Alban Gruin
2020-06-30 15:15 ` [PATCH v2 " Alban Gruin
2020-06-30 15:15   ` [PATCH v2 1/6] stash: mark `i_tree' in reset_tree() const Alban Gruin
2020-06-30 15:15   ` [PATCH v2 2/6] stash: remove the second index in stash_working_tree() Alban Gruin
2020-06-30 15:15   ` [PATCH v2 3/6] stash: remove the second index in stash_patch() Alban Gruin
2020-06-30 15:15   ` [PATCH v2 4/6] stash: remove the second index in save_untracked_files() Alban Gruin
2020-06-30 15:15   ` [PATCH v2 5/6] stash: remove the second index in restore_untracked() Alban Gruin
2020-07-31 13:45     ` Christian Couder
2020-07-31 16:16       ` Alban Gruin
2020-06-30 15:15   ` [PATCH v2 6/6] stash: remove `stash_index_path' Alban Gruin
2020-07-31 13:53   ` [PATCH v2 0/6] stash: drop usage of a second index Christian Couder
2020-07-31 16:51   ` [PATCH v3 " Alban Gruin
2020-07-31 16:51     ` [PATCH v3 1/6] stash: mark `i_tree' in reset_tree() const Alban Gruin
2020-07-31 18:28       ` Junio C Hamano
2020-07-31 16:51     ` [PATCH v3 2/6] stash: remove the second index in stash_working_tree() Alban Gruin
2020-07-31 18:26       ` Junio C Hamano
2020-08-02  2:20         ` Junio C Hamano
2020-07-31 16:51     ` [PATCH v3 3/6] stash: remove the second index in stash_patch() Alban Gruin
2020-07-31 16:51     ` [PATCH v3 4/6] stash: remove the second index in save_untracked_files() Alban Gruin
2020-07-31 16:51     ` [PATCH v3 5/6] stash: remove the second index in restore_untracked() Alban Gruin
2020-07-31 16:51     ` [PATCH v3 6/6] stash: remove `stash_index_path' Alban Gruin
2020-07-31 17:48     ` [PATCH v3 0/6] stash: drop usage of a second index Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200615215020.GE2898@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alban.gruin@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sluongng@gmail.com \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).