git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	phillip.wood@dunelm.org.uk,
	Brian Norris <briannorris@chromium.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH] t3429: try to protect against a potential racy todo file problem
Date: Mon, 25 Nov 2019 04:10:05 +0100	[thread overview]
Message-ID: <20191125031005.GC23183@szeder.dev> (raw)
In-Reply-To: <20191124211021.GB23183@szeder.dev>

On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote:
> To notice a changed todo file the sequencer stores the file's stat
> data in its 'struct todo_list' instance, and compares it with the
> file's current stat data after 'reword', 'squash' and 'exec'
> instructions.  If the two stat data doesn't match, it re-reads the
> todo file.
> 
> Sounds simple, but there are some subtleties going on here:
> 
>   - The 'struct todo_list' holds the stat data from the time when the
>     todo file was last read.
> 
>   - This stat data in 'struct todo_list' is not updated when the
>     sequencer itself writes the todo file.
> 
>   - Before executing each instruction during an interactive rebase,
>     the sequencer always updates the todo file by removing the
>     just-about-to-be-executed instruction.  This changes the file's
>     size and inode [1].
> 
> Consequently, when the sequencer looks at the stat data after a
> 'reword', 'squash' or 'exec' instruction, it most likely finds that
> they differ, even when the user didn't modify the todo list at all!
> This is not an issue in practice, it just wastes a few cycles on
> re-reading the todo list that matches what the sequencer already has
> in memory anyway.
> 
> However, an unsuspecting Git developer might try to "fix" it by simply
> updating the stat data each time the sequencer writes the todo list
> for an interactive rebase.  On first sight it looks quite sensible and
> straightforward, but we have to be very careful when doing that,
> because potential racy problems lie that way.
> 
> It is possible to overwrite the todo list file without modifying
> either its inode or size, and if such an overwrite were to happen in
> the same second when the file was last read (our stat data has one
> second granularity by default), then the actual stat data on the file
> system would match the stat data that the sequencer has in memory.
> Consequently, such a modification to the todo list file would go
> unnoticed.

> [1] The todo file is written using the lockfile API, i.e. first to the
>     lockfile, which is then moved to place, so the new file can't
>     possibly have the same inode as the file it replaces.  Note,
>     however, that the file system might reuse inodes, so it is
>     possible that the new todo file ends up with the same inode as is
>     recorded in the 'struct todo_list' from the last time the file was
>     read.

Unfortunately, we already do have an issue here when the sequencer can
overlook a modified todo list, but triggering it needs the lucky
"alignment" of inodes as well.  I can trigger it fairly reliably with
the test below, but forcing the inode match makes it kind of gross and
Linux-only.

  https://travis-ci.org/szeder/git/jobs/616460522#L1470


  ---  >8  ---

diff --git a/t/t9999-rebase-racy-todo-reread.sh b/t/t9999-rebase-racy-todo-reread.sh
new file mode 100755
index 0000000000..437ebd55e0
--- /dev/null
+++ b/t/t9999-rebase-racy-todo-reread.sh
@@ -0,0 +1,87 @@
+#!/bin/sh
+
+test_description='racy edit todo reread problem'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit first_ &&
+	test_commit second &&
+	test_commit third_ &&
+	test_commit fourth &&
+	test_commit fifth_ &&
+	test_commit sixth_ &&
+
+	write_script sequence-editor <<-\EOS &&
+		todo=.git/rebase-merge/git-rebase-todo &&
+		cat >"$todo" <<-EOF
+			reword $(git rev-parse second^0) second
+			reword $(git rev-parse third_^0) third_
+			reword $(git rev-parse fourth^0) fourth
+		EOF
+	EOS
+
+	write_script commit-editor <<-\EOS &&
+		read first_line <"$1" &&
+		echo "$first_line - edited" >"$1" &&
+
+		todo=.git/rebase-merge/git-rebase-todo &&
+
+		if test "$first_line" = second
+		then
+			stat --format=%i "$todo" >expected-ino
+		elif test "$first_line" = third_
+		then
+			ino=$(cat expected-ino) &&
+			file=$(find . -inum $ino) &&
+			if test -n "$file"
+			then
+				echo &&
+				echo "Trying to free inode $ino by moving \"$file\" out of the way" &&
+				cp -av "$file" "$file".tmp &&
+				rm -fv "$file"
+			fi &&
+
+			cat >"$todo".tmp <<-EOF &&
+			reword $(git rev-parse fifth_^0) fifth_
+			reword $(git rev-parse sixth_^0) sixth_
+			EOF
+			mv -v "$todo".tmp "$todo" &&
+
+			if test "$ino" -eq $(stat --format=%i "$todo")
+			then
+				echo "Yay! The todo list did get inode $ino, just what the sequencer is expecting!"
+			fi &&
+
+			if test -n "$file"
+			then
+				mv -v "$file".tmp "$file"
+			fi
+		fi
+	EOS
+
+	cat >expect <<-\EOF
+	sixth_ - edited
+	fifth_ - edited
+	third_ - edited
+	second - edited
+	first_
+	EOF
+'
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "demonstrate racy todo re-read problem #$trial" '
+		git reset --hard fourth &&
+		>expected-ino && # placeholder
+
+		GIT_SEQUENCE_EDITOR=./sequence-editor \
+		GIT_EDITOR=./commit-editor \
+		git rebase -i HEAD^^^ &&
+
+		git log --format=%s >actual &&
+		test_cmp expect actual
+	'
+done
+
+test_done

  parent reply	other threads:[~2019-11-25  3:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-22 23:10 git 2.24: git revert <commit1> <commit2> requires extra '--continue'? Brian Norris
2019-11-23  0:34 ` SZEDER Gábor
2019-11-23  9:53   ` Phillip Wood
2019-11-23 17:20     ` [PATCH] sequencer: don't re-read todo for revert and cherry-pick SZEDER Gábor
2019-11-23 21:14       ` Johannes Schindelin
2019-11-24  4:49       ` Junio C Hamano
2019-11-24 10:44         ` Phillip Wood
2019-11-24 21:10           ` [PATCH] t3429: try to protect against a potential racy todo file problem SZEDER Gábor
2019-11-25  1:28             ` Junio C Hamano
2019-11-25  3:10             ` SZEDER Gábor [this message]
2019-11-25 13:18             ` SZEDER Gábor
2019-11-25 14:43               ` Phillip Wood
2019-11-25 15:15                 ` SZEDER Gábor
2019-11-25 16:40                   ` Phillip Wood
2019-11-25  1:10           ` [PATCH] sequencer: don't re-read todo for revert and cherry-pick Junio C Hamano
2019-11-25 10:47             ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191125031005.GC23183@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=briannorris@chromium.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).