* [PATCH 0/2] Fix an error-handling path when locking refs @ 2017-10-24 15:16 Michael Haggerty 2017-10-24 15:16 ` [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts Michael Haggerty ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Michael Haggerty @ 2017-10-24 15:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git, Michael Haggerty In [1], Peff described a bug that he found that could cause a reference transaction to be partly carried-out and partly not, with a successful return. The scenario that he discussed was the deletion of one reference and creation of another, where the two references D/F conflict with each other. But in fact the problem is more general; it can happen whenever a reference deletion within a transaction is processed successfully, and then another reference update in the same transaction fails in `lock_ref_for_update()`. In such a case the failed update and any subsequent ones could be silently ignored. There is a longer explanation in the second patch's commit message. The tests that I added probe a bunch of D/F update scenarios, which I think should be characteristic of the scenarios that would trigger this bug. It would be nice to have tests that examine other types of failures (e.g., wrong `old_oid` values). Bit since the fix is obviously a strict improvement and can prevent data loss, and since the release process is already pretty far along, I wanted to send this out ASAP. We can follow it up later with additional tests. These patches apply to current master. They are also available from my GitHub fork [2] as branch `ref-locking-fix`. While looking at this code again, I realized that the new code rewrites the `packed-refs` file more often than did the old code. Specifically, after dc39e09942 (files_ref_store: use a transaction to update packed refs, 2017-09-08), the `packed-refs` file is overwritten for any transaction that includes a reference delete. Prior to that commit, `packed-refs` was only overwritten if a deleted reference was actually present in the existing `packed-refs` file. This is even the case if there was previously no `packed-refs` file at all; now any reference deletion causes an empty `packed-refs` file to be created. I think this is a conscionable regression, since deleting references that are purely loose is probably not all that common, and the old code had to read the whole `packed-refs` file anyway. Nevertheless, it is obviously something that I would like to fix (though maybe not for 2.15? I'm open to input about its urgency.) [1] https://public-inbox.org/git/20171024082409.smwsd6pla64jjlua@sigill.intra.peff.net/ [2] https://github.com/mhagger/git Michael Haggerty (2): t1404: add a bunch of tests of D/F conflicts files_transaction_prepare(): fix handling of ref lock failure refs/files-backend.c | 2 +- t/t1404-update-ref-errors.sh | 146 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 147 insertions(+), 1 deletion(-) -- 2.14.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-24 15:16 [PATCH 0/2] Fix an error-handling path when locking refs Michael Haggerty @ 2017-10-24 15:16 ` Michael Haggerty 2017-10-24 16:19 ` Eric Sunshine 2017-10-24 15:16 ` [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure Michael Haggerty 2017-10-24 19:45 ` [PATCH 0/2] Fix an error-handling path when locking refs Jeff King 2 siblings, 1 reply; 14+ messages in thread From: Michael Haggerty @ 2017-10-24 15:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git, Michael Haggerty It is currently not allowed, in a single transaction, to add one reference and delete another reference if the two reference names D/F conflict with each other (e.g., like `refs/foo/bar` and `refs/foo`). The reason is that the code would need to take locks $GIT_DIR/refs/foo.lock $GIT_DIR/refs/foo/bar.lock But the latter lock couldn't coexist with the loose reference file $GIT_DIR/refs/foo , because `$GIT_DIR/refs/foo` cannot be both a directory and a file at the same time (hence the name "D/F conflict). Add a bunch of tests that we cleanly reject such transactions. In fact, many of the new tests currently fail. They will be fixed in the next commit along with an explanation. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> --- t/t1404-update-ref-errors.sh | 146 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 146 insertions(+) diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh index 100d50e362..8b5e9a83c5 100755 --- a/t/t1404-update-ref-errors.sh +++ b/t/t1404-update-ref-errors.sh @@ -34,6 +34,86 @@ test_update_rejected () { Q="'" +# Test adding and deleting D/F-conflicting references in a single +# transaction. +df_test() { + local prefix="$1" + shift + local pack=: + local symadd=false + local symdel=false + local add_del=false + local addref + local delref + while test $# -gt 0 + do + case "$1" in + --pack) + pack="git pack-refs --all" + shift + ;; + --sym-add) + # Perform the add via a symbolic reference + symadd=true + shift + ;; + --sym-del) + # Perform the del via a symbolic reference + symdel=true + shift + ;; + --del-add) + # Delete first reference then add second + add_del=false + delref="$prefix/r/$2" + addref="$prefix/r/$3" + shift 3 + ;; + --add-del) + # Add first reference then delete second + add_del=true + addref="$prefix/r/$2" + delref="$prefix/r/$3" + shift 3 + ;; + *) + echo 1>&2 "Extra args to df_test: $*" + return 1 + ;; + esac + done + git update-ref "$delref" $C && + if $symadd + then + addname="$prefix/s/symadd" && + git symbolic-ref "$addname" "$addref" + else + addname="$addref" + fi && + if $symdel + then + delname="$prefix/s/symdel" && + git symbolic-ref "$delname" "$delref" + else + delname="$delref" + fi && + cat >expected-err <<-EOF && + fatal: cannot lock ref $Q$addname$Q: $Q$delref$Q exists; cannot create $Q$addref$Q + EOF + $pack && + if $add_del + then + printf "%s\n" "create $addname $D" "delete $delname" + else + printf "%s\n" "delete $delname" "create $addname $D" + fi >commands && + test_must_fail git update-ref --stdin <commands 2>output.err && + test_cmp expected-err output.err && + printf "%s\n" "$C $delref" >expected-refs && + git for-each-ref --format="%(objectname) %(refname)" $prefix/r >actual-refs && + test_cmp expected-refs actual-refs +} + test_expect_success 'setup' ' git commit --allow-empty -m Initial && @@ -188,6 +268,72 @@ test_expect_success 'empty directory should not fool 1-arg delete' ' git update-ref --stdin ' +test_expect_success 'D/F conflict prevents add long + delete short' ' + df_test refs/df-al-ds --add-del foo/bar foo +' + +test_expect_success 'D/F conflict prevents add short + delete long' ' + df_test refs/df-as-dl --add-del foo foo/bar +' + +test_expect_failure 'D/F conflict prevents delete long + add short' ' + df_test refs/df-dl-as --del-add foo/bar foo +' + +test_expect_failure 'D/F conflict prevents delete short + add long' ' + df_test refs/df-ds-al --del-add foo foo/bar +' + +test_expect_success 'D/F conflict prevents add long + delete short packed' ' + df_test refs/df-al-dsp --pack --add-del foo/bar foo +' + +test_expect_success 'D/F conflict prevents add short + delete long packed' ' + df_test refs/df-as-dlp --pack --add-del foo foo/bar +' + +test_expect_failure 'D/F conflict prevents delete long packed + add short' ' + df_test refs/df-dlp-as --pack --del-add foo/bar foo +' + +test_expect_failure 'D/F conflict prevents delete short packed + add long' ' + df_test refs/df-dsp-al --pack --del-add foo foo/bar +' + +# Try some combinations involving symbolic refs... + +test_expect_failure 'D/F conflict prevents indirect add long + delete short' ' + df_test refs/df-ial-ds --sym-add --add-del foo/bar foo +' + +test_expect_success 'D/F conflict prevents indirect add long + indirect delete short' ' + df_test refs/df-ial-ids --sym-add --sym-del --add-del foo/bar foo +' + +test_expect_success 'D/F conflict prevents indirect add short + indirect delete long' ' + df_test refs/df-ias-idl --sym-add --sym-del --add-del foo foo/bar +' + +test_expect_failure 'D/F conflict prevents indirect delete long + indirect add short' ' + df_test refs/df-idl-ias --sym-add --sym-del --del-add foo/bar foo +' + +test_expect_failure 'D/F conflict prevents indirect add long + delete short packed' ' + df_test refs/df-ial-dsp --sym-add --pack --add-del foo/bar foo +' + +test_expect_success 'D/F conflict prevents indirect add long + indirect delete short packed' ' + df_test refs/df-ial-idsp --sym-add --sym-del --pack --add-del foo/bar foo +' + +test_expect_success 'D/F conflict prevents add long + indirect delete short packed' ' + df_test refs/df-al-idsp --sym-del --pack --add-del foo/bar foo +' + +test_expect_failure 'D/F conflict prevents indirect delete long packed + indirect add short' ' + df_test refs/df-idlp-ias --sym-add --sym-del --pack --del-add foo/bar foo +' + # Test various errors when reading the old values of references... test_expect_success 'missing old value blocks update' ' -- 2.14.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-24 15:16 ` [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts Michael Haggerty @ 2017-10-24 16:19 ` Eric Sunshine 2017-10-24 19:46 ` Jeff King 0 siblings, 1 reply; 14+ messages in thread From: Eric Sunshine @ 2017-10-24 16:19 UTC (permalink / raw) To: Michael Haggerty; +Cc: Junio C Hamano, Jeff King, Git List On Tue, Oct 24, 2017 at 11:16 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh > @@ -34,6 +34,86 @@ test_update_rejected () { > +# Test adding and deleting D/F-conflicting references in a single > +# transaction. > +df_test() { > + local prefix="$1" > + shift > + local pack=: Isn't "local" a Bash-ism we want to keep out of the test scripts? > + local symadd=false > + local symdel=false > + local add_del=false > + local addref > + local delref > + while test $# -gt 0 > + do ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-24 16:19 ` Eric Sunshine @ 2017-10-24 19:46 ` Jeff King 2017-10-25 8:03 ` Michael Haggerty 0 siblings, 1 reply; 14+ messages in thread From: Jeff King @ 2017-10-24 19:46 UTC (permalink / raw) To: Eric Sunshine; +Cc: Michael Haggerty, Junio C Hamano, Git List On Tue, Oct 24, 2017 at 12:19:26PM -0400, Eric Sunshine wrote: > On Tue, Oct 24, 2017 at 11:16 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > > diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh > > @@ -34,6 +34,86 @@ test_update_rejected () { > > +# Test adding and deleting D/F-conflicting references in a single > > +# transaction. > > +df_test() { > > + local prefix="$1" > > + shift > > + local pack=: > > Isn't "local" a Bash-ism we want to keep out of the test scripts? Yeah. It's supported by dash and many other shells, but we do try to avoid it[1]. I think in this case we could just drop it (but keep setting the "local foo" ones to empty with "foo=". -Peff [1] There's some more discussion in the thread at: https://public-inbox.org/git/20160601163747.GA10721@sigill.intra.peff.net/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-24 19:46 ` Jeff King @ 2017-10-25 8:03 ` Michael Haggerty 2017-10-25 8:23 ` Michael Haggerty 2017-10-26 6:32 ` Jeff King 0 siblings, 2 replies; 14+ messages in thread From: Michael Haggerty @ 2017-10-25 8:03 UTC (permalink / raw) To: Jeff King, Eric Sunshine; +Cc: Junio C Hamano, Git List On 10/24/2017 09:46 PM, Jeff King wrote: > On Tue, Oct 24, 2017 at 12:19:26PM -0400, Eric Sunshine wrote: > >> On Tue, Oct 24, 2017 at 11:16 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote: >>> diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh >>> @@ -34,6 +34,86 @@ test_update_rejected () { >>> +# Test adding and deleting D/F-conflicting references in a single >>> +# transaction. >>> +df_test() { >>> + local prefix="$1" >>> + shift >>> + local pack=: >> >> Isn't "local" a Bash-ism we want to keep out of the test scripts? > > Yeah. It's supported by dash and many other shells, but we do try to > avoid it[1]. I think in this case we could just drop it (but keep > setting the "local foo" ones to empty with "foo=". I do wish that we could allow "local", as it avoids a lot of headaches and potential breakage. According to [1], > However, every single POSIX-compliant shell I've tested implements the > 'local' keyword, which lets you declare variables that won't be > returned from the current function. So nowadays you can safely count > on it working. He mentions that ksh93 doesn't support "local", but that it differs from POSIX in many other ways, too. Perhaps we could slip in a couple of "local" as a compatibility test to see if anybody complains, like we did with a couple of C features recently. But I agree that this bug fix is not the correct occasion to change policy on something like this, so I'll reroll without "local". Michael [1] http://apenwarr.ca/log/?m=201102#28 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-25 8:03 ` Michael Haggerty @ 2017-10-25 8:23 ` Michael Haggerty 2017-10-26 6:32 ` Jeff King 1 sibling, 0 replies; 14+ messages in thread From: Michael Haggerty @ 2017-10-25 8:23 UTC (permalink / raw) To: Jeff King, Eric Sunshine; +Cc: Junio C Hamano, Git List On 10/25/2017 10:03 AM, Michael Haggerty wrote: > On 10/24/2017 09:46 PM, Jeff King wrote: >> On Tue, Oct 24, 2017 at 12:19:26PM -0400, Eric Sunshine wrote: >> >>> On Tue, Oct 24, 2017 at 11:16 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote: >>>> diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh >>>> @@ -34,6 +34,86 @@ test_update_rejected () { >>>> +# Test adding and deleting D/F-conflicting references in a single >>>> +# transaction. >>>> +df_test() { >>>> + local prefix="$1" >>>> + shift >>>> + local pack=: >>> >>> Isn't "local" a Bash-ism we want to keep out of the test scripts? >> >> Yeah. It's supported by dash and many other shells, but we do try to >> avoid it[1]. I think in this case we could just drop it (but keep >> setting the "local foo" ones to empty with "foo=". > [...] > > But I agree that this bug fix is not the correct occasion to change > policy on something like this, so I'll reroll without "local". Oh, I see Junio has already made this fix in the version that he picked up, so I won't bother rerolling. Michael ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-25 8:03 ` Michael Haggerty 2017-10-25 8:23 ` Michael Haggerty @ 2017-10-26 6:32 ` Jeff King 2017-10-28 16:42 ` brian m. carlson 1 sibling, 1 reply; 14+ messages in thread From: Jeff King @ 2017-10-26 6:32 UTC (permalink / raw) To: Michael Haggerty; +Cc: Eric Sunshine, Junio C Hamano, Git List On Wed, Oct 25, 2017 at 10:03:09AM +0200, Michael Haggerty wrote: > > Yeah. It's supported by dash and many other shells, but we do try to > > avoid it[1]. I think in this case we could just drop it (but keep > > setting the "local foo" ones to empty with "foo=". > > I do wish that we could allow "local", as it avoids a lot of headaches > and potential breakage. According to [1], Agreed. > > However, every single POSIX-compliant shell I've tested implements the > > 'local' keyword, which lets you declare variables that won't be > > returned from the current function. So nowadays you can safely count > > on it working. > > He mentions that ksh93 doesn't support "local", but that it differs from > POSIX in many other ways, too. Yes, the conclusion we came to in the thread I linked earlier is the same: ksh is affected, but that shell is a problem for other reasons. I don't know if anybody tested with "modern" ksh like mksh, though. Should be easy enough: cat >foo.sh <<\EOF fun() { local x=3 echo inside: $x } x=2 fun echo after: $x EOF mksh foo.sh which produces the expected: inside: 3 after: 2 So that's good. > Perhaps we could slip in a couple of "local" as a compatibility test to > see if anybody complains, like we did with a couple of C features recently. That sounds reasonable to me. But from the earlier conversation, beware that: local x ... x=5 is not necessarily enough to notice the problem on broken shells (they may complain that "local" is not a command, and quietly stomp on the global). I think: local x=5 would be enough (though depend on how you use $x, the failure mode might be pretty subtle). Or we could even add an explicit test in t0000 like the example above. > But I agree that this bug fix is not the correct occasion to change > policy on something like this, so I'll reroll without "local". Also agreed. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-26 6:32 ` Jeff King @ 2017-10-28 16:42 ` brian m. carlson 2017-10-29 0:05 ` Junio C Hamano 0 siblings, 1 reply; 14+ messages in thread From: brian m. carlson @ 2017-10-28 16:42 UTC (permalink / raw) To: Jeff King; +Cc: Michael Haggerty, Eric Sunshine, Junio C Hamano, Git List [-- Attachment #1: Type: text/plain, Size: 3180 bytes --] On Wed, Oct 25, 2017 at 11:32:51PM -0700, Jeff King wrote: > On Wed, Oct 25, 2017 at 10:03:09AM +0200, Michael Haggerty wrote: > > > > Yeah. It's supported by dash and many other shells, but we do try to > > > avoid it[1]. I think in this case we could just drop it (but keep > > > setting the "local foo" ones to empty with "foo=". > > > > I do wish that we could allow "local", as it avoids a lot of headaches > > and potential breakage. According to [1], > > Agreed. This would be useful. Debian requires that all implementations that implement /bin/sh support local and a small number of other features. There is discussion in the Austin Group issue tracker about adding this feature to POSIX, but it's gotten bogged down over lexical versus dynamic scoping. Everyone agrees that it's a desirable feature, though. > > He mentions that ksh93 doesn't support "local", but that it differs from > > POSIX in many other ways, too. > > Yes, the conclusion we came to in the thread I linked earlier is the > same: ksh is affected, but that shell is a problem for other reasons. I > don't know if anybody tested with "modern" ksh like mksh, though. Should > be easy enough: As far as I can tell, bash, dash, posh, mksh, pdksh, zsh, and busybox sh all support local. From my reading of the documentation, so does sh on FreeBSD, NetBSD, and OpenBSD. Not all of these are good choices for a POSIXy sh, though. ksh93 will support local if you alias it to typeset, but only when called from functions defined with "function", not normal shell-style functions. I have a gist[0] that does absurd things to work around that, but I wouldn't recommend that for production use. Solaris 11.1's man page doesn't document local in sh (which is a ksh88 variant) and ksh is ksh93, so it doesn't appear to support it. Solaris 11.3 documents bash, so it's a non-issue there. It's my understanding that using ksh as a POSIXy sh variant is very common on proprietary Unices, so its lack of compatibility may be a dealbreaker. Then again, many of those systems may have bash installed. > > Perhaps we could slip in a couple of "local" as a compatibility test to > > see if anybody complains, like we did with a couple of C features recently. > > That sounds reasonable to me. But from the earlier conversation, beware > that: > > local x > ... > x=5 > > is not necessarily enough to notice the problem on broken shells (they > may complain that "local" is not a command, and quietly stomp on the > global). I think: > > local x=5 > > would be enough (though depend on how you use $x, the failure mode might > be pretty subtle). Or we could even add an explicit test in t0000 like > the example above. I'd recommend an explicit test for this. It's much easier to track down that way than seeing other failure scenarios. People will also usually complain about failing tests. [0] https://gist.github.com/bk2204/9dd24df0e4499a02a300578ebdca4728 -- brian m. carlson / brian with sandals: Houston, Texas, US https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 867 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-28 16:42 ` brian m. carlson @ 2017-10-29 0:05 ` Junio C Hamano 2017-10-29 16:12 ` brian m. carlson 0 siblings, 1 reply; 14+ messages in thread From: Junio C Hamano @ 2017-10-29 0:05 UTC (permalink / raw) To: brian m. carlson; +Cc: Jeff King, Michael Haggerty, Eric Sunshine, Git List "brian m. carlson" <sandals@crustytoothpaste.net> writes: > There is discussion in the Austin Group issue tracker about adding this > feature to POSIX, but it's gotten bogged down over lexical versus > dynamic scoping. Everyone agrees that it's a desirable feature, though. > ... In short, unless you are a binary packager on a platform whose native shell is ksh and who refuses to depend on tools that are not default/native on the platform, you'd be OK? > I'd recommend an explicit test for this. It's much easier to track down > that way than seeing other failure scenarios. People will also usually > complain about failing tests. Hopefully. Starting from an explicit test, gradually using more "local" in tests that cover more important parts of the system, and then start using "local" as appropriate in the main tools would be a good way forward. By that time, we might have a lot less scripted Porcelains, though ;-) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts 2017-10-29 0:05 ` Junio C Hamano @ 2017-10-29 16:12 ` brian m. carlson 0 siblings, 0 replies; 14+ messages in thread From: brian m. carlson @ 2017-10-29 16:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Michael Haggerty, Eric Sunshine, Git List [-- Attachment #1: Type: text/plain, Size: 1279 bytes --] On Sun, Oct 29, 2017 at 09:05:44AM +0900, Junio C Hamano wrote: > In short, unless you are a binary packager on a platform whose > native shell is ksh and who refuses to depend on tools that are not > default/native on the platform, you'd be OK? Yes. > > I'd recommend an explicit test for this. It's much easier to track down > > that way than seeing other failure scenarios. People will also usually > > complain about failing tests. > > Hopefully. > > Starting from an explicit test, gradually using more "local" in > tests that cover more important parts of the system, and then start > using "local" as appropriate in the main tools would be a good way > forward. I completely agree. I just wanted to ensure that if we failed, it was at least obvious to packagers who ran the tests. I'm not sure there's anything we can do if people don't run them. I do think people may run the tests more frequently than you think, though. I always do in my packaging at $DAYJOB, but any failures (usually missing SANITY) tend to already be patched by the time I get off work and write a patch. -- brian m. carlson / brian with sandals: Houston, Texas, US https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 867 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure 2017-10-24 15:16 [PATCH 0/2] Fix an error-handling path when locking refs Michael Haggerty 2017-10-24 15:16 ` [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts Michael Haggerty @ 2017-10-24 15:16 ` Michael Haggerty 2017-10-24 19:49 ` Jeff King 2017-10-25 2:29 ` Junio C Hamano 2017-10-24 19:45 ` [PATCH 0/2] Fix an error-handling path when locking refs Jeff King 2 siblings, 2 replies; 14+ messages in thread From: Michael Haggerty @ 2017-10-24 15:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git, Michael Haggerty Since dc39e09942 (files_ref_store: use a transaction to update packed refs, 2017-09-08), failure to lock a reference has been handled incorrectly by `files_transaction_prepare()`. If `lock_ref_for_update()` fails in the lock-acquisition loop of that function, it sets `ret` then breaks out of that loop. Prior to dc39e09942, that was OK, because the only thing following the loop was the cleanup code. But dc39e09942 added another blurb of code between the loop and the cleanup. That blurb sometimes resets `ret` to zero, making the cleanup code think that the locking was successful. Specifically, whenever * One or more reference deletions have been processed successfully in the lock-acquisition loop. (Processing the first such reference causes a packed-ref transaction to be initialized.) * Then `lock_ref_for_update()` fails for a subsequent reference. Such a failure can happen for a number of reasons, such as the old SHA-1 not being correct, lock contention, etc. This causes a `break` out of the lock-acquisition loop. * The `packed-refs` lock is acquired successfully and `ref_transaction_prepare()` succeeds for the packed-ref transaction. This has the effect of resetting `ret` back to 0, and making the cleanup code think that lock acquisition was successful. In that case, any reference updates that were processed prior to breaking out of the loop would be carried out (loose and packed), but the reference that couldn't be locked and any subsequent references would silently be ignored. This can easily cause data loss if, for example, the user was trying to push a new name for an existing branch while deleting the old name. After the push, the branch could be left unreachable, and could even subsequently be garbage-collected. This problem was noticed in the context of deleting one reference and creating another in a single transaction, when the two references D/F conflict with each other, like git update-ref --stdin <<EOF delete refs/foo create refs/foo/bar HEAD EOF This triggers the above bug because the deletion is processed successfully for `refs/foo`, then the D/F conflict causes `lock_ref_for_update()` to fail when `refs/foo/bar` is processed. In this case the transaction *should* fail, but instead it causes `refs/foo` to be deleted without creating `refs/foo`. This could easily result in data loss. The fix is simple: instead of just breaking out of the loop, jump directly to the cleanup code. This fixes some tests in t1404 that were added in the previous commit. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> --- refs/files-backend.c | 2 +- t/t1404-update-ref-errors.sh | 16 ++++++++-------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 014dabb0bf..8cc1e07fdb 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2570,7 +2570,7 @@ static int files_transaction_prepare(struct ref_store *ref_store, ret = lock_ref_for_update(refs, update, transaction, head_ref, &affected_refnames, err); if (ret) - break; + goto cleanup; if (update->flags & REF_DELETING && !(update->flags & REF_LOG_ONLY) && diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh index 8b5e9a83c5..b7838967b8 100755 --- a/t/t1404-update-ref-errors.sh +++ b/t/t1404-update-ref-errors.sh @@ -276,11 +276,11 @@ test_expect_success 'D/F conflict prevents add short + delete long' ' df_test refs/df-as-dl --add-del foo foo/bar ' -test_expect_failure 'D/F conflict prevents delete long + add short' ' +test_expect_success 'D/F conflict prevents delete long + add short' ' df_test refs/df-dl-as --del-add foo/bar foo ' -test_expect_failure 'D/F conflict prevents delete short + add long' ' +test_expect_success 'D/F conflict prevents delete short + add long' ' df_test refs/df-ds-al --del-add foo foo/bar ' @@ -292,17 +292,17 @@ test_expect_success 'D/F conflict prevents add short + delete long packed' ' df_test refs/df-as-dlp --pack --add-del foo foo/bar ' -test_expect_failure 'D/F conflict prevents delete long packed + add short' ' +test_expect_success 'D/F conflict prevents delete long packed + add short' ' df_test refs/df-dlp-as --pack --del-add foo/bar foo ' -test_expect_failure 'D/F conflict prevents delete short packed + add long' ' +test_expect_success 'D/F conflict prevents delete short packed + add long' ' df_test refs/df-dsp-al --pack --del-add foo foo/bar ' # Try some combinations involving symbolic refs... -test_expect_failure 'D/F conflict prevents indirect add long + delete short' ' +test_expect_success 'D/F conflict prevents indirect add long + delete short' ' df_test refs/df-ial-ds --sym-add --add-del foo/bar foo ' @@ -314,11 +314,11 @@ test_expect_success 'D/F conflict prevents indirect add short + indirect delete df_test refs/df-ias-idl --sym-add --sym-del --add-del foo foo/bar ' -test_expect_failure 'D/F conflict prevents indirect delete long + indirect add short' ' +test_expect_success 'D/F conflict prevents indirect delete long + indirect add short' ' df_test refs/df-idl-ias --sym-add --sym-del --del-add foo/bar foo ' -test_expect_failure 'D/F conflict prevents indirect add long + delete short packed' ' +test_expect_success 'D/F conflict prevents indirect add long + delete short packed' ' df_test refs/df-ial-dsp --sym-add --pack --add-del foo/bar foo ' @@ -330,7 +330,7 @@ test_expect_success 'D/F conflict prevents add long + indirect delete short pack df_test refs/df-al-idsp --sym-del --pack --add-del foo/bar foo ' -test_expect_failure 'D/F conflict prevents indirect delete long packed + indirect add short' ' +test_expect_success 'D/F conflict prevents indirect delete long packed + indirect add short' ' df_test refs/df-idlp-ias --sym-add --sym-del --pack --del-add foo/bar foo ' -- 2.14.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure 2017-10-24 15:16 ` [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure Michael Haggerty @ 2017-10-24 19:49 ` Jeff King 2017-10-25 2:29 ` Junio C Hamano 1 sibling, 0 replies; 14+ messages in thread From: Jeff King @ 2017-10-24 19:49 UTC (permalink / raw) To: Michael Haggerty; +Cc: Junio C Hamano, git On Tue, Oct 24, 2017 at 05:16:25PM +0200, Michael Haggerty wrote: > The fix is simple: instead of just breaking out of the loop, jump > directly to the cleanup code. This fixes some tests in t1404 that were > added in the previous commit. Nicely explained. I think that fix makes sense, and matches how the rest of the function behaved. The other option would be to switch the recently-added code to: if (!ret && packed_transaction) ... but it seems like the "goto" means one less thing for new code to remember if it gets added. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure 2017-10-24 15:16 ` [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure Michael Haggerty 2017-10-24 19:49 ` Jeff King @ 2017-10-25 2:29 ` Junio C Hamano 1 sibling, 0 replies; 14+ messages in thread From: Junio C Hamano @ 2017-10-25 2:29 UTC (permalink / raw) To: Michael Haggerty; +Cc: Jeff King, git Michael Haggerty <mhagger@alum.mit.edu> writes: > ... But dc39e09942 added another blurb of code between > the loop and the cleanup. That blurb sometimes resets `ret` to zero, > making the cleanup code think that the locking was successful. > ... > The fix is simple: instead of just breaking out of the loop, jump > directly to the cleanup code. This fixes some tests in t1404 that were > added in the previous commit. OK. Now because we do not break and start packed_transaction but instead jump over that if statement, we'll leave packed_transation instance that we got from transaction_begin() that we called add_update() on, but haven't called transaction_prepare() on behind. That instance is pointed by backend_data pointer which is part of transaction, so presumably transaction_cleanup() called on it in the section labelled "cleanup:" should take care of it? Thanks for catching the issue and fixing it quickly. Will queue. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/2] Fix an error-handling path when locking refs 2017-10-24 15:16 [PATCH 0/2] Fix an error-handling path when locking refs Michael Haggerty 2017-10-24 15:16 ` [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts Michael Haggerty 2017-10-24 15:16 ` [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure Michael Haggerty @ 2017-10-24 19:45 ` Jeff King 2 siblings, 0 replies; 14+ messages in thread From: Jeff King @ 2017-10-24 19:45 UTC (permalink / raw) To: Michael Haggerty; +Cc: Junio C Hamano, git On Tue, Oct 24, 2017 at 05:16:23PM +0200, Michael Haggerty wrote: > But in fact the problem is more general; it can happen whenever a > reference deletion within a transaction is processed successfully, and > then another reference update in the same transaction fails in > `lock_ref_for_update()`. In such a case the failed update and any > subsequent ones could be silently ignored. Thanks for digging this down to the root cause. It sounds like this might not be all that hard to trigger for users of "push --atomic", then (most of the rest of the code is saved by the fact that it still uses one ref per transaction, which I think can't trigger this). > There is a longer explanation in the second patch's commit message. > > The tests that I added probe a bunch of D/F update scenarios, which I > think should be characteristic of the scenarios that would trigger > this bug. It would be nice to have tests that examine other types of > failures (e.g., wrong `old_oid` values). What you added makes sense to me for now. I do admit I was a little surprised that we didn't have better test coverage for partial transaction failures. I think in general our test suite is weak on checking failures, and covering more cases (like old_oid) would be welcome. Those will be valuable especially as we started playing with more storage backends. I also agree that those tests can come post-release. > While looking at this code again, I realized that the new code > rewrites the `packed-refs` file more often than did the old code. > Specifically, after dc39e09942 (files_ref_store: use a transaction to > update packed refs, 2017-09-08), the `packed-refs` file is overwritten > for any transaction that includes a reference delete. Prior to that > commit, `packed-refs` was only overwritten if a deleted reference was > actually present in the existing `packed-refs` file. This is even the > case if there was previously no `packed-refs` file at all; now any > reference deletion causes an empty `packed-refs` file to be created. > > I think this is a conscionable regression, since deleting references > that are purely loose is probably not all that common, and the old > code had to read the whole `packed-refs` file anyway. Nevertheless, it > is obviously something that I would like to fix (though maybe not for > 2.15? I'm open to input about its urgency.) I agree it's not nearly as urgent as the fix you have here. It might show up on a busy system as increased lock contention over packed-refs. Or if you have really gigantic packed-refs files, a possible performance regression. But as you say, it should be a pretty rare case. So I'd be OK with leaving it to post-2.15. OTOH, I suspect it's a pretty small patch. If you happen to produce it quickly, it might be worth seeing before evaluating. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-10-29 16:12 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-10-24 15:16 [PATCH 0/2] Fix an error-handling path when locking refs Michael Haggerty 2017-10-24 15:16 ` [PATCH 1/2] t1404: add a bunch of tests of D/F conflicts Michael Haggerty 2017-10-24 16:19 ` Eric Sunshine 2017-10-24 19:46 ` Jeff King 2017-10-25 8:03 ` Michael Haggerty 2017-10-25 8:23 ` Michael Haggerty 2017-10-26 6:32 ` Jeff King 2017-10-28 16:42 ` brian m. carlson 2017-10-29 0:05 ` Junio C Hamano 2017-10-29 16:12 ` brian m. carlson 2017-10-24 15:16 ` [PATCH 2/2] files_transaction_prepare(): fix handling of ref lock failure Michael Haggerty 2017-10-24 19:49 ` Jeff King 2017-10-25 2:29 ` Junio C Hamano 2017-10-24 19:45 ` [PATCH 0/2] Fix an error-handling path when locking refs Jeff King
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).