From: Michael Haggerty <mhagger@alum.mit.edu>
To: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Johan Herland <johan@herland.net>,
git@vger.kernel.org
Subject: Re: [PATCH 00/12] Fix some reference-related races
Date: Sun, 16 Jun 2013 07:50:42 +0200 [thread overview]
Message-ID: <51BD5232.20205@alum.mit.edu> (raw)
In-Reply-To: <51BCCAD2.9070106@ramsay1.demon.co.uk>
Thanks for all of the information.
On 06/15/2013 10:13 PM, Ramsay Jones wrote:
> Michael Haggerty wrote:
>> *This patch series must be built on top of mh/reflife.*
>>
>
> [...]
>
>> The other problem was that the for_each_ref() functions will die if
>> the ref cache that they are iterating over is freed out from under
>> them. This problem is solved by using reference counts to avoid
>> freeing the old packed ref cache (even if it is no longer valid) until
>> all users are done with it.
>
> Yes, I found exactly this happened to me on cygwin, earlier this week,
> with the previous version of this code. After seeing this mail, I had
> decided not to describe the failure on the old version, but wait and
> test this version instead.
>
> This version is a great improvement, but it still has some failures on
> cygwin. So, it may be worth (briefly) describing the old failure anyway!
> Note that several tests failed, but I will only mention t3211-peel-ref.sh
> tests #7-8.
>
> $ pwd
> /home/ramsay/git/t/trash directory.t3211-peel-ref
> $
>
> $ ../../bin-wrappers/git show-ref -d
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/outside/foo^{}
> 5 [main] git 3540 _cygtls::handle_exceptions: Error while dumping state (p
> robably corrupted stack)
> Segmentation fault (core dumped)
> $
>
> The stack-trace for the faulting code looks something like:
>
> cmd_show_ref()
> for_each_ref(show_ref, NULL)
> do_for_each_ref(&ref_cache, "", show_ref, 0, 0, NULL)
> do_for_each_entry(&ref_cache, "", do_one_ref, &data)
> do_for_each_entry_in_dirs(packed_dir, loose_dir, do_one_ref, &data)
> *dfeeid() recursive calls*
> do_one_ref(entry, &data)
> show_ref("refs/outside/foo", sha1, NULL) [2nd match]
> peel_ref("refs/outside/foo", sha1)
> peel_entry(entry, 0)
> peel_object(name, sha1)
> deref_tag_noverify(o)
> parse_object(sha1 <eb0e854c2...>)
> lookup_replace_object(sha1)
> do_lookup_replace_object(sha1)
> prepare_replace_object()
>
> [un-indent here!]
> for_each_replace_ref(register_replace_ref, NULL)
> do_for_each_ref(&ref_cache, "refs/replace", fn, 13, 0, NULL)
> do_for_each_entry(&ref_cache, "refs/replace", fn, &data)
> get_packed_refs(&ref_cache)
> clear_packed_ref_cache(&ref_cache) *free_ref_entries etc*
> ** return to show_ref() [2nd match] above **
> ** return to recursive dfeeid() call in original iteration
> ** dir1->entries has been free()-ed and reused => segmentation fault
> [dir1->entries == 0x64633263 => dc2c => part of sha1 for refs/outside/foo]
>
> So, the nested "replace-reference-iteration" causes the ref_cache to be
> freed out from under the initial show-ref iteration, so this works:
>
> $ GIT_NO_REPLACE_OBJECTS=1 ../../bin-wrappers/git show-ref -d
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/outside/foo^{}
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/foo^{}
> $
>
> You may be wondering why clear_packed_ref_cache() is called? Well, that
> is because stat_validity_check() *incorrectly* indicates that the
> packed-refs file has changed. Why does it do that? Well, this is one
> more example of the problems caused by the cygwin schizophrenic stat()
> functions. :( [ARGHHHHHHHHH]
>
> At this point, I tried running 'git show-ref' with core.checkstat set
> on the command line; but that didn't work! I had to fix show-ref and
> re-build git, and then, this works:
>
> $ ../../bin-wrappers/git -c core.checkstat=minimal -c core.trustctime=f
> alse show-ref -d
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/outside/foo^{}
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/foo^{}
> $
So if I understand correctly, all of the above is *without* the
refcounting changes introduced in mh/ref-races? Is so, then it is not
surprising, as this is exactly the sort of problem that the reference
counting is meant to solve.
> Now, turning to the new code, t3211-peel-ref.sh test #7 now works, but
> test #8 still fails...
>
> $ ./t3211-peel-ref.sh -i -v
>
> ...
>
> ok 7 - refs are peeled outside of refs/tags (old packed)
>
> expecting success:
> git pack-refs --all &&
> cp .git/packed-refs fully-peeled &&
> git branch yadda &&
> git pack-refs --all &&
> git branch -d yadda &&
> test_cmp fully-peeled .git/packed-refs
>
> fatal: internal error: packed-ref cache cleared while locked
> not ok 8 - peeled refs survive deletion of packed ref
> #
> # git pack-refs --all &&
> # cp .git/packed-refs fully-peeled &&
> # git branch yadda &&
> # git pack-refs --all &&
> # git branch -d yadda &&
> # test_cmp fully-peeled .git/packed-refs
> #
> $ cd trash\ directory.t3211-peel-ref/
> $ ../../bin-wrappers/git pack-refs --all
> fatal: internal error: packed-ref cache cleared while locked
These "internal error: packed-ref cache cleared while locked" failures
result from an internal consistency check that clear_packed_ref_cache()
is not called while the write lock is held on the packed-refs file. A
call to c_p_r_c() could result from
* a programming error
* a determination based on the packed-refs file stat that the file needs
to be re-read
Judging from what you said about cygwin, I assume that the latter is
happening. It should be impossible, because the current process is
holding packed-refs.lock, and therefore other git processes should
refuse to change the packed-refs file.
But if the stat information is not reliable, then the current process
would *think* that the packed-refs file has been changed even though it
hasn't, then it would call c_p_r_c() even though it holds the lock on
packed-refs, and the internal consistency check would fail.
So apparently in these cases cygwin is reporting that the packed-refs
stat information has changed (in the sense defined by the new
stat_validity_check() function, which does essentially the same checks
as the old ce_match_stat_basic() function).
> $ ls
> actual base.t expect
> $ ls .git
> COMMIT_EDITMSG branches/ description index logs/ packed-refs
> HEAD config hooks-disabled/ info/ objects/ refs/
> $ ls -l .git/packed-refs
> -rw-r--r-- 1 ramsay None 296 Jun 14 20:34 .git/packed-refs
> $ cat .git/packed-refs
> # pack-refs with: peeled
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
> ^d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e
> $
>
> Now, I have a test-stat program which prints the difference between
> the two stat implementations used in cygwin git, thus:
>
> $ ../../test-stat .git/packed-refs
> stat for '.git/packed-refs':
> *dev: -1395862925, 0
> *ino: 166044, 0
> *mode: 100644 -rw-, 100600 -rw-
> nlink: 1, 1
> *uid: 1005, 0
> *gid: 513, 0
> *rdev: -1395862925, 0
> size: 296, 296
> atime: 1371238550, 1371238550 Fri Jun 14 20:35:50 2013
> mtime: 1371238469, 1371238469 Fri Jun 14 20:34:29 2013
> ctime: 1371238469, 1371238469 Fri Jun 14 20:34:29 2013
Yikes! ECYGWINFAIL. I have no doubt they have reason why they cannot
implement this correctly, but this is rather limiting. (I has assumed
that the #ifdef magic that was already in ce_match_stat_basic() would
have papered over the problems in stat, but I guess that is not the case.)
You say that there are two stat references in cygwin. Is there a way to
ensure that the same one is used in both cases? Or is it so hopelessly
broken that there is no point?
Or let me step back and pose a slightly more abstract question:
How, under cygwin, can we implement a quick check of whether a file
might have changed? The check should not have any false negatives
(claiming that a file is unchanged when actually it was rewritten via
the usual lock_file mechanism) nor should it have any "strong" false
positives (claiming that a file has changed even though it has never
been touched), though a "weak" false positive would be OK (claiming that
a file has changed even though it was replaced by a version with
identical contents).
If such a check is possible, then we should build it into the
implementation of match_stat_data(). If not, we have to think of
another way to implement the checks of packed-refs cache up-to-dateness.
(One horrible hack would be: when in doubt, read the packed-refs file
into a temporary ref_dir, then compare *the contents* to the version in
the cache. If they are the same, then discard the newly-read version,
update the stat_validity, and continue to use the old version. If they
are different, *then* verify that the lock file was not held, call
clear_packed_ref_cache(), and start using the new version.)
> $ ../../bin-wrappers/git -c core.checkstat=minimal pack-refs --all
> fatal: internal error: packed-ref cache cleared while locked
> $
>
> Hmmm, that should have worked! Wait, fix 'git pack-refs' to support
> setting config variables on the command line, rebuild and:
>
> $ ../../bin-wrappers/git -c core.checkstat=minimal pack-refs --all
> $ cat .git/packed-refs
> # pack-refs with: peeled fully-peeled
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/outside/foo
> ^d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e
> d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/tags/base
> eb0e854c2cd2511c7571b1a5e3c8b8146193fb30 refs/tags/foo
> ^d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e
> $
>
> I haven't checked the remaining test failures to see if they are
> caused by this code (I don't think so, but ...), but this failure
> is clearly a cygwin specific issue.
Thanks again for the testing and analysis,
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
next prev parent reply other threads:[~2013-06-16 5:51 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-11 21:48 [PATCH 00/12] Fix some reference-related races Michael Haggerty
2013-06-11 21:48 ` [PATCH 01/12] repack_without_ref(): split list curation and entry writing Michael Haggerty
2013-06-12 11:38 ` Jeff King
2013-06-12 11:56 ` Michael Haggerty
2013-06-11 21:48 ` [PATCH 02/12] pack_refs(): split creation of packed refs " Michael Haggerty
2013-06-11 21:48 ` [PATCH 03/12] refs: wrap the packed refs cache in a level of indirection Michael Haggerty
2013-06-11 21:48 ` [PATCH 04/12] refs: implement simple transactions for the packed-refs file Michael Haggerty
2013-06-12 12:01 ` Jeff King
2013-06-11 21:48 ` [PATCH 05/12] refs: manage lifetime of packed refs cache via reference counting Michael Haggerty
2013-06-11 21:48 ` [PATCH 06/12] do_for_each_entry(): increment the packed refs cache refcount Michael Haggerty
2013-06-11 21:48 ` [PATCH 07/12] packed_ref_cache: increment refcount when locked Michael Haggerty
2013-06-11 21:48 ` [PATCH 08/12] Extract a struct stat_data from cache_entry Michael Haggerty
2013-06-11 21:48 ` [PATCH 09/12] add a stat_validity struct Michael Haggerty
2013-06-11 21:48 ` [PATCH 10/12] get_packed_ref_cache: reload packed-refs file when it changes Michael Haggerty
2013-06-11 21:48 ` [PATCH 11/12] for_each_ref: load all loose refs before packed refs Michael Haggerty
2013-06-11 21:48 ` [PATCH 12/12] refs: do not invalidate the packed-refs cache unnecessarily Michael Haggerty
2013-06-12 12:39 ` Jeff King
2013-06-12 12:52 ` [PATCH 00/12] Fix some reference-related races Jeff King
2013-06-15 20:13 ` Ramsay Jones
2013-06-16 5:50 ` Michael Haggerty [this message]
2013-06-18 18:13 ` Ramsay Jones
2013-06-19 5:51 ` Michael Haggerty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51BD5232.20205@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johan@herland.net \
--cc=peff@peff.net \
--cc=ramsay@ramsay1.demon.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).