* SHA1 collision in production repo?! (probably not) @ 2017-03-31 16:05 Lars Schneider 2017-03-31 17:27 ` Jeff King 2017-03-31 17:35 ` Junio C Hamano 0 siblings, 2 replies; 15+ messages in thread From: Lars Schneider @ 2017-03-31 16:05 UTC (permalink / raw) To: Git List; +Cc: Jeff King Hi, I just got a report with the following output after a "git fetch" operation using Git 2.11.0.windows.3 [1]: remote: Counting objects: 5922, done. remote: Compressing objects: 100% (14/14), done. error: inflate: data stream error (unknown compression method) error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! fatal: index-pack failed I would be really surprised if we discovered a SHA1 collision in a production repo. My guess is that this is somehow triggered by a network issue (see data stream error). Any tips how to debug this? Thanks, Lars [1] Git for Windows build based on Git v2.11.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 16:05 SHA1 collision in production repo?! (probably not) Lars Schneider @ 2017-03-31 17:27 ` Jeff King 2017-03-31 17:35 ` Junio C Hamano 1 sibling, 0 replies; 15+ messages in thread From: Jeff King @ 2017-03-31 17:27 UTC (permalink / raw) To: Lars Schneider; +Cc: Git List On Fri, Mar 31, 2017 at 06:05:17PM +0200, Lars Schneider wrote: > I just got a report with the following output after a "git fetch" operation > using Git 2.11.0.windows.3 [1]: > > remote: Counting objects: 5922, done. > remote: Compressing objects: 100% (14/14), done. > error: inflate: data stream error (unknown compression method) > error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header > fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! > fatal: index-pack failed > > I would be really surprised if we discovered a SHA1 collision in a production > repo. My guess is that this is somehow triggered by a network issue (see data > stream error). Any tips how to debug this? I'd be surprised, too. :) I'm not sure that inflate error actually comes from the network pack. The "unable to unpack $sha1 header" message actually comes from sha1_object_info_loose(). Which means we're failing to read our _local_ version of 6acd8f279a8b, which is an object we believe is coming in from the network. And that would explain the false-positive collision. We computed the sha1 on something come from the network. We believe we have an object with that sha1 already (but it's corrupted), and then when we compared the real bytes to the corrupted bytes, they didn't match. We should be able to confirm that by running "git fsck" on the local repo, which I'd expect to complain about the loose object. But what I find disturbing there is that we did not notice the failure when accessing the loose object. It seems to have been lost in the call chain. I think the problem is that sha1_loose_object_info() may report errors in two ways: returning -1 if it did not find the object, or putting OBJ_BAD into the type field if it found a corrupt object. But callers are not aware of the second one. I think it should probably return -1 for a corruption, too, and act as if we don't have the object (because we effectively don't). -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 16:05 SHA1 collision in production repo?! (probably not) Lars Schneider 2017-03-31 17:27 ` Jeff King @ 2017-03-31 17:35 ` Junio C Hamano 2017-03-31 17:45 ` Jeff King 1 sibling, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2017-03-31 17:35 UTC (permalink / raw) To: Lars Schneider; +Cc: Git List, Jeff King Lars Schneider <larsxschneider@gmail.com> writes: > Hi, > > I just got a report with the following output after a "git fetch" operation > using Git 2.11.0.windows.3 [1]: > > remote: Counting objects: 5922, done. > remote: Compressing objects: 100% (14/14), done. > error: inflate: data stream error (unknown compression method) > error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header > fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! > fatal: index-pack failed > > I would be really surprised if we discovered a SHA1 collision in a production > repo. My guess is that this is somehow triggered by a network issue (see data > stream error). Any tips how to debug this? Perhaps the first thing to do is to tweak the messages in builtin/index-pack.c to help you identify which one of identical 5 messages is firing. My guess would be that the code saw an object that came over the wire, hashed it to learn that its object name is 6acd8f279a8b20311665f41134579b7380970446, noticed that it locally already has the object with the same name, and tried to make sure they have identical contents (otherwise, what came over the wire is a successful second preimage attack). But your loose object on disk you already had was corrupt and didn't inflate correctly when builtin/index-pack.c::compare_objects() or check_collision() tried to. The code saw no data, or truncated data, or whatever---something different from what the other data that hashed down to 6acd8..., and reported a collision when there is no collision. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 17:35 ` Junio C Hamano @ 2017-03-31 17:45 ` Jeff King 2017-03-31 17:48 ` Jeff King ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Jeff King @ 2017-03-31 17:45 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Schneider, Git List On Fri, Mar 31, 2017 at 10:35:06AM -0700, Junio C Hamano wrote: > Lars Schneider <larsxschneider@gmail.com> writes: > > > Hi, > > > > I just got a report with the following output after a "git fetch" operation > > using Git 2.11.0.windows.3 [1]: > > > > remote: Counting objects: 5922, done. > > remote: Compressing objects: 100% (14/14), done. > > error: inflate: data stream error (unknown compression method) > > error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header > > fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! > > fatal: index-pack failed > > > > I would be really surprised if we discovered a SHA1 collision in a production > > repo. My guess is that this is somehow triggered by a network issue (see data > > stream error). Any tips how to debug this? > > Perhaps the first thing to do is to tweak the messages in builtin/index-pack.c > to help you identify which one of identical 5 messages is firing. > > My guess would be that the code saw an object that came over the > wire, hashed it to learn that its object name is > 6acd8f279a8b20311665f41134579b7380970446, noticed that it locally > already has the object with the same name, and tried to make sure > they have identical contents (otherwise, what came over the wire is > a successful second preimage attack). But your loose object on disk > you already had was corrupt and didn't inflate correctly when > builtin/index-pack.c::compare_objects() or check_collision() tried > to. The code saw no data, or truncated data, or > whatever---something different from what the other data that hashed > down to 6acd8..., and reported a collision when there is no > collision. My guess is the one in compare_objects(). The "unable to unpack" error comes from sha1_loose_object_info(). We'd normally then follow up with read_sha1_file(), which would generate its own set of errors. But if open_istream() got a bogus value for the object size (but didn't correctly report an error), then it would probably quietly return 0 bytes from read_istream() later. I suspect this may improve things, but I haven't dug deeper to see if there are unwanted side effects, or if there are other spots that need similar treatment. diff --git a/sha1_file.c b/sha1_file.c index 43990dec7..38411f90b 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, if (status && oi->typep) *oi->typep = status; strbuf_release(&hdrbuf); - return 0; + return status; } int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) -Peff ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 17:45 ` Jeff King @ 2017-03-31 17:48 ` Jeff King 2017-03-31 18:19 ` Junio C Hamano 2017-03-31 21:16 ` Junio C Hamano 2017-09-12 16:18 ` SHA1 collision in production repo?! (probably not) Lars Schneider 2 siblings, 1 reply; 15+ messages in thread From: Jeff King @ 2017-03-31 17:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Schneider, Git List On Fri, Mar 31, 2017 at 01:45:15PM -0400, Jeff King wrote: > I suspect this may improve things, but I haven't dug deeper to see if > there are unwanted side effects, or if there are other spots that need > similar treatment. > > diff --git a/sha1_file.c b/sha1_file.c > index 43990dec7..38411f90b 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, > if (status && oi->typep) > *oi->typep = status; > strbuf_release(&hdrbuf); > - return 0; > + return status; > } > > int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) Er, no, that's totally wrong. "status' may be holding the type. It should really be: return status < 0 ? status : 0; -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 17:48 ` Jeff King @ 2017-03-31 18:19 ` Junio C Hamano 2017-03-31 18:42 ` Jeff King 0 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2017-03-31 18:19 UTC (permalink / raw) To: Jeff King; +Cc: Lars Schneider, Git List Jeff King <peff@peff.net> writes: > On Fri, Mar 31, 2017 at 01:45:15PM -0400, Jeff King wrote: > >> I suspect this may improve things, but I haven't dug deeper to see if >> there are unwanted side effects, or if there are other spots that need >> similar treatment. >> >> diff --git a/sha1_file.c b/sha1_file.c >> index 43990dec7..38411f90b 100644 >> --- a/sha1_file.c >> +++ b/sha1_file.c >> @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, >> if (status && oi->typep) >> *oi->typep = status; >> strbuf_release(&hdrbuf); >> - return 0; >> + return status; >> } >> >> int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) > > Er, no, that's totally wrong. "status' may be holding the type. It > should really be: > > return status < 0 ? status : 0; Sounds more like it. The only caller will say "ah, that object is not available to us---let's try packs again", which is exactly what we want to happen. There is another bug in the codepath: the assignment to *oi->typep in the pre-context must guard against negative status value. By returning an error correctly like you do above, that bug becomes more or less irrelevant, though. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 18:19 ` Junio C Hamano @ 2017-03-31 18:42 ` Jeff King 0 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2017-03-31 18:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Schneider, Git List On Fri, Mar 31, 2017 at 11:19:54AM -0700, Junio C Hamano wrote: > > Er, no, that's totally wrong. "status' may be holding the type. It > > should really be: > > > > return status < 0 ? status : 0; > > Sounds more like it. The only caller will say "ah, that object is > not available to us---let's try packs again", which is exactly what > we want to happen. Right. Callers cannot distinguish between "did not have it" and "corrupted", but that is no different than the rest of the sha1-file interface. > There is another bug in the codepath: the assignment to *oi->typep > in the pre-context must guard against negative status value. By > returning an error correctly like you do above, that bug becomes > more or less irrelevant, though. I think that is intentional. OBJ_BAD is -1 (though the callers are assuming that error() == OBJ_BAD). And that type gets relayed through sha1_object_info() as the combined type/status return value. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 17:45 ` Jeff King 2017-03-31 17:48 ` Jeff King @ 2017-03-31 21:16 ` Junio C Hamano 2017-04-01 8:03 ` Jeff King 2017-09-12 16:18 ` SHA1 collision in production repo?! (probably not) Lars Schneider 2 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2017-03-31 21:16 UTC (permalink / raw) To: Jeff King; +Cc: Lars Schneider, Git List Before we forget... -- >8 -- From: Jeff King <peff@peff.net> When sha1_loose_object_info() finds that a loose object file cannot be stat(2)ed or mmap(2)ed, it returns -1 to signal an error to the caller. However, if it found that the loose object file is corrupt and the object data cannot be used from it, it forgets to return -1. This can confuse the caller, who may be lead to mistakenly think that there is a loose object and possibly gets an incorrect type and size from the function. The SHA-1 collision detection codepath, for example, when it gets an object over the wire and tries to see the data is the same as what is available as a loose object locally, can get confused when the loose object is correupted due to this bug. --- sha1_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 71063890ff..368c89b5c3 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, if (status && oi->typep) *oi->typep = status; strbuf_release(&hdrbuf); - return 0; + return (status < 0) ? status : 0; } int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 21:16 ` Junio C Hamano @ 2017-04-01 8:03 ` Jeff King 2017-04-01 8:05 ` [PATCH 1/2] sha1_loose_object_info: return error for corrupted objects Jeff King 2017-04-01 8:09 ` [PATCH 2/2] index-pack: detect local corruption in collision check Jeff King 0 siblings, 2 replies; 15+ messages in thread From: Jeff King @ 2017-04-01 8:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Schneider, Git List On Fri, Mar 31, 2017 at 02:16:29PM -0700, Junio C Hamano wrote: > Before we forget... > > -- >8 -- > From: Jeff King <peff@peff.net> > > When sha1_loose_object_info() finds that a loose object file cannot > be stat(2)ed or mmap(2)ed, it returns -1 to signal an error to the > caller. However, if it found that the loose object file is corrupt > and the object data cannot be used from it, it forgets to return -1. > > This can confuse the caller, who may be lead to mistakenly think > that there is a loose object and possibly gets an incorrect type and > size from the function. The SHA-1 collision detection codepath, for > example, when it gets an object over the wire and tries to see the > data is the same as what is available as a loose object locally, can > get confused when the loose object is correupted due to this bug. Unfortunately this isn't quite right. I was able to reproduce the problem that Lars saw, and this patch doesn't fix it. So here's a two-patch series. The first fixes the problem described above, along with a simpler test that demonstrates it. The second fixes Lars's problem on top. I know you're heading out for a week, so I'll post it now for review, and then hold and repost when you get back. [1/2]: sha1_loose_object_info: return error for corrupted objects [2/2]: index-pack: detect local corruption in collision check builtin/index-pack.c | 2 ++ sha1_file.c | 2 +- t/t1060-object-corruption.sh | 24 ++++++++++++++++++++++++ 3 files changed, 27 insertions(+), 1 deletion(-) -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] sha1_loose_object_info: return error for corrupted objects 2017-04-01 8:03 ` Jeff King @ 2017-04-01 8:05 ` Jeff King 2017-04-01 17:47 ` Junio C Hamano 2017-04-01 8:09 ` [PATCH 2/2] index-pack: detect local corruption in collision check Jeff King 1 sibling, 1 reply; 15+ messages in thread From: Jeff King @ 2017-04-01 8:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Schneider, Git List When sha1_loose_object_info() finds that a loose object file cannot be stat(2)ed or mmap(2)ed, it returns -1 to signal an error to the caller. However, if it found that the loose object file is corrupt and the object data cannot be used from it, it stuffs OBJ_BAD into "type" field of the object_info, but returns zero (i.e., success), which can confuse callers. This is due to 052fe5eac (sha1_loose_object_info: make type lookup optional, 2013-07-12), which switched the return to a strict success/error, rather than returning the type (but botched the return). Callers of regular sha1_object_info() don't notice the difference, as that function returns the type (which is OBJ_BAD in this case). However, direct callers of sha1_object_info_extended() see the function return success, but without setting any meaningful values in the object_info struct, leading them to access potentially uninitialized memory. The easiest way to see the bug is via "cat-file -s", which will happily ignore the corruption and report whatever value happened to be in the "size" variable. Signed-off-by: Jeff King <peff@peff.net> --- So not only does it not fail, but running with --valgrind complains. sha1_file.c | 2 +- t/t1060-object-corruption.sh | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 71063890f..368c89b5c 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, if (status && oi->typep) *oi->typep = status; strbuf_release(&hdrbuf); - return 0; + return (status < 0) ? status : 0; } int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index 3f8705139..3a88d79c5 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -53,6 +53,13 @@ test_expect_success 'streaming a corrupt blob fails' ' ) ' +test_expect_success 'getting type of a corrupt blob fails' ' + ( + cd bit-error && + test_must_fail git cat-file -s HEAD:content.t + ) +' + test_expect_success 'read-tree -u detects bit-errors in blobs' ' ( cd bit-error && -- 2.12.2.943.g91cb50fd8 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] sha1_loose_object_info: return error for corrupted objects 2017-04-01 8:05 ` [PATCH 1/2] sha1_loose_object_info: return error for corrupted objects Jeff King @ 2017-04-01 17:47 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2017-04-01 17:47 UTC (permalink / raw) To: Jeff King; +Cc: Lars Schneider, Git List Jeff King <peff@peff.net> writes: > When sha1_loose_object_info() finds that a loose object file > cannot be stat(2)ed or mmap(2)ed, it returns -1 to signal an > error to the caller. However, if it found that the loose > object file is corrupt and the object data cannot be used > from it, it stuffs OBJ_BAD into "type" field of the > object_info, but returns zero (i.e., success), which can > confuse callers. Thanks for a nice analysis and a fix. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] index-pack: detect local corruption in collision check 2017-04-01 8:03 ` Jeff King 2017-04-01 8:05 ` [PATCH 1/2] sha1_loose_object_info: return error for corrupted objects Jeff King @ 2017-04-01 8:09 ` Jeff King 2017-04-01 18:04 ` Junio C Hamano 1 sibling, 1 reply; 15+ messages in thread From: Jeff King @ 2017-04-01 8:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Schneider, Git List When we notice that we have a local copy of an incoming object, we compare the two objects to make sure we haven't found a collision. Before we get to the actual object bytes, though, we compare the type and size from sha1_object_info(). If our local object is corrupted, then the type will be OBJ_BAD, which obviously will not match the incoming type, and we'll report "SHA1 COLLISION FOUND" (with capital letters and everything). This is confusing, as the problem is not a collision but rather local corruption. We should reoprt that instead (just like we do if reading the rest of the object content fails a few lines later). Note that we _could_ just ignore the error and mark it as a non-collision. That would let you "git fetch" to replace a corrupted object. But it's not a very reliable method for repairing a repository. The earlier want/have negotiation tries to get the other side to omit objects we already have, and it would not realize that we are "missing" this corrupted object. So we're better off complaining loudly when we see corruption, and letting the user take more drastic measures to repair (like making a full clone elsewhere and copying the pack into place). Note that the test sets transfer.unpackLimit in the receiving repository so that we use index-pack (which is what does the collision check). Normally for such a small push we'd use unpack-objects, which would simply try to write the loose object, and discard the new one when we see that there's already an old one. Signed-off-by: Jeff King <peff@peff.net> --- I was surprised to see that unpack-objects doesn't do the same collision check. I think it relies on the loose-object precedence. But that seems like it misses a case: if you have a packed version of an object and receive a small fetch or push, we may unpack a duplicate object to its loose format without doing any kind of collision check. We may want to tighten that up. In the long run, I'd love to see unpack-objects go away in favor of teaching index-pack how to write loose objects. The two had very similar code once upon a time, but index-pack has grown a lot of feature and bugfixes over the years. builtin/index-pack.c | 2 ++ t/t1060-object-corruption.sh | 17 +++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 88d205f85..9f17adb37 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -809,6 +809,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, unsigned long has_size; read_lock(); has_type = sha1_object_info(sha1, &has_size); + if (has_type < 0) + die(_("cannot read existing object info %s"), sha1_to_hex(sha1)); if (has_type != type || has_size != size) die(_("SHA1 COLLISION FOUND WITH %s !"), sha1_to_hex(sha1)); has_data = read_sha1_file(sha1, &has_type, &has_size); diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index 3a88d79c5..ac1f189fd 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -21,6 +21,14 @@ test_expect_success 'setup corrupt repo' ' cd bit-error && test_commit content && corrupt_byte HEAD:content.t 10 + ) && + git init no-bit-error && + ( + # distinct commit from bit-error, but containing a + # non-corrupted version of the same blob + cd no-bit-error && + test_tick && + test_commit content ) ' @@ -108,4 +116,13 @@ test_expect_failure 'clone --local detects misnamed objects' ' test_must_fail git clone --local misnamed misnamed-checkout ' +test_expect_success 'fetch into corrupted repo with index-pack' ' + ( + cd bit-error && + test_must_fail git -c transfer.unpackLimit=1 \ + fetch ../no-bit-error 2>stderr && + test_i18ngrep ! -i collision stderr + ) +' + test_done -- 2.12.2.943.g91cb50fd8 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] index-pack: detect local corruption in collision check 2017-04-01 8:09 ` [PATCH 2/2] index-pack: detect local corruption in collision check Jeff King @ 2017-04-01 18:04 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2017-04-01 18:04 UTC (permalink / raw) To: Jeff King; +Cc: Lars Schneider, Git List Jeff King <peff@peff.net> writes: > is not a collision but rather local corruption. We should > reoprt that instead (just like we do if reading the rest of > the object content fails a few lines later). s/reoprt/report/ (locally amended while queuing). > We may want to tighten that up. In the long run, I'd love to see > unpack-objects go away in favor of teaching index-pack how to write > loose objects. The two had very similar code once upon a time, but > index-pack has grown a lot of feature and bugfixes over the years. This sounds like a nice future to aspire to reach. Besides having to maintain two separate executables, another downside of the current arrangement is that we have to make the decision between streaming to a single pack and exploding into loose objects too early and base our decision solely on the object count. If we are moving to a single receiver, it is conceivable to switch to a scheme based on the size of the incoming pack (e.g. spool the first N MB and if we run out we write out loose objects, otherwise create a new pack, and dump the spooled part and stream the rest into it while indexing). Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-03-31 17:45 ` Jeff King 2017-03-31 17:48 ` Jeff King 2017-03-31 21:16 ` Junio C Hamano @ 2017-09-12 16:18 ` Lars Schneider 2017-09-12 17:38 ` Jeff King 2 siblings, 1 reply; 15+ messages in thread From: Lars Schneider @ 2017-09-12 16:18 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Git List > On 31 Mar 2017, at 19:45, Jeff King <peff@peff.net> wrote: > > On Fri, Mar 31, 2017 at 10:35:06AM -0700, Junio C Hamano wrote: > >> Lars Schneider <larsxschneider@gmail.com> writes: >> >>> Hi, >>> >>> I just got a report with the following output after a "git fetch" operation >>> using Git 2.11.0.windows.3 [1]: >>> >>> remote: Counting objects: 5922, done. >>> remote: Compressing objects: 100% (14/14), done. >>> error: inflate: data stream error (unknown compression method) >>> error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header >>> fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! >>> fatal: index-pack failed >>> >>> I would be really surprised if we discovered a SHA1 collision in a production >>> repo. My guess is that this is somehow triggered by a network issue (see data >>> stream error). Any tips how to debug this? >> >> Perhaps the first thing to do is to tweak the messages in builtin/index-pack.c >> to help you identify which one of identical 5 messages is firing. >> >> My guess would be that the code saw an object that came over the >> wire, hashed it to learn that its object name is >> 6acd8f279a8b20311665f41134579b7380970446, noticed that it locally >> already has the object with the same name, and tried to make sure >> they have identical contents (otherwise, what came over the wire is >> a successful second preimage attack). But your loose object on disk >> you already had was corrupt and didn't inflate correctly when >> builtin/index-pack.c::compare_objects() or check_collision() tried >> to. The code saw no data, or truncated data, or >> whatever---something different from what the other data that hashed >> down to 6acd8..., and reported a collision when there is no >> collision. > > My guess is the one in compare_objects(). The "unable to unpack" error > comes from sha1_loose_object_info(). We'd normally then follow up with > read_sha1_file(), which would generate its own set of errors. > > But if open_istream() got a bogus value for the object size (but didn't > correctly report an error), then it would probably quietly return 0 > bytes from read_istream() later. > > I suspect this may improve things, but I haven't dug deeper to see if > there are unwanted side effects, or if there are other spots that need > similar treatment. > > diff --git a/sha1_file.c b/sha1_file.c > index 43990dec7..38411f90b 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, > if (status && oi->typep) > *oi->typep = status; > strbuf_release(&hdrbuf); > - return 0; > + return status; > } > > int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) Hi Peff, we are seeing this now in Git 2.14.1: ... error: inflate: data stream error (unknown compression method) error: unable to unpack 7b513f98a66ef9488e516e7abbc246438597c6d5 header error: inflate: data stream error (unknown compression method) error: unable to unpack 7b513f98a66ef9488e516e7abbc246438597c6d5 header fatal: loose object 7b513f98a66ef9488e516e7abbc246438597c6d5 (stored in .git/objects/7b/513f98a66ef9488e516e7abbc246438597c6d5) is corrupt fatal: The remote end hung up unexpectedly I guess this means your fix [1] works properly :-) At some point I will try to explore a retry mechanism for these cases. Cheers, Lars [1] https://github.com/git/git/commit/93cff9a978e1c177ac3e889867004a56773301b2 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: SHA1 collision in production repo?! (probably not) 2017-09-12 16:18 ` SHA1 collision in production repo?! (probably not) Lars Schneider @ 2017-09-12 17:38 ` Jeff King 0 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2017-09-12 17:38 UTC (permalink / raw) To: Lars Schneider; +Cc: Junio C Hamano, Git List On Tue, Sep 12, 2017 at 06:18:32PM +0200, Lars Schneider wrote: > we are seeing this now in Git 2.14.1: > > ... > error: inflate: data stream error (unknown compression method) > error: unable to unpack 7b513f98a66ef9488e516e7abbc246438597c6d5 header > error: inflate: data stream error (unknown compression method) > error: unable to unpack 7b513f98a66ef9488e516e7abbc246438597c6d5 header > fatal: loose object 7b513f98a66ef9488e516e7abbc246438597c6d5 (stored in .git/objects/7b/513f98a66ef9488e516e7abbc246438597c6d5) is corrupt > fatal: The remote end hung up unexpectedly > > I guess this means your fix [1] works properly :-) Oh, good. :) > At some point I will try to explore a retry mechanism for these cases. I don't think we can generally retry loose-object failures. We use copies from packs first, and then loose. So a corrupt loose can fallback to another pack or to loose, but not the other way around (because we would never look at the loose if we had a good copy elsewhere). Though in your particular case, if I recall, you're receiving the object over the network and the corrupted copy is in the way. So right now the recovery process is: 1. Notice the commit message. 2. Run git-fsck to notice that we don't really[1] need the object. 3. Run `rm .git/objects/7b/513f...` 4. Re-run the fetch. But in theory we should be able to say "oh, we don't _really_ have that, the collision test isn't necessary" and then overwrite it. I actually thought that's what would happen now (has_sha1_file() would return an error), but I guess for what we need, it just does a stat() and calls it a day, not realizing we ought to be overwriting. -Peff [1] git-fsck will actually complain if reflogs point to the object, and we can always expire those in a corrupted repo. So possibly what you want to know is whether it's reachable from actual refs. Of course this whole check is optional. If the object's corrupted, it's corrupted. But I get nervous calling `rm` on something that _could_ be precious (say it's just a single-bit error that could be recovered). But if you have a known-good copy incoming, that's less of an issue. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-09-12 17:38 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-31 16:05 SHA1 collision in production repo?! (probably not) Lars Schneider 2017-03-31 17:27 ` Jeff King 2017-03-31 17:35 ` Junio C Hamano 2017-03-31 17:45 ` Jeff King 2017-03-31 17:48 ` Jeff King 2017-03-31 18:19 ` Junio C Hamano 2017-03-31 18:42 ` Jeff King 2017-03-31 21:16 ` Junio C Hamano 2017-04-01 8:03 ` Jeff King 2017-04-01 8:05 ` [PATCH 1/2] sha1_loose_object_info: return error for corrupted objects Jeff King 2017-04-01 17:47 ` Junio C Hamano 2017-04-01 8:09 ` [PATCH 2/2] index-pack: detect local corruption in collision check Jeff King 2017-04-01 18:04 ` Junio C Hamano 2017-09-12 16:18 ` SHA1 collision in production repo?! (probably not) Lars Schneider 2017-09-12 17:38 ` Jeff King
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).