From: Philip Oakley <email@example.com> To: Jason Hatton <firstname.lastname@example.org>, Junio C Hamano <email@example.com> Cc: "René Scharfe" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org> Subject: Re: [PATCH] Prevent git from rehashing 4GBi files Date: Tue, 10 May 2022 23:45:14 +0100 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <CY4PR16MB1655F83010A128D4ED67C7EDAFC49@CY4PR16MB1655.namprd16.prod.outlook.com> On 07/05/2022 03:15, Jason Hatton wrote: >> Philip Oakley <firstname.lastname@example.org> writes: >> >>>> This may treat non-zero multiple of 4GiB as "not racy", but has >>>> anybody double checked the concern Réne brought up earlier that a >>>> 4GiB file that was added and then got rewritten to 2GiB within the >>>> same second would suddenly start getting treated as not racy? >>> This is the pre-existing problem, that ~1in 2^31 size changes might not >>> get noticed for size change. The 0 byte / 4GiB change is an identical >>> issue, as is changing from 3 bytes to 4GiB+3 bytes, etc., so that's no >>> worse than before (well maybe twice as 'unlikely'). >> OK, it added one more case to 2^32-1 existing cases, I guess. >> >>>> The patch (the firnal version of it anyway) needs to be accompanied >>>> by a handful of test additions to tickle corner cases like that. >>> They'd be protected by the EXPENSIVE prerequisite I would assume. >> Oh, absolutely. Thanks for spelling that out. > I have been testing out the patch a bit and have good and (mostly) bad news. > > What works using a munge value of 1. > > $ git add > $ git status > > Racy seems to work. > > $ touch .git/index 4GiB # 4GiB is now racy > $ git status # Git will rehash the racy file > $ git status # Git cached the file. Second status is fast. > > What doesn't work. > > $ git checkout 4GiB > $ fatal: packed object is corrupt! > > Using a munge value of 1<<31 causes even more problems. The file hash in the > index for 4GiB files (git ls-files -s --debug) are set to the zero file hash. > > I looked up and down the code base and couldn't figure out how the munged > value was leaking out of read-cache.c and breaking things. Most of the code > I found tends to use stat and then convert that to a size_t, not using the > munged unsigned int at all. > > Maybe someone else will have better luck. This seems over my head :( > > Thanks > -- > Jason > Is there a problem that 1<<31, when on a 32bit long is MAX_NEG, rather than being MAX_POS? And the size would need to be positive to be an acceptable file size? (The code is a bit of a mish-mash on the Windows LLP64 side, where long is only 32 bits). Philip Apologies for the terseness.
next prev parent reply other threads:[~2022-05-10 22:45 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-05-07 2:15 Jason Hatton [not found] ` <1DFD3E42-3EF3-4420-8E01-748EF3DBE7A1@iee.email> 2022-05-07 15:22 ` René Scharfe 2022-05-10 22:45 ` Philip Oakley [this message] 2022-05-11 22:24 ` Philip Oakley [not found] <email@example.com> 2022-05-07 18:58 ` Jason D. Hatton [not found] <CY4PR16MB165501ED1B535592033C76F2AFC49@CY4PR16MB1655.namprd16.prod.outlook.com> 2022-05-07 18:10 ` Jason Hatton -- strict thread matches above, loose matches on Subject: below -- 2022-05-06 17:08 Jason Hatton 2022-05-06 18:32 ` Junio C Hamano 2022-05-06 0:26 Jason Hatton 2022-05-06 4:37 ` Torsten Bögershausen 2022-05-06 10:22 ` Philip Oakley 2022-05-06 16:36 ` Junio C Hamano 2022-05-06 21:17 ` Philip Oakley 2022-05-06 21:23 ` Junio C Hamano
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [PATCH] Prevent git from rehashing 4GBi files' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).