git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Braun <thomas.braun@virtuell-zuhause.de>
To: git-for-windows@googlegroups.com, git@vger.kernel.org
Subject: Creates unreadable pack files on platforms with sizeof(unsigned long) != sizeof(uintmax_t)
Date: Sun, 20 Mar 2016 22:20:02 +0100	[thread overview]
Message-ID: <56EF1402.4050708@virtuell-zuhause.de> (raw)

Hi,

while playing around with some git settings I encountered some problems on Windows x64
using the 64bit build of git.
And it is not restricted to that platform.

Recipe to break:
mkdir test &&
cd test &&
truncate -s 5g largefile.bin &&
git init &&
git add . &&
git commit -m "changes" &&
git fsck

Result:
Initialized empty Git repository in E:/ttest/.git/
[master (root-commit) d19adaf] changes
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 largefile.bin
Checking object directories: 100% (256/256), done.
error: bad object header
error: unknown object type -1 at offset 12 in
.git/objects/pack/pack-25250ce5c176078ba51a42fee177c2f03f8845ca.pack
error: cannot unpack 0be2be10a4c8764f32c4bf372a98edc731a4b204 from
.git/objects/pack/pack-25250ce5c176078ba51a42fee177c2f03f8845ca.pack at offset
12
Checking objects: 100% (1/1), done.

So I've created a repository which I can now not use.

The die() call is from unpack_object_header_buffer() in sha1_file.c. On windows x64
bitsizeof(long) returns 32 and equals shift at some point.
unpack_object_header_buffer() returns the size in an unsigned long (32bit). [1, 2, 3]

The questions why this has not been detected on creating the pack leads to
encode_in_pack_object_header()
which uses uintmax_t (64 bit wide) for storing the size.

unsigned long is used in more places for file sizes e.g. in struct object_entry
in pack-objects.h.

The proper solution now would be, I guess, to convert file sizes from unsigned long to something
which is wider on windows x64 and in the best case the same size on linux. My git code base foo is rather
low and it looks much more involved than a simple s//.

A first intermediate solution could be to die on pack creation e.g. in 

diff --git a/pack-write.c b/pack-write.c
index 33293ce..ebb8b0a 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -313,6 +313,9 @@ int encode_in_pack_object_header(enum object_type type, uintmax_t size, unsigned
        if (type < OBJ_COMMIT || type > OBJ_REF_DELTA)
                die("bad type %d", type);

+       if (bitsizeof(unsigned long) != bitsizeof(uintmax_t) && size > (unsigned long) size)
+               die("Cannot handle files this big");
+
        c = (type << 4) | (size & 15);
        size >>= 4;
        while (size) {

With that patch I get

$ ../git add .
fatal: Cannot handle files this big

I know that usually people don't add big binary files to git. But I do, so I care ;)
If this direction sounds reasonable I can provide a proper patch.

Thanks,
Thomas

[1]: https://msdn.microsoft.com/en-us/library/323b6b3k.aspx
[2]: https://msdn.microsoft.com/en-us/library/s3f49ktz.aspx
[3]: http://stackoverflow.com/questions/7607502/sizeoflong-in-64-bit-c

                 reply	other threads:[~2016-03-20 21:41 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56EF1402.4050708@virtuell-zuhause.de \
    --to=thomas.braun@virtuell-zuhause.de \
    --cc=git-for-windows@googlegroups.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).