git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Johannes Schauer <josch@debian.org>,
	Git List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/3] t5004: test ZIP archives with many entries
Date: Sun, 23 Aug 2015 13:45:26 -0400	[thread overview]
Message-ID: <CAPig+cSNSfpt7gOLvz7P4oDrNF5fTQ38v1pfncJU3h7a6FjMyQ@mail.gmail.com> (raw)
In-Reply-To: <trinity-6e67d416-0a61-4e73-9779-63519dd83fdb-1440322151491@3capp-webde-bs47>

On Sun, Aug 23, 2015 at 5:29 AM, "René Scharfe" <l.s.r@web.de> wrote:
> Am 23.08.2015 um 07:54 schrieb Eric Sunshine:
>> On Sat, Aug 22, 2015 at 3:06 PM, René Scharfe <l.s.r@web.de> wrote:
>>> +test_lazy_prereq ZIPINFO '
>>> +       n=$("$ZIPINFO" "$TEST_DIRECTORY"/t5004/empty.zip | sed -n "2s/.* //p")
>>> +       test "x$n" = "x0"
>>> +'
>>
>> Unfortunately, this sed expression isn't portable due to dissimilar
>> output of various zipinfo implementations. On Linux, the output of
>> zipinfo is:
>>
>>      $ zipinfo t/t5004/empty.zip
>>      Archive:  t/t5004/empty.zip
>>      Zip file size: 62 bytes, number of entries: 0
>>      Empty zipfile.
>>      $
>>
>> however, on Mac OS X:
>>
>>      $ zipinfo t/t5004/empty.zip
>>      Archive:  t/t5004/empty.zip   62 bytes   0 files
>>      Empty zipfile.
>>      $
>>
>> and on FreeBSD, the zipinfo command seems to have been removed
>> altogether in favor of "unzip -Z" (emulate zipinfo).
>
> I suspected that zipinfo's output might be formatted differently on
> different platforms and tried to guard against it by checking for the
> number zero there. Git's ZIP file creation is platform independent
> (modulo bugs), so having a test run at least somewhere should
> suffice. In theory.
>
> We could add support for the one-line-summary variant on OS X easily,
> though.

Probably, although it's looking like testing on Mac OS X won't be
fruitful (see below).

>> One might hope that "unzip -Z" would be a reasonable replacement for
>> zipinfo, however, it is apparently only partially implemented on
>> FreeBSD, and requires that -1 be passed, as well. Even with "unzip -Z
>> -1", there are issues. The output on Linux and Mac OS X is:
>>
>>      $ unzip -Z -1 t/t5004/empty.zip
>>      Empty zipfile.
>>      $
>>
>> but FreeBSD differs:
>>
>>      $ unzip -Z -1 t/t5004/empty.zip
>>      $
>>
>> With a non-empty zip file, the output is identical on all platforms:
>>
>>      $ unzip -Z -1 twofiles.zip
>>      file1
>>      file2
>>      $
>>
>> So, if you combine that with "wc -l" or test_line_count, you may have
>> a portable and reliable entry counter.
>
> Counting all entries is slow, and more importantly it's not what we
> want. In this test we need the number of entries recorded in the ZIP
> directory, not the actual number of entries found by scanning the
> archive, or the directory.

Ah, right. The commit message did state this clearly enough...

> On Linux "unzip -Z -1 many.zip | wc -l" reports 65792 even before
> adding ZIP64 support; only without -1 we get the interesting numbers
> (specifically with "unzip -Z many.zip | sed -n '2p;$p'"):
>
>     Zip file size: 6841366 bytes, number of entries: 256
>     65792 files, 0 bytes uncompressed, 0 bytes compressed: 0.0%
>
>> With these three patches applied, Mac OS X has trouble with 'many.zip':
>>
>>      $ unzip -Z -1 many.zip
>>      warning [many.zip]:  76 extra bytes at beginning or within zipfile
>>        (attempting to process anyway)
>>      error [many.zip]:  reported length of central directory is
>>        -76 bytes too long (Atari STZip zipfile?  J.H.Holm ZIPSPLIT 1.1
>>        zipfile?).  Compensating...
>>      00/
>>      00/00
>>      ...
>>      ff/ff
>>      error: expected central file header signature not found (file
>>        #65793). (please check that you have transferred or created the
>>        zipfile in the appropriate BINARY mode and that you have compiled
>>        UnZip properly)
>>
>> And FreeBSD doesn't like it either:
>>
>>      $ unzip -Z -1 many.zip
>>      unzip: Invalid central directory signature
>>      $
>>
>
> Looks like they don't support ZIP64. Or I got some of the fields wrong
> after all.

A >65536 file zip created on Mac OS X with Mac's "zip" command given
to "unzip" or "zipinfo" results in exactly the same warnings/errors as
above (including the bit about "76 extra bytes" and "-76 bytes too
long"), so it doesn't seem to be a problem with your implementation.

> https://en.wikipedia.org/wiki/Zip_%28file_format%29#ZIP64 says: "OS X
> Yosemite does support the creation of ZIP64 archives, but does not
> support unzipping these archives using the shipped unzip command-line
> utility or graphical Archive Utility.[citation needed]".
>
> How does unzip react to a ZIP file with more than 65535 entries that
> was created natively on these platforms? And what does zipinfo (a real
> one, without -1) report at the top for such files?

On Mac OS X, unzip does extract all the files (although complains as
noted above). zipinfo caps out at reporting 65535 for the number of
files (although it lists them all fine). With the warnings/errors
filtered out for clarity:

    $ zipinfo biggy.zip
    Archive:  biggy.zip   9642874 bytes   65535 files
    ...

  parent reply	other threads:[~2015-08-23 17:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-11 10:40 bug: git-archive does not use the zip64 extension for archives with more than 16k entries Johannes Schauer
2015-08-12 19:40 ` René Scharfe
2015-08-13  2:25   ` Johannes Schauer
2015-08-22 19:06     ` [PATCH 1/3] t5004: test ZIP archives with many entries René Scharfe
2015-08-23  5:54       ` Eric Sunshine
2015-08-23  9:29         ` "René Scharfe"
2015-08-23  9:35           ` Eric Sunshine mail delivery failure René Scharfe
2015-08-23 17:16             ` Johannes Löthberg
2015-08-23 18:24               ` Eric Sunshine
     [not found]                 ` <CA+EOSBmk2cdQe3owaXgkYAgTZqpUFa=J8g5FYq28-=VhDcJ4EA@mail.gmail.com>
2015-08-23 18:48                   ` Eric Sunshine
2015-08-23 18:57                     ` Eric Sunshine
2015-08-23 17:45           ` Eric Sunshine [this message]
2015-08-28 15:45             ` [PATCH 1/3] t5004: test ZIP archives with many entries Junio C Hamano
2015-08-28 15:57               ` Junio C Hamano
2015-08-28 16:47                 ` Eric Sunshine
2015-08-22 19:06     ` [PATCH 2/3] archive-zip: use a local variable to store the creator version René Scharfe
2015-08-22 19:06     ` [PATCH 3/3] archive-zip: support more than 65535 entries René Scharfe
2015-08-15  8:40   ` bug: git-archive does not use the zip64 extension for archives with more than 16k entries Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPig+cSNSfpt7gOLvz7P4oDrNF5fTQ38v1pfncJU3h7a6FjMyQ@mail.gmail.com \
    --to=sunshine@sunshineco.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=josch@debian.org \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).