git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/2] archive-tar: write extended headers for file sizes >= 8GB
Date: Tue, 21 Jun 2016 22:54:50 +0200	[thread overview]
Message-ID: <5769A99A.9040508@web.de> (raw)
In-Reply-To: <20160621155920.GA7549@sigill.intra.peff.net>

Am 21.06.2016 um 17:59 schrieb Jeff King:
> On Tue, Jun 21, 2016 at 12:54:11AM +0200, René Scharfe wrote:
>
>>> Unfortunately, it's quite an expensive test to run. For one
>>> thing, unless your filesystem supports files with holes, it
>>> takes 64GB of disk space (you might think piping straight to
>>> `hash-object --stdin` would be better, but it's not; that
>>> tries to buffer all 64GB in RAM!). Furthermore, hashing and
>>> compressing the object takes several minutes of CPU time.
>>>
>>> We could ship just the resulting compressed object data as a
>>> loose object, but even that takes 64MB. So sadly, this code
>>> path remains untested in the test suite.
>>
>> If we could set the limit to a lower value than 8GB for testing then we
>> could at least check if the extended header is written, e.g. if ustar_size()
>> could be convinced to return 0 every time using a hidden command line
>> parameter or an environment variable or something better.
>
> Yes, we could do that, though I think it loses most of the value of the
> test. We can check that if we hit an arbitrary value we generate the pax
> header, but I think what we _really_ care about is: did we generate an
> output that somebody else's tar implementation can handle.

I agree with the last point, but don't see how that diminishes the
value of such a test.  If we provide file sizes only through extended
headers (the normal header field being set to 0) and we can extract
files with correct sizes then tar must have interpreted those header
as intended, right?

(Or it just guessed the sizes by searching for the next header magic,
but such a fallback won't be accurate for files ending with NUL
characters due to NUL-padding, so we just have to add such a file.)

René


-- >8 --
Subject: archive-tar: test creation of pax extended size headers

---
The value 120 is magic; we need it to pass the tests.  That's
because prepare_header() is used for building extended header
records as well and we don't create extended headers for extended
headers (not sure if that would work anyway), so they simply
vanish when they're over the limit as their size field is set to
zero.

 archive-tar.c       | 7 ++++++-
 t/t5000-tar-tree.sh | 7 +++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/archive-tar.c b/archive-tar.c
index f53e61c..fbbc4cc 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -14,6 +14,7 @@ static char block[BLOCKSIZE];
 static unsigned long offset;
 
 static int tar_umask = 002;
+static unsigned long ustar_size_max = 077777777777UL;
 
 static int write_tar_filter_archive(const struct archiver *ar,
 				    struct archiver_args *args);
@@ -179,7 +180,7 @@ static size_t get_path_prefix(const char *path, size_t pathlen, size_t maxlen)
 
 static inline unsigned long ustar_size(uintmax_t size)
 {
-	if (size <= 077777777777UL)
+	if (size <= ustar_size_max)
 		return size;
 	else
 		return 0;
@@ -412,6 +413,10 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 		}
 		return 0;
 	}
+	if (!strcmp(var, "tar.ustarsizemax")) {
+		ustar_size_max = git_config_ulong(var, value);
+		return 0;
+	}
 
 	return tar_filter_config(var, value, cb);
 }
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 4b68bba..03bb4c7 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -102,6 +102,7 @@ test_expect_success \
      echo long filename >a/four$hundred &&
      mkdir a/bin &&
      test-genrandom "frotz" 500000 >a/bin/sh &&
+     printf "\0\0\0" >>a/bin/sh &&
      printf "A\$Format:%s\$O" "$SUBSTFORMAT" >a/substfile1 &&
      printf "A not substituted O" >a/substfile2 &&
      if test_have_prereq SYMLINKS; then
@@ -157,6 +158,12 @@ test_expect_success 'git-archive --prefix=olde-' '
 
 check_tar with_olde-prefix olde-
 
+test_expect_success !TAR_NEEDS_PAX_FALLBACK 'pax extended size headers' '
+	git -c tar.ustarsizemax=120 archive HEAD >extended_size_header.tar
+'
+
+check_tar extended_size_header
+
 test_expect_success 'git archive on large files' '
     test_config core.bigfilethreshold 1 &&
     git archive HEAD >b3.tar &&
-- 
2.9.0


  parent reply	other threads:[~2016-06-21 21:01 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-16  4:35 [PATCH 0/2] friendlier handling of overflows in archive-tar Jeff King
2016-06-16  4:37 ` [PATCH 1/2] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-06-20 22:54   ` René Scharfe
2016-06-21 15:59     ` Jeff King
2016-06-21 16:02       ` Jeff King
2016-06-21 20:42       ` René Scharfe
2016-06-21 20:57         ` René Scharfe
2016-06-21 21:04           ` Jeff King
2016-06-22  5:46             ` René Scharfe
2016-06-21 21:02         ` Jeff King
2016-06-22  5:46           ` René Scharfe
2016-06-23 19:21             ` Jeff King
2016-06-21 20:54       ` René Scharfe [this message]
2016-06-21 19:44   ` Robin H. Johnson
2016-06-21 20:57     ` Jeff King
2016-06-16  4:37 ` [PATCH 2/2] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-20 22:54   ` René Scharfe
2016-06-22  5:46     ` René Scharfe
2016-06-23 19:22       ` Jeff King
2016-06-23 21:38         ` René Scharfe
2016-06-23 21:39           ` Jeff King
2016-06-16 17:55 ` [PATCH 0/2] friendlier handling of overflows in archive-tar Junio C Hamano
2016-06-21 16:16 ` Jeff King
2016-06-21 16:16   ` [PATCH v2 1/2] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-06-21 16:17   ` [PATCH v2 2/2] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-21 18:43   ` [PATCH 0/2] friendlier handling of overflows in archive-tar Junio C Hamano
2016-06-23 23:15   ` [PATCH v3] " Jeff King
2016-06-23 23:20     ` [PATCH v3 1/4] t5000: test tar files that overflow ustar headers Jeff King
2016-06-23 23:31       ` Jeff King
2016-06-24 16:38       ` Johannes Sixt
2016-06-24 16:46         ` Jeff King
2016-06-24 17:05           ` Johannes Sixt
2016-06-24 19:39             ` [PATCH 0/4] portable signal-checking in tests Jeff King
2016-06-24 19:43               ` [PATCH 1/4] tests: factor portable signal check out of t0005 Jeff King
2016-06-24 20:52                 ` Johannes Sixt
2016-06-24 21:05                   ` Jeff King
2016-06-24 21:32                     ` Johannes Sixt
2016-06-24 19:44               ` [PATCH 2/4] t0005: use test_match_signal as appropriate Jeff King
2016-06-24 19:45               ` [PATCH 3/4] test_must_fail: use test_match_signal Jeff King
2016-06-24 19:45               ` [PATCH 4/4] t/lib-git-daemon: " Jeff King
2016-06-24 19:48               ` [PATCH 0/4] portable signal-checking in tests Jeff King
2016-06-24 18:56       ` [PATCH v3 1/4] t5000: test tar files that overflow ustar headers Junio C Hamano
2016-06-24 19:07         ` Jeff King
2016-06-24 19:44           ` Junio C Hamano
2016-06-24 20:58           ` Jeff King
2016-06-24 22:41             ` Junio C Hamano
2016-06-24 23:22               ` Jeff King
2016-06-24 20:58           ` Eric Sunshine
2016-06-24 21:09             ` Jeff King
2016-06-23 23:21     ` [PATCH v3 2/4] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-06-24 19:01       ` Junio C Hamano
2016-06-24 19:10         ` Jeff King
2016-06-24 19:45           ` Junio C Hamano
2016-06-24 19:46             ` Jeff King
2016-06-23 23:21     ` [PATCH v3 3/4] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-24 19:06       ` Junio C Hamano
2016-06-24 19:16         ` Jeff King
2016-06-23 23:21     ` [PATCH v3 4/4] archive-tar: drop return value Jeff King
2016-06-24 11:49       ` Remi Galan Alfonso
2016-06-24 13:13         ` Jeff King
2016-06-24 19:10           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5769A99A.9040508@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).