git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"René Scharfe" <l.s.r@web.de>,
	"Robin H. Johnson" <robbat2@gentoo.org>
Subject: Re: [PATCH v3 1/4] t5000: test tar files that overflow ustar headers
Date: Thu, 23 Jun 2016 19:31:15 -0400	[thread overview]
Message-ID: <20160623233114.GA5605@sigill.intra.peff.net> (raw)
In-Reply-To: <20160623232041.GA3668@sigill.intra.peff.net>

On Thu, Jun 23, 2016 at 07:20:44PM -0400, Jeff King wrote:

> I'm still not excited about the 64MB write, just because it's awfully
> heavyweight for such a trivial test. It runs pretty fast on my RAM disk,
> but maybe not on other people's system.
> 
> I considered but didn't explore two other options:
> 
>   1. I couldn't convince zlib to write a smaller file (this is done with
>      core.compression=9). But I'm not sure if that's inherent to the
>      on-disk format, or simply the maximum size of a deflate block.
> 
>      So it's possible that one could hand-roll zlib data that says "I'm
>      64GB" but is only a few bytes long.
> 
>   2. We don't ever want to see the whole 64GB, of course; we want to
>      stream it out and only care about the header (as an aside, this
>      makes a wonderful test that we are hitting the streaming code path,
>      as it's unlikely to work without it :) ).
> 
>      So another option would be to include a truncated file that claims
>      to be 64GB, and has only the first 256kb or something worth of data
>      (which should deflate down to almost nothing).
> 
>      git-fsck wouldn't work, of course, but we don't need to run it.
>      Other bits of git might complain, but our plan is for git to get
>      SIGPIPE before hitting that point anyway.
> 
>      So that seems pretty easy, but it is potentially flaky.

Writing that convinced me that (2) is actually quite a sane way to go.
The patch is below, which seems to work.

I arbitrarily picked the first 2048 bytes of the loose object. That's
1/32768 of the original. If we assume the compression ratio is stable
through the file (and it should be; the file is all zeroes), that should
generate 2MB of data should we need it (way more than we feed to our
"head -c" invocation).

This patch is on top of the whole series just to illustrate it. Doing it
for real will involve squashing it into the first patch (and adjusting
the commit message), and then handling the minor rebase conflicts. I'll
hold off on a re-roll until I get any comments on v3.

-Peff

---
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 07e0bdc..e542938 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -339,19 +339,12 @@ test_lazy_prereq TAR_HUGE '
 	test_cmp expect actual
 '
 
-# Likewise, we need bunzip for the 64GB git object.
-test_lazy_prereq BUNZIP '
-	bunzip2 --version
-'
-
-test_expect_success BUNZIP 'set up repository with huge blob' '
+test_expect_success 'set up repository with huge blob' '
 	obj_d=19 &&
 	obj_f=f9c8273ec45a8938e6999cb59b3ff66739902a &&
 	obj=${obj_d}${obj_f} &&
 	mkdir -p .git/objects/$obj_d &&
-	bunzip2 -c \
-		<"$TEST_DIRECTORY"/t5000/$obj.bz2 \
-		>.git/objects/$obj_d/$obj_f &&
+	cp "$TEST_DIRECTORY"/t5000/$obj .git/objects/$obj_d/$obj_f &&
 	rm -f .git/index &&
 	git update-index --add --cacheinfo 100644,$obj,huge &&
 	git commit -m huge
@@ -359,7 +352,7 @@ test_expect_success BUNZIP 'set up repository with huge blob' '
 
 # We expect git to die with SIGPIPE here (otherwise we
 # would generate the whole 64GB).
-test_expect_success BUNZIP 'generate tar with huge size' '
+test_expect_success 'generate tar with huge size' '
 	{
 		git archive HEAD
 		echo $? >exit-code
@@ -368,7 +361,7 @@ test_expect_success BUNZIP 'generate tar with huge size' '
 	test_cmp expect exit-code
 '
 
-test_expect_success BUNZIP,TAR_HUGE 'system tar can read our huge size' '
+test_expect_success TAR_HUGE 'system tar can read our huge size' '
 	echo 68719476737 >expect &&
 	tar_info huge.tar | cut -d" " -f1 >actual &&
 	test_cmp expect actual
diff --git a/t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a b/t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a
new file mode 100644
index 0000000000000000000000000000000000000000..5cbe9ec312bfd7b7e0398ca281e9d42848743704
GIT binary patch
literal 2048
zcmb=p_2!@<FM|RPgTa!7MhCMEn`g89mv_5$t@dj2a|><dbTfugFd71*Aut*OqaiRF
L0;3@?%t8PFWtt9Z

literal 0
HcmV?d00001

diff --git a/t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a.bz2 b/t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a.bz2
deleted file mode 100644
index 9619fd3c5f6f345a40aba1b91807bb4d937fc51c..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001

literal 578
zcmV-I0=@l0T4*^jL0KkKS$mI#Bmk}@egE{{00qbxVL(9uNQgj6FakgbL{P{8umA%P
z1cU%-zyaKJ3Zno3&;S`U000^T007Vc88iR@8UZy`L7)Mk0iXZ?8a+UzR8!JvwH}iK
zXvkv}!$8#Dpwk&jhdD}y=}Ly#N{6=5N`@-9@zf#~Y}ml6A5xVKIEqv?6sUZ+ic~-0
z3bD}&yn;flel6M|SGh`t%>^nMd9)O$Wu&Noa%<D9r9(fbH_W9&qsPxi@k)lGl@5I+
zDj0prR6BE&sCksAbrh&!tffP{F-nI-prpRlLJ+$#Qc)DvQlXBLp@B+<1u7d*3ZhIX
zg<RA^t&a>SgnP<VG8CwK>ira`Ve7_Hm1V`0kgE@zR6?%0tYir&brh(3dz7ebDNy(^
zL0KkKS?}&rK>-rd{r~jXFbzl$fdD`NU<g16aRNX9Kv2K{umFG%1O@;pzyY`b3;;A}
zXu>jRG-;qPzyn5xj3Xw4Mw$Zz)lx%14F-mTL6atf)Gz?ip`!@NpwXs)%#?y?q!Xf$
zP8vZs=>+{8*(#7u93>!3K5|fsT$F-(VMr#rK{}}f`lJ&zAez-6nY%u7(h0VZPg+4e
zX$0a(CY`i`W~3AAPBemQq!ZGRP7dlpJ?RA1nvhKtf@|4GCaOU@$`MOSl~R?f3Ia)6
zDFpYV6Llb)sRYUuLMe6QNGF9Ln9>Q3G=gtZ37;(>om7H;BohijH-{QkNGI(epUDKq
QkWcY<BvXY60gn}ckTCG&H~;_u


  reply	other threads:[~2016-06-23 23:31 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-16  4:35 [PATCH 0/2] friendlier handling of overflows in archive-tar Jeff King
2016-06-16  4:37 ` [PATCH 1/2] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-06-20 22:54   ` René Scharfe
2016-06-21 15:59     ` Jeff King
2016-06-21 16:02       ` Jeff King
2016-06-21 20:42       ` René Scharfe
2016-06-21 20:57         ` René Scharfe
2016-06-21 21:04           ` Jeff King
2016-06-22  5:46             ` René Scharfe
2016-06-21 21:02         ` Jeff King
2016-06-22  5:46           ` René Scharfe
2016-06-23 19:21             ` Jeff King
2016-06-21 20:54       ` René Scharfe
2016-06-21 19:44   ` Robin H. Johnson
2016-06-21 20:57     ` Jeff King
2016-06-16  4:37 ` [PATCH 2/2] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-20 22:54   ` René Scharfe
2016-06-22  5:46     ` René Scharfe
2016-06-23 19:22       ` Jeff King
2016-06-23 21:38         ` René Scharfe
2016-06-23 21:39           ` Jeff King
2016-06-16 17:55 ` [PATCH 0/2] friendlier handling of overflows in archive-tar Junio C Hamano
2016-06-21 16:16 ` Jeff King
2016-06-21 16:16   ` [PATCH v2 1/2] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-06-21 16:17   ` [PATCH v2 2/2] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-21 18:43   ` [PATCH 0/2] friendlier handling of overflows in archive-tar Junio C Hamano
2016-06-23 23:15   ` [PATCH v3] " Jeff King
2016-06-23 23:20     ` [PATCH v3 1/4] t5000: test tar files that overflow ustar headers Jeff King
2016-06-23 23:31       ` Jeff King [this message]
2016-06-24 16:38       ` Johannes Sixt
2016-06-24 16:46         ` Jeff King
2016-06-24 17:05           ` Johannes Sixt
2016-06-24 19:39             ` [PATCH 0/4] portable signal-checking in tests Jeff King
2016-06-24 19:43               ` [PATCH 1/4] tests: factor portable signal check out of t0005 Jeff King
2016-06-24 20:52                 ` Johannes Sixt
2016-06-24 21:05                   ` Jeff King
2016-06-24 21:32                     ` Johannes Sixt
2016-06-24 19:44               ` [PATCH 2/4] t0005: use test_match_signal as appropriate Jeff King
2016-06-24 19:45               ` [PATCH 3/4] test_must_fail: use test_match_signal Jeff King
2016-06-24 19:45               ` [PATCH 4/4] t/lib-git-daemon: " Jeff King
2016-06-24 19:48               ` [PATCH 0/4] portable signal-checking in tests Jeff King
2016-06-24 18:56       ` [PATCH v3 1/4] t5000: test tar files that overflow ustar headers Junio C Hamano
2016-06-24 19:07         ` Jeff King
2016-06-24 19:44           ` Junio C Hamano
2016-06-24 20:58           ` Jeff King
2016-06-24 22:41             ` Junio C Hamano
2016-06-24 23:22               ` Jeff King
2016-06-24 20:58           ` Eric Sunshine
2016-06-24 21:09             ` Jeff King
2016-06-23 23:21     ` [PATCH v3 2/4] archive-tar: write extended headers for file sizes >= 8GB Jeff King
2016-06-24 19:01       ` Junio C Hamano
2016-06-24 19:10         ` Jeff King
2016-06-24 19:45           ` Junio C Hamano
2016-06-24 19:46             ` Jeff King
2016-06-23 23:21     ` [PATCH v3 3/4] archive-tar: write extended headers for far-future mtime Jeff King
2016-06-24 19:06       ` Junio C Hamano
2016-06-24 19:16         ` Jeff King
2016-06-23 23:21     ` [PATCH v3 4/4] archive-tar: drop return value Jeff King
2016-06-24 11:49       ` Remi Galan Alfonso
2016-06-24 13:13         ` Jeff King
2016-06-24 19:10           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160623233114.GA5605@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=robbat2@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).