git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/4] fix pax extended header length calculation
@ 2019-08-17 16:19 René Scharfe
  2019-08-17 16:23 ` [PATCH 1/4] archive-tar: report wrong pax extended header length René Scharfe
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: René Scharfe @ 2019-08-17 16:19 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

The code for building pax extended headers has been miscalculating
lengths slightly shorter than powers of 10  since I wrote it in 2006.
That affects entries for paths with a length of 990, 991, 9989, 9990,
9991, 99988 etc. and link targets 4 characters shorter.  Here's a
series for fixing it.

  archive-tar: report wrong pax extended header length
  archive-tar: fix pax extended header length calculation
  archive-tar: use size_t in strbuf_append_ext_header()
  archive-tar: turn length miscalculation warning into BUG

 archive-tar.c                   | 14 ++++++++++----
 t/t5004-archive-corner-cases.sh | 19 +++++++++++++++++++
 2 files changed, 29 insertions(+), 4 deletions(-)

--
2.23.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] archive-tar: report wrong pax extended header length
  2019-08-17 16:19 [PATCH 0/4] fix pax extended header length calculation René Scharfe
@ 2019-08-17 16:23 ` René Scharfe
  2019-08-17 16:24 ` [PATCH 2/4] archive-tar: fix pax extended header length calculation René Scharfe
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: René Scharfe @ 2019-08-17 16:23 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

Extended header entries contain a length value that is a bit tricky to
calculate because it includes its own length (number of decimal digits)
as well.  We get it wrong in corner cases.  Add a check, report wrong
results as a warning and add a test for exercising it.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 archive-tar.c                   |  6 ++++++
 t/t5004-archive-corner-cases.sh | 20 ++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/archive-tar.c b/archive-tar.c
index 3e53aac1e6..355c8142c6 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -144,6 +144,7 @@ static int stream_blocked(const struct object_id *oid)
 static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,
 				     const char *value, unsigned int valuelen)
 {
+	size_t orig_len = sb->len;
 	int len, tmp;

 	/* "%u %s=%s\n" */
@@ -155,6 +156,11 @@ static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,
 	strbuf_addf(sb, "%u %s=", len, keyword);
 	strbuf_add(sb, value, valuelen);
 	strbuf_addch(sb, '\n');
+
+	if (len != sb->len - orig_len)
+		warning("pax extended header length miscalculated as %d"
+			", should be %"PRIuMAX,
+			len, (uintmax_t)(sb->len - orig_len));
 }

 /*
diff --git a/t/t5004-archive-corner-cases.sh b/t/t5004-archive-corner-cases.sh
index 271eb5a1fd..2f15d1899d 100755
--- a/t/t5004-archive-corner-cases.sh
+++ b/t/t5004-archive-corner-cases.sh
@@ -204,4 +204,24 @@ test_expect_success EXPENSIVE,LONG_IS_64BIT,UNZIP,UNZIP_ZIP64_SUPPORT,ZIPINFO \
 	grep $size big.lst
 '

+build_tree() {
+	perl -e '
+		my $hash = $ARGV[0];
+		foreach my $order (2..6) {
+			$first = 10 ** $order;
+			foreach my $i (-13..-9) {
+				my $name = "a" x ($first + $i);
+				print "100644 blob $hash\t$name\n"
+			}
+		}
+	' "$1"
+}
+
+test_expect_failure 'tar archive with long paths' '
+	blob=$(echo foo | git hash-object -w --stdin) &&
+	tree=$(build_tree $blob | git mktree) &&
+	git archive -o long_paths.tar $tree 2>stderr &&
+	test_must_be_empty stderr
+'
+
 test_done
--
2.23.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] archive-tar: fix pax extended header length calculation
  2019-08-17 16:19 [PATCH 0/4] fix pax extended header length calculation René Scharfe
  2019-08-17 16:23 ` [PATCH 1/4] archive-tar: report wrong pax extended header length René Scharfe
@ 2019-08-17 16:24 ` René Scharfe
  2019-08-17 18:03   ` Eric Sunshine
  2019-08-17 16:24 ` [PATCH 3/4] archive-tar: use size_t in strbuf_append_ext_header() René Scharfe
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: René Scharfe @ 2019-08-17 16:24 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

A pax extended header records starts with a decimal number.  Its value
is the length of the whole record, including its own length.

The calculation of that number if strbuf_append_ext_header() is off by
one in case the length of the rest is close to a higher order of
magnitude.  This affects paths and link targets a bit shorter than 1000,
10000, 100000 etc. characters -- paths with a length of up to 100 fit
into the tar header and don't need a pax extended header.

The mistake has been present since the function was added by ae64bbc18c
("tar-tree: Introduce write_entry()", 2006-03-25).

Account for digits added to len during the loop and keep incrementing
until we have enough space for len and the rest.  The crucial change is
to check against the current value of len before each iteration, instead
of against its value before the loop.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 archive-tar.c                   | 2 +-
 t/t5004-archive-corner-cases.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 355c8142c6..4395a29ffb 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -149,7 +149,7 @@ static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,

 	/* "%u %s=%s\n" */
 	len = 1 + 1 + strlen(keyword) + 1 + valuelen + 1;
-	for (tmp = len; tmp > 9; tmp /= 10)
+	for (tmp = 1; len / 10 >= tmp; tmp *= 10)
 		len++;

 	strbuf_grow(sb, len);
diff --git a/t/t5004-archive-corner-cases.sh b/t/t5004-archive-corner-cases.sh
index 2f15d1899d..4966a74b4d 100755
--- a/t/t5004-archive-corner-cases.sh
+++ b/t/t5004-archive-corner-cases.sh
@@ -217,7 +217,7 @@ build_tree() {
 	' "$1"
 }

-test_expect_failure 'tar archive with long paths' '
+test_expect_success 'tar archive with long paths' '
 	blob=$(echo foo | git hash-object -w --stdin) &&
 	tree=$(build_tree $blob | git mktree) &&
 	git archive -o long_paths.tar $tree 2>stderr &&
--
2.23.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] archive-tar: use size_t in strbuf_append_ext_header()
  2019-08-17 16:19 [PATCH 0/4] fix pax extended header length calculation René Scharfe
  2019-08-17 16:23 ` [PATCH 1/4] archive-tar: report wrong pax extended header length René Scharfe
  2019-08-17 16:24 ` [PATCH 2/4] archive-tar: fix pax extended header length calculation René Scharfe
@ 2019-08-17 16:24 ` René Scharfe
  2019-08-17 16:24 ` [PATCH 4/4] archive-tar: turn length miscalculation warning into BUG René Scharfe
  2019-08-17 16:40 ` [PATCH 0/4] fix pax extended header length calculation brian m. carlson
  4 siblings, 0 replies; 7+ messages in thread
From: René Scharfe @ 2019-08-17 16:24 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

One of its callers already passes in a size_t value.  Use it
consistently in this function.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 archive-tar.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 4395a29ffb..9d09edd547 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -142,10 +142,10 @@ static int stream_blocked(const struct object_id *oid)
  * string and appends it to a struct strbuf.
  */
 static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,
-				     const char *value, unsigned int valuelen)
+				     const char *value, size_t valuelen)
 {
 	size_t orig_len = sb->len;
-	int len, tmp;
+	size_t len, tmp;

 	/* "%u %s=%s\n" */
 	len = 1 + 1 + strlen(keyword) + 1 + valuelen + 1;
@@ -153,14 +153,14 @@ static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,
 		len++;

 	strbuf_grow(sb, len);
-	strbuf_addf(sb, "%u %s=", len, keyword);
+	strbuf_addf(sb, "%"PRIuMAX" %s=", (uintmax_t)len, keyword);
 	strbuf_add(sb, value, valuelen);
 	strbuf_addch(sb, '\n');

 	if (len != sb->len - orig_len)
-		warning("pax extended header length miscalculated as %d"
+		warning("pax extended header length miscalculated as %"PRIuMAX
 			", should be %"PRIuMAX,
-			len, (uintmax_t)(sb->len - orig_len));
+			(uintmax_t)len, (uintmax_t)(sb->len - orig_len));
 }

 /*
--
2.23.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] archive-tar: turn length miscalculation warning into BUG
  2019-08-17 16:19 [PATCH 0/4] fix pax extended header length calculation René Scharfe
                   ` (2 preceding siblings ...)
  2019-08-17 16:24 ` [PATCH 3/4] archive-tar: use size_t in strbuf_append_ext_header() René Scharfe
@ 2019-08-17 16:24 ` René Scharfe
  2019-08-17 16:40 ` [PATCH 0/4] fix pax extended header length calculation brian m. carlson
  4 siblings, 0 replies; 7+ messages in thread
From: René Scharfe @ 2019-08-17 16:24 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

Now that we're confident our pax extended header calculation is correct,
turn the criticality of the assertion up to the maximum, from warning
right up to BUG.  Simplify the test, as the stderr comparison step would
not be reached in case the BUG message is triggered.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 archive-tar.c                   | 6 +++---
 t/t5004-archive-corner-cases.sh | 3 +--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 9d09edd547..e16d3f756d 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -158,9 +158,9 @@ static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,
 	strbuf_addch(sb, '\n');

 	if (len != sb->len - orig_len)
-		warning("pax extended header length miscalculated as %"PRIuMAX
-			", should be %"PRIuMAX,
-			(uintmax_t)len, (uintmax_t)(sb->len - orig_len));
+		BUG("pax extended header length miscalculated as %"PRIuMAX
+		    ", should be %"PRIuMAX,
+		    (uintmax_t)len, (uintmax_t)(sb->len - orig_len));
 }

 /*
diff --git a/t/t5004-archive-corner-cases.sh b/t/t5004-archive-corner-cases.sh
index 4966a74b4d..3e7b23cb32 100755
--- a/t/t5004-archive-corner-cases.sh
+++ b/t/t5004-archive-corner-cases.sh
@@ -220,8 +220,7 @@ build_tree() {
 test_expect_success 'tar archive with long paths' '
 	blob=$(echo foo | git hash-object -w --stdin) &&
 	tree=$(build_tree $blob | git mktree) &&
-	git archive -o long_paths.tar $tree 2>stderr &&
-	test_must_be_empty stderr
+	git archive -o long_paths.tar $tree
 '

 test_done
--
2.23.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] fix pax extended header length calculation
  2019-08-17 16:19 [PATCH 0/4] fix pax extended header length calculation René Scharfe
                   ` (3 preceding siblings ...)
  2019-08-17 16:24 ` [PATCH 4/4] archive-tar: turn length miscalculation warning into BUG René Scharfe
@ 2019-08-17 16:40 ` brian m. carlson
  4 siblings, 0 replies; 7+ messages in thread
From: brian m. carlson @ 2019-08-17 16:40 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 801 bytes --]

On 2019-08-17 at 16:19:29, René Scharfe wrote:
> The code for building pax extended headers has been miscalculating
> lengths slightly shorter than powers of 10  since I wrote it in 2006.
> That affects entries for paths with a length of 990, 991, 9989, 9990,
> 9991, 99988 etc. and link targets 4 characters shorter.  Here's a
> series for fixing it.

This series looked good to me. I'm don't completely understand our
technique for computing the length, but the additional tests build my
confidence and your explanation makes sense.

As a side note, I have personally found computing the length of pax
headers to be enormously difficult due to this very edge case, so I'm
not surprised this bug crept in.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/4] archive-tar: fix pax extended header length calculation
  2019-08-17 16:24 ` [PATCH 2/4] archive-tar: fix pax extended header length calculation René Scharfe
@ 2019-08-17 18:03   ` Eric Sunshine
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2019-08-17 18:03 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Junio C Hamano

On Sat, Aug 17, 2019 at 12:24 PM René Scharfe <l.s.r@web.de> wrote:
> A pax extended header records starts with a decimal number.  Its value

s/records/record/

> is the length of the whole record, including its own length.
>
> The calculation of that number if strbuf_append_ext_header() is off by

s/if/in/

> one in case the length of the rest is close to a higher order of
> magnitude.  This affects paths and link targets a bit shorter than 1000,
> 10000, 100000 etc. characters -- paths with a length of up to 100 fit
> into the tar header and don't need a pax extended header.
>
> The mistake has been present since the function was added by ae64bbc18c
> ("tar-tree: Introduce write_entry()", 2006-03-25).
>
> Account for digits added to len during the loop and keep incrementing
> until we have enough space for len and the rest.  The crucial change is
> to check against the current value of len before each iteration, instead
> of against its value before the loop.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-08-17 18:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-17 16:19 [PATCH 0/4] fix pax extended header length calculation René Scharfe
2019-08-17 16:23 ` [PATCH 1/4] archive-tar: report wrong pax extended header length René Scharfe
2019-08-17 16:24 ` [PATCH 2/4] archive-tar: fix pax extended header length calculation René Scharfe
2019-08-17 18:03   ` Eric Sunshine
2019-08-17 16:24 ` [PATCH 3/4] archive-tar: use size_t in strbuf_append_ext_header() René Scharfe
2019-08-17 16:24 ` [PATCH 4/4] archive-tar: turn length miscalculation warning into BUG René Scharfe
2019-08-17 16:40 ` [PATCH 0/4] fix pax extended header length calculation brian m. carlson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).