git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Pedro Alvarez <pedro.alvarez@codethink.co.uk>
To: git@vger.kernel.org
Cc: gitster@pobox.com, peff@peff.net,
	Pedro Alvarez Piedehierro <palvarez89@gmail.com>,
	Felipe Contreras <felipe.contreras@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: [PATCH v2 1/1] import-tars: read overlong names from pax extended header
Date: Wed, 23 May 2018 23:54:17 +0100	[thread overview]
Message-ID: <20180523225417.10165-2-pedro.alvarez@codethink.co.uk> (raw)
In-Reply-To: <20180523225417.10165-1-pedro.alvarez@codethink.co.uk>

From: Pedro Alvarez Piedehierro <palvarez89@gmail.com>

Importing gcc tarballs[1] with import-tars script (in contrib) fails
when hitting a pax extended header.

Make sure we always read the extended attributes from the pax entries,
and store the 'path' value if found to be used in the next ustar entry.

The code to parse pax extended headers was written consulting the Pax
Pax Interchange Format documentation [2].

[1] http://ftp.gnu.org/gnu/gcc/gcc-7.3.0/gcc-7.3.0.tar.xz
[2] https://www.freebsd.org/cgi/man.cgi?manpath=FreeBSD+8-current&query=tar&sektion=5

Signed-off-by: Pedro Alvarez <palvarez89@gmail.com>
---
 contrib/fast-import/import-tars.perl | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
index d60b4315ed..e800d9f5c9 100755
--- a/contrib/fast-import/import-tars.perl
+++ b/contrib/fast-import/import-tars.perl
@@ -63,6 +63,8 @@ foreach my $tar_file (@ARGV)
 	my $have_top_dir = 1;
 	my ($top_dir, %files);
 
+	my $next_path = '';
+
 	while (read(I, $_, 512) == 512) {
 		my ($name, $mode, $uid, $gid, $size, $mtime,
 			$chksum, $typeflag, $linkname, $magic,
@@ -70,6 +72,13 @@ foreach my $tar_file (@ARGV)
 			$prefix) = unpack 'Z100 Z8 Z8 Z8 Z12 Z12
 			Z8 Z1 Z100 Z6
 			Z2 Z32 Z32 Z8 Z8 Z*', $_;
+
+		unless ($next_path eq '') {
+			# Recover name from previous extended header
+			$name = $next_path;
+			$next_path = '';
+		}
+
 		last unless length($name);
 		if ($name eq '././@LongLink') {
 			# GNU tar extension
@@ -90,13 +99,31 @@ foreach my $tar_file (@ARGV)
 			Z8 Z1 Z100 Z6
 			Z2 Z32 Z32 Z8 Z8 Z*', $_;
 		}
-		next if $name =~ m{/\z};
 		$mode = oct $mode;
 		$size = oct $size;
 		$mtime = oct $mtime;
 		next if $typeflag == 5; # directory
 
-		if ($typeflag != 1) { # handle hard links later
+		if ($typeflag eq 'x') { # extended header
+			# If extended header, check for path
+			my $pax_header = '';
+			while ($size > 0 && read(I, $_, 512) == 512) {
+				$pax_header = $pax_header . substr($_, 0, $size);
+				$size -= 512;
+			}
+
+			my @lines = split /\n/, $pax_header;
+			foreach my $line (@lines) {
+				my ($len, $entry) = split / /, $line;
+				my ($key, $value) = split /=/, $entry;
+				if ($key eq 'path') {
+					$next_path = $value;
+				}
+			}
+			next;
+		} elsif ($name =~ m{/\z}) { # directory
+			next;
+		} elsif ($typeflag != 1) { # handle hard links later
 			print FI "blob\n", "mark :$next_mark\n";
 			if ($typeflag == 2) { # symbolic link
 				print FI "data ", length($linkname), "\n",
-- 
2.11.0


      reply	other threads:[~2018-05-23 22:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-22 10:05 [PATCH] Add initial support for pax extended attributes Pedro Alvarez
2018-05-23  2:34 ` Junio C Hamano
2018-05-23  4:57   ` Jeff King
2018-05-23 23:38     ` Junio C Hamano
2018-05-23 22:54 ` [PATCH v2 0/1] import-tars: read overlong names from pax extended header Pedro Alvarez
2018-05-23 22:54   ` Pedro Alvarez [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180523225417.10165-2-pedro.alvarez@codethink.co.uk \
    --to=pedro.alvarez@codethink.co.uk \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=palvarez89@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).