git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] archive: fix archive generation for empty trees
@ 2012-03-08  0:09 Brodie Rao
  2012-03-08  5:55 ` Jeff King
  2012-03-08 17:46 ` René Scharfe
  0 siblings, 2 replies; 9+ messages in thread
From: Brodie Rao @ 2012-03-08  0:09 UTC (permalink / raw)
  To: git; +Cc: Jeff King

Prior to this change, git-archive would try to verify path arguments -
even if none were provided. It used get_pathspec("", pathspec), which
would return a pathspec of "" instead of NULL.

Then it would try to verify if the tree contained any paths matching
"". This is fine in the normal case where the tree contains anything
(every entry would match), but for an empty tree, it wouldn't match,
and you'd get this error:

  fatal: path not found:

Now, instead of "", we use a pathspec prefix of NULL. If no path
arguments were provided, get_pathspec() will return NULL, and we won't
try to verify the existence of any paths in the tree.

Signed-off-by: Brodie Rao <brodie@sf.io>
---
 archive.c           |    2 +-
 t/t5000-tar-tree.sh |   16 ++++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/archive.c b/archive.c
index 1ee837d..6e23896 100644
--- a/archive.c
+++ b/archive.c
@@ -236,7 +236,7 @@ static int path_exists(struct tree *tree, const char *path)
 static void parse_pathspec_arg(const char **pathspec,
 		struct archiver_args *ar_args)
 {
-	ar_args->pathspec = pathspec = get_pathspec("", pathspec);
+	ar_args->pathspec = pathspec = get_pathspec(NULL, pathspec);
 	if (pathspec) {
 		while (*pathspec) {
 			if (!path_exists(ar_args->tree, *pathspec))
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 527c9e7..404786f 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -360,4 +360,20 @@ test_expect_success GZIP 'remote tar.gz can be disabled' '
 		>remote.tar.gz
 '
 
+test_expect_success \
+    'git archive with an empty tree and a prefix' \
+    'git rm -r . &&
+     git commit -m empty &&
+     git archive --format=tar --prefix=empty/ HEAD > e1.tar &&
+     "$TAR" tf e1.tar'
+
+test_expect_success \
+    'git archive with an empty tree and no prefix' \
+    'git archive --format=tar HEAD > e2.tar &&
+     test_must_fail "$TAR" tf e2.tar'
+
+test_expect_success \
+    'git archive on specific paths with an empty tree' \
+    'test_must_fail git archive --format=tar --prefix=empty/ HEAD foo'
+
 test_done
-- 
1.7.9.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08  0:09 [PATCH] archive: fix archive generation for empty trees Brodie Rao
@ 2012-03-08  5:55 ` Jeff King
  2012-03-08  6:38   ` Junio C Hamano
  2012-03-08 17:46 ` René Scharfe
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff King @ 2012-03-08  5:55 UTC (permalink / raw)
  To: Brodie Rao; +Cc: René Scharfe, Nguyễn Thái Ngọc Duy, git

[+cc René, as this is his code;
 +cc Duy, for pathspec questions]

On Wed, Mar 07, 2012 at 04:09:22PM -0800, Brodie Rao wrote:

> Prior to this change, git-archive would try to verify path arguments -
> even if none were provided. It used get_pathspec("", pathspec), which
> would return a pathspec of "" instead of NULL.
> 
> Then it would try to verify if the tree contained any paths matching
> "". This is fine in the normal case where the tree contains anything
> (every entry would match), but for an empty tree, it wouldn't match,
> and you'd get this error:
> 
>   fatal: path not found:
> 
> Now, instead of "", we use a pathspec prefix of NULL. If no path
> arguments were provided, get_pathspec() will return NULL, and we won't
> try to verify the existence of any paths in the tree.

Yeah, this looks like the right thing to do. The get_pathspec code
treats a NULL prefix specially as "no prefix", and I think that is what
we are trying to say here (i.e., we are interpreting pathspecs from the
root).

Though the main function of get_pathspec seems to be to call
prefix_pathspec on each element of the pathspec. And we have no prefix
here. However, prefix_pathspec does a lot of magic parsing; it's unclear
to me whether this is all in support of properly adding the prefix, or
if its side effects are important. But if it is purely about prefixing,
can't we just get rid of the call to get_pathspec entirely?

There is also a comment above prefix_pathspec regarding moving things
over to the new "struct pathspec" interface. But that leaves me more
confused, since init_pathspec does not handle prefixes at all (so it
looks like you would need to call get_pathspec first to get a
prefixed list, and then feed the result to init_pathspec. Should we be
doing something with "struct pathspec" here? Confused...

-Peff

PS The patch itself is quoted below without further comment for the
   benefit of those who were cc'd.

> ---
>  archive.c           |    2 +-
>  t/t5000-tar-tree.sh |   16 ++++++++++++++++
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/archive.c b/archive.c
> index 1ee837d..6e23896 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -236,7 +236,7 @@ static int path_exists(struct tree *tree, const char *path)
>  static void parse_pathspec_arg(const char **pathspec,
>  		struct archiver_args *ar_args)
>  {
> -	ar_args->pathspec = pathspec = get_pathspec("", pathspec);
> +	ar_args->pathspec = pathspec = get_pathspec(NULL, pathspec);
>  	if (pathspec) {
>  		while (*pathspec) {
>  			if (!path_exists(ar_args->tree, *pathspec))
> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> index 527c9e7..404786f 100755
> --- a/t/t5000-tar-tree.sh
> +++ b/t/t5000-tar-tree.sh
> @@ -360,4 +360,20 @@ test_expect_success GZIP 'remote tar.gz can be disabled' '
>  		>remote.tar.gz
>  '
>  
> +test_expect_success \
> +    'git archive with an empty tree and a prefix' \
> +    'git rm -r . &&
> +     git commit -m empty &&
> +     git archive --format=tar --prefix=empty/ HEAD > e1.tar &&
> +     "$TAR" tf e1.tar'
> +
> +test_expect_success \
> +    'git archive with an empty tree and no prefix' \
> +    'git archive --format=tar HEAD > e2.tar &&
> +     test_must_fail "$TAR" tf e2.tar'
> +
> +test_expect_success \
> +    'git archive on specific paths with an empty tree' \
> +    'test_must_fail git archive --format=tar --prefix=empty/ HEAD foo'
> +
>  test_done
> -- 
> 1.7.9.2
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08  5:55 ` Jeff King
@ 2012-03-08  6:38   ` Junio C Hamano
  2012-03-08  7:15     ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2012-03-08  6:38 UTC (permalink / raw)
  To: Jeff King
  Cc: Brodie Rao, René Scharfe,
	Nguyễn Thái Ngọc Duy, git

Jeff King <peff@peff.net> writes:

>> Now, instead of "", we use a pathspec prefix of NULL. If no path
>> arguments were provided, get_pathspec() will return NULL, and we won't
>> try to verify the existence of any paths in the tree.
>
> Yeah, this looks like the right thing to do. The get_pathspec code
> treats a NULL prefix specially as "no prefix", and I think that is what
> we are trying to say here (i.e., we are interpreting pathspecs from the
> root).

Yes, that sounds sane.

> ... However, prefix_pathspec does a lot of magic parsing;
> it's unclear to me whether this is all in support of properly
> adding the prefix, or if its side effects are important.

These "magic" are for things like :(root)/path that will explicitly
refuse the prefix when run from a subdirectory.

In the longer term, get_pathspec() should be converted to directly
deal with "struct pathspec", but we are not there yet.  There are
too many code left-over, even after Duy's last pathspec related
topic, that still look at ->raw field of "struct pathspec" and they
all need to be taught to work with the unified pathspec matching
machinery.  This is one of the lots of unfinished loose ends, which
might be a good GSoC project but may be a bit too large for a
student to bite and chew.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08  6:38   ` Junio C Hamano
@ 2012-03-08  7:15     ` Jeff King
  2012-03-08 17:46       ` René Scharfe
  2012-03-09  0:06       ` Brodie Rao
  0 siblings, 2 replies; 9+ messages in thread
From: Jeff King @ 2012-03-08  7:15 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brodie Rao, René Scharfe,
	Nguyễn Thái Ngọc Duy, git

On Wed, Mar 07, 2012 at 10:38:07PM -0800, Junio C Hamano wrote:

> > ... However, prefix_pathspec does a lot of magic parsing;
> > it's unclear to me whether this is all in support of properly
> > adding the prefix, or if its side effects are important.
> 
> These "magic" are for things like :(root)/path that will explicitly
> refuse the prefix when run from a subdirectory.

Yeah, that was my impression. In that case, I would think we could get
rid of the get_pathspec call entirely, as it is purely about fixing-up
prefixes, and we know that we have none.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08  7:15     ` Jeff King
@ 2012-03-08 17:46       ` René Scharfe
  2012-03-09  0:06       ` Brodie Rao
  1 sibling, 0 replies; 9+ messages in thread
From: René Scharfe @ 2012-03-08 17:46 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Brodie Rao, Nguyễn Thái Ngọc Duy,
	git

Am 08.03.2012 08:15, schrieb Jeff King:
> On Wed, Mar 07, 2012 at 10:38:07PM -0800, Junio C Hamano wrote:
>
>>> ... However, prefix_pathspec does a lot of magic parsing;
>>> it's unclear to me whether this is all in support of properly
>>> adding the prefix, or if its side effects are important.
>>
>> These "magic" are for things like :(root)/path that will explicitly
>> refuse the prefix when run from a subdirectory.
>
> Yeah, that was my impression. In that case, I would think we could get
> rid of the get_pathspec call entirely, as it is purely about fixing-up
> prefixes, and we know that we have none.

Yes, I think you're right.  Not sure why I didn't do that in ebfbdb34, 
when that empty prefix was introduced instead.

René

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08  0:09 [PATCH] archive: fix archive generation for empty trees Brodie Rao
  2012-03-08  5:55 ` Jeff King
@ 2012-03-08 17:46 ` René Scharfe
  2012-03-09  0:08   ` Brodie Rao
  1 sibling, 1 reply; 9+ messages in thread
From: René Scharfe @ 2012-03-08 17:46 UTC (permalink / raw)
  To: Brodie Rao; +Cc: git, Jeff King

Am 08.03.2012 01:09, schrieb Brodie Rao:
> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> index 527c9e7..404786f 100755
> --- a/t/t5000-tar-tree.sh
> +++ b/t/t5000-tar-tree.sh
> @@ -360,4 +360,20 @@ test_expect_success GZIP 'remote tar.gz can be disabled' '
>   		>remote.tar.gz
>   '
>
> +test_expect_success \
> +    'git archive with an empty tree and a prefix' \
> +    'git rm -r .&&
> +     git commit -m empty&&
> +     git archive --format=tar --prefix=empty/ HEAD>  e1.tar&&
> +     "$TAR" tf e1.tar'
> +
> +test_expect_success \
> +    'git archive with an empty tree and no prefix' \
> +    'git archive --format=tar HEAD>  e2.tar&&
> +     test_must_fail "$TAR" tf e2.tar'

This test fails for me, i.e. tar does not complain about the empty 
archive (GNU tar 1.25).  Perhaps use git archive -v to generate a list 
of entries?

> +
> +test_expect_success \
> +    'git archive on specific paths with an empty tree' \
> +    'test_must_fail git archive --format=tar --prefix=empty/ HEAD foo'
> +
>   test_done
> -- 1.7.9.2
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08  7:15     ` Jeff King
  2012-03-08 17:46       ` René Scharfe
@ 2012-03-09  0:06       ` Brodie Rao
  2012-03-09  7:30         ` René Scharfe
  1 sibling, 1 reply; 9+ messages in thread
From: Brodie Rao @ 2012-03-09  0:06 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, René Scharfe,
	Nguyễn Thái Ngọc, git

2012/3/7 Jeff King <peff@peff.net>:
> On Wed, Mar 07, 2012 at 10:38:07PM -0800, Junio C Hamano wrote:
>
>> > ... However, prefix_pathspec does a lot of magic parsing;
>> > it's unclear to me whether this is all in support of properly
>> > adding the prefix, or if its side effects are important.
>>
>> These "magic" are for things like :(root)/path that will explicitly
>> refuse the prefix when run from a subdirectory.
>
> Yeah, that was my impression. In that case, I would think we could get
> rid of the get_pathspec call entirely, as it is purely about fixing-up
> prefixes, and we know that we have none.

Let me see if I've got this right: We're currently passing in ""/NULL
to get_pathspec() because we handle the prefix beforehand in
parse_treeish_args(). Once we get the tree object, every path is
relative to it, so we don't need to continue using a prefix.

Wouldn't it be better to continue using get_pathspec(), passing it the
real prefix, and looking up tree entries relative to the top-level
tree? The way it works now, you get weird behavior like this:

  $ cd xdiff
  $ git archive -v --format=tar HEAD ../t/t5000-tar-tree.sh > /dev/null
  fatal: '../t/t5000-tar-tree.sh' is outside repository
  $ git archive -v --format=tar HEAD .. > /dev/null
  fatal: '..' is outside repository

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-08 17:46 ` René Scharfe
@ 2012-03-09  0:08   ` Brodie Rao
  0 siblings, 0 replies; 9+ messages in thread
From: Brodie Rao @ 2012-03-09  0:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Jeff King

On Thu, Mar 8, 2012 at 9:46 AM, René Scharfe
<rene.scharfe@lsrfire.ath.cx> wrote:
> Am 08.03.2012 01:09, schrieb Brodie Rao:
>
>> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
>> index 527c9e7..404786f 100755
>> --- a/t/t5000-tar-tree.sh
>> +++ b/t/t5000-tar-tree.sh
>> @@ -360,4 +360,20 @@ test_expect_success GZIP 'remote tar.gz can be
>> disabled' '
>>                >remote.tar.gz
>>  '
>>
>> +test_expect_success \
>> +    'git archive with an empty tree and a prefix' \
>> +    'git rm -r .&&
>> +     git commit -m empty&&
>> +     git archive --format=tar --prefix=empty/ HEAD>  e1.tar&&
>> +     "$TAR" tf e1.tar'
>> +
>> +test_expect_success \
>> +    'git archive with an empty tree and no prefix' \
>> +    'git archive --format=tar HEAD>  e2.tar&&
>> +     test_must_fail "$TAR" tf e2.tar'
>
>
> This test fails for me, i.e. tar does not complain about the empty archive
> (GNU tar 1.25).  Perhaps use git archive -v to generate a list of entries?

Whoops. I only ran the test with BSD tar, which apparently isn't
tolerant of the empty archive git-archive is generating. It'd probably
be better to omit the tar invocation in that test and just confirm
that git-archive doesn't blow up.

>
>
>> +
>> +test_expect_success \
>> +    'git archive on specific paths with an empty tree' \
>> +    'test_must_fail git archive --format=tar --prefix=empty/ HEAD foo'
>> +
>>  test_done
>> -- 1.7.9.2
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] archive: fix archive generation for empty trees
  2012-03-09  0:06       ` Brodie Rao
@ 2012-03-09  7:30         ` René Scharfe
  0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2012-03-09  7:30 UTC (permalink / raw)
  To: Brodie Rao
  Cc: Jeff King, Junio C Hamano, Nguyễn Thái Ngọc, git

Am 09.03.2012 01:06, schrieb Brodie Rao:
> 2012/3/7 Jeff King<peff@peff.net>:
>> On Wed, Mar 07, 2012 at 10:38:07PM -0800, Junio C Hamano wrote:
>>
>>>> ... However, prefix_pathspec does a lot of magic parsing;
>>>> it's unclear to me whether this is all in support of properly
>>>> adding the prefix, or if its side effects are important.
>>>
>>> These "magic" are for things like :(root)/path that will explicitly
>>> refuse the prefix when run from a subdirectory.
>>
>> Yeah, that was my impression. In that case, I would think we could get
>> rid of the get_pathspec call entirely, as it is purely about fixing-up
>> prefixes, and we know that we have none.
>
> Let me see if I've got this right: We're currently passing in ""/NULL
> to get_pathspec() because we handle the prefix beforehand in
> parse_treeish_args(). Once we get the tree object, every path is
> relative to it, so we don't need to continue using a prefix.
>
> Wouldn't it be better to continue using get_pathspec(), passing it the
> real prefix, and looking up tree entries relative to the top-level
> tree? The way it works now, you get weird behavior like this:
>
>    $ cd xdiff
>    $ git archive -v --format=tar HEAD ../t/t5000-tar-tree.sh>  /dev/null
>    fatal: '../t/t5000-tar-tree.sh' is outside repository
>    $ git archive -v --format=tar HEAD ..>  /dev/null
>    fatal: '..' is outside repository

With get_pathspec() gone you'd get this instead:

	fatal: path not found: ..

The message could be improved by mentioning the subdirectory and perhaps 
the tree, something like this:

	fatal: path not found in subdir 'xdiff' of 'HEAD': ..

However, you seem to expect such an invocation to succeed.  What should 
go into the created archive in that case and which pathes would be recorded?

René

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-03-09  7:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-08  0:09 [PATCH] archive: fix archive generation for empty trees Brodie Rao
2012-03-08  5:55 ` Jeff King
2012-03-08  6:38   ` Junio C Hamano
2012-03-08  7:15     ` Jeff King
2012-03-08 17:46       ` René Scharfe
2012-03-09  0:06       ` Brodie Rao
2012-03-09  7:30         ` René Scharfe
2012-03-08 17:46 ` René Scharfe
2012-03-09  0:08   ` Brodie Rao

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).