git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Cristian Le <cristian.le@mpsd.mpg.de>, git@vger.kernel.org
Subject: Re: Bug in git archive + .gitattributes + relative path
Date: Mon, 6 Mar 2023 18:51:18 +0100	[thread overview]
Message-ID: <a86f05cb-407e-f5c6-ed15-a1fbf1be0584@web.de> (raw)
In-Reply-To: <xmqqy1o9byye.fsf@gitster.g>

Am 06.03.23 um 17:56 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>>    $ git archive --strip-components=1 HEAD sha1dc | tar tf -
>>    .gitattributes
>>    LICENSE.txt
>>    sha1.c
>>    sha1.h
>>    ubc_check.c
>>    ubc_check.h
>
> What should happen to paths that match the given pathspec that do
> not have enough number of components?  E.g. "cache.h" when the
> command is "git archive --strip-components=1 HEAD \*.h"?  Should it
> be documented?

Entries whose full path is stripped away don't make it into the archive.
That behavior is copied from bsdtar along with the option name and most
of its description in git-archive.txt.

Alternatively we could warn or die.  The latter would be a bit awkward
because we'd either have to check all paths first or risk reporting them
after writing at least some headers.

No strong preference, but following the precedence set by bsdtar makes
the most sense to me.

>> The new option does not affect the paths of entries added by --add-file
>> and --add-virtual-file because they are handcrafted to their desired
>> values already.  Similarly, the value of --prefix is not subject to
>> component stripping.
>
> Very sensible.
>
>> diff --git a/archive.c b/archive.c
>> index 9aeaf2bd87..8308d4d9c4 100644
>> --- a/archive.c
>> +++ b/archive.c
>> @@ -166,6 +166,18 @@ static int write_archive_entry(const struct object_id *oid, const char *base,
>>  		args->convert = check_attr_export_subst(check);
>>  	}
>
> We probably could save attribute lookup overhead by moving the new
> logic a bit higher in the function?
>
> No, that would invalidate the path_without_prefix variable by using
> strbuf_remove() on &path, and will break the attribute look-up.  The
> variable is used only once before this point and never used later,
> but as an independent future-proofing, we may want to remove the
> variable or narrow the scope.  It's totally out of scope of the
> patch, though.

Would you have noticed that attribute lookup breakage without the
presence of that variable? :)

The sad thing is that we concatenate base and filename here and
then attr.c::collect_some_attrs() goes and splits them again.  It
also uses the concatenated path, but perhaps that can be avoided?

>> +	if (args->strip_components > 0) {
>> +		size_t orig_baselen = baselen;
>> +		for (int i = 0; i < args->strip_components; i++) {
>> +			const char *slash = memchr(base, '/', baselen);
>> +			if (!slash)
>> +				return S_ISDIR(mode) ? READ_TREE_RECURSIVE : 0;
>> +			baselen -= slash - base + 1;
>> +			base = slash + 1;
>> +		}
>> +		strbuf_remove(&path, args->baselen, orig_baselen - baselen);
>> +	}
>
> Nice to see that the core logic of the new feature is surprisingly
> small.
>
>>  	if (args->verbose)
>>  		fprintf(stderr, "%.*s\n", (int)path.len, path.buf);
>
> By having the verbose output after the path stripping, we won't show
> the leading components we stripped, making it similar to what we
> would see when we piped the resulting archive to "| tar tf -".  I
> guess this makes sense than showing the original path.

Right, printing the path as it appears in the archive makes sense.
bsdtar does the same..

René

  reply	other threads:[~2023-03-06 18:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-03 10:25 Bug in git archive + .gitattributes + relative path Cristian Le
2023-03-03 15:19 ` René Scharfe
2023-03-03 15:38   ` Cristian Le
2023-03-04 13:58     ` René Scharfe
2023-03-04 15:11       ` Cristian Le
2023-03-05  9:32         ` René Scharfe
2023-03-06 16:56       ` Junio C Hamano
2023-03-06 17:51         ` René Scharfe [this message]
2023-03-06 17:27       ` Junio C Hamano
2023-03-06 18:28         ` René Scharfe
2023-03-06 18:59           ` Junio C Hamano
2023-03-06 21:32             ` René Scharfe
2023-03-06 22:34               ` Junio C Hamano
2023-03-11 20:47                 ` René Scharfe
2023-03-12 21:25                   ` Junio C Hamano
2023-03-18 21:30                     ` René Scharfe
2023-03-20 16:16                       ` Junio C Hamano
2023-03-20 20:02                       ` [PATCH] archive: improve support for running in a subdirectory René Scharfe
2023-03-21 22:59                         ` Junio C Hamano
2023-03-24 22:26                           ` René Scharfe
2023-03-24 22:27                         ` [PATCH v2] archive: improve support for running in subdirectory René Scharfe
2023-03-27 16:09                           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a86f05cb-407e-f5c6-ed15-a1fbf1be0584@web.de \
    --to=l.s.r@web.de \
    --cc=cristian.le@mpsd.mpg.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).