From: Junio C Hamano <gitster@pobox.com>
To: Nick Townsend <nick.townsend@mac.com>
Cc: "Git List" <git@vger.kernel.org>,
"Eric Sunshine" <sunshine@sunshineco.com>,
"René Scharfe" <l.s.r@web.de>,
"Jens Lehmann" <Jens.Lehmann@web.de>,
"Heiko Voigt" <hvoigt@hvoigt.net>
Subject: Re: [PATCH] Additional git-archive tests
Date: Tue, 10 Dec 2013 10:48:47 -0800 [thread overview]
Message-ID: <xmqq8uvsv8xc.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <3943DF12-8FE8-4664-8901-C243D588BE82@mac.com> (Nick Townsend's message of "Mon, 09 Dec 2013 21:26:42 -0800")
Nick Townsend <nick.townsend@mac.com> writes:
>> In other words, are all these new tests expected to pass?
>
> Sorry about that - the first four tests do pass with 1.8.5, the last
> three tests do not currently pass.
OK.
>> So I do not think it is expected to accept tree object names with
>> the HEAD:<path> style syntax, if the user expects a predictable and
>> consistent result. The third step above attempts to make sure that
>> you name a tree-ish that corresponds to the top-level of the
>> project, i.e. with no <path>.
>>
> That’s not quite right - paths do work from the root directory - see tests.
I know that they work from the very top, but they _only_ happens to
work from the very top. As I do not see the code is designed to
compare the prefix (i.e. the subdirectory you are in) and the path
that is tucked after the name of the top-level tree object with a
colon (e.g. path in "HEAD:path") and adjust these two accordingly, I
do not think it was designed to deal with that at all. The code
just assumes that the tree object corresponds to the top-level, and
the adjustment is done solely by adding the prefix when limiting the
output by paths.
This observation *is* very right. Not designing to deal with that
case may or may not be right, but that is a different issue.
> It was this very useful capability that I sought to add and generalized
> the code to do.
>> What seems to be supported are:
>>
>> cd a && git archive HEAD ;# archives HEAD:a tree
>> cd a && git archive HEAD -- b ;# archives a/b/ part of HEAD:a as b/
>>
>> Specifically, it appears that HEAD:./b, HEAD:b etc. are not designed
>> to work, at least to me.
>> ...
>> I am not saying that these should _not_ work, but it is unclear what
>> it means to "work". For example, what should this do?
>>
>> cd a && git archive HEAD:./b $pathspec
>>
> I think we can clear this up by documenting the cases and choosing
> sensible behaviour _for git-archive_ ie. what people might expect.
I am afraid that this topic is not limited to "git-archive";
"sensible behaviour _for git-archive_" will end up being far from
sufficient. I would not be surprised if "ls-tree" shares the same
issue, for example. If you had "subdir/subsubdir/path" (and other
paths) tracked in your project,
cd subdir
git ls-tree HEAD:./subsubdir path
is likely to have the same issue as
cd subdir
git archive HEAD:./subsubdir path
I would think.
> Here is a suggestion:
>
> a. The pathspec is used as a selector to include things in the archive.
> it is applied after the cwd and treeish selection.
That would mean that doing this in the top-level:
git archive HEAD:subdir/subsubdir path<TAB>
will not auto-complete the pathname. In all other Git commands,
vanilla pathspecs (at least the ones that do not use the magic
escape hatch ":/path", etc.) are relative to your current directory,
and I do not think we want a single command to work in an
inconsistent way with respect to that.
> b. The current working directory (if present) prefixes a path in the object
> if one is present.
That is already done at a lower-level get_sha1_with_context(), I
think.
cd subdir
git archive HEAD:./subsubdir
will try to use tree subdir/subsubdir in the HEAD commit (but due to
the prefix logic that is not preprared to use such a tree object,
you will get "current working directory is untracked" error).
> c. The path in the object (if present) is prefixed by the cwd (if present) and
> used to select items for archiving. However the composite path so created
> *is not present* in the archive - ie. the leading components are stripped.
> (This is very useful behaviour for creating archives without leading paths)
>
> d. As currently the ―prefix option (not the prefix from setup_git_directory)
> is prepended to all entries in the archive.
>
> These correspond to the use cases that I have for git archive - extracting all
> or part of a multiple submodule tree into a tgz file, with or without leading
> directories.
>
>> The extended SHA-1 expression HEAD:./b in the subdirectory a/ is
>> interpreted by get_sha1_with_context_1() to be the name of the tree
>> object at path "a/b" in the commit HEAD. Further, you are giving a
>> pathspec while in a subdirectory a/ to the command. What should
>> that pathspec be relative to?
>>
>> In a normal Git command, the pathspec always is relative to the
>> current subdirectory, so, the way to learn about the tree object
>> a/b/c in the HEAD while in subdirectory a/ would be:
>>
>> cd a && git ls-tree HEAD b/c
>>
>> But what should the command line for archive to grab HEAD:a/b/c be?
>> It feels wrong to say:
>>
>> cd a && git archive HEAD:./b b/c
> It’s clear to me that if you are in a subdirectory, that is an implicit prefix on the
> ./b so you retrieve HEAD:a/b which then archives everything in b without the
> leading a/b. The pathspec is wrong (including the b) and this archive is empty.
>>
>> and it also feels wrong to say
>>
>> cd a && git archive HEAD:./b c
>>
> That looks fine to me given the rules suggested above.
>
> Also git-parse manages
> to return the correct thing in this case, so I’d expect this to work.
>
>> No matter what we would do, we should behave consistently with this
>> case:
>>
>> treeish=$(git rev-parse HEAD:a/b) &&
>> cd a &&
>> git archive $treeish -- $pathspec
>>
> Well this will fail both in the existing case (‘fatal: current
> working directory is untracked’) and with the scheme proposed
> above (‘fatal: invalid object name $treeish:a/‘)
The thing is the "current working directory is untracked" is a
fallout from the current archive code not expecting to see a
tree-ish specifed in HEAD:<path> style to interact with $cwd (aka
prefix). Go back to the description in how parse_treeish_arg()
works in the message you are responding to (or to archive.c). It
says "if we have prefix, grab the tree object that corresponds to
the prefix in the given tree-ish". When "subdir" is your current
directory, if you give HEAD as the tree-ish, this logic will give
you the subtree HEAD:subdir; if you give anything else that does not
correspond to the top-level of the project, it will not be able to
find such a subtree (by definition, any tree-ish that does not
correspond to the top-level cannot contain "subdir" tree object that
corresponds to "subdir" directory of the top-level; it may contain a
random directory that happens to be named "subdir", but that will be
a different directory from the "subdir" you see at the top-level).
I do not think a workable compromise exists, but one that is closest
to be workable, even though I do not think it is such a great idea,
would be to apply the add "prefix" logic _only_ when the given
tree-ish is known to be at the top-level (e.g. when it was
originally given as a commit).
prev parent reply other threads:[~2013-12-10 18:49 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-03 0:10 [PATCH] Improvements to git-archive tests and add_submodule_odb() Nick Townsend
2013-12-03 0:14 ` Nick Townsend
2013-12-03 17:52 ` Junio C Hamano
2013-12-03 18:18 ` Heiko Voigt
2013-12-03 18:39 ` Re* " Junio C Hamano
2013-12-03 20:42 ` Heiko Voigt
2013-12-03 0:16 ` Nick Townsend
2013-12-03 9:26 ` Eric Sunshine
2013-12-03 17:54 ` Junio C Hamano
2013-12-05 2:49 ` [PATCH] Additional git-archive tests Nick Townsend
2013-12-05 19:52 ` Junio C Hamano
2013-12-10 5:26 ` Nick Townsend
2013-12-10 18:48 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq8uvsv8xc.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox.com \
--cc=Jens.Lehmann@web.de \
--cc=git@vger.kernel.org \
--cc=hvoigt@hvoigt.net \
--cc=l.s.r@web.de \
--cc=nick.townsend@mac.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).