git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Luke Shumaker <lukeshu@lukeshu.com>,
	Git Mailing List <git@vger.kernel.org>,
	Luke Shumaker <lukeshu@datawire.io>, Jeff King <peff@peff.net>
Subject: Re: [RFC PATCH] fast-export, fast-import: Let tags specify an internal name
Date: Thu, 22 Apr 2021 10:41:01 +0200	[thread overview]
Message-ID: <874kfy3e5e.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CABPp-BFY65wddHHw2Uhortcux+TzMYBZS1wwfnsasYeishXa-w@mail.gmail.com>


On Wed, Apr 21 2021, Elijah Newren wrote:

> On Wed, Apr 21, 2021 at 1:19 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Tue, Apr 20 2021, Junio C Hamano wrote:
>>
>> > Luke Shumaker <lukeshu@lukeshu.com> writes:
>> >
>> >> That'd work fine if they're lightweight tags, but if they're annotated
>> >> tags, then after the rename the internal name in the tag object
>> >> (`v0.0.1`) is now different than the refname (`gitk/v0.0.1`).  Which
>> >> is still mostly fine, since not too many tools care if the internal
>> >> name and the refname disagree.
>> >>
>> >> But, fast-export/fast-import are tools that do care: it's currently
>> >> impossible to represent these tags in a fast-import stream.
>> >>
>> >> This patch adds an optional "name" sub-command to fast-import's "tag"
>> >> top-level-command, the stream
>> >>
>> >>     tag foo
>> >>     name bar
>> >>     ...
>> >>
>> >> will create a tag at "refs/tags/foo" that says "tag bar" internally.
>> >>
>> >> These tags are things that "shouldn't" happen, so perhaps adding
>> >> support for them to fast-export/fast-import is unwelcome, which is why
>> >> I've marked this as an "RFC".  If this addition is welcome, then it
>> >> still needs tests and documentation.
>> >
>> > I actually think this is a good direction to go in, and it might be
>> > even an acceptable change to fsck to require only the tail match of
>> > tagname and refname so that it becomes perfectly OK for Gitk's
>> > "v0.0.1" tag to be stored at say "refs/tags/gitk/v0.0.1".
>>
>> Do you mean to change fsck to care about this it all? It doesn't care
>> about the refname pointing to a tag, and AFAICT we never did.
>>
>> All we check is that the pseudo-"refname" is valid, i.e. if we were to
>> use the thing we find on the "tag" line as a refname, does it pass
>> check_refname_format()?
>>
>> "git tag -v" doesn't care either:
>>
>>         $ git update-ref refs/tags/a-v-2.31.0 3e90d4b58f3819cfd58ac61cb8668e83d3ea0563
>>         $ git tag -v a-v-2.31.0
>>         object a5828ae6b52137b913b978e16cd2334482eb4c1f
>>         type commit
>>         tag v2.31.0
>>         tagger Junio C Hamano <gitster@pobox.com> 1615834385 -0700
>>         [.. snip same gpgp output as for v2.31.0 itself..]
>>
>> I think at this point the right thing to do is to just explicitly
>> document that we ignore it, and that the export/import chain should be
>> as forgiving about it as possible.
>>
>> I.e. we have not cared about this before for validation, and
>> e.g. core.alternateRefsPrefixes and such things will break any "it
>> should be under refs/tags/" assumption.
>>
>> There's also perfectly legitimate in-the-wild use-cases for this,
>> e.g. "archiving" tags to not-refs/tags/* so e.g. the upload-pack logic
>> doesn't consider and follow them. Not being able to export/import those
>> repositories as-is due to an overzelous data check there that's not in
>> fsck.c would suck.
>
> Not would suck, but does suck.  I had to document it as a shortcoming
> of fast-export/fast-import -- see
> https://www.mankier.com/1/git-filter-repo#Internals-Limitations, where
> I wrote, "annotated and signed tags outside of the refs/tags/
> namespace are not supported (their location will be mangled in weird
> ways)".

Indeed, hence the whole point of this thread. I stand corrected.

I'm less familiar with fast-export (obviously), just wanted to chime in
on the "tag" field in the tag object.

> The problem is, what's the right backward-compatible way to fix this?
> Do we have to add a flag to both fast-export and fast-import to stop
> assuming a "refs/tags/" prefix and use the full refname, and require
> the user to pass both flags?  How is fast-import supposed to know that
> "refs/alternate-tags/foo" is or isn't
> "refs/tags/refs/alternate-tags/foo"?
>
> And if we need such a flag, should fast-import die if it sees this new
> "name" directive and the flag isn't given?

After looking at it, it seems to me that there's two potential cases,
and the simpler one we can nastily hack in, the more complex case needs
a format change.

This is the simpler case:
	
	test_expect_success 'setup' '
		echo file content >file &&
		git add file &&
		git commit -m"my commit message" &&
		git tag -a -m"my tag message" mytag HEAD &&
	
		git for-each-ref &&
		git fast-export --all >stream.a &&
	
		mkdir .git/refs/mytags &&
		mv .git/refs/tags/mytag .git/refs/mytags/ &&
		git for-each-ref &&
		git fast-export --all >stream.b &&
		test_might_fail git diff --no-index stream.a stream.b
	'
	
	test_expect_success 'minimal' '
		git init --bare import &&
		cat stream.b &&
		git -C import fast-import <stream.b &&
		git -C import for-each-ref
	'
	
	test_done

Right now this "works", but with this difference in the stream:
    
    + git diff --no-index stream.a stream.b
    diff --git a/stream.a b/stream.b
    index 0d7d656..167bc26 100644
    --- a/stream.a
    +++ b/stream.b
    @@ -12,7 +12,7 @@ data 18
     my commit message
     M 100644 :1 file
    
    -tag mytag
    +tag refs/mytags/mytag
     from :2
     tagger C O Mitter <committer@example.com> 1112354055 +0200
     data 15

Instead of:

    9ecf7742801c36c6b37b068fdf499603702c582a tag    refs/mytags/mytag
    
we end up with:
    
    ed9c5b1dcec27acec5dac510d475869d4d11a6a9 tag    refs/tags/refs/mytags/mytag
    
The only difference in the objects is that the former has "tag mytag",
and the latter "tag refs/mytags/mytag", since we didn't trigger the
special-case of stripping off the "refs/tags/*" prefix.

So wouldn't the nasty hack of:

    * If we see a tag object
    * It's prefixed with refs/*, e.g. "refs/some-name/space/a-name"

We strip off everything until the last slash, stick that "a-name" in the
"tag" header, and place the resulting object at the requested
"refs/some-name/space/a-name."

This rule would be ambiguous for anyone who today has a tag name like
"refs/tags/refs/[...]", but that seems exceedingly unlikely (and we
could guard the behavior with a flag or whatever).

The case we can't seem to support without a format change is if you not
only moved the tag to a new namespace, but also changed its name.

But isn't that a special-case of fast-export being unable to support
custom commit/tag object headers (maybe it does, and I've just missed
that). I.e. we could then easily support it as a minor special-case of
sometimes including the built-in "tag" header as a "custom" header.

  parent reply	other threads:[~2021-04-22  8:41 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-20 19:05 [RFC PATCH] fast-export, fast-import: Let tags specify an internal name Luke Shumaker
2021-04-20 21:40 ` Junio C Hamano
2021-04-21  8:18   ` Ævar Arnfjörð Bjarmason
2021-04-21 16:17     ` Luke Shumaker
2021-04-21 16:59     ` Junio C Hamano
2021-04-21 18:34     ` Elijah Newren
2021-04-21 18:48       ` Luke Shumaker
2021-04-21 19:24         ` Elijah Newren
2021-04-22  8:41       ` Ævar Arnfjörð Bjarmason [this message]
2021-04-21 18:41   ` Elijah Newren
2021-04-21 18:54     ` Junio C Hamano
2021-04-21 19:32       ` Elijah Newren
2021-04-22  8:54         ` Ævar Arnfjörð Bjarmason
2021-04-22 19:37           ` Elijah Newren
2021-04-21  8:03 ` Ævar Arnfjörð Bjarmason
2021-04-21 16:34   ` Luke Shumaker
2021-04-21 17:26     ` Luke Shumaker
2021-04-21 18:26     ` Elijah Newren
2021-04-21 17:48   ` Junio C Hamano
2021-04-23 16:47 ` Luke Shumaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874kfy3e5e.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=lukeshu@datawire.io \
    --cc=lukeshu@lukeshu.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).