git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* RFC: How should we handle un-deleted remote branches?
@ 2018-04-20 12:14 Ævar Arnfjörð Bjarmason
  2018-04-22  7:41 ` Andreas Heiduk
  0 siblings, 1 reply; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-20 12:14 UTC (permalink / raw)
  To: Git Mailing List

I removed a remote and its refs persisted. At first I thought this was a
bug but looking at add_branch_for_removal()'s "don't delete a branch if
another remote also uses it" logic it's intentional.

This goes very wrong if you do e.g.:

    (
        rm -rf /tmp/git &&
        git clone --bare --mirror git@github.com:git/git.git /tmp/git &&
        cd /tmp/git &&
        git remote add avar  git@github.com:avar/git.git &&
        git fetch avar &&
        git remote remove avar &&
        git for-each-ref|grep -c refs/remotes/avar
    )

At this point you can't prune it since no remote config is left:

    $ git remote prune avar
    fatal: 'avar' does not appear to be a git repository
    fatal: Could not read from remote repository.

But this is a possible work-around:

    git init /tmp/empty.git
    git remote add avar file:///tmp/empty.git
    git remote prune avar
    git remote remove avar

Or use update-ref -d to remove them:

    for rr in $(git for-each-ref --format="%(refname)"|grep ^refs/remotes/avar/)
    do
        git update-ref -d $rr
    done

I started to patch this, but I'm not sure what to do here. we could do
some combination of:

 0. Just document the current behavior and leave it.

 1. Dig further down to see what other remotes reference these refs, and
    just ignore any refspecs that don't explicitly reference
    refs/remotes/<our_deleted_remote>/*.

    I.e. isn't the intention here to preserve a case where you have two
    URLs for the same effective remote, not whene you have something
    like a --mirror refspec? Unfortunately I can't ask the original
    author :(

 2. Warn about each ref we didn't delete, or at least warn saying
    there's undeleted refs under refs/remotes/<name>/*.

 3. Make 'git remote remove --force-deletion <name>' (or whatever the
    flag is called) be a thing. But unless we do the next item this
    won't be useful.

 4. Make 'git remote prune <name>' work in cases where we don't have a
    remote called <name> anymore, just falling back to deleting
    refs/remotes/<name>. In this case 'git remote remove
    --force-deletion <name>' would also do the same thing.

In any case, the current behavior where we just leave this crap behind
without any explanation or an obvious way to delete them (can't use
git-branch -d, need update-ref -d) isn't nice.

I encountered this in the wild because GitLab will add an "upstream" ref
if you clone a repo, but if you stop using the remote to update it in
combination with their "geo" remote (which has --mirror refspecs) it'll
leave behind all these dead refs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: How should we handle un-deleted remote branches?
  2018-04-20 12:14 RFC: How should we handle un-deleted remote branches? Ævar Arnfjörð Bjarmason
@ 2018-04-22  7:41 ` Andreas Heiduk
  2018-04-22 11:17   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Heiduk @ 2018-04-22  7:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Git Mailing List

Am 20.04.2018 um 14:14 schrieb Ævar Arnfjörð Bjarmason:
> But this is a possible work-around:
> 
>     git init /tmp/empty.git
>     git remote add avar file:///tmp/empty.git
>     git remote prune avar
>     git remote remove avar

This won't do it also?

	git remote prune origin


> I started to patch this, but I'm not sure what to do here. we could do
> some combination of:
> 
>  0. Just document the current behavior and leave it.
> 
>  1. Dig further down to see what other remotes reference these refs, and
>     just ignore any refspecs that don't explicitly reference
>     refs/remotes/<our_deleted_remote>/*.
> 
>     I.e. isn't the intention here to preserve a case where you have two
>     URLs for the same effective remote, not whene you have something
>     like a --mirror refspec? Unfortunately I can't ask the original
>     author :(
> 
>  2. Warn about each ref we didn't delete, or at least warn saying
>     there's undeleted refs under refs/remotes/<name>/*.
> 
>  3. Make 'git remote remove --force-deletion <name>' (or whatever the
>     flag is called) be a thing. But unless we do the next item this
>     won't be useful.
> 
>  4. Make 'git remote prune <name>' work in cases where we don't have a
>     remote called <name> anymore, just falling back to deleting
>     refs/remotes/<name>. In this case 'git remote remove
>     --force-deletion <name>' would also do the same thing.

Possible 5):

	Don't fix "git remote remove" but "git remote add" to complain that its
ref-namespace is already occupied by some other remote. Add "--force"
for the experts.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: How should we handle un-deleted remote branches?
  2018-04-22  7:41 ` Andreas Heiduk
@ 2018-04-22 11:17   ` Ævar Arnfjörð Bjarmason
  2018-04-22 14:30     ` Andreas Heiduk
  0 siblings, 1 reply; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-22 11:17 UTC (permalink / raw)
  To: Andreas Heiduk; +Cc: Git Mailing List


On Sun, Apr 22 2018, Andreas Heiduk wrote:

> Am 20.04.2018 um 14:14 schrieb Ævar Arnfjörð Bjarmason:
>> But this is a possible work-around:
>>
>>     git init /tmp/empty.git
>>     git remote add avar file:///tmp/empty.git
>>     git remote prune avar
>>     git remote remove avar
>
> This won't do it also?
>
> 	git remote prune origin

Yes, in this particular case, but that's just emergent behavior in how
we handle refspec prunign, and the fact that it "works" is arguably a
bug in "prune". i.e. this:

    (
        rm -rf /tmp/git &&
        git clone --bare --mirror git@github.com:git/git.git /tmp/git &&
        cd /tmp/git &&
        git remote add avar git@github.com:avar/git.git &&
        git remote add peff git@github.com:peff/git.git &&
        git fetch --all &&
        git remote remove avar &&
        git remote prune origin
    )

Will delete all the avar/* and peff/* branches, even though I still have
a "peff" remote.

IOW the guarding logic we have in add_branch_for_removal() for not
deleting the branches of other remotes isn't in the corresponding
"prune" function, and that's a bug.

In the specific example I picked "git remote prune origin" just so
happens to do the right thing since I have no other active remote, and
there *is* an alive remote so I can "prune" against it, but it doesn't
help in the general case. In my case I have a remote URL for a git
server called "upstream" that doesn't exist anymore (but as noted, I can
fake it with an empty repo...)>


>> I started to patch this, but I'm not sure what to do here. we could do
>> some combination of:
>>
>>  0. Just document the current behavior and leave it.
>>
>>  1. Dig further down to see what other remotes reference these refs, and
>>     just ignore any refspecs that don't explicitly reference
>>     refs/remotes/<our_deleted_remote>/*.
>>
>>     I.e. isn't the intention here to preserve a case where you have two
>>     URLs for the same effective remote, not whene you have something
>>     like a --mirror refspec? Unfortunately I can't ask the original
>>     author :(
>>
>>  2. Warn about each ref we didn't delete, or at least warn saying
>>     there's undeleted refs under refs/remotes/<name>/*.
>>
>>  3. Make 'git remote remove --force-deletion <name>' (or whatever the
>>     flag is called) be a thing. But unless we do the next item this
>>     won't be useful.
>>
>>  4. Make 'git remote prune <name>' work in cases where we don't have a
>>     remote called <name> anymore, just falling back to deleting
>>     refs/remotes/<name>. In this case 'git remote remove
>>     --force-deletion <name>' would also do the same thing.
>
> Possible 5):
>
> 	Don't fix "git remote remove" but "git remote add" to complain that its
> ref-namespace is already occupied by some other remote. Add "--force"
> for the experts.

Indeed, that's another bug here, i.e. in the above example:

    git remote remove peff && # won't delete peff/ branches
    git remote add peff git@github.com:peff/git.git

Will happily add the "peff" remote again, even though as you point out
it could be an entirely different remote.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: How should we handle un-deleted remote branches?
  2018-04-22 11:17   ` Ævar Arnfjörð Bjarmason
@ 2018-04-22 14:30     ` Andreas Heiduk
  2018-04-22 18:37       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Heiduk @ 2018-04-22 14:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

Am 22.04.2018 um 13:17 schrieb Ævar Arnfjörð Bjarmason:
> 
> On Sun, Apr 22 2018, Andreas Heiduk wrote:
> 
>> Am 20.04.2018 um 14:14 schrieb Ævar Arnfjörð Bjarmason:
>>> But this is a possible work-around:
>>>
>>>     git init /tmp/empty.git
>>>     git remote add avar file:///tmp/empty.git
>>>     git remote prune avar
>>>     git remote remove avar
>>
>> This won't do it also?
>>
>> 	git remote prune origin
> 
> Yes, in this particular case, but that's just emergent behavior in how
> we handle refspec prunign, and the fact that it "works" is arguably a
> bug in "prune". i.e. this:

Its not emergent because "origin" is the other remote responsible for
that ref and cleaning stuff "belonging" to the remote is the job
description (I'm arguing from a user's perspective, not as a git-developer).

> 
>     (
>         rm -rf /tmp/git &&
>         git clone --bare --mirror git@github.com:git/git.git /tmp/git &&
>         cd /tmp/git &&
>         git remote add avar git@github.com:avar/git.git &&
>         git remote add peff git@github.com:peff/git.git &&
>         git fetch --all &&
>         git remote remove avar &&
>         git remote prune origin
>     )
> 
> Will delete all the avar/* and peff/* branches, even though I still have
> a "peff" remote.

Exactly my point: When you are in a the bad situation of "shared
responsibility", then there is no easy and always correct way out,
because there are uncountable possible situations.

To give another, slightly modified example expanding the problem space:

(
    rm -rf /tmp/git &&
    git clone --bare --mirror https://github.com/git/git.git /tmp/git &&
    cd /tmp/git &&
    git remote add github https://github.com/avar/git.git &&
    git fetch github &&
    git fetch origin &&
    # note new fetches for "github/master" using with "(forced update)"
)

For ... reasons the first repo publishes some references like

	github/maint
	github/master
	github/pu

So when this repo is mirrored AND another, suitably named remote is
added then there will be also namespace conflicts. You can call

    git fetch github
    git fetch origin

in a loop and most likely each fetch will update the same refs, always
toggling between two states.

So: not only "remote remove" and "remote prune" are at stake but every
command handling remote references.

How should "git remote remove github" work in both situations? Remove
the refs/remotes/github/master & co? remove them only if the last fetch
was for "github" but not when the last update was for "origin"? Should
"git fetch" use the same logic?

So it seems better to me to avoid that bad situation altogether. Don't
allow overlapping/conflicting refspecs when adding a new remote. Using
*your* last examples both

>         git remote add avar git@github.com:avar/git.git &&
>         git remote add peff git@github.com:peff/git.git &&

should have failed and hence the "prune" problems won't exist. Same for
my example.

>> Possible 5):
>>
>> 	Don't fix "git remote remove" but "git remote add" to complain that its
>> ref-namespace is already occupied by some other remote. Add "--force"
>> for the experts.
> 
> Indeed, that's another bug here, i.e. in the above example:
> 
>     git remote remove peff && # won't delete peff/ branches
>     git remote add peff git@github.com:peff/git.git
> 
> Will happily add the "peff" remote again, even though as you point out
> it could be an entirely different remote.

Ummm. That was not my point. My is: "git clone --mirror" uses a refspec

	fetch = +refs/*:refs/*

and hence "occupies" the complete "refs/*" namespace. So adding another
remote (for the first time or for the second time is as irrelevant as
the url) will use

	fetch = +refs/heads/*:refs/remotes/peff/*

and now the "refs/remotes/peff/*" part is in conflict with "refs/*" from
above. The conflict exists not only for "prune" or "remove" but also for
"fetch", "rename" (didn't check) .

This kind of conflict should not be allowed right from the start - when
the first "git remote add peff..." is executed. Then prune, remove AND
fetch would be OK.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: How should we handle un-deleted remote branches?
  2018-04-22 14:30     ` Andreas Heiduk
@ 2018-04-22 18:37       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-04-22 18:37 UTC (permalink / raw)
  To: Andreas Heiduk; +Cc: Git Mailing List


On Sun, Apr 22 2018, Andreas Heiduk wrote:

> Am 22.04.2018 um 13:17 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sun, Apr 22 2018, Andreas Heiduk wrote:
>>
>>> Am 20.04.2018 um 14:14 schrieb Ævar Arnfjörð Bjarmason:
>>>> But this is a possible work-around:
>>>>
>>>>     git init /tmp/empty.git
>>>>     git remote add avar file:///tmp/empty.git
>>>>     git remote prune avar
>>>>     git remote remove avar
>>>
>>> This won't do it also?
>>>
>>> 	git remote prune origin
>>
>> Yes, in this particular case, but that's just emergent behavior in how
>> we handle refspec prunign, and the fact that it "works" is arguably a
>> bug in "prune". i.e. this:
>
> Its not emergent because "origin" is the other remote responsible for
> that ref and cleaning stuff "belonging" to the remote is the job
> description (I'm arguing from a user's perspective, not as a git-developer).

Right, I should have said something closer to "inconsistent", at least
from the user's perspective. I.e. "remove" doesn't delete your branches
because another ref "owns" them, but "prune" will happily clobber
another remote's refs without warning.

Maybe we're happy to keep that, pruning is a bit of an oddity already,
see the "Git has a default disposition of keeping[...]" docs and the
rest of the "PRUNING" section I added to git-fetch in 2.17.0, but this
is osmething worth keeping in mind.

>>
>>     (
>>         rm -rf /tmp/git &&
>>         git clone --bare --mirror git@github.com:git/git.git /tmp/git &&
>>         cd /tmp/git &&
>>         git remote add avar git@github.com:avar/git.git &&
>>         git remote add peff git@github.com:peff/git.git &&
>>         git fetch --all &&
>>         git remote remove avar &&
>>         git remote prune origin
>>     )
>>
>> Will delete all the avar/* and peff/* branches, even though I still have
>> a "peff" remote.
>
> Exactly my point: When you are in a the bad situation of "shared
> responsibility", then there is no easy and always correct way out,
> because there are uncountable possible situations.
>
> To give another, slightly modified example expanding the problem space:
>
> (
>     rm -rf /tmp/git &&
>     git clone --bare --mirror https://github.com/git/git.git /tmp/git &&
>     cd /tmp/git &&
>     git remote add github https://github.com/avar/git.git &&
>     git fetch github &&
>     git fetch origin &&
>     # note new fetches for "github/master" using with "(forced update)"
> )
>
> For ... reasons the first repo publishes some references like
>
> 	github/maint
> 	github/master
> 	github/pu
>
> So when this repo is mirrored AND another, suitably named remote is
> added then there will be also namespace conflicts. You can call
>
>     git fetch github
>     git fetch origin
>
> in a loop and most likely each fetch will update the same refs, always
> toggling between two states.
>
> So: not only "remote remove" and "remote prune" are at stake but every
> command handling remote references.

Right, there's certainly a lot of insanity you can create with
overlapping refs, but to bring this thread around a bit, the edge cases
I'm interested in addressing are those where "git remote <whatever>"
silently doesn't do its job, or does it too eagerly, resulting in a
hard-to-repair repo state.

> How should "git remote remove github" work in both situations? Remove
> the refs/remotes/github/master & co? remove them only if the last fetch
> was for "github" but not when the last update was for "origin"? Should
> "git fetch" use the same logic?

Until we move to some other ref store implementation that namespaces
things properly, the best we can do is simply to assume that
refs/remotes/<name>/ is owned by the <name> remote for the purposes of
remove/prune etc.

That'll of course leave open this edge case you're pointing out where
you're mirroring another remote into refs/* and it creates a
remote/<name>/ branch, but I think it's sufficient for the purposes of
sane UI to just document that fetching into refs/* creates such caveats.

> So it seems better to me to avoid that bad situation altogether. Don't
> allow overlapping/conflicting refspecs when adding a new remote. Using
> *your* last examples both
>
>>         git remote add avar git@github.com:avar/git.git &&
>>         git remote add peff git@github.com:peff/git.git &&
>
> should have failed and hence the "prune" problems won't exist. Same for
> my example.

I think this is a non-started. There's plenty of legitimate reasons to
have overlapping refspecs, e.g. the GitLab case where they're creating a
"geo" remote which is a mirror of other refs they push to.

Even if that wasn't the case, it would be very fragile to solve these
cases by disallowing adding such remotes, since users can edit the
config file, we'd need to detect it on the fly.

>>> Possible 5):
>>>
>>> 	Don't fix "git remote remove" but "git remote add" to complain that its
>>> ref-namespace is already occupied by some other remote. Add "--force"
>>> for the experts.
>>
>> Indeed, that's another bug here, i.e. in the above example:
>>
>>     git remote remove peff && # won't delete peff/ branches
>>     git remote add peff git@github.com:peff/git.git
>>
>> Will happily add the "peff" remote again, even though as you point out
>> it could be an entirely different remote.
>
> Ummm. That was not my point. My is: "git clone --mirror" uses a refspec
>
> 	fetch = +refs/*:refs/*
>
> and hence "occupies" the complete "refs/*" namespace. So adding another
> remote (for the first time or for the second time is as irrelevant as
> the url) will use
>
> 	fetch = +refs/heads/*:refs/remotes/peff/*
>
> and now the "refs/remotes/peff/*" part is in conflict with "refs/*" from
> above. The conflict exists not only for "prune" or "remove" but also for
> "fetch", "rename" (didn't check) .
>
> This kind of conflict should not be allowed right from the start - when
> the first "git remote add peff..." is executed. Then prune, remove AND
> fetch would be OK.

As noted above we need to deal with this overlap, and it has to be
allowed, but we can still do better than we do now with "remove" /
"prune" et al, i.e. simply inform & ask the user what he'd like to do.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-04-22 18:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-20 12:14 RFC: How should we handle un-deleted remote branches? Ævar Arnfjörð Bjarmason
2018-04-22  7:41 ` Andreas Heiduk
2018-04-22 11:17   ` Ævar Arnfjörð Bjarmason
2018-04-22 14:30     ` Andreas Heiduk
2018-04-22 18:37       ` Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).