git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* Push force-with-lease with multi-URL remote
@ 2019-07-27 16:54 Christopher Head
  2019-07-27 17:46 ` Junio C Hamano
  2019-07-29 10:20 ` Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Christopher Head @ 2019-07-27 16:54 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1164 bytes --]

Hi folks,
When a single remote has multiple push URLs, Git’s force-with-lease
logic appears to be:

For each URL:
1. Read refs/heads/mybranch (call this commit X)
2. Read refs/remotes/myremote/mybranch (call this commit Y)
3. Send to the URL an atomic compare-and-swap, replacing Y with X.
4. If step 3 succeeded, change refs/remotes/myremote/mybranch to X.

This means that, assuming both URLs start out identical, the second URL
will always fail because refs/remots/myremote/mybranch has been updated
from Y to X, and therefore the second compare-and-swap fails. I can’t
imagine any situation in which this behaviour is actually useful.

This is what I would expect:

1. Read refs/heads/mybranch (call this commit X)
2. Read refs/remotes/myremote/mybranch (call this commit Y)
3. For each URL:
3a. Send to the URL an atomic compare-and-swap, replacing Y with X.
4. If any (or maybe all) of the CAS operations in 3a succeeded, change
refs/remotes/myremote/mybranch to X.

Thoughts? Does anyone have a use case for the existing behaviour? I
have attached a shell script which constructs some repos and
demonstrates the situation.

Thanks!
-- 
Christopher Head

[-- Attachment #2: test --]
[-- Type: application/octet-stream, Size: 521 bytes --]

#!/bin/bash

set -e

# Create remote1, remote2, local.
git init --bare remote1
git init --bare remote2
git init local
cd local
git remote add origin ../remote1
git remote set-url --push origin ../remote1
git remote set-url --push --add origin ../remote2

# Add commit A and push.
echo 'hello world' > test.txt
git add test.txt
git commit -m 'Commit A'
git push -u origin master

# Amend to commit B.
echo 'goodbye world' > test.txt
git add test.txt
git commit --amend --no-edit

# Force-push.
git push --force-with-lease

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-27 16:54 Push force-with-lease with multi-URL remote Christopher Head
@ 2019-07-27 17:46 ` Junio C Hamano
  2019-07-27 18:15   ` Christopher Head
  2019-07-29 10:20 ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2019-07-27 17:46 UTC (permalink / raw)
  To: Christopher Head; +Cc: git

Christopher Head <bugs@chead.ca> writes:

> Hi folks,
> When a single remote has multiple push URLs, Git’s force-with-lease
> logic appears to be:
>
> For each URL:
> 1. Read refs/heads/mybranch (call this commit X)
> 2. Read refs/remotes/myremote/mybranch (call this commit Y)
> 3....
> from Y to X, and therefore the second compare-and-swap fails. I can’t
> imagine any situation in which this behaviour is actually useful.

Quite honestly, the true culprit of the above story is that you are
letting multiple logically different remote repositories [*1*] use
the same single remote-tracking refes/remotes/myremote/ hierarchy.

If your previous "git push myremote" (with or without lease does not
matter, as this discussion is to illustrate that your setup is
fundamentally wrong) updated X but for some reason failed to update
Y (perhaps the network to Y was unreachable back then), and
refs/remotes/myremote/mybrach got updated to reflect the update to
X, what happens to your next "git push myremote" (or more
specifically, "git push Y")?  The repository on your local side
thinks that the other party has already took the previous push but
in reality that is the state of X, and Y hasn't seen that previous
push.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-27 17:46 ` Junio C Hamano
@ 2019-07-27 18:15   ` Christopher Head
  2019-07-27 20:57     ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Christopher Head @ 2019-07-27 18:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On July 27, 2019 10:46:43 AM PDT, Junio C Hamano <gitster@pobox.com> wrote:
>Christopher Head <bugs@chead.ca> writes:
>
>Quite honestly, the true culprit of the above story is that you are
>letting multiple logically different remote repositories [*1*] use
>the same single remote-tracking refes/remotes/myremote/ hierarchy.

They weren’t supposed to be logically different (if I understand what you mean by that phrase). My intention was for the different URLs to be mirrors of each other, and multiple push URLs seemed to be the easiest way to update all the mirrors without having to mess around with making them trust each other and sending notifications and such.

If not this, then what are multiple push URLs on a single remote meant for?

>If your previous "git push myremote" (with or without lease does not
>matter, as this discussion is to illustrate that your setup is
>fundamentally wrong) updated X but for some reason failed to update
>Y (perhaps the network to Y was unreachable back then), and
>refs/remotes/myremote/mybrach got updated to reflect the update to
>X, what happens to your next "git push myremote" (or more
>specifically, "git push Y")?  The repository on your local side
>thinks that the other party has already took the previous push but
>in reality that is the state of X, and Y hasn't seen that previous
>push.

Of course it wouldn’t be perfect even with my proposed behaviour. It just seemed more useful than the existing behaviour, which will essentially *never* do anything useful as far as I can tell.

-- 
Christopher Head

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-27 18:15   ` Christopher Head
@ 2019-07-27 20:57     ` Junio C Hamano
  2019-07-27 21:43       ` Christopher Head
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2019-07-27 20:57 UTC (permalink / raw)
  To: Christopher Head; +Cc: git

Christopher Head <bugs@chead.ca> writes:

> On July 27, 2019 10:46:43 AM PDT, Junio C Hamano <gitster@pobox.com> wrote:
>>Christopher Head <bugs@chead.ca> writes:
>>
>>Quite honestly, the true culprit of the above story is that you are
>>letting multiple logically different remote repositories [*1*] use
>>the same single remote-tracking refes/remotes/myremote/ hierarchy.
>
> They weren’t supposed to be logically different (if I understand
> what you mean by that phrase). My intention was for the different
> URLs to be mirrors of each other,...

What I would call "logically the same" is the set of repositories
that are synchronized at the server side, which may or may not have
multiple URLs to reach it, but behave as if it is just a single one
without your doing anything special.  Your wanting to actively make
them in sync by the above definition makes them logically different
set of repositories.  But the phrasing does not matter much.

One repository at a hosting service (which may iternally be
replicated, but that is not even observable from outside) may be
reached over https:// or ssh://, and the result of pushing to either
one of the URLs would be observable by immediately fetching back
from either one.  Having both URLs and trying to use either one that
works may help under flaky proxy situation, for example.

In the reverse direction, I think "git fetch" supports the notion of
<group> of repositories, so that fetch from multiple remotes can be
initiated with a single command, but I am not sure if we added the
same <group> concept to the pushing side.  I personally want to have
finer control, so when I push my work to multiple repositories, each
of them are treated as totally different push targets, and a script
controls multiple "git push" processes to each of them in parallel,
with retries and all.  I think having multiple pushURL and pushing
to them is sort-of OK, but what is broken in your configuration is
that you have a single remote-tracking branch hierarchy---if you get
rid of it, so that refs/remotes/myremote/ does not exist and does
not get updated, I think things would work fine.







^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-27 20:57     ` Junio C Hamano
@ 2019-07-27 21:43       ` Christopher Head
  2019-07-29  5:19         ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Christopher Head @ 2019-07-27 21:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On July 27, 2019 1:57:18 PM PDT, Junio C Hamano <gitster@pobox.com> wrote:
>What I would call "logically the same" is the set of repositories
>that are synchronized at the server side, which may or may not have
>multiple URLs to reach it, but behave as if it is just a single one
>without your doing anything special.  Your wanting to actively make
>them in sync by the above definition makes them logically different
>set of repositories.  But the phrasing does not matter much.

OK, I would probably have just used one push URL in this scenario.

>One repository at a hosting service (which may iternally be
>replicated, but that is not even observable from outside) may be
>reached over https:// or ssh://, and the result of pushing to either
>one of the URLs would be observable by immediately fetching back
>from either one.  Having both URLs and trying to use either one that
>works may help under flaky proxy situation, for example.

That makes sense, I guess, if the unusable URLs can be expected to fail fast.

>In the reverse direction, I think "git fetch" supports the notion of
><group> of repositories, so that fetch from multiple remotes can be
>initiated with a single command, but I am not sure if we added the
>same <group> concept to the pushing side.  I personally want to have
>finer control, so when I push my work to multiple repositories, each
>of them are treated as totally different push targets, and a script
>controls multiple "git push" processes to each of them in parallel,
>with retries and all.  I think having multiple pushURL and pushing
>to them is sort-of OK, but what is broken in your configuration is
>that you have a single remote-tracking branch hierarchy---if you get
>rid of it, so that refs/remotes/myremote/ does not exist and does
>not get updated, I think things would work fine.

Yes, I agree, the presence of only a single remote tracking ref is what makes the use of a single remote with multiple URLs suboptimal here—it was just a better than the other options. I tend to have pretty reliable Internet connectivity, and I don’t particularly want to go writing custom scripts. I just want to use the normal push and fetch commands. Using multiple URLs on one remote is OK, though the single remote tracking ref is annoying. Using separate remotes works, but is more annoying due to having to remember to push to all of them. If I understand what remote groups are (separate remotes but you can act on all of them with one command?) then they should be perfect. However it does not look like they work for pushing. Would it make sense to add?
-- 
Christopher Head

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-27 21:43       ` Christopher Head
@ 2019-07-29  5:19         ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2019-07-29  5:19 UTC (permalink / raw)
  To: Christopher Head; +Cc: git

Christopher Head <bugs@chead.ca> writes:

> On July 27, 2019 1:57:18 PM PDT, Junio C Hamano <gitster@pobox.com> wrote:
> ...
>>In the reverse direction, I think "git fetch" supports the notion of
>><group> of repositories, so that fetch from multiple remotes can be
>>initiated with a single command, but I am not sure if we added the
>>same <group> concept to the pushing side.
>>...

> If I understand what remote groups are (separate remotes but you
> can act on all of them with one command?) then they should be
> perfect. However it does not look like they work for
> pushing.

Yup, you are confirming what I already said ;-).

I do not offhand think of a fundamental reason why the <group>
concept should only apply to the fetching direction.  I am not sure
about a few design issues if we were to have "push" groups, though,
and somebody who wants to have the feature must think hard about.

Should the same <group> be usable for both fetching and pushing, or
should there be two separate and independent namespaces, one for
fetch groups and the other for push groups, so that the set of
repositories "git fetch groupA" fetches from could be configured to
be different from the set of repositories "git push groupA"?  It can
be argued for both ways, but I am unsure about the pros and cons.

How should the feature interact with push atomicity?  We obviously
would not want (and probably cannot afford) to arrange two or more
repositories coodinate by participating in two-phase commit etc., so
the best we could do may be to not even initiate push after seeing
a push to one destination fail, but there may be a better definition
people can come up with.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-27 16:54 Push force-with-lease with multi-URL remote Christopher Head
  2019-07-27 17:46 ` Junio C Hamano
@ 2019-07-29 10:20 ` Jeff King
  2019-07-29 13:33   ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2019-07-29 10:20 UTC (permalink / raw)
  To: Christopher Head; +Cc: git

On Sat, Jul 27, 2019 at 09:54:40AM -0700, Christopher Head wrote:

> For each URL:
> 1. Read refs/heads/mybranch (call this commit X)
> 2. Read refs/remotes/myremote/mybranch (call this commit Y)
> 3. Send to the URL an atomic compare-and-swap, replacing Y with X.
> 4. If step 3 succeeded, change refs/remotes/myremote/mybranch to X.
> 
> This means that, assuming both URLs start out identical, the second URL
> will always fail because refs/remots/myremote/mybranch has been updated
> from Y to X, and therefore the second compare-and-swap fails. I can’t
> imagine any situation in which this behaviour is actually useful.

My general feeling is that having multiple push URLs for a remote is a
poorly designed feature in Git (and I think the discussion elsewhere in
this thread went there, as well).

But since we do have it, and if we are not going to deprecate it[1], it
seems like this case should pick the X value of myremote/mybranch ahead
of time, and then use it consistently for each push. There are questions
of partial push failures, etc, but as you note the current behavior
isn't ever useful. I think it just a case where two features do not
interact well (and since neither is used all that frequently, nobody has
noticed).

-Peff

[1] I would not be at all sad to see multiple URLs like this get
    deprecated in favor of multiple remotes with convenient grouping
    options. If that happens, then your original problem goes away. ;)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-29 10:20 ` Jeff King
@ 2019-07-29 13:33   ` Junio C Hamano
  2019-07-29 13:47     ` Christopher Head
  2019-07-29 19:20     ` Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2019-07-29 13:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Christopher Head, git

Jeff King <peff@peff.net> writes:

> My general feeling is that having multiple push URLs for a remote is a
> poorly designed feature in Git (and I think the discussion elsewhere in
> this thread went there, as well).

That's being generous.  I do not think it was even designed---at
least, the interaction with remote-tracking is ill thought out,
but I think the updating of remote-tracking by pretending to have
turned around and fetched immediately after it has done its thing
came much later than multiple URLs for push.  A remote with multiple
URLs without any remote-tracking (i.e. "push only remote") behaves
semi-sensibly.

> But since we do have it, and if we are not going to deprecate it[1], it
> seems like this case should pick the X value of myremote/mybranch ahead
> of time, and then use it consistently for each push.

I agree but only if the listed ones are separate ones.  If the URLs
are separate paths to reach the same remote (e.g. https:// and ssh://
going to the same place), the current definition would make more sense.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-29 13:33   ` Junio C Hamano
@ 2019-07-29 13:47     ` Christopher Head
  2019-07-29 19:20     ` Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Christopher Head @ 2019-07-29 13:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Christopher Head, git

On Mon, 29 Jul 2019 06:33:32 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> I agree but only if the listed ones are separate ones.  If the URLs
> are separate paths to reach the same remote (e.g. https:// and ssh://
> going to the same place), the current definition would make more
> sense.

I realize I’m a bit biased towards my personal use case, but I wonder
if it would make sense to consider how frequently each case occurs?

Case 1: someone wants to keep multiple repos mirrored, by always
pushing to all of them (my use case).

Case 2: someone wants to push to one repo, but vagaries of Internet
connectivity mean that sometimes they can’t use SSH and other times
they can’t use HTTP (or they prefer one protocol but sometimes that
one doesn’t work), therefore they want both URLs so that when one fails
the other may work. I suppose the most common situation in this case is
that you want to use SSH so that you don’t have to type a password, but
sometimes you’re in a site which only allows HTTP connections and
typing a password as a fallback is preferable to failing altogether?

For me, case 1 happens quite frequently but case 2 pretty much never—I
don’t think I’ve ever been somewhere that blocks port 22 outbound, so I
always just use SSH. But I realize other people’s experience varies.
-- 
Christopher Head

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-29 13:33   ` Junio C Hamano
  2019-07-29 13:47     ` Christopher Head
@ 2019-07-29 19:20     ` Jeff King
  2019-07-29 21:44       ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2019-07-29 19:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christopher Head, git

On Mon, Jul 29, 2019 at 06:33:32AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > My general feeling is that having multiple push URLs for a remote is a
> > poorly designed feature in Git (and I think the discussion elsewhere in
> > this thread went there, as well).
> 
> That's being generous.  I do not think it was even designed---at
> least, the interaction with remote-tracking is ill thought out,
> but I think the updating of remote-tracking by pretending to have
> turned around and fetched immediately after it has done its thing
> came much later than multiple URLs for push.  A remote with multiple
> URLs without any remote-tracking (i.e. "push only remote") behaves
> semi-sensibly.

Yeah, the auto-update of the tracking refs came later (so I think you
could argue the bad interaction is my fault!).

> > But since we do have it, and if we are not going to deprecate it[1], it
> > seems like this case should pick the X value of myremote/mybranch ahead
> > of time, and then use it consistently for each push.
> 
> I agree but only if the listed ones are separate ones.  If the URLs
> are separate paths to reach the same remote (e.g. https:// and ssh://
> going to the same place), the current definition would make more sense.

Hmm, true. I'd almost argue that --force-with-lease, at least in its
default mode with no explicit lease source specified, should allow an
update from X to Y to be a successful noop if the remote "somehow"
already moved to Y.

This multi-URL push is one such "somehow", but I could imagine a case
where two other independent processes are racing. And we do not care at
all how we get to "Y", only that we get there.

But I haven't thought it through carefully, and I wonder if some users
would be unhappy not to find out that somebody had moved to "Y" already.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-29 19:20     ` Jeff King
@ 2019-07-29 21:44       ` Junio C Hamano
  2019-07-29 22:29         ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2019-07-29 21:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Christopher Head, git

Jeff King <peff@peff.net> writes:

> Yeah, the auto-update of the tracking refs came later (so I think you
> could argue the bad interaction is my fault!).

Heh, I somehow thought it was somebody else.

> Hmm, true. I'd almost argue that --force-with-lease, at least in its
> default mode with no explicit lease source specified, should allow an
> update from X to Y to be a successful noop if the remote "somehow"
> already moved to Y.

I've already written the --force-with-lease that expects what you
have on your remote-tracking branch off as a gross misdesign that
should be deprecated in the longer term; I do not have a strong
opinion on the tweaks to be done to the feature until it gets
dropped ;-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Push force-with-lease with multi-URL remote
  2019-07-29 21:44       ` Junio C Hamano
@ 2019-07-29 22:29         ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2019-07-29 22:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christopher Head, git

On Mon, Jul 29, 2019 at 02:44:00PM -0700, Junio C Hamano wrote:

> > Hmm, true. I'd almost argue that --force-with-lease, at least in its
> > default mode with no explicit lease source specified, should allow an
> > update from X to Y to be a successful noop if the remote "somehow"
> > already moved to Y.
> 
> I've already written the --force-with-lease that expects what you
> have on your remote-tracking branch off as a gross misdesign that
> should be deprecated in the longer term; I do not have a strong
> opinion on the tweaks to be done to the feature until it gets
> dropped ;-)

Well, that part I certainly agree with. ;)

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-29 22:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-27 16:54 Push force-with-lease with multi-URL remote Christopher Head
2019-07-27 17:46 ` Junio C Hamano
2019-07-27 18:15   ` Christopher Head
2019-07-27 20:57     ` Junio C Hamano
2019-07-27 21:43       ` Christopher Head
2019-07-29  5:19         ` Junio C Hamano
2019-07-29 10:20 ` Jeff King
2019-07-29 13:33   ` Junio C Hamano
2019-07-29 13:47     ` Christopher Head
2019-07-29 19:20     ` Jeff King
2019-07-29 21:44       ` Junio C Hamano
2019-07-29 22:29         ` Jeff King

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).