git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Calvin Wan <calvinwan@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] Documentation: clarify multiple pushurls vs urls
Date: Mon, 06 Feb 2023 22:55:11 +0100	[thread overview]
Message-ID: <230206.86pmam4exz.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CAFySSZCO7M8bm8Cc97x7MpZYHd0qWwRHF_YRDmw1rryF6Q7dnQ@mail.gmail.com>


On Mon, Feb 06 2023, Calvin Wan wrote:

>> > Defining multiple `url` fields can cause confusion for users since
>> > running `git config remote.<remote>.url` returns the last defined url
>> > which doesn't align with the url `git fetch <remote>` uses (the first).
>>
>> I'm certainly confused, I had no idea it worked this way, I'd have thought it was last-set-wins like most things.
>>
>> From a glance fb0cc87ec0f (Allow programs to not depend on remotes
>> having urls, 2009-11-18) mentions it as a known factor, but with:
>>
>>         diff --git a/transport.c b/transport.c
>>         index 77a61a9d7bb..06159c4184e 100644
>>         --- a/transport.c
>>         +++ b/transport.c
>>         @@ -1115,7 +1115,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
>>                 helper = remote->foreign_vcs;
>>
>>                 if (!url && remote->url)
>>         -               url = remote->url[0];
>>         +               url = remote->url[remote->url_nr - 1];
>>                 ret->url = url;
>>
>>                 /* maybe it is a foreign URL? */
>>
>> All tests pass for me, and it's selecting the last URL now. I can't find
>> any other mention of these semantics in the docs (but maybe I didn't
>> look in the right places).
>>
>> So is this just some accident, does anyone rely on it, and would we be
>> better off just "fixing" this, rather than steering people away from
>> "url"?
>
> I should've mentioned running `git remote -v` on a config with multiple urls
>  shows the correct fetch url, so functionally everything is working as
> intended -- just needs a doc update somewhere.

Ah, yes, it seems to prefer the first configured one, whether that's
what anyone intended (and should we use last configured?) is another
matter.

But in any case, figuring that out and having a test in-tree that fails
if you pick the first or last of the list (depending on what we go for)
would be most welcome...

>> Surely if there's confusion about the priority of the *.url config
>> variable we should be documenting that explicitly where we discuss "url"
>> itself (e.g. in Documentation/config/remote.txt). Just mentioning it in
>> passing as we document "pushUrl" feels like the wrong place.
>>
>> But I still don't quite see the premise. "git push" has a feature to
>> push to all N urls, whether that's Url or pushUrl.
>>
>> When I configure it to have multiple URLs it pushes to the first
>> configured one first, if the source of the confusion was that it didn't
>> prefer the last configured one first, shouldn't it be doing them in
>> reverse order?
>>
>> I don't think that would make sense, but I also don't see how
>> recommending "pushurl" over "url" un-confuses things.
>>
>> So why is it confusing that "fetch" would use the same order, but due to
>> the semantics of a "fetch" we'd stop after the first one?
>
> I agree with you now that updating the documentation in
> Documentation/config/remote.txt is the ideal way to go about this, but

Aside: I actually think near where you made the change in
Documentation/urls-remotes.txt is probably better, in remote.txt we just
point to git-fetch.txt or git-pull.txt etc., which in turn include that.

It just seems we'd need a short blurb about how URLs are selected, we
prefer the first one, and for fetch it's always "stop at the first", and
for push "push to all, from first to last".

I may have gotten that wrong, but that's my current understanding from
looking at it briefly.

> I'll mention what my original thought process was:
> If a user wants one url to push/fetch to, then he defines 'url'
> If a user wants to push to multiple urls, then he can either define
> multiple urls or pushurls (one of the pushurls can be the same as the url).
> But if a user has say url #2 and #3 defined, they act as pushurls anyways,
> so defining them as such removes any speculation as to what else they
> could do (and also clears up the confusion when running
> `git config remote.<remote>.url`).

I'm coming away from this with the impression that we should almost
never recommend "pushUrl", not that it's worthwhile to use it to solve
this ambiguity.

Trying this out, given a config like:
	
	[remote "avar"]
	        url = git@github.com:avar/git.git
	        url = git@github.com:avar/git1.git
	        url = git@github.com:avar/git2.git

I'll get:
	
	$ ./git remote -v|grep avar
	avar    git@github.com:avar/git.git (fetch)
	avar    git@github.com:avar/git.git (push)
	avar    git@github.com:avar/git1.git (push)
	avar    git@github.com:avar/git2.git (push)

So, the semantics of "url" is that the first one is always the fetch,
there's no such thing as multiple fetch URLs, and the push URLs are the
list of all "url"'s.

But now let's change that to:

	[remote "avar"]
	        url = git@github.com:avar/git.git
	        url = git@github.com:avar/git1.git
	        pushUrl = git@github.com:avar/git2.git

Which gives us:

	$ ./git remote -v|grep avar
	avar    git@github.com:avar/git.git (fetch)
	avar    git@github.com:avar/git2.git (push)

So, by defining a "pushUrl" all subsequent URLs in "url" have been
shadowed.
	
Maybe that was considered at the time, but it wouldn't surprise me if
that's a blindspot in the original "pushUrl" patch (203462347fc (Allow
push and fetch urls to be different, 2009-06-09)), i.e. shouldn't we at
least warn on "push" now or somewhere else about the useless second
"url"?

So, back to "pushurl". It seems to me that the only real use-case for it
is the case where the URL really must by different in the case of
"fetch" and "push", which I daresay is increasingly rare these days (but
used to be more common when dumb http read + ssh write or whatever was
more common).

Whereas the more common case is just wanting to fetch/push from/to
github.com, and also wanting to mirror to gitlab.com or whathever.

In that case using "pushUrl" requires you to duplicate the "fetch" url,
whereas just having multiple "url" sections without "pushurl" does the
same thing without the duplication.


  reply	other threads:[~2023-02-06 22:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 19:55 [PATCH] Documentation: clarify multiple pushurls vs urls Calvin Wan
2023-02-06 20:11 ` Ævar Arnfjörð Bjarmason
2023-02-06 21:12   ` Calvin Wan
2023-02-06 21:55     ` Ævar Arnfjörð Bjarmason [this message]
2023-02-06 23:00     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=230206.86pmam4exz.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=calvinwan@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).