Re: [RFC/PATCH] pull: set-upstream implementation

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, tom.russello@grenoble-inp.org,
	samuel.groot@grenoble-inp.org,
	Jordan De Gea <jordan.de-gea@grenoble-inp.org>,
	Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Subject: Re: [RFC/PATCH] pull: set-upstream implementation
Date: Sun, 29 May 2016 22:00:50 +0200	[thread overview]
Message-ID: <5639de7f-5a76-7038-ca13-867aaca3de64@grenoble-inp.org> (raw)
In-Reply-To: <xmqqvb22ypot.fsf@gitster.mtv.corp.google.com>



On 25/05/2016 20:09, Junio C Hamano wrote:
 > Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org> writes:
 >> Difficulties:
 >>     - I can't figure out what remote branch sould be tracked
 >>     in that case: `git pull -u origin :master`
 >
 > What does the command do without "-u"?

After some research, I think it creates a new branch if the ref doesn't 
exist. Otherwise it does nothing. So no branch should be tracked in that 
case.

 >> +-u::
 >> +--set-upstream::
 >> +    For each branch that is successfully pulled, add
 >> +    upstream (tracking) reference, used by argument-less
 >> +    linkgit:git-push[1] and other commands. For more information,
 >> +    see 'branch.<name>.merge' in linkgit:git-config[1].
 >
 > A crucial piece of information that is lacking in the above is where
 > <name> comes from.  The same issue exists in your proposed commit
 > log message.

You're right, documentation is unclear. I'll sort it out.

 > There a few design decisions you must have made that needs to be
 > either described fully or at least hinted here, too, such as (not
 > exhaustive and in random order):
 >
 >  * What should happen when the current branch already has these two
 >    configuration variables defined?  Should the user get a warning
 >    when this changes the setting from what was already there?

Neither `git pull --set-upstream` nor `git branch --set-upstream-to` 
warns the user in such case. So `git pull --set-upstream` should simply 
override the previous configuration the same way these commands do.


 >  * What should happen when the remote is specified as a Git URL, not
 >    as a remote nickname?  You want a nickname for the value to place
 >    in "branch.<name>.remote".
 >
 >     - Should Git find an existing remote that points at the URL and
 >       use it?  If so, and if the value we are about to place in
 >       "branch.<name>.merge" is outside its configured fetch refspec,
 >       should Git tweak the fetch refspec of the existing remote?
 >
 >     - Should Git create a new remote nickname for the URL?  If so,
 >       what should the fetch refspec be for the newly created remote?
 >       Should it fetch only the branch we fetched just now?

Still in comparison with `git push --set-upstream`, when using a Git URL 
as <repo>, "push" just sets "branch.<name>.remote" to the URL given in 
argument (even if there is a configured remote with this URL).
`git pull --set-upstream` should work the same way.

 >  * What should happen when more than one ref is given?
 >    branch_get_upstream() can return only one ref; should Git choose
 >    one of them at random?  Should Git warn and turn this into a
 >    no-op?

It depends whether refspecs given in arguments refers to the same branch 
or not. I discuss this point later in this email.

 >  * What should happen when the ref given to pull is not a branch
 >    (e.g. a tag)?  A tag, meant to be a stable anchoring point, is
 >    not something suitable to be set as branch.<name>.merge, even
 >    though it might be technically possible to configure it as such.
 >    Should Git warn and turn this into a no-op?

`pull --set-upstream` should only set "branch.<name>.merge" to remote 
branches. So yes a check must be done with eventually a no-op.

 >> +    refspecs = parse_fetch_refspec(nr_refspec, refspecs_name);
 >> +
 >> +    for (i = 0; i < nr_refspec; i++) {
 >> +        if (refspecs[i].src != NULL && strlen(refspecs[i].src) > 0) {
 >
 > More importantly, what would you see here in .src when your command
 > line were:
 >
 >     $ git pull -u <REMOTE> "refs/heads/*:refs/remotes/foo/*"
 >
 > Hint: check .pattern to make sure it is false.  Skip them at the top
 > of the loop body, perhaps.

This case wasn't handled at all, thanks for the hint.

 >> +            remote_branch = refspecs[i].src;
 >> +        } else {
 >> +            warning(N_("not yet implemented"));
 >
 > What do you plan to implement here?

Now that I understand `git pull -u origin :master`, there's nothing to 
implement here, except maybe a warning?

 >> +            local_branch = resolve_ref_unsafe("HEAD", 0, sha1, NULL);
 >> +            skip_prefix(local_branch, "refs/heads/", &local_branch);
 >
 > What happens here if your user did this?
 >
 >     $ git checkout HEAD^0 && git pull -u <REMOTE> ...
 >
 > You do not have "local_branch" here.  Check the condition, warn and
 > turn it into no-op (I do not think "pull -u" is important enough to
 > fail the entire command).

"pull -u" should indeed only work for branches, I'll put a check on this.
But I'm not really sure what procedure I should use to fully resolve 
refs. In git code, sometimes "dwim_ref" is used, sometimes it's 
"resolve_ref_unsafe" and haven't found any documentation on this.

 >> +        }
 >> +
 >> +        strbuf_init(&remote_name, strlen(repo) + 
strlen(remote_branch) + 1);
 >> +        strbuf_addf(&remote_name, "%s/%s", repo, remote_branch);
 >
 > What happens here if your user did this?
 >
 >     $ git pull -u git://git.kernel.org/r/e/p/o master
 >
 > Specifically, what is "repo" at this point?
 >
 >> +        create_branch(local_branch, local_branch, remote_name.buf, 
0, 0, 0, opt_verbosity < 0, BRANCH_TRACK_OVERRIDE);
 >
 > I thought you were only setting up configurations for existing
 > branch.  Why do you call create_branch() on the existing
 > local_branch, which is supposed to be the current one?

We took the example of `git branch --set-upstream-to=<remote>/<branch>` 
for this one. Perhaps using directly install_branch_config() is more 
relevant here.
It allows to set a non-configured remote (I mean in the .git/config) as 
the remote for the branch (with the full URL), fixing the example you 
gave me and working like `git push -u git://git.kernel.org/r/e/p/o master`.


 > What does it mean for this loop to execute more than once, flipping
 > the configuration for the current branch number of times?  The last
 > one wins?  Does it even make sense from the end-user's point of
 > view?

The loop is here to handle when the user uses multiple 
'remote_branch:local_branch' notation.
(e.g. `git pull -u <remote> master:master other:other`)

For now, the last one wins. But I agree, it doesn't really make sense.
The simplest solution from the technical point of view would be to no-op 
when multiple refs are passed in arguments.
However `git pull -u <remote> master:master other:other` seems legit.

So another solution would be to warn and no-op if the same branch config 
is modified twice.

 >> +    rm -rf parent &&
 >> +    git init parent &&
 >> +    cd parent &&
 >> +    (
 >> +        echo content >file &&
 >> +        git add file &&
 >> +        git commit -m one &&
 >> +        git checkout -b other &&
 >> +        echo content >file2 &&
 >> +        git add file2 &&
 >> +        git commit -m two &&
 >> +        git checkout -b master2 &&
 >> +        git checkout master
 >> +    ) &&
 >> +    cd ..
 >
 > The tests typically do
 >
 >     ... some stuff ... &&
 >         (
 >         cd elsewhere &&
 >                 ... more stuff ...
 >     ) &&
 >         ... yet more stuff ...
 >
 > to make sure that even after any of "... more stuff ..." fails, the
 > caller of this sequence will find it in an expected and stable
 > place.  As written in your patch, if any of the "... more stuff ..."
 > fails, the caller will be in "parent" directory, not in the original
 > directory, because your "cd .." will not be exectued.
 >
 > I wonder if you are completely missing the point of using a subshell
 > here?

You quoted the same issue later and indeed I didn't use it correctly.

 >> +check_config() {
 >> +    (echo $2; echo $3) >expect.$1
 >
 >     test_write_lines "$2" "$3" >"expect.$1"
 >
 > Unless you want your variable reference is split at $IFS, quote
 > your variables.
 >

 >> +    (git config branch.$1.remote
 >> +     git config branch.$1.merge) >actual.$1
 >> +    test_cmp expect.$1 actual.$1
 >
 > It is not _wrong_ per-se, but I do not think ".$1" suffix is adding
 > any value.  I'd drop it if I were doing this patch.

I took these functions from "t/t5523-push-upstream.sh" and I didn't 
closely look at it. I'll clean it up.

 >> +'
 >> +
 >> +test_expect_success TTY 'quiet pull' '
 >> +    ensure_fresh_upstream &&
 >> +
 >> +    test_terminal git pull -u --quiet upstream master 2>&1 | tee 
output &&
 >> +    test_cmp /dev/null output
 >> +'
 >
 > There is no test that makes sure that the new feature does not kick
 > in when it should not.  E.g. pulling from somewhere that is not a
 > configured remote.  For example,
 >

Yes you're right. I'll reshape tests and create new ones to cover more 
use cases.

Thanks for the time you spent for this huge and very useful feedback.

next prev parent reply	other threads:[~2016-05-29 20:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-25 15:25 [RFC/PATCH] pull: set-upstream implementation Erwan Mathoniere
2016-05-25 18:09 ` Junio C Hamano
2016-05-29 20:00   ` Erwan Mathoniere [this message]
2016-06-06  9:34 ` [RFC/PATCH v2] pull: add --set-upstream Erwan Mathoniere
2016-06-06 15:54   ` Matthieu Moy
2016-06-06 19:06     ` Junio C Hamano
2016-06-07  7:06       ` Matthieu Moy
2016-06-07 12:54         ` Erwan Mathoniere
2016-06-07 13:15       ` Erwan Mathoniere
2016-06-07 12:43     ` Erwan Mathoniere
2016-06-06 16:29   ` Philip Oakley
2016-06-07 13:42     ` Erwan Mathoniere

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5639de7f-5a76-7038-ca13-867aaca3de64@grenoble-inp.org \
    --to=erwan.mathoniere@grenoble-inp.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jordan.de-gea@grenoble-inp.org \
    --cc=matthieu.moy@grenoble-inp.fr \
    --cc=samuel.groot@grenoble-inp.org \
    --cc=tom.russello@grenoble-inp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).