Pull without fetch

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Pull without fetch
@ 2019-04-06 13:12 Damien Robert
  2019-04-08  1:34 ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Damien Robert @ 2019-04-06 13:12 UTC (permalink / raw)
  To: git

Dear git developpers,

is there a way to do a git pull without it running git fetch?
Looking at the source in builtin/pull.c does not seem to indicate so.

Here is my use case: I do a `git fetch --all` to look at the incoming
changes, and only afterwards I would like to do a `git pull --no-fetch`. I
could of course call `git merge`/`git rebase` directly, but the advantage
of `git pull` is that it parses the value of branch.<name>.rebase for me
(in `parse_config_rebase`). I could easily write a script for it, but this
would just duplicate this part of git-pull, so we might as well use git
pull directly. Moreover, `git-pull --rebase` used to have extra functionality
where it looked at the reflog to find the merge base. This is now
incorporated directly into git rebase, but this is one more reason I would
like to use git pull directly.

I am missing another way? Would a patch to provide this feature be accepted?

-- 
Damien Robert

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-06 13:12 Pull without fetch Damien Robert
@ 2019-04-08  1:34 ` Junio C Hamano
  2019-04-08  2:17   ` Duy Nguyen
  2019-04-08 14:53   ` Damien Robert
  0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2019-04-08  1:34 UTC (permalink / raw)
  To: Damien Robert; +Cc: git

Damien Robert <damien.olivier.robert@gmail.com> writes:

> is there a way to do a git pull without it running git fetch?
> Looking at the source in builtin/pull.c does not seem to indicate so.

The reason behind that is because it does not make any sense for
"pull", which is meant as a quick short-cut to say "fetch && merge",
not to run fetch, especially back then when 'git pull' was designed,
the world was much simpler.  There was no "fetch && rebase", our
branches did not know what their @{upstream}s were.  In that simpler
world, what you are trying to do would have been:

	git fetch
	# did I get anything worth integrating?
	git merge FETCH_HEAD

That obviously would not work for those with "pull.rebase", and I do
not think it makes much sense to teach "git rebase" the same trick
to read FETCH_HEAD as "git merge" does in the above sequence.

Others may have a better idea, but I do not immediately see any
solution better than inventing a new option to "git pull".

Another and better option that may be harder to arrange is to make
sure that a no-op "git fetch" incurs very low cost.  If you did so,
"git fetch && git pull" would perform just like your "git fetch &&
git pull --no-fetch", and we won't need a new option at all.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-08  1:34 ` Junio C Hamano
@ 2019-04-08  2:17   ` Duy Nguyen
  2019-04-08 12:51     ` Ævar Arnfjörð Bjarmason
  2019-04-08 14:53   ` Damien Robert
  1 sibling, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2019-04-08  2:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Damien Robert, Git Mailing List

On Mon, Apr 8, 2019 at 8:34 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Damien Robert <damien.olivier.robert@gmail.com> writes:
>
> > is there a way to do a git pull without it running git fetch?
> > Looking at the source in builtin/pull.c does not seem to indicate so.
>
> The reason behind that is because it does not make any sense for
> "pull", which is meant as a quick short-cut to say "fetch && merge",
> not to run fetch, especially back then when 'git pull' was designed,
> the world was much simpler.  There was no "fetch && rebase", our
> branches did not know what their @{upstream}s were.  In that simpler
> world, what you are trying to do would have been:
>
>         git fetch
>         # did I get anything worth integrating?
>         git merge FETCH_HEAD
>
> That obviously would not work for those with "pull.rebase", and I do
> not think it makes much sense to teach "git rebase" the same trick
> to read FETCH_HEAD as "git merge" does in the above sequence.
>
> Others may have a better idea, but I do not immediately see any
> solution better than inventing a new option to "git pull".
>
> Another and better option that may be harder to arrange is to make
> sure that a no-op "git fetch" incurs very low cost.  If you did so,

Not exactly related. But I often wish to see the list of branch
updates since the last fetch. There's no easy way (that I know) to do
this unless you copy the last fetch's output somewhere. If this "fetch
at low cost" could simply read FETCH_HEAD and summarizes it like a
normal fetch, that would be great. And it should also be very low cost
because we only replay the last part (making summary) of normal fetch.

> "git fetch && git pull" would perform just like your "git fetch &&
> git pull --no-fetch", and we won't need a new option at all.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-08  2:17   ` Duy Nguyen
@ 2019-04-08 12:51     ` Ævar Arnfjörð Bjarmason
  2019-04-08 13:18       ` Duy Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-04-08 12:51 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Damien Robert, Git Mailing List, Christian Couder,
	Stefan Beller


On Mon, Apr 08 2019, Duy Nguyen wrote:

> On Mon, Apr 8, 2019 at 8:34 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Damien Robert <damien.olivier.robert@gmail.com> writes:
>>
>> > is there a way to do a git pull without it running git fetch?
>> > Looking at the source in builtin/pull.c does not seem to indicate so.
>>
>> The reason behind that is because it does not make any sense for
>> "pull", which is meant as a quick short-cut to say "fetch && merge",
>> not to run fetch, especially back then when 'git pull' was designed,
>> the world was much simpler.  There was no "fetch && rebase", our
>> branches did not know what their @{upstream}s were.  In that simpler
>> world, what you are trying to do would have been:
>>
>>         git fetch
>>         # did I get anything worth integrating?
>>         git merge FETCH_HEAD
>>
>> That obviously would not work for those with "pull.rebase", and I do
>> not think it makes much sense to teach "git rebase" the same trick
>> to read FETCH_HEAD as "git merge" does in the above sequence.
>>
>> Others may have a better idea, but I do not immediately see any
>> solution better than inventing a new option to "git pull".
>>
>> Another and better option that may be harder to arrange is to make
>> sure that a no-op "git fetch" incurs very low cost.  If you did so,
>
> Not exactly related. But I often wish to see the list of branch
> updates since the last fetch. There's no easy way (that I know) to do
> this unless you copy the last fetch's output somewhere. If this "fetch
> at low cost" could simply read FETCH_HEAD and summarizes it like a
> normal fetch, that would be great. And it should also be very low cost
> because we only replay the last part (making summary) of normal fetch.

The ability to have this is something reftables will provide (from my
memory of a comment by Stefan Beller), which Christian Couder is working
on implementing these days.

>> "git fetch && git pull" would perform just like your "git fetch &&
>> git pull --no-fetch", and we won't need a new option at all.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-08 12:51     ` Ævar Arnfjörð Bjarmason
@ 2019-04-08 13:18       ` Duy Nguyen
  0 siblings, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2019-04-08 13:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Damien Robert, Git Mailing List, Christian Couder,
	Stefan Beller

On Mon, Apr 8, 2019 at 7:51 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>
> On Mon, Apr 08 2019, Duy Nguyen wrote:
>
> > On Mon, Apr 8, 2019 at 8:34 AM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> Damien Robert <damien.olivier.robert@gmail.com> writes:
> >>
> >> > is there a way to do a git pull without it running git fetch?
> >> > Looking at the source in builtin/pull.c does not seem to indicate so.
> >>
> >> The reason behind that is because it does not make any sense for
> >> "pull", which is meant as a quick short-cut to say "fetch && merge",
> >> not to run fetch, especially back then when 'git pull' was designed,
> >> the world was much simpler.  There was no "fetch && rebase", our
> >> branches did not know what their @{upstream}s were.  In that simpler
> >> world, what you are trying to do would have been:
> >>
> >>         git fetch
> >>         # did I get anything worth integrating?
> >>         git merge FETCH_HEAD
> >>
> >> That obviously would not work for those with "pull.rebase", and I do
> >> not think it makes much sense to teach "git rebase" the same trick
> >> to read FETCH_HEAD as "git merge" does in the above sequence.
> >>
> >> Others may have a better idea, but I do not immediately see any
> >> solution better than inventing a new option to "git pull".
> >>
> >> Another and better option that may be harder to arrange is to make
> >> sure that a no-op "git fetch" incurs very low cost.  If you did so,
> >
> > Not exactly related. But I often wish to see the list of branch
> > updates since the last fetch. There's no easy way (that I know) to do
> > this unless you copy the last fetch's output somewhere. If this "fetch
> > at low cost" could simply read FETCH_HEAD and summarizes it like a
> > normal fetch, that would be great. And it should also be very low cost
> > because we only replay the last part (making summary) of normal fetch.
>
> The ability to have this is something reftables will provide (from my
> memory of a comment by Stefan Beller), which Christian Couder is working
> on implementing these days.

I don't think we even need reftables to implement it. The list of new
SHA-1 is available in FETCH_HEAD (at least until the next fetch), and
from reflog we know the old SHA-1 (or a new branch/tag, I think we
know too).
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-08  1:34 ` Junio C Hamano
  2019-04-08  2:17   ` Duy Nguyen
@ 2019-04-08 14:53   ` Damien Robert
  2019-04-08 16:11     ` Damien Robert
  1 sibling, 1 reply; 9+ messages in thread
From: Damien Robert @ 2019-04-08 14:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

From Junio C Hamano, Mon 08 Apr 2019 at 10:34:07 (+0900) :
> In that simpler world, what you are trying to do would have been:

> 	git fetch
> 	# did I get anything worth integrating?
> 	git merge FETCH_HEAD

Indeed.

> That obviously would not work for those with "pull.rebase", and I do
> not think it makes much sense to teach "git rebase" the same trick
> to read FETCH_HEAD as "git merge" does in the above sequence.

Yes, it could learn to read the first branch not marked as not-for-merge,
but I agree this would be more confusing since it would introduce another
special handling of FETCH_HEAD, different from `merge` (which handle *all*
branches not marked as not-for-merge) and the other reference parsing
mechanisms (which simply look at the first branch in FETCh_HEAD).

> Others may have a better idea, but I do not immediately see any
> solution better than inventing a new option to "git pull".

Indeed, I was wondering if I was missing something since this is something
I do often (granted in practice it's not too hard to type `git merge` or
`git rebase` after the fetch for a branch; but when handling a lot of
branches at once I prefer to automatize this somewhat, and when I find
myself writing a script that needs to read branch.<name>.rebase values I am
left wondering if this would not be better to be directly supported in `git
pull` directly).

> Another and better option that may be harder to arrange is to make
> sure that a no-op "git fetch" incurs very low cost.  If you did so,
> "git fetch && git pull" would perform just like your "git fetch &&
> git pull --no-fetch", and we won't need a new option at all.

I am not sure I understand what a no-op `git fetch` means exactly.
In the "git fetch; <review changes>; git pull" scenario,
after I do the real `git fetch` and want to merge/rebase the changes, how
would I prevent `git pull` to pull new commits that were pushed in between?

-- 
Damien Robert

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-08 14:53   ` Damien Robert
@ 2019-04-08 16:11     ` Damien Robert
  2019-04-09  8:03       ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Damien Robert @ 2019-04-08 16:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

From Damien Robert, Mon 08 Apr 2019 at 16:53:40 (+0200) :
> > Others may have a better idea, but I do not immediately see any
> > solution better than inventing a new option to "git pull".

So here is a RFC patch that implements --no-fetch. (I am not sure about
the wording of the documentation, and what's the best way to test that
fetch was not called).

-- >8 --

From: Damien Robert <damien.olivier.robert+git@gmail.com>
Date: Mon, 8 Apr 2019 17:51:51 +0200
Subject: [PATCH 1/1] pull: add --no-fetch

A common workflow is to do a fetch, review the changes, and then
integrate the changes via an explicit merge or rebase.

In the good old days `git pull` was essentially `git fetch` followed by
`git merge FETCH_HEAD` so it was easy to separate the fetching part from
the integrating part.

But nowadays there are no easy way to do the integrating part of `git
pull`. Indeed, `git pull` learnt to read the config values of
`branch.<current>.rebase` and `pull.rebase`, whose possible values are
growing. At the time of this commit, they can be 'true', 'merges',
'preserve' and 'interactive'.

To remedy this, add a new --no-fetch option to `git pull`.

Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
---
 Documentation/git-pull.txt |  9 +++++++++
 builtin/pull.c             |  8 ++++++--
 t/t5520-pull.sh            | 14 ++++++++++++++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-pull.txt b/Documentation/git-pull.txt
index 118d9d86f7..cec06bf6e3 100644
--- a/Documentation/git-pull.txt
+++ b/Documentation/git-pull.txt
@@ -144,6 +144,15 @@ This option is only valid when "--rebase" is used.
 Options related to fetching
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
+--fetch::
+--no-fetch::
+	Perform a fetch first. This option can be used to override
+	--no-fetch.
++
+With --no-fetch don't fetch before updating the branch via a merge or a
+rebase. This can be used to review changes by doing a fetch first before
+pulling.
+
 include::fetch-options.txt[]
 
 include::pull-fetch-param.txt[]
diff --git a/builtin/pull.c b/builtin/pull.c
index 33db889955..0c14701abe 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -111,6 +111,7 @@ static char *opt_gpg_sign;
 static int opt_allow_unrelated_histories;
 
 /* Options passed to git-fetch */
+static int opt_fetch = 1;
 static char *opt_all;
 static char *opt_append;
 static char *opt_upload_pack;
@@ -195,6 +196,8 @@ static struct option pull_options[] = {
 
 	/* Options passed to git-fetch */
 	OPT_GROUP(N_("Options related to fetching")),
+	OPT_BOOL(0, "fetch", &opt_fetch,
+		N_("fetch before merging / rebasing (default)")),
 	OPT_PASSTHRU(0, "all", &opt_all, NULL,
 		N_("fetch from all remotes"),
 		PARSE_OPT_NOARG),
@@ -912,8 +915,9 @@ int cmd_pull(int argc, const char **argv, const char *prefix)
 			oidclr(&rebase_fork_point);
 	}
 
-	if (run_fetch(repo, refspecs))
-		return 1;
+	if (opt_fetch)
+		if (run_fetch(repo, refspecs))
+			return 1;
 
 	if (opt_dry_run)
 		return 0;
diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh
index cf4cc32fd0..058c85e6a4 100755
--- a/t/t5520-pull.sh
+++ b/t/t5520-pull.sh
@@ -134,6 +134,20 @@ test_expect_success 'the default remote . should not break explicit pull' '
 	test_cmp reflog.expected reflog.fuzzy
 '
 
+test_expect_success 'With --no-fetch will not fetch but still merge pending changes' '
+	git checkout -b nofetch master^ &&
+	echo modified >file &&
+	git commit -a -m modified &&
+	git checkout copy &&
+	git reset --hard HEAD^ &&
+	test "$(cat file)" = file &&
+	stat --format %Y .git/FETCH_HEAD > fetch_head_before &&
+	git pull --no-fetch . nofetch &&
+	stat --format %Y .git/FETCH_HEAD > fetch_head_after &&
+	test "$(cat file)" = modified &&
+	test_cmp fetch_head_before fetch_head_after
+'
+
 test_expect_success 'fail if wildcard spec does not match any refs' '
 	git checkout -b test copy^ &&
 	test_when_finished "git checkout -f copy && git branch -D test" &&
-- 
Patched on top of v2.21.0-196-g041f5ea1cf (git version 2.21.0)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-08 16:11     ` Damien Robert
@ 2019-04-09  8:03       ` Junio C Hamano
  2019-04-11 11:07         ` Damien Robert
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2019-04-09  8:03 UTC (permalink / raw)
  To: Damien Robert; +Cc: git

Damien Robert <damien.olivier.robert@gmail.com> writes:

> In the good old days `git pull` was essentially `git fetch` followed by
> `git merge FETCH_HEAD` so it was easy to separate the fetching part from
> the integrating part.
>
> But nowadays there are no easy way to do the integrating part of `git
> pull`. Indeed, `git pull` learnt to read the config values of
> `branch.<current>.rebase` and `pull.rebase`, whose possible values are
> growing. At the time of this commit, they can be 'true', 'merges',
> 'preserve' and 'interactive'.
>
> To remedy this, add a new --no-fetch option to `git pull`.

It may be better to present the whole remedy, not just this new
piece that is an ingredient of it, i.e.

	Teach a new --no-fetch option to `git pull`, so that the old
	`git fetch && inspect && git merge FETCH_HEAD` sequence can
	be generalized for workflows with newer `pull` features as a
	`git fetch && inspect && git pull --no-fetch` sequence.

or something like that, perhaps?

What worries me about this topic (before even reading the patch
text) revolves around how we guarantee that the "inspect" step in
that sequence does not affect the environment negatively.  In order
for the three-step sequence (no other options to `fetch` or `pull`
steps) to be truly equivalent to `git pull` (no other options) when
inspection is satisfactory, the "inspect" step probably should not
change the currently checked out branch; an unsplit `git pull` would
base its decision to choose which branch(es) from what remote to
fetch from based on what the currently checked-out branch is [*1*],
and it is not too implausible for `git fetch` to learn the same
trick, even though currently a `git fetch` without options seem to
always fetch from the default remote, without taking the current
branch or any other enviornmental conditions into account.  In the
worst case, after "inspect" step, `git pull --no-fetch` may not even
find a useful FETCH_HEAD; this includes a case like `git pull
--no-fetch some_other_remote` is run that is totally unrelated to
the previous `git fetch` that left the FETCH_HEAD.  The code needs
to notice strangeness like that and deal with it (most of the time,
I think erroring out saying "you did something stupid/unexpected
since you ran `git fetch`, so --no-fetch refuses to work" is
sufficient).

	Side note *1* There may be things other than what is
	currently checked out that can serve as an input to this
	decision, and moreover, we need to be prepared that other
	Git developers may want to invent other things later after
	this feature gets in that affects how `fetch` works, and
	leave the door open for them to do so without breaking the
	feature we are introducing here.

Anyway, let's see what you wrote.

> diff --git a/Documentation/git-pull.txt b/Documentation/git-pull.txt
> index 118d9d86f7..cec06bf6e3 100644
> --- a/Documentation/git-pull.txt
> +++ b/Documentation/git-pull.txt
> @@ -144,6 +144,15 @@ This option is only valid when "--rebase" is used.
>  Options related to fetching
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> +--fetch::
> +--no-fetch::
> +	Perform a fetch first. This option can be used to override
> +	--no-fetch.
> ++
> +With --no-fetch don't fetch before updating the branch via a merge or a
> +rebase. This can be used to review changes by doing a fetch first before
> +pulling.

Those who were in this discussion thread can guess that this is
referring to

	$ git fetch
          ... inspect ...
	$ git pull --no-fetch         

sequence, that emulates a single "git pull", but those who are
reading this without such a background probably gets lost.  Aren't
there restrictions on the first fetch done and the subsequent "pull"
that does not fetch?  For example, if you have a remote, origin, and
also pull from repositories without a nickname, doing this is not
something you would be willing to support, I presume:

         $ git fetch origin
         $ git pull --no-fetch https://github.com/gitster/git master

The reasons why I suspect you are not supporting the above are (1)
`git fetch origin` won't grab anything from my repository, and nothing
in FETCH_HEAD is useful to help the `pull` step of the sequence, and
(2) `git pull https://github.com/gitster/git master` without a set
nickname would not keep "the last time seen" value in refs/remotes/
hierarchy, so there is nothing to help --no-fetch "pretend" as if it
fetched previously [*2*].

Such a restriction must be spelled out to set the end-user
expectation for this new feature.

	Side note *2* This is about the design decision, but do we
	want this --no-fetch thing useful even for remotes without
	remote-tracking branches?  If we are willing to declare that
	we only support remotes with remote-tracking branches, then
	an alternative and probably a much better design that does
	not cause any of the "a user may do random and stupid things
	while inspecting, affecting the environment" worries I
	mentioned earlier becomes possible.  Instead of handling
	this feature at the "pull" level, teach "git fetch" a new
	option "--no-fetch" that pretends to have fetched from the
	given remote and given refspec, by looking at the tips of
	remote-tracking branches instead of talking to the remote,
	and populate FETCH_HEAD as if we got exactly the same thing
	as we saw when we last talked to the remote.  Then the
	change to "git pull --no-fetch" would be just to pass the
	option through to the underlying "git fetch".  That way, no
	matter what stupid things the user does in the intermediate
	"inspect" step, the condition that would affect `git pull`
	at the last step to decide which branches from what remote
	are used will be the same with or without `--no-fetch`, and
	we do not have to worry about restrictions on the earlier
	`git fetch` step relative to the final `git pull --no-fetch`
	step.  That would make the feature much easier to explain.

Anyway, let's keep reading.

>  /* Options passed to git-fetch */
> +static int opt_fetch = 1;
>  static char *opt_all;
>  static char *opt_append;
>  static char *opt_upload_pack;
> @@ -195,6 +196,8 @@ static struct option pull_options[] = {
>  
>  	/* Options passed to git-fetch */
>  	OPT_GROUP(N_("Options related to fetching")),
> +	OPT_BOOL(0, "fetch", &opt_fetch,
> +		N_("fetch before merging / rebasing (default)")),

Good.  The above two implements a --no-foo that is off by default
correctly.

> @@ -912,8 +915,9 @@ int cmd_pull(int argc, const char **argv, const char *prefix)
>  			oidclr(&rebase_fork_point);
>  	}
>  
> -	if (run_fetch(repo, refspecs))
> -		return 1;
> +	if (opt_fetch)
> +		if (run_fetch(repo, refspecs))
> +			return 1;

I know this is PoC, but the real implementation must have the else
clause to deal with "what if we do not fetch" case.

	if (opt_fetch) {
		if (run_fetch(repo, refspecs))
			return 1;
	} else {
		... make sure what is in FETCH_HEAD is compatible
		... with run_fetch(repo, refspecs) we are skipping
		... would have left, if we didn't skip it.
		... and otherwise barf loudly.
	}


> +test_expect_success 'With --no-fetch will not fetch but still merge pending changes' '
> +	git checkout -b nofetch master^ &&
> +	echo modified >file &&
> +	git commit -a -m modified &&
> +	git checkout copy &&
> +	git reset --hard HEAD^ &&
> +	test "$(cat file)" = file &&
> +	stat --format %Y .git/FETCH_HEAD > fetch_head_before &&

Do we use "stat(1)" elsewhere?  I think we avoid things that are not
portable.

See Documentation/CodingGuidelines for shell scripting style guides.

> +	git pull --no-fetch . nofetch &&
> +	stat --format %Y .git/FETCH_HEAD > fetch_head_after &&

I do not think you want to do this.

In "git fetch && inspect && git pull --no-fetch" sequence, the
file timestamp of FETCH_HEAD is not something you care about.  What
you care about is that what you inspected gets used in the 'pull'
step, and nothing newer or older.

A more to-the-point test arrangement would be something like

	- prepare the source repository
	- git fetch from the source repository
	- go back to the source repository and add a commit
	- run git pull --no-fetch

        - inspect the resulting merge commit.  Is the second parent
          the original commit you grabbed when you fetched?  Or is
          it the new one added to the source repository since you
          fetched?

One obvious advantage of doing so is that it is the exact motivating
use case you gave when you came to the list with this feature.
Also, it allows the "teach `git fetch --no-fetch` to pretend as if
it fetched the same thing, based on the remote-tracking branches"
implementation, by not caring the file timestamp of FETCH_HEAD.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pull without fetch
  2019-04-09  8:03       ` Junio C Hamano
@ 2019-04-11 11:07         ` Damien Robert
  0 siblings, 0 replies; 9+ messages in thread
From: Damien Robert @ 2019-04-11 11:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

First, let me thank you for your very thorough review!

From Junio C Hamano, Tue 09 Apr 2019 at 17:03:12 (+0900) :
> It may be better to present the whole remedy, not just this new
> piece that is an ingredient of it, i.e.
> 	Teach a new --no-fetch option to `git pull`, so that the old
> 	`git fetch && inspect && git merge FETCH_HEAD` sequence can
> 	be generalized for workflows with newer `pull` features as a
> 	`git fetch && inspect && git pull --no-fetch` sequence.
> or something like that, perhaps?

Indeed.

> What worries me about this topic (before even reading the patch
> text) revolves around how we guarantee that the "inspect" step in
> that sequence does not affect the environment negatively.  In order
> for the three-step sequence (no other options to `fetch` or `pull`
> steps) to be truly equivalent to `git pull` (no other options) when
> inspection is satisfactory, the "inspect" step probably should not
> change the currently checked out branch; an unsplit `git pull` would
> base its decision to choose which branch(es) from what remote to
> fetch from based on what the currently checked-out branch is [*1*],
> and it is not too implausible for `git fetch` to learn the same
> trick, even though currently a `git fetch` without options seem to
> always fetch from the default remote, without taking the current
> branch or any other enviornmental conditions into account.

Well it does take into account the current branch to know what is the
"default remote", and when doing a `git fetch remote refspec`, if 'remote'
is the default remote then it also adds the upstream branch to the refspec
(cf add_merge_config called by get_ref_map in fetch.c).

> In the worst case, after "inspect" step, `git pull --no-fetch` may not
> even find a useful FETCH_HEAD; this includes a case like `git pull
> --no-fetch some_other_remote` is run that is totally unrelated to the
> previous `git fetch` that left the FETCH_HEAD.  The code needs to notice
> strangeness like that and deal with it (most of the time, I think
> erroring out saying "you did something stupid/unexpected since you ran
> `git fetch`, so --no-fetch refuses to work" is sufficient).

Ok, so in my mind `git pull --no-fetch` was meant as the 'modern' version of
`git merge FETCH_HEAD`. If a remote or refspec is specified they are
currently discarded (I agree it would be better to error out). This means
that doing a `git fetch` while on branch master, followed by a checkout to
branch `feature` and a `git pull --no-fetch` will try to merge with
master's upstream rather than feature's upstream; exactly like `git merge
FETCH_HEAD` would have done.

In other words the current philosophy of `git pull --no-fetch` is as
follow: we integrate the previously fetched branches with the current
branch. Integrate means merge or rebase according to the value of
branch.<current>.rebase. We do not care what the current upstream branch
is, how the user did his previous fetch, and if it was not related to the
current branch it is his responsibility.

Now in your sidenote 2 you mention another approach:

> Instead of handling this feature at the "pull" level, teach "git fetch" a
> new option "--no-fetch" that pretends to have fetched from the given
> remote and given refspec, by looking at the tips of remote-tracking
> branches instead of talking to the remote, and populate FETCH_HEAD as if
> we got exactly the same thing as we saw when we last talked to the
> remote.
> [...]
> That would make the feature much easier to explain.

This approach is quite different from `git merge FETCH_HEAD`. Indeed it is
'construct our own FETCH_HEAD and then do the git merge FETCH_HEAD'.
In this case, the remote and refspec provided to `git pull --no-fetch`
actually matters. The only difference with a normal pull/fetch is that we
only use the local remotes branches, we never update them.

So this is probably what you meant in your previous email about doing a
'noop fetch', I had not understood that, sorry!

Now the question is which approach to implement? I think that your
approach is indeed better; with your method a `git pull --no-fetch` behave
much more similarly with a normal `git pull` than with my method, and there
is less room for the user to shoot himself in the foot.

I would even argue that even in the good old days, `git merge FETCH_HEAD`
was just a convenient approximation of a putative `git pull --no-fetch`, so
when implementing `--no-fetch` we should strive for the best end-user
result rather than do the same approximation as I did.

> this without such a background probably gets lost.  Aren't there
> restrictions on the first fetch done and the subsequent "pull" that does
> not fetch?  For example, if you have a remote, origin, and also pull from
> repositories without a nickname, doing this is not something you would be
> willing to support, I presume:
>          $ git fetch origin $ git pull --no-fetch
>          https://github.com/gitster/git master

So currently this would simply discard the arguments and merge with what's
is in FETCH_HEAD.

> The reasons why I suspect you are not supporting the above are (1) `git
> fetch origin` won't grab anything from my repository, and nothing in
> FETCH_HEAD is useful to help the `pull` step of the sequence, and (2)
> `git pull https://github.com/gitster/git master` without a set nickname
> would not keep "the last time seen" value in refs/remotes/ hierarchy, so
> there is nothing to help --no-fetch "pretend" as if it fetched previously

In a 'noop fetch' implementation, this would need to barf that `--no-fetch`
only works for name based remotes, not for url based ones.

> > @@ -912,8 +915,9 @@ int cmd_pull(int argc, const char **argv, const
> > char *prefix) oidclr(&rebase_fork_point); }
> >  
> > -	if (run_fetch(repo, refspecs)) -		return 1; +	if
> > (opt_fetch) +		if (run_fetch(repo, refspecs)) +
> > return 1;
> 
> I know this is PoC, but the real implementation must have the else clause
> to deal with "what if we do not fetch" case.
> 	if (opt_fetch) { if (run_fetch(repo, refspecs)) return 1; } else {
> 	... make sure what is in FETCH_HEAD is compatible ... with
> 	run_fetch(repo, refspecs) we are skipping ... would have left, if
> 	we didn't skip it. ... and otherwise barf loudly. }

In the current implementation, the else part would be
- error out if the user specified remotes or refspecs
- fails with a more informative message than
fatal: could not open '.git/FETCH_HEAD' for reading: No such file or directory
  if FETCH_HEAD does not exists. Something like:
fatal: you need to fetch first before running `git pull --no-fetch`.

> > +	git pull --no-fetch . nofetch && +	stat --format %Y
> > .git/FETCH_HEAD > fetch_head_after &&
> I do not think you want to do this.
> A more to-the-point test arrangement would be something like
> 	- prepare the source repository - git fetch from the source
> 	repository - go back to the source repository and add a commit -
> 	run git pull --no-fetch
>       - inspect the resulting merge commit.  Is the second parent the
>       original commit you grabbed when you fetched?  Or is it the new one
>       added to the source repository since you fetched?
> One obvious advantage of doing so is that it is the exact motivating use
> case you gave when you came to the list with this feature. Also, it
> allows the "teach `git fetch --no-fetch` to pretend as if it fetched the
> same thing, based on the remote-tracking branches" implementation, by not
> caring the file timestamp of FETCH_HEAD.

Yes, I realised after sending the RFC patch that in the test I should
really test the feature I want rather than an hacky approximation of it,
sorry about that.

I admit I was a bit unconfortable with the structure of t5520-pull.sh where
tests depends on the state of the git repositories of previous tests rather
than creating the git repos from scratch (presumably for the sake of
speed), so I was afraid to break things. But I see that I can probably
reuse the repos from the test 'setupt for detecting upstreamed changes'.

So I'll try (when I have time...) to do a RFC implementation of a 'noop
fetch' approach. I first need to understand the source of fetch.c better,
in particular how to do something like store_updated_refs when we are doing
a noop fetch (so without using transports).

-- 
Damien Robert
http://www.normalesup.org/~robert/pro

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-11 11:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-06 13:12 Pull without fetch Damien Robert
2019-04-08  1:34 ` Junio C Hamano
2019-04-08  2:17   ` Duy Nguyen
2019-04-08 12:51     ` Ævar Arnfjörð Bjarmason
2019-04-08 13:18       ` Duy Nguyen
2019-04-08 14:53   ` Damien Robert
2019-04-08 16:11     ` Damien Robert
2019-04-09  8:03       ` Junio C Hamano
2019-04-11 11:07         ` Damien Robert

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).