git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* bug: --shallow-since misbehaves on old branch heads
@ 2018-05-22 19:48 Andreas Krey
  2018-05-23 16:02 ` Duy Nguyen
  2018-05-26 11:35 ` [PATCH] upload-pack: reject shallow requests that would return nothing Nguyễn Thái Ngọc Duy
  0 siblings, 2 replies; 8+ messages in thread
From: Andreas Krey @ 2018-05-22 19:48 UTC (permalink / raw)
  To: git

Hi everybody,

I think I have discovered a problem with clone/fetch --shallow-since:
When a ref that is fetches has a head that is already older than
the 'since' time, then the entire branch is fetched, instead of
just the current commit.

Repro:

  rm -rf tmp out deep
  git init tmp
  for i in `seq 10 31`; do
      d="2017-05-${i}T12:00"
      GIT_AUTOR_DATE="$d" GIT_COMMITTER_DATE="$d" git -C tmp commit -m nix$i --allow-empty
  done
  git -C tmp checkout -b test HEAD^^^
  for i in `seq 10 14`; do
      d="2017-06-${i}T12:00"
      GIT_AUTHOR_DATE="$d" GIT_COMMITTER_DATE="$d" git -C tmp commit -m nax$i --allow-empty
  done
  for i in `seq 1 4`; do
      git -C tmp commit -m new$i --allow-empty
  done

  echo "** This is fine:"

  git clone --shallow-since '1 month ago' file://`pwd`/tmp out --branch test
  git -C out log --oneline

  echo "** This should show one commit but shows all:"

  git clone --shallow-since '1 month ago' file://`pwd`/tmp deep --branch master
  git -C deep log --oneline

Do I expect the wrong thing?

- Andreas


-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: --shallow-since misbehaves on old branch heads
  2018-05-22 19:48 bug: --shallow-since misbehaves on old branch heads Andreas Krey
@ 2018-05-23 16:02 ` Duy Nguyen
  2018-05-26 11:35 ` [PATCH] upload-pack: reject shallow requests that would return nothing Nguyễn Thái Ngọc Duy
  1 sibling, 0 replies; 8+ messages in thread
From: Duy Nguyen @ 2018-05-23 16:02 UTC (permalink / raw)
  To: Andreas Krey; +Cc: git

I probably can't look into this until the weekend. Just wanted to let
you know that I've seen this mail and, being the one who introduced
--shallow-since, I will look into it soon (unless someone beats me to
it of course).
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] upload-pack: reject shallow requests that would return nothing
  2018-05-22 19:48 bug: --shallow-since misbehaves on old branch heads Andreas Krey
  2018-05-23 16:02 ` Duy Nguyen
@ 2018-05-26 11:35 ` Nguyễn Thái Ngọc Duy
  2018-05-28  5:55   ` Junio C Hamano
  2018-06-04 10:46   ` Junio C Hamano
  1 sibling, 2 replies; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-05-26 11:35 UTC (permalink / raw)
  To: git; +Cc: a.krey, Nguyễn Thái Ngọc Duy

Shallow clones with --shallow-since or --shalow-exclude work by
running rev-list to get all reachable commits, then draw a boundary
between reachable and unreachable and send "shallow" requests based on
that.

The code does miss one corner case: if rev-list returns nothing, we'll
have no border and we'll send no shallow requests back to the client
(i.e. no history cuts). This essentially means a full clone (or a full
branch if the client requests just one branch). One example is the
oldest commit is older than what is specified by --shallow-since.

To avoid this, if rev-list returns nothing, we abort the clone/fetch.
The user could adjust their request (e.g. --shallow-since further back
in the past) and retry.

Another possible option for this case is to fall back to a default
depth (like depth 1). But I don't like too much magic that way because
we may return something unexpected to the user. If they request
"history since 2008" and we return a single depth at 2000, that might
break stuff for them. It is better to tell them that something is
wrong and let them take the best course of action.

Note that we need to die() in get_shallow_commits_by_rev_list()
instead of just checking for empty result from its caller
deepen_by_rev_list() and handling the error there. The reason is,
empty result could be a valid case: if you have commits in year 2013
and you request --shallow-since=year.2000 then you should get a full
clone (i.e. empty result).

Reported-by: Andreas Krey <a.krey@gmx.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c             |  3 +++
 t/t5500-fetch-pack.sh | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/shallow.c b/shallow.c
index df4d44ea7a..44fdca1ace 100644
--- a/shallow.c
+++ b/shallow.c
@@ -175,6 +175,9 @@ struct commit_list *get_shallow_commits_by_rev_list(int ac, const char **av,
 		die("revision walk setup failed");
 	traverse_commit_list(&revs, show_commit, NULL, &not_shallow_list);
 
+	if (!not_shallow_list)
+		die("no commits selected for shallow requests");
+
 	/* Mark all reachable commits as NOT_SHALLOW */
 	for (p = not_shallow_list; p; p = p->next)
 		p->item->object.flags |= not_shallow_flag;
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 0680dec808..c8f2d38a58 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -711,6 +711,17 @@ test_expect_success 'fetch shallow since ...' '
 	test_cmp expected actual
 '
 
+test_expect_success 'clone shallow since selects no commits' '
+	test_create_repo shallow-since-the-future &&
+	(
+	cd shallow-since-the-future &&
+	GIT_COMMITTER_DATE="100000000 +0700" git commit --allow-empty -m one &&
+	GIT_COMMITTER_DATE="200000000 +0700" git commit --allow-empty -m two &&
+	GIT_COMMITTER_DATE="300000000 +0700" git commit --allow-empty -m three &&
+	test_must_fail git clone --shallow-since "900000000 +0700" "file://$(pwd)/." ../shallow111
+	)
+'
+
 test_expect_success 'shallow clone exclude tag two' '
 	test_create_repo shallow-exclude &&
 	(
-- 
2.17.0.705.g3525833791


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] upload-pack: reject shallow requests that would return nothing
  2018-05-26 11:35 ` [PATCH] upload-pack: reject shallow requests that would return nothing Nguyễn Thái Ngọc Duy
@ 2018-05-28  5:55   ` Junio C Hamano
  2018-05-28 18:48     ` Duy Nguyen
  2018-06-04 10:46   ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2018-05-28  5:55 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, a.krey

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> To avoid this, if rev-list returns nothing, we abort the clone/fetch.
> The user could adjust their request (e.g. --shallow-since further back
> in the past) and retry.

Yeah, that makes sense.

> Another possible option for this case is to fall back to a default
> depth (like depth 1). But I don't like too much magic that way because
> we may return something unexpected to the user.

I agree that it would be a horrible fallback.  I actually am
wondering if we should just silently return no objects without even
telling the user there is something unexpected happening.  After
all, the user may well be expecting with --shallow-since that is too
recent that the fetch may not result in pulling anything new, and
giving a "die" message, which now needs to be distinguished from
other forms of die's like network connectivity or auth failures, is
not all that helpful.

> Note that we need to die() in get_shallow_commits_by_rev_list()
> instead of just checking for empty result from its caller
> deepen_by_rev_list() and handling the error there. The reason is,
> empty result could be a valid case: if you have commits in year 2013
> and you request --shallow-since=year.2000 then you should get a full
> clone (i.e. empty result).

Yup, that latter example makes me more convinced that it also is a
valid outcome if we end up requesting "no" object when shallow-since
names too recent date, e.g. against a project that is dormant since
2013 with --shallow-since=2018/01/01 or something like that, instead
of dying.

> Reported-by: Andreas Krey <a.krey@gmx.de>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  shallow.c             |  3 +++
>  t/t5500-fetch-pack.sh | 11 +++++++++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/shallow.c b/shallow.c
> index df4d44ea7a..44fdca1ace 100644
> --- a/shallow.c
> +++ b/shallow.c
> @@ -175,6 +175,9 @@ struct commit_list *get_shallow_commits_by_rev_list(int ac, const char **av,
>  		die("revision walk setup failed");
>  	traverse_commit_list(&revs, show_commit, NULL, &not_shallow_list);
>  
> +	if (!not_shallow_list)
> +		die("no commits selected for shallow requests");
> +
>  	/* Mark all reachable commits as NOT_SHALLOW */
>  	for (p = not_shallow_list; p; p = p->next)
>  		p->item->object.flags |= not_shallow_flag;
> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> index 0680dec808..c8f2d38a58 100755
> --- a/t/t5500-fetch-pack.sh
> +++ b/t/t5500-fetch-pack.sh
> @@ -711,6 +711,17 @@ test_expect_success 'fetch shallow since ...' '
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'clone shallow since selects no commits' '
> +	test_create_repo shallow-since-the-future &&
> +	(
> +	cd shallow-since-the-future &&
> +	GIT_COMMITTER_DATE="100000000 +0700" git commit --allow-empty -m one &&
> +	GIT_COMMITTER_DATE="200000000 +0700" git commit --allow-empty -m two &&
> +	GIT_COMMITTER_DATE="300000000 +0700" git commit --allow-empty -m three &&
> +	test_must_fail git clone --shallow-since "900000000 +0700" "file://$(pwd)/." ../shallow111
> +	)
> +'
> +
>  test_expect_success 'shallow clone exclude tag two' '
>  	test_create_repo shallow-exclude &&
>  	(

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] upload-pack: reject shallow requests that would return nothing
  2018-05-28  5:55   ` Junio C Hamano
@ 2018-05-28 18:48     ` Duy Nguyen
  2018-06-02  6:06       ` Duy Nguyen
  0 siblings, 1 reply; 8+ messages in thread
From: Duy Nguyen @ 2018-05-28 18:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Andreas Krey

On Mon, May 28, 2018 at 7:55 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> To avoid this, if rev-list returns nothing, we abort the clone/fetch.
>> The user could adjust their request (e.g. --shallow-since further back
>> in the past) and retry.
>
> Yeah, that makes sense.
>
>> Another possible option for this case is to fall back to a default
>> depth (like depth 1). But I don't like too much magic that way because
>> we may return something unexpected to the user.
>
> I agree that it would be a horrible fallback.  I actually am
> wondering if we should just silently return no objects without even
> telling the user there is something unexpected happening.  After
> all, the user may well be expecting with --shallow-since that is too
> recent that the fetch may not result in pulling anything new, and
> giving a "die" message, which now needs to be distinguished from
> other forms of die's like network connectivity or auth failures, is
> not all that helpful.

An empty fetch is probably ok (though I would need to double check if
anything bad would happen or git-fetch would give some helpful
suggestion). git-clone on the other hand should actually clean this up
with a good advice. I'll need to check and come back with v2 later.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] upload-pack: reject shallow requests that would return nothing
  2018-05-28 18:48     ` Duy Nguyen
@ 2018-06-02  6:06       ` Duy Nguyen
  0 siblings, 0 replies; 8+ messages in thread
From: Duy Nguyen @ 2018-06-02  6:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Andreas Krey

On Mon, May 28, 2018 at 8:48 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Mon, May 28, 2018 at 7:55 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>>
>>> To avoid this, if rev-list returns nothing, we abort the clone/fetch.
>>> The user could adjust their request (e.g. --shallow-since further back
>>> in the past) and retry.
>>
>> Yeah, that makes sense.
>>
>>> Another possible option for this case is to fall back to a default
>>> depth (like depth 1). But I don't like too much magic that way because
>>> we may return something unexpected to the user.
>>
>> I agree that it would be a horrible fallback.  I actually am
>> wondering if we should just silently return no objects without even
>> telling the user there is something unexpected happening.  After
>> all, the user may well be expecting with --shallow-since that is too
>> recent that the fetch may not result in pulling anything new, and
>> giving a "die" message, which now needs to be distinguished from
>> other forms of die's like network connectivity or auth failures, is
>> not all that helpful.
>
> An empty fetch is probably ok (though I would need to double check if
> anything bad would happen or git-fetch would give some helpful
> suggestion). git-clone on the other hand should actually clean this up
> with a good advice. I'll need to check and come back with v2 later.

It turns out harder than I thought. Cutting history and want/have
negotiation are separate but must be consistent. If during the
negotiation you said "I'm giving you this ref at this SHA-1" then you
send nothing back, the client will complain at connectivity check. It
expects all the objects that lead to said SHA-1.

Part of the problem is we advertise refs very early, before accepting
shallow requests and it's kinda hard to tell the user "ok with this
set of shallow requests, only these refs are actually valid and could
be sent back to you" before the want/have negotiation starts. At the
current state, the only thing we could do is tell the user "nak you
can't have that ref" even if we advertise the ref. This might confuse
clients and does not sound great.

I think for now die() may be a good enough quick fix. I'm not sure if
it's worth messing with the want/have negotiation just to cover this
rare edge case.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] upload-pack: reject shallow requests that would return nothing
  2018-05-26 11:35 ` [PATCH] upload-pack: reject shallow requests that would return nothing Nguyễn Thái Ngọc Duy
  2018-05-28  5:55   ` Junio C Hamano
@ 2018-06-04 10:46   ` Junio C Hamano
  2018-06-04 14:44     ` Duy Nguyen
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2018-06-04 10:46 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, a.krey

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> Shallow clones with --shallow-since or --shalow-exclude work by
> running rev-list to get all reachable commits, then draw a boundary
> between reachable and unreachable and send "shallow" requests based on
> that.
>
> The code does miss one corner case: if rev-list returns nothing, we'll
> have no border and we'll send no shallow requests back to the client
> (i.e. no history cuts). This essentially means a full clone (or a full
> branch if the client requests just one branch). One example is the
> oldest commit is older than what is specified by --shallow-since.

"the newest commit is older than", isn't it?  That is, the cutoff
point specified is newer than the existing history.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] upload-pack: reject shallow requests that would return nothing
  2018-06-04 10:46   ` Junio C Hamano
@ 2018-06-04 14:44     ` Duy Nguyen
  0 siblings, 0 replies; 8+ messages in thread
From: Duy Nguyen @ 2018-06-04 14:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Andreas Krey

On Mon, Jun 4, 2018 at 12:46 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> Shallow clones with --shallow-since or --shalow-exclude work by
>> running rev-list to get all reachable commits, then draw a boundary
>> between reachable and unreachable and send "shallow" requests based on
>> that.
>>
>> The code does miss one corner case: if rev-list returns nothing, we'll
>> have no border and we'll send no shallow requests back to the client
>> (i.e. no history cuts). This essentially means a full clone (or a full
>> branch if the client requests just one branch). One example is the
>> oldest commit is older than what is specified by --shallow-since.
>
> "the newest commit is older than", isn't it?  That is, the cutoff
> point specified is newer than the existing history.

Yes. As a result, the entirely history is cut, including the tip.
--shallow-exclude could also lead to this situation if the user
accidentally excludes everything.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-04 14:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-22 19:48 bug: --shallow-since misbehaves on old branch heads Andreas Krey
2018-05-23 16:02 ` Duy Nguyen
2018-05-26 11:35 ` [PATCH] upload-pack: reject shallow requests that would return nothing Nguyễn Thái Ngọc Duy
2018-05-28  5:55   ` Junio C Hamano
2018-05-28 18:48     ` Duy Nguyen
2018-06-02  6:06       ` Duy Nguyen
2018-06-04 10:46   ` Junio C Hamano
2018-06-04 14:44     ` Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).