git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* bug in git-log with funny merges
@ 2020-03-24 17:29 Uwe Kleine-König
  2020-03-25  5:30 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Uwe Kleine-König @ 2020-03-24 17:29 UTC (permalink / raw)
  To: git; +Cc: oystwa, entwicklung

Hello,

TL;DR: how can @~..v5.4.27 be empty, but @..v5.4.27 contain commits?

I hit a situation with git that I don't understand. After asking in #git
I found two others who agree this is strange and I should report here.

You can recreate my history with the following commands:

	$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
	$ cd linux
	$ git remote add -f stable git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
	$ wget https://www.kleine-koenig.org/tmp/funnymerge
	$ git fetch funnymerge HEAD
	$ git checkout FETCH_HEAD

With that in place I have:

	$ git log --oneline --boundary --graph v5.4.27..
	*---.   86fd3e9df543 (HEAD) customers/sample/20200324-1
	|\ \ \  
	| | | * 6bfe59d959c3 iio: dac: ltc2632: remove some unused defines
	| | | * fc25f80fa2d6 iio: dac: ltc2632: add support for LTC2636 family
	| | | * aaf713847341 iio: dac: ltc2632: drop some duplicated data
	| | | * 36f4e0527245 dt-bindings: iio: ltc2632: expand for ltc2636 support
	| | * | dd98c18c4ae4 rtc: pcf2127: implement reading battery status bits
	| | |/  
	| * | 2704eba85eb7 serial: imx: use hrtimers for rs485 delays
	| * | 6fde0395f781 serial: imx: implement rts delaying for rs485
	| * | c288981265b7 serial: imx: support an enable-gpio
	| * | a7198bfcc09f serial: imx: fix a race condition in receive path
	| * | 38777ea60445 tty: serial: imx: use the sg count from dma_map_sg
	| * | 9dabd7937faf serial: imx: adapt rx buffer and dma periods
	* | | d1437a91a10e Release customers/sample/20200324-1
	* | | 803aad81ca69 iio: dac: ltc2632: remove some unused defines
	* | | 8a4579487b00 iio: dac: ltc2632: add support for LTC2636 family
	* | | 25a9309d7ded iio: dac: ltc2632: drop some duplicated data
	* | | e8f40385c55d dt-bindings: iio: ltc2632: expand for ltc2636 support
	* | | b95a62694eae rtc: pcf2127: implement reading battery status bits
	* | | f5af38a6b7a1 serial: imx: use hrtimers for rs485 delays
	* | | 14075cd0ce1b serial: imx: implement rts delaying for rs485
	* | | 338647d2ab5c serial: imx: support an enable-gpio
	* | | b31f3c6fbc18 serial: imx: adapt rx buffer and dma periods
	o | | 585e0cc08069 (tag: v5.4.27, stable/linux-5.4.y) Linux 5.4.27
	 / /  
	o / da0c9ea146cb (tag: v5.4-rc2) Linux 5.4-rc2
	 /  
	o 219d54332a09 (tag: v5.4) Linux 5.4

(This is a tree format we use to archive kernel trees for our customers.
It contains a snapshot of the relevant topic branches and a
linearisation to put into a quilt stack.)

So v5.4.27 is an ancestor of HEAD:

	$ git merge-base --is-ancestor v5.4.27 @ && echo yes
	yes

But!:

	$ git rev-list --count HEAD..v5.4.27
	3868

So there are 3868 patches that are in v5.4.27 but not HEAD? I would have
expected zero. Also

	$ git log HEAD..v5.4.27

lists commits, but

	$ git log HEAD~..v5.4.27

does not. (The latter is expected.)

I tested with git 2.26.0 and also git 2.0.0 and on different machines,
with the same results.

Am I missing something, or is there a (quite old) bug in git somewhere?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: bug in git-log with funny merges
  2020-03-24 17:29 bug in git-log with funny merges Uwe Kleine-König
@ 2020-03-25  5:30 ` Jeff King
  2020-03-25  9:04   ` git log behaves strange on non-monotonic commit dates [Was: Re: bug in git-log with funny merges] Uwe Kleine-König
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2020-03-25  5:30 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: Derrick Stolee, git, oystwa, entwicklung

On Tue, Mar 24, 2020 at 06:29:49PM +0100, Uwe Kleine-König wrote:

> So v5.4.27 is an ancestor of HEAD:
> 
> 	$ git merge-base --is-ancestor v5.4.27 @ && echo yes
> 	yes
> 
> But!:
> 
> 	$ git rev-list --count HEAD..v5.4.27
> 	3868

Yeah, this seems impossible. If v5.4.27 is an ancestor of HEAD, then
that rev-list result must be 0 commits. I'd trust merge-base over
rev-list's limiting, as the latter can be confused by clock skew.

And indeed, doing:

  $ git log --graph --format='%h %cd' HEAD

does point to some weirdness:

  [...]
  * | | d1437a91a10e Tue Mar 24 17:53:00 2020 +0100
  * | | 803aad81ca69 Mon Feb 3 21:24:56 2020 +0100
  * | | 8a4579487b00 Thu Jan 30 11:58:13 2020 +0100
  * | | 25a9309d7ded Thu Jan 30 11:54:28 2020 +0100
  * | | e8f40385c55d Thu Jan 30 14:15:47 2020 +0100
  * | | b95a62694eae Wed Sep 30 11:19:29 2015 +0200
  * | | f5af38a6b7a1 Thu Sep 19 17:59:29 2019 +0200
  * | | 14075cd0ce1b Tue Nov 17 21:20:00 2015 +0100
  * | | 338647d2ab5c Wed Jul 13 10:41:40 2016 +0200
  * | | b31f3c6fbc18 Mon Sep 23 15:59:16 2019 +0200
  * | | 585e0cc08069 Sat Mar 21 08:12:00 2020 +0100

The commit dates are very out-of-order: we go back to January (and even
to 2015!) for commits that are built on top of ones from last Saturday.
So while looking for a common ancestor with v5.4.27 (which is from only
a few days ago), I think we'd stop walking if we're left only with
UNINTERESTING commits (which these are, being ancestors of HEAD) that
seem to be from long before the last interesting one.

There's a slop-counter which is supposed to skip over up to 5
out-of-order commits, but that's clearly not enough here. Curiously,
bumping it to 100k commits isn't enough for this case! I had to bump it
to a million (effectively disabling the optimization entirely):

diff --git a/revision.c b/revision.c
index 8136929e23..cf585f375e 100644
--- a/revision.c
+++ b/revision.c
@@ -1085,7 +1085,7 @@ static void cherry_pick_list(struct commit_list *list, struct rev_info *revs)
 }
 
 /* How many extra uninteresting commits we want to see.. */
-#define SLOP 5
+#define SLOP 1000000
 
 static int still_interesting(struct commit_list *src, timestamp_t date, int slop,
 			     struct commit **interesting_cache)

I guess this depth might have to do with some weird topology between
linux-stable and linux.git (or maybe just something with your merge, I
didn't look too closely).

But my takeaways are:

One, whatever is generating this history is wrong. Author dates can be
out of order (and it is perfectly reasonable to do so to represent
cherry-picked or rebased commits). But commit dates should generally be
monotonically increasing. If you're applying old patches, etc, you
should be using the old date for the author timestamp, but the current
time for the committer timestamp.

And two, the clock skew issue is known in Git, and is an accepted
tradeoff for performance. E.g., "rev-list --count HEAD..v5.4.27" goes
from 43ms to 7200ms when I bump the SLOP value.

But the good news is that there's work towards correcting this. Instead
of using commit timestamps as a proxy for graph generation numbers, the
new commit-graph feature actually generates real generation numbers. So:

  $ git commit-graph write --reachable
  $ git rev-list --topo-order --count HEAD..v5.4.27
  0

which is the correct answer (and the rev-list runs even faster than the
SLOP version). The slightly bad news is that I needed --topo-order to
get that result. We're not yet using the generation numbers everywhere;
the issue is that they can actually be slower than commit timestamps
(timestamps, when they're accurate, have more resolution than generation
numbers, so they let us order the breadth-first search better). There
are plans to have the graph code use a "combined" number that stores
both a generation (for accuracy) and a timestamp (for resolution).

-Peff

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* git log behaves strange on non-monotonic commit dates [Was: Re: bug in git-log with funny merges]
  2020-03-25  5:30 ` Jeff King
@ 2020-03-25  9:04   ` Uwe Kleine-König
  0 siblings, 0 replies; 3+ messages in thread
From: Uwe Kleine-König @ 2020-03-25  9:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Derrick Stolee, git, oystwa, entwicklung

Hello Jeff,

I adapted the subject for the new evidences.

On Wed, Mar 25, 2020 at 01:30:39AM -0400, Jeff King wrote:
> On Tue, Mar 24, 2020 at 06:29:49PM +0100, Uwe Kleine-König wrote:
> 
> > So v5.4.27 is an ancestor of HEAD:
> > 
> > 	$ git merge-base --is-ancestor v5.4.27 @ && echo yes
> > 	yes
> > 
> > But!:
> > 
> > 	$ git rev-list --count HEAD..v5.4.27
> > 	3868
> 
> Yeah, this seems impossible. If v5.4.27 is an ancestor of HEAD, then
> that rev-list result must be 0 commits. I'd trust merge-base over
> rev-list's limiting, as the latter can be confused by clock skew.
> 
> And indeed, doing:
> 
>   $ git log --graph --format='%h %cd' HEAD
> 
> does point to some weirdness:
> 
>   [...]
>   * | | d1437a91a10e Tue Mar 24 17:53:00 2020 +0100
>   * | | 803aad81ca69 Mon Feb 3 21:24:56 2020 +0100
>   * | | 8a4579487b00 Thu Jan 30 11:58:13 2020 +0100
>   * | | 25a9309d7ded Thu Jan 30 11:54:28 2020 +0100
>   * | | e8f40385c55d Thu Jan 30 14:15:47 2020 +0100
>   * | | b95a62694eae Wed Sep 30 11:19:29 2015 +0200
>   * | | f5af38a6b7a1 Thu Sep 19 17:59:29 2019 +0200
>   * | | 14075cd0ce1b Tue Nov 17 21:20:00 2015 +0100
>   * | | 338647d2ab5c Wed Jul 13 10:41:40 2016 +0200
>   * | | b31f3c6fbc18 Mon Sep 23 15:59:16 2019 +0200
>   * | | 585e0cc08069 Sat Mar 21 08:12:00 2020 +0100

> The commit dates are very out-of-order: we go back to January (and even
> to 2015!) for commits that are built on top of ones from last Saturday.

That's done because the linearisation of the patch stack (that is
created for consumption by quilt) uses the original committer dates for
reproducibility. hmm ...

> So while looking for a common ancestor with v5.4.27 (which is from only
> a few days ago), I think we'd stop walking if we're left only with
> UNINTERESTING commits (which these are, being ancestors of HEAD) that
> seem to be from long before the last interesting one.
> 
> There's a slop-counter which is supposed to skip over up to 5
> out-of-order commits, but that's clearly not enough here. Curiously,
> bumping it to 100k commits isn't enough for this case! I had to bump it
> to a million (effectively disabling the optimization entirely):
> 
> diff --git a/revision.c b/revision.c
> index 8136929e23..cf585f375e 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1085,7 +1085,7 @@ static void cherry_pick_list(struct commit_list *list, struct rev_info *revs)
>  }
>  
>  /* How many extra uninteresting commits we want to see.. */
> -#define SLOP 5
> +#define SLOP 1000000
>  
>  static int still_interesting(struct commit_list *src, timestamp_t date, int slop,
>  			     struct commit **interesting_cache)
> 
> I guess this depth might have to do with some weird topology between
> linux-stable and linux.git (or maybe just something with your merge, I
> didn't look too closely).

> But my takeaways are:
> 
> One, whatever is generating this history is wrong. Author dates can be
> out of order (and it is perfectly reasonable to do so to represent
> cherry-picked or rebased commits). But commit dates should generally be
> monotonically increasing. If you're applying old patches, etc, you
> should be using the old date for the author timestamp, but the current
> time for the committer timestamp.

Yeah, will take that to the team to discuss how to improve our tool.

Thanks a lot for your analysis.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-25  9:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24 17:29 bug in git-log with funny merges Uwe Kleine-König
2020-03-25  5:30 ` Jeff King
2020-03-25  9:04   ` git log behaves strange on non-monotonic commit dates [Was: Re: bug in git-log with funny merges] Uwe Kleine-König

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).