git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: SZEDER Gábor <szeder.dev@gmail.com>
To: Mike Hommey <mh@glandium.org>
Cc: git@vger.kernel.org
Subject: Re: Revision walking, commit dates, slop
Date: Sat, 18 May 2019 03:50:05 +0200
Message-ID: <20190518015005.GA951@szeder.dev> (raw)
In-Reply-To: <20190518005412.n45pj5p2rrtm2bfj@glandium.org>

On Sat, May 18, 2019 at 09:54:12AM +0900, Mike Hommey wrote:
> There are established corner cases, where in a repo where commit dates
> are not monotonically increasing, revision walking can go horribly
> wrong. This was discussed in the past in e.g.
> https://public-inbox.org/git/20150521061553.GA29269@glandium.org/
> 
> The only (simple) workable way, given the current algorithm, to get an
> accurate view off rev-list is to essentially make slop infinite. This
> works fine, at the expense of runtime.
> 
> Now, ignoring any modification for the above, I'm hitting another corner
> case in some other "weird" history, where I have 500k commits all with
> the same date. With such a commit dag, something as trivial as
> `git rev-list HEAD~..HEAD` goes through all commits from the root commit
> to HEAD, which takes multiple seconds, when the (obvious) output is one
> commit.
> 
> It looks like the only way revision walking stops going through all the
> ancestry is through slop, and slop is essentially made infinite by the
> fact all commits have the same date (because of the date check in
> still_interesting(). By extension, this means the workaound for the
> first corner case above, which is to make slop infinite, essentially
> makes all rev walking go through the entire ancestry of the commits
> given on the command line.
> 
> It feels like some cases of everybody_uninteresting should shorcut slop
> entirely, but considering the only way for slop to decrease at all is
> when everybody_uninteresting returns true, that would seem like a wrong
> assumption. But I'm also not sure what slop helps with in the first
> place (but I don't have a clear view of the broader picture of how the
> entire revision walking works).
> 
> Anyways, a rather easy way to witness this happening is to create a
> dummy repo like:
>   git init foo
>   cd foo
>   for i in $(seq 1 50); do
>     echo $i > a;
>     git add a;
>     git commit -a -m $i;
>   done
> 
> The something as simple as `git rev-list HEAD~..HEAD` will go through
> all 50 commits (assuming the script above created commits in the same
> second, which it did on my machine)
> 
> By the time both HEAD~ and HEAD have been processed, the revision
> walking should have enough information to determine that it doesn't need
> to go further, but still does. Even with something like HEAD~2..HEAD,
> after the first round of processing parents it should be able to see
> there's not going to be any more interesting commits.
> 
> I'm willing to dig into this, but if someone familiar with the
> algorithm could give me some hints as to what I might be missing in the
> big picture, that would be helpful.

All the above is without commit-graph, I presume?  If so, then you
should give it a try, as it might bring immediate help in your
pathological repo.  With 5k commit in the same second (enforced via
'export GIT_COMMITTER_DATE=$(date); for i in {1..5000} ...') I get:

  $ best-of-five -q git rev-list HEAD~..HEAD
  0.069
  $ git commit-graph write --reachableComputing commit graph generation
  numbers: 100% (5000/5000), done.
  $ best-of-five -q git rev-list HEAD~..HEAD
  0.004



  reply index

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-18  0:54 Mike Hommey
2019-05-18  1:50 ` SZEDER Gábor [this message]
2019-05-18  3:58   ` Mike Hommey
2019-05-18  4:17     ` Mike Hommey
2019-05-18 12:01       ` SZEDER Gábor
2019-05-19 22:28         ` Jakub Narebski
2019-05-20  1:33       ` Derrick Stolee
2019-05-20 11:02         ` Jakub Narebski
2019-05-20 11:20           ` Derrick Stolee
2019-05-20 13:42             ` Jakub Narebski
2019-05-20 23:27               ` Jakub Narebski
2019-05-21  1:20                 ` Derrick Stolee
2019-05-22 18:29                   ` Jakub Narebski
2019-05-22 19:06                     ` Derrick Stolee
2019-05-23 21:04                       ` Jakub Narebski
2019-06-25  7:51               ` Jakub Narebski
2019-06-25 10:54                 ` Derrick Stolee
2019-05-21 13:14         ` [PATCH] revision: use generation for A..B --topo-order queries Derrick Stolee
2019-05-21 13:59           ` [PATCH 2/2] revision: keep topo-walk free of unintersting commits Derrick Stolee
2019-05-22  2:19             ` Mike Hommey
2019-05-21  2:00   ` Revision walking, commit dates, slop Jonathan Nieder

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190518015005.GA951@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mh@glandium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox