From: Junio C Hamano <gitster@pobox.com>
To: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, me@ttaylorr.com, vdye@github.com,
Jeff King <peff@peff.net>,
Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH v2 0/8] ref-filter: ahead/behind counting, faster --merged option
Date: Fri, 10 Mar 2023 11:16:17 -0800 [thread overview]
Message-ID: <xmqqedpw5se6.fsf@gitster.g> (raw)
In-Reply-To: pull.1489.v2.git.1678468863.gitgitgadget@gmail.com
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> I was
> initially concerned about the overhead of 'git for-each-ref' and its
> generality and sorting, but I was not able to measure any important
> difference between this implementation and our internal 'git ahead-behind'
> implementation.
That certainly is nice to know.
> However, for our specific uses, we like to batch a list of exact references
> that could be very long. We introduce a new --stdin option here.
>
> To keep things close to the v1 outline, I replaced the existing patches with
> closely-related ones, when possible.
>
> Patch 1 adds the --stdin option to 'git for-each-ref'. (This is similar to
> the boilerplate patch from v1.)
>
> Patch 2 adds a test to explicitly check that 'git for-each-ref' will still
> succeed when all input refs are missing. (This is similar to the
> --ignore-missing patch from v1.)
Sensible.
> Patches 3-5 introduce a new method: ensure_generations_valid(). Patch 3 does
> some refactoring of the existing generation number computations to make it
> more generic, and patch 4 updates the definition of
> commit_graph_generation() slightly, making way for patch 5 to implement the
> method. With an existing commit-graph file, the commits that are not present
> in the file are considered as having generation number "infinity". This is
> useful for most of our reachability queries to this point, since those
> commits are "above" the ones tracked by the commit-graph. When these commits
> are low in number, then there is very little performance cost and zero
> correctness cost. (These patches match v1 exactly.)
>
> However, we will see that the ahead/behind computation requires accurate
> generation numbers to avoid overcounting. Thus, ensure_generations_valid()
> is a way to specify a list of commits that need generation numbers computed
> before continuing. It's a no-op if all of those commits are in the
> commit-graph file. It's expensive if the commit-graph doesn't exist.
Reasonable.
> However, '%(ahead-behind:)' computations are likely to be slow no matter
> what without a commit-graph, so assuming an existing commit-graph file is
> reasonable. If we find sufficient desire to have an implementation that does
> not have this requirement, we could create a second implementation and
> toggle to it when generation_numbers_enabled() returns false.
At that point, it might make sense to find a way to make the work
ensure_generations_valid() had to spend cycles on not to go to
waste. Something like "ah, you do not have commit-graph at all, so
let's try to create one if you can write into the repository" at the
beginning of the function, or something? Just thinking aloud.
> Patch 6 implements the ahead-behind algorithm, but it is not connected to a
> builtin. It's a long commit message, so hopefully it explains the algorithm
> sufficiently. (The difference from v1 is that it no longer integrates with a
> builtin and there are no new tests. It also uses 'unsigned int' and is
> correctly co-authored by Taylor.)
Nice.
> Patch 7 integrates the ahead-behind algorithm with the ref-filter code,
> including parsing the "ahead-behind" token. This finally adds tests that
> check both ahead_behind() and ensure_generations_valid() via
> t6600-test-reach.sh. (This patch is essentially completely new in v2.)
>
> Patch 8 implements the tips_reachable_from_base() method, and uses it within
> the ref-filter code to speed up 'git for-each-ref --merged' and 'git branch
> --merged'. (The interface is slightly different than v1, due to the needs of
> the new caller.)
Very nice.
Having read all the patches, I am very impressed and pleased, but
are we losing anything by having the feature inside for-each-ref
compared to a new command ahead-behind? As far as I can tell, the
new "for-each-ref --stdin" would still want to match refs and work
only on refs, but there shouldn't be any reason for ahead-behind
computation to limit to tips that are at the tip of a ref, so that
may be one downside in this updated design. For the intended use
case of "let's find which branches are stale", that downside does
not matter in practice, but for other use cases people will think
of in the future, the limitation might matter (at which time we can
easily resurrect the other subcommand, using the internal machinery
we have here, so it is not a huge deal, I presume).
Thanks.
next prev parent reply other threads:[~2023-03-10 19:16 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-06 14:06 [PATCH 0/8] ahead-behind: new builtin for counting multiple commit ranges Derrick Stolee via GitGitGadget
2023-03-06 14:06 ` [PATCH 1/8] ahead-behind: create empty builtin Derrick Stolee via GitGitGadget
2023-03-06 18:48 ` Junio C Hamano
2023-03-07 0:40 ` Taylor Blau
2023-03-08 22:14 ` Derrick Stolee
2023-03-08 22:56 ` Junio C Hamano
2023-03-06 14:06 ` [PATCH 2/8] ahead-behind: parse tip references Derrick Stolee via GitGitGadget
2023-03-07 0:43 ` Taylor Blau
2023-03-06 14:06 ` [PATCH 3/8] ahead-behind: implement --ignore-missing option Derrick Stolee via GitGitGadget
2023-03-07 0:46 ` Taylor Blau
2023-03-06 14:06 ` [PATCH 4/8] commit-graph: combine generation computations Derrick Stolee via GitGitGadget
2023-03-06 14:06 ` [PATCH 5/8] commit-graph: return generation from memory Derrick Stolee via GitGitGadget
2023-03-06 14:06 ` [PATCH 6/8] commit-graph: introduce `ensure_generations_valid()` Taylor Blau via GitGitGadget
2023-03-06 18:52 ` Junio C Hamano
2023-03-07 0:50 ` Taylor Blau
2023-03-06 14:06 ` [PATCH 7/8] ahead-behind: implement ahead_behind() logic Derrick Stolee via GitGitGadget
2023-03-07 1:05 ` Taylor Blau
2023-03-09 17:32 ` Derrick Stolee
2023-03-06 14:06 ` [PATCH 8/8] ahead-behind: add --contains mode Derrick Stolee via GitGitGadget
2023-03-06 18:26 ` [PATCH 0/8] ahead-behind: new builtin for counting multiple commit ranges Junio C Hamano
2023-03-06 20:18 ` Derrick Stolee
2023-03-06 22:24 ` Junio C Hamano
2023-03-07 0:36 ` Taylor Blau
2023-03-09 9:20 ` Jeff King
2023-03-09 21:51 ` Junio C Hamano
2023-03-07 0:33 ` Taylor Blau
2023-03-10 17:20 ` [PATCH v2 0/8] ref-filter: ahead/behind counting, faster --merged option Derrick Stolee via GitGitGadget
2023-03-10 17:20 ` [PATCH v2 1/8] for-each-ref: add --stdin option Derrick Stolee via GitGitGadget
2023-03-10 18:08 ` Junio C Hamano
2023-03-13 10:31 ` Phillip Wood
2023-03-13 13:33 ` Derrick Stolee
2023-03-13 21:10 ` Taylor Blau
2023-03-15 13:37 ` Ævar Arnfjörð Bjarmason
2023-03-15 17:17 ` Jeff King
2023-03-15 17:49 ` Jeff King
2023-03-15 19:24 ` Junio C Hamano
2023-03-15 19:44 ` Jeff King
2023-03-10 17:20 ` [PATCH v2 2/8] for-each-ref: explicitly test no matches Derrick Stolee via GitGitGadget
2023-03-10 17:20 ` [PATCH v2 3/8] commit-graph: combine generation computations Derrick Stolee via GitGitGadget
2023-03-10 17:20 ` [PATCH v2 4/8] commit-graph: return generation from memory Derrick Stolee via GitGitGadget
2023-03-10 17:21 ` [PATCH v2 5/8] commit-graph: introduce `ensure_generations_valid()` Taylor Blau via GitGitGadget
2023-03-10 17:21 ` [PATCH v2 6/8] commit-reach: implement ahead_behind() logic Derrick Stolee via GitGitGadget
2023-03-15 13:50 ` Ævar Arnfjörð Bjarmason
2023-03-15 16:03 ` Junio C Hamano
2023-03-15 16:13 ` Derrick Stolee
2023-03-10 17:21 ` [PATCH v2 7/8] for-each-ref: add ahead-behind format atom Derrick Stolee via GitGitGadget
2023-03-10 19:09 ` Junio C Hamano
2023-03-15 13:57 ` Ævar Arnfjörð Bjarmason
2023-03-15 16:01 ` Junio C Hamano
2023-03-15 16:12 ` Derrick Stolee
2023-03-15 16:11 ` Derrick Stolee
2023-03-10 17:21 ` [PATCH v2 8/8] commit-reach: add tips_reachable_from_bases() Derrick Stolee via GitGitGadget
2023-03-15 14:13 ` Ævar Arnfjörð Bjarmason
2023-03-15 16:17 ` Derrick Stolee
2023-03-15 16:18 ` Derrick Stolee
2023-03-10 19:16 ` Junio C Hamano [this message]
2023-03-10 19:25 ` [PATCH v2 0/8] ref-filter: ahead/behind counting, faster --merged option Derrick Stolee
2023-03-15 17:31 ` Jeff King
2023-03-15 17:44 ` Derrick Stolee
2023-03-15 19:34 ` Junio C Hamano
2023-03-15 13:22 ` Ævar Arnfjörð Bjarmason
2023-03-15 13:54 ` Derrick Stolee
2023-03-15 17:45 ` [PATCH v3 " Derrick Stolee via GitGitGadget
2023-03-15 17:45 ` [PATCH v3 1/8] for-each-ref: add --stdin option Derrick Stolee via GitGitGadget
2023-03-15 18:06 ` Jeff King
2023-03-15 19:14 ` Junio C Hamano
2023-03-15 22:41 ` Jonathan Tan
2023-03-15 17:45 ` [PATCH v3 2/8] for-each-ref: explicitly test no matches Derrick Stolee via GitGitGadget
2023-03-15 17:45 ` [PATCH v3 3/8] commit-graph: combine generation computations Derrick Stolee via GitGitGadget
2023-03-15 22:49 ` Jonathan Tan
2023-03-17 18:30 ` Derrick Stolee
2023-03-15 17:45 ` [PATCH v3 4/8] commit-graph: return generation from memory Derrick Stolee via GitGitGadget
2023-03-15 22:58 ` Jonathan Tan
2023-03-15 17:45 ` [PATCH v3 5/8] commit-graph: introduce `ensure_generations_valid()` Taylor Blau via GitGitGadget
2023-03-15 17:45 ` [PATCH v3 6/8] commit-reach: implement ahead_behind() logic Derrick Stolee via GitGitGadget
2023-03-15 23:28 ` Jonathan Tan
2023-03-17 18:44 ` Derrick Stolee
2023-03-15 17:45 ` [PATCH v3 7/8] for-each-ref: add ahead-behind format atom Derrick Stolee via GitGitGadget
2023-03-15 17:45 ` [PATCH v3 8/8] commit-reach: add tips_reachable_from_bases() Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 0/9] ref-filter: ahead/behind counting, faster --merged option Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 1/9] for-each-ref: add --stdin option Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 2/9] for-each-ref: explicitly test no matches Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 3/9] commit-graph: refactor compute_topological_levels() Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 4/9] commit-graph: simplify compute_generation_numbers() Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 5/9] commit-graph: return generation from memory Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 6/9] commit-graph: introduce `ensure_generations_valid()` Taylor Blau via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 7/9] commit-reach: implement ahead_behind() logic Derrick Stolee via GitGitGadget
2023-03-20 20:40 ` Jonathan Tan
2023-03-20 11:26 ` [PATCH v4 8/9] for-each-ref: add ahead-behind format atom Derrick Stolee via GitGitGadget
2023-03-20 11:26 ` [PATCH v4 9/9] commit-reach: add tips_reachable_from_bases() Derrick Stolee via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqedpw5se6.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).