git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com
Subject: Re: [PATCH 3/3] commit-graph: respect 'core.useBloomFilters'
Date: Tue, 30 Jun 2020 15:18:34 -0400	[thread overview]
Message-ID: <20200630191834.GC1888406@coredump.intra.peff.net> (raw)
In-Reply-To: <4cfa086e503e19763a9d581fcb6a2ef818776dfc.1593536481.git.me@ttaylorr.com>

On Tue, Jun 30, 2020 at 01:17:48PM -0400, Taylor Blau wrote:

> Git uses the 'core.commitGraph' configuration value to control whether
> or not the commit graph is used when parsing commits or performing a
> traversal.

I think this is a good thing to have, and the patch itself makes sense
to me (this is actually my first time reviewing it, despite its intended
use within GitHub :) ).

If I may bikeshed for a moment:

> Introduce 'core.useBloomFilters' to control whether or not Bloom filters
> are read. Note that this configuration is independent from both:
> 
>   - 'core.commitGraph', to allow flexibility in using all parts of a
>     commit-graph _except_ for its Bloom filters.
> 
>   - The '--changed-paths' option for 'git commit-graph write', to allow
>     reading and writing Bloom filters to be controlled independently.

Should we avoid exposing the user to the words "Bloom filter"?

The command-line option for writing them was genericized to
"changed-paths", which I think is good. The use of Bloom filters is an
implementation detail. What the user cares about is whether we can
optimize queries of which paths changed in a commit.

When we introduced reachability bitmaps long ago, we made the mistake of
just calling them "bitmaps". That jargon is well understood by people
who work with that code, but it's confusing outside of that (even within
other parts of Git) because bitmaps are just a generic data structure.
You can have a bitmap of just about anything (and indeed we do use other
bitmaps these days). Consistently calling them "reachability bitmaps",
especially in the user facing bits, would have reduced confusion over
the years.

Similarly, Bloom filters are a generic structure we might use elsewhere.
I don't really care if we use the word "Bloom" internally to refer to
this feature, but we'll be stuck with this config option for all time. I
think it's worth picking something more clear.

It might even be worth considering whether "changed paths" needs more
context (or would if we add new features in the future). On a "git
commit-graph write" command-line it is perfectly clear, but would
core.commitGraphChangedPaths be worth it? It's definitely more specific,
but it's also way more ugly. ;)

> diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c
> index 6d0c962438..5f585a1725 100644
> --- a/t/helper/test-read-graph.c
> +++ b/t/helper/test-read-graph.c
> @@ -12,11 +12,12 @@ int cmd__read_graph(int argc, const char **argv)
>  	setup_git_directory();
>  	odb = the_repository->objects->odb;
>  
> +	prepare_repo_settings(the_repository);
> +
>  	graph = read_commit_graph_one(the_repository, odb);

I wondered why we would need this prepare_repo_settings() now, when it
should have been needed already to cover core.commitGraph already. I
strongly suspect the answer is: "test-tool read-graph" never properly
respected core.commitGraph in the first place.

And now presumably it would. If true, I don't think any tests need
adjusted because the only places we set it are:

  - on a "git -c" command line, which wouldn't run a test-tool helper

  - when we do set it, it is always to "true", which is the default
    anyway

>  	if (!graph)
>  		return 1;
>  
> -
>  	printf("header: %08x %d %d %d %d\n",
>  		ntohl(*(uint32_t*)graph->data),
>  		*(unsigned char*)(graph->data + 4),

Oh good, I happened to be looking at this code earlier today for an
unrelated reason and was bothered by this extra newline. :)

-Peff

  reply	other threads:[~2020-06-30 19:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 17:17 [PATCH 0/3] commit-graph: introduce 'core.useBloomFilters' Taylor Blau
2020-06-30 17:17 ` [PATCH 1/3] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-06-30 20:52   ` Derrick Stolee
2020-06-30 17:17 ` [PATCH 2/3] t4216: fix broken '&&'-chain Taylor Blau
2020-06-30 17:50   ` Eric Sunshine
2020-06-30 18:39     ` Taylor Blau
2020-06-30 19:03       ` Jeff King
2020-06-30 19:12         ` Taylor Blau
2020-06-30 19:19           ` Jeff King
2020-06-30 19:48         ` Eric Sunshine
2020-06-30 18:55     ` Jeff King
2020-06-30 17:17 ` [PATCH 3/3] commit-graph: respect 'core.useBloomFilters' Taylor Blau
2020-06-30 19:18   ` Jeff King [this message]
2020-06-30 19:27     ` Taylor Blau
2020-06-30 19:33       ` Jeff King
2020-08-03 19:02 ` [PATCH 0/3] commit-graph: introduce 'core.useBloomFilters' Taylor Blau
  -- strict thread matches above, loose matches on Subject: below --
2020-07-01  9:58 [PATCH 3/3] commit-graph: respect 'core.useBloomFilters' Son Luong Ngoc
2020-07-13 19:22 ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200630191834.GC1888406@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).