git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: Git Mailing List <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Derrick Stolee <dstolee@microsoft.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]'
Date: Fri, 31 Jan 2020 20:27:27 +0100	[thread overview]
Message-ID: <CAN0heSo29+sf1352EfGv=qFsir-e=Ompja445bh5z3vpOfkaJA@mail.gmail.com> (raw)
In-Reply-To: <4f5bc19903f8a1f5b153b5665de378e743e12744.1580430057.git.me@ttaylorr.com>

On Fri, 31 Jan 2020 at 01:29, Taylor Blau <me@ttaylorr.com> wrote:
> With '--split', the commit-graph machinery writes new commits in another
> incremental commit-graph which is part of the existing chain, and
> optionally decides to condense the chain into a single commit-graph.
> This is done to ensure that the aysmptotic behavior of looking up a

asymptotic

> commit in an incremental chain is dominated by the number of
> incrementals in that chain. It can be controlled by the '--max-commits'
> and '--size-multiple' options.
>
> On occasion, callers may want to ensure that 'git commit-graph write
> --split' always writes an incremental, and never spends effort
> condensing the incremental chain [1]. Previously, this was possible by
> passing '--size-multiple=0', but this no longer the case following
> 63020f175f (commit-graph: prefer default size_mult when given zero,
> 2020-01-02).
>
> Reintroduce a less-magical variant of the above with a new pair of
> arguments to '--split': '--split=no-merge' and '--split=merge-all'. When
> '--split=no-merge' is given, the commit-graph machinery will never
> condense an existing chain and will always write a new incremental.
> Conversely, if '--split=merge-all' is given, any invocation including it
> will always condense a chain if one exists.  If '--split' is given with
> no arguments, it behaves as before and defers to '--size-multiple', and
> so on.

I understand your motivation for doing this -- it all seems quite sound
to me. Not being too familiar with this commit-graph splitting and
merging, I had a hard time groking this terminology though. From what I
understand, before this patch, `--split` means "write the commit-graph
using the 'split' file-format / in a split fashion". Ok, that makes
sense.

From seeing `--split=no-merge`, I have no idea how to even parse that.
Of course I don't want to merge, I want to split! Well not split, but
write split files.

I think it would help me (and others like me) if we could somehow
separate "I want to use 'split' files" from "and here's how I want you
to decide on the merging". That is, which "strategy" to use. Obviously,
talking about a "merge strategy" would be stupid and "split strategy"
also seems a bit odd. "Coalescing strategy"? "Joining strategy"?

Or can you convince me otherwise? From which angle should I look at
this?

> -With the `--split` option, write the commit-graph as a chain of multiple
> -commit-graph files stored in `<dir>/info/commit-graphs`. The new commits
> -not already in the commit-graph are added in a new "tip" file. This file
> -is merged with the existing file if the following merge conditions are
> -met:
> +With the `--split[=<strategy>]` option, write the commit-graph as a
> +chain of multiple commit-graph files stored in
> +`<dir>/info/commit-graphs`. Commit-graph layers are merged based on the
> +strategy and other splitting options. The new commits not already in the
> +commit-graph are added in a new "tip" file. This file is merged with the
> +existing file if the following merge conditions are met:
> +* If `--split=merge-always` is specified, then a merge is always
> +conducted, and the remaining options are ignored. Conversely, if
> +`--split=no-merge` is specified, a merge is never performed, and the
> +remaining options are ignored. A bare `--split` defers to the remaining
> +options. (Note that merging a chain of commit graphs replaces the
> +existing chain with a length-1 chain where the first and only
> +incremental holds the entire graph).

To better understand the background for this patch, I read the manpage
as it stands today. From the section on `--split`, I got this
impression: Let's say that `--max-commits` is huge, so all that matters
is the `--size-multiple`. Let's say it's two. If the current tip
contains three commits and we're about to write one with two, then 2*2 >
3 so we will merge, i.e., write a tip file with five commits. Unless of
course *that* is more than half the size of the file before. And so on.
We might just merge once, or maybe "many" files in an avalanche effect.
Every now and then, such avalanches will go all the way back to the
first file.

Now this says something different, namely that once we decide to merge,
we do it all the way back, no matter what.

The commit message of 1771be90c8 ("commit-graph: merge commit-graph
chains", 2019-06-18) seems to support my original understanding, at
least for `--size-multiple`, but not `--max-commits`, curiously enough.

Can you clarify?

> -               OPT_BOOL(0, "split", &opts.split,
> -                       N_("allow writing an incremental commit-graph file")),
> +               OPT_CALLBACK_F(0, "split", &split_opts.flags, NULL,
> +                       N_("allow writing an incremental commit-graph file"),

This still sounds very boolean. Cramming in the "strategy" might be hard
-- is this an argument in favor of having two separate options? ;-)

> +enum commit_graph_split_flags {
> +       COMMIT_GRAPH_SPLIT_UNSPECIFIED      = 0,
> +       COMMIT_GRAPH_SPLIT_MERGE_REQUIRED   = 1,
> +       COMMIT_GRAPH_SPLIT_MERGE_PROHIBITED = 2
> +};

I wonder if this should be "MERGE_AUTO" rather than "UNSPECIFIED". This
is related to Stolee's comment, I think.


Martin

  parent reply	other threads:[~2020-01-31 19:27 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-31  0:28 [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
2020-01-31  0:28 ` [PATCH 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
2020-01-31 14:19   ` Derrick Stolee
2020-02-04  3:47     ` Taylor Blau
2020-01-31 19:27   ` Martin Ågren [this message]
2020-02-04  4:06     ` Taylor Blau
2020-02-06 19:15       ` Martin Ågren
2020-02-09 23:27         ` Taylor Blau
2020-01-31 23:34   ` SZEDER Gábor
2020-02-01 21:25     ` Johannes Schindelin
2020-02-03 10:47       ` SZEDER Gábor
2020-02-03 11:11         ` Jeff King
2020-02-04  3:58           ` Taylor Blau
2020-02-04 14:14             ` Jeff King
2020-02-04  3:59       ` Taylor Blau
2020-02-04  3:59     ` Taylor Blau
2020-01-31  0:28 ` [PATCH 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
2020-01-31 14:40   ` Derrick Stolee
2020-02-04  4:21     ` Taylor Blau
2020-01-31 19:34   ` Martin Ågren
2020-02-04  4:51     ` Taylor Blau
2020-02-13 11:33       ` SZEDER Gábor
2020-02-13 11:48         ` SZEDER Gábor
2020-02-13 17:56           ` Taylor Blau
2020-01-31  0:28 ` [PATCH 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
2020-01-31 14:40   ` Derrick Stolee
2020-01-31 19:45   ` Martin Ågren
2020-02-04  5:01     ` Taylor Blau
2020-01-31  0:32 ` [PATCH 0/3] builtin/commit-graph.c: new split/merge options Taylor Blau
2020-01-31 13:26   ` Derrick Stolee
2020-01-31 14:41 ` Derrick Stolee
2020-02-04 23:44 ` Junio C Hamano
2020-02-05  0:30   ` Taylor Blau
2020-02-05  0:28 ` [PATCH v2 " Taylor Blau
2020-02-05  0:28   ` [PATCH v2 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
2020-02-06 19:41     ` Martin Ågren
2020-02-07 15:48       ` Derrick Stolee
2020-02-09 23:32         ` Taylor Blau
2020-02-12  6:03         ` Martin Ågren
2020-02-12 20:50           ` Taylor Blau
2020-02-09 23:30       ` Taylor Blau
2020-02-05  0:28   ` [PATCH v2 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
2020-02-05  0:28   ` [PATCH v2 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
2020-02-06 19:50     ` Martin Ågren
2020-02-09 23:32       ` Taylor Blau
2020-02-05 20:07   ` [PATCH v2 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
2020-02-12  5:47 ` [PATCH v3 " Taylor Blau
2020-02-12  5:47   ` [PATCH v3 1/3] builtin/commit-graph.c: support '--split[=<strategy>]' Taylor Blau
2020-02-12  5:47   ` [PATCH v3 2/3] builtin/commit-graph.c: introduce '--input=<source>' Taylor Blau
2020-02-12  5:47   ` [PATCH v3 3/3] builtin/commit-graph.c: support '--input=none' Taylor Blau
2020-02-13 11:39     ` SZEDER Gábor
2020-02-13 12:31     ` SZEDER Gábor
2020-02-13 16:08       ` Junio C Hamano
2020-02-13 17:58         ` Taylor Blau
2020-02-13 17:56       ` Taylor Blau
2020-02-12 18:19   ` [PATCH v3 0/3] builtin/commit-graph.c: new split/merge options Junio C Hamano
2020-02-13 17:41     ` Taylor Blau
2020-02-17 18:24   ` Martin Ågren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN0heSo29+sf1352EfGv=qFsir-e=Ompja445bh5z3vpOfkaJA@mail.gmail.com' \
    --to=martin.agren@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).