git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, sandals@crustytoothpaste.net,
	Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 0/5] Create commit-graph file format v2
Date: Fri, 26 Apr 2019 08:06:05 -0400	[thread overview]
Message-ID: <0a8fb6ec-d35c-45d7-3442-0e8298efca4b@gmail.com> (raw)
In-Reply-To: <878svxrwsh.fsf@evledraar.gmail.com>

On 4/26/2019 4:33 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Apr 26 2019, Junio C Hamano wrote:
>>
>> Thanks always for your careful review and thoughtful comments, by
>> the way.

I agree that these comments are extremely helpful.

>>> Now as noted in my series we now on 'master' downgrade that to a warning
>>> (along with the rest of the errors):
>>>
>>>     $ ~/g/git/git --exec-path=$PWD status
>>>     error: commit-graph version 2 does not match version 1
>>>     On branch master
>>>     [...]
>>>
>>> ...and this series sets the default version for all new graphs to v2.>>>> The phrasing seems odd.  It is unclear, even to me who is vaguely
>> familiar with the word "commit-graph" and is aware of the fact that
>> the file format is being updated, what
>>
>>     "commit-graph version 2 does not match version 1"
> 
> Yeah it should really say:
> 
>     "commit-graph is of version 2, our maximum supported version is 1"

I agree this phrasing is better. Please see the patch I just submitted [1]
to try and improve these messages.

[1] https://public-inbox.org/git/pull.181.git.gitgitgadget@gmail.com/
 
> Hindsight is 20/20, but more generally I wonder if we should have these
> format versions match that of the git version (unlikely to change it
> twice in the same release...) which would allow us to say things like:
> 
>     "commit-graph needs v2.22.0 or later, we have the version written by v2.18.0..v2.21.0"
> 
> But of course dealing with those larger integers in the code/gaps is
> also messy :)

There are a couple issues with using the version numbers, from my
perspective:

1. We don't do that anywhere else, like the index file.

2. The microsoft/git fork takes certain performance changes faster
   than core git, and frequently ships versions between major version
   updates. Our 2.17 had the commit-graph, for instance. It's also
   possible that we'd take commit-graph v2 earlier than the core Git
   major release.
 
>> wants to say.  Do I have version #2 on disk and the running binary
>> only understands version #1?  Or do I have version #1 on disk and
>> the binary expected version #2?  How would I get out of this
>> situation?  Is it sufficient to do "rm -f .git/info/commit-graph*"
>> and is it safe?
> 
> Yeah. An rm of .git/info/commit-graph is safe, so is "-c
> core.commitGraph=false" as a workaround.

That is true. I'm not sure the error message is the right place to
describe the workaround.

> I'd say "let's improve the error", but that ship has sailed, and we can
> do better than an error here, no matter how it's phrased...
> 
>>> I think this is *way* too aggressive of an upgrade path. If these
>>> patches go into v2.22.0 then git clients on all older versions that grok
>>> the commit graph (IIRC v2.18 and above) will have their git completely
>>> broken if they're in a mixed-git-version environmen.>
> I should note that "all older versions..." here is those that have
> core.commitGraph=true set. More details in 43d3561805 ("commit-graph
> write: don't die if the existing graph is corrupt", 2019-03-25).
> 
>>> Is it really so important to move to v2 right away that we need to risk
>>> those breakages? I think even with my ab/commit-graph-fixes it's still
>>> too annoying (I was mostly trying to fix other stuff...). If only we
>>> could detect "we should make a new graph now" ....
>>
>> True.

You are right, this is too aggressive and I should have known better. I'll
update in the next version to keep a default to v1. Not only do we have this
downgrade risk, there is no actual benefit in this series alone. This only
sets up the ability for other features.
 
> Having slept on my earlier
> https://public-inbox.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ I think
> I see a better way to deal with this than my earlier suggestion that we
> perform some version flip-flop dance on the single "commit-graph" file:
> 
> How about just writing .git/objects/info/commit-graph-v2, and for the
> upcoming plan when where they'll be split have some dir/prefix there
> where we include the version?
> 
> That means that:
> 
>  1. If there's an existing v1 "commit-graph" file we don't write a v2 at
>     that path in v2.22, although we might have some "write v1 (as well
>     as v2?) for old client compat" config where we opt-in to do that.
> 
>  2. By default in v2.22 we read/write a "commit-graph-v2" file,
>     preferring it over the v1 "commit-graph", falling back on earlier
>     versions if it's not there (until gc --auto kicks in on v2.22 and
>     makes a v2 graph).
> 
>  3. If you have concurrent v2.21 and v2.22 clients accessing the repo
>     you might end up generating one commit-graph or the other depending
>     on who happens to trigger "gc --auto".
> 
>     Hopefully that's a non-issue since an out-of-date graph isn't
>     usually a big deal, and client versions mostly march forward. But
>     v2.22 could also learn some "incremental gc" where it says "my v2 is
>     older, v1 client must have refreshed it, I'll refresh mine/both".
> 
>  4. v2.22 and newer versions will have some code in git-gc where we'll
>     eventually readdir() .git/objects/info and remove graphs that are
>     too old per some new config (say
>     "gc.pruneOlderCommitGraphVersions=180 days").
> 
> This means that:
> 
>  A. GOOD: Now and going forward we can fearlessly create new versions of
>     the graph without worrying/testing how older clients deal with it.
> 
>  B. BAD: We are going to eat ~2x the disk space for commit-graphs while
>     such transitions are underway. I think that's fine. They're
>     relatively small compared to .git/objects, and we'll eventually "gc"
>     the old ones.

We could also write 'commit-graph-v2' and delete 'commit-graph' and if
someone downgrades they would just have a performance issue, not a failure.

>  C. BAD: Different versions of git might perform wildly differently (new
>     version slower) since their respective preferred graph versions
>     might have a very different/up-to-date number of commits v.s. what's
>     in the packs.
> 
> I think "A" outweighs "B" && "C" in this case. It's "just" a caching
> data structure, and git works without it. So we can be a lot looser than
> say updating the index/pack format.
> 
> Worst case things slow down but still work, and as noted in #3 above if
> we care it can be mitigated (but I don't, I think we can safely assume
> "client versions march forward").

While I agree that this downgrade path can be a problem, I don't like the
idea of adding a version in the filename. The whole point of having a versioned
file format is so we can make these changes without changing the filename.

Is it sufficient to remove the auto-upgrade path, at least for a few major
versions? And I can learn from past mistakes and change the response to
the other version information in the v2 file (reach index version, hash version,
unused value in 8th byte) and instead make them non-fatal warnings.

Thanks,
-Stolee

  reply	other threads:[~2019-04-26 12:06 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23 21:59 [PATCH 0/6] Create commit-graph file format v2 Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 1/6] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 2/6] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 3/6] commit-graph: create new version flags Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 4/6] commit-graph: add --version=<n> option Derrick Stolee via GitGitGadget
2019-01-24  9:31   ` Ævar Arnfjörð Bjarmason
2019-01-23 21:59 ` [PATCH 5/6] commit-graph: implement file format version 2 Derrick Stolee via GitGitGadget
2019-01-23 23:56   ` Jonathan Tan
2019-01-24  9:40   ` Ævar Arnfjörð Bjarmason
2019-01-24 14:34     ` Derrick Stolee
2019-03-21  9:21   ` Ævar Arnfjörð Bjarmason
2019-01-23 21:59 ` [PATCH 6/6] commit-graph: test verifying a corrupt v2 header Derrick Stolee via GitGitGadget
2019-01-23 23:59   ` Jonathan Tan
2019-01-24 23:05 ` [PATCH 0/6] Create commit-graph file format v2 Junio C Hamano
2019-01-24 23:39 ` Junio C Hamano
2019-01-25 13:54   ` Derrick Stolee
2019-04-24 19:58 ` [PATCH v2 0/5] " Derrick Stolee via GitGitGadget
2019-04-24 19:58   ` [PATCH v2 1/5] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-04-24 19:58   ` [PATCH v2 2/5] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-04-25  5:21     ` Junio C Hamano
2019-04-24 19:58   ` [PATCH v2 3/5] commit-graph: create new version flags Derrick Stolee via GitGitGadget
2019-04-25  5:29     ` Junio C Hamano
2019-04-25 11:09       ` Derrick Stolee
2019-04-25 21:31     ` Ævar Arnfjörð Bjarmason
2019-04-26  2:20       ` Junio C Hamano
2019-04-24 19:58   ` [PATCH v2 4/5] commit-graph: add --version=<n> option Derrick Stolee via GitGitGadget
2019-04-24 19:58   ` [PATCH v2 5/5] commit-graph: implement file format version 2 Derrick Stolee via GitGitGadget
2019-04-25 22:09   ` [PATCH v2 0/5] Create commit-graph file format v2 Ævar Arnfjörð Bjarmason
2019-04-26  2:28     ` Junio C Hamano
2019-04-26  8:33       ` Ævar Arnfjörð Bjarmason
2019-04-26 12:06         ` Derrick Stolee [this message]
2019-04-26 13:55           ` Ævar Arnfjörð Bjarmason
2019-04-27 12:57     ` Ævar Arnfjörð Bjarmason
2019-05-01 13:11   ` [PATCH v3 0/6] " Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 1/6] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-05-01 14:46       ` Ævar Arnfjörð Bjarmason
2019-05-01 13:11     ` [PATCH v3 2/6] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 3/6] commit-graph: create new version parameter Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 4/6] commit-graph: add --version=<n> option Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 5/6] commit-graph: implement file format version 2 Derrick Stolee via GitGitGadget
2019-05-01 19:12       ` Ævar Arnfjörð Bjarmason
2019-05-01 19:56         ` Derrick Stolee
2019-05-01 13:11     ` [PATCH v3 6/6] commit-graph: remove Future Work section Derrick Stolee via GitGitGadget
2019-05-01 14:58       ` Ævar Arnfjörð Bjarmason
2019-05-01 19:59         ` Derrick Stolee
2019-05-01 20:25     ` [PATCH v3 0/6] Create commit-graph file format v2 Ævar Arnfjörð Bjarmason
2019-05-02 13:26       ` Derrick Stolee
2019-05-02 18:02         ` Ævar Arnfjörð Bjarmason
2019-05-03 12:47           ` Derrick Stolee
2019-05-03 13:41             ` Ævar Arnfjörð Bjarmason
2019-05-06  8:27               ` Christian Couder
2019-05-06 13:47                 ` Derrick Stolee
2019-05-03 14:16             ` SZEDER Gábor
2019-05-03 15:11               ` Derrick Stolee
2019-05-09 14:22     ` [PATCH v4 00/11] Commit-graph write refactor (was: Create commit-graph file format v2) Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 01/11] commit-graph: fix the_repository reference Derrick Stolee via GitGitGadget
2019-05-13  2:56         ` Junio C Hamano
2019-05-09 14:22       ` [PATCH v4 02/11] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-05-13  3:13         ` Junio C Hamano
2019-05-13 11:04           ` Derrick Stolee
2019-05-13 11:22             ` Derrick Stolee
2019-05-09 14:22       ` [PATCH v4 03/11] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-05-13  3:44         ` Junio C Hamano
2019-05-13 11:07           ` Derrick Stolee
2019-05-09 14:22       ` [PATCH v4 04/11] commit-graph: remove Future Work section Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 05/11] commit-graph: create write_commit_graph_context Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 07/11] commit-graph: extract fill_oids_from_commit_hex() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 06/11] commit-graph: extract fill_oids_from_packs() Derrick Stolee via GitGitGadget
2019-05-13  5:05         ` Junio C Hamano
2019-05-09 14:22       ` [PATCH v4 08/11] commit-graph: extract fill_oids_from_all_packs() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 09/11] commit-graph: extract count_distinct_commits() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 10/11] commit-graph: extract copy_oids_to_commits() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 11/11] commit-graph: extract write_commit_graph_file() Derrick Stolee via GitGitGadget
2019-05-13  5:09         ` Junio C Hamano
2019-05-09 17:58       ` [PATCH v4 00/11] Commit-graph write refactor (was: Create commit-graph file format v2) Josh Steadmon
2019-06-12 13:29       ` [PATCH v5 " Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 01/11] commit-graph: fix the_repository reference Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 02/11] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-06-29 17:23           ` SZEDER Gábor
2019-07-01 12:19             ` Derrick Stolee
2019-06-12 13:29         ` [PATCH v5 03/11] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 04/11] commit-graph: remove Future Work section Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 05/11] commit-graph: create write_commit_graph_context Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 06/11] commit-graph: extract fill_oids_from_packs() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 07/11] commit-graph: extract fill_oids_from_commit_hex() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 08/11] commit-graph: extract fill_oids_from_all_packs() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 09/11] commit-graph: extract count_distinct_commits() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 10/11] commit-graph: extract copy_oids_to_commits() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 11/11] commit-graph: extract write_commit_graph_file() Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a8fb6ec-d35c-45d7-3442-0e8298efca4b@gmail.com \
    --to=stolee@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).