git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
	"Stefan Beller" <sbeller@google.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Jeff Hostetler" <git@jeffhostetler.com>,
	"Jeff King" <peff@peff.net>,
	"Derrick Stolee" <dstolee@microsoft.com>
Subject: Re: [PATCH v8 04/14] graph: add commit graph design document
Date: Mon, 16 Apr 2018 00:48:15 +0200	[thread overview]
Message-ID: <86vacsjdcg.fsf@gmail.com> (raw)
In-Reply-To: <20180410125608.39443-5-dstolee@microsoft.com> (Derrick Stolee's message of "Tue, 10 Apr 2018 08:55:58 -0400")

Derrick Stolee <stolee@gmail.com> writes:

> +Future Work
> +-----------
> +
> +- The commit graph feature currently does not honor commit grafts. This can
> +  be remedied by duplicating or refactoring the current graft logic.

The problem in my opinion lies in different direction, namely that
commit grafts can change, changing the view of the history.  If we want
commit-graph file to follow user-visible view of the history of the
project, it needs to respect current version of commit grafts - but what
if commit grafts changed since commit-graph file was generated?

Actually, there are currently three ways to affect the view of the
history:

a. legacy commit grafts mechanism; it was first, but it is not safe,
   cannot be transferred on fetch / push, and is now deprecated.

b. shallow clones, which are kind of specialized and limited grafts;
   they used to limit available functionality, but restrictions are
   being lifted (or perhaps even got lifted)

c. git-replace mechanism, where we can create an "overlay" of any
   object, and is intended to be among others replacement for commit
   grafts; safe, transferable, can be turned off with "git
   --no-replace-objects <command>"

All those can change; some more likely than others.  The problem is if
they change between writing commit-graph file (respecting old view of
the history) and reading it (where we expect to see the new view).

a. grafts file can change: lines can be added, removed or changed

b. shallow clones can be deepened or shortened, or even make
   not shallow

c. new replacements can be added, old removed, and existing edited


There are, as far as I can see, two ways of handling the issue of Git
features that can change the view of the project's history, namely:

 * Disable commit-graph reading when any of this features are used, and
   always write full graph info.

   This may not matter much for shallow clones, where commit count
   should be small anyway, but when using git-replace to stitch together
   current repository with historical one, commit-graph would be
   certainly useful.  Also, git-replace does not need to change history.

   On the other hand I think it is the easier solution.

Or

 * Detect somehow that the view of the history changed, and invalidate
   commit-graph (perhaps even automatically regenerate it).

   For shallow clone changes I think one can still use the old
   commit-graph file to generate the new one.  For other cases, the
   metadata is simple to modify, but indices such as generation number
   would need to be at least partially calculated anew.

Happily, you don't need to do this now.  It can be done in later series;
on the other hand this would be required before the switch can be turned
from "default off" to "default on" for commit-graph feature (configured
with core.commitGraph).

So please keep up the good work with sending nicely digestible patch
series.

> +
> +- The 'commit-graph' subcommand does not have a "verify" mode that is
> +  necessary for integration with fsck.

The "read" mode has beginnings of "verify", or at least "fsck", isn't
it?

[...]
> +- The current design uses the 'commit-graph' subcommand to generate the graph.
> +  When this feature stabilizes enough to recommend to most users, we should
> +  add automatic graph writes to common operations that create many commits.
> +  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
> +  commands.

Automatic is good ("git gc --auto").

But that needs handling of view chaning features such as commit grafts,
isn't it?

> +
> +- A server could provide a commit graph file as part of the network protocol
> +  to avoid extra calculations by clients. This feature is only of benefit if
> +  the user is willing to trust the file, because verifying the file is correct
> +  is as hard as computing it from scratch.

Should server send different commit-graph file / info depending on
whether client fetches from refs/replaces/* nameespace?

> +
> +Related Links
> +-------------

Thank you for providing them (together with summary).

> +[0] https://bugs.chromium.org/p/git/issues/detail?id=8
> +    Chromium work item for: Serialized Commit Graph
> +
> +[1] https://public-inbox.org/git/20110713070517.GC18566@sigill.intra.peff.net/
> +    An abandoned patch that introduced generation numbers.
> +
> +[2] https://public-inbox.org/git/20170908033403.q7e6dj7benasrjes@sigill.intra.peff.net/
> +    Discussion about generation numbers on commits and how they interact
> +    with fsck.
> +
> +[3] https://public-inbox.org/git/20170908034739.4op3w4f2ma5s65ku@sigill.intra.peff.net/
> +    More discussion about generation numbers and not storing them inside
> +    commit objects. A valuable quote:
> +
> +    "I think we should be moving more in the direction of keeping
> +     repo-local caches for optimizations. Reachability bitmaps have been
> +     a big performance win. I think we should be doing the same with our
> +     properties of commits. Not just generation numbers, but making it
> +     cheap to access the graph structure without zlib-inflating whole
> +     commit objects (i.e., packv4 or something like the "metapacks" I
> +     proposed a few years ago)."
> +
> +[4] https://public-inbox.org/git/20180108154822.54829-1-git@jeffhostetler.com/T/#u
> +    A patch to remove the ahead-behind calculation from 'status'.

  reply	other threads:[~2018-04-15 22:48 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-27  2:32 [PATCH v5 00/13] Serialized Git Commit Graph Derrick Stolee
2018-02-27  2:32 ` [PATCH v5 01/13] commit-graph: add format document Derrick Stolee
2018-02-27  2:32 ` [PATCH v5 02/13] graph: add commit graph design document Derrick Stolee
2018-02-27  2:32 ` [PATCH v5 03/13] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-27  2:32 ` [PATCH v5 04/13] csum-file: add CSUM_KEEP_OPEN flag Derrick Stolee
2018-03-12 13:55   ` Derrick Stolee
2018-03-13 21:42     ` Junio C Hamano
2018-03-14  2:26       ` Derrick Stolee
2018-03-14 17:00         ` Junio C Hamano
2018-02-27  2:32 ` [PATCH v5 05/13] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 06/13] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 07/13] commit-graph: implement git commit-graph read Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 08/13] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 09/13] commit-graph: close under reachability Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 10/13] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 11/13] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-27 20:15   ` Stefan Beller
2018-02-27  2:33 ` [PATCH v5 12/13] commit-graph: build graph from starting commits Derrick Stolee
2018-02-27  2:33 ` [PATCH v5 13/13] commit-graph: implement "--additive" option Derrick Stolee
2018-02-27 18:50 ` [PATCH v5 00/13] Serialized Git Commit Graph Stefan Beller
2018-03-14 19:27 ` [PATCH v6 00/14] " Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 01/14] csum-file: rename hashclose() to finalize_hashfile() Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 02/14] csum-file: refactor finalize_hashfile() method Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 03/14] commit-graph: add format document Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 04/14] graph: add commit graph design document Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 05/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 06/14] commit-graph: implement write_commit_graph() Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 07/14] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-03-18 13:25     ` Ævar Arnfjörð Bjarmason
2018-03-19 13:12       ` Derrick Stolee
2018-03-19 14:36         ` Ævar Arnfjörð Bjarmason
2018-03-19 18:27           ` Derrick Stolee
2018-03-19 18:48             ` Ævar Arnfjörð Bjarmason
2018-03-14 19:27   ` [PATCH v6 08/14] commit-graph: implement git commit-graph read Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 09/14] commit-graph: add core.commitGraph setting Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 10/14] commit-graph: close under reachability Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 12/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-03-15 22:50     ` SZEDER Gábor
2018-03-19 13:13       ` Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 13/14] commit-graph: build graph from starting commits Derrick Stolee
2018-03-14 19:27   ` [PATCH v6 14/14] commit-graph: implement "--additive" option Derrick Stolee
2018-03-14 20:10   ` [PATCH v6 00/14] Serialized Git Commit Graph Ramsay Jones
2018-03-14 20:43   ` Junio C Hamano
2018-03-15 17:23     ` Johannes Schindelin
2018-03-15 18:41       ` Junio C Hamano
2018-03-15 21:51         ` Ramsay Jones
2018-03-16 11:50         ` Johannes Schindelin
2018-03-16 17:27           ` Junio C Hamano
2018-03-19 11:41             ` Johannes Schindelin
2018-03-16 16:28     ` Lars Schneider
2018-03-19 13:10       ` Derrick Stolee
2018-03-16 15:06   ` Ævar Arnfjörð Bjarmason
2018-03-16 16:38     ` SZEDER Gábor
2018-03-16 18:33       ` Junio C Hamano
2018-03-16 19:48         ` SZEDER Gábor
2018-03-16 20:06           ` Jeff King
2018-03-16 20:19             ` Jeff King
2018-03-19 12:55               ` Derrick Stolee
2018-03-20  1:17                 ` Derrick Stolee
2018-03-16 20:49         ` Jeff King
2018-04-02 20:34   ` [PATCH v7 " Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 01/14] csum-file: rename hashclose() to finalize_hashfile() Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 02/14] csum-file: refactor finalize_hashfile() method Derrick Stolee
2018-04-07 22:59       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 03/14] commit-graph: add format document Derrick Stolee
2018-04-07 23:49       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 04/14] graph: add commit graph design document Derrick Stolee
2018-04-08 11:06       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 05/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 06/14] commit-graph: implement write_commit_graph() Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 07/14] commit-graph: implement git-commit-graph write Derrick Stolee
2018-04-08 11:59       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 08/14] commit-graph: implement git commit-graph read Derrick Stolee
2018-04-02 21:33       ` Junio C Hamano
2018-04-03 11:49         ` Derrick Stolee
2018-04-08 12:59       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 09/14] commit-graph: add core.commitGraph setting Derrick Stolee
2018-04-08 13:39       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 10/14] commit-graph: close under reachability Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 12/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-04-02 20:34     ` [PATCH v7 13/14] commit-graph: build graph from starting commits Derrick Stolee
2018-04-08 13:50       ` Jakub Narebski
2018-04-02 20:34     ` [PATCH v7 14/14] commit-graph: implement "--additive" option Derrick Stolee
2018-04-05  8:27       ` SZEDER Gábor
2018-04-10 12:55     ` [PATCH v8 00/14] Serialized Git Commit Graph Derrick Stolee
2018-04-10 12:55       ` [PATCH v8 01/14] csum-file: rename hashclose() to finalize_hashfile() Derrick Stolee
2018-04-10 12:55       ` [PATCH v8 02/14] csum-file: refactor finalize_hashfile() method Derrick Stolee
2018-04-10 12:55       ` [PATCH v8 03/14] commit-graph: add format document Derrick Stolee
2018-04-10 19:10         ` Stefan Beller
2018-04-10 19:18           ` Derrick Stolee
2018-04-11 20:58         ` Jakub Narebski
2018-04-12 11:28           ` Derrick Stolee
2018-04-13 22:07             ` Jakub Narebski
2018-04-10 12:55       ` [PATCH v8 04/14] graph: add commit graph design document Derrick Stolee
2018-04-15 22:48         ` Jakub Narebski [this message]
2018-04-10 12:55       ` [PATCH v8 05/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 06/14] commit-graph: implement write_commit_graph() Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 07/14] commit-graph: implement git-commit-graph write Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 08/14] commit-graph: implement git commit-graph read Derrick Stolee
2018-04-14 22:15         ` Jakub Narebski
2018-04-15  3:26           ` Eric Sunshine
2018-04-10 12:56       ` [PATCH v8 09/14] commit-graph: add core.commitGraph setting Derrick Stolee
2018-04-14 18:33         ` Jakub Narebski
2018-04-10 12:56       ` [PATCH v8 10/14] commit-graph: close under reachability Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 12/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 13/14] commit-graph: build graph from starting commits Derrick Stolee
2018-04-10 12:56       ` [PATCH v8 14/14] commit-graph: implement "--append" option Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86vacsjdcg.fsf@gmail.com \
    --to=jnareb@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=sbeller@google.com \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).