From: Derrick Stolee <stolee@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com,
git@jeffhostetler.com, peff@peff.net, jonathantanmy@google.com,
sbeller@google.com, szeder.dev@gmail.com
Subject: Re: [PATCH v3 07/14] commit-graph: update graph-head during write
Date: Mon, 12 Feb 2018 16:24:55 -0500 [thread overview]
Message-ID: <99543db0-26e4-8daa-a580-b618497e48ba@gmail.com> (raw)
In-Reply-To: <xmqqwozirlky.fsf@gitster-ct.c.googlers.com>
On 2/12/2018 3:37 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>> It is possible to have multiple commit graph files in a pack directory,
>>> but only one is important at a time. Use a 'graph_head' file to point
>>> to the important file. Teach git-commit-graph to write 'graph_head' upon
>>> writing a new commit graph file.
>> Why this design, instead of what "repack -a" would do, iow, if there
>> always is a singleton that is the only one that matters, shouldn't
>> the creation of that latest singleton just clear the older ones
>> before it returns control?
> Note that I am not complaining---I am just curious why we want to
> expose this "there is one relevant one but we keep irrelevant ones
> we usually do not look at and need to be garbage collected" to end
> users, and also expect readers of the series, resulting code and
> docs would have the same puzzled feeling.
>
Aside: I forgot to mention in my cover letter that the experience around
the "--delete-expired" flag for "git commit-graph write" is different
than v2. If specified, we delete all ".graph" files in the pack
directory other than the one referenced by "graph_head" at the beginning
of the process or the one written by the process. If these deletes fail,
then we ignore the failure (assuming that they are being used by another
Git process). In usual cases, we will delete these expired files in the
next instance. I believe this matches similar behavior in gc and repack.
-- Back to discussion about the value of "graph_head" --
The current design of using a pointer file (graph_head) is intended to
have these benefits:
1. We do not need to rely on a directory listing and mtimes to determine
which graph file to use.
2. If we write a new graph file while another git process is reading the
existing graph file, we can update the graph_head pointer without
deleting the file that is currently memory-mapped. (This is why we
cannot just rely on a canonical file name, such as "the_graph", to store
the data.)
3. We can atomically change the 'graph_head' file without interrupting
concurrent git processes. I think this is different from the "repack"
situation because a concurrent process would load all packfiles in the
pack directory and possibly have open handles when the repack is trying
to delete them.
4. We remain open to making the graph file incremental (as the MIDX
feature is designed to do; see [1]). It is less crucial to have an
incremental graph file structure (the graph file for the Windows
repository is currently ~120MB versus a MIDX file of 1.25 GB), but the
graph_head pattern makes this a possibility.
I tried to avoid item 1 due to personal taste, and since I am storing
the files in the objects/pack directory (so that listing may be very
large with a lot of wasted entries). This is less important with our
pending change of moving the graph files to a different directory. This
also satisfies items 2 and 3, as long as we never write graph files so
quickly that we have a collision on mtime.
I cannot think of another design that satisfies item 4.
As for your end user concerns: My philosophy with this feature is that
end users will never interact with the commit-graph builtin. 99% of
users will benefit from a repack or GC automatically computing a commit
graph (when we add that integration point). The other uses for the
builtin are for users who want extreme control over their data, such as
code servers and build agents.
Perhaps someone with experience managing large repositories with git in
a server environment could chime in with some design requirements they
would need.
Thanks,
-Stolee
[1]
https://public-inbox.org/git/20180107181459.222909-2-dstolee@microsoft.com/
[RFC PATCH 01/18] docs: Multi-Pack Index (MIDX) Design Notes
next prev parent reply other threads:[~2018-02-12 21:25 UTC|newest]
Thread overview: 146+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-30 21:39 [PATCH v2 00/14] Serialized Git Commit Graph Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 01/14] commit-graph: add format document Derrick Stolee
2018-02-01 21:44 ` Jonathan Tan
2018-01-30 21:39 ` [PATCH v2 02/14] graph: add commit graph design document Derrick Stolee
2018-01-31 2:19 ` Stefan Beller
2018-01-30 21:39 ` [PATCH v2 03/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-02 0:53 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 04/14] commit-graph: implement construct_commit_graph() Derrick Stolee
2018-02-01 22:23 ` Jonathan Tan
2018-02-01 23:46 ` SZEDER Gábor
2018-02-02 15:32 ` SZEDER Gábor
2018-02-05 16:06 ` Derrick Stolee
2018-02-07 15:08 ` SZEDER Gábor
2018-02-07 15:10 ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 05/14] commit-graph: implement git-commit-graph --write Derrick Stolee
2018-02-01 23:33 ` Jonathan Tan
2018-02-02 18:36 ` Stefan Beller
2018-02-02 22:48 ` Junio C Hamano
2018-02-03 1:58 ` Derrick Stolee
2018-02-03 9:28 ` Jeff King
2018-02-05 18:48 ` Junio C Hamano
2018-02-06 18:55 ` Derrick Stolee
2018-02-01 23:48 ` SZEDER Gábor
2018-02-05 18:07 ` Derrick Stolee
2018-02-02 1:47 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 06/14] commit-graph: implement git-commit-graph --read Derrick Stolee
2018-01-31 2:22 ` Stefan Beller
2018-02-02 0:02 ` SZEDER Gábor
2018-02-02 0:23 ` Jonathan Tan
2018-02-05 19:29 ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 07/14] commit-graph: implement git-commit-graph --update-head Derrick Stolee
2018-02-02 1:35 ` SZEDER Gábor
2018-02-05 21:01 ` Derrick Stolee
2018-02-02 2:45 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 08/14] commit-graph: implement git-commit-graph --clear Derrick Stolee
2018-02-02 4:01 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 09/14] commit-graph: teach git-commit-graph --delete-expired Derrick Stolee
2018-02-02 15:04 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 10/14] commit-graph: add core.commitgraph setting Derrick Stolee
2018-01-31 22:44 ` Igor Djordjevic
2018-02-02 16:01 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-02 1:51 ` Jonathan Tan
2018-02-06 14:53 ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 12/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 13/14] commit-graph: close under reachability Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 14/14] commit-graph: build graph from starting commits Derrick Stolee
2018-01-30 21:47 ` [PATCH v2 00/14] Serialized Git Commit Graph Stefan Beller
2018-02-01 2:34 ` Stefan Beller
2018-02-08 20:37 ` [PATCH v3 " Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 01/14] commit-graph: add format document Derrick Stolee
2018-02-08 21:21 ` Junio C Hamano
2018-02-08 21:33 ` Derrick Stolee
2018-02-08 23:16 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 02/14] graph: add commit graph design document Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 03/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-08 21:27 ` Junio C Hamano
2018-02-08 21:36 ` Derrick Stolee
2018-02-08 23:21 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 04/14] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-08 22:14 ` Junio C Hamano
2018-02-15 18:19 ` Junio C Hamano
2018-02-15 18:23 ` Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 05/14] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-13 21:57 ` Jonathan Tan
2018-02-08 20:37 ` [PATCH v3 06/14] commit-graph: implement 'git-commit-graph read' Derrick Stolee
2018-02-08 23:38 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 07/14] commit-graph: update graph-head during write Derrick Stolee
2018-02-12 18:56 ` Junio C Hamano
2018-02-12 20:37 ` Junio C Hamano
2018-02-12 21:24 ` Derrick Stolee [this message]
2018-02-13 22:38 ` Jonathan Tan
2018-02-08 20:37 ` [PATCH v3 08/14] commit-graph: implement 'git-commit-graph clear' Derrick Stolee
2018-02-13 22:49 ` Jonathan Tan
2018-02-08 20:37 ` [PATCH v3 09/14] commit-graph: implement --delete-expired Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 10/14] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-14 0:12 ` Jonathan Tan
2018-02-14 18:08 ` Derrick Stolee
2018-02-15 18:25 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 12/14] commit-graph: close under reachability Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 13/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 14/14] commit-graph: build graph from starting commits Derrick Stolee
2018-02-09 13:02 ` SZEDER Gábor
2018-02-09 13:45 ` Derrick Stolee
2018-02-14 18:15 ` [PATCH v3 00/14] Serialized Git Commit Graph Derrick Stolee
2018-02-14 18:27 ` Stefan Beller
2018-02-14 19:11 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 00/13] " Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 01/13] commit-graph: add format document Derrick Stolee
2018-02-20 20:49 ` Junio C Hamano
2018-02-21 19:23 ` Stefan Beller
2018-02-21 19:45 ` Derrick Stolee
2018-02-21 19:48 ` Stefan Beller
2018-03-30 13:25 ` Jakub Narebski
2018-04-02 13:09 ` Derrick Stolee
2018-04-02 14:09 ` Jakub Narebski
2018-02-19 18:53 ` [PATCH v4 02/13] graph: add commit graph design document Derrick Stolee
2018-02-20 21:42 ` Junio C Hamano
2018-02-23 15:44 ` Derrick Stolee
2018-02-21 19:34 ` Stefan Beller
2018-02-19 18:53 ` [PATCH v4 03/13] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-20 21:51 ` Junio C Hamano
2018-02-21 18:58 ` Junio C Hamano
2018-02-23 16:07 ` Derrick Stolee
2018-02-26 16:25 ` SZEDER Gábor
2018-02-26 17:08 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 04/13] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-20 22:57 ` Junio C Hamano
2018-02-23 17:23 ` Derrick Stolee
2018-02-23 19:30 ` Junio C Hamano
2018-02-23 19:48 ` Junio C Hamano
2018-02-23 20:02 ` Derrick Stolee
2018-02-26 16:10 ` SZEDER Gábor
2018-02-28 18:47 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 05/13] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-21 19:25 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 06/13] commit-graph: implement git commit-graph read Derrick Stolee
2018-02-21 20:11 ` Junio C Hamano
2018-02-22 18:25 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 07/13] commit-graph: implement --set-latest Derrick Stolee
2018-02-22 18:31 ` Junio C Hamano
2018-02-23 17:53 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 08/13] commit-graph: implement --delete-expired Derrick Stolee
2018-02-21 21:34 ` Stefan Beller
2018-02-23 17:43 ` Derrick Stolee
2018-02-22 18:48 ` Junio C Hamano
2018-02-23 17:59 ` Derrick Stolee
2018-02-23 19:33 ` Junio C Hamano
2018-02-23 19:41 ` Derrick Stolee
2018-02-23 19:51 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 09/13] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 10/13] commit-graph: close under reachability Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 11/13] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 12/13] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-21 22:25 ` Stefan Beller
2018-02-23 19:19 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 13/13] commit-graph: build graph from starting commits Derrick Stolee
2018-03-30 11:10 ` [PATCH v4 00/13] Serialized Git Commit Graph Jakub Narebski
2018-04-02 13:02 ` Derrick Stolee
2018-04-02 14:46 ` Jakub Narebski
2018-04-02 15:02 ` Derrick Stolee
2018-04-02 17:35 ` Stefan Beller
2018-04-02 17:54 ` Derrick Stolee
2018-04-02 18:02 ` Stefan Beller
2018-04-07 22:37 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99543db0-26e4-8daa-a580-b618497e48ba@gmail.com \
--to=stolee@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).