git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, git@jeffhostetler.com, peff@peff.net,
	jonathantanmy@google.com, szeder.dev@gmail.com,
	sbeller@google.com, Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v4 06/13] commit-graph: implement git commit-graph read
Date: Wed, 21 Feb 2018 12:11:14 -0800	[thread overview]
Message-ID: <xmqq371uyufx.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <1519066406-81663-7-git-send-email-dstolee@microsoft.com> (Derrick Stolee's message of "Mon, 19 Feb 2018 13:53:19 -0500")

Derrick Stolee <stolee@gmail.com> writes:

> +'read'::
> +
> +Read a graph file given by the graph-head file and output basic
> +details about the graph file.
> ++
> +With `--file=<name>` option, consider the graph stored in the file at
> +the path  <object-dir>/info/<name>.
> +

A sample reader confusion after reading the above twice:

    What is "the graph-head file" and how does the user specify it?  Is
    it given by  the value for the "--file=<name>" command line option?

Another sample reader reaction after reading the above:

    What are the kind of "basic details" we can learn from this
    command is unclear, but perhaps there is an example to help me
    decide if this command is worth studying.

> @@ -44,6 +53,12 @@ EXAMPLES
>  $ git commit-graph write
>  ------------------------------------------------
>  
> +* Read basic information from a graph file.
> ++
> +------------------------------------------------
> +$ git commit-graph read --file=<name>
> +------------------------------------------------
> +

And the sample reader is utterly disappointed at this point.

> +static int graph_read(int argc, const char **argv)
> +{
> +	struct commit_graph *graph = 0;
> +	struct strbuf full_path = STRBUF_INIT;
> +
> +	static struct option builtin_commit_graph_read_options[] = {
> +		{ OPTION_STRING, 'o', "object-dir", &opts.obj_dir,
> +			N_("dir"),
> +			N_("The object directory to store the graph") },
> +		{ OPTION_STRING, 'H', "file", &opts.graph_file,
> +			N_("file"),
> +			N_("The filename for a specific commit graph file in the object directory."),
> +			PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
> +		OPT_END(),
> +	};

The same comment as all the previous ones apply, wrt short options
and non-use of OPT_STRING().

Also, I suspect that these two would want to use OPT_FILENAME
instead, if we anticipate that the command might want to be
sometimes run from a subdirectory.  Otherwise wouldn't

	cd t && git commit-graph read --file=../.git/object/info/$whatever

end up referring to a wrong place because the code that uses the
value obtained from OPTION_STRING does not do the equivalent of
parse-options.c::fix_filename()?  The same applies to object-dir
handling.

> +	argc = parse_options(argc, argv, NULL,
> +			     builtin_commit_graph_read_options,
> +			     builtin_commit_graph_read_usage, 0);
> +
> +	if (!opts.obj_dir)
> +		opts.obj_dir = get_object_directory();
> +
> +	if (!opts.graph_file)
> +		die("no graph hash specified");
> +
> +	strbuf_addf(&full_path, "%s/info/%s", opts.obj_dir, opts.graph_file);

Ahh, I was fooled by a misnamed option.  --file does *not* name the
file.  It is a filename in a fixed place that is determined by other
things.

So it would be a mistake to use OPT_FILENAME() in the parser for
that misnamed "--file" option.  The parser for --object-dir still
would want to be OPT_FILENAME(), but quite honestly, I do not see
the point of having --object-dir option in the first place.  The
graph file is not relative to it but is forced to have /info/ in
between that directory and the filename, so it is not like the user
gets useful flexibility out of being able to specify two different
places using --object-dir= option and $GIT_OBJECT_DIRECTORY
environment (iow, a caller that wants to work on a specific object
directory can use the environment, which is how it would tell any
other Git subcommand which object store it wants to work with).

But stepping back a bit, I think the way --file argument is defined
is halfway off from two possible more useful ways to define it.  If
it were just "path to the file" (iow, what OPT_FILENAME() is suited
for parsing it), then a user could say "I have this graph file that
I created for testing, it is not installed in its usual place in
$GIT_OBJECT_DIRECTORY/info/ at all, but I want you to read it
because I am debugging".  That is one possible useful extreme.  The
other possibility would be to allow *only* the hash part to be
specified, iow, not just forcing /info/ relative to object
directory, you would force the "graph-" prefix and ".graph" suffix.
That would be the other extreme that is useful (less typing and less
error prone).

For a low-level command line this, my gut feeling is that it would
be better to allow paths to the object directory and the graph file
to be totally independently specified.

> +	if (graph_signature != GRAPH_SIGNATURE) {
> +		munmap(graph_map, graph_size);
> +		close(fd);
> +		die("graph signature %X does not match signature %X",
> +			graph_signature, GRAPH_SIGNATURE);
> +	}
> +
> +	graph_version = *(unsigned char*)(data + 4);
> +	if (graph_version != GRAPH_VERSION) {
> +		munmap(graph_map, graph_size);
> +		close(fd);
> +		die("graph version %X does not match version %X",
> +			graph_version, GRAPH_VERSION);
> +	}
> +
> +	hash_version = *(unsigned char*)(data + 5);
> +	if (hash_version != GRAPH_OID_VERSION) {
> +		munmap(graph_map, graph_size);
> +		close(fd);
> +		die("hash version %X does not match version %X",
> +			hash_version, GRAPH_OID_VERSION);

It becomes a bit tiring to see munmap/close/die pattern repreated
over and over again, doesn't it?  Can we make it simpler, perhaps by
letting die() take care of the clean-up?  After all, if the very
next step dies because alloc_commit_graph() got NULL in xmalloc(),
we are letting die() there take care of the clean-up anyway already,
and die() in the chunk parsing look has no such cleanup, either.

Of course, when we later want to libify this part of the code, then
we wouldn't be calling die() from this codepath, but the change
required to do so will not be just s/die/error/; it would be more
like

	if (x_version != X_VERSION) {
		error("X version %X does not match",...);
		goto cleanup_fail;
	}

with munmap/close done at the jumped-to label.

> +	}
> +
> +	graph = alloc_commit_graph();
> +
> +	graph->hash_len = GRAPH_OID_LEN;
> + ...
> +		if (chunk_offset > graph_size - GIT_MAX_RAWSZ)
> +			die("improper chunk offset %08x%08x", (uint32_t)(chunk_offset >> 32),
> +			    (uint32_t)chunk_offset);
> +
> +		switch (chunk_id) {
> +			case GRAPH_CHUNKID_OIDFANOUT:
> +				graph->chunk_oid_fanout = (uint32_t*)(data + chunk_offset);
> +				break;

This is over-indented from our point of view.  In our codebase, case
arms aling with switch, i.e.

		switch (chunk_id) {
		case GRAPH_CHUNKID_OIDFANOUT:
			graph->chunk_oid_fanout = ...;
			break;

When the input file has GRAPH_CHUNKID_OIDFANOUT twice, I think it
should be flagged as a corrupt/malformed input file, causing the
reader to reject it.  It is plausible that you wanted to make it
"the last one wins", but even if that is the case, I think the user
should at least get a warning, as (I'd imagine) it is an unusual
condition.

The same applies to multiple instances of any currently-defined
chunk types.

> +graph_read_expect() {
> +	OPTIONAL=""
> +	NUM_CHUNKS=3
> +	if [ ! -z $2 ]

We use "test" and do not use "[ ... ]" or "[[ ... ]]".

I'll stop here.


  reply	other threads:[~2018-02-21 20:11 UTC|newest]

Thread overview: 146+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-30 21:39 [PATCH v2 00/14] Serialized Git Commit Graph Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 01/14] commit-graph: add format document Derrick Stolee
2018-02-01 21:44   ` Jonathan Tan
2018-01-30 21:39 ` [PATCH v2 02/14] graph: add commit graph design document Derrick Stolee
2018-01-31  2:19   ` Stefan Beller
2018-01-30 21:39 ` [PATCH v2 03/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-02  0:53   ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 04/14] commit-graph: implement construct_commit_graph() Derrick Stolee
2018-02-01 22:23   ` Jonathan Tan
2018-02-01 23:46   ` SZEDER Gábor
2018-02-02 15:32   ` SZEDER Gábor
2018-02-05 16:06     ` Derrick Stolee
2018-02-07 15:08       ` SZEDER Gábor
2018-02-07 15:10         ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 05/14] commit-graph: implement git-commit-graph --write Derrick Stolee
2018-02-01 23:33   ` Jonathan Tan
2018-02-02 18:36     ` Stefan Beller
2018-02-02 22:48       ` Junio C Hamano
2018-02-03  1:58         ` Derrick Stolee
2018-02-03  9:28           ` Jeff King
2018-02-05 18:48             ` Junio C Hamano
2018-02-06 18:55               ` Derrick Stolee
2018-02-01 23:48   ` SZEDER Gábor
2018-02-05 18:07     ` Derrick Stolee
2018-02-02  1:47   ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 06/14] commit-graph: implement git-commit-graph --read Derrick Stolee
2018-01-31  2:22   ` Stefan Beller
2018-02-02  0:02   ` SZEDER Gábor
2018-02-02  0:23   ` Jonathan Tan
2018-02-05 19:29     ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 07/14] commit-graph: implement git-commit-graph --update-head Derrick Stolee
2018-02-02  1:35   ` SZEDER Gábor
2018-02-05 21:01     ` Derrick Stolee
2018-02-02  2:45   ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 08/14] commit-graph: implement git-commit-graph --clear Derrick Stolee
2018-02-02  4:01   ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 09/14] commit-graph: teach git-commit-graph --delete-expired Derrick Stolee
2018-02-02 15:04   ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 10/14] commit-graph: add core.commitgraph setting Derrick Stolee
2018-01-31 22:44   ` Igor Djordjevic
2018-02-02 16:01   ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-02  1:51   ` Jonathan Tan
2018-02-06 14:53     ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 12/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 13/14] commit-graph: close under reachability Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 14/14] commit-graph: build graph from starting commits Derrick Stolee
2018-01-30 21:47 ` [PATCH v2 00/14] Serialized Git Commit Graph Stefan Beller
2018-02-01  2:34   ` Stefan Beller
2018-02-08 20:37 ` [PATCH v3 " Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 01/14] commit-graph: add format document Derrick Stolee
2018-02-08 21:21     ` Junio C Hamano
2018-02-08 21:33       ` Derrick Stolee
2018-02-08 23:16         ` Junio C Hamano
2018-02-08 20:37   ` [PATCH v3 02/14] graph: add commit graph design document Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 03/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-08 21:27     ` Junio C Hamano
2018-02-08 21:36       ` Derrick Stolee
2018-02-08 23:21         ` Junio C Hamano
2018-02-08 20:37   ` [PATCH v3 04/14] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-08 22:14     ` Junio C Hamano
2018-02-15 18:19     ` Junio C Hamano
2018-02-15 18:23       ` Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 05/14] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-13 21:57     ` Jonathan Tan
2018-02-08 20:37   ` [PATCH v3 06/14] commit-graph: implement 'git-commit-graph read' Derrick Stolee
2018-02-08 23:38     ` Junio C Hamano
2018-02-08 20:37   ` [PATCH v3 07/14] commit-graph: update graph-head during write Derrick Stolee
2018-02-12 18:56     ` Junio C Hamano
2018-02-12 20:37       ` Junio C Hamano
2018-02-12 21:24         ` Derrick Stolee
2018-02-13 22:38     ` Jonathan Tan
2018-02-08 20:37   ` [PATCH v3 08/14] commit-graph: implement 'git-commit-graph clear' Derrick Stolee
2018-02-13 22:49     ` Jonathan Tan
2018-02-08 20:37   ` [PATCH v3 09/14] commit-graph: implement --delete-expired Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 10/14] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-14  0:12     ` Jonathan Tan
2018-02-14 18:08       ` Derrick Stolee
2018-02-15 18:25     ` Junio C Hamano
2018-02-08 20:37   ` [PATCH v3 12/14] commit-graph: close under reachability Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 13/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-08 20:37   ` [PATCH v3 14/14] commit-graph: build graph from starting commits Derrick Stolee
2018-02-09 13:02     ` SZEDER Gábor
2018-02-09 13:45       ` Derrick Stolee
2018-02-14 18:15   ` [PATCH v3 00/14] Serialized Git Commit Graph Derrick Stolee
2018-02-14 18:27     ` Stefan Beller
2018-02-14 19:11       ` Derrick Stolee
2018-02-19 18:53     ` [PATCH v4 00/13] " Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 01/13] commit-graph: add format document Derrick Stolee
2018-02-20 20:49         ` Junio C Hamano
2018-02-21 19:23         ` Stefan Beller
2018-02-21 19:45           ` Derrick Stolee
2018-02-21 19:48             ` Stefan Beller
2018-03-30 13:25         ` Jakub Narebski
2018-04-02 13:09           ` Derrick Stolee
2018-04-02 14:09             ` Jakub Narebski
2018-02-19 18:53       ` [PATCH v4 02/13] graph: add commit graph design document Derrick Stolee
2018-02-20 21:42         ` Junio C Hamano
2018-02-23 15:44           ` Derrick Stolee
2018-02-21 19:34         ` Stefan Beller
2018-02-19 18:53       ` [PATCH v4 03/13] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-20 21:51         ` Junio C Hamano
2018-02-21 18:58           ` Junio C Hamano
2018-02-23 16:07             ` Derrick Stolee
2018-02-26 16:25         ` SZEDER Gábor
2018-02-26 17:08           ` Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 04/13] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-20 22:57         ` Junio C Hamano
2018-02-23 17:23           ` Derrick Stolee
2018-02-23 19:30             ` Junio C Hamano
2018-02-23 19:48               ` Junio C Hamano
2018-02-23 20:02               ` Derrick Stolee
2018-02-26 16:10         ` SZEDER Gábor
2018-02-28 18:47         ` Junio C Hamano
2018-02-19 18:53       ` [PATCH v4 05/13] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-21 19:25         ` Junio C Hamano
2018-02-19 18:53       ` [PATCH v4 06/13] commit-graph: implement git commit-graph read Derrick Stolee
2018-02-21 20:11         ` Junio C Hamano [this message]
2018-02-22 18:25           ` Junio C Hamano
2018-02-19 18:53       ` [PATCH v4 07/13] commit-graph: implement --set-latest Derrick Stolee
2018-02-22 18:31         ` Junio C Hamano
2018-02-23 17:53           ` Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 08/13] commit-graph: implement --delete-expired Derrick Stolee
2018-02-21 21:34         ` Stefan Beller
2018-02-23 17:43           ` Derrick Stolee
2018-02-22 18:48         ` Junio C Hamano
2018-02-23 17:59           ` Derrick Stolee
2018-02-23 19:33             ` Junio C Hamano
2018-02-23 19:41               ` Derrick Stolee
2018-02-23 19:51                 ` Junio C Hamano
2018-02-19 18:53       ` [PATCH v4 09/13] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 10/13] commit-graph: close under reachability Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 11/13] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 12/13] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-21 22:25         ` Stefan Beller
2018-02-23 19:19           ` Derrick Stolee
2018-02-19 18:53       ` [PATCH v4 13/13] commit-graph: build graph from starting commits Derrick Stolee
2018-03-30 11:10       ` [PATCH v4 00/13] Serialized Git Commit Graph Jakub Narebski
2018-04-02 13:02         ` Derrick Stolee
2018-04-02 14:46           ` Jakub Narebski
2018-04-02 15:02             ` Derrick Stolee
2018-04-02 17:35               ` Stefan Beller
2018-04-02 17:54                 ` Derrick Stolee
2018-04-02 18:02                   ` Stefan Beller
2018-04-07 22:37               ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq371uyufx.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=dstolee@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).