git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>, Jeff Hostetler <git@jeffhostetler.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 11/14] commit: integrate packed graph with commit parsing
Date: Fri, 26 Jan 2018 11:38:51 -0800	[thread overview]
Message-ID: <CAGZ79kbR_GUGRhXBU187LchHKqxFd42Rc=w+jOEfrigmEAgNaw@mail.gmail.com> (raw)
In-Reply-To: <20180125140231.65604-12-dstolee@microsoft.com>

On Thu, Jan 25, 2018 at 6:02 AM, Derrick Stolee <stolee@gmail.com> wrote:
> Teach Git to inspect a packed graph to supply the contents of a
> struct commit when calling parse_commit_gently(). This implementation
> satisfies all post-conditions on the struct commit, including loading
> parents, the root tree, and the commit date. The only loosely-expected
> condition is that the commit buffer is loaded into the cache. This
> was checked in log-tree.c:show_log(), but the "return;" on failure
> produced unexpected results (i.e. the message line was never terminated).
> The new behavior of loading the buffer when needed prevents the
> unexpected behavior.
>
> If core.graph is false, then do not load the graph and behave as usual.
>
> In test script t5319-graph.sh, add output-matching conditions on read-
> only graph operations.
>
> By loading commits from the graph instead of parsing commit buffers, we
> save a lot of time on long commits walks. Here are some performance
> results for a copy of the Linux repository where 'master' has 704,766
> reachable commits and is behind 'origin/master' by 19,610 commits.
>
> | Command                          | Before | After  | Rel % |
> |----------------------------------|--------|--------|-------|
> | log --oneline --topo-order -1000 |  5.9s  |  0.7s  | -88%  |
> | branch -vv                       |  0.42s |  0.27s | -35%  |
> | rev-list --all                   |  6.4s  |  1.0s  | -84%  |
> | rev-list --all --objects         | 32.6s  | 27.6s  | -15%  |

This sounds impressive!

> @@ -383,19 +384,27 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing)
>
>         if (!item)
>                 return -1;
> +
> +       // If we already parsed, but got it from the graph, then keep going!

comment style.

>         if (item->object.parsed)
>                 return 0;
> +
> +       if (check_packed && parse_packed_commit(item))
> +               return 0;
> +
>         buffer = read_sha1_file(item->object.oid.hash, &type, &size);
>         if (!buffer)
>                 return quiet_on_missing ? -1 :
>                         error("Could not read %s",
> -                            oid_to_hex(&item->object.oid));
> +                       oid_to_hex(&item->object.oid));
>         if (type != OBJ_COMMIT) {
>                 free(buffer);
>                 return error("Object %s not a commit",
> -                            oid_to_hex(&item->object.oid));
> +                       oid_to_hex(&item->object.oid));
>         }
> +
>         ret = parse_commit_buffer(item, buffer, size);
> +

I guess the new lines are for readability?
Not sure if will play out nicely with merges in this area, though.
(I touch this area of the code as well in the not yet sent out series
adding the repository as an argument all over the place. Not your
problem, just me getting anxious)

> @@ -34,6 +34,8 @@
>  #define GRAPH_CHUNKLOOKUP_SIZE (5 * 12)
>  #define GRAPH_MIN_SIZE (GRAPH_CHUNKLOOKUP_SIZE + GRAPH_FANOUT_SIZE + \
>                         GRAPH_OID_LEN + sizeof(struct packed_graph_header))
> +/* global storage */
> +struct packed_graph *packed_graph = 0;
>
>  struct object_id *get_graph_head_oid(const char *pack_dir, struct object_id *oid)
>  {
> @@ -209,6 +211,225 @@ struct packed_graph *load_packed_graph_one(const char *graph_file, const char *p
>         return graph;
>  }
>
> +static void prepare_packed_graph_one(const char *obj_dir)
> +{
> +       char *graph_file;
> +       struct object_id oid;
> +       struct strbuf pack_dir = STRBUF_INIT;
> +       strbuf_addstr(&pack_dir, obj_dir);
> +       strbuf_add(&pack_dir, "/pack", 5);
> +
> +       if (!get_graph_head_oid(pack_dir.buf, &oid))
> +               return;
> +
> +       graph_file = get_graph_filename_oid(pack_dir.buf, &oid);
> +
> +       packed_graph = load_packed_graph_one(graph_file, pack_dir.buf);
> +       strbuf_release(&pack_dir);
> +}
> +
> +static int prepare_packed_graph_run_once = 0;

Okay. :(
Seeing new globals like these, gives me extra motivation to
get the object store series going.

  reply	other threads:[~2018-01-26 19:38 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-25 14:02 [PATCH 00/14] Serialized Commit Graph Derrick Stolee
2018-01-25 14:02 ` [PATCH 01/14] graph: add packed graph design document Derrick Stolee
2018-01-25 20:04   ` Stefan Beller
2018-01-26 12:49     ` Derrick Stolee
2018-01-26 18:17       ` Stefan Beller
2018-01-25 21:14   ` Junio C Hamano
2018-01-26 13:06     ` Derrick Stolee
2018-01-26 14:13   ` Duy Nguyen
2018-01-25 14:02 ` [PATCH 02/14] packed-graph: add core.graph setting Derrick Stolee
2018-01-25 20:17   ` Stefan Beller
2018-01-25 20:40     ` Derrick Stolee
2018-01-25 21:43   ` Junio C Hamano
2018-01-26 13:08     ` Derrick Stolee
2018-01-25 14:02 ` [PATCH 03/14] packed-graph: create git-graph builtin Derrick Stolee
2018-01-25 21:45   ` Stefan Beller
2018-01-26 13:13     ` Derrick Stolee
2018-01-25 23:01   ` Junio C Hamano
2018-01-26 13:14     ` Derrick Stolee
2018-01-26 14:16       ` Duy Nguyen
2018-01-25 14:02 ` [PATCH 04/14] packed-graph: add format document Derrick Stolee
2018-01-25 22:06   ` Junio C Hamano
2018-01-25 22:18     ` Stefan Beller
2018-01-25 22:29       ` Junio C Hamano
2018-01-26 13:22         ` Derrick Stolee
2018-01-25 22:07   ` Stefan Beller
2018-01-26 13:25     ` Derrick Stolee
2018-01-25 14:02 ` [PATCH 05/14] packed-graph: implement construct_graph() Derrick Stolee
2018-01-25 23:21   ` Stefan Beller
2018-01-26 20:47     ` Junio C Hamano
2018-01-26 20:55   ` Junio C Hamano
2018-01-26 21:14     ` Andreas Schwab
2018-01-26 22:04       ` Junio C Hamano
2018-01-25 14:02 ` [PATCH 06/14] packed-graph: implement git-graph --write Derrick Stolee
2018-01-25 23:28   ` Stefan Beller
2018-01-26 13:28     ` Derrick Stolee
2018-01-25 14:02 ` [PATCH 07/14] packed-graph: implement git-graph --read Derrick Stolee
2018-01-25 14:02 ` [PATCH 08/14] graph: implement git-graph --update-head Derrick Stolee
2018-01-25 14:02 ` [PATCH 09/14] packed-graph: implement git-graph --clear Derrick Stolee
2018-01-25 23:35   ` Stefan Beller
2018-01-25 14:02 ` [PATCH 10/14] packed-graph: teach git-graph --delete-expired Derrick Stolee
2018-01-25 14:02 ` [PATCH 11/14] commit: integrate packed graph with commit parsing Derrick Stolee
2018-01-26 19:38   ` Stefan Beller [this message]
2018-01-25 14:02 ` [PATCH 12/14] packed-graph: read only from specific pack-indexes Derrick Stolee
2018-01-25 14:02 ` [PATCH 13/14] packed-graph: close under reachability Derrick Stolee
2018-01-25 14:02 ` [PATCH 14/14] packed-graph: teach git-graph to read commits Derrick Stolee
2018-01-25 15:46 ` [PATCH 00/14] Serialized Commit Graph Ævar Arnfjörð Bjarmason
2018-01-25 16:09   ` Derrick Stolee
2018-01-25 23:06     ` Ævar Arnfjörð Bjarmason
2018-01-26 12:15       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kbR_GUGRhXBU187LchHKqxFd42Rc=w+jOEfrigmEAgNaw@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=dstolee@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).