git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Elliott Cable <me@ell.io>
Subject: Re: [PATCH v2 4/4] log: --author-date-order
Date: Mon, 10 Jun 2013 14:49:18 -0400	[thread overview]
Message-ID: <20130610184918.GC2084@sigill.intra.peff.net> (raw)
In-Reply-To: <7vobbel8ib.fsf@alter.siamese.dyndns.org>

On Mon, Jun 10, 2013 at 12:39:24AM -0700, Junio C Hamano wrote:

> > I'm not excited about introducing yet another place that parses commit
> > objects (mostly not for correctness, but because we have had
> > inconsistency in how malformed objects are treated). It is at least
> > using split_ident_line which covers the hard bits. I wonder how much
> > slower it would be to simply call format_commit_message to do the
> > parsing.
> 
> The thought certainly crossed my mind, not exactly in that form but
> more about splitting the machinery used in pretty.c into a more
> reusable form.
> 
> The result of my attempt however did not become all that reusable
> (admittedly I didn't spend too much brain cycles on it), so I punted
> ;-).

Yes, I feel like it has been tried before. The problem is that a clean
interface would let you get individual pieces of information with a
single call. But an efficient interface will utilize the same parsing
pass to get multiple items out, and stop parsing when we have gotten all
required items (but leave the parser in a consistent state so that we
can pick it up later).

The format_commit_one parser does that, but the "format_commit_context"
it holds is a bit bulky. I think it might be possible to pull out the
parsing bits into a separate struct, and you could call it something
like:

  struct commit_parser parser;
  unsigned long authordate;
  const char *authorname;
  int authorlen;

  commit_parser_init(&parser, commit);
  authordate = commit_parse_authordate(&parser);
  authorname = commit_parse_authorname(&parser, &authorlen);

where the second parse call is basically "free", because we've already
done (and cached) the hard work in the first call.

So they might look like:

  static void parse_author_ident(struct commit_parser *parser)
  {
          if (!parser->author.name_begin) {
                  if (!parser->authorline.start)
                          parse_commit_header(parser);
                  split_ident_line(&parser->author,
                                   parser->authorline.start,
                                   parser->authorline.len);
          }
  }

  unsigned long commit_parse_authordate(struct commit_parser *parser)
  {
          parse_author_ident(parser);
          /* XXX should check for malformedness here */
          return strtoul(ident.date_begin, NULL, 10);
  }

  const char *commit_parse_authorname(struct commit_parser *parser,
                                      unsigned long *len)
  {
          parse_author_ident(parser);
          *len = parser.author.name_end - parser.author.name_begin;
          return parser.author.name_begin;
  }

and so forth. It would be easy (and have the same efficiency) for
format_commit_message to build on that, and it calling it from regular
code is not too bad.

> But you are right.  The commit->buffer may no longer be there, and
> the --author-date-order option needs to read the object again
> in this codepath.  That would be in line with what --pretty/format
> would do, I guess.
> 
> Or we could extend parse_commit() API to take an optional commit
> info slab to store not just author date but other non-essential
> stuff like people's names, and we arrange that extended API to be
> triggered when we know --author-date-order is in effect?

I like the latter option. It takes a non-trivial amount of time to load
the commits from disk, and now we are potentially doing it 2 or 3 times
for a run (once to parse, once to get the author info for topo-sort, and
possibly later to show it if --pretty is given; though I did not check
and maybe we turn off save_commit_buffer with --pretty). It would be
nice to have an extended parse_object that handled that. I'm not sure of
the interface. Maybe variadic with pairs of type/slab, like:

  parse_commit_extended(commit,
                        PARSE_COMMIT_AUTHORDATE, &authordate_slab,
                        PARSE_COMMIT_DONE);

?

-Peff

  reply	other threads:[~2013-06-10 18:49 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-04 18:08 [PATCH/RFC] add --authorship-order flag to git log / rev-list elliottcable
2013-06-04 18:08 ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering elliottcable
2013-06-04 19:14   ` Junio C Hamano
2013-06-04 21:20     ` Junio C Hamano
2013-06-06 19:03       ` Elliott Cable
2013-06-06 19:29         ` Junio C Hamano
2013-06-06 19:32           ` Elliott Cable
2013-06-06 19:40         ` Junio C Hamano
2013-06-06 22:48           ` Junio C Hamano
2013-06-06 23:25             ` [PATCH] toposort: rename "lifo" field Junio C Hamano
2013-06-07  1:25               ` Junio C Hamano
2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
2013-06-07  5:11                   ` [PATCH 1/3] toposort: rename "lifo" field Junio C Hamano
2013-06-07  5:18                     ` Eric Sunshine
2013-06-07  5:21                       ` Junio C Hamano
2013-06-07  5:11                   ` [PATCH 2/3] commit-queue: LIFO or priority queue of commits Junio C Hamano
2013-06-07  5:29                     ` Eric Sunshine
2013-06-07  5:11                   ` [PATCH 3/3] sort-in-topological-order: use commit-queue Junio C Hamano
2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
2013-06-10  2:12                       ` Eric Sunshine
2013-06-10  5:05                       ` Jeff King
2013-06-09 23:24                     ` [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits Junio C Hamano
2013-06-10  5:25                       ` Jeff King
2013-06-10  7:21                         ` Junio C Hamano
2013-06-10 18:15                           ` Jeff King
2013-06-10 18:56                             ` Junio C Hamano
2013-06-10 18:59                               ` Jeff King
2013-06-10 23:23                                 ` Junio C Hamano
2013-06-11  6:36                                   ` Jeff King
2013-06-11 17:02                                     ` Junio C Hamano
2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 1/4] toposort: rename "lifo" field Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 2/4] prio-queue: priority queue of pointers to structs Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 3/4] sort-in-topological-order: use prio-queue Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 4/4] log: --author-date-order Junio C Hamano
2013-06-09 23:24                     ` [PATCH v2 3/4] sort-in-topological-order: use commit-queue Junio C Hamano
2013-06-09 23:37                       ` Junio C Hamano
2013-06-10  5:31                         ` Jeff King
2013-06-10  7:27                           ` Junio C Hamano
2013-06-10 18:24                             ` Jeff King
2013-06-09 23:24                     ` [PATCH v2 4/4] log: --author-date-order Junio C Hamano
2013-06-10  5:50                       ` Jeff King
2013-06-10  7:39                         ` Junio C Hamano
2013-06-10 18:49                           ` Jeff King [this message]
2013-06-20 19:36                             ` Junio C Hamano
2013-06-20 20:16                               ` Jeff King
2013-06-07  5:09               ` [PATCH] toposort: rename "lifo" field Eric Sunshine
2013-06-04 21:22     ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering Jeff King
2013-06-04 18:53 ` [PATCH/RFC] add --authorship-order flag to git log / rev-list Junio C Hamano
2013-06-06 18:06   ` Elliott Cable

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130610184918.GC2084@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ell.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).