git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
	Ramsay Jones <ramsay@ramsayjones.plus.com>,
	git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 02/19] builtin/diff-tree: convert to struct object_id
Date: Fri, 17 Feb 2017 20:42:17 -0500	[thread overview]
Message-ID: <20170218014217.sil4jyukkbqguxfz@sigill.intra.peff.net> (raw)
In-Reply-To: <20170218012607.kdisudmmponvts35@genre.crustytoothpaste.net>

On Sat, Feb 18, 2017 at 01:26:07AM +0000, brian m. carlson wrote:

> > > +	struct object_id oid;
> > >  	struct tree *tree2;
> > > -	if (len != 82 || !isspace(line[40]) || get_sha1_hex(line + 41, sha1))
> > > +	const int chunksz = GIT_SHA1_HEXSZ + 1;
> > > +	if (len != 2 * chunksz || !isspace(line[chunksz-1]) ||
> > > +		get_sha1_hex(line + chunksz, oid.hash))
> > 
> > I'm not sure that this is an improvement. The input expected in 'line'
> > is supposed to look like: '<sha1> + <space> + <sha1> + <\n>'. So your
> > 'chunk' would be a <sha1> plus one 'char' of some sort. Except that the
> > caller of this function has already replaced the newline character with
> > a '\0' char (so strlen(line) would return 81), but still passes the
> > original line length! Also, note that this (and other functions in this
> > file) actually test for 'isspace(char)' rather than for a ' ' char!
> > 
> > Hmm, maybe just:
> > 
> > if (len < (2 * GIT_SHA1_HEXSZ + 1) || line[GIT_SHA1_HEXSZ] != ' ' ||
> >     get_sha1_hex(line + GIT_SHA1_HEXSZ + 1, oid.hash))
> > 
> > (or, perhaps, still call isspace() in this patch ...)
> 
> Well, I think it's strictly an improvement in that we have avoided
> writing hardcoded constants[0].  I did intend it as a "hash plus one"
> chunk, which is actually quite common throughout the code.
> 
> I'm wondering if parse_oid_hex could be useful here as well.

I know I haven't looked at this chunk nearly as carefully as you have,
but it seems somewhat crazy to me that these functions get the original
"line" in the first place. Shouldn't they get line+40 from the caller
(who in turn should be using parse_oid_hex to compute that)?

And then each function should subsequently parse left-to-right with
a mix of isspace() and parse_oid_hex(), and probably doesn't even need
to care about the original "len" at all (yes, you can quit early if you
know your len isn't long enough, but that's the unusual error case
anyway; it's not a big deal to find that out while parsing).

In general, I think this sort of left-to-right incremental pointer
movement is safe and simple. There may be a few cases where it doesn't
apply (i.e., where you need to look at the end of the string to know how
to parse the beginning), but that should be relatively rare.

> [0] If we change the hash size, all of the GIT_SHA1_HEXSZ constants can
> be replaced with a variable that varies based on hash size, and the code
> still works.

I am happy to see fewer magic numbers. But I think with incremental
pointer-movements, we don't even need to use the numeric constants,
either. If one day we can parse "sha256:1234abcd..." as an oid, then the
existing code would "just work".

-Peff

  reply	other threads:[~2017-02-18  1:42 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-18  0:06 [PATCH v3 00/19] object_id part 6 brian m. carlson
2017-02-18  0:06 ` [PATCH v3 01/19] builtin/commit: convert to struct object_id brian m. carlson
2017-02-18  0:06 ` [PATCH v3 02/19] builtin/diff-tree: " brian m. carlson
2017-02-18  1:18   ` Ramsay Jones
2017-02-18  1:26     ` brian m. carlson
2017-02-18  1:42       ` Jeff King [this message]
2017-02-18  3:15         ` Jeff King
2017-02-18 19:12           ` brian m. carlson
2017-02-18 20:01             ` Jeff King
2017-02-18  0:06 ` [PATCH v3 03/19] builtin/describe: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 04/19] builtin/fast-export: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 05/19] builtin/fmt-merge-message: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 06/19] builtin/grep: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 07/19] builtin/branch: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 08/19] builtin/clone: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 09/19] builtin/merge: " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 10/19] Convert remaining callers of resolve_refdup to object_id brian m. carlson
2017-02-18  0:06 ` [PATCH v3 11/19] builtin/replace: convert to struct object_id brian m. carlson
2017-02-18  0:06 ` [PATCH v3 12/19] reflog-walk: convert struct reflog_info " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 13/19] refs: convert each_reflog_ent_fn " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 14/19] hex: introduce parse_oid_hex brian m. carlson
2017-02-18  0:06 ` [PATCH v3 15/19] refs: simplify parsing of reflog entries brian m. carlson
2017-02-18  0:06 ` [PATCH v3 16/19] sha1_file: introduce an nth_packed_object_oid function brian m. carlson
2017-02-18  1:24   ` Ramsay Jones
2017-02-18  1:29     ` brian m. carlson
2017-02-18  0:06 ` [PATCH v3 17/19] Convert object iteration callbacks to struct object_id brian m. carlson
2017-02-18  0:06 ` [PATCH v3 18/19] builtin/merge-base: convert " brian m. carlson
2017-02-18  0:06 ` [PATCH v3 19/19] wt-status: " brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170218014217.sil4jyukkbqguxfz@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).