git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@jeffhostetler.com
Cc: "SZEDER Gábor" <szeder.dev@gmail.com>,
	gitster@pobox.com, peff@peff.net,
	"Jeff Hostetler" <jeffhost@microsoft.com>,
	git@vger.kernel.org
Subject: Re: [PATCH v5 2/4] read-cache: add strcmp_offset function
Date: Thu,  6 Apr 2017 16:19:12 +0200	[thread overview]
Message-ID: <20170406141912.14536-1-szeder.dev@gmail.com> (raw)
In-Reply-To: <20170405173809.3098-3-git@jeffhostetler.com>

> Add strcmp_offset() function to also return the offset of the
> first change.
> 
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  cache.h      |  1 +
>  read-cache.c | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/cache.h b/cache.h
> index 80b6372..4d82490 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -574,6 +574,7 @@ extern int write_locked_index(struct index_state *, struct lock_file *lock, unsi
>  extern int discard_index(struct index_state *);
>  extern int unmerged_index(const struct index_state *);
>  extern int verify_path(const char *path);
> +extern int strcmp_offset(const char *s1_in, const char *s2_in, int *first_change);
>  extern int index_dir_exists(struct index_state *istate, const char *name, int namelen);
>  extern void adjust_dirname_case(struct index_state *istate, char *name);
>  extern struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase);
> diff --git a/read-cache.c b/read-cache.c
> index 9054369..b3fc77d 100644
> --- a/read-cache.c
> +++ b/read-cache.c

read-cache.c is probably not the best place for such a general string
utility function, though I'm not sure where its most apropriate place
would be.

> @@ -887,6 +887,35 @@ static int has_file_name(struct index_state *istate,
>  	return retval;
>  }
>  
> +
> +/*
> + * Like strcmp(), but also return the offset of the first change.

This comment doesn't tell us what will happen with the offset
parameter if it is NULL or if the two strings are equal.  I don't
really care about the safety check for NULL: a callsite not interested
in the offset should just call strcmp() instead.  I'm fine either way.
However, setting it to 0 when the strings are equal doesn't seem quite
right, does it.

> + */
> +int strcmp_offset(const char *s1_in, const char *s2_in, int *first_change)
> +{
> +	const unsigned char *s1 = (const unsigned char *)s1_in;
> +	const unsigned char *s2 = (const unsigned char *)s2_in;
> +	int diff = 0;
> +	int k;
> +
> +	*first_change = 0;
> +	for (k=0; s1[k]; k++)
> +		if ((diff = (s1[k] - s2[k])))
> +			goto found_it;
> +	if (!s2[k])
> +		return 0;
> +	diff = -1;
> +
> +found_it:
> +	*first_change = k;
> +	if (diff > 0)
> +		return 1;
> +	else if (diff < 0)
> +		return -1;
> +	else
> +		return 0;
> +}

The implementation seems to me a bit long-winded, with more
conditional statements than necessary.  How about something like this
instead?  Much shorter, no goto and only half the number of
conditionals to reason about, leaves *first_change untouched when the
two strings are equal, and deals with it being NULL.

int strcmp_offset(const char *s1, const char *s2, int *first_change)
{
	int k;

	for (k = 0; s1[k] == s2[k]; k++)
		if (s1[k] == '\0')
			return 0;

	if (first_change)
		*first_change = k;
	return ((unsigned char *)s1)[k] - ((unsigned char *)s2)[k];
}


  reply	other threads:[~2017-04-06 14:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-05 17:38 [PATCH v5 0/4] read-cache: speed up add_index_entry git
2017-04-05 17:38 ` [PATCH v5 1/4] p0004-read-tree: perf test to time read-tree git
2017-04-06 20:20   ` René Scharfe
2017-04-06 20:41     ` Jeff Hostetler
2017-04-05 17:38 ` [PATCH v5 2/4] read-cache: add strcmp_offset function git
2017-04-06 14:19   ` SZEDER Gábor [this message]
2017-04-06 15:58     ` Jeff Hostetler
2017-04-05 17:38 ` [PATCH v5 3/4] test-strcmp-offset: created test for strcmp_offset git
2017-04-05 22:47   ` SZEDER Gábor
2017-04-06  8:21     ` Johannes Schindelin
2017-04-06 14:25       ` Jeff Hostetler
2017-04-06 20:20   ` René Scharfe
2017-04-06 20:42     ` Jeff Hostetler
2017-04-05 17:38 ` [PATCH v5 4/4] read-cache: speed up add_index_entry during checkout git
2017-04-05 22:54   ` SZEDER Gábor
2017-04-06 14:05     ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170406141912.14536-1-szeder.dev@gmail.com \
    --to=szeder.dev@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).