git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Stefan Beller <sbeller@google.com>,
	git@vger.kernel.org, orgads@gmail.com
Subject: [PATCH 5/5] diff: handle NULs in get_string_hash()
Date: Thu, 19 Oct 2017 16:31:20 -0400
Message-ID: <20171019203119.d2xbryov2y5t2aig@sigill.intra.peff.net> (raw)
In-Reply-To: <20171019202326.grovyfsragl2d7xx@sigill.intra.peff.net>

For computing moved lines, we feed the characters of each
line into a hash. When we've been asked to ignore
whitespace, then we pick each character using next_byte(),
which returns -1 on end-of-string, which it determines using
the start/end pointers we feed it.

However our check of its return value treats "0" the same as
"-1", meaning we'd quit if the string has an embedded NUL.
This is unlikely to ever come up in practice since our line
boundaries generally come from calling strlen() in the first
place.

But it was a bit surprising to me as a reader of the
next_byte() code. And it's possible that we may one day feed
this function with more exotic input, which otherwise works
with arbitrary ptr/len pairs.

Signed-off-by: Jeff King <peff@peff.net>
---
I noticed that we make an extra copy of each line here, just to feed it
to memihash! I guess "-w" is not a critical-performance code path, but
this could be fixed if we could do memhash() incrementally (e.g., by
putting the FNV state into a struct and letting callers "add" to it
incrementally). Maybe an interesting #leftoverbits, though I'd want to
see timing tests that show it's worth doing.

 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 09081a207c..c4a669ffa8 100644
--- a/diff.c
+++ b/diff.c
@@ -782,7 +782,7 @@ static unsigned get_string_hash(struct emitted_diff_symbol *es, struct diff_opti
 		strbuf_reset(&sb);
 		while (ae > ap && isspace(ae[-1]))
 			ae--;
-		while ((c = next_byte(&ap, &ae, o)) > 0)
+		while ((c = next_byte(&ap, &ae, o)) >= 0)
 			strbuf_addch(&sb, c);
 
 		return memhash(sb.buf, sb.len);
-- 
2.15.0.rc1.560.g5f0609e481

  parent reply index

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-12 19:53 Out of memory with diff.colormoved enabled Orgad Shaneh
2017-10-12 20:05 ` Jeff King
2017-10-12 22:39   ` Stefan Beller
2017-10-12 23:33   ` [PATCH] diff.c: increment buffer pointer in all code path Stefan Beller
2017-10-13  0:18     ` Jeff King
2017-10-13  0:20       ` Jeff King
2017-10-13  0:24         ` Stefan Beller
2017-10-19  5:04         ` Jeff King
2017-10-19  5:24           ` Jeff King
2017-10-19  5:30             ` Junio C Hamano
2017-10-19  5:32               ` Junio C Hamano
2017-10-19  5:32                 ` Jeff King
2017-10-19  5:42               ` Jeff King
2017-10-19 19:55                 ` Stefan Beller
2017-10-19 20:23                 ` [PATCH 0/5] fix "diff --color-moved --ignore-space-at-eol" Jeff King
2017-10-19 20:24                   ` [PATCH 1/5] t4015: refactor --color-moved whitespace test Jeff King
2017-10-19 20:56                     ` Stefan Beller
2017-10-19 21:10                       ` Jeff King
2017-10-19 20:25                   ` [PATCH 2/5] t4015: check "negative" case for "-w --color-moved" Jeff King
2017-10-19 20:54                     ` Stefan Beller
2017-10-19 20:26                   ` [PATCH 3/5] t4015: test the output of "diff --color-moved -b" Jeff King
2017-10-19 21:03                     ` Stefan Beller
2017-10-19 21:14                       ` Jeff King
2017-10-19 20:29                   ` [PATCH 4/5] diff: fix whitespace-skipping with --color-moved Jeff King
2017-10-19 21:15                     ` Stefan Beller
2017-10-19 21:19                       ` Jeff King
2017-10-20  7:23                     ` Simon Ruderich
2017-10-20 22:37                       ` Jeff King
2017-10-19 20:31                   ` Jeff King [this message]
2017-10-19 21:31                     ` [PATCH 5/5] diff: handle NULs in get_string_hash() Stefan Beller
2017-10-19 21:39                       ` Jeff King
2017-10-19 21:50                         ` Stefan Beller
2017-10-19 19:53             ` [PATCH] diff.c: increment buffer pointer in all code path Stefan Beller
2017-10-19 19:55               ` Jeff King

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171019203119.d2xbryov2y5t2aig@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=orgads@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox