git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] hash binary sha1 into patch id
@ 2010-08-13  9:40 Clemens Buchacher
  2010-08-13 20:00 ` Jonathan Nieder
  0 siblings, 1 reply; 13+ messages in thread
From: Clemens Buchacher @ 2010-08-13  9:40 UTC (permalink / raw)
  To: git; +Cc: Marat Radchenko, Michael J Gruber, Junio C Hamano


[-- Attachment #1.1: Type: text/plain, Size: 1601 bytes --]

Since commit 2f82f760 (Take binary diffs into account for "git rebase"), binary
files are included in patch ID computation. Binary files are diffed using the
text diff algorithm, however, which has a huge impact on performance. The
following tests performance for a 50000 line file marked as binary in
.gitattributes.

$ git format-patch --stdout --full-index --ignore-if-in-upstream master

real	0m0.367s
user	0m0.354s
sys	0m0.010s

Instead of hashing the diff of binary files, use the post-image sha1, which is
just as unique. As a result, performance is much improved.

$ git format-patch --stdout --full-index --ignore-if-in-upstream master

real	0m0.016s
user	0m0.015s
sys	0m0.001s

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

This may be related to the rebase performance issue discussed in
the following thread.
 
 http://mid.gmane.org/loom.20100713T082913-327@post.gmane.org

I am attaching the script which I used to test performance.

 diff.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/diff.c b/diff.c
index 17873f3..20fc6db 100644
--- a/diff.c
+++ b/diff.c
@@ -3758,6 +3758,12 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1)
 					len2, p->two->path);
 		git_SHA1_Update(&ctx, buffer, len1);
 
+		if (diff_filespec_is_binary(p->two)) {
+			len1 = sprintf(buffer, "%s", sha1_to_hex(p->two->sha1));
+			git_SHA1_Update(&ctx, buffer, len1);
+			continue;
+		}
+
 		xpp.flags = 0;
 		xecfg.ctxlen = 3;
 		xecfg.flags = XDL_EMIT_FUNCNAMES;
-- 
1.7.2.1.1.g202c


[-- Attachment #1.2: test-patchid.sh --]
[-- Type: application/x-sh, Size: 1719 bytes --]

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-10-14  7:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-13  9:40 [PATCH] hash binary sha1 into patch id Clemens Buchacher
2010-08-13 20:00 ` Jonathan Nieder
2010-08-13 21:23   ` Clemens Buchacher
2010-08-13 21:37     ` Jonathan Nieder
2010-08-13 21:58       ` Clemens Buchacher
2010-08-15  7:20         ` [PATCH v2] " Clemens Buchacher
2010-08-15  7:56           ` Jonathan Nieder
2010-09-10  5:17           ` Marat Radchenko
2010-09-10  8:16             ` Clemens Buchacher
2010-10-13  7:46               ` Marat Radchenko
2010-10-13  9:17                 ` Marat Radchenko
2010-10-13 21:10                   ` Clemens Buchacher
2010-10-14  7:19                     ` Marat Radchenko

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).