git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Dmitry V. Levin" <ldv@altlinux.org>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	Git Mailing List <git@vger.kernel.org>, Jeff King <peff@peff.net>
Subject: [PATCH 2/2] Fix the rename detection limit checking
Date: Fri, 14 Sep 2007 10:39:48 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.0.999.0709141017450.16478@woody.linux-foundation.org> (raw)
In-Reply-To: <alpine.LFD.0.999.0709141014130.16478@woody.linux-foundation.org>


This adds more proper rename detection limits. Instead of just checking 
the limit against the number of potential rename destinations, we verify 
that the rename matrix (which is what really matters) doesn't grow 
ridiculously large, and we also make sure that we don't overflow when 
doing the matrix size calculation.

This also changes the default limits from unlimited, to a rename matrix 
that is limited to 100 entries on a side. You can raise it with the config 
entry, or by using the "-l<n>" command line flag, but at least the default 
is now a sane number that avoids spending lots of time (and memory) in 
situations that likely don't merit it.

The choice of default value is of course very debatable. Limiting the 
rename matrix to a 100x100 size will mean that even if you have just one 
obvious rename, but you also create (or delete) 10,000 files, the rename 
matrix will be so big that we disable the heuristics. Sounds reasonable to 
me, but let's see if people hit this (and, perhaps more importantly, 
actually *care*) in real life.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

Due to the overflow issue (and yes, Dmitry's test-case actually triggered 
an overflow even on 64-bit machines, because the math was done on "int" 
types), I really think this was a real bug.

Now, whether this is necessarily the right way to fix it, I dunno, but it 
also *does* fix the old broken limit handling that was based just on the 
number of target files, and didn't take the number of potential source 
files into account.

But the fix to that logic also means that the meaning of "-l<n>" changes 
subtly. For the better, I think, but changes nonetheless.

I'd also like to apologize to the change in wt-status.c, but that code 
doesn't use the regular diffopt setup logic, so it really is a special 
case and needs to be handled as such. Do we want to teach "git runstatus" 
to actually honor command line flags for diff generation etc? That's a 
separate question, this patch just makes it use the same rename default as 
the normal diffs now do.

 diff.c            |    2 +-
 diffcore-rename.c |   19 +++++++++++++++++--
 wt-status.c       |    1 +
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 1aca5df..0ee9ea1 100644
--- a/diff.c
+++ b/diff.c
@@ -17,7 +17,7 @@
 #endif
 
 static int diff_detect_rename_default;
-static int diff_rename_limit_default = -1;
+static int diff_rename_limit_default = 100;
 static int diff_use_color_default;
 int diff_auto_refresh_index = 1;
 
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 6bde439..41b35c3 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -298,10 +298,25 @@ void diffcore_rename(struct diff_options *options)
 		else if (detect_rename == DIFF_DETECT_COPY)
 			register_rename_src(p->one, 1, p->score);
 	}
-	if (rename_dst_nr == 0 || rename_src_nr == 0 ||
-	    (0 < rename_limit && rename_limit < rename_dst_nr))
+	if (rename_dst_nr == 0 || rename_src_nr == 0)
 		goto cleanup; /* nothing to do */
 
+	/*
+	 * This basically does a test for the rename matrix not
+	 * growing larger than a "rename_limit" square matrix, ie:
+	 *
+	 *    rename_dst_nr * rename_src_nr > rename_limit * rename_limit
+	 *
+	 * but handles the potential overflow case specially (and we
+	 * assume at least 32-bit integers)
+	 */
+	if (rename_limit <= 0 || rename_limit > 32767)
+		rename_limit = 32767;
+	if (rename_dst_nr > rename_limit && rename_src_nr > rename_limit)
+		goto cleanup;
+	if (rename_dst_nr * rename_src_nr > rename_limit * rename_limit)
+		goto cleanup;
+
 	/* We really want to cull the candidates list early
 	 * with cheap tests in order to avoid doing deltas.
 	 * The first round matches up the up-to-date entries,
diff --git a/wt-status.c b/wt-status.c
index 5205420..10ce6ee 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -227,6 +227,7 @@ static void wt_status_print_updated(struct wt_status *s)
 	rev.diffopt.format_callback = wt_status_print_updated_cb;
 	rev.diffopt.format_callback_data = s;
 	rev.diffopt.detect_rename = 1;
+	rev.diffopt.rename_limit = 100;
 	wt_read_cache(s);
 	run_diff_index(&rev, 1);
 }

  reply	other threads:[~2007-09-14 17:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-05 23:49 [PATCH 1/2] git-commit: Disallow unchanged tree in non-merge mode Dmitry V. Levin
2007-09-06  2:25 ` Shawn O. Pearce
2007-09-06 10:16   ` Dmitry V. Levin
     [not found]     ` <20070909044648.GH18160@spearce.org>
     [not found]       ` <7vir6fjmuv.fsf@gitster.siamese.dyndns.org>
     [not found]         ` <20070913035137.GM3099@spearce.org>
     [not found]           ` <7vr6l2gxyw.fsf@gitster.siamese.dyndns.org>
     [not found]             ` <20070914000108.GE3619@basalt.office.altlinux.org>
     [not found]               ` <7vr6l2f6k1.fsf@gitster.siamese.dyndns.org>
     [not found]                 ` <alpine.LFD.0.999.0709131850060.16478@woody.linux-foundation.org>
     [not found]                   ` <20070914024303.GH3619@basalt.office.altlinux.org>
     [not found]                     ` <alpine.LFD.0.999.0709132123570.16478@woody.linux-foundation.org>
2007-09-14 17:14                       ` Linus Torvalds
2007-09-14 17:17                         ` [PATCH 1/2] Fix "git diff" setup code Linus Torvalds
2007-09-14 17:39                           ` Linus Torvalds [this message]
2007-09-14 18:44                             ` [PATCH 2/2] Fix the rename detection limit checking Linus Torvalds
2007-09-14 18:49                               ` Linus Torvalds
2007-09-14 18:19                           ` [PATCH 1/2] Fix "git diff" setup code Junio C Hamano
2007-09-14 18:30                             ` Linus Torvalds
2007-09-14 19:11                               ` Junio C Hamano
2007-09-14 19:46                                 ` Linus Torvalds
     [not found]                       ` <20070914050410.GA11402@coredump.intra.peff.net>
     [not found]                         ` <alpine.LFD.0.999.0709132215250.16478@woody.linux-foundation.org>
2007-09-14 18:24                           ` [PATCH 1/2] git-commit: Disallow unchanged tree in non-merge mode Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.999.0709141017450.16478@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ldv@altlinux.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).