git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/4] Fix issues with rename detection limits
@ 2017-11-13 20:15 Elijah Newren
  2017-11-13 20:15 ` [PATCH v2 1/4] sequencer: warn when internal merge may be suboptimal due to renameLimit Elijah Newren
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Elijah Newren @ 2017-11-13 20:15 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This patchset fixes some issues around rename detection limits.  Changes
since the original submission[1]:
  * Patches 2 and 3 are swapped in order so as to not temporarily introduce
    any bugs (even if only cosmetic)
  * Commit message fixups
  * Using uint64_t instead of double for holding multiplication of integers
  * Corrected format specifier for printing 64-bit ints.

[1] https://public-inbox.org/git/20171110173956.25105-1-newren@gmail.com/

Elijah Newren (4):
  sequencer: warn when internal merge may be suboptimal due to
    renameLimit
  progress: fix progress meters when dealing with lots of work
  Remove silent clamp of renameLimit
  sequencer: show rename progress during cherry picks

 diff.c            |  2 +-
 diffcore-rename.c | 15 ++++++---------
 progress.c        | 22 +++++++++++-----------
 progress.h        |  8 ++++----
 sequencer.c       |  2 ++
 5 files changed, 24 insertions(+), 25 deletions(-)

-- 
2.15.0.44.gf995e411c7


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/4] sequencer: warn when internal merge may be suboptimal due to renameLimit
  2017-11-13 20:15 [PATCH v2 0/4] Fix issues with rename detection limits Elijah Newren
@ 2017-11-13 20:15 ` Elijah Newren
  2017-11-13 20:15 ` [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work Elijah Newren
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Elijah Newren @ 2017-11-13 20:15 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

When many files were renamed, the recursive merge strategy stopped
detecting renames and left many paths with delete/modify conflicts,
without any warning about what was going on or providing any hints about
how to tell Git to spend more cycles to detect renames.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 sequencer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sequencer.c b/sequencer.c
index f2a10cc4f2..2b4cecb617 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -462,6 +462,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next,
 	if (is_rebase_i(opts) && clean <= 0)
 		fputs(o.obuf.buf, stdout);
 	strbuf_release(&o.obuf);
+	diff_warn_rename_limit("merge.renamelimit", o.needed_rename_limit, 0);
 	if (clean < 0)
 		return clean;
 
-- 
2.15.0.44.gf995e411c7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work
  2017-11-13 20:15 [PATCH v2 0/4] Fix issues with rename detection limits Elijah Newren
  2017-11-13 20:15 ` [PATCH v2 1/4] sequencer: warn when internal merge may be suboptimal due to renameLimit Elijah Newren
@ 2017-11-13 20:15 ` Elijah Newren
  2017-11-13 20:28   ` Jonathan Tan
  2017-11-13 21:33   ` Jonathan Tan
  2017-11-13 20:15 ` [PATCH v2 3/4] Remove silent clamp of renameLimit Elijah Newren
  2017-11-13 20:16 ` [PATCH v2 4/4] sequencer: show rename progress during cherry picks Elijah Newren
  3 siblings, 2 replies; 8+ messages in thread
From: Elijah Newren @ 2017-11-13 20:15 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

The possibility of setting merge.renameLimit beyond 2^16 raises the
possibility that the values passed to progress can exceed 2^32.
Use uint64_t, because it "ought to be enough for anybody".  :-)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
This does imply 64-bit math for all progress operations.  Possible alternatives
I could think of listed at
https://public-inbox.org/git/CABPp-BH1Cpc9UfYpmBBAHWSqadg=QuD=28qx1oV29ZdvF4NbJw@mail.gmail.com/
Opinions of others on whether 64-bit is okay, or preference for which alternative
is picked?

 diffcore-rename.c |  4 ++--
 progress.c        | 22 +++++++++++-----------
 progress.h        |  8 ++++----
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 0d8c3d2ee4..c03f484d53 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -534,7 +534,7 @@ void diffcore_rename(struct diff_options *options)
 	if (options->show_rename_progress) {
 		progress = start_delayed_progress(
 				_("Performing inexact rename detection"),
-				rename_dst_nr * rename_src_nr);
+				(uint64_t)rename_dst_nr * (uint64_t)rename_src_nr);
 	}
 
 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx));
@@ -571,7 +571,7 @@ void diffcore_rename(struct diff_options *options)
 			diff_free_filespec_blob(two);
 		}
 		dst_cnt++;
-		display_progress(progress, (i+1)*rename_src_nr);
+		display_progress(progress, (uint64_t)(i+1)*(uint64_t)rename_src_nr);
 	}
 	stop_progress(&progress);
 
diff --git a/progress.c b/progress.c
index 289678d43d..837d3a8eb8 100644
--- a/progress.c
+++ b/progress.c
@@ -30,8 +30,8 @@ struct throughput {
 
 struct progress {
 	const char *title;
-	int last_value;
-	unsigned total;
+	uint64_t last_value;
+	uint64_t total;
 	unsigned last_percent;
 	unsigned delay;
 	unsigned delayed_percent_threshold;
@@ -79,7 +79,7 @@ static int is_foreground_fd(int fd)
 	return tpgrp < 0 || tpgrp == getpgid(0);
 }
 
-static int display(struct progress *progress, unsigned n, const char *done)
+static int display(struct progress *progress, uint64_t n, const char *done)
 {
 	const char *eol, *tp;
 
@@ -106,7 +106,7 @@ static int display(struct progress *progress, unsigned n, const char *done)
 		if (percent != progress->last_percent || progress_update) {
 			progress->last_percent = percent;
 			if (is_foreground_fd(fileno(stderr)) || done) {
-				fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
+				fprintf(stderr, "%s: %3u%% (%"PRIuMAX"/%"PRIuMAX")%s%s",
 					progress->title, percent, n,
 					progress->total, tp, eol);
 				fflush(stderr);
@@ -116,7 +116,7 @@ static int display(struct progress *progress, unsigned n, const char *done)
 		}
 	} else if (progress_update) {
 		if (is_foreground_fd(fileno(stderr)) || done) {
-			fprintf(stderr, "%s: %u%s%s",
+			fprintf(stderr, "%s: %"PRIuMAX"%s%s",
 				progress->title, n, tp, eol);
 			fflush(stderr);
 		}
@@ -127,7 +127,7 @@ static int display(struct progress *progress, unsigned n, const char *done)
 	return 0;
 }
 
-static void throughput_string(struct strbuf *buf, off_t total,
+static void throughput_string(struct strbuf *buf, uint64_t total,
 			      unsigned int rate)
 {
 	strbuf_reset(buf);
@@ -138,7 +138,7 @@ static void throughput_string(struct strbuf *buf, off_t total,
 	strbuf_addstr(buf, "/s");
 }
 
-void display_throughput(struct progress *progress, off_t total)
+void display_throughput(struct progress *progress, uint64_t total)
 {
 	struct throughput *tp;
 	uint64_t now_ns;
@@ -200,12 +200,12 @@ void display_throughput(struct progress *progress, off_t total)
 		display(progress, progress->last_value, NULL);
 }
 
-int display_progress(struct progress *progress, unsigned n)
+int display_progress(struct progress *progress, uint64_t n)
 {
 	return progress ? display(progress, n, NULL) : 0;
 }
 
-static struct progress *start_progress_delay(const char *title, unsigned total,
+static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned percent_threshold, unsigned delay)
 {
 	struct progress *progress = malloc(sizeof(*progress));
@@ -227,12 +227,12 @@ static struct progress *start_progress_delay(const char *title, unsigned total,
 	return progress;
 }
 
-struct progress *start_delayed_progress(const char *title, unsigned total)
+struct progress *start_delayed_progress(const char *title, uint64_t total)
 {
 	return start_progress_delay(title, total, 0, 2);
 }
 
-struct progress *start_progress(const char *title, unsigned total)
+struct progress *start_progress(const char *title, uint64_t total)
 {
 	return start_progress_delay(title, total, 0, 0);
 }
diff --git a/progress.h b/progress.h
index 6392b63371..70a4d4a0d6 100644
--- a/progress.h
+++ b/progress.h
@@ -3,10 +3,10 @@
 
 struct progress;
 
-void display_throughput(struct progress *progress, off_t total);
-int display_progress(struct progress *progress, unsigned n);
-struct progress *start_progress(const char *title, unsigned total);
-struct progress *start_delayed_progress(const char *title, unsigned total);
+void display_throughput(struct progress *progress, uint64_t total);
+int display_progress(struct progress *progress, uint64_t n);
+struct progress *start_progress(const char *title, uint64_t total);
+struct progress *start_delayed_progress(const char *title, uint64_t total);
 void stop_progress(struct progress **progress);
 void stop_progress_msg(struct progress **progress, const char *msg);
 
-- 
2.15.0.44.gf995e411c7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/4] Remove silent clamp of renameLimit
  2017-11-13 20:15 [PATCH v2 0/4] Fix issues with rename detection limits Elijah Newren
  2017-11-13 20:15 ` [PATCH v2 1/4] sequencer: warn when internal merge may be suboptimal due to renameLimit Elijah Newren
  2017-11-13 20:15 ` [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work Elijah Newren
@ 2017-11-13 20:15 ` Elijah Newren
  2017-11-13 20:16 ` [PATCH v2 4/4] sequencer: show rename progress during cherry picks Elijah Newren
  3 siblings, 0 replies; 8+ messages in thread
From: Elijah Newren @ 2017-11-13 20:15 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

In commit 0024a5492 (Fix the rename detection limit checking; 2007-09-14),
the renameLimit was clamped to 32767.  This appears to have been to simply
avoid integer overflow in the following computation:

   num_create * num_src <= rename_limit * rename_limit

although it also could be viewed as a hardcoded bound on the amount of CPU
time we're willing to allow users to tell git to spend on handling
renames.  An upper bound may make sense, but unfortunately this upper
bound was neither communicated to the users, nor documented anywhere.

Although large limits can make things slow, we have users who would be
ecstatic to have a small five file change be correctly cherry picked even
if they have to manually specify a large limit and wait ten minutes for
the renames to be detected.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff.c            |  2 +-
 diffcore-rename.c | 11 ++++-------
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 6fd288420b..c6597e3231 100644
--- a/diff.c
+++ b/diff.c
@@ -5524,7 +5524,7 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
 		warning(_(rename_limit_warning));
 	else
 		return;
-	if (0 < needed && needed < 32767)
+	if (0 < needed)
 		warning(_(rename_limit_advice), varname, needed);
 }
 
diff --git a/diffcore-rename.c b/diffcore-rename.c
index c03f484d53..32366632f4 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -391,14 +391,10 @@ static int too_many_rename_candidates(int num_create,
 	 * growing larger than a "rename_limit" square matrix, ie:
 	 *
 	 *    num_create * num_src > rename_limit * rename_limit
-	 *
-	 * but handles the potential overflow case specially (and we
-	 * assume at least 32-bit integers)
 	 */
-	if (rename_limit <= 0 || rename_limit > 32767)
-		rename_limit = 32767;
 	if ((num_create <= rename_limit || num_src <= rename_limit) &&
-	    (num_create * num_src <= rename_limit * rename_limit))
+	    ((uint64_t)num_create * (uint64_t)num_src
+	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
 		return 0;
 
 	options->needed_rename_limit =
@@ -415,7 +411,8 @@ static int too_many_rename_candidates(int num_create,
 		num_src++;
 	}
 	if ((num_create <= rename_limit || num_src <= rename_limit) &&
-	    (num_create * num_src <= rename_limit * rename_limit))
+	    ((uint64_t)num_create * (uint64_t)num_src
+	     <= (uint64_t)rename_limit * (uint64_t)rename_limit))
 		return 2;
 	return 1;
 }
-- 
2.15.0.44.gf995e411c7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/4] sequencer: show rename progress during cherry picks
  2017-11-13 20:15 [PATCH v2 0/4] Fix issues with rename detection limits Elijah Newren
                   ` (2 preceding siblings ...)
  2017-11-13 20:15 ` [PATCH v2 3/4] Remove silent clamp of renameLimit Elijah Newren
@ 2017-11-13 20:16 ` Elijah Newren
  3 siblings, 0 replies; 8+ messages in thread
From: Elijah Newren @ 2017-11-13 20:16 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

When trying to cherry-pick a change that has lots of renames, it is
somewhat unsettling to wait a really long time without any feedback.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 sequencer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sequencer.c b/sequencer.c
index 2b4cecb617..247d93f363 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -448,6 +448,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next,
 	o.branch2 = next ? next_label : "(empty tree)";
 	if (is_rebase_i(opts))
 		o.buffer_output = 2;
+	o.show_rename_progress = 1;
 
 	head_tree = parse_tree_indirect(head);
 	next_tree = next ? next->tree : empty_tree();
-- 
2.15.0.44.gf995e411c7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work
  2017-11-13 20:15 ` [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work Elijah Newren
@ 2017-11-13 20:28   ` Jonathan Tan
  2017-11-13 21:33   ` Jonathan Tan
  1 sibling, 0 replies; 8+ messages in thread
From: Jonathan Tan @ 2017-11-13 20:28 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Mon, 13 Nov 2017 12:15:58 -0800
Elijah Newren <newren@gmail.com> wrote:

> The possibility of setting merge.renameLimit beyond 2^16 raises the
> possibility that the values passed to progress can exceed 2^32.
> Use uint64_t, because it "ought to be enough for anybody".  :-)
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> This does imply 64-bit math for all progress operations.  Possible alternatives
> I could think of listed at
> https://public-inbox.org/git/CABPp-BH1Cpc9UfYpmBBAHWSqadg=QuD=28qx1oV29ZdvF4NbJw@mail.gmail.com/
> Opinions of others on whether 64-bit is okay, or preference for which alternative
> is picked?

I haven't looked into this in much detail, but another alternative to
consider is to use size_t everywhere. This also allows us to use st_add
and st_mult, which checks overflow for us.

Changing progress to use the size_t of the local machine makes sense to
me.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work
  2017-11-13 20:15 ` [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work Elijah Newren
  2017-11-13 20:28   ` Jonathan Tan
@ 2017-11-13 21:33   ` Jonathan Tan
  2017-11-13 22:26     ` Elijah Newren
  1 sibling, 1 reply; 8+ messages in thread
From: Jonathan Tan @ 2017-11-13 21:33 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Mon, 13 Nov 2017 12:15:58 -0800
Elijah Newren <newren@gmail.com> wrote:

> -static int display(struct progress *progress, unsigned n, const char *done)
> +static int display(struct progress *progress, uint64_t n, const char *done)
>  {
>  	const char *eol, *tp;
>  
> @@ -106,7 +106,7 @@ static int display(struct progress *progress, unsigned n, const char *done)
>  		if (percent != progress->last_percent || progress_update) {
>  			progress->last_percent = percent;
>  			if (is_foreground_fd(fileno(stderr)) || done) {
> -				fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
> +				fprintf(stderr, "%s: %3u%% (%"PRIuMAX"/%"PRIuMAX")%s%s",
>  					progress->title, percent, n,
>  					progress->total, tp, eol);

I think it would be better to cast the appropriate arguments to
uintmax_t - searching through the Git code shows that we do that in
several situations. Same for the rest of the diff.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work
  2017-11-13 21:33   ` Jonathan Tan
@ 2017-11-13 22:26     ` Elijah Newren
  0 siblings, 0 replies; 8+ messages in thread
From: Elijah Newren @ 2017-11-13 22:26 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Git Mailing List

Thanks for the reviews!

On Mon, Nov 13, 2017 at 1:33 PM, Jonathan Tan <jonathantanmy@google.com> wrote:
> On Mon, 13 Nov 2017 12:15:58 -0800
> Elijah Newren <newren@gmail.com> wrote:
>
>> -static int display(struct progress *progress, unsigned n, const char *done)
>> +static int display(struct progress *progress, uint64_t n, const char *done)
>>  {
>>       const char *eol, *tp;
>>
>> @@ -106,7 +106,7 @@ static int display(struct progress *progress, unsigned n, const char *done)
>>               if (percent != progress->last_percent || progress_update) {
>>                       progress->last_percent = percent;
>>                       if (is_foreground_fd(fileno(stderr)) || done) {
>> -                             fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
>> +                             fprintf(stderr, "%s: %3u%% (%"PRIuMAX"/%"PRIuMAX")%s%s",
>>                                       progress->title, percent, n,
>>                                       progress->total, tp, eol);
>
> I think it would be better to cast the appropriate arguments to
> uintmax_t - searching through the Git code shows that we do that in
> several situations. Same for the rest of the diff.

Interesting.  My first inclination was to ask why not just change the
variables to be of type uintmax_t instead of uint64_t (since we're
already changing their types already), and then get rid of the cast.
But I went digging through the source code based on your comment.
Almost all the existing examples in the codebase were off_t and size_t
values; there was only one case with uint64_t...but that one case led
me to commit 5be507fc95 (Use PRIuMAX instead of 'unsigned long long'
in show-index 2007-10-21), and that commit does suggest doing exactly
as you say here.

I'll fix it up.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-11-13 22:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-13 20:15 [PATCH v2 0/4] Fix issues with rename detection limits Elijah Newren
2017-11-13 20:15 ` [PATCH v2 1/4] sequencer: warn when internal merge may be suboptimal due to renameLimit Elijah Newren
2017-11-13 20:15 ` [PATCH v2 2/4] progress: fix progress meters when dealing with lots of work Elijah Newren
2017-11-13 20:28   ` Jonathan Tan
2017-11-13 21:33   ` Jonathan Tan
2017-11-13 22:26     ` Elijah Newren
2017-11-13 20:15 ` [PATCH v2 3/4] Remove silent clamp of renameLimit Elijah Newren
2017-11-13 20:16 ` [PATCH v2 4/4] sequencer: show rename progress during cherry picks Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).