git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [WIP/PATCH] merge-recursive: option to specify rename threshold
@ 2010-09-22  6:03 Kevin Ballard
  2010-09-23  0:32 ` [PATCH] " Kevin Ballard
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-22  6:03 UTC (permalink / raw
  To: git; +Cc: Kevin Ballard, Junio C Hamano

The recursive merge strategy turns on rename detection but leaves the
rename score at the default. Add a strategy option to allow the user
to specify a rename score to use.
---
The only thing I'm concerned about in this patch is the duplicated
parse_num() function. I'm inclined to take that function in diff.c,
rename it to something like parse_rename_score(), and declare it
in diff.h.

 merge-recursive.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 merge-recursive.h |    1 +
 2 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index bf611ae..f8ff30e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,6 +334,7 @@ static struct string_list *get_renames(struct merge_options *o,
 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
 			    500;
+	opts.rename_score = o->rename_score;
 	opts.warn_on_too_large_rename = 1;
 	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
 	if (diff_setup_done(&opts) < 0)
@@ -1552,6 +1553,43 @@ void init_merge_options(struct merge_options *o)
 	o->current_directory_set.strdup_strings = 1;
 }
 
+// XXX: copied from diff.c
+static int parse_num(const char **cp_p)
+{
+	unsigned long num, scale;
+	int ch, dot;
+	const char *cp = *cp_p;
+
+	num = 0;
+	scale = 1;
+	dot = 0;
+	for (;;) {
+		ch = *cp;
+		if ( !dot && ch == '.' ) {
+			scale = 1;
+			dot = 1;
+		} else if ( ch == '%' ) {
+			scale = dot ? scale*100 : 100;
+			cp++;	/* % is always at the end */
+			break;
+		} else if ( ch >= '0' && ch <= '9' ) {
+			if ( scale < 100000 ) {
+				scale *= 10;
+				num = (num*10) + (ch-'0');
+			}
+		} else {
+			break;
+		}
+		cp++;
+	}
+	*cp_p = cp;
+
+	/* user says num divided by scale and we say internally that
+	 * is MAX_SCORE * num / scale.
+	 */
+	return (int)((num >= scale) ? MAX_SCORE : (MAX_SCORE * num / scale));
+}
+
 int parse_merge_opt(struct merge_options *o, const char *s)
 {
 	if (!s || !*s)
@@ -1576,6 +1614,11 @@ int parse_merge_opt(struct merge_options *o, const char *s)
 		o->renormalize = 1;
 	else if (!strcmp(s, "no-renormalize"))
 		o->renormalize = 0;
+	else if (!prefixcmp(s, "rename-score=")) {
+		const char *score = s + strlen("rename-score=");
+		if ((o->rename_score = parse_num(&score)) == -1 || *score != 0)
+			return -1;
+	}
 	else
 		return -1;
 	return 0;
diff --git a/merge-recursive.h b/merge-recursive.h
index 2eb5d1a..c8135b0 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -19,6 +19,7 @@ struct merge_options {
 	int verbosity;
 	int diff_rename_limit;
 	int merge_rename_limit;
+	int rename_score;
 	int call_depth;
 	struct strbuf obuf;
 	struct string_list current_file_set;
-- 
1.7.3.237.ge0a7

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] merge-recursive: option to specify rename threshold
  2010-09-22  6:03 [WIP/PATCH] merge-recursive: option to specify rename threshold Kevin Ballard
@ 2010-09-23  0:32 ` Kevin Ballard
  2010-09-23  0:38   ` Kevin Ballard
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-23  0:32 UTC (permalink / raw
  To: git; +Cc: Kevin Ballard, Junio C Hamano

The recursive merge strategy turns on rename detection but leaves the
rename score at the default. Add a strategy option to allow the user
to specify a rename score to use.

Signed-off-by: Kevin Ballard <kevin@sb.org>
---
As near as I can tell, there are no tests that deal with rename score.
Given this, I did not attempt to construct my own, as I fear such a test
would be far more complicated than the change itself.

 Documentation/merge-strategies.txt |    4 ++++
 diff.c                             |    6 +++---
 diff.h                             |    2 ++
 merge-recursive.c                  |    6 ++++++
 merge-recursive.h                  |    1 +
 5 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt
index 91faba5..05eb8f8 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -74,6 +74,10 @@ no-renormalize;;
 	Disables the `renormalize` option.  This overrides the
 	`merge.renormalize` configuration variable.
 
+rename-score=<n>;;
+	Controls the similarity threshold used for rename detection.
+	See also linkgit:git-diff[1] `-M`.
+
 subtree[=path];;
 	This option is a more advanced form of 'subtree' strategy, where
 	the strategy makes a guess on how two trees must be shifted to
diff --git a/diff.c b/diff.c
index a7d15e5..da88704 100644
--- a/diff.c
+++ b/diff.c
@@ -3323,7 +3323,7 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 	return 1;
 }
 
-static int parse_num(const char **cp_p)
+int parse_rename_score(const char **cp_p)
 {
 	unsigned long num, scale;
 	int ch, dot;
@@ -3369,7 +3369,7 @@ static int diff_scoreopt_parse(const char *opt)
 	if (cmd != 'M' && cmd != 'C' && cmd != 'B')
 		return -1; /* that is not a -M, -C nor -B option */
 
-	opt1 = parse_num(&opt);
+	opt1 = parse_rename_score(&opt);
 	if (cmd != 'B')
 		opt2 = 0;
 	else {
@@ -3379,7 +3379,7 @@ static int diff_scoreopt_parse(const char *opt)
 			return -1; /* we expect -B80/99 or -B80 */
 		else {
 			opt++;
-			opt2 = parse_num(&opt);
+			opt2 = parse_rename_score(&opt);
 		}
 	}
 	if (*opt != 0)
diff --git a/diff.h b/diff.h
index e17383c..1a263e9 100644
--- a/diff.h
+++ b/diff.h
@@ -332,4 +332,6 @@ extern void emit_line(struct diff_options *o, const char *set, const char *reset
 
 extern char *quote_two(const char *one, const char *two);
 
+extern int parse_rename_score(const char **cp_p);
+
 #endif /* DIFF_H */
diff --git a/merge-recursive.c b/merge-recursive.c
index bf611ae..4d131da 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,6 +334,7 @@ static struct string_list *get_renames(struct merge_options *o,
 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
 			    500;
+	opts.rename_score = o->rename_score;
 	opts.warn_on_too_large_rename = 1;
 	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
 	if (diff_setup_done(&opts) < 0)
@@ -1576,6 +1577,11 @@ int parse_merge_opt(struct merge_options *o, const char *s)
 		o->renormalize = 1;
 	else if (!strcmp(s, "no-renormalize"))
 		o->renormalize = 0;
+	else if (!prefixcmp(s, "rename-score=")) {
+		const char *score = s + strlen("rename-score=");
+		if ((o->rename_score = parse_rename_score(&score)) == -1 || *score != 0)
+			return -1;
+	}
 	else
 		return -1;
 	return 0;
diff --git a/merge-recursive.h b/merge-recursive.h
index 2eb5d1a..c8135b0 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -19,6 +19,7 @@ struct merge_options {
 	int verbosity;
 	int diff_rename_limit;
 	int merge_rename_limit;
+	int rename_score;
 	int call_depth;
 	struct strbuf obuf;
 	struct string_list current_file_set;
-- 
1.7.3.237.g22e9

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-23  0:32 ` [PATCH] " Kevin Ballard
@ 2010-09-23  0:38   ` Kevin Ballard
  2010-09-23  0:45     ` Kevin Ballard
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-23  0:38 UTC (permalink / raw
  To: Git Mailing List; +Cc: Junio C Hamano

Ignore this patch, I just discovered that I was still operating on pre-reset next and it doesn't apply cleanly on top of the current tip.

-Kevin Ballard

On Sep 22, 2010, at 5:32 PM, Kevin Ballard wrote:

> The recursive merge strategy turns on rename detection but leaves the
> rename score at the default. Add a strategy option to allow the user
> to specify a rename score to use.
> 
> Signed-off-by: Kevin Ballard <kevin@sb.org>
> ---
> As near as I can tell, there are no tests that deal with rename score.
> Given this, I did not attempt to construct my own, as I fear such a test
> would be far more complicated than the change itself.
> 
> Documentation/merge-strategies.txt |    4 ++++
> diff.c                             |    6 +++---
> diff.h                             |    2 ++
> merge-recursive.c                  |    6 ++++++
> merge-recursive.h                  |    1 +
> 5 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt
> index 91faba5..05eb8f8 100644
> --- a/Documentation/merge-strategies.txt
> +++ b/Documentation/merge-strategies.txt
> @@ -74,6 +74,10 @@ no-renormalize;;
> 	Disables the `renormalize` option.  This overrides the
> 	`merge.renormalize` configuration variable.
> 
> +rename-score=<n>;;
> +	Controls the similarity threshold used for rename detection.
> +	See also linkgit:git-diff[1] `-M`.
> +
> subtree[=path];;
> 	This option is a more advanced form of 'subtree' strategy, where
> 	the strategy makes a guess on how two trees must be shifted to
> diff --git a/diff.c b/diff.c
> index a7d15e5..da88704 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3323,7 +3323,7 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
> 	return 1;
> }
> 
> -static int parse_num(const char **cp_p)
> +int parse_rename_score(const char **cp_p)
> {
> 	unsigned long num, scale;
> 	int ch, dot;
> @@ -3369,7 +3369,7 @@ static int diff_scoreopt_parse(const char *opt)
> 	if (cmd != 'M' && cmd != 'C' && cmd != 'B')
> 		return -1; /* that is not a -M, -C nor -B option */
> 
> -	opt1 = parse_num(&opt);
> +	opt1 = parse_rename_score(&opt);
> 	if (cmd != 'B')
> 		opt2 = 0;
> 	else {
> @@ -3379,7 +3379,7 @@ static int diff_scoreopt_parse(const char *opt)
> 			return -1; /* we expect -B80/99 or -B80 */
> 		else {
> 			opt++;
> -			opt2 = parse_num(&opt);
> +			opt2 = parse_rename_score(&opt);
> 		}
> 	}
> 	if (*opt != 0)
> diff --git a/diff.h b/diff.h
> index e17383c..1a263e9 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -332,4 +332,6 @@ extern void emit_line(struct diff_options *o, const char *set, const char *reset
> 
> extern char *quote_two(const char *one, const char *two);
> 
> +extern int parse_rename_score(const char **cp_p);
> +
> #endif /* DIFF_H */
> diff --git a/merge-recursive.c b/merge-recursive.c
> index bf611ae..4d131da 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -334,6 +334,7 @@ static struct string_list *get_renames(struct merge_options *o,
> 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
> 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
> 			    500;
> +	opts.rename_score = o->rename_score;
> 	opts.warn_on_too_large_rename = 1;
> 	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
> 	if (diff_setup_done(&opts) < 0)
> @@ -1576,6 +1577,11 @@ int parse_merge_opt(struct merge_options *o, const char *s)
> 		o->renormalize = 1;
> 	else if (!strcmp(s, "no-renormalize"))
> 		o->renormalize = 0;
> +	else if (!prefixcmp(s, "rename-score=")) {
> +		const char *score = s + strlen("rename-score=");
> +		if ((o->rename_score = parse_rename_score(&score)) == -1 || *score != 0)
> +			return -1;
> +	}
> 	else
> 		return -1;
> 	return 0;
> diff --git a/merge-recursive.h b/merge-recursive.h
> index 2eb5d1a..c8135b0 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -19,6 +19,7 @@ struct merge_options {
> 	int verbosity;
> 	int diff_rename_limit;
> 	int merge_rename_limit;
> +	int rename_score;
> 	int call_depth;
> 	struct strbuf obuf;
> 	struct string_list current_file_set;
> -- 
> 1.7.3.237.g22e9
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] merge-recursive: option to specify rename threshold
  2010-09-23  0:38   ` Kevin Ballard
@ 2010-09-23  0:45     ` Kevin Ballard
  2010-09-27  4:11       ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-23  0:45 UTC (permalink / raw
  To: git; +Cc: Kevin Ballard, Junio C Hamano

The recursive merge strategy turns on rename detection but leaves the
rename score at the default. Add a strategy option to allow the user
to specify a rename score to use.

Signed-off-by: Kevin Ballard <kevin@sb.org>
---
This patch was generated off of the tip of the next branch.

As with the previous version, there are no tests included here. There seem
to be no tests for the rename score in general, and I was not prepared to
try to create my own.

 Documentation/merge-strategies.txt |    4 ++++
 diff.c                             |    6 +++---
 diff.h                             |    2 ++
 merge-recursive.c                  |    6 ++++++
 merge-recursive.h                  |    1 +
 5 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt
index 91faba5..05eb8f8 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -74,6 +74,10 @@ no-renormalize;;
 	Disables the `renormalize` option.  This overrides the
 	`merge.renormalize` configuration variable.
 
+rename-score=<n>;;
+	Controls the similarity threshold used for rename detection.
+	See also linkgit:git-diff[1] `-M`.
+
 subtree[=path];;
 	This option is a more advanced form of 'subtree' strategy, where
 	the strategy makes a guess on how two trees must be shifted to
diff --git a/diff.c b/diff.c
index cc73061..d862234 100644
--- a/diff.c
+++ b/diff.c
@@ -3323,7 +3323,7 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 	return 1;
 }
 
-static int parse_num(const char **cp_p)
+int parse_rename_score(const char **cp_p)
 {
 	unsigned long num, scale;
 	int ch, dot;
@@ -3369,7 +3369,7 @@ static int diff_scoreopt_parse(const char *opt)
 	if (cmd != 'M' && cmd != 'C' && cmd != 'B')
 		return -1; /* that is not a -M, -C nor -B option */
 
-	opt1 = parse_num(&opt);
+	opt1 = parse_rename_score(&opt);
 	if (cmd != 'B')
 		opt2 = 0;
 	else {
@@ -3379,7 +3379,7 @@ static int diff_scoreopt_parse(const char *opt)
 			return -1; /* we expect -B80/99 or -B80 */
 		else {
 			opt++;
-			opt2 = parse_num(&opt);
+			opt2 = parse_rename_score(&opt);
 		}
 	}
 	if (*opt != 0)
diff --git a/diff.h b/diff.h
index 1fd44f5..0083d92 100644
--- a/diff.h
+++ b/diff.h
@@ -315,4 +315,6 @@ extern size_t fill_textconv(struct userdiff_driver *driver,
 
 extern struct userdiff_driver *get_textconv(struct diff_filespec *one);
 
+extern int parse_rename_score(const char **cp_p);
+
 #endif /* DIFF_H */
diff --git a/merge-recursive.c b/merge-recursive.c
index 325a97b..2e3ef44 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,6 +334,7 @@ static struct string_list *get_renames(struct merge_options *o,
 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
 			    500;
+	opts.rename_score = o->rename_score;
 	opts.warn_on_too_large_rename = 1;
 	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
 	if (diff_setup_done(&opts) < 0)
@@ -1576,6 +1577,11 @@ int parse_merge_opt(struct merge_options *o, const char *s)
 		o->renormalize = 1;
 	else if (!strcmp(s, "no-renormalize"))
 		o->renormalize = 0;
+	else if (!prefixcmp(s, "rename-score=")) {
+		const char *score = s + strlen("rename-score=");
+		if ((o->rename_score = parse_rename_score(&score)) == -1 || *score != 0)
+			return -1;
+	}
 	else
 		return -1;
 	return 0;
diff --git a/merge-recursive.h b/merge-recursive.h
index 2eb5d1a..c8135b0 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -19,6 +19,7 @@ struct merge_options {
 	int verbosity;
 	int diff_rename_limit;
 	int merge_rename_limit;
+	int rename_score;
 	int call_depth;
 	struct strbuf obuf;
 	struct string_list current_file_set;
-- 
1.7.3.68.ge6d63

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-23  0:45     ` Kevin Ballard
@ 2010-09-27  4:11       ` Junio C Hamano
  2010-09-27  5:04         ` Kevin Ballard
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2010-09-27  4:11 UTC (permalink / raw
  To: Kevin Ballard; +Cc: git

Kevin Ballard <kevin@sb.org> writes:

> The recursive merge strategy turns on rename detection but leaves the
> rename score at the default. Add a strategy option to allow the user
> to specify a rename score to use.

Sounds straightforward, except that Documentation/diff-options.txt seems
to call the number associated with -M "threshold", not "score".  The title
of the patch incidentally says threshold as well ;-)

At the end-user level, this new option to merge-recursive has exactly the
same meaning as existing -M given to "diff" family; people would probably
want to see it made available as a synonym to "diff" family as well, no?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27  4:11       ` Junio C Hamano
@ 2010-09-27  5:04         ` Kevin Ballard
  2010-09-27  5:24           ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-27  5:04 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Sep 26, 2010, at 9:11 PM, Junio C Hamano wrote:

> Kevin Ballard <kevin@sb.org> writes:
> 
>> The recursive merge strategy turns on rename detection but leaves the
>> rename score at the default. Add a strategy option to allow the user
>> to specify a rename score to use.
> 
> Sounds straightforward, except that Documentation/diff-options.txt seems
> to call the number associated with -M "threshold", not "score".  The title
> of the patch incidentally says threshold as well ;-)

It says "threshold" because that's how the -M switch to git-diff is documented. The merge strategy option is called "rename-score" partially because that's what it's called internally, and partially because it's just an easier name to remember/type. I have no objections to calling it "rename-threshold" if you think that's better.

> At the end-user level, this new option to merge-recursive has exactly the
> same meaning as existing -M given to "diff" family; people would probably
> want to see it made available as a synonym to "diff" family as well, no?

You mean so you can type `git diff --rename-score=50% foo`? A reasonable suggestion, but then what do we do with -B and -C? It doesn't make much sense to give a longer name to only one of the three options. This patch was concerned with simply exposing the functionality to the merge strategy and doesn't attempt to address the problem of providing long names for this trio of options.

-Kevin Ballard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27  5:04         ` Kevin Ballard
@ 2010-09-27  5:24           ` Junio C Hamano
  2010-09-27  5:34             ` Kevin Ballard
  2010-09-27 22:01             ` Kevin Ballard
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2010-09-27  5:24 UTC (permalink / raw
  To: Kevin Ballard; +Cc: git

Kevin Ballard <kevin@sb.org> writes:

>> At the end-user level, this new option to merge-recursive has exactly the
>> same meaning as existing -M given to "diff" family; people would probably
>> want to see it made available as a synonym to "diff" family as well, no?
>
> You mean so you can type `git diff --rename-score=50% foo`? A reasonable
> suggestion, but then what do we do with -B and -C? It doesn't make much
> sense to give a longer name to only one of the three options. This patch
> was concerned with simply exposing the functionality to the merge
> strategy and doesn't attempt to address the problem of providing long
> names for this trio of options.

I would call them --break-threshold and --copy-threshold respectively.

I have been happy without long option names when we originally had only
short names, but some people seem to be able to be more explicit, so...

While we are at it, would it make sense to have "merge-recursive -M20" as
a shorthand as well?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27  5:24           ` Junio C Hamano
@ 2010-09-27  5:34             ` Kevin Ballard
  2010-09-27  6:04               ` Junio C Hamano
  2010-09-27 22:01             ` Kevin Ballard
  1 sibling, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-27  5:34 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Sep 26, 2010, at 10:24 PM, Junio C Hamano wrote:

> Kevin Ballard <kevin@sb.org> writes:
> 
>>> At the end-user level, this new option to merge-recursive has exactly the
>>> same meaning as existing -M given to "diff" family; people would probably
>>> want to see it made available as a synonym to "diff" family as well, no?
>> 
>> You mean so you can type `git diff --rename-score=50% foo`? A reasonable
>> suggestion, but then what do we do with -B and -C? It doesn't make much
>> sense to give a longer name to only one of the three options. This patch
>> was concerned with simply exposing the functionality to the merge
>> strategy and doesn't attempt to address the problem of providing long
>> names for this trio of options.
> 
> I would call them --break-threshold and --copy-threshold respectively.
> 
> I have been happy without long option names when we originally had only
> short names, but some people seem to be able to be more explicit, so...

Fair enough. Expect that naming in the next iteration of the patch.

> While we are at it, would it make sense to have "merge-recursive -M20" as
> a shorthand as well?

So it would be invoked like `git merge -s recursive -X M20 foo`? Looks a bit odd to me. I can add that if you think it's worthwhile though.

-Kevin Ballard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27  5:34             ` Kevin Ballard
@ 2010-09-27  6:04               ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2010-09-27  6:04 UTC (permalink / raw
  To: Kevin Ballard; +Cc: Junio C Hamano, git

Kevin Ballard <kevin@sb.org> writes:

> On Sep 26, 2010, at 10:24 PM, Junio C Hamano wrote:
>
>> Kevin Ballard <kevin@sb.org> writes:
>> 
>>>> At the end-user level, this new option to merge-recursive has exactly the
>>>> same meaning as existing -M given to "diff" family; people would probably
>>>> want to see it made available as a synonym to "diff" family as well, no?
>>> 
>>> You mean so you can type `git diff --rename-score=50% foo`? A reasonable
>>> suggestion, but then what do we do with -B and -C? It doesn't make much
>>> sense to give a longer name to only one of the three options. This patch
>>> was concerned with simply exposing the functionality to the merge
>>> strategy and doesn't attempt to address the problem of providing long
>>> names for this trio of options.
>> 
>> I would call them --break-threshold and --copy-threshold respectively.
>> 
>> I have been happy without long option names when we originally had only
>> short names, but some people seem to be able to be more explicit, so...
>
> Fair enough. Expect that naming in the next iteration of the patch.
>
>> While we are at it, would it make sense to have "merge-recursive -M20" as
>> a shorthand as well?
>
> So it would be invoked like `git merge -s recursive -X M20 foo`? Looks a bit odd to me. I can add that if you think it's worthwhile though.

I agree it looks odd, too ;-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27  5:24           ` Junio C Hamano
  2010-09-27  5:34             ` Kevin Ballard
@ 2010-09-27 22:01             ` Kevin Ballard
  2010-09-27 23:53               ` Jonathan Nieder
                                 ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Kevin Ballard @ 2010-09-27 22:01 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Sep 26, 2010, at 10:24 PM, Junio C Hamano wrote:

> Kevin Ballard <kevin@sb.org> writes:
> 
>>> At the end-user level, this new option to merge-recursive has exactly the
>>> same meaning as existing -M given to "diff" family; people would probably
>>> want to see it made available as a synonym to "diff" family as well, no?
>> 
>> You mean so you can type `git diff --rename-score=50% foo`? A reasonable
>> suggestion, but then what do we do with -B and -C? It doesn't make much
>> sense to give a longer name to only one of the three options. This patch
>> was concerned with simply exposing the functionality to the merge
>> strategy and doesn't attempt to address the problem of providing long
>> names for this trio of options.
> 
> I would call them --break-threshold and --copy-threshold respectively.
> 
> I have been happy without long option names when we originally had only
> short names, but some people seem to be able to be more explicit, so...

After taking a look at this, it raises another question. -B, -M, and -C all have optional arguments, but the long-form names don't seem to support that. `git diff --rename-threshold= foo` would work, but looks mighty odd, and if I make it support `git diff --rename-threshold foo` that would also work, but the name doesn't seem appropriate without the argument. Should I go ahead and support `git diff --rename-threshold foo` and just live with it looking weird, or do you have a better suggestion?

-Kevin Ballard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27 22:01             ` Kevin Ballard
@ 2010-09-27 23:53               ` Jonathan Nieder
  2010-09-28  0:01                 ` Kevin Ballard
  2010-09-27 23:58               ` [PATCHv2 1/2] " Kevin Ballard
  2010-09-27 23:58               ` [PATCHv2 2/2] diff: add synonyms for -M, -C, -B Kevin Ballard
  2 siblings, 1 reply; 18+ messages in thread
From: Jonathan Nieder @ 2010-09-27 23:53 UTC (permalink / raw
  To: Kevin Ballard; +Cc: Junio C Hamano, git

Kevin Ballard wrote:

> After taking a look at this, it raises another question. -B, -M, and
> -C all have optional arguments, but the long-form names don't seem
> to support that.
[...]
>                       if I make it support `git diff
> --rename-threshold foo` that would also work, but the name doesn't
> seem appropriate without the argument.

Right --- with merge-recursive the argument doesn't need to be
optional, but with git diff it does.

How about

	--detect-renames=<threshold>
	--detect-copies=<threshold>
	--detect-rewrites=<threshold>/<threshold>

?

Ciao,
Jonathan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 1/2] merge-recursive: option to specify rename threshold
  2010-09-27 22:01             ` Kevin Ballard
  2010-09-27 23:53               ` Jonathan Nieder
@ 2010-09-27 23:58               ` Kevin Ballard
  2010-09-27 23:58               ` [PATCHv2 2/2] diff: add synonyms for -M, -C, -B Kevin Ballard
  2 siblings, 0 replies; 18+ messages in thread
From: Kevin Ballard @ 2010-09-27 23:58 UTC (permalink / raw
  To: git; +Cc: Kevin Ballard, Junio C Hamano

The recursive merge strategy turns on rename detection but leaves the
rename threshold at the default. Add a strategy option to allow the user
to specify a rename threshold to use.

Signed-off-by: Kevin Ballard <kevin@sb.org>
---
 Documentation/merge-strategies.txt |    4 ++++
 diff.c                             |    6 +++---
 diff.h                             |    2 ++
 merge-recursive.c                  |    6 ++++++
 merge-recursive.h                  |    1 +
 5 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt
index 91faba5..77f2606 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -74,6 +74,10 @@ no-renormalize;;
 	Disables the `renormalize` option.  This overrides the
 	`merge.renormalize` configuration variable.
 
+rename-threshold=<n>;;
+	Controls the similarity threshold used for rename detection.
+	See also linkgit:git-diff[1] `-M`.
+
 subtree[=path];;
 	This option is a more advanced form of 'subtree' strategy, where
 	the strategy makes a guess on how two trees must be shifted to
diff --git a/diff.c b/diff.c
index cc73061..d862234 100644
--- a/diff.c
+++ b/diff.c
@@ -3323,7 +3323,7 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 	return 1;
 }
 
-static int parse_num(const char **cp_p)
+int parse_rename_score(const char **cp_p)
 {
 	unsigned long num, scale;
 	int ch, dot;
@@ -3369,7 +3369,7 @@ static int diff_scoreopt_parse(const char *opt)
 	if (cmd != 'M' && cmd != 'C' && cmd != 'B')
 		return -1; /* that is not a -M, -C nor -B option */
 
-	opt1 = parse_num(&opt);
+	opt1 = parse_rename_score(&opt);
 	if (cmd != 'B')
 		opt2 = 0;
 	else {
@@ -3379,7 +3379,7 @@ static int diff_scoreopt_parse(const char *opt)
 			return -1; /* we expect -B80/99 or -B80 */
 		else {
 			opt++;
-			opt2 = parse_num(&opt);
+			opt2 = parse_rename_score(&opt);
 		}
 	}
 	if (*opt != 0)
diff --git a/diff.h b/diff.h
index 1fd44f5..0083d92 100644
--- a/diff.h
+++ b/diff.h
@@ -315,4 +315,6 @@ extern size_t fill_textconv(struct userdiff_driver *driver,
 
 extern struct userdiff_driver *get_textconv(struct diff_filespec *one);
 
+extern int parse_rename_score(const char **cp_p);
+
 #endif /* DIFF_H */
diff --git a/merge-recursive.c b/merge-recursive.c
index 325a97b..875859f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,6 +334,7 @@ static struct string_list *get_renames(struct merge_options *o,
 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
 			    500;
+	opts.rename_score = o->rename_score;
 	opts.warn_on_too_large_rename = 1;
 	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
 	if (diff_setup_done(&opts) < 0)
@@ -1576,6 +1577,11 @@ int parse_merge_opt(struct merge_options *o, const char *s)
 		o->renormalize = 1;
 	else if (!strcmp(s, "no-renormalize"))
 		o->renormalize = 0;
+	else if (!prefixcmp(s, "rename-threshold=")) {
+		const char *score = s + strlen("rename-threshold=");
+		if ((o->rename_score = parse_rename_score(&score)) == -1 || *score != 0)
+			return -1;
+	}
 	else
 		return -1;
 	return 0;
diff --git a/merge-recursive.h b/merge-recursive.h
index 2eb5d1a..c8135b0 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -19,6 +19,7 @@ struct merge_options {
 	int verbosity;
 	int diff_rename_limit;
 	int merge_rename_limit;
+	int rename_score;
 	int call_depth;
 	struct strbuf obuf;
 	struct string_list current_file_set;
-- 
1.7.3.72.g8af0.dirty

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv2 2/2] diff: add synonyms for -M, -C, -B
  2010-09-27 22:01             ` Kevin Ballard
  2010-09-27 23:53               ` Jonathan Nieder
  2010-09-27 23:58               ` [PATCHv2 1/2] " Kevin Ballard
@ 2010-09-27 23:58               ` Kevin Ballard
  2010-09-28 21:15                 ` Thell Fowler
  2 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-27 23:58 UTC (permalink / raw
  To: git; +Cc: Kevin Ballard, Junio C Hamano

Add new long-form options --detect-renames[=<n>], --detect-copies[=<n>],
and --break-rewrites[=[<n>][/<m>]] as synonyms for the -M, -C, and -B
options (respectively).

Signed-off-by: Kevin Ballard <kevin@sb.org>
---
After thinking about it, I decided that --rename-threshold doesn't make sense
as a long-form option because it doesn't make its meaning obvious when you
don't specify the optional argument. I also figured it doesn't need to match
exactly with the rename-threshold merge strategy option as merge-recursive
already turns on rename detection and the option just controls the threshold.
However, the option to git-diff actually turns on rename detection instead of
just controlling the threshold.

 Documentation/diff-options.txt |    3 +++
 diff.c                         |   25 ++++++++++++++++++++++---
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index f77a0f8..a511529 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -207,6 +207,7 @@ endif::git-format-patch[]
 	digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
+--break-rewrites[=[<n>][/<m>]]::
 	Break complete rewrite changes into pairs of delete and
 	create. This serves two purposes:
 +
@@ -229,6 +230,7 @@ eligible for being picked up as a possible source of a rename to
 another file.
 
 -M[<n>]::
+--detect-renames[=<n>]::
 ifndef::git-log[]
 	Detect renames.
 endif::git-log[]
@@ -244,6 +246,7 @@ endif::git-log[]
 	hasn't changed.
 
 -C[<n>]::
+--detect-copies[=<n>]::
 	Detect copies as well as renames.  See also `--find-copies-harder`.
 	If `n` is specified, it has the same meaning as for `-M<n>`.
 
diff --git a/diff.c b/diff.c
index d862234..d8fcf06 100644
--- a/diff.c
+++ b/diff.c
@@ -3140,16 +3140,19 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 		return stat_opt(options, av);
 
 	/* renames options */
-	else if (!prefixcmp(arg, "-B")) {
+	else if (!prefixcmp(arg, "-B") || !prefixcmp(arg, "--break-rewrites=") ||
+		 !strcmp(arg, "--break-rewrites")) {
 		if ((options->break_opt = diff_scoreopt_parse(arg)) == -1)
 			return -1;
 	}
-	else if (!prefixcmp(arg, "-M")) {
+	else if (!prefixcmp(arg, "-M") || !prefixcmp(arg, "--detect-renames=") ||
+		 !strcmp(arg, "--detect-renames")) {
 		if ((options->rename_score = diff_scoreopt_parse(arg)) == -1)
 			return -1;
 		options->detect_rename = DIFF_DETECT_RENAME;
 	}
-	else if (!prefixcmp(arg, "-C")) {
+	else if (!prefixcmp(arg, "-C") || !prefixcmp(arg, "--detect-copies=") ||
+		 !strcmp(arg, "--detect-copies")) {
 		if (options->detect_rename == DIFF_DETECT_COPY)
 			DIFF_OPT_SET(options, FIND_COPIES_HARDER);
 		if ((options->rename_score = diff_scoreopt_parse(arg)) == -1)
@@ -3366,6 +3369,22 @@ static int diff_scoreopt_parse(const char *opt)
 	if (*opt++ != '-')
 		return -1;
 	cmd = *opt++;
+	if (cmd == '-') {
+		/* convert the long-form arguments into short-form versions */
+		if (!prefixcmp(opt, "break-rewrites")) {
+			opt += strlen("break-rewrites");
+			if (*opt == 0 || *opt++ == '=')
+				cmd = 'B';
+		} else if (!prefixcmp(opt, "detect-copies")) {
+			opt += strlen("detect-copies");
+			if (*opt == 0 || *opt++ == '=')
+				cmd = 'C';
+		} else if (!prefixcmp(opt, "detect-renames")) {
+			opt += strlen("detect-renames");
+			if (*opt == 0 || *opt++ == '=')
+				cmd = 'M';
+		}
+	}
 	if (cmd != 'M' && cmd != 'C' && cmd != 'B')
 		return -1; /* that is not a -M, -C nor -B option */
 
-- 
1.7.3.72.g8af0.dirty

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-27 23:53               ` Jonathan Nieder
@ 2010-09-28  0:01                 ` Kevin Ballard
  2010-09-28  0:08                   ` Jonathan Nieder
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-28  0:01 UTC (permalink / raw
  To: Jonathan Nieder; +Cc: Junio C Hamano, git

On Sep 27, 2010, at 4:53 PM, Jonathan Nieder wrote:

> Kevin Ballard wrote:
> 
>> After taking a look at this, it raises another question. -B, -M, and
>> -C all have optional arguments, but the long-form names don't seem
>> to support that.
> [...]
>>                      if I make it support `git diff
>> --rename-threshold foo` that would also work, but the name doesn't
>> seem appropriate without the argument.
> 
> Right --- with merge-recursive the argument doesn't need to be
> optional, but with git diff it does.
> 
> How about
> 
> 	--detect-renames=<threshold>
> 	--detect-copies=<threshold>
> 	--detect-rewrites=<threshold>/<threshold>

Good timing, I just sent out a patch that does almost exactly this, though I went with --break-rewrites instead of --detect-rewrites.

-Kevin Ballard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-28  0:01                 ` Kevin Ballard
@ 2010-09-28  0:08                   ` Jonathan Nieder
  2010-09-28  0:14                     ` Kevin Ballard
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Nieder @ 2010-09-28  0:08 UTC (permalink / raw
  To: Kevin Ballard; +Cc: Junio C Hamano, git

Kevin Ballard wrote:

> Good timing, I just sent out a patch that does almost exactly this,
> though I went with --break-rewrites instead of --detect-rewrites.

Both new patches look good to me, for what it's worth (though it
would be nicer to have tests, of course :)).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-28  0:08                   ` Jonathan Nieder
@ 2010-09-28  0:14                     ` Kevin Ballard
  2010-09-28  0:24                       ` Jonathan Nieder
  0 siblings, 1 reply; 18+ messages in thread
From: Kevin Ballard @ 2010-09-28  0:14 UTC (permalink / raw
  To: Jonathan Nieder; +Cc: Junio C Hamano, git

On Sep 27, 2010, at 5:08 PM, Jonathan Nieder wrote:

> Kevin Ballard wrote:
> 
>> Good timing, I just sent out a patch that does almost exactly this,
>> though I went with --break-rewrites instead of --detect-rewrites.
> 
> Both new patches look good to me, for what it's worth (though it
> would be nicer to have tests, of course :)).

I considered tests, and I looked at the existing ones. I am unable to find any tests that actually test setting the rename/copy score to anything other than the default, and I was a bit hesitant to add tests that simply checked to make sure the long-form option was parsed correctly, as that would just be duplicating existing tests that use the short-form arguments.

-Kevin Ballard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] merge-recursive: option to specify rename threshold
  2010-09-28  0:14                     ` Kevin Ballard
@ 2010-09-28  0:24                       ` Jonathan Nieder
  0 siblings, 0 replies; 18+ messages in thread
From: Jonathan Nieder @ 2010-09-28  0:24 UTC (permalink / raw
  To: Kevin Ballard; +Cc: Junio C Hamano, git

Kevin Ballard wrote:

> I looked at the existing ones. I am unable to find any tests that
> actually test setting the rename/copy score to anything other than
> the default

Yep, a quick grep shows there is none.  I think the precise meaning
of the scores is subject to change, but a test for 1% should be
reliable enough. :)  (Or 0%, except that that is a magic number
with the current code.)

I can look into it tomorrow if no one else gets around to it before
then.

'night,
Jonathan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv2 2/2] diff: add synonyms for -M, -C, -B
  2010-09-27 23:58               ` [PATCHv2 2/2] diff: add synonyms for -M, -C, -B Kevin Ballard
@ 2010-09-28 21:15                 ` Thell Fowler
  0 siblings, 0 replies; 18+ messages in thread
From: Thell Fowler @ 2010-09-28 21:15 UTC (permalink / raw
  To: Kevin Ballard; +Cc: git, Junio C Hamano

Just wanted to throw support behind this patch as it is already helping 
our project to ease merging of a project that can't/won't modify some 
files that we have extensively altered.

Thanks for doing this Kevin!

Just to note:  We are using this applied to msysgit's devel branch by 
manually removing the diff.h chunk, applying the first patch, manually 
fixing diff.h, hen applying the second patch.  Which so far has worked 
just fine.

-- 
Thell

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-09-28 22:23 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-22  6:03 [WIP/PATCH] merge-recursive: option to specify rename threshold Kevin Ballard
2010-09-23  0:32 ` [PATCH] " Kevin Ballard
2010-09-23  0:38   ` Kevin Ballard
2010-09-23  0:45     ` Kevin Ballard
2010-09-27  4:11       ` Junio C Hamano
2010-09-27  5:04         ` Kevin Ballard
2010-09-27  5:24           ` Junio C Hamano
2010-09-27  5:34             ` Kevin Ballard
2010-09-27  6:04               ` Junio C Hamano
2010-09-27 22:01             ` Kevin Ballard
2010-09-27 23:53               ` Jonathan Nieder
2010-09-28  0:01                 ` Kevin Ballard
2010-09-28  0:08                   ` Jonathan Nieder
2010-09-28  0:14                     ` Kevin Ballard
2010-09-28  0:24                       ` Jonathan Nieder
2010-09-27 23:58               ` [PATCHv2 1/2] " Kevin Ballard
2010-09-27 23:58               ` [PATCHv2 2/2] diff: add synonyms for -M, -C, -B Kevin Ballard
2010-09-28 21:15                 ` Thell Fowler

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).