git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] config: Introduce --patience config variable
@ 2012-03-06 10:59 Michal Privoznik
  2012-03-06 11:49 ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Michal Privoznik @ 2012-03-06 10:59 UTC (permalink / raw)
  To: git; +Cc: gitster

Some users like the patience diff more than the bare diff. However,
specifying the '--patience' argument every time diff is to be used
is impractical. Moreover, creating an alias doesn't play with other
tools nice, e.g. git-show; Therefore we need a configuration variable
to turn this on/off across whole git tools.
---
Please keep me CC'ed as I am not signed into list.

 Documentation/diff-config.txt |    3 +++
 diff.c                        |    9 +++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 1aed79e..b72b2fd 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -86,6 +86,9 @@ diff.mnemonicprefix::
 diff.noprefix::
 	If set, 'git diff' does not show any source or destination prefix.
 
+diff.patience:
+    If set, 'git diff' will use patience algorithm.
+
 diff.renameLimit::
 	The number of files to consider when performing the copy/rename
 	detection; equivalent to the 'git diff' option '-l'.
diff --git a/diff.c b/diff.c
index a1c06b5..8940d00 100644
--- a/diff.c
+++ b/diff.c
@@ -33,6 +33,7 @@ static int diff_mnemonic_prefix;
 static int diff_no_prefix;
 static int diff_dirstat_permille_default = 30;
 static struct diff_options default_diff_options;
+static int diff_patience = 0;
 
 static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_RESET,
@@ -212,6 +213,11 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	if (!prefixcmp(var, "submodule."))
 		return parse_submodule_config_option(var, value);
 
+    if (!strcmp(var, "diff.patience")) {
+		diff_patience = git_config_bool(var, value);
+		return 0;
+    }
+
 	return git_default_config(var, value, cb);
 }
 
@@ -3202,6 +3208,9 @@ int diff_setup_done(struct diff_options *options)
 		DIFF_OPT_SET(options, EXIT_WITH_STATUS);
 	}
 
+    if (diff_patience)
+		DIFF_XDL_SET(options, PATIENCE_DIFF);
+
 	return 0;
 }
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 10:59 [PATCH] config: Introduce --patience config variable Michal Privoznik
@ 2012-03-06 11:49 ` Jeff King
  2012-03-06 13:01   ` Thomas Rast
  2012-03-07  2:57   ` Junio C Hamano
  0 siblings, 2 replies; 23+ messages in thread
From: Jeff King @ 2012-03-06 11:49 UTC (permalink / raw)
  To: Michal Privoznik; +Cc: git, gitster

On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote:

> Some users like the patience diff more than the bare diff. However,
> specifying the '--patience' argument every time diff is to be used
> is impractical. Moreover, creating an alias doesn't play with other
> tools nice, e.g. git-show; Therefore we need a configuration variable
> to turn this on/off across whole git tools.

The idea of turning on patience diff via config makes sense to me, and
it is not a problem for plumbing tools to have patience diff on, since
patience diffs are syntactically identical to non-patience diffs. So I
think the intent is good.

A few comments on the patch itself:

> --- a/Documentation/diff-config.txt
> +++ b/Documentation/diff-config.txt
> @@ -86,6 +86,9 @@ diff.mnemonicprefix::
>  diff.noprefix::
>  	If set, 'git diff' does not show any source or destination prefix.
>  
> +diff.patience:
> +    If set, 'git diff' will use patience algorithm.
> +

Should this be a boolean? Or should we actually have a diff.algorithm
option where you specify the algorithm you want (e.g., "diff.algorithm =
patience")? That would free us up later to more easily add new values.

In particular, I am thinking about --minimal. It is mutually exclusive
with --patience, and is simply ignored if you use patience diff.
we perhaps have "diff.algorithm" which can be one of "myers", "minimal"
(which is really myers + the minimal flag), and "patience".

Or I suppose you could think of it as "--minimal" being orthogonal to
the algorithm chosen, and it is simply that "--minimal" does nothing
(currently) with the patience algorithm.

> @@ -3202,6 +3208,9 @@ int diff_setup_done(struct diff_options *options)
>  		DIFF_OPT_SET(options, EXIT_WITH_STATUS);
>  	}
>  
> +    if (diff_patience)
> +		DIFF_XDL_SET(options, PATIENCE_DIFF);
> +
>  	return 0;

Hmm. Usually for a boolean config we would have "-1" mean "not
specified", and otherwise 0/1 for true/false. But in your case, setting
diff.patience to "false" will be the same as not setting it at all. I
think this is probably OK. "false" is the default, so you would only
want to specify it to override an earlier setting, but there is nothing
earlier than config that you could possibly be overriding.

Speaking of overriding, you may want to actually override the config
option from the command line. You probably should also add a
"--no-patience" option, so that one can turn off "diff.patience = true"
for a particular invocation.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 11:49 ` Jeff King
@ 2012-03-06 13:01   ` Thomas Rast
  2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
                       ` (2 more replies)
  2012-03-07  2:57   ` Junio C Hamano
  1 sibling, 3 replies; 23+ messages in thread
From: Thomas Rast @ 2012-03-06 13:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Michal Privoznik, git, gitster

Jeff King <peff@peff.net> writes:

> On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote:
>
>> --- a/Documentation/diff-config.txt
>> +++ b/Documentation/diff-config.txt
>> @@ -86,6 +86,9 @@ diff.mnemonicprefix::
>>  diff.noprefix::
>>  	If set, 'git diff' does not show any source or destination prefix.
>>  
>> +diff.patience:
>> +    If set, 'git diff' will use patience algorithm.
>> +
>
> Should this be a boolean? Or should we actually have a diff.algorithm
> option where you specify the algorithm you want (e.g., "diff.algorithm =
> patience")? That would free us up later to more easily add new values.
>
> In particular, I am thinking about --minimal. It is mutually exclusive
> with --patience, and is simply ignored if you use patience diff.
> we perhaps have "diff.algorithm" which can be one of "myers", "minimal"
> (which is really myers + the minimal flag), and "patience".

Don't forget "histogram".  I have no idea why it's not documented
(evidently 8c912eea slipped through the review cracks) but --histogram
is supported since 1.7.7.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 13:01   ` Thomas Rast
@ 2012-03-06 13:15     ` Thomas Rast
  2012-03-06 13:15       ` [PATCH 2/2] Document the --histogram diff option Thomas Rast
                         ` (2 more replies)
  2012-03-06 13:30     ` [PATCH] config: Introduce --patience config variable Jeff King
  2012-03-06 13:32     ` Michal Privoznik
  2 siblings, 3 replies; 23+ messages in thread
From: Thomas Rast @ 2012-03-06 13:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michal Privoznik, Jeff King

8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff
was faster than both Myers and patience.

We have since incorporated a performance testing framework, so add a
test that compares the various diff tasks performed in a real 'log -p'
workload.  This does indeed show that histogram diff slightly beats
Myers, while patience is much slower than the others.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---

The 3000 is pretty arbitrary but makes for a nice test duration.

I'm reluctant to put numbers into the message, since the whole point
of the perf test framework is that you can easily get them too.  But
here's what I'm seeing:

  4000.1: log -3000 (baseline)          0.04(0.02+0.01)                                                     
  4000.2: log --raw -3000 (tree-only)   0.49(0.38+0.09)                                                     
  4000.3: log -p -3000 (Myers)          1.93(1.75+0.17)
  4000.4: log -p -3000 --histogram      1.90(1.74+0.15)
  4000.5: log -p -3000 --patience       2.25(2.07+0.16)

 t/perf/p4000-diff-algorithms.sh |   29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)
 create mode 100755 t/perf/p4000-diff-algorithms.sh

diff --git a/t/perf/p4000-diff-algorithms.sh b/t/perf/p4000-diff-algorithms.sh
new file mode 100755
index 0000000..d6e505c
--- /dev/null
+++ b/t/perf/p4000-diff-algorithms.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+test_description="Tests diff generation performance"
+
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+test_perf 'log -3000 (baseline)' '
+	git log -1000 >/dev/null
+'
+
+test_perf 'log --raw -3000 (tree-only)' '
+	git log --raw -3000 >/dev/null
+'
+
+test_perf 'log -p -3000 (Myers)' '
+	git log -p -3000 >/dev/null
+'
+
+test_perf 'log -p -3000 --histogram' '
+	git log -p -3000 --histogram >/dev/null
+'
+
+test_perf 'log -p -3000 --patience' '
+	git log -p -3000 --patience >/dev/null
+'
+
+test_done
-- 
1.7.9.2.467.g7fee4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] Document the --histogram diff option
  2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
@ 2012-03-06 13:15       ` Thomas Rast
  2012-03-06 19:57         ` Junio C Hamano
  2012-03-06 19:52       ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano
  2012-03-10  7:13       ` René Scharfe
  2 siblings, 1 reply; 23+ messages in thread
From: Thomas Rast @ 2012-03-06 13:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michal Privoznik, Jeff King

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---

This is only the minimal update.  I think in the long run, we should
add a note saying why we support all of them.  BUt off hand I didn't
have any substantial evidence in favour of patience that could be used
as an argument.

 Documentation/diff-options.txt |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 87f0a5f..7d4566f 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -52,6 +52,9 @@ endif::git-format-patch[]
 --patience::
 	Generate a diff using the "patience diff" algorithm.
 
+--histogram::
+	Generate a diff using the "histogram diff" algorithm.
+
 --stat[=<width>[,<name-width>[,<count>]]]::
 	Generate a diffstat. By default, as much space as necessary
 	will be used for the filename part, and the rest for the graph
-- 
1.7.9.2.467.g7fee4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 13:01   ` Thomas Rast
  2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
@ 2012-03-06 13:30     ` Jeff King
  2012-03-06 13:32     ` Michal Privoznik
  2 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2012-03-06 13:30 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Michal Privoznik, git, gitster

On Tue, Mar 06, 2012 at 02:01:42PM +0100, Thomas Rast wrote:

> >> --- a/Documentation/diff-config.txt
> >> +++ b/Documentation/diff-config.txt
> >> @@ -86,6 +86,9 @@ diff.mnemonicprefix::
> >>  diff.noprefix::
> >>  	If set, 'git diff' does not show any source or destination prefix.
> >>  
> >> +diff.patience:
> >> +    If set, 'git diff' will use patience algorithm.
> >> +
> >
> > Should this be a boolean? Or should we actually have a diff.algorithm
> > option where you specify the algorithm you want (e.g., "diff.algorithm =
> > patience")? That would free us up later to more easily add new values.
> >
> > In particular, I am thinking about --minimal. It is mutually exclusive
> > with --patience, and is simply ignored if you use patience diff.
> > we perhaps have "diff.algorithm" which can be one of "myers", "minimal"
> > (which is really myers + the minimal flag), and "patience".
> 
> Don't forget "histogram".  I have no idea why it's not documented
> (evidently 8c912eea slipped through the review cracks) but --histogram
> is supported since 1.7.7.

Ah, thanks. I had the vague feeling that we had a third algorithm
already, but I didn't see it in the docs. So yeah, I really think this
should be diff.algorithm, with a value of "myers", "patience", or
"histogram" (and possibly "minimal", depending how we want to treat
that).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 13:01   ` Thomas Rast
  2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
  2012-03-06 13:30     ` [PATCH] config: Introduce --patience config variable Jeff King
@ 2012-03-06 13:32     ` Michal Privoznik
  2012-03-06 13:38       ` Matthieu Moy
  2 siblings, 1 reply; 23+ messages in thread
From: Michal Privoznik @ 2012-03-06 13:32 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Jeff King, git, gitster

On 06.03.2012 14:01, Thomas Rast wrote:
> Jeff King <peff@peff.net> writes:
> 
>> On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote:
>>
>>> --- a/Documentation/diff-config.txt
>>> +++ b/Documentation/diff-config.txt
>>> @@ -86,6 +86,9 @@ diff.mnemonicprefix::
>>>  diff.noprefix::
>>>  	If set, 'git diff' does not show any source or destination prefix.
>>>  
>>> +diff.patience:
>>> +    If set, 'git diff' will use patience algorithm.
>>> +
>>
>> Should this be a boolean? Or should we actually have a diff.algorithm
>> option where you specify the algorithm you want (e.g., "diff.algorithm =
>> patience")? That would free us up later to more easily add new values.
>>
>> In particular, I am thinking about --minimal. It is mutually exclusive
>> with --patience, and is simply ignored if you use patience diff.
>> we perhaps have "diff.algorithm" which can be one of "myers", "minimal"
>> (which is really myers + the minimal flag), and "patience".
> 
> Don't forget "histogram".  I have no idea why it's not documented
> (evidently 8c912eea slipped through the review cracks) but --histogram
> is supported since 1.7.7.
> 

Okay guys. I'll got with diff.algorithm = [patience | minimal |
histogram | myers] then. What I am not sure about is how to threat case
when user have say algorithm = patience set in config but want to use
myers. I guess we need --myers option then, don't we?

Michal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 13:32     ` Michal Privoznik
@ 2012-03-06 13:38       ` Matthieu Moy
  2012-03-06 14:09         ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Matthieu Moy @ 2012-03-06 13:38 UTC (permalink / raw)
  To: Michal Privoznik; +Cc: Thomas Rast, Jeff King, git, gitster

Michal Privoznik <mprivozn@redhat.com> writes:

> Okay guys. I'll got with diff.algorithm = [patience | minimal |
> histogram | myers] then. What I am not sure about is how to threat case
> when user have say algorithm = patience set in config but want to use
> myers. I guess we need --myers option then, don't we?

At this point, maybe it's time to have
--diff-algorithm=[patience|minimal|histogram|myers], and keep
--patience, --minimal and --histogram just as compatibility aliases.

Having one option per algorithm feels wrong ...

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 13:38       ` Matthieu Moy
@ 2012-03-06 14:09         ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2012-03-06 14:09 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Michal Privoznik, Thomas Rast, git, gitster

On Tue, Mar 06, 2012 at 02:38:16PM +0100, Matthieu Moy wrote:

> Michal Privoznik <mprivozn@redhat.com> writes:
> 
> > Okay guys. I'll got with diff.algorithm = [patience | minimal |
> > histogram | myers] then. What I am not sure about is how to threat case
> > when user have say algorithm = patience set in config but want to use
> > myers. I guess we need --myers option then, don't we?
> 
> At this point, maybe it's time to have
> --diff-algorithm=[patience|minimal|histogram|myers], and keep
> --patience, --minimal and --histogram just as compatibility aliases.
> 
> Having one option per algorithm feels wrong ...

Yeah, that was going to be my suggestion, too. And explaining it that
way in the docs will make it clear that "--histogram" overrides
"--patience" and so forth.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
  2012-03-06 13:15       ` [PATCH 2/2] Document the --histogram diff option Thomas Rast
@ 2012-03-06 19:52       ` Junio C Hamano
  2012-03-06 21:00         ` Thomas Rast
  2012-03-10  7:13       ` René Scharfe
  2 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-03-06 19:52 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Michal Privoznik, Jeff King

Thomas Rast <trast@student.ethz.ch> writes:

> 8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff
> was faster than both Myers and patience.
>
> We have since incorporated a performance testing framework, so add a
> test that compares the various diff tasks performed in a real 'log -p'
> workload.  This does indeed show that histogram diff slightly beats
> Myers, while patience is much slower than the others.
>
> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> ---

Fun.

I am getting this (probably unrelated to this patch), by the way:

$ make perf
make -C t/perf/ all
make[1]: Entering directory `/srv/project/git/git.git/t/perf'
rm -rf test-results
./run
...
perf 4 - grep --cached, expensive regex: 1 2 3 ok
# passed all 4 test(s)
1..4
Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5.
BEGIN failed--compilation aborted at ./aggregate.perl line 5.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] Document the --histogram diff option
  2012-03-06 13:15       ` [PATCH 2/2] Document the --histogram diff option Thomas Rast
@ 2012-03-06 19:57         ` Junio C Hamano
  2012-03-06 20:42           ` Thomas Rast
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-03-06 19:57 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Michal Privoznik, Jeff King

Thomas Rast <trast@student.ethz.ch> writes:

> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> ---
>
> This is only the minimal update.  I think in the long run, we should
> add a note saying why we support all of them.  But off hand I didn't
> have any substantial evidence in favour of patience that could be used
> as an argument.

Isn't the main argument made by proponents of patience diff is more
readable output, and not performance?  That line of argument relies
on a fairly subjective test "which one is easier to read?", so it is
hard to come up with a substantial evidence, unless somebody invests
in A/B test.

Thanks, will queue for 1.7.7.x maintenance track and upwards.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] Document the --histogram diff option
  2012-03-06 19:57         ` Junio C Hamano
@ 2012-03-06 20:42           ` Thomas Rast
  0 siblings, 0 replies; 23+ messages in thread
From: Thomas Rast @ 2012-03-06 20:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git, Michal Privoznik, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Thomas Rast <trast@student.ethz.ch> writes:
>
>> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
>> ---
>>
>> This is only the minimal update.  I think in the long run, we should
>> add a note saying why we support all of them.  But off hand I didn't
>> have any substantial evidence in favour of patience that could be used
>> as an argument.
>
> Isn't the main argument made by proponents of patience diff is more
> readable output, and not performance?  That line of argument relies
> on a fairly subjective test "which one is easier to read?", so it is
> hard to come up with a substantial evidence, unless somebody invests
> in A/B test.

Well, I was just too lazy to look up what I dimly remembered people had
posted at some point: examples where patience beats Myers for
readability.  E.g.,

  http://article.gmane.org/gmane.comp.version-control.git/104316

I don't think you need a blind test to justify that the patience result
is more readable.  So I think in the long run, the docs should say
something like:

--diff-algorithm={histogram|myers|minimal|patience}::
       Choose a diff algorithm.  The variants are as follows:
+
--
histogram::
        This is the fastest algorithm, and thus the default.

myers::
        The classical Myers diff algorithm.  <state a reason why myers
        would be useful>

minimal::
        Like 'myers', but spend extra time making sure that the diff
        is the shortest possible for the set of changes performed.

patience::
        The patience diff algorithm, which first matches unique lines
        with each other.  This sometimes results in more readable (if
        longer) patches than the other algorithms.
--

Or whatever -- magic is required to have a nested list in asciidoc.
I can't be bothered to twiddle with that right now.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 19:52       ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano
@ 2012-03-06 21:00         ` Thomas Rast
  2012-03-06 21:18           ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Rast @ 2012-03-06 21:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git, Michal Privoznik, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

>
> I am getting this (probably unrelated to this patch), by the way:
>
> $ make perf
> make -C t/perf/ all
> make[1]: Entering directory `/srv/project/git/git.git/t/perf'
> rm -rf test-results
> ./run
> ...
> perf 4 - grep --cached, expensive regex: 1 2 3 ok
> # passed all 4 test(s)
> 1..4
> Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5.
> BEGIN failed--compilation aborted at ./aggregate.perl line 5.

It would seem that you are not installing Git.pm as part of your normal
installation?  aggregate.perl uses it, so you'd have to.  Perhaps a Perl
guru can tell us if it's possible to magically pull Git.pm from the
build tree instead of the installed version...

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 21:00         ` Thomas Rast
@ 2012-03-06 21:18           ` Junio C Hamano
  2012-03-06 21:41             ` Jakub Narebski
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-03-06 21:18 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Thomas Rast, git, Michal Privoznik, Jeff King

Thomas Rast <trast@inf.ethz.ch> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>>
>> I am getting this (probably unrelated to this patch), by the way:
>>
>> $ make perf
>> make -C t/perf/ all
>> make[1]: Entering directory `/srv/project/git/git.git/t/perf'
>> rm -rf test-results
>> ./run
>> ...
>> perf 4 - grep --cached, expensive regex: 1 2 3 ok
>> # passed all 4 test(s)
>> 1..4
>> Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5.
>> BEGIN failed--compilation aborted at ./aggregate.perl line 5.
>
> It would seem that you are not installing Git.pm as part of your normal
> installation?

I actually am installing it in a quite vanilla way.

I think our installation procedure places Git.pm in git specific
perl library path where a simple invocation of "perl" that is
git-unaware will not look into, and we make sure that our scripts
still find the matching version of Git.pm by having "use lib" at the
beginning that points at the right directory.

But of course, this from a command line would not work:

	$ perl -MGit

I do not expect it to, and for the ease of testing new versions, I
prefer it not to work.

In any case, you should be able to do anything under t/ _before_
installing, so relying on having Git.pm in normal @INC is a double
no-no.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 21:18           ` Junio C Hamano
@ 2012-03-06 21:41             ` Jakub Narebski
  2012-03-07 12:44               ` Thomas Rast
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Narebski @ 2012-03-06 21:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, Thomas Rast, git, Michal Privoznik, Jeff King

Junio C Hamano <gitster@pobox.com> writes:
> Thomas Rast <trast@inf.ethz.ch> writes:
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>>>
>>> I am getting this (probably unrelated to this patch), by the way:
>>>
>>> $ make perf
>>> make -C t/perf/ all
>>> make[1]: Entering directory `/srv/project/git/git.git/t/perf'
>>> rm -rf test-results
>>> ./run
>>> ...
>>> perf 4 - grep --cached, expensive regex: 1 2 3 ok
>>> # passed all 4 test(s)
>>> 1..4
>>> Can't locate Git.pm in @INC (@INC contains: /etc/perl ...) at ./aggregate.perl line 5.
>>> BEGIN failed--compilation aborted at ./aggregate.perl line 5.
>>
>> It would seem that you are not installing Git.pm as part of your normal
>> installation?
> 
> I actually am installing it in a quite vanilla way.
> 
> I think our installation procedure places Git.pm in git specific
> perl library path where a simple invocation of "perl" that is
> git-unaware will not look into, and we make sure that our scripts
> still find the matching version of Git.pm by having "use lib" at the
> beginning that points at the right directory.
> 
> But of course, this from a command line would not work:
> 
> 	$ perl -MGit
> 
> I do not expect it to, and for the ease of testing new versions, I
> prefer it not to work.
> 
> In any case, you should be able to do anything under t/ _before_
> installing, so relying on having Git.pm in normal @INC is a double
> no-no.

Thomas, take a look at how it is solved in 't/t9700/test.pl', used by
't/t9700-perl-git.sh':

  use lib (split(/:/, $ENV{GITPERLLIB}));

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-06 11:49 ` Jeff King
  2012-03-06 13:01   ` Thomas Rast
@ 2012-03-07  2:57   ` Junio C Hamano
  2012-03-07 11:47     ` Jeff King
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2012-03-07  2:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Michal Privoznik, git

Jeff King <peff@peff.net> writes:

> On Tue, Mar 06, 2012 at 11:59:42AM +0100, Michal Privoznik wrote:
>
>> Some users like the patience diff more than the bare diff. However,
>> specifying the '--patience' argument every time diff is to be used
>> is impractical. Moreover, creating an alias doesn't play with other
>> tools nice, e.g. git-show; Therefore we need a configuration variable
>> to turn this on/off across whole git tools.
>
> The idea of turning on patience diff via config makes sense to me, and
> it is not a problem for plumbing tools to have patience diff on, since
> patience diffs are syntactically identical to non-patience diffs.

Even though I do not strongly object to the general conclusion, I
have to point out that the last line above is a dangerous thought.

If you change the default diff algorithm, you will invalidate long
term caches that computed their keys based on the shape of patches
produced in the past.  The prime example being the rerere database,
but I wouldn't be surprised if somebody has a notes tree to record
patch ids for existing commits to speed up "git cherry" (hence "git
rebase").  Also kup tool kernel.org folks use to optimize the
uploading of inter-release diff relies on having a stable output
from "git diff A..B", so that the uploader can run the command to
produce a diff locally, GPG sign it, and upload only the signature
and have the k.org side produce the same diff between two tags,
without having to upload the huge patchfile over the wire.

IOW, "syntactically identical so it is OK" is not the right thought
process.  "It may change the shape of the patch, which is a potential
problem for various tools, but as long as users understand that, it
should be allowed." is OK.

Cached patch-id database for "git cherry" would be a local and does
not affect the correctness, so I would consider breaking it is fine.

About kup use case, the potential problem can be worked around as
long as the receiving end keeps the setting vanilla (and I do not
see any reason it wouldn't) and the uploader remembers to choose the
matching variant when he locally generates the patch, so I think it
would be also OK.

Unnecessarily invalidating rerere database may be more frustrating,
but that again is local and personal, so the end user may suffer
worse than cached patch-id use case, but that hopefully is just one
time pain.

My preference however is to limit this to Porcelains only, though.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-07  2:57   ` Junio C Hamano
@ 2012-03-07 11:47     ` Jeff King
  2012-03-07 17:24       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2012-03-07 11:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michal Privoznik, git

On Tue, Mar 06, 2012 at 06:57:42PM -0800, Junio C Hamano wrote:

> > The idea of turning on patience diff via config makes sense to me, and
> > it is not a problem for plumbing tools to have patience diff on, since
> > patience diffs are syntactically identical to non-patience diffs.
> 
> Even though I do not strongly object to the general conclusion, I
> have to point out that the last line above is a dangerous thought.
> 
> If you change the default diff algorithm, you will invalidate long
> term caches that computed their keys based on the shape of patches
> produced in the past.

I see your point, though I don't think I'd use the word "dangerous" to
describe the invalidation of a cache. In a true cache, we must be ready
for cache misses, so the "danger" is causing extra cache misses. That
might be non-optimal, depending on the cost of a miss, but it's not
going to result in data loss.

[This email ended up long; the "tl;dr" is: "I don't think it's that big a
 deal, but I don't really have a problem limiting this feature to
 just porcelain". Michal, the way to do that would be to move the config
 parsing into git_diff_ui_config instead of git_diff_basic_config.]

> The prime example being the rerere database,

Having said that about caches, I don't know that I'd call rerere a
cache. The "miss" means the user has to resolve the merge. So whether it
is a cache or a database of precious information depends on how much you
value the user's input.

Anyway, would this option actually affect rerere? AFAICT, rerere does
not care about diff generation, but rather about the merge parameters.
And it is smart enough to tell the merge subsystem to use standard
parameters when generating rr-cache entries (e.g., setting
merge.conflictstyle does not affect what is put in the rr-cache).

> but I wouldn't be surprised if somebody has a notes tree to record
> patch ids for existing commits to speed up "git cherry" (hence "git
> rebase").

I do that[1]. However, any cache of patch-ids that does not record
the diff parameters used is wrong[2]. You're unlikely to come up with a
false positive due to the length of the patch-id hash, and a few cache
misses here and there are fine. A cache that invalidates all entries
when its cache parameters change[3] would be rendered useless by two
commands flip-flopping between two different sets of diff parameters.
But that is a quality-of-implementation issue for the cache.

[1] Actually, I don't store it in notes, but rather in the mmap'd
    persistent key value store I developed for the generation number
    cache last summer (which we ended up not using).

[2] Mine doesn't actually do this, and is wrong. Which is part of the
    reason I haven't sent it upstream.

[3] The notes_cache code invalidates like this. Since the only user is
    the textconv cache, that strategy works well. For the generation
    number cache, it maintains multiple cache entries.

> Also kup tool kernel.org folks use to optimize the uploading of
> inter-release diff relies on having a stable output from "git diff
> A..B", so that the uploader can run the command to produce a diff
> locally, GPG sign it, and upload only the signature and have the k.org
> side produce the same diff between two tags, without having to upload
> the huge patchfile over the wire.

That's a cute trick, but I think it is going to be inherently flaky.
Does kup run the "git diff" itself (or, I would hope "git diff-tree")?
If not, then they are at the mercy of the uploader adding custom diff
parameters. I don't know about others, but I sometimes tweak the amount
of context in a diff I send if it makes the diff more readable (e.g., if
two hunks are close by and are more easily read as a single hunk, or if
relevant context is one or two lines outside of a hunk).

But even if the kup client runs a sanitized, well-known, plumbing
version of the diff on both sides, they're still not guaranteed to
produce the same patch. Different versions of git may produce different
results. For example, we stopped defaulting to XDF_NEED_MINIMAL in
582aa00. Git v1.7.1 produces a different diff for some commits than
v1.7.1.1.

> IOW, "syntactically identical so it is OK" is not the right thought
> process.  "It may change the shape of the patch, which is a potential
> problem for various tools, but as long as users understand that, it
> should be allowed." is OK.

I would argue that it is those tools which should be the ones to care.
One of the great things about git is that it uses simple, standard
formats like unified diff, and doesn't care if you used git to generate
them. But the actual algorithm used to generate the diff is outside the
scope of git-apply. So in theory the email you are grabbing with "git
am" could be generated by GNU diff, or even some weird proprietary diff,
and we don't care. Applying a patch-id or anything else that makes
assumptions about the diff algorithm is inherently risky.  For internal
cases where one tool is generating both sides of the diff, it should be
sure to specify vanilla parameters.

> My preference however is to limit this to Porcelains only, though.

I'd be OK with that. The "...it should be sure to specify vanilla
parameters" above can be spelled as "should use the plumbing version".
Though even that is not entirely foolproof (due to differing versions,
as I mentioned above, but also I think we respect things like
$GIT_DIFF_OPTS), it works OK in practice. And given that the main
motivation is for porcelain users, I don't think it is hurting anyone to
be conservative at first.

If "kup" relies on the user to run the diff themselves for the feature
you mentioned, then it may cause breakage there (because most people
will run the porcelain "git diff"). But I think it is already broken by
things like diff.mnemonicprefix, as well as break- and rename-detection
parameters.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 21:41             ` Jakub Narebski
@ 2012-03-07 12:44               ` Thomas Rast
  2012-03-07 17:45                 ` Junio C Hamano
  2012-03-07 18:03                 ` Jakub Narebski
  0 siblings, 2 replies; 23+ messages in thread
From: Thomas Rast @ 2012-03-07 12:44 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Junio C Hamano, Thomas Rast, git, Michal Privoznik, Jeff King

Jakub Narebski <jnareb@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>> 
>> But of course, this from a command line would not work:
>> 
>> 	$ perl -MGit
>> 
>> I do not expect it to, and for the ease of testing new versions, I
>> prefer it not to work.
>> 
>> In any case, you should be able to do anything under t/ _before_
>> installing, so relying on having Git.pm in normal @INC is a double
>> no-no.
>
> Thomas, take a look at how it is solved in 't/t9700/test.pl', used by
> 't/t9700-perl-git.sh':
>
>   use lib (split(/:/, $ENV{GITPERLLIB}));

Hum.  The problem is that the user may invoke aggregate.perl manually,
and GITPERLLIB won't be set in that case.

Is there a better solution than duplicating the logic that sets
GITPERLLIB in test-lib.sh within aggregate.perl?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] config: Introduce --patience config variable
  2012-03-07 11:47     ` Jeff King
@ 2012-03-07 17:24       ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2012-03-07 17:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Michal Privoznik, git

Jeff King <peff@peff.net> writes:

> On Tue, Mar 06, 2012 at 06:57:42PM -0800, Junio C Hamano wrote:
>
>> > The idea of turning on patience diff via config makes sense to me, and
>> > it is not a problem for plumbing tools to have patience diff on, since
>> > patience diffs are syntactically identical to non-patience diffs.
>> 
>> Even though I do not strongly object to the general conclusion, I
>> have to point out that the last line above is a dangerous thought.
>>
>> If you change the default diff algorithm, you will invalidate long
>> term caches that computed their keys based on the shape of patches
>> produced in the past.
>
> I see your point, though I don't think I'd use the word "dangerous" to
> describe the invalidation of a cache.

I probably was not writing clearly enough to avoid getting misread.
The "dangerous" does not refer to "invalidation of a cache".  What I
meant was that "The output is syntactically identical, so it is OK"
is a dangerous way to think when assessing the risk of regression,
because applying to a given preimage and producing the same
postimage is not the *only* way the output is used.

I think the executive summary is that we are in agreement; your
analysis of potential regression coming from differences of the
shape of the patch output (not applicability to a given preimage to
produce the same postimage) seems to match what I said in the
previous message.

Especially in the kup case, it is a minority tool used by people who
knows or can be easily taught in a tightly controlled environment,
and it is fine as long as the users have a way to make sure two
diffs run on both ends of the communication produce the same result
(in an earlier discussion on k.org users list where kup was first
discussed, the limitation of users having to use the same version as
k.org has was noted and the users are already aware of it).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-07 12:44               ` Thomas Rast
@ 2012-03-07 17:45                 ` Junio C Hamano
  2012-03-07 18:03                 ` Jakub Narebski
  1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2012-03-07 17:45 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Jakub Narebski, Thomas Rast, git, Michal Privoznik, Jeff King

Thomas Rast <trast@inf.ethz.ch> writes:

> Jakub Narebski <jnareb@gmail.com> writes:
>
>> Thomas, take a look at how it is solved in 't/t9700/test.pl', used by
>> 't/t9700-perl-git.sh':
>>
>>   use lib (split(/:/, $ENV{GITPERLLIB}));
>
> Hum.  The problem is that the user may invoke aggregate.perl manually,
> and GITPERLLIB won't be set in that case.

I would equate "manual invocation" with "the user knows what he is doing",
so if that is the only problem, I think we are good.

> Is there a better solution than duplicating the logic that sets
> GITPERLLIB in test-lib.sh within aggregate.perl?

Perhaps the part to figure out the directory layout can and should
be split out of test-lib.sh into test-env.sh or something and be
dot-sourced at the beginning of test-lib.sh; would that help?

 t/test-env.sh |  112 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 t/test-lib.sh |  118 ++++-----------------------------------------------------
 2 files changed, 120 insertions(+), 110 deletions(-)

diff --git a/t/test-env.sh b/t/test-env.sh
new file mode 100644
index 0000000..1159c5a
--- /dev/null
+++ b/t/test-env.sh
@@ -0,0 +1,112 @@
+# figure out the environment a test-related script is being run
+# taken from test-lib.sh, Copyright (c) 2005 Junio C Hamano
+
+# The directory to find other helper pieces e.g. lib-gpg.sh of the
+# test suite; usually t/ in the git source tree.
+if test -z "$TEST_DIRECTORY"
+then
+	# We allow tests to override this, in case they want to run tests
+	# outside of t/, e.g. for running tests on the test library
+	# itself.
+	TEST_DIRECTORY=$(pwd)
+fi
+GIT_BUILD_DIR="$TEST_DIRECTORY"/..
+GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt
+GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git
+
+if test -n "$valgrind"
+then
+	make_symlink () {
+		test -h "$2" &&
+		test "$1" = "$(readlink "$2")" || {
+			# be super paranoid
+			if mkdir "$2".lock
+			then
+				rm -f "$2" &&
+				ln -s "$1" "$2" &&
+				rm -r "$2".lock
+			else
+				while test -d "$2".lock
+				do
+					say "Waiting for lock on $2."
+					sleep 1
+				done
+			fi
+		}
+	}
+
+	make_valgrind_symlink () {
+		# handle only executables, unless they are shell libraries that
+		# need to be in the exec-path.  We will just use "#!" as a
+		# guess for a shell-script, since we have no idea what the user
+		# may have configured as the shell path.
+		test -x "$1" ||
+		test "#!" = "$(head -c 2 <"$1")" ||
+		return;
+
+		base=$(basename "$1")
+		symlink_target=$GIT_BUILD_DIR/$base
+		# do not override scripts
+		if test -x "$symlink_target" &&
+		    test ! -d "$symlink_target" &&
+		    test "#!" != "$(head -c 2 < "$symlink_target")"
+		then
+			symlink_target=../valgrind.sh
+		fi
+		case "$base" in
+		*.sh|*.perl)
+			symlink_target=../unprocessed-script
+		esac
+		# create the link, or replace it if it is out of date
+		make_symlink "$symlink_target" "$GIT_VALGRIND/bin/$base" || exit
+	}
+
+	# override all git executables in TEST_DIRECTORY/..
+	GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+	mkdir -p "$GIT_VALGRIND"/bin
+	for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-*
+	do
+		make_valgrind_symlink $file
+	done
+	# special-case the mergetools loadables
+	make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools"
+	OLDIFS=$IFS
+	IFS=:
+	for path in $PATH
+	do
+		ls "$path"/git-* 2> /dev/null |
+		while read file
+		do
+			make_valgrind_symlink "$file"
+		done
+	done
+	IFS=$OLDIFS
+	PATH=$GIT_VALGRIND/bin:$PATH
+	GIT_EXEC_PATH=$GIT_VALGRIND/bin
+	export GIT_VALGRIND
+elif test -n "$GIT_TEST_INSTALLED" ; then
+	GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path)  ||
+	error "Cannot run git from $GIT_TEST_INSTALLED."
+	PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR:$PATH
+	GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH}
+else # normal case, use ../bin-wrappers only unless $with_dashes:
+	git_bin_dir="$GIT_BUILD_DIR/bin-wrappers"
+	if ! test -x "$git_bin_dir/git" ; then
+		if test -z "$with_dashes" ; then
+			say "$git_bin_dir/git is not executable; using GIT_EXEC_PATH"
+		fi
+		with_dashes=t
+	fi
+	PATH="$git_bin_dir:$PATH"
+	GIT_EXEC_PATH=$GIT_BUILD_DIR
+	if test -n "$with_dashes" ; then
+		PATH="$GIT_BUILD_DIR:$PATH"
+	fi
+fi
+GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt
+
+. "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
+
+GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git
+
+export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GITPERLLIB
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 30ed4d7..e892368 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -31,6 +31,13 @@ done,*)
 	;;
 esac
 
+# Protect ourselves from common misconfiguration to export
+# CDPATH into the environment
+unset CDPATH
+unset GREP_OPTIONS
+
+. ./test-env.sh
+
 # Keep the original TERM for say_color
 ORIGINAL_TERM=$TERM
 
@@ -72,12 +79,6 @@ export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME
 export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME
 export EDITOR
 
-# Protect ourselves from common misconfiguration to export
-# CDPATH into the environment
-unset CDPATH
-
-unset GREP_OPTIONS
-
 case $(echo $GIT_TRACE |tr "[A-Z]" "[a-z]") in
 	1|2|true)
 		echo "* warning: Some tests will not work if GIT_TRACE" \
@@ -380,117 +381,17 @@ test_done () {
 	esac
 }
 
-# Test the binaries we have just built.  The tests are kept in
-# t/ subdirectory and are run in 'trash directory' subdirectory.
-if test -z "$TEST_DIRECTORY"
-then
-	# We allow tests to override this, in case they want to run tests
-	# outside of t/, e.g. for running tests on the test library
-	# itself.
-	TEST_DIRECTORY=$(pwd)
-fi
 if test -z "$TEST_OUTPUT_DIRECTORY"
 then
 	# Similarly, override this to store the test-results subdir
 	# elsewhere
 	TEST_OUTPUT_DIRECTORY=$TEST_DIRECTORY
 fi
-GIT_BUILD_DIR="$TEST_DIRECTORY"/..
 
-if test -n "$valgrind"
-then
-	make_symlink () {
-		test -h "$2" &&
-		test "$1" = "$(readlink "$2")" || {
-			# be super paranoid
-			if mkdir "$2".lock
-			then
-				rm -f "$2" &&
-				ln -s "$1" "$2" &&
-				rm -r "$2".lock
-			else
-				while test -d "$2".lock
-				do
-					say "Waiting for lock on $2."
-					sleep 1
-				done
-			fi
-		}
-	}
-
-	make_valgrind_symlink () {
-		# handle only executables, unless they are shell libraries that
-		# need to be in the exec-path.  We will just use "#!" as a
-		# guess for a shell-script, since we have no idea what the user
-		# may have configured as the shell path.
-		test -x "$1" ||
-		test "#!" = "$(head -c 2 <"$1")" ||
-		return;
-
-		base=$(basename "$1")
-		symlink_target=$GIT_BUILD_DIR/$base
-		# do not override scripts
-		if test -x "$symlink_target" &&
-		    test ! -d "$symlink_target" &&
-		    test "#!" != "$(head -c 2 < "$symlink_target")"
-		then
-			symlink_target=../valgrind.sh
-		fi
-		case "$base" in
-		*.sh|*.perl)
-			symlink_target=../unprocessed-script
-		esac
-		# create the link, or replace it if it is out of date
-		make_symlink "$symlink_target" "$GIT_VALGRIND/bin/$base" || exit
-	}
-
-	# override all git executables in TEST_DIRECTORY/..
-	GIT_VALGRIND=$TEST_DIRECTORY/valgrind
-	mkdir -p "$GIT_VALGRIND"/bin
-	for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-*
-	do
-		make_valgrind_symlink $file
-	done
-	# special-case the mergetools loadables
-	make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools"
-	OLDIFS=$IFS
-	IFS=:
-	for path in $PATH
-	do
-		ls "$path"/git-* 2> /dev/null |
-		while read file
-		do
-			make_valgrind_symlink "$file"
-		done
-	done
-	IFS=$OLDIFS
-	PATH=$GIT_VALGRIND/bin:$PATH
-	GIT_EXEC_PATH=$GIT_VALGRIND/bin
-	export GIT_VALGRIND
-elif test -n "$GIT_TEST_INSTALLED" ; then
-	GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path)  ||
-	error "Cannot run git from $GIT_TEST_INSTALLED."
-	PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR:$PATH
-	GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH}
-else # normal case, use ../bin-wrappers only unless $with_dashes:
-	git_bin_dir="$GIT_BUILD_DIR/bin-wrappers"
-	if ! test -x "$git_bin_dir/git" ; then
-		if test -z "$with_dashes" ; then
-			say "$git_bin_dir/git is not executable; using GIT_EXEC_PATH"
-		fi
-		with_dashes=t
-	fi
-	PATH="$git_bin_dir:$PATH"
-	GIT_EXEC_PATH=$GIT_BUILD_DIR
-	if test -n "$with_dashes" ; then
-		PATH="$GIT_BUILD_DIR:$PATH"
-	fi
-fi
-GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt
 unset GIT_CONFIG
 GIT_CONFIG_NOSYSTEM=1
 GIT_ATTR_NOSYSTEM=1
-export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GIT_CONFIG_NOSYSTEM GIT_ATTR_NOSYSTEM
+export GIT_CONFIG_NOSYSTEM GIT_ATTR_NOSYSTEM
 
 . "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
 
@@ -503,9 +404,6 @@ then
 		GIT_TEST_CMP="$DIFF -u"
 	fi
 fi
-
-GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git
-export GITPERLLIB
 test -d "$GIT_BUILD_DIR"/templates/blt || {
 	error "You haven't built things yet, have you?"
 }

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-07 12:44               ` Thomas Rast
  2012-03-07 17:45                 ` Junio C Hamano
@ 2012-03-07 18:03                 ` Jakub Narebski
  2012-03-07 18:19                   ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Jakub Narebski @ 2012-03-07 18:03 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, Thomas Rast, git, Michal Privoznik, Jeff King

Thomas Rast wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> > Junio C Hamano <gitster@pobox.com> writes:
> >> 
> >> But of course, this from a command line would not work:
> >> 
> >> 	$ perl -MGit
> >> 
> >> I do not expect it to, and for the ease of testing new versions, I
> >> prefer it not to work.
> >> 
> >> In any case, you should be able to do anything under t/ _before_
> >> installing, so relying on having Git.pm in normal @INC is a double
> >> no-no.
> >
> > Thomas, take a look at how it is solved in 't/t9700/test.pl', used by
> > 't/t9700-perl-git.sh':
> >
> >   use lib (split(/:/, $ENV{GITPERLLIB}));
> 
> Hum.  The problem is that the user may invoke aggregate.perl manually,
> and GITPERLLIB won't be set in that case.
> 
> Is there a better solution than duplicating the logic that sets
> GITPERLLIB in test-lib.sh within aggregate.perl?

Beside extracting logic that sets GITPERLLIB into separate file like
in Junio proposal?  You can always assume that it is fixed relative
to perl/Git.pm, and use __DIR__ or $FindBin to make "use lib", e.g.

  use FindBin;
  use lib "$FindBin::Bin/../../perl";

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-07 18:03                 ` Jakub Narebski
@ 2012-03-07 18:19                   ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2012-03-07 18:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Thomas Rast, Thomas Rast, git, Michal Privoznik, Jeff King

Jakub Narebski <jnareb@gmail.com> writes:

> Beside extracting logic that sets GITPERLLIB into separate file like
> in Junio proposal?

That was not even a proposal.  The part that deals with valgrind is
blatantly wrong (it shouldn't create symlinks in that code, it only
should figure out where things should be).

> You can always assume that it is fixed relative
> to perl/Git.pm, and use __DIR__ or $FindBin to make "use lib", e.g.
>
>   use FindBin;
>   use lib "$FindBin::Bin/../../perl";

Use of FindBin to find the location of the script is OK, but does
using "../../perl" really work?

We have this

  GITPERLLIB="$GIT_BUILD_DIR"/perl/blib/lib:"$GIT_BUILD_DIR"/perl/blib/arch/auto/Git

to look for two places in test-lib.sh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] perf: compare diff algorithms
  2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
  2012-03-06 13:15       ` [PATCH 2/2] Document the --histogram diff option Thomas Rast
  2012-03-06 19:52       ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano
@ 2012-03-10  7:13       ` René Scharfe
  2 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2012-03-10  7:13 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Junio C Hamano, Michal Privoznik, Jeff King

Am 06.03.2012 14:15, schrieb Thomas Rast:
> 8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff
> was faster than both Myers and patience.
>
> We have since incorporated a performance testing framework, so add a
> test that compares the various diff tasks performed in a real 'log -p'
> workload.  This does indeed show that histogram diff slightly beats
> Myers, while patience is much slower than the others.
>
> Signed-off-by: Thomas Rast<trast@student.ethz.ch>
> ---
>
> The 3000 is pretty arbitrary but makes for a nice test duration.
>
> I'm reluctant to put numbers into the message, since the whole point
> of the perf test framework is that you can easily get them too.  But
> here's what I'm seeing:
>
>    4000.1: log -3000 (baseline)          0.04(0.02+0.01)
>    4000.2: log --raw -3000 (tree-only)   0.49(0.38+0.09)
>    4000.3: log -p -3000 (Myers)          1.93(1.75+0.17)
>    4000.4: log -p -3000 --histogram      1.90(1.74+0.15)
>    4000.5: log -p -3000 --patience       2.25(2.07+0.16)

Just a data point: --histogram is slightly slower for me:

    Test                                  this tree
    -----------------------------------------------------
    4000.1: log -3000 (baseline)          0.07(0.07+0.00)
    4000.2: log --raw -3000 (tree-only)   0.35(0.31+0.04)
    4000.3: log -p -3000 (Myers)          1.50(1.40+0.08)
    4000.4: log -p -3000 --histogram      1.54(1.48+0.05)
    4000.5: log -p -3000 --patience       1.79(1.71+0.06)

(baseline with -3000)

René

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-03-10  7:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-06 10:59 [PATCH] config: Introduce --patience config variable Michal Privoznik
2012-03-06 11:49 ` Jeff King
2012-03-06 13:01   ` Thomas Rast
2012-03-06 13:15     ` [PATCH 1/2] perf: compare diff algorithms Thomas Rast
2012-03-06 13:15       ` [PATCH 2/2] Document the --histogram diff option Thomas Rast
2012-03-06 19:57         ` Junio C Hamano
2012-03-06 20:42           ` Thomas Rast
2012-03-06 19:52       ` [PATCH 1/2] perf: compare diff algorithms Junio C Hamano
2012-03-06 21:00         ` Thomas Rast
2012-03-06 21:18           ` Junio C Hamano
2012-03-06 21:41             ` Jakub Narebski
2012-03-07 12:44               ` Thomas Rast
2012-03-07 17:45                 ` Junio C Hamano
2012-03-07 18:03                 ` Jakub Narebski
2012-03-07 18:19                   ` Junio C Hamano
2012-03-10  7:13       ` René Scharfe
2012-03-06 13:30     ` [PATCH] config: Introduce --patience config variable Jeff King
2012-03-06 13:32     ` Michal Privoznik
2012-03-06 13:38       ` Matthieu Moy
2012-03-06 14:09         ` Jeff King
2012-03-07  2:57   ` Junio C Hamano
2012-03-07 11:47     ` Jeff King
2012-03-07 17:24       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).