git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Mailmap in log improvements
@ 2012-12-22 16:58 Antoine Pelisse
  2012-12-22 16:58 ` [PATCH 1/2] log: grep author/committer using mailmap Antoine Pelisse
  2012-12-22 16:58 ` [PATCH 2/2] log: add log.mailmap configuration option Antoine Pelisse
  0 siblings, 2 replies; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-22 16:58 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse

The goal of these patches are to:

 - allow the use of mailmap when looking for commits
   authored/committed by a mapped name/email.

 - add an option so that --use-mailmap option can be used
   automatically.

Tests are included.

This series is based on ap/log-mailmap.

Antoine Pelisse (2):
  log: grep author/committer using mailmap
  log: add log.mailmap configuration option

 Documentation/config.txt |    4 ++++
 builtin/log.c            |    8 ++++++-
 revision.c               |   53 ++++++++++++++++++++++++++++++++++++++++++++++
 t/t4203-mailmap.sh       |   42 ++++++++++++++++++++++++++++++++++++
 4 files changed, 106 insertions(+), 1 deletion(-)

--
1.7.9.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] log: grep author/committer using mailmap
  2012-12-22 16:58 [PATCH 0/2] Mailmap in log improvements Antoine Pelisse
@ 2012-12-22 16:58 ` Antoine Pelisse
  2012-12-26 19:27   ` Junio C Hamano
  2012-12-22 16:58 ` [PATCH 2/2] log: add log.mailmap configuration option Antoine Pelisse
  1 sibling, 1 reply; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-22 16:58 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse

Currently mailmap can be used to display log authors and committers
but there no way to use mailmap to find commits with mapped values.

This commit allows those commands to work:

    git log --use-mailmap --author mapped_name_or_email
    git log --use-mailmap --committer mapped_name_or_email

Of course it only works if the --use-mailmap option is used.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
---
I probably missed something but I didn't find the connection with
commit 2d10c55. Let me know if I went the wrong direction.

 revision.c         |   53 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 t/t4203-mailmap.sh |   18 ++++++++++++++++++
 2 files changed, 71 insertions(+)

diff --git a/revision.c b/revision.c
index 95d21e6..fb9fd41 100644
--- a/revision.c
+++ b/revision.c
@@ -13,6 +13,7 @@
 #include "decorate.h"
 #include "log-tree.h"
 #include "string-list.h"
+#include "mailmap.h"

 volatile show_early_output_fn_t show_early_output;

@@ -2219,6 +2220,50 @@ static int rewrite_parents(struct rev_info *revs, struct commit *commit)
 	return 0;
 }

+static int commit_rewrite_authors(struct strbuf *buf, const char *what, struct string_list *mailmap)
+{
+	char *author, *endp;
+	size_t len;
+	struct strbuf name = STRBUF_INIT;
+	struct strbuf mail = STRBUF_INIT;
+	struct ident_split ident;
+
+	author = strstr(buf->buf, what);
+	if (!author)
+		goto error;
+
+	author += strlen(what);
+	endp = strstr(author, "\n");
+	if (!endp)
+		goto error;
+
+	len = endp - author;
+
+	if (split_ident_line(&ident, author, len)) {
+	error:
+		strbuf_release(&name);
+		strbuf_release(&mail);
+
+		return 1;
+	}
+
+	strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
+	map_user(mailmap, &mail, &name);
+
+	strbuf_addf(&name, " <%s>", mail.buf);
+
+	strbuf_splice(buf, ident.name_begin - buf->buf,
+		      ident.mail_end - ident.name_begin + 1,
+		      name.buf, name.len);
+
+	strbuf_release(&name);
+	strbuf_release(&mail);
+
+	return 0;
+}
+
 static int commit_match(struct commit *commit, struct rev_info *opt)
 {
 	int retval;
@@ -2237,6 +2282,14 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
 	if (buf.len)
 		strbuf_addstr(&buf, commit->buffer);

+	if (opt->mailmap) {
+		if (!buf.len)
+			strbuf_addstr(&buf, commit->buffer);
+
+		commit_rewrite_authors(&buf, "\nauthor ", opt->mailmap);
+		commit_rewrite_authors(&buf, "\ncommitter ", opt->mailmap);
+	}
+
 	/* Append "fake" message parts as needed */
 	if (opt->show_notes) {
 		if (!buf.len)
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index db043dc..e16187f 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -248,11 +248,29 @@ Author: Other Author <other@author.xx>
 Author: Some Dude <some@dude.xx>
 Author: A U Thor <author@example.com>
 EOF
+
 test_expect_success 'Log output with --use-mailmap' '
 	git log --use-mailmap | grep Author >actual &&
 	test_cmp expect actual
 '

+cat >expect <<\EOF
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+EOF
+
+test_expect_success 'Grep author with --use-mailmap' '
+	git log --use-mailmap --author Santa | grep Author >actual &&
+	test_cmp expect actual
+'
+
+>expect
+
+test_expect_success 'Only grep replaced author with --use-mailmap' '
+	git log --use-mailmap --author "<cto@coompany.xx>" >actual &&
+	test_cmp expect actual
+'
+
 # git blame
 cat >expect <<\EOF
 ^OBJI (A U Thor     DATE 1) one
--
1.7.9.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] log: add log.mailmap configuration option
  2012-12-22 16:58 [PATCH 0/2] Mailmap in log improvements Antoine Pelisse
  2012-12-22 16:58 ` [PATCH 1/2] log: grep author/committer using mailmap Antoine Pelisse
@ 2012-12-22 16:58 ` Antoine Pelisse
  2012-12-23  4:26   ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-22 16:58 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse

This patch provides a new configuration option 'log.mailmap' to
automatically use the --use-mailmap option from git-show, git-log and
git-whatchanged commands.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
---
I'm wondering if it would be needed to add a no-use-mailmap option to
log command so that it can cancel this configuration option.

 Documentation/config.txt |    4 ++++
 builtin/log.c            |    8 +++++++-
 t/t4203-mailmap.sh       |   24 ++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bf8f911..226362a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1509,6 +1509,10 @@ log.showroot::
 	Tools like linkgit:git-log[1] or linkgit:git-whatchanged[1], which
 	normally hide the root commit will now show it. True by default.

+log.mailmap::
+	If true, makes linkgit:git-log[1], linkgit:git-show[1], and
+	linkgit:git-whatchanged[1] assume `--use-mailmap`.
+
 mailmap.file::
 	The location of an augmenting mailmap file. The default
 	mailmap, located in the root of the repository, is loaded
diff --git a/builtin/log.c b/builtin/log.c
index d2bd8ce..f6936ff 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -31,6 +31,7 @@ static int default_abbrev_commit;
 static int default_show_root = 1;
 static int decoration_style;
 static int decoration_given;
+static int use_mailmap;
 static const char *fmt_patch_subject_prefix = "PATCH";
 static const char *fmt_pretty;

@@ -138,7 +139,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 	if (source)
 		rev->show_source = 1;

-	if (mailmap) {
+	if (mailmap || use_mailmap) {
 		rev->mailmap = xcalloc(1, sizeof(struct string_list));
 		read_mailmap(rev->mailmap, NULL);
 	}
@@ -358,6 +359,11 @@ static int git_log_config(const char *var, const char *value, void *cb)
 	}
 	if (!prefixcmp(var, "color.decorate."))
 		return parse_decorate_color_config(var, 15, value);
+	if (!strcmp(var, "log.mailmap")) {
+		use_mailmap = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (grep_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index e16187f..7d4d31c 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -255,6 +255,21 @@ test_expect_success 'Log output with --use-mailmap' '
 '

 cat >expect <<\EOF
+Author: CTO <cto@company.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Other Author <other@author.xx>
+Author: Other Author <other@author.xx>
+Author: Some Dude <some@dude.xx>
+Author: A U Thor <author@example.com>
+EOF
+
+test_expect_success 'Log output with log.mailmap' '
+	git -c log.mailmap=True log | grep Author >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<\EOF
 Author: Santa Claus <santa.claus@northpole.xx>
 Author: Santa Claus <santa.claus@northpole.xx>
 EOF
@@ -263,6 +278,15 @@ test_expect_success 'Grep author with --use-mailmap' '
 	git log --use-mailmap --author Santa | grep Author >actual &&
 	test_cmp expect actual
 '
+cat >expect <<\EOF
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+EOF
+
+test_expect_success 'Grep author with log.mailmap' '
+	git -c log.mailmap=True log --author Santa | grep Author >actual &&
+	test_cmp expect actual
+'

 >expect

--
1.7.9.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] log: add log.mailmap configuration option
  2012-12-22 16:58 ` [PATCH 2/2] log: add log.mailmap configuration option Antoine Pelisse
@ 2012-12-23  4:26   ` Junio C Hamano
  2012-12-26 16:14     ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-23  4:26 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Antoine Pelisse <apelisse@gmail.com> writes:

> This patch provides a new configuration option 'log.mailmap' to
> automatically use the --use-mailmap option from git-show, git-log and
> git-whatchanged commands.
>
> Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
> ---
> I'm wondering if it would be needed to add a no-use-mailmap option to
> log command so that it can cancel this configuration option.

The usual way for adding a new feature is to add a --enable-feature
long-option without any configuration variable to let users try it
out in the field, and then add the configuration to let it be
default for users who opt in.  The first step should also allow a
command line option to disable (which should come for free if you
use parse-options API correctly).

>  Documentation/config.txt |    4 ++++
>  builtin/log.c            |    8 +++++++-
>  t/t4203-mailmap.sh       |   24 ++++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index bf8f911..226362a 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1509,6 +1509,10 @@ log.showroot::
>  	Tools like linkgit:git-log[1] or linkgit:git-whatchanged[1], which
>  	normally hide the root commit will now show it. True by default.
>
> +log.mailmap::
> +	If true, makes linkgit:git-log[1], linkgit:git-show[1], and
> +	linkgit:git-whatchanged[1] assume `--use-mailmap`.
> +
>  mailmap.file::
>  	The location of an augmenting mailmap file. The default
>  	mailmap, located in the root of the repository, is loaded
> diff --git a/builtin/log.c b/builtin/log.c
> index d2bd8ce..f6936ff 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -31,6 +31,7 @@ static int default_abbrev_commit;
>  static int default_show_root = 1;
>  static int decoration_style;
>  static int decoration_given;
> +static int use_mailmap;
>  static const char *fmt_patch_subject_prefix = "PATCH";
>  static const char *fmt_pretty;
>
> @@ -138,7 +139,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
>  	if (source)
>  		rev->show_source = 1;
>
> -	if (mailmap) {
> +	if (mailmap || use_mailmap) {
>  		rev->mailmap = xcalloc(1, sizeof(struct string_list));
>  		read_mailmap(rev->mailmap, NULL);
>  	}
> @@ -358,6 +359,11 @@ static int git_log_config(const char *var, const char *value, void *cb)
>  	}
>  	if (!prefixcmp(var, "color.decorate."))
>  		return parse_decorate_color_config(var, 15, value);
> +	if (!strcmp(var, "log.mailmap")) {
> +		use_mailmap = git_config_bool(var, value);
> +		return 0;
> +	}
> +
>  	if (grep_config(var, value, cb) < 0)
>  		return -1;
>  	return git_diff_ui_config(var, value, cb);
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index e16187f..7d4d31c 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -255,6 +255,21 @@ test_expect_success 'Log output with --use-mailmap' '
>  '
>
>  cat >expect <<\EOF
> +Author: CTO <cto@company.xx>
> +Author: Santa Claus <santa.claus@northpole.xx>
> +Author: Santa Claus <santa.claus@northpole.xx>
> +Author: Other Author <other@author.xx>
> +Author: Other Author <other@author.xx>
> +Author: Some Dude <some@dude.xx>
> +Author: A U Thor <author@example.com>
> +EOF
> +
> +test_expect_success 'Log output with log.mailmap' '
> +	git -c log.mailmap=True log | grep Author >actual &&
> +	test_cmp expect actual
> +'
> +
> +cat >expect <<\EOF
>  Author: Santa Claus <santa.claus@northpole.xx>
>  Author: Santa Claus <santa.claus@northpole.xx>
>  EOF
> @@ -263,6 +278,15 @@ test_expect_success 'Grep author with --use-mailmap' '
>  	git log --use-mailmap --author Santa | grep Author >actual &&
>  	test_cmp expect actual
>  '
> +cat >expect <<\EOF
> +Author: Santa Claus <santa.claus@northpole.xx>
> +Author: Santa Claus <santa.claus@northpole.xx>
> +EOF
> +
> +test_expect_success 'Grep author with log.mailmap' '
> +	git -c log.mailmap=True log --author Santa | grep Author >actual &&
> +	test_cmp expect actual
> +'
>
>  >expect
>
> --
> 1.7.9.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] log: add log.mailmap configuration option
  2012-12-23  4:26   ` Junio C Hamano
@ 2012-12-26 16:14     ` Junio C Hamano
  2012-12-26 16:42       ` Antoine Pelisse
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-26 16:14 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Antoine Pelisse <apelisse@gmail.com> writes:
>
>> I'm wondering if it would be needed to add a no-use-mailmap option to
>> log command so that it can cancel this configuration option.
>
> The usual way for adding a new feature is to add a --enable-feature
> long-option without any configuration variable to let users try it
> out in the field, and then add the configuration to let it be
> default for users who opt in.  The first step should also allow a
> command line option to disable (which should come for free if you
> use parse-options API correctly).

It should be sufficient to squash something like this in.  Use the
configured value, if available, to initialize the existing "mailmap"
variable, which is in turn updated from the command line option with
either --use-mailmap or --no-use-mailmap.  What is left in "mailmap"
after the command line parsing returns is what the user told us to
use.

Thanks.

 builtin/log.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index f6936ff..16e6520 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -31,7 +31,7 @@ static int default_abbrev_commit;
 static int default_show_root = 1;
 static int decoration_style;
 static int decoration_given;
-static int use_mailmap;
+static int use_mailmap_config;
 static const char *fmt_patch_subject_prefix = "PATCH";
 static const char *fmt_pretty;
 
@@ -107,6 +107,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 		OPT_END()
 	};
 
+	mailmap = use_mailmap_config;
 	argc = parse_options(argc, argv, prefix,
 			     builtin_log_options, builtin_log_usage,
 			     PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
@@ -139,7 +140,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 	if (source)
 		rev->show_source = 1;
 
-	if (mailmap || use_mailmap) {
+	if (mailmap) {
 		rev->mailmap = xcalloc(1, sizeof(struct string_list));
 		read_mailmap(rev->mailmap, NULL);
 	}
@@ -360,7 +361,7 @@ static int git_log_config(const char *var, const char *value, void *cb)
 	if (!prefixcmp(var, "color.decorate."))
 		return parse_decorate_color_config(var, 15, value);
 	if (!strcmp(var, "log.mailmap")) {
-		use_mailmap = git_config_bool(var, value);
+		use_mailmap_config = git_config_bool(var, value);
 		return 0;
 	}
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] log: add log.mailmap configuration option
  2012-12-26 16:14     ` Junio C Hamano
@ 2012-12-26 16:42       ` Antoine Pelisse
  0 siblings, 0 replies; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-26 16:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

I was planning to send you a fix pretty close to that,

Thanks a lot Junio!

On Wed, Dec 26, 2012 at 5:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Antoine Pelisse <apelisse@gmail.com> writes:
>>
>>> I'm wondering if it would be needed to add a no-use-mailmap option to
>>> log command so that it can cancel this configuration option.
>>
>> The usual way for adding a new feature is to add a --enable-feature
>> long-option without any configuration variable to let users try it
>> out in the field, and then add the configuration to let it be
>> default for users who opt in.  The first step should also allow a
>> command line option to disable (which should come for free if you
>> use parse-options API correctly).
>
> It should be sufficient to squash something like this in.  Use the
> configured value, if available, to initialize the existing "mailmap"
> variable, which is in turn updated from the command line option with
> either --use-mailmap or --no-use-mailmap.  What is left in "mailmap"
> after the command line parsing returns is what the user told us to
> use.
>
> Thanks.
>
>  builtin/log.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index f6936ff..16e6520 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -31,7 +31,7 @@ static int default_abbrev_commit;
>  static int default_show_root = 1;
>  static int decoration_style;
>  static int decoration_given;
> -static int use_mailmap;
> +static int use_mailmap_config;
>  static const char *fmt_patch_subject_prefix = "PATCH";
>  static const char *fmt_pretty;
>
> @@ -107,6 +107,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
>                 OPT_END()
>         };
>
> +       mailmap = use_mailmap_config;
>         argc = parse_options(argc, argv, prefix,
>                              builtin_log_options, builtin_log_usage,
>                              PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
> @@ -139,7 +140,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
>         if (source)
>                 rev->show_source = 1;
>
> -       if (mailmap || use_mailmap) {
> +       if (mailmap) {
>                 rev->mailmap = xcalloc(1, sizeof(struct string_list));
>                 read_mailmap(rev->mailmap, NULL);
>         }
> @@ -360,7 +361,7 @@ static int git_log_config(const char *var, const char *value, void *cb)
>         if (!prefixcmp(var, "color.decorate."))
>                 return parse_decorate_color_config(var, 15, value);
>         if (!strcmp(var, "log.mailmap")) {
> -               use_mailmap = git_config_bool(var, value);
> +               use_mailmap_config = git_config_bool(var, value);
>                 return 0;
>         }
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] log: grep author/committer using mailmap
  2012-12-22 16:58 ` [PATCH 1/2] log: grep author/committer using mailmap Antoine Pelisse
@ 2012-12-26 19:27   ` Junio C Hamano
  2012-12-26 21:12     ` Antoine Pelisse
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-26 19:27 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Antoine Pelisse <apelisse@gmail.com> writes:

> Currently mailmap can be used to display log authors and committers
> but there no way to use mailmap to find commits with mapped values.
>
> This commit allows those commands to work:
>
>     git log --use-mailmap --author mapped_name_or_email
>     git log --use-mailmap --committer mapped_name_or_email
>
> Of course it only works if the --use-mailmap option is used.
>
> Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
> ---
> I probably missed something but I didn't find the connection with
> commit 2d10c55. Let me know if I went the wrong direction.
>
>  revision.c         |   53 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  t/t4203-mailmap.sh |   18 ++++++++++++++++++
>  2 files changed, 71 insertions(+)
>
> diff --git a/revision.c b/revision.c
> index 95d21e6..fb9fd41 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -13,6 +13,7 @@
>  #include "decorate.h"
>  #include "log-tree.h"
>  #include "string-list.h"
> +#include "mailmap.h"
>
>  volatile show_early_output_fn_t show_early_output;
>
> @@ -2219,6 +2220,50 @@ static int rewrite_parents(struct rev_info *revs, struct commit *commit)
>  	return 0;
>  }
>
> +static int commit_rewrite_authors(struct strbuf *buf, const char *what, struct string_list *mailmap)
> +{
> +	char *author, *endp;
> +	size_t len;
> +	struct strbuf name = STRBUF_INIT;
> +	struct strbuf mail = STRBUF_INIT;
> +	struct ident_split ident;
> +
> +	author = strstr(buf->buf, what);
> +	if (!author)
> +		goto error;

This does not stop at the end of the header part and would match a
random line in the log message that happens to begin with "author ";
is this something we would worry about, or would we leave it to "fsck"?

> +	author += strlen(what);
> +	endp = strstr(author, "\n");

Using strchr(author, '\n') would feel more natural.  Also rename
"author" to "person" or something, as you would be using this
function for the committer information as well?

> +	if (!endp)
> +		goto error;
> +
> +	len = endp - author;
> +
> +	if (split_ident_line(&ident, author, len)) {
> +	error:
> +		strbuf_release(&name);
> +		strbuf_release(&mail);
> +
> +		return 1;

We usually signal error by returning a negative integer.  It does
not matter too much in this case as no callers seem to check the
return value from this function, though.

> +	}
> +
> +	strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
> +	strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
> +
> +	map_user(mailmap, &mail, &name);
> +
> +	strbuf_addf(&name, " <%s>", mail.buf);
> +
> +	strbuf_splice(buf, ident.name_begin - buf->buf,
> +		      ident.mail_end - ident.name_begin + 1,
> +		      name.buf, name.len);

Would it give us better performance if we splice only when
map_user() tells us that we actually rewrote the ident?

> +	strbuf_release(&name);
> +	strbuf_release(&mail);
> +
> +	return 0;
> +}
> +
>  static int commit_match(struct commit *commit, struct rev_info *opt)
>  {
>  	int retval;
> @@ -2237,6 +2282,14 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
>  	if (buf.len)
>  		strbuf_addstr(&buf, commit->buffer);
>
> +	if (opt->mailmap) {
> +		if (!buf.len)
> +			strbuf_addstr(&buf, commit->buffer);
> +
> +		commit_rewrite_authors(&buf, "\nauthor ", opt->mailmap);
> +		commit_rewrite_authors(&buf, "\ncommitter ", opt->mailmap);
> +	}
> +
>  	/* Append "fake" message parts as needed */
>  	if (opt->show_notes) {
>  		if (!buf.len)
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index db043dc..e16187f 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -248,11 +248,29 @@ Author: Other Author <other@author.xx>
>  Author: Some Dude <some@dude.xx>
>  Author: A U Thor <author@example.com>
>  EOF
> +
>  test_expect_success 'Log output with --use-mailmap' '
>  	git log --use-mailmap | grep Author >actual &&
>  	test_cmp expect actual
>  '
>
> +cat >expect <<\EOF
> +Author: Santa Claus <santa.claus@northpole.xx>
> +Author: Santa Claus <santa.claus@northpole.xx>
> +EOF
> +
> +test_expect_success 'Grep author with --use-mailmap' '
> +	git log --use-mailmap --author Santa | grep Author >actual &&
> +	test_cmp expect actual
> +'
> +
> +>expect
> +
> +test_expect_success 'Only grep replaced author with --use-mailmap' '
> +	git log --use-mailmap --author "<cto@coompany.xx>" >actual &&
> +	test_cmp expect actual
> +'
> +
>  # git blame
>  cat >expect <<\EOF
>  ^OBJI (A U Thor     DATE 1) one
> --
> 1.7.9.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] log: grep author/committer using mailmap
  2012-12-26 19:27   ` Junio C Hamano
@ 2012-12-26 21:12     ` Antoine Pelisse
  2012-12-26 21:37       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-26 21:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

>>
>> +static int commit_rewrite_authors(struct strbuf *buf, const char *what, struct string_list *mailmap)
>> +{
>> +     char *author, *endp;
>> +     size_t len;
>> +     struct strbuf name = STRBUF_INIT;
>> +     struct strbuf mail = STRBUF_INIT;
>> +     struct ident_split ident;
>> +
>> +     author = strstr(buf->buf, what);
>> +     if (!author)
>> +             goto error;
>
> This does not stop at the end of the header part and would match a
> random line in the log message that happens to begin with "author ";
> is this something we would worry about, or would we leave it to "fsck"?

The only worrying case would be:
 - commit doesn't have "\nauthor" in the header (can that happen
without corruption?)
 - commit has "\nauthor" in the commit log
 - This line from commit log contains an <email> (split_ident_line works)
Then, I guess it's going to replace the name in the commit log.

Otherwise, it would not replace anything, as there is no author to
replace anyway.

It looks like most mechanisms using mailmap would have the same issue.

>> +     author += strlen(what);
>> +     endp = strstr(author, "\n");
>
> Using strchr(author, '\n') would feel more natural.  Also rename
> "author" to "person" or something, as you would be using this
> function for the committer information as well?

Both fixed

>> +     if (!endp)
>> +             goto error;
>> +
>> +     len = endp - author;
>> +
>> +     if (split_ident_line(&ident, author, len)) {
>> +     error:
>> +             strbuf_release(&name);
>> +             strbuf_release(&mail);
>> +
>> +             return 1;
>
> We usually signal error by returning a negative integer.  It does
> not matter too much in this case as no callers seem to check the
> return value from this function, though.

Fixed, or would you rather see it `void` ?

>> +     }
>> +
>> +     strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
>> +     strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
>> +
>> +     map_user(mailmap, &mail, &name);
>> +
>> +     strbuf_addf(&name, " <%s>", mail.buf);
>> +
>> +     strbuf_splice(buf, ident.name_begin - buf->buf,
>> +                   ident.mail_end - ident.name_begin + 1,
>> +                   name.buf, name.len);
>
> Would it give us better performance if we splice only when
> map_user() tells us that we actually rewrote the ident?

My intuition was that the cost of splice belongs to "memoving", when the
size is different. Yet, Fixed, as it removes two copies.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] log: grep author/committer using mailmap
  2012-12-26 21:12     ` Antoine Pelisse
@ 2012-12-26 21:37       ` Junio C Hamano
  2012-12-27 15:31         ` [PATCH v2] " Antoine Pelisse
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-26 21:37 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Antoine Pelisse <apelisse@gmail.com> writes:

>>>
>>> +static int commit_rewrite_authors(struct strbuf *buf, const char *what, struct string_list *mailmap)
>>> +{
>>> +     char *author, *endp;
>>> +     size_t len;
>>> +     struct strbuf name = STRBUF_INIT;
>>> +     struct strbuf mail = STRBUF_INIT;
>>> +     struct ident_split ident;
>>> +
>>> +     author = strstr(buf->buf, what);
>>> +     if (!author)
>>> +             goto error;
>>
>> This does not stop at the end of the header part and would match a
>> random line in the log message that happens to begin with "author ";
>> is this something we would worry about, or would we leave it to "fsck"?
>
> The only worrying case would be:
> ...

Yeah, that pretty much matches what I had in mind (the short answer:
leave it to "git fsck").

>> We usually signal error by returning a negative integer.  It does
>> not matter too much in this case as no callers seem to check the
>> return value from this function, though.
>
> Fixed, or would you rather see it `void` ?

Just like you can take advantage of map_user() that signals the
caller if it did anything to optimize this function, in the longer
run, it may help the (future) callers of this function if it gave "I
did something" vs "I left it intact".  In the particular case of
this function, the "error" cases fall into the latter (it merely
explains why it left it intact, and there is no sensible error
recovery the caller _could_ do in any case) and I think it is not
necessary to differenciate between "Returned as-is because there is
no mapping" and "Returned as-is because I couldn't parse the
commit".

So "return 0 when it didn't do anything, return 1 when it rewrote"
feels good enough, at least to me.

>>> +     }
>>> +
>>> +     strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
>>> +     strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
>>> +
>>> +     map_user(mailmap, &mail, &name);
>>> +
>>> +     strbuf_addf(&name, " <%s>", mail.buf);
>>> +
>>> +     strbuf_splice(buf, ident.name_begin - buf->buf,
>>> +                   ident.mail_end - ident.name_begin + 1,
>>> +                   name.buf, name.len);
>>
>> Would it give us better performance if we splice only when
>> map_user() tells us that we actually rewrote the ident?
>
> My intuition was that the cost of splice belongs to "memoving", when the
> size is different. Yet, Fixed, as it removes two copies.

Thanks.

I wonder if we can further restructure the code so that it first
inspects the existing buffer to see if it even needs to copy the
original commit buffer into a "strbuf only for grepping".  If that
can be easily done, then we will save even more copying, I think.

The reason I alluded to revamping the grep API to get rid of the use
of "header grep" mode in this codepath was exactly that.  We could:

 - change the command line parser for --author= and --committer= so
   that these do not become part of the main "grep" expression.
   Instead we keep them as separate grep expressions (one "author"
   expression that OR'es the --author= options together, the other
   for the --committer= options);

 - in this codepath, inspect the "author" and "committer" in the
   commit object buffer, map them if necessary via the mailmap
   mechanism into temporary buffers (that is different from the
   "buf" in the commit_match() function), then run grep_buffer()
   with the author and committer grep expressions we separated in
   the previous step. Then we combine the results from "author" and
   "committer" grep and the main grep_buffer() result ourselves in
   this function.

That may essentially amount to going in the totally opposite
direction from what 2d10c55 (git log: Unify header_filter and
message_filter into one., 2006-09-20) attempted to do.  We used to
have two grep expressions (one for header, the other one for body)
commit_match() runs grep_buffer() on and combined the results.
2d10c55 merged them into one grep expression by introducing a term
that matches only header elements.  But we would instead split the
"header" expression into "author" and "committer" expressions
(making it three from one) if we go the above route.

That would eliminate the need to copy and rewrite the contents of
the commit object in this codepath, which may be a big win when
names and emails that need to be rewritten are minority cases.

But I suspect that is a much larger change.  If we can reduce the
amount of copies necessary without changing the code structure, that
may be enough to reduce the performance hit from this change.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] log: grep author/committer using mailmap
  2012-12-26 21:37       ` Junio C Hamano
@ 2012-12-27 15:31         ` Antoine Pelisse
  2012-12-27 18:45           ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-27 15:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Antoine Pelisse

Currently you can use mailmap to display log authors and committers
but you can't use the mailmap to find commits with mapped values.

This commit allows you to run:

    git log --use-mailmap --author mapped_name_or_email
    git log --use-mailmap --committer mapped_name_or_email

Of course it only works if the --use-mailmap option is used.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
---
 revision.c         |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 t/t4203-mailmap.sh |   18 ++++++++++++++++++
 2 files changed, 72 insertions(+)

diff --git a/revision.c b/revision.c
index 95d21e6..fa16b9d 100644
--- a/revision.c
+++ b/revision.c
@@ -13,6 +13,7 @@
 #include "decorate.h"
 #include "log-tree.h"
 #include "string-list.h"
+#include "mailmap.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -2219,6 +2220,51 @@ static int rewrite_parents(struct rev_info *revs, struct commit *commit)
 	return 0;
 }
 
+static int commit_rewrite_person(struct strbuf *buf, const char *what, struct string_list *mailmap)
+{
+	char *person, *endp;
+	size_t len;
+	struct strbuf name = STRBUF_INIT;
+	struct strbuf mail = STRBUF_INIT;
+	struct ident_split ident;
+
+	person = strstr(buf->buf, what);
+	if (!person)
+		goto left_intact;
+
+	person += strlen(what);
+	endp = strchr(person, '\n');
+	if (!endp)
+		goto left_intact;
+
+	len = endp - person;
+
+	if (split_ident_line(&ident, person, len))
+		goto left_intact;
+
+	strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
+	if (map_user(mailmap, &mail, &name)) {
+		strbuf_addf(&name, " <%s>", mail.buf);
+
+		strbuf_splice(buf, ident.name_begin - buf->buf,
+			      ident.mail_end - ident.name_begin + 1,
+			      name.buf, name.len);
+
+		strbuf_release(&name);
+		strbuf_release(&mail);
+
+		return 1;
+	}
+
+left_intact:
+	strbuf_release(&name);
+	strbuf_release(&mail);
+
+	return 0;
+}
+
 static int commit_match(struct commit *commit, struct rev_info *opt)
 {
 	int retval;
@@ -2237,6 +2283,14 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
 	if (buf.len)
 		strbuf_addstr(&buf, commit->buffer);
 
+	if (opt->mailmap) {
+		if (!buf.len)
+			strbuf_addstr(&buf, commit->buffer);
+
+		commit_rewrite_person(&buf, "\nauthor ", opt->mailmap);
+		commit_rewrite_person(&buf, "\ncommitter ", opt->mailmap);
+	}
+
 	/* Append "fake" message parts as needed */
 	if (opt->show_notes) {
 		if (!buf.len)
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index db043dc..e16187f 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -248,11 +248,29 @@ Author: Other Author <other@author.xx>
 Author: Some Dude <some@dude.xx>
 Author: A U Thor <author@example.com>
 EOF
+
 test_expect_success 'Log output with --use-mailmap' '
 	git log --use-mailmap | grep Author >actual &&
 	test_cmp expect actual
 '
 
+cat >expect <<\EOF
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+EOF
+
+test_expect_success 'Grep author with --use-mailmap' '
+	git log --use-mailmap --author Santa | grep Author >actual &&
+	test_cmp expect actual
+'
+
+>expect
+
+test_expect_success 'Only grep replaced author with --use-mailmap' '
+	git log --use-mailmap --author "<cto@coompany.xx>" >actual &&
+	test_cmp expect actual
+'
+
 # git blame
 cat >expect <<\EOF
 ^OBJI (A U Thor     DATE 1) one
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] log: grep author/committer using mailmap
  2012-12-27 15:31         ` [PATCH v2] " Antoine Pelisse
@ 2012-12-27 18:45           ` Junio C Hamano
  2012-12-27 18:48             ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-27 18:45 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Antoine Pelisse <apelisse@gmail.com> writes:

> Currently you can use mailmap to display log authors and committers
> but you can't use the mailmap to find commits with mapped values.
>
> This commit allows you to run:
>
>     git log --use-mailmap --author mapped_name_or_email
>     git log --use-mailmap --committer mapped_name_or_email
>
> Of course it only works if the --use-mailmap option is used.
>
> Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
> ---

Thanks.  I'll queue this on top.

-- >8 --
Subject: [PATCH] log --use-mailmap: optimize for cases without --author/--committer search

When we taught the commit_match() mechanism to pay attention to the
new --use-mailmap option, we started to unconditionally copy the
commit object to a temporary buffer, just in case we need the author
and committer lines updated via the mailmap mechanism.

It turns out that this has a rather unpleasant performance
implications.  In the linux kernel repository, running

  $ git log --author='Junio C Hamano' --pretty=short >/dev/null

under /usr/bin/time, with and without --use-mailmap (the .mailmap
file is 118 entries long, the particular author does not appear in
it), cost (with warm cache):

  [without --use-mailmap]
  5.34user 0.25system 0:05.60elapsed 100%CPU (0avgtext+0avgdata 2004832maxresident)k
  0inputs+0outputs (0major+137600minor)pagefaults 0swaps

  [with --use-mailmap]
  6.87user 0.24system 0:07.11elapsed 99%CPU (0avgtext+0avgdata 2006352maxresident)k
  0inputs+0outputs (0major+137696minor)pagefaults 0swaps

which is with about 29% overhead.  The command is doing extra work,
so the extra cost may be justified.

But it is inexcusable to pay the cost when we do not need
author/committer match.  In the same repository,

  $ git log --grep='fix menuconfig on debian lenny' --pretty=short >/dev/null

shows very similar numbers as the above:

  [without --use-mailmap]
  5.30user 0.24system 0:05.55elapsed 99%CPU (0avgtext+0avgdata 2004896maxresident)k
  0inputs+0outputs (0major+137604minor)pagefaults 0swaps

  [with --use-mailmap]
  6.82user 0.26system 0:07.07elapsed 100%CPU (0avgtext+0avgdata 2006352maxresident)k
  0inputs+0outputs (0major+137697minor)pagefaults 0swaps

The latter case is an unnecessary performance regression.  We may
want to _show_ the result with mailmap applied, but we do not have
to copy and rewrite the author/committer of all commits we try to
match if we do not query for these fields.

Trivially optimize this performace regression by limiting the
rewrites for only when we are matching with author/committer fields.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index fa16b9d..56b72f7 100644
--- a/revision.c
+++ b/revision.c
@@ -2283,7 +2283,7 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
 	if (buf.len)
 		strbuf_addstr(&buf, commit->buffer);
 
-	if (opt->mailmap) {
+	if (opt->grep_filter.header_list && opt->mailmap) {
 		if (!buf.len)
 			strbuf_addstr(&buf, commit->buffer);
 
-- 
1.8.1.rc3.221.g8d09d94

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] log: grep author/committer using mailmap
  2012-12-27 18:45           ` Junio C Hamano
@ 2012-12-27 18:48             ` Junio C Hamano
  2012-12-28 18:00               ` Antoine Pelisse
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-27 18:48 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Thanks.  I'll queue this on top.
>
> -- >8 --
> Subject: [PATCH] log --use-mailmap: optimize for cases without --author/--committer search

And this I will *not* queue further on top.

-- >8 --
Subject: [PATCH] [DO NOT USE] log --use-mailmap --author/--committer: further
 optimize identity rewriting

We used to always allocate a temporary buffer to search for
author/committer names even when the author/committer does not
appear in the mailmap.  Update the logic to do the allocation
on demand to reduce needless malloc()/free() calls.

It turns out that this does not affect performance at all; all the
time is spent in map_user() which is a look-up in string_list, so
let's not use this patch in the production, but it illustrates an
interesting point.

In map_identities() function, we already know who the author
recorded in the commit is, either in "author" strbuf, or in buffer
between [a_at..a_len], so we could grep_buffer() the author
regexp(s) specified from the command line right there, and combine
that result with the main grep_buffer() done for the --grep= option
at the end of the commit_match() function.

That may essentially amount to going in the totally opposite
direction from what 2d10c55 (git log: Unify header_filter and
message_filter into one., 2006-09-20) attempted to do.  We used to
have two grep expressions (one for header, the other one for body)
commit_match() runs grep_buffer() on and combined the results.
2d10c55 merged them into one grep expression by introducing a term
that matches only header elements.  But we would instead split the
"header" expression into "author" and "committer" expressions
(making it three from one) if we go that route.

In any case, I *think* the bottleneck is in map_user() so until that
is solved, such an update (or this patch) is not very useful.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 revision.c | 95 +++++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 57 insertions(+), 38 deletions(-)

diff --git a/revision.c b/revision.c
index 56b72f7..4b66598 100644
--- a/revision.c
+++ b/revision.c
@@ -2220,49 +2220,73 @@ static int rewrite_parents(struct rev_info *revs, struct commit *commit)
 	return 0;
 }
 
-static int commit_rewrite_person(struct strbuf *buf, const char *what, struct string_list *mailmap)
+static void map_person(const char *buf, size_t len, const char *head, int headlen,
+		       struct strbuf *result, struct string_list *mailmap,
+		       int *matchofs, int *matchlen)
 {
-	char *person, *endp;
-	size_t len;
+	struct ident_split ident;
+	const char *endp, *person;
 	struct strbuf name = STRBUF_INIT;
 	struct strbuf mail = STRBUF_INIT;
-	struct ident_split ident;
 
-	person = strstr(buf->buf, what);
+	person = memmem(buf, len, head, headlen);
 	if (!person)
-		goto left_intact;
-
-	person += strlen(what);
+		return;
+	person += headlen;
 	endp = strchr(person, '\n');
 	if (!endp)
-		goto left_intact;
-
-	len = endp - person;
-
-	if (split_ident_line(&ident, person, len))
-		goto left_intact;
-
+		return;
+	*matchofs = person - buf;
+	*matchlen = endp - person;
+	if (split_ident_line(&ident, person, *matchlen))
+		return;
 	strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
 	strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
-
-	if (map_user(mailmap, &mail, &name)) {
-		strbuf_addf(&name, " <%s>", mail.buf);
-
-		strbuf_splice(buf, ident.name_begin - buf->buf,
-			      ident.mail_end - ident.name_begin + 1,
-			      name.buf, name.len);
-
-		strbuf_release(&name);
-		strbuf_release(&mail);
-
-		return 1;
-	}
-
-left_intact:
+	if (map_user(mailmap, &mail, &name))
+		strbuf_addf(result, "%s <%s>", name.buf, mail.buf);
 	strbuf_release(&name);
 	strbuf_release(&mail);
+}
 
-	return 0;
+static void map_identities(struct strbuf *buf, const char *buffer, struct string_list *mailmap)
+{
+	const char *eob;
+	struct strbuf author = STRBUF_INIT;
+	struct strbuf committer = STRBUF_INIT;
+	int a_at = -1, a_len, c_at = -1, c_len;
+
+	eob = strstr(buffer, "\n\n");
+	if (!eob)
+		eob = buffer + strlen(buffer);
+	map_person(buffer, eob - buffer, "\nauthor ", 8, &author, mailmap,
+		   &a_at, &a_len);
+	map_person(buffer, eob - buffer, "\ncommitter ", 11, &committer, mailmap,
+		   &c_at, &c_len);
+	if (!author.len && !committer.len)
+		goto done;
+
+	/* Now, we know we need rewriting */
+	if (!buf->len)
+		strbuf_addstr(buf, buffer);
+
+	if (c_at < 0 || !committer.len) {
+		strbuf_splice(buf, a_at, a_len, author.buf, author.len);
+	} else if (a_at < 0 || !author.len) {
+		strbuf_splice(buf, c_at, c_len, committer.buf, committer.len);
+	} else if (a_at < c_at) {
+		strbuf_splice(buf, c_at, c_len, committer.buf, committer.len);
+		strbuf_splice(buf, a_at, a_len, author.buf, author.len);
+	} else {
+		/*
+		 * The commit records committer before the author, which is malformed,
+		 * which we may want to warn about.
+		 */
+		strbuf_splice(buf, a_at, a_len, author.buf, author.len);
+		strbuf_splice(buf, c_at, c_len, committer.buf, committer.len);
+	}
+ done:
+	strbuf_release(&author);
+	strbuf_release(&committer);
 }
 
 static int commit_match(struct commit *commit, struct rev_info *opt)
@@ -2283,13 +2307,8 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
 	if (buf.len)
 		strbuf_addstr(&buf, commit->buffer);
 
-	if (opt->grep_filter.header_list && opt->mailmap) {
-		if (!buf.len)
-			strbuf_addstr(&buf, commit->buffer);
-
-		commit_rewrite_person(&buf, "\nauthor ", opt->mailmap);
-		commit_rewrite_person(&buf, "\ncommitter ", opt->mailmap);
-	}
+	if (opt->grep_filter.header_list && opt->mailmap)
+		map_identities(&buf, commit->buffer, opt->mailmap);
 
 	/* Append "fake" message parts as needed */
 	if (opt->show_notes) {
-- 
1.8.1.rc3.221.g8d09d94

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] log: grep author/committer using mailmap
  2012-12-27 18:48             ` Junio C Hamano
@ 2012-12-28 18:00               ` Antoine Pelisse
  2012-12-28 18:43                 ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-28 18:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Actually, gprof seems to be unhappy about the number of call to
strbuf_grow() in map_user() (25% of the time spent in map_user() is
spent in strbuf_grow()).

That probably comes from the repeated call to strbuf_addch() when
lowering the email address.
At this point, we are also copying the '\0' for every char we add,
doubling the copy.
This may not be much of a difference, but it seems to be called 15
millions times when running:
$ git log --author='Junio C Hamano' --use-mailmap

Maybe we should come up with another way to lower this email address afterall.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] log: grep author/committer using mailmap
  2012-12-28 18:00               ` Antoine Pelisse
@ 2012-12-28 18:43                 ` Junio C Hamano
  2012-12-28 20:37                   ` Antoine Pelisse
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-12-28 18:43 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Antoine Pelisse <apelisse@gmail.com> writes:

> Actually, gprof seems to be unhappy about the number of call to
> strbuf_grow() in map_user() (25% of the time spent in map_user() is
> spent in strbuf_grow()).
>
> That probably comes from the repeated call to strbuf_addch() when
> lowering the email address.

This is about your rewritten implementation that hasn't escaped to
the general public but sitting in 'next', right?

Two things that immediately come to mind are:

 - initialization of lowermail can use strbuf_init() instead;
 - downcasing can be done in place, i.e. "lowermail.buf[i] = ...".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] log: grep author/committer using mailmap
  2012-12-28 18:43                 ` Junio C Hamano
@ 2012-12-28 20:37                   ` Antoine Pelisse
  0 siblings, 0 replies; 15+ messages in thread
From: Antoine Pelisse @ 2012-12-28 20:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

> This is about your rewritten implementation that hasn't escaped to
> the general public but sitting in 'next', right?
>
> Two things that immediately come to mind are:
>
>  - initialization of lowermail can use strbuf_init() instead;
>  - downcasing can be done in place, i.e. "lowermail.buf[i] = ...".

Yep,
I don't think it's merged to 'next', I will squash those changes appropriately.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-12-28 20:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-22 16:58 [PATCH 0/2] Mailmap in log improvements Antoine Pelisse
2012-12-22 16:58 ` [PATCH 1/2] log: grep author/committer using mailmap Antoine Pelisse
2012-12-26 19:27   ` Junio C Hamano
2012-12-26 21:12     ` Antoine Pelisse
2012-12-26 21:37       ` Junio C Hamano
2012-12-27 15:31         ` [PATCH v2] " Antoine Pelisse
2012-12-27 18:45           ` Junio C Hamano
2012-12-27 18:48             ` Junio C Hamano
2012-12-28 18:00               ` Antoine Pelisse
2012-12-28 18:43                 ` Junio C Hamano
2012-12-28 20:37                   ` Antoine Pelisse
2012-12-22 16:58 ` [PATCH 2/2] log: add log.mailmap configuration option Antoine Pelisse
2012-12-23  4:26   ` Junio C Hamano
2012-12-26 16:14     ` Junio C Hamano
2012-12-26 16:42       ` Antoine Pelisse

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).