git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* format-patch subject-prefix gets truncated when using the --numbered flag
@ 2017-02-28 15:59 Adrian Dudau
  2017-02-28 17:42 ` Junio C Hamano
  2017-02-28 18:17 ` Jeff King
  0 siblings, 2 replies; 13+ messages in thread
From: Adrian Dudau @ 2017-02-28 15:59 UTC (permalink / raw)
  To: git@vger.kernel.org

Hello,

I noticed that the --subject-prefix string gets truncated sometimes,
but only when using the --numbered flat. Here's an example:

addu@sestofb11:/data/fb/addu/git$ export longm="very very very very
very very very very very very very very very very long prefix"


addu@sestofb11:/data/fb/addu/git$ git format-patch -1 --subject-
prefix="$longm][PATCH"

As expected, in the generated patch file we have:
Subject: [very very very very very very very very very very very very
very very long prefix][PATCH] 
 First batch after 2.12

But now, if I pass the --numbered flag too:
addu@sestofb11:/data/fb/addu/git$ git format-patch -1 --numbered --
subject-prefix="$longm][PATCH"

In the generated patch file we get this: 
Subject: [very very very very very very very very very very verFirst
batch 
 after 2.12

This is happening on the latest master branch, so I dug through the
code and tracked the issue to this piece of code in log-tree.c:

        if (opt->total > 0) {
                static char buffer[64];
                snprintf(buffer, sizeof(buffer),
                         "Subject: [%s%s%0*d/%d] ",
                         opt->subject_prefix,
                         *opt->subject_prefix ? " " : "",
                         digits_in_number(opt->total),
                         opt->nr, opt->total);
                subject = buffer;
        } else if (opt->total == 0 && opt->subject_prefix && *opt-
>subject_prefix) {
                static char buffer[256];
                snprintf(buffer, sizeof(buffer),
                         "Subject: [%s] ",
                         opt->subject_prefix);
                subject = buffer;
        } else {
                subject = "Subject: ";
        }

Apparently the size of the "buffer" var is different in the two
situations. Anybody knows if this is by design or just an old
oversight?
I can send a patch to fix it but I'm not very familiar with the git
code and I'm afraid some hidden consequence I don't see right now.

Cheers,
--Adrian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: format-patch subject-prefix gets truncated when using the --numbered flag
  2017-02-28 15:59 format-patch subject-prefix gets truncated when using the --numbered flag Adrian Dudau
@ 2017-02-28 17:42 ` Junio C Hamano
  2017-02-28 19:33   ` Junio C Hamano
  2017-02-28 18:17 ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-02-28 17:42 UTC (permalink / raw)
  To: Adrian Dudau; +Cc: git@vger.kernel.org

Adrian Dudau <Adrian.Dudau@enea.com> writes:

> I noticed that the --subject-prefix string gets truncated sometimes,
> but only when using the --numbered flat. Here's an example:
>
> addu@sestofb11:/data/fb/addu/git$ export longm="very very very very
> very very very very very very very very very very long prefix"

This looks like "dr, my arm hurts when I twist it this way---don't
do that then" ;-).  Truncation is designed and intended to preserve
space for the real title of commit.

Having said that...

> This is happening on the latest master branch, so I dug through the
> code and tracked the issue to this piece of code in log-tree.c:
>
>         if (opt->total > 0) {
>                 static char buffer[64];
>                 snprintf(buffer, sizeof(buffer),
>                          "Subject: [%s%s%0*d/%d] ",
>                          opt->subject_prefix,
>                          *opt->subject_prefix ? " " : "",
>                          digits_in_number(opt->total),
>                          opt->nr, opt->total);
>                 subject = buffer;
>         } else if (opt->total == 0 && opt->subject_prefix && *opt-
>>subject_prefix) {
>                 static char buffer[256];
>                 snprintf(buffer, sizeof(buffer),
>                          "Subject: [%s] ",
>                          opt->subject_prefix);
>                 subject = buffer;
>         } else {
>                 subject = "Subject: ";
>         }
>
> Apparently the size of the "buffer" var is different in the two
> situations. Anybody knows if this is by design or just an old
> oversight?

I think this is just an old oversight.  The latter should have been
even shorter than the former one (or the former should be longer
than the latter) to account for the width of the counter e.g. 01/27.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: format-patch subject-prefix gets truncated when using the --numbered flag
  2017-02-28 15:59 format-patch subject-prefix gets truncated when using the --numbered flag Adrian Dudau
  2017-02-28 17:42 ` Junio C Hamano
@ 2017-02-28 18:17 ` Jeff King
  2017-03-01 11:36   ` [PATCH 1/2] log-tree: factor out fmt_output_email_subject() René Scharfe
  2017-03-01 11:37   ` [PATCH 2/2] pretty: use fmt_output_email_subject() René Scharfe
  1 sibling, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-02-28 18:17 UTC (permalink / raw)
  To: Adrian Dudau; +Cc: git@vger.kernel.org

On Tue, Feb 28, 2017 at 03:59:16PM +0000, Adrian Dudau wrote:

> This is happening on the latest master branch, so I dug through the
> code and tracked the issue to this piece of code in log-tree.c:
> 
>         if (opt->total > 0) {
>                 static char buffer[64];
>                 snprintf(buffer, sizeof(buffer),
>                          "Subject: [%s%s%0*d/%d] ",
>                          opt->subject_prefix,
>                          *opt->subject_prefix ? " " : "",
>                          digits_in_number(opt->total),
>                          opt->nr, opt->total);
>                 subject = buffer;
>         } else if (opt->total == 0 && opt->subject_prefix && *opt-
> >subject_prefix) {
>                 static char buffer[256];
>                 snprintf(buffer, sizeof(buffer),
>                          "Subject: [%s] ",
>                          opt->subject_prefix);
>                 subject = buffer;
>         } else {
>                 subject = "Subject: ";
>         }
> 
> Apparently the size of the "buffer" var is different in the two
> situations. Anybody knows if this is by design or just an old
> oversight?

I think it's just an old oversight. There are some other fixed-size
buffers later, too, which could similarly truncate.

I think these should all be "static strbuf", and entering the function
they should get strbuf_reset(), followed by a strbuf_addf(). The static
strbufs will be the owners of the allocated heap memory, and it will get
reused from call to call.

That stops the immediate problem. As a function interface, it's pretty
ugly. It would probably make more sense for the caller to pass in a
strbuf rather than have us pass out pointers to static storage. For the
call in make_cover_letter(), that would be fine. For show_log(), it's
less clear. That's called for every commit in "git log", which might be
a little sensitive to allocations.

The only persistent storage it has is via the rev_info. Perhaps it could
hold some scratch buffers for the traversal (though I don't think we
currently do any resource-freeing when done with a rev_info, so it
effectively becomes a leak).

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: format-patch subject-prefix gets truncated when using the --numbered flag
  2017-02-28 17:42 ` Junio C Hamano
@ 2017-02-28 19:33   ` Junio C Hamano
  2017-02-28 22:44     ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-02-28 19:33 UTC (permalink / raw)
  To: Adrian Dudau; +Cc: git@vger.kernel.org

Junio C Hamano <gitster@pobox.com> writes:

> Adrian Dudau <Adrian.Dudau@enea.com> writes:
>
>> I noticed that the --subject-prefix string gets truncated sometimes,
>> but only when using the --numbered flat. Here's an example:
>>
>> addu@sestofb11:/data/fb/addu/git$ export longm="very very very very
>> very very very very very very very very very very long prefix"
>
> This looks like "dr, my arm hurts when I twist it this way---don't
> do that then" ;-).  Truncation is designed and intended to preserve
> space for the real title of commit.
> 
> Having said that...
> ...
> I think this is just an old oversight.  The latter should have been
> even shorter than the former one (or the former should be longer
> than the latter) to account for the width of the counter e.g. 01/27.

And having said all that, if we really want to allow overlong
subject prefix that would end up hiding the real title of the patch,
a modern way to do so would be to use xstrfmt() like the attached
toy-patch does.  Note that this is merely a small first step---you'd
notice that "subject" is kept around as a "static" and only freed
upon entry to this function for the second time, to preserve the
ownership model of the original code.  In a real "fix" (if this
needs to be "fixed", that is), I think the ownership model of the
storage used for *subject_p and *extra_headers_p needs to be updated
so that it will become caller's responsibility to free them
(similarly, the ownership model of opt->diffopt.stat_sep that is
assigned the address of the static buffer[] in the same function
needs to be revisited).

That "buffer" thing I think would need to be a bit more careful even
in the current code, which _does_ use snprintf() correctly to avoid
overflowing the buffer[], by the way.  If you have an overlong
opt->mime_boundary, the resulting "e-mail" looking output can become
structurely broken.  The truncation may happen way before the full
line for Content-Transfer-Encoding: is written, for example.

So this function seems to have a lot more graver problems that need
to be looked at.

 log-tree.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 8c2415747a..24c98f5a80 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -337,29 +337,23 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p)
 {
-	const char *subject = NULL;
+	static const char *subject = NULL;
 	const char *extra_headers = opt->extra_headers;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      &null_oid : &commit->object.oid);
 
+	free((void *)subject);
 	*need_8bit_cte_p = 0; /* unknown */
 	if (opt->total > 0) {
-		static char buffer[64];
-		snprintf(buffer, sizeof(buffer),
-			 "Subject: [%s%s%0*d/%d] ",
-			 opt->subject_prefix,
-			 *opt->subject_prefix ? " " : "",
-			 digits_in_number(opt->total),
-			 opt->nr, opt->total);
-		subject = buffer;
+		subject = xstrfmt("Subject: [%s%s%0*d/%d] ",
+				  opt->subject_prefix,
+				  *opt->subject_prefix ? " " : "",
+				  digits_in_number(opt->total),
+				  opt->nr, opt->total);
 	} else if (opt->total == 0 && opt->subject_prefix && *opt->subject_prefix) {
-		static char buffer[256];
-		snprintf(buffer, sizeof(buffer),
-			 "Subject: [%s] ",
-			 opt->subject_prefix);
-		subject = buffer;
+		subject = xstrfmt("Subject: [%s] ", opt->subject_prefix);
 	} else {
-		subject = "Subject: ";
+		subject = xstrdup("Subject: ");
 	}
 
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: format-patch subject-prefix gets truncated when using the --numbered flag
  2017-02-28 19:33   ` Junio C Hamano
@ 2017-02-28 22:44     ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-02-28 22:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Adrian Dudau, git@vger.kernel.org

On Tue, Feb 28, 2017 at 11:33:55AM -0800, Junio C Hamano wrote:

> And having said all that, if we really want to allow overlong
> subject prefix that would end up hiding the real title of the patch,
> a modern way to do so would be to use xstrfmt() like the attached
> toy-patch does.

If you are going to keep the ownership inside this function via statics,
you should use a strbuf. That lets you avoid reallocating on each entry
(and we call this once per commit in a traversal, though see below).

> Note that this is merely a small first step---you'd
> notice that "subject" is kept around as a "static" and only freed
> upon entry to this function for the second time, to preserve the
> ownership model of the original code.  In a real "fix" (if this
> needs to be "fixed", that is), I think the ownership model of the
> storage used for *subject_p and *extra_headers_p needs to be updated
> so that it will become caller's responsibility to free them
> (similarly, the ownership model of opt->diffopt.stat_sep that is
> assigned the address of the static buffer[] in the same function
> needs to be revisited).

I agree the ownership model is ugly, and would be nicer if the caller
passed in a strbuf to write into (or a pointer out-parameter, I guess,
but I think strbufs make the ownership semantics much more obvious).

I would just worry about allocation overhead, but that is probably an
overblown concern for format-patch. It tends to be called on a much
smaller set of commits, and it is generally used with "-p", which
creates a lot of overhead already.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] log-tree: factor out fmt_output_email_subject()
  2017-02-28 18:17 ` Jeff King
@ 2017-03-01 11:36   ` René Scharfe
  2017-03-01 11:37   ` [PATCH 2/2] pretty: use fmt_output_email_subject() René Scharfe
  1 sibling, 0 replies; 13+ messages in thread
From: René Scharfe @ 2017-03-01 11:36 UTC (permalink / raw)
  To: Jeff King, Adrian Dudau; +Cc: git@vger.kernel.org

Use a strbuf to store the subject prefix string and move its
construction into its own function.  This gets rid of two arbitrary
length limits and allows the string to be added by callers directly.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 log-tree.c | 40 ++++++++++++++++++++--------------------
 log-tree.h |  1 +
 2 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/log-tree.c b/log-tree.c
index 8c2415747a..44febb75ab 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -332,35 +332,33 @@ void fmt_output_commit(struct strbuf *filename,
 	strbuf_release(&subject);
 }
 
+void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
+{
+	if (opt->total > 0) {
+		strbuf_addf(sb, "Subject: [%s%s%0*d/%d] ",
+			    opt->subject_prefix,
+			    *opt->subject_prefix ? " " : "",
+			    digits_in_number(opt->total),
+			    opt->nr, opt->total);
+	} else if (opt->total == 0 && opt->subject_prefix && *opt->subject_prefix) {
+		strbuf_addf(sb, "Subject: [%s] ",
+			    opt->subject_prefix);
+	} else {
+		strbuf_addstr(sb, "Subject: ");
+	}
+}
+
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     const char **subject_p,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p)
 {
-	const char *subject = NULL;
+	static struct strbuf subject = STRBUF_INIT;
 	const char *extra_headers = opt->extra_headers;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      &null_oid : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
-	if (opt->total > 0) {
-		static char buffer[64];
-		snprintf(buffer, sizeof(buffer),
-			 "Subject: [%s%s%0*d/%d] ",
-			 opt->subject_prefix,
-			 *opt->subject_prefix ? " " : "",
-			 digits_in_number(opt->total),
-			 opt->nr, opt->total);
-		subject = buffer;
-	} else if (opt->total == 0 && opt->subject_prefix && *opt->subject_prefix) {
-		static char buffer[256];
-		snprintf(buffer, sizeof(buffer),
-			 "Subject: [%s] ",
-			 opt->subject_prefix);
-		subject = buffer;
-	} else {
-		subject = "Subject: ";
-	}
 
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
@@ -417,7 +415,9 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer;
 		strbuf_release(&filename);
 	}
-	*subject_p = subject;
+	strbuf_reset(&subject);
+	fmt_output_email_subject(&subject, opt);
+	*subject_p = subject.buf;
 	*extra_headers_p = extra_headers;
 }
 
diff --git a/log-tree.h b/log-tree.h
index c8116e60cd..dd75dd7770 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -30,5 +30,6 @@ void load_ref_decorations(int flags);
 #define FORMAT_PATCH_NAME_MAX 64
 void fmt_output_commit(struct strbuf *, struct commit *, struct rev_info *);
 void fmt_output_subject(struct strbuf *, const char *subject, struct rev_info *);
+void fmt_output_email_subject(struct strbuf *, struct rev_info *);
 
 #endif
-- 
2.12.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-02-28 18:17 ` Jeff King
  2017-03-01 11:36   ` [PATCH 1/2] log-tree: factor out fmt_output_email_subject() René Scharfe
@ 2017-03-01 11:37   ` René Scharfe
  2017-03-01 15:41     ` René Scharfe
                       ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: René Scharfe @ 2017-03-01 11:37 UTC (permalink / raw)
  To: Jeff King, Adrian Dudau; +Cc: git@vger.kernel.org

Add the email-style subject prefix (e.g. "Subject: [PATCH] ") directly
when it's needed instead of letting log_write_email_headers() prepare
it in a static buffer in advance.  This simplifies storage ownership and
code flow.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
This slows down the last three tests in p4000 by ca. 3% for some reason,
so we may want to only do the first part for now, which is performance
neutral on my machine.

 builtin/log.c      | 5 +++--
 builtin/shortlog.c | 2 +-
 commit.h           | 6 ++++--
 log-tree.c         | 9 +++------
 log-tree.h         | 1 -
 pretty.c           | 7 ++++---
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 55d20cc2d8..281af8c1ec 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -989,8 +989,7 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 	    open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", rev, quiet))
 		return;
 
-	log_write_email_headers(rev, head, &pp.subject, &pp.after_subject,
-				&need_8bit_cte);
+	log_write_email_headers(rev, head, &pp.after_subject, &need_8bit_cte);
 
 	for (i = 0; !need_8bit_cte && i < nr; i++) {
 		const char *buf = get_commit_buffer(list[i], NULL);
@@ -1005,6 +1004,8 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 	msg = body;
 	pp.fmt = CMIT_FMT_EMAIL;
 	pp.date_mode.type = DATE_RFC2822;
+	pp.rev = rev;
+	pp.print_email_subject = 1;
 	pp_user_info(&pp, NULL, &sb, committer, encoding);
 	pp_title_line(&pp, &msg, &sb, encoding, need_8bit_cte);
 	pp_remainder(&pp, &msg, &sb, 0);
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index c9585d475d..f78bb4818d 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -148,7 +148,7 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 
 	ctx.fmt = CMIT_FMT_USERFORMAT;
 	ctx.abbrev = log->abbrev;
-	ctx.subject = "";
+	ctx.print_email_subject = 1;
 	ctx.after_subject = "";
 	ctx.date_mode.type = DATE_NORMAL;
 	ctx.output_encoding = get_log_output_encoding();
diff --git a/commit.h b/commit.h
index 9c12abb911..459daef94a 100644
--- a/commit.h
+++ b/commit.h
@@ -142,21 +142,24 @@ static inline int cmit_fmt_is_mail(enum cmit_fmt fmt)
 	return (fmt == CMIT_FMT_EMAIL || fmt == CMIT_FMT_MBOXRD);
 }
 
+struct rev_info; /* in revision.h, it circularly uses enum cmit_fmt */
+
 struct pretty_print_context {
 	/*
 	 * Callers should tweak these to change the behavior of pp_* functions.
 	 */
 	enum cmit_fmt fmt;
 	int abbrev;
-	const char *subject;
 	const char *after_subject;
 	int preserve_subject;
 	struct date_mode date_mode;
 	unsigned date_mode_explicit:1;
+	unsigned print_email_subject:1;
 	int expand_tabs_in_log;
 	int need_8bit_cte;
 	char *notes_message;
 	struct reflog_walk_info *reflog_info;
+	struct rev_info *rev;
 	const char *output_encoding;
 	struct string_list *mailmap;
 	int color;
@@ -175,7 +178,6 @@ struct userformat_want {
 };
 
 extern int has_non_ascii(const char *text);
-struct rev_info; /* in revision.h, it circularly uses enum cmit_fmt */
 extern const char *logmsg_reencode(const struct commit *commit,
 				   char **commit_encoding,
 				   const char *output_encoding);
diff --git a/log-tree.c b/log-tree.c
index 44febb75ab..4618dd04ca 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -349,11 +349,9 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 }
 
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **subject_p,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p)
 {
-	static struct strbuf subject = STRBUF_INIT;
 	const char *extra_headers = opt->extra_headers;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      &null_oid : &commit->object.oid);
@@ -415,9 +413,6 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer;
 		strbuf_release(&filename);
 	}
-	strbuf_reset(&subject);
-	fmt_output_email_subject(&subject, opt);
-	*subject_p = subject.buf;
 	*extra_headers_p = extra_headers;
 }
 
@@ -602,8 +597,10 @@ void show_log(struct rev_info *opt)
 	 */
 
 	if (cmit_fmt_is_mail(opt->commit_format)) {
-		log_write_email_headers(opt, commit, &ctx.subject, &extra_headers,
+		log_write_email_headers(opt, commit, &extra_headers,
 					&ctx.need_8bit_cte);
+		ctx.rev = opt;
+		ctx.print_email_subject = 1;
 	} else if (opt->commit_format != CMIT_FMT_USERFORMAT) {
 		fputs(diff_get_color_opt(&opt->diffopt, DIFF_COMMIT), opt->diffopt.file);
 		if (opt->commit_format != CMIT_FMT_ONELINE)
diff --git a/log-tree.h b/log-tree.h
index dd75dd7770..48f11fb740 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -22,7 +22,6 @@ void format_decorations_extended(struct strbuf *sb, const struct commit *commit,
 			     format_decorations_extended((strbuf), (commit), (color), " (", ", ", ")")
 void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **subject_p,
 			     const char **extra_headers_p,
 			     int *need_8bit_cte_p);
 void load_ref_decorations(int flags);
diff --git a/pretty.c b/pretty.c
index 5e683830d9..d0f86f5d85 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1607,8 +1607,9 @@ void pp_title_line(struct pretty_print_context *pp,
 				pp->preserve_subject ? "\n" : " ");
 
 	strbuf_grow(sb, title.len + 1024);
-	if (pp->subject) {
-		strbuf_addstr(sb, pp->subject);
+	if (pp->print_email_subject) {
+		if (pp->rev)
+			fmt_output_email_subject(sb, pp->rev);
 		if (needs_rfc2047_encoding(title.buf, title.len, RFC2047_SUBJECT))
 			add_rfc2047(sb, title.buf, title.len,
 						encoding, RFC2047_SUBJECT);
@@ -1818,7 +1819,7 @@ void pretty_print_commit(struct pretty_print_context *pp,
 	}
 
 	pp_header(pp, encoding, commit, &msg, sb);
-	if (pp->fmt != CMIT_FMT_ONELINE && !pp->subject) {
+	if (pp->fmt != CMIT_FMT_ONELINE && !pp->print_email_subject) {
 		strbuf_addch(sb, '\n');
 	}
 
-- 
2.12.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-03-01 11:37   ` [PATCH 2/2] pretty: use fmt_output_email_subject() René Scharfe
@ 2017-03-01 15:41     ` René Scharfe
  2017-03-01 18:08     ` Junio C Hamano
  2017-03-01 19:22     ` Jeff King
  2 siblings, 0 replies; 13+ messages in thread
From: René Scharfe @ 2017-03-01 15:41 UTC (permalink / raw)
  To: Jeff King, Adrian Dudau; +Cc: git@vger.kernel.org

Am 01.03.2017 um 12:37 schrieb René Scharfe:
> Add the email-style subject prefix (e.g. "Subject: [PATCH] ") directly
> when it's needed instead of letting log_write_email_headers() prepare
> it in a static buffer in advance.  This simplifies storage ownership and
> code flow.
>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---
> This slows down the last three tests in p4000 by ca. 3% for some reason,
> so we may want to only do the first part for now, which is performance
> neutral on my machine.

Update below.

> diff --git a/commit.h b/commit.h
> index 9c12abb911..459daef94a 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -142,21 +142,24 @@ static inline int cmit_fmt_is_mail(enum cmit_fmt fmt)
>  	return (fmt == CMIT_FMT_EMAIL || fmt == CMIT_FMT_MBOXRD);
>  }
>
> +struct rev_info; /* in revision.h, it circularly uses enum cmit_fmt */
> +
>  struct pretty_print_context {
>  	/*
>  	 * Callers should tweak these to change the behavior of pp_* functions.
>  	 */
>  	enum cmit_fmt fmt;
>  	int abbrev;
> -	const char *subject;
>  	const char *after_subject;
>  	int preserve_subject;
>  	struct date_mode date_mode;
>  	unsigned date_mode_explicit:1;
> +	unsigned print_email_subject:1;

Turning this into an int restores performance according to p4000. 
Didn't know that bitfields can be *that* expensive.

René

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-03-01 11:37   ` [PATCH 2/2] pretty: use fmt_output_email_subject() René Scharfe
  2017-03-01 15:41     ` René Scharfe
@ 2017-03-01 18:08     ` Junio C Hamano
  2017-03-01 19:32       ` Jeff King
  2017-03-01 19:22     ` Jeff King
  2 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-03-01 18:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, Adrian Dudau, git@vger.kernel.org

René Scharfe <l.s.r@web.de> writes:

> Add the email-style subject prefix (e.g. "Subject: [PATCH] ") directly
> when it's needed instead of letting log_write_email_headers() prepare
> it in a static buffer in advance.  This simplifies storage ownership and
> code flow.

It certainly does.  At first I wondered if there are places other
than log-write-email-headers that sets its own string to pp.subject,
but this patch removes the field from the structure to guarantee
that such a place, if existed, is properly dealt with.  Otherwise,
the resulting code would not compile.

> @@ -1005,6 +1004,8 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
>  	msg = body;
>  	pp.fmt = CMIT_FMT_EMAIL;
>  	pp.date_mode.type = DATE_RFC2822;
> +	pp.rev = rev;
> +	pp.print_email_subject = 1;

These two are always set together?  We'll shortly see that it is not
the case below.

>  	pp_user_info(&pp, NULL, &sb, committer, encoding);
>  	pp_title_line(&pp, &msg, &sb, encoding, need_8bit_cte);
>  	pp_remainder(&pp, &msg, &sb, 0);
> diff --git a/builtin/shortlog.c b/builtin/shortlog.c
> index c9585d475d..f78bb4818d 100644
> --- a/builtin/shortlog.c
> +++ b/builtin/shortlog.c
> @@ -148,7 +148,7 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
>  
>  	ctx.fmt = CMIT_FMT_USERFORMAT;
>  	ctx.abbrev = log->abbrev;
> -	ctx.subject = "";
> +	ctx.print_email_subject = 1;

Here we see .rev is left to NULL here, with print_email_subject set
to true.  And in the entire patch this is the only such place.

> diff --git a/pretty.c b/pretty.c
> index 5e683830d9..d0f86f5d85 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -1607,8 +1607,9 @@ void pp_title_line(struct pretty_print_context *pp,
>  				pp->preserve_subject ? "\n" : " ");
>  
>  	strbuf_grow(sb, title.len + 1024);
> -	if (pp->subject) {
> -		strbuf_addstr(sb, pp->subject);
> +	if (pp->print_email_subject) {
> +		if (pp->rev)
> +			fmt_output_email_subject(sb, pp->rev);

A hidden assumption this code makes is that anybody who does not
want .rev (aka "doing it as part of format-patch that may want
nr/total etc") does not want _any_ "Subject: ".  It obviously holds
true in today's code (the one in shortlog-add-commit is the only one
and it sets an empty string to .subject).

Does the loss of flexibility to the future callers matter, though?
I cannot tell offhand.

Thanks.  Let's see what others think.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-03-01 11:37   ` [PATCH 2/2] pretty: use fmt_output_email_subject() René Scharfe
  2017-03-01 15:41     ` René Scharfe
  2017-03-01 18:08     ` Junio C Hamano
@ 2017-03-01 19:22     ` Jeff King
  2 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-03-01 19:22 UTC (permalink / raw)
  To: René Scharfe; +Cc: Adrian Dudau, git@vger.kernel.org

On Wed, Mar 01, 2017 at 12:37:07PM +0100, René Scharfe wrote:

> Add the email-style subject prefix (e.g. "Subject: [PATCH] ") directly
> when it's needed instead of letting log_write_email_headers() prepare
> it in a static buffer in advance.  This simplifies storage ownership and
> code flow.

This looks much cleaner to me.

I suspect we can do the same thing with the mime headers below there.
They end up in extra_headers, which is shown via after_subject. But we
could mostly replace after_subject with a call to fmt_output_email_mime()
or similar.

The only other use of extra_headers seems to be the static to/cc fields
one can add via format-patch. But those _should_ be treated differently,
as they can be allocated once in the rev_info, not per-commit. Which I
think shows off another bug. If you have a large to/cc list, that all
gets lumped into the same 1024-byte buffer, and may cause truncation.

I think the diffopt.stat_sep thing could get similar handling, too. It
appears to be set only in this one spot, and gets looked at in exactly
one. That could be replaced with an on-the-fly function call.

> This slows down the last three tests in p4000 by ca. 3% for some reason,
> so we may want to only do the first part for now, which is performance
> neutral on my machine.

It sounds like the bitfield was the cause, so that should be an easy
fix. The other question is whether it makes "--format=email" any slower.
It shouldn't, as your new approach doesn't do any extra per-commit
allocations (and in fact, it avoids some useless buffer-copying).

I couldn't measure any difference.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-03-01 18:08     ` Junio C Hamano
@ 2017-03-01 19:32       ` Jeff King
  2017-03-01 19:38         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-03-01 19:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, Adrian Dudau, git@vger.kernel.org

On Wed, Mar 01, 2017 at 10:08:10AM -0800, Junio C Hamano wrote:

> >  	strbuf_grow(sb, title.len + 1024);
> > -	if (pp->subject) {
> > -		strbuf_addstr(sb, pp->subject);
> > +	if (pp->print_email_subject) {
> > +		if (pp->rev)
> > +			fmt_output_email_subject(sb, pp->rev);
> 
> A hidden assumption this code makes is that anybody who does not
> want .rev (aka "doing it as part of format-patch that may want
> nr/total etc") does not want _any_ "Subject: ".  It obviously holds
> true in today's code (the one in shortlog-add-commit is the only one
> and it sets an empty string to .subject).
> 
> Does the loss of flexibility to the future callers matter, though?
> I cannot tell offhand.
> 
> Thanks.  Let's see what others think.

I would think that future callers would just need to provide a dummy
pp->rev. I guess that logic could be pushed down into
fmt_output_email_subject(), so that it skips looking at
opt->subject_prefix, etc, when "opt" is NULL, and just hits the
"Subject:" case arm.

I don't think it's a big deal, but it would be easy to fix now, like:

diff --git a/log-tree.c b/log-tree.c
index 4618dd04c..c73df6857 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -334,13 +334,13 @@ void fmt_output_commit(struct strbuf *filename,
 
 void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 {
-	if (opt->total > 0) {
+	if (opt && opt->total > 0) {
 		strbuf_addf(sb, "Subject: [%s%s%0*d/%d] ",
 			    opt->subject_prefix,
 			    *opt->subject_prefix ? " " : "",
 			    digits_in_number(opt->total),
 			    opt->nr, opt->total);
-	} else if (opt->total == 0 && opt->subject_prefix && *opt->subject_prefix) {
+	} else if (opt && opt->total == 0 && opt->subject_prefix && *opt->subject_prefix) {
 		strbuf_addf(sb, "Subject: [%s] ",
 			    opt->subject_prefix);
 	} else {
diff --git a/pretty.c b/pretty.c
index d0f86f5d8..6b321c68c 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1608,8 +1608,7 @@ void pp_title_line(struct pretty_print_context *pp,
 
 	strbuf_grow(sb, title.len + 1024);
 	if (pp->print_email_subject) {
-		if (pp->rev)
-			fmt_output_email_subject(sb, pp->rev);
+		fmt_output_email_subject(sb, pp->rev);
 		if (needs_rfc2047_encoding(title.buf, title.len, RFC2047_SUBJECT))
 			add_rfc2047(sb, title.buf, title.len,
 						encoding, RFC2047_SUBJECT);

-Peff

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-03-01 19:32       ` Jeff King
@ 2017-03-01 19:38         ` Junio C Hamano
  2017-03-01 19:43           ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-03-01 19:38 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Adrian Dudau, git@vger.kernel.org

On Wed, Mar 1, 2017 at 11:32 AM, Jeff King <peff@peff.net> wrote:
> I would think that future callers would just need to provide a dummy
> pp->rev. I guess that logic could be pushed down into
> fmt_output_email_subject(), so that it skips looking at
> opt->subject_prefix, etc, when "opt" is NULL, and just hits the
> "Subject:" case arm.

The "flexibility" I was wondering about is that the current .subject can
point at any caller-supplied string, not "Subject:".

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] pretty: use fmt_output_email_subject()
  2017-03-01 19:38         ` Junio C Hamano
@ 2017-03-01 19:43           ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-03-01 19:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, Adrian Dudau, git@vger.kernel.org

On Wed, Mar 01, 2017 at 11:38:57AM -0800, Junio C Hamano wrote:

> On Wed, Mar 1, 2017 at 11:32 AM, Jeff King <peff@peff.net> wrote:
> > I would think that future callers would just need to provide a dummy
> > pp->rev. I guess that logic could be pushed down into
> > fmt_output_email_subject(), so that it skips looking at
> > opt->subject_prefix, etc, when "opt" is NULL, and just hits the
> > "Subject:" case arm.
> 
> The "flexibility" I was wondering about is that the current .subject can
> point at any caller-supplied string, not "Subject:".

Ah, I see. I don't think that is a huge loss, as nobody was using it.
And "Subject:" is already hard-coded in the nr/total counting bits,
which are what you'd want to reuse. I think it is fine to punt to the
future. If somebody really wants it later, the right fix is for them to
provide a string that fmt_output_email_subject() would use in place of
"Subject:" when it adds to the strbuf.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-03-02  1:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-28 15:59 format-patch subject-prefix gets truncated when using the --numbered flag Adrian Dudau
2017-02-28 17:42 ` Junio C Hamano
2017-02-28 19:33   ` Junio C Hamano
2017-02-28 22:44     ` Jeff King
2017-02-28 18:17 ` Jeff King
2017-03-01 11:36   ` [PATCH 1/2] log-tree: factor out fmt_output_email_subject() René Scharfe
2017-03-01 11:37   ` [PATCH 2/2] pretty: use fmt_output_email_subject() René Scharfe
2017-03-01 15:41     ` René Scharfe
2017-03-01 18:08     ` Junio C Hamano
2017-03-01 19:32       ` Jeff King
2017-03-01 19:38         ` Junio C Hamano
2017-03-01 19:43           ` Jeff King
2017-03-01 19:22     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).