git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Matheus Tavares" <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH] format-patch: warn if commit msg contains a patch delimiter
Date: Mon, 5 Sep 2022 12:57:45 +0200	[thread overview]
Message-ID: <904b784d-a328-011f-c71a-c2092534e0f7@web.de> (raw)
In-Reply-To: <220905.864jxmme0a.gmgdl@evledraar.gmail.com>

Am 05.09.22 um 10:01 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Sep 04 2022, Matheus Tavares wrote:
>
>> When applying a patch, `git am` looks for special delimiter strings
>> (such as "---") to know where the message ends and the actual diff
>> starts. If one of these strings appears in the commit message itself,
>> `am` might get confused and fail to apply the patch properly. This has
>> already caused inconveniences in the past [1][2]. To help avoid such
>> problem, let's make `git format-patch` warn on commit messages
>> containing one of the said strings.
>>
>> [1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@kernel.org/
>> [2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/
>
> I followed this topic with one eye, and have run into this myself in the
> past. I'm not against this warning, but I wonder if we can't fix
> "am/apply" to just be smarter. The cases I've seen are all ones where:
>
>  * We have a copy/pasted git diff, but we could disambiguate based on
>    (at least) the "---" line being a telltale for the "real" patch, and
>    the "X file changed..." diffstat.
>  * We have a not-quite-git-looking patch diff in the commit message
>    (which we'd normally detect and apply), as in your [2].
>
> Couldn't we just be a bit smarter about applying these, and do a
> look-ahead and find what the user meant.

Whatever we use to separate message from diff can be included in that
message by an unsuspecting user and "---" can be part of a diff.  An
earlier discussion yielded an idea, but no implementation:
https://lore.kernel.org/git/20200204010524-mutt-send-email-mst@kernel.org/

> Is any case, having such a warning won't "settle" this issue, as we're
> able to deal with this non-ambiguity in commit objects/the push/fetch
> protocol. It's just "format-patch/am" as a "wire protocol" that has this
> issue.
>
> But anyway, that's the state of the world now, so warning() about it is
> fair, even if we had a fix for the "apply" part we might want to warn
> for a while to note that it's an issue on older gits.
>
>> +		if (pp->check_in_body_patch_breaks) {
>> +			strbuf_reset(&linebuf);
>> +			strbuf_add(&linebuf, line, linelen);
>> +			if (patchbreak(&linebuf) || is_scissors_line(linebuf.buf)) {
>> +				strbuf_strip_suffix(&linebuf, "\n");
>
> Hrm, it's a (small) shame that the patchbreak() function takes a "struct
> strbuf" rather than a char */size_t in this case (seemingly for no good
> reason, as it's "const"?).

A strbuf is NUL-terminated, a length-limited string (char */size_t)
doesn't have to be.  That means the current implementation can use
functions like starts_with(), but a faithful version that promises to
stay within a given length cannot.  So the reason is probably
convenience.  With skip_prefix_mem() it wouldn't be that bad, though:

---
 mailinfo.c | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 9621ba62a3..ae2e70e363 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -646,32 +646,30 @@ static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
 	free(ret);
 }

-static inline int patchbreak(const struct strbuf *line)
+static int patchbreak(const char *buf, size_t len)
 {
-	size_t i;
-
 	/* Beginning of a "diff -" header? */
-	if (starts_with(line->buf, "diff -"))
+	if (skip_prefix_mem(buf, len, "diff -", &buf, &len))
 		return 1;

 	/* CVS "Index: " line? */
-	if (starts_with(line->buf, "Index: "))
+	if (skip_prefix_mem(buf, len, "Index: ", &buf, &len))
 		return 1;

 	/*
 	 * "--- <filename>" starts patches without headers
 	 * "---<sp>*" is a manual separator
 	 */
-	if (line->len < 4)
+	if (len < 4)
 		return 0;

-	if (starts_with(line->buf, "---")) {
+	if (skip_prefix_mem(buf, len, "---", &buf, &len)) {
 		/* space followed by a filename? */
-		if (line->buf[3] == ' ' && !isspace(line->buf[4]))
+		if (len > 1 && buf[0] == ' ' && !isspace(buf[1]))
 			return 1;
 		/* Just whitespace? */
-		for (i = 3; i < line->len; i++) {
-			unsigned char c = line->buf[i];
+		for (; len; buf++, len--) {
+			unsigned char c = buf[0];
 			if (c == '\n')
 				return 1;
 			if (!isspace(c))
@@ -682,14 +680,14 @@ static inline int patchbreak(const struct strbuf *line)
 	return 0;
 }

-static int is_scissors_line(const char *line)
+static int is_scissors_line(const char *line, size_t len)
 {
 	const char *c;
 	int scissors = 0, gap = 0;
 	const char *first_nonblank = NULL, *last_nonblank = NULL;
 	int visible, perforation = 0, in_perforation = 0;

-	for (c = line; *c; c++) {
+	for (c = line; len; c++, len--) {
 		if (isspace(*c)) {
 			if (in_perforation) {
 				perforation++;
@@ -705,12 +703,14 @@ static int is_scissors_line(const char *line)
 			perforation++;
 			continue;
 		}
-		if (starts_with(c, ">8") || starts_with(c, "8<") ||
-		    starts_with(c, ">%") || starts_with(c, "%<")) {
+		if (skip_prefix_mem(c, len, ">8", &c, &len) ||
+		    skip_prefix_mem(c, len, "8<", &c, &len) ||
+		    skip_prefix_mem(c, len, ">%", &c, &len) ||
+		    skip_prefix_mem(c, len, "%<", &c, &len)) {
 			in_perforation = 1;
 			perforation += 2;
 			scissors += 2;
-			c++;
+			c--, len++;
 			continue;
 		}
 		in_perforation = 0;
@@ -747,7 +747,8 @@ static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 {
 	if (mi->inbody_header_accum.len &&
 	    (line->buf[0] == ' ' || line->buf[0] == '\t')) {
-		if (mi->use_scissors && is_scissors_line(line->buf)) {
+		if (mi->use_scissors &&
+		    is_scissors_line(line->buf, line->len)) {
 			/*
 			 * This is a scissors line; do not consider this line
 			 * as a header continuation line.
@@ -808,7 +809,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	if (convert_to_utf8(mi, line, mi->charset.buf))
 		return 0; /* mi->input_error already set */

-	if (mi->use_scissors && is_scissors_line(line->buf)) {
+	if (mi->use_scissors && is_scissors_line(line->buf, line->len)) {
 		int i;

 		strbuf_setlen(&mi->log_message, 0);
@@ -826,7 +827,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 		return 0;
 	}

-	if (patchbreak(line)) {
+	if (patchbreak(line->buf, line->len)) {
 		if (mi->message_id)
 			strbuf_addf(&mi->log_message,
 				    "Message-Id: %s\n", mi->message_id);
--
2.37.2


  reply	other threads:[~2022-09-05 11:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-04 23:12 [PATCH] format-patch: warn if commit msg contains a patch delimiter Matheus Tavares
2022-09-05  8:01 ` Ævar Arnfjörð Bjarmason
2022-09-05 10:57   ` René Scharfe [this message]
2022-09-07 14:44   ` [PATCH v2 0/2] " Matheus Tavares
2022-09-07 14:44     ` [PATCH v2 1/2] patchbreak(), is_scissors_line(): work with a buf/len pair Matheus Tavares
2022-09-07 18:20       ` Phillip Wood
2022-09-08  0:35       ` Eric Sunshine
2022-09-07 14:44     ` [PATCH v2 2/2] format-patch: warn if commit msg contains a patch delimiter Matheus Tavares
2022-09-07 18:09       ` Phillip Wood
2022-09-07 18:36         ` Junio C Hamano
2022-09-09  1:08           ` Matheus Tavares
2022-09-09 16:47             ` Junio C Hamano
2022-09-07 17:44     ` [PATCH v2 0/2] " René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=904b784d-a328-011f-c71a-c2092534e0f7@web.de \
    --to=l.s.r@web.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).