git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Siddharth Asthana <siddharthasthana31@gmail.com>
Cc: git@vger.kernel.org, phillip.wood123@gmail.com,
	congdanhqx@gmail.com, christian.couder@gmail.com,
	avarab@gmail.com, gitster@pobox.com, johncai86@gmail.com
Subject: Re: [PATCH v3 1/4] revision: improve commit_rewrite_person()
Date: Tue, 12 Jul 2022 18:29:32 +0200 (CEST)	[thread overview]
Message-ID: <98nq3r16-0p93-21p5-0pn6-r36841320903@tzk.qr> (raw)
In-Reply-To: <20220709154149.165524-2-siddharthasthana31@gmail.com>

Hi Siddarth,

On Sat, 9 Jul 2022, Siddharth Asthana wrote:

> diff --git a/revision.c b/revision.c
> index 211352795c..1939c56c67 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -3792,14 +3791,42 @@ static int commit_rewrite_person(struct strbuf *buf, const char *what, struct st
>  			      ident.mail_end - ident.name_begin + 1,
>  			      namemail.buf, namemail.len);
>
> +		newlen = namemail.len;
> +
>  		strbuf_release(&namemail);
>
> -		return 1;
> +		return newlen - (ident.mail_end - ident.name_begin + 1);
>  	}
>
>  	return 0;
>  }
>
> +static void commit_rewrite_person(struct strbuf *buf, const char **headers, struct string_list *mailmap)
> +{
> +	size_t buf_offset = 0;
> +
> +	if (!mailmap)
> +		return;
> +
> +	for (;;) {
> +		const char *person, *line;
> +		size_t i, linelen;
> +
> +		line = buf->buf + buf_offset;
> +		linelen = strchrnul(line, '\n') - line + 1;
> +
> +		if (linelen <= 1)
> +			/* End of header */
> +			return;

This conditional would probably read much better if it was moved up a few
lines and rewritten like this:

	if (!*line || *line == '\n')
		return; /* End of headers */

or even turning the `for (;;)` into

	while (buf->buf[buf_offset] && buf->buf[buf_offset] != '\n')

> +
> +		buf_offset += linelen;

I would prefer to avoid having `linelen` altogether, and instead move the
`buf_offset` assignment _after_ the loop that handles the current header
line (and _not_ modify `buf_offset` inside):

	buf_offset = strchrnul(buf->buf + buf_offset, '\n');
	if (buf->buf[buf_offset] == '\n')
		buf_offset++;

> +
> +		for (i = 0; headers[i]; i++)
> +			if (skip_prefix(line, headers[i], &person))
> +				buf_offset += rewrite_ident_line(person, buf, mailmap);

At this point, we have handled the header and should _not_ continue the
(inner) `for` loop. This is important because `line` is potentially
invalidated by that `rewrite_ident_line()` call. See below for a patch
(which is on top of `shears/seen`, but you get the idea.

This issue could also be avoided by consistently using `buf->buf +
buf_offset` instead of `line`.

> +	}
> +}
> +
>  static int commit_match(struct commit *commit, struct rev_info *opt)
>  {
>  	int retval;

-- snipsnap --
From 61dd169def195eee9827a9a670f8dab606279cea Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 12 Jul 2022 15:10:35 +0200
Subject: [PATCH] fixup??? revision: improve commit_rewrite_person()

When the `linux-musl` job failed in t4203.44 with a segmentation fault,
I became suspicious. From
https://github.com/git-for-windows/git/runs/7301741954?check_suite_focus=true#step:5:1775:

  + test_config mailmap.file complex.map
  + config_dir=
  + test mailmap.file '=' -C
  + test_when_finished 'test_unconfig  '"'"'mailmap.file'"'"
  + test 0 '=' 0
  + test_cleanup='{ test_unconfig  '"'"'mailmap.file'"'"'
  		} && (exit "$eval_ret"); eval_ret=$?; :'
  + git config mailmap.file complex.map
  + git log --use-mailmap --author '<cto@coompany.xx>'
  Segmentation fault (core dumped)
  error: last command exited with $?=139

I suspected a memory corruption, and my go-to tool to investigate those
is `valgrind`, so I ran `t4203-*.sh -ivx --valgrind-only=44`. It
reported the following:

-- snip --
[...]
expecting success of 4203.44 'Only grep replaced author with --use-mailmap':
	test_config mailmap.file complex.map &&
	git log --use-mailmap --author "<cto@coompany.xx>" >actual &&
	test_must_be_empty actual

+ test_config mailmap.file complex.map
+ config_dir=
+ test mailmap.file = -C
+ test_when_finished test_unconfig  'mailmap.file'
+ test 0 = 0
+ test_cleanup={ test_unconfig  'mailmap.file'
		} && (exit "$eval_ret"); eval_ret=$?; :
+ git config mailmap.file complex.map
+ git log --use-mailmap --author <cto@coompany.xx>
==14374== Invalid read of size 1
==14374==    at 0x2EE384: skip_prefix (git-compat-util.h:676)
==14374==    by 0x2EF31D: apply_mailmap_to_header (ident.c:417)
==14374==    by 0x3BB045: commit_match (revision.c:3831)
==14374==    by 0x3BB389: get_commit_action (revision.c:3917)
==14374==    by 0x3BB934: simplify_commit (revision.c:4005)
==14374==    by 0x3BBCAD: get_revision_1 (revision.c:4083)
==14374==    by 0x3BBEF0: get_revision_internal (revision.c:4184)
==14374==    by 0x3BC192: get_revision (revision.c:4262)
==14374==    by 0x1A0B05: cmd_log_walk_no_free (log.c:454)
==14374==    by 0x1A0BCD: cmd_log_walk (log.c:496)
==14374==    by 0x1A1E90: cmd_log (log.c:818)
==14374==    by 0x129A04: run_builtin (git.c:466)
==14374==  Address 0x4b76f4e is 94 bytes inside a block of size 210 free'd
==14374==    at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==14374==    by 0x42F908: xrealloc (wrapper.c:136)
==14374==    by 0x3E6742: strbuf_grow (strbuf.c:99)
==14374==    by 0x3E6F1F: strbuf_splice (strbuf.c:242)
==14374==    by 0x2EF220: rewrite_ident_line (ident.c:382)
==14374==    by 0x2EF338: apply_mailmap_to_header (ident.c:418)
==14374==    by 0x3BB045: commit_match (revision.c:3831)
==14374==    by 0x3BB389: get_commit_action (revision.c:3917)
==14374==    by 0x3BB934: simplify_commit (revision.c:4005)
==14374==    by 0x3BBCAD: get_revision_1 (revision.c:4083)
==14374==    by 0x3BBEF0: get_revision_internal (revision.c:4184)
==14374==    by 0x3BC192: get_revision (revision.c:4262)
==14374==  Block was alloc'd at
==14374==    at 0x483B723: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==14374==    by 0x483E017: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==14374==    by 0x42F908: xrealloc (wrapper.c:136)
==14374==    by 0x3E6742: strbuf_grow (strbuf.c:99)
==14374==    by 0x3E733D: strbuf_add (strbuf.c:298)
==14374==    by 0x3AF420: strbuf_addstr (strbuf.h:305)
==14374==    by 0x3BB027: commit_match (revision.c:3829)
==14374==    by 0x3BB389: get_commit_action (revision.c:3917)
==14374==    by 0x3BB934: simplify_commit (revision.c:4005)
==14374==    by 0x3BBCAD: get_revision_1 (revision.c:4083)
==14374==    by 0x3BBEF0: get_revision_internal (revision.c:4184)
==14374==    by 0x3BC192: get_revision (revision.c:4262)
==14374==
{
   <insert_a_suppression_name_here>
   Memcheck:Addr1
   fun:skip_prefix
   fun:apply_mailmap_to_header
   fun:commit_match
   fun:get_commit_action
   fun:simplify_commit
   fun:get_revision_1
   fun:get_revision_internal
   fun:get_revision
   fun:cmd_log_walk_no_free
   fun:cmd_log_walk
   fun:cmd_log
   fun:run_builtin
}
error: last command exited with $?=126
not ok 44 - Only grep replaced author with --use-mailmap
1..44
-- snap --

And indeed, we see that the `rewrite_ident_line()` function grows the
strbuf, which can (and does, in this instance) move the buffer to a new
address, which invalidates the `line` pointer, which still points at the
old address.

It might actually make sense to rewrite the entire part of the original
patch where it looks for the end of the header line, so that it avoids
working on pointers altogether, and uses offsets instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ident.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/ident.c b/ident.c
index 5f17bd607dd..fbcf7250aab 100644
--- a/ident.c
+++ b/ident.c
@@ -414,8 +414,10 @@ void apply_mailmap_to_header(struct strbuf *buf, const char **headers, struct st
 		buf_offset += linelen;

 		for (i = 0; headers[i]; i++)
-			if (skip_prefix(line, headers[i], &person))
+			if (skip_prefix(line, headers[i], &person)) {
 				buf_offset += rewrite_ident_line(person, buf, mailmap);
+				break;
+			}
 	}
 }

--
2.37.0.rc2.windows.1.7.g45a475aeb84


  reply	other threads:[~2022-07-12 16:29 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-30 14:24 [PATCH 0/3] Add support for mailmap in cat-file Siddharth Asthana
2022-06-30 14:24 ` [PATCH 1/3] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-06-30 16:00   ` Đoàn Trần Công Danh
2022-06-30 23:22   ` Junio C Hamano
2022-06-30 14:24 ` [PATCH 2/3] ident: rename commit_rewrite_person() to rewrite_ident_line() Siddharth Asthana
2022-06-30 15:33   ` Phillip Wood
2022-06-30 16:55     ` Christian Couder
2022-06-30 23:31   ` Junio C Hamano
2022-06-30 14:24 ` [PATCH 3/3] cat-file: add mailmap support Siddharth Asthana
2022-06-30 15:50   ` Phillip Wood
2022-06-30 16:36     ` Phillip Wood
2022-06-30 17:07     ` Christian Couder
2022-06-30 21:33       ` Junio C Hamano
2022-07-07  9:15         ` Christian Couder
2022-06-30 23:36   ` Ævar Arnfjörð Bjarmason
2022-06-30 23:53     ` Junio C Hamano
2022-07-07  9:02     ` Christian Couder
2022-06-30 23:41   ` Junio C Hamano
2022-06-30 21:18 ` [PATCH 0/3] Add support for mailmap in cat-file Junio C Hamano
2022-07-07 16:15 ` [PATCH v2 0/4] " Siddharth Asthana
2022-07-07 16:15   ` [PATCH v2 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-07 21:52     ` Junio C Hamano
2022-07-08 14:50     ` Đoàn Trần Công Danh
     [not found]       ` <CAP8UFD116xMnp27pxW8WNDf6PRJxnnwWtcy2TNHU_KyV2ZVA1g@mail.gmail.com>
2022-07-09  1:02         ` Đoàn Trần Công Danh
2022-07-09  5:04           ` Christian Couder
2022-07-07 16:15   ` [PATCH v2 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-07 16:15   ` [PATCH v2 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-07 16:15   ` [PATCH v2 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-07 21:55     ` Junio C Hamano
2022-07-08 11:53     ` Johannes Schindelin
2022-07-07 22:06   ` [PATCH v2 0/4] Add support for mailmap in cat-file Junio C Hamano
2022-07-07 22:58     ` Junio C Hamano
2022-07-09 15:41   ` [PATCH v3 " Siddharth Asthana
2022-07-09 15:41     ` [PATCH v3 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-12 16:29       ` Johannes Schindelin [this message]
2022-07-09 15:41     ` [PATCH v3 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-09 15:41     ` [PATCH v3 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-09 15:41     ` [PATCH v3 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-10  5:34     ` [PATCH v3 0/4] Add support for mailmap in cat-file Junio C Hamano
2022-07-12 12:34       ` Johannes Schindelin
2022-07-12 14:16         ` Junio C Hamano
2022-07-12 16:01           ` Siddharth Asthana
2022-07-12 16:06           ` Junio C Hamano
2022-07-12 16:06     ` [PATCH v4 " Siddharth Asthana
2022-07-12 16:06       ` [PATCH v4 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-13  1:25         ` Ævar Arnfjörð Bjarmason
2022-07-13 12:18           ` Christian Couder
2022-07-14 21:02         ` Junio C Hamano
2022-07-12 16:06       ` [PATCH v4 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-12 16:06       ` [PATCH v4 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-13  1:25         ` Ævar Arnfjörð Bjarmason
2022-07-13 13:29           ` Christian Couder
2022-07-12 16:06       ` [PATCH v4 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-16  7:40       ` [PATCH v5 0/4] Add support for mailmap in cat-file Siddharth Asthana
2022-07-16  7:40         ` [PATCH v5 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-17 22:11           ` Junio C Hamano
2022-07-16  7:40         ` [PATCH v5 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-16  7:40         ` [PATCH v5 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-16  7:40         ` [PATCH v5 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-18 19:50         ` [PATCH v6 0/4] Add support for mailmap in cat-file Siddharth Asthana
2022-07-18 19:50           ` [PATCH v6 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-18 19:51           ` [PATCH v6 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-18 19:51           ` [PATCH v6 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-18 19:51           ` [PATCH v6 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-25 18:58           ` [PATCH v6 0/4] Add support for mailmap in cat-file Junio C Hamano
2022-07-28 19:07             ` Christian Couder
2022-07-28 19:32               ` Junio C Hamano
2022-07-30  7:50                 ` Siddharth Asthana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98nq3r16-0p93-21p5-0pn6-r36841320903@tzk.qr \
    --to=johannes.schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=congdanhqx@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=phillip.wood123@gmail.com \
    --cc=siddharthasthana31@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).