git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Phillip Wood <phillip.wood@talktalk.net>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Phillip Wood <phillip.wood@dunelm.org.uk>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Akinori MUSHA <knu@iDaemons.org>
Subject: Re: [RFC PATCH] sequencer: fix quoting in write_author_script
Date: Fri, 27 Jul 2018 11:36:51 +0100	[thread overview]
Message-ID: <dae25f9d-5fde-a9c5-2178-9e06f3150885@talktalk.net> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1807261332130.71@tvgsbejvaqbjf.bet>

Hi Johannes
On 26/07/18 13:33, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Wed, 18 Jul 2018, Phillip Wood wrote:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> Single quotes should be escaped as \' not \\'. Note that this only
>> affects authors that contain a single quote and then only external
>> scripts that read the author script and users whose git is upgraded from
>> the shell version of rebase -i while rebase was stopped. This is because
>> the parsing in read_env_script() expected the broken version and for
>> some reason sq_dequote() called by read_author_ident() seems to handle
>> the broken quoting correctly.
>>
>> Ideally write_author_script() would be rewritten to use
>> split_ident_line() and sq_quote_buf() but this commit just fixes the
>> immediate bug.
>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
> 
> Good catch.
> 
>> This is untested, unfortuantely I don't have really have time to write a test or
>> follow this up at the moment, if someone else want to run with it then please
>> do.
> 
> I modified the test that was added by Akinori. As it was added very early,
> and as there is still a test case *after* Akinori's that compares a
> hard-coded SHA-1, I refrained from using `test_commit` (which would change
> that SHA-1). See below.

Thanks for adding a test, that sounds like sensible approach, however
having thought about it I wonder if we should just be writing a plain
text file (e.g rebase-merge/author-data) and fixing the reader to read
that if it exists and only then fall back to reading the legacy
rebase-merge/author-script with a fix to correctly handle the script
written by the shell version - what do you think? The author-script
really should be just an implementation detail. If anyone really wants
to read it they can still do 'read -r l' and split the lines with
${l%%=*} and ${l#*=}

>> diff --git a/sequencer.c b/sequencer.c
>> index 5354d4d51e..0b78d1f100 100644
>> --- a/sequencer.c
>> +++ b/sequencer.c
>> @@ -638,21 +638,21 @@ static int write_author_script(const char *message)
>>  		else if (*message != '\'')
>>  			strbuf_addch(&buf, *(message++));
>>  		else
>> -			strbuf_addf(&buf, "'\\\\%c'", *(message++));
>> +			strbuf_addf(&buf, "'\\%c'", *(message++));
>>  	strbuf_addstr(&buf, "'\nGIT_AUTHOR_EMAIL='");
>>  	while (*message && *message != '\n' && *message != '\r')
>>  		if (skip_prefix(message, "> ", &message))
>>  			break;
>>  		else if (*message != '\'')
>>  			strbuf_addch(&buf, *(message++));
>>  		else
>> -			strbuf_addf(&buf, "'\\\\%c'", *(message++));
>> +			strbuf_addf(&buf, "'\\%c'", *(message++));
>>  	strbuf_addstr(&buf, "'\nGIT_AUTHOR_DATE='@");
>>  	while (*message && *message != '\n' && *message != '\r')
>>  		if (*message != '\'')
>>  			strbuf_addch(&buf, *(message++));
>>  		else
>> -			strbuf_addf(&buf, "'\\\\%c'", *(message++));
>> +			strbuf_addf(&buf, "'\\%c'", *(message++));
>>  	res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
> 
> I resolved the merge conflict with Akinori's patch. FWIW I pushed all of
> this, including the fixup to Junio's fixup to the
> `fix-t3404-author-script-test` branch at https://github.com/dscho/git.
> 
>>  	strbuf_release(&buf);
>>  	return res;
>> @@ -666,13 +666,21 @@ static int read_env_script(struct argv_array *env)
>>  {
>>  	struct strbuf script = STRBUF_INIT;
>>  	int i, count = 0;
>> -	char *p, *p2;
>> +	const char *p2;
>> +	char *p;
>>  
>>  	if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
>>  		return -1;
>>  
>>  	for (p = script.buf; *p; p++)
>> -		if (skip_prefix(p, "'\\\\''", (const char **)&p2))
>> +		/*
>> +		 * write_author_script() used to escape "'" incorrectly as
>> +		 * "'\\\\''" rather than "'\\''" so we check for the correct
>> +		 * version the incorrect version in case git was upgraded while
>> +		 * rebase was stopped.
>> +		 */
>> +		if (skip_prefix(p, "'\\''", &p2) ||
>> +		    skip_prefix(p, "'\\\\''", &p2))
> 
> I think in this form, it is possibly unsafe because it assumes that the
> new code cannot generate output that would trigger that same code path.
> Although I have to admit that I did not give this a great deal of thought.

Hm, I not sure that it can. If the Author begins \\' then this will be
written as the C string "'\\\\'\\''...". If \\' comes at the end then I
think this will be written as "\\\\'\\'''", in the middle of the name it
will be "\\\\'\\''..."

> In any case, if you have to think long and hard about some fix, it might
> be better to go with something that is easier to reason about. So how
> about this: we already know that the code is buggy, Akinori fixed the bug,
> where the author-script missed its trailing single-quote. We can use this
> as a tell-tale for *this* bug. Assuming that Junio will advance both your
> and Akinori's fix in close proximity.

That sounds like a good approach
> Again, this is pushed to the `fix-t3404-author-script-test` branch at
> https://github.com/dscho/git; My fixup on top of your patch looks like
> this (feel free to drop the sq_bug part and only keep the test part):
> 
> -- snipsnap --
> diff --git a/sequencer.c b/sequencer.c
> index 46c0b3e720f..7abe78dc78e 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -573,13 +573,14 @@ static int write_author_script(const char *message)
>  static int read_env_script(struct argv_array *env)
>  {
>  	struct strbuf script = STRBUF_INIT;
> -	int i, count = 0;
> +	int i, count = 0, sq_bug;
>  	const char *p2;
>  	char *p;
>  
>  	if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
>  		return -1;
>  
> +	sq_bug = script.len && script.buf[script.len - 1] != '\'';
>  	for (p = script.buf; *p; p++)
>  		/*
>  		 * write_author_script() used to escape "'" incorrectly as
> @@ -587,8 +588,9 @@ static int read_env_script(struct argv_array *env)
>  		 * version the incorrect version in case git was upgraded while
>  		 * rebase was stopped.
>  		 */

We probably want the change the comment slightly to explain sq_bug

> -		if (skip_prefix(p, "'\\''", &p2) ||
> -		    skip_prefix(p, "'\\\\''", &p2))
> +		if (sq_bug && skip_prefix(p, "'\\\\''", &p2))
> +			strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
> +		else if (skip_prefix(p, "'\\''", &p2))
>  			strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
>  		else if (*p == '\'')
>  			strbuf_splice(&script, p-- - script.buf, 1, "", 0);
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 97f0b4bf881..dd726ff4dc4 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -75,16 +75,18 @@ test_expect_success 'rebase --keep-empty' '
>  	test_line_count = 6 actual
>  '
>  
> +SQ="'"
>  test_expect_success 'rebase -i writes correct author-script' '
>  	test_when_finished "test_might_fail git rebase --abort" &&
> -	git checkout master &&
> +	git checkout -b author-with-sq master &&
> +	GIT_AUTHOR_NAME="Auth O$SQ R" git commit --allow-empty -m with-sq &&
>  	set_fake_editor &&
> -	FAKE_LINES="edit 1" git rebase -i HEAD^ &&
> +	FAKE_LINES="edit 1" git rebase -ki HEAD^ &&
>  	(
>  		sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE &&
>  		. .git/rebase-merge/author-script &&
>  		test "$(git show -s --date=raw --format=%an,%ae,@%ad)" = \
> -			"$GIT_AUTHOR_NAME,$GIT_AUTHOR_EMAIL,$GIT_AUTHOR_DATE"
> +			"Auth O$SQ R,$GIT_AUTHOR_EMAIL,$GIT_AUTHOR_DATE"
>  	)
>  '

Testing all three variables together is nice.

Best Wishes

Phillip

> @@ -1347,7 +1349,6 @@ test_expect_success 'editor saves as CR/LF' '
>  	)
>  '
>  
> -SQ="'"
>  test_expect_success 'rebase -i --gpg-sign=<key-id>' '
>  	set_fake_editor &&
>  	FAKE_LINES="edit 1" git rebase -i --gpg-sign="\"S I Gner\"" HEAD^ \
> 


  reply	other threads:[~2018-07-27 10:36 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-12 11:18 [PATCH] sequencer.c: terminate the last line of author-script properly Akinori MUSHA
2018-07-12 17:22 ` Junio C Hamano
2018-07-18  9:45   ` Phillip Wood
2018-07-18 13:46     ` [PATCH] sequencer.c: terminate the last line of author-scriptproperly Phillip Wood
2018-07-18 15:55       ` [RFC PATCH] sequencer: fix quoting in write_author_script Phillip Wood
2018-07-24 15:31         ` Junio C Hamano
2018-07-26 12:33         ` Johannes Schindelin
2018-07-27 10:36           ` Phillip Wood [this message]
2018-07-27 12:37             ` Johannes Schindelin
2018-07-30  9:35               ` Phillip Wood
2018-07-18 17:24       ` [PATCH] sequencer.c: terminate the last line of author-scriptproperly Junio C Hamano
2018-07-18 17:17     ` [PATCH] sequencer.c: terminate the last line of author-script properly Junio C Hamano
2018-07-19  9:20       ` Phillip Wood
2018-07-26 12:39         ` Johannes Schindelin
2018-07-26 17:53           ` Junio C Hamano
2018-07-12 20:13 ` Junio C Hamano
2018-07-12 20:16   ` Eric Sunshine
2018-07-12 20:23     ` Junio C Hamano
2018-07-17 23:25     ` Junio C Hamano
2018-07-18  6:23       ` Akinori MUSHA
2018-07-26 12:07       ` Johannes Schindelin
2018-07-26 17:44         ` Junio C Hamano
2018-07-27 15:49           ` Johannes Schindelin
2018-07-12 20:49   ` Junio C Hamano
2018-07-18  9:25 ` Phillip Wood
2018-07-18 13:50 ` Phillip Wood
2018-07-18 13:58   ` [PATCH] sequencer.c: terminate the last line of author-scriptproperly Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dae25f9d-5fde-a9c5-2178-9e06f3150885@talktalk.net \
    --to=phillip.wood@talktalk.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=knu@iDaemons.org \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).