git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Phillip Wood <phillip.wood@dunelm.org.uk>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Akinori MUSHA <knu@iDaemons.org>
Subject: Re: [RFC PATCH] sequencer: fix quoting in write_author_script
Date: Fri, 27 Jul 2018 14:37:10 +0200 (DST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1807271415410.10478@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <dae25f9d-5fde-a9c5-2178-9e06f3150885@talktalk.net>

Hi Phillip, Junio and Akinori,

I just noticed that t3404 is broken without my patches (but with Junio's
fixup), on Windows, macOS and Linux. (See log at the end.)

On Fri, 27 Jul 2018, Phillip Wood wrote:

> On 26/07/18 13:33, Johannes Schindelin wrote:
> > 
> > On Wed, 18 Jul 2018, Phillip Wood wrote:
> > 
> >> Single quotes should be escaped as \' not \\'. Note that this only
> >> affects authors that contain a single quote and then only external
> >> scripts that read the author script and users whose git is upgraded from
> >> the shell version of rebase -i while rebase was stopped. This is because
> >> the parsing in read_env_script() expected the broken version and for
> >> some reason sq_dequote() called by read_author_ident() seems to handle
> >> the broken quoting correctly.
> >>
> >> Ideally write_author_script() would be rewritten to use
> >> split_ident_line() and sq_quote_buf() but this commit just fixes the
> >> immediate bug.
> > 
> >> This is untested, unfortuantely I don't have really have time to write a test or
> >> follow this up at the moment, if someone else want to run with it then please
> >> do.
> > 
> > I modified the test that was added by Akinori. As it was added very early,
> > and as there is still a test case *after* Akinori's that compares a
> > hard-coded SHA-1, I refrained from using `test_commit` (which would change
> > that SHA-1). See below.
> 
> Thanks for adding a test, that sounds like sensible approach, however
> having thought about it I wonder if we should just be writing a plain
> text file (e.g rebase-merge/author-data) and fixing the reader to read
> that if it exists and only then fall back to reading the legacy
> rebase-merge/author-script with a fix to correctly handle the script
> written by the shell version - what do you think? The author-script
> really should be just an implementation detail. If anyone really wants
> to read it they can still do 'read -r l' and split the lines with
> ${l%%=*} and ${l#*=}

In contrast to `git am`, there *is* a use case where power users might
have come to rely on the presence of the .git/rebase-merge/author-script
file *and* its nature as a shell script snippet: we purposefully allow
scripting `rebase -i`.

So I don't think that we can declare the file and its format as
implementation detail, even if the idea is very, very tempting.

> >> diff --git a/sequencer.c b/sequencer.c
> >> index 5354d4d51e..0b78d1f100 100644
> >> --- a/sequencer.c
> >> +++ b/sequencer.c
> >> @@ -638,21 +638,21 @@ static int write_author_script(const char *message)
> >>  		else if (*message != '\'')
> >>  			strbuf_addch(&buf, *(message++));
> >>  		else
> >> -			strbuf_addf(&buf, "'\\\\%c'", *(message++));
> >> +			strbuf_addf(&buf, "'\\%c'", *(message++));
> >>  	strbuf_addstr(&buf, "'\nGIT_AUTHOR_EMAIL='");
> >>  	while (*message && *message != '\n' && *message != '\r')
> >>  		if (skip_prefix(message, "> ", &message))
> >>  			break;
> >>  		else if (*message != '\'')
> >>  			strbuf_addch(&buf, *(message++));
> >>  		else
> >> -			strbuf_addf(&buf, "'\\\\%c'", *(message++));
> >> +			strbuf_addf(&buf, "'\\%c'", *(message++));
> >>  	strbuf_addstr(&buf, "'\nGIT_AUTHOR_DATE='@");
> >>  	while (*message && *message != '\n' && *message != '\r')
> >>  		if (*message != '\'')
> >>  			strbuf_addch(&buf, *(message++));
> >>  		else
> >> -			strbuf_addf(&buf, "'\\\\%c'", *(message++));
> >> +			strbuf_addf(&buf, "'\\%c'", *(message++));
> >>  	res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
> > 
> > I resolved the merge conflict with Akinori's patch. FWIW I pushed all of
> > this, including the fixup to Junio's fixup to the
> > `fix-t3404-author-script-test` branch at https://github.com/dscho/git.
> > 
> >>  	strbuf_release(&buf);
> >>  	return res;
> >> @@ -666,13 +666,21 @@ static int read_env_script(struct argv_array *env)
> >>  {
> >>  	struct strbuf script = STRBUF_INIT;
> >>  	int i, count = 0;
> >> -	char *p, *p2;
> >> +	const char *p2;
> >> +	char *p;
> >>  
> >>  	if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
> >>  		return -1;
> >>  
> >>  	for (p = script.buf; *p; p++)
> >> -		if (skip_prefix(p, "'\\\\''", (const char **)&p2))
> >> +		/*
> >> +		 * write_author_script() used to escape "'" incorrectly as
> >> +		 * "'\\\\''" rather than "'\\''" so we check for the correct
> >> +		 * version the incorrect version in case git was upgraded while
> >> +		 * rebase was stopped.
> >> +		 */
> >> +		if (skip_prefix(p, "'\\''", &p2) ||
> >> +		    skip_prefix(p, "'\\\\''", &p2))
> > 
> > I think in this form, it is possibly unsafe because it assumes that the
> > new code cannot generate output that would trigger that same code path.
> > Although I have to admit that I did not give this a great deal of thought.
> 
> Hm, I not sure that it can. If the Author begins \\' then this will be
> written as the C string "'\\\\'\\''...". If \\' comes at the end then I
> think this will be written as "\\\\'\\'''", in the middle of the name it
> will be "\\\\'\\''..."

Yes, that matches my gut feeling... but...

> > In any case, if you have to think long and hard about some fix, it might
> > be better to go with something that is easier to reason about. So how
> > about this: we already know that the code is buggy, Akinori fixed the bug,
> > where the author-script missed its trailing single-quote. We can use this
> > as a tell-tale for *this* bug. Assuming that Junio will advance both your
> > and Akinori's fix in close proximity.
> 
> That sounds like a good approach

I am glad that you agree to this.

I am a big fan of this age-old wisdom
that goes somewhat like this: some code is so simple that there is no
space for obvious bugs, and some code is so complicated that there is no
space for obvious bugs. In this context, I would use the modified
version: some code is so easy to reason about that there is no obvious
flaw, and some other code is so difficult to reason about that there is
no obvious flaw.

Besides, using the sq_bug version can serve as a reminder to "pay down
the technical debt" in the future.

> > Again, this is pushed to the `fix-t3404-author-script-test` branch at
> > https://github.com/dscho/git; My fixup on top of your patch looks like
> > this (feel free to drop the sq_bug part and only keep the test part):
> > 
> > -- snipsnap --
> > diff --git a/sequencer.c b/sequencer.c
> > index 46c0b3e720f..7abe78dc78e 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -573,13 +573,14 @@ static int write_author_script(const char *message)
> >  static int read_env_script(struct argv_array *env)
> >  {
> >  	struct strbuf script = STRBUF_INIT;
> > -	int i, count = 0;
> > +	int i, count = 0, sq_bug;
> >  	const char *p2;
> >  	char *p;
> >  
> >  	if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
> >  		return -1;
> >  
> > +	sq_bug = script.len && script.buf[script.len - 1] != '\'';
> >  	for (p = script.buf; *p; p++)
> >  		/*
> >  		 * write_author_script() used to escape "'" incorrectly as
> > @@ -587,8 +588,9 @@ static int read_env_script(struct argv_array *env)
> >  		 * version the incorrect version in case git was upgraded while
> >  		 * rebase was stopped.
> >  		 */
> 
> We probably want the change the comment slightly to explain sq_bug

True. Can you give it a shot? If not, I will try to remember some time mid
next week.

Here the promised log of t3404 with -i -v -x (on macOS, but Linux and
Windows shows equivalent failures, and the way I read it, the problem is
simply that the test was introduced in the middle of t3404 and
subsequent test cases' assumptions are no longer met):

-- snipsnap --
2018-07-27T11:53:04.2450890Z ok 2 - rebase --keep-empty
2018-07-27T11:53:04.2464690Z 
2018-07-27T11:53:04.2489730Z expecting success: 
2018-07-27T11:53:04.2513840Z 	test_when_finished "git rebase --abort ||:" &&
2018-07-27T11:53:04.2536980Z 	git checkout master &&
2018-07-27T11:53:04.2563050Z 	set_fake_editor &&
2018-07-27T11:53:04.2590300Z 	FAKE_LINES="edit 1" git rebase -i HEAD^ &&
2018-07-27T11:53:04.2614900Z 	test -f .git/rebase-merge/author-script &&
2018-07-27T11:53:04.2639740Z 	(
2018-07-27T11:53:04.2665320Z 		sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE &&
2018-07-27T11:53:04.2688410Z 		eval "$(cat .git/rebase-merge/author-script)" &&
2018-07-27T11:53:04.2710930Z 		test "$(git show --quiet --pretty=format:%an)" = "$GIT_AUTHOR_NAME" &&
2018-07-27T11:53:04.2734740Z 		test "$(git show --quiet --pretty=format:%ae)" = "$GIT_AUTHOR_EMAIL" &&
2018-07-27T11:53:04.2756700Z 		test "$(git show --quiet --date=raw --pretty=format:@%ad)" = "$GIT_AUTHOR_DATE"
2018-07-27T11:53:04.2778060Z 	)
2018-07-27T11:53:04.2788910Z 
2018-07-27T11:53:04.2810410Z ++ test_when_finished 'git rebase --abort ||:'
2018-07-27T11:53:04.2831430Z ++ test 0 = 0
2018-07-27T11:53:04.2852830Z ++ test_cleanup='{ git rebase --abort ||:
2018-07-27T11:53:04.2874430Z 		} && (exit "$eval_ret"); eval_ret=$?; :'
2018-07-27T11:53:04.2892720Z ++ git checkout master
2018-07-27T11:53:04.2911560Z Switched to branch 'master'
2018-07-27T11:53:04.2929490Z ++ set_fake_editor
2018-07-27T11:53:04.2947690Z ++ write_script fake-editor.sh
2018-07-27T11:53:04.2966040Z ++ echo '#!/bin/sh'
2018-07-27T11:53:04.2983780Z ++ cat
2018-07-27T11:53:04.3001950Z ++ chmod +x fake-editor.sh
2018-07-27T11:53:04.3019770Z +++ pwd
2018-07-27T11:53:04.3038530Z ++ test_set_editor '/Users/vsts/agent/2.138.3/work/1/s/t/trash directory.t3404-rebase-interactive/fake-editor.sh'
2018-07-27T11:53:04.3057950Z ++ FAKE_EDITOR='/Users/vsts/agent/2.138.3/work/1/s/t/trash directory.t3404-rebase-interactive/fake-editor.sh'
2018-07-27T11:53:04.3076510Z ++ export FAKE_EDITOR
2018-07-27T11:53:04.3094750Z ++ EDITOR='"$FAKE_EDITOR"'
2018-07-27T11:53:04.3112610Z ++ export EDITOR
2018-07-27T11:53:04.3130670Z ++ FAKE_LINES='edit 1'
2018-07-27T11:53:04.3153990Z ++ git rebase -i 'HEAD^'
2018-07-27T11:53:04.4674460Z rebase -i script before editing:
2018-07-27T11:53:04.4698420Z pick 8f99a4f E
2018-07-27T11:53:04.4710290Z 
2018-07-27T11:53:04.4792780Z rebase -i script after editing:
2018-07-27T11:53:04.4817630Z edit 8f99a4f E
2018-07-27T11:53:04.5025190Z Rebasing (1/1)
2018-07-27T11:53:04.5046400Z Stopped at 8f99a4f...  E
2018-07-27T11:53:04.5065680Z You can amend the commit now, with
2018-07-27T11:53:04.5074810Z 
2018-07-27T11:53:04.5092850Z   git commit --amend 
2018-07-27T11:53:04.5101800Z 
2018-07-27T11:53:04.5119900Z Once you are satisfied with your changes, run
2018-07-27T11:53:04.5129030Z 
2018-07-27T11:53:04.5147390Z   git rebase --continue
2018-07-27T11:53:04.5165610Z ++ test -f .git/rebase-merge/author-script
2018-07-27T11:53:04.5183820Z ++ sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE
2018-07-27T11:53:04.5201610Z ++ unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE
2018-07-27T11:53:04.5220730Z ++ return 0
2018-07-27T11:53:04.5244590Z +++ cat .git/rebase-merge/author-script
2018-07-27T11:53:04.5269130Z ++ eval 'GIT_AUTHOR_NAME='\''A U Thor'\''
2018-07-27T11:53:04.5294440Z GIT_AUTHOR_EMAIL='\''author@example.com'\''
2018-07-27T11:53:04.5319550Z GIT_AUTHOR_DATE='\''@1112912233 -0700'\'''
2018-07-27T11:53:04.5345580Z +++ GIT_AUTHOR_NAME='A U Thor'
2018-07-27T11:53:04.5365960Z +++ GIT_AUTHOR_EMAIL=author@example.com
2018-07-27T11:53:04.5384150Z +++ GIT_AUTHOR_DATE='@1112912233 -0700'
2018-07-27T11:53:04.5402250Z +++ git show --quiet --pretty=format:%an
2018-07-27T11:53:04.5420400Z ++ test 'A U Thor' = 'A U Thor'
2018-07-27T11:53:04.5438410Z +++ git show --quiet --pretty=format:%ae
2018-07-27T11:53:04.5459570Z ++ test author@example.com = author@example.com
2018-07-27T11:53:04.5477950Z +++ git show --quiet --date=raw --pretty=format:@%ad
2018-07-27T11:53:04.5496180Z ++ test '@1112912233 -0700' = '@1112912233 -0700'
2018-07-27T11:53:04.5514390Z ++ git rebase --abort
2018-07-27T11:53:04.6924920Z ++ exit 0
2018-07-27T11:53:04.6950890Z ++ eval_ret=0
2018-07-27T11:53:04.6973900Z ++ :
2018-07-27T11:53:04.6998480Z ok 3 - rebase -i writes out .git/rebase-merge/author-script in "edit" that sh(1) can parse
2018-07-27T11:53:04.7009400Z 
2018-07-27T11:53:04.7030290Z expecting success: 
2018-07-27T11:53:04.7050880Z 	set_fake_editor &&
2018-07-27T11:53:04.7072520Z 	test_must_fail env FAKE_LINES="1 exec_true" git rebase -i HEAD^ >actual 2>&1 &&
2018-07-27T11:53:04.7092910Z 	test_i18ncmp expect actual
2018-07-27T11:53:04.7102960Z 
2018-07-27T11:53:04.7123150Z ++ set_fake_editor
2018-07-27T11:53:04.7144080Z ++ write_script fake-editor.sh
2018-07-27T11:53:04.7164970Z ++ echo '#!/bin/sh'
2018-07-27T11:53:04.7185360Z ++ cat
2018-07-27T11:53:04.7212950Z ++ chmod +x fake-editor.sh
2018-07-27T11:53:04.7236720Z +++ pwd
2018-07-27T11:53:04.7263500Z ++ test_set_editor '/Users/vsts/agent/2.138.3/work/1/s/t/trash directory.t3404-rebase-interactive/fake-editor.sh'
2018-07-27T11:53:04.7291390Z ++ FAKE_EDITOR='/Users/vsts/agent/2.138.3/work/1/s/t/trash directory.t3404-rebase-interactive/fake-editor.sh'
2018-07-27T11:53:04.7314090Z ++ export FAKE_EDITOR
2018-07-27T11:53:04.7338290Z ++ EDITOR='"$FAKE_EDITOR"'
2018-07-27T11:53:04.7362710Z ++ export EDITOR
2018-07-27T11:53:04.7386370Z ++ test_must_fail env 'FAKE_LINES=1 exec_true' git rebase -i 'HEAD^'
2018-07-27T11:53:04.7412410Z ++ case "$1" in
2018-07-27T11:53:04.7436170Z ++ _test_ok=
2018-07-27T11:53:04.7455340Z ++ env 'FAKE_LINES=1 exec_true' git rebase -i 'HEAD^'
2018-07-27T11:53:04.9560670Z ++ exit_code=0
2018-07-27T11:53:04.9581700Z ++ test 0 -eq 0
2018-07-27T11:53:04.9603200Z ++ list_contains '' success
2018-07-27T11:53:04.9628730Z ++ case ",$1," in
2018-07-27T11:53:04.9652120Z ++ return 1
2018-07-27T11:53:04.9677740Z ++ echo 'test_must_fail: command succeeded: env FAKE_LINES=1 exec_true git rebase -i HEAD^'
2018-07-27T11:53:04.9704590Z test_must_fail: command succeeded: env FAKE_LINES=1 exec_true git rebase -i HEAD^
2018-07-27T11:53:04.9730750Z ++ return 1
2018-07-27T11:53:04.9754580Z error: last command exited with $?=1
2018-07-27T11:53:04.9779100Z not ok 4 - rebase -i with empty HEAD
2018-07-27T11:53:04.9802670Z #	
2018-07-27T11:53:04.9826330Z #		set_fake_editor &&
2018-07-27T11:53:04.9851320Z #		test_must_fail env FAKE_LINES="1 exec_true" git rebase -i HEAD^ >actual 2>&1 &&
2018-07-27T11:53:04.9875680Z #		test_i18ncmp expect actual
2018-07-27T11:53:04.9899170Z #	
2018-07-27T11:53:04.9923310Z make[1]: *** [t3404-rebase-interactive.sh] Error 1

  reply	other threads:[~2018-07-27 12:37 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-12 11:18 [PATCH] sequencer.c: terminate the last line of author-script properly Akinori MUSHA
2018-07-12 17:22 ` Junio C Hamano
2018-07-18  9:45   ` Phillip Wood
2018-07-18 13:46     ` [PATCH] sequencer.c: terminate the last line of author-scriptproperly Phillip Wood
2018-07-18 15:55       ` [RFC PATCH] sequencer: fix quoting in write_author_script Phillip Wood
2018-07-24 15:31         ` Junio C Hamano
2018-07-26 12:33         ` Johannes Schindelin
2018-07-27 10:36           ` Phillip Wood
2018-07-27 12:37             ` Johannes Schindelin [this message]
2018-07-30  9:35               ` Phillip Wood
2018-07-18 17:24       ` [PATCH] sequencer.c: terminate the last line of author-scriptproperly Junio C Hamano
2018-07-18 17:17     ` [PATCH] sequencer.c: terminate the last line of author-script properly Junio C Hamano
2018-07-19  9:20       ` Phillip Wood
2018-07-26 12:39         ` Johannes Schindelin
2018-07-26 17:53           ` Junio C Hamano
2018-07-12 20:13 ` Junio C Hamano
2018-07-12 20:16   ` Eric Sunshine
2018-07-12 20:23     ` Junio C Hamano
2018-07-17 23:25     ` Junio C Hamano
2018-07-18  6:23       ` Akinori MUSHA
2018-07-26 12:07       ` Johannes Schindelin
2018-07-26 17:44         ` Junio C Hamano
2018-07-27 15:49           ` Johannes Schindelin
2018-07-12 20:49   ` Junio C Hamano
2018-07-18  9:25 ` Phillip Wood
2018-07-18 13:50 ` Phillip Wood
2018-07-18 13:58   ` [PATCH] sequencer.c: terminate the last line of author-scriptproperly Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1807271415410.10478@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=knu@iDaemons.org \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).