git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v3 06/11] rerere: fix crash when conflict goes unresolved
Date: Mon, 30 Jul 2018 21:45:45 +0100	[thread overview]
Message-ID: <20180730204545.GH9955@hank.intra.tgummerer.com> (raw)
In-Reply-To: <xmqqbmaohath.fsf@gitster-ct.c.googlers.com>

On 07/30, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > Currently when a user doesn't resolve a conflict in a file, but
> > commits the file with the conflict markers, and later the file ends up
> > in a state in which rerere can't handle it, subsequent rerere
> > operations that are interested in that path, such as 'rerere clear' or
> > 'rerere forget <path>' will fail, or even worse in the case of 'rerere
> > clear' segfault.
> >
> > Such states include nested conflicts, or an extra conflict marker that
> > doesn't have any match.
> >
> > This is because the first 'git rerere' when there was only one
> > conflict in the file leaves an entry in the MERGE_RR file behind.  The
> 
> I find this sentence, especially the "only one conflict in the file"
> part, a bit unclear.  What does the sentence count as one conflict?
> One block of lines enclosed inside "<<<"...">>>" pair?  The command
> behaves differently when there are two such blocks instead?

Yeah as you mentioned below, conflict marker(s) that cannot be parsed
here would make more sense.  Will adjust the commit message.

> > next 'git rerere' will then pick the rerere ID for that file up, and
> > not assign a new ID as it can't successfully calculate one.  It will
> > however still try to do the rerere operation, because of the existing
> > ID.  As the handle_file function fails, it will remove the 'preimage'
> > for the ID in the process, while leaving the ID in the MERGE_RR file.
> >
> > Now when 'rerere clear' for example is run, it will segfault in
> > 'has_rerere_resolution', because status is NULL.
> 
> I think this "status" refers to the collection->status[].  How do we
> get into that state, though?
> 
> new_rerere_id() and new_rerere_id_hex() fills id->collection by
> calling find_rerere_dir(), which either finds an existing rerere_dir
> instance or manufactures one with .status==NULL.  The .status[]
> array is later grown by calling fit_variant as we scan and find the
> pre/post images, but because there is no pre/post image for a file
> with unparseable conflicts, it is left NULL.
> 
> So another possible fix could be to make sure that .status[] is only
> read when .status_nr says there is something worth reading.  I am
> not saying that would be a better fix---I am just thinking out loud
> to make sure I understand the issue correctly.

Yeah what you are writing above matches my understanding, and that
should fix the issue as well.  I haven't actually tried what you're
proposing above, but I think I find it nicer to just remove the entry
we can't do anything with anyway.

> > To fix this, remove the rerere ID from the MERGE_RR file in the case
> > when we can't handle it, and remove the corresponding variant from
> > .git/rr-cache/.  Removing it unconditionally is fine here, because if
> > the user would have resolved the conflict and ran rerere, the entry
> > would no longer be in the MERGE_RR file, so we wouldn't have this
> > problem in the first place, while if the conflict was not resolved,
> > the only thing that's left in the folder is the 'preimage', which by
> > itself will be regenerated by git if necessary, so the user won't
> > loose any work.
> 
> s/loose/lose/
> 
> > Note that other variants that have the same conflict ID will not be
> > touched.
> 
> Nice.  Thanks for a fix.
> 
> >
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> >  rerere.c          | 12 +++++++-----
> >  t/t4200-rerere.sh | 22 ++++++++++++++++++++++
> >  2 files changed, 29 insertions(+), 5 deletions(-)
> >
> > diff --git a/rerere.c b/rerere.c
> > index da1ab54027..895ad80c0c 100644
> > --- a/rerere.c
> > +++ b/rerere.c
> > @@ -823,10 +823,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
> >  		struct rerere_id *id;
> >  		unsigned char sha1[20];
> >  		const char *path = conflict.items[i].string;
> > -		int ret;
> > -
> > -		if (string_list_has_string(rr, path))
> > -			continue;
> > +		int ret, has_string;
> >  
> >  		/*
> >  		 * Ask handle_file() to scan and assign a
> > @@ -834,7 +831,12 @@ static int do_plain_rerere(struct string_list *rr, int fd)
> >  		 * yet.
> >  		 */
> >  		ret = handle_file(path, sha1, NULL);
> > -		if (ret < 1)
> > +		has_string = string_list_has_string(rr, path);
> > +		if (ret < 0 && has_string) {
> > +			remove_variant(string_list_lookup(rr, path)->util);
> > +			string_list_remove(rr, path, 1);
> > +		}
> > +		if (ret < 1 || has_string)
> >  			continue;
> 
> We used to say "if we know about the path we do not do anything
> here, if we do not see any conflict in the file we do nothing,
> otherwise we assign a new id"; we now say "see if we can parse
> and also see if we have conflict(s); if we know about the path and
> we cannot parse, drop it from the rr database (because otherwise the
> entry will cause us trouble elsewhere later).  Otherwise, if we do
> not have any conflict or we already know about the path, no need to
> do anything. Otherwise, i.e. a newly discovered path with conflicts
> gets a new id".
> 
> Makes sense.  "A known path with unparseable conflict gets dropped"
> is the important change in this hunk.
> 
> > diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
> > index 8417e5a4b1..34f0518a5e 100755
> > --- a/t/t4200-rerere.sh
> > +++ b/t/t4200-rerere.sh
> > @@ -580,4 +580,26 @@ test_expect_success 'multiple identical conflicts' '
> >  	count_pre_post 0 0
> >  '
> >  
> > +test_expect_success 'rerere with unexpected conflict markers does not crash' '
> > +	git reset --hard &&
> > +
> > +	git checkout -b branch-1 master &&
> > +	echo "bar" >test &&
> > +	git add test &&
> > +	git commit -q -m two &&
> > +
> > +	git reset --hard &&
> > +	git checkout -b branch-2 master &&
> > +	echo "foo" >test &&
> > +	git add test &&
> > +	git commit -q -a -m one &&
> > +
> > +	test_must_fail git merge branch-1 &&
> > +	sed "s/bar/>>>>>>> a/" >test.tmp <test &&
> > +	mv test.tmp test &&
> 
> OK, so the "only one conflict" in the log message meant just one
> side of the conflict marker.  More generally, the troublesome is
> to have "conflict marker(s) that cannot be parsed" in the file.
> 
> > +	git rerere &&
> > +
> > +	git rerere clear
> > +'
> > +
> >  test_done

  reply	other threads:[~2018-07-30 20:45 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-20 21:12 [RFC/PATCH 0/7] rerere: handle nested conflicts Thomas Gummerer
2018-05-20 21:12 ` [RFC/PATCH 1/7] rerere: unify error message when read_cache fails Thomas Gummerer
2018-05-21 19:00   ` Stefan Beller
2018-05-20 21:12 ` [RFC/PATCH 2/7] rerere: mark strings for translation Thomas Gummerer
2018-05-24  7:20   ` Junio C Hamano
2018-05-20 21:12 ` [RFC/PATCH 3/7] rerere: add some documentation Thomas Gummerer
2018-05-24  9:20   ` Junio C Hamano
2018-06-03 11:41     ` Thomas Gummerer
2018-05-20 21:12 ` [RFC/PATCH 4/7] rerere: fix crash when conflict goes unresolved Thomas Gummerer
2018-05-24  9:47   ` Junio C Hamano
2018-05-24 18:54     ` Thomas Gummerer
2018-05-25  1:20       ` Junio C Hamano
2018-05-20 21:12 ` [RFC/PATCH 5/7] rerere: only return whether a path has conflicts or not Thomas Gummerer
2018-05-24 10:02   ` Junio C Hamano
2018-05-20 21:12 ` [RFC/PATCH 6/7] rerere: factor out handle_conflict function Thomas Gummerer
2018-05-20 21:12 ` [RFC/PATCH 7/7] rerere: teach rerere to handle nested conflicts Thomas Gummerer
2018-05-24 10:21   ` Junio C Hamano
2018-05-24 19:07     ` Thomas Gummerer
2018-06-05 21:52 ` [PATCH v2 00/10] rerere: " Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 01/10] rerere: unify error messages when read_cache fails Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 02/10] rerere: lowercase error messages Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 03/10] rerere: wrap paths in output in sq Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 04/10] rerere: mark strings for translation Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 05/10] rerere: add some documentation Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 06/10] rerere: fix crash when conflict goes unresolved Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 07/10] rerere: only return whether a path has conflicts or not Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 08/10] rerere: factor out handle_conflict function Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 09/10] rerere: teach rerere to handle nested conflicts Thomas Gummerer
2018-06-05 21:52   ` [PATCH v2 10/10] rerere: recalculate conflict ID when unresolved conflict is committed Thomas Gummerer
2018-07-03 21:05   ` [PATCH v2 00/10] rerere: handle nested conflicts Thomas Gummerer
2018-07-06 17:56     ` Junio C Hamano
2018-07-10 21:37       ` Thomas Gummerer
2018-07-14 21:44   ` [PATCH v3 00/11] " Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 01/11] rerere: unify error messages when read_cache fails Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 02/11] rerere: lowercase error messages Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 03/11] rerere: wrap paths in output in sq Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 04/11] rerere: mark strings for translation Thomas Gummerer
2018-07-15 13:24       ` Simon Ruderich
2018-07-16 20:40         ` Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 05/11] rerere: add documentation for conflict normalization Thomas Gummerer
2018-07-30 17:50       ` Junio C Hamano
2018-07-30 20:21         ` Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 06/11] rerere: fix crash when conflict goes unresolved Thomas Gummerer
2018-07-30 17:50       ` Junio C Hamano
2018-07-30 20:45         ` Thomas Gummerer [this message]
2018-07-14 21:44     ` [PATCH v3 07/11] rerere: only return whether a path has conflicts or not Thomas Gummerer
2018-07-30 17:50       ` Junio C Hamano
2018-07-30 20:47         ` Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 08/11] rerere: factor out handle_conflict function Thomas Gummerer
2018-07-30 17:51       ` Junio C Hamano
2018-07-14 21:44     ` [PATCH v3 09/11] rerere: return strbuf from handle path Thomas Gummerer
2018-07-30 17:51       ` Junio C Hamano
2018-07-14 21:44     ` [PATCH v3 10/11] rerere: teach rerere to handle nested conflicts Thomas Gummerer
2018-07-30 17:45       ` Junio C Hamano
2018-07-30 20:20         ` Thomas Gummerer
2018-07-14 21:44     ` [PATCH v3 11/11] rerere: recalculate conflict ID when unresolved conflict is committed Thomas Gummerer
2018-07-30 17:50     ` [PATCH v3 00/11] rerere: handle nested conflicts Junio C Hamano
2018-07-30 20:49       ` Thomas Gummerer
2018-08-05 17:20     ` [PATCH v4 " Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 01/11] rerere: unify error messages when read_cache fails Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 02/11] rerere: lowercase error messages Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 03/11] rerere: wrap paths in output in sq Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 04/11] rerere: mark strings for translation Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 05/11] rerere: add documentation for conflict normalization Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 06/11] rerere: fix crash with files rerere can't handle Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 07/11] rerere: only return whether a path has conflicts or not Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 08/11] rerere: factor out handle_conflict function Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 09/11] rerere: return strbuf from handle path Thomas Gummerer
2018-08-05 17:20       ` [PATCH v4 10/11] rerere: teach rerere to handle nested conflicts Thomas Gummerer
2018-08-22 11:00         ` Ævar Arnfjörð Bjarmason
2018-08-22 16:06           ` Junio C Hamano
2018-08-22 20:34             ` Thomas Gummerer
2018-08-22 21:07               ` Junio C Hamano
2018-08-24 21:56                 ` Thomas Gummerer
2018-08-24 22:10                   ` [PATCH 1/2] rerere: remove documentation for "nested conflicts" Thomas Gummerer
2018-08-24 22:10                     ` [PATCH 2/2] rerere: add not about files with existing conflict markers Thomas Gummerer
2018-08-28 21:27                     ` [PATCH v2 1/2] rerere: mention caveat about unmatched " Thomas Gummerer
2018-08-28 21:27                       ` [PATCH v2 2/2] rerere: add note about files with existing " Thomas Gummerer
2018-08-29 16:04                       ` [PATCH v2 1/2] rerere: mention caveat about unmatched " Junio C Hamano
2018-09-01  9:00                         ` Thomas Gummerer
2018-08-27 17:33                   ` [PATCH v4 10/11] rerere: teach rerere to handle nested conflicts Junio C Hamano
2018-08-28 22:05                     ` Thomas Gummerer
2018-08-27 19:36                   ` Junio C Hamano
2018-08-05 17:20       ` [PATCH v4 11/11] rerere: recalculate conflict ID when unresolved conflict is committed Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180730204545.GH9955@hank.intra.tgummerer.com \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).