git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Calvin Wan <calvinwan@google.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Glen Choo <chooglen@google.com>,
	Junio C Hamano <gitster@pobox.com>,
	Philippe Blain <levraiphilippeblain@gmail.com>
Subject: Re: [PATCH v3] submodule merge: update conflict error message
Date: Thu, 30 Jun 2022 21:27:13 -0700	[thread overview]
Message-ID: <CABPp-BE0qGwUy80dmVszkJQ+tcpfLRW0OZyErymzhZ9+HWY1mw@mail.gmail.com> (raw)
In-Reply-To: <20220630194833.1191949-1-calvinwan@google.com>

On Thu, Jun 30, 2022 at 12:48 PM Calvin Wan <calvinwan@google.com> wrote:
>
> > > Failed to merge submodule <submodule>
> > > CONFLICT (submodule): Merge conflict in <submodule>
> > > Automatic merge failed; recursive merging with submodules is currently
> > > not supported. To manually complete the merge:
> > >  - go to submodule (<submodule>), and merge commit <commit>
> >
> > I'm still a little unsure on this.  The merge_submodule() logic we
> > have may not have detected a merge commit that merges the two branches
> > together, but that doesn't automatically imply a new merge must be
> > created in the submodule to resolve this conflict.  There might be
> > various reasons that other existing commits in the submodule could be
> > used as the "correct" update.
> >
> > Perhaps that is uncommon enough to not be worth mentioning; I'm just a
> > little hesitant to treat the merge_submodule() logic as robust and
> > finding the only choices users might want to use.  If we do keep the
> > wording as-is, it might be nice to at least discuss in the commit
> > message why we chose that and ignored the or-update-submodule option
> > in this case.
>
> You make a good point about merge_submodule() possibly not finding all
> of the right choices -- I'll add back the or-update-submodule option
> back.
>
> > > If this is correct simply add it to the index for example
> > > by using:
> > >
> > >   git update-index --cacheinfo 160000 <commit> "<submodule>"
> > >
> > > which will accept this suggestion.
> >
> > The last few lines above will no longer be part of the output once
> > en/merge-tree is merged; see commit a4040cfa35 (merge-ort: remove
> > command-line-centric submodule message from merge-ort, 2022-06-18).
>
> ack
>
> > > CONFLICT (submodule): Merge conflict in <submodule>
> > > Recursive merging with submodules is currently not supported.
> > > To manually complete the merge:
> > >  - go to submodule (<submodule>), and either update the submodule to a possible commit above or merge commit <commit>
> >
> > Again, I'm hesitant to treat the suggestions from merge_submodule() as
> > the only possibilities.  For example, perhaps the user will want to
> > pick a commit that contains one of the ones found by merge_submodule()
> > in its history -- perhaps because the newer commit they want to select
> > contains an important bugfix for an issue not discovered at the time
> > of the merge in the submodule.
> >
> > Also, I'm worried your sentence may be easy to misunderstand, due to
> > it being ambiguous whether "merge" is a verb or an adjective.  More
> > precisely, your sentence could be read as "update the submodule to a
> > possible commit above, or update the submodule to merge commit
> > <commit>" and then users say, "But <commit> isn't a merge commit; why
> > does this message claim it is?  Do I just update the submodule to that
> > commit anyway?"  Or perhaps users just update it to <commit> without
> > even checking to find out that it's not a merge commit, with the
> > deleterious consequences of missing all the important changes
> > currently found in the submodule.

I probably should have mentioned that listing "merge" first and
"update" second in your instructions at least avoids the ambiguity
because it makes it clear that "merge" is a verb that way:

    - go to submodule (<submodule>), and either merge commit <commit>
or update the submodule to a possible commit above

> I agree this sentence can be misinterpreted. I also think the main
> reason my current message is unclear is because there is not enough
> context for the user to understand why the merge failed. In a standard
> merge, the reason in "CONFLICT (<reason>)" provides context, whereas in
> this case, "CONFLICT (submodule)" only tells the user that the submodule
> is conflicted in some way. The user is unaware that we attempted to
> fast-forward the submodule, failed, and now require the user to either
> update the commit or merge the commit. How does this rewording sound?

Do they need to know what we attempted?  I'm not sure why that
matters, beyond maybe telling them that actual automatic merging of
submodules is currently only done by Git's merging machinery in very
trivial cases.  All that should really matter at this point is that
there was a submodule modified on both sides of history, and we need
them to handle the merge of the submodule.

>  Automatic merge failed; recursive merging with submodules currently
>  only supports fast-forward merges. For each conflicted submodule, if

I'd rather substitute "trivial cases" instead of "fast-forward
merges".  For example, we handle deletions on both sides, it's just
that that's done before it ever gets to merge_submodule().  And if we
add more types of special cases beyond fast-forwards at some point,
it'd be nice to not need to update this text.

>  the current commit does not reflect the desired state, either update or
>  merge the commit. This can be accomplished with the following steps:

Maybe replace the last two sentences with "Please manually handle the
merging of each conflicted submodule; this can be accomplished with
the following steps:"

>  - go to submodule (<submodule>), and either update the commit or merge commit <commit>

What would you think of switching this to

   - go to submodule <submodule>, and either merge commit <commit> or
update to an existing commit which has merged those changes such as
one of the ones listed above

> > >  - come back to superproject, and `git add <submodule>` to record the above merge
> >
> > "...to record the above merge or update"?
>
> will... "update" haha

:-)

> > > If git detects multiple possible merge resolutions, the following is printed:
> > >
> > > --------
> > >
> > > Failed to merge submodule sub, but multiple possible merges exist:
> > >     <commit> Merge branch '<branch1>' into <branch2>
> > >     <commit> Merge branch '<branch1>' into <branch3>
> > >
> > > CONFLICT (submodule): Merge conflict in <submodule>
> > > Recursive merging with submodules is currently not supported.
> > > To manually complete the merge:
> > >  - go to submodule (<submodule>), and either update the submodule to a possible commit above or merge commit <commit>
> >
> > Same issues as I mentioned above for the single-merge-resolution-found case.
> >
>
> ditto
>
> > >  - come back to superproject, and `git add <submodule>` to record the above merge
> >
> > "merge or update", right?
>
> ditto
>
> > > +ret:
> > > +       if (!ret) {
> > > +               if (!csub) {
> > > +                       CALLOC_ARRAY(csub, 1);
> > > +               }
> > > +               csub_item.oid = *result;
> >
> > Shouldn't this be "*b" rather than "*result"?
>
> Yes it should
>
> >
> > Also, are we risking telling the user to "merge commit
> > 0000000000000000000000000000000000000000" from the submodule, given
> > some of the early exits that you redirected to come through here?
>
> You are correct, but I'm not quite sure what should happen in this case. What does it mean for either o, a, or b to be null? Did a submodule get deleted on one side?

o is the version in the merge-base, a is the version from the first
parent (thus the submodule version stored within HEAD), and b is the
version from the second parent (thus the submodule version stored
within the main module's MERGE_HEAD commit).

We can't have all three be null, because that would just mean there
was never a submodule at this path.

We can't have two of the three be null, because that would either be
deleted on both sides, or added on one side, and those cases are
trivially resolvable elsewhere in the merge machinery and there's no
need to call merge_submodule().

If o is null, the submodule didn't exist in the merge base.  So it
must be added on both sides (and the two sides have to have different
submodule commits, or the merge of the submodule would be trivially
handle-able elsewhere).

If a is null, it didn't exist in HEAD, meaning it was deleted on our
side of history.  (And b has to be different than o, or else this
would be trivially resolvable as a deletion.)

If b is null, then it's similar to the above case, but the submodule
was deleted on the other side of history that we are trying to merge
into HEAD rather than on HEAD's side of history.

However, now that I've said this, I took another look through the
code.  I think this actually isn't relevant right now.
merge_submodule() is only called from handle_content_merge(), which in
turn is only called from two places: (1) process_renames(), and (2)
the filemask >= 6 section of process_entry().  The process_renames()
cases, since we can't detect renames involving submodules, can't
involve a case with a null oid for a submodule.  And the filemask >= 6
implies that only o could have a null hash.  That means the checks for
a and b being null oids within merge_submodule will never trigger.
And we don't actually run the risk of telling users to merge the all
null commit.

Any time we have a modify/delete issue with submodules, we'll just end
up at the codepath that reports "CONFLICT (modify/delete): ...", and
which doesn't mention anything about the path being a submodule, but
it really doesn't need to; the text is accurate regardless.


> > > +void print_submodule_conflict_suggestion(struct conflicted_submodule_list *csub) {
> >
> > Maybe just make this function be static?
>
> It should be static now that this won't be called in merge-recursive
>
> > > +                                       find_unique_abbrev(&id, DEFAULT_ABBREV));
> >
> > Shouldn't this be repo_find_unique_abbrev(), and in particular with
> > the submodule repository being passed to it?  Or perhaps using the
> > format_commit() function, since merge_submodule() is already using it
> > for the earlier submodule-related messages?
>
> It should and I will go with the format_commit() option so I don't have to pass subrepo information into the print function.
>
> > >  void merge_switch_to_result(struct merge_options *opt,
> > >                             struct tree *head,
> > >                             struct merge_result *result,
> > > @@ -4324,6 +4367,8 @@ void merge_switch_to_result(struct merge_options *opt,
> > >                 }
> > >                 string_list_clear(&olist, 0);
> > >
> > > +               print_submodule_conflict_suggestion(opt->conflicted_submodules);
> > > +
> > >                 /* Also include needed rename limit adjustment now */
> > >                 diff_warn_rename_limit("merge.renamelimit",
> > >                                        opti->renames.needed_limit, 0);
> > > diff --git a/merge-recursive.c b/merge-recursive.c
> > > index fd1bbde061..311cc37886 100644
> > > --- a/merge-recursive.c
> > > +++ b/merge-recursive.c
> >
> > Is it worth updating merge-recursive.c here?  I'd rather only give it
> > security fix updates until we delete it.
>
> Ah wasn't aware that was the status of merge-recursive. Will delete.
>
> > > diff --git a/merge-recursive.h b/merge-recursive.h
> > > index b88000e3c2..b615955fb7 100644
> > > --- a/merge-recursive.h
> > > +++ b/merge-recursive.h
> > > @@ -9,6 +9,7 @@ struct object_id;
> > >  struct repository;
> > >  struct tree;
> > >
> > > +struct conflicted_submodule_list;
> > >  struct merge_options_internal;
> > >  struct merge_options {
> > >         struct repository *repo;
> > > @@ -51,6 +52,21 @@ struct merge_options {
> > >
> > >         /* internal fields used by the implementation */
> > >         struct merge_options_internal *priv;
> > > +
> > > +       /* field that holds submodule conflict information */
> > > +       struct conflicted_submodule_list *conflicted_submodules;
> >
> > I think this should be added to merge_options_internal rather than to
> > merge_options.  There's no need for an API caller to make use of
> > these.
> >
> > Also, if a clear need arises later for API callers to make use of this
> > information, then it should be part of merge_result for merge-ort.c,
> > not part of merge_options.
>
> ack
>
> > > +};
> > > +
> > > +struct conflicted_submodule_item {
> > > +       struct object_id oid;
> > > +       char *path;
> > > +       int resolution_exists;
> > > +};
> > > +
> > > +struct conflicted_submodule_list {
> > > +       struct conflicted_submodule_item *items;
> > > +       size_t nr;
> > > +       size_t alloc;
> > >  };
> >
> > Similarly, I think these should be defined in merge-ort.c (and maybe
> > also merge-recursive.c) near struct merge_options_internal.
>
> ack
>
> > >  void init_merge_options(struct merge_options *opt, struct repository *repo);
> > > @@ -122,4 +138,6 @@ int merge_recursive_generic(struct merge_options *opt,
> > >                             const struct object_id **merge_bases,
> > >                             struct commit **result);
> > >
> > > +void print_submodule_conflict_suggestion(struct conflicted_submodule_list *csub);
> >
> > and I think there's no reason to put this in the header file; it
> > should just be a static function in merge-ort.c.  (And, if we really
> > want to update merge-recursive.c, then copy the function over there.
> > *Nothing* in merge-ort.c should call a function in merge-recursive.c
> > and similarly nothing in merge-recursive.c should call any function in
> > merge-ort.c.  Yes, that implies some duplication.  There is a fair
> > amount already and we've discussed it, and chosen against creating a
> > shared module as well.  I want merge-recursive to not be broken by
> > merge-ort updates; it should remain stable until the day we finally
> > get to nuke it.  I'm personally of the opinion that merge-recursive
> > should only get security fixes at this point, though I haven't pinged
> > to see if other folks agree with that point of view yet or not.  I'm
> > not wasting my time fixing/improving it, though...)
>
> Good to know going forward I should only update merge-ort (unless for security fixes). I'll keep this in mind refactoring my patch.
>
> > Despite my many nitpicks and whatnot, it looks like your change will
> > make things nicer for the user, and I think your patch is coming along
> > nicely.  There are some things to fix up, but the overall direction
> > seems good.
>
> Thank you for all of the helpful feedback!

  reply	other threads:[~2022-07-01  4:27 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-06 23:54 [PATCH] submodule merge: update conflict error message Calvin Wan
2022-06-07  0:48 ` Junio C Hamano
2022-06-08 17:19   ` Calvin Wan
2022-06-08 17:34     ` Glen Choo
2022-06-08 18:01       ` Calvin Wan
2022-06-08 19:13         ` Junio C Hamano
2022-06-10 23:11 ` [PATCH v2] " Calvin Wan
2022-06-11  4:53   ` Elijah Newren
2022-06-11 17:08     ` Philippe Blain
2022-06-12  6:56       ` Elijah Newren
2022-06-13  2:03       ` Calvin Wan
2022-06-12 23:30     ` Junio C Hamano
2022-06-13  1:54     ` Calvin Wan
2022-06-29 22:40   ` [PATCH v3] " Calvin Wan
2022-06-30  2:40     ` Elijah Newren
2022-06-30 19:48       ` Calvin Wan
2022-07-01  4:27         ` Elijah Newren [this message]
2022-06-30 20:35     ` Glen Choo
2022-06-30 20:45       ` Glen Choo
2022-06-30 21:08       ` Calvin Wan
2022-07-12 23:19     ` [PATCH v4] " Calvin Wan
2022-07-13 18:11       ` Junio C Hamano
2022-07-17  2:46         ` Elijah Newren
2022-07-15 12:57       ` Johannes Schindelin
2022-07-16  6:22         ` Junio C Hamano
2022-07-17  2:44         ` Elijah Newren
2022-07-18 17:03         ` Calvin Wan
2022-07-18 21:43       ` [PATCH v5] " Calvin Wan
2022-07-19  6:39         ` Junio C Hamano
2022-07-19 19:30           ` Calvin Wan
2022-07-19 20:16             ` Junio C Hamano
2022-07-19  7:13         ` Junio C Hamano
2022-07-19 19:07           ` Calvin Wan
2022-07-19 20:30             ` Junio C Hamano
2022-07-25  6:05         ` Ævar Arnfjörð Bjarmason
2022-07-25 12:11           ` Ævar Arnfjörð Bjarmason
2022-07-25 22:03             ` Calvin Wan
2022-07-25 12:31         ` Ævar Arnfjörð Bjarmason
2022-07-25 21:27           ` Calvin Wan
2022-07-26 21:00         ` [PATCH v6] " Calvin Wan
2022-07-27  1:13           ` Elijah Newren
2022-07-27 22:00             ` Calvin Wan
2022-07-28  0:41               ` Elijah Newren
2022-07-28 19:06                 ` Calvin Wan
2022-07-27  9:20           ` Ævar Arnfjörð Bjarmason
2022-07-28 21:12           ` [PATCH v7] " Calvin Wan
2022-07-28 23:22             ` Ævar Arnfjörð Bjarmason
2022-07-29  0:24             ` Elijah Newren
2022-08-01 22:24               ` Calvin Wan
2022-08-01 12:06             ` Ævar Arnfjörð Bjarmason
2022-08-02  0:50             ` Junio C Hamano
2022-08-02 21:03               ` Calvin Wan
2022-08-02 21:11                 ` Junio C Hamano
2022-08-02 21:55                   ` Calvin Wan
2022-08-02 22:22                     ` Junio C Hamano
2022-08-04 19:51             ` [PATCH v8] " Calvin Wan
2022-08-16 15:58               ` Junio C Hamano
2022-08-16 18:58                 ` Junio C Hamano
2022-08-16 19:34                   ` Calvin Wan
2022-08-16 19:39                     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BE0qGwUy80dmVszkJQ+tcpfLRW0OZyErymzhZ9+HWY1mw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=calvinwan@google.com \
    --cc=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=levraiphilippeblain@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).