git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Stefan Beller <sbeller@google.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: RFC: Merge-related plans
Date: Tue, 29 May 2018 14:03:27 -0700	[thread overview]
Message-ID: <CABPp-BH1BBH7aBpoSaJE5J7okdfrDzpbu5rH-w-FSG2RM0hU1g@mail.gmail.com> (raw)
In-Reply-To: <CAGZ79ka39xbbm_koyFeEizggoDBqKuwgmGwRBZJnBc13aOWZag@mail.gmail.com>

Hi Stefan,

On Tue, May 29, 2018 at 11:19 AM, Stefan Beller <sbeller@google.com> wrote:
> On Mon, May 28, 2018 at 1:48 PM, Elijah Newren <newren@gmail.com> wrote:
>> Currently, I would like to:
>>
>> A) Fix cases where directory rename detection does not work with
>>    rebase/am due to how they call merge-recursive.
>>
>>    Notes: Could just wait for D & E to land before fixing.
>>    Alternatively, email RFC to list now explaining issues and how the
>>    fix has performance implications; poll for opinions on whether to
>>    fix before or after D.
>>
>> B) Implement a remerge-diff ability (compare a merge commit to what an
>>    "automatic merge" would look like)[1].
>>
>>    Notes: Possibly for cherry-picks and reverts too.  Depends on C &
>>    E.
>>
>> C) Modify/extend how path-based and mode-based conflicts are
>>    communicated to the user.
>>
>>    Notes: Particularly important as a mechanism for handling
>>    challenges[2] with implementing the remerge-diff ability.  Need to
>>    send RFC to list with ideas to get feedback.
>>
>> D) Improve merge performance.
>>
>>    Notes: Includes 4-5 specific optimizations[5], some of which I
>>    expect to be somewhat invasive and thus may make more sense to just
>>    make part of the new merge strategy implemented in E.  Biggest
>>    optimization depends on F.
>>
>> E) Write a new merge strategy meant to replace merge-recursive.
>>
>>    Notes: Suggested by Junio[3][4].  Depends on F & G.
>>
>> F) Make file collision conflict types more consistent.
>>
>>    Notes: Specifically, make rename/rename(2to1) & rename/add
>>    conflicts behave more like add/add[6][7].  Depends on part of G.
>>    Would prefer H to be accepted first.
>>
>> G) Improve merge-related portion of testsuite.
>>
>>    Notes: Intended to help test new merge strategy with more
>>    confidence.  Will include approximately a dozen edge and corner
>>    cases where merge-recursive currently falls short.  Started at [8];
>>    see also [9].
>
> Most items forward-reference "Depends on (<later letter>) up to here;
> (H) seems independent, but might be a good first start.
> (G) [8] is queued as origin/en/merge-recursive-tests, or do you mean
> to expand (G) into a mini-framework of merge-testing? i.e. run the
> mini test framework multiple times, each using a different
> merge strategy, similar to submodule tests, e.g. see
> t/lib-submodule-update.sh and one of its users, t1013.

Sorry, I should have been more clear about G.
en/merge-recursive-tests is not G, it's just a preparation topic for
it (or, alternatively, the first of a few topics for G).  In short,
the idea for G was missing coverage of existing functionality for
merge-recursive.

In more detail...

The idea for G was that, if possible, I wanted to avoid a repeat of
the problems directory rename detection caused Junio where he had to
revert it from master and worried about mis-merges.  That could have
been avoided if we had additional tests in the testsuite covering the
already-up-to-date check, making me wonder if there were other
important cases where merge-recursive lacked coverage.  Rather than
finding untested cases which merge-recursive handles correctly, my
investigation mostly yielded additional testcases where
merge-recursive falls down, including:
  * about 3 different rename cases (non-recursive), one of which was
reported previously but isn't in the testsuite
  * about 4 different issues with submodules (non-recursive) that I
don't think are represented in the testsuite (a couple of which I
mentioned in an email to you on the list last year but never created
testcases for)
  * about 6 different issues with special types and/or mode conflicts
(recursive only)
  * a few additional tests, commentary, and an alternate idea for
improving directory/file conflict handling

>> H) Miscellaneous code cleanups irritating me while working on other
>>    changes[10].
>>
>>
>> My current plan was to work roughly in reverse, from H to A.  Some questions:
>>
>>   * Does any of this look objectionable?
>
> Going in order A-H seems slightly out-of-order to me, I'd think (H) and (G)
> would go first;
>
> (B) sounds like an independent feature, which could go in parallel?

B may sound like an independent feature, but it needs a merge
algorithm that doesn't mess with the working tree so it depends pretty
strongly on E.

>>   * Should I post RFC questions on A and C earlier?
>
> I would think so, it is easier to give feedback on code, I would think.

If the idea is to give feedback on *code* rather than just
ideas/tradeoffs/pinpointing-buggy-lines, then it sounds like you're
actually suggesting posting the RFC later rather than earlier?

Also, the bigger question for me wasn't so much "should I ask the list
about these changes?" before making them, but rather: Do folks want me
to bring these things up before I work on D & E -- even if I end up
not getting back to incorporating their answers for months until D & E
are completed and merged?

>>   * Should I split D and G?  (Current plan was to keep D together, but
>>     split G into five short slightly inter-dependent topics)
>
> I would have expected to have tests (G) as a companion of (A) or (C)
> rather than (D), as performance improvements would keep the test suite
> unchanged?

Let me re-phrase: D and G are completely independent series, both of
which happen to be divisible.  Should either of them be split?

More background: D is only a handful of commits, so far; the main
reason to split it is to allow some of it to go first (maybe even
before G or H).  The downside is introducing extra churn and risk in
merge-recursive, when I'm planning to rewrite it soon anyway.  I was
trying to minimize merge-recursive changes, other than trying to make
sure that the new merge strategy and merge-recursive will give
identical results (modulo maybe fixing a few extra corner cases and
running faster).  Basically, I wanted it to be really easy to compare
old and new strategies, but otherwise wanted to leave merge-recursive
mostly alone.  It's not entirely clear how quickly I'll find time to
work on all of this, though, so maybe just-wait-for-the-rewrite is not
the right prioritization.

(And yes, A & C deserve tests, but they're not of the
missing-coverage-for-merge-recursive variety that I was thinking of
for G.)

>>   * E is kind of big.  Are there any specific things folks would like to see
>>     with how that is handled?
>
> How much abstraction can be done ahead of time such that there is an
> interface/API where you just plug in a new merge strategy and do not
> need to duplicate a lot of code/tests?

For avoiding duplicate tests, my plan was to use a variable (like
GIT_TEST_SPLIT_INDEX or GIT_FSMONITOR_TEST), which, when set, would
change the default merge strategy from "recursive" to the new one, and
also replace explicit requests for the "recursive" merge strategy with
the new one.

For avoiding duplicate code...well, Junio's suggestion was "[to
rewrite] without using much from the existing code in
merge-recursive.c at all" because he has "written off that code as
mostly unsalvageable"[4].

There are some functions that I think are worth leaving intact (e.g.
helpers like output(), the directory rename detection stuff, or the
merge_submodule code that you recently moved).  That stuff I was
planning to put in a "/* COPIED FROM merge-recursive.c VERBATIM */"
section.  And eventually, the idea would be to delete the old
merge-recursive and just use the replacement, making the duplication
concerns eventually go away.

...or is your question more about how to abstract things so that
others can also write new merge strategies in the future and allow as
much code and test re-use as possible?


>> [4] https://public-inbox.org/git/xmqqk1ydkbx0.fsf@gitster.mtv.corp.google.com/
>>    [New strategy]


Thanks for taking a look!

Elijah

  reply	other threads:[~2018-05-29 21:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-28 20:48 RFC: Merge-related plans Elijah Newren
2018-05-29 18:19 ` Stefan Beller
2018-05-29 21:03   ` Elijah Newren [this message]
2018-05-29 22:12     ` Stefan Beller
2018-05-30  4:21       ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BH1BBH7aBpoSaJE5J7okdfrDzpbu5rH-w-FSG2RM0hU1g@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).