From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 790041F42D for ; Thu, 12 Apr 2018 22:15:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753074AbeDLWPN (ORCPT ); Thu, 12 Apr 2018 18:15:13 -0400 Received: from mail-wm0-f49.google.com ([74.125.82.49]:53326 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752612AbeDLWPM (ORCPT ); Thu, 12 Apr 2018 18:15:12 -0400 Received: by mail-wm0-f49.google.com with SMTP id 66so1308904wmd.3 for ; Thu, 12 Apr 2018 15:15:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=vCDgupaxF9FrvKMBuBDmthOs0r2lweEnmWURdVv59p0=; b=KffxAzA9fTXco08CKvnsvkAA2blKwhPzBqg/pC5i77kbFik6SxKy3cnAiSL74KJjs9 TGuKAcAmU5HG6aRR36PqTLAYkr83tg35jJIlmH9nvCczEznIloMcE8XVf0KGuorLpivd vk+zCUxlGhPQ57RggzoOX413vcOhO4NE/AX5IRzFeSgzmoT/LCKmH6LnEeNIps1tvXV/ MapM8Z2IQDQPHL28NFl+ETfcYeMJ3i9jlvP9qm/yG03bCXqFCKgQ6y90J/kbypiu4npH wbs/Cy9tScJWk3m6aFSErQctMjw23OKfxqSyu+Ty5bsT5gh5oSlaC5U7/bS8gZhADbVk cFeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=vCDgupaxF9FrvKMBuBDmthOs0r2lweEnmWURdVv59p0=; b=CLnPhgUwMfQ3MYuEk6brJ72E/MpfC/83kzj9qap11IqUfkugBsJmr5l83tbjGmw+Y+ khz/PLT7J7anccPsYkH5X8SKu/HX45ZLwyN/WyxgmpiMyGI6lqOIhRvh2/KFuoKaGlat ioVs2vUj+xqZYTBBtASrOW5gl+9YNck2nxU+UJhCiSpTnVXrybHj5m5FtHBZ4ahU2iTv Ndlnr9uZK78Yg+4sKH8K6xvLOO/UhJA3AVZvi7LhBtcUaeCwuC1qca/ycY3B0iet2o29 dv4r4E5wEJTP4dR7XNYVcM/YhEsHOG5p2hIEf15Jtg7XJRZZZg4fE/RkQYQviIv4Evd3 4i7Q== X-Gm-Message-State: ALQs6tCDn1f88GK6p8xan7NpIWfTuqOKeKu2F+0ABlak6ycAkRNp6tzC WbMH0CJUn3D7t/iV0EqWE5j55WMu99VmoQ7LOnU= X-Google-Smtp-Source: AIpwx49L/015Qv84G1BCI9skhqCW9LjXtZTJ7DpexXxuGP7Mubgg/tQkuwod3LddgdJeqxrpBth7GGkGsxt33JQBGjQ= X-Received: by 10.80.208.2 with SMTP id j2mr17624034edf.44.1523571310841; Thu, 12 Apr 2018 15:15:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.80.180.180 with HTTP; Thu, 12 Apr 2018 15:14:50 -0700 (PDT) In-Reply-To: References: <87tvsj6rn2.fsf@javad.com> <871sfm737e.fsf@javad.com> <87h8oh51jm.fsf@javad.com> <87sh810ym4.fsf@javad.com> From: Jacob Keller Date: Thu, 12 Apr 2018 15:14:50 -0700 Message-ID: Subject: Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology To: Johannes Schindelin Cc: Sergey Organov , Git mailing list , Junio C Hamano , Stefan Beller , Philip Oakley , Eric Sunshine , Phillip Wood , Igor Djordjevic , Johannes Sixt Content-Type: text/plain; charset="UTF-8" Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Thu, Apr 12, 2018 at 3:02 PM, Johannes Schindelin wrote: > Hi Jake, > > On Thu, 12 Apr 2018, Jacob Keller wrote: > >> On Wed, Apr 11, 2018 at 10:42 PM, Sergey Organov wrote: >> > >> > Jacob Keller writes: >> >> On Wed, Apr 11, 2018 at 6:13 AM, Sergey Organov wrote: >> >>> It was rather --recreate-merges just a few weeks ago, and I've seen >> >>> nobody actually commented either in favor or against the >> >>> --rebase-merges. >> >>> >> >>> git rebase --rebase-merges >> >> >> >> I'm going to jump in here and say that *I* prefer --rebase-merges, as >> >> it clearly mentions merge commits (which is the thing that changes). >> > >> > OK, thanks, it's fair and the first argument in favor of >> > --rebase-merges I see. >> >> I'd be ok with "--keep-merges" also. > > My main argument against --keep-merges is that there is no good short > option for it: -k and -m are already taken. And I really like my `git > rebase -kir` now... > Right, that's unfortunate. > A minor argument in favor of `--rebase-merges` vs `--keep-merges` is that > we do not really keep the merge commits, we rewrite them. In the version > as per this here patch series, we really create recursive merges from > scratch. I also don't have a strong opinion in regards to --keep vs --rebase.. > > In the later patch series on which I am working, we use a variation of > Phillip's strategy which can be construed as a generalization of the > cherry-pick to include merges: for a cherry-pick, we perform a 3-way merge > between the commit and HEAD, with the commit's parent commit as merge > base. With Phillip's strategy, we perform a 3-way merge between the merge > commit and HEAD (i.e. the rebased first parent), with the merge commit's > first parent as merge base, followed by a 3-way merge with the rebased > 2nd parent (with the original 2nd parent as merge base), etc > > However. This strategy, while it performed well in my initial tests (and > in Buga's initial tests, too), *does* involve more than one 3-way merge, > and therefore it risks something very, very nasty: *nested* merge > conflicts. Yea, it could. Finding an elegant solution around this would be ideal! (By elegant, I mean a solution which produces merge conflicts that users can resolve relatively easily). > > Now, I did see nested merge conflicts in the past, very rarely, but that > can happen, when two developers criss-cross merge each others' `master` > branch and are really happy to perform invasive changes that our merge > does not deal well with, such as indentation changes. > > When rebasing a merge conflict, however, such nested conflicts can happen > relatively easily. Not rare at all. Right. This would be true regardless of what strategy we choose, I think. > > I found out about this by doing what I keep preaching in this thred: > theory is often very nice *right* until the point where it hits reality, > and then frequently turns really useless, real quickly. Theoretical > musings can therefore be an utter waste of time, unless accompanied by > concrete examples. Agreed. > > To start, I built on the example for an "evil merge" that I gave already > in the very beginning of this insanely chatty thread: if one branch > changes the signature of a function, and a second branch adds a caller to > that function, then by necessity a merge between those two branches has to > change the caller to accommodate the signature change. Otherwise it would > end up in a broken state. > > In my `sequencer-shears` branch at https://github.com/dscho/git, I added > this as a test case, where I start out with a main.c containing a single > function called core(). I then create one branch where this function is > renamed to hi(), and another branch where the function caller() is added > that calls core(). Then I merge both, amending the merge commit so that > caller() now calls hi(). So this is the main.c after merging: > > int hi(void) { > printf("Hello, world!\n"); > } > /* caller */ > void caller(void) { > hi(); > } > > To create the kind of problems that are all too common in my daily work > (seemingly every time some stable patch in Git for Windows gets > upstreamed, it changes into an incompatible version, causing merge > conflicts, and sometimes not only that... but I digress...), I then added > an "upstream" where some maintainer decided that core() is better called > greeting(), and also that a placeholder function for an event loop should > be added. So in upstream, main.c looks like this: > > int greeting(void) { > printf("Hello, world!\n"); > } > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > } > > Keep in mind: while this is a minimal example of disagreeing changes that > may look unrealistic, in practice this is the exact type of problem I am > dealing with on a daily basis, in Git for Windows as well as in GVFS Git > (which adds a thicket of branches on top of Git for Windows) and with the > MSYS2 runtime (where Git for Windows stacks patches on top of MSYs2, which > in turn maintains their set of patches on top of the Cygwin runtime), and > with BusyBox, and probably other projects I forgot spontaneously. This > makes me convinced that this is the exact type of problem that will > challenge whatever --rebase-merges has to deal with, or better put: what > the user of --rebase-merges will have to deal with. > > (If I got a penny for every merge conflict I resolved, where test cases > were appended to files in t/, I'd probably be rich by now. Likewise, the > `const char *` -> `struct object_oid *` conflicts have gotten to a point > where I can resolve them while chatting to somebody.) > > Now, rebasing the original patches above (renaming core() to hi(), and > adding caller()) will obviously conflict with those upstream patches > (renaming core() to greeting(), and adding event_loop()). That cannot be > avoided. In the example above, I decided to override upstream's decision > by insisting on the name hi(), and resolving the other merge conflict by > adding *both* event_loop() and caller(). > > The big trick, now, is to avoid forcing the user to resolve the same > conflicts *again* when the merge commit is rebased. The better we can help > the user here, the more powerful will this mode be. > > But here, Phillip's strategy (as implemented by yours truly) runs this > problem: > > int hi(void) { > printf("Hello, world!\n"); > } > <<<<<<< intermediate merge > <<<<<<< HEAD > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > ======= > ======= > } > >>>>>>> ... merge head #1 > /* caller */ > void caller(void) { > hi(); > >>>>>>> ... original merge > } > > Now, no matter who I ask, everybody so far agreed with me that this looks > bad. Like, really bad. There are two merge conflicts, obviously, but it is > not even clear which conflict markers belong together! > > It gets a little better when I take a page out of recursive merge's > playbook, which uses different marker sizes for nested merge conflicts > (which I of course implemented and pushed to `sequencer-shears`, currently > still in an unpolished state): > > int hi(void) { > printf("Hello, world!\n"); > } > <<<<<<< intermediate merge > <<<<<<<< HEAD > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > ======== > ======= > } > >>>>>>> ... merge head #1 > /* caller */ > void caller(void) { > hi(); > >>>>>>>> ... original merge > } > > At least now we understand which conflict markers belong together. But I > still needed to inspect the intermediate states to understand what is > going on: > > After the first 3-way merge (the one between the original merge commit and > HEAD), we have the conflict markers around event_loop() and caller(), > because they had been added into the same spot. > > The second 3-way merge would also want to add the event_loop(), but not > caller(), so ideally it should see that event_loop() is already there and > not add any conflict markers. But that is not the case: event_loop() was > added *with conflict markers*. > > So those conflict markers in the first 3-way merge *cause* the conflicts > in the second 3-way merge! > > And indeed, if we merge the other way round (original merge with 2nd > parent, then with 1st parent), the result looks much better: > > int hi(void) { > printf("Hello, world!\n"); > } > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > } > <<<<<<<< HEAD > ======== > /* caller */ > void caller(void) { > hi(); > } > >>>>>>>> ... intermediate merge > > So: the order of the 3-way merges does matter. > > I did implement this, too, in the `sequencer_shears` branch: if the first > 3-way merge causes conflicts, attempt the second one, and if that one is > clean, try merging that merge result into HEAD (forgetting about the first > attempted 3-way merge). > > That is still unsatisfying, though, as it is easy to come up with a > main2.c in the above example that requires the *opposite* merge order to > avoid nested conflicts. I agree, this solution won't work reliably because we can show examples which fail the opposite way. > > The only way out I can see is to implement some sort of "W merge" or > "chandelier merge" that can perform an N-way merge between one revision > and N-1 other revisions (each of the N-1 bringing its own merge base). I > call them "W" or "chandelier" because such a merge can be visualized by > the original merge commit being the center of a chandelier, and each arm > representing one of the N-1 merge heads with their own merge bases. > I think this approach sounds reasonable. > Similar to the 3-way merge we have implemented in xdiff/xmerge.c, this > "chandelier merge" would then generate the two diffs between merge base > and both merge heads, except not only one time, but N-1 times. It would > then iterate through all hunks ordered by file name and line range. Any > hunk without conflicting changes would be applied as-is, and the remaining > ones be turned into conflicts (handling those chandelier arms first where > both diffs' hunks look identical). > > Have I missed any simpler alternative? I *think* this would work well if I understand it, but it's difficult to process without examples. > > Ciao, > Johannes