From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.4 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 36A621F517 for ; Sun, 3 Jun 2018 11:40:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751048AbeFCLkw (ORCPT ); Sun, 3 Jun 2018 07:40:52 -0400 Received: from mail-wr0-f196.google.com ([209.85.128.196]:40499 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750847AbeFCLkv (ORCPT ); Sun, 3 Jun 2018 07:40:51 -0400 Received: by mail-wr0-f196.google.com with SMTP id l41-v6so40346629wre.7 for ; Sun, 03 Jun 2018 04:40:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=6OpPWqEW/xAWStVURq3IFIdSiBO6OpoRWfiCCORfeok=; b=GUfo0yEDZ89mQWC1CBWl9ivhrYCvhA+NTuRoH9FFZSZ9Y2aOEQrKj4j2urBx57ND+N 686+1OnHv7nSgxIpAWCyEl4N1tM0QKerBV1VQulXupqAIGIcwROhhLPqd9JCTlRnU6VY g18o/IuJKnxxotmbYkjnNKFUXw+D8m0w+H1P9Nj0W9zyKVD8Ea4Tk++zZWi1ExAnrNEY vrslnrC9KNuSbJIzzIDQjStVudAfiPSQlGCrBJ012gX0Fjf68KDYP076yw3tTfcV9NRl BS8EdhUQDSNNxU4H/KooB4LVsFHa5O+gK+dLPaYEVpOPct8SznBU50gsbNoAVmZPmuXY MeSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=6OpPWqEW/xAWStVURq3IFIdSiBO6OpoRWfiCCORfeok=; b=UiDvQbLceyOkKLQBQuhrSZSTHe/j+7Mjzbd7eFOhjcDGRf0z1HPs4y7pYdfiDX9MR6 fmSL9Cmfo+hjn1QP4cOhx1tIxt8skhhC4I4J5xXTzBRCVeNSsUaogkeEgN0gXojmrEHx 6WYm+iKg/laoWBTuRD61cwndfMgC+Cg3OHTz6D2qRWktWOdGrv5+aCd5cGEMT4BhAdAv 4gToowufGOzCiyLOnmMxBPCAxyop7a4qHap0St1PIDNrYSXsxyvl4rZCHwvKyJndqOuY 0QSqLtE3a0xsu8aLTQVhSFhet7j0Vy+nWVklhRSZSU3PDw8QUaXbae7FooAisEa2kj9+ 1iqw== X-Gm-Message-State: ALKqPwcZFwG8lNWMAr1s8R5mDP9dxewg5+7s07GilYe9V068lVsuCtDX ycytmkqMRYndDdM7TYLn7La8S8Bu X-Google-Smtp-Source: ADUXVKKb2bUiGGnt1hhJe6YxlogvvUer5jTcEqpKS6zBapUBhgUaokWiBegKAAbTjsvURN/IGKppRQ== X-Received: by 2002:adf:824e:: with SMTP id 72-v6mr13864092wrb.127.1528026050085; Sun, 03 Jun 2018 04:40:50 -0700 (PDT) Received: from localhost (cpc73832-dals21-2-0-cust969.20-2.cable.virginm.net. [81.110.231.202]) by smtp.gmail.com with ESMTPSA id r6-v6sm49001784wrg.73.2018.06.03.04.40.48 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 03 Jun 2018 04:40:48 -0700 (PDT) Date: Sun, 3 Jun 2018 12:41:28 +0100 From: Thomas Gummerer To: Junio C Hamano Cc: git@vger.kernel.org Subject: Re: [RFC/PATCH 3/7] rerere: add some documentation Message-ID: <20180603114128.GD18193@hank.intra.tgummerer.com> References: <20180520211210.1248-1-t.gummerer@gmail.com> <20180520211210.1248-4-t.gummerer@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.5 (2018-04-13) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On 05/24, Junio C Hamano wrote: > Thomas Gummerer writes: > > > +Conflict normalization > > +---------------------- > > + > > +To try and re-do a conflict resolution, even when different merge > > +strategies are used, 'rerere' computes a conflict ID for each > > +conflict in the file. > > + > > +This is done by discarding the common ancestor version in the > > +diff3-style, and re-ordering the two sides of the conflict, in > > +alphabetic order. > > s/discarding.*-style/normalising the conflicted section to 'merge' style/ > > The motivation behind the normalization should probably be given > upfront in the first paragraph. It is to ensure the recorded > resolutions can be looked up from the rerere database for > application, even when branches are merged in different order. I am > not sure what you meant by even when different merge stratagies are > used; I'd drop that if I were writing the paragraph. What I meant was when different conflict styles are used, and when the branches are merged in different orders. But merge strategies is obviously not a good word for that. Will rephrase this. > > +Using this technique a conflict that looks as follows when for example > > +'master' was merged into a topic branch: > > + > > + <<<<<<< HEAD > > + foo > > + ======= > > + bar > > + >>>>>>> master > > + > > +and the opposite way when the topic branch is merged into 'master': > > + > > + <<<<<<< HEAD > > + bar > > + ======= > > + foo > > + >>>>>>> topic > > + > > +can be recognized as the same conflict, and can automatically be > > +re-resolved by 'rerere', as the SHA-1 sum of the two conflicts would > > +be calculated from 'barfoo' in both cases. > > You earlier talked about normalizing and reordering, but did not > talk about "concatenate both with NUL in between and hash", so the > explanation in the last two lines are not quite understandable by > mere mortals, even though I know which part of the code you are > referring to. When you talk about hasing, you may want to make sure > the readers understand that the branch label on <<< and >>> lines > are ignored. > > > +If there are multiple conflicts in one file, they are all appended to > > +one another, both in the 'preimage' file as well as in the conflict > > +ID. > > In case it was not clear (and I do not think it is to those who only > read your description and haven't thought things through > themselves), this concatenation is why the normalization by > reordering is helpful. Imagine that a common ancestor had a file > with a line with string "A" on it (I'll call such a line "line A" > for brevity in the following) in its early part, and line X in its > late part. And then you fork four branches that do these things: > > - AB: changes A to B > - AC: changes A to C > - XY: changes X to Y > - XZ: changes X to Z > > Now, forking a branch ABAC off of branch AB and then merging AC into > it, and forking a branch ACAB off of branch AC and then merging AB > into it, would yield the conflict in a different order. The former > would say "A became B or C, what now?" while the latter would say "A > became C or B, what now?" > > But the act of merging AC into ABAC and resolving the conflict to > leave line D means that you declare: > > After examining what branches AB and AC did, I believe that > making line A into line D is the best thing to do that is > compatible with what AB and AC wanted to do. > > So the conflict we would see when merging AB into ACAB should be > resolved the same way---it is the resolution that is in line with > that declaration. > > Imagine that similarly you had previously forked branch XYXZ from > XY, merged XZ into it, and resolved "X became Y or Z" into "X became > W". > > Now, if you forked a branch ABXY from AB and then merged XY, then > ABXY would have line B in its early part and line Y in its later > part. Such a merge would be quite clean. We can construct > 4 combinations using these four branches ((AB, AC) x (XY, XZ)). > > Merging ABXY and ACXZ would make "an early A became B or C, a late X > became Y or Z" conflict, while merging ACXY and ABXZ would make "an > early A became C or B, a late X became Y or Z". We can see there > are 4 combinations of ("B or C", "C or B") x ("X or Y", "Y or X"). > > By sorting, we can give the conflict its canonical name, namely, "an > early part became B or C, a late part becames X or Y", and whenever > any of these four patterns appear, we can get to the same conflict > and resolution that we saw earlier. Without the sorting, we will > have to somehow find a previous resolution from combinatorial > explosion ;-) Thanks for the in depth explanation! I'll incorporate this into the document. > These days post ec34a8b1 ("Merge branch 'jc/rerere-multi'", > 2016-05-23), the conflict ID can safely collide, i.e. hash > collisions that drops completely different conflicts and their > resolutions into the same .git/rr-cache/$id directory will not > interfere with proper operation of the system, thanks to that > rerere-multi topic that allows us to store multiple preimage > conflicts that happens to share the same conflict ID with their > corresponding postimage resolutions. > > In theory, we *should* be able to stub out the SHA-1 computation and > give every conflict the same ID and rerere should still operate > correctly, even though I haven't tried it yet myself. I gave this a quick try, and the test suite seems to pass with the hash computation giving the same ID to all conflicts.