From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 8F61B1F4B4 for ; Wed, 7 Apr 2021 02:26:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348102AbhDGC0E (ORCPT ); Tue, 6 Apr 2021 22:26:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233994AbhDGC0D (ORCPT ); Tue, 6 Apr 2021 22:26:03 -0400 Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32D15C06174A for ; Tue, 6 Apr 2021 19:25:54 -0700 (PDT) Received: by mail-ed1-x532.google.com with SMTP id dd20so11696425edb.12 for ; Tue, 06 Apr 2021 19:25:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=skydio.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+dMQ1UXu9XBzlGTnwkRACrO25wnqYBB5RlQyEEYwoZY=; b=qhl0JJYLLEig8rQ0M4Ko87z9CYiiQ9WVh9S8Vy1/f5s0cFolvqKgRlIOZOLLoHq1ec 70C3zg3KLDDvYFS2B+TMAR3wgc15B1psM/16yoBrf4tcE9GpnGvLpa3mRBkGSSfacd86 SWuSuBG0xHKNGHMhZDUfTNtTL1XtWdwzPPWS5yTO7VAM0k9DQ2RHRyuw5HlLZEUEjGE0 Ns6wCgvhM1zT22qk8iRGocIIiTDw5riKd6janhb/SQ0xzn3gTlJdzbABm5bhYTPAJbN0 Bfy/RC/U4mKUe7AcQZqkMR3NFcNufs7KsCu0fpUIkGUx54FD+kqAWJeX3D/lMzZox4bE GE9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+dMQ1UXu9XBzlGTnwkRACrO25wnqYBB5RlQyEEYwoZY=; b=b5+qxVFWtxT/zz4Xvf8gPtL6EPe9+wv0hAl7gO7o8XCLc+zB3MczhtET+PqQI7Tmnz XyHkh2zwFuokzUjmu5iGX/nnAaW8LD+JG4jtiVNiUhzkVzficlX2ZVAF20SaMzhLxfmX zsLN8vhM6bowmRhOIl9SLOEEaw8VbPRP9nOKHmNoXUdxCa/nBsVLXyYxcjZjlktB2+R1 d5Rv63HtBZp9rW2k7b0bsuV27ZiGNMskHhCX56+pi9m7qivdxLjr2rFKOyxHldaMyu/O 3xHk1+BEDzh2q6KTw3ENXG5tUUoioR3ZlBRj1GXpIGnjdEwrgZDTtPUEm8/FI/Ro/ydC CNrg== X-Gm-Message-State: AOAM533XNLA+zUqz95eLDwPFk/yyO34gc2sqjP2BxvG+2vpVU6FZUZDe txsComHoS3muzaZU/TW0yM3RhbX7u8xbL2zhdE6hHw== X-Google-Smtp-Source: ABdhPJywcsTEQB1xynsDpuswzht3yOdI5LBn2WtJUG9/4NQRNhgNQ6yyNGys7zly5h5/ssnMULI+l2teCEFV1/RKbYQ= X-Received: by 2002:a05:6402:4241:: with SMTP id g1mr1584105edb.331.1617762352859; Tue, 06 Apr 2021 19:25:52 -0700 (PDT) MIME-Version: 1.0 References: <20210403013410.32064-2-jerry@skydio.com> <20210405221902.27998-1-jerry@skydio.com> In-Reply-To: From: Jerry Zhang Date: Tue, 6 Apr 2021 19:25:41 -0700 Message-ID: Subject: Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options To: Junio C Hamano Cc: Git Mailing List , Elijah Newren , Ross Yeager , Abraham Bachrach , Brian Kubisiak , Jerry Zhang Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Apr 6, 2021 at 2:56 PM Jerry Zhang wrote: > > On Mon, Apr 5, 2021 at 10:52 PM Junio C Hamano wrote: > > > > Jerry Zhang writes: > > > > > Thanks for the comments! I've updated v3 with the changes. Let me know > > > if you have any > > > more thoughts on whether to block / warn the user before clobbering their cache. > > > > Please do not top-post on this list. > > > > I've already said that I think we should ensure the index is clean > > by default, because, unlike the case where the application is done > > on the working tree files, the use of "--cached" is a sign that the > > next step is likely to write a tree out. As I've already said so in > > earlier reviews, there is nothing more from me to add on that issue. > Understood, but please bear with me to explain the risks a bit more. I'm > having some difficulty coming up with a name and explanation for flags > for this case, because I don't completely understand the safety issue > we are trying to mitigate. > > Let me enumerate some behaviors in 3 different cases where the user > has "file.txt" changes staged in the index, so index differs from HEAD. > > "git apply --cached" would either 1. combine the patch and cached version > and put that in the cache or 2. do nothing (patch failed). In 2 nothing happened > so the user's changes are safe. In 1 the user's changes may be gone, but > since the user was forewarned, this is presumably what they wanted. > > "git apply --3way" would either 1. apply cleanly to working dir or > 2. conflict, in which case user's changes would be moved to stage #2 > in cache. For 1 the user's changes are in the cache, so they can check that out > to restore the original state, since this invocation requires the cache > and working dir to match. For 2, the user's changes are moved to cache > in stage #2. Although the changes are preserved, there doesn't seem to > be any atomic way to move a cache entry from stage #2 to stage #0. > Something like "git restore --staged --ours file.txt" seems like it should > work, but "git restore" doesn't allow combining those flags. > The non atomic way we can do is "git checkout --ours file.txt && > git add file.txt", this is ok in this case since we've required the index > and working tree to match. > > "git apply --3way --cached" would either 1. apply cleanly to the cache or > 2. conflict, and the user's changes are moved to stage #2. In 1, the user's > changes are lost because they're combined with the patch, but this is > the same as the "--cached" case by itself. In 2, the user's changes are > preserved in stage #2 similar to "--3way" by itself. What's somewhat > tricky here is restoring it to stage #0 since we can't use the working > tree, but I think that is more of a limitation in "git restore", since moving > a cache entry from stage #2 to stage #0 is a conceptually possible and > simple operation. Update: I found out that this can be done through "git update-index" Say you start with ``` 100755 adc1032303de8d262868c0a1b85a00fa3b97e9d8 1 file.sh 100755 d0bf2e33594ea7d9d7352df5511db562de819518 2 file.sh 100755 e9bf5dfdea804f4f1bcf24988bbc7c3f99991a09 3 file.sh ``` Pass into "git update-index --index-info the following ``` 100755 d0bf2e33594ea7d9d7352df5511db562de819518 0 file.sh 0 0 1 file.sh 0 0 2 file.sh 0 0 3 file.sh ``` and you will have atomically moved the "ours" stage of file.sh into stage #0. This is sort of what I'd expect "git checkout --ours file.sh" to do, but it doesn't seem to do that. > > In summary it seems to me that merge or no merge, the safety semantics > for "--3way" + "--cached" as it is are pretty similar to the existing semantics > for those options individually. The user could be preparing to write > a tree out in either the "--cached" or the "--cached --3way" operation > so I don't understand why those must differ in safety. In addition, the > both "--3way" and "--3way --cached" perform mergey operations that > changes the stages of a file in cache, so I don't understand why those > must differ in safety either. > > > > > >> Give an order to the codebase to "be like so". Here is my attempt. > > >> > > >> Teach "git apply" to accept "--cached" and "--3way" at the same > > >> time. Only when all changes to all paths involved in the > > >> application auto-resolve cleanly, the result is placed in the > > >> index at stage #0 and the command exits with 0 status. If there > > >> is any path whose conflict cannot be cleanly auto-resolved, the > > >> original contents from common ancestor (stage #1), our version > > >> (stage #2) and the contents from the patch (stage #3) for the > > >> conflicted paths are left at separate stages without any attempt > > >> to resolve the conflict at the content level, and the command > > >> exists with non-zero status, because there is no place (like the > > >> working tree files) to leave a half-resolved conflicted merge > > >> result to ask the end-user to resolve. > > > > I wrote the above as an example to illustrate the tone and the level > > of details expected in our proposed commit log message. The > > behaviour it describes may not necessarily match what you have > > implemented in the patch. > > > > For example, imagine that we are applying a patch for two paths, > > where one auto-resolves cleanly and the other does not. The above > > description expects both paths will leave the higher stages (instead > > of recording the auto-resolved path at stage #0, and leaving the > > other path that cannot be auto-resolved at higher stages) and the > > command exits with non-zero status, which may not be what you > > implemented. As an illustration, I didn't necessarily mean such an > > all-or-none behaviour wrt resolving should be what we implement---I > > do not want to choose, as this is your itch and I want _you_ with > > the itch to think long and hard before deciding what the best design > > for end-users would be, and present it as a proposed solution. An > > obvious alternative is to record auto-resolved paths at stage #0 and > > leave only the paths for which auto-resolution failed in conflicted > > state. > I missed the "all changes to all paths" requirement in that description, > I'll update it to be more consistent with what it actually does. As you say, > the leaving entries at higher orders behavior only happens for conflicting > paths, not for all paths. > > > > Thanks.