git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options
@ 2021-04-03  1:34 Jerry Zhang
  2021-04-03  1:34 ` [PATCH 1/1] " Jerry Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-03  1:34 UTC (permalink / raw)
  To: git; +Cc: ross, abe, brian.kubisiask, Jerry Zhang

I'm creating a script/tool that will be able to cherry-pick 
multiple commits from a single branch, rebase them onto a
base commit, and push those references to a remote. 

Ex. with a branch like "origin/master -> A -> B -> C"
The tool will create "master -> A", "master -> B",
"master -> C" and either make local branches or
push them to a remote. This can be useful since code
review tools like github use branches as the basis
for pull requests.

A key feature here is that the above happens without
any changes to the user's working directory or cache.
This is important since those operations will add
time and generate build churn. We use these steps
for synthesizing a "cherry-pick" of B to master.

1. cp .git/index index.temp
2. set GIT_INDEX_FILE=index.temp
3. git reset master -- . (git read-tree also works here, but is a bit slower)
4. git format-patch --full-index B~..B
5. git apply --cached B.patch
6. git write-tree
7. git commit-tree {output of 6} -p master -m "message"
8. either `git symbolic-ref` to make a branch or `git push` to remote

I'm looking to improve the git apply step in #5. 
Currently we can't use --cached in combination with
--3way, which limits some of the usefulness of this method.
There are many diffs that will block applying a patch
that a 3 way merge can resolve without conflicts. Even
in the case where there are real conflicts, performing
a 3 way merge will allow us to show the user the lines
where the conflict occurred. 

With the above in mind, I've created a small patch that
implements the behavior I'd like. Rather than disallow
the cached and 3way flags to be combined, we allow them,
but write any conflicts directly to the cached file. Since 
we're unable to modify the working directory, it seems
reasonable in this case to not actually present the user
with any options to resolve conflicts. Instead, a script
or tool using this command can diff the temporary cache
to get the source of the conflict.

Happy to address any feedback. After I address any major
changes I will add new tests for this path.

All tests passed locally.

Jerry Zhang (1):
  git-apply: Allow simultaneous --cached and --3way options

 Documentation/git-apply.txt |  4 +++-
 apply.c                     | 13 +++++++------
 2 files changed, 10 insertions(+), 7 deletions(-)

-- 
2.29.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  1:34 [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options Jerry Zhang
@ 2021-04-03  1:34 ` Jerry Zhang
  2021-04-03  3:46   ` Elijah Newren
  2021-04-05 22:19   ` [PATCH V2] " Jerry Zhang
  2021-04-03  3:04 ` [PATCH 0/1] git-apply: Allow " Elijah Newren
  2021-04-03  5:24 ` Bagas Sanjaya
  2 siblings, 2 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-03  1:34 UTC (permalink / raw)
  To: git; +Cc: ross, abe, brian.kubisiask, Jerry Zhang, Jerry Zhang

Previously, --cached and --3way were not
allowed to be used together, since --3way
wrote conflict markers into the working tree.

These changes change semantics so that if
these flags are given together and there is
a conflict, the conflict markers are added
directly to cache. If there is no conflict,
the patch is applied directly to cache as
expected.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Jerry Zhang <jerryxzha@googlemail.com>
---
 Documentation/git-apply.txt |  4 +++-
 apply.c                     | 13 +++++++------
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 91d9a8601c..3dc0085066 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -89,7 +89,9 @@ OPTIONS
 	and we have those blobs available locally, possibly leaving the
 	conflict markers in the files in the working tree for the user to
 	resolve.  This option implies the `--index` option, and is incompatible
-	with the `--reject` and the `--cached` options.
+	with the `--reject` option. When used with the --cached option, any
+	conflict markers are added directly to the cache rather than the
+	working tree.
 
 --build-fake-ancestor=<file>::
 	Newer 'git diff' output has embedded 'index information'
diff --git a/apply.c b/apply.c
index 6695a931e9..fc94ca0e99 100644
--- a/apply.c
+++ b/apply.c
@@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
 
 	if (state->apply_with_reject && state->threeway)
 		return error(_("--reject and --3way cannot be used together."));
-	if (state->cached && state->threeway)
-		return error(_("--cached and --3way cannot be used together."));
 	if (state->threeway) {
 		if (is_not_gitdir)
 			return error(_("--3way outside a repository"));
@@ -4490,13 +4488,16 @@ static int create_file(struct apply_state *state, struct patch *patch)
 
 	if (!mode)
 		mode = S_IFREG | 0644;
-	if (create_one_file(state, path, mode, buf, size))
-		return -1;
+	if (!state->cached) {
+		if (create_one_file(state, path, mode, buf, size))
+			return -1;
+	}
 
-	if (patch->conflicted_threeway)
+	if (patch->conflicted_threeway && !state->cached)
 		return add_conflicted_stages_file(state, patch);
-	else if (state->update_index)
+	else if (state->update_index) {
 		return add_index_file(state, path, mode, buf, size);
+	}
 	return 0;
 }
 
-- 
2.29.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  1:34 [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options Jerry Zhang
  2021-04-03  1:34 ` [PATCH 1/1] " Jerry Zhang
@ 2021-04-03  3:04 ` Elijah Newren
  2021-04-05 22:05   ` Jerry Zhang
  2021-04-03  5:24 ` Bagas Sanjaya
  2 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2021-04-03  3:04 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: Git Mailing List, ross, abe, brian.kubisiask

On Fri, Apr 2, 2021 at 6:36 PM Jerry Zhang <jerry@skydio.com> wrote:
>
> I'm creating a script/tool that will be able to cherry-pick
> multiple commits from a single branch, rebase them onto a
> base commit, and push those references to a remote.
>
> Ex. with a branch like "origin/master -> A -> B -> C"
> The tool will create "master -> A", "master -> B",
> "master -> C" and either make local branches or
> push them to a remote. This can be useful since code
> review tools like github use branches as the basis
> for pull requests.

Not sure I understand the "master -> A", "master -> B" syntax.  What
do you mean here?

> A key feature here is that the above happens without
> any changes to the user's working directory or cache.
> This is important since those operations will add
> time and generate build churn. We use these steps
> for synthesizing a "cherry-pick" of B to master.
>
> 1. cp .git/index index.temp
> 2. set GIT_INDEX_FILE=index.temp
> 3. git reset master -- . (git read-tree also works here, but is a bit slower)
> 4. git format-patch --full-index B~..B
> 5. git apply --cached B.patch
> 6. git write-tree
> 7. git commit-tree {output of 6} -p master -m "message"
> 8. either `git symbolic-ref` to make a branch or `git push` to remote

Yeah, folks have resorted to various variants of this kind of thing in
the past.  It is a clever way to handle some basic cases, but it does
often fall short.  It's unfortunate that cherry-pick and rebase cannot
yet just provide this functionality (more on that below).

It may also interest you that rebase has two different backends, one
built on am (which in turn is built on format-patch + apply), and one
built on the merge machinery (which the am --3way also uses when it
needs to).  We deprecated the format-patch + apply backend in part
because it sometimes results in misapplied patches; see the "Context"
subsection of the "BEHAVIORAL DIFFERENCES" section of the git-rebase
manpage.  However, the am version would at least handle basic renames,
which I believe might cause problems for a direct format-patch + apply
invocation like yours (I'll also discuss this more below).

> I'm looking to improve the git apply step in #5.
> Currently we can't use --cached in combination with
> --3way, which limits some of the usefulness of this method.
> There are many diffs that will block applying a patch
> that a 3 way merge can resolve without conflicts. Even
> in the case where there are real conflicts, performing
> a 3 way merge will allow us to show the user the lines
> where the conflict occurred.
>
> With the above in mind, I've created a small patch that
> implements the behavior I'd like. Rather than disallow
> the cached and 3way flags to be combined, we allow them,
> but write any conflicts directly to the cached file. Since
> we're unable to modify the working directory, it seems
> reasonable in this case to not actually present the user
> with any options to resolve conflicts. Instead, a script
> or tool using this command can diff the temporary cache
> to get the source of the conflict.

Looks like you're focusing on content conflicts.  What about path
conflicts?  For example, apply's --3way just uses a per-file
ll_merge() call, meaning it won't handle renames, so your method would
also often get spurious modify/delete conflicts when renames occur.
How does your plan to just "cache" conflicts work with these
modify/delete files?  Will users just look for conflict markers and
ignore the fact that both modified newfile and modified oldfile are
present?  I'm also curious how e.g. directory/file conflicts would be
handled by your scheme; these seem somewhat problematic to me from
your description.

> Happy to address any feedback. After I address any major
> changes I will add new tests for this path.

Don't know the timeframe you're looking at, but I'm looking to modify
cherry-pick and rebase to be able to operate on branches that are not
checked out, or in bare repositories.  The blocker to that
traditionally has been that the merge machinery required a working
directory.  The good news is that I wrote a new merge backend that
doesn't require a working directory.  The bad news is I'm still trying
to get that new merge backend through the review process, and no
current release of git has a complete version of that new backend yet.
Further, the changes to cherry-pick and rebase have not yet been
started.  There were some decisions to make too, such as how to handle
the case with conflicts -- just report them and tell the user to retry
with a checkout?  Provide some kind of basic information about the
conflicts?  What'd be useful to you?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  1:34 ` [PATCH 1/1] " Jerry Zhang
@ 2021-04-03  3:46   ` Elijah Newren
  2021-04-03  4:26     ` Junio C Hamano
  2021-04-05 22:08     ` Jerry Zhang
  2021-04-05 22:19   ` [PATCH V2] " Jerry Zhang
  1 sibling, 2 replies; 29+ messages in thread
From: Elijah Newren @ 2021-04-03  3:46 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: Git Mailing List, ross, abe, brian.kubisiask, Jerry Zhang

I'm not that familiar with apply.c, but let me attempt to take a look...

On Fri, Apr 2, 2021 at 6:36 PM Jerry Zhang <jerry@skydio.com> wrote:
>
> Previously, --cached and --3way were not
> allowed to be used together, since --3way
> wrote conflict markers into the working tree.
>
> These changes change semantics so that if
> these flags are given together and there is
> a conflict, the conflict markers are added
> directly to cache. If there is no conflict,
> the patch is applied directly to cache as
> expected.
>
> Signed-off-by: Jerry Zhang <jerry@skydio.com>
> Signed-off-by: Jerry Zhang <jerryxzha@googlemail.com>
> ---
>  Documentation/git-apply.txt |  4 +++-
>  apply.c                     | 13 +++++++------
>  2 files changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> index 91d9a8601c..3dc0085066 100644
> --- a/Documentation/git-apply.txt
> +++ b/Documentation/git-apply.txt
> @@ -89,7 +89,9 @@ OPTIONS
>         and we have those blobs available locally, possibly leaving the
>         conflict markers in the files in the working tree for the user to
>         resolve.  This option implies the `--index` option, and is incompatible
> -       with the `--reject` and the `--cached` options.
> +       with the `--reject` option. When used with the --cached option, any
> +       conflict markers are added directly to the cache rather than the
> +       working tree.
>
>  --build-fake-ancestor=<file>::
>         Newer 'git diff' output has embedded 'index information'
> diff --git a/apply.c b/apply.c
> index 6695a931e9..fc94ca0e99 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
>
>         if (state->apply_with_reject && state->threeway)
>                 return error(_("--reject and --3way cannot be used together."));
> -       if (state->cached && state->threeway)
> -               return error(_("--cached and --3way cannot be used together."));
>         if (state->threeway) {
>                 if (is_not_gitdir)
>                         return error(_("--3way outside a repository"));
> @@ -4490,13 +4488,16 @@ static int create_file(struct apply_state *state, struct patch *patch)
>
>         if (!mode)
>                 mode = S_IFREG | 0644;
> -       if (create_one_file(state, path, mode, buf, size))
> -               return -1;
> +       if (!state->cached) {

Why add this check?  create_one_file() already has an early return if
state->cached is true.

> +               if (create_one_file(state, path, mode, buf, size))
> +                       return -1;
> +       }
>
> -       if (patch->conflicted_threeway)
> +       if (patch->conflicted_threeway && !state->cached)
>                 return add_conflicted_stages_file(state, patch);
> -       else if (state->update_index)
> +       else if (state->update_index) {
>                 return add_index_file(state, path, mode, buf, size);

So if something had conflicts, you ignore the various conflicted
modes, and just add it to the index as it stands.  What if it was
deleted upstream and modified locally?  Doesn't that just ignore the
conflict, make it invisible to the user, and add the locally modified
version?  Similarly if it was renamed upstream and modified locally,
doesn't that end up in both files being present?  And if there's a
directory/file conflict, due to the lack of ADD_CACHE_SKIP_DFCHECK (or
whatever it's called), the add is just going to fail, but perhaps
that's the most reasonable case as it'd print an error message and
return -1, I think.

Again, I didn't test any of this out and I'm not so familiar with this
code, so I'm guessing at these scenarios.  If I'm wrong about how this
works, the commit message probably deserves an explanation about why
they work, and we'd definitely need a few testcases for these types of
scenarios.  If I'm right, the current implementation is problematic at
least if not the idea of using these options together.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  3:46   ` Elijah Newren
@ 2021-04-03  4:26     ` Junio C Hamano
  2021-04-04  1:02       ` Junio C Hamano
  2021-04-05 22:08     ` Jerry Zhang
  1 sibling, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-04-03  4:26 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jerry Zhang, Git Mailing List, ross, abe, brian.kubisiask, Jerry Zhang

Elijah Newren <newren@gmail.com> writes:

> I'm not that familiar with apply.c, but let me attempt to take a look...

I am (well, at least I was the one who invented 3way and added the
"index" line to the diff output format) ;-)

> scenarios.  If I'm right, the current implementation is problematic at
> least if not the idea of using these options together.

Yes, the conflicted case cannot sanely be handled _without_ leaving
higher stage entries in the index, and that is exactly why I made it
incompatible with the "--cached" mode.

It might be OK to only allow the combination when everything auto
resolves cleanly and fail the operation without touching either the
index or the working tree.  Pretending there was no delete/modify
conflicts or adding contents with unresolved conflicts as if nothing
bad happened as stage 0 entries would never be acceptable.

Perhaps

 * Error out if the index does not match HEAD.

 * Try applying to the contents in the index.  If there are any
   structural conflicts, leave these paths at higher stage and do
   not touch their contents.

 * For paths without structural conflict but need content merges,
   attempt ll-merge of the contents.  If autoresolves cleanly,
   register the result at stage 0.  Otherwise, discard the failed
   conflicted merge, and leave stages 1, 2 and 3 as they are.

 * Exit with status 0 if and only if everything has resolved
   cleanly.  Otherwise, exit with non-zero status.

would be the minimally-acceptably-safe behaviour.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  1:34 [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options Jerry Zhang
  2021-04-03  1:34 ` [PATCH 1/1] " Jerry Zhang
  2021-04-03  3:04 ` [PATCH 0/1] git-apply: Allow " Elijah Newren
@ 2021-04-03  5:24 ` Bagas Sanjaya
       [not found]   ` <CAMKO5CtiW84E4XjnPRf-yOPp+ua_u07LsAu=BB0YhmP3+3kYiw@mail.gmail.com>
  2 siblings, 1 reply; 29+ messages in thread
From: Bagas Sanjaya @ 2021-04-03  5:24 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: ross, abe, brian.kubisiask, git

On 03/04/21 08.34, Jerry Zhang wrote:
> I'm creating a script/tool that will be able to cherry-pick
> multiple commits from a single branch, rebase them onto a
> base commit, and push those references to a remote.
> 
> Ex. with a branch like "origin/master -> A -> B -> C"
> The tool will create "master -> A", "master -> B",
> "master -> C" and either make local branches or
> push them to a remote. This can be useful since code
> review tools like github use branches as the basis
> for pull requests.
> 
> A key feature here is that the above happens without
> any changes to the user's working directory or cache.
> This is important since those operations will add
> time and generate build churn. We use these steps
> for synthesizing a "cherry-pick" of B to master.
> 
> 1. cp .git/index index.temp
> 2. set GIT_INDEX_FILE=index.temp
> 3. git reset master -- . (git read-tree also works here, but is a bit slower)
> 4. git format-patch --full-index B~..B
> 5. git apply --cached B.patch
> 6. git write-tree
> 7. git commit-tree {output of 6} -p master -m "message"
> 8. either `git symbolic-ref` to make a branch or `git push` to remote
> 
> I'm looking to improve the git apply step in #5.
> Currently we can't use --cached in combination with
> --3way, which limits some of the usefulness of this method.
> There are many diffs that will block applying a patch
> that a 3 way merge can resolve without conflicts. Even
> in the case where there are real conflicts, performing
> a 3 way merge will allow us to show the user the lines
> where the conflict occurred.
> 
> With the above in mind, I've created a small patch that
> implements the behavior I'd like. Rather than disallow
> the cached and 3way flags to be combined, we allow them,
> but write any conflicts directly to the cached file. Since
> we're unable to modify the working directory, it seems
> reasonable in this case to not actually present the user
> with any options to resolve conflicts. Instead, a script
> or tool using this command can diff the temporary cache
> to get the source of the conflict.
> 
> Happy to address any feedback. After I address any major
> changes I will add new tests for this path.
> 
> All tests passed locally.
> 
> Jerry Zhang (1):
>    git-apply: Allow simultaneous --cached and --3way options
> 
>   Documentation/git-apply.txt |  4 +++-
>   apply.c                     | 13 +++++++------
>   2 files changed, 10 insertions(+), 7 deletions(-)
> 
I know this patch series only have one patch, so why add the
cover letter (PATCH 0/1)? Customarily, single-patch series just
begin with `[PATCH] some title`, then patch message, and actual
diff; all without cover letter.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options
       [not found]   ` <CAMKO5CtiW84E4XjnPRf-yOPp+ua_u07LsAu=BB0YhmP3+3kYiw@mail.gmail.com>
@ 2021-04-03  8:05     ` Bagas Sanjaya
  0 siblings, 0 replies; 29+ messages in thread
From: Bagas Sanjaya @ 2021-04-03  8:05 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: Ross Yeager, Abraham Bachrach, brian.kubisiask, git

On 03/04/21 13.57, Jerry Zhang wrote:
> I wanted to provide extra context to explain the background of what I'm
> trying to accomplish,
> but didn't want it all merged as commit text in the tree ;-)
> 
But you can add these contexts to your patch, just put them between
--- and `diff` line. (after patch message and before actual diff).

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  4:26     ` Junio C Hamano
@ 2021-04-04  1:02       ` Junio C Hamano
  2021-04-05 22:12         ` Jerry Zhang
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-04-04  1:02 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jerry Zhang, Git Mailing List, ross, abe, brian.kubisiask, Jerry Zhang

Junio C Hamano <gitster@pobox.com> writes:

> It might be OK to only allow the combination when everything auto
> resolves cleanly and fail the operation without touching either the
> index or the working tree.  Pretending there was no delete/modify
> conflicts or adding contents with unresolved conflicts as if nothing
> bad happened as stage 0 entries would never be acceptable.
>
> Perhaps
>
>  * Error out if the index does not match HEAD.
>
>  * Try applying to the contents in the index.  If there are any
>    structural conflicts, leave these paths at higher stage and do
>    not touch their contents.
>
>  * For paths without structural conflict but need content merges,
>    attempt ll-merge of the contents.  If autoresolves cleanly,
>    register the result at stage 0.  Otherwise, discard the failed
>    conflicted merge, and leave stages 1, 2 and 3 as they are.
>
>  * Exit with status 0 if and only if everything has resolved
>    cleanly.  Otherwise, exit with non-zero status.
>
> would be the minimally-acceptably-safe behaviour.

Note that, while a lot unsatisfactory than the above, the following
would also be acceptable.

  * Error out if the index does not match HEAD.

  * Try applying to the contents in the index.  If there are any
    structural conflicts, abort without touching the index (or the
    working tree --- but that is best left unsaid as we all know we
    are talking about '--cached').

  * For paths without structural conflict but need content merges,
    attempt ll-merge of the contents.  If ALL SUCh PATHS autoresolve
    cleanly, register their result at stage 0.  Otherwise, abort
    without touching the index (or the working tree).

  * Exit with status 0 if and only if everything has resolved
    cleanly.  Otherwise, exit with non-zero status (and never touch
    the index or the working tree).

The version I earlier gave would give a good starting point to
manually resolve the conflicts in the index and when resolved fully,
it is safely recorded as the result of applying the patch on top of
HEAD, because the non-final results are all in higher stages, and
all the paths at stage 0 are either from the HEAD and unaffected by
the merge, or the ones that cleanly resolved.  The "the index must
match HEAD" upfront is to ensure that.  Otherwise it would make it
very tempting, after spending all that time to resolve the conflicts
only in the higher stages of the index, to commit the index as-is to
make a child commit of HEAD and record that it is the result of
applying the patch.  But if the starting condition had a change
unrelated to the change the patch brings in already in the index,
the resulting commit would be _more_ than what the patch did to the
codebase.

The simplified version would let the user proceed only when the
conflicts can mechanically resolved, but it still has the "make sure
what is recorded is only from the incoming patch" safety.

Of course, if the user is trying to cherry-pick parts of multiple
patches and combine them to create a new single commit, the second
and subsequent applycation of the patches would be thwarted by the
"the index must match HEAD" rule, but it is far safer to make each
step into its own snapshot commit during such a workflow to combine
multiple patch pieces and then squash them together after finishing,
than carrying an intermediate result only in the index and risk
losing work you did in the previous step(s) to incorrect resolution
in later step(s).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  3:04 ` [PATCH 0/1] git-apply: Allow " Elijah Newren
@ 2021-04-05 22:05   ` Jerry Zhang
  0 siblings, 0 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-05 22:05 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Ross Yeager, Abraham Bachrach, brian.kubisiask

On Fri, Apr 2, 2021 at 8:04 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Fri, Apr 2, 2021 at 6:36 PM Jerry Zhang <jerry@skydio.com> wrote:
> >
> > I'm creating a script/tool that will be able to cherry-pick
> > multiple commits from a single branch, rebase them onto a
> > base commit, and push those references to a remote.
> >
> > Ex. with a branch like "origin/master -> A -> B -> C"
> > The tool will create "master -> A", "master -> B",
> > "master -> C" and either make local branches or
> > push them to a remote. This can be useful since code
> > review tools like github use branches as the basis
> > for pull requests.
>
> Not sure I understand the "master -> A", "master -> B" syntax.  What
> do you mean here?
Ah yeah my syntax wasn't super clear here.
I mean a branch "dev" pointing to commit "C", which is on top of "B",
which is on top of "A", which is on top of "master".
My tool would fake "cherry-pick" each of A, B, and C on top of master.
>
> > A key feature here is that the above happens without
> > any changes to the user's working directory or cache.
> > This is important since those operations will add
> > time and generate build churn. We use these steps
> > for synthesizing a "cherry-pick" of B to master.
> >
> > 1. cp .git/index index.temp
> > 2. set GIT_INDEX_FILE=index.temp
> > 3. git reset master -- . (git read-tree also works here, but is a bit slower)
> > 4. git format-patch --full-index B~..B
> > 5. git apply --cached B.patch
> > 6. git write-tree
> > 7. git commit-tree {output of 6} -p master -m "message"
> > 8. either `git symbolic-ref` to make a branch or `git push` to remote
>
> Yeah, folks have resorted to various variants of this kind of thing in
> the past.  It is a clever way to handle some basic cases, but it does
> often fall short.  It's unfortunate that cherry-pick and rebase cannot
> yet just provide this functionality (more on that below).
>
> It may also interest you that rebase has two different backends, one
> built on am (which in turn is built on format-patch + apply), and one
> built on the merge machinery (which the am --3way also uses when it
> needs to).  We deprecated the format-patch + apply backend in part
> because it sometimes results in misapplied patches; see the "Context"
> subsection of the "BEHAVIORAL DIFFERENCES" section of the git-rebase
> manpage.  However, the am version would at least handle basic renames,
> which I believe might cause problems for a direct format-patch + apply
> invocation like yours (I'll also discuss this more below).
Thanks -- I was able to repro a case where am machinery applied a patch
incorrectly but 3way applied it correctly. This actually brings up
another point,
because am doesn't report errors when applying a patch incorrectly in this
case, we don't end up falling back to 3way. There also is no user flag
to force 3way, so the user can't do anything to ensure the correct
application here. Maybe it would be better for --3way to directly invoke
the 3way merge rather than causing it to fallback? (Junio might also
have some input here).
>
> > I'm looking to improve the git apply step in #5.
> > Currently we can't use --cached in combination with
> > --3way, which limits some of the usefulness of this method.
> > There are many diffs that will block applying a patch
> > that a 3 way merge can resolve without conflicts. Even
> > in the case where there are real conflicts, performing
> > a 3 way merge will allow us to show the user the lines
> > where the conflict occurred.
> >
> > With the above in mind, I've created a small patch that
> > implements the behavior I'd like. Rather than disallow
> > the cached and 3way flags to be combined, we allow them,
> > but write any conflicts directly to the cached file. Since
> > we're unable to modify the working directory, it seems
> > reasonable in this case to not actually present the user
> > with any options to resolve conflicts. Instead, a script
> > or tool using this command can diff the temporary cache
> > to get the source of the conflict.
>
> Looks like you're focusing on content conflicts.  What about path
> conflicts?  For example, apply's --3way just uses a per-file
> ll_merge() call, meaning it won't handle renames, so your method would
> also often get spurious modify/delete conflicts when renames occur.
> How does your plan to just "cache" conflicts work with these
> modify/delete files?  Will users just look for conflict markers and
> ignore the fact that both modified newfile and modified oldfile are
> present?  I'm also curious how e.g. directory/file conflicts would be
> handled by your scheme; these seem somewhat problematic to me from
> your description.
>
> > Happy to address any feedback. After I address any major
> > changes I will add new tests for this path.
>
> Don't know the timeframe you're looking at, but I'm looking to modify
> cherry-pick and rebase to be able to operate on branches that are not
> checked out, or in bare repositories.  The blocker to that
That functionality would be great. I initially did look at what it would
take to modify sequencer to get what I wanted, but I quickly realized
it would be a big refactor.
> traditionally has been that the merge machinery required a working
> directory.  The good news is that I wrote a new merge backend that
> doesn't require a working directory.  The bad news is I'm still trying
> to get that new merge backend through the review process, and no
> current release of git has a complete version of that new backend yet.
> Further, the changes to cherry-pick and rebase have not yet been
> started.  There were some decisions to make too, such as how to handle
> the case with conflicts -- just report them and tell the user to retry
> with a checkout?  Provide some kind of basic information about the
> conflicts?  What'd be useful to you?
After thinking some more I'd generally agree with comments to leave
the conflicts at higher stages rather than check in the conflict markers.
This should result in less issues with path conflicts as well (or at least
be similar to 3way by itself). This is probably ok, because the conflict
markers can always be generated from the higher stage files (git diff or
git checkout -m -- .), but the reverse isn't true.
Overall 90% of the functionality comes from being able to do the 3-way
at all, since it's able to handle more cases correctly. Having *any* output
to tell the user why their operation failed would just be a bonus.

I'd envision flags similar to these, for cherry-pick

--cached : Do not touch the working tree to apply conflict markers.
Instead conflicts are left at a higher order in the cache.
--cached-parent : Checkout the index to the given commit, then
apply the cherry-pick with the given commit as a parent. Print out
the new commit. Warning: index will be left unsynchronized with
HEAD after this operation. Intended to be used with a temporary index
rather than the main one.

In the end I'm not sure how to still accomplish the desired functionality
without using a temporary index -- this would always result in
desyncing the user's index / working dir afterwards. Maybe error or
warn if the user isn't using a temporary index?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  3:46   ` Elijah Newren
  2021-04-03  4:26     ` Junio C Hamano
@ 2021-04-05 22:08     ` Jerry Zhang
  1 sibling, 0 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-05 22:08 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Ross Yeager, Abraham Bachrach, Jerry Zhang,
	Brian Kubisiak

On Fri, Apr 2, 2021 at 8:46 PM Elijah Newren <newren@gmail.com> wrote:
>
> I'm not that familiar with apply.c, but let me attempt to take a look...
>
> On Fri, Apr 2, 2021 at 6:36 PM Jerry Zhang <jerry@skydio.com> wrote:
> >
> > Previously, --cached and --3way were not
> > allowed to be used together, since --3way
> > wrote conflict markers into the working tree.
> >
> > These changes change semantics so that if
> > these flags are given together and there is
> > a conflict, the conflict markers are added
> > directly to cache. If there is no conflict,
> > the patch is applied directly to cache as
> > expected.
> >
> > Signed-off-by: Jerry Zhang <jerry@skydio.com>
> > Signed-off-by: Jerry Zhang <jerryxzha@googlemail.com>
> > ---
> >  Documentation/git-apply.txt |  4 +++-
> >  apply.c                     | 13 +++++++------
> >  2 files changed, 10 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> > index 91d9a8601c..3dc0085066 100644
> > --- a/Documentation/git-apply.txt
> > +++ b/Documentation/git-apply.txt
> > @@ -89,7 +89,9 @@ OPTIONS
> >         and we have those blobs available locally, possibly leaving the
> >         conflict markers in the files in the working tree for the user to
> >         resolve.  This option implies the `--index` option, and is incompatible
> > -       with the `--reject` and the `--cached` options.
> > +       with the `--reject` option. When used with the --cached option, any
> > +       conflict markers are added directly to the cache rather than the
> > +       working tree.
> >
> >  --build-fake-ancestor=<file>::
> >         Newer 'git diff' output has embedded 'index information'
> > diff --git a/apply.c b/apply.c
> > index 6695a931e9..fc94ca0e99 100644
> > --- a/apply.c
> > +++ b/apply.c
> > @@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
> >
> >         if (state->apply_with_reject && state->threeway)
> >                 return error(_("--reject and --3way cannot be used together."));
> > -       if (state->cached && state->threeway)
> > -               return error(_("--cached and --3way cannot be used together."));
> >         if (state->threeway) {
> >                 if (is_not_gitdir)
> >                         return error(_("--3way outside a repository"));
> > @@ -4490,13 +4488,16 @@ static int create_file(struct apply_state *state, struct patch *patch)
> >
> >         if (!mode)
> >                 mode = S_IFREG | 0644;
> > -       if (create_one_file(state, path, mode, buf, size))
> > -               return -1;
> > +       if (!state->cached) {
>
> Why add this check?  create_one_file() already has an early return if
> state->cached is true.
>
removed in latest patch
> > +               if (create_one_file(state, path, mode, buf, size))
> > +                       return -1;
> > +       }
> >
> > -       if (patch->conflicted_threeway)
> > +       if (patch->conflicted_threeway && !state->cached)
> >                 return add_conflicted_stages_file(state, patch);
> > -       else if (state->update_index)
> > +       else if (state->update_index) {
> >                 return add_index_file(state, path, mode, buf, size);
>
> So if something had conflicts, you ignore the various conflicted
> modes, and just add it to the index as it stands.  What if it was
> deleted upstream and modified locally?  Doesn't that just ignore the
> conflict, make it invisible to the user, and add the locally modified
> version?  Similarly if it was renamed upstream and modified locally,
> doesn't that end up in both files being present?  And if there's a
> directory/file conflict, due to the lack of ADD_CACHE_SKIP_DFCHECK (or
> whatever it's called), the add is just going to fail, but perhaps
> that's the most reasonable case as it'd print an error message and
> return -1, I think.
>
> Again, I didn't test any of this out and I'm not so familiar with this
> code, so I'm guessing at these scenarios.  If I'm wrong about how this
> works, the commit message probably deserves an explanation about why
> they work, and we'd definitely need a few testcases for these types of
> scenarios.  If I'm right, the current implementation is problematic at
> least if not the idea of using these options together.
Echoing my other reply, I think I will give up on the conflict markers and
just leave the files at the higher stages. I think that will address any
safety concerns since that's what --3way does without the cached
flag

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-04  1:02       ` Junio C Hamano
@ 2021-04-05 22:12         ` Jerry Zhang
  2021-04-05 22:23           ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Jerry Zhang @ 2021-04-05 22:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren, Git Mailing List, Ross Yeager, Abraham Bachrach,
	Jerry Zhang, Brian Kubisiak

On Sat, Apr 3, 2021 at 6:02 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > It might be OK to only allow the combination when everything auto
> > resolves cleanly and fail the operation without touching either the
> > index or the working tree.  Pretending there was no delete/modify
> > conflicts or adding contents with unresolved conflicts as if nothing
> > bad happened as stage 0 entries would never be acceptable.
Yep I've changed it to just leave the higher stages in the index
> >
> > Perhaps
> >
> >  * Error out if the index does not match HEAD.
> >
> >  * Try applying to the contents in the index.  If there are any
> >    structural conflicts, leave these paths at higher stage and do
> >    not touch their contents.
> >
> >  * For paths without structural conflict but need content merges,
> >    attempt ll-merge of the contents.  If autoresolves cleanly,
> >    register the result at stage 0.  Otherwise, discard the failed
> >    conflicted merge, and leave stages 1, 2 and 3 as they are.
> >
> >  * Exit with status 0 if and only if everything has resolved
> >    cleanly.  Otherwise, exit with non-zero status.
> >
> > would be the minimally-acceptably-safe behaviour.
>
> Note that, while a lot unsatisfactory than the above, the following
> would also be acceptable.
>
>   * Error out if the index does not match HEAD.
>
>   * Try applying to the contents in the index.  If there are any
>     structural conflicts, abort without touching the index (or the
>     working tree --- but that is best left unsaid as we all know we
>     are talking about '--cached').
>
>   * For paths without structural conflict but need content merges,
>     attempt ll-merge of the contents.  If ALL SUCh PATHS autoresolve
>     cleanly, register their result at stage 0.  Otherwise, abort
>     without touching the index (or the working tree).
>
>   * Exit with status 0 if and only if everything has resolved
>     cleanly.  Otherwise, exit with non-zero status (and never touch
>     the index or the working tree).
>
> The version I earlier gave would give a good starting point to
> manually resolve the conflicts in the index and when resolved fully,
> it is safely recorded as the result of applying the patch on top of
> HEAD, because the non-final results are all in higher stages, and
> all the paths at stage 0 are either from the HEAD and unaffected by
> the merge, or the ones that cleanly resolved.  The "the index must
> match HEAD" upfront is to ensure that.  Otherwise it would make it
> very tempting, after spending all that time to resolve the conflicts
> only in the higher stages of the index, to commit the index as-is to
> make a child commit of HEAD and record that it is the result of
> applying the patch.  But if the starting condition had a change
> unrelated to the change the patch brings in already in the index,
> the resulting commit would be _more_ than what the patch did to the
> codebase.
I can see what you mean about the user safety issue. However,
my specific use case (see cover letter) involves an index that does not
match HEAD, and wouldn't be possible at all if we forced the index to
match HEAD. Furthermore git-apply --cached even without --3way
doesn't force the index to match HEAD either, so why force it now?

>
> The simplified version would let the user proceed only when the
> conflicts can mechanically resolved, but it still has the "make sure
> what is recorded is only from the incoming patch" safety.
>
> Of course, if the user is trying to cherry-pick parts of multiple
> patches and combine them to create a new single commit, the second
> and subsequent applycation of the patches would be thwarted by the
> "the index must match HEAD" rule, but it is far safer to make each
> step into its own snapshot commit during such a workflow to combine
> multiple patch pieces and then squash them together after finishing,
> than carrying an intermediate result only in the index and risk
> losing work you did in the previous step(s) to incorrect resolution
> in later step(s).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
  2021-04-03  1:34 ` [PATCH 1/1] " Jerry Zhang
  2021-04-03  3:46   ` Elijah Newren
@ 2021-04-05 22:19   ` Jerry Zhang
  2021-04-05 22:46     ` Junio C Hamano
  2021-04-06  2:49     ` [PATCH v3] git-apply: allow " Jerry Zhang
  1 sibling, 2 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-05 22:19 UTC (permalink / raw)
  To: git, newren, gitster; +Cc: ross, abe, brian.kubisiak, Jerry Zhang, Jerry Zhang

Previously, --cached and --3way were not
allowed to be used together, since --3way
wrote conflict markers into the working tree.

These changes change semantics so that if
these flags are given together and there is
a conflict, the conflicting objects are left
at a higher order in the cache, and the command
will return non-zero. If there is no conflict,
the patch is applied directly to cache as
expected and the command will return 0.

The user can use `git diff` to view the contents
of the conflict, or `git checkout -m -- .` to
regenerate the conflict markers in the working
directory.

With the combined --3way and --cached flags,
The conflict markers won't be written to the
working directory, so there is no point in
attempting rerere.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Jerry Zhang <jerryxzha@googlemail.com>
---
 Documentation/git-apply.txt | 3 ++-
 apply.c                     | 5 ++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 91d9a8601c..392882d9a5 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -89,7 +89,8 @@ OPTIONS
 	and we have those blobs available locally, possibly leaving the
 	conflict markers in the files in the working tree for the user to
 	resolve.  This option implies the `--index` option, and is incompatible
-	with the `--reject` and the `--cached` options.
+	with the `--reject` option. When used with the --cached option, any conflicts
+    are left at higher stages in the cache.
 
 --build-fake-ancestor=<file>::
 	Newer 'git diff' output has embedded 'index information'
diff --git a/apply.c b/apply.c
index 6695a931e9..e59c77a1b7 100644
--- a/apply.c
+++ b/apply.c
@@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
 
 	if (state->apply_with_reject && state->threeway)
 		return error(_("--reject and --3way cannot be used together."));
-	if (state->cached && state->threeway)
-		return error(_("--cached and --3way cannot be used together."));
 	if (state->threeway) {
 		if (is_not_gitdir)
 			return error(_("--3way outside a repository"));
@@ -4646,7 +4644,8 @@ static int write_out_results(struct apply_state *state, struct patch *list)
 		}
 		string_list_clear(&cpath, 0);
 
-		repo_rerere(state->repo, 0);
+		if (!state->cached)
+			repo_rerere(state->repo, 0);
 	}
 
 	return errs;
-- 
2.29.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-05 22:12         ` Jerry Zhang
@ 2021-04-05 22:23           ` Junio C Hamano
  2021-04-05 23:29             ` Jerry Zhang
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-04-05 22:23 UTC (permalink / raw)
  To: Jerry Zhang
  Cc: Elijah Newren, Git Mailing List, Ross Yeager, Abraham Bachrach,
	Jerry Zhang, Brian Kubisiak

Jerry Zhang <jerry@skydio.com> writes:

> I can see what you mean about the user safety issue. However,
> my specific use case (see cover letter) involves an index that does not
> match HEAD, and wouldn't be possible at all if we forced the index to
> match HEAD. Furthermore git-apply --cached even without --3way
> doesn't force the index to match HEAD either, so why force it now?

Primarily because we tend to be extra careful before mergy operation
than any other operation.  Especially without --3way, apply (with or
without --cached/--index) is extra careful to make itself all-or-none
operation to be safe, so that there is no mixed mess that requires
manual intervention (which would further increase the risk of mistakes).

It is OK to introduce a new option to allow a dirty index, and your
tool can pass that option when it calls "apply --cached --3way", but
it would be safe to require a clean index (it does not matter how
dirty the working tree is ;-) by default.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
  2021-04-05 22:19   ` [PATCH V2] " Jerry Zhang
@ 2021-04-05 22:46     ` Junio C Hamano
  2021-04-06  2:52       ` Jerry Zhang
  2021-04-06  2:49     ` [PATCH v3] git-apply: allow " Jerry Zhang
  1 sibling, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-04-05 22:46 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: git, newren, ross, abe, brian.kubisiak, Jerry Zhang

Jerry Zhang <jerry@skydio.com> writes:

> Subject: Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options

s/Allow/allow/ (cf. "git shortlog --no-merged" output for recent examples)

> Previously, --cached and --3way were not
> allowed to be used together, since --3way
> wrote conflict markers into the working tree.

Hint that you are talking about the "git apply" command by
mentioning the name somewhere.

Drop "previously"; we talk about the status quo in the present tense
in our proposed commit log messages to set the stage, and then describe
what the patch author percieves as a problem, before describing the
proposed solution to the problem.

cf. Documentation/SubmittingPatches[[describe-changes]] (the whole section)

> These changes change semantics so that if
> these flags are given together and there is
> a conflict, the conflicting objects are left
> at a higher order in the cache, and the command
> will return non-zero. If there is no conflict,
> the patch is applied directly to cache as
> expected and the command will return 0.

Give an order to the codebase to "be like so".  Here is my attempt.

    Teach "git apply" to accept "--cached" and "--3way" at the same
    time.  Only when all changes to all paths involved in the
    application auto-resolve cleanly, the result is placed in the
    index at stage #0 and the command exits with 0 status.  If there
    is any path whose conflict cannot be cleanly auto-resolved, the
    original contents from common ancestor (stage #1), our version
    (stage #2) and the contents from the patch (stage #3) for the
    conflicted paths are left at separate stages without any attempt
    to resolve the conflict at the content level, and the command
    exists with non-zero status, because there is no place (like the
    working tree files) to leave a half-resolved conflicted merge
    result to ask the end-user to resolve.

> The user can use `git diff` to view the contents
> of the conflict, or `git checkout -m -- .` to
> regenerate the conflict markers in the working
> directory.

Nice.

> With the combined --3way and --cached flags,
> The conflict markers won't be written to the
> working directory, so there is no point in
> attempting rerere.

I am not sure what this paragraph is trying to convey here.

I agree that when a *new* conflict is encountered in this new mode,
writing out a rerere pre-image, in preparation for accepting the
post-image the end-user gives us after the conflicts are resolved,
does not make sense, because we are not giving the end-user the
conflicted state and asking to help resolve it for us.

But if a rerere database entry records a previous merge result in
which conflicts were resolved by the end user, it would make sense
to try reusing the resolution, I would think.  I offhand do not know
how involved it would be to do so, so punting on that is fine, but
that is "there is no point", but it is "we are not trying".

Perhaps

    When there are conflicts, theoretically, it would be nice to be
    able to replay an existing entry in the rerere database that
    records already resolved conflict that match the current one,
    but that would be too much work, so let's not try it for now.

would be a good explanation why we are not doing (i.e. we made a
trade-off) and recording that is important, as it will allow others
in the future to try building on the change we are proposing here
(it is not like we decided that it is fundamentally wrong to try to
use rerere in this situation).

> Signed-off-by: Jerry Zhang <jerry@skydio.com>
> Signed-off-by: Jerry Zhang <jerryxzha@googlemail.com>

Unless we are interacting with two people with the same name, please
sign-off with the same name/address as the name/address that will be
recorded as the author of this change.  I am guessing that dropping
the latter should be sufficient?

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-05 22:23           ` Junio C Hamano
@ 2021-04-05 23:29             ` Jerry Zhang
  2021-04-06  0:10               ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Jerry Zhang @ 2021-04-05 23:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren, Git Mailing List, Ross Yeager, Abraham Bachrach,
	Jerry Zhang, Brian Kubisiak

On Mon, Apr 5, 2021 at 3:23 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jerry Zhang <jerry@skydio.com> writes:
>
> > I can see what you mean about the user safety issue. However,
> > my specific use case (see cover letter) involves an index that does not
> > match HEAD, and wouldn't be possible at all if we forced the index to
> > match HEAD. Furthermore git-apply --cached even without --3way
> > doesn't force the index to match HEAD either, so why force it now?
>
> Primarily because we tend to be extra careful before mergy operation
> than any other operation.  Especially without --3way, apply (with or
> without --cached/--index) is extra careful to make itself all-or-none
> operation to be safe, so that there is no mixed mess that requires
> manual intervention (which would further increase the risk of mistakes).
>
> It is OK to introduce a new option to allow a dirty index, and your
> tool can pass that option when it calls "apply --cached --3way", but
> it would be safe to require a clean index (it does not matter how
> dirty the working tree is ;-) by default.
>
Sure adding the staged files will definitely clobber whatever the user
had in the cache at stage 0. This will probably be unexpected. But
the normal invocation of --3way also does this without warning, since
it touches the cache as well. It just seems odd to me to be adding a safety
check on some paths that aren't there on other very similar ones. Maybe
another option would be to add a very stern warning for users of --3way?

Unrelatedly would you have context on why --3way falls back on 3way
rather than trying 3way first then falling back on apply_fragments if
blobs don't exist? I see some cases where the normal patch application
will succeed but apply the patch incorrectly, while 3way will apply the
patch correctly. In these cases it's impossible for the user to force 3way.
Are there downsides to 3way that aren't solved by falling back on
apply_fragments?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] git-apply: Allow simultaneous --cached and --3way options
  2021-04-05 23:29             ` Jerry Zhang
@ 2021-04-06  0:10               ` Junio C Hamano
  0 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2021-04-06  0:10 UTC (permalink / raw)
  To: Jerry Zhang
  Cc: Elijah Newren, Git Mailing List, Ross Yeager, Abraham Bachrach,
	Jerry Zhang, Brian Kubisiak

Jerry Zhang <jerry@skydio.com> writes:

> Unrelatedly would you have context on why --3way falls back on 3way
> rather than trying 3way first then falling back on apply_fragments if
> blobs don't exist?

Historical accident, following the order in which these features
were invented, plus applying a single patch straight has been faster
than doing a 3-way.

I tend to agree with what you are hinting at, though, as I do not
offhand think of a situation in which a successful 3way merge would
be less correct than a straight patch application, and if the user
explicitly has told us to do "--3way", that is a sign that it is
acceptable to try 3way first even if it costs more cycles.

Those who has been giving "--3way" from inertia would notice if the
performance difference is large enough and may complain, though.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3] git-apply: allow simultaneous --cached and --3way options
  2021-04-05 22:19   ` [PATCH V2] " Jerry Zhang
  2021-04-05 22:46     ` Junio C Hamano
@ 2021-04-06  2:49     ` Jerry Zhang
  2021-04-07 18:03       ` [PATCH v4] " Jerry Zhang
  1 sibling, 1 reply; 29+ messages in thread
From: Jerry Zhang @ 2021-04-06  2:49 UTC (permalink / raw)
  To: git, newren, gitster; +Cc: ross, abe, brian.kubisiak, Jerry Zhang

"git apply" does not allow "--cached" and
"--3way" to be used together, since "--3way"
writes conflict markers into the working tree.

Allow "git apply" to accept "--cached" and
"--3way" at the same time.  When all changes
auto-resolve cleanly, the result is placed in the
index at stage #0 and the command exits with 0
status.  If there is any path whose conflict
cannot be cleanly auto-resolved, the original
contents from common ancestor (stage #1), our
version (stage #2) and the contents from the
patch (stage #3) are left at separate stages.
No attempt is made to resolve the conflict at
the content level, and the command exists with
 non-zero status, because there is no place
(like the working tree) to leave a half-resolved
 merge for the user to resolve.

The user can use `git diff` to view the contents
of the conflict, or `git checkout -m -- .` to
regenerate the conflict markers in the working
directory.

Since rerere depends on conflict markers written
to file for its database storage and lookup, don't
attempt it in this case. This could be fixable
if the in memory conflict markers from the ll_merge
result could be passed to the rerere api.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
---
 Documentation/git-apply.txt |  6 ++++--
 apply.c                     |  7 +++----
 t/t4108-apply-threeway.sh   | 24 ++++++++++++++++++++++++
 3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 91d9a8601c8c316d4649c405af42e531c39991a8..9c48863c47287208850e8376f43453ecec595444 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -88,8 +88,10 @@ OPTIONS
 	the patch records the identity of blobs it is supposed to apply to,
 	and we have those blobs available locally, possibly leaving the
 	conflict markers in the files in the working tree for the user to
-	resolve.  This option implies the `--index` option, and is incompatible
-	with the `--reject` and the `--cached` options.
+	resolve.  This option implies the `--index` option unless the
+	`--cached` option is used, and is incompatible with the `--reject` option.
+	When used with the `--cached` option, any conflicts are left at higher stages
+	in the cache.
 
 --build-fake-ancestor=<file>::
 	Newer 'git diff' output has embedded 'index information'
diff --git a/apply.c b/apply.c
index 6695a931e979a968b28af88d425d0c76ba17d0d4..02d13ea6db7f9a4066dec3d33d7ddbe11b616f33 100644
--- a/apply.c
+++ b/apply.c
@@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
 
 	if (state->apply_with_reject && state->threeway)
 		return error(_("--reject and --3way cannot be used together."));
-	if (state->cached && state->threeway)
-		return error(_("--cached and --3way cannot be used together."));
 	if (state->threeway) {
 		if (is_not_gitdir)
 			return error(_("--3way outside a repository"));
@@ -4645,8 +4643,9 @@ static int write_out_results(struct apply_state *state, struct patch *list)
 				fprintf(stderr, "U %s\n", item->string);
 		}
 		string_list_clear(&cpath, 0);
-
-		repo_rerere(state->repo, 0);
+		/* rerere relies on conflict markers which aren't written with --cached */
+		if (!state->cached)
+			repo_rerere(state->repo, 0);
 	}
 
 	return errs;
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index d62db3fbe16f35a625a4a14eebb70034f695d3eb..75eb34b13d0024046fd2a510c00c5af1f7bfc52d 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -160,4 +160,28 @@ test_expect_success 'apply -3 with add/add conflict (dirty working tree)' '
 	test_cmp three.save three
 '
 
+test_expect_success 'apply with --3way --cached' '
+	# Merging side should be similar to applying this patch
+	git diff ...side >P.diff &&
+
+	# The corresponding conflicted merge
+	git reset --hard &&
+	git checkout main^0 &&
+	test_must_fail git merge --no-commit side &&
+	git ls-files -s >expect.ls &&
+
+	# should fail to apply
+	git reset --hard &&
+	git checkout main^0 &&
+	test_must_fail git apply --cached --3way P.diff &&
+	git ls-files -s >actual.ls &&
+	print_sanitized_conflicted_diff >actual.diff &&
+
+	# The cache should resemble the corresponding merge
+	test_cmp expect.ls actual.ls &&
+	# However the working directory should not change
+	>expect.diff &&
+	test_cmp expect.diff actual.diff
+'
+
 test_done
-- 
2.29.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
  2021-04-05 22:46     ` Junio C Hamano
@ 2021-04-06  2:52       ` Jerry Zhang
  2021-04-06  5:52         ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Jerry Zhang @ 2021-04-06  2:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Elijah Newren, Ross Yeager, Abraham Bachrach,
	Brian Kubisiak, Jerry Zhang

Thanks for the comments! I've updated v3 with the changes. Let me know
if you have any
more thoughts on whether to block / warn the user before clobbering their cache.

On Mon, Apr 5, 2021 at 3:46 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jerry Zhang <jerry@skydio.com> writes:
>
> > Subject: Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
>
> s/Allow/allow/ (cf. "git shortlog --no-merged" output for recent examples)
>
> > Previously, --cached and --3way were not
> > allowed to be used together, since --3way
> > wrote conflict markers into the working tree.
>
> Hint that you are talking about the "git apply" command by
> mentioning the name somewhere.
>
> Drop "previously"; we talk about the status quo in the present tense
> in our proposed commit log messages to set the stage, and then describe
> what the patch author percieves as a problem, before describing the
> proposed solution to the problem.
>
> cf. Documentation/SubmittingPatches[[describe-changes]] (the whole section)
>
> > These changes change semantics so that if
> > these flags are given together and there is
> > a conflict, the conflicting objects are left
> > at a higher order in the cache, and the command
> > will return non-zero. If there is no conflict,
> > the patch is applied directly to cache as
> > expected and the command will return 0.
>
> Give an order to the codebase to "be like so".  Here is my attempt.
>
>     Teach "git apply" to accept "--cached" and "--3way" at the same
>     time.  Only when all changes to all paths involved in the
>     application auto-resolve cleanly, the result is placed in the
>     index at stage #0 and the command exits with 0 status.  If there
>     is any path whose conflict cannot be cleanly auto-resolved, the
>     original contents from common ancestor (stage #1), our version
>     (stage #2) and the contents from the patch (stage #3) for the
>     conflicted paths are left at separate stages without any attempt
>     to resolve the conflict at the content level, and the command
>     exists with non-zero status, because there is no place (like the
>     working tree files) to leave a half-resolved conflicted merge
>     result to ask the end-user to resolve.
>
> > The user can use `git diff` to view the contents
> > of the conflict, or `git checkout -m -- .` to
> > regenerate the conflict markers in the working
> > directory.
>
> Nice.
>
> > With the combined --3way and --cached flags,
> > The conflict markers won't be written to the
> > working directory, so there is no point in
> > attempting rerere.
>
> I am not sure what this paragraph is trying to convey here.
>
> I agree that when a *new* conflict is encountered in this new mode,
> writing out a rerere pre-image, in preparation for accepting the
> post-image the end-user gives us after the conflicts are resolved,
> does not make sense, because we are not giving the end-user the
> conflicted state and asking to help resolve it for us.
>
> But if a rerere database entry records a previous merge result in
> which conflicts were resolved by the end user, it would make sense
> to try reusing the resolution, I would think.  I offhand do not know
> how involved it would be to do so, so punting on that is fine, but
> that is "there is no point", but it is "we are not trying".
>
> Perhaps
>
>     When there are conflicts, theoretically, it would be nice to be
>     able to replay an existing entry in the rerere database that
>     records already resolved conflict that match the current one,
>     but that would be too much work, so let's not try it for now.
>
> would be a good explanation why we are not doing (i.e. we made a
> trade-off) and recording that is important, as it will allow others
> in the future to try building on the change we are proposing here
> (it is not like we decided that it is fundamentally wrong to try to
> use rerere in this situation).
>
> > Signed-off-by: Jerry Zhang <jerry@skydio.com>
> > Signed-off-by: Jerry Zhang <jerryxzha@googlemail.com>
>
> Unless we are interacting with two people with the same name, please
> sign-off with the same name/address as the name/address that will be
> recorded as the author of this change.  I am guessing that dropping
> the latter should be sufficient?
>
> Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
  2021-04-06  2:52       ` Jerry Zhang
@ 2021-04-06  5:52         ` Junio C Hamano
  2021-04-06 21:56           ` Jerry Zhang
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-04-06  5:52 UTC (permalink / raw)
  To: Jerry Zhang
  Cc: Git Mailing List, Elijah Newren, Ross Yeager, Abraham Bachrach,
	Brian Kubisiak, Jerry Zhang

Jerry Zhang <jerry@skydio.com> writes:

> Thanks for the comments! I've updated v3 with the changes. Let me know
> if you have any
> more thoughts on whether to block / warn the user before clobbering their cache.

Please do not top-post on this list.

I've already said that I think we should ensure the index is clean
by default, because, unlike the case where the application is done
on the working tree files, the use of "--cached" is a sign that the
next step is likely to write a tree out.  As I've already said so in
earlier reviews, there is nothing more from me to add on that issue.

>> Give an order to the codebase to "be like so".  Here is my attempt.
>>
>>     Teach "git apply" to accept "--cached" and "--3way" at the same
>>     time.  Only when all changes to all paths involved in the
>>     application auto-resolve cleanly, the result is placed in the
>>     index at stage #0 and the command exits with 0 status.  If there
>>     is any path whose conflict cannot be cleanly auto-resolved, the
>>     original contents from common ancestor (stage #1), our version
>>     (stage #2) and the contents from the patch (stage #3) for the
>>     conflicted paths are left at separate stages without any attempt
>>     to resolve the conflict at the content level, and the command
>>     exists with non-zero status, because there is no place (like the
>>     working tree files) to leave a half-resolved conflicted merge
>>     result to ask the end-user to resolve.

I wrote the above as an example to illustrate the tone and the level
of details expected in our proposed commit log message.  The
behaviour it describes may not necessarily match what you have
implemented in the patch.

For example, imagine that we are applying a patch for two paths,
where one auto-resolves cleanly and the other does not.  The above
description expects both paths will leave the higher stages (instead
of recording the auto-resolved path at stage #0, and leaving the
other path that cannot be auto-resolved at higher stages) and the
command exits with non-zero status, which may not be what you
implemented.  As an illustration, I didn't necessarily mean such an
all-or-none behaviour wrt resolving should be what we implement---I
do not want to choose, as this is your itch and I want _you_ with
the itch to think long and hard before deciding what the best design
for end-users would be, and present it as a proposed solution.  An
obvious alternative is to record auto-resolved paths at stage #0 and
leave only the paths for which auto-resolution failed in conflicted
state.

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
  2021-04-06  5:52         ` Junio C Hamano
@ 2021-04-06 21:56           ` Jerry Zhang
  2021-04-07  2:25             ` Jerry Zhang
  0 siblings, 1 reply; 29+ messages in thread
From: Jerry Zhang @ 2021-04-06 21:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Elijah Newren, Ross Yeager, Abraham Bachrach,
	Brian Kubisiak, Jerry Zhang

On Mon, Apr 5, 2021 at 10:52 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jerry Zhang <jerry@skydio.com> writes:
>
> > Thanks for the comments! I've updated v3 with the changes. Let me know
> > if you have any
> > more thoughts on whether to block / warn the user before clobbering their cache.
>
> Please do not top-post on this list.
>
> I've already said that I think we should ensure the index is clean
> by default, because, unlike the case where the application is done
> on the working tree files, the use of "--cached" is a sign that the
> next step is likely to write a tree out.  As I've already said so in
> earlier reviews, there is nothing more from me to add on that issue.
Understood, but please bear with me to explain the risks a bit more. I'm
having some difficulty coming up with a name and explanation for flags
for this case, because I don't completely understand the safety issue
we are trying to mitigate.

Let me enumerate some behaviors in 3 different cases where the user
has "file.txt" changes staged in the index, so index differs from HEAD.

"git apply --cached" would either 1. combine the patch and cached version
and put that in the cache or 2. do nothing (patch failed). In 2 nothing happened
so the user's changes are safe. In 1 the user's changes may be gone, but
since the user was forewarned, this is presumably what they wanted.

"git apply --3way" would either 1. apply cleanly to working dir or
2. conflict, in which case user's changes would be moved to stage #2
in cache. For 1 the user's changes are in the cache, so they can check that out
to restore the original state, since this invocation requires the cache
and working dir to match. For 2, the user's changes are moved to cache
in stage #2. Although the changes are preserved, there doesn't seem to
be any atomic way to move a cache entry from stage #2 to stage #0.
Something like "git restore --staged --ours file.txt" seems like it should
work, but "git restore" doesn't allow combining those flags.
The non atomic way we can do is "git checkout --ours file.txt &&
git add file.txt", this is ok in this case since we've required the index
and working tree to match.

"git apply --3way --cached" would either 1. apply cleanly to the cache or
2. conflict, and the user's changes are moved to stage #2. In 1, the user's
changes are lost because they're combined with the patch, but this is
the same as the "--cached" case by itself. In 2, the user's changes are
preserved in stage #2 similar to "--3way" by itself. What's somewhat
tricky here is restoring it to stage #0 since we can't use the working
tree, but I think that is more of a limitation in "git restore", since moving
a cache entry from stage #2 to stage #0 is a conceptually possible and
simple operation.

In summary it seems to me that merge or no merge, the safety semantics
for "--3way" + "--cached" as it is are pretty similar to the existing semantics
for those options individually. The user could be preparing to write
a tree out in either the "--cached" or the "--cached --3way" operation
so I don't understand why those must differ in safety. In addition, the
both "--3way" and "--3way --cached" perform mergey operations that
changes the stages of a file in cache, so I don't understand why those
must differ in safety either.

>
> >> Give an order to the codebase to "be like so".  Here is my attempt.
> >>
> >>     Teach "git apply" to accept "--cached" and "--3way" at the same
> >>     time.  Only when all changes to all paths involved in the
> >>     application auto-resolve cleanly, the result is placed in the
> >>     index at stage #0 and the command exits with 0 status.  If there
> >>     is any path whose conflict cannot be cleanly auto-resolved, the
> >>     original contents from common ancestor (stage #1), our version
> >>     (stage #2) and the contents from the patch (stage #3) for the
> >>     conflicted paths are left at separate stages without any attempt
> >>     to resolve the conflict at the content level, and the command
> >>     exists with non-zero status, because there is no place (like the
> >>     working tree files) to leave a half-resolved conflicted merge
> >>     result to ask the end-user to resolve.
>
> I wrote the above as an example to illustrate the tone and the level
> of details expected in our proposed commit log message.  The
> behaviour it describes may not necessarily match what you have
> implemented in the patch.
>
> For example, imagine that we are applying a patch for two paths,
> where one auto-resolves cleanly and the other does not.  The above
> description expects both paths will leave the higher stages (instead
> of recording the auto-resolved path at stage #0, and leaving the
> other path that cannot be auto-resolved at higher stages) and the
> command exits with non-zero status, which may not be what you
> implemented.  As an illustration, I didn't necessarily mean such an
> all-or-none behaviour wrt resolving should be what we implement---I
> do not want to choose, as this is your itch and I want _you_ with
> the itch to think long and hard before deciding what the best design
> for end-users would be, and present it as a proposed solution.  An
> obvious alternative is to record auto-resolved paths at stage #0 and
> leave only the paths for which auto-resolution failed in conflicted
> state.
I missed the "all changes to all paths" requirement in that description,
I'll update it to be more consistent with what it actually does. As you say,
the leaving entries at higher orders behavior only happens for conflicting
paths, not for all paths.
>
> Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V2] git-apply: Allow simultaneous --cached and --3way options
  2021-04-06 21:56           ` Jerry Zhang
@ 2021-04-07  2:25             ` Jerry Zhang
  0 siblings, 0 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-07  2:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Elijah Newren, Ross Yeager, Abraham Bachrach,
	Brian Kubisiak, Jerry Zhang

On Tue, Apr 6, 2021 at 2:56 PM Jerry Zhang <jerry@skydio.com> wrote:
>
> On Mon, Apr 5, 2021 at 10:52 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Jerry Zhang <jerry@skydio.com> writes:
> >
> > > Thanks for the comments! I've updated v3 with the changes. Let me know
> > > if you have any
> > > more thoughts on whether to block / warn the user before clobbering their cache.
> >
> > Please do not top-post on this list.
> >
> > I've already said that I think we should ensure the index is clean
> > by default, because, unlike the case where the application is done
> > on the working tree files, the use of "--cached" is a sign that the
> > next step is likely to write a tree out.  As I've already said so in
> > earlier reviews, there is nothing more from me to add on that issue.
> Understood, but please bear with me to explain the risks a bit more. I'm
> having some difficulty coming up with a name and explanation for flags
> for this case, because I don't completely understand the safety issue
> we are trying to mitigate.
>
> Let me enumerate some behaviors in 3 different cases where the user
> has "file.txt" changes staged in the index, so index differs from HEAD.
>
> "git apply --cached" would either 1. combine the patch and cached version
> and put that in the cache or 2. do nothing (patch failed). In 2 nothing happened
> so the user's changes are safe. In 1 the user's changes may be gone, but
> since the user was forewarned, this is presumably what they wanted.
>
> "git apply --3way" would either 1. apply cleanly to working dir or
> 2. conflict, in which case user's changes would be moved to stage #2
> in cache. For 1 the user's changes are in the cache, so they can check that out
> to restore the original state, since this invocation requires the cache
> and working dir to match. For 2, the user's changes are moved to cache
> in stage #2. Although the changes are preserved, there doesn't seem to
> be any atomic way to move a cache entry from stage #2 to stage #0.
> Something like "git restore --staged --ours file.txt" seems like it should
> work, but "git restore" doesn't allow combining those flags.
> The non atomic way we can do is "git checkout --ours file.txt &&
> git add file.txt", this is ok in this case since we've required the index
> and working tree to match.
>
> "git apply --3way --cached" would either 1. apply cleanly to the cache or
> 2. conflict, and the user's changes are moved to stage #2. In 1, the user's
> changes are lost because they're combined with the patch, but this is
> the same as the "--cached" case by itself. In 2, the user's changes are
> preserved in stage #2 similar to "--3way" by itself. What's somewhat
> tricky here is restoring it to stage #0 since we can't use the working
> tree, but I think that is more of a limitation in "git restore", since moving
> a cache entry from stage #2 to stage #0 is a conceptually possible and
> simple operation.
Update: I found out that this can be done through "git update-index"
Say you start with
```
100755 adc1032303de8d262868c0a1b85a00fa3b97e9d8 1 file.sh
100755 d0bf2e33594ea7d9d7352df5511db562de819518 2 file.sh
100755 e9bf5dfdea804f4f1bcf24988bbc7c3f99991a09 3 file.sh
```
Pass into "git update-index --index-info the following
```
100755 d0bf2e33594ea7d9d7352df5511db562de819518 0 file.sh
0 0 1 file.sh
0 0 2 file.sh
0 0 3 file.sh
```
and you will have atomically moved the "ours" stage of file.sh into
stage #0. This is sort of what I'd expect "git checkout --ours file.sh"
to do, but it doesn't seem to do that.
>
> In summary it seems to me that merge or no merge, the safety semantics
> for "--3way" + "--cached" as it is are pretty similar to the existing semantics
> for those options individually. The user could be preparing to write
> a tree out in either the "--cached" or the "--cached --3way" operation
> so I don't understand why those must differ in safety. In addition, the
> both "--3way" and "--3way --cached" perform mergey operations that
> changes the stages of a file in cache, so I don't understand why those
> must differ in safety either.
>
> >
> > >> Give an order to the codebase to "be like so".  Here is my attempt.
> > >>
> > >>     Teach "git apply" to accept "--cached" and "--3way" at the same
> > >>     time.  Only when all changes to all paths involved in the
> > >>     application auto-resolve cleanly, the result is placed in the
> > >>     index at stage #0 and the command exits with 0 status.  If there
> > >>     is any path whose conflict cannot be cleanly auto-resolved, the
> > >>     original contents from common ancestor (stage #1), our version
> > >>     (stage #2) and the contents from the patch (stage #3) for the
> > >>     conflicted paths are left at separate stages without any attempt
> > >>     to resolve the conflict at the content level, and the command
> > >>     exists with non-zero status, because there is no place (like the
> > >>     working tree files) to leave a half-resolved conflicted merge
> > >>     result to ask the end-user to resolve.
> >
> > I wrote the above as an example to illustrate the tone and the level
> > of details expected in our proposed commit log message.  The
> > behaviour it describes may not necessarily match what you have
> > implemented in the patch.
> >
> > For example, imagine that we are applying a patch for two paths,
> > where one auto-resolves cleanly and the other does not.  The above
> > description expects both paths will leave the higher stages (instead
> > of recording the auto-resolved path at stage #0, and leaving the
> > other path that cannot be auto-resolved at higher stages) and the
> > command exits with non-zero status, which may not be what you
> > implemented.  As an illustration, I didn't necessarily mean such an
> > all-or-none behaviour wrt resolving should be what we implement---I
> > do not want to choose, as this is your itch and I want _you_ with
> > the itch to think long and hard before deciding what the best design
> > for end-users would be, and present it as a proposed solution.  An
> > obvious alternative is to record auto-resolved paths at stage #0 and
> > leave only the paths for which auto-resolution failed in conflicted
> > state.
> I missed the "all changes to all paths" requirement in that description,
> I'll update it to be more consistent with what it actually does. As you say,
> the leaving entries at higher orders behavior only happens for conflicting
> paths, not for all paths.
> >
> > Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4] git-apply: allow simultaneous --cached and --3way options
  2021-04-06  2:49     ` [PATCH v3] git-apply: allow " Jerry Zhang
@ 2021-04-07 18:03       ` Jerry Zhang
  2021-04-07 19:00         ` Junio C Hamano
  2021-04-08  2:13         ` [PATCH v5] " Jerry Zhang
  0 siblings, 2 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-07 18:03 UTC (permalink / raw)
  To: git, newren, gitster; +Cc: ross, abe, brian.kubisiak, Jerry Zhang

"git apply" does not allow "--cached" and
"--3way" to be used together, since "--3way"
writes conflict markers into the working tree.

Allow "git apply" to accept "--cached" and
"--3way" at the same time.  When a single file
auto-resolves cleanly, the result is placed in the
index at stage #0 and the command exits with 0
status.  For a file that has a conflict which
cannot be cleanly auto-resolved, the original
contents from common ancestor (stage #1), our
version (stage #2) and the contents from the
patch (stage #3) are left at separate stages.
No attempt is made to resolve the conflict at
the content level, and the command exists with
non-zero status, because there is no place
(like the working tree) to leave a half-resolved
merge for the user to resolve.

The user can use `git diff` to view the contents
of the conflict, or `git checkout -m -- .` to
regenerate the conflict markers in the working
directory.

Don't attempt rerere in this case since it depends
on conflict markers written to file for its database
storage and lookup. There would be two main changes
required to get rerere working:
1. Allow the rerere api to accept in memory object
rather than files, which would allow us to pass in
the conflict markers contained in the result from
ll_merge().
2. Rerere can't write to the working directory, so
it would have to apply the result to cache stage #0
directly. A flag would be needed to control this.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
---
 Documentation/git-apply.txt |  6 ++++--
 apply.c                     |  7 +++----
 t/t4108-apply-threeway.sh   | 24 ++++++++++++++++++++++++
 3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 9144575299c264dd299b542b7b5948eef35f211c..aa1ae56a25e0428cabcfa2539900ef2a09abcb7c 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -87,8 +87,10 @@ OPTIONS
 	Attempt 3-way merge if the patch records the identity of blobs it is supposed
 	to apply to and we have those blobs available locally, possibly leaving the
 	conflict markers in the files in the working tree for the user to
-	resolve.  This option implies the `--index` option, and is incompatible
-	with the `--reject` and the `--cached` options.
+	resolve.  This option implies the `--index` option unless the
+	`--cached` option is used, and is incompatible with the `--reject` option.
+	When used with the `--cached` option, any conflicts are left at higher stages
+	in the cache.
 
 --build-fake-ancestor=<file>::
 	Newer 'git diff' output has embedded 'index information'
diff --git a/apply.c b/apply.c
index 9bd4efcbced842d2c5c030a0f2178ddb36114600..0d1e91c88986433052e9b6e67c0dcbd04e6eb703 100644
--- a/apply.c
+++ b/apply.c
@@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
 
 	if (state->apply_with_reject && state->threeway)
 		return error(_("--reject and --3way cannot be used together."));
-	if (state->cached && state->threeway)
-		return error(_("--cached and --3way cannot be used together."));
 	if (state->threeway) {
 		if (is_not_gitdir)
 			return error(_("--3way outside a repository"));
@@ -4644,8 +4642,9 @@ static int write_out_results(struct apply_state *state, struct patch *list)
 				fprintf(stderr, "U %s\n", item->string);
 		}
 		string_list_clear(&cpath, 0);
-
-		repo_rerere(state->repo, 0);
+		/* rerere relies on conflict markers which aren't written with --cached */
+		if (!state->cached)
+			repo_rerere(state->repo, 0);
 	}
 
 	return errs;
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index 9ff313f976422f9c12dc8032d14567b54cfe3765..37ba4f6fa201c49a4bf2882d6b8345c1c2bedf0c 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -180,4 +180,28 @@ test_expect_success 'apply -3 with ambiguous repeating file' '
 	test_cmp expect one_two_repeat
 '
 
+test_expect_success 'apply with --3way --cached' '
+	# Merging side should be similar to applying this patch
+	git diff ...side >P.diff &&
+
+	# The corresponding conflicted merge
+	git reset --hard &&
+	git checkout main^0 &&
+	test_must_fail git merge --no-commit side &&
+	git ls-files -s >expect.ls &&
+
+	# should fail to apply
+	git reset --hard &&
+	git checkout main^0 &&
+	test_must_fail git apply --cached --3way P.diff &&
+	git ls-files -s >actual.ls &&
+	print_sanitized_conflicted_diff >actual.diff &&
+
+	# The cache should resemble the corresponding merge
+	test_cmp expect.ls actual.ls &&
+	# However the working directory should not change
+	>expect.diff &&
+	test_cmp expect.diff actual.diff
+'
+
 test_done
-- 
2.29.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4] git-apply: allow simultaneous --cached and --3way options
  2021-04-07 18:03       ` [PATCH v4] " Jerry Zhang
@ 2021-04-07 19:00         ` Junio C Hamano
  2021-04-08  2:13         ` [PATCH v5] " Jerry Zhang
  1 sibling, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2021-04-07 19:00 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: git, newren, ross, abe, brian.kubisiak

Jerry Zhang <jerry@skydio.com> writes:

> "git apply" does not allow "--cached" and
> "--3way" to be used together, since "--3way"
> writes conflict markers into the working tree.
>
> Allow "git apply" to accept "--cached" and
> "--3way" at the same time.  When a single file
> auto-resolves cleanly, the result is placed in the
> index at stage #0 and the command exits with 0
> status.  For a file that has a conflict which
> cannot be cleanly auto-resolved, the original
> contents from common ancestor (stage #1), our
> version (stage #2) and the contents from the
> patch (stage #3) are left at separate stages.
> No attempt is made to resolve the conflict at
> the content level, and the command exists with
> non-zero status, because there is no place
> (like the working tree) to leave a half-resolved
> merge for the user to resolve.
>
> The user can use `git diff` to view the contents
> of the conflict, or `git checkout -m -- .` to
> regenerate the conflict markers in the working
> directory.
>
> Don't attempt rerere in this case since it depends
> on conflict markers written to file for its database
> storage and lookup. There would be two main changes
> required to get rerere working:
> 1. Allow the rerere api to accept in memory object
> rather than files, which would allow us to pass in
> the conflict markers contained in the result from
> ll_merge().
> 2. Rerere can't write to the working directory, so
> it would have to apply the result to cache stage #0
> directly. A flag would be needed to control this.
>
> Signed-off-by: Jerry Zhang <jerry@skydio.com>
> ---

For future reference, please summarize what changed between v3 and
v4 in this space immediately after the three-dash line.  This is
especially helpful when sending v4 so soon after v3 that nobody had
a chance to review and respond to v3, as it helps reviewers to
decide if it is safe to skip v3 and jump directly to v4 to start
reading.

>  Documentation/git-apply.txt |  6 ++++--
>  apply.c                     |  7 +++----
>  t/t4108-apply-threeway.sh   | 24 ++++++++++++++++++++++++
>  3 files changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> index 9144575299c264dd299b542b7b5948eef35f211c..aa1ae56a25e0428cabcfa2539900ef2a09abcb7c 100644
> --- a/Documentation/git-apply.txt
> +++ b/Documentation/git-apply.txt
> @@ -87,8 +87,10 @@ OPTIONS
>  	Attempt 3-way merge if the patch records the identity of blobs it is supposed
>  	to apply to and we have those blobs available locally, possibly leaving the
>  	conflict markers in the files in the working tree for the user to
> -	resolve.  This option implies the `--index` option, and is incompatible
> -	with the `--reject` and the `--cached` options.
> +	resolve.  This option implies the `--index` option unless the
> +	`--cached` option is used, and is incompatible with the `--reject` option.
> +	When used with the `--cached` option, any conflicts are left at higher stages
> +	in the cache.

Also for future reference.

It is clear to me (from the pre-context lines of the above hunk)
that this change wants to depend on the other "3way-first" topic,
because I reviewed the other topic.

But it would not be too much trouble to say "this builds on the
jz/apply-run-3way-first topic 923cd87a (git-apply: try threeway
first when "--3way" is used, 2021-04-06)".  When potential reviewers
are tempted to apply this and try it out while reviewing, such a
note would help them.  And as a patch author, you would want to
increase the chance that your patch gets reviewed, so any help you
give to potential reviewers would help you.

The space between the three-dash line and the diffstat is the place
to write it.

> diff --git a/apply.c b/apply.c
> index 9bd4efcbced842d2c5c030a0f2178ddb36114600..0d1e91c88986433052e9b6e67c0dcbd04e6eb703 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
>  
>  	if (state->apply_with_reject && state->threeway)
>  		return error(_("--reject and --3way cannot be used together."));
> -	if (state->cached && state->threeway)
> -		return error(_("--cached and --3way cannot be used together."));
>  	if (state->threeway) {
>  		if (is_not_gitdir)
>  			return error(_("--3way outside a repository"));
> @@ -4644,8 +4642,9 @@ static int write_out_results(struct apply_state *state, struct patch *list)
>  				fprintf(stderr, "U %s\n", item->string);
>  		}
>  		string_list_clear(&cpath, 0);
> -
> -		repo_rerere(state->repo, 0);
> +		/* rerere relies on conflict markers which aren't written with --cached */

A minor nit.  It is not just "conflict markers" that rerere wants.
It wants a intermediate half-merged result "in the working tree",
because it does not work with in-core copy.  So

                /*
                 * With --cached, we do not write conflicted file to the
                 * working tree, so cannot use rerere to reuse previous
		 * resolution.
                 */

or something, perhaps.

> +		if (!state->cached)
> +			repo_rerere(state->repo, 0);
>  	}
>  
>  	return errs;
> diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
> index 9ff313f976422f9c12dc8032d14567b54cfe3765..37ba4f6fa201c49a4bf2882d6b8345c1c2bedf0c 100755
> --- a/t/t4108-apply-threeway.sh
> +++ b/t/t4108-apply-threeway.sh
> @@ -180,4 +180,28 @@ test_expect_success 'apply -3 with ambiguous repeating file' '
>  	test_cmp expect one_two_repeat
>  '
>  
> +test_expect_success 'apply with --3way --cached' '
> +	# Merging side should be similar to applying this patch
> +	git diff ...side >P.diff &&
> +
> +	# The corresponding conflicted merge
> +	git reset --hard &&
> +	git checkout main^0 &&
> +	test_must_fail git merge --no-commit side &&
> +	git ls-files -s >expect.ls &&
> +
> +	# should fail to apply
> +	git reset --hard &&
> +	git checkout main^0 &&
> +	test_must_fail git apply --cached --3way P.diff &&
> +	git ls-files -s >actual.ls &&
> +	print_sanitized_conflicted_diff >actual.diff &&
> +
> +	# The cache should resemble the corresponding merge
> +	test_cmp expect.ls actual.ls &&
> +	# However the working directory should not change
> +	>expect.diff &&
> +	test_cmp expect.diff actual.diff
> +'

Interesting.  I would have expected "ls-files -u" would be used, but
using "-s" to see the stage #0 entries is more thorough.

The above is only about a failing case, which is of course an
important case to validate, but don't we also want to check a
successful case, and a case where two paths are touched, and one
applies cleanly while the other conflicts?

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5] git-apply: allow simultaneous --cached and --3way options
  2021-04-07 18:03       ` [PATCH v4] " Jerry Zhang
  2021-04-07 19:00         ` Junio C Hamano
@ 2021-04-08  2:13         ` Jerry Zhang
  2021-04-08 13:33           ` Junio C Hamano
  2021-04-12 15:40           ` Elijah Newren
  1 sibling, 2 replies; 29+ messages in thread
From: Jerry Zhang @ 2021-04-08  2:13 UTC (permalink / raw)
  To: git, newren, gitster; +Cc: ross, abe, brian.kubisiak, Jerry Zhang

"git apply" does not allow "--cached" and
"--3way" to be used together, since "--3way"
writes conflict markers into the working tree.

Allow "git apply" to accept "--cached" and
"--3way" at the same time.  When a single file
auto-resolves cleanly, the result is placed in the
index at stage #0 and the command exits with 0
status.  For a file that has a conflict which
cannot be cleanly auto-resolved, the original
contents from common ancestor (stage #1), our
version (stage #2) and the contents from the
patch (stage #3) are left at separate stages.
No attempt is made to resolve the conflict at
the content level, and the command exists with
non-zero status, because there is no place
(like the working tree) to leave a half-resolved
merge for the user to resolve.

The user can use `git diff` to view the contents
of the conflict, or `git checkout -m -- .` to
regenerate the conflict markers in the working
directory.

Don't attempt rerere in this case since it depends
on conflict markers written to file for its database
storage and lookup. There would be two main changes
required to get rerere working:
1. Allow the rerere api to accept in memory object
rather than files, which would allow us to pass in
the conflict markers contained in the result from
ll_merge().
2. Rerere can't write to the working directory, so
it would have to apply the result to cache stage #0
directly. A flag would be needed to control this.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
---
Patch applies on top of
"[PATCH v2] git-apply: try threeway first when "--3way""
Main merge conflict is the addition of multiple
tests at the bottom of the file.

v4->v5:

Updated in file comment about rerere
Added test for cleanly applying patch (should return 0)
Previous test captured case where patch fails to apply
due to 1 conflicting file and 1 cleanly applying file
(returns 1).

 Documentation/git-apply.txt |  6 +++--
 apply.c                     |  9 ++++---
 t/t4108-apply-threeway.sh   | 50 +++++++++++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 9144575299c264dd299b542b7b5948eef35f211c..aa1ae56a25e0428cabcfa2539900ef2a09abcb7c 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -87,8 +87,10 @@ OPTIONS
 	Attempt 3-way merge if the patch records the identity of blobs it is supposed
 	to apply to and we have those blobs available locally, possibly leaving the
 	conflict markers in the files in the working tree for the user to
-	resolve.  This option implies the `--index` option, and is incompatible
-	with the `--reject` and the `--cached` options.
+	resolve.  This option implies the `--index` option unless the
+	`--cached` option is used, and is incompatible with the `--reject` option.
+	When used with the `--cached` option, any conflicts are left at higher stages
+	in the cache.
 
 --build-fake-ancestor=<file>::
 	Newer 'git diff' output has embedded 'index information'
diff --git a/apply.c b/apply.c
index 9bd4efcbced842d2c5c030a0f2178ddb36114600..dadab80ec967357b031657d4e3d0ae52fac11411 100644
--- a/apply.c
+++ b/apply.c
@@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
 
 	if (state->apply_with_reject && state->threeway)
 		return error(_("--reject and --3way cannot be used together."));
-	if (state->cached && state->threeway)
-		return error(_("--cached and --3way cannot be used together."));
 	if (state->threeway) {
 		if (is_not_gitdir)
 			return error(_("--3way outside a repository"));
@@ -4644,8 +4642,11 @@ static int write_out_results(struct apply_state *state, struct patch *list)
 				fprintf(stderr, "U %s\n", item->string);
 		}
 		string_list_clear(&cpath, 0);
-
-		repo_rerere(state->repo, 0);
+		/* Rerere relies on the partially merged result being in the working tree
+		 * with conflict markers, but that isn't written with --cached.
+		 */
+		if (!state->cached)
+			repo_rerere(state->repo, 0);
 	}
 
 	return errs;
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index 9ff313f976422f9c12dc8032d14567b54cfe3765..65147efdea9a00e30d156e6f4d5d72a3987f230d 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -180,4 +180,54 @@ test_expect_success 'apply -3 with ambiguous repeating file' '
 	test_cmp expect one_two_repeat
 '
 
+test_expect_success 'apply with --3way --cached clean apply' '
+	# Merging side should be similar to applying this patch
+	git diff ...side >P.diff &&
+
+	# The corresponding cleanly applied merge
+	git reset --hard &&
+	git checkout main~ &&
+	git merge --no-commit side &&
+	git ls-files -s >expect.ls &&
+
+	# should succeed
+	git reset --hard &&
+	git checkout main~ &&
+	git apply --cached --3way P.diff &&
+	git ls-files -s >actual.ls &&
+	print_sanitized_conflicted_diff >actual.diff &&
+
+	# The cache should resemble the corresponding merge
+	# (both files at stage #0)
+	test_cmp expect.ls actual.ls &&
+	# However the working directory should not change
+	>expect.diff &&
+	test_cmp expect.diff actual.diff
+'
+
+test_expect_success 'apply with --3way --cached and conflicts' '
+	# Merging side should be similar to applying this patch
+	git diff ...side >P.diff &&
+
+	# The corresponding conflicted merge
+	git reset --hard &&
+	git checkout main^0 &&
+	test_must_fail git merge --no-commit side &&
+	git ls-files -s >expect.ls &&
+
+	# should fail to apply
+	git reset --hard &&
+	git checkout main^0 &&
+	test_must_fail git apply --cached --3way P.diff &&
+	git ls-files -s >actual.ls &&
+	print_sanitized_conflicted_diff >actual.diff &&
+
+	# The cache should resemble the corresponding merge
+	# (one file at stage #0, one file at stages #1 #2 #3)
+	test_cmp expect.ls actual.ls &&
+	# However the working directory should not change
+	>expect.diff &&
+	test_cmp expect.diff actual.diff
+'
+
 test_done
-- 
2.29.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5] git-apply: allow simultaneous --cached and --3way options
  2021-04-08  2:13         ` [PATCH v5] " Jerry Zhang
@ 2021-04-08 13:33           ` Junio C Hamano
  2021-04-12 15:45             ` Elijah Newren
  2021-04-12 15:40           ` Elijah Newren
  1 sibling, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-04-08 13:33 UTC (permalink / raw)
  To: Jerry Zhang; +Cc: git, newren, ross, abe, brian.kubisiak

Jerry Zhang <jerry@skydio.com> writes:

>  Documentation/git-apply.txt |  6 +++--
>  apply.c                     |  9 ++++---
>  t/t4108-apply-threeway.sh   | 50 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 59 insertions(+), 6 deletions(-)

Nicely done.  Will queue.

Elijah, how does this round look to you?

> diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> index 9144575299c264dd299b542b7b5948eef35f211c..aa1ae56a25e0428cabcfa2539900ef2a09abcb7c 100644
> --- a/Documentation/git-apply.txt
> +++ b/Documentation/git-apply.txt
> @@ -87,8 +87,10 @@ OPTIONS
>  	Attempt 3-way merge if the patch records the identity of blobs it is supposed
>  	to apply to and we have those blobs available locally, possibly leaving the
>  	conflict markers in the files in the working tree for the user to
> -	resolve.  This option implies the `--index` option, and is incompatible
> -	with the `--reject` and the `--cached` options.
> +	resolve.  This option implies the `--index` option unless the
> +	`--cached` option is used, and is incompatible with the `--reject` option.
> +	When used with the `--cached` option, any conflicts are left at higher stages
> +	in the cache.
>  
>  --build-fake-ancestor=<file>::
>  	Newer 'git diff' output has embedded 'index information'
> diff --git a/apply.c b/apply.c
> index 9bd4efcbced842d2c5c030a0f2178ddb36114600..dadab80ec967357b031657d4e3d0ae52fac11411 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
>  
>  	if (state->apply_with_reject && state->threeway)
>  		return error(_("--reject and --3way cannot be used together."));
> -	if (state->cached && state->threeway)
> -		return error(_("--cached and --3way cannot be used together."));
>  	if (state->threeway) {
>  		if (is_not_gitdir)
>  			return error(_("--3way outside a repository"));
> @@ -4644,8 +4642,11 @@ static int write_out_results(struct apply_state *state, struct patch *list)
>  				fprintf(stderr, "U %s\n", item->string);
>  		}
>  		string_list_clear(&cpath, 0);
> -
> -		repo_rerere(state->repo, 0);
> +		/* Rerere relies on the partially merged result being in the working tree
> +		 * with conflict markers, but that isn't written with --cached.
> +		 */
> +		if (!state->cached)
> +			repo_rerere(state->repo, 0);
>  	}
>  
>  	return errs;
> diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
> index 9ff313f976422f9c12dc8032d14567b54cfe3765..65147efdea9a00e30d156e6f4d5d72a3987f230d 100755
> --- a/t/t4108-apply-threeway.sh
> +++ b/t/t4108-apply-threeway.sh
> @@ -180,4 +180,54 @@ test_expect_success 'apply -3 with ambiguous repeating file' '
>  	test_cmp expect one_two_repeat
>  '
>  
> +test_expect_success 'apply with --3way --cached clean apply' '
> +	# Merging side should be similar to applying this patch
> +	git diff ...side >P.diff &&
> +
> +	# The corresponding cleanly applied merge
> +	git reset --hard &&
> +	git checkout main~ &&
> +	git merge --no-commit side &&
> +	git ls-files -s >expect.ls &&
> +
> +	# should succeed
> +	git reset --hard &&
> +	git checkout main~ &&
> +	git apply --cached --3way P.diff &&
> +	git ls-files -s >actual.ls &&
> +	print_sanitized_conflicted_diff >actual.diff &&
> +
> +	# The cache should resemble the corresponding merge
> +	# (both files at stage #0)
> +	test_cmp expect.ls actual.ls &&
> +	# However the working directory should not change
> +	>expect.diff &&
> +	test_cmp expect.diff actual.diff
> +'
> +
> +test_expect_success 'apply with --3way --cached and conflicts' '
> +	# Merging side should be similar to applying this patch
> +	git diff ...side >P.diff &&
> +
> +	# The corresponding conflicted merge
> +	git reset --hard &&
> +	git checkout main^0 &&
> +	test_must_fail git merge --no-commit side &&
> +	git ls-files -s >expect.ls &&
> +
> +	# should fail to apply
> +	git reset --hard &&
> +	git checkout main^0 &&
> +	test_must_fail git apply --cached --3way P.diff &&
> +	git ls-files -s >actual.ls &&
> +	print_sanitized_conflicted_diff >actual.diff &&
> +
> +	# The cache should resemble the corresponding merge
> +	# (one file at stage #0, one file at stages #1 #2 #3)
> +	test_cmp expect.ls actual.ls &&
> +	# However the working directory should not change
> +	>expect.diff &&
> +	test_cmp expect.diff actual.diff
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5] git-apply: allow simultaneous --cached and --3way options
  2021-04-08  2:13         ` [PATCH v5] " Jerry Zhang
  2021-04-08 13:33           ` Junio C Hamano
@ 2021-04-12 15:40           ` Elijah Newren
  2021-04-12 18:27             ` Junio C Hamano
  1 sibling, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2021-04-12 15:40 UTC (permalink / raw)
  To: Jerry Zhang
  Cc: Git Mailing List, Junio C Hamano, ross, Abraham Bachrach, brian.kubisiak

On Wed, Apr 7, 2021 at 7:13 PM Jerry Zhang <jerry@skydio.com> wrote:
>
> "git apply" does not allow "--cached" and
> "--3way" to be used together, since "--3way"
> writes conflict markers into the working tree.
>
> Allow "git apply" to accept "--cached" and
> "--3way" at the same time.  When a single file
> auto-resolves cleanly, the result is placed in the
> index at stage #0 and the command exits with 0
> status.

Should this instead read:
  "...placed in the index at stage #0.  If all files auto-resolve
cleanly, the command exits with 0 status."
or something like that?

>  For a file that has a conflict which
> cannot be cleanly auto-resolved, the original
> contents from common ancestor (stage #1), our
> version (stage #2) and the contents from the
> patch (stage #3) are left at separate stages.
> No attempt is made to resolve the conflict at
> the content level, and the command exists with

s/exists/exits/

> non-zero status, because there is no place
> (like the working tree) to leave a half-resolved
> merge for the user to resolve.
>
> The user can use `git diff` to view the contents
> of the conflict, or `git checkout -m -- .` to
> regenerate the conflict markers in the working
> directory.
>
> Don't attempt rerere in this case since it depends
> on conflict markers written to file for its database
> storage and lookup. There would be two main changes
> required to get rerere working:
> 1. Allow the rerere api to accept in memory object
> rather than files, which would allow us to pass in
> the conflict markers contained in the result from
> ll_merge().
> 2. Rerere can't write to the working directory, so
> it would have to apply the result to cache stage #0
> directly. A flag would be needed to control this.
>
> Signed-off-by: Jerry Zhang <jerry@skydio.com>
> ---
> Patch applies on top of
> "[PATCH v2] git-apply: try threeway first when "--3way""
> Main merge conflict is the addition of multiple
> tests at the bottom of the file.
>
> v4->v5:
>
> Updated in file comment about rerere
> Added test for cleanly applying patch (should return 0)
> Previous test captured case where patch fails to apply
> due to 1 conflicting file and 1 cleanly applying file
> (returns 1).
>
>  Documentation/git-apply.txt |  6 +++--
>  apply.c                     |  9 ++++---
>  t/t4108-apply-threeway.sh   | 50 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 59 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> index 9144575299c264dd299b542b7b5948eef35f211c..aa1ae56a25e0428cabcfa2539900ef2a09abcb7c 100644
> --- a/Documentation/git-apply.txt
> +++ b/Documentation/git-apply.txt
> @@ -87,8 +87,10 @@ OPTIONS
>         Attempt 3-way merge if the patch records the identity of blobs it is supposed
>         to apply to and we have those blobs available locally, possibly leaving the
>         conflict markers in the files in the working tree for the user to
> -       resolve.  This option implies the `--index` option, and is incompatible
> -       with the `--reject` and the `--cached` options.
> +       resolve.  This option implies the `--index` option unless the
> +       `--cached` option is used, and is incompatible with the `--reject` option.
> +       When used with the `--cached` option, any conflicts are left at higher stages
> +       in the cache.
>
>  --build-fake-ancestor=<file>::
>         Newer 'git diff' output has embedded 'index information'
> diff --git a/apply.c b/apply.c
> index 9bd4efcbced842d2c5c030a0f2178ddb36114600..dadab80ec967357b031657d4e3d0ae52fac11411 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
>
>         if (state->apply_with_reject && state->threeway)
>                 return error(_("--reject and --3way cannot be used together."));
> -       if (state->cached && state->threeway)
> -               return error(_("--cached and --3way cannot be used together."));
>         if (state->threeway) {
>                 if (is_not_gitdir)
>                         return error(_("--3way outside a repository"));
> @@ -4644,8 +4642,11 @@ static int write_out_results(struct apply_state *state, struct patch *list)
>                                 fprintf(stderr, "U %s\n", item->string);
>                 }
>                 string_list_clear(&cpath, 0);
> -
> -               repo_rerere(state->repo, 0);
> +               /* Rerere relies on the partially merged result being in the working tree
> +                * with conflict markers, but that isn't written with --cached.
> +                */
> +               if (!state->cached)
> +                       repo_rerere(state->repo, 0);
>         }
>
>         return errs;
> diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
> index 9ff313f976422f9c12dc8032d14567b54cfe3765..65147efdea9a00e30d156e6f4d5d72a3987f230d 100755
> --- a/t/t4108-apply-threeway.sh
> +++ b/t/t4108-apply-threeway.sh
> @@ -180,4 +180,54 @@ test_expect_success 'apply -3 with ambiguous repeating file' '
>         test_cmp expect one_two_repeat
>  '
>
> +test_expect_success 'apply with --3way --cached clean apply' '
> +       # Merging side should be similar to applying this patch
> +       git diff ...side >P.diff &&
> +
> +       # The corresponding cleanly applied merge
> +       git reset --hard &&
> +       git checkout main~ &&
> +       git merge --no-commit side &&
> +       git ls-files -s >expect.ls &&
> +
> +       # should succeed
> +       git reset --hard &&
> +       git checkout main~ &&
> +       git apply --cached --3way P.diff &&
> +       git ls-files -s >actual.ls &&
> +       print_sanitized_conflicted_diff >actual.diff &&
> +
> +       # The cache should resemble the corresponding merge
> +       # (both files at stage #0)
> +       test_cmp expect.ls actual.ls &&
> +       # However the working directory should not change
> +       >expect.diff &&
> +       test_cmp expect.diff actual.diff
> +'
> +
> +test_expect_success 'apply with --3way --cached and conflicts' '
> +       # Merging side should be similar to applying this patch
> +       git diff ...side >P.diff &&
> +
> +       # The corresponding conflicted merge
> +       git reset --hard &&
> +       git checkout main^0 &&
> +       test_must_fail git merge --no-commit side &&
> +       git ls-files -s >expect.ls &&
> +
> +       # should fail to apply
> +       git reset --hard &&
> +       git checkout main^0 &&
> +       test_must_fail git apply --cached --3way P.diff &&
> +       git ls-files -s >actual.ls &&
> +       print_sanitized_conflicted_diff >actual.diff &&
> +
> +       # The cache should resemble the corresponding merge
> +       # (one file at stage #0, one file at stages #1 #2 #3)
> +       test_cmp expect.ls actual.ls &&
> +       # However the working directory should not change
> +       >expect.diff &&
> +       test_cmp expect.diff actual.diff
> +'
> +
>  test_done
> --
> 2.29.0

Otherwise, looks good to me.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5] git-apply: allow simultaneous --cached and --3way options
  2021-04-08 13:33           ` Junio C Hamano
@ 2021-04-12 15:45             ` Elijah Newren
  2021-04-12 18:26               ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2021-04-12 15:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jerry Zhang, Git Mailing List, ross, Abraham Bachrach, brian.kubisiak

On Thu, Apr 8, 2021 at 6:33 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jerry Zhang <jerry@skydio.com> writes:
>
> >  Documentation/git-apply.txt |  6 +++--
> >  apply.c                     |  9 ++++---
> >  t/t4108-apply-threeway.sh   | 50 +++++++++++++++++++++++++++++++++++++
> >  3 files changed, 59 insertions(+), 6 deletions(-)
>
> Nicely done.  Will queue.
>
> Elijah, how does this round look to you?

Sorry for the delay; modulo two minor issues with the commit message
it looks good to me.

This change won't allow git-apply to handle upstream renames, and
makes me wonder if we should lift the fall_back_threeway() logic out
of builtin/am.c and use it here.  This change also makes me wonder if
we should change git-am's --3way flag to make it not be treated as a
fallback to be consistent with what we are doing here.  But neither of
those changes need to be part of this patch.

> > diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> > index 9144575299c264dd299b542b7b5948eef35f211c..aa1ae56a25e0428cabcfa2539900ef2a09abcb7c 100644
> > --- a/Documentation/git-apply.txt
> > +++ b/Documentation/git-apply.txt
> > @@ -87,8 +87,10 @@ OPTIONS
> >       Attempt 3-way merge if the patch records the identity of blobs it is supposed
> >       to apply to and we have those blobs available locally, possibly leaving the
> >       conflict markers in the files in the working tree for the user to
> > -     resolve.  This option implies the `--index` option, and is incompatible
> > -     with the `--reject` and the `--cached` options.
> > +     resolve.  This option implies the `--index` option unless the
> > +     `--cached` option is used, and is incompatible with the `--reject` option.
> > +     When used with the `--cached` option, any conflicts are left at higher stages
> > +     in the cache.
> >
> >  --build-fake-ancestor=<file>::
> >       Newer 'git diff' output has embedded 'index information'
> > diff --git a/apply.c b/apply.c
> > index 9bd4efcbced842d2c5c030a0f2178ddb36114600..dadab80ec967357b031657d4e3d0ae52fac11411 100644
> > --- a/apply.c
> > +++ b/apply.c
> > @@ -133,8 +133,6 @@ int check_apply_state(struct apply_state *state, int force_apply)
> >
> >       if (state->apply_with_reject && state->threeway)
> >               return error(_("--reject and --3way cannot be used together."));
> > -     if (state->cached && state->threeway)
> > -             return error(_("--cached and --3way cannot be used together."));
> >       if (state->threeway) {
> >               if (is_not_gitdir)
> >                       return error(_("--3way outside a repository"));
> > @@ -4644,8 +4642,11 @@ static int write_out_results(struct apply_state *state, struct patch *list)
> >                               fprintf(stderr, "U %s\n", item->string);
> >               }
> >               string_list_clear(&cpath, 0);
> > -
> > -             repo_rerere(state->repo, 0);
> > +             /* Rerere relies on the partially merged result being in the working tree
> > +              * with conflict markers, but that isn't written with --cached.
> > +              */
> > +             if (!state->cached)
> > +                     repo_rerere(state->repo, 0);
> >       }
> >
> >       return errs;
> > diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
> > index 9ff313f976422f9c12dc8032d14567b54cfe3765..65147efdea9a00e30d156e6f4d5d72a3987f230d 100755
> > --- a/t/t4108-apply-threeway.sh
> > +++ b/t/t4108-apply-threeway.sh
> > @@ -180,4 +180,54 @@ test_expect_success 'apply -3 with ambiguous repeating file' '
> >       test_cmp expect one_two_repeat
> >  '
> >
> > +test_expect_success 'apply with --3way --cached clean apply' '
> > +     # Merging side should be similar to applying this patch
> > +     git diff ...side >P.diff &&
> > +
> > +     # The corresponding cleanly applied merge
> > +     git reset --hard &&
> > +     git checkout main~ &&
> > +     git merge --no-commit side &&
> > +     git ls-files -s >expect.ls &&
> > +
> > +     # should succeed
> > +     git reset --hard &&
> > +     git checkout main~ &&
> > +     git apply --cached --3way P.diff &&
> > +     git ls-files -s >actual.ls &&
> > +     print_sanitized_conflicted_diff >actual.diff &&
> > +
> > +     # The cache should resemble the corresponding merge
> > +     # (both files at stage #0)
> > +     test_cmp expect.ls actual.ls &&
> > +     # However the working directory should not change
> > +     >expect.diff &&
> > +     test_cmp expect.diff actual.diff
> > +'
> > +
> > +test_expect_success 'apply with --3way --cached and conflicts' '
> > +     # Merging side should be similar to applying this patch
> > +     git diff ...side >P.diff &&
> > +
> > +     # The corresponding conflicted merge
> > +     git reset --hard &&
> > +     git checkout main^0 &&
> > +     test_must_fail git merge --no-commit side &&
> > +     git ls-files -s >expect.ls &&
> > +
> > +     # should fail to apply
> > +     git reset --hard &&
> > +     git checkout main^0 &&
> > +     test_must_fail git apply --cached --3way P.diff &&
> > +     git ls-files -s >actual.ls &&
> > +     print_sanitized_conflicted_diff >actual.diff &&
> > +
> > +     # The cache should resemble the corresponding merge
> > +     # (one file at stage #0, one file at stages #1 #2 #3)
> > +     test_cmp expect.ls actual.ls &&
> > +     # However the working directory should not change
> > +     >expect.diff &&
> > +     test_cmp expect.diff actual.diff
> > +'
> > +
> >  test_done

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5] git-apply: allow simultaneous --cached and --3way options
  2021-04-12 15:45             ` Elijah Newren
@ 2021-04-12 18:26               ` Junio C Hamano
  0 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2021-04-12 18:26 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jerry Zhang, Git Mailing List, ross, Abraham Bachrach, brian.kubisiak

Elijah Newren <newren@gmail.com> writes:

> Sorry for the delay; modulo two minor issues with the commit message
> it looks good to me.

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5] git-apply: allow simultaneous --cached and --3way options
  2021-04-12 15:40           ` Elijah Newren
@ 2021-04-12 18:27             ` Junio C Hamano
  0 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2021-04-12 18:27 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jerry Zhang, Git Mailing List, ross, Abraham Bachrach, brian.kubisiak

Elijah Newren <newren@gmail.com> writes:

> On Wed, Apr 7, 2021 at 7:13 PM Jerry Zhang <jerry@skydio.com> wrote:
>>
>> "git apply" does not allow "--cached" and
>> "--3way" to be used together, since "--3way"
>> writes conflict markers into the working tree.
>>
>> Allow "git apply" to accept "--cached" and
>> "--3way" at the same time.  When a single file
>> auto-resolves cleanly, the result is placed in the
>> index at stage #0 and the command exits with 0
>> status.
>
> Should this instead read:
>   "...placed in the index at stage #0.  If all files auto-resolve
> cleanly, the command exits with 0 status."
> or something like that?

Perhaps.

>>  For a file that has a conflict which
>> cannot be cleanly auto-resolved, the original
>> contents from common ancestor (stage #1), our
>> version (stage #2) and the contents from the
>> patch (stage #3) are left at separate stages.
>> No attempt is made to resolve the conflict at
>> the content level, and the command exists with
>
> s/exists/exits/

Will squash it in.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2021-04-12 18:27 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-03  1:34 [PATCH 0/1] git-apply: Allow simultaneous --cached and --3way options Jerry Zhang
2021-04-03  1:34 ` [PATCH 1/1] " Jerry Zhang
2021-04-03  3:46   ` Elijah Newren
2021-04-03  4:26     ` Junio C Hamano
2021-04-04  1:02       ` Junio C Hamano
2021-04-05 22:12         ` Jerry Zhang
2021-04-05 22:23           ` Junio C Hamano
2021-04-05 23:29             ` Jerry Zhang
2021-04-06  0:10               ` Junio C Hamano
2021-04-05 22:08     ` Jerry Zhang
2021-04-05 22:19   ` [PATCH V2] " Jerry Zhang
2021-04-05 22:46     ` Junio C Hamano
2021-04-06  2:52       ` Jerry Zhang
2021-04-06  5:52         ` Junio C Hamano
2021-04-06 21:56           ` Jerry Zhang
2021-04-07  2:25             ` Jerry Zhang
2021-04-06  2:49     ` [PATCH v3] git-apply: allow " Jerry Zhang
2021-04-07 18:03       ` [PATCH v4] " Jerry Zhang
2021-04-07 19:00         ` Junio C Hamano
2021-04-08  2:13         ` [PATCH v5] " Jerry Zhang
2021-04-08 13:33           ` Junio C Hamano
2021-04-12 15:45             ` Elijah Newren
2021-04-12 18:26               ` Junio C Hamano
2021-04-12 15:40           ` Elijah Newren
2021-04-12 18:27             ` Junio C Hamano
2021-04-03  3:04 ` [PATCH 0/1] git-apply: Allow " Elijah Newren
2021-04-05 22:05   ` Jerry Zhang
2021-04-03  5:24 ` Bagas Sanjaya
     [not found]   ` <CAMKO5CtiW84E4XjnPRf-yOPp+ua_u07LsAu=BB0YhmP3+3kYiw@mail.gmail.com>
2021-04-03  8:05     ` Bagas Sanjaya

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).