git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Submodule/contents conflict
@ 2017-04-24  8:06 Orgad Shaneh
  2017-04-24 17:40 ` Stefan Beller
  0 siblings, 1 reply; 18+ messages in thread
From: Orgad Shaneh @ 2017-04-24  8:06 UTC (permalink / raw)
  To: git

Hi,

I've noticed a strange behavior with submodule/content conflict. My
current Git version is 2.12.2, but the problem exists since I
remember.

Branch A has a submodule.
In branch B which diverged from A, I replaced the submodule with its contents.

Now, every time I merge A into B, and A had changed the submodule
reference, all the files inside the ex-submodule directory in B are
being "re-added".

Moreover, aborting the merge prints an error, but seems to work
nevertheless, and if I run git reset --hard all the files in that
directory are actually written to the disk, even though they haven't
changed at all.

When the submodule is small, it might be ok. But in my project we have
a huge submodule with ~16K files, and on each merge all the files are
listed, and even mixed reset takes several minutes.

The following script demonstrates this:
#!/bin/sh

rm -rf super sub
mkdir sub
cd sub
git init
touch foo
git add foo
git commit -m 'Initial commit'
mkdir ../super; cd ../super
git init
git submodule add ../sub
touch foo; git add foo sub
git commit -m 'Initial commit'
git checkout -b update-sub
git update-index --cacheinfo 160000,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,sub
git commit -m 'Update submodule'
git checkout -b remove-sub HEAD^
git rm sub
mkdir sub
touch sub/foo sub/bar
git add sub
git commit -m 'Replaced submodule with contents'
git checkout -b remove-2 HEAD^
git merge --no-ff remove-sub
git merge update-sub
# Adding sub/foo
# Adding sub/bar
# CONFLICT (modify/delete): sub deleted in HEAD and modified in
update-sub. Version update-sub of sub left in tree at sub~update-sub.
# Automatic merge failed; fix conflicts and then commit the result.
git merge --abort
# error: 'sub' appears as both a file and as a directory
# error: sub: cannot drop to stage #0

- Orgad

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-24  8:06 Submodule/contents conflict Orgad Shaneh
@ 2017-04-24 17:40 ` Stefan Beller
  2017-04-24 23:33   ` Philip Oakley
  2017-04-25  2:12   ` Junio C Hamano
  0 siblings, 2 replies; 18+ messages in thread
From: Stefan Beller @ 2017-04-24 17:40 UTC (permalink / raw)
  To: Orgad Shaneh, Dakota Hawkins; +Cc: git

On Mon, Apr 24, 2017 at 1:06 AM, Orgad Shaneh <orgads@gmail.com> wrote:
> Hi,
>
> I've noticed a strange behavior with submodule/content conflict. My
> current Git version is 2.12.2, but the problem exists since I
> remember.
>
> Branch A has a submodule.
> In branch B which diverged from A, I replaced the submodule with its contents.
>
> Now, every time I merge A into B, and A had changed the submodule
> reference, all the files inside the ex-submodule directory in B are
> being "re-added".
>
> Moreover, aborting the merge prints an error, but seems to work
> nevertheless, and if I run git reset --hard all the files in that
> directory are actually written to the disk, even though they haven't
> changed at all.
>
> When the submodule is small, it might be ok. But in my project we have
> a huge submodule with ~16K files, and on each merge all the files are
> listed, and even mixed reset takes several minutes.
>

A similar bug report
https://public-inbox.org/git/CAG0BQX=wvpkJ=PQWV-NbmhuPV8yzvd_KYKzJmsfWq9xStZ2bnQ@mail.gmail.com/

"checkout --recurse-submodules" (as mentioned in that report)
made it into Git by now, but this bug goes unfixed, still.

> The following script demonstrates this:
> #!/bin/sh
>
> rm -rf super sub
> mkdir sub
> cd sub
> git init
> touch foo
> git add foo
> git commit -m 'Initial commit'
> mkdir ../super; cd ../super
> git init
> git submodule add ../sub
> touch foo; git add foo sub
> git commit -m 'Initial commit'
> git checkout -b update-sub
> git update-index --cacheinfo 160000,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,sub
> git commit -m 'Update submodule'
> git checkout -b remove-sub HEAD^
> git rm sub
> mkdir sub
> touch sub/foo sub/bar
> git add sub
> git commit -m 'Replaced submodule with contents'
> git checkout -b remove-2 HEAD^
> git merge --no-ff remove-sub
> git merge update-sub
> # Adding sub/foo
> # Adding sub/bar
> # CONFLICT (modify/delete): sub deleted in HEAD and modified in
> update-sub. Version update-sub of sub left in tree at sub~update-sub.
> # Automatic merge failed; fix conflicts and then commit the result.
> git merge --abort
> # error: 'sub' appears as both a file and as a directory
> # error: sub: cannot drop to stage #0
>
> - Orgad

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-24 17:40 ` Stefan Beller
@ 2017-04-24 23:33   ` Philip Oakley
  2017-04-24 23:43     ` Stefan Beller
  2017-04-25  2:12   ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Philip Oakley @ 2017-04-24 23:33 UTC (permalink / raw)
  To: Stefan Beller, Orgad Shaneh, Dakota Hawkins; +Cc: git

From: "Stefan Beller" <sbeller@google.com>
> On Mon, Apr 24, 2017 at 1:06 AM, Orgad Shaneh <orgads@gmail.com> wrote:
>> Hi,
>>
>> I've noticed a strange behavior with submodule/content conflict. My
>> current Git version is 2.12.2, but the problem exists since I
>> remember.
>>
>> Branch A has a submodule.
>> In branch B which diverged from A, I replaced the submodule with its 
>> contents.
>>
>> Now, every time I merge A into B, and A had changed the submodule
>> reference, all the files inside the ex-submodule directory in B are
>> being "re-added".
>>
>> Moreover, aborting the merge prints an error, but seems to work
>> nevertheless, and if I run git reset --hard all the files in that
>> directory are actually written to the disk, even though they haven't
>> changed at all.
>>

This is almost the same as just reported by 'vvs' [1] 
https://public-inbox.org/git/CAM1zWBtfgHT=pT0pidQo1HD=DfrXLG3gNaUvs0vZKvYfG1BHFw@mail.gmail.com/, 
originally on the 'git user' list 
https://groups.google.com/forum/?hl=en#!topic/git-users/9ziZ6yq-BfU

It also has a similarity to 
https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/  regarding 
how checkout operates.

It does feel as if there are two slightly different optimisations that could 
be used when the desired file pre-exists in the worktree, but isn't 
immediately known to the index.


>> When the submodule is small, it might be ok. But in my project we have
>> a huge submodule with ~16K files, and on each merge all the files are
>> listed, and even mixed reset takes several minutes.

That sounds like a wait that is not wanted!
>>
>
> A similar bug report
> https://public-inbox.org/git/CAG0BQX=wvpkJ=PQWV-NbmhuPV8yzvd_KYKzJmsfWq9xStZ2bnQ@mail.gmail.com/
>
> "checkout --recurse-submodules" (as mentioned in that report)
> made it into Git by now, but this bug goes unfixed, still.
>
>> The following script demonstrates this:
>> #!/bin/sh
>>
>> rm -rf super sub
>> mkdir sub
>> cd sub
>> git init
>> touch foo
>> git add foo
>> git commit -m 'Initial commit'
>> mkdir ../super; cd ../super
>> git init
>> git submodule add ../sub
>> touch foo; git add foo sub
>> git commit -m 'Initial commit'
>> git checkout -b update-sub
>> git update-index --cacheinfo 
>> 160000,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,sub
>> git commit -m 'Update submodule'
>> git checkout -b remove-sub HEAD^
>> git rm sub
>> mkdir sub
>> touch sub/foo sub/bar
>> git add sub
>> git commit -m 'Replaced submodule with contents'
>> git checkout -b remove-2 HEAD^
>> git merge --no-ff remove-sub
>> git merge update-sub
>> # Adding sub/foo
>> # Adding sub/bar
>> # CONFLICT (modify/delete): sub deleted in HEAD and modified in
>> update-sub. Version update-sub of sub left in tree at sub~update-sub.
>> # Automatic merge failed; fix conflicts and then commit the result.
>> git merge --abort
>> # error: 'sub' appears as both a file and as a directory
>> # error: sub: cannot drop to stage #0
>>
>> - Orgad
>
--
Philip 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-24 23:33   ` Philip Oakley
@ 2017-04-24 23:43     ` Stefan Beller
  2017-04-25  3:22       ` Jeff King
  2017-04-25 11:10       ` Philip Oakley
  0 siblings, 2 replies; 18+ messages in thread
From: Stefan Beller @ 2017-04-24 23:43 UTC (permalink / raw)
  To: Philip Oakley, Jeff King; +Cc: Orgad Shaneh, Dakota Hawkins, git

On Mon, Apr 24, 2017 at 4:33 PM, Philip Oakley <philipoakley@iee.org> wrote:

> This is almost the same as just reported by 'vvs' [1]
> https://public-inbox.org/git/CAM1zWBtfgHT=pT0pidQo1HD=DfrXLG3gNaUvs0vZKvYfG1BHFw@mail.gmail.com/,
> originally on the 'git user' list
> https://groups.google.com/forum/?hl=en#!topic/git-users/9ziZ6yq-BfU

For this issue, +cc Jeff King due to this pointer:

>> On the main list thare is a similar "issue" [1] regarding the expectation for `git checkout`,
>> and importantly (for me) these collected views regarding the "Git Data Protection and
>> Management Principles" is not within the Git documentation.
>
> Yes, that's an interesting point. What concerns me is that the commit
> c5326bd62b7e168ba1339dacb7ee812d0fe98c7c which introduced this
> into checkout isn't consistent with reset. Seems that nobody noticed this before.

It seems as if we'd want to see the code from
c5326bd62b7e168ba1339dacb7ee812d0fe98c7c
to be part of any worktree touching command, specifically reset?

> It also has a similarity to
> https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/  regarding
> how checkout operates.

This seems to be orthogonal to the original topic (no submodules, nor checkout
bugs, just a doc update?)


> It does feel as if there are two slightly different optimisations that could
> be used when the desired file pre-exists in the worktree, but isn't
> immediately known to the index.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-24 17:40 ` Stefan Beller
  2017-04-24 23:33   ` Philip Oakley
@ 2017-04-25  2:12   ` Junio C Hamano
  2017-04-25 15:57     ` Stefan Beller
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2017-04-25  2:12 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Orgad Shaneh, Dakota Hawkins, git

Stefan Beller <sbeller@google.com> writes:

> A similar bug report
> https://public-inbox.org/git/CAG0BQX=wvpkJ=PQWV-NbmhuPV8yzvd_KYKzJmsfWq9xStZ2bnQ@mail.gmail.com/
>
> "checkout --recurse-submodules" (as mentioned in that report)
> made it into Git by now, but this bug goes unfixed, still.

I do not mind reverting that merge out of 'master'.  Please holler
if you feel these "go recursive" topics are premature.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-24 23:43     ` Stefan Beller
@ 2017-04-25  3:22       ` Jeff King
  2017-04-25  3:39         ` Jeff King
  2017-04-27 22:52         ` Philip Oakley
  2017-04-25 11:10       ` Philip Oakley
  1 sibling, 2 replies; 18+ messages in thread
From: Jeff King @ 2017-04-25  3:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Philip Oakley, Orgad Shaneh, Dakota Hawkins, git

On Mon, Apr 24, 2017 at 04:43:28PM -0700, Stefan Beller wrote:

> >> On the main list thare is a similar "issue" [1] regarding the expectation for `git checkout`,
> >> and importantly (for me) these collected views regarding the "Git Data Protection and
> >> Management Principles" is not within the Git documentation.
> >
> > Yes, that's an interesting point. What concerns me is that the commit
> > c5326bd62b7e168ba1339dacb7ee812d0fe98c7c which introduced this
> > into checkout isn't consistent with reset. Seems that nobody noticed this before.
> 
> It seems as if we'd want to see the code from
> c5326bd62b7e168ba1339dacb7ee812d0fe98c7c
> to be part of any worktree touching command, specifically reset?

Note that that commit is just about "git checkout <commit> -- <paths>".
The matching reset command, "git reset <commit> -- <paths>" does handle
this the same way. Or at least I claimed so here:

  http://public-inbox.org/git/20141107081324.GA19845@peff.net/

and that is what I modeled the code in c5326bd62b after.

But note that none of that should ever affect _what_ gets checked out at
a file or content level. It may only affect the timestamps on the
resulting files. And I think those timestamps are not something Git
makes any promises about.

So if Git can elide a write and keep a timestamp up to date, that is a
great optimization and we should do that. But from an outside viewer's
perspective, we guarantee nothing. We might choose to rewrite a
stat-dirty file (updating its timestamp) rather than read its contents
to see if it might have the same content that we're about to write. And
the file may look stat dirty only because of the racy-git file/index
timestamp problem, even if it wasn't actually modified.

> > It also has a similarity to
> > https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/  regarding
> > how checkout operates.

I didn't look too deeply into this one, but it really looks like
somebody caring too much about when git needs to write (and I'd suspect
it's impacted by the racy-git thing, too).

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-25  3:22       ` Jeff King
@ 2017-04-25  3:39         ` Jeff King
  2017-04-27 22:52         ` Philip Oakley
  1 sibling, 0 replies; 18+ messages in thread
From: Jeff King @ 2017-04-25  3:39 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Philip Oakley, Orgad Shaneh, Dakota Hawkins, git

On Mon, Apr 24, 2017 at 11:22:42PM -0400, Jeff King wrote:

> > > It also has a similarity to
> > > https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/  regarding
> > > how checkout operates.
> 
> I didn't look too deeply into this one, but it really looks like
> somebody caring too much about when git needs to write (and I'd suspect
> it's impacted by the racy-git thing, too).

Oh, sorry, I got this confused with:

  https://public-inbox.org/git/CAM1zWBtfgHT=pT0pidQo1HD=DfrXLG3gNaUvs0vZKvYfG1BHFw@mail.gmail.com/

which was mentioned earlier (and which I do think is just about
timestamps).

I don't think any of what I said is related to the doc update in

  https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/

which seems reasonable to me.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-24 23:43     ` Stefan Beller
  2017-04-25  3:22       ` Jeff King
@ 2017-04-25 11:10       ` Philip Oakley
  2017-04-26  2:51         ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Philip Oakley @ 2017-04-25 11:10 UTC (permalink / raw)
  To: Stefan Beller, Jeff King
  Cc: Orgad Shaneh, Dakota Hawkins, git, Christoph Michelbach

From: "Stefan Beller" <sbeller@google.com>
> On Mon, Apr 24, 2017 at 4:33 PM, Philip Oakley <philipoakley@iee.org> 
> wrote:
>
>> This is almost the same as just reported by 'vvs' [1]
>> https://public-inbox.org/git/CAM1zWBtfgHT=pT0pidQo1HD=DfrXLG3gNaUvs0vZKvYfG1BHFw@mail.gmail.com/,
>> originally on the 'git user' list
>> https://groups.google.com/forum/?hl=en#!topic/git-users/9ziZ6yq-BfU
>
> For this issue, +cc Jeff King due to this pointer:
>
>>> On the main list thare is a similar "issue" [1] regarding the 
>>> expectation for `git checkout`,
>>> and importantly (for me) these collected views regarding the "Git Data 
>>> Protection and
>>> Management Principles" is not within the Git documentation.
>>
>> Yes, that's an interesting point. What concerns me is that the commit
>> c5326bd62b7e168ba1339dacb7ee812d0fe98c7c which introduced this
>> into checkout isn't consistent with reset. Seems that nobody noticed this 
>> before.
>
> It seems as if we'd want to see the code from
> c5326bd62b7e168ba1339dacb7ee812d0fe98c7c
> to be part of any worktree touching command, specifically reset?
>
>> It also has a similarity to
>> https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/ 
>> regarding
>> how checkout operates.
>
> This seems to be orthogonal to the original topic (no submodules, nor 
> checkout
> bugs, just a doc update?)

At the moment that topic has ended up as an attempt at a documentation 
update, but I think it may not succeed ynless there is clarity about the 
(reported) underlying problem.

As I recall Christoph was using checkout to copy a file (e.g. a template 
file) from an older commit/revision into his worktree, and was suprised that 
this (git checkout <tree> <path>) also _staged_ the file, rather than simply 
letting it be in a modified/untracked state.

It in a sense is the same issue of relating the worktree content to the 
index status, though I think that checkout is acting as designed/intended. 
I don't think that there is a command (is ther?) to do what Christoph hoped 
for of not staging the new content (e.g. a template file), but the index 
knowing/remembering its stat details, just in case.

The submodule case appears to be more about deciding if the content of 
submodule's index should be considered for inclusion in the super project 
prior to the checkout of the other branch's tree that is already unchanged 
and in-place.

>
>
>> It does feel as if there are two slightly different optimisations that 
>> could
>> be used when the desired file pre-exists in the worktree, but isn't
>> immediately known to the index.
>>
>
--
Philip 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-25  2:12   ` Junio C Hamano
@ 2017-04-25 15:57     ` Stefan Beller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2017-04-25 15:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Orgad Shaneh, Dakota Hawkins, git

On Mon, Apr 24, 2017 at 7:12 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> A similar bug report
>> https://public-inbox.org/git/CAG0BQX=wvpkJ=PQWV-NbmhuPV8yzvd_KYKzJmsfWq9xStZ2bnQ@mail.gmail.com/
>>
>> "checkout --recurse-submodules" (as mentioned in that report)
>> made it into Git by now, but this bug goes unfixed, still.
>
> I do not mind reverting that merge out of 'master'.  Please holler
> if you feel these "go recursive" topics are premature.

The bugs reported here are not for the recursive checkout,
I was merely noting it as a side effect of time having passed. :)

I did not imply the recursive topics being premature.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-25 11:10       ` Philip Oakley
@ 2017-04-26  2:51         ` Junio C Hamano
  2017-04-26 17:41           ` Stefan Beller
  2017-04-27 22:07           ` Philip Oakley
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-04-26  2:51 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Stefan Beller, Jeff King, Orgad Shaneh, Dakota Hawkins, git,
	Christoph Michelbach

"Philip Oakley" <philipoakley@iee.org> writes:

> As I recall Christoph was using checkout to copy a file (e.g. a
> template file) from an older commit/revision into his worktree, and
> was suprised that this (git checkout <tree> <path>) also _staged_ the
> file, rather than simply letting it be in a modified/untracked state.

This probably is taking it even further than the original topic, but
I raise this weather-balloon to see if anybody is interested.

In the modern day, it might be useful if the "--working-tree-only"
mode added a new file as an intent-to-add entry to the index, but
that is not what "git apply (no other options)" (which is the gold
standard for command that operates on the working tree and/or on the
index) does, so it is not done in this patch.  IOW, if you grab a
path that does not exist in your index out of <tree-ish>, you will
write out an untracked file to the working tree.

-- >8 --
Subject: [PATCH] checkout: add --working-tree-only option

"git checkout <tree-ish> <pathspec>" has always copied the blob from
the tree-ish to the index before checking them out to the working tree.

Some users may want to grab a blob out of a tree-ish directly to the
working tree, without updating the index, so that "git diff" can be
used to assess the damage and adjust the file contents taken from a
different branch to be more appropriate for the current branch.

The new option "--working-tree-only" allows exactly that.

In the hindsight, when a command works on the working tree and/or
the index, the usual convention is:

 - with no other option, the command works only on the working tree;

 - with "--cached" option, the command works only on the index; and

 - with "--index" option, the command works on both the working tree
   and the index.

So we probably should have triggered the default behaviour under the
"--index" option, and triggered this "--working-tree-only" mode of
behaviour when "--index" option is not given.  From the same point
of view, "git checkout --cached <tree-ish> <pathspec>" would have
done the same as "git reset <tree-ish> <pathspec>" would do.  And
that may have made the command set a bit more consistent.

But that is merely a hindsight being 20/20, oh well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-checkout.txt | 22 +++++++++++++++-------
 builtin/checkout.c             | 10 +++++++++-
 t/t2022-checkout-paths.sh      | 21 +++++++++++++++++++++
 3 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 8e2c0662dd..201677752e 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -14,6 +14,7 @@ SYNOPSIS
 'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
 'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>...
 'git checkout' [-p|--patch] [<tree-ish>] [--] [<paths>...]
+'git checkout' --working-tree-only <tree-ish> [--] [<paths>...]
 
 DESCRIPTION
 -----------
@@ -81,13 +82,14 @@ Omitting <branch> detaches HEAD at the tip of the current branch.
 'git checkout' [-p|--patch] [<tree-ish>] [--] <pathspec>...::
 
 	When <paths> or `--patch` are given, 'git checkout' does *not*
-	switch branches.  It updates the named paths in the working tree
-	from the index file or from a named <tree-ish> (most often a
-	commit).  In this case, the `-b` and `--track` options are
-	meaningless and giving either of them results in an error.  The
-	<tree-ish> argument can be used to specify a specific tree-ish
-	(i.e.  commit, tag or tree) to update the index for the given
-	paths before updating the working tree.
+	switch branches.  In this case, the `-b` and `--track` options
+	are meaningless and giving either of them results in an error.
++
+The command checks out blobs for paths that match the given
+<pathspec> from the index to the working tree.  When an optional
+<tree-ish> is given, the blobs for paths that match the given
+<pathspec> are copied from the <tree-ish> to the index before
+they are checked out of the index.
 +
 'git checkout' with <paths> or `--patch` is used to restore modified or
 deleted paths to their original contents from the index or replace paths
@@ -101,6 +103,12 @@ specific side of the merge can be checked out of the index by
 using `--ours` or `--theirs`.  With `-m`, changes made to the working tree
 file can be discarded to re-create the original conflicted merge result.
 
+'git checkout' --working-tree-only <tree-ish> [--] <pathspec>...::
+	Similar to `git checkout <tree-ish> [--] <pathspec>`, but
+	the index file is left in the same state as it was before
+	running this command.
+
+
 OPTIONS
 -------
 -q::
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 9b2a5b31d4..d214e99521 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -37,6 +37,7 @@ struct checkout_opts {
 	int overwrite_ignore;
 	int ignore_skipworktree;
 	int ignore_other_worktrees;
+	int no_index;
 	int show_progress;
 
 	const char *new_branch;
@@ -268,6 +269,9 @@ static int checkout_paths(const struct checkout_opts *opts,
 		die(_("Cannot update paths and switch to branch '%s' at the same time."),
 		    opts->new_branch);
 
+	if (opts->no_index && !opts->source_tree)
+		die(_("'--working-tree-only' cannot be used without tree-ish"));
+
 	if (opts->patch_mode)
 		return run_add_interactive(revision, "--patch=checkout",
 					   &opts->pathspec);
@@ -370,7 +374,9 @@ static int checkout_paths(const struct checkout_opts *opts,
 		}
 	}
 
-	if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
+	if (opts->no_index)
+		; /* discard the in-core index */
+	else if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 		die(_("unable to write new index file"));
 
 	read_ref_full("HEAD", 0, rev.hash, NULL);
@@ -1161,6 +1167,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "ignore-other-worktrees", &opts.ignore_other_worktrees,
 			 N_("do not check if another worktree is holding the given ref")),
 		OPT_BOOL(0, "progress", &opts.show_progress, N_("force progress reporting")),
+		OPT_BOOL(0, "working-tree-only", &opts.no_index, N_("checkout to working tree only without touching the index")),
+
 		OPT_END(),
 	};
 
diff --git a/t/t2022-checkout-paths.sh b/t/t2022-checkout-paths.sh
index f46d0499bc..8ea2e34c90 100755
--- a/t/t2022-checkout-paths.sh
+++ b/t/t2022-checkout-paths.sh
@@ -78,4 +78,25 @@ test_expect_success 'do not touch files that are already up-to-date' '
 	test_cmp expect actual
 '
 
+test_expect_success 'working-tree-only option leaves checked out files unadded' '
+	git reset --hard &&
+	git checkout -b pu next &&
+	echo another >>file1 &&
+	echo exists >file3 &&
+	git add file3 &&
+	git commit -a -m another &&
+	git checkout next &&
+
+	! grep another file1 &&
+	git checkout --working-tree-only pu file1 file3 &&
+	grep another file1 &&
+	test_must_fail git grep --cached another file1 &&
+
+	grep exists file3 &&
+	git ls-files file3 >actual &&
+	>expect &&
+	test_cmp expect actual
+
+'
+
 test_done
-- 
2.13.0-rc0-309-gb63395ed9e


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-26  2:51         ` Junio C Hamano
@ 2017-04-26 17:41           ` Stefan Beller
  2017-04-27  0:25             ` Junio C Hamano
  2017-04-27  0:29             ` Junio C Hamano
  2017-04-27 22:07           ` Philip Oakley
  1 sibling, 2 replies; 18+ messages in thread
From: Stefan Beller @ 2017-04-26 17:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Philip Oakley, Jeff King, Orgad Shaneh, Dakota Hawkins, git,
	Christoph Michelbach

On Tue, Apr 25, 2017 at 7:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> "Philip Oakley" <philipoakley@iee.org> writes:
>
>> As I recall Christoph was using checkout to copy a file (e.g. a
>> template file) from an older commit/revision into his worktree, and
>> was suprised that this (git checkout <tree> <path>) also _staged_ the
>> file, rather than simply letting it be in a modified/untracked state.
>
> This probably is taking it even further than the original topic, but
> I raise this weather-balloon to see if anybody is interested.
>
> In the modern day, it might be useful if the "--working-tree-only"
> mode added a new file as an intent-to-add entry to the index, but
> that is not what "git apply (no other options)" (which is the gold
> standard for command that operates on the working tree and/or on the
> index) does, so it is not done in this patch.  IOW, if you grab a
> path that does not exist in your index out of <tree-ish>, you will
> write out an untracked file to the working tree.
>
> -- >8 --
> Subject: [PATCH] checkout: add --working-tree-only option
>
> "git checkout <tree-ish> <pathspec>" has always copied the blob from
> the tree-ish to the index before checking them out to the working tree.
>
> Some users may want to grab a blob out of a tree-ish directly to the
> working tree, without updating the index, so that "git diff" can be
> used to assess the damage and adjust the file contents taken from a
> different branch to be more appropriate for the current branch.

That makes sense for the in-repo-point-of-view.
I assumed a use case like this:

  A user may want to extract a file from a given tree-ish via
  GIT_WORK_TREE=/tmp/place git checkout <tree> -- <file>
  without modifying the repository (i.e. index) at all. For this
  we'd need an option to modify the working tree only.

> The new option "--working-tree-only" allows exactly that.
>
> In the hindsight, when a command works on the working tree and/or

s/the// ?

> the index, the usual convention is:
>
>  - with no other option, the command works only on the working tree;
>
>  - with "--cached" option, the command works only on the index; and
>
>  - with "--index" option, the command works on both the working tree
>    and the index.

I never realized this as a usual convention explicitly. Thanks for pointing
it out.

> So we probably should have triggered the default behaviour under the
> "--index" option, and triggered this "--working-tree-only" mode of
> behaviour when "--index" option is not given.  From the same point
> of view, "git checkout --cached <tree-ish> <pathspec>" would have
> done the same as "git reset <tree-ish> <pathspec>" would do.  And
> that may have made the command set a bit more consistent.
>
> But that is merely a hindsight being 20/20, oh well.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/git-checkout.txt | 22 +++++++++++++++-------
>  builtin/checkout.c             | 10 +++++++++-
>  t/t2022-checkout-paths.sh      | 21 +++++++++++++++++++++
>  3 files changed, 45 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
> index 8e2c0662dd..201677752e 100644
> --- a/Documentation/git-checkout.txt
> +++ b/Documentation/git-checkout.txt
> @@ -14,6 +14,7 @@ SYNOPSIS
>  'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
>  'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>...
>  'git checkout' [-p|--patch] [<tree-ish>] [--] [<paths>...]
> +'git checkout' --working-tree-only <tree-ish> [--] [<paths>...]
>
>  DESCRIPTION
>  -----------
> @@ -81,13 +82,14 @@ Omitting <branch> detaches HEAD at the tip of the current branch.
>  'git checkout' [-p|--patch] [<tree-ish>] [--] <pathspec>...::
>
>         When <paths> or `--patch` are given, 'git checkout' does *not*
> -       switch branches.  It updates the named paths in the working tree
> -       from the index file or from a named <tree-ish> (most often a
> -       commit).  In this case, the `-b` and `--track` options are
> -       meaningless and giving either of them results in an error.  The
> -       <tree-ish> argument can be used to specify a specific tree-ish
> -       (i.e.  commit, tag or tree) to update the index for the given
> -       paths before updating the working tree.
> +       switch branches.  In this case, the `-b` and `--track` options
> +       are meaningless and giving either of them results in an error.
> ++
> +The command checks out blobs for paths that match the given
> +<pathspec> from the index to the working tree.  When an optional
> +<tree-ish> is given, the blobs for paths that match the given
> +<pathspec> are copied from the <tree-ish> to the index before
> +they are checked out of the index.
>  +
>  'git checkout' with <paths> or `--patch` is used to restore modified or
>  deleted paths to their original contents from the index or replace paths
> @@ -101,6 +103,12 @@ specific side of the merge can be checked out of the index by
>  using `--ours` or `--theirs`.  With `-m`, changes made to the working tree
>  file can be discarded to re-create the original conflicted merge result.
>
> +'git checkout' --working-tree-only <tree-ish> [--] <pathspec>...::
> +       Similar to `git checkout <tree-ish> [--] <pathspec>`, but
> +       the index file is left in the same state as it was before
> +       running this command.

Adding this as a new mode seems like a "patch after the fact",
whereas the wording hints that this may be included in the prior
part, but I find it hard to come up with a good description there.

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 9b2a5b31d4..d214e99521 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -37,6 +37,7 @@ struct checkout_opts {
>         int overwrite_ignore;
>         int ignore_skipworktree;
>         int ignore_other_worktrees;
> +       int no_index;
>         int show_progress;
>
>         const char *new_branch;
> @@ -268,6 +269,9 @@ static int checkout_paths(const struct checkout_opts *opts,
>                 die(_("Cannot update paths and switch to branch '%s' at the same time."),
>                     opts->new_branch);
>
> +       if (opts->no_index && !opts->source_tree)
> +               die(_("'--working-tree-only' cannot be used without tree-ish"));

double negation, maybe:

  "--working-tree-only requires tree-ish"

> @@ -370,7 +374,9 @@ static int checkout_paths(const struct checkout_opts *opts,
>                 }
>         }
>
> -       if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
> +       if (opts->no_index)
> +               ; /* discard the in-core index */
> +       else if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
>                 die(_("unable to write new index file"));
>
>         read_ref_full("HEAD", 0, rev.hash, NULL);
> @@ -1161,6 +1167,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>                 OPT_BOOL(0, "ignore-other-worktrees", &opts.ignore_other_worktrees,
>                          N_("do not check if another worktree is holding the given ref")),
>                 OPT_BOOL(0, "progress", &opts.show_progress, N_("force progress reporting")),
> +               OPT_BOOL(0, "working-tree-only", &opts.no_index, N_("checkout to working tree only without touching the index")),
> +

nit: no need for extra empty line here.

> +test_expect_success 'working-tree-only option leaves checked out files unadded' '
> +       git reset --hard &&
> +       git checkout -b pu next &&
> +       echo another >>file1 &&
> +       echo exists >file3 &&
> +       git add file3 &&
> +       git commit -a -m another &&
> +       git checkout next &&

Up to here it is all preparation; I started to give an argument
on why using "another" for both the commit message and the file content
was suboptimal, but I was wrong. This seems to be best after some consideration.

The next paragraph checks for
'working-tree-only option populates the working tree, but doesn't touch index'

> +       ! grep another file1 &&
> +       git checkout --working-tree-only pu file1 file3 &&
> +       grep another file1 &&
> +       test_must_fail git grep --cached another file1 &&

but only for file1, whereas the next paragraph checks it for file3.

> +       grep exists file3 &&
> +       git ls-files file3 >actual &&
> +       >expect &&
> +       test_cmp expect actual

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-26 17:41           ` Stefan Beller
@ 2017-04-27  0:25             ` Junio C Hamano
  2017-04-27  0:29             ` Junio C Hamano
  1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-04-27  0:25 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Philip Oakley, Jeff King, Orgad Shaneh, Dakota Hawkins, git,
	Christoph Michelbach

Stefan Beller <sbeller@google.com> writes:

> I assumed a use case like this:
>
>   A user may want to extract a file from a given tree-ish via
>   GIT_WORK_TREE=/tmp/place git checkout <tree> -- <file>
>   without modifying the repository (i.e. index) at all. For this
>   we'd need an option to modify the working tree only.

I somehow do not see that as a very useful use case.  It is not just
keeping the index, but it is depositing the file also out of the
working tree, and if you do not want to work with the resulting
<file> (actually, if it is a single file, "cat-file -t blob" is the
most natural thing to use for such an operation, so you should say
<pathspec> that may match multiple files and perhaps a collection of
directories) within the context of your working tree, I would think
that "archive | tar xf -" is a better choice.

The feature is more about in-tree case.  You clobber what you did in
your working tree by extracting a pristine copy out of some known
tree, which may not be HEAD.  And if the tree is not HEAD, you would
want "git diff <pathspec>" would give a useful preview of what would
happen when you do "git add" them.  As "checkout" adds to the index
before updating the working tree (which can be argued as an ancient
design mistake, given that we typically do worktree-only by default
and have "--index/--cached" to work with the index, e.g. "apply",
"grep"), the user needs to do "git diff HEAD <pathspec>" instead in
this case.

> I never realized this as a usual convention explicitly. Thanks for pointing
> it out.

I think it is somewhere in gitcli(7).

>> +       if (opts->no_index && !opts->source_tree)
>> +               die(_("'--working-tree-only' cannot be used without tree-ish"));
>
> double negation, maybe:
>
>   "--working-tree-only requires tree-ish"

Much more concise.  Thanks.

>> +test_expect_success 'working-tree-only option leaves checked out files unadded' '
>> +       git reset --hard &&
>> +       git checkout -b pu next &&
>> +       echo another >>file1 &&
>> +       echo exists >file3 &&
>> +       git add file3 &&
>> +       git commit -a -m another &&
>> +       git checkout next &&
>
> Up to here it is all preparation; I started to give an argument
> on why using "another" for both the commit message and the file content
> was suboptimal, but I was wrong. This seems to be best after some consideration.

It does add to the mental burden.  Will think of a pair of different
words and phrases.

> The next paragraph checks for
> 'working-tree-only option populates the working tree, but doesn't touch index'
>
>> +       ! grep another file1 &&
>> +       git checkout --working-tree-only pu file1 file3 &&
>> +       grep another file1 &&
>> +       test_must_fail git grep --cached another file1 &&
>
> but only for file1, whereas the next paragraph checks it for file3.
>
>> +       grep exists file3 &&
>> +       git ls-files file3 >actual &&
>> +       >expect &&
>> +       test_cmp expect actual

Yes.  This demonstrates that paths only in tree-ish is not added to
the index (i.e. ls-files does not show it, because HEAD and the
original index didn't have it).  If you did the same "grep --cached"
that expects a non-match, you cannot tell if we added the path with
the --intent-to-add or if we didn't add it at all, so it deliberately
checks different thing from the case for file1.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-26 17:41           ` Stefan Beller
  2017-04-27  0:25             ` Junio C Hamano
@ 2017-04-27  0:29             ` Junio C Hamano
  1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-04-27  0:29 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Philip Oakley, Jeff King, Orgad Shaneh, Dakota Hawkins, git,
	Christoph Michelbach

Stefan Beller <sbeller@google.com> writes:

>> +'git checkout' --working-tree-only <tree-ish> [--] <pathspec>...::
>> +       Similar to `git checkout <tree-ish> [--] <pathspec>`, but
>> +       the index file is left in the same state as it was before
>> +       running this command.
>
> Adding this as a new mode seems like a "patch after the fact",
> whereas the wording hints that this may be included in the prior
> part, but I find it hard to come up with a good description there.

Yes, having three distinct modes of operation covered in a single
entry makes the description unnecessarily messy, as you have to say
"generally these things happen, but if you use X then additionally
this happens before that, and if you do not use Y then this happens
instead".  We _might_ want to separate the [-p|--patch] mode out of
the main one, making these into three entries, for this reason.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-26  2:51         ` Junio C Hamano
  2017-04-26 17:41           ` Stefan Beller
@ 2017-04-27 22:07           ` Philip Oakley
  2017-04-28  2:08             ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Philip Oakley @ 2017-04-27 22:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Jeff King, Orgad Shaneh, Dakota Hawkins, git,
	Christoph Michelbach

From: "Junio C Hamano" <gitster@pobox.com> Sent: Wednesday, April 26, 2017 
3:51 AM
> "Philip Oakley" <philipoakley@iee.org> writes:
>
>> As I recall Christoph was using checkout to copy a file (e.g. a
>> template file) from an older commit/revision into his worktree, and
>> was suprised that this (git checkout <tree> <path>) also _staged_ the
>> file, rather than simply letting it be in a modified/untracked state.
>
> This probably is taking it even further than the original topic, but
> I raise this weather-balloon to see if anybody is interested.
>
> In the modern day, it might be useful if the "--working-tree-only"
> mode added a new file as an intent-to-add entry to the index, but
> that is not what "git apply (no other options)" (which is the gold

did you mean `git add` ? Or am I missing something.

> standard for command that operates on the working tree and/or on the
> index) does, so it is not done in this patch.  IOW, if you grab a
> path that does not exist in your index out of <tree-ish>, you will
> write out an untracked file to the working tree.

It sound like a good idea, as I wasn't aware of another easy way of doing 
it.

>
> -- >8 --
> Subject: [PATCH] checkout: add --working-tree-only option
>
> "git checkout <tree-ish> <pathspec>" has always copied the blob from
> the tree-ish to the index before checking them out to the working tree.
>
> Some users may want to grab a blob out of a tree-ish directly to the
> working tree, without updating the index, so that "git diff" can be
> used to assess the damage and adjust the file contents taken from a
> different branch to be more appropriate for the current branch.
>
> The new option "--working-tree-only" allows exactly that.
>
> In the hindsight, when a command works on the working tree and/or
> the index, the usual convention is:
>
> - with no other option, the command works only on the working tree;
>
> - with "--cached" option, the command works only on the index; and
>
> - with "--index" option, the command works on both the working tree
>   and the index.
>
> So we probably should have triggered the default behaviour under the
> "--index" option, and triggered this "--working-tree-only" mode of
> behaviour when "--index" option is not given.  From the same point
> of view, "git checkout --cached <tree-ish> <pathspec>" would have
> done the same as "git reset <tree-ish> <pathspec>" would do.  And
> that may have made the command set a bit more consistent.
>
> But that is merely a hindsight being 20/20, oh well.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> Documentation/git-checkout.txt | 22 +++++++++++++++-------
> builtin/checkout.c             | 10 +++++++++-
> t/t2022-checkout-paths.sh      | 21 +++++++++++++++++++++
> 3 files changed, 45 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-checkout.txt 
> b/Documentation/git-checkout.txt
> index 8e2c0662dd..201677752e 100644
> --- a/Documentation/git-checkout.txt
> +++ b/Documentation/git-checkout.txt
> @@ -14,6 +14,7 @@ SYNOPSIS
> 'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] 
> [<start_point>]
> 'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] 
> [--] <paths>...
> 'git checkout' [-p|--patch] [<tree-ish>] [--] [<paths>...]
> +'git checkout' --working-tree-only <tree-ish> [--] [<paths>...]
>
> DESCRIPTION
> -----------
> @@ -81,13 +82,14 @@ Omitting <branch> detaches HEAD at the tip of the 
> current branch.
> 'git checkout' [-p|--patch] [<tree-ish>] [--] <pathspec>...::
>
>  When <paths> or `--patch` are given, 'git checkout' does *not*
> - switch branches.  It updates the named paths in the working tree
> - from the index file or from a named <tree-ish> (most often a
> - commit).  In this case, the `-b` and `--track` options are
> - meaningless and giving either of them results in an error.  The
> - <tree-ish> argument can be used to specify a specific tree-ish
> - (i.e.  commit, tag or tree) to update the index for the given
> - paths before updating the working tree.
> + switch branches.  In this case, the `-b` and `--track` options
> + are meaningless and giving either of them results in an error.
> ++
> +The command checks out blobs for paths that match the given
> +<pathspec> from the index to the working tree.  When an optional
> +<tree-ish> is given, the blobs for paths that match the given
> +<pathspec> are copied from the <tree-ish> to the index before
> +they are checked out of the index.
> +
> 'git checkout' with <paths> or `--patch` is used to restore modified or
> deleted paths to their original contents from the index or replace paths
> @@ -101,6 +103,12 @@ specific side of the merge can be checked out of the 
> index by
> using `--ours` or `--theirs`.  With `-m`, changes made to the working tree
> file can be discarded to re-create the original conflicted merge result.
>
> +'git checkout' --working-tree-only <tree-ish> [--] <pathspec>...::
> + Similar to `git checkout <tree-ish> [--] <pathspec>`, but
> + the index file is left in the same state as it was before
> + running this command.

I feel that the docs should also contain a little of the commit message 
highlighting that `This complements the usual convention of "--cached" and 
"--index" options for other commands.`, and would pick up on Stefan's "I 
didn't know that" response - A little education of the reader goes a long 
way, maybe ;-)

> +
> +
> OPTIONS
> -------
> -q::
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 9b2a5b31d4..d214e99521 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -37,6 +37,7 @@ struct checkout_opts {
>  int overwrite_ignore;
>  int ignore_skipworktree;
>  int ignore_other_worktrees;
> + int no_index;
>  int show_progress;
>
>  const char *new_branch;
> @@ -268,6 +269,9 @@ static int checkout_paths(const struct checkout_opts 
> *opts,
>  die(_("Cannot update paths and switch to branch '%s' at the same time."),
>      opts->new_branch);
>
> + if (opts->no_index && !opts->source_tree)
> + die(_("'--working-tree-only' cannot be used without tree-ish"));
> +
>  if (opts->patch_mode)
>  return run_add_interactive(revision, "--patch=checkout",
>     &opts->pathspec);
> @@ -370,7 +374,9 @@ static int checkout_paths(const struct checkout_opts 
> *opts,
>  }
>  }
>
> - if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
> + if (opts->no_index)
> + ; /* discard the in-core index */
> + else if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
>  die(_("unable to write new index file"));
>
>  read_ref_full("HEAD", 0, rev.hash, NULL);
> @@ -1161,6 +1167,8 @@ int cmd_checkout(int argc, const char **argv, const 
> char *prefix)
>  OPT_BOOL(0, "ignore-other-worktrees", &opts.ignore_other_worktrees,
>  N_("do not check if another worktree is holding the given ref")),
>  OPT_BOOL(0, "progress", &opts.show_progress, N_("force progress 
> reporting")),
> + OPT_BOOL(0, "working-tree-only", &opts.no_index, N_("checkout to working 
> tree only without touching the index")),
> +
>  OPT_END(),
>  };
>
> diff --git a/t/t2022-checkout-paths.sh b/t/t2022-checkout-paths.sh
> index f46d0499bc..8ea2e34c90 100755
> --- a/t/t2022-checkout-paths.sh
> +++ b/t/t2022-checkout-paths.sh
> @@ -78,4 +78,25 @@ test_expect_success 'do not touch files that are 
> already up-to-date' '
>  test_cmp expect actual
> '
>
> +test_expect_success 'working-tree-only option leaves checked out files 
> unadded' '
> + git reset --hard &&
> + git checkout -b pu next &&
> + echo another >>file1 &&
> + echo exists >file3 &&
> + git add file3 &&
> + git commit -a -m another &&
> + git checkout next &&
> +
> + ! grep another file1 &&
> + git checkout --working-tree-only pu file1 file3 &&
> + grep another file1 &&
> + test_must_fail git grep --cached another file1 &&
> +
> + grep exists file3 &&
> + git ls-files file3 >actual &&
> + >expect &&
> + test_cmp expect actual
> +
> +'
> +
> test_done
> -- 
> 2.13.0-rc0-309-gb63395ed9e
>
--
Thanks
Philip 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-25  3:22       ` Jeff King
  2017-04-25  3:39         ` Jeff King
@ 2017-04-27 22:52         ` Philip Oakley
  2017-04-28  8:30           ` Jeff King
  1 sibling, 1 reply; 18+ messages in thread
From: Philip Oakley @ 2017-04-27 22:52 UTC (permalink / raw)
  To: Jeff King, Stefan Beller; +Cc: Orgad Shaneh, Dakota Hawkins, git

From: "Jeff King" <peff@peff.net> Sent: Tuesday, April 25, 2017 4:22 AM

Just catching up - sorry it's late..
> On Mon, Apr 24, 2017 at 04:43:28PM -0700, Stefan Beller wrote:
>
>> >> On the main list thare is a similar "issue" [1] regarding the 
>> >> expectation for `git checkout`,
>> >> and importantly (for me) these collected views regarding the "Git Data 
>> >> Protection and
>> >> Management Principles" is not within the Git documentation.
>> >
>> > Yes, that's an interesting point. What concerns me is that the commit
>> > c5326bd62b7e168ba1339dacb7ee812d0fe98c7c which introduced this
>> > into checkout isn't consistent with reset. Seems that nobody noticed 
>> > this before.
>>
>> It seems as if we'd want to see the code from
>> c5326bd62b7e168ba1339dacb7ee812d0fe98c7c
>> to be part of any worktree touching command, specifically reset?
>
> Note that that commit is just about "git checkout <commit> -- <paths>".
> The matching reset command, "git reset <commit> -- <paths>" does handle
> this the same way. Or at least I claimed so here:
>
>  http://public-inbox.org/git/20141107081324.GA19845@peff.net/
>
> and that is what I modeled the code in c5326bd62b after.
>
> But note that none of that should ever affect _what_ gets checked out at
> a file or content level. It may only affect the timestamps on the
> resulting files. And I think those timestamps are not something Git
> makes any promises about.

It's not actually clear to users what Git promises in cases like this which 
confuses user expectations - the make file issue does appear to come up 
quite often.
>
> So if Git can elide a write and keep a timestamp up to date, that is a
> great optimization and we should do that. But from an outside viewer's
> perspective, we guarantee nothing. We might choose to rewrite a
> stat-dirty file (updating its timestamp) rather than read its contents
> to see if it might have the same content that we're about to write. And
> the file may look stat dirty only because of the racy-git file/index
> timestamp problem, even if it wasn't actually modified.

I'm not sure how in this case we would get the stat-dirty state? We should 
be able to determine that the file has existed, as originally checked out, 
for some while (i.e. past the racy FS time) and is unmodified, so that as 
long as the original checkout OID and the required new OID are the same we 
should be able to avoid the file overwrite (or merge).

It would require that the now time be used as a stand-in for the new OID 
file's stat time (given the object store doesn't store date stamps for 
files) to check for the racy-git situation. These negative information 
scenarios can be tricky.
>
>> > It also has a similarity to
>> > https://public-inbox.org/git/1492287435.14812.2.camel@gmail.com/ 
>> > regarding
>> > how checkout operates.
>
> I didn't look too deeply into this one, but it really looks like
> somebody caring too much about when git needs to write (and I'd suspect
> it's impacted by the racy-git thing, too).
>
> -Peff
>
--
Philip 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-27 22:07           ` Philip Oakley
@ 2017-04-28  2:08             ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-04-28  2:08 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Stefan Beller, Jeff King, Orgad Shaneh, Dakota Hawkins, git,
	Christoph Michelbach

"Philip Oakley" <philipoakley@iee.org> writes:

>> In the modern day, it might be useful if the "--working-tree-only"
>> mode added a new file as an intent-to-add entry to the index, but
>> that is not what "git apply (no other options)" (which is the gold
>
> did you mean `git add` ? Or am I missing something.

I did mean "git apply" that applies the patch (0) only to the
working tree by default, (1) to the index without touching the
working tree with the "--cached" option and (2) both to the working
tree and the index with the "--index" option.  As "git add" is all
about the index, having these three modes wouldn't make much sense
there.

>> +'git checkout' --working-tree-only <tree-ish> [--] <pathspec>...::
>> + Similar to `git checkout <tree-ish> [--] <pathspec>`, but
>> + the index file is left in the same state as it was before
>> + running this command.
>
> I feel that the docs should also contain a little of the commit
> message highlighting that `This complements the usual convention of
> "--cached" and "--index" options for other commands.`, and would pick
> up on Stefan's "I didn't know that" response - A little education of
> the reader goes a long way, maybe ;-)

I do not think "If we were building Git from scratch today without
having any existing user of `git checkout`, we probably would have
made this to be the default mode of operation, and instead required
the user to say `git checkout --index <tree-ish> <pathspec>' if the
user wants checkout to affect both the index and the working tree,
to be more consistent with other parts of the system" would help the
readers of the current system very much, even if we were promising
that such a more consistent system may come in a future (and we are
not, at least not yet, with this addition).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-27 22:52         ` Philip Oakley
@ 2017-04-28  8:30           ` Jeff King
  2017-05-01  0:15             ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2017-04-28  8:30 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Stefan Beller, Orgad Shaneh, Dakota Hawkins, git

On Thu, Apr 27, 2017 at 11:52:24PM +0100, Philip Oakley wrote:

> > But note that none of that should ever affect _what_ gets checked out at
> > a file or content level. It may only affect the timestamps on the
> > resulting files. And I think those timestamps are not something Git
> > makes any promises about.
> 
> It's not actually clear to users what Git promises in cases like this which
> confuses user expectations - the make file issue does appear to come up
> quite often.

Yeah, I wouldn't mind a documentation update there. I have no idea where
to put it that people would find it, though.

> > So if Git can elide a write and keep a timestamp up to date, that is a
> > great optimization and we should do that. But from an outside viewer's
> > perspective, we guarantee nothing. We might choose to rewrite a
> > stat-dirty file (updating its timestamp) rather than read its contents
> > to see if it might have the same content that we're about to write. And
> > the file may look stat dirty only because of the racy-git file/index
> > timestamp problem, even if it wasn't actually modified.
> 
> I'm not sure how in this case we would get the stat-dirty state? We should
> be able to determine that the file has existed, as originally checked out,
> for some while (i.e. past the racy FS time) and is unmodified, so that as
> long as the original checkout OID and the required new OID are the same we
> should be able to avoid the file overwrite (or merge).
> 
> It would require that the now time be used as a stand-in for the new OID
> file's stat time (given the object store doesn't store date stamps for
> files) to check for the racy-git situation. These negative information
> scenarios can be tricky.

I don't think the "now" time during the read is relevant for racy git.
The problem is the timestamp of the index versus the timestamp of the
file itself. So updating the index in the same second the file was
touched (like a test script often does):

  echo foo >bar && git add bar

will have a racily-dirty entry, and a subsequent index refresh will
re-read the file. You can verify this with strace:

  $ echo foo >bar && git add bar
  $ strace git update-index --refresh 2>&1 | grep '"bar"'
  lstat("bar", {st_mode=S_IFREG|0644, st_size=4, ...}) = 0
  open("bar", O_RDONLY|O_CLOEXEC)         = 4

  [now update the index so it has a later timestamp]
  $ sleep 1
  $ echo foo >unrelated && git add unrelated
  $ strace  git update-index --refresh 2>&1 | grep '"bar"'
  lstat("bar", {st_mode=S_IFREG|0644, st_size=4, ...}) = 0

Interestingly, I would have thought that the first update-index call
would "de-racify" the entry by rewriting the index. But we don't
actually write, presumably because we eventually realize that there are
no entries to update. But it might actually be worth doing the write,
because it avoids further file-content reads later on (and most
workflows tend to do a lot of reads; every git-status is going to rehash
the file until the next index update).

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Submodule/contents conflict
  2017-04-28  8:30           ` Jeff King
@ 2017-05-01  0:15             ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2017-05-01  0:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Philip Oakley, Stefan Beller, Orgad Shaneh, Dakota Hawkins, git

Jeff King <peff@peff.net> writes:

> Interestingly, I would have thought that the first update-index call
> would "de-racify" the entry by rewriting the index. But we don't
> actually write, presumably because we eventually realize that there are
> no entries to update. But it might actually be worth doing the write,
> because it avoids further file-content reads later on (and most
> workflows tend to do a lot of reads; every git-status is going to rehash
> the file until the next index update).

Yeah, there is a tradeoff of time being spent on writing the index
(which could be large) and having to rehash the content (which could
also have to happen number of times until the next index writeout),
and in hindsight I suspect that I got the tradeoff wrong when we did
the racy-git-avoidance thing.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-05-01  0:15 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-24  8:06 Submodule/contents conflict Orgad Shaneh
2017-04-24 17:40 ` Stefan Beller
2017-04-24 23:33   ` Philip Oakley
2017-04-24 23:43     ` Stefan Beller
2017-04-25  3:22       ` Jeff King
2017-04-25  3:39         ` Jeff King
2017-04-27 22:52         ` Philip Oakley
2017-04-28  8:30           ` Jeff King
2017-05-01  0:15             ` Junio C Hamano
2017-04-25 11:10       ` Philip Oakley
2017-04-26  2:51         ` Junio C Hamano
2017-04-26 17:41           ` Stefan Beller
2017-04-27  0:25             ` Junio C Hamano
2017-04-27  0:29             ` Junio C Hamano
2017-04-27 22:07           ` Philip Oakley
2017-04-28  2:08             ` Junio C Hamano
2017-04-25  2:12   ` Junio C Hamano
2017-04-25 15:57     ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).