git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] reset: unstage empty deleted ita files
@ 2019-07-12 15:02 Varun Naik
  2019-07-26  4:48 ` [PATCH v2] " Varun Naik
  2019-08-01 16:15 ` [PATCH v3] diff-lib.c: handle " Varun Naik
  0 siblings, 2 replies; 11+ messages in thread
From: Varun Naik @ 2019-07-12 15:02 UTC (permalink / raw)
  To: git; +Cc: Varun Naik

It is possible to delete a committed file from the index and then add it
as intent-to-add. After `git reset HEAD`, the file should be identical
in the index and HEAD. This patch provides the desired behavior even
when the file is empty in the index.

Signed-off-by: Varun Naik <vcnaik94@gmail.com>
---
My first patch!

According to an old thread on the mailing list[0], this issue is known,
but it looks like people overlooked the fix that I'm making. Basically,
this one line causes diff-lib.c:do_oneway_diff() to call show_modified()
instead of diff_index_show_file() for ita files.

An alternative to the last line of the test code is running `git status
--porcelain=v2` and then inspecting the output. I can adjust that if
necessary.

Unfortunately, "restore" has a similar problem, but it follows a
different code path into read-cache.c rather than diff-lib.c, and there
does not seem to be a similar one-line fix.

[0]: https://public-inbox.org/git/CACsJy8Bov1asw+_J_fbhKqigM==xNPi8itDGkhibkYVch4pvmQ@mail.gmail.com/

 builtin/reset.c  |  1 +
 t/t7102-reset.sh | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/builtin/reset.c b/builtin/reset.c
index 26ef9a7bd0..47a088f4b7 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec,
 	opt.format_callback_data = &intent_to_add;
 	opt.flags.override_submodule_config = 1;
 	opt.repo = the_repository;
+	opt.ita_invisible_in_index = 1;
 
 	if (do_diff_cache(tree_oid, &opt))
 		return 1;
diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
index 97be0d968d..250fc1c890 100755
--- a/t/t7102-reset.sh
+++ b/t/t7102-reset.sh
@@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
 	test_must_be_empty actual
 '
 
+test_expect_success 'reset --mixed removes deleted intent-to-add file from index' '
+	echo "nonempty" >nonempty &&
+	>empty &&
+	git add nonempty empty &&
+	git commit -m "create files to be deleted" &&
+	git rm --cached nonempty empty &&
+	git add -N nonempty empty &&
+	git reset HEAD nonempty empty &&
+	git diff --staged --exit-code
+'
+
 test_done
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2] reset: unstage empty deleted ita files
  2019-07-12 15:02 [PATCH] reset: unstage empty deleted ita files Varun Naik
@ 2019-07-26  4:48 ` Varun Naik
  2019-07-26 18:19   ` Junio C Hamano
  2019-08-01 16:15 ` [PATCH v3] diff-lib.c: handle " Varun Naik
  1 sibling, 1 reply; 11+ messages in thread
From: Varun Naik @ 2019-07-26  4:48 UTC (permalink / raw)
  To: vcnaik94; +Cc: git, pclouds

It is possible to delete a committed file from the index and then add it
as intent-to-add. After `git reset HEAD`, the file should be identical
in the index and HEAD. This patch provides the desired behavior even
when the file is empty in the index.

Signed-off-by: Varun Naik <vcnaik94@gmail.com>
---
CC Duy because you seem to be the one who is most involved in changes to
ita behavior.

The test description was incorrect, so I changed that now. I also
figured out how to fix the related problem in "restore"; separate patch
coming soon.

 builtin/reset.c  |  1 +
 t/t7102-reset.sh | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/builtin/reset.c b/builtin/reset.c
index 26ef9a7bd0..47a088f4b7 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec,
 	opt.format_callback_data = &intent_to_add;
 	opt.flags.override_submodule_config = 1;
 	opt.repo = the_repository;
+	opt.ita_invisible_in_index = 1;
 
 	if (do_diff_cache(tree_oid, &opt))
 		return 1;
diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
index 97be0d968d..9f3854e8f0 100755
--- a/t/t7102-reset.sh
+++ b/t/t7102-reset.sh
@@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
 	test_must_be_empty actual
 '
 
+test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' '
+	echo "nonempty" >nonempty &&
+	>empty &&
+	git add nonempty empty &&
+	git commit -m "create files to be deleted" &&
+	git rm --cached nonempty empty &&
+	git add -N nonempty empty &&
+	git reset HEAD nonempty empty &&
+	git diff --staged --exit-code
+'
+
 test_done
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] reset: unstage empty deleted ita files
  2019-07-26  4:48 ` [PATCH v2] " Varun Naik
@ 2019-07-26 18:19   ` Junio C Hamano
  2019-07-29  6:52     ` Varun Naik
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2019-07-26 18:19 UTC (permalink / raw)
  To: Varun Naik; +Cc: git, pclouds

Varun Naik <vcnaik94@gmail.com> writes:

> It is possible to delete a committed file from the index and then add it
> as intent-to-add. After `git reset HEAD`, the file should be identical
> in the index and HEAD. This patch provides the desired behavior even
> when the file is empty in the index.

The first and the second sentence describes what you want to achieve
concisely and sensibly.  There is a vast leap between them and the
last sentence.  What's missing is:

 - What goes wrong without this one-liner fix and how does the
   command make a wrong decision to leave the path 'empty' (in the
   new test) different from that of the tree of the HEAD?

 - How does the change help the machinery to make a right decision
   instead?

I had to briefly wonder if this change interacts with "reset -N" in
any way.  In that mode, we want to make sure that diff sees the
entries that are missing from the index that exist in the tree of
the HEAD, so that update_index_from_diff() can add them back to the
index as I-T-A entries.

Making I-T-A entries invisible in the index for the purpose of
running that diff would mean that they appear as removed, but it is
OK because we'll add them back as I-T-A entries anyway, so it all
evens out, I think.

> diff --git a/builtin/reset.c b/builtin/reset.c
> index 26ef9a7bd0..47a088f4b7 100644
> --- a/builtin/reset.c
> +++ b/builtin/reset.c
> @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec,
>  	opt.format_callback_data = &intent_to_add;
>  	opt.flags.override_submodule_config = 1;
>  	opt.repo = the_repository;
> +	opt.ita_invisible_in_index = 1;
>  
>  	if (do_diff_cache(tree_oid, &opt))
>  		return 1;
> diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
> index 97be0d968d..9f3854e8f0 100755
> --- a/t/t7102-reset.sh
> +++ b/t/t7102-reset.sh
> @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
>  	test_must_be_empty actual
>  '
>  
> +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' '
> +	echo "nonempty" >nonempty &&
> +	>empty &&
> +	git add nonempty empty &&
> +	git commit -m "create files to be deleted" &&
> +	git rm --cached nonempty empty &&
> +	git add -N nonempty empty &&
> +	git reset HEAD nonempty empty &&
> +	git diff --staged --exit-code

We are not testing if the "diff --staged" synonym (that is not even
in "git diff --help") behaves identically to "diff --cached" here
(if we wanted to do such a test, we should do so somewhere in t4xxx
series, not here), so let's spell it in the canonical form, like so:

	git diff --cached --exit-code HEAD

At this point, we have three paths (empty, nonempty and secondfile)
in the tree of the HEAD, and we just resetted the entries for the
first two paths in the index to match.  The 'secondfile' added, in
previous tests, is still there unchanged, and is not shown in the
diff output.  The 'new-file', added as I-T-A in previous tests, is
also still there unchanged, and is not shown in the diff output.

How does the internal do_diff_cache() behave differently before and
after this patch on 'empty' and 'nonempty'?  How does the difference
affect the final outcome of "git reset" operation?

 - without ita-is-invisible bit set, we end up comparing the mode
   and blob object name; for 'nonempty', HEAD records a blob object
   name for the non-empty content, but the index has an empty blob
   object name (with I-T-A bit set, but that does not participate in
   the diff operation at the level of diff-lib.c::do_oneway_diff())
   and are deemed "modified".  Even though we should say "deleted",
   the end result turns out to be the same---we restore from the
   tree of the HEAD.

 - however, for 'empty', we mistakenly end up saying "both are empty
   blobs, so no difference; nothing to be done", leaving the i-t-a
   entry in the index.

 - with ita-is-invisible bit set, both 'nonempty' and 'empty' are
   immediately marked as "deleted" in do_oneway_diff().  This causes
   both paths to be resurrected from the tree of the HEAD the same
   way.

With the above kind of analysis, a reader can fill in the leap in
the explanation between the second and the third sentence in the
proposed log message.  But we shouldn't force readers to make that
effort to understand how the patch was meant to improve things.

Thanks.

> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] reset: unstage empty deleted ita files
  2019-07-26 18:19   ` Junio C Hamano
@ 2019-07-29  6:52     ` Varun Naik
  2019-07-29 16:07       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Varun Naik @ 2019-07-29  6:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, pclouds

On Fri, Jul 26, 2019 at 11:20 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Varun Naik <vcnaik94@gmail.com> writes:
>
> > It is possible to delete a committed file from the index and then add it
> > as intent-to-add. After `git reset HEAD`, the file should be identical
> > in the index and HEAD. This patch provides the desired behavior even
> > when the file is empty in the index.
>
> The first and the second sentence describes what you want to achieve
> concisely and sensibly.  There is a vast leap between them and the
> last sentence.  What's missing is:
>
>  - What goes wrong without this one-liner fix and how does the
>    command make a wrong decision to leave the path 'empty' (in the
>    new test) different from that of the tree of the HEAD?
>
>  - How does the change help the machinery to make a right decision
>    instead?
>
> I had to briefly wonder if this change interacts with "reset -N" in
> any way.  In that mode, we want to make sure that diff sees the
> entries that are missing from the index that exist in the tree of
> the HEAD, so that update_index_from_diff() can add them back to the
> index as I-T-A entries.
>
> Making I-T-A entries invisible in the index for the purpose of
> running that diff would mean that they appear as removed, but it is
> OK because we'll add them back as I-T-A entries anyway, so it all
> evens out, I think.
>
> > diff --git a/builtin/reset.c b/builtin/reset.c
> > index 26ef9a7bd0..47a088f4b7 100644
> > --- a/builtin/reset.c
> > +++ b/builtin/reset.c
> > @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec,
> >       opt.format_callback_data = &intent_to_add;
> >       opt.flags.override_submodule_config = 1;
> >       opt.repo = the_repository;
> > +     opt.ita_invisible_in_index = 1;
> >
> >       if (do_diff_cache(tree_oid, &opt))
> >               return 1;
> > diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
> > index 97be0d968d..9f3854e8f0 100755
> > --- a/t/t7102-reset.sh
> > +++ b/t/t7102-reset.sh
> > @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
> >       test_must_be_empty actual
> >  '
> >
> > +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' '
> > +     echo "nonempty" >nonempty &&
> > +     >empty &&
> > +     git add nonempty empty &&
> > +     git commit -m "create files to be deleted" &&
> > +     git rm --cached nonempty empty &&
> > +     git add -N nonempty empty &&
> > +     git reset HEAD nonempty empty &&
> > +     git diff --staged --exit-code
>
> We are not testing if the "diff --staged" synonym (that is not even
> in "git diff --help") behaves identically to "diff --cached" here
> (if we wanted to do such a test, we should do so somewhere in t4xxx
> series, not here), so let's spell it in the canonical form, like so:
>
>         git diff --cached --exit-code HEAD
>
> At this point, we have three paths (empty, nonempty and secondfile)
> in the tree of the HEAD, and we just resetted the entries for the
> first two paths in the index to match.  The 'secondfile' added, in
> previous tests, is still there unchanged, and is not shown in the
> diff output.  The 'new-file', added as I-T-A in previous tests, is
> also still there unchanged, and is not shown in the diff output.
>

To guard against changes to the test cases in the future, would it be
better if I write something like the following instead?
    git diff --cached --exit-code HEAD nonempty empty

> How does the internal do_diff_cache() behave differently before and
> after this patch on 'empty' and 'nonempty'?  How does the difference
> affect the final outcome of "git reset" operation?
>
>  - without ita-is-invisible bit set, we end up comparing the mode
>    and blob object name; for 'nonempty', HEAD records a blob object
>    name for the non-empty content, but the index has an empty blob
>    object name (with I-T-A bit set, but that does not participate in
>    the diff operation at the level of diff-lib.c::do_oneway_diff())
>    and are deemed "modified".  Even though we should say "deleted",
>    the end result turns out to be the same---we restore from the
>    tree of the HEAD.
>
>  - however, for 'empty', we mistakenly end up saying "both are empty
>    blobs, so no difference; nothing to be done", leaving the i-t-a
>    entry in the index.
>
>  - with ita-is-invisible bit set, both 'nonempty' and 'empty' are
>    immediately marked as "deleted" in do_oneway_diff().  This causes
>    both paths to be resurrected from the tree of the HEAD the same
>    way.
>
> With the above kind of analysis, a reader can fill in the leap in
> the explanation between the second and the third sentence in the
> proposed log message.  But we shouldn't force readers to make that
> effort to understand how the patch was meant to improve things.
>

Thank you for the detailed explanation, this really helped me
understand the internals of do_diff_cache(). While adjusting my commit
message, I realized that my change actually breaks `git reset HEAD`
for ita files (i.e. after `git add -N nonempty` and `git reset HEAD
nonempty`, the file is still marked as intent-to-add). So, instead of
setting the flag ita_invisible_in_index to 1 before calling
do_diff_cache(), we want to handle a specific edge case deep inside
the diff machinery. It looks like I fixed another bug in the process,
so I will write a test case for that and then send out v3.

> Thanks.
>
> > +'
> > +
> >  test_done

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] reset: unstage empty deleted ita files
  2019-07-29  6:52     ` Varun Naik
@ 2019-07-29 16:07       ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2019-07-29 16:07 UTC (permalink / raw)
  To: Varun Naik; +Cc: git, pclouds

Varun Naik <vcnaik94@gmail.com> writes:

> To guard against changes to the test cases in the future, would it be
> better if I write something like the following instead?
>     git diff --cached --exit-code HEAD nonempty empty

Hmph, that is probably a good idea, as it matches the kind of
"reset" we just did, which is what we are testing.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] diff-lib.c: handle empty deleted ita files
  2019-07-12 15:02 [PATCH] reset: unstage empty deleted ita files Varun Naik
  2019-07-26  4:48 ` [PATCH v2] " Varun Naik
@ 2019-08-01 16:15 ` Varun Naik
  2019-08-15 16:26   ` Varun Naik
  2019-08-15 19:06   ` Junio C Hamano
  1 sibling, 2 replies; 11+ messages in thread
From: Varun Naik @ 2019-08-01 16:15 UTC (permalink / raw)
  To: vcnaik94; +Cc: git, gitster, pclouds

It is possible to delete a committed file from the index and then add it
as intent-to-add. Certain forms of `git diff` should show the file.
After `git reset HEAD`, the file should be identical in the index and
HEAD. The commands already work correctly if the file has contents in
HEAD. This patch provides the desired behavior even when the file is
empty in HEAD.

The affected "diff" commands and the "reset" command call
diff-lib.c:do_oneway_diff() with a cache entry in the index and a cache
entry in HEAD. An ita file is represented in the index by a cache entry
with the same hash as an empty file. For a nonempty deleted ita file,
do_oneway_diff() calls show_modified(), which detects a diff between the
cache entry in the index and the cache entry in HEAD and therefore deems
the file "modified". However, for an empty deleted ita file,
do_oneway_diff() previously detected no such diff between the two cache
entries and therefore deemed the file "not modified". After this fix,
for any deleted ita file, do_oneway_diff() calls diff_index_show_file()
and deems the file "deleted".

`git diff-index --cached HEAD` prints a row of output for both a
"modified" and a "deleted" file, although the output differs slightly.
`git reset HEAD` treats a "modified" and a "deleted" file similarly,
resurrecting the file in the index from HEAD.

This change should not affect newly added ita files. For those, the
"tree" cache entry is NULL, so the changed code is not executed.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Varun Naik <vcnaik94@gmail.com>
---
I tried to limit the "blast radius" of affected commands as much as
possible. To find commands that were affected, I ran the following:
    git diff
    git diff HEAD
    git diff --cached HEAD
    git diff-index HEAD
    git diff-index --cached HEAD
    git diff-files

I also ran each command with the option "--ita-visible-in-index" and the
option "--ita-invisible-in-index". Of these 18 commands, the following
three showed a diff for nonempty deleted ita files, but no diff for
empty deleted ita files. All three commands now work correctly, but I
chose the first one for the test case because the option
"--ita-visible-in-index" is still marked as experimental.
    git diff-index --cached HEAD
    git diff-index --cached --ita-visible-in-index HEAD
    git diff --cached --ita-visible-in-index HEAD

The `git add` at the end of the "diff-index" test case is necessary
because `git reset --hard HEAD` at the beginning of the next test case
_also_ breaks for empty deleted ita files. That command goes into
unpack-trees.c:oneway_merge() rather than diff-lib.c:do_oneway_diff().
I plan to create a separate patch to fix that, after I figure out which
commands are part of its blast radius.

 diff-lib.c            |  5 ++++-
 t/t2203-add-intent.sh | 13 +++++++++++++
 t/t7102-reset.sh      | 11 +++++++++++
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/diff-lib.c b/diff-lib.c
index 61812f48c2..29dba467d5 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -433,8 +433,11 @@ static void do_oneway_diff(struct unpack_trees_options *o,
 
 	/*
 	 * Something removed from the tree?
+	 * Consider a file deleted from the index and added as ita to be "deleted",
+	 * even though it should arguably be "modified", because we want empty
+	 * deleted ita files to appear in the diff.
 	 */
-	if (!idx) {
+	if (!idx || (cached && ce_intent_to_add(idx))) {
 		diff_index_show_file(revs, "-", tree, &tree->oid, 1,
 				     tree->ce_mode, 0);
 		return;
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 68e54d5c44..4e4a972921 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -261,6 +261,19 @@ test_expect_success '"diff HEAD" includes ita as new files' '
 	test_cmp expected actual
 '
 
+test_expect_success '"diff-index --cached HEAD" detects diff for deleted intent-to-add file' '
+	git reset --hard &&
+	echo "nonempty" >nonempty &&
+	>empty &&
+	git add nonempty empty &&
+	git commit -m "create files to be deleted" &&
+	git rm --cached nonempty empty &&
+	git add -N nonempty empty &&
+	test_expect_code 1 git diff-index --cached --exit-code HEAD nonempty &&
+	test_expect_code 1 git diff-index --cached --exit-code HEAD empty &&
+	git add nonempty empty
+'
+
 test_expect_success 'apply --intent-to-add' '
 	git reset --hard &&
 	echo new >new-ita &&
diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
index 97be0d968d..7b79502f7d 100755
--- a/t/t7102-reset.sh
+++ b/t/t7102-reset.sh
@@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
 	test_must_be_empty actual
 '
 
+test_expect_success 'reset --mixed removes deleted intent-to-add file from index' '
+	echo "nonempty" >nonempty &&
+	>empty &&
+	git add nonempty empty &&
+	git commit -m "create files to be deleted" &&
+	git rm --cached nonempty empty &&
+	git add -N nonempty empty &&
+	git reset HEAD nonempty empty &&
+	git diff --cached --exit-code nonempty empty
+'
+
 test_done
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] diff-lib.c: handle empty deleted ita files
  2019-08-01 16:15 ` [PATCH v3] diff-lib.c: handle " Varun Naik
@ 2019-08-15 16:26   ` Varun Naik
  2019-08-15 19:06   ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Varun Naik @ 2019-08-15 16:26 UTC (permalink / raw)
  To: Varun Naik; +Cc: git, Junio C Hamano, pclouds

On Thu, Aug 1, 2019 at 9:16 AM Varun Naik <vcnaik94@gmail.com> wrote:
>
> It is possible to delete a committed file from the index and then add it
> as intent-to-add. Certain forms of `git diff` should show the file.
> After `git reset HEAD`, the file should be identical in the index and
> HEAD. The commands already work correctly if the file has contents in
> HEAD. This patch provides the desired behavior even when the file is
> empty in HEAD.
>
> The affected "diff" commands and the "reset" command call
> diff-lib.c:do_oneway_diff() with a cache entry in the index and a cache
> entry in HEAD. An ita file is represented in the index by a cache entry
> with the same hash as an empty file. For a nonempty deleted ita file,
> do_oneway_diff() calls show_modified(), which detects a diff between the
> cache entry in the index and the cache entry in HEAD and therefore deems
> the file "modified". However, for an empty deleted ita file,
> do_oneway_diff() previously detected no such diff between the two cache
> entries and therefore deemed the file "not modified". After this fix,
> for any deleted ita file, do_oneway_diff() calls diff_index_show_file()
> and deems the file "deleted".
>
> `git diff-index --cached HEAD` prints a row of output for both a
> "modified" and a "deleted" file, although the output differs slightly.
> `git reset HEAD` treats a "modified" and a "deleted" file similarly,
> resurrecting the file in the index from HEAD.
>
> This change should not affect newly added ita files. For those, the
> "tree" cache entry is NULL, so the changed code is not executed.
>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Varun Naik <vcnaik94@gmail.com>
> ---

Bumping this email, since I noticed v2 is still in pu.

Varun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] diff-lib.c: handle empty deleted ita files
  2019-08-01 16:15 ` [PATCH v3] diff-lib.c: handle " Varun Naik
  2019-08-15 16:26   ` Varun Naik
@ 2019-08-15 19:06   ` Junio C Hamano
  2019-08-19 15:42     ` Varun Naik
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2019-08-15 19:06 UTC (permalink / raw)
  To: Varun Naik; +Cc: git, pclouds

Varun Naik <vcnaik94@gmail.com> writes:

> diff --git a/diff-lib.c b/diff-lib.c
> index 61812f48c2..29dba467d5 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -433,8 +433,11 @@ static void do_oneway_diff(struct unpack_trees_options *o,
>  
>  	/*
>  	 * Something removed from the tree?
> +	 * Consider a file deleted from the index and added as ita to be "deleted",
> +	 * even though it should arguably be "modified", because we want empty
> +	 * deleted ita files to appear in the diff.
>  	 */
> -	if (!idx) {
> +	if (!idx || (cached && ce_intent_to_add(idx))) {
>  		diff_index_show_file(revs, "-", tree, &tree->oid, 1,
>  				     tree->ce_mode, 0);
>  		return;

There is already half of the same logic near the beginning of this
function, no?

	/*
	 * i-t-a entries do not actually exist in the index (if we're
	 * looking at its content)
	 */
	if (o->index_only &&
	    revs->diffopt.ita_invisible_in_index &&
	    idx && ce_intent_to_add(idx)) {
		idx = NULL;
		if (!tree)
			return;	/* nothing to diff.. */
	}

IOW, when ita-invisible-in-index flag is set, idx is made NULL and
all the rest of the function behaves as if there is no such entry in
the index (e.g. relative to HEAD it looks as if the entry is removed
in the index).

So for example, when ita-invisible-in-index is not set, this piece,
just above the part you touched, kicks in:

	/*
	 * Something added to the tree?
	 */
	if (!tree) {
		show_new_file(revs, idx, cached, match_missing);
		return;
	}

and says "no such entry in the tree, but you have an I-T-A entry
there in the index".

It is unclear why we can unconditionally declare "I-T-A entry does
not exist, the entry was in the tree but not in the index" in the
code you touched, without consulting ita-invisible-in-index flag.
It feels awfully inconsistent to me.

Of course, consistency could go the other way around, and the right
fix to achieve consistency might turn out to be to drop the check
for ita-invisible-in-index flag (and perhaps the flag itself) from
the early part of this function.  I dunno.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] diff-lib.c: handle empty deleted ita files
  2019-08-15 19:06   ` Junio C Hamano
@ 2019-08-19 15:42     ` Varun Naik
  2019-08-19 20:15       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Varun Naik @ 2019-08-19 15:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Nguyễn Thái Ngọc Duy

On Thu, Aug 15, 2019 at 12:06 PM Junio C Hamano <gitster@pobox.com> wrote:
>
>         /*
>          * i-t-a entries do not actually exist in the index (if we're
>          * looking at its content)
>          */
>         if (o->index_only &&
>             revs->diffopt.ita_invisible_in_index &&
>             idx && ce_intent_to_add(idx)) {
>                 idx = NULL;
>                 if (!tree)
>                         return; /* nothing to diff.. */
>         }
>
> IOW, when ita-invisible-in-index flag is set, idx is made NULL and
> all the rest of the function behaves as if there is no such entry in
> the index (e.g. relative to HEAD it looks as if the entry is removed
> in the index).
>

That's right. I was hoping to avoid simply setting the ita-invisible-in-
index flag because then an ita file (i.e. a file marked as ita in the
index with no contents in HEAD) would be considered nonexistent in both
the tree and the index, and so the function would hit the return
statement above because it believes there is "nothing to diff".

> So for example, when ita-invisible-in-index is not set, this piece,
> just above the part you touched, kicks in:
>
>         /*
>          * Something added to the tree?
>          */
>         if (!tree) {
>                 show_new_file(revs, idx, cached, match_missing);
>                 return;
>         }
>
> and says "no such entry in the tree, but you have an I-T-A entry
> there in the index".
>

That sounds right, an ita file is considered to be a "new file".

> It is unclear why we can unconditionally declare "I-T-A entry does
> not exist, the entry was in the tree but not in the index" in the
> code you touched, without consulting ita-invisible-in-index flag.
> It feels awfully inconsistent to me.
>
> Of course, consistency could go the other way around, and the right
> fix to achieve consistency might turn out to be to drop the check
> for ita-invisible-in-index flag (and perhaps the flag itself) from
> the early part of this function.  I dunno.

Actually, I agree that it seems strange to consider a deleted ita file
as a "deleted file", when an ita file that did not previously exist in
the tree is considered a "new file". It felt like a dirty hack when I
was originally writing it. And the longer I look at the logic that I
added towards the end of the function, the more I worry that it
sacrifices readability and maintainability to achieve minimal gain.

Dropping the check at the beginning is probably not right either - this
would break a lot of other places that call into do_oneway_diff().

This function is probably not the place where we want to make changes.
It would be better to change diff-lib.c:show_modified() and
diff.c:diff_change() to consider the intent-to-add bit when performing a
diff. A small change to show_modified() prevents the function from
returning prematurely when "new_entry" has the intent-to-add bit set.
But now, the call to diff_change() hits the error "feeding unmodified %s
to diffcore". We can avoid this by adding a boolean "int ita" field to
"struct diff_filespec" and adjusting the following code:

@@ -5847,7 +5847,7 @@ static void diff_resolve_rename_copy(void)
                        else
                                p->status = DIFF_STATUS_RENAMED;
                }
-               else if (!oideq(&p->one->oid, &p->two->oid) ||
+               else if (!oideq(&p->one->oid, &p->two->oid) || p->two->ita ||
                         p->one->mode != p->two->mode ||
                         p->one->dirty_submodule ||
                         p->two->dirty_submodule ||

Next, we need to pass something like "int new_ita" to diff_change() and
set "two->ita" accordingly. This is where I'm stuck right now. It's easy
enough to adjust the call to diff_change() inside show_modified(), but
the adjustments to other calls to diff_change(), especially the one
inside diff-lib.c:run_diff_files(), might need some other accompanying
changes.

Does changing the diffcore code seem like a reasonable approach? If not,
then I can't think of a better one, and it might be best to leave this
patch unmerged.

Varun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] diff-lib.c: handle empty deleted ita files
  2019-08-19 15:42     ` Varun Naik
@ 2019-08-19 20:15       ` Junio C Hamano
  2020-02-14 17:12         ` Varun Naik
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2019-08-19 20:15 UTC (permalink / raw)
  To: Varun Naik; +Cc: git, Nguyễn Thái Ngọc Duy

Varun Naik <vcnaik94@gmail.com> writes:

> This function is probably not the place where we want to make changes.
> It would be better to change diff-lib.c:show_modified() and
> diff.c:diff_change() to consider the intent-to-add bit when performing a
> diff.

I do agree that diff-lib.c::show_modified(), which is about "git
diff-cache" (and hence "git diff <tree-ish>", i.e. comparison
between a tree-ish and the index), is the right layer for this
change.

I am not offhand sure about the diff.c::diff_change(), though, and
cannot say much without first thinking about it a bit more.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] diff-lib.c: handle empty deleted ita files
  2019-08-19 20:15       ` Junio C Hamano
@ 2020-02-14 17:12         ` Varun Naik
  0 siblings, 0 replies; 11+ messages in thread
From: Varun Naik @ 2020-02-14 17:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Nguyễn Thái Ngọc Duy

On Mon, Aug 19, 2019 at 1:15 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> I do agree that diff-lib.c::show_modified(), which is about "git
> diff-cache" (and hence "git diff <tree-ish>", i.e. comparison
> between a tree-ish and the index), is the right layer for this
> change.
>
> I am not offhand sure about the diff.c::diff_change(), though, and
> cannot say much without first thinking about it a bit more.

I'm back!

I made the changes to "struct diff_filespec" and diff_resolve_rename_copy()
that I mentioned in my previous email on this thread, and it seems to work.
But the patch seems very clunky. Even though it has been in the back of my
mind for six months, I can't think of a more elegant approach. Furthermore,
the few test cases that I could create are extremely unlikely in practice. For
all of these reasons, I would rather avoid this patch altogether.

v2 is still sitting in pu. Could we discard it? I'll send an update to the
related patch [0] in a few minutes.

[0]: https://public-inbox.org/git/20190821032100.73917-1-vcnaik94@gmail.com/

Best,
Varun

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-02-14 17:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-12 15:02 [PATCH] reset: unstage empty deleted ita files Varun Naik
2019-07-26  4:48 ` [PATCH v2] " Varun Naik
2019-07-26 18:19   ` Junio C Hamano
2019-07-29  6:52     ` Varun Naik
2019-07-29 16:07       ` Junio C Hamano
2019-08-01 16:15 ` [PATCH v3] diff-lib.c: handle " Varun Naik
2019-08-15 16:26   ` Varun Naik
2019-08-15 19:06   ` Junio C Hamano
2019-08-19 15:42     ` Varun Naik
2019-08-19 20:15       ` Junio C Hamano
2020-02-14 17:12         ` Varun Naik

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).