git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Mike Crowe <mac@mcrowe.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: git diff --quiet exits with 1 on clean tree with CRLF conversions
Date: Sat, 25 Feb 2017 15:32:30 +0000	[thread overview]
Message-ID: <20170225153230.GA30565@mcrowe.com> (raw)
In-Reply-To: <xmqqlgt0imhe.fsf@gitster.mtv.corp.google.com>

On Monday 20 February 2017 at 13:25:01 -0800, Junio C Hamano wrote:
> This almost makes me suspect that the place that checks lengths of
> one and two in order to refrain from running more expensive content
> comparison you found earlier need to ask would_convert_to_git()
> before taking the short-cut, something along the lines of this (for
> illustration purposes only, not even compile-tested).  The "almost"
> comes to me because I do not offhand know the performance implications
> of making calls to would_convert_to_git() here.
> 
>  diff.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/diff.c b/diff.c
> index 051761be40..094d5913da 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4921,9 +4921,10 @@ static int diff_filespec_check_stat_unmatch(struct diff_filepair *p)
>  	 *    differences.
>  	 *
>  	 * 2. At this point, the file is known to be modified,
> -	 *    with the same mode and size, and the object
> -	 *    name of one side is unknown.  Need to inspect
> -	 *    the identical contents.
> +	 *    with the same mode and size, the object
> +	 *    name of one side is unknown, or size comparison
> +	 *    cannot be depended upon.  Need to inspect the 
> +	 *    contents.
>  	 */
>  	if (!DIFF_FILE_VALID(p->one) || /* (1) */
>  	    !DIFF_FILE_VALID(p->two) ||
> @@ -4931,7 +4932,16 @@ static int diff_filespec_check_stat_unmatch(struct diff_filepair *p)
>  	    (p->one->mode != p->two->mode) ||
>  	    diff_populate_filespec(p->one, CHECK_SIZE_ONLY) ||
>  	    diff_populate_filespec(p->two, CHECK_SIZE_ONLY) ||
> -	    (p->one->size != p->two->size) ||
> +
> +	    /* 
> +	     * only if eol and other conversions are not involved,
> +	     * we can say that two contents of different sizes
> +	     * cannot be the same without checking their contents.
> +	     */
> +	    (!would_convert_to_git(p->one->path) &&
> +	     !would_convert_to_git(p->two->path) &&
> +	     (p->one->size != p->two->size)) ||
> +
>  	    !diff_filespec_is_identical(p->one, p->two)) /* (2) */
>  		p->skip_stat_unmatch_result = 1;
>  	return p->skip_stat_unmatch_result;
> 
> 

Thanks for investigating this. I think you are correct that I was misguided
in my previous "fix". However, your change above does fix the problem for
me.

It looks like the main cost of convert_to_git is in convert_attrs which
ends up doing various path operations in attr.c. After that, both
apply_filter and crlf_to_git return straight away if there's nothing to do.

I experimented several times with running "git diff -quiet" after touching
every file in Git's own worktree and any difference in total time was lost
in the noise.

I've further improved my test case. Tests 3 and 4 fail without the above
change but pass with it. Unfortunately I'm still unable to get those tests
to fail without the above fix unless the sleeps are present. I've tried
using the "touch -r .datetime" technique from racy-git.txt but it doesn't
help. It seems that I'm unable to stop Git from using its cache without
sleeping. :(

diff --git a/t/t4063-diff-converted.sh b/t/t4063-diff-converted.sh
new file mode 100755
index 0000000..31a730d
--- /dev/null
+++ b/t/t4063-diff-converted.sh
@@ -0,0 +1,44 @@
+#!/bin/sh
+#
+# Copyright (c) 2017 Mike Crowe
+#
+# These tests ensure that files changing line endings in the presence
+# of .gitattributes to indicate that line endings should be ignored
+# don't cause 'git diff' or 'git diff --quiet' to think that they have
+# been changed.
+#
+# The sleeps are necessary to reproduce the problem for reasons that I
+# don't understand.
+
+test_description='git diff with files that require CRLF conversion'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	echo "* text=auto" > .gitattributes &&
+	printf "Hello\r\nWorld\r\n" > crlf.txt &&
+	printf "Hello\nWorld\n" > lf.txt &&
+	git add .gitattributes crlf.txt lf.txt &&
+	git commit -m "initial" && echo three
+'
+test_expect_success 'noisy diff works on file that requires CRLF conversion' '
+	git status >/dev/null &&
+	git diff >/dev/null &&
+	sleep 1 &&
+	touch crlf.txt lf.txt &&
+	git diff >/dev/null
+'
+test_expect_success 'quiet diff works on file that requires CRLF conversion with no changes' '
+	git status &&
+	git diff --quiet &&
+	sleep 1 &&
+	touch crlf.txt lf.txt &&
+	git diff --quiet
+'
+
+test_expect_success 'quiet diff works on file with line-ending change that has no effect on repository' '
+	printf "Hello\nWorld\n" > crlf.txt &&
+	printf "Hello\r\nWorld\r\n" > lf.txt &&
+	git diff --quiet
+'
+test_done


Mike.

  reply	other threads:[~2017-02-25 15:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-17 21:26 git diff --quiet exits with 1 on clean tree with CRLF conversions Mike Crowe
2017-02-17 22:05 ` Junio C Hamano
2017-02-17 22:19   ` Mike Crowe
2017-02-20 15:33     ` Mike Crowe
2017-02-20 21:25       ` Junio C Hamano
2017-02-25 15:32         ` Mike Crowe [this message]
2017-02-27 20:17           ` Junio C Hamano
2017-02-28 18:06             ` Torsten Bögershausen
2017-02-28 21:50               ` Junio C Hamano
2017-03-01 17:04                 ` [PATCH v1 1/1] " tboegi
2017-03-01 21:14                   ` Junio C Hamano
2017-03-01 21:54                     ` Junio C Hamano
2017-03-02  8:53                       ` Jeff King
2017-03-02 17:52                         ` Junio C Hamano
2017-03-02 19:12                           ` Jeff King
2017-03-02 18:51                         ` [PATCH v2] diff: do not short-cut CHECK_SIZE_ONLY check in diff_populate_filespec() Junio C Hamano
2017-03-02 14:20                       ` [PATCH v1 1/1] git diff --quiet exits with 1 on clean tree with CRLF conversions Mike Crowe
2017-03-02 18:20                         ` Torsten Bögershausen
2017-03-02 18:33                         ` Junio C Hamano
2017-03-02 20:03                           ` Mike Crowe
2017-03-03 17:02                             ` Torsten Bögershausen
2017-03-03 17:47                               ` Junio C Hamano
2017-03-04  6:25                                 ` Torsten Bögershausen
2017-03-04 19:59                                   ` Junio C Hamano
2017-03-01 21:25                   ` Mike Crowe
2017-03-01 23:29                     ` Junio C Hamano
2017-03-02 18:17                     ` Torsten Bögershausen
2017-03-03 17:01                       ` Mike Crowe
2017-03-02 15:38               ` git status reports file modified when only line-endings have changed (was git diff --quiet exits with 1 on clean tree with CRLF conversions) Mike Crowe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170225153230.GA30565@mcrowe.com \
    --to=mac@mcrowe.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).