git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Timo Funke <timoses@msn.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Weird behaviour of git diff-index in container
Date: Mon, 09 May 2022 16:18:32 -0700	[thread overview]
Message-ID: <xmqqy1za9tx3.fsf@gitster.g> (raw)
In-Reply-To: <VI1PR0402MB28779C7A41783472B2EF6823BFC69@VI1PR0402MB2877.eurprd04.prod.outlook.com> (Timo Funke's message of "Mon, 9 May 2022 22:42:14 +0000")

Timo Funke <timoses@msn.com> writes:

>> container# git diff-index --quiet HEAD -- ; echo $?
> 1
>> container# git status
> On branch master
> nothing to commit, working tree clean
>> container# git diff-index --quiet HEAD -- ; echo $?
> 0

This is unfortunately very much expected and doubly unfortunately
not very well documented.  Patches to update documentation is very
much welcomed, but such a patch cannot be written in void, so let's
explain what is going on.

To detect paths that have not been modified quickly, Git uses the
mechanism called "cached stat data" in the index.  Among the cached
stat data is the timestamp of the last modification of each file.
By noting that the fact that the last time it checked, the contents
in the file on the filesystem hasn't been modified, together with
the file timestamp observed at the time of such a check, the next
time somebody asks "please compute 'git diff'", Git can notice that
the timestamp of the working tree file hasn't changed and say "no,
there is no change" without looking at the contents.

Now, when the file on the filesystem is "touched" in a way that its
timestamp gets updated without changing the contents (hence, if
there weren't the above optimization, diff would have said "no
change"), Git will think there is a change in the file.

There are two levels of Git subcommands.  Porcelain commands, like
"git diff", are end-user facing and are optimized more for usability
than performance.  "git diff --quiet HEAD --" in the above scenario
WILL notice that there is no change in the contents after all and
exit with 0 (unless diff.autoRefreshIndex configuration is set to
false).  The way they do so is by refreshing the "cached stat data"
automatically before using, and that operation is called "refreshing
the index" (hence the configuration variable name to disable it).

On the other hand, plumbing commands, like "git diff-files" and "git
diff-index", are designed to be used in scripts, number of times,
and do not want to pay the cost of refreshing the index always
before working.  The correct way to use them in a repository whose
current state you do not know about is to first "refresh the index"
by running the command to do so,  e.g. "git update-index --refresh"
before doing anything else.

If you were to run "git diff-files" and "git diff-index HEAD" in a
row in order to compute what "git status" would give you, for
example, you do not need to and want to pay the cost of refreshing
the index twice.  You run "git update-index --refresh" once, and
then run "git diff-files".  Doing so would not change the contents
of the working tree files, so you do not have to refresh the index
again after that, before running "git diff-index HEAD".  That is why
these plumbing commands do not refresh the index themselves.  They
expect you to be refreshing the index before you call them.

"git status" is one of the commands (as a Porcelain) that refreshes
the index automatically, so it is very much understandable that the
same "diff-index --quiet" behaves differently after running it once
and until you touch/smudge the working tree files.


  reply	other threads:[~2022-05-09 23:18 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09 22:42 Weird behaviour of git diff-index in container Timo Funke
2022-05-09 23:18 ` Junio C Hamano [this message]
2022-05-10  2:42 ` Jonathan Nieder
2022-05-10 16:47   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy1za9tx3.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=timoses@msn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).