* Weird behaviour of git diff-index in container @ 2022-05-09 22:42 Timo Funke 2022-05-09 23:18 ` Junio C Hamano 2022-05-10 2:42 ` Jonathan Nieder 0 siblings, 2 replies; 4+ messages in thread From: Timo Funke @ 2022-05-09 22:42 UTC (permalink / raw) To: git@vger.kernel.org What did you do before the bug happened? (Steps to reproduce your issue) mkdir test cd test git init touch test git add test git commit -m 'init' podman run --rm -it -v `pwd`:/git:z --entrypoint sh docker.io/alpine > container# apk add git > container# cd /git > container# git diff-index --quiet HEAD -- ; echo $? 1 > container# git diff-index --quiet HEAD -- ; echo $? 1 > container# git status On branch master nothing to commit, working tree clean > container# git diff-index --quiet HEAD -- ; echo $? 0 What did you expect to happen? (Expected behavior) `git diff-index --quiet HEAD -- ; echo $?` should return `0` even without executing `git status`. What happened instead? (Actual behavior) Without executing `git status` `git diff-index --quiet HEAD -- ; echo $?` will repeatedly print `1`. What's different between what you expected and what actually happened? It is odd that `git diff-index --quiet HEAD -- ; echo $?` prints different results depending on whether `git status` was executed. Anything else you want to add: Perhaps this has to do with running git in a container? [System Info] git version: git version 2.34.2 cpu: x86_64 no commit associated with this build sizeof-long: 8 sizeof-size_t: 8 shell-path: /bin/sh uname: Linux 4.18.0-348.20.1.el8_5.x86_64 #1 SMP Thu Mar 10 20:59:28 UTC 2022 x86_64 compiler info: gnuc: 10.3 libc info: no libc information available $SHELL (typically, interactive shell): <unset> [Enabled Hooks] not run from a git repository - no hooks to show ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Weird behaviour of git diff-index in container 2022-05-09 22:42 Weird behaviour of git diff-index in container Timo Funke @ 2022-05-09 23:18 ` Junio C Hamano 2022-05-10 2:42 ` Jonathan Nieder 1 sibling, 0 replies; 4+ messages in thread From: Junio C Hamano @ 2022-05-09 23:18 UTC (permalink / raw) To: Timo Funke; +Cc: git@vger.kernel.org Timo Funke <timoses@msn.com> writes: >> container# git diff-index --quiet HEAD -- ; echo $? > 1 >> container# git status > On branch master > nothing to commit, working tree clean >> container# git diff-index --quiet HEAD -- ; echo $? > 0 This is unfortunately very much expected and doubly unfortunately not very well documented. Patches to update documentation is very much welcomed, but such a patch cannot be written in void, so let's explain what is going on. To detect paths that have not been modified quickly, Git uses the mechanism called "cached stat data" in the index. Among the cached stat data is the timestamp of the last modification of each file. By noting that the fact that the last time it checked, the contents in the file on the filesystem hasn't been modified, together with the file timestamp observed at the time of such a check, the next time somebody asks "please compute 'git diff'", Git can notice that the timestamp of the working tree file hasn't changed and say "no, there is no change" without looking at the contents. Now, when the file on the filesystem is "touched" in a way that its timestamp gets updated without changing the contents (hence, if there weren't the above optimization, diff would have said "no change"), Git will think there is a change in the file. There are two levels of Git subcommands. Porcelain commands, like "git diff", are end-user facing and are optimized more for usability than performance. "git diff --quiet HEAD --" in the above scenario WILL notice that there is no change in the contents after all and exit with 0 (unless diff.autoRefreshIndex configuration is set to false). The way they do so is by refreshing the "cached stat data" automatically before using, and that operation is called "refreshing the index" (hence the configuration variable name to disable it). On the other hand, plumbing commands, like "git diff-files" and "git diff-index", are designed to be used in scripts, number of times, and do not want to pay the cost of refreshing the index always before working. The correct way to use them in a repository whose current state you do not know about is to first "refresh the index" by running the command to do so, e.g. "git update-index --refresh" before doing anything else. If you were to run "git diff-files" and "git diff-index HEAD" in a row in order to compute what "git status" would give you, for example, you do not need to and want to pay the cost of refreshing the index twice. You run "git update-index --refresh" once, and then run "git diff-files". Doing so would not change the contents of the working tree files, so you do not have to refresh the index again after that, before running "git diff-index HEAD". That is why these plumbing commands do not refresh the index themselves. They expect you to be refreshing the index before you call them. "git status" is one of the commands (as a Porcelain) that refreshes the index automatically, so it is very much understandable that the same "diff-index --quiet" behaves differently after running it once and until you touch/smudge the working tree files. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Weird behaviour of git diff-index in container 2022-05-09 22:42 Weird behaviour of git diff-index in container Timo Funke 2022-05-09 23:18 ` Junio C Hamano @ 2022-05-10 2:42 ` Jonathan Nieder 2022-05-10 16:47 ` Junio C Hamano 1 sibling, 1 reply; 4+ messages in thread From: Jonathan Nieder @ 2022-05-10 2:42 UTC (permalink / raw) To: Timo Funke; +Cc: git@vger.kernel.org Hi! Timo Funke wrote: > podman run --rm -it -v `pwd`:/git:z --entrypoint sh docker.io/alpine > > container# apk add git > > container# cd /git > > container# git diff-index --quiet HEAD -- ; echo $? > 1 > > container# git diff-index --quiet HEAD -- ; echo $? > 1 > > container# git status > On branch master > nothing to commit, working tree clean > > container# git diff-index --quiet HEAD -- ; echo $? > 0 > > > What did you expect to happen? (Expected behavior) > `git diff-index --quiet HEAD -- ; echo $?` should return `0` > even without executing `git status`. > > What happened instead? (Actual behavior) > Without executing `git status` `git diff-index --quiet HEAD -- ; echo $?` > will repeatedly print `1`. > > What's different between what you expected and what actually happened? > It is odd that `git diff-index --quiet HEAD -- ; echo $?` prints > different results depending on whether `git status` was executed. I love this example. Thanks for writing. I checked "git help diff-index" to see whether it describes this pitfall, and I didn't see an explanation. So at the very least you have uncovered a documentation bug. The difference between diff-index and status here is a difference between "porcelain" (user-facing) commands and "plumbing" (script-facing) commands. In Git's index file there is stat(2) information for each file; if that stat(2) information matches the corresponding file in the working directory then we know it hasn't been modified relative to what is in the index. If the stat(2) information differs from the working copy, on the other hand, the behavior depends on whether the command being run is porcelain or plumbing: - plumbing commands assume that the script author has run "git update-index --refresh -q" first to update the stat(2) information if the file hasn't changed. This allows efficient scripts to refresh the index once and then run multiple commands that rely on the result of that: git update-index --refresh -q || : for rev in "${revs[@]}" do if git diff-index --quiet "$rev" -- then ... do something ... fi done - porcelain commands such as "git status" implicitly refresh the index before doing anything else. This allows them to produce the expected result even if the repository is a copy made using "cp -a" or has been transferred across machines on a USB stick. Some places I expected to find an explanation of this: - documentation for the "git diff-index" command ("git help diff-index"). It does not mention this behavior. - documentation for the "git diff" command ("git help diff"). It also doesn't mention this. That's particularly surprising because it would be a great place to document the diff.autoRefreshIndex setting that affects this behavior of the "git diff" command (described in Documentation/config/diff.txt). - the Git user manual (Documentation/user-manual.txt). It describes "git update-index --refresh" but very briefly. It doesn't describe the above scripting pattern. - Git's command-line conventions ("git help cli"). No mention. - overview of plumbing and porcelain commands ("man git"). No mention. - the Git scripting manual ("git help core-tutorial"). It describes "git update-index --refresh" after a "cp -a" but not its use in scripts. - the history of Git's contrib/examples/. This contains many examples of the above scripting pattern but is not very discoverable. So there are many opportunities for someone to document this better. If you'd be interested in pursuing that, I'd be happy to provide some pointers. Thanks, Jonathan ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Weird behaviour of git diff-index in container 2022-05-10 2:42 ` Jonathan Nieder @ 2022-05-10 16:47 ` Junio C Hamano 0 siblings, 0 replies; 4+ messages in thread From: Junio C Hamano @ 2022-05-10 16:47 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Timo Funke, git@vger.kernel.org Jonathan Nieder <jrnieder@gmail.com> writes: > I love this example. Thanks for writing. I guess our mails crossed ;-) > Some places I expected to find an explanation of this: > > - documentation for the "git diff-index" command ("git help > diff-index"). It does not mention this behavior. Yes, diff-index and diff-files should at least have a pointer to "update-index --refresh". Ideally they should share a write-up based on what both of us covered in these responses. > - documentation for the "git diff" command ("git help diff"). It also > doesn't mention this. That's particularly surprising because it > would be a great place to document the diff.autoRefreshIndex setting > that affects this behavior of the "git diff" command (described in > Documentation/config/diff.txt). And the autorefreshindex documentation is a tad stale (it is on by default these days) and does not say why you would want it. I do not mind config/diff.txt having it, but that should eventually refer to the same page that is designed to help the readers of the diff-index and diff-files documentation. I do not think anywhere else the missing info belongs to, but stepping back a bit, it may help to have a write up on scripting using the plumbing commands in general, not limited to "diff-*" family of commands. I actually am torn a bit, as we have long neglected to give matching improvement to plumbing commands when we add shiny new toys to commands at the Porcelain level, so Git may have grown much more hostile to scripters over the years X-<. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-05-10 16:48 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-09 22:42 Weird behaviour of git diff-index in container Timo Funke 2022-05-09 23:18 ` Junio C Hamano 2022-05-10 2:42 ` Jonathan Nieder 2022-05-10 16:47 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).