* Tracking a merge bug @ 2020-10-26 15:08 Ephrim Khong 2020-10-26 15:50 ` Elijah Newren 0 siblings, 1 reply; 3+ messages in thread From: Ephrim Khong @ 2020-10-26 15:08 UTC (permalink / raw) To: GIT Mailing-list Dear All, I am trying to find the root cause for what I believe might be a strange bug in git merge. I have a feature branch A which branched off master not too long ago, and want to bring it up to date with master: git checkout A git merge master which yields Removing somefile Removing anotherfile error: add_cacheinfo failed to refresh for path 'c/d/e.sh'; merge aborting. the offending file, c/d/e.sh, does not exist in my feature branch but was added to master since branching off. After aborting, the working directory is in an inconsistent state and c/d/e.sh exists with the correct content. Below is a stacktrace - the merger handles the file as a rename (apparently there is a similar / identical file 'c/f/g.sh' that is renamed to 'c/d/e.sh'), but that fails because the file has MODE_CHANGED set. (Which appears strange - at the time where the merge is aborting, the file apparently was already written to the working directory. Is it renaming two different files to the same target file?). Any hint is appreciated, especially where to look: Is the root cause more likely to be at the filesystem level (the stat returns something off), or in the merge logic? What else could be wrong here? The stacktrace looks roughly as follows: -> read-cache.c, ie_modified(): ie_match_stat returned 63, which is MTIME_CHANGED | CTIME_CHANGED | OWNER_CHANGED | MODE_CHANGED | INODE_CHANGED | DATA_CHANGED and is_modified() returns 63 because MODE_CHANGED is set. -> read-cache.c, refresh_cache_ent(): at the call to ie_modified -> read-cache.c, refresh_cache_entry() -> merge-recursive.c, add_cacheinfo(), is in the refresh-path (i.e. make_cache_entry() worked, but refresh_cache_entry() will fail) -> merge-recursive.c, update_file_flags(), after the update_index: label -> merge-recursive.c, update_file() -> merge-recursive.c, handle_content_merge() is in the very last update_file() call, close to the end of the function -> merge-recursive.c, handle_rename_normal() -> merge-recursive.c, process_entry() is in the RENAME_NORMAL / RENAME_ONE_FILE_TO_ONE block -> [...] Thanks - Eph ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Tracking a merge bug 2020-10-26 15:08 Tracking a merge bug Ephrim Khong @ 2020-10-26 15:50 ` Elijah Newren 2021-03-05 11:07 ` Ephrim Khong 0 siblings, 1 reply; 3+ messages in thread From: Elijah Newren @ 2020-10-26 15:50 UTC (permalink / raw) To: Ephrim Khong; +Cc: GIT Mailing-list On Mon, Oct 26, 2020 at 8:11 AM Ephrim Khong <dr.khong@gmail.com> wrote: > > Dear All, > > I am trying to find the root cause for what I believe might be a strange > bug in git merge. I have a feature branch A which branched off master > not too long ago, and want to bring it up to date with master: > > git checkout A > git merge master > > which yields > > Removing somefile > Removing anotherfile > error: add_cacheinfo failed to refresh for path 'c/d/e.sh'; merge > aborting. "add_cacheinfo failed to refresh"? Wow, that's a new one. Some years back we had a "add_cacheinfo failed for path" corresponding to the other error site within that function, but we fixed that one up long ago. I've never seen anything hit the refresh failure. The fact that you're seeing this error message means you're using git >= 2.18.0. Could you verify the exact version? Also, I suspect "somefile" and "anotherfile" are placeholders. Any chance we could see the real names? > the offending file, c/d/e.sh, does not exist in my feature branch but > was added to master since branching off. After aborting, the working > directory is in an inconsistent state and c/d/e.sh exists with the > correct content. Is c/d a file or submodule or anything in the feature branch? Do you have any untracked files at the beginning of the merge named c/d or c/d/e.sh? I'm trying to guess at what might be weird about the paths here. > Below is a stacktrace - the merger handles the file as a rename > (apparently there is a similar / identical file 'c/f/g.sh' that is > renamed to 'c/d/e.sh'), but that fails because the file has MODE_CHANGED > set. (Which appears strange - at the time where the merge is aborting, > the file apparently was already written to the working directory. Is it > renaming two different files to the same target file?). What's the output of git ls-tree -r master | grep c/d/e.sh git ls-tree -r feature | grep c/d/f.sh git ls-tree -r $(git merge-base master feature) | grep c/d/f.sh ? And, before the merge, what is the output of ls -l c/d/e.sh c/d/f.sh ? > Any hint is appreciated, especially where to look: Is the root cause > more likely to be at the filesystem level (the stat returns something > off), or in the merge logic? What else could be wrong here? Not sure; I've never hit this before. I'm curious if you've got a weird mode recorded for one of the files in your tree or something. We might have to get someone who understands the index data structure and read-cache.c to take a look. > The stacktrace looks roughly as follows: > > -> read-cache.c, ie_modified(): ie_match_stat returned 63, which is > MTIME_CHANGED | CTIME_CHANGED | OWNER_CHANGED | > MODE_CHANGED | INODE_CHANGED | DATA_CHANGED > and is_modified() returns 63 because MODE_CHANGED is set. > > -> read-cache.c, refresh_cache_ent(): at the call to ie_modified > > -> read-cache.c, refresh_cache_entry() > > -> merge-recursive.c, add_cacheinfo(), is in the refresh-path (i.e. > make_cache_entry() worked, but refresh_cache_entry() will fail) > > -> merge-recursive.c, update_file_flags(), after the update_index: label > > -> merge-recursive.c, update_file() > > -> merge-recursive.c, handle_content_merge() is in the very last > update_file() call, close to the end of the function > > -> merge-recursive.c, handle_rename_normal() > > -> merge-recursive.c, process_entry() > is in the RENAME_NORMAL / RENAME_ONE_FILE_TO_ONE block RENAME_NORMAL / RENAME_ONE_FILE_TO_ONE means this is NOT a case of renaming two files to one. From just this information, it could be that both sides renamed c/d/f.sh -> c/d/e.sh -- but you ruled that out when you said c/d/e.sh didn't exist in your branch. So, that basically leaves us with a normal rename; i.e. c/d/f.sh existed in the merge base and in your feature branch, but it was renamed to c/d/e.sh in master. Any chance this repository is available for others to access to try to reproduce the problem? If the repository has sensitive contents, are the files c/d/f.sh and c/d/e.sh sensitive? If the repository overall is sensitive but those two files aren't, we might be able to get a reproduction by cloning the repository and running git filter-repo --path c/d/e.sh --path c/d/f.sh to filter the repository down to just these two files (after first installing git-filter-repo; see https://github.com/newren/git-filter-repo). ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Tracking a merge bug 2020-10-26 15:50 ` Elijah Newren @ 2021-03-05 11:07 ` Ephrim Khong 0 siblings, 0 replies; 3+ messages in thread From: Ephrim Khong @ 2021-03-05 11:07 UTC (permalink / raw) To: Elijah Newren; +Cc: GIT Mailing-list On 26.10.2020 16:50, Elijah Newren wrote: > On Mon, Oct 26, 2020 at 8:11 AM Ephrim Khong <dr.khong@gmail.com> wrote: >> >> I am trying to find the root cause for what I believe might be a strange >> bug in git merge. I have a feature branch A which branched off master >> not too long ago, and want to bring it up to date with master: >> >> git checkout A >> git merge master >> >> which yields >> >> error: add_cacheinfo failed to refresh for path 'c/d/e.sh'; merge >> aborting. > > "add_cacheinfo failed to refresh"? Wow, that's a new one. Some years > back we had a "add_cacheinfo failed for path" corresponding to the > other error site within that function, but we fixed that one up long > ago. I've never seen anything hit the refresh failure. Thank you for looking into this and sorry for the delay. I ran into this again and did some more testing. The merge works if I copy the complete repository to a different filesystem (in this case a local SSD) with cp -a. It was originally on a network share. An strace on the affected file seems to show that git creates and writes the file, attempting to set the executable bit. But the subsequent lstat reports that no executable bit is set (and that is correct, when looking at the file after the merge errors out): > lstat("tools/ci/nightly/run_benchmarks.sh", 0x7ffc2f148c20) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_EXCL, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > fstat(4, {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > close(4) = 0 > > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 #<-- first lstat, mode is OK > unlink("tools/ci/nightly/run_benchmarks.sh") = 0 > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_TRUNC, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > close(4) = 0 > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0640, st_size=1973, ...}) = 0 #<-- second lstat, mode is wrong > > write(2, "error: add_cacheinfo failed to r"..., 120error: add_cacheinfo failed to refresh for path 'tools/ci/nightly/run_benchmarks.sh' (2); merge aborting. note how the last lstat() reports 0640, even though openat() wanted 0777. The same thing when merging on the SSD shows > lstat("tools/ci/nightly/run_benchmarks.sh", 0x7ffd491745c0) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_EXCL, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > fstat(4, {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > close(4) = 0 > > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > unlink("tools/ci/nightly/run_benchmarks.sh") = 0 > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_WRONLY|O_CREAT|O_TRUNC, 0777) = 4 > write(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > close(4) = 0 > lstat("tools/ci/nightly/run_benchmarks.sh", {st_mode=S_IFREG|0755, st_size=1973, ...}) = 0 > > openat(AT_FDCWD, "tools/ci/nightly/run_benchmarks.sh", O_RDONLY|O_CLOEXEC) = 4 > read(4, "#!/bin/bash\n#\n# Build and run hb"..., 1973) = 1973 > close(4) = 0 here, lstat() reports 0755 and the merge continues. This could very well be an issue with the storage system backing my network share, but maybe I overlooked something in that strace that git is doing wrong or could do better. It does, for example, write the file twice with identical content for some reason. I have patched my git for now, and in read-cache.c, refresh_cache_ent() call chmod() and re-run lstat() if the mode is incorrect. That fixes the problem for now. It's not perfect, since chmod() ignores the umask, but at least allows to merge. > Any chance this repository is available for others to access to try to > reproduce the problem? Unfortunately it's not public. I have tried to cut out the offending part, but was unable to reproduce it with a smaller test case yet. Thanks - Eph ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-05 11:08 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-26 15:08 Tracking a merge bug Ephrim Khong 2020-10-26 15:50 ` Elijah Newren 2021-03-05 11:07 ` Ephrim Khong
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).