From: "Torsten Bögershausen" <tboegi@web.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
git@jeffhostetler.com, git@vger.kernel.org, newren@gmail.com,
pawelparuzel95@gmail.com, peff@peff.net,
sandals@crustytoothpaste.net,
"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: Re: [PATCH v4] clone: report duplicate entries on case-insensitive filesystems
Date: Thu, 16 Aug 2018 16:03:12 +0200 [thread overview]
Message-ID: <20180816140312.GA6102@tor.lan> (raw)
In-Reply-To: <xmqqtvnvh12u.fsf@gitster-ct.c.googlers.com>
On Wed, Aug 15, 2018 at 12:38:49PM -0700, Junio C Hamano wrote:
This should answer Duys comments as well.
> Torsten Bögershausen <tboegi@web.de> writes:
>
[snip]
> > Should the following be protected by core.checkstat ?
> > if (check_stat) {
>
> I do not think such a if statement is strictly necessary.
>
> Even if check_stat tells us "when checking if a cached stat
> information tells us that the path may have modified, use minimum
> set of fields from the 'struct stat'", we still capture and update
> the values from the same "full" set of fields when we mark a cache
> entry up-to-date. So it all depends on why you are limiting with
> check_stat. Is it because stdev is unusable? Is it because nsec is
> unusable? Is it because ino is unusable? Only in the last case,
> paying attention to check_stat will reduce the false positive.
>
> But then you made me wonder what value check_stat has on Windows.
> If it is false, perhaps we do not even need the conditional
> compilation, which is a huge plus.
Agreed:
check_stat is 0 on Windows, and inum is allways 0 in lstat().
I was thinking about systems which don't have inodes and inum,
and then generate an inum in memory, sometimes random.
After a reboot or a re-mount of the file systems those ino values
change.
However, for the initial clone we are fine in any case.
>
> >> + if (dup->ce_stat_data.sd_ino == st->st_ino) {
> >> + dup->ce_flags |= CE_MATCHED;
> >> + break;
> >> + }
> >> + }
> >> +#endif
> >
> > Another thing is that we switch of the ASCII case-folding-detection-logic
> > off for Windows users, even if we otherwise rely on icase.
> > I think we can use fspathcmp() as a fallback. when inodes fail,
> > because we may be on a network file system.
> >
> > (I don't have a test setup at the moment, but what happens with inodes
> > when a Windows machine exports a share to Linux or Mac ?)
> >
> > Is there a chance to get the fspathcmp() back, like this ?
>
> If fspathcmp() never gives false positives, I do not think we would
> mind using it like your update. False negatives are fine, as that
> is better than just punting the whole thing when there is no usable
> inum. And we do not care all that much if it is more expensive;
> this is an error codepath after all.
>
> And from code structure's point of view, I think it makes sense. It
> would be even better if we can lose the conditional compilation.
The current implementation of fspathcmp() does not give false positvies,
and future versions should not either.
All case-insentive file systems have always treated 'a-z' equal to 'A-Z'.
In FAT MS/DOS there had only been uppercase letters as file names,
and `type file.txt` (the equivilant to ´cat file.txt´ in *nix)
simply resultet in `type FILE.TXT`
Later, with VFAT and later with HPFS/NTFS a file could be stored on
disk as "File.txt".
From now on ´type FILE.TXT´ still worked, (and all other upper-lowercase
combinations).
This all is probably nothing new.
The main point should be that fspathcmp() should never return a false positive,
and I think we all agree on that.
Now back to the compiler switch:
Windows always set inum to 0 and I can't think about a situation where
a file in a working tree gets inum = 0, can we use the following:
static void mark_colliding_entries(const struct checkout *state,
struct cache_entry *ce, struct stat *st)
{
int i;
ce->ce_flags |= CE_MATCHED;
for (i = 0; i < state->istate->cache_nr; i++) {
struct cache_entry *dup = state->istate->cache[i];
int folded = 0;
if (dup == ce)
break;
if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
continue;
/*
* Windows sets ino to 0. On other FS ino = 0 will already be
* used, so we don't see it for a file in a Git working tree
*/
if (st->st_ino && (dup->ce_stat_data.sd_ino == st->st_ino))
folded = 1;
/*
* Fallback for NTFS and other case insenstive FS,
* which don't use POSIX inums
*/
if (!fspathcmp(dup->name, ce->name))
folded = 1;
if (folded) {
dup->ce_flags |= CE_MATCHED;
break;
}
}
}
>
> Another thing we maybe want to see is if we can update the caller of
> this function so that we do not overwrite the earlier checkout with
> the data for this path. When two paths collide, we check out one of
> the paths without reporting (because we cannot notice), then attempt
> to check out the other path and report (because we do notice the
> previous one with lstat()). The current code then goes on and overwrites
> the file with the contents from the "other" path.
>
> Even if we had false negative in this loop, if we leave the contents
> for the earlier path while reporting the "other" path, then the user
> can get curious, inspect what contents the "other" path has on the
> filesystem, and can notice that it belongs to the (unreported--due
> to false negative) earlier path.
>
[snip]
next prev parent reply other threads:[~2018-08-16 14:03 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-27 9:59 Git clone and case sensitivity Paweł Paruzel
2018-07-27 20:59 ` brian m. carlson
2018-07-28 4:36 ` Duy Nguyen
2018-07-28 4:45 ` Duy Nguyen
2018-07-28 4:48 ` Jeff King
2018-07-28 5:11 ` Duy Nguyen
2018-07-28 9:48 ` Simon Ruderich
2018-07-28 9:56 ` Jeff King
2018-07-28 18:05 ` brian m. carlson
2018-07-29 5:26 ` Duy Nguyen
2018-07-29 9:28 ` Jeff King
2018-07-30 15:27 ` [PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems Nguyễn Thái Ngọc Duy
2018-07-31 18:23 ` Torsten Bögershausen
2018-08-01 15:25 ` Duy Nguyen
2018-07-31 18:44 ` Elijah Newren
2018-07-31 19:12 ` Junio C Hamano
2018-07-31 19:29 ` Jeff King
2018-07-31 20:12 ` Junio C Hamano
2018-07-31 20:37 ` Jeff King
2018-07-31 20:57 ` Junio C Hamano
2018-08-01 21:20 ` Junio C Hamano
2018-08-02 14:43 ` Duy Nguyen
2018-08-02 16:27 ` Junio C Hamano
2018-08-02 19:06 ` Jeff King
2018-08-02 21:14 ` Junio C Hamano
2018-08-02 21:28 ` Jeff King
2018-08-03 18:23 ` Jeff Hostetler
2018-08-03 18:49 ` Junio C Hamano
2018-08-03 18:53 ` Jeff King
2018-08-05 14:01 ` Jeff Hostetler
2018-08-03 14:28 ` Torsten Bögershausen
2018-08-01 15:21 ` Duy Nguyen
2018-07-31 19:13 ` Junio C Hamano
2018-08-01 15:16 ` Duy Nguyen
2018-08-07 19:01 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-08-07 19:31 ` Junio C Hamano
2018-08-08 19:48 ` Jeff Hostetler
2018-08-08 22:31 ` Jeff King
2018-08-09 0:41 ` Junio C Hamano
2018-08-09 14:23 ` Jeff King
2018-08-09 21:14 ` Jeff Hostetler
2018-08-09 21:34 ` Jeff King
2018-08-09 21:40 ` Elijah Newren
2018-08-09 21:44 ` Jeff King
2018-08-09 21:53 ` Elijah Newren
2018-08-09 21:59 ` Jeff King
2018-08-09 23:05 ` Elijah Newren
2018-08-09 22:07 ` Junio C Hamano
2018-08-10 15:36 ` [PATCH v3 0/1] clone: warn on colidding entries on checkout Nguyễn Thái Ngọc Duy
2018-08-10 15:36 ` [PATCH v3 1/1] clone: report duplicate entries on case-insensitive filesystems Nguyễn Thái Ngọc Duy
2018-08-10 16:42 ` Junio C Hamano
2018-08-11 10:09 ` SZEDER Gábor
2018-08-11 13:16 ` Duy Nguyen
2018-08-13 16:55 ` Junio C Hamano
2018-08-13 17:12 ` Duy Nguyen
2018-08-10 16:12 ` [PATCH v3 0/1] clone: warn on colidding entries on checkout Junio C Hamano
2018-08-12 9:07 ` [PATCH v4] clone: report duplicate entries on case-insensitive filesystems Nguyễn Thái Ngọc Duy
2018-08-13 15:32 ` Jeff Hostetler
2018-08-13 17:18 ` Junio C Hamano
2018-08-15 19:08 ` Torsten Bögershausen
2018-08-15 19:35 ` Duy Nguyen
2018-08-16 15:56 ` [PATCH] config.txt: clarify core.checkStat = minimal Nguyễn Thái Ngọc Duy
2018-08-16 17:01 ` Junio C Hamano
2018-08-16 18:19 ` Duy Nguyen
2018-08-16 22:29 ` Junio C Hamano
2018-08-17 15:26 ` Junio C Hamano
2018-08-17 15:29 ` Duy Nguyen
2018-08-15 19:38 ` [PATCH v4] clone: report duplicate entries on case-insensitive filesystems Junio C Hamano
2018-08-16 14:03 ` Torsten Bögershausen [this message]
2018-08-16 15:42 ` Duy Nguyen
2018-08-16 16:23 ` Junio C Hamano
2018-08-17 16:16 ` [PATCH v5] " Nguyễn Thái Ngọc Duy
2018-08-17 17:20 ` Junio C Hamano
2018-08-17 18:00 ` Duy Nguyen
2018-08-17 19:46 ` Torsten Bögershausen
2018-11-19 8:20 ` Carlo Marcelo Arenas Belón
2018-11-19 12:28 ` Torsten Bögershausen
2018-11-19 17:14 ` Carlo Arenas
2018-11-19 18:24 ` Duy Nguyen
2018-11-19 21:03 ` Duy Nguyen
2018-11-19 21:04 ` Duy Nguyen
2018-11-19 21:17 ` Duy Nguyen
2018-11-19 23:29 ` Ramsay Jones
2018-11-19 23:54 ` Ramsay Jones
2018-11-20 1:05 ` Carlo Arenas
2018-11-20 2:22 ` Junio C Hamano
2018-11-20 16:28 ` [PATCH] clone: fix colliding file detection on APFS Nguyễn Thái Ngọc Duy
2018-11-20 19:20 ` Ramsay Jones
2018-11-20 19:35 ` Carlo Arenas
2018-11-20 19:38 ` Duy Nguyen
2018-11-22 17:59 ` [PATCH v1 1/1] t5601-99: Enable colliding file detection for MINGW tboegi
2018-11-22 20:16 ` Carlo Marcelo Arenas Belón
2018-11-23 11:24 ` Johannes Schindelin
2018-11-19 17:21 ` [PATCH v5] clone: report duplicate entries on case-insensitive filesystems Ramsay Jones
2018-11-19 19:39 ` Carlo Arenas
2018-07-31 19:39 ` Git clone and case sensitivity Jeff Hostetler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180816140312.GA6102@tor.lan \
--to=tboegi@web.de \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=pawelparuzel95@gmail.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).