git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Cc: Junio C Hamano <gitster@pobox.com>, git <git@vger.kernel.org>
Subject: Re: [PATCH v3 5/7] refresh_index(): add REFRESH_DONT_MARK_SPARSE_MATCHES flag
Date: Wed, 31 Mar 2021 02:14:29 -0700	[thread overview]
Message-ID: <CABPp-BGjuz1ZEriCOhrpakQCQ8AZ12ovir5Vm233nadvywcWcA@mail.gmail.com> (raw)
In-Reply-To: <CAHd-oW4kRLjV9Sq3CFt-V1Ot9pYFzJggU1zPp3Hcuw=qWfq7Mg@mail.gmail.com>

Hi,

On Tue, Mar 30, 2021 at 11:51 AM Matheus Tavares Bernardino
<matheus.bernardino@usp.br> wrote:
>
> On Fri, Mar 19, 2021 at 1:05 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:
> >
> > >> In other words, the change makes me wonder why we are not adding a
> > >> flag that says "do we or do we not want to match paths outside the
> > >> sparse checkout cone?", with which the seen[] would automatically
> > >> record the right thing.
> > >
> > > Yeah, makes sense. I didn't want to make the flag skip the sparse
> > > paths unconditionally (i.e. without matching them) because then we
> > > would also skip the ce_stage() checkings below and the later
> > > ce_mark_uptodate(). And I wasn't sure whether this could cause any
> > > unwanted side effects.
> > >
> > > But thinking more carefully about this now, unmerged paths should
> > > never have the SKIP_WORKTREE bit set anyway, right? What about the
> > > CE_UPTODATE mark, would it be safe to skip it? I'm not very familiar
> > > with this code, but I'll try to investigate more later.
>
> Sorry I haven't given any update on this yet. From what I could see so
> far, it seems OK to ignore the skip_worktree entries in
> refresh_index() when it is called from `git add --refresh`. But
> because we would no longer mark the skip_worktree entries that match
> the pathspec with CE_UPTODATE, do_write_index() would start checking
> if they are racy clean (this is only done when `!ce_uptodate(ce)`),
> which could add some lstat() overhead.
>
> However, this made me think what happens today if we do have a racy
> clean entry with the skip_worktree bit set... `git add --refresh` will
> probably update the index without noticing that the entry is racy
> clean (because it won't check CE_UPTODATE entries, and skip_worktree
> entries will have this bit set in refresh_index()). Thus the entries'
> size won't be truncated to zero when writing the index, and the entry
> will appear unchanged even if we later unset the skip_worktree bit.
>
> But can we have a "racy clean skip_worktree entry"? Yes, this seems
> possible e.g. if the following sequence happens fast enough for mtime
> to be the same before and after the update:
>
>   echo x >file
>   git update-index --refresh --skip-worktree file
>   echo y>file
>
> Here is a more complete example which artificially creates a "racy
> clean skip_worktree entry", runs `git add --refresh`, and shows that
> the racy clean entry was not detected:
>
> # Setup
> echo sparse >sparse
> echo dense >dense
> git add .
> git commit -m files
>
> # Emulate a racy clean situation
> touch -d yesterday date
> touch -r date sparse
> git update-index --refresh --skip-worktree sparse
> touch -r date .git/index
> echo xparse >sparse
> touch -r date sparse
>
> # `git add` will now write a new index without checking if
> # `sparse` is racy clean nor truncating its size
> touch -r date dense
> git add --refresh .
>
> git update-index --no-skip-worktree sparse
> git status
> <doesn't show that `sparse` was modified>
>
> This situation feels rather uncommon, but perhaps the same could
> happen with `git sparse-checkout set` instead of `git update-index
> --refresh --skip-worktree`? IDK. This made me think whether
> refresh_index() should really mark skip_worktree entries with
> CE_UPTODATE, in the first place.
>
> Any thoughts?

Wow, that's a weird one.  Nice digging.  I don't understand the racily
clean stuff that well, or the CE_UPTODATE handling...but based on my
meager understanding of these things I'd say that marking
skip_worktree entries with CE_UPTODATE seems wrong and I'd agree with
your hunch that we shouldn't be updating it for those files.  If an
entry is skip_worktree, it's assumed to not be present in the working
tree and that we'll treat it as "unchanged".  So, when the filehappens
to be present despite that bit being set, checking the files' stat
information or pretending to have done so so we can claim it is up to
date seems wrong to me.  In fact, I'd say the mere recognition that
the file is present in the working tree despite being SKIP_WORKTREE to
me implies it's not "up-to-date" for at least one definition of that
term.

I suspect that if someone flips the skip-worktree bit on and off for a
file without removing it from the working tree, not marking
skip_worktree entries with CE_UPTODATE as you mention above might
force us to incur the cost of an extra stat on said file when we run
"git status" later.  If so, I think that's a totally reasonable cost,
so if that was your worry, I'd say that this is shifting the cost
where it belongs.

But, like I said, I'm not very familiar with the racily clean code or
CE_UPTODATE, so it's possible I said something above that doesn't even
make sense.  Does my reasoning help at all?

  reply	other threads:[~2021-03-31  9:15 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 21:01 [PATCH] rm: honor sparse checkout patterns Matheus Tavares
2020-11-12 23:54 ` Elijah Newren
2020-11-13 13:47   ` Derrick Stolee
2020-11-15 20:12     ` Matheus Tavares Bernardino
2020-11-15 21:42       ` Johannes Sixt
2020-11-16 12:37         ` Matheus Tavares Bernardino
2020-11-23 13:23           ` Johannes Schindelin
2020-11-24  2:48             ` Matheus Tavares Bernardino
2020-11-16 14:30     ` Jeff Hostetler
2020-11-17  4:53       ` Elijah Newren
2020-11-16 13:58 ` [PATCH v2] " Matheus Tavares
2021-02-17 21:02   ` [RFC PATCH 0/7] add/rm: honor sparse checkout and warn on sparse paths Matheus Tavares
2021-02-17 21:02     ` [RFC PATCH 1/7] add --chmod: don't update index when --dry-run is used Matheus Tavares
2021-02-17 21:45       ` Junio C Hamano
2021-02-18  1:33         ` Matheus Tavares
2021-02-17 21:02     ` [RFC PATCH 2/7] add: include magic part of pathspec on --refresh error Matheus Tavares
2021-02-17 22:20       ` Junio C Hamano
2021-02-17 21:02     ` [RFC PATCH 3/7] t3705: add tests for `git add` in sparse checkouts Matheus Tavares
2021-02-17 23:01       ` Junio C Hamano
2021-02-17 23:22         ` Eric Sunshine
2021-02-17 23:34           ` Junio C Hamano
2021-02-18  3:11           ` Matheus Tavares Bernardino
2021-02-18  3:07         ` Matheus Tavares Bernardino
2021-02-18 14:38           ` Matheus Tavares
2021-02-18 19:05             ` Junio C Hamano
2021-02-18 19:02           ` Junio C Hamano
2021-02-22 18:53         ` Elijah Newren
2021-02-17 21:02     ` [RFC PATCH 4/7] add: make --chmod and --renormalize honor " Matheus Tavares
2021-02-17 21:02     ` [RFC PATCH 5/7] pathspec: allow to ignore SKIP_WORKTREE entries on index matching Matheus Tavares
2021-02-17 21:02     ` [RFC PATCH 6/7] add: warn when pathspec only matches SKIP_WORKTREE entries Matheus Tavares
2021-02-19  0:34       ` Junio C Hamano
2021-02-19 17:11         ` Matheus Tavares Bernardino
2021-02-17 21:02     ` [RFC PATCH 7/7] rm: honor sparse checkout patterns Matheus Tavares
2021-02-22 18:57     ` [RFC PATCH 0/7] add/rm: honor sparse checkout and warn on sparse paths Elijah Newren
2021-02-24  4:05     ` [PATCH v2 " Matheus Tavares
2021-02-24  4:05       ` [PATCH v2 1/7] add: include magic part of pathspec on --refresh error Matheus Tavares
2021-02-24  4:05       ` [PATCH v2 2/7] t3705: add tests for `git add` in sparse checkouts Matheus Tavares
2021-02-24  5:15         ` Elijah Newren
2021-02-24  4:05       ` [PATCH v2 3/7] add: make --chmod and --renormalize honor " Matheus Tavares
2021-02-24  4:05       ` [PATCH v2 4/7] pathspec: allow to ignore SKIP_WORKTREE entries on index matching Matheus Tavares
2021-02-24  5:23         ` Elijah Newren
2021-02-24  4:05       ` [PATCH v2 5/7] refresh_index(): add REFRESH_DONT_MARK_SPARSE_MATCHES flag Matheus Tavares
2021-02-24  4:05       ` [PATCH v2 6/7] add: warn when pathspec only matches SKIP_WORKTREE entries Matheus Tavares
2021-02-24  6:50         ` Elijah Newren
2021-02-24 15:33           ` Matheus Tavares
2021-03-04 15:23           ` Matheus Tavares
2021-03-04 17:21             ` Elijah Newren
2021-03-04 21:03               ` Junio C Hamano
2021-03-04 22:48                 ` Elijah Newren
2021-03-04 21:26               ` Matheus Tavares Bernardino
2021-02-24  4:05       ` [PATCH v2 7/7] rm: honor sparse checkout patterns Matheus Tavares
2021-02-24  6:59         ` Elijah Newren
2021-02-24  7:05       ` [PATCH v2 0/7] add/rm: honor sparse checkout and warn on sparse paths Elijah Newren
2021-03-12 22:47       ` [PATCH v3 " Matheus Tavares
2021-03-12 22:47         ` [PATCH v3 1/7] add: include magic part of pathspec on --refresh error Matheus Tavares
2021-03-12 22:47         ` [PATCH v3 2/7] t3705: add tests for `git add` in sparse checkouts Matheus Tavares
2021-03-23 20:00           ` Derrick Stolee
2021-03-12 22:47         ` [PATCH v3 3/7] add: make --chmod and --renormalize honor " Matheus Tavares
2021-03-12 22:47         ` [PATCH v3 4/7] pathspec: allow to ignore SKIP_WORKTREE entries on index matching Matheus Tavares
2021-03-12 22:48         ` [PATCH v3 5/7] refresh_index(): add REFRESH_DONT_MARK_SPARSE_MATCHES flag Matheus Tavares
2021-03-18 23:45           ` Junio C Hamano
2021-03-19  0:00             ` Junio C Hamano
2021-03-19 12:23               ` Matheus Tavares Bernardino
2021-03-19 16:05                 ` Junio C Hamano
2021-03-30 18:51                   ` Matheus Tavares Bernardino
2021-03-31  9:14                     ` Elijah Newren [this message]
2021-03-12 22:48         ` [PATCH v3 6/7] add: warn when asked to update SKIP_WORKTREE entries Matheus Tavares
2021-03-12 22:48         ` [PATCH v3 7/7] rm: honor sparse checkout patterns Matheus Tavares
2021-03-21 23:03           ` Ævar Arnfjörð Bjarmason
2021-03-22  1:08             ` Matheus Tavares Bernardino
2021-03-23 20:47           ` Derrick Stolee
2021-03-13  7:07         ` [PATCH v3 0/7] add/rm: honor sparse checkout and warn on sparse paths Elijah Newren
2021-04-08 20:41         ` [PATCH v4 " Matheus Tavares
2021-04-08 20:41           ` [PATCH v4 1/7] add: include magic part of pathspec on --refresh error Matheus Tavares
2021-04-08 20:41           ` [PATCH v4 2/7] t3705: add tests for `git add` in sparse checkouts Matheus Tavares
2021-04-14 16:39             ` Derrick Stolee
2021-04-08 20:41           ` [PATCH v4 3/7] add: make --chmod and --renormalize honor " Matheus Tavares
2021-04-08 20:41           ` [PATCH v4 4/7] pathspec: allow to ignore SKIP_WORKTREE entries on index matching Matheus Tavares
2021-04-08 20:41           ` [PATCH v4 5/7] refresh_index(): add flag to ignore SKIP_WORKTREE entries Matheus Tavares
2021-04-08 20:41           ` [PATCH v4 6/7] add: warn when asked to update " Matheus Tavares
2021-04-08 20:41           ` [PATCH v4 7/7] rm: honor sparse checkout patterns Matheus Tavares
2021-04-14 16:36           ` [PATCH v4 0/7] add/rm: honor sparse checkout and warn on sparse paths Elijah Newren
2021-04-14 18:04             ` Matheus Tavares Bernardino
2021-04-16 21:33           ` Junio C Hamano
2021-04-16 23:17             ` Elijah Newren
2020-11-16 20:14 ` [PATCH] rm: honor sparse checkout patterns Junio C Hamano
2020-11-17  5:20   ` Elijah Newren
2020-11-20 17:06     ` Elijah Newren
2020-12-31 20:03       ` sparse-checkout questions and proposals [Was: Re: [PATCH] rm: honor sparse checkout patterns] Elijah Newren
2021-01-04  3:02         ` Derrick Stolee
2021-01-06 19:15           ` Elijah Newren
2021-01-07 12:53             ` Derrick Stolee
2021-01-07 17:36               ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BGjuz1ZEriCOhrpakQCQ8AZ12ovir5Vm233nadvywcWcA@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).