git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, pclouds@gmail.com,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH] dir: force untracked cache with core.untrackedCache
Date: Mon, 14 Feb 2022 12:16:41 -0800	[thread overview]
Message-ID: <xmqqzgmt19w6.fsf@gitster.g> (raw)
In-Reply-To: <pull.1058.git.1644860224151.gitgitgadget@gmail.com> (Derrick Stolee via GitGitGadget's message of "Mon, 14 Feb 2022 17:37:04 +0000")

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The GIT_FORCE_UNTRACKED_CACHE environment variable writes the untracked
> cache more frequently than the core.untrackedCache config variable. This
> is due to how read_directory() handles the creation of an untracked
> cache. The old mechanism required using something like 'git update-index
> --untracked-cache' before the index would actually contain an untracked
> cache. This was noted as a performance problem on macOS in the past, and
> this is a resolution for that issue.

"The old mechanism" meaning "core.untrackedCache does not add a new
one; it only updates an existing one"?  What "this" refers to that
was noted as a problem on macOS is not quite clear; is "writing
untracked cache is a performance problem"? And the last "this" which
is a resolution is "not to add untrackedCache merely because the
configuration variable says we are allowed to use it"?

> The decision to not write the untracked cache without an environment
> variable tracks back to fc9ecbeb9 (dir.c: don't flag the index as dirty
> for changes to the untracked cache, 2018-02-05). The motivation of that
> change is that writing the index is expensive, and if the untracked
> cache is the only thing that needs to be written, then it is more
> expensive than the benefit of the cache. However, this also means that
> the untracked cache never gets populated, so the user who enabled it via
> config does not actually get the extension until running 'git
> update-index --untracked-cache' manually or using the environment
> variable.

OK.  It was invented solely as a test mechanism it seems, but at
least to the workflow of Microsoft folks, once we spent cycles to
prepare UNTR data, it helps their future use of the index to spend
a bit more cycle to write it out, instead of discarding.

I have to wonder if there are workflows that are sufficiently
different from what Microsoft folks use that the write-out cost of
more frequent updates to the untracked cache outweigh the runtime
performance boost of not having to run around and readdir() for
untracked files?

ad0fb659 (repo-settings: parse core.untrackedCache, 2019-08-13)
explains that unset core.untrackedCache means "keep", and "true"
means untracked cache is "automatically added", which this change is
not invalidated, so I guess there is no need to update anything in
the documentation for this change.  In fact, we might be able to
sell this change as a bugfix (i.e. "I set the configuration to
'true' but it wasn't written out when it should have").

> diff --git a/dir.c b/dir.c
> index d91295f2bcd..79a5f6918c8 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2936,7 +2936,9 @@ int read_directory(struct dir_struct *dir, struct index_state *istate,
>  
>  		if (force_untracked_cache < 0)
>  			force_untracked_cache =
> -				git_env_bool("GIT_FORCE_UNTRACKED_CACHE", 0);
> +				git_env_bool("GIT_FORCE_UNTRACKED_CACHE", -1);
> +		if (force_untracked_cache < 0)
> +			force_untracked_cache = (istate->repo->settings.core_untracked_cache == UNTRACKED_CACHE_WRITE);
>  		if (force_untracked_cache &&
>  			dir->untracked == istate->untracked &&
>  		    (dir->untracked->dir_opened ||
>
> base-commit: b80121027d1247a0754b3cc46897fee75c050b44

  reply	other threads:[~2022-02-14 20:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-14 17:37 [PATCH] dir: force untracked cache with core.untrackedCache Derrick Stolee via GitGitGadget
2022-02-14 20:16 ` Junio C Hamano [this message]
2022-02-14 20:40   ` Derrick Stolee
2022-02-17 21:00 ` [PATCH v2] " Derrick Stolee via GitGitGadget
2022-02-17 22:51   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzgmt19w6.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).