git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Tao Klerks <tao@klerks.biz>
Subject: Re: [PATCH 3/3] Write index when populating empty untracked cache
Date: Mon, 28 Jun 2021 21:42:37 -0700	[thread overview]
Message-ID: <xmqq35t11dtu.fsf@gitster.g> (raw)
In-Reply-To: 627f1952fd8d4864b6b87f5539a9d9b802c5796b.1624559402.git.gitgitgadget@gmail.com

"Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Subject: Re: [PATCH 3/3] Write index when populating empty untracked cache

Common to all three patches.  

Look at "git shortlog --no-merges -200 master" and observe the
pattern, i.e. our commit title are <area> prefix, a colon and then a
summary.  Try to make your commits blend in.

> From: Tao Klerks <tao@klerks.biz>
>
> It is expected that an empty/unpopulated untracked
> cache structure can be written to the index - by update-
> index, or by a "git status" call that sees the untracked cache
> should be enabled, but is running with options that make
> it non-applicable in that run.

Would an example be helpful?  What requests the untracked cache to
be created, and what options (or perhaps the lack of what options)
prevent the untracked cache to be "non-applicable"?

> Currently, if that happens, then subsequent "git status"
> calls end up populating the untracked cache, but not
> writing the index (not saving their work) - so the
> performance outcome is almost identical to the cache
> being altogether disabled.
>
> This continues until the index gets written with the cache
> populated, for some *other* reason.

Here (and only here), the word "cache" is used instead of "untracked
cache"---but I suspect that "the cache" here is the same "untracked
cache".  If that is the case, being consistent and spelling it out
would reduce the risk of confusing readers.

> In this change, we detect the "existing cache is empty
> and it looks like we are using it" condition, and queues
> an index write when this happens.

"we detect ... and queues" sounds like a grammo.

In this project, we describe the change being proposed as if we are
giving an order to the codebase to "become like so".  So, perhaps

    Detect the condition where an empty untracked cache exists in
    the index and we collect the list of untracked paths, and queue
    an index write under that condition, so that the collected
    untracked paths can be written out to the untracked cache
    extension in the index.

or something along the line.

> This change depends on previous fixes to t7519 for the
> "ignore .git changes when invalidating UNTR" test case to
> pass - before this fix, the test never actually did anything
> as it was not set up correctly.
>
> Signed-off-by: Tao Klerks <tao@klerks.biz>
> ---
>  dir.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index ebe5ec046e0..a326e40e1c1 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2703,7 +2703,8 @@ void remove_untracked_cache(struct index_state *istate)
>  
>  static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir,
>  						      int base_len,
> -						      const struct pathspec *pathspec)
> +						      const struct pathspec *pathspec,
> +						      struct index_state *istate)
>  {
>  	struct untracked_cache_dir *root;
>  	static int untracked_cache_disabled = -1;
> @@ -2767,8 +2768,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
>  		return NULL;
>  	}
>  
> -	if (!dir->untracked->root)
> +	if (!dir->untracked->root) {
>  		FLEX_ALLOC_STR(dir->untracked->root, name, "");
> +		/*
> +		 * If we've had to initialize the root, then what we had was an
> +		 * empty uninitialized untracked cache structure. We will be
> +		 * populating it now, so we should trigger an index write.
> +		 */
> +		istate->cache_changed |= UNTRACKED_CHANGED;
> +	}

The logic sounds fairly straight-forward.  The fact that we came
this far in the helper function means that we know we want to use
(i.e. untracked cache is not disabled, and various other "we
shouldn't be contaminating the cache with the one-shot information"
cases did not return from the helper) untracked cache, and root
being NULL originally means the untracked cache wasn't populated.

>  	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
>  	root = dir->untracked->root;
> @@ -2838,7 +2846,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate,
>  		return dir->nr;
>  	}
>  
> -	untracked = validate_untracked_cache(dir, len, pathspec);
> +	untracked = validate_untracked_cache(dir, len, pathspec, istate);
>  	if (!untracked)
>  		/*
>  		 * make sure untracked cache code path is disabled,

  reply	other threads:[~2021-06-29  4:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget
2021-06-29  4:22   ` Junio C Hamano
2021-06-24 18:30 ` [PATCH 2/3] In t7519, populate untracked cache before test Tao Klerks via GitGitGadget
2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget
2021-06-29  4:42   ` Junio C Hamano [this message]
2022-02-24 17:52     ` Tao Klerks
2022-02-24 20:35       ` Junio C Hamano
2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2022-02-25 17:10   ` [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget
2022-02-25 19:07     ` Junio C Hamano
2022-02-27 22:12       ` Tao Klerks
2022-02-25 17:10   ` [PATCH v2 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget
2022-02-25 17:10   ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget
2022-02-25 19:12     ` Junio C Hamano
2022-02-27 21:56   ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2022-02-27 21:56     ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget
2022-02-27 21:57     ` [PATCH v3 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget
2022-02-27 21:57     ` [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq35t11dtu.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=tao@klerks.biz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).