From: "René Scharfe" <l.s.r@web.de>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: newren@gmail.com, gitster@pobox.com, matheus.bernardino@usp.br,
stolee@gmail.com, Derrick Stolee <derrickstolee@github.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 03/13] dir: select directories correctly
Date: Fri, 24 Sep 2021 09:44:29 +0200 [thread overview]
Message-ID: <e51adbad-c633-b030-ff01-3cfc260a0d65@web.de> (raw)
In-Reply-To: <d47c7a1cf2a3fa8cfdcfc6be1ac800af123e7efc.1629842085.git.gitgitgadget@gmail.com>
Am 24.08.21 um 23:54 schrieb Derrick Stolee via GitGitGadget:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> When matching a path against a list of patterns, the ones that require a
> directory match previously did not work when a filename is specified.
> This was fine when all pattern-matching was done within methods such as
> unpack_trees() that check a directory before recursing into the
> contained files. However, other commands will start matching individual
> files against pattern lists without that recursive approach.
>
> We modify path_matches_dir_pattern() to take a strbuf 'path_parent' that
> is used to store the parent directory of 'pathname' between multiple
> pattern matching tests. This is loaded lazily, only on the first pattern
> it finds that has the PATTERN_FLAG_MUSTBEDIR flag.
>
> If we find that a path has a parent directory, we start by checking to
> see if that parent directory matches the pattern. If so, then we do not
> need to query the index for the type (which can be expensive). If we
> find that the parent does not match, then we still must check the type
> from the index for the given pathname.
>
> Note that this does not affect cone mode pattern matching, but instead
> the more general -- and slower -- full pattern set. Thus, this does not
> affect the sparse index.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> dir.c | 34 ++++++++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 652135df896..fe5ee87bb5f 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1305,10 +1305,38 @@ int match_pathname(const char *pathname, int pathlen,
>
> static int path_matches_dir_pattern(const char *pathname,
> int pathlen,
> + struct strbuf *path_parent,
> int *dtype,
> struct path_pattern *pattern,
> struct index_state *istate)
> {
> + /*
> + * Use 'alloc' as an indicator that the string has not been
> + * initialized, in case the parent is the root directory.
> + */
This means the caller needs to take care to release the strbuf between
calls for files from different directories. Seems a bit fragile. The
current caller is only ever passing in the same pathname before throwing
away the strbuf, so it's doing the right thing.
> + if (!path_parent->alloc) {
> + char *slash;
> + strbuf_addstr(path_parent, pathname);
> + slash = find_last_dir_sep(path_parent->buf);
The caller has pathname, pathlen and basename. If basename is
guaranteed to be a substring of pathname then the parent directory name
length could be calculated without requiring a string copy or scan.
IIUC if pathname and basename can be pointers to different objects then
just checking if basename is between pathname and pathname + pathlen
would already be undefined behavior.
Using pathname, pathlen and dirlen instead would be safer for such
calculations, as it enforces basename to be a substring. Seems like
this would require a lot of function signature changes, though, as the
call tree is quite high. :|
> +
> + if (slash)
> + *slash = '\0';
This doesn't update path_parent->len...
> + else
> + strbuf_setlen(path_parent, 0);
> + }
> +
> + /*
> + * If the parent directory matches the pattern, then we do not
> + * need to check for dtype.
> + */
> + if (path_parent->len &&
> + match_pathname(path_parent->buf, path_parent->len,
... so this checks if "<dirname>\0<basename>" matches. Intended?
> + pattern->base,
> + pattern->baselen ? pattern->baselen - 1 : 0,
> + pattern->pattern, pattern->nowildcardlen,
> + pattern->patternlen, pattern->flags))
> + return 1;
> +
> *dtype = resolve_dtype(*dtype, istate, pathname, pathlen);
> if (*dtype != DT_DIR)
> return 0;
> @@ -1331,6 +1359,7 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
> {
> struct path_pattern *res = NULL; /* undecided */
> int i;
> + struct strbuf path_parent = STRBUF_INIT;
>
> if (!pl->nr)
> return NULL; /* undefined */
> @@ -1340,8 +1369,8 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
> const char *exclude = pattern->pattern;
> int prefix = pattern->nowildcardlen;
>
> - if ((pattern->flags & PATTERN_FLAG_MUSTBEDIR) &&
> - !path_matches_dir_pattern(pathname, pathlen,
> + if (pattern->flags & PATTERN_FLAG_MUSTBEDIR &&
> + !path_matches_dir_pattern(pathname, pathlen, &path_parent,
"a & b && c" is equivalent to "(a & b) && c", but removing the
parentheses here serves no apparent purpose and distracts a bit from
the actual change, i.e. adding a parameter.
> dtype, pattern, istate))
> continue;
>
> @@ -1367,6 +1396,7 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
> break;
> }
> }
> + strbuf_release(&path_parent);
> return res;
> }
>
>
next prev parent reply other threads:[~2021-09-24 7:44 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-24 21:54 [PATCH 00/13] [RFC] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 01/13] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 02/13] dir: extract directory-matching logic Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 03/13] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-24 7:44 ` René Scharfe [this message]
2021-08-24 21:54 ` [PATCH 04/13] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 05/13] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-08-27 21:06 ` Matheus Tavares Bernardino
2021-08-27 22:50 ` Matheus Tavares Bernardino
2021-09-08 17:54 ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 06/13] add: skip paths that are outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-08-27 21:13 ` Matheus Tavares
2021-09-08 19:46 ` Derrick Stolee
2021-09-08 20:02 ` Derrick Stolee
2021-09-08 21:06 ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 07/13] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-08-27 21:14 ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 08/13] add: prevent adding sparse conflict files Derrick Stolee via GitGitGadget
2021-08-27 21:16 ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 09/13] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-08-27 21:17 ` Matheus Tavares Bernardino
2021-09-08 18:04 ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 10/13] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-08-27 21:18 ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 11/13] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-08-27 21:20 ` Matheus Tavares Bernardino
2021-08-27 23:44 ` Matheus Tavares Bernardino
2021-09-08 18:41 ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 12/13] mv: add '--sparse' option to ignore sparse-checkout Derrick Stolee via GitGitGadget
2021-08-28 14:18 ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 13/13] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 01/14] t3705: test that 'sparse_entry' is unstaged Derrick Stolee via GitGitGadget
2021-09-15 5:22 ` Elijah Newren
2021-09-15 16:17 ` Derrick Stolee
2021-09-15 16:32 ` Matheus Tavares
2021-09-15 16:42 ` Derrick Stolee
2021-09-12 13:23 ` [PATCH v2 02/14] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-09-12 22:17 ` Ævar Arnfjörð Bjarmason
2021-09-13 15:02 ` Derrick Stolee
2021-09-12 13:23 ` [PATCH v2 03/14] dir: extract directory-matching logic Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 04/14] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-12 22:21 ` Ævar Arnfjörð Bjarmason
2021-09-15 14:41 ` Derrick Stolee
2021-09-15 14:54 ` Elijah Newren
2021-09-15 16:43 ` Derrick Stolee
2021-09-12 13:23 ` [PATCH v2 05/14] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 06/14] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 07/14] add: skip tracked paths outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 08/14] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-09-15 16:59 ` Elijah Newren
2021-09-20 15:45 ` Derrick Stolee
2021-09-12 13:23 ` [PATCH v2 09/14] add: update --chmod to skip sparse paths Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 10/14] add: update --renormalize " Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 11/14] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 12/14] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 13/14] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 14/14] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-12 21:58 ` Ævar Arnfjörð Bjarmason
2021-09-15 16:54 ` Derrick Stolee
2021-09-15 20:18 ` [PATCH v2 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Elijah Newren
2021-09-20 17:45 ` [PATCH v3 " Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 01/14] t3705: test that 'sparse_entry' is unstaged Derrick Stolee via GitGitGadget
2021-09-22 22:52 ` Junio C Hamano
2021-09-20 17:45 ` [PATCH v3 02/14] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-09-22 23:06 ` Junio C Hamano
2021-09-23 13:37 ` Derrick Stolee
2021-09-20 17:45 ` [PATCH v3 03/14] dir: extract directory-matching logic Derrick Stolee via GitGitGadget
2021-09-22 23:13 ` Junio C Hamano
2021-09-23 13:39 ` Derrick Stolee
2021-09-23 13:42 ` Derrick Stolee
2021-09-23 18:23 ` Junio C Hamano
2021-09-24 13:29 ` Derrick Stolee
2021-09-20 17:45 ` [PATCH v3 04/14] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 05/14] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 06/14] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 07/14] add: skip tracked paths outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 08/14] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 09/14] add: update --chmod to skip sparse paths Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 10/14] add: update --renormalize " Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 11/14] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 12/14] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 13/14] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-09-20 17:45 ` [PATCH v3 14/14] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-24 6:08 ` [PATCH v3 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Elijah Newren
2021-09-24 15:39 ` [PATCH v4 00/13] Sparse-checkout: modify 'git add', 'git rm', and 'git mv' behavior Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 01/13] t3705: test that 'sparse_entry' is unstaged Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 02/13] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 03/13] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 04/13] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-11-02 0:15 ` Glen Choo
2021-11-02 0:34 ` Junio C Hamano
2021-11-02 13:42 ` Derrick Stolee
2021-11-02 14:50 ` Derrick Stolee
2021-11-02 15:33 ` Ævar Arnfjörð Bjarmason
2021-11-03 14:40 ` Derrick Stolee
2021-11-03 17:14 ` Junio C Hamano
2021-09-24 15:39 ` [PATCH v4 05/13] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 06/13] add: skip tracked paths outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 07/13] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 08/13] add: update --chmod to skip sparse paths Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 09/13] add: update --renormalize " Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 10/13] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 11/13] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 12/13] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-09-24 15:39 ` [PATCH v4 13/13] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-27 15:51 ` [PATCH v4 00/13] Sparse-checkout: modify 'git add', 'git rm', and 'git mv' behavior Elijah Newren
2021-09-27 20:51 ` Junio C Hamano
2021-10-18 21:28 ` [PATCH v2 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Sean Christopherson
2021-10-19 12:29 ` Derrick Stolee
2021-10-19 16:50 ` Sean Christopherson
2021-10-20 13:28 ` Junio C Hamano
2021-10-20 14:28 ` Sean Christopherson
2021-10-22 2:28 ` [RFC PATCH] add|rm|mv: fix bug that prevent the update of non-sparse Matheus Tavares
2021-10-22 4:03 ` Matheus Tavares
2021-10-25 16:40 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e51adbad-c633-b030-ff01-3cfc260a0d65@web.de \
--to=l.s.r@web.de \
--cc=derrickstolee@github.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=matheus.bernardino@usp.br \
--cc=newren@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).