git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: newren@gmail.com, gitster@pobox.com, matheus.bernardino@usp.br,
	stolee@gmail.com, vdye@github.com,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Derrick Stolee" <derrickstolee@github.com>,
	"Derrick Stolee" <dstolee@microsoft.com>
Subject: [PATCH v4 03/13] dir: select directories correctly
Date: Fri, 24 Sep 2021 15:39:04 +0000	[thread overview]
Message-ID: <2ebaf8e68c2c46f700f7523d96e7494ebe3fb8cb.1632497954.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1018.v4.git.1632497954.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

When matching a path against a list of patterns, the ones that require a
directory match previously did not work when a filename is specified.
This was fine when all pattern-matching was done within methods such as
unpack_trees() that check a directory before recursing into the
contained files. However, other commands will start matching individual
files against pattern lists without that recursive approach.

The last_matching_pattern_from_list() logic performs some checks on the
filetype of a path within the index when the PATTERN_FLAG_MUSTBEDIR flag
is set. This works great when setting SKIP_WORKTREE bits within
unpack_trees(), but doesn't work well when passing an arbitrary path
such as a file within a matching directory.

We extract the logic around determining the file type, but attempt to
avoid checking the filesystem if the parent directory already matches
the sparse-checkout patterns. The new path_matches_dir_pattern() method
includes a 'path_parent' parameter that is used to store the parent
directory of 'pathname' between multiple pattern matching tests. This is
loaded lazily, only on the first pattern it finds that has the
PATTERN_FLAG_MUSTBEDIR flag.

If we find that a path has a parent directory, we start by checking to
see if that parent directory matches the pattern. If so, then we do not
need to query the index for the type (which can be expensive). If we
find that the parent does not match, then we still must check the type
from the index for the given pathname.

Note that this does not affect cone mode pattern matching, but instead
the more general -- and slower -- full pattern set. Thus, this does not
affect the sparse index.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 49 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 86afa2eae00..9ea6cfe61cb 100644
--- a/dir.c
+++ b/dir.c
@@ -1303,6 +1303,44 @@ int match_pathname(const char *pathname, int pathlen,
 				 WM_PATHNAME) == 0;
 }
 
+static int path_matches_dir_pattern(const char *pathname,
+				    int pathlen,
+				    struct strbuf **path_parent,
+				    int *dtype,
+				    struct path_pattern *pattern,
+				    struct index_state *istate)
+{
+	if (!*path_parent) {
+		char *slash;
+		CALLOC_ARRAY(*path_parent, 1);
+		strbuf_add(*path_parent, pathname, pathlen);
+		slash = find_last_dir_sep((*path_parent)->buf);
+
+		if (slash)
+			strbuf_setlen(*path_parent, slash - (*path_parent)->buf);
+		else
+			strbuf_setlen(*path_parent, 0);
+	}
+
+	/*
+	 * If the parent directory matches the pattern, then we do not
+	 * need to check for dtype.
+	 */
+	if ((*path_parent)->len &&
+	    match_pathname((*path_parent)->buf, (*path_parent)->len,
+			   pattern->base,
+			   pattern->baselen ? pattern->baselen - 1 : 0,
+			   pattern->pattern, pattern->nowildcardlen,
+			   pattern->patternlen, pattern->flags))
+		return 1;
+
+	*dtype = resolve_dtype(*dtype, istate, pathname, pathlen);
+	if (*dtype != DT_DIR)
+		return 0;
+
+	return 1;
+}
+
 /*
  * Scan the given exclude list in reverse to see whether pathname
  * should be ignored.  The first match (i.e. the last on the list), if
@@ -1318,6 +1356,7 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
 {
 	struct path_pattern *res = NULL; /* undecided */
 	int i;
+	struct strbuf *path_parent = NULL;
 
 	if (!pl->nr)
 		return NULL;	/* undefined */
@@ -1327,11 +1366,10 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
 		const char *exclude = pattern->pattern;
 		int prefix = pattern->nowildcardlen;
 
-		if (pattern->flags & PATTERN_FLAG_MUSTBEDIR) {
-			*dtype = resolve_dtype(*dtype, istate, pathname, pathlen);
-			if (*dtype != DT_DIR)
-				continue;
-		}
+		if (pattern->flags & PATTERN_FLAG_MUSTBEDIR &&
+		    !path_matches_dir_pattern(pathname, pathlen, &path_parent,
+					      dtype, pattern, istate))
+			continue;
 
 		if (pattern->flags & PATTERN_FLAG_NODIR) {
 			if (match_basename(basename,
@@ -1355,6 +1393,12 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
 			break;
 		}
 	}
+
+	if (path_parent) {
+		strbuf_release(path_parent);
+		free(path_parent);
+	}
+
 	return res;
 }
 
-- 
gitgitgadget


  parent reply	other threads:[~2021-09-24 15:39 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24 21:54 [PATCH 00/13] [RFC] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 01/13] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 02/13] dir: extract directory-matching logic Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 03/13] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-24  7:44   ` René Scharfe
2021-08-24 21:54 ` [PATCH 04/13] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-08-24 21:54 ` [PATCH 05/13] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-08-27 21:06   ` Matheus Tavares Bernardino
2021-08-27 22:50     ` Matheus Tavares Bernardino
2021-09-08 17:54     ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 06/13] add: skip paths that are outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-08-27 21:13   ` Matheus Tavares
2021-09-08 19:46     ` Derrick Stolee
2021-09-08 20:02       ` Derrick Stolee
2021-09-08 21:06     ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 07/13] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-08-27 21:14   ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 08/13] add: prevent adding sparse conflict files Derrick Stolee via GitGitGadget
2021-08-27 21:16   ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 09/13] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-08-27 21:17   ` Matheus Tavares Bernardino
2021-09-08 18:04     ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 10/13] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-08-27 21:18   ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 11/13] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-08-27 21:20   ` Matheus Tavares Bernardino
2021-08-27 23:44     ` Matheus Tavares Bernardino
2021-09-08 18:41     ` Derrick Stolee
2021-08-24 21:54 ` [PATCH 12/13] mv: add '--sparse' option to ignore sparse-checkout Derrick Stolee via GitGitGadget
2021-08-28 14:18   ` Matheus Tavares Bernardino
2021-08-24 21:54 ` [PATCH 13/13] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-12 13:23 ` [PATCH v2 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 01/14] t3705: test that 'sparse_entry' is unstaged Derrick Stolee via GitGitGadget
2021-09-15  5:22     ` Elijah Newren
2021-09-15 16:17       ` Derrick Stolee
2021-09-15 16:32     ` Matheus Tavares
2021-09-15 16:42       ` Derrick Stolee
2021-09-12 13:23   ` [PATCH v2 02/14] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-09-12 22:17     ` Ævar Arnfjörð Bjarmason
2021-09-13 15:02       ` Derrick Stolee
2021-09-12 13:23   ` [PATCH v2 03/14] dir: extract directory-matching logic Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 04/14] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-12 22:21     ` Ævar Arnfjörð Bjarmason
2021-09-15 14:41       ` Derrick Stolee
2021-09-15 14:54     ` Elijah Newren
2021-09-15 16:43       ` Derrick Stolee
2021-09-12 13:23   ` [PATCH v2 05/14] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 06/14] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 07/14] add: skip tracked paths outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 08/14] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-09-15 16:59     ` Elijah Newren
2021-09-20 15:45       ` Derrick Stolee
2021-09-12 13:23   ` [PATCH v2 09/14] add: update --chmod to skip sparse paths Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 10/14] add: update --renormalize " Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 11/14] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 12/14] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 13/14] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-09-12 13:23   ` [PATCH v2 14/14] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-12 21:58     ` Ævar Arnfjörð Bjarmason
2021-09-15 16:54       ` Derrick Stolee
2021-09-15 20:18   ` [PATCH v2 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Elijah Newren
2021-09-20 17:45   ` [PATCH v3 " Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 01/14] t3705: test that 'sparse_entry' is unstaged Derrick Stolee via GitGitGadget
2021-09-22 22:52       ` Junio C Hamano
2021-09-20 17:45     ` [PATCH v3 02/14] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-09-22 23:06       ` Junio C Hamano
2021-09-23 13:37         ` Derrick Stolee
2021-09-20 17:45     ` [PATCH v3 03/14] dir: extract directory-matching logic Derrick Stolee via GitGitGadget
2021-09-22 23:13       ` Junio C Hamano
2021-09-23 13:39         ` Derrick Stolee
2021-09-23 13:42           ` Derrick Stolee
2021-09-23 18:23             ` Junio C Hamano
2021-09-24 13:29               ` Derrick Stolee
2021-09-20 17:45     ` [PATCH v3 04/14] dir: select directories correctly Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 05/14] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 06/14] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 07/14] add: skip tracked paths outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 08/14] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 09/14] add: update --chmod to skip sparse paths Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 10/14] add: update --renormalize " Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 11/14] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 12/14] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 13/14] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-09-20 17:45     ` [PATCH v3 14/14] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-24  6:08     ` [PATCH v3 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Elijah Newren
2021-09-24 15:39     ` [PATCH v4 00/13] Sparse-checkout: modify 'git add', 'git rm', and 'git mv' behavior Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 01/13] t3705: test that 'sparse_entry' is unstaged Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 02/13] t1092: behavior for adding sparse files Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` Derrick Stolee via GitGitGadget [this message]
2021-09-24 15:39       ` [PATCH v4 04/13] dir: fix pattern matching on dirs Derrick Stolee via GitGitGadget
2021-11-02  0:15         ` Glen Choo
2021-11-02  0:34           ` Junio C Hamano
2021-11-02 13:42             ` Derrick Stolee
2021-11-02 14:50               ` Derrick Stolee
2021-11-02 15:33                 ` Ævar Arnfjörð Bjarmason
2021-11-03 14:40                   ` Derrick Stolee
2021-11-03 17:14                     ` Junio C Hamano
2021-09-24 15:39       ` [PATCH v4 05/13] add: fail when adding an untracked sparse file Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 06/13] add: skip tracked paths outside sparse-checkout cone Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 07/13] add: implement the --sparse option Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 08/13] add: update --chmod to skip sparse paths Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 09/13] add: update --renormalize " Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 10/13] rm: add --sparse option Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 11/13] rm: skip sparse paths with missing SKIP_WORKTREE Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 12/13] mv: refuse to move sparse paths Derrick Stolee via GitGitGadget
2021-09-24 15:39       ` [PATCH v4 13/13] advice: update message to suggest '--sparse' Derrick Stolee via GitGitGadget
2021-09-27 15:51       ` [PATCH v4 00/13] Sparse-checkout: modify 'git add', 'git rm', and 'git mv' behavior Elijah Newren
2021-09-27 20:51         ` Junio C Hamano
2021-10-18 21:28   ` [PATCH v2 00/14] Sparse-checkout: modify 'git add', 'git rm', and 'git add' behavior Sean Christopherson
2021-10-19 12:29     ` Derrick Stolee
2021-10-19 16:50       ` Sean Christopherson
2021-10-20 13:28         ` Junio C Hamano
2021-10-20 14:28           ` Sean Christopherson
2021-10-22  2:28     ` [RFC PATCH] add|rm|mv: fix bug that prevent the update of non-sparse Matheus Tavares
2021-10-22  4:03       ` Matheus Tavares
2021-10-25 16:40       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ebaf8e68c2c46f700f7523d96e7494ebe3fb8cb.1632497954.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).