git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files
@ 2018-04-10 21:04 Ben Peart
  2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-10 21:04 UTC (permalink / raw)
  To: git; +Cc: pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin, Ben Peart

In git repos with large working directories an external file system monitor
(like fsmonitor or gvfs) can track what files in the working directory have been
modified.  This information can be used to speed up git operations that scale
based on the size of the working directory so that they become O(# of modified
files) vs O(# of files in the working directory).

The fsmonitor patch series added logic to limit what files git had to stat() to
the set of modified files provided by the fsmonitor hook proc.  It also used the
untracked cache (if enabled) to limit the files/folders git had to scan looking
for new/untracked files.  GVFS is another external file system model that also
speeds up git working directory based operations that has been using a different
mechanism (programmatically generating an excludes file) to enable git to be
O(# of modified files).

This patch series will introduce a new way to limit git�s traversal of the
working directory that does not require the untracked cache (fsmonitor) or using
the excludes feature (GVFS).  It does this by enhancing the existing excludes
logic in dir.c to support a new �File System Excludes� or fsexcludes API that is
better tuned to these programmatic applications.

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/2ccbcd6360
Checkout: git fetch https://github.com/benpeart/git fsexcludes-v1 && git checkout 2ccbcd6360

Ben Peart (2):
  fsexcludes: add a programmatic way to exclude files from git's working
    directory traversal logic
  fsmonitor: switch to use new fsexcludes logic and remove unused
    untracked cache based logic

 Makefile                    |   1 +
 dir.c                       |  33 ++++--
 dir.h                       |   2 -
 fsexcludes.c                | 210 ++++++++++++++++++++++++++++++++++++
 fsexcludes.h                |  27 +++++
 fsmonitor.c                 |  21 +---
 fsmonitor.h                 |  10 +-
 t/t7519-status-fsmonitor.sh |  14 +--
 8 files changed, 270 insertions(+), 48 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h


base-commit: 0b0cc9f86731f894cff8dd25299a9b38c254569e
-- 
2.17.0.windows.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
@ 2018-04-10 21:04 ` Ben Peart
  2018-04-10 22:09   ` Martin Ågren
  2018-04-11  6:58   ` Junio C Hamano
  2018-04-10 21:04 ` [PATCH v1 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-10 21:04 UTC (permalink / raw)
  To: git; +Cc: pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin, Ben Peart

The File System Excludes module is a new programmatic way to exclude files and
folders from git's traversal of the working directory.  fsexcludes_init() should
be called with a string buffer that contains a NUL separated list of path names
of the files and/or directories that should be included.  Any path not listed
will be excluded. The paths should be relative to the root of the working
directory and be separated by a single NUL.

The excludes logic in dir.c has been updated to honor the results of
fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
normal excludes logic is also checked as it could further reduce the set of
files that should be included.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile     |   1 +
 fsexcludes.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fsexcludes.h |  27 +++++++
 3 files changed, 238 insertions(+)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h

diff --git a/Makefile b/Makefile
index 96f6138f63..c102d2f75a 100644
--- a/Makefile
+++ b/Makefile
@@ -819,6 +819,7 @@ LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsexcludes.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
diff --git a/fsexcludes.c b/fsexcludes.c
new file mode 100644
index 0000000000..07bfe376a0
--- /dev/null
+++ b/fsexcludes.c
@@ -0,0 +1,210 @@
+#include "cache.h"
+#include "fsexcludes.h"
+#include "hashmap.h"
+#include "strbuf.h"
+
+static int fsexcludes_initialized = 0;
+static struct strbuf fsexcludes_data = STRBUF_INIT;
+static struct hashmap fsexcludes_hashmap;
+static struct hashmap parent_directory_hashmap;
+
+struct fsexcludes {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *pattern;
+	int patternlen;
+};
+
+static unsigned int(*fsexcludeshash)(const void *buf, size_t len);
+static int(*fsexcludescmp)(const char *a, const char *b, size_t len);
+
+static int fsexcludes_hashmap_cmp(const void *unused_cmp_data,
+	const void *a, const void *b, const void *key)
+{
+	const struct fsexcludes *fse1 = a;
+	const struct fsexcludes *fse2 = b;
+
+	return fsexcludescmp(fse1->pattern, fse2->pattern, fse1->patternlen);
+}
+
+static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct fsexcludes fse;
+	char *slash;
+
+	/* Check straight mapping */
+	strbuf_reset(&sb);
+	strbuf_add(&sb, pattern, patternlen);
+	fse.pattern = sb.buf;
+	fse.patternlen = sb.len;
+	hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+	if (hashmap_get(map, &fse, NULL)) {
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	/*
+	 * Check to see if it matches a directory or any path
+	 * underneath it.  In other words, 'a/b/foo.txt' will match
+	 * '/', 'a/', and 'a/b/'.
+	 */
+	slash = strchr(sb.buf, '/');
+	while (slash) {
+		fse.pattern = sb.buf;
+		fse.patternlen = slash - sb.buf + 1;
+		hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+		if (hashmap_get(map, &fse, NULL)) {
+			strbuf_release(&sb);
+			return 0;
+		}
+		slash = strchr(slash + 1, '/');
+	}
+
+	strbuf_release(&sb);
+	return 1;
+}
+
+static void fsexcludes_hashmap_add(struct hashmap *map, const char *pattern, const int patternlen)
+{
+	struct fsexcludes *fse;
+
+	fse = xmalloc(sizeof(struct fsexcludes));
+	fse->pattern = pattern;
+	fse->patternlen = patternlen;
+	hashmap_entry_init(fse, fsexcludeshash(fse->pattern, fse->patternlen));
+	hashmap_add(map, fse);
+}
+
+static void initialize_fsexcludes_hashmap(struct hashmap *map, struct strbuf *fsexcludes_data)
+{
+	char *buf, *entry;
+	size_t len;
+	int i;
+
+	/*
+	 * Build a hashmap of the fsexcludes data we can use to look
+	 * for cache entry matches quickly
+	 */
+	fsexcludeshash = ignore_case ? memihash : memhash;
+	fsexcludescmp = ignore_case ? strncasecmp : strncmp;
+	hashmap_init(map, fsexcludes_hashmap_cmp, NULL, 0);
+
+	entry = buf = fsexcludes_data->buf;
+	len = fsexcludes_data->len;
+	for (i = 0; i < len; i++) {
+		if (buf[i] == '\0') {
+			fsexcludes_hashmap_add(map, entry, buf + i - entry);
+			entry = buf + i + 1;
+		}
+	}
+}
+
+static void parent_directory_hashmap_add(struct hashmap *map, const char *pattern, const int patternlen)
+{
+	char *slash;
+	struct fsexcludes *fse;
+
+	/*
+	 * Add any directories leading up to the file as the excludes logic
+	 * needs to match directories leading up to the files as well. Detect
+	 * and prevent unnecessary duplicate entries which will be common.
+	 */
+	if (patternlen > 1) {
+		slash = strchr(pattern + 1, '/');
+		while (slash) {
+			fse = xmalloc(sizeof(struct fsexcludes));
+			fse->pattern = pattern;
+			fse->patternlen = slash - pattern + 1;
+			hashmap_entry_init(fse, fsexcludeshash(fse->pattern, fse->patternlen));
+			if (hashmap_get(map, fse, NULL))
+				free(fse);
+			else
+				hashmap_add(map, fse);
+			slash = strchr(slash + 1, '/');
+		}
+	}
+}
+
+static void initialize_parent_directory_hashmap(struct hashmap *map, struct strbuf *vfs_data)
+{
+	char *buf, *entry;
+	size_t len;
+	int i;
+
+	/*
+	 * Build a hashmap of the parent directories contained in the virtual
+	 * file system data we can use to look for matches quickly
+	 */
+	fsexcludeshash = ignore_case ? memihash : memhash;
+	fsexcludescmp = ignore_case ? strncasecmp : strncmp;
+	hashmap_init(map, fsexcludes_hashmap_cmp, NULL, 0);
+
+	entry = buf = vfs_data->buf;
+	len = vfs_data->len;
+	for (i = 0; i < len; i++) {
+		if (buf[i] == '\0') {
+			parent_directory_hashmap_add(map, entry, buf + i - entry);
+			entry = buf + i + 1;
+		}
+	}
+}
+
+static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct fsexcludes fse;
+
+	/* Check for directory */
+	strbuf_reset(&sb);
+	strbuf_add(&sb, pathname, pathlen);
+	strbuf_addch(&sb, '/');
+	fse.pattern = sb.buf;
+	fse.patternlen = sb.len;
+	hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+	if (hashmap_get(map, &fse, NULL)) {
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	strbuf_release(&sb);
+	return 1;
+}
+
+/*
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+	const char *pathname, int pathlen, int dtype)
+{
+	if (!fsexcludes_initialized)
+		return -1;
+
+	if (dtype == DT_REG) {
+		/* lazily init the hashmap */
+		if (!fsexcludes_hashmap.cmpfn_data)
+			initialize_fsexcludes_hashmap(&fsexcludes_hashmap, &fsexcludes_data);
+
+		return check_fsexcludes_hashmap(&fsexcludes_hashmap, pathname, pathlen);
+	}
+
+	if (dtype == DT_DIR || dtype == DT_LNK) {
+		/* lazily init the hashmap */
+		if (!parent_directory_hashmap.cmpfn_data)
+			initialize_parent_directory_hashmap(&parent_directory_hashmap, &fsexcludes_data);
+
+		return check_directory_hashmap(&parent_directory_hashmap, pathname, pathlen);
+	}
+
+	return -1;
+}
+
+void fsexcludes_init(struct strbuf *sb) {
+	fsexcludes_initialized = 1;
+	fsexcludes_data = *sb;
+}
+
+void fsexcludes_free() {
+	strbuf_release(&fsexcludes_data);
+	hashmap_free(&fsexcludes_hashmap, 1);
+	hashmap_free(&parent_directory_hashmap, 1);
+}
diff --git a/fsexcludes.h b/fsexcludes.h
new file mode 100644
index 0000000000..1c4101343c
--- /dev/null
+++ b/fsexcludes.h
@@ -0,0 +1,27 @@
+#ifndef FSEXCLUDES_H
+#define FSEXCLUDES_H
+
+/*
+ * The file system excludes functions provides a way to programatically limit
+ * where git will scan for untracked files.  This is used to speed up the
+ * scan by avoiding scanning parts of the work directory that do not have
+ * any new files.
+ *
+ */
+
+/*
+ * sb should contain a NUL separated list of path names of the files
+ * and/or directories that should be checked.  Any path not listed will
+ * be excluded from the scan.
+ */
+void fsexcludes_init(struct strbuf *sb);
+void fsexcludes_free();
+
+/*
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+	const char *pathname, int pathlen, int dtype_p);
+
+
+#endif
-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v1 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic
  2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
@ 2018-04-10 21:04 ` Ben Peart
  2018-04-11 20:01 ` [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-10 21:04 UTC (permalink / raw)
  To: git; +Cc: pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin, Ben Peart

Update fsmonitor to utilize the new fsexcludes based logic for excluding paths
that do not need to be scaned for new or modified files.  Remove the old logic
in dir.c that utilized the untracked cache (if enabled) to accomplish the same
goal.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 dir.c                       | 33 ++++++++++++++++++++++++---------
 dir.h                       |  2 --
 fsmonitor.c                 | 21 ++-------------------
 fsmonitor.h                 | 10 +++-------
 t/t7519-status-fsmonitor.sh | 14 +++-----------
 5 files changed, 32 insertions(+), 48 deletions(-)

diff --git a/dir.c b/dir.c
index 63a917be45..28c2c83f76 100644
--- a/dir.c
+++ b/dir.c
@@ -18,7 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
-#include "fsmonitor.h"
+#include "fsexcludes.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1102,6 +1102,12 @@ int is_excluded_from_list(const char *pathname,
 			  struct exclude_list *el, struct index_state *istate)
 {
 	struct exclude *exclude;
+
+	if (*dtype == DT_UNKNOWN)
+		*dtype = get_dtype(NULL, istate, pathname, pathlen);
+	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype) > 0)
+		return 1;
+
 	exclude = last_exclude_matching_from_list(pathname, pathlen, basename,
 						  dtype, el, istate);
 	if (exclude)
@@ -1317,8 +1323,15 @@ struct exclude *last_exclude_matching(struct dir_struct *dir,
 int is_excluded(struct dir_struct *dir, struct index_state *istate,
 		const char *pathname, int *dtype_p)
 {
-	struct exclude *exclude =
-		last_exclude_matching(dir, istate, pathname, dtype_p);
+	struct exclude *exclude;
+	int pathlen = strlen(pathname);
+
+	if (*dtype_p == DT_UNKNOWN)
+		*dtype_p = get_dtype(NULL, istate, pathname, pathlen);
+	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype_p) > 0)
+		return 1;
+
+	exclude = last_exclude_matching(dir, istate, pathname, dtype_p);
 	if (exclude)
 		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 	return 0;
@@ -1671,6 +1684,9 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (dtype != DT_DIR && has_path_in_index)
 		return path_none;
 
+	if (fsexcludes_is_excluded_from(istate, path->buf, path->len, dtype) > 0)
+		return path_excluded;
+
 	/*
 	 * When we are looking at a directory P in the working tree,
 	 * there are three cases:
@@ -1810,12 +1826,9 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	/*
-	 * With fsmonitor, we can trust the untracked cache's valid field.
-	 */
-	refresh_fsmonitor(istate);
-	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
-		if (lstat(path->len ? path->buf : ".", &st)) {
+	if (!untracked->valid) {
+		if (stat(path->len ? path->buf : ".", &st)) {
+			invalidate_directory(dir->untracked, untracked);
 			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
 			return 0;
 		}
@@ -2011,6 +2024,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* add the path to the appropriate result list */
 		switch (state) {
 		case path_excluded:
+			if (fsexcludes_is_excluded_from(istate, path.buf, path.len, DTYPE(cdir.de)) > 0)
+				break;
 			if (dir->flags & DIR_SHOW_IGNORED)
 				dir_add_name(dir, istate, path.buf, path.len);
 			else if ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
diff --git a/dir.h b/dir.h
index b0758b82a2..e67ccfbb29 100644
--- a/dir.h
+++ b/dir.h
@@ -139,8 +139,6 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
-	/* fsmonitor invalidation data */
-	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/fsmonitor.c b/fsmonitor.c
index 6d7bcd5d0e..dd67eef851 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -2,6 +2,7 @@
 #include "config.h"
 #include "dir.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 #include "run-command.h"
 #include "strbuf.h"
@@ -125,12 +126,7 @@ static void fsmonitor_refresh_callback(struct index_state *istate, const char *n
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
 	}
 
-	/*
-	 * Mark the untracked cache dirty even if it wasn't found in the index
-	 * as it could be a new untracked file.
-	 */
 	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);
-	untracked_cache_invalidate_path(istate, name, 0);
 }
 
 void refresh_fsmonitor(struct index_state *istate)
@@ -184,11 +180,8 @@ void refresh_fsmonitor(struct index_state *istate)
 		/* Mark all entries invalid */
 		for (i = 0; i < istate->cache_nr; i++)
 			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
-
-		if (istate->untracked)
-			istate->untracked->use_fsmonitor = 0;
 	}
-	strbuf_release(&query_result);
+	fsexcludes_init(&query_result);
 
 	/* Now that we've updated istate, save the last_update time */
 	istate->fsmonitor_last_update = last_update;
@@ -207,12 +200,6 @@ void add_fsmonitor(struct index_state *istate)
 		for (i = 0; i < istate->cache_nr; i++)
 			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
 
-		/* reset the untracked cache */
-		if (istate->untracked) {
-			add_untracked_cache(istate);
-			istate->untracked->use_fsmonitor = 1;
-		}
-
 		/* Update the fsmonitor state */
 		refresh_fsmonitor(istate);
 	}
@@ -241,10 +228,6 @@ void tweak_fsmonitor(struct index_state *istate)
 
 			/* Mark all previously saved entries as dirty */
 			ewah_each_bit(istate->fsmonitor_dirty, fsmonitor_ewah_callback, istate);
-
-			/* Now mark the untracked cache for fsmonitor usage */
-			if (istate->untracked)
-				istate->untracked->use_fsmonitor = 1;
 		}
 
 		ewah_free(istate->fsmonitor_dirty);
diff --git a/fsmonitor.h b/fsmonitor.h
index 65f3743636..f7adfc1f7c 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -35,8 +35,7 @@ extern void tweak_fsmonitor(struct index_state *istate);
 
 /*
  * Run the configured fsmonitor integration script and clear the
- * CE_FSMONITOR_VALID bit for any files returned as dirty.  Also invalidate
- * any corresponding untracked cache directory structures. Optimized to only
+ * CE_FSMONITOR_VALID bit for any files returned as dirty. Optimized to only
  * run the first time it is called.
  */
 extern void refresh_fsmonitor(struct index_state *istate);
@@ -55,17 +54,14 @@ static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 }
 
 /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
- * any corresponding untracked cache directory structures. This should
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit. This should
  * be called any time git creates or modifies a file that should
- * trigger an lstat() or invalidate the untracked cache for the
- * corresponding directory
+ * trigger an lstat() for the corresponding directory
  */
 static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
 {
 	if (core_fsmonitor) {
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
-		untracked_cache_invalidate_path(istate, ce->name, 1);
 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
 	}
 }
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 756beb0d8e..d6a1da5a0a 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -225,8 +225,7 @@ test_expect_success '*only* files returned by the integration script get flagged
 # Ensure commands that call refresh_index() to move the index back in time
 # properly invalidate the fsmonitor cache
 test_expect_success 'refresh_index() invalidates fsmonitor cache' '
-	write_script .git/hooks/fsmonitor-test<<-\EOF &&
-	EOF
+	write_integration_script &&
 	clean_repo &&
 	dirty_repo &&
 	git add . &&
@@ -275,7 +274,7 @@ do
 		'
 
 		# Make sure it's actually skipping the check for modified and untracked
-		# (if enabled) files unless it is told about them.
+		# files unless it is told about them.
 		test_expect_success "status doesn't detect unreported modifications" '
 			write_script .git/hooks/fsmonitor-test<<-\EOF &&
 			:>marker
@@ -288,14 +287,7 @@ do
 			git status >actual &&
 			test_path_is_file marker &&
 			test_i18ngrep ! "Changes not staged for commit:" actual &&
-			if test $uc_val = true
-			then
-				test_i18ngrep ! "Untracked files:" actual
-			fi &&
-			if test $uc_val = false
-			then
-				test_i18ngrep "Untracked files:" actual
-			fi &&
+			test_i18ngrep ! "Untracked files:" actual &&
 			rm -f marker
 		'
 	done
-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
@ 2018-04-10 22:09   ` Martin Ågren
  2018-04-11 19:56     ` Ben Peart
  2018-04-11  6:58   ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Martin Ågren @ 2018-04-10 22:09 UTC (permalink / raw)
  To: Ben Peart; +Cc: git, pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin

On 10 April 2018 at 23:04, Ben Peart <Ben.Peart@microsoft.com> wrote:
> The File System Excludes module is a new programmatic way to exclude files and
> folders from git's traversal of the working directory.  fsexcludes_init() should
> be called with a string buffer that contains a NUL separated list of path names
> of the files and/or directories that should be included.  Any path not listed
> will be excluded. The paths should be relative to the root of the working
> directory and be separated by a single NUL.
>
> The excludes logic in dir.c has been updated to honor the results of
> fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
> normal excludes logic is also checked as it could further reduce the set of
> files that should be included.

Here you mention a change in dir.c...

>  Makefile     |   1 +
>  fsexcludes.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  fsexcludes.h |  27 +++++++
>  3 files changed, 238 insertions(+)

... but this patch does not seem to touch dir.c at all.

> +static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +       struct fsexcludes fse;
> +       char *slash;
> +
> +       /* Check straight mapping */
> +       strbuf_reset(&sb);

You could drop this strbuf_reset(). Or did you intend to use a static
struct strbuf?

> +       /*
> +        * Check to see if it matches a directory or any path
> +        * underneath it.  In other words, 'a/b/foo.txt' will match
> +        * '/', 'a/', and 'a/b/'.
> +        */
> +       slash = strchr(sb.buf, '/');
> +       while (slash) {
> +               fse.pattern = sb.buf;
> +               fse.patternlen = slash - sb.buf + 1;
> +               hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
> +               if (hashmap_get(map, &fse, NULL)) {
> +                       strbuf_release(&sb);
> +                       return 0;
> +               }
> +               slash = strchr(slash + 1, '/');
> +       }

Maybe a for-loop would make this slightly more obvious:

for (slash = strchr(sb.buf, '/'); slash; slash = strchr(slash + 1, '/'))

On second thought, maybe not.

> +       entry = buf = fsexcludes_data->buf;
> +       len = fsexcludes_data->len;
> +       for (i = 0; i < len; i++) {
> +               if (buf[i] == '\0') {
> +                       fsexcludes_hashmap_add(map, entry, buf + i - entry);
> +                       entry = buf + i + 1;
> +               }
> +       }
> +}

Very minor: I would have found "buf - entry + i" clearer here and later,
but I'm sure you'll find someone of the opposing opinion (e.g.,
yourself). ;-)

> +static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +       struct fsexcludes fse;
> +
> +       /* Check for directory */
> +       strbuf_reset(&sb);

Same comment as above about this spurious reset.

> +       if (hashmap_get(map, &fse, NULL)) {
> +               strbuf_release(&sb);
> +               return 0;
> +       }
> +
> +       strbuf_release(&sb);
> +       return 1;
> +}
> +
> +/*
> + * Return 1 for exclude, 0 for include and -1 for undecided.
> + */
> +int fsexcludes_is_excluded_from(struct index_state *istate,
> +       const char *pathname, int pathlen, int dtype)
> +{

Will we at some point regret not being able to "return negative on
error"? I guess that would be "-2" or "negative other than -1".

> +void fsexcludes_init(struct strbuf *sb) {
> +       fsexcludes_initialized = 1;
> +       fsexcludes_data = *sb;
> +}

Grabbing the strbuf's members looks a bit odd. Is this
performance-sensitive enough that you do not want to make a copy? If a
caller releases its strbuf, which would normally be a good thing to do,
we may be in big trouble later. (Not only may .buf be stale, .len may
indicate we actually have something to read.)

I can understand that you do not want to pass a pointer+len, and that it
is not enough to pass sb.buf, since the string may contain nuls.

Maybe detach the original strbuf? That way, if a caller releases its
buffer, that is a no-op. A caller which goes on to use its buffer should
fail quickly and obviously. Right now, an incorrect caller would
probably fail more subtly and less reproducibly.

In any case, maybe document this in the .h-file?

Martin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
  2018-04-10 22:09   ` Martin Ågren
@ 2018-04-11  6:58   ` Junio C Hamano
  1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2018-04-11  6:58 UTC (permalink / raw)
  To: Ben Peart; +Cc: git\, pclouds\, alexmv\, blees\, bmwill\, avarab\, johannes.schindelin\

Ben Peart <Ben.Peart@microsoft.com> writes:

> +void fsexcludes_free() {

Write this line like so:

        void fsexcludes_free(void)
        {

> +void fsexcludes_free();

void fsexcludes_free(void);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-10 22:09   ` Martin Ågren
@ 2018-04-11 19:56     ` Ben Peart
  0 siblings, 0 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-11 19:56 UTC (permalink / raw)
  To: Martin Ågren, Ben Peart; +Cc: git, pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin



On 4/10/2018 6:09 PM, Martin Ågren wrote:
> On 10 April 2018 at 23:04, Ben Peart <Ben.Peart@microsoft.com> wrote:
>> The File System Excludes module is a new programmatic way to exclude files and
>> folders from git's traversal of the working directory.  fsexcludes_init() should
>> be called with a string buffer that contains a NUL separated list of path names
>> of the files and/or directories that should be included.  Any path not listed
>> will be excluded. The paths should be relative to the root of the working
>> directory and be separated by a single NUL.
>>
>> The excludes logic in dir.c has been updated to honor the results of
>> fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
>> normal excludes logic is also checked as it could further reduce the set of
>> files that should be included.
> 
> Here you mention a change in dir.c...
> 
>>   Makefile     |   1 +
>>   fsexcludes.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   fsexcludes.h |  27 +++++++
>>   3 files changed, 238 insertions(+)
> 
> ... but this patch does not seem to touch dir.c at all.
> 

Oops! Fixed in V2.

>> +static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
>> +{
>> +       struct strbuf sb = STRBUF_INIT;
>> +       struct fsexcludes fse;
>> +       char *slash;
>> +
>> +       /* Check straight mapping */
>> +       strbuf_reset(&sb);
> 
> You could drop this strbuf_reset(). Or did you intend to use a static
> struct strbuf?
> 

Good point, fixed in V2.

>> +       /*
>> +        * Check to see if it matches a directory or any path
>> +        * underneath it.  In other words, 'a/b/foo.txt' will match
>> +        * '/', 'a/', and 'a/b/'.
>> +        */
>> +       slash = strchr(sb.buf, '/');
>> +       while (slash) {
>> +               fse.pattern = sb.buf;
>> +               fse.patternlen = slash - sb.buf + 1;
>> +               hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
>> +               if (hashmap_get(map, &fse, NULL)) {
>> +                       strbuf_release(&sb);
>> +                       return 0;
>> +               }
>> +               slash = strchr(slash + 1, '/');
>> +       }
> 
> Maybe a for-loop would make this slightly more obvious:
> 
> for (slash = strchr(sb.buf, '/'); slash; slash = strchr(slash + 1, '/'))
> 
> On second thought, maybe not.
> 
>> +       entry = buf = fsexcludes_data->buf;
>> +       len = fsexcludes_data->len;
>> +       for (i = 0; i < len; i++) {
>> +               if (buf[i] == '\0') {
>> +                       fsexcludes_hashmap_add(map, entry, buf + i - entry);
>> +                       entry = buf + i + 1;
>> +               }
>> +       }
>> +}
> 
> Very minor: I would have found "buf - entry + i" clearer here and later,
> but I'm sure you'll find someone of the opposing opinion (e.g.,
> yourself). ;-)
> 
>> +static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
>> +{
>> +       struct strbuf sb = STRBUF_INIT;
>> +       struct fsexcludes fse;
>> +
>> +       /* Check for directory */
>> +       strbuf_reset(&sb);
> 
> Same comment as above about this spurious reset.

Good point, fixed in V2.

> 
>> +       if (hashmap_get(map, &fse, NULL)) {
>> +               strbuf_release(&sb);
>> +               return 0;
>> +       }
>> +
>> +       strbuf_release(&sb);
>> +       return 1;
>> +}
>> +
>> +/*
>> + * Return 1 for exclude, 0 for include and -1 for undecided.
>> + */
>> +int fsexcludes_is_excluded_from(struct index_state *istate,
>> +       const char *pathname, int pathlen, int dtype)
>> +{
> 
> Will we at some point regret not being able to "return negative on
> error"? I guess that would be "-2" or "negative other than -1".
> 

This function is modeled after the other is_excluded_from* functions in 
dir.c so that the return value can be handled the same way.  I don't 
anticipate any need for change but you're right, we could return some 
other "negative other than -1" if it was ever needed.

>> +void fsexcludes_init(struct strbuf *sb) {
>> +       fsexcludes_initialized = 1;
>> +       fsexcludes_data = *sb;
>> +}
> 
> Grabbing the strbuf's members looks a bit odd. Is this
> performance-sensitive enough that you do not want to make a copy? If a
> caller releases its strbuf, which would normally be a good thing to do,
> we may be in big trouble later. (Not only may .buf be stale, .len may
> indicate we actually have something to read.)
> 
> I can understand that you do not want to pass a pointer+len, and that it
> is not enough to pass sb.buf, since the string may contain nuls.
> 
> Maybe detach the original strbuf? That way, if a caller releases its
> buffer, that is a no-op. A caller which goes on to use its buffer should
> fail quickly and obviously. Right now, an incorrect caller would
> probably fail more subtly and less reproducibly.
> 
> In any case, maybe document this in the .h-file?

Great suggestion!  I was looking for a better way to ensure the buffer 
ownership transfer was robust.  I'll do both strbuf_detach() and update 
the header file.  Thank you.

> 
> Martin
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files
  2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
  2018-04-10 21:04 ` [PATCH v1 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
@ 2018-04-11 20:01 ` Ben Peart
  2018-04-11 20:01   ` [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
  2018-04-11 20:01   ` [PATCH v2 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
  2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2018-04-14 15:59 ` [PATCH v1 " Duy Nguyen
  4 siblings, 2 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-11 20:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin, martin.agren, Ben Peart

Updated to incorporate feedback from V1.

I'd really like a close review of the changes in dir.c where I added the calls
to fsexcludes_is_excluded_from().  While they work and pass all the git tests
as well as our internal functional tests, I'd like to be sure I haven't missed
anything.

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/08442c209d
Checkout: git fetch https://github.com/benpeart/git fsexcludes-v2 && git checkout 08442c209d


### Interdiff (v1..v2):

diff --git a/fsexcludes.c b/fsexcludes.c
index 07bfe376a0..0ef57f107b 100644
--- a/fsexcludes.c
+++ b/fsexcludes.c
@@ -33,7 +33,6 @@ static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, in
 	char *slash;
 
 	/* Check straight mapping */
-	strbuf_reset(&sb);
 	strbuf_add(&sb, pattern, patternlen);
 	fse.pattern = sb.buf;
 	fse.patternlen = sb.len;
@@ -155,7 +154,6 @@ static int check_directory_hashmap(struct hashmap *map, const char *pathname, in
 	struct fsexcludes fse;
 
 	/* Check for directory */
-	strbuf_reset(&sb);
 	strbuf_add(&sb, pathname, pathlen);
 	strbuf_addch(&sb, '/');
 	fse.pattern = sb.buf;
@@ -198,13 +196,16 @@ int fsexcludes_is_excluded_from(struct index_state *istate,
 	return -1;
 }
 
-void fsexcludes_init(struct strbuf *sb) {
+void fsexcludes_init(struct strbuf *sb)
+{
 	fsexcludes_initialized = 1;
 	fsexcludes_data = *sb;
+	strbuf_detach(sb, NULL);
 }
 
-void fsexcludes_free() {
+void fsexcludes_free(void) {
 	strbuf_release(&fsexcludes_data);
 	hashmap_free(&fsexcludes_hashmap, 1);
 	hashmap_free(&parent_directory_hashmap, 1);
+	fsexcludes_initialized = 0;
 }
diff --git a/fsexcludes.h b/fsexcludes.h
index 1c4101343c..10246daa02 100644
--- a/fsexcludes.h
+++ b/fsexcludes.h
@@ -6,16 +6,18 @@
  * where git will scan for untracked files.  This is used to speed up the
  * scan by avoiding scanning parts of the work directory that do not have
  * any new files.
- *
  */
 
 /*
  * sb should contain a NUL separated list of path names of the files
  * and/or directories that should be checked.  Any path not listed will
  * be excluded from the scan.
+ *
+ * NOTE: fsexcludes_init() will take ownership of the storage passed in
+ * sb and will reset sb to `STRBUF_INIT`
  */
 void fsexcludes_init(struct strbuf *sb);
-void fsexcludes_free();
+void fsexcludes_free(void);
 
 /*
  * Return 1 for exclude, 0 for include and -1 for undecided.


### Patches

Ben Peart (2):
  fsexcludes: add a programmatic way to exclude files from git's working
    directory traversal logic
  fsmonitor: switch to use new fsexcludes logic and remove unused
    untracked cache based logic

 Makefile                    |   1 +
 dir.c                       |  33 ++++--
 dir.h                       |   2 -
 fsexcludes.c                | 211 ++++++++++++++++++++++++++++++++++++
 fsexcludes.h                |  29 +++++
 fsmonitor.c                 |  21 +---
 fsmonitor.h                 |  10 +-
 t/t7519-status-fsmonitor.sh |  14 +--
 8 files changed, 273 insertions(+), 48 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h


base-commit: 0b0cc9f86731f894cff8dd25299a9b38c254569e
-- 
2.17.0.windows.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-11 20:01 ` [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
@ 2018-04-11 20:01   ` Ben Peart
  2018-04-11 23:52     ` Junio C Hamano
  2018-04-11 20:01   ` [PATCH v2 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
  1 sibling, 1 reply; 17+ messages in thread
From: Ben Peart @ 2018-04-11 20:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin, martin.agren, Ben Peart

The File System Excludes module is a new programmatic way to exclude files and
folders from git's traversal of the working directory.  fsexcludes_init() should
be called with a string buffer that contains a NUL separated list of path names
of the files and/or directories that should be included.  Any path not listed
will be excluded. The paths should be relative to the root of the working
directory and be separated by a single NUL.

The excludes logic in dir.c has been updated to honor the results of
fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
normal excludes logic is also checked as it could further reduce the set of
files that should be included.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile     |   1 +
 dir.c        |  23 +++++-
 fsexcludes.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fsexcludes.h |  29 +++++++
 4 files changed, 262 insertions(+), 2 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h

diff --git a/Makefile b/Makefile
index 96f6138f63..c102d2f75a 100644
--- a/Makefile
+++ b/Makefile
@@ -819,6 +819,7 @@ LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsexcludes.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
diff --git a/dir.c b/dir.c
index 63a917be45..1aa639b9f4 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 
 /*
@@ -1102,6 +1103,12 @@ int is_excluded_from_list(const char *pathname,
 			  struct exclude_list *el, struct index_state *istate)
 {
 	struct exclude *exclude;
+
+	if (*dtype == DT_UNKNOWN)
+		*dtype = get_dtype(NULL, istate, pathname, pathlen);
+	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype) > 0)
+		return 1;
+
 	exclude = last_exclude_matching_from_list(pathname, pathlen, basename,
 						  dtype, el, istate);
 	if (exclude)
@@ -1317,8 +1324,15 @@ struct exclude *last_exclude_matching(struct dir_struct *dir,
 int is_excluded(struct dir_struct *dir, struct index_state *istate,
 		const char *pathname, int *dtype_p)
 {
-	struct exclude *exclude =
-		last_exclude_matching(dir, istate, pathname, dtype_p);
+	struct exclude *exclude;
+	int pathlen = strlen(pathname);
+
+	if (*dtype_p == DT_UNKNOWN)
+		*dtype_p = get_dtype(NULL, istate, pathname, pathlen);
+	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype_p) > 0)
+		return 1;
+
+	exclude = last_exclude_matching(dir, istate, pathname, dtype_p);
 	if (exclude)
 		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 	return 0;
@@ -1671,6 +1685,9 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (dtype != DT_DIR && has_path_in_index)
 		return path_none;
 
+	if (fsexcludes_is_excluded_from(istate, path->buf, path->len, dtype) > 0)
+		return path_excluded;
+
 	/*
 	 * When we are looking at a directory P in the working tree,
 	 * there are three cases:
@@ -2011,6 +2028,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* add the path to the appropriate result list */
 		switch (state) {
 		case path_excluded:
+			if (fsexcludes_is_excluded_from(istate, path.buf, path.len, DTYPE(cdir.de)) > 0)
+				break;
 			if (dir->flags & DIR_SHOW_IGNORED)
 				dir_add_name(dir, istate, path.buf, path.len);
 			else if ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
diff --git a/fsexcludes.c b/fsexcludes.c
new file mode 100644
index 0000000000..0ef57f107b
--- /dev/null
+++ b/fsexcludes.c
@@ -0,0 +1,211 @@
+#include "cache.h"
+#include "fsexcludes.h"
+#include "hashmap.h"
+#include "strbuf.h"
+
+static int fsexcludes_initialized = 0;
+static struct strbuf fsexcludes_data = STRBUF_INIT;
+static struct hashmap fsexcludes_hashmap;
+static struct hashmap parent_directory_hashmap;
+
+struct fsexcludes {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *pattern;
+	int patternlen;
+};
+
+static unsigned int(*fsexcludeshash)(const void *buf, size_t len);
+static int(*fsexcludescmp)(const char *a, const char *b, size_t len);
+
+static int fsexcludes_hashmap_cmp(const void *unused_cmp_data,
+	const void *a, const void *b, const void *key)
+{
+	const struct fsexcludes *fse1 = a;
+	const struct fsexcludes *fse2 = b;
+
+	return fsexcludescmp(fse1->pattern, fse2->pattern, fse1->patternlen);
+}
+
+static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct fsexcludes fse;
+	char *slash;
+
+	/* Check straight mapping */
+	strbuf_add(&sb, pattern, patternlen);
+	fse.pattern = sb.buf;
+	fse.patternlen = sb.len;
+	hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+	if (hashmap_get(map, &fse, NULL)) {
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	/*
+	 * Check to see if it matches a directory or any path
+	 * underneath it.  In other words, 'a/b/foo.txt' will match
+	 * '/', 'a/', and 'a/b/'.
+	 */
+	slash = strchr(sb.buf, '/');
+	while (slash) {
+		fse.pattern = sb.buf;
+		fse.patternlen = slash - sb.buf + 1;
+		hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+		if (hashmap_get(map, &fse, NULL)) {
+			strbuf_release(&sb);
+			return 0;
+		}
+		slash = strchr(slash + 1, '/');
+	}
+
+	strbuf_release(&sb);
+	return 1;
+}
+
+static void fsexcludes_hashmap_add(struct hashmap *map, const char *pattern, const int patternlen)
+{
+	struct fsexcludes *fse;
+
+	fse = xmalloc(sizeof(struct fsexcludes));
+	fse->pattern = pattern;
+	fse->patternlen = patternlen;
+	hashmap_entry_init(fse, fsexcludeshash(fse->pattern, fse->patternlen));
+	hashmap_add(map, fse);
+}
+
+static void initialize_fsexcludes_hashmap(struct hashmap *map, struct strbuf *fsexcludes_data)
+{
+	char *buf, *entry;
+	size_t len;
+	int i;
+
+	/*
+	 * Build a hashmap of the fsexcludes data we can use to look
+	 * for cache entry matches quickly
+	 */
+	fsexcludeshash = ignore_case ? memihash : memhash;
+	fsexcludescmp = ignore_case ? strncasecmp : strncmp;
+	hashmap_init(map, fsexcludes_hashmap_cmp, NULL, 0);
+
+	entry = buf = fsexcludes_data->buf;
+	len = fsexcludes_data->len;
+	for (i = 0; i < len; i++) {
+		if (buf[i] == '\0') {
+			fsexcludes_hashmap_add(map, entry, buf + i - entry);
+			entry = buf + i + 1;
+		}
+	}
+}
+
+static void parent_directory_hashmap_add(struct hashmap *map, const char *pattern, const int patternlen)
+{
+	char *slash;
+	struct fsexcludes *fse;
+
+	/*
+	 * Add any directories leading up to the file as the excludes logic
+	 * needs to match directories leading up to the files as well. Detect
+	 * and prevent unnecessary duplicate entries which will be common.
+	 */
+	if (patternlen > 1) {
+		slash = strchr(pattern + 1, '/');
+		while (slash) {
+			fse = xmalloc(sizeof(struct fsexcludes));
+			fse->pattern = pattern;
+			fse->patternlen = slash - pattern + 1;
+			hashmap_entry_init(fse, fsexcludeshash(fse->pattern, fse->patternlen));
+			if (hashmap_get(map, fse, NULL))
+				free(fse);
+			else
+				hashmap_add(map, fse);
+			slash = strchr(slash + 1, '/');
+		}
+	}
+}
+
+static void initialize_parent_directory_hashmap(struct hashmap *map, struct strbuf *vfs_data)
+{
+	char *buf, *entry;
+	size_t len;
+	int i;
+
+	/*
+	 * Build a hashmap of the parent directories contained in the virtual
+	 * file system data we can use to look for matches quickly
+	 */
+	fsexcludeshash = ignore_case ? memihash : memhash;
+	fsexcludescmp = ignore_case ? strncasecmp : strncmp;
+	hashmap_init(map, fsexcludes_hashmap_cmp, NULL, 0);
+
+	entry = buf = vfs_data->buf;
+	len = vfs_data->len;
+	for (i = 0; i < len; i++) {
+		if (buf[i] == '\0') {
+			parent_directory_hashmap_add(map, entry, buf + i - entry);
+			entry = buf + i + 1;
+		}
+	}
+}
+
+static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct fsexcludes fse;
+
+	/* Check for directory */
+	strbuf_add(&sb, pathname, pathlen);
+	strbuf_addch(&sb, '/');
+	fse.pattern = sb.buf;
+	fse.patternlen = sb.len;
+	hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+	if (hashmap_get(map, &fse, NULL)) {
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	strbuf_release(&sb);
+	return 1;
+}
+
+/*
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+	const char *pathname, int pathlen, int dtype)
+{
+	if (!fsexcludes_initialized)
+		return -1;
+
+	if (dtype == DT_REG) {
+		/* lazily init the hashmap */
+		if (!fsexcludes_hashmap.cmpfn_data)
+			initialize_fsexcludes_hashmap(&fsexcludes_hashmap, &fsexcludes_data);
+
+		return check_fsexcludes_hashmap(&fsexcludes_hashmap, pathname, pathlen);
+	}
+
+	if (dtype == DT_DIR || dtype == DT_LNK) {
+		/* lazily init the hashmap */
+		if (!parent_directory_hashmap.cmpfn_data)
+			initialize_parent_directory_hashmap(&parent_directory_hashmap, &fsexcludes_data);
+
+		return check_directory_hashmap(&parent_directory_hashmap, pathname, pathlen);
+	}
+
+	return -1;
+}
+
+void fsexcludes_init(struct strbuf *sb)
+{
+	fsexcludes_initialized = 1;
+	fsexcludes_data = *sb;
+	strbuf_detach(sb, NULL);
+}
+
+void fsexcludes_free(void) {
+	strbuf_release(&fsexcludes_data);
+	hashmap_free(&fsexcludes_hashmap, 1);
+	hashmap_free(&parent_directory_hashmap, 1);
+	fsexcludes_initialized = 0;
+}
diff --git a/fsexcludes.h b/fsexcludes.h
new file mode 100644
index 0000000000..10246daa02
--- /dev/null
+++ b/fsexcludes.h
@@ -0,0 +1,29 @@
+#ifndef FSEXCLUDES_H
+#define FSEXCLUDES_H
+
+/*
+ * The file system excludes functions provides a way to programatically limit
+ * where git will scan for untracked files.  This is used to speed up the
+ * scan by avoiding scanning parts of the work directory that do not have
+ * any new files.
+ */
+
+/*
+ * sb should contain a NUL separated list of path names of the files
+ * and/or directories that should be checked.  Any path not listed will
+ * be excluded from the scan.
+ *
+ * NOTE: fsexcludes_init() will take ownership of the storage passed in
+ * sb and will reset sb to `STRBUF_INIT`
+ */
+void fsexcludes_init(struct strbuf *sb);
+void fsexcludes_free(void);
+
+/*
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+	const char *pathname, int pathlen, int dtype_p);
+
+
+#endif
-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic
  2018-04-11 20:01 ` [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2018-04-11 20:01   ` [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
@ 2018-04-11 20:01   ` Ben Peart
  1 sibling, 0 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-11 20:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin, martin.agren, Ben Peart

Update fsmonitor to utilize the new fsexcludes based logic for excluding paths
that do not need to be scaned for new or modified files.  Remove the old logic
in dir.c that utilized the untracked cache (if enabled) to accomplish the same
goal.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 dir.c                       | 10 +++-------
 dir.h                       |  2 --
 fsmonitor.c                 | 21 ++-------------------
 fsmonitor.h                 | 10 +++-------
 t/t7519-status-fsmonitor.sh | 14 +++-----------
 5 files changed, 11 insertions(+), 46 deletions(-)

diff --git a/dir.c b/dir.c
index 1aa639b9f4..28c2c83f76 100644
--- a/dir.c
+++ b/dir.c
@@ -19,7 +19,6 @@
 #include "varint.h"
 #include "ewah/ewok.h"
 #include "fsexcludes.h"
-#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1827,12 +1826,9 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	/*
-	 * With fsmonitor, we can trust the untracked cache's valid field.
-	 */
-	refresh_fsmonitor(istate);
-	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
-		if (lstat(path->len ? path->buf : ".", &st)) {
+	if (!untracked->valid) {
+		if (stat(path->len ? path->buf : ".", &st)) {
+			invalidate_directory(dir->untracked, untracked);
 			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
 			return 0;
 		}
diff --git a/dir.h b/dir.h
index b0758b82a2..e67ccfbb29 100644
--- a/dir.h
+++ b/dir.h
@@ -139,8 +139,6 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
-	/* fsmonitor invalidation data */
-	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/fsmonitor.c b/fsmonitor.c
index 6d7bcd5d0e..dd67eef851 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -2,6 +2,7 @@
 #include "config.h"
 #include "dir.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 #include "run-command.h"
 #include "strbuf.h"
@@ -125,12 +126,7 @@ static void fsmonitor_refresh_callback(struct index_state *istate, const char *n
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
 	}
 
-	/*
-	 * Mark the untracked cache dirty even if it wasn't found in the index
-	 * as it could be a new untracked file.
-	 */
 	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);
-	untracked_cache_invalidate_path(istate, name, 0);
 }
 
 void refresh_fsmonitor(struct index_state *istate)
@@ -184,11 +180,8 @@ void refresh_fsmonitor(struct index_state *istate)
 		/* Mark all entries invalid */
 		for (i = 0; i < istate->cache_nr; i++)
 			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
-
-		if (istate->untracked)
-			istate->untracked->use_fsmonitor = 0;
 	}
-	strbuf_release(&query_result);
+	fsexcludes_init(&query_result);
 
 	/* Now that we've updated istate, save the last_update time */
 	istate->fsmonitor_last_update = last_update;
@@ -207,12 +200,6 @@ void add_fsmonitor(struct index_state *istate)
 		for (i = 0; i < istate->cache_nr; i++)
 			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
 
-		/* reset the untracked cache */
-		if (istate->untracked) {
-			add_untracked_cache(istate);
-			istate->untracked->use_fsmonitor = 1;
-		}
-
 		/* Update the fsmonitor state */
 		refresh_fsmonitor(istate);
 	}
@@ -241,10 +228,6 @@ void tweak_fsmonitor(struct index_state *istate)
 
 			/* Mark all previously saved entries as dirty */
 			ewah_each_bit(istate->fsmonitor_dirty, fsmonitor_ewah_callback, istate);
-
-			/* Now mark the untracked cache for fsmonitor usage */
-			if (istate->untracked)
-				istate->untracked->use_fsmonitor = 1;
 		}
 
 		ewah_free(istate->fsmonitor_dirty);
diff --git a/fsmonitor.h b/fsmonitor.h
index 65f3743636..f7adfc1f7c 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -35,8 +35,7 @@ extern void tweak_fsmonitor(struct index_state *istate);
 
 /*
  * Run the configured fsmonitor integration script and clear the
- * CE_FSMONITOR_VALID bit for any files returned as dirty.  Also invalidate
- * any corresponding untracked cache directory structures. Optimized to only
+ * CE_FSMONITOR_VALID bit for any files returned as dirty. Optimized to only
  * run the first time it is called.
  */
 extern void refresh_fsmonitor(struct index_state *istate);
@@ -55,17 +54,14 @@ static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 }
 
 /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
- * any corresponding untracked cache directory structures. This should
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit. This should
  * be called any time git creates or modifies a file that should
- * trigger an lstat() or invalidate the untracked cache for the
- * corresponding directory
+ * trigger an lstat() for the corresponding directory
  */
 static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
 {
 	if (core_fsmonitor) {
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
-		untracked_cache_invalidate_path(istate, ce->name, 1);
 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
 	}
 }
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 756beb0d8e..d6a1da5a0a 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -225,8 +225,7 @@ test_expect_success '*only* files returned by the integration script get flagged
 # Ensure commands that call refresh_index() to move the index back in time
 # properly invalidate the fsmonitor cache
 test_expect_success 'refresh_index() invalidates fsmonitor cache' '
-	write_script .git/hooks/fsmonitor-test<<-\EOF &&
-	EOF
+	write_integration_script &&
 	clean_repo &&
 	dirty_repo &&
 	git add . &&
@@ -275,7 +274,7 @@ do
 		'
 
 		# Make sure it's actually skipping the check for modified and untracked
-		# (if enabled) files unless it is told about them.
+		# files unless it is told about them.
 		test_expect_success "status doesn't detect unreported modifications" '
 			write_script .git/hooks/fsmonitor-test<<-\EOF &&
 			:>marker
@@ -288,14 +287,7 @@ do
 			git status >actual &&
 			test_path_is_file marker &&
 			test_i18ngrep ! "Changes not staged for commit:" actual &&
-			if test $uc_val = true
-			then
-				test_i18ngrep ! "Untracked files:" actual
-			fi &&
-			if test $uc_val = false
-			then
-				test_i18ngrep "Untracked files:" actual
-			fi &&
+			test_i18ngrep ! "Untracked files:" actual &&
 			rm -f marker
 		'
 	done
-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-11 20:01   ` [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
@ 2018-04-11 23:52     ` Junio C Hamano
  2018-04-13 11:53       ` Ben Peart
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2018-04-11 23:52 UTC (permalink / raw)
  To: Ben Peart; +Cc: git\, pclouds\, alexmv\, blees\, bmwill\, avarab\, johannes.schindelin\, martin.agren\


I haven't studied and thought about the motivation behind these two
patches, but one thing I noticed...

Ben Peart <Ben.Peart@microsoft.com> writes:

> diff --git a/dir.c b/dir.c
> index 63a917be45..1aa639b9f4 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1102,6 +1103,12 @@ int is_excluded_from_list(const char *pathname,
>  			  struct exclude_list *el, struct index_state *istate)
>  {
>  	struct exclude *exclude;
> +
> +	if (*dtype == DT_UNKNOWN)
> +		*dtype = get_dtype(NULL, istate, pathname, pathlen);
> +	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype) > 0)
> +		return 1;
> +
>  	exclude = last_exclude_matching_from_list(pathname, pathlen, basename,
>  						  dtype, el, istate);
>  	if (exclude)
> @@ -1317,8 +1324,15 @@ struct exclude *last_exclude_matching(struct dir_struct *dir,
>  int is_excluded(struct dir_struct *dir, struct index_state *istate,
>  		const char *pathname, int *dtype_p)
>  {
> -	struct exclude *exclude =
> -		last_exclude_matching(dir, istate, pathname, dtype_p);
> +	struct exclude *exclude;
> +	int pathlen = strlen(pathname);
> +
> +	if (*dtype_p == DT_UNKNOWN)
> +		*dtype_p = get_dtype(NULL, istate, pathname, pathlen);
> +	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype_p) > 0)
> +		return 1;
> +
> +	exclude = last_exclude_matching(dir, istate, pathname, dtype_p);
>  	if (exclude)
>  		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
>  	return 0;

A piece of impression I am getting from the above two hunks is that
the fsexcludes_is_excluded_from() function requires a real dtype in
its last parameter (i.e. DT_UNKNOWN is not acceptable).

> @@ -1671,6 +1685,9 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
>  	if (dtype != DT_DIR && has_path_in_index)
>  		return path_none;
>  
> +	if (fsexcludes_is_excluded_from(istate, path->buf, path->len, dtype) > 0)
> +		return path_excluded;
> +

And this hunk reinforces that impression (we are comparing dtype
with DT_DIR, so we know we cannot be passing DT_UNKNOWN to it).

> @@ -2011,6 +2028,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>  		/* add the path to the appropriate result list */
>  		switch (state) {
>  		case path_excluded:
> +			if (fsexcludes_is_excluded_from(istate, path.buf, path.len, DTYPE(cdir.de)) > 0)
> +				break;

Then the use of DTYPE() looks a bit odd here.  On
NO_D_TYPE_IN_DIRENT platforms, we would get DT_UNKNOWN out of it and
then end up passing DT_UNKNOWN to the function.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-11 23:52     ` Junio C Hamano
@ 2018-04-13 11:53       ` Ben Peart
  0 siblings, 0 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-13 11:53 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart; +Cc: git, pclouds, alexmv, blees, bmwill, avarab, johannes.schindelin, martin.agren



On 4/11/2018 7:52 PM, Junio C Hamano wrote:
>> @@ -2011,6 +2028,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>>   		/* add the path to the appropriate result list */
>>   		switch (state) {
>>   		case path_excluded:
>> +			if (fsexcludes_is_excluded_from(istate, path.buf, path.len, DTYPE(cdir.de)) > 0)
>> +				break;
> 
> Then the use of DTYPE() looks a bit odd here.  On
> NO_D_TYPE_IN_DIRENT platforms, we would get DT_UNKNOWN out of it and
> then end up passing DT_UNKNOWN to the function.
> 

Good catch.  I was trying to optimize this path and didn't realize the 
platform implications of using DTYPE().  I'll update it to match the others.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files
  2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
                   ` (2 preceding siblings ...)
  2018-04-11 20:01 ` [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
@ 2018-04-13 12:22 ` Ben Peart
  2018-04-13 12:22   ` [PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
                     ` (2 more replies)
  2018-04-14 15:59 ` [PATCH v1 " Duy Nguyen
  4 siblings, 3 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-13 12:22 UTC (permalink / raw)
  To: git, gitster; +Cc: pclouds, alexmv, blees, bmwill, avarab, johannes.schindelin, martin.agren, Ben Peart

Only minor changes from V2: 

Switched to using get_dtype() instead of DTYPE() for platform independence.
Cleaned up reverting of fsmonitor code in the untracked cache.

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/709470f33f
Checkout: git fetch https://github.com/benpeart/git fsexcludes-v3 && git checkout 709470f33f



### Patches

Ben Peart (2):
  fsexcludes: add a programmatic way to exclude files from git's working
    directory traversal logic
  fsmonitor: switch to use new fsexcludes logic and remove unused
    untracked cache based logic

 Makefile                    |   1 +
 dir.c                       |  47 +++++---
 dir.h                       |   2 -
 fsexcludes.c                | 211 ++++++++++++++++++++++++++++++++++++
 fsexcludes.h                |  29 +++++
 fsmonitor.c                 |  21 +---
 fsmonitor.h                 |  10 +-
 t/t7519-status-fsmonitor.sh |  14 +--
 8 files changed, 279 insertions(+), 56 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h


base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
-- 
2.17.0.windows.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
  2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
@ 2018-04-13 12:22   ` Ben Peart
  2018-04-13 12:22   ` [PATCH v3 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
  2018-04-18 15:31   ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2 siblings, 0 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-13 12:22 UTC (permalink / raw)
  To: git, gitster; +Cc: pclouds, alexmv, blees, bmwill, avarab, johannes.schindelin, martin.agren, Ben Peart

The File System Excludes module is a new programmatic way to exclude files and
folders from git's traversal of the working directory.  fsexcludes_init() should
be called with a string buffer that contains a NUL separated list of path names
of the files and/or directories that should be included.  Any path not listed
will be excluded. The paths should be relative to the root of the working
directory and be separated by a single NUL.

The excludes logic in dir.c has been updated to honor the results of
fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
normal excludes logic is also checked as it could further reduce the set of
files that should be included.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile     |   1 +
 dir.c        |  24 +++++-
 fsexcludes.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fsexcludes.h |  29 +++++++
 4 files changed, 263 insertions(+), 2 deletions(-)
 create mode 100644 fsexcludes.c
 create mode 100644 fsexcludes.h

diff --git a/Makefile b/Makefile
index f181687250..a4f1471272 100644
--- a/Makefile
+++ b/Makefile
@@ -822,6 +822,7 @@ LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsexcludes.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
diff --git a/dir.c b/dir.c
index 63a917be45..47a073efe1 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 
 /*
@@ -1102,6 +1103,12 @@ int is_excluded_from_list(const char *pathname,
 			  struct exclude_list *el, struct index_state *istate)
 {
 	struct exclude *exclude;
+
+	if (*dtype == DT_UNKNOWN)
+		*dtype = get_dtype(NULL, istate, pathname, pathlen);
+	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype) > 0)
+		return 1;
+
 	exclude = last_exclude_matching_from_list(pathname, pathlen, basename,
 						  dtype, el, istate);
 	if (exclude)
@@ -1317,8 +1324,15 @@ struct exclude *last_exclude_matching(struct dir_struct *dir,
 int is_excluded(struct dir_struct *dir, struct index_state *istate,
 		const char *pathname, int *dtype_p)
 {
-	struct exclude *exclude =
-		last_exclude_matching(dir, istate, pathname, dtype_p);
+	struct exclude *exclude;
+	int pathlen = strlen(pathname);
+
+	if (*dtype_p == DT_UNKNOWN)
+		*dtype_p = get_dtype(NULL, istate, pathname, pathlen);
+	if (fsexcludes_is_excluded_from(istate, pathname, pathlen, *dtype_p) > 0)
+		return 1;
+
+	exclude = last_exclude_matching(dir, istate, pathname, dtype_p);
 	if (exclude)
 		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 	return 0;
@@ -1671,6 +1685,9 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (dtype != DT_DIR && has_path_in_index)
 		return path_none;
 
+	if (fsexcludes_is_excluded_from(istate, path->buf, path->len, dtype) > 0)
+		return path_excluded;
+
 	/*
 	 * When we are looking at a directory P in the working tree,
 	 * there are three cases:
@@ -2011,6 +2028,9 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* add the path to the appropriate result list */
 		switch (state) {
 		case path_excluded:
+			if (fsexcludes_is_excluded_from(istate, path.buf, path.len,
+					get_dtype(cdir.de, istate, path.buf, path.len)) > 0)
+				break;
 			if (dir->flags & DIR_SHOW_IGNORED)
 				dir_add_name(dir, istate, path.buf, path.len);
 			else if ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
diff --git a/fsexcludes.c b/fsexcludes.c
new file mode 100644
index 0000000000..0ef57f107b
--- /dev/null
+++ b/fsexcludes.c
@@ -0,0 +1,211 @@
+#include "cache.h"
+#include "fsexcludes.h"
+#include "hashmap.h"
+#include "strbuf.h"
+
+static int fsexcludes_initialized = 0;
+static struct strbuf fsexcludes_data = STRBUF_INIT;
+static struct hashmap fsexcludes_hashmap;
+static struct hashmap parent_directory_hashmap;
+
+struct fsexcludes {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *pattern;
+	int patternlen;
+};
+
+static unsigned int(*fsexcludeshash)(const void *buf, size_t len);
+static int(*fsexcludescmp)(const char *a, const char *b, size_t len);
+
+static int fsexcludes_hashmap_cmp(const void *unused_cmp_data,
+	const void *a, const void *b, const void *key)
+{
+	const struct fsexcludes *fse1 = a;
+	const struct fsexcludes *fse2 = b;
+
+	return fsexcludescmp(fse1->pattern, fse2->pattern, fse1->patternlen);
+}
+
+static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct fsexcludes fse;
+	char *slash;
+
+	/* Check straight mapping */
+	strbuf_add(&sb, pattern, patternlen);
+	fse.pattern = sb.buf;
+	fse.patternlen = sb.len;
+	hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+	if (hashmap_get(map, &fse, NULL)) {
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	/*
+	 * Check to see if it matches a directory or any path
+	 * underneath it.  In other words, 'a/b/foo.txt' will match
+	 * '/', 'a/', and 'a/b/'.
+	 */
+	slash = strchr(sb.buf, '/');
+	while (slash) {
+		fse.pattern = sb.buf;
+		fse.patternlen = slash - sb.buf + 1;
+		hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+		if (hashmap_get(map, &fse, NULL)) {
+			strbuf_release(&sb);
+			return 0;
+		}
+		slash = strchr(slash + 1, '/');
+	}
+
+	strbuf_release(&sb);
+	return 1;
+}
+
+static void fsexcludes_hashmap_add(struct hashmap *map, const char *pattern, const int patternlen)
+{
+	struct fsexcludes *fse;
+
+	fse = xmalloc(sizeof(struct fsexcludes));
+	fse->pattern = pattern;
+	fse->patternlen = patternlen;
+	hashmap_entry_init(fse, fsexcludeshash(fse->pattern, fse->patternlen));
+	hashmap_add(map, fse);
+}
+
+static void initialize_fsexcludes_hashmap(struct hashmap *map, struct strbuf *fsexcludes_data)
+{
+	char *buf, *entry;
+	size_t len;
+	int i;
+
+	/*
+	 * Build a hashmap of the fsexcludes data we can use to look
+	 * for cache entry matches quickly
+	 */
+	fsexcludeshash = ignore_case ? memihash : memhash;
+	fsexcludescmp = ignore_case ? strncasecmp : strncmp;
+	hashmap_init(map, fsexcludes_hashmap_cmp, NULL, 0);
+
+	entry = buf = fsexcludes_data->buf;
+	len = fsexcludes_data->len;
+	for (i = 0; i < len; i++) {
+		if (buf[i] == '\0') {
+			fsexcludes_hashmap_add(map, entry, buf + i - entry);
+			entry = buf + i + 1;
+		}
+	}
+}
+
+static void parent_directory_hashmap_add(struct hashmap *map, const char *pattern, const int patternlen)
+{
+	char *slash;
+	struct fsexcludes *fse;
+
+	/*
+	 * Add any directories leading up to the file as the excludes logic
+	 * needs to match directories leading up to the files as well. Detect
+	 * and prevent unnecessary duplicate entries which will be common.
+	 */
+	if (patternlen > 1) {
+		slash = strchr(pattern + 1, '/');
+		while (slash) {
+			fse = xmalloc(sizeof(struct fsexcludes));
+			fse->pattern = pattern;
+			fse->patternlen = slash - pattern + 1;
+			hashmap_entry_init(fse, fsexcludeshash(fse->pattern, fse->patternlen));
+			if (hashmap_get(map, fse, NULL))
+				free(fse);
+			else
+				hashmap_add(map, fse);
+			slash = strchr(slash + 1, '/');
+		}
+	}
+}
+
+static void initialize_parent_directory_hashmap(struct hashmap *map, struct strbuf *vfs_data)
+{
+	char *buf, *entry;
+	size_t len;
+	int i;
+
+	/*
+	 * Build a hashmap of the parent directories contained in the virtual
+	 * file system data we can use to look for matches quickly
+	 */
+	fsexcludeshash = ignore_case ? memihash : memhash;
+	fsexcludescmp = ignore_case ? strncasecmp : strncmp;
+	hashmap_init(map, fsexcludes_hashmap_cmp, NULL, 0);
+
+	entry = buf = vfs_data->buf;
+	len = vfs_data->len;
+	for (i = 0; i < len; i++) {
+		if (buf[i] == '\0') {
+			parent_directory_hashmap_add(map, entry, buf + i - entry);
+			entry = buf + i + 1;
+		}
+	}
+}
+
+static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
+{
+	struct strbuf sb = STRBUF_INIT;
+	struct fsexcludes fse;
+
+	/* Check for directory */
+	strbuf_add(&sb, pathname, pathlen);
+	strbuf_addch(&sb, '/');
+	fse.pattern = sb.buf;
+	fse.patternlen = sb.len;
+	hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+	if (hashmap_get(map, &fse, NULL)) {
+		strbuf_release(&sb);
+		return 0;
+	}
+
+	strbuf_release(&sb);
+	return 1;
+}
+
+/*
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+	const char *pathname, int pathlen, int dtype)
+{
+	if (!fsexcludes_initialized)
+		return -1;
+
+	if (dtype == DT_REG) {
+		/* lazily init the hashmap */
+		if (!fsexcludes_hashmap.cmpfn_data)
+			initialize_fsexcludes_hashmap(&fsexcludes_hashmap, &fsexcludes_data);
+
+		return check_fsexcludes_hashmap(&fsexcludes_hashmap, pathname, pathlen);
+	}
+
+	if (dtype == DT_DIR || dtype == DT_LNK) {
+		/* lazily init the hashmap */
+		if (!parent_directory_hashmap.cmpfn_data)
+			initialize_parent_directory_hashmap(&parent_directory_hashmap, &fsexcludes_data);
+
+		return check_directory_hashmap(&parent_directory_hashmap, pathname, pathlen);
+	}
+
+	return -1;
+}
+
+void fsexcludes_init(struct strbuf *sb)
+{
+	fsexcludes_initialized = 1;
+	fsexcludes_data = *sb;
+	strbuf_detach(sb, NULL);
+}
+
+void fsexcludes_free(void) {
+	strbuf_release(&fsexcludes_data);
+	hashmap_free(&fsexcludes_hashmap, 1);
+	hashmap_free(&parent_directory_hashmap, 1);
+	fsexcludes_initialized = 0;
+}
diff --git a/fsexcludes.h b/fsexcludes.h
new file mode 100644
index 0000000000..10246daa02
--- /dev/null
+++ b/fsexcludes.h
@@ -0,0 +1,29 @@
+#ifndef FSEXCLUDES_H
+#define FSEXCLUDES_H
+
+/*
+ * The file system excludes functions provides a way to programatically limit
+ * where git will scan for untracked files.  This is used to speed up the
+ * scan by avoiding scanning parts of the work directory that do not have
+ * any new files.
+ */
+
+/*
+ * sb should contain a NUL separated list of path names of the files
+ * and/or directories that should be checked.  Any path not listed will
+ * be excluded from the scan.
+ *
+ * NOTE: fsexcludes_init() will take ownership of the storage passed in
+ * sb and will reset sb to `STRBUF_INIT`
+ */
+void fsexcludes_init(struct strbuf *sb);
+void fsexcludes_free(void);
+
+/*
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+	const char *pathname, int pathlen, int dtype_p);
+
+
+#endif
-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic
  2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2018-04-13 12:22   ` [PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
@ 2018-04-13 12:22   ` Ben Peart
  2018-04-18 15:31   ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2 siblings, 0 replies; 17+ messages in thread
From: Ben Peart @ 2018-04-13 12:22 UTC (permalink / raw)
  To: git, gitster; +Cc: pclouds, alexmv, blees, bmwill, avarab, johannes.schindelin, martin.agren, Ben Peart

Update fsmonitor to utilize the new fsexcludes based logic for excluding paths
that do not need to be scaned for new or modified files.  Remove the old logic
in dir.c that utilized the untracked cache (if enabled) to accomplish the same
goal.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 dir.c                       | 23 ++++++++---------------
 dir.h                       |  2 --
 fsmonitor.c                 | 21 ++-------------------
 fsmonitor.h                 | 10 +++-------
 t/t7519-status-fsmonitor.sh | 14 +++-----------
 5 files changed, 16 insertions(+), 54 deletions(-)

diff --git a/dir.c b/dir.c
index 47a073efe1..b1859f4311 100644
--- a/dir.c
+++ b/dir.c
@@ -19,7 +19,6 @@
 #include "varint.h"
 #include "ewah/ewok.h"
 #include "fsexcludes.h"
-#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1827,20 +1826,14 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	/*
-	 * With fsmonitor, we can trust the untracked cache's valid field.
-	 */
-	refresh_fsmonitor(istate);
-	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
-		if (lstat(path->len ? path->buf : ".", &st)) {
-			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
-			return 0;
-		}
-		if (!untracked->valid ||
-			match_stat_data_racy(istate, &untracked->stat_data, &st)) {
-			fill_stat_data(&untracked->stat_data, &st);
-			return 0;
-		}
+	if (stat(path->len ? path->buf : ".", &st)) {
+		memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
+		return 0;
+	}
+	if (!untracked->valid ||
+	    match_stat_data_racy(istate, &untracked->stat_data, &st)) {
+		fill_stat_data(&untracked->stat_data, &st);
+		return 0;
 	}
 
 	if (untracked->check_only != !!check_only)
diff --git a/dir.h b/dir.h
index b0758b82a2..e67ccfbb29 100644
--- a/dir.h
+++ b/dir.h
@@ -139,8 +139,6 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
-	/* fsmonitor invalidation data */
-	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/fsmonitor.c b/fsmonitor.c
index 6d7bcd5d0e..dd67eef851 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -2,6 +2,7 @@
 #include "config.h"
 #include "dir.h"
 #include "ewah/ewok.h"
+#include "fsexcludes.h"
 #include "fsmonitor.h"
 #include "run-command.h"
 #include "strbuf.h"
@@ -125,12 +126,7 @@ static void fsmonitor_refresh_callback(struct index_state *istate, const char *n
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
 	}
 
-	/*
-	 * Mark the untracked cache dirty even if it wasn't found in the index
-	 * as it could be a new untracked file.
-	 */
 	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);
-	untracked_cache_invalidate_path(istate, name, 0);
 }
 
 void refresh_fsmonitor(struct index_state *istate)
@@ -184,11 +180,8 @@ void refresh_fsmonitor(struct index_state *istate)
 		/* Mark all entries invalid */
 		for (i = 0; i < istate->cache_nr; i++)
 			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
-
-		if (istate->untracked)
-			istate->untracked->use_fsmonitor = 0;
 	}
-	strbuf_release(&query_result);
+	fsexcludes_init(&query_result);
 
 	/* Now that we've updated istate, save the last_update time */
 	istate->fsmonitor_last_update = last_update;
@@ -207,12 +200,6 @@ void add_fsmonitor(struct index_state *istate)
 		for (i = 0; i < istate->cache_nr; i++)
 			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
 
-		/* reset the untracked cache */
-		if (istate->untracked) {
-			add_untracked_cache(istate);
-			istate->untracked->use_fsmonitor = 1;
-		}
-
 		/* Update the fsmonitor state */
 		refresh_fsmonitor(istate);
 	}
@@ -241,10 +228,6 @@ void tweak_fsmonitor(struct index_state *istate)
 
 			/* Mark all previously saved entries as dirty */
 			ewah_each_bit(istate->fsmonitor_dirty, fsmonitor_ewah_callback, istate);
-
-			/* Now mark the untracked cache for fsmonitor usage */
-			if (istate->untracked)
-				istate->untracked->use_fsmonitor = 1;
 		}
 
 		ewah_free(istate->fsmonitor_dirty);
diff --git a/fsmonitor.h b/fsmonitor.h
index 65f3743636..f7adfc1f7c 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -35,8 +35,7 @@ extern void tweak_fsmonitor(struct index_state *istate);
 
 /*
  * Run the configured fsmonitor integration script and clear the
- * CE_FSMONITOR_VALID bit for any files returned as dirty.  Also invalidate
- * any corresponding untracked cache directory structures. Optimized to only
+ * CE_FSMONITOR_VALID bit for any files returned as dirty. Optimized to only
  * run the first time it is called.
  */
 extern void refresh_fsmonitor(struct index_state *istate);
@@ -55,17 +54,14 @@ static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 }
 
 /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
- * any corresponding untracked cache directory structures. This should
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit. This should
  * be called any time git creates or modifies a file that should
- * trigger an lstat() or invalidate the untracked cache for the
- * corresponding directory
+ * trigger an lstat() for the corresponding directory
  */
 static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
 {
 	if (core_fsmonitor) {
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
-		untracked_cache_invalidate_path(istate, ce->name, 1);
 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
 	}
 }
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 756beb0d8e..d6a1da5a0a 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -225,8 +225,7 @@ test_expect_success '*only* files returned by the integration script get flagged
 # Ensure commands that call refresh_index() to move the index back in time
 # properly invalidate the fsmonitor cache
 test_expect_success 'refresh_index() invalidates fsmonitor cache' '
-	write_script .git/hooks/fsmonitor-test<<-\EOF &&
-	EOF
+	write_integration_script &&
 	clean_repo &&
 	dirty_repo &&
 	git add . &&
@@ -275,7 +274,7 @@ do
 		'
 
 		# Make sure it's actually skipping the check for modified and untracked
-		# (if enabled) files unless it is told about them.
+		# files unless it is told about them.
 		test_expect_success "status doesn't detect unreported modifications" '
 			write_script .git/hooks/fsmonitor-test<<-\EOF &&
 			:>marker
@@ -288,14 +287,7 @@ do
 			git status >actual &&
 			test_path_is_file marker &&
 			test_i18ngrep ! "Changes not staged for commit:" actual &&
-			if test $uc_val = true
-			then
-				test_i18ngrep ! "Untracked files:" actual
-			fi &&
-			if test $uc_val = false
-			then
-				test_i18ngrep "Untracked files:" actual
-			fi &&
+			test_i18ngrep ! "Untracked files:" actual &&
 			rm -f marker
 		'
 	done
-- 
2.17.0.windows.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files
  2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
                   ` (3 preceding siblings ...)
  2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
@ 2018-04-14 15:59 ` " Duy Nguyen
  4 siblings, 0 replies; 17+ messages in thread
From: Duy Nguyen @ 2018-04-14 15:59 UTC (permalink / raw)
  To: Ben Peart; +Cc: git, alexmv, blees, gitster, bmwill, avarab, johannes.schindelin

On Tue, Apr 10, 2018 at 11:04 PM, Ben Peart <Ben.Peart@microsoft.com> wrote:
> In git repos with large working directories an external file system monitor
> (like fsmonitor or gvfs) can track what files in the working directory have been
> modified.  This information can be used to speed up git operations that scale
> based on the size of the working directory so that they become O(# of modified
> files) vs O(# of files in the working directory).
>
> The fsmonitor patch series added logic to limit what files git had to stat() to
> the set of modified files provided by the fsmonitor hook proc.  It also used the
> untracked cache (if enabled) to limit the files/folders git had to scan looking
> for new/untracked files.  GVFS is another external file system model that also
> speeds up git working directory based operations that has been using a different
> mechanism (programmatically generating an excludes file) to enable git to be
> O(# of modified files).
>
> This patch series will introduce a new way to limit git�s traversal of the
> working directory that does not require the untracked cache (fsmonitor) or using
> the excludes feature (GVFS).  It does this by enhancing the existing excludes
> logic in dir.c to support a new �File System Excludes� or fsexcludes API that is
> better tuned to these programmatic applications.

I have not had a chance to really look at the patches yet but I think
these three paragraphs should somehow be included in the commit
description of 1/2 (or spread out between 1/2 and 2/2). 1/2
description for example briefly talks about how to use the new thing,
but not really tell what it's for, why you need to add it.
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files
  2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
  2018-04-13 12:22   ` [PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
  2018-04-13 12:22   ` [PATCH v3 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
@ 2018-04-18 15:31   ` Ben Peart
  2018-04-18 21:25     ` Junio C Hamano
  2 siblings, 1 reply; 17+ messages in thread
From: Ben Peart @ 2018-04-18 15:31 UTC (permalink / raw)
  To: Ben Peart, git, gitster; +Cc: pclouds, alexmv, blees, bmwill, avarab, johannes.schindelin, martin.agren

I found a bug with how this patch series deals with untracked files. 
I'm going to retract this patch until I have time to create a new test 
case to demonstrate the bug and come up with a good fix.

Ben

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files
  2018-04-18 15:31   ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
@ 2018-04-18 21:25     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2018-04-18 21:25 UTC (permalink / raw)
  To: Ben Peart; +Cc: Ben Peart, git\, pclouds\, alexmv\, blees\, bmwill\, avarab\, johannes.schindelin\, martin.agren\

Ben Peart <peartben@gmail.com> writes:

> I found a bug with how this patch series deals with untracked
> files. I'm going to retract this patch until I have time to create a
> new test case to demonstrate the bug and come up with a good fix.
>
> Ben

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, back to index

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
2018-04-10 22:09   ` Martin Ågren
2018-04-11 19:56     ` Ben Peart
2018-04-11  6:58   ` Junio C Hamano
2018-04-10 21:04 ` [PATCH v1 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
2018-04-11 20:01 ` [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-11 20:01   ` [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
2018-04-11 23:52     ` Junio C Hamano
2018-04-13 11:53       ` Ben Peart
2018-04-11 20:01   ` [PATCH v2 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-13 12:22   ` [PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
2018-04-13 12:22   ` [PATCH v3 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
2018-04-18 15:31   ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-18 21:25     ` Junio C Hamano
2018-04-14 15:59 ` [PATCH v1 " Duy Nguyen

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox