git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v5 0/7] Fast git status via a file system watcher
@ 2017-06-10 13:40 Ben Peart
  2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
                   ` (8 more replies)
  0 siblings, 9 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

Changes from V4 include:

fsmonitor.c:
	Only flag index dirty if fsmonitor extension is added or removed
	or if new cache or untracked cache entries are marked dirty

dir.c:
	If an untracked cache entry is flagged as fsmonitor_dirty,
	fall back to existing logic to stat the file and check for
	changes instead of assuming it is dirty.

hooks--query-fsmonitor.sample: 
	Optimize query to exclude transitory files (files that were
	created and deleted since the last call).
	
	Update path mangling on Windows to match Watchman beta.

test-drop-caches.c:
	Add perf helper to drop the disk cache on Windows.

p7519-fsmonitor.sh:
	Add perf test for fsmonitor changes

Ben Peart (7):
  bswap: add 64 bit endianness helper get_be64
  dir: make lookup_untracked() available outside of dir.c
  fsmonitor: teach git to optionally utilize a file system monitor to
    speed up detecting new or changed files.
  fsmonitor: add test cases for fsmonitor extension
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: add a sample query-fsmonitor hook script for Watchman
  fsmonitor: add a performance test

 Documentation/config.txt                 |   7 +
 Documentation/githooks.txt               |  23 +++
 Documentation/technical/index-format.txt |  19 +++
 Makefile                                 |   2 +
 builtin/update-index.c                   |   1 +
 cache.h                                  |   5 +
 compat/bswap.h                           |   4 +
 config.c                                 |   4 +
 dir.c                                    |  29 ++--
 dir.h                                    |   5 +
 entry.c                                  |   1 +
 environment.c                            |   1 +
 fsmonitor.c                              | 261 +++++++++++++++++++++++++++++++
 fsmonitor.h                              |   9 ++
 read-cache.c                             |  28 +++-
 t/helper/test-drop-caches.c              | 107 +++++++++++++
 t/perf/p7519-fsmonitor.sh                | 161 +++++++++++++++++++
 t/t7519-status-fsmonitor.sh              | 173 ++++++++++++++++++++
 templates/hooks--query-fsmonitor.sample  |  76 +++++++++
 unpack-trees.c                           |   1 +
 20 files changed, 904 insertions(+), 13 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100755 t/perf/p7519-fsmonitor.sh
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 templates/hooks--query-fsmonitor.sample

-- 
2.13.0


^ permalink raw reply	[flat|nested] 137+ messages in thread

* [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-10 13:40 ` [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c Ben Peart
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

Add a new get_be64 macro to enable 64 bit endian conversions on memory
that may or may not be aligned.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/bswap.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/compat/bswap.h b/compat/bswap.h
index d47c003544..f89fe7f4b5 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -158,6 +158,7 @@ static inline uint64_t git_bswap64(uint64_t x)
 
 #define get_be16(p)	ntohs(*(unsigned short *)(p))
 #define get_be32(p)	ntohl(*(unsigned int *)(p))
+#define get_be64(p)	ntohll(*(uint64_t *)(p))
 #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
 
 #else
@@ -170,6 +171,9 @@ static inline uint64_t git_bswap64(uint64_t x)
 	(*((unsigned char *)(p) + 1) << 16) | \
 	(*((unsigned char *)(p) + 2) <<  8) | \
 	(*((unsigned char *)(p) + 3) <<  0) )
+#define get_be64(p)	( \
+	((uint64_t)get_be32((unsigned char *)(p) + 0) << 32) | \
+	((uint64_t)get_be32((unsigned char *)(p) + 4) <<  0)
 #define put_be32(p, v)	do { \
 	unsigned int __v = (v); \
 	*((unsigned char *)(p) + 0) = __v >> 24; \
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
  2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

Remove the static qualifier from lookup_untracked() and make it
available to other modules by exporting it from dir.h.  This will be
used later when we need to find entries to mark 'fsmonitor dirty.'

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 dir.c | 2 +-
 dir.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 9efcf1eab6..5f531d9eed 100644
--- a/dir.c
+++ b/dir.c
@@ -666,7 +666,7 @@ static void trim_trailing_spaces(char *buf)
  *
  * If "name" has the trailing slash, it'll be excluded in the search.
  */
-static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
 						    struct untracked_cache_dir *dir,
 						    const char *name, int len)
 {
diff --git a/dir.h b/dir.h
index a89c13e27a..75655ebe35 100644
--- a/dir.h
+++ b/dir.h
@@ -354,4 +354,7 @@ extern void connect_work_tree_and_git_dir(const char *work_tree, const char *git
 extern void relocate_gitdir(const char *path,
 			    const char *old_git_dir,
 			    const char *new_git_dir);
+struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+					     struct untracked_cache_dir *dir,
+					     const char *name, int len);
 #endif
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
  2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
  2017-06-10 13:40 ` [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-27 15:43   ` Christian Couder
  2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

When the index is read from disk, the query-fsmonitor index extension is
used to flag the last known potentially dirty index and untracked cache
entries.

If git finds out some entries are 'fsmonitor-dirty', but are really
unchanged (e.g. the file was changed, then reverted back), then Git will
clear the marking in the extension. If git adds or updates an index
entry, it is marked 'fsmonitor-dirty' to ensure it is checked for
changes in the working directory.

Before the 'fsmonitor-dirty' flags are used to limit the scope of the
files to be checked, the query-fsmonitor hook proc is called with the
time the index was last updated.  The hook proc returns the list of
files changed since that last updated time and the list of
potentially dirty entries is updated to reflect the current state.

refresh_index() and valid_cached_dir() are updated so that any entry not
flagged as potentially dirty is not checked as it cannot have any
changes.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile               |   1 +
 builtin/update-index.c |   1 +
 cache.h                |   5 +
 config.c               |   4 +
 dir.c                  |  27 +++--
 dir.h                  |   2 +
 entry.c                |   1 +
 environment.c          |   1 +
 fsmonitor.c            | 261 +++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor.h            |   9 ++
 read-cache.c           |  28 +++++-
 unpack-trees.c         |   1 +
 12 files changed, 329 insertions(+), 12 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h

diff --git a/Makefile b/Makefile
index 7c621f7f76..992dd58801 100644
--- a/Makefile
+++ b/Makefile
@@ -763,6 +763,7 @@ LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/builtin/update-index.c b/builtin/update-index.c
index ebfc09faa0..32fd977b43 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -232,6 +232,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
+		active_cache[pos]->ce_flags |= CE_FSMONITOR_DIRTY;
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
diff --git a/cache.h b/cache.h
index 4d92aae0e8..220efbfde7 100644
--- a/cache.h
+++ b/cache.h
@@ -201,6 +201,7 @@ struct cache_entry {
 #define CE_ADDED             (1 << 19)
 
 #define CE_HASHED            (1 << 20)
+#define CE_FSMONITOR_DIRTY   (1 << 21)
 #define CE_WT_REMOVE         (1 << 22) /* remove in work directory */
 #define CE_CONFLICTED        (1 << 23)
 
@@ -324,6 +325,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CACHE_TREE_CHANGED	(1 << 5)
 #define SPLIT_INDEX_ORDERED	(1 << 6)
 #define UNTRACKED_CHANGED	(1 << 7)
+#define FSMONITOR_CHANGED	(1 << 8)
 
 struct split_index;
 struct untracked_cache;
@@ -342,6 +344,8 @@ struct index_state {
 	struct hashmap dir_hash;
 	unsigned char sha1[20];
 	struct untracked_cache *untracked;
+	uint64_t fsmonitor_last_update;
+	struct ewah_bitmap *fsmonitor_dirty;
 };
 
 extern struct index_state the_index;
@@ -767,6 +771,7 @@ extern int precomposed_unicode;
 extern int protect_hfs;
 extern int protect_ntfs;
 extern int git_db_env, git_index_env, git_graft_env, git_common_dir_env;
+extern int core_fsmonitor;
 
 /*
  * Include broken refs in all ref iterations, which will
diff --git a/config.c b/config.c
index 146cb3452a..c70f667640 100644
--- a/config.c
+++ b/config.c
@@ -1245,6 +1245,10 @@ static int git_default_core_config(const char *var, const char *value)
 			hide_dotfiles = HIDE_DOTFILES_DOTGITONLY;
 		else
 			hide_dotfiles = git_config_bool(var, value);
+	}
+
+	if (!strcmp(var, "core.fsmonitor")) {
+		core_fsmonitor = git_config_bool(var, value);
 		return 0;
 	}
 
diff --git a/dir.c b/dir.c
index 5f531d9eed..1f4a43dc75 100644
--- a/dir.c
+++ b/dir.c
@@ -17,6 +17,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1680,17 +1681,23 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	if (stat(path->len ? path->buf : ".", &st)) {
-		invalidate_directory(dir->untracked, untracked);
-		memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
-		return 0;
-	}
-	if (!untracked->valid ||
-	    match_stat_data_racy(istate, &untracked->stat_data, &st)) {
-		if (untracked->valid)
+	/*
+	 * With fsmonitor, we can trust the untracked cache's valid field.
+	 */
+	refresh_by_fsmonitor(istate);
+	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
+		if (stat(path->len ? path->buf : ".", &st)) {
 			invalidate_directory(dir->untracked, untracked);
-		fill_stat_data(&untracked->stat_data, &st);
-		return 0;
+			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
+			return 0;
+		}
+		if (!untracked->valid ||
+		    match_stat_data_racy(istate, &untracked->stat_data, &st)) {
+			if (untracked->valid)
+				invalidate_directory(dir->untracked, untracked);
+			fill_stat_data(&untracked->stat_data, &st);
+			return 0;
+		}
 	}
 
 	if (untracked->check_only != !!check_only) {
diff --git a/dir.h b/dir.h
index 75655ebe35..1a67cb4324 100644
--- a/dir.h
+++ b/dir.h
@@ -139,6 +139,8 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
+	/* fsmonitor invalidation data */
+	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/entry.c b/entry.c
index d6b263f78e..667ca5734b 100644
--- a/entry.c
+++ b/entry.c
@@ -222,6 +222,7 @@ static int write_entry(struct cache_entry *ce,
 			lstat(ce->name, &st);
 		fill_stat_cache_info(ce, &st);
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		ce->ce_flags |= CE_FSMONITOR_DIRTY;
 		state->istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 	return 0;
diff --git a/environment.c b/environment.c
index aa478e71de..ea63ee4604 100644
--- a/environment.c
+++ b/environment.c
@@ -64,6 +64,7 @@ int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
 unsigned long pack_size_limit_cfg;
 enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY;
 enum log_refs_config log_all_ref_updates = LOG_REFS_UNSET;
+int core_fsmonitor;
 
 #ifndef PROTECT_HFS_DEFAULT
 #define PROTECT_HFS_DEFAULT 0
diff --git a/fsmonitor.c b/fsmonitor.c
new file mode 100644
index 0000000000..c1f89abccd
--- /dev/null
+++ b/fsmonitor.c
@@ -0,0 +1,261 @@
+#include "cache.h"
+#include "dir.h"
+#include "ewah/ewok.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "fsmonitor.h"
+
+#define INDEX_EXTENSION_VERSION	1
+#define HOOK_INTERFACE_VERSION		1
+
+int read_fsmonitor_extension(struct index_state *istate, const void *data,
+	unsigned long sz)
+{
+	const char *index = data;
+	uint32_t hdr_version;
+	uint32_t ewah_size;
+	int ret;
+
+	if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
+		return error("corrupt fsmonitor extension (too short)");
+
+	hdr_version = get_be32(index);
+	index += sizeof(uint32_t);
+	if (hdr_version != INDEX_EXTENSION_VERSION)
+		return error("bad fsmonitor version %d", hdr_version);
+
+	istate->fsmonitor_last_update = get_be64(index);
+	index += sizeof(uint64_t);
+
+	ewah_size = get_be32(index);
+	index += sizeof(uint32_t);
+
+	istate->fsmonitor_dirty = ewah_new();
+	ret = ewah_read_mmap(istate->fsmonitor_dirty, index, ewah_size);
+	if (ret != ewah_size) {
+		ewah_free(istate->fsmonitor_dirty);
+		istate->fsmonitor_dirty = NULL;
+		return error("failed to parse ewah bitmap reading fsmonitor index extension");
+	}
+
+	return 0;
+}
+
+void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
+{
+	uint32_t hdr_version;
+	uint64_t tm;
+	struct ewah_bitmap *bitmap;
+	int i;
+	uint32_t ewah_start;
+	uint32_t ewah_size = 0;
+	int fixup = 0;
+
+	hdr_version = htonl(INDEX_EXTENSION_VERSION);
+	strbuf_add(sb, &hdr_version, sizeof(uint32_t));
+
+	tm = htonll((uint64_t)istate->fsmonitor_last_update);
+	strbuf_add(sb, &tm, sizeof(uint64_t));
+	fixup = sb->len;
+	strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
+
+	ewah_start = sb->len;
+	bitmap = ewah_new();
+	for (i = 0; i < istate->cache_nr; i++)
+		if (istate->cache[i]->ce_flags & CE_FSMONITOR_DIRTY)
+			ewah_set(bitmap, i);
+	ewah_serialize_strbuf(bitmap, sb);
+	ewah_free(bitmap);
+
+	/* fix up size field */
+	ewah_size = htonl(sb->len - ewah_start);
+	memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t));
+}
+
+static struct untracked_cache_dir *find_untracked_cache_dir(
+	struct untracked_cache *uc, struct untracked_cache_dir *ucd,
+	const char *name)
+{
+	const char *end;
+	struct untracked_cache_dir *dir = ucd;
+
+	if (!*name)
+		return dir;
+
+	end = strchr(name, '/');
+	if (end) {
+		dir = lookup_untracked(uc, ucd, name, end - name);
+		if (dir)
+			return find_untracked_cache_dir(uc, dir, end + 1);
+	}
+
+	return dir;
+}
+
+/* This function will be passed to ewah_each_bit() */
+static void mark_fsmonitor_dirty(size_t pos, void *is)
+{
+	struct index_state *istate = is;
+	struct untracked_cache_dir *dir;
+	struct cache_entry *ce = istate->cache[pos];
+
+	assert(pos < istate->cache_nr);
+	ce->ce_flags |= CE_FSMONITOR_DIRTY;
+
+	if (!istate->untracked || !istate->untracked->root)
+		return;
+
+	dir = find_untracked_cache_dir(istate->untracked, istate->untracked->root, ce->name);
+	if (dir)
+		dir->valid = 0;
+}
+
+void tweak_fsmonitor_extension(struct index_state *istate)
+{
+	int val, fsmonitor = 0;
+
+	if (!git_config_get_maybe_bool("core.fsmonitor", &val))
+		fsmonitor = val;
+
+	if (fsmonitor) {
+		if (!istate->fsmonitor_last_update)
+			istate->cache_changed |= FSMONITOR_CHANGED;
+		if (istate->fsmonitor_dirty)
+			ewah_each_bit(istate->fsmonitor_dirty, mark_fsmonitor_dirty, istate);
+	} else {
+		if (istate->fsmonitor_last_update)
+			istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = 0;
+	}
+
+	if (istate->fsmonitor_dirty) {
+		ewah_free(istate->fsmonitor_dirty);
+		istate->fsmonitor_dirty = NULL;
+	}
+}
+
+/*
+ * Call the query-fsmonitor hook passing the time of the last saved results.
+ */
+static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	char ver[64];
+	char date[64];
+	const char *argv[4];
+
+	if (!(argv[0] = find_hook("query-fsmonitor")))
+		return -1;
+
+	snprintf(ver, sizeof(version), "%d", version);
+	snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
+	argv[1] = ver;
+	argv[2] = date;
+	argv[3] = NULL;
+	cp.argv = argv;
+	cp.out = -1;
+
+	return capture_command(&cp, query_result, 1024);
+}
+
+static void mark_file_dirty(struct index_state *istate, const char *name)
+{
+	struct untracked_cache_dir *dir;
+	int pos;
+
+	/* find it in the index and mark that entry as dirty */
+	pos = index_name_pos(istate, name, strlen(name));
+	if (pos >= 0) {
+		if (!(istate->cache[pos]->ce_flags & CE_FSMONITOR_DIRTY)) {
+			istate->cache[pos]->ce_flags |= CE_FSMONITOR_DIRTY;
+			istate->cache_changed |= FSMONITOR_CHANGED;
+		}
+	}
+
+	/*
+	 * Find the corresponding directory in the untracked cache
+	 * and mark it as invalid
+	 */
+	if (!istate->untracked || !istate->untracked->root)
+		return;
+
+	dir = find_untracked_cache_dir(istate->untracked, istate->untracked->root, name);
+	if (dir) {
+		if (dir->valid) {
+			dir->valid = 0;
+			istate->cache_changed |= FSMONITOR_CHANGED;
+		}
+	}
+}
+
+void refresh_by_fsmonitor(struct index_state *istate)
+{
+	static int has_run_once = 0;
+	struct strbuf query_result = STRBUF_INIT;
+	int query_success = 0;
+	size_t bol = 0; /* beginning of line */
+	uint64_t last_update;
+	char *buf, *entry;
+	int i;
+
+	if (!core_fsmonitor || has_run_once)
+		return;
+	has_run_once = 1;
+
+	/*
+	 * This could be racy so save the date/time now and the hook
+	 * should be inclusive to ensure we don't miss potential changes.
+	 */
+	last_update = getnanotime();
+
+	/*
+	 * If we have a last update time, call query-monitor for the set of
+	 * changes since that time.
+	 */
+	if (istate->fsmonitor_last_update) {
+		query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
+			istate->fsmonitor_last_update, &query_result);
+		trace_performance_since(last_update, "query-fsmonitor");
+	}
+
+	if (query_success) {
+		/* Mark all entries returned by the monitor as dirty */
+		buf = entry = query_result.buf;
+		for (i = 0; i < query_result.len; i++) {
+			if (buf[i] != '\0')
+				continue;
+			mark_file_dirty(istate, buf + bol);
+			bol = i + 1;
+		}
+		if (bol < query_result.len)
+			mark_file_dirty(istate, buf + bol);
+
+		/* Mark all clean entries up-to-date */
+		for (i = 0; i < istate->cache_nr; i++) {
+			struct cache_entry *ce = istate->cache[i];
+			if (ce_stage(ce) || (ce->ce_flags & CE_FSMONITOR_DIRTY))
+				continue;
+			ce_mark_uptodate(ce);
+		}
+
+		/*
+		 * Now that we've marked the invalid entries in the
+		 * untracked-cache itself, we can mark the untracked cache for
+		 * fsmonitor usage.
+		 */
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 1;
+	} else {
+		/* if we can't update the cache, fall back to checking them all */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags |= CE_FSMONITOR_DIRTY;
+
+		/* mark the untracked cache as unusable for fsmonitor */
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 0;
+	}
+	strbuf_release(&query_result);
+
+	/* Now that we've updated istate, save the last_update time */
+	istate->fsmonitor_last_update = last_update;
+}
diff --git a/fsmonitor.h b/fsmonitor.h
new file mode 100644
index 0000000000..9c1e2b480f
--- /dev/null
+++ b/fsmonitor.h
@@ -0,0 +1,9 @@
+#ifndef FSMONITOR_H
+#define FSMONITOR_H
+
+int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz);
+void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate);
+void tweak_fsmonitor_extension(struct index_state *istate);
+void refresh_by_fsmonitor(struct index_state *istate);
+
+#endif
diff --git a/read-cache.c b/read-cache.c
index bc156a133e..68a1580c96 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -18,6 +18,7 @@
 #include "varint.h"
 #include "split-index.h"
 #include "utf8.h"
+#include "fsmonitor.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -37,11 +38,12 @@
 #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
 #define CACHE_EXT_LINK 0x6c696e6b	  /* "link" */
 #define CACHE_EXT_UNTRACKED 0x554E5452	  /* "UNTR" */
+#define CACHE_EXT_FSMONITOR 0x46534D4E	  /* "FSMN" */
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
 		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \
-		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED)
+		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED | FSMONITOR_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -61,6 +63,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	free(old);
 	set_index_entry(istate, nr, ce);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	ce->ce_flags |= CE_FSMONITOR_DIRTY;
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 }
 
@@ -777,6 +780,7 @@ int chmod_index_entry(struct index_state *istate, struct cache_entry *ce,
 	}
 	cache_tree_invalidate_path(istate, ce->name);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	ce->ce_flags |= CE_FSMONITOR_DIRTY;
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 
 	return 0;
@@ -1344,6 +1348,8 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	const char *added_fmt;
 	const char *unmerged_fmt;
 
+	refresh_by_fsmonitor(istate);
+
 	modified_fmt = (in_porcelain ? "M\t%s\n" : "%s: needs update\n");
 	deleted_fmt = (in_porcelain ? "D\t%s\n" : "%s: needs update\n");
 	typechange_fmt = (in_porcelain ? "T\t%s\n" : "%s needs update\n");
@@ -1380,8 +1386,11 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 			continue;
 
 		new = refresh_cache_ent(istate, ce, options, &cache_errno, &changed);
-		if (new == ce)
+		if (new == ce) {
+			ce->ce_flags &= ~CE_FSMONITOR_DIRTY;
 			continue;
+		}
+
 		if (!new) {
 			const char *fmt;
 
@@ -1391,6 +1400,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 				 */
 				ce->ce_flags &= ~CE_VALID;
 				ce->ce_flags |= CE_UPDATE_IN_BASE;
+				ce->ce_flags |= CE_FSMONITOR_DIRTY;
 				istate->cache_changed |= CE_ENTRY_CHANGED;
 			}
 			if (quiet)
@@ -1549,6 +1559,9 @@ static int read_index_extension(struct index_state *istate,
 	case CACHE_EXT_UNTRACKED:
 		istate->untracked = read_untracked_extension(data, sz);
 		break;
+	case CACHE_EXT_FSMONITOR:
+		read_fsmonitor_extension(istate, data, sz);
+		break;
 	default:
 		if (*ext < 'A' || 'Z' < *ext)
 			return error("index uses %.4s extension, which we do not understand",
@@ -1721,6 +1734,7 @@ static void post_read_index_from(struct index_state *istate)
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
 	tweak_split_index(istate);
+	tweak_fsmonitor_extension(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
@@ -2295,6 +2309,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 		if (err)
 			return -1;
 	}
+	if (!strip_extensions && istate->fsmonitor_last_update) {
+		struct strbuf sb = STRBUF_INIT;
+
+		write_fsmonitor_extension(&sb, istate);
+		err = write_index_ext_header(&c, newfd, CACHE_EXT_FSMONITOR, sb.len) < 0
+			|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
+		strbuf_release(&sb);
+		if (err)
+			return -1;
+	}
 
 	if (ce_flush(&c, newfd, istate->sha1))
 		return -1;
diff --git a/unpack-trees.c b/unpack-trees.c
index d38c37e38c..0ebc505b6c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -417,6 +417,7 @@ static int apply_sparse_checkout(struct index_state *istate,
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
 	if (was_skip_worktree != ce_skip_worktree(ce)) {
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		ce->ce_flags |= CE_FSMONITOR_DIRTY;
 		istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
                   ` (2 preceding siblings ...)
  2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-27 16:20   ` Christian Couder
  2017-06-10 13:40 ` [PATCH v5 5/7] fsmonitor: add documentation for the " Ben Peart
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

Add test cases that ensure status results are correct when using the new
fsmonitor extension.  Test untracked, modified, and new files by
ensuring the results are identical to when not using the extension.

Add a test to ensure updates to the index properly mark corresponding
entries in the index extension as dirty so that the status is correct
after commands that modify the index but don't trigger changes in the
working directory.

Add a test that verifies that if the fsmonitor extension doesn't tell
git about a change, it doesn't discover it on its own.  This ensures
git is honoring the extension and that we get the performance benefits
desired.

All test hooks output a marker file that is used to ensure the hook
was actually used to generate the test results.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t7519-status-fsmonitor.sh | 173 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 173 insertions(+)
 create mode 100755 t/t7519-status-fsmonitor.sh

diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
new file mode 100755
index 0000000000..458eabe6dc
--- /dev/null
+++ b/t/t7519-status-fsmonitor.sh
@@ -0,0 +1,173 @@
+#!/bin/sh
+
+test_description='git status with file system watcher'
+
+. ./test-lib.sh
+
+clean_repo () {
+	git reset --hard HEAD &&
+	git clean -fd &&
+	rm -f marker
+}
+
+dirty_repo () {
+	: >untracked &&
+	: >dir1/untracked &&
+	: >dir2/untracked &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	echo 4 >new &&
+	echo 5 >dir1/new &&
+	echo 6 >dir2/new &&
+	git add new &&
+	git add dir1/new &&
+	git add dir2/new
+}
+
+# The test query-fsmonitor hook proc will output a marker file we can use to
+# ensure the hook was actually used to generate the correct results.
+
+# fsmonitor works correctly with or without the untracked cache
+# but if it is available, we'll turn it on to ensure we test that
+# codepath as well.
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+if test_have_prereq UNTRACKED_CACHE; then
+	git config core.untrackedcache true
+else
+	git config core.untrackedcache false
+fi
+
+test_expect_success 'setup' '
+	mkdir -p .git/hooks &&
+	: >tracked &&
+	: >modified &&
+	mkdir dir1 &&
+	: >dir1/tracked &&
+	: >dir1/modified &&
+	mkdir dir2 &&
+	: >dir2/tracked &&
+	: >dir2/modified &&
+	git add . &&
+	test_tick &&
+	git commit -m initial &&
+	git config core.fsmonitor true &&
+	cat >.gitignore <<-\EOF
+	.gitignore
+	expect*
+	output*
+	marker*
+	EOF
+'
+
+# Ensure commands that call refresh_index() to move the index back in time
+# properly invalidate the fsmonitor cache
+
+test_expect_success 'refresh_index() invalidates fsmonitor cache' '
+	git status &&
+	test_path_is_missing marker &&
+	dirty_repo &&
+	write_script .git/hooks/query-fsmonitor<<-\EOF &&
+	:>marker
+	EOF
+	git add . &&
+	git commit -m "to reset" &&
+	git status &&
+	test_path_is_file marker &&
+	git reset HEAD~1 &&
+	rm -f marker &&
+	git status >output &&
+	test_path_is_file marker &&
+	git -c core.fsmonitor=false status >expect &&
+	test_i18ncmp expect output
+'
+
+# Now make sure it's actually skipping the check for modified and untracked
+# files unless it is told about them.  Note, after "git reset --hard HEAD" no
+# extensions exist other than 'TREE' so do a "git status" to get the extension
+# written before testing the results.
+
+test_expect_success "status doesn't detect unreported modifications" '
+	write_script .git/hooks/query-fsmonitor<<-\EOF &&
+	:>marker
+	EOF
+	clean_repo &&
+	git status &&
+	test_path_is_missing marker &&
+	: >untracked &&
+	echo 2 >dir1/modified &&
+	git status >output &&
+	test_path_is_file marker &&
+	test_i18ngrep ! "Changes not staged for commit:" output &&
+	test_i18ngrep ! "Untracked files:" output &&
+	write_script .git/hooks/query-fsmonitor<<-\EOF &&
+	:>marker
+	printf "untracked\0"
+	printf "dir1/modified\0"
+	EOF
+	rm -f marker &&
+	git status >output &&
+	test_path_is_file marker &&
+	test_i18ngrep "Changes not staged for commit:" output &&
+	test_i18ngrep "Untracked files:" output
+'
+
+# Status is well tested elsewhere so we'll just ensure that the results are
+# the same when using core.fsmonitor. First call after turning on the option
+# does a complete scan so we need to do two calls to ensure we test the new
+# codepath.
+
+test_expect_success 'status with core.untrackedcache false' '
+	git config core.untrackedcache false &&
+	write_script .git/hooks/query-fsmonitor<<-\EOF &&
+	if [ $1 -ne 1 ]
+	then
+		echo -e "Unsupported query-fsmonitor hook version.\n" >&2
+		exit 1;
+	fi
+	: >marker
+	printf "untracked\0"
+	printf "dir1/untracked\0"
+	printf "dir2/untracked\0"
+	printf "modified\0"
+	printf "dir1/modified\0"
+	printf "dir2/modified\0"
+	printf "new\0""
+	printf "dir1/new\0"
+	printf "dir2/new\0"
+	EOF
+	clean_repo &&
+	dirty_repo &&
+	git -c core.fsmonitor=false status >expect &&
+	clean_repo &&
+	git status &&
+	test_path_is_missing marker &&
+	dirty_repo &&
+	git status >output &&
+	test_path_is_file marker &&
+	test_i18ncmp expect output
+'
+
+if ! test_have_prereq UNTRACKED_CACHE; then
+	skip_all='This system does not support untracked cache'
+	test_done
+fi
+
+test_expect_success 'status with core.untrackedcache true' '
+	git config core.untrackedcache true &&
+	git -c core.fsmonitor=false status >expect &&
+	clean_repo &&
+	git status &&
+	test_path_is_missing marker &&
+	dirty_repo &&
+	git status >output &&
+	test_path_is_file marker &&
+	test_i18ncmp expect output
+'
+
+test_done
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v5 5/7] fsmonitor: add documentation for the fsmonitor extension.
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
                   ` (3 preceding siblings ...)
  2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-10 13:40 ` [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

This includes the core.fsmonitor setting, the query-fsmonitor hook,
and the fsmonitor index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Documentation/config.txt                 |  7 +++++++
 Documentation/githooks.txt               | 23 +++++++++++++++++++++++
 Documentation/technical/index-format.txt | 19 +++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dd4beec39d..d883e3318c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -410,6 +410,13 @@ core.protectNTFS::
 	8.3 "short" names.
 	Defaults to `true` on Windows, and `false` elsewhere.
 
+core.fsmonitor::
+	If set to true, call the query-fsmonitor hook proc which will
+	identify all files that may have had changes since the last
+	request. This information is used to speed up operations like
+	'git commit' and 'git status' by limiting what git must scan to
+	detect changes.
+
 core.trustctime::
 	If false, the ctime differences between the index and the
 	working tree are ignored; useful when the inode change time
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index b2514f4d44..219786b2da 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -456,6 +456,29 @@ non-zero status causes 'git send-email' to abort before sending any
 e-mails.
 
 
+[[query-fsmonitor]]
+query-fsmonitor
+~~~~~~~~~~~~~~~
+
+This hook is invoked when the configuration option core.fsmonitor is
+set and git needs to identify changed or untracked files.  It takes
+two arguments, a version (currently 1) and the time in elapsed
+nanoseconds since midnight, January 1, 1970.
+
+The hook should output to stdout the list of all files in the working
+directory that may have changed since the requested time.  The logic
+should be inclusive so that it does not miss any potential changes.
+The paths should be relative to the root of the working directory
+and be separated by a single NUL.
+
+Git will limit what files it checks for changes as well as which
+directories are checked for untracked files based on the path names
+given.
+
+The exit status determines whether git will use the data from the
+hook to limit its search.  On error, it will fall back to verifying
+all files and folders.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index ade0b0c445..7aeeea6f94 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -295,3 +295,22 @@ The remaining data of each directory block is grouped by type:
     in the previous ewah bitmap.
 
   - One NUL.
+
+== File System Monitor cache
+
+  The file system monitor cache tracks files for which the query-fsmonitor
+  hook has told us about changes.  The signature for this extension is
+  { 'F', 'S', 'M', 'N' }.
+
+  The extension starts with
+
+  - 32-bit version number: the current supported version is 1.
+
+  - 64-bit time: the extension data reflects all changes through the given
+	time which is stored as the nanoseconds elapsed since midnight,
+	January 1, 1970.
+
+  - 32-bit bitmap size: the size of the CE_FSMONITOR_DIRTY bitmap.
+
+  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
+    is CE_FSMONITOR_DIRTY.
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
                   ` (4 preceding siblings ...)
  2017-06-10 13:40 ` [PATCH v5 5/7] fsmonitor: add documentation for the " Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

This hook script integrates the new fsmonitor capabilities of git with
the cross platform Watchman file watching service. To use the script:

Download and install Watchman from https://facebook.github.io/watchman/
and instruct Watchman to watch your working directory for changes
('watchman watch-project /usr/src/git').

Rename the sample integration hook from query-fsmonitor.sample to
query-fsmonitor.

Configure git to use the extension ('git config core.fsmonitor true')
and optionally turn on the untracked cache for optimal performance
('git config core.untrackedcache true').

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 templates/hooks--query-fsmonitor.sample | 76 +++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)
 create mode 100755 templates/hooks--query-fsmonitor.sample

diff --git a/templates/hooks--query-fsmonitor.sample b/templates/hooks--query-fsmonitor.sample
new file mode 100755
index 0000000000..8d05b87a90
--- /dev/null
+++ b/templates/hooks--query-fsmonitor.sample
@@ -0,0 +1,76 @@
+#!/bin/sh
+#
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to provide fast
+# git status.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-fsmonitor"
+
+# check the hook interface version
+if [ "$1" -eq 1 ]
+then
+	# convert nanoseconds to seconds
+	time_t=$(($2/1000000000))
+else
+	echo -e "Unsupported query-fsmonitor hook version.\nFalling back to scanning...\n" >&2
+	exit 1;
+fi
+
+# Convert unix style paths to what Watchman expects
+case "$(uname -s)" in
+MINGW*|MSYS_NT*)
+  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')"
+  ;;
+*)
+  GIT_WORK_TREE="$PWD"
+  ;;
+esac
+
+# In the query expression below we're asking for names of files that
+# changed since $time_t but were not transient (ie created after
+# $time_t but no longer exist).
+#
+# To accomplish this, we're using the "since" generator to use the
+# recency index to select candidate nodes and "fields" to limit the
+# output to file names only. Then we're using the "expression" term to
+# further constrain the results.
+#
+# The category of transient files that we want to ignore will have a
+# creation clock (cclock) newer than $time_t value and will also not
+# currently exist.
+
+echo  "[\"query\", \"$GIT_WORK_TREE\", { \
+	\"since\": $time_t, \
+	\"fields\": [\"name\"], \
+	\"expression\": [\"not\", [\"allof\", [\"since\", $time_t, \"cclock\"], [\"not\", \"exists\"]]] \
+	}]" | \
+	watchman -j |
+	perl -0666 -e '
+		use strict;
+		use warnings;
+
+		my $stdin = <>;
+		die "Watchman: command returned no output.\nFalling back to scanning...\n" if $stdin eq "";
+		die "Watchman: command returned invalid output: $stdin\nFalling back to scanning...\n" unless $stdin =~ /^\{/;
+
+		my $json_pkg;
+		eval {
+			require JSON::XS;
+			$json_pkg = "JSON::XS";
+			1;
+		} or do {
+			require JSON::PP;
+			$json_pkg = "JSON::PP";
+		};
+
+		my $o = $json_pkg->new->utf8->decode($stdin);
+		die "Watchman: $o->{error}.\nFalling back to scanning...\n" if $o->{error};
+
+		local $, = "\0";
+		print @{$o->{files}};
+	'
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v5 7/7] fsmonitor: add a performance test
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
                   ` (5 preceding siblings ...)
  2017-06-10 13:40 ` [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
@ 2017-06-10 13:40 ` Ben Peart
  2017-06-10 14:04   ` Ben Peart
  2017-06-12 22:04   ` Junio C Hamano
  2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
  8 siblings, 2 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 13:40 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

Add a test utility (test-drop-caches) that enables dropping the file
system cache on Windows.

Add a perf test (p7519-fsmonitor.sh) for fsmonitor.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile                    |   1 +
 t/helper/test-drop-caches.c | 107 +++++++++++++++++++++++++++++
 t/perf/p7519-fsmonitor.sh   | 161 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 269 insertions(+)
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100755 t/perf/p7519-fsmonitor.sh

diff --git a/Makefile b/Makefile
index 992dd58801..893947839f 100644
--- a/Makefile
+++ b/Makefile
@@ -648,6 +648,7 @@ TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
 TEST_PROGRAMS_NEED_X += test-urlmatch-normalization
 TEST_PROGRAMS_NEED_X += test-wildmatch
+TEST_PROGRAMS_NEED_X += test-drop-caches
 
 TEST_PROGRAMS = $(patsubst %,t/helper/%$X,$(TEST_PROGRAMS_NEED_X))
 
diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
new file mode 100644
index 0000000000..80830d920b
--- /dev/null
+++ b/t/helper/test-drop-caches.c
@@ -0,0 +1,107 @@
+#include "git-compat-util.h"
+#include <stdio.h>
+
+typedef DWORD NTSTATUS;
+
+#ifdef GIT_WINDOWS_NATIVE
+#include <tchar.h>
+
+#define STATUS_SUCCESS			(0x00000000L)
+#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
+
+typedef enum _SYSTEM_INFORMATION_CLASS {
+	SystemMemoryListInformation = 80, // 80, q: SYSTEM_MEMORY_LIST_INFORMATION; s: SYSTEM_MEMORY_LIST_COMMAND (requires SeProfileSingleProcessPrivilege)
+} SYSTEM_INFORMATION_CLASS;
+
+// private
+typedef enum _SYSTEM_MEMORY_LIST_COMMAND
+{
+	MemoryCaptureAccessedBits,
+	MemoryCaptureAndResetAccessedBits,
+	MemoryEmptyWorkingSets,
+	MemoryFlushModifiedList,
+	MemoryPurgeStandbyList,
+	MemoryPurgeLowPriorityStandbyList,
+	MemoryCommandMax
+} SYSTEM_MEMORY_LIST_COMMAND;
+
+BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
+{
+	BOOL bResult;
+	DWORD dwBufferLength;
+	LUID luid;
+	TOKEN_PRIVILEGES tpPreviousState;
+	TOKEN_PRIVILEGES tpNewState;
+
+	dwBufferLength = 16;
+	bResult = LookupPrivilegeValueA(0, lpName, &luid);
+	if (bResult)
+	{
+		tpNewState.PrivilegeCount = 1;
+		tpNewState.Privileges[0].Luid = luid;
+		tpNewState.Privileges[0].Attributes = 0;
+		bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState, (DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState), &tpPreviousState, &dwBufferLength);
+		if (bResult)
+		{
+			tpPreviousState.PrivilegeCount = 1;
+			tpPreviousState.Privileges[0].Luid = luid;
+			tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0;
+			bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState, dwBufferLength, 0, 0);
+		}
+	}
+	return bResult;
+}
+#endif
+
+int cmd_main(int argc, const char **argv)
+{
+	NTSTATUS status = 1;
+#ifdef GIT_WINDOWS_NATIVE
+	HANDLE hProcess = GetCurrentProcess();
+	HANDLE hToken;
+	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
+	{
+		_ftprintf(stderr, _T("Can't open current process token\n"));
+		return 1;
+	}
+
+	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
+	{
+		_ftprintf(stderr, _T("Can't get SeProfileSingleProcessPrivilege\n"));
+		return 1;
+	}
+
+	CloseHandle(hToken);
+
+	HMODULE ntdll = LoadLibrary(_T("ntdll.dll"));
+	if (!ntdll)
+	{
+		_ftprintf(stderr, _T("Can't load ntdll.dll, wrong Windows version?\n"));
+		return 1;
+	}
+
+	NTSTATUS(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) = (NTSTATUS(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
+	if (!NtSetSystemInformation)
+	{
+		_ftprintf(stderr, _T("Can't get function addresses, wrong Windows version?\n"));
+		return 1;
+	}
+
+	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
+	status = NtSetSystemInformation(
+		SystemMemoryListInformation,
+		&command,
+		sizeof(SYSTEM_MEMORY_LIST_COMMAND)
+	);
+	if (status == STATUS_PRIVILEGE_NOT_HELD)
+	{
+		_ftprintf(stderr, _T("Insufficient privileges to execute the memory list command"));
+	}
+	else if (status != STATUS_SUCCESS)
+	{
+		_ftprintf(stderr, _T("Unable to execute the memory list command %lX"), status);
+	}
+#endif
+
+	return status;
+}
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
new file mode 100755
index 0000000000..e41905cb9b
--- /dev/null
+++ b/t/perf/p7519-fsmonitor.sh
@@ -0,0 +1,161 @@
+#!/bin/sh
+
+test_description="Test core.fsmonitor"
+
+. ./perf-lib.sh
+
+# This has to be run with GIT_PERF_REPEAT_COUNT=1 to generate valid results.
+# Otherwise the caching that happens for the nth run will negate the validity
+# of the comparisons.
+if [ "$GIT_PERF_REPEAT_COUNT" -ne 1 ]
+then
+	echo "warning: This test must be run with GIT_PERF_REPEAT_COUNT=1 to generate valid results." >&2
+	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+	GIT_PERF_REPEAT_COUNT=1
+fi
+
+test_perf_large_repo
+test_checkout_worktree
+
+# Convert unix style paths to what Watchman expects
+case "$(uname -s)" in
+MINGW*|MSYS_NT*)
+  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')"
+  ;;
+*)
+  GIT_WORK_TREE="$PWD"
+  ;;
+esac
+
+# The big win for using fsmonitor is the elimination of the need to scan
+# the working directory looking for changed files and untracked files. If
+# the file information is all cached in RAM, the benefits are reduced.
+
+flush_disk_cache () {
+	case "$(uname -s)" in
+	MINGW*|MSYS_NT*)
+	  sync && test-drop-caches
+	  ;;
+	*)
+	  sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
+	  ;;
+	esac
+
+}
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success "setup" '
+	# Maybe set untrackedCache & splitIndex depending on the environment
+	if test -n "$GIT_PERF_7519_UNTRACKED_CACHE"
+	then
+		git config core.untrackedCache "$GIT_PERF_7519_UNTRACKED_CACHE"
+	else
+		if test_have_prereq UNTRACKED_CACHE
+		then
+			git config core.untrackedCache true
+		else
+			git config core.untrackedCache false
+		fi
+	fi &&
+
+	if test -n "$GIT_PERF_7519_SPLIT_INDEX"
+	then
+		git config core.splitIndex "$GIT_PERF_7519_SPLIT_INDEX"
+	fi &&
+
+	# Hook scaffolding
+	mkdir .git/hooks &&
+	cp ../../../templates/hooks--query-fsmonitor.sample .git/hooks/query-fsmonitor &&
+
+	# have Watchman monitor the test folder
+	watchman watch "$GIT_WORK_TREE" &&
+	watchman watch-list | grep -q -F "$GIT_WORK_TREE"
+'
+
+# Worst case without fsmonitor
+test_expect_success "clear fs cache" '
+	git config core.fsmonitor false &&
+	flush_disk_cache
+'
+test_perf "status (fsmonitor=false, cold fs cache)" '
+	git status
+'
+
+# Best case without fsmonitor
+test_perf "status (fsmonitor=false, warm fs cache)" '
+	git status
+'
+
+# Let's see if -uno & -uall make any difference
+test_expect_success "clear fs cache" '
+	flush_disk_cache
+'
+test_perf "status -uno (fsmonitor=false, cold fs cache)" '
+	git status -uno
+'
+
+test_expect_success "clear fs cache" '
+	flush_disk_cache
+'
+test_perf "status -uall (fsmonitor=false, cold fs cache)" '
+	git status -uall
+'
+
+# The first run with core.fsmonitor=true has to do a normal scan and write
+# out the index extension.
+test_expect_success "populate extension" '
+	# core.preloadIndex defeats the benefits of core.fsMonitor as it
+	# calls lstat for the index entries. Turn it off as _not_ doing
+	# the work is faster than doing the work across multiple threads.
+	git config core.fsmonitor true &&
+	git config core.preloadIndex false &&
+	git status
+'
+
+# Worst case with fsmonitor
+test_expect_success "shutdown fsmonitor, clear fs cache" '
+	watchman shutdown-server &&
+	flush_disk_cache
+'
+test_perf "status (fsmonitor=true, cold fs cache, cold fsmonitor)" '
+	git status
+'
+
+# Best case with fsmonitor
+test_perf "status (fsmonitor=true, warm fs cache, warm fsmonitor)" '
+	git status
+'
+
+# Best improved with fsmonitor (compare to worst case without fsmonitor)
+test_expect_success "clear fs cache" '
+	flush_disk_cache
+'
+test_perf "status (fsmonitor=true, cold fs cache, warm fsmonitor)" '
+	git status
+'
+
+# Let's see if -uno & -uall make any difference
+test_expect_success "clear fs cache" '
+	flush_disk_cache
+'
+test_perf "status -uno (fsmonitor=true, cold fs cache)" '
+	git status -uno
+'
+
+test_expect_success "clear fs cache" '
+	flush_disk_cache
+'
+test_perf "status -uall (fsmonitor=true, cold fs cache)" '
+	git status -uall
+'
+
+test_expect_success "cleanup" '
+	watchman watch-del "$GIT_WORK_TREE" &&
+	watchman shutdown-server
+'
+
+test_done
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
@ 2017-06-10 14:04   ` Ben Peart
  2017-06-12 22:04   ` Junio C Hamano
  1 sibling, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-06-10 14:04 UTC (permalink / raw)
  To: git
  Cc: gitster, benpeart, pclouds, johannes.schindelin, David.Turner,
	peff, christian.couder, avarab

Here are some perf test results for repos of various size generated with 
many-files.sh.

Comparing cold fs cache times on a fast SSD running Windows:

# files		preloadindex	fsmonitor	reduction
=========================================================
10,000		0.69		0.46		33%
100,000		3.75		0.7		81%
1,000,000	35.07		3.24		91%



10,000 files
GIT_PERF_REPEAT_COUNT=1 
GIT_PERF_LARGE_REPO=/c/Repos/gen-many-files-3.10.9.git/ ./run 
p7519-fsmonitor.sh
Test                                                              this tree
---------------------------------------------------------------------------------
7519.3: status (fsmonitor=false, cold fs cache) 
0.69(0.03+0.06)
7519.4: status (fsmonitor=false, warm fs cache) 
0.45(0.01+0.06)
7519.6: status -uno (fsmonitor=false, cold fs cache) 
0.53(0.03+0.04)
7519.8: status -uall (fsmonitor=false, cold fs cache) 
0.57(0.04+0.01)
7519.11: status (fsmonitor=true, cold fs cache, cold fsmonitor) 
2.16(0.01+0.06)
7519.12: status (fsmonitor=true, warm fs cache, warm fsmonitor) 
0.36(0.01+0.07)
7519.14: status (fsmonitor=true, cold fs cache, warm fsmonitor) 
0.46(0.04+0.06)
7519.16: status -uno (fsmonitor=true, cold fs cache) 
0.46(0.03+0.04)
7519.18: status -uall (fsmonitor=true, cold fs cache) 
0.76(0.04+0.04)

100,000 files
GIT_PERF_REPEAT_COUNT=1 
GIT_PERF_LARGE_REPO=/c/Repos/gen-many-files-4.10.9.git/ ./run 
p7519-fsmonitor.sh
Test                                                              this tree
----------------------------------------------------------------------------------
7519.3: status (fsmonitor=false, cold fs cache) 
3.75(0.01+0.04)
7519.4: status (fsmonitor=false, warm fs cache) 
2.74(0.06+0.06)
7519.6: status -uno (fsmonitor=false, cold fs cache) 
2.88(0.01+0.06)
7519.8: status -uall (fsmonitor=false, cold fs cache) 
3.24(0.00+0.06)
7519.11: status (fsmonitor=true, cold fs cache, cold fsmonitor) 
17.90(0.00+0.10)
7519.12: status (fsmonitor=true, warm fs cache, warm fsmonitor) 
0.59(0.00+0.09)
7519.14: status (fsmonitor=true, cold fs cache, warm fsmonitor) 
0.70(0.00+0.07)
7519.16: status -uno (fsmonitor=true, cold fs cache) 
0.69(0.01+0.01)
7519.18: status -uall (fsmonitor=true, cold fs cache) 
4.51(0.00+0.06)

1,000,000 files
GIT_PERF_REPEAT_COUNT=1 
GIT_PERF_LARGE_REPO=/c/Repos/gen-many-files-5.10.9.git/ ./run 
p7519-fsmonitor.sh
Test                                                              this tree
-----------------------------------------------------------------------------------
7519.3: status (fsmonitor=false, cold fs cache) 
35.07(0.01+0.03)
7519.4: status (fsmonitor=false, warm fs cache) 
26.58(0.03+0.07)
7519.6: status -uno (fsmonitor=false, cold fs cache) 
26.46(0.03+0.06)
7519.8: status -uall (fsmonitor=false, cold fs cache) 
31.55(0.01+0.03)
7519.11: status (fsmonitor=true, cold fs cache, cold fsmonitor) 
193.15(0.01+0.04)
7519.12: status (fsmonitor=true, warm fs cache, warm fsmonitor) 
3.03(0.01+0.07)
7519.14: status (fsmonitor=true, cold fs cache, warm fsmonitor) 
3.24(0.01+0.04)
7519.16: status -uno (fsmonitor=true, cold fs cache) 
2.99(0.03+0.03)
7519.18: status -uall (fsmonitor=true, cold fs cache) 
35.07(0.03+0.07)


On 6/10/2017 9:40 AM, Ben Peart wrote:
> Add a test utility (test-drop-caches) that enables dropping the file
> system cache on Windows.
> 
> Add a perf test (p7519-fsmonitor.sh) for fsmonitor.
> 
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   Makefile                    |   1 +
>   t/helper/test-drop-caches.c | 107 +++++++++++++++++++++++++++++
>   t/perf/p7519-fsmonitor.sh   | 161 ++++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 269 insertions(+)
>   create mode 100644 t/helper/test-drop-caches.c
>   create mode 100755 t/perf/p7519-fsmonitor.sh
> 
> diff --git a/Makefile b/Makefile
> index 992dd58801..893947839f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -648,6 +648,7 @@ TEST_PROGRAMS_NEED_X += test-subprocess
>   TEST_PROGRAMS_NEED_X += test-svn-fe
>   TEST_PROGRAMS_NEED_X += test-urlmatch-normalization
>   TEST_PROGRAMS_NEED_X += test-wildmatch
> +TEST_PROGRAMS_NEED_X += test-drop-caches
>   
>   TEST_PROGRAMS = $(patsubst %,t/helper/%$X,$(TEST_PROGRAMS_NEED_X))
>   
> diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
> new file mode 100644
> index 0000000000..80830d920b
> --- /dev/null
> +++ b/t/helper/test-drop-caches.c
> @@ -0,0 +1,107 @@
> +#include "git-compat-util.h"
> +#include <stdio.h>
> +
> +typedef DWORD NTSTATUS;
> +
> +#ifdef GIT_WINDOWS_NATIVE
> +#include <tchar.h>
> +
> +#define STATUS_SUCCESS			(0x00000000L)
> +#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
> +
> +typedef enum _SYSTEM_INFORMATION_CLASS {
> +	SystemMemoryListInformation = 80, // 80, q: SYSTEM_MEMORY_LIST_INFORMATION; s: SYSTEM_MEMORY_LIST_COMMAND (requires SeProfileSingleProcessPrivilege)
> +} SYSTEM_INFORMATION_CLASS;
> +
> +// private
> +typedef enum _SYSTEM_MEMORY_LIST_COMMAND
> +{
> +	MemoryCaptureAccessedBits,
> +	MemoryCaptureAndResetAccessedBits,
> +	MemoryEmptyWorkingSets,
> +	MemoryFlushModifiedList,
> +	MemoryPurgeStandbyList,
> +	MemoryPurgeLowPriorityStandbyList,
> +	MemoryCommandMax
> +} SYSTEM_MEMORY_LIST_COMMAND;
> +
> +BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
> +{
> +	BOOL bResult;
> +	DWORD dwBufferLength;
> +	LUID luid;
> +	TOKEN_PRIVILEGES tpPreviousState;
> +	TOKEN_PRIVILEGES tpNewState;
> +
> +	dwBufferLength = 16;
> +	bResult = LookupPrivilegeValueA(0, lpName, &luid);
> +	if (bResult)
> +	{
> +		tpNewState.PrivilegeCount = 1;
> +		tpNewState.Privileges[0].Luid = luid;
> +		tpNewState.Privileges[0].Attributes = 0;
> +		bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState, (DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState), &tpPreviousState, &dwBufferLength);
> +		if (bResult)
> +		{
> +			tpPreviousState.PrivilegeCount = 1;
> +			tpPreviousState.Privileges[0].Luid = luid;
> +			tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0;
> +			bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState, dwBufferLength, 0, 0);
> +		}
> +	}
> +	return bResult;
> +}
> +#endif
> +
> +int cmd_main(int argc, const char **argv)
> +{
> +	NTSTATUS status = 1;
> +#ifdef GIT_WINDOWS_NATIVE
> +	HANDLE hProcess = GetCurrentProcess();
> +	HANDLE hToken;
> +	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
> +	{
> +		_ftprintf(stderr, _T("Can't open current process token\n"));
> +		return 1;
> +	}
> +
> +	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
> +	{
> +		_ftprintf(stderr, _T("Can't get SeProfileSingleProcessPrivilege\n"));
> +		return 1;
> +	}
> +
> +	CloseHandle(hToken);
> +
> +	HMODULE ntdll = LoadLibrary(_T("ntdll.dll"));
> +	if (!ntdll)
> +	{
> +		_ftprintf(stderr, _T("Can't load ntdll.dll, wrong Windows version?\n"));
> +		return 1;
> +	}
> +
> +	NTSTATUS(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) = (NTSTATUS(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
> +	if (!NtSetSystemInformation)
> +	{
> +		_ftprintf(stderr, _T("Can't get function addresses, wrong Windows version?\n"));
> +		return 1;
> +	}
> +
> +	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
> +	status = NtSetSystemInformation(
> +		SystemMemoryListInformation,
> +		&command,
> +		sizeof(SYSTEM_MEMORY_LIST_COMMAND)
> +	);
> +	if (status == STATUS_PRIVILEGE_NOT_HELD)
> +	{
> +		_ftprintf(stderr, _T("Insufficient privileges to execute the memory list command"));
> +	}
> +	else if (status != STATUS_SUCCESS)
> +	{
> +		_ftprintf(stderr, _T("Unable to execute the memory list command %lX"), status);
> +	}
> +#endif
> +
> +	return status;
> +}
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> new file mode 100755
> index 0000000000..e41905cb9b
> --- /dev/null
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -0,0 +1,161 @@
> +#!/bin/sh
> +
> +test_description="Test core.fsmonitor"
> +
> +. ./perf-lib.sh
> +
> +# This has to be run with GIT_PERF_REPEAT_COUNT=1 to generate valid results.
> +# Otherwise the caching that happens for the nth run will negate the validity
> +# of the comparisons.
> +if [ "$GIT_PERF_REPEAT_COUNT" -ne 1 ]
> +then
> +	echo "warning: This test must be run with GIT_PERF_REPEAT_COUNT=1 to generate valid results." >&2
> +	echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
> +	GIT_PERF_REPEAT_COUNT=1
> +fi
> +
> +test_perf_large_repo
> +test_checkout_worktree
> +
> +# Convert unix style paths to what Watchman expects
> +case "$(uname -s)" in
> +MINGW*|MSYS_NT*)
> +  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')"
> +  ;;
> +*)
> +  GIT_WORK_TREE="$PWD"
> +  ;;
> +esac
> +
> +# The big win for using fsmonitor is the elimination of the need to scan
> +# the working directory looking for changed files and untracked files. If
> +# the file information is all cached in RAM, the benefits are reduced.
> +
> +flush_disk_cache () {
> +	case "$(uname -s)" in
> +	MINGW*|MSYS_NT*)
> +	  sync && test-drop-caches
> +	  ;;
> +	*)
> +	  sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
> +	  ;;
> +	esac
> +
> +}
> +
> +test_lazy_prereq UNTRACKED_CACHE '
> +	{ git update-index --test-untracked-cache; ret=$?; } &&
> +	test $ret -ne 1
> +'
> +
> +test_expect_success "setup" '
> +	# Maybe set untrackedCache & splitIndex depending on the environment
> +	if test -n "$GIT_PERF_7519_UNTRACKED_CACHE"
> +	then
> +		git config core.untrackedCache "$GIT_PERF_7519_UNTRACKED_CACHE"
> +	else
> +		if test_have_prereq UNTRACKED_CACHE
> +		then
> +			git config core.untrackedCache true
> +		else
> +			git config core.untrackedCache false
> +		fi
> +	fi &&
> +
> +	if test -n "$GIT_PERF_7519_SPLIT_INDEX"
> +	then
> +		git config core.splitIndex "$GIT_PERF_7519_SPLIT_INDEX"
> +	fi &&
> +
> +	# Hook scaffolding
> +	mkdir .git/hooks &&
> +	cp ../../../templates/hooks--query-fsmonitor.sample .git/hooks/query-fsmonitor &&
> +
> +	# have Watchman monitor the test folder
> +	watchman watch "$GIT_WORK_TREE" &&
> +	watchman watch-list | grep -q -F "$GIT_WORK_TREE"
> +'
> +
> +# Worst case without fsmonitor
> +test_expect_success "clear fs cache" '
> +	git config core.fsmonitor false &&
> +	flush_disk_cache
> +'
> +test_perf "status (fsmonitor=false, cold fs cache)" '
> +	git status
> +'
> +
> +# Best case without fsmonitor
> +test_perf "status (fsmonitor=false, warm fs cache)" '
> +	git status
> +'
> +
> +# Let's see if -uno & -uall make any difference
> +test_expect_success "clear fs cache" '
> +	flush_disk_cache
> +'
> +test_perf "status -uno (fsmonitor=false, cold fs cache)" '
> +	git status -uno
> +'
> +
> +test_expect_success "clear fs cache" '
> +	flush_disk_cache
> +'
> +test_perf "status -uall (fsmonitor=false, cold fs cache)" '
> +	git status -uall
> +'
> +
> +# The first run with core.fsmonitor=true has to do a normal scan and write
> +# out the index extension.
> +test_expect_success "populate extension" '
> +	# core.preloadIndex defeats the benefits of core.fsMonitor as it
> +	# calls lstat for the index entries. Turn it off as _not_ doing
> +	# the work is faster than doing the work across multiple threads.
> +	git config core.fsmonitor true &&
> +	git config core.preloadIndex false &&
> +	git status
> +'
> +
> +# Worst case with fsmonitor
> +test_expect_success "shutdown fsmonitor, clear fs cache" '
> +	watchman shutdown-server &&
> +	flush_disk_cache
> +'
> +test_perf "status (fsmonitor=true, cold fs cache, cold fsmonitor)" '
> +	git status
> +'
> +
> +# Best case with fsmonitor
> +test_perf "status (fsmonitor=true, warm fs cache, warm fsmonitor)" '
> +	git status
> +'
> +
> +# Best improved with fsmonitor (compare to worst case without fsmonitor)
> +test_expect_success "clear fs cache" '
> +	flush_disk_cache
> +'
> +test_perf "status (fsmonitor=true, cold fs cache, warm fsmonitor)" '
> +	git status
> +'
> +
> +# Let's see if -uno & -uall make any difference
> +test_expect_success "clear fs cache" '
> +	flush_disk_cache
> +'
> +test_perf "status -uno (fsmonitor=true, cold fs cache)" '
> +	git status -uno
> +'
> +
> +test_expect_success "clear fs cache" '
> +	flush_disk_cache
> +'
> +test_perf "status -uall (fsmonitor=true, cold fs cache)" '
> +	git status -uall
> +'
> +
> +test_expect_success "cleanup" '
> +	watchman watch-del "$GIT_WORK_TREE" &&
> +	watchman shutdown-server
> +'
> +
> +test_done
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
  2017-06-10 14:04   ` Ben Peart
@ 2017-06-12 22:04   ` Junio C Hamano
  2017-06-14 14:12     ` Ben Peart
  1 sibling, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-06-12 22:04 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, benpeart, pclouds, johannes.schindelin, David.Turner, peff,
	christian.couder, avarab

Ben Peart <peartben@gmail.com> writes:

> diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
> new file mode 100644
> index 0000000000..80830d920b
> --- /dev/null
> +++ b/t/helper/test-drop-caches.c
> @@ -0,0 +1,107 @@
> +#include "git-compat-util.h"
> +#include <stdio.h>

I thought compat-util should already include stdio?

> +
> +typedef DWORD NTSTATUS;

Is this safe to have outside "#ifdef GIT_WINDOWS_NATIVE"?

> +
> +#ifdef GIT_WINDOWS_NATIVE
> +#include <tchar.h>
> +
> +#define STATUS_SUCCESS			(0x00000000L)
> +#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
> +
> +typedef enum _SYSTEM_INFORMATION_CLASS {
> +	SystemMemoryListInformation = 80, // 80, q: SYSTEM_MEMORY_LIST_INFORMATION; s: SYSTEM_MEMORY_LIST_COMMAND (requires SeProfileSingleProcessPrivilege)

I would have said "Please avoid // comment in this codebase unless
we know we only use the compilers that grok it".  This particular
one may be OK, as it is inside GIT_WINDOWS_NATIVE and I assume
everybody on that platform uses recent GCC (or VStudio groks it I
guess)?

> +} SYSTEM_INFORMATION_CLASS;
> +
> +// private
> +typedef enum _SYSTEM_MEMORY_LIST_COMMAND
> +{

Style: '{' comes at the end of the previous line, with a single SP
immediately before it, unless it is the beginning of the function body.

What you did for _SYSTEM_INFORMATION_CLASS above is correct.

> +	MemoryCaptureAccessedBits,
> +	MemoryCaptureAndResetAccessedBits,
> +	MemoryEmptyWorkingSets,
> +	MemoryFlushModifiedList,
> +	MemoryPurgeStandbyList,
> +	MemoryPurgeLowPriorityStandbyList,
> +	MemoryCommandMax

Style: avoid CamelCase.

> +} SYSTEM_MEMORY_LIST_COMMAND;
> +
> +BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
> +{
> +	BOOL bResult;
> +	DWORD dwBufferLength;
> +	LUID luid;
> +	TOKEN_PRIVILEGES tpPreviousState;
> +	TOKEN_PRIVILEGES tpNewState;
> +
> +	dwBufferLength = 16;
> +	bResult = LookupPrivilegeValueA(0, lpName, &luid);
> +	if (bResult)
> +	{

Style: '{' comes at the end of the previous line, with a single SP
immediately before it, unless it is the beginning of the function body.

> +		tpNewState.PrivilegeCount = 1;
> +		tpNewState.Privileges[0].Luid = luid;
> +		tpNewState.Privileges[0].Attributes = 0;
> +		bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState, (DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState), &tpPreviousState, &dwBufferLength);
> +		if (bResult)
> +		{
> +			tpPreviousState.PrivilegeCount = 1;
> +			tpPreviousState.Privileges[0].Luid = luid;
> +			tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0;
> +			bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState, dwBufferLength, 0, 0);
> +		}
> +	}
> +	return bResult;
> +}
> +#endif
> +
> +int cmd_main(int argc, const char **argv)
> +{
> +	NTSTATUS status = 1;
> +#ifdef GIT_WINDOWS_NATIVE
> +	HANDLE hProcess = GetCurrentProcess();
> +	HANDLE hToken;
> +	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
> +	{
> +		_ftprintf(stderr, _T("Can't open current process token\n"));
> +		return 1;
> +	}
> +
> +	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
> +	{
> +		_ftprintf(stderr, _T("Can't get SeProfileSingleProcessPrivilege\n"));
> +		return 1;
> +	}
> +
> +	CloseHandle(hToken);
> +
> +	HMODULE ntdll = LoadLibrary(_T("ntdll.dll"));
> +	if (!ntdll)
> +	{
> +		_ftprintf(stderr, _T("Can't load ntdll.dll, wrong Windows version?\n"));
> +		return 1;
> +	}
> +
> +	NTSTATUS(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) = (NTSTATUS(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");

Is this "decl-after-stmt"?  Avoid it.

> +	if (!NtSetSystemInformation)
> +	{
> +		_ftprintf(stderr, _T("Can't get function addresses, wrong Windows version?\n"));
> +		return 1;
> +	}
> +
> +	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
> +	status = NtSetSystemInformation(
> +		SystemMemoryListInformation,
> +		&command,
> +		sizeof(SYSTEM_MEMORY_LIST_COMMAND)
> +	);
> +	if (status == STATUS_PRIVILEGE_NOT_HELD)
> +	{
> +		_ftprintf(stderr, _T("Insufficient privileges to execute the memory list command"));
> +	}
> +	else if (status != STATUS_SUCCESS)
> +	{
> +		_ftprintf(stderr, _T("Unable to execute the memory list command %lX"), status);
> +	}
> +#endif
> +
> +	return status;
> +}

So status is initialized to 1 and anybody without GIT_WINDOWS_NATIVE
unconditionally get exit(1)?

Given that 'status' is the return value of this function that
returns 'int', I wonder if we need NTSTATUS type here.

Having said all that, I think you are using this ONLY on windows;
perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
the above and arrange Makefile to build test-drop-cache only on that
platform, or something?


> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> new file mode 100755
> index 0000000000..e41905cb9b
> --- /dev/null
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -0,0 +1,161 @@
> +#!/bin/sh
> +
> +test_description="Test core.fsmonitor"
> +
> +. ./perf-lib.sh
> +
> +# This has to be run with GIT_PERF_REPEAT_COUNT=1 to generate valid results.
> +# Otherwise the caching that happens for the nth run will negate the validity
> +# of the comparisons.
> +if [ "$GIT_PERF_REPEAT_COUNT" -ne 1 ]
> +then

Style: 

	if test "$GIT_PERF_REPEAT_COUNT" -ne 1
	then
	
> + ...
> +test_expect_success "setup" '
> +...
> +	# Hook scaffolding
> +	mkdir .git/hooks &&
> +	cp ../../../templates/hooks--query-fsmonitor.sample .git/hooks/query-fsmonitor &&

Does this assume $TRASH_DIRECTORY must be in $TEST_DIRECTORY/perf/?
Shouldn't t/perf/p[0-9][0-9][0-9][0-9]-*.sh scripts be capable of
running with the --root=<ramdisk> option?  Perhaps take the copy
relative to $TEST_DIRECTORY?

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-06-12 22:04   ` Junio C Hamano
@ 2017-06-14 14:12     ` Ben Peart
  2017-06-14 18:36       ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-06-14 14:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, benpeart, pclouds, johannes.schindelin, David.Turner, peff,
	christian.couder, avarab



On 6/12/2017 6:04 PM, Junio C Hamano wrote:
> Ben Peart <peartben@gmail.com> writes:
> 
>> diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
>> new file mode 100644
>> index 0000000000..80830d920b
>> --- /dev/null
>> +++ b/t/helper/test-drop-caches.c
>> @@ -0,0 +1,107 @@
>> +#include "git-compat-util.h"
>> +#include <stdio.h>
> 
> I thought compat-util should already include stdio?
> 
>> +
>> +typedef DWORD NTSTATUS;
> 
> Is this safe to have outside "#ifdef GIT_WINDOWS_NATIVE"?

I think it's safe but it isn't required.  I remove it to be sure.

> 
>> +
>> +#ifdef GIT_WINDOWS_NATIVE
>> +#include <tchar.h>
>> +
>> +#define STATUS_SUCCESS			(0x00000000L)
>> +#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
>> +
>> +typedef enum _SYSTEM_INFORMATION_CLASS {
>> +	SystemMemoryListInformation = 80, // 80, q: SYSTEM_MEMORY_LIST_INFORMATION; s: SYSTEM_MEMORY_LIST_COMMAND (requires SeProfileSingleProcessPrivilege)
> 
> I would have said "Please avoid // comment in this codebase unless
> we know we only use the compilers that grok it".  This particular
> one may be OK, as it is inside GIT_WINDOWS_NATIVE and I assume
> everybody on that platform uses recent GCC (or VStudio groks it I
> guess)?
> 

I'll remove them to be sure.

>> +} SYSTEM_INFORMATION_CLASS;
>> +
>> +// private
>> +typedef enum _SYSTEM_MEMORY_LIST_COMMAND
>> +{
> 
> Style: '{' comes at the end of the previous line, with a single SP
> immediately before it, unless it is the beginning of the function body.
> 

Old habits. :) Started writing Windows code and adopted their coding 
style without even thinking about it.  I'll 'gitify' the style as much 
as possible.  This should address the various style comments below.

> What you did for _SYSTEM_INFORMATION_CLASS above is correct.
> 
>> +	MemoryCaptureAccessedBits,
>> +	MemoryCaptureAndResetAccessedBits,
>> +	MemoryEmptyWorkingSets,
>> +	MemoryFlushModifiedList,
>> +	MemoryPurgeStandbyList,
>> +	MemoryPurgeLowPriorityStandbyList,
>> +	MemoryCommandMax
> 
> Style: avoid CamelCase.
> 
> 
> So status is initialized to 1 and anybody without GIT_WINDOWS_NATIVE
> unconditionally get exit(1)?
> 
> Given that 'status' is the return value of this function that
> returns 'int', I wonder if we need NTSTATUS type here.
> 
> Having said all that, I think you are using this ONLY on windows;
> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
> the above and arrange Makefile to build test-drop-cache only on that
> platform, or something?
> 

I didn't find any other examples of Windows only tools.  I'll update the 
#ifdef to properly dump the file system cache on Linux as well and only 
error out on other platforms.

> 
>> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>> new file mode 100755
>> index 0000000000..e41905cb9b
>> --- /dev/null
>> +++ b/t/perf/p7519-fsmonitor.sh
>> @@ -0,0 +1,161 @@
>> +#!/bin/sh
>> +
>> +test_description="Test core.fsmonitor"
>> +
>> +. ./perf-lib.sh
>> +
>> +# This has to be run with GIT_PERF_REPEAT_COUNT=1 to generate valid results.
>> +# Otherwise the caching that happens for the nth run will negate the validity
>> +# of the comparisons.
>> +if [ "$GIT_PERF_REPEAT_COUNT" -ne 1 ]
>> +then
> 
> Style:
> 
> 	if test "$GIT_PERF_REPEAT_COUNT" -ne 1
> 	then
> 	
>> + ...
>> +test_expect_success "setup" '
>> +...
>> +	# Hook scaffolding
>> +	mkdir .git/hooks &&
>> +	cp ../../../templates/hooks--query-fsmonitor.sample .git/hooks/query-fsmonitor &&
> 
> Does this assume $TRASH_DIRECTORY must be in $TEST_DIRECTORY/perf/?
> Shouldn't t/perf/p[0-9][0-9][0-9][0-9]-*.sh scripts be capable of
> running with the --root=<ramdisk> option?  Perhaps take the copy
> relative to $TEST_DIRECTORY?
> 

I'll copy it relative to $TEST_DIRECTORY in the next iteration.


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-06-14 14:12     ` Ben Peart
@ 2017-06-14 18:36       ` Junio C Hamano
  2017-07-07 18:14         ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-06-14 18:36 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, benpeart, pclouds, johannes.schindelin, David.Turner, peff,
	christian.couder, avarab

Ben Peart <peartben@gmail.com> writes:

>> Having said all that, I think you are using this ONLY on windows;
>> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
>> the above and arrange Makefile to build test-drop-cache only on that
>> platform, or something?
>
> I didn't find any other examples of Windows only tools.  I'll update
> the #ifdef to properly dump the file system cache on Linux as well and
> only error out on other platforms.

If this will become Windows-only, then I have no problem with
platform specfic typedef ;-) I have no problem with CamelCase,
either, as that follows the local convention on the platform
(similar to those in compat/* that are only for Windows).

Having said all that.

Another approach is to build this helper on all platforms, with
sections protected by "#ifdef LINUX", "#ifdef WINDOWS", etc.  That
way, the platform detection and switching between running this
program and echoing something into /proc filesystem performed in
p7519 can be removed (this test-helper program will be responsible
to implement that logic instead).  When p7519 wants to drop the
filesystem cache, regardless of the platform, it can call this
test-helper without having to know how the filesystem cache is
dropped.

I do not have strong preference either way, but I have a slight
suspicion that the "another approach" above may give us a better
result.  For one thing, the test-helper can be reused in new perf
scripts people will write in the future.

Thanks.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-06-27 15:43   ` Christian Couder
  2017-07-03 21:25     ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Christian Couder @ 2017-06-27 15:43 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason

On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:

> +int read_fsmonitor_extension(struct index_state *istate, const void *data,
> +       unsigned long sz)
> +{
> +       const char *index = data;
> +       uint32_t hdr_version;
> +       uint32_t ewah_size;
> +       int ret;
> +
> +       if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
> +               return error("corrupt fsmonitor extension (too short)");
> +
> +       hdr_version = get_be32(index);

Here we use get_be32(), ...

> +       index += sizeof(uint32_t);
> +       if (hdr_version != INDEX_EXTENSION_VERSION)
> +               return error("bad fsmonitor version %d", hdr_version);
> +
> +       istate->fsmonitor_last_update = get_be64(index);

...get_be64(), ...

> +       index += sizeof(uint64_t);
> +
> +       ewah_size = get_be32(index);

... and get_be32 again, ...

> +       index += sizeof(uint32_t);
> +
> +       istate->fsmonitor_dirty = ewah_new();
> +       ret = ewah_read_mmap(istate->fsmonitor_dirty, index, ewah_size);
> +       if (ret != ewah_size) {
> +               ewah_free(istate->fsmonitor_dirty);
> +               istate->fsmonitor_dirty = NULL;
> +               return error("failed to parse ewah bitmap reading fsmonitor index extension");
> +       }
> +
> +       return 0;
> +}
> +
> +void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
> +{
> +       uint32_t hdr_version;
> +       uint64_t tm;
> +       struct ewah_bitmap *bitmap;
> +       int i;
> +       uint32_t ewah_start;
> +       uint32_t ewah_size = 0;
> +       int fixup = 0;
> +
> +       hdr_version = htonl(INDEX_EXTENSION_VERSION);

... but here we use htonl() instead of put_be32(), ...

> +       strbuf_add(sb, &hdr_version, sizeof(uint32_t));
> +
> +       tm = htonll((uint64_t)istate->fsmonitor_last_update);

... htonll(), ...

> +       strbuf_add(sb, &tm, sizeof(uint64_t));
> +       fixup = sb->len;
> +       strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
> +
> +       ewah_start = sb->len;
> +       bitmap = ewah_new();
> +       for (i = 0; i < istate->cache_nr; i++)
> +               if (istate->cache[i]->ce_flags & CE_FSMONITOR_DIRTY)
> +                       ewah_set(bitmap, i);
> +       ewah_serialize_strbuf(bitmap, sb);
> +       ewah_free(bitmap);
> +
> +       /* fix up size field */
> +       ewah_size = htonl(sb->len - ewah_start);

... and htonl() again.

It would be more consistent (and perhaps more correct) to use
put_beXX() functions, instead of the htonl() and htonll() functions.

> +       memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t));
> +}

> +/*
> + * Call the query-fsmonitor hook passing the time of the last saved results.
> + */
> +static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
> +{
> +       struct child_process cp = CHILD_PROCESS_INIT;
> +       char ver[64];
> +       char date[64];
> +       const char *argv[4];
> +
> +       if (!(argv[0] = find_hook("query-fsmonitor")))
> +               return -1;
> +
> +       snprintf(ver, sizeof(version), "%d", version);
> +       snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
> +       argv[1] = ver;
> +       argv[2] = date;
> +       argv[3] = NULL;
> +       cp.argv = argv;

Maybe it would be nicer using the argv_array_pushX() functions.

> +       cp.out = -1;
> +
> +       return capture_command(&cp, query_result, 1024);
> +}
> +
> +static void mark_file_dirty(struct index_state *istate, const char *name)
> +{
> +       struct untracked_cache_dir *dir;
> +       int pos;
> +
> +       /* find it in the index and mark that entry as dirty */
> +       pos = index_name_pos(istate, name, strlen(name));
> +       if (pos >= 0) {
> +               if (!(istate->cache[pos]->ce_flags & CE_FSMONITOR_DIRTY)) {
> +                       istate->cache[pos]->ce_flags |= CE_FSMONITOR_DIRTY;
> +                       istate->cache_changed |= FSMONITOR_CHANGED;
> +               }
> +       }
> +
> +       /*
> +        * Find the corresponding directory in the untracked cache
> +        * and mark it as invalid
> +        */
> +       if (!istate->untracked || !istate->untracked->root)
> +               return;
> +
> +       dir = find_untracked_cache_dir(istate->untracked, istate->untracked->root, name);
> +       if (dir) {
> +               if (dir->valid) {
> +                       dir->valid = 0;
> +                       istate->cache_changed |= FSMONITOR_CHANGED;
> +               }
> +       }

The code above is quite similar as what is in mark_fsmonitor_dirty(),
so I wonder if a refactoring is possible.

> +}
> +
> +void refresh_by_fsmonitor(struct index_state *istate)
> +{
> +       static int has_run_once = 0;
> +       struct strbuf query_result = STRBUF_INIT;
> +       int query_success = 0;
> +       size_t bol = 0; /* beginning of line */
> +       uint64_t last_update;
> +       char *buf, *entry;
> +       int i;
> +
> +       if (!core_fsmonitor || has_run_once)
> +               return;
> +       has_run_once = 1;
> +
> +       /*
> +        * This could be racy so save the date/time now and the hook
> +        * should be inclusive to ensure we don't miss potential changes.
> +        */
> +       last_update = getnanotime();
> +
> +       /*
> +        * If we have a last update time, call query-monitor for the set of
> +        * changes since that time.
> +        */
> +       if (istate->fsmonitor_last_update) {
> +               query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
> +                       istate->fsmonitor_last_update, &query_result);
> +               trace_performance_since(last_update, "query-fsmonitor");
> +       }
> +
> +       if (query_success) {
> +               /* Mark all entries returned by the monitor as dirty */
> +               buf = entry = query_result.buf;
> +               for (i = 0; i < query_result.len; i++) {
> +                       if (buf[i] != '\0')
> +                               continue;
> +                       mark_file_dirty(istate, buf + bol);

It looks like bol is always equal to i here...

> +                       bol = i + 1;
> +               }
> +               if (bol < query_result.len)
> +                       mark_file_dirty(istate, buf + bol);

... and here too. As it is not used below, I wonder if you really need
the bol variable.

> +               /* Mark all clean entries up-to-date */
> +               for (i = 0; i < istate->cache_nr; i++) {
> +                       struct cache_entry *ce = istate->cache[i];
> +                       if (ce_stage(ce) || (ce->ce_flags & CE_FSMONITOR_DIRTY))
> +                               continue;
> +                       ce_mark_uptodate(ce);
> +               }

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension
  2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
@ 2017-06-27 16:20   ` Christian Couder
  2017-07-07 18:50     ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Christian Couder @ 2017-06-27 16:20 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason

On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:

> +# fsmonitor works correctly with or without the untracked cache
> +# but if it is available, we'll turn it on to ensure we test that
> +# codepath as well.
> +
> +test_lazy_prereq UNTRACKED_CACHE '
> +       { git update-index --test-untracked-cache; ret=$?; } &&
> +       test $ret -ne 1
> +'
> +
> +if test_have_prereq UNTRACKED_CACHE; then
> +       git config core.untrackedcache true
> +else
> +       git config core.untrackedcache false
> +fi

I wonder if it would be better to just do something like:

=====================

test_expect_success 'setup' '
        ....
'

uc_values="false"
test_have_prereq UNTRACKED_CACHE && uc_values="false true"

for uc_val in $uc_values
do

    test_expect_success "setup untracked cache to $uc_val" '
         git config core.untrackedcache $uc_val
    '

    test_expect_success 'refresh_index() invalidates fsmonitor cache' '
          ...
    '

    test_expect_success "status doesn't detect unreported modifications" '
          ...
    '

...

done

=====================

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 0/7] Fast git status via a file system watcher
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
                   ` (6 preceding siblings ...)
  2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
@ 2017-06-28  5:11 ` Christian Couder
  2017-07-10 13:36   ` Ben Peart
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
  8 siblings, 1 reply; 137+ messages in thread
From: Christian Couder @ 2017-06-28  5:11 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason

On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:
> Changes from V4 include:
...

I took a look at this patch series except the last patch ([PATCH v5
7/7] fsmonitor: add a performance test) as Junio reviewed it already,
and had only a few comments on patches 3/7 and 4/7.

I am still not convinced by the discussions following v2
(http://public-inbox.org/git/20170518201333.13088-1-benpeart@microsoft.com/)
about using a hook instead of for example a "core.fsmonitorcommand".

I think using a hook is not necessary and might not be a good match
for later optimizations. For example people might want to use a
library or some OS specific system calls to do what the hook does.

AEvar previously reported some not so great performance numbers on
some big Booking.com boxes with a big monorepo and it seems that using
taskset for example to make sure that the hook is run on the same CPU
improves these numbers significantly. So avoiding to run a separate
process can be important in some cases.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-06-27 15:43   ` Christian Couder
@ 2017-07-03 21:25     ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-07-03 21:25 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason



On 6/27/2017 11:43 AM, Christian Couder wrote:
> On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:
> 
>> +int read_fsmonitor_extension(struct index_state *istate, const void *data,
>> +       unsigned long sz)
>> +{
>> +       const char *index = data;
>> +       uint32_t hdr_version;
>> +       uint32_t ewah_size;
>> +       int ret;
>> +
>> +       if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
>> +               return error("corrupt fsmonitor extension (too short)");
>> +
>> +       hdr_version = get_be32(index);
> 
> Here we use get_be32(), ...
> 
>> +       index += sizeof(uint32_t);
>> +       if (hdr_version != INDEX_EXTENSION_VERSION)
>> +               return error("bad fsmonitor version %d", hdr_version);
>> +
>> +       istate->fsmonitor_last_update = get_be64(index);
> 
> ...get_be64(), ...
> 
>> +       index += sizeof(uint64_t);
>> +
>> +       ewah_size = get_be32(index);
> 
> ... and get_be32 again, ...
> 
>> +       index += sizeof(uint32_t);
>> +
>> +       istate->fsmonitor_dirty = ewah_new();
>> +       ret = ewah_read_mmap(istate->fsmonitor_dirty, index, ewah_size);
>> +       if (ret != ewah_size) {
>> +               ewah_free(istate->fsmonitor_dirty);
>> +               istate->fsmonitor_dirty = NULL;
>> +               return error("failed to parse ewah bitmap reading fsmonitor index extension");
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
>> +{
>> +       uint32_t hdr_version;
>> +       uint64_t tm;
>> +       struct ewah_bitmap *bitmap;
>> +       int i;
>> +       uint32_t ewah_start;
>> +       uint32_t ewah_size = 0;
>> +       int fixup = 0;
>> +
>> +       hdr_version = htonl(INDEX_EXTENSION_VERSION);
> 
> ... but here we use htonl() instead of put_be32(), ...
> 
>> +       strbuf_add(sb, &hdr_version, sizeof(uint32_t));
>> +
>> +       tm = htonll((uint64_t)istate->fsmonitor_last_update);
> 
> ... htonll(), ...
> 
>> +       strbuf_add(sb, &tm, sizeof(uint64_t));
>> +       fixup = sb->len;
>> +       strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
>> +
>> +       ewah_start = sb->len;
>> +       bitmap = ewah_new();
>> +       for (i = 0; i < istate->cache_nr; i++)
>> +               if (istate->cache[i]->ce_flags & CE_FSMONITOR_DIRTY)
>> +                       ewah_set(bitmap, i);
>> +       ewah_serialize_strbuf(bitmap, sb);
>> +       ewah_free(bitmap);
>> +
>> +       /* fix up size field */
>> +       ewah_size = htonl(sb->len - ewah_start);
> 
> ... and htonl() again.
> 
> It would be more consistent (and perhaps more correct) to use
> put_beXX() functions, instead of the htonl() and htonll() functions.

I agree that these are asymmetric.  I followed the pattern used in the 
untracked cache in which write_untracked_extension uses htonl and 
read_untracked_extension uses get_be32. I can change this to be more 
symmetric.

> 
>> +       memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t));
>> +}
> 
>> +/*
>> + * Call the query-fsmonitor hook passing the time of the last saved results.
>> + */
>> +static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
>> +{
>> +       struct child_process cp = CHILD_PROCESS_INIT;
>> +       char ver[64];
>> +       char date[64];
>> +       const char *argv[4];
>> +
>> +       if (!(argv[0] = find_hook("query-fsmonitor")))
>> +               return -1;
>> +
>> +       snprintf(ver, sizeof(version), "%d", version);
>> +       snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
>> +       argv[1] = ver;
>> +       argv[2] = date;
>> +       argv[3] = NULL;
>> +       cp.argv = argv;
> 
> Maybe it would be nicer using the argv_array_pushX() functions.

When the number of arguments is fixed and known, I prefer avoiding the 
dynamic allocations that come along with the argv_array_pushX() functions.

> 
>> +       cp.out = -1;
>> +
>> +       return capture_command(&cp, query_result, 1024);
>> +}
>> +
>> +static void mark_file_dirty(struct index_state *istate, const char *name)
>> +{
>> +       struct untracked_cache_dir *dir;
>> +       int pos;
>> +
>> +       /* find it in the index and mark that entry as dirty */
>> +       pos = index_name_pos(istate, name, strlen(name));
>> +       if (pos >= 0) {
>> +               if (!(istate->cache[pos]->ce_flags & CE_FSMONITOR_DIRTY)) {
>> +                       istate->cache[pos]->ce_flags |= CE_FSMONITOR_DIRTY;
>> +                       istate->cache_changed |= FSMONITOR_CHANGED;
>> +               }
>> +       }
>> +
>> +       /*
>> +        * Find the corresponding directory in the untracked cache
>> +        * and mark it as invalid
>> +        */
>> +       if (!istate->untracked || !istate->untracked->root)
>> +               return;
>> +
>> +       dir = find_untracked_cache_dir(istate->untracked, istate->untracked->root, name);
>> +       if (dir) {
>> +               if (dir->valid) {
>> +                       dir->valid = 0;
>> +                       istate->cache_changed |= FSMONITOR_CHANGED;
>> +               }
>> +       }
> 
> The code above is quite similar as what is in mark_fsmonitor_dirty(),
> so I wonder if a refactoring is possible.

I've felt the same way and looked at how to refactor it better a number 
of times but never came up with a way that made it simpler, clearer and 
easier to maintain.  I'm happy to review a patch if you can figure out 
something better. :)

> 
>> +}
>> +
>> +void refresh_by_fsmonitor(struct index_state *istate)
>> +{
>> +       static int has_run_once = 0;
>> +       struct strbuf query_result = STRBUF_INIT;
>> +       int query_success = 0;
>> +       size_t bol = 0; /* beginning of line */
>> +       uint64_t last_update;
>> +       char *buf, *entry;
>> +       int i;
>> +
>> +       if (!core_fsmonitor || has_run_once)
>> +               return;
>> +       has_run_once = 1;
>> +
>> +       /*
>> +        * This could be racy so save the date/time now and the hook
>> +        * should be inclusive to ensure we don't miss potential changes.
>> +        */
>> +       last_update = getnanotime();
>> +
>> +       /*
>> +        * If we have a last update time, call query-monitor for the set of
>> +        * changes since that time.
>> +        */
>> +       if (istate->fsmonitor_last_update) {
>> +               query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
>> +                       istate->fsmonitor_last_update, &query_result);
>> +               trace_performance_since(last_update, "query-fsmonitor");
>> +       }
>> +
>> +       if (query_success) {
>> +               /* Mark all entries returned by the monitor as dirty */
>> +               buf = entry = query_result.buf;
>> +               for (i = 0; i < query_result.len; i++) {
>> +                       if (buf[i] != '\0')
>> +                               continue;
>> +                       mark_file_dirty(istate, buf + bol);
> 
> It looks like bol is always equal to i here...
> 
>> +                       bol = i + 1;
>> +               }
>> +               if (bol < query_result.len)
>> +                       mark_file_dirty(istate, buf + bol);
> 
> ... and here too. As it is not used below, I wonder if you really need
> the bol variable.

bol saves the position of the beginning of the current line while i 
iterates through the remainder of the string looking for the NULL.

However, the entry variable is no longer used so I will remove it.

> 
>> +               /* Mark all clean entries up-to-date */
>> +               for (i = 0; i < istate->cache_nr; i++) {
>> +                       struct cache_entry *ce = istate->cache[i];
>> +                       if (ce_stage(ce) || (ce->ce_flags & CE_FSMONITOR_DIRTY))
>> +                               continue;
>> +                       ce_mark_uptodate(ce);
>> +               }

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-06-14 18:36       ` Junio C Hamano
@ 2017-07-07 18:14         ` Ben Peart
  2017-07-07 18:35           ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-07-07 18:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, benpeart, pclouds, johannes.schindelin, David.Turner, peff,
	christian.couder, avarab



On 6/14/2017 2:36 PM, Junio C Hamano wrote:
> Ben Peart <peartben@gmail.com> writes:
> 
>>> Having said all that, I think you are using this ONLY on windows;
>>> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
>>> the above and arrange Makefile to build test-drop-cache only on that
>>> platform, or something?
>>
>> I didn't find any other examples of Windows only tools.  I'll update
>> the #ifdef to properly dump the file system cache on Linux as well and
>> only error out on other platforms.
> 
> If this will become Windows-only, then I have no problem with
> platform specfic typedef ;-) I have no problem with CamelCase,
> either, as that follows the local convention on the platform
> (similar to those in compat/* that are only for Windows).
> 
> Having said all that.
> 
> Another approach is to build this helper on all platforms, with
> sections protected by "#ifdef LINUX", "#ifdef WINDOWS", etc.  That
> way, the platform detection and switching between running this
> program and echoing something into /proc filesystem performed in
> p7519 can be removed (this test-helper program will be responsible
> to implement that logic instead).  When p7519 wants to drop the
> filesystem cache, regardless of the platform, it can call this
> test-helper without having to know how the filesystem cache is
> dropped.
> 

I'll take a cut at doing this but it is obviously very platform 
dependent and I'm unfamiliar with platforms other than Windows.

For everything other than Windows, my implementation will be calling 
"system" for external utilities based on what I can find on the web. 
Oh, and I have no way to test it other than on Windows so could use some 
review/testing from others. :)

> I do not have strong preference either way, but I have a slight
> suspicion that the "another approach" above may give us a better
> result.  For one thing, the test-helper can be reused in new perf
> scripts people will write in the future.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-07-07 18:14         ` Ben Peart
@ 2017-07-07 18:35           ` Junio C Hamano
  2017-07-07 19:07             ` Ben Peart
                               ` (2 more replies)
  0 siblings, 3 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-07-07 18:35 UTC (permalink / raw)
  To: Ben Peart
  Cc: git, benpeart, pclouds, johannes.schindelin, David.Turner, peff,
	christian.couder, avarab

Ben Peart <peartben@gmail.com> writes:

> On 6/14/2017 2:36 PM, Junio C Hamano wrote:
>> Ben Peart <peartben@gmail.com> writes:
>>
>>>> Having said all that, I think you are using this ONLY on windows;
>>>> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
>>>> the above and arrange Makefile to build test-drop-cache only on that
>>>> platform, or something?
>>>
>>> I didn't find any other examples of Windows only tools.  I'll update
>>> the #ifdef to properly dump the file system cache on Linux as well and
>>> only error out on other platforms.
>>
>> If this will become Windows-only, then I have no problem with
>> platform specfic typedef ;-) I have no problem with CamelCase,
>> either, as that follows the local convention on the platform
>> (similar to those in compat/* that are only for Windows).
>>
>> Having said all that.
>>
>> Another approach is to build this helper on all platforms, ...

... and having said all that, I think it is perfectly fine to do
such a clean-up long after the series gets more exposure to wider
audiences as a follow-up patch.  Let's get the primary part that
affects people's everyday use of Git right and then worry about the
test details later.

A quick show of hands to the list audiences.  How many of you guys
actually tried this series on 'pu' and checked to see its
performance (and correctness ;-) characteristics?  

Do you folks like it?  Rather not have such complexity in the core
part of the system?  A good first step to start adding more
performance improvements?  No opinion?



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension
  2017-06-27 16:20   ` Christian Couder
@ 2017-07-07 18:50     ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-07-07 18:50 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason



On 6/27/2017 12:20 PM, Christian Couder wrote:
> On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:
> 
>> +# fsmonitor works correctly with or without the untracked cache
>> +# but if it is available, we'll turn it on to ensure we test that
>> +# codepath as well.
>> +
>> +test_lazy_prereq UNTRACKED_CACHE '
>> +       { git update-index --test-untracked-cache; ret=$?; } &&
>> +       test $ret -ne 1
>> +'
>> +
>> +if test_have_prereq UNTRACKED_CACHE; then
>> +       git config core.untrackedcache true
>> +else
>> +       git config core.untrackedcache false
>> +fi
> 
> I wonder if it would be better to just do something like:

That is a good idea; I'll add that around the tests that aren't 
explicitly testing interop with and without the untracked cache.

Thanks!

> 
> =====================
> 
> test_expect_success 'setup' '
>          ....
> '
> 
> uc_values="false"
> test_have_prereq UNTRACKED_CACHE && uc_values="false true"
> 
> for uc_val in $uc_values
> do
> 
>      test_expect_success "setup untracked cache to $uc_val" '
>           git config core.untrackedcache $uc_val
>      '
> 
>      test_expect_success 'refresh_index() invalidates fsmonitor cache' '
>            ...
>      '
> 
>      test_expect_success "status doesn't detect unreported modifications" '
>            ...
>      '
> 
> ...
> 
> done
> 
> =====================
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-07-07 18:35           ` Junio C Hamano
@ 2017-07-07 19:07             ` Ben Peart
  2017-07-07 19:33             ` David Turner
  2017-07-08  7:19             ` Christian Couder
  2 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-07-07 19:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, benpeart, pclouds, johannes.schindelin, David.Turner, peff,
	christian.couder, avarab



On 7/7/2017 2:35 PM, Junio C Hamano wrote:
> Ben Peart <peartben@gmail.com> writes:
> 
>> On 6/14/2017 2:36 PM, Junio C Hamano wrote:
>>> Ben Peart <peartben@gmail.com> writes:
>>>
>>>>> Having said all that, I think you are using this ONLY on windows;
>>>>> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
>>>>> the above and arrange Makefile to build test-drop-cache only on that
>>>>> platform, or something?
>>>>
>>>> I didn't find any other examples of Windows only tools.  I'll update
>>>> the #ifdef to properly dump the file system cache on Linux as well and
>>>> only error out on other platforms.
>>>
>>> If this will become Windows-only, then I have no problem with
>>> platform specfic typedef ;-) I have no problem with CamelCase,
>>> either, as that follows the local convention on the platform
>>> (similar to those in compat/* that are only for Windows).
>>>
>>> Having said all that.
>>>
>>> Another approach is to build this helper on all platforms, ...
> 
> ... and having said all that, I think it is perfectly fine to do
> such a clean-up long after the series gets more exposure to wider
> audiences as a follow-up patch.  Let's get the primary part that
> affects people's everyday use of Git right and then worry about the
> test details later.
> 
> A quick show of hands to the list audiences.  How many of you guys
> actually tried this series on 'pu' and checked to see its
> performance (and correctness ;-) characteristics?

TLDR: the current version isn't correct.

One of the things I did was hack up the test script to enable running 
all the tests with fsmonitor enabled.  I found a number of bugs in 
Watchman on Windows and have been working with Wez to get them fixed.

Just last week Watchman got to the point where I could run the complete 
git test suite.  As a result, I found that fsmonitor is overly 
aggressive in marking things with ce_mark_uptodate.  Submodules are 
currently broken as are commands that pass "--ignore-missing."

I started reworking the logic to fix these bugs and realized I was 
duplicating the set of tests that already exist in preload_thread. I'm 
currently working on integrating the logic into preload_thread so that 
both options can be used in combination and so I can avoid doing the 
(nearly identical) loop twice.

> 
> Do you folks like it?  Rather not have such complexity in the core
> part of the system?  A good first step to start adding more
> performance improvements?  No opinion?
> 
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-07-07 18:35           ` Junio C Hamano
  2017-07-07 19:07             ` Ben Peart
@ 2017-07-07 19:33             ` David Turner
  2017-07-08  7:19             ` Christian Couder
  2 siblings, 0 replies; 137+ messages in thread
From: David Turner @ 2017-07-07 19:33 UTC (permalink / raw)
  To: 'Junio C Hamano', Ben Peart
  Cc: git@vger.kernel.org, benpeart@microsoft.com, pclouds@gmail.com,
	johannes.schindelin@gmx.de, peff@peff.net,
	christian.couder@gmail.com, avarab@gmail.com

> -----Original Message-----
> From: Junio C Hamano [mailto:jch2355@gmail.com] On Behalf Of Junio C
> Hamano
> Sent: Friday, July 7, 2017 2:35 PM
> To: Ben Peart <peartben@gmail.com>
> Cc: git@vger.kernel.org; benpeart@microsoft.com; pclouds@gmail.com;
> johannes.schindelin@gmx.de; David Turner <David.Turner@twosigma.com>;
> peff@peff.net; christian.couder@gmail.com; avarab@gmail.com
> Subject: Re: [PATCH v5 7/7] fsmonitor: add a performance test
> 
> Ben Peart <peartben@gmail.com> writes:
> 
> > On 6/14/2017 2:36 PM, Junio C Hamano wrote:
> >> Ben Peart <peartben@gmail.com> writes:
> >>
> >>>> Having said all that, I think you are using this ONLY on windows;
> >>>> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
> >>>> the above and arrange Makefile to build test-drop-cache only on
> >>>> that platform, or something?
> >>>
> >>> I didn't find any other examples of Windows only tools.  I'll update
> >>> the #ifdef to properly dump the file system cache on Linux as well
> >>> and only error out on other platforms.
> >>
> >> If this will become Windows-only, then I have no problem with
> >> platform specfic typedef ;-) I have no problem with CamelCase,
> >> either, as that follows the local convention on the platform (similar
> >> to those in compat/* that are only for Windows).
> >>
> >> Having said all that.
> >>
> >> Another approach is to build this helper on all platforms, ...
> 
> ... and having said all that, I think it is perfectly fine to do such a clean-up long
> after the series gets more exposure to wider audiences as a follow-up patch.
> Let's get the primary part that affects people's everyday use of Git right and then
> worry about the test details later.
> 
> A quick show of hands to the list audiences.  How many of you guys actually
> tried this series on 'pu' and checked to see its performance (and correctness ;-)
> characteristics?
> 
> Do you folks like it?  Rather not have such complexity in the core part of the
> system?  A good first step to start adding more performance improvements?  No
> opinion?

I have not had the chance to test the latest version out yet.  The idea, broadly, seems sound, but as Ben notes in a later mail, the details are important.  Since he's going to re-roll with more interesting invalidation logic, I'll wait to try it again until a new version is available.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 7/7] fsmonitor: add a performance test
  2017-07-07 18:35           ` Junio C Hamano
  2017-07-07 19:07             ` Ben Peart
  2017-07-07 19:33             ` David Turner
@ 2017-07-08  7:19             ` Christian Couder
  2 siblings, 0 replies; 137+ messages in thread
From: Christian Couder @ 2017-07-08  7:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, git, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason

On Fri, Jul 7, 2017 at 8:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Ben Peart <peartben@gmail.com> writes:
>
>> On 6/14/2017 2:36 PM, Junio C Hamano wrote:
>>> Ben Peart <peartben@gmail.com> writes:
>>>
>>>>> Having said all that, I think you are using this ONLY on windows;
>>>>> perhaps it is better to drop #ifdef GIT_WINDOWS_NATIVE from all of
>>>>> the above and arrange Makefile to build test-drop-cache only on that
>>>>> platform, or something?
>>>>
>>>> I didn't find any other examples of Windows only tools.  I'll update
>>>> the #ifdef to properly dump the file system cache on Linux as well and
>>>> only error out on other platforms.
>>>
>>> If this will become Windows-only, then I have no problem with
>>> platform specfic typedef ;-) I have no problem with CamelCase,
>>> either, as that follows the local convention on the platform
>>> (similar to those in compat/* that are only for Windows).
>>>
>>> Having said all that.
>>>
>>> Another approach is to build this helper on all platforms, ...
>
> ... and having said all that, I think it is perfectly fine to do
> such a clean-up long after the series gets more exposure to wider
> audiences as a follow-up patch.  Let's get the primary part that
> affects people's everyday use of Git right and then worry about the
> test details later.
>
> A quick show of hands to the list audiences.  How many of you guys
> actually tried this series on 'pu' and checked to see its
> performance (and correctness ;-) characteristics?

As you can guess from my previous replies to this thread (and the
previous version of this patch series), I lightly tried it and checked
its performance for Booking.com.

> Do you folks like it?  Rather not have such complexity in the core
> part of the system?  A good first step to start adding more
> performance improvements?  No opinion?

I already gave my opinion which I think is shared with Ævar. In short
I don't think it should be a hook, as that limits the performance and
is not necessary, but it is going in the right direction.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 0/7] Fast git status via a file system watcher
  2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
@ 2017-07-10 13:36   ` Ben Peart
  2017-07-10 14:40     ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-07-10 13:36 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason



On 6/28/2017 1:11 AM, Christian Couder wrote:
> On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:
>> Changes from V4 include:
> ...
> 
> I took a look at this patch series except the last patch ([PATCH v5
> 7/7] fsmonitor: add a performance test) as Junio reviewed it already,
> and had only a few comments on patches 3/7 and 4/7.
> 
> I am still not convinced by the discussions following v2
> (http://public-inbox.org/git/20170518201333.13088-1-benpeart@microsoft.com/)
> about using a hook instead of for example a "core.fsmonitorcommand".
> 
> I think using a hook is not necessary and might not be a good match
> for later optimizations. For example people might want to use a
> library or some OS specific system calls to do what the hook does.
> 
> AEvar previously reported some not so great performance numbers on
> some big Booking.com boxes with a big monorepo and it seems that using
> taskset for example to make sure that the hook is run on the same CPU
> improves these numbers significantly. So avoiding to run a separate
> process can be important in some cases.
> 

Using a hook is the only pattern I've seen in git that provides a way to 
enable OS specific calls.  I used a hook so that different file 
monitoring services could be plugged in depending on the OS or tools 
available.  The Watchman integration script was mostly intended as a 
sample that could be used where Watchman is available and works well.

I had not heard about the taskset issues you mention above.  If there is 
something else that can be done to make this work better, please let me 
know the details.

If I'm understanding you correctly, you are suggesting that someone 
should be able to configure a setting (core.fsmonitorcommand) that gives 
a custom command line that would be run instead of running the 
query-fsmonitor hook.

I'm not entirely sure how that should work.  There are command line 
options that need to be passed (currently the interface version as well 
as the current clock in nanoseconds).  How would those passed when using 
the custom command?

Is it OK to just append them to the given command line?  Does there need 
to be some substitution token to indicate where they should be inserted 
(ie "mycustomcommand --custom --options %version% --more-options 
%timestamp%").  Are there any other tokens that should be supported (ie 
PID or processor mask?).

Is there a design pattern already used somewhere in git that I can 
follow or is this all blazing a new trail?

Thanks for continuing to look into this. Feedback is good!



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v5 0/7] Fast git status via a file system watcher
  2017-07-10 13:36   ` Ben Peart
@ 2017-07-10 14:40     ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-07-10 14:40 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ben Peart, Nguyen Thai Ngoc Duy,
	Johannes Schindelin, David Turner, Jeff King,
	Ævar Arnfjörð Bjarmason



On 7/10/2017 9:36 AM, Ben Peart wrote:
> 
> 
> On 6/28/2017 1:11 AM, Christian Couder wrote:
>> On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart <peartben@gmail.com> wrote:
>>> Changes from V4 include:
>> ...
>>
>> I took a look at this patch series except the last patch ([PATCH v5
>> 7/7] fsmonitor: add a performance test) as Junio reviewed it already,
>> and had only a few comments on patches 3/7 and 4/7.
>>
>> I am still not convinced by the discussions following v2
>> (http://public-inbox.org/git/20170518201333.13088-1-benpeart@microsoft.com/) 
>>
>> about using a hook instead of for example a "core.fsmonitorcommand".
>>
>> I think using a hook is not necessary and might not be a good match
>> for later optimizations. For example people might want to use a
>> library or some OS specific system calls to do what the hook does.
>>
>> AEvar previously reported some not so great performance numbers on
>> some big Booking.com boxes with a big monorepo and it seems that using
>> taskset for example to make sure that the hook is run on the same CPU
>> improves these numbers significantly. So avoiding to run a separate
>> process can be important in some cases.
>>
> 
> Using a hook is the only pattern I've seen in git that provides a way to 
> enable OS specific calls.  I used a hook so that different file 
> monitoring services could be plugged in depending on the OS or tools 
> available.  The Watchman integration script was mostly intended as a 
> sample that could be used where Watchman is available and works well.
> 
> I had not heard about the taskset issues you mention above.  If there is 
> something else that can be done to make this work better, please let me 
> know the details.
> 
> If I'm understanding you correctly, you are suggesting that someone 
> should be able to configure a setting (core.fsmonitorcommand) that gives 
> a custom command line that would be run instead of running the 
> query-fsmonitor hook.
> 
> I'm not entirely sure how that should work.  There are command line 
> options that need to be passed (currently the interface version as well 
> as the current clock in nanoseconds).  How would those passed when using 
> the custom command?
> 
> Is it OK to just append them to the given command line?  Does there need 
> to be some substitution token to indicate where they should be inserted 
> (ie "mycustomcommand --custom --options %version% --more-options 
> %timestamp%").  Are there any other tokens that should be supported (ie 
> PID or processor mask?).
> 
> Is there a design pattern already used somewhere in git that I can 
> follow or is this all blazing a new trail?
> 

My co-worker reminded me about git difftool - is this what you had in mind?

> Thanks for continuing to look into this. Feedback is good!
> 
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* [PATCH v6 00/12] Fast git status via a file system watcher
  2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
                   ` (7 preceding siblings ...)
  2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
@ 2017-09-15 19:20 ` Ben Peart
  2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
                     ` (12 more replies)
  8 siblings, 13 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This is a fairly significant rewrite since V5. The big changes include:

Multiple functions including preload-index(), ie_match_stat(), and
refresh_cache_ent() have been updated to honor the CE_FSMONITOR_VALID bit
following the same pattern as skip_worktree and CE_VALID.  As a result,
performance improvements apply to all git commands that would otherwise
have had to scan the entire working directory.

core.fsmonitor is now a registered command (instead of a hook) to
provide additional flexibility.  It is called when needed to ensure the
state of the index is up-to-date.

The Watchman integration script is now entirely written in perl to
minimize spawning additional helper commands.  This along with the other
changes have helped reduce the overhead and made the extension applicable
to more (ie smaller) repos.

There are additional opportunities for performance improvements but I
wanted to get this version out there and then build on it as the
foundation.  Some potential examples of future patches include:

 - call the integration script on a background thread so that it can
   execute in parallel.

 - optimize traverse trees by pruning out entire branches that do not
   contain any changes.

Other optimizations likely exist where knowledge that files have not
changed can be used to short circuit some of the normal workflow.

Performance
===========

With the various enhancements, performance has been improved especially
for smaller repos.  The included perf test compares status times without
fsmonitor to those with fsmonitor using the provided Watchman integration
script.

Due to the overhead of calling out to Watchman, on very small repos
(<10K files) the overhead exceeds the savings.  Once repos hit 10K files
the savings kick in and for repos beyond that, the savings are dramatic.

Test with 10,000 files                                           this tree
------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.35(0.03+0.04)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.37(0.00+0.09)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   0.43(0.03+0.06)
7519.6: status (fsmonitor=)                                      0.45(0.00+0.07)
7519.7: status -uno (fsmonitor=)                                 0.40(0.03+0.07)
7519.8: status -uall (fsmonitor=)                                0.44(0.04+0.04)

Test with 100,000 files                                          this tree
------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.33(0.01+0.03)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.36(0.00+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   0.93(0.00+0.07)
7519.6: status (fsmonitor=)                                      2.66(0.04+0.03)
7519.7: status -uno (fsmonitor=)                                 2.44(0.01+0.06)
7519.8: status -uall (fsmonitor=)                                2.94(0.03+0.07)

Test with 1,000,000 files                                        this tree
---------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         1.45(0.00+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.88(0.01+0.04)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   6.14(0.03+0.04)
7519.6: status (fsmonitor=)                                      25.91(0.04+0.06)
7519.7: status -uno (fsmonitor=)                                 23.96(0.04+0.03)
7519.8: status -uall (fsmonitor=)                                28.81(0.00+0.07)

Note: all numbers above are with a warm disk cache on a fast SSD, real
world performance numbers are often dramatically better as fsmonitor can
eliminate all the file IO to lstat every file and then traverse the
working directory looking for untracked files.  For example, a cold
status without fsmonitor on a HDD with 1M files takes 1m22.774s

$ time git -c core.fsmonitor= status
On branch p0006-ballast

It took 2.09 seconds to enumerate untracked files. 'status -uno'
may speed it up, but you have to be careful not to forget to add
new files yourself (see 'git help status').
nothing to commit, working tree clean

real    1m22.774s
user    0m0.000s
sys     0m0.000s


Ben Peart (12):
  bswap: add 64 bit endianness helper get_be64
  preload-index: add override to enable testing preload-index
  update-index: add a new --force-write-index option
  fsmonitor: teach git to optionally utilize a file system monitor to
    speed up detecting new or changed files.
  fsmonitor: add documentation for the fsmonitor extension.
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  update-index: add fsmonitor support to update-index
  fsmonitor: add a test tool to dump the index extension
  split-index: disable the fsmonitor extension when running the split
    index test
  fsmonitor: add test cases for fsmonitor extension
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add a performance test

 Documentation/config.txt                   |   6 +
 Documentation/githooks.txt                 |  23 +++
 Documentation/technical/index-format.txt   |  19 +++
 Makefile                                   |   3 +
 apply.c                                    |   2 +-
 builtin/ls-files.c                         |   8 +-
 builtin/update-index.c                     |  26 ++-
 cache.h                                    |  10 +-
 compat/bswap.h                             |  22 +++
 config.c                                   |  14 ++
 config.h                                   |   1 +
 diff-lib.c                                 |   2 +
 dir.c                                      |  27 +--
 dir.h                                      |   2 +
 entry.c                                    |   4 +-
 environment.c                              |   1 +
 fsmonitor.c                                | 253 ++++++++++++++++++++++++++++
 fsmonitor.h                                |  61 +++++++
 preload-index.c                            |   8 +-
 read-cache.c                               |  49 +++++-
 submodule.c                                |   2 +-
 t/helper/.gitignore                        |   1 +
 t/helper/test-drop-caches.c                | 161 ++++++++++++++++++
 t/helper/test-dump-fsmonitor.c             |  21 +++
 t/perf/p7519-fsmonitor.sh                  | 184 ++++++++++++++++++++
 t/t1700-split-index.sh                     |   1 +
 t/t7519-status-fsmonitor.sh                | 259 +++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all                      |  23 +++
 t/t7519/fsmonitor-none                     |  21 +++
 t/t7519/fsmonitor-watchman                 | 128 ++++++++++++++
 templates/hooks--fsmonitor-watchman.sample | 119 +++++++++++++
 unpack-trees.c                             |   8 +-
 32 files changed, 1440 insertions(+), 29 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100644 t/helper/test-dump-fsmonitor.c
 create mode 100755 t/perf/p7519-fsmonitor.sh
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply	[flat|nested] 137+ messages in thread

* [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 19:20   ` [PATCH v6 02/12] preload-index: add override to enable testing preload-index Ben Peart
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a new get_be64 macro to enable 64 bit endian conversions on memory
that may or may not be aligned.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/bswap.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/compat/bswap.h b/compat/bswap.h
index 7d063e9e40..6b22c46214 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -158,7 +158,9 @@ static inline uint64_t git_bswap64(uint64_t x)
 
 #define get_be16(p)	ntohs(*(unsigned short *)(p))
 #define get_be32(p)	ntohl(*(unsigned int *)(p))
+#define get_be64(p)	ntohll(*(uint64_t *)(p))
 #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
+#define put_be64(p, v)	do { *(uint64_t *)(p) = htonll(v); } while (0)
 
 #else
 
@@ -178,6 +180,13 @@ static inline uint32_t get_be32(const void *ptr)
 		(uint32_t)p[3] <<  0;
 }
 
+static inline uint64_t get_be64(const void *ptr)
+{
+	const unsigned char *p = ptr;
+	return	(uint64_t)get_be32(p[0]) << 32 |
+		(uint64_t)get_be32(p[4]) <<  0;
+}
+
 static inline void put_be32(void *ptr, uint32_t value)
 {
 	unsigned char *p = ptr;
@@ -187,4 +196,17 @@ static inline void put_be32(void *ptr, uint32_t value)
 	p[3] = value >>  0;
 }
 
+static inline void put_be64(void *ptr, uint64_t value)
+{
+	unsigned char *p = ptr;
+	p[0] = value >> 56;
+	p[1] = value >> 48;
+	p[2] = value >> 40;
+	p[3] = value >> 32;
+	p[4] = value >> 24;
+	p[5] = value >> 16;
+	p[6] = value >>  8;
+	p[7] = value >>  0;
+}
+
 #endif
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 02/12] preload-index: add override to enable testing preload-index
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
  2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 19:20   ` [PATCH v6 03/12] update-index: add a new --force-write-index option Ben Peart
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Preload index doesn't run unless it has a minimum number of 1000 files.
To enable running tests with fewer files, add an environment variable
(GIT_FORCE_PRELOAD_TEST) which will override that minimum and set it to 2.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 preload-index.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/preload-index.c b/preload-index.c
index 70a4c80878..75564c497a 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -79,6 +79,8 @@ static void preload_index(struct index_state *index,
 		return;
 
 	threads = index->cache_nr / THREAD_COST;
+	if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST"))
+		threads = 2;
 	if (threads < 2)
 		return;
 	if (threads > MAX_PARALLEL)
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 03/12] update-index: add a new --force-write-index option
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
  2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
  2017-09-15 19:20   ` [PATCH v6 02/12] preload-index: add override to enable testing preload-index Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

At times, it makes sense to avoid the cost of writing out the index
when the only changes can easily be recomputed on demand. This causes
problems when trying to write test cases to verify that state as they
can't guarantee the state has been persisted to disk.

Add a new option (--force-write-index) to update-index that will
ensure the index is written out even if the cache_changed flag is not
set.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/update-index.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d562f2ec69..e1ca0759d5 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -915,6 +915,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	struct refresh_params refresh_args = {0, &has_errors};
 	int lock_error = 0;
 	int split_index = -1;
+	int force_write = 0;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
@@ -1006,6 +1007,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    N_("test if the filesystem supports untracked cache"), UC_TEST),
 		OPT_SET_INT(0, "force-untracked-cache", &untracked_cache,
 			    N_("enable untracked cache without testing the filesystem"), UC_FORCE),
+		OPT_SET_INT(0, "force-write-index", &force_write,
+			N_("write out the index even if is not flagged as changed"), 1),
 		OPT_END()
 	};
 
@@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		die("BUG: bad untracked_cache value: %d", untracked_cache);
 	}
 
-	if (active_cache_changed) {
+	if (active_cache_changed || force_write) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
 				exit(128);
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (2 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 03/12] update-index: add a new --force-write-index option Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 21:35     ` David Turner
  2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
                     ` (8 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile               |   1 +
 apply.c                |   2 +-
 builtin/update-index.c |   2 +
 cache.h                |  10 +-
 config.c               |  14 +++
 config.h               |   1 +
 diff-lib.c             |   2 +
 dir.c                  |  27 ++++--
 dir.h                  |   2 +
 entry.c                |   4 +-
 environment.c          |   1 +
 fsmonitor.c            | 253 +++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor.h            |  61 ++++++++++++
 preload-index.c        |   6 +-
 read-cache.c           |  49 ++++++++--
 submodule.c            |   2 +-
 unpack-trees.c         |   8 +-
 17 files changed, 419 insertions(+), 26 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h

diff --git a/Makefile b/Makefile
index f2bb7f2f63..9d6ec9c1e9 100644
--- a/Makefile
+++ b/Makefile
@@ -786,6 +786,7 @@ LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/apply.c b/apply.c
index 71cbbd141c..9061cc5f15 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct cache_entry *ce, struct stat *st)
 			return -1;
 		return 0;
 	}
-	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 }
 
 #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/builtin/update-index.c b/builtin/update-index.c
index e1ca0759d5..6f39ee9274 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -16,6 +16,7 @@
 #include "pathspec.h"
 #include "dir.h"
 #include "split-index.h"
+#include "fsmonitor.h"
 
 /*
  * Default to not allowing changes to the list of files. The
@@ -233,6 +234,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
diff --git a/cache.h b/cache.h
index a916bc79e3..eccab968bd 100644
--- a/cache.h
+++ b/cache.h
@@ -203,6 +203,7 @@ struct cache_entry {
 #define CE_ADDED             (1 << 19)
 
 #define CE_HASHED            (1 << 20)
+#define CE_FSMONITOR_VALID   (1 << 21)
 #define CE_WT_REMOVE         (1 << 22) /* remove in work directory */
 #define CE_CONFLICTED        (1 << 23)
 
@@ -326,6 +327,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CACHE_TREE_CHANGED	(1 << 5)
 #define SPLIT_INDEX_ORDERED	(1 << 6)
 #define UNTRACKED_CHANGED	(1 << 7)
+#define FSMONITOR_CHANGED	(1 << 8)
 
 struct split_index;
 struct untracked_cache;
@@ -344,6 +346,7 @@ struct index_state {
 	struct hashmap dir_hash;
 	unsigned char sha1[20];
 	struct untracked_cache *untracked;
+	uint64_t fsmonitor_last_update;
 };
 
 extern struct index_state the_index;
@@ -679,8 +682,10 @@ extern void *read_blob_data_from_index(const struct index_state *, const char *,
 #define CE_MATCH_IGNORE_MISSING		0x08
 /* enable stat refresh */
 #define CE_MATCH_REFRESH		0x10
-extern int ie_match_stat(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
-extern int ie_modified(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
+/* do stat comparison even if CE_FSMONITOR_VALID is true */
+#define CE_MATCH_IGNORE_FSMONITOR 0X20
+extern int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
+extern int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
@@ -773,6 +778,7 @@ extern int core_apply_sparse_checkout;
 extern int precomposed_unicode;
 extern int protect_hfs;
 extern int protect_ntfs;
+extern const char *core_fsmonitor;
 
 /*
  * Include broken refs in all ref iterations, which will
diff --git a/config.c b/config.c
index d0d8ce823a..ddda96e584 100644
--- a/config.c
+++ b/config.c
@@ -2165,6 +2165,20 @@ int git_config_get_max_percent_split_change(void)
 	return -1; /* default value */
 }
 
+int git_config_get_fsmonitor(void)
+{
+	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
+		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
+
+	if (core_fsmonitor && !*core_fsmonitor)
+		core_fsmonitor = NULL;
+
+	if (core_fsmonitor)
+		return 1;
+
+	return 0;
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
diff --git a/config.h b/config.h
index 97471b8873..c9fcf691ba 100644
--- a/config.h
+++ b/config.h
@@ -211,6 +211,7 @@ extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
+extern int git_config_get_fsmonitor(void);
 
 /* This dies if the configured or default date is in the future */
 extern int git_config_get_expiry(const char *key, const char **output);
diff --git a/diff-lib.c b/diff-lib.c
index 2a52b07954..23c6d03ca9 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -12,6 +12,7 @@
 #include "refs.h"
 #include "submodule.h"
 #include "dir.h"
+#include "fsmonitor.h"
 
 /*
  * diff-files
@@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 		if (!changed && !dirty_submodule) {
 			ce_mark_uptodate(ce);
+			mark_fsmonitor_valid(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
 				continue;
 		}
diff --git a/dir.c b/dir.c
index 1c55dc3e36..ac9833daec 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1688,17 +1689,23 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	if (stat(path->len ? path->buf : ".", &st)) {
-		invalidate_directory(dir->untracked, untracked);
-		memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
-		return 0;
-	}
-	if (!untracked->valid ||
-	    match_stat_data_racy(istate, &untracked->stat_data, &st)) {
-		if (untracked->valid)
+	/*
+	 * With fsmonitor, we can trust the untracked cache's valid field.
+	 */
+	refresh_fsmonitor(istate);
+	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
+		if (stat(path->len ? path->buf : ".", &st)) {
 			invalidate_directory(dir->untracked, untracked);
-		fill_stat_data(&untracked->stat_data, &st);
-		return 0;
+			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
+			return 0;
+		}
+		if (!untracked->valid ||
+			match_stat_data_racy(istate, &untracked->stat_data, &st)) {
+			if (untracked->valid)
+				invalidate_directory(dir->untracked, untracked);
+			fill_stat_data(&untracked->stat_data, &st);
+			return 0;
+		}
 	}
 
 	if (untracked->check_only != !!check_only) {
diff --git a/dir.h b/dir.h
index e3717055d1..fab8fc1561 100644
--- a/dir.h
+++ b/dir.h
@@ -139,6 +139,8 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
+	/* fsmonitor invalidation data */
+	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/entry.c b/entry.c
index cb291aa88b..5e6794f9fc 100644
--- a/entry.c
+++ b/entry.c
@@ -4,6 +4,7 @@
 #include "streaming.h"
 #include "submodule.h"
 #include "progress.h"
+#include "fsmonitor.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
 			lstat(ce->name, &st);
 		fill_stat_cache_info(ce, &st);
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
 		state->istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 	return 0;
@@ -402,7 +404,7 @@ int checkout_entry(struct cache_entry *ce,
 
 	if (!check_path(path.buf, path.len, &st, state->base_dir_len)) {
 		const struct submodule *sub;
-		unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+		unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 		/*
 		 * Needs to be checked before !changed returns early,
 		 * as the possibly empty directory was not changed
diff --git a/environment.c b/environment.c
index 3fd4b10845..d0b9fc64d4 100644
--- a/environment.c
+++ b/environment.c
@@ -76,6 +76,7 @@ int protect_hfs = PROTECT_HFS_DEFAULT;
 #define PROTECT_NTFS_DEFAULT 0
 #endif
 int protect_ntfs = PROTECT_NTFS_DEFAULT;
+const char *core_fsmonitor;
 
 /*
  * The character that begins a commented line in user-editable file
diff --git a/fsmonitor.c b/fsmonitor.c
new file mode 100644
index 0000000000..144294b8df
--- /dev/null
+++ b/fsmonitor.c
@@ -0,0 +1,253 @@
+#include "cache.h"
+#include "config.h"
+#include "dir.h"
+#include "ewah/ewok.h"
+#include "fsmonitor.h"
+#include "run-command.h"
+#include "strbuf.h"
+
+#define INDEX_EXTENSION_VERSION	(1)
+#define HOOK_INTERFACE_VERSION	(1)
+
+struct trace_key trace_fsmonitor = TRACE_KEY_INIT(FSMONITOR);
+
+static void fsmonitor_ewah_callback(size_t pos, void *is)
+{
+	struct index_state *istate = (struct index_state *)is;
+	struct cache_entry *ce = istate->cache[pos];
+
+	ce->ce_flags &= ~CE_FSMONITOR_VALID;
+}
+
+int read_fsmonitor_extension(struct index_state *istate, const void *data,
+	unsigned long sz)
+{
+	const char *index = data;
+	uint32_t hdr_version;
+	uint32_t ewah_size;
+	struct ewah_bitmap *fsmonitor_dirty;
+	int i;
+	int ret;
+
+	if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
+		return error("corrupt fsmonitor extension (too short)");
+
+	hdr_version = get_be32(index);
+	index += sizeof(uint32_t);
+	if (hdr_version != INDEX_EXTENSION_VERSION)
+		return error("bad fsmonitor version %d", hdr_version);
+
+	istate->fsmonitor_last_update = get_be64(index);
+	index += sizeof(uint64_t);
+
+	ewah_size = get_be32(index);
+	index += sizeof(uint32_t);
+
+	fsmonitor_dirty = ewah_new();
+	ret = ewah_read_mmap(fsmonitor_dirty, index, ewah_size);
+	if (ret != ewah_size) {
+		ewah_free(fsmonitor_dirty);
+		return error("failed to parse ewah bitmap reading fsmonitor index extension");
+	}
+
+	if (git_config_get_fsmonitor()) {
+		/* Mark all entries valid */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags |= CE_FSMONITOR_VALID;
+
+		/* Mark all previously saved entries as dirty */
+		ewah_each_bit(fsmonitor_dirty, fsmonitor_ewah_callback, istate);
+		ewah_free(fsmonitor_dirty);
+
+		/* Now mark the untracked cache for fsmonitor usage */
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 1;
+	}
+
+	trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful");
+	return 0;
+}
+
+void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
+{
+	uint32_t hdr_version;
+	uint64_t tm;
+	struct ewah_bitmap *bitmap;
+	int i;
+	uint32_t ewah_start;
+	uint32_t ewah_size = 0;
+	int fixup = 0;
+
+	put_be32(&hdr_version, INDEX_EXTENSION_VERSION);
+	strbuf_add(sb, &hdr_version, sizeof(uint32_t));
+
+	put_be64(&tm, istate->fsmonitor_last_update);
+	strbuf_add(sb, &tm, sizeof(uint64_t));
+	fixup = sb->len;
+	strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
+
+	ewah_start = sb->len;
+	bitmap = ewah_new();
+	for (i = 0; i < istate->cache_nr; i++)
+		if (!(istate->cache[i]->ce_flags & CE_FSMONITOR_VALID))
+			ewah_set(bitmap, i);
+	ewah_serialize_strbuf(bitmap, sb);
+	ewah_free(bitmap);
+
+	/* fix up size field */
+	put_be32(&ewah_size, sb->len - ewah_start);
+	memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t));
+
+	trace_printf_key(&trace_fsmonitor, "write fsmonitor extension successful");
+}
+
+/*
+ * Call the query-fsmonitor hook passing the time of the last saved results.
+ */
+static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	char ver[64];
+	char date[64];
+	const char *argv[4];
+
+	if (!(argv[0] = core_fsmonitor))
+		return -1;
+
+	snprintf(ver, sizeof(version), "%d", version);
+	snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
+	argv[1] = ver;
+	argv[2] = date;
+	argv[3] = NULL;
+	cp.argv = argv;
+	cp.use_shell = 1;
+
+	return capture_command(&cp, query_result, 1024);
+}
+
+static void fsmonitor_refresh_callback(struct index_state *istate, const char *name)
+{
+	int pos = index_name_pos(istate, name, strlen(name));
+
+	if (pos >= 0) {
+		struct cache_entry *ce = istate->cache[pos];
+		ce->ce_flags &= ~CE_FSMONITOR_VALID;
+	}
+
+	/*
+	 * Mark the untracked cache dirty even if it wasn't found in the index
+	 * as it could be a new untracked file.
+	 */
+	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);
+	untracked_cache_invalidate_path(istate, name);
+}
+
+void refresh_fsmonitor(struct index_state *istate)
+{
+	static int has_run_once = 0;
+	struct strbuf query_result = STRBUF_INIT;
+	int query_success = 0;
+	size_t bol; /* beginning of line */
+	uint64_t last_update;
+	char *buf;
+	int i;
+
+	if (!core_fsmonitor || has_run_once)
+		return;
+	has_run_once = 1;
+
+	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
+	/*
+	 * This could be racy so save the date/time now and query_fsmonitor
+	 * should be inclusive to ensure we don't miss potential changes.
+	 */
+	last_update = getnanotime();
+
+	/*
+	 * If we have a last update time, call query_fsmonitor for the set of
+	 * changes since that time, else assume everything is possibly dirty
+	 * and check it all.
+	 */
+	if (istate->fsmonitor_last_update) {
+		query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
+			istate->fsmonitor_last_update, &query_result);
+		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
+		trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s",
+			core_fsmonitor, query_success ? "success" : "failure");
+	}
+
+	/* a fsmonitor process can return '*' to indicate all entries are invalid */
+	if (query_success && query_result.buf[0] != '*') {
+		/* Mark all entries returned by the monitor as dirty */
+		buf = query_result.buf;
+		bol = 0;
+		for (i = 0; i < query_result.len; i++) {
+			if (buf[i] != '\0')
+				continue;
+			fsmonitor_refresh_callback(istate, buf + bol);
+			bol = i + 1;
+		}
+		if (bol < query_result.len)
+			fsmonitor_refresh_callback(istate, buf + bol);
+	} else {
+		/* Mark all entries invalid */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 0;
+	}
+	strbuf_release(&query_result);
+
+	/* Now that we've updated istate, save the last_update time */
+	istate->fsmonitor_last_update = last_update;
+}
+
+void add_fsmonitor(struct index_state *istate)
+{
+	int i;
+
+	if (!istate->fsmonitor_last_update) {
+		trace_printf_key(&trace_fsmonitor, "add fsmonitor");
+		istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = getnanotime();
+
+		/* reset the fsmonitor state */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+
+		/* reset the untracked cache */
+		if (istate->untracked) {
+			add_untracked_cache(istate);
+			istate->untracked->use_fsmonitor = 1;
+		}
+
+		/* Update the fsmonitor state */
+		refresh_fsmonitor(istate);
+	}
+}
+
+void remove_fsmonitor(struct index_state *istate)
+{
+	if (istate->fsmonitor_last_update) {
+		trace_printf_key(&trace_fsmonitor, "remove fsmonitor");
+		istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = 0;
+	}
+}
+
+void tweak_fsmonitor(struct index_state *istate)
+{
+	switch (git_config_get_fsmonitor()) {
+	case -1: /* keep: do nothing */
+		break;
+	case 0: /* false */
+		remove_fsmonitor(istate);
+		break;
+	case 1: /* true */
+		add_fsmonitor(istate);
+		break;
+	default: /* unknown value: do nothing */
+		break;
+	}
+}
diff --git a/fsmonitor.h b/fsmonitor.h
new file mode 100644
index 0000000000..dadbe90283
--- /dev/null
+++ b/fsmonitor.h
@@ -0,0 +1,61 @@
+#ifndef FSMONITOR_H
+#define FSMONITOR_H
+
+extern struct trace_key trace_fsmonitor;
+
+/*
+ * Read the the fsmonitor index extension and (if configured) restore the
+ * CE_FSMONITOR_VALID state.
+ */
+extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz);
+
+/*
+ * Write the CE_FSMONITOR_VALID state into the fsmonitor index extension.
+ */
+extern void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate);
+
+/*
+ * Add/remove the fsmonitor index extension
+ */
+extern void add_fsmonitor(struct index_state *istate);
+extern void remove_fsmonitor(struct index_state *istate);
+
+/*
+ * Add/remove the fsmonitor index extension as necessary based on the current
+ * core.fsmonitor setting.
+ */
+extern void tweak_fsmonitor(struct index_state *istate);
+
+/*
+ * Run the configured fsmonitor integration script and clear the
+ * CE_FSMONITOR_VALID bit for any files returned as dirty.  Also invalidate
+ * any corresponding untracked cache directory structures. Optimized to only
+ * run the first time it is called.
+ */
+extern void refresh_fsmonitor(struct index_state *istate);
+
+/*
+ * Set the given cache entries CE_FSMONITOR_VALID bit.
+ */
+static inline void mark_fsmonitor_valid(struct cache_entry *ce)
+{
+	if (core_fsmonitor) {
+		ce->ce_flags |= CE_FSMONITOR_VALID;
+		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_clean '%s'", ce->name);
+	}
+}
+
+/*
+ * Clear the given cache entries CE_FSMONITOR_VALID bit and invalidate any
+ * corresponding untracked cache directory structures.
+ */
+static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
+{
+	if (core_fsmonitor) {
+		ce->ce_flags &= ~CE_FSMONITOR_VALID;
+		untracked_cache_invalidate_path(istate, ce->name);
+		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
+	}
+}
+
+#endif
diff --git a/preload-index.c b/preload-index.c
index 75564c497a..2a83255e4e 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -4,6 +4,7 @@
 #include "cache.h"
 #include "pathspec.h"
 #include "dir.h"
+#include "fsmonitor.h"
 
 #ifdef NO_PTHREADS
 static void preload_index(struct index_state *index,
@@ -55,15 +56,18 @@ static void *preload_thread(void *_data)
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
+		if (ce->ce_flags & CE_FSMONITOR_VALID)
+			continue;
 		if (!ce_path_match(ce, &p->pathspec, NULL))
 			continue;
 		if (threaded_has_symlink_leading_path(&cache, ce->name, ce_namelen(ce)))
 			continue;
 		if (lstat(ce->name, &st))
 			continue;
-		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY))
+		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR))
 			continue;
 		ce_mark_uptodate(ce);
+		mark_fsmonitor_valid(ce);
 	} while (--nr > 0);
 	cache_def_clear(&cache);
 	return NULL;
diff --git a/read-cache.c b/read-cache.c
index 40da87ea71..53093dbebf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -19,6 +19,7 @@
 #include "varint.h"
 #include "split-index.h"
 #include "utf8.h"
+#include "fsmonitor.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -38,11 +39,12 @@
 #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
 #define CACHE_EXT_LINK 0x6c696e6b	  /* "link" */
 #define CACHE_EXT_UNTRACKED 0x554E5452	  /* "UNTR" */
+#define CACHE_EXT_FSMONITOR 0x46534D4E	  /* "FSMN" */
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
 		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \
-		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED)
+		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED | FSMONITOR_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -62,6 +64,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	free(old);
 	set_index_entry(istate, nr, ce);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	mark_fsmonitor_invalid(istate, ce);
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 }
 
@@ -150,8 +153,10 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 	if (assume_unchanged)
 		ce->ce_flags |= CE_VALID;
 
-	if (S_ISREG(st->st_mode))
+	if (S_ISREG(st->st_mode)) {
 		ce_mark_uptodate(ce);
+		mark_fsmonitor_valid(ce);
+	}
 }
 
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
@@ -300,7 +305,7 @@ int match_stat_data_racy(const struct index_state *istate,
 	return match_stat_data(sd, st);
 }
 
-int ie_match_stat(const struct index_state *istate,
+int ie_match_stat(struct index_state *istate,
 		  const struct cache_entry *ce, struct stat *st,
 		  unsigned int options)
 {
@@ -308,7 +313,10 @@ int ie_match_stat(const struct index_state *istate,
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 	int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
 	int assume_racy_is_modified = options & CE_MATCH_RACY_IS_DIRTY;
+	int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
 
+	if (!ignore_fsmonitor)
+		refresh_fsmonitor(istate);
 	/*
 	 * If it's marked as always valid in the index, it's
 	 * valid whatever the checked-out copy says.
@@ -319,6 +327,8 @@ int ie_match_stat(const struct index_state *istate,
 		return 0;
 	if (!ignore_valid && (ce->ce_flags & CE_VALID))
 		return 0;
+	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID))
+		return 0;
 
 	/*
 	 * Intent-to-add entries have not been added, so the index entry
@@ -356,7 +366,7 @@ int ie_match_stat(const struct index_state *istate,
 	return changed;
 }
 
-int ie_modified(const struct index_state *istate,
+int ie_modified(struct index_state *istate,
 		const struct cache_entry *ce,
 		struct stat *st, unsigned int options)
 {
@@ -631,7 +641,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	int size, namelen, was_same;
 	mode_t st_mode = st->st_mode;
 	struct cache_entry *ce, *alias;
-	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY;
+	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR;
 	int verbose = flags & (ADD_CACHE_VERBOSE | ADD_CACHE_PRETEND);
 	int pretend = flags & ADD_CACHE_PRETEND;
 	int intent_only = flags & ADD_CACHE_INTENT;
@@ -777,6 +787,7 @@ int chmod_index_entry(struct index_state *istate, struct cache_entry *ce,
 	}
 	cache_tree_invalidate_path(istate, ce->name);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	mark_fsmonitor_invalid(istate, ce);
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 
 	return 0;
@@ -1228,10 +1239,13 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 	int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
 	int ignore_missing = options & CE_MATCH_IGNORE_MISSING;
+	int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
 
 	if (!refresh || ce_uptodate(ce))
 		return ce;
 
+	if (!ignore_fsmonitor)
+		refresh_fsmonitor(istate);
 	/*
 	 * CE_VALID or CE_SKIP_WORKTREE means the user promised us
 	 * that the change to the work tree does not matter and told
@@ -1245,6 +1259,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 		ce_mark_uptodate(ce);
 		return ce;
 	}
+	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) {
+		ce_mark_uptodate(ce);
+		return ce;
+	}
 
 	if (has_symlink_leading_path(ce->name, ce_namelen(ce))) {
 		if (ignore_missing)
@@ -1282,8 +1300,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 			 * because CE_UPTODATE flag is in-core only;
 			 * we are not going to write this change out.
 			 */
-			if (!S_ISGITLINK(ce->ce_mode))
+			if (!S_ISGITLINK(ce->ce_mode)) {
 				ce_mark_uptodate(ce);
+				mark_fsmonitor_valid(ce);
+			}
 			return ce;
 		}
 	}
@@ -1336,7 +1356,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int first = 1;
 	int in_porcelain = (flags & REFRESH_IN_PORCELAIN);
 	unsigned int options = (CE_MATCH_REFRESH |
-				(really ? CE_MATCH_IGNORE_VALID : 0) |
+				(really ? CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_FSMONITOR : 0) |
 				(not_new ? CE_MATCH_IGNORE_MISSING : 0));
 	const char *modified_fmt;
 	const char *deleted_fmt;
@@ -1391,6 +1411,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 				 */
 				ce->ce_flags &= ~CE_VALID;
 				ce->ce_flags |= CE_UPDATE_IN_BASE;
+				mark_fsmonitor_invalid(istate, ce);
 				istate->cache_changed |= CE_ENTRY_CHANGED;
 			}
 			if (quiet)
@@ -1550,6 +1571,9 @@ static int read_index_extension(struct index_state *istate,
 	case CACHE_EXT_UNTRACKED:
 		istate->untracked = read_untracked_extension(data, sz);
 		break;
+	case CACHE_EXT_FSMONITOR:
+		read_fsmonitor_extension(istate, data, sz);
+		break;
 	default:
 		if (*ext < 'A' || 'Z' < *ext)
 			return error("index uses %.4s extension, which we do not understand",
@@ -1722,6 +1746,7 @@ static void post_read_index_from(struct index_state *istate)
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
 	tweak_split_index(istate);
+	tweak_fsmonitor(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
@@ -2306,6 +2331,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 		if (err)
 			return -1;
 	}
+	if (!strip_extensions && istate->fsmonitor_last_update) {
+		struct strbuf sb = STRBUF_INIT;
+
+		write_fsmonitor_extension(&sb, istate);
+		err = write_index_ext_header(&c, newfd, CACHE_EXT_FSMONITOR, sb.len) < 0
+			|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
+		strbuf_release(&sb);
+		if (err)
+			return -1;
+	}
 
 	if (ce_flush(&c, newfd, istate->sha1))
 		return -1;
diff --git a/submodule.c b/submodule.c
index 3cea8221e0..8a931a1aaa 100644
--- a/submodule.c
+++ b/submodule.c
@@ -62,7 +62,7 @@ int is_staging_gitmodules_ok(const struct index_state *istate)
 	if ((pos >= 0) && (pos < istate->cache_nr)) {
 		struct stat st;
 		if (lstat(GITMODULES_FILE, &st) == 0 &&
-		    ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED)
+		    ce_match_stat(istate->cache[pos], &st, CE_MATCH_IGNORE_FSMONITOR) & DATA_CHANGED)
 			return 0;
 	}
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 71b70ccb12..f724a61ac0 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "submodule.h"
 #include "submodule-config.h"
+#include "fsmonitor.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -408,6 +409,7 @@ static int apply_sparse_checkout(struct index_state *istate,
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
 	if (was_skip_worktree != ce_skip_worktree(ce)) {
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(istate, ce);
 		istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 
@@ -1454,7 +1456,7 @@ static int verify_uptodate_1(const struct cache_entry *ce,
 		return 0;
 
 	if (!lstat(ce->name, &st)) {
-		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE;
+		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR;
 		unsigned changed = ie_match_stat(o->src_index, ce, &st, flags);
 
 		if (submodule_from_ce(ce)) {
@@ -1610,7 +1612,7 @@ static int icase_exists(struct unpack_trees_options *o, const char *name, int le
 	const struct cache_entry *src;
 
 	src = index_file_exists(o->src_index, name, len, 1);
-	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 }
 
 static int check_ok_to_remove(const char *name, int len, int dtype,
@@ -2134,7 +2136,7 @@ int oneway_merge(const struct cache_entry * const *src,
 		if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old)) {
 			struct stat st;
 			if (lstat(old->name, &st) ||
-			    ie_match_stat(o->src_index, old, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE))
+			    ie_match_stat(o->src_index, old, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR))
 				update |= CE_UPDATE;
 		}
 		add_entry(o, old, update, 0);
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (3 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 19:43     ` David Turner
  2017-09-17  8:03     ` Junio C Hamano
  2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
                     ` (7 subsequent siblings)
  12 siblings, 2 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This includes the core.fsmonitor setting, the query-fsmonitor hook,
and the fsmonitor index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Documentation/config.txt                 |  6 ++++++
 Documentation/githooks.txt               | 23 +++++++++++++++++++++++
 Documentation/technical/index-format.txt | 19 +++++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a2..c196007a27 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -413,6 +413,12 @@ core.protectNTFS::
 	8.3 "short" names.
 	Defaults to `true` on Windows, and `false` elsewhere.
 
+core.fsmonitor::
+	If set, the value of this variable is used as a command which
+	will identify all files that may have changed since the
+	requested date/time. This information is used to speed up git by
+	avoiding unnecessary processing of files that have not changed.
+
 core.trustctime::
 	If false, the ctime differences between the index and the
 	working tree are ignored; useful when the inode change time
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 1bb4f92d4d..da82d64b0b 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -456,6 +456,29 @@ non-zero status causes 'git send-email' to abort before sending any
 e-mails.
 
 
+[[fsmonitor-watchman]]
+fsmonitor-watchman
+~~~~~~~~~~~~~~~
+
+This hook is invoked when the configuration option core.fsmonitor is
+set to .git/hooks/fsmonitor-watchman.  It takes two arguments, a version
+(currently 1) and the time in elapsed nanoseconds since midnight,
+January 1, 1970.
+
+The hook should output to stdout the list of all files in the working
+directory that may have changed since the requested time.  The logic
+should be inclusive so that it does not miss any potential changes.
+The paths should be relative to the root of the working directory
+and be separated by a single NUL.
+
+Git will limit what files it checks for changes as well as which
+directories are checked for untracked files based on the path names
+given.
+
+The exit status determines whether git will use the data from the
+hook to limit its search.  On error, it will fall back to verifying
+all files and folders.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index ade0b0c445..db3572626b 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -295,3 +295,22 @@ The remaining data of each directory block is grouped by type:
     in the previous ewah bitmap.
 
   - One NUL.
+
+== File System Monitor cache
+
+  The file system monitor cache tracks files for which the core.fsmonitor
+  hook has told us about changes.  The signature for this extension is
+  { 'F', 'S', 'M', 'N' }.
+
+  The extension starts with
+
+  - 32-bit version number: the current supported version is 1.
+
+  - 64-bit time: the extension data reflects all changes through the given
+	time which is stored as the nanoseconds elapsed since midnight,
+	January 1, 1970.
+
+  - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap.
+
+  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
+    is not CE_FSMONITOR_VALID.
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (4 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 20:34     ` David Turner
  2017-09-15 19:20   ` [PATCH v6 07/12] update-index: add fsmonitor support to update-index Ben Peart
                     ` (6 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a new command line option (-f) to ls-files to have it use lowercase
letters for 'fsmonitor valid' files

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/ls-files.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1339e6d17..313962a0c1 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -31,6 +31,7 @@ static int show_resolve_undo;
 static int show_modified;
 static int show_killed;
 static int show_valid_bit;
+static int show_fsmonitor_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
@@ -86,7 +87,8 @@ static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
 
-	if (tag && *tag && show_valid_bit && (ce->ce_flags & CE_VALID)) {
+	if (tag && *tag && ((show_valid_bit && (ce->ce_flags & CE_VALID)) ||
+		(show_fsmonitor_bit && (ce->ce_flags & CE_FSMONITOR_VALID)))) {
 		memcpy(alttag, tag, 3);
 
 		if (isalpha(tag[0])) {
@@ -515,6 +517,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			N_("identify the file status with tags")),
 		OPT_BOOL('v', NULL, &show_valid_bit,
 			N_("use lowercase letters for 'assume unchanged' files")),
+		OPT_BOOL('f', NULL, &show_fsmonitor_bit,
+			N_("use lowercase letters for 'fsmonitor clean' files")),
 		OPT_BOOL('c', "cached", &show_cached,
 			N_("show cached files in the output (default)")),
 		OPT_BOOL('d', "deleted", &show_deleted,
@@ -584,7 +588,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_exclude(exclude_list.items[i].string, "", 0, el, --exclude_args);
 	}
-	if (show_tag || show_valid_bit) {
+	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
 		tag_removed = "R ";
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 07/12] update-index: add fsmonitor support to update-index
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (5 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add support in update-index to manually add/remove the fsmonitor
extension via --fsmonitor/--no-fsmonitor flags

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/update-index.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 6f39ee9274..b03afc1f3a 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -918,6 +918,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	int lock_error = 0;
 	int split_index = -1;
 	int force_write = 0;
+	int fsmonitor = -1;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
@@ -1011,6 +1012,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    N_("enable untracked cache without testing the filesystem"), UC_FORCE),
 		OPT_SET_INT(0, "force-write-index", &force_write,
 			N_("write out the index even if is not flagged as changed"), 1),
+		OPT_BOOL(0, "fsmonitor", &fsmonitor,
+			N_("enable or disable file system monitor")),
 		OPT_END()
 	};
 
@@ -1152,6 +1155,22 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		die("BUG: bad untracked_cache value: %d", untracked_cache);
 	}
 
+	if (fsmonitor > 0) {
+		if (git_config_get_fsmonitor() == 0)
+			warning(_("core.fsmonitor is unset; "
+				"set it if you really want to "
+				"enable fsmonitor"));
+		add_fsmonitor(&the_index);
+		report(_("fsmonitor enabled"));
+	} else if (!fsmonitor) {
+		if (git_config_get_fsmonitor() == 1)
+			warning(_("core.fsmonitor is set; "
+				"remove it if you really want to "
+				"disable fsmonitor"));
+		remove_fsmonitor(&the_index);
+		report(_("fsmonitor disabled"));
+	}
+
 	if (active_cache_changed || force_write) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (6 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 07/12] update-index: add fsmonitor support to update-index Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-17  8:02     ` Junio C Hamano
  2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
                     ` (4 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile                       |  1 +
 t/helper/test-dump-fsmonitor.c | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)
 create mode 100644 t/helper/test-dump-fsmonitor.c

diff --git a/Makefile b/Makefile
index 9d6ec9c1e9..d970cd00e9 100644
--- a/Makefile
+++ b/Makefile
@@ -639,6 +639,7 @@ TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
+TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
 TEST_PROGRAMS_NEED_X += test-fake-ssh
diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
new file mode 100644
index 0000000000..482d749bb9
--- /dev/null
+++ b/t/helper/test-dump-fsmonitor.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+int cmd_main(int ac, const char **av)
+{
+	struct index_state *istate = &the_index;
+	int i;
+
+	setup_git_directory();
+	if (do_read_index(istate, get_index_file(), 0) < 0)
+		die("unable to read index file");
+	if (!istate->fsmonitor_last_update) {
+		printf("no fsmonitor\n");
+		return 0;
+	}
+	printf("fsmonitor last update %"PRIuMAX"\n", istate->fsmonitor_last_update);
+
+	for (i = 0; i < istate->cache_nr; i++)
+		printf((istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) ? "+" : "-");
+
+	return 0;
+}
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (7 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-19 20:43     ` Jonathan Nieder
  2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
                     ` (3 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

The split index test t1700-split-index.sh has hard coded SHA values for
the index.  Currently it supports index V4 and V3 but assumes there are
no index extensions loaded.

When manually forcing the fsmonitor extension to be turned on when
running the test suite, the SHA values no longer match which causes the
test to fail.

The potential matrix of index extensions and index versions can is quite
large so instead disable the extension before attempting to run the test.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t1700-split-index.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 22f69a410b..af9b847761 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -6,6 +6,7 @@ test_description='split index mode tests'
 
 # We need total control of index splitting here
 sane_unset GIT_TEST_SPLIT_INDEX
+sane_unset GIT_FSMONITOR_TEST
 
 test_expect_success 'enable split index' '
 	git config splitIndex.maxPercentChange 100 &&
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (8 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 22:00     ` David Turner
                       ` (2 more replies)
  2017-09-15 19:20   ` [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
                     ` (2 subsequent siblings)
  12 siblings, 3 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Test the ability to add/remove the fsmonitor index extension via
update-index.

Test that dirty files returned from the integration script are properly
represented in the index extension and verify that ls-files correctly
reports their state.

Test that ensure status results are correct when using the new fsmonitor
extension.  Test untracked, modified, and new files by ensuring the
results are identical to when not using the extension.

Test that if the fsmonitor extension doesn't tell git about a change, it
doesn't discover it on its own.  This ensures git is honoring the
extension and that we get the performance benefits desired.

Three test integration scripts are provided:

fsmonitor-all - marks all files as dirty
fsmonitor-none - marks no files as dirty
fsmonitor-watchman - integrates with Watchman with debug logging

To run tests in the test suite while utilizing fsmonitor:

First copy t/t7519/fsmonitor-all to a location in your path and then set
GIT_FORCE_PRELOAD_TEST=true and GIT_FSMONITOR_TEST=fsmonitor-all and run
your tests.

Note: currently when using the test script fsmonitor-watchman on
Windows, many tests fail due to a reported but not yet fixed bug in
Watchman where it holds on to handles for directories and files which
prevents the test directory from being cleaned up properly.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t7519-status-fsmonitor.sh | 259 ++++++++++++++++++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all       |  23 ++++
 t/t7519/fsmonitor-none      |  21 ++++
 t/t7519/fsmonitor-watchman  | 128 ++++++++++++++++++++++
 4 files changed, 431 insertions(+)
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman

diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
new file mode 100755
index 0000000000..6aa1e4e924
--- /dev/null
+++ b/t/t7519-status-fsmonitor.sh
@@ -0,0 +1,259 @@
+#!/bin/sh
+
+test_description='git status with file system watcher'
+
+. ./test-lib.sh
+
+#
+# To run the entire git test suite using fsmonitor:
+#
+# copy t/t7519/fsmonitor-all to a location in your path and then set
+# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests.
+#
+
+# Note, after "git reset --hard HEAD" no extensions exist other than 'TREE'
+# "git update-index --fsmonitor" can be used to get the extension written
+# before testing the results.
+
+clean_repo () {
+	git reset --hard HEAD &&
+	git clean -fd
+}
+
+dirty_repo () {
+	: >untracked &&
+	: >dir1/untracked &&
+	: >dir2/untracked &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	echo 4 >new &&
+	echo 5 >dir1/new &&
+	echo 6 >dir2/new
+}
+
+write_integration_script() {
+	write_script .git/hooks/fsmonitor-test<<-\EOF
+	if [ "$#" -ne 2 ]; then
+		echo "$0: exactly 2 arguments expected"
+		exit 2
+	fi
+	if [ "$1" != 1 ]; then
+		echo -e "Unsupported core.fsmonitor hook version.\n" >&2
+		exit 1
+	fi
+	printf "untracked\0"
+	printf "dir1/untracked\0"
+	printf "dir2/untracked\0"
+	printf "modified\0"
+	printf "dir1/modified\0"
+	printf "dir2/modified\0"
+	printf "new\0"
+	printf "dir1/new\0"
+	printf "dir2/new\0"
+	EOF
+}
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success 'setup' '
+	mkdir -p .git/hooks &&
+	: >tracked &&
+	: >modified &&
+	mkdir dir1 &&
+	: >dir1/tracked &&
+	: >dir1/modified &&
+	mkdir dir2 &&
+	: >dir2/tracked &&
+	: >dir2/modified &&
+	git -c core.fsmonitor= add . &&
+	git -c core.fsmonitor= commit -m initial &&
+	git config core.fsmonitor .git/hooks/fsmonitor-test &&
+	cat >.gitignore <<-\EOF
+	.gitignore
+	expect*
+	actual*
+	marker*
+	EOF
+'
+
+# test that the fsmonitor extension is off by default
+test_expect_success 'fsmonitor extension is off by default' '
+	test-dump-fsmonitor >actual &&
+	grep "^no fsmonitor" actual
+'
+
+# test that "update-index --fsmonitor" adds the fsmonitor extension
+test_expect_success 'update-index --fsmonitor" adds the fsmonitor extension' '
+	git update-index --fsmonitor &&
+	test-dump-fsmonitor >actual &&
+	grep "^fsmonitor last update" actual
+'
+
+# test that "update-index --no-fsmonitor" removes the fsmonitor extension
+test_expect_success 'update-index --no-fsmonitor" removes the fsmonitor extension' '
+	git update-index --no-fsmonitor &&
+	test-dump-fsmonitor >actual &&
+	grep "^no fsmonitor" actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+H dir1/tracked
+H dir2/modified
+H dir2/tracked
+H modified
+H tracked
+EOF
+
+# test that all files returned by the script get flagged as invalid
+test_expect_success 'all files returned by integration script get flagged as invalid' '
+	write_integration_script &&
+	dirty_repo &&
+	git update-index --fsmonitor &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/new
+H dir1/tracked
+H dir2/modified
+h dir2/new
+H dir2/tracked
+H modified
+h new
+H tracked
+EOF
+
+# test that newly added files are marked valid
+test_expect_success 'newly added files are marked valid' '
+	git add new &&
+	git add dir1/new &&
+	git add dir2/new &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/new
+h dir1/tracked
+H dir2/modified
+h dir2/new
+h dir2/tracked
+H modified
+h new
+h tracked
+EOF
+
+# test that all unmodified files get marked valid
+test_expect_success 'all unmodified files get marked valid' '
+	# modified files result in update-index returning 1
+	test_must_fail git update-index --refresh --force-write-index &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/tracked
+h dir2/modified
+h dir2/tracked
+h modified
+h tracked
+EOF
+
+# test that *only* files returned by the integration script get flagged as invalid
+test_expect_success '*only* files returned by the integration script get flagged as invalid' '
+	write_script .git/hooks/fsmonitor-test<<-\EOF &&
+	printf "dir1/modified\0"
+	EOF
+	clean_repo &&
+	git update-index --refresh --force-write-index &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	test_must_fail git update-index --refresh --force-write-index &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+# Ensure commands that call refresh_index() to move the index back in time
+# properly invalidate the fsmonitor cache
+test_expect_success 'refresh_index() invalidates fsmonitor cache' '
+	write_script .git/hooks/fsmonitor-test<<-\EOF &&
+	EOF
+	clean_repo &&
+	dirty_repo &&
+	git add . &&
+	git commit -m "to reset" &&
+	git reset HEAD~1 &&
+	git status >actual &&
+	git -c core.fsmonitor= status >expect &&
+	test_i18ncmp expect actual
+'
+
+# test fsmonitor with and without preloadIndex
+preload_values="false true"
+for preload_val in $preload_values
+do
+	test_expect_success "setup preloadIndex to $preload_val" '
+		git config core.preloadIndex $preload_val &&
+		if [ $preload_val -eq true ]
+		then
+			GIT_FORCE_PRELOAD_TEST=$preload_val; export GIT_FORCE_PRELOAD_TEST
+		else
+			unset GIT_FORCE_PRELOAD_TEST
+		fi
+	'
+
+	# test fsmonitor with and without the untracked cache (if available)
+	uc_values="false"
+	test_have_prereq UNTRACKED_CACHE && uc_values="false true"
+	for uc_val in $uc_values
+	do
+		test_expect_success "setup untracked cache to $uc_val" '
+			git config core.untrackedcache $uc_val
+		'
+
+		# Status is well tested elsewhere so we'll just ensure that the results are
+		# the same when using core.fsmonitor.
+		test_expect_success 'compare status with and without fsmonitor' '
+			write_integration_script &&
+			clean_repo &&
+			dirty_repo &&
+			git add new &&
+			git add dir1/new &&
+			git add dir2/new &&
+			git status >actual &&
+			git -c core.fsmonitor= status >expect &&
+			test_i18ncmp expect actual
+		'
+
+		# Make sure it's actually skipping the check for modified and untracked
+		# (if enabled) files unless it is told about them.
+		test_expect_success "status doesn't detect unreported modifications" '
+			write_script .git/hooks/fsmonitor-test<<-\EOF &&
+			:>marker
+			EOF
+			clean_repo &&
+			git status &&
+			test_path_is_file marker &&
+			dirty_repo &&
+			rm -f marker &&
+			git status >actual &&
+			test_path_is_file marker &&
+			test_i18ngrep ! "Changes not staged for commit:" actual &&
+			if [ $uc_val -eq true ]; then test_i18ngrep ! "Untracked files:" actual; fi &&
+			if [ $uc_val -eq false ]; then test_i18ngrep "Untracked files:" actual; fi &&
+			rm -f marker
+		'
+	done
+done
+
+test_done
diff --git a/t/t7519/fsmonitor-all b/t/t7519/fsmonitor-all
new file mode 100755
index 0000000000..a3870e431e
--- /dev/null
+++ b/t/t7519/fsmonitor-all
@@ -0,0 +1,23 @@
+#!/bin/sh
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+
+if [ "$#" -ne 2 ] ; then
+	echo -e "$0: exactly 2 arguments expected\n" >&2
+	exit 2
+fi
+
+if [ "$1" != 1 ]
+then
+	echo -e "Unsupported core.fsmonitor hook version.\n" >&2
+	exit 1
+fi
+
+echo "*"
\ No newline at end of file
diff --git a/t/t7519/fsmonitor-none b/t/t7519/fsmonitor-none
new file mode 100755
index 0000000000..c500bb0f26
--- /dev/null
+++ b/t/t7519/fsmonitor-none
@@ -0,0 +1,21 @@
+#!/bin/sh
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+
+if [ "$#" -ne 2 ] ; then
+	echo -e "$0: exactly 2 arguments expected\n" >&2
+	exit 2
+fi
+
+if [ "$1" != 1 ]
+then
+	echo -e "Unsupported core.fsmonitor hook version.\n" >&2
+	exit 1
+fi
diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
new file mode 100755
index 0000000000..aaee5d1fe3
--- /dev/null
+++ b/t/t7519/fsmonitor-watchman
@@ -0,0 +1,128 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $time) = @ARGV;
+print STDERR "$0 $version $time\n";
+
+# Check the hook interface version
+
+if ($version == 1) {
+	# convert nanoseconds to seconds
+	$time = int $time / 1000000000;
+} else {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+my $git_work_tree = $ENV{'PWD'};
+
+my $retry = 1;
+
+launch_watchman();
+
+sub launch_watchman {
+
+	# Set input record separator
+	local $/ = 0666;
+
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+	    or die "open2() failed: $!\n" .
+	    "Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $time but were not transient (ie created after
+	# $time but no longer exist).
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	#
+	# The category of transient files that we want to ignore will have a
+	# creation clock (cclock) newer than $time_t value and will also not
+	# currently exist.
+
+	open (my $fh, ">", ".git/watchman-query.json");
+	print $fh "[\"query\", \"$git_work_tree\", { \
+	\"since\": $time, \
+	\"fields\": [\"name\"], \
+	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
+	}]";
+	close $fh;
+
+	print CHLD_IN "[\"query\", \"$git_work_tree\", { \
+	\"since\": $time, \
+	\"fields\": [\"name\"], \
+	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
+	}]";
+
+	my $response = <CHLD_OUT>;
+
+	open ($fh, ">", ".git/watchman-response.json");
+	print $fh $response;
+	close $fh;
+
+	die "Watchman: command returned no output.\n" .
+	    "Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	    "Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	my $json_pkg;
+	eval {
+		require JSON::XS;
+		$json_pkg = "JSON::XS";
+		1;
+	} or do {
+		require JSON::PP;
+		$json_pkg = "JSON::PP";
+	};
+
+	my $o = $json_pkg->new->utf8->decode($response);
+
+	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+		$retry--;
+		qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+		# return fast "everything is dirty" flag"
+		print "*\0";
+		open ($fh, ">", ".git/watchman-output.out");
+		print "*\0";
+		close $fh;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+		print "*\0";
+		eval { launch_watchman() };
+		exit 0;
+	}
+
+	die "Watchman: $o->{error}.\n" .
+	    "Falling back to scanning...\n" if $o->{error};
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+
+	open ($fh, ">", ".git/watchman-output.out");
+	print $fh @{$o->{files}};
+	close $fh;
+}
\ No newline at end of file
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (9 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This script integrates the new fsmonitor capabilities of git with the
cross platform Watchman file watching service. To use the script:

Download and install Watchman from https://facebook.github.io/watchman/.
Rename the sample integration hook from fsmonitor-watchman.sample to
fsmonitor-watchman. Configure git to use the extension:

git config core.fsmonitor .git/hooks/fsmonitor-watchman

Optionally turn on the untracked cache for optimal performance.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Christian Couder <christian.couder@gmail.com>
---
 templates/hooks--fsmonitor-watchman.sample | 119 +++++++++++++++++++++++++++++
 1 file changed, 119 insertions(+)
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
new file mode 100755
index 0000000000..2779d7edf3
--- /dev/null
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -0,0 +1,119 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $time) = @ARGV;
+
+# Check the hook interface version
+
+if ($version == 1) {
+	# convert nanoseconds to seconds
+	$time = int $time / 1000000000;
+} else {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+# Convert unix style paths to escaped Windows style paths when running
+# in Windows command prompt
+
+my $system = `uname -s`;
+$system =~ s/[\r\n]+//g;
+my $git_work_tree;
+
+if ($system =~ m/^MSYS_NT/) {
+	$git_work_tree = `cygpath -aw "\$PWD"`;
+	$git_work_tree =~ s/[\r\n]+//g;
+	$git_work_tree =~ s,\\,/,g;
+} else {
+	$git_work_tree = $ENV{'PWD'};
+}
+
+my $retry = 1;
+
+launch_watchman();
+
+sub launch_watchman {
+
+	# Set input record separator
+	local $/ = 0666;
+
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+	    or die "open2() failed: $!\n" .
+	    "Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $time but were not transient (ie created after
+	# $time but no longer exist).
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	#
+	# The category of transient files that we want to ignore will have a
+	# creation clock (cclock) newer than $time_t value and will also not
+	# currently exist.
+
+	print CHLD_IN "[\"query\", \"$git_work_tree\", { \
+	\"since\": $time, \
+	\"fields\": [\"name\"], \
+	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
+	}]";
+
+	my $response = <CHLD_OUT>;
+
+	die "Watchman: command returned no output.\n" .
+	    "Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	    "Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	my $json_pkg;
+	eval {
+		require JSON::XS;
+		$json_pkg = "JSON::XS";
+		1;
+	} or do {
+		require JSON::PP;
+		$json_pkg = "JSON::PP";
+	};
+
+	my $o = $json_pkg->new->utf8->decode($response);
+
+	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+		$retry--;
+		qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+		print "*\0";
+		eval { launch_watchman() };
+		exit 0;
+	}
+
+	die "Watchman: $o->{error}.\n" .
+	    "Falling back to scanning...\n" if $o->{error};
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+}
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v6 12/12] fsmonitor: add a performance test
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (10 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
@ 2017-09-15 19:20   ` Ben Peart
  2017-09-15 21:56     ` David Turner
  2017-09-18 14:24     ` Johannes Schindelin
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
  12 siblings, 2 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-15 19:20 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a test utility (test-drop-caches) that flushes all changes to disk
then drops file system cache on Windows, Linux, and OSX.

Add a perf test (p7519-fsmonitor.sh) for fsmonitor.

By default, the performance test will utilize the Watchman file system
monitor if it is installed.  If Watchman is not installed, it will use a
dummy integration script that does not report any new or modified files.
The dummy script has very little overhead which provides optimistic results.

The performance test will also use the untracked cache feature if it is
available as fsmonitor uses it to speed up scanning for untracked files.

There are 3 environment variables that can be used to alter the default
behavior of the performance test:

GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor

The big win for using fsmonitor is the elimination of the need to scan the
working directory looking for changed and untracked files. If the file
information is all cached in RAM, the benefits are reduced.

GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile                    |   1 +
 t/helper/.gitignore         |   1 +
 t/helper/test-drop-caches.c | 161 ++++++++++++++++++++++++++++++++++++++
 t/perf/p7519-fsmonitor.sh   | 184 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 347 insertions(+)
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100755 t/perf/p7519-fsmonitor.sh

diff --git a/Makefile b/Makefile
index d970cd00e9..b2653ee64f 100644
--- a/Makefile
+++ b/Makefile
@@ -638,6 +638,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-drop-caches
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
 TEST_PROGRAMS_NEED_X += test-dump-split-index
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 721650256e..f9328eebdd 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -3,6 +3,7 @@
 /test-config
 /test-date
 /test-delta
+/test-drop-caches
 /test-dump-cache-tree
 /test-dump-split-index
 /test-dump-untracked-cache
diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
new file mode 100644
index 0000000000..717079865c
--- /dev/null
+++ b/t/helper/test-drop-caches.c
@@ -0,0 +1,161 @@
+#include "git-compat-util.h"
+
+#if defined(GIT_WINDOWS_NATIVE)
+
+int cmd_sync(void)
+{
+	char Buffer[MAX_PATH];
+	DWORD dwRet;
+	char szVolumeAccessPath[] = "\\\\.\\X:";
+	HANDLE hVolWrite;
+	int success = 0;
+
+	dwRet = GetCurrentDirectory(MAX_PATH, Buffer);
+	if ((0 == dwRet) || (dwRet > MAX_PATH))
+		return error("Error getting current directory");
+
+	if ((Buffer[0] < 'A') || (Buffer[0] > 'Z'))
+		return error("Invalid drive letter '%c'", Buffer[0]);
+
+	szVolumeAccessPath[4] = Buffer[0];
+	hVolWrite = CreateFile(szVolumeAccessPath, GENERIC_READ | GENERIC_WRITE,
+		FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);
+	if (INVALID_HANDLE_VALUE == hVolWrite)
+		return error("Unable to open volume for writing, need admin access");
+
+	success = FlushFileBuffers(hVolWrite);
+	if (!success)
+		error("Unable to flush volume");
+
+	CloseHandle(hVolWrite);
+
+	return !success;
+}
+
+#define STATUS_SUCCESS			(0x00000000L)
+#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
+
+typedef enum _SYSTEM_INFORMATION_CLASS {
+	SystemMemoryListInformation = 80,
+} SYSTEM_INFORMATION_CLASS;
+
+typedef enum _SYSTEM_MEMORY_LIST_COMMAND {
+	MemoryCaptureAccessedBits,
+	MemoryCaptureAndResetAccessedBits,
+	MemoryEmptyWorkingSets,
+	MemoryFlushModifiedList,
+	MemoryPurgeStandbyList,
+	MemoryPurgeLowPriorityStandbyList,
+	MemoryCommandMax
+} SYSTEM_MEMORY_LIST_COMMAND;
+
+BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
+{
+	BOOL bResult;
+	DWORD dwBufferLength;
+	LUID luid;
+	TOKEN_PRIVILEGES tpPreviousState;
+	TOKEN_PRIVILEGES tpNewState;
+
+	dwBufferLength = 16;
+	bResult = LookupPrivilegeValueA(0, lpName, &luid);
+	if (bResult) {
+		tpNewState.PrivilegeCount = 1;
+		tpNewState.Privileges[0].Luid = luid;
+		tpNewState.Privileges[0].Attributes = 0;
+		bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState,
+			(DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState),
+			&tpPreviousState, &dwBufferLength);
+		if (bResult) {
+			tpPreviousState.PrivilegeCount = 1;
+			tpPreviousState.Privileges[0].Luid = luid;
+			tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0;
+			bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState,
+				dwBufferLength, 0, 0);
+		}
+	}
+	return bResult;
+}
+
+int cmd_dropcaches(void)
+{
+	HANDLE hProcess = GetCurrentProcess();
+	HANDLE hToken;
+	int status;
+
+	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
+		return error("Can't open current process token");
+
+	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
+		return error("Can't get SeProfileSingleProcessPrivilege");
+
+	CloseHandle(hToken);
+
+	HMODULE ntdll = LoadLibrary("ntdll.dll");
+	if (!ntdll)
+		return error("Can't load ntdll.dll, wrong Windows version?");
+
+	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
+		(DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
+	if (!NtSetSystemInformation)
+		return error("Can't get function addresses, wrong Windows version?");
+
+	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
+	status = NtSetSystemInformation(
+		SystemMemoryListInformation,
+		&command,
+		sizeof(SYSTEM_MEMORY_LIST_COMMAND)
+	);
+	if (status == STATUS_PRIVILEGE_NOT_HELD)
+		error("Insufficient privileges to purge the standby list, need admin access");
+	else if (status != STATUS_SUCCESS)
+		error("Unable to execute the memory list command %d", status);
+
+	FreeLibrary(ntdll);
+
+	return status;
+}
+
+#elif defined(__linux__)
+
+int cmd_sync(void)
+{
+	return system("sync");
+}
+
+int cmd_dropcaches(void)
+{
+	return system("echo 3 | sudo tee /proc/sys/vm/drop_caches");
+}
+
+#elif defined(__APPLE__)
+
+int cmd_sync(void)
+{
+	return system("sync");
+}
+
+int cmd_dropcaches(void)
+{
+	return system("sudo purge");
+}
+
+#else
+
+int cmd_sync(void)
+{
+	return 0;
+}
+
+int cmd_dropcaches(void)
+{
+	return error("drop caches not implemented on this platform");
+}
+
+#endif
+
+int cmd_main(int argc, const char **argv)
+{
+	cmd_sync();
+	return cmd_dropcaches();
+}
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
new file mode 100755
index 0000000000..1c5978d5c8
--- /dev/null
+++ b/t/perf/p7519-fsmonitor.sh
@@ -0,0 +1,184 @@
+#!/bin/sh
+
+test_description="Test core.fsmonitor"
+
+. ./perf-lib.sh
+
+#
+# Performance test for the fsmonitor feature which enables git to talk to a
+# file system change monitor and avoid having to scan the working directory
+# for new or modified files.
+#
+# By default, the performance test will utilize the Watchman file system
+# monitor if it is installed.  If Watchman is not installed, it will use a
+# dummy integration script that does not report any new or modified files.
+# The dummy script has very little overhead which provides optimistic results.
+#
+# The performance test will also use the untracked cache feature if it is
+# available as fsmonitor uses it to speed up scanning for untracked files.
+#
+# There are 3 environment variables that can be used to alter the default
+# behavior of the performance test:
+#
+# GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
+# GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
+# GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor
+#
+# The big win for using fsmonitor is the elimination of the need to scan the
+# working directory looking for changed and untracked files. If the file
+# information is all cached in RAM, the benefits are reduced.
+#
+# GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests
+#
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_lazy_prereq WATCHMAN '
+	{ command -v watchman >/dev/null 2>&1; ret=$?; } &&
+	test $ret -ne 1
+'
+
+if test_have_prereq WATCHMAN
+then
+	# Convert unix style paths to escaped Windows style paths for Watchman
+	case "$(uname -s)" in
+	MSYS_NT*)
+	  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')"
+	  ;;
+	*)
+	  GIT_WORK_TREE="$PWD"
+	  ;;
+	esac
+fi
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"
+then
+	# When using GIT_PERF_7519_DROP_CACHE, GIT_PERF_REPEAT_COUNT must be 1 to
+	# generate valid results. Otherwise the caching that happens for the nth
+	# run will negate the validity of the comparisons.
+	if test "$GIT_PERF_REPEAT_COUNT" -ne 1
+	then
+		echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+		GIT_PERF_REPEAT_COUNT=1
+	fi
+fi
+
+test_expect_success "setup for fsmonitor" '
+	# set untrackedCache depending on the environment
+	if test -n "$GIT_PERF_7519_UNTRACKED_CACHE"
+	then
+		git config core.untrackedCache "$GIT_PERF_7519_UNTRACKED_CACHE"
+	else
+		if test_have_prereq UNTRACKED_CACHE
+		then
+			git config core.untrackedCache true
+		else
+			git config core.untrackedCache false
+		fi
+	fi &&
+
+	# set core.splitindex depending on the environment
+	if test -n "$GIT_PERF_7519_SPLIT_INDEX"
+	then
+		git config core.splitIndex "$GIT_PERF_7519_SPLIT_INDEX"
+	fi &&
+
+	# set INTEGRATION_SCRIPT depending on the environment
+	if test -n "$GIT_PERF_7519_FSMONITOR"
+	then
+		INTEGRATION_SCRIPT="$GIT_PERF_7519_FSMONITOR"
+	else
+		#
+		# Choose integration script based on existance of Watchman.
+		# If Watchman exists, watch the work tree and attempt a query.
+		# If everything succeeds, use Watchman integration script,
+		# else fall back to an empty integration script.
+		#
+		mkdir .git/hooks &&
+		if test_have_prereq WATCHMAN
+		then
+			INTEGRATION_SCRIPT=".git/hooks/fsmonitor-watchman" &&
+			cp "$TEST_DIRECTORY/../templates/hooks--fsmonitor-watchman.sample" "$INTEGRATION_SCRIPT" &&
+			watchman watch "$GIT_WORK_TREE" &&
+			watchman watch-list | grep -q -F "$GIT_WORK_TREE"
+		else
+			INTEGRATION_SCRIPT=".git/hooks/fsmonitor-empty" &&
+			write_script "$INTEGRATION_SCRIPT"<<-\EOF
+			EOF
+		fi
+	fi &&
+
+	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
+	git update-index --fsmonitor
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uno
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uall
+'
+
+test_expect_success "setup without fsmonitor" '
+	unset INTEGRATION_SCRIPT &&
+	git config --unset core.fsmonitor &&
+	git update-index --no-fsmonitor
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uno
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uall
+'
+
+if test_have_prereq WATCHMAN
+then
+	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
+
+	# Work around Watchman bug on Windows where it holds on to handles
+	# preventing the removal of the trash directory
+	watchman shutdown-server >/dev/null 2>&1
+fi
+
+test_done
-- 
2.14.1.548.ge54b1befee.dirty


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
@ 2017-09-15 19:43     ` David Turner
  2017-09-18 13:27       ` Ben Peart
  2017-09-17  8:03     ` Junio C Hamano
  1 sibling, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-15 19:43 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net



> -----Original Message-----
> From: Ben Peart [mailto:benpeart@microsoft.com]
> Sent: Friday, September 15, 2017 3:21 PM
> To: benpeart@microsoft.com
> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor
> extension.
> 
> This includes the core.fsmonitor setting, the query-fsmonitor hook, and the
> fsmonitor index extension.
> 
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> ---
>  Documentation/config.txt                 |  6 ++++++
>  Documentation/githooks.txt               | 23 +++++++++++++++++++++++
>  Documentation/technical/index-format.txt | 19 +++++++++++++++++++
>  3 files changed, 48 insertions(+)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt index
> dc4e3f58a2..c196007a27 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -413,6 +413,12 @@ core.protectNTFS::
>  	8.3 "short" names.
>  	Defaults to `true` on Windows, and `false` elsewhere.
> 
> +core.fsmonitor::
> +	If set, the value of this variable is used as a command which
> +	will identify all files that may have changed since the
> +	requested date/time. This information is used to speed up git by
> +	avoiding unnecessary processing of files that have not changed.

I'm confused here.  You have a file called "fsmonitor-watchman", which seems to discuss the protocol for core.fsmonitor scripts in general, and you have this documentation, which does not link to that file.  Can you clarify this? 

<snip>

> +The hook should output to stdout the list of all files in the working
> +directory that may have changed since the requested time.  The logic
> +should be inclusive so that it does not miss any potential changes.

+"It is OK to include files which have not actually changed.  Newly-created and deleted files should also be included.  When files are renamed, both the old and the new name should be included."

Also, please discuss case sensitivity issues (e.g. on OS X).  

> +The paths should be relative to the root of the working directory and
> +be separated by a single NUL.

<snip>

> +  - 32-bit version number: the current supported version is 1.
> +
> +  - 64-bit time: the extension data reflects all changes through the given
> +	time which is stored as the nanoseconds elapsed since midnight,
> +	January 1, 1970.

Nit: Please specify signed or unsigned for these.  (I expect to be getting out of 
cryosleep around 2262, and I want to know if my old git repos will keep working...)

> +  - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap.
> +
> +  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
> +    is not CE_FSMONITOR_VALID.


^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
@ 2017-09-15 20:34     ` David Turner
  0 siblings, 0 replies; 137+ messages in thread
From: David Turner @ 2017-09-15 20:34 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

> -----Original Message-----
> From: Ben Peart [mailto:benpeart@microsoft.com]
> Sent: Friday, September 15, 2017 3:21 PM
> To: benpeart@microsoft.com
> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor
> valid bit
> 
> Add a new command line option (-f) to ls-files to have it use lowercase letters
> for 'fsmonitor valid' files

Document in man page, please.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-09-15 21:35     ` David Turner
  2017-09-18 13:07       ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-15 21:35 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

> -----Original Message-----
> From: Ben Peart [mailto:benpeart@microsoft.com]
> Sent: Friday, September 15, 2017 3:21 PM
> To: benpeart@microsoft.com
> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system
> monitor to speed up detecting new or changed files.
 
> +int git_config_get_fsmonitor(void)
> +{
> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
> +
> +	if (core_fsmonitor && !*core_fsmonitor)
> +		core_fsmonitor = NULL;
> +
> +	if (core_fsmonitor)
> +		return 1;
> +
> +	return 0;
> +}

This functions return values are backwards relative to the rest of the git_config_* functions.

[snip]

+>	/*
+>	 * With fsmonitor, we can trust the untracked cache's valid field.
+>	 */

[snip]

> +int read_fsmonitor_extension(struct index_state *istate, const void *data,
> +	unsigned long sz)
> +{

If git_config_get_fsmonitor returns 0, fsmonitor_dirty will leak.

[snip]

> +	/* a fsmonitor process can return '*' to indicate all entries are invalid */

That's not documented in your documentation.  Also, I'm not sure I like it: what 
if I have a file whose name starts with '*'?  Yeah, that would be silly, but this indicates the need 
for the protocol to have some sort of signaling mechanism that's out-of-band  Maybe 
have some key\0value\0 pairs and then \0\0 and then the list of files?  Or, if you want to keep
it really simple, allow an entry of '/' (which is an invalid filename) to mean 'all'.

> +void add_fsmonitor(struct index_state *istate) {
> +	int i;
> +
> +	if (!istate->fsmonitor_last_update) {
[snip]
> +		/* reset the untracked cache */

Is this really necessary?  Shouldn't the untracked cache be in a correct state already? 

> +/*
> + * Clear the given cache entries CE_FSMONITOR_VALID bit and invalidate

Nit: "s/entries/entry's/".
 


^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 12/12] fsmonitor: add a performance test
  2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
@ 2017-09-15 21:56     ` David Turner
  2017-09-18 14:24     ` Johannes Schindelin
  1 sibling, 0 replies; 137+ messages in thread
From: David Turner @ 2017-09-15 21:56 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net


> -----Original Message-----
> +		# Choose integration script based on existance of Watchman.

Spelling: existence



^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
@ 2017-09-15 22:00     ` David Turner
  2017-09-19 19:32       ` David Turner
  2017-09-16 15:27     ` Torsten Bögershausen
  2017-09-17  4:47     ` Junio C Hamano
  2 siblings, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-15 22:00 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

> -----Original Message-----
> +dirty_repo () {
> +	: >untracked &&
> +	: >dir1/untracked &&
> +	: >dir2/untracked &&
> +	echo 1 >modified &&
> +	echo 2 >dir1/modified &&
> +	echo 3 >dir2/modified &&
> +	echo 4 >new &&
> +	echo 5 >dir1/new &&
> +	echo 6 >dir2/new

If I add an untracked file named dir3/untracked to dirty_repo
 (and write_integration_script), then "status doesn't detect 
unreported modifications", below, fails.  Did I do something 
wrong, or does this turn up a bug?

> +	test_expect_success "setup preloadIndex to $preload_val" '
> +		git config core.preloadIndex $preload_val &&
> +		if [ $preload_val -eq true ]

"-eq" is for numeric equality in POSIX shell.  So this works if your 
/bin/sh is bash but not if it's e.g. dash.  This happens twice more 
below.  Use "=" instead.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
  2017-09-15 22:00     ` David Turner
@ 2017-09-16 15:27     ` Torsten Bögershausen
  2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
  2017-09-18 14:06       ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
  2017-09-17  4:47     ` Junio C Hamano
  2 siblings, 2 replies; 137+ messages in thread
From: Torsten Bögershausen @ 2017-09-16 15:27 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

On 2017-09-15 21:20, Ben Peart wrote:
> +if [ "$1" != 1 ]
> +then
> +	echo -e "Unsupported core.fsmonitor hook version.\n" >&2
> +	exit 1
> +fi

The echo -e not portable
(It was detected by a tighter version of the lint script,
 which I have here, but not yet send to the list :-(

This will do:
echo  "Unsupported core.fsmonitor hook version." >&2

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
  2017-09-15 22:00     ` David Turner
  2017-09-16 15:27     ` Torsten Bögershausen
@ 2017-09-17  4:47     ` Junio C Hamano
  2017-09-18 15:25       ` Ben Peart
  2 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-17  4:47 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> +write_integration_script() {
> +	write_script .git/hooks/fsmonitor-test<<-\EOF
> +	if [ "$#" -ne 2 ]; then
> +		echo "$0: exactly 2 arguments expected"
> +		exit 2
> +	fi
> +	if [ "$1" != 1 ]; then
> +		echo -e "Unsupported core.fsmonitor hook version.\n" >&2
> +		exit 1
> +	fi

In addition to "echo -e" thing pointed out earlier, these look
somewhat unusual in our shell scripts, relative to what
Documentation/CodingGuidelines tells us to do:

 - We prefer a space between the function name and the parentheses,
   and no space inside the parentheses. The opening "{" should also
   be on the same line.

	(incorrect)
	my_function(){
		...

	(correct)
	my_function () {
		...

 - We prefer "test" over "[ ... ]".

 - Do not write control structures on a single line with semicolon.
   "then" should be on the next line for if statements, and "do"
   should be on the next line for "while" and "for".

	(incorrect)
	if test -f hello; then
		do this
	fi

	(correct)
	if test -f hello
	then
		do this
	fi

> diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
> new file mode 100755
> index 0000000000..aaee5d1fe3
> --- /dev/null
> +++ b/t/t7519/fsmonitor-watchman
> @@ -0,0 +1,128 @@
> +#!/usr/bin/perl
> +
> +use strict;
> +use warnings;
> +use IPC::Open2;
> + ...
> +	open (my $fh, ">", ".git/watchman-query.json");
> +	print $fh "[\"query\", \"$git_work_tree\", { \
> +	\"since\": $time, \
> +	\"fields\": [\"name\"], \
> +	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
> +	}]";
> +	close $fh;
> +
> +	print CHLD_IN "[\"query\", \"$git_work_tree\", { \
> +	\"since\": $time, \
> +	\"fields\": [\"name\"], \
> +	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
> +	}]";

This look painful to read, write and maintain.  IIRC, Perl supports
the <<HERE document syntax quite similar to shell; would it make
these "print" we see above easier?

> +}
> \ No newline at end of file

Oops.

Thanks.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable
  2017-09-16 15:27     ` Torsten Bögershausen
@ 2017-09-17  5:43       ` tboegi
  2017-09-19 20:37         ` Jonathan Nieder
  2017-09-18 14:06       ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
  1 sibling, 1 reply; 137+ messages in thread
From: tboegi @ 2017-09-17  5:43 UTC (permalink / raw)
  To: git, benpeart; +Cc: Torsten Bögershausen

From: Torsten Bögershausen <tboegi@web.de>

Some implementations of `echo` support the '-e' option to enable
backslash interpretation of the following string.
As an addition, they support '-E' to turn it off.

However, none of these are portable, POSIX doesn't even mention them,
and many implementations don't support them.

A check for '-n' is already done in check-non-portable-shell.pl,
extend it to cover '-n', '-e' or '-E-'

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
 t/check-non-portable-shell.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
index b170cbc045..03dc9d2852 100755
--- a/t/check-non-portable-shell.pl
+++ b/t/check-non-portable-shell.pl
@@ -17,7 +17,7 @@ sub err {
 while (<>) {
 	chomp;
 	/\bsed\s+-i/ and err 'sed -i is not portable';
-	/\becho\s+-n/ and err 'echo -n is not portable (please use printf)';
+	/\becho\s+-[neE]/ and err 'echo with option is not portable (please use printf)';
 	/^\s*declare\s+/ and err 'arrays/declare not portable';
 	/^\s*[^#]\s*which\s/ and err 'which is not portable (please use type)';
 	/\btest\s+[^=]*==/ and err '"test a == b" is not portable (please use =)';
-- 
2.14.1.145.gb3622a4ee9


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
@ 2017-09-17  8:02     ` Junio C Hamano
  2017-09-18 13:38       ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-17  8:02 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
> new file mode 100644
> index 0000000000..482d749bb9
> --- /dev/null
> +++ b/t/helper/test-dump-fsmonitor.c
> @@ -0,0 +1,21 @@
> +#include "cache.h"
> +
> +int cmd_main(int ac, const char **av)
> +{
> +	struct index_state *istate = &the_index;
> +	int i;
> +
> +	setup_git_directory();
> +	if (do_read_index(istate, get_index_file(), 0) < 0)
> +		die("unable to read index file");
> +	if (!istate->fsmonitor_last_update) {
> +		printf("no fsmonitor\n");
> +		return 0;
> +	}
> +	printf("fsmonitor last update %"PRIuMAX"\n", istate->fsmonitor_last_update);

After pushing this out and had Travis complain, I queued a squash on
top of this to cast the argument to (uintmax_t), like you did in an
earlier step (I think it was [PATCH 04/12]).


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
  2017-09-15 19:43     ` David Turner
@ 2017-09-17  8:03     ` Junio C Hamano
  2017-09-18 13:29       ` Ben Peart
  1 sibling, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-17  8:03 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> +[[fsmonitor-watchman]]
> +fsmonitor-watchman
> +~~~~~~~~~~~~~~~

I've queued a mini squash on top to make sure the ~~~~ line aligns
with the length of the string above it by adding three ~'s here.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-15 21:35     ` David Turner
@ 2017-09-18 13:07       ` Ben Peart
  2017-09-18 13:32         ` David Turner
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-18 13:07 UTC (permalink / raw)
  To: David Turner, 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

Thanks for taking the time to review/provide feedback!

On 9/15/2017 5:35 PM, David Turner wrote:
>> -----Original Message-----
>> From: Ben Peart [mailto:benpeart@microsoft.com]
>> Sent: Friday, September 15, 2017 3:21 PM
>> To: benpeart@microsoft.com
>> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
>> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
>> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
>> Subject: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system
>> monitor to speed up detecting new or changed files.
>   
>> +int git_config_get_fsmonitor(void)
>> +{
>> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
>> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
>> +
>> +	if (core_fsmonitor && !*core_fsmonitor)
>> +		core_fsmonitor = NULL;
>> +
>> +	if (core_fsmonitor)
>> +		return 1;
>> +
>> +	return 0;
>> +}
> 
> This functions return values are backwards relative to the rest of the git_config_* functions.

I'm confused.  If core.fsmonitor is configured, it returns 1. If it is 
not configured, it returns 0. I don't make use of the -1 /* default 
value */ option as I didn't see any use/value in this case. What is 
backwards?

> 
> [snip]
> 
> +>	/*
> +>	 * With fsmonitor, we can trust the untracked cache's valid field.
> +>	 */
> 

Did you intend to make a comment here?

> [snip]
> 
>> +int read_fsmonitor_extension(struct index_state *istate, const void *data,
>> +	unsigned long sz)
>> +{
> 
> If git_config_get_fsmonitor returns 0, fsmonitor_dirty will leak.
> 

Good catch!  Thank you.

> [snip]
> 
>> +	/* a fsmonitor process can return '*' to indicate all entries are invalid */
> 
> That's not documented in your documentation.  Also, I'm not sure I like it: what
> if I have a file whose name starts with '*'?  Yeah, that would be silly, but this indicates the need
> for the protocol to have some sort of signaling mechanism that's out-of-band  Maybe
> have some key\0value\0 pairs and then \0\0 and then the list of files?  Or, if you want to keep
> it really simple, allow an entry of '/' (which is an invalid filename) to mean 'all'.
> 

Yea, this was an optimization I added late in the game to get around an 
issue in Watchman where it returns the name of every file the first time 
you query it (rather than the set of files that have actually changed 
since the requested time).

I didn't realize the wild card '*' was a valid character for a filename. 
  I'll switch to '/' as you suggest as I don't want to complicate things 
unnecessarily to handle this relatively rare optimization.  I'll also 
get it documented properly.  Thanks!

>> +void add_fsmonitor(struct index_state *istate) {
>> +	int i;
>> +
>> +	if (!istate->fsmonitor_last_update) {
> [snip]
>> +		/* reset the untracked cache */
> 
> Is this really necessary?  Shouldn't the untracked cache be in a correct state already?
> 

When fsmonitor is not turned on, I'm not explicitly invalidating 
untracked cache directory entries as git makes changes to files. While I 
doubt the sequence happens of 1) git making changes to files, *then* 2) 
turning on fsmonitor - I thought it better safe than sorry to assume 
that pattern won't ever happen in the future.  Especially since turning 
on the extension is rare and the cost is low.

>> +/*
>> + * Clear the given cache entries CE_FSMONITOR_VALID bit and invalidate
> 
> Nit: "s/entries/entry's/".
>   
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-15 19:43     ` David Turner
@ 2017-09-18 13:27       ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-18 13:27 UTC (permalink / raw)
  To: David Turner, 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net



On 9/15/2017 3:43 PM, David Turner wrote:
> 
> 
>> -----Original Message-----
>> From: Ben Peart [mailto:benpeart@microsoft.com]
>> Sent: Friday, September 15, 2017 3:21 PM
>> To: benpeart@microsoft.com
>> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
>> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
>> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
>> Subject: [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor
>> extension.
>>
>> This includes the core.fsmonitor setting, the query-fsmonitor hook, and the
>> fsmonitor index extension.
>>
>> Signed-off-by: Ben Peart <benpeart@microsoft.com>
>> ---
>>   Documentation/config.txt                 |  6 ++++++
>>   Documentation/githooks.txt               | 23 +++++++++++++++++++++++
>>   Documentation/technical/index-format.txt | 19 +++++++++++++++++++
>>   3 files changed, 48 insertions(+)
>>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt index
>> dc4e3f58a2..c196007a27 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -413,6 +413,12 @@ core.protectNTFS::
>>   	8.3 "short" names.
>>   	Defaults to `true` on Windows, and `false` elsewhere.
>>
>> +core.fsmonitor::
>> +	If set, the value of this variable is used as a command which
>> +	will identify all files that may have changed since the
>> +	requested date/time. This information is used to speed up git by
>> +	avoiding unnecessary processing of files that have not changed.
> 
> I'm confused here.  You have a file called "fsmonitor-watchman", which seems to discuss the protocol for core.fsmonitor scripts in general, and you have this documentation, which does not link to that file.  Can you clarify this?

I'll add the missing link to the documentation in githooks.txt.  The 
documentation should be enough for someone to develop another 
integration script.

The fsmonitor-watchman script allows people to easily use this patch 
series with the existing Watchman monitor but it can certainly also be 
used as a sample for how to integrate with another file system monitor.

> 
> <snip>
> 
>> +The hook should output to stdout the list of all files in the working
>> +directory that may have changed since the requested time.  The logic
>> +should be inclusive so that it does not miss any potential changes.
> 
> +"It is OK to include files which have not actually changed.  Newly-created and deleted files should also be included.  When files are renamed, both the old and the new name should be included."
> 
> Also, please discuss case sensitivity issues (e.g. on OS X).
> 
>> +The paths should be relative to the root of the working directory and
>> +be separated by a single NUL.
> 
> <snip>
> 
>> +  - 32-bit version number: the current supported version is 1.
>> +
>> +  - 64-bit time: the extension data reflects all changes through the given
>> +	time which is stored as the nanoseconds elapsed since midnight,
>> +	January 1, 1970.
> 
> Nit: Please specify signed or unsigned for these.  (I expect to be getting out of
> cryosleep around 2262, and I want to know if my old git repos will keep working...)
> 

While I'm not opposed to specifying unsigned, I did notice that the only 
place signed/unsigned is specified today is in "index entry." Everywhere 
else doesn't specify so I left it off for consistency.  I've not seen 
negative version numbers nor negative time so am not entirely sure it is 
necessary to clarify. :)

>> +  - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap.
>> +
>> +  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
>> +    is not CE_FSMONITOR_VALID.
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-17  8:03     ` Junio C Hamano
@ 2017-09-18 13:29       ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-18 13:29 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff



On 9/17/2017 4:03 AM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> +[[fsmonitor-watchman]]
>> +fsmonitor-watchman
>> +~~~~~~~~~~~~~~~
> 
> I've queued a mini squash on top to make sure the ~~~~ line aligns
> with the length of the string above it by adding three ~'s here.
> 

Thanks, I'll do the same assuming there will be another re-roll.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-18 13:07       ` Ben Peart
@ 2017-09-18 13:32         ` David Turner
  2017-09-18 13:49           ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-18 13:32 UTC (permalink / raw)
  To: 'Ben Peart', 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

> -----Original Message-----
> From: Ben Peart [mailto:peartben@gmail.com]
> Sent: Monday, September 18, 2017 9:07 AM
> To: David Turner <David.Turner@twosigma.com>; 'Ben Peart'
> <benpeart@microsoft.com>
> Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
> gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
> peff@peff.net
> Subject: Re: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file
> system monitor to speed up detecting new or changed files.
> 
> Thanks for taking the time to review/provide feedback!
> 
> On 9/15/2017 5:35 PM, David Turner wrote:
> >> -----Original Message-----
> >> From: Ben Peart [mailto:benpeart@microsoft.com]
> >> Sent: Friday, September 15, 2017 3:21 PM
> >> To: benpeart@microsoft.com
> >> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> >> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> >> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> >> Subject: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize
> >> a file system monitor to speed up detecting new or changed files.
> >
> >> +int git_config_get_fsmonitor(void)
> >> +{
> >> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
> >> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
> >> +
> >> +	if (core_fsmonitor && !*core_fsmonitor)
> >> +		core_fsmonitor = NULL;
> >> +
> >> +	if (core_fsmonitor)
> >> +		return 1;
> >> +
> >> +	return 0;
> >> +}
> >
> > This functions return values are backwards relative to the rest of the
> git_config_* functions.
> 
> I'm confused.  If core.fsmonitor is configured, it returns 1. If it is not
> configured, it returns 0. I don't make use of the -1 /* default value */ option
> as I didn't see any use/value in this case. What is backwards?

The other git_config_* functions return 1 for error and 0 for success.

> > [snip]
> >
> > +>	/*
> > +>	 * With fsmonitor, we can trust the untracked cache's valid field.
> > +>	 */
> >
> 
> Did you intend to make a comment here?

Sorry.  I was going to make a comment that I didn't see how that could work 
since we weren't touching the untracked cache here, but then I saw the bit 
further down.   I'm still not sure it works (see comment on 10/12), but at
least it could in theory work.
 


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-17  8:02     ` Junio C Hamano
@ 2017-09-18 13:38       ` Ben Peart
  2017-09-18 15:43         ` Torsten Bögershausen
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-18 13:38 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff



On 9/17/2017 4:02 AM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
>> new file mode 100644
>> index 0000000000..482d749bb9
>> --- /dev/null
>> +++ b/t/helper/test-dump-fsmonitor.c
>> @@ -0,0 +1,21 @@
>> +#include "cache.h"
>> +
>> +int cmd_main(int ac, const char **av)
>> +{
>> +	struct index_state *istate = &the_index;
>> +	int i;
>> +
>> +	setup_git_directory();
>> +	if (do_read_index(istate, get_index_file(), 0) < 0)
>> +		die("unable to read index file");
>> +	if (!istate->fsmonitor_last_update) {
>> +		printf("no fsmonitor\n");
>> +		return 0;
>> +	}
>> +	printf("fsmonitor last update %"PRIuMAX"\n", istate->fsmonitor_last_update);
> 
> After pushing this out and had Travis complain, I queued a squash on
> top of this to cast the argument to (uintmax_t), like you did in an
> earlier step (I think it was [PATCH 04/12]).
> 

Thanks. I'll update this to cast it as (uint64_t) as that is what 
get/put_be64 use.  As far as I can tell they both map to the same thing 
(unsigned long long) so there isn't functional difference.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-18 13:32         ` David Turner
@ 2017-09-18 13:49           ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-18 13:49 UTC (permalink / raw)
  To: David Turner, 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net



On 9/18/2017 9:32 AM, David Turner wrote:
>> -----Original Message-----
>> From: Ben Peart [mailto:peartben@gmail.com]
>> Sent: Monday, September 18, 2017 9:07 AM
>> To: David Turner <David.Turner@twosigma.com>; 'Ben Peart'
>> <benpeart@microsoft.com>
>> Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
>> gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
>> peff@peff.net
>> Subject: Re: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file
>> system monitor to speed up detecting new or changed files.
>>
>> Thanks for taking the time to review/provide feedback!
>>
>> On 9/15/2017 5:35 PM, David Turner wrote:
>>>> -----Original Message-----
>>>> From: Ben Peart [mailto:benpeart@microsoft.com]
>>>> Sent: Friday, September 15, 2017 3:21 PM
>>>> To: benpeart@microsoft.com
>>>> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
>>>> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
>>>> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
>>>> Subject: [PATCH v6 04/12] fsmonitor: teach git to optionally utilize
>>>> a file system monitor to speed up detecting new or changed files.
>>>
>>>> +int git_config_get_fsmonitor(void)
>>>> +{
>>>> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
>>>> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
>>>> +
>>>> +	if (core_fsmonitor && !*core_fsmonitor)
>>>> +		core_fsmonitor = NULL;
>>>> +
>>>> +	if (core_fsmonitor)
>>>> +		return 1;
>>>> +
>>>> +	return 0;
>>>> +}
>>>
>>> This functions return values are backwards relative to the rest of the
>> git_config_* functions.
>>
>> I'm confused.  If core.fsmonitor is configured, it returns 1. If it is not
>> configured, it returns 0. I don't make use of the -1 /* default value */ option
>> as I didn't see any use/value in this case. What is backwards?
> 
> The other git_config_* functions return 1 for error and 0 for success.

Hmm, I followed the model (ie copy/paste :)) used by the tracked cache. 
If you look at how it uses, the return value, it is 0 == false == remove 
the extension, 1 == true == add the extension.  I'm doing the same with 
fsmonitor.

static void tweak_untracked_cache(struct index_state *istate)
{
	switch (git_config_get_untracked_cache()) {
	case -1: /* keep: do nothing */
		break;
	case 0: /* false */
		remove_untracked_cache(istate);
		break;
	case 1: /* true */
		add_untracked_cache(istate);
		break;
	default: /* unknown value: do nothing */
		break;
	}
}

void tweak_fsmonitor(struct index_state *istate)
{
	switch (git_config_get_fsmonitor()) {
	case -1: /* keep: do nothing */
		break;
	case 0: /* false */
		remove_fsmonitor(istate);
		break;
	case 1: /* true */
		add_fsmonitor(istate);
		break;
	default: /* unknown value: do nothing */
		break;
	}
}

> 
>>> [snip]
>>>
>>> +>	/*
>>> +>	 * With fsmonitor, we can trust the untracked cache's valid field.
>>> +>	 */
>>>
>>
>> Did you intend to make a comment here?
> 
> Sorry.  I was going to make a comment that I didn't see how that could work
> since we weren't touching the untracked cache here, but then I saw the bit
> further down.   I'm still not sure it works (see comment on 10/12), but at
> least it could in theory work.
>   
> 

The logic here assumes that when the index is written out, it is valid 
including the untracked cache and the CE_FSMONITOR_VALID bits. 
Therefore it should still all be valid as of the time the fsmonitor was 
queried and the index saved.

Another way of thinking about this is that the fsmonitor extension is 
simply adding another persisted bit on each index entry.  It just gets 
persisted/restored differently than the other persisted bits.

Obviously, before we can use it assuming it reflects the *current* state 
of the working directory, we have to refresh the bits via the refresh logic.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-16 15:27     ` Torsten Bögershausen
  2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
@ 2017-09-18 14:06       ` Ben Peart
  1 sibling, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-18 14:06 UTC (permalink / raw)
  To: Torsten Bögershausen, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff



On 9/16/2017 11:27 AM, Torsten Bögershausen wrote:
> On 2017-09-15 21:20, Ben Peart wrote:
>> +if [ "$1" != 1 ]
>> +then
>> +	echo -e "Unsupported core.fsmonitor hook version.\n" >&2
>> +	exit 1
>> +fi
> 
> The echo -e not portable
> (It was detected by a tighter version of the lint script,
>   which I have here, but not yet send to the list :-(
> 
> This will do:
> echo  "Unsupported core.fsmonitor hook version." >&2
> 

Thanks, I'll fix these and the ones in the t/t7519 directory.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 12/12] fsmonitor: add a performance test
  2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
  2017-09-15 21:56     ` David Turner
@ 2017-09-18 14:24     ` Johannes Schindelin
  2017-09-18 18:19       ` Ben Peart
  1 sibling, 1 reply; 137+ messages in thread
From: Johannes Schindelin @ 2017-09-18 14:24 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, gitster, pclouds,
	peff

Hi Ben,

sorry for not catching this earlier:

On Fri, 15 Sep 2017, Ben Peart wrote:

> [...]
> +
> +int cmd_dropcaches(void)
> +{
> +	HANDLE hProcess = GetCurrentProcess();
> +	HANDLE hToken;
> +	int status;
> +
> +	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
> +		return error("Can't open current process token");
> +
> +	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
> +		return error("Can't get SeProfileSingleProcessPrivilege");
> +
> +	CloseHandle(hToken);
> +
> +	HMODULE ntdll = LoadLibrary("ntdll.dll");

Git's source code still tries to abide by C90, and for simplicity's sake,
this extends to the Windows-specific part. Therefore, the `ntdll` variable
needs to be declared at the beginning of the function (I do agree that it
makes for better code to reduce the scope of variables, but C90 simply did
not allow variables to be declared in the middle of functions).

I wanted to send a patch address this in the obvious way, but then I
encountered these lines:

> +	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
> +		(DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
> +	if (!NtSetSystemInformation)
> +		return error("Can't get function addresses, wrong Windows version?");

It turns out that we have seen this plenty of times in Git for Windows'
fork, so much so that we came up with a nice helper to make this all a bit
more robust and a bit more obvious, too: the DECLARE_PROC_ADDR and
INIT_PROC_ADDR helpers in compat/win32/lazyload.h.

Maybe this would be the perfect excuse to integrate this patch into
upstream Git? This would be the patch (you can also cherry-pick it from
25c4dc3a73352e72e995594cf1b4afa46e93d040 in https://github.com/dscho/git):

-- snip --
From 25c4dc3a73352e72e995594cf1b4afa46e93d040 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 10 Jan 2017 23:14:20 +0100
Subject: [PATCH] Win32: simplify loading of DLL functions

Dynamic loading of DLL functions is duplicated in several places in Git
for Windows' source code.

This patch adds a pair of macros to simplify the process: the
DECLARE_PROC_ADDR(<dll>, <return-type>, <function-name>,
...<function-parameter-types>...) macro to be used at the beginning of a
code block, and the INIT_PROC_ADDR(<function-name>) macro to call before
using the declared function. The return value of the INIT_PROC_ADDR()
call has to be checked; If it is NULL, the function was not found in the
specified DLL.

Example:

        DECLARE_PROC_ADDR(kernel32.dll, BOOL, CreateHardLinkW,
                          LPCWSTR, LPCWSTR, LPSECURITY_ATTRIBUTES);

        if (!INIT_PROC_ADDR(CreateHardLinkW))
                return error("Could not find CreateHardLinkW() function";

	if (!CreateHardLinkW(source, target, NULL))
		return error("could not create hardlink from %S to %S",
			     source, target);
	return 0;

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/win32/lazyload.h | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 compat/win32/lazyload.h

diff --git a/compat/win32/lazyload.h b/compat/win32/lazyload.h
new file mode 100644
index 00000000000..91c10dad2fb
--- /dev/null
+++ b/compat/win32/lazyload.h
@@ -0,0 +1,44 @@
+#ifndef LAZYLOAD_H
+#define LAZYLOAD_H
+
+/* simplify loading of DLL functions */
+
+struct proc_addr {
+	const char *const dll;
+	const char *const function;
+	FARPROC pfunction;
+	unsigned initialized : 1;
+};
+
+/* Declares a function to be loaded dynamically from a DLL. */
+#define DECLARE_PROC_ADDR(dll, rettype, function, ...) \
+	static struct proc_addr proc_addr_##function = \
+	{ #dll, #function, NULL, 0 }; \
+	static rettype (WINAPI *function)(__VA_ARGS__)
+
+/*
+ * Loads a function from a DLL (once-only).
+ * Returns non-NULL function pointer on success.
+ * Returns NULL + errno == ENOSYS on failure.
+ */
+#define INIT_PROC_ADDR(function) \
+	(function = get_proc_addr(&proc_addr_##function))
+
+static inline void *get_proc_addr(struct proc_addr *proc)
+{
+	/* only do this once */
+	if (!proc->initialized) {
+		HANDLE hnd;
+		proc->initialized = 1;
+		hnd = LoadLibraryExA(proc->dll, NULL,
+				     LOAD_LIBRARY_SEARCH_SYSTEM32);
+		if (hnd)
+			proc->pfunction = GetProcAddress(hnd, proc->function);
+	}
+	/* set ENOSYS if DLL or function was not found */
+	if (!proc->pfunction)
+		errno = ENOSYS;
+	return proc->pfunction;
+}
+
+#endif
-- snap --

With this patch, this fixup to your patch would make things compile (you
can also cherry-pick d05996fb61027512b8ab31a36c4a7a677dea11bb from my
fork):

-- snipsnap --
From d05996fb61027512b8ab31a36c4a7a677dea11bb Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 18 Sep 2017 14:56:40 +0200
Subject: [PATCH] fixup! fsmonitor: add a performance test

---
 t/helper/test-drop-caches.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
index 717079865cb..b27358528f7 100644
--- a/t/helper/test-drop-caches.c
+++ b/t/helper/test-drop-caches.c
@@ -1,6 +1,7 @@
 #include "git-compat-util.h"
 
 #if defined(GIT_WINDOWS_NATIVE)
+#include "compat/win32/lazyload.h"
 
 int cmd_sync(void)
 {
@@ -82,6 +83,9 @@ int cmd_dropcaches(void)
 	HANDLE hProcess = GetCurrentProcess();
 	HANDLE hToken;
 	int status;
+	SYSTEM_MEMORY_LIST_COMMAND command;
+	DECLARE_PROC_ADDR(ntll,
+			  DWORD, NtSetSystemInformation, INT, PVOID, ULONG);
 
 	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
 		return error("Can't open current process token");
@@ -91,16 +95,10 @@ int cmd_dropcaches(void)
 
 	CloseHandle(hToken);
 
-	HMODULE ntdll = LoadLibrary("ntdll.dll");
-	if (!ntdll)
-		return error("Can't load ntdll.dll, wrong Windows
 		version?");
-
-	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
-		(DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll,
 		"NtSetSystemInformation");
-	if (!NtSetSystemInformation)
+	if (!INIT_PROC_ADDR(NtSetSystemInformation))
 		return error("Can't get function addresses, wrong Windows version?");
 
-	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
+	command = MemoryPurgeStandbyList;
 	status = NtSetSystemInformation(
 		SystemMemoryListInformation,
 		&command,
@@ -111,8 +109,6 @@ int cmd_dropcaches(void)
 	else if (status != STATUS_SUCCESS)
 		error("Unable to execute the memory list command %d", status);
 
-	FreeLibrary(ntdll);
-
 	return status;
 }
 
-- 
2.14.1.windows.1.510.g0cb6d35d23

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-17  4:47     ` Junio C Hamano
@ 2017-09-18 15:25       ` Ben Peart
  2017-09-19 20:34         ` Jonathan Nieder
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-18 15:25 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff



On 9/17/2017 12:47 AM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> +write_integration_script() {
>> +	write_script .git/hooks/fsmonitor-test<<-\EOF
>> +	if [ "$#" -ne 2 ]; then
>> +		echo "$0: exactly 2 arguments expected"
>> +		exit 2
>> +	fi
>> +	if [ "$1" != 1 ]; then
>> +		echo -e "Unsupported core.fsmonitor hook version.\n" >&2
>> +		exit 1
>> +	fi
> 
> In addition to "echo -e" thing pointed out earlier, these look
> somewhat unusual in our shell scripts, relative to what
> Documentation/CodingGuidelines tells us to do:

I'm happy to make these changes.  I understand the difficulty of 
creating a consistent coding style especially after the fact.

<soapbox>

Copy/paste is usually a developers best friend as it allows you to avoid 
a lot of errors by reusing existing tested code.  One of the times it 
backfires is when that code doesn't match the current desired coding style.

I only point these out to help lend some additional impetus to the 
effort to formalize the coding style and to provide tooling to handle 
what should mostly be a mechanical process. IMO, the goal should be to 
save the maintainer and contributors the cost of having to write up and 
respond to formatting feedback. :)

Some stats on these same coding style errors in the current bash scripts:

298 instances of "[a-z]\(\).*\{" ie "function_name() {" (no space)
140 instances of "if \[ .* \]" (not using the preferred "test")
293 instances of "if .*; then"

Wouldn't it be great not to have to write up style feedback for when 
these all get copy/pasted into new scripts? :)

</soapbox>

> 
>   - We prefer a space between the function name and the parentheses,
>     and no space inside the parentheses. The opening "{" should also
>     be on the same line.
> 
> 	(incorrect)
> 	my_function(){
> 		...
> 
> 	(correct)
> 	my_function () {
> 		...
> 
>   - We prefer "test" over "[ ... ]".
> 
>   - Do not write control structures on a single line with semicolon.
>     "then" should be on the next line for if statements, and "do"
>     should be on the next line for "while" and "for".
> 
> 	(incorrect)
> 	if test -f hello; then
> 		do this
> 	fi
> 
> 	(correct)
> 	if test -f hello
> 	then
> 		do this
> 	fi
> 
>> diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
>> new file mode 100755
>> index 0000000000..aaee5d1fe3
>> --- /dev/null
>> +++ b/t/t7519/fsmonitor-watchman
>> @@ -0,0 +1,128 @@
>> +#!/usr/bin/perl
>> +
>> +use strict;
>> +use warnings;
>> +use IPC::Open2;
>> + ...
>> +	open (my $fh, ">", ".git/watchman-query.json");
>> +	print $fh "[\"query\", \"$git_work_tree\", { \
>> +	\"since\": $time, \
>> +	\"fields\": [\"name\"], \
>> +	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
>> +	}]";
>> +	close $fh;
>> +
>> +	print CHLD_IN "[\"query\", \"$git_work_tree\", { \
>> +	\"since\": $time, \
>> +	\"fields\": [\"name\"], \
>> +	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
>> +	}]";
> 
> This look painful to read, write and maintain.  IIRC, Perl supports
> the <<HERE document syntax quite similar to shell; would it make
> these "print" we see above easier?
> 

I agree!  I'm definitely *not* a perl developer so was unaware of this 
construct.  A few minutes with stack overflow and now I can clean this up.

>> +}
>> \ No newline at end of file
> 
> Oops.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-18 13:38       ` Ben Peart
@ 2017-09-18 15:43         ` Torsten Bögershausen
  2017-09-18 16:28           ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Torsten Bögershausen @ 2017-09-18 15:43 UTC (permalink / raw)
  To: Ben Peart, Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

On 2017-09-18 15:38, Ben Peart wrote:
> 
> 
> On 9/17/2017 4:02 AM, Junio C Hamano wrote:
>> Ben Peart <benpeart@microsoft.com> writes:
>>
>>> diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
>>> new file mode 100644
>>> index 0000000000..482d749bb9
>>> --- /dev/null
>>> +++ b/t/helper/test-dump-fsmonitor.c
>>> @@ -0,0 +1,21 @@
>>> +#include "cache.h"
>>> +
>>> +int cmd_main(int ac, const char **av)
>>> +{
>>> +    struct index_state *istate = &the_index;
>>> +    int i;
>>> +
>>> +    setup_git_directory();
>>> +    if (do_read_index(istate, get_index_file(), 0) < 0)
>>> +        die("unable to read index file");
>>> +    if (!istate->fsmonitor_last_update) {
>>> +        printf("no fsmonitor\n");
>>> +        return 0;
>>> +    }
>>> +    printf("fsmonitor last update %"PRIuMAX"\n",
>>> istate->fsmonitor_last_update);
>>
>> After pushing this out and had Travis complain, I queued a squash on
>> top of this to cast the argument to (uintmax_t), like you did in an
>> earlier step (I think it was [PATCH 04/12]).
>>
> 
> Thanks. I'll update this to cast it as (uint64_t) as that is what get/put_be64
> use.  As far as I can tell they both map to the same thing (unsigned long long)
> so there isn't functional difference.
(Just to double-check): This is the way to print "PRIuMAX" correctly
 (on all platforms):

printf("fsmonitor last update %"PRIuMAX"\n",
 (uintmax_t)istate->fsmonitor_last_update);



^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-18 15:43         ` Torsten Bögershausen
@ 2017-09-18 16:28           ` Ben Peart
  2017-09-19 14:16             ` Torsten Bögershausen
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-18 16:28 UTC (permalink / raw)
  To: Torsten Bögershausen, Ben Peart, Junio C Hamano
  Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

> -----Original Message-----
> From: Torsten Bögershausen [mailto:tboegi@web.de]
> Sent: Monday, September 18, 2017 11:43 AM
> To: Ben Peart <peartben@gmail.com>; Junio C Hamano
> <gitster@pobox.com>; Ben Peart <Ben.Peart@microsoft.com>
> Cc: David.Turner@twosigma.com; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: Re: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index
> extension
> 
> On 2017-09-18 15:38, Ben Peart wrote:
> >
> >
> > On 9/17/2017 4:02 AM, Junio C Hamano wrote:
> >> Ben Peart <benpeart@microsoft.com> writes:
> >>
> >>> diff --git a/t/helper/test-dump-fsmonitor.c
> >>> b/t/helper/test-dump-fsmonitor.c new file mode 100644 index
> >>> 0000000000..482d749bb9
> >>> --- /dev/null
> >>> +++ b/t/helper/test-dump-fsmonitor.c
> >>> @@ -0,0 +1,21 @@
> >>> +#include "cache.h"
> >>> +
> >>> +int cmd_main(int ac, const char **av) {
> >>> +    struct index_state *istate = &the_index;
> >>> +    int i;
> >>> +
> >>> +    setup_git_directory();
> >>> +    if (do_read_index(istate, get_index_file(), 0) < 0)
> >>> +        die("unable to read index file");
> >>> +    if (!istate->fsmonitor_last_update) {
> >>> +        printf("no fsmonitor\n");
> >>> +        return 0;
> >>> +    }
> >>> +    printf("fsmonitor last update %"PRIuMAX"\n",
> >>> istate->fsmonitor_last_update);
> >>
> >> After pushing this out and had Travis complain, I queued a squash on
> >> top of this to cast the argument to (uintmax_t), like you did in an
> >> earlier step (I think it was [PATCH 04/12]).
> >>
> >
> > Thanks. I'll update this to cast it as (uint64_t) as that is what
> > get/put_be64 use.  As far as I can tell they both map to the same
> > thing (unsigned long long) so there isn't functional difference.
> (Just to double-check): This is the way to print "PRIuMAX" correctly  (on all
> platforms):
> 
> printf("fsmonitor last update %"PRIuMAX"\n",  (uintmax_t)istate-
> >fsmonitor_last_update);
> 

Should I just make the variable type itself uintmax_t and then just skip
the cast altogether? I went with uint64_t because that is what
getnanotime returned.


^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 12/12] fsmonitor: add a performance test
  2017-09-18 14:24     ` Johannes Schindelin
@ 2017-09-18 18:19       ` Ben Peart
  2017-09-19 15:28         ` Johannes Schindelin
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-18 18:19 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, pclouds@gmail.com, peff@peff.net


> -----Original Message-----
> From: Johannes Schindelin [mailto:Johannes.Schindelin@gmx.de]
> Sent: Monday, September 18, 2017 10:25 AM
> To: Ben Peart <Ben.Peart@microsoft.com>
> Cc: David.Turner@twosigma.com; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> pclouds@gmail.com; peff@peff.net
> Subject: Re: [PATCH v6 12/12] fsmonitor: add a performance test
> 
> Hi Ben,
> 
> sorry for not catching this earlier:
> 
> On Fri, 15 Sep 2017, Ben Peart wrote:
> 
> > [...]
> > +
> > +int cmd_dropcaches(void)
> > +{
> > +	HANDLE hProcess = GetCurrentProcess();
> > +	HANDLE hToken;
> > +	int status;
> > +
> > +	if (!OpenProcessToken(hProcess, TOKEN_QUERY |
> TOKEN_ADJUST_PRIVILEGES, &hToken))
> > +		return error("Can't open current process token");
> > +
> > +	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
> > +		return error("Can't get SeProfileSingleProcessPrivilege");
> > +
> > +	CloseHandle(hToken);
> > +
> > +	HMODULE ntdll = LoadLibrary("ntdll.dll");
> 

Thanks Johannes, I'll fix that.

> Git's source code still tries to abide by C90, and for simplicity's sake, this
> extends to the Windows-specific part. Therefore, the `ntdll` variable needs to
> be declared at the beginning of the function (I do agree that it makes for
> better code to reduce the scope of variables, but C90 simply did not allow
> variables to be declared in the middle of functions).
> 
> I wanted to send a patch address this in the obvious way, but then I
> encountered these lines:
> 
> > +	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
> > +		(DWORD(WINAPI *)(INT, PVOID,
> ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
> > +	if (!NtSetSystemInformation)
> > +		return error("Can't get function addresses, wrong Windows
> > +version?");
> 
> It turns out that we have seen this plenty of times in Git for Windows'
> fork, so much so that we came up with a nice helper to make this all a bit
> more robust and a bit more obvious, too: the DECLARE_PROC_ADDR and
> INIT_PROC_ADDR helpers in compat/win32/lazyload.h.
> 
> Maybe this would be the perfect excuse to integrate this patch into upstream
> Git? 

This patch is pretty hefty already.  How about you push this capability
upstream and I take advantage of it in a later patch. :)

This would be the patch (you can also cherry-pick it from
> 25c4dc3a73352e72e995594cf1b4afa46e93d040 in
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
> com%2Fdscho%2Fgit&data=02%7C01%7CBen.Peart%40microsoft.com%7C96
> 4027bdc1f34a033c1f08d4fea1056e%7C72f988bf86f141af91ab2d7cd011db47
> %7C1%7C0%7C636413414914282865&sdata=jyvu6G7myRY9UA1XxWx2tDZ%
> 2BWsIWqLTRMT8WfzEGe5g%3D&reserved=0):
> 
> -- snip --
> From 25c4dc3a73352e72e995594cf1b4afa46e93d040 Mon Sep 17 00:00:00
> 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Tue, 10 Jan 2017 23:14:20 +0100
> Subject: [PATCH] Win32: simplify loading of DLL functions
> 
> Dynamic loading of DLL functions is duplicated in several places in Git for
> Windows' source code.
> 
> This patch adds a pair of macros to simplify the process: the
> DECLARE_PROC_ADDR(<dll>, <return-type>, <function-name>,
> ...<function-parameter-types>...) macro to be used at the beginning of a
> code block, and the INIT_PROC_ADDR(<function-name>) macro to call before
> using the declared function. The return value of the INIT_PROC_ADDR() call
> has to be checked; If it is NULL, the function was not found in the specified
> DLL.
> 
> Example:
> 
>         DECLARE_PROC_ADDR(kernel32.dll, BOOL, CreateHardLinkW,
>                           LPCWSTR, LPCWSTR, LPSECURITY_ATTRIBUTES);
> 
>         if (!INIT_PROC_ADDR(CreateHardLinkW))
>                 return error("Could not find CreateHardLinkW() function";
> 
> 	if (!CreateHardLinkW(source, target, NULL))
> 		return error("could not create hardlink from %S to %S",
> 			     source, target);
> 	return 0;
> 
> Signed-off-by: Karsten Blees <blees@dcon.de>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  compat/win32/lazyload.h | 44
> ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
>  create mode 100644 compat/win32/lazyload.h
> 
> diff --git a/compat/win32/lazyload.h b/compat/win32/lazyload.h new file
> mode 100644 index 00000000000..91c10dad2fb
> --- /dev/null
> +++ b/compat/win32/lazyload.h
> @@ -0,0 +1,44 @@
> +#ifndef LAZYLOAD_H
> +#define LAZYLOAD_H
> +
> +/* simplify loading of DLL functions */
> +
> +struct proc_addr {
> +	const char *const dll;
> +	const char *const function;
> +	FARPROC pfunction;
> +	unsigned initialized : 1;
> +};
> +
> +/* Declares a function to be loaded dynamically from a DLL. */ #define
> +DECLARE_PROC_ADDR(dll, rettype, function, ...) \
> +	static struct proc_addr proc_addr_##function = \
> +	{ #dll, #function, NULL, 0 }; \
> +	static rettype (WINAPI *function)(__VA_ARGS__)
> +
> +/*
> + * Loads a function from a DLL (once-only).
> + * Returns non-NULL function pointer on success.
> + * Returns NULL + errno == ENOSYS on failure.
> + */
> +#define INIT_PROC_ADDR(function) \
> +	(function = get_proc_addr(&proc_addr_##function))
> +
> +static inline void *get_proc_addr(struct proc_addr *proc) {
> +	/* only do this once */
> +	if (!proc->initialized) {
> +		HANDLE hnd;
> +		proc->initialized = 1;
> +		hnd = LoadLibraryExA(proc->dll, NULL,
> +				     LOAD_LIBRARY_SEARCH_SYSTEM32);
> +		if (hnd)
> +			proc->pfunction = GetProcAddress(hnd, proc-
> >function);
> +	}
> +	/* set ENOSYS if DLL or function was not found */
> +	if (!proc->pfunction)
> +		errno = ENOSYS;
> +	return proc->pfunction;
> +}
> +
> +#endif
> -- snap --
> 
> With this patch, this fixup to your patch would make things compile (you can
> also cherry-pick d05996fb61027512b8ab31a36c4a7a677dea11bb from my
> fork):
> 
> -- snipsnap --
> From d05996fb61027512b8ab31a36c4a7a677dea11bb Mon Sep 17 00:00:00
> 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Mon, 18 Sep 2017 14:56:40 +0200
> Subject: [PATCH] fixup! fsmonitor: add a performance test
> 
> ---
>  t/helper/test-drop-caches.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c index
> 717079865cb..b27358528f7 100644
> --- a/t/helper/test-drop-caches.c
> +++ b/t/helper/test-drop-caches.c
> @@ -1,6 +1,7 @@
>  #include "git-compat-util.h"
> 
>  #if defined(GIT_WINDOWS_NATIVE)
> +#include "compat/win32/lazyload.h"
> 
>  int cmd_sync(void)
>  {
> @@ -82,6 +83,9 @@ int cmd_dropcaches(void)
>  	HANDLE hProcess = GetCurrentProcess();
>  	HANDLE hToken;
>  	int status;
> +	SYSTEM_MEMORY_LIST_COMMAND command;
> +	DECLARE_PROC_ADDR(ntll,
> +			  DWORD, NtSetSystemInformation, INT, PVOID,
> ULONG);
> 
>  	if (!OpenProcessToken(hProcess, TOKEN_QUERY |
> TOKEN_ADJUST_PRIVILEGES, &hToken))
>  		return error("Can't open current process token"); @@ -91,16
> +95,10 @@ int cmd_dropcaches(void)
> 
>  	CloseHandle(hToken);
> 
> -	HMODULE ntdll = LoadLibrary("ntdll.dll");
> -	if (!ntdll)
> -		return error("Can't load ntdll.dll, wrong Windows
>  		version?");
> -
> -	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
> -		(DWORD(WINAPI *)(INT, PVOID,
> ULONG))GetProcAddress(ntdll,
>  		"NtSetSystemInformation");
> -	if (!NtSetSystemInformation)
> +	if (!INIT_PROC_ADDR(NtSetSystemInformation))
>  		return error("Can't get function addresses, wrong Windows
> version?");
> 
> -	SYSTEM_MEMORY_LIST_COMMAND command =
> MemoryPurgeStandbyList;
> +	command = MemoryPurgeStandbyList;
>  	status = NtSetSystemInformation(
>  		SystemMemoryListInformation,
>  		&command,
> @@ -111,8 +109,6 @@ int cmd_dropcaches(void)
>  	else if (status != STATUS_SUCCESS)
>  		error("Unable to execute the memory list command %d",
> status);
> 
> -	FreeLibrary(ntdll);
> -
>  	return status;
>  }
> 
> --
> 2.14.1.windows.1.510.g0cb6d35d23

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-18 16:28           ` Ben Peart
@ 2017-09-19 14:16             ` Torsten Bögershausen
  2017-09-19 15:36               ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Torsten Bögershausen @ 2017-09-19 14:16 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, Junio C Hamano, David.Turner@twosigma.com,
	avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

> 
> Should I just make the variable type itself uintmax_t and then just skip
> the cast altogether? I went with uint64_t because that is what
> getnanotime returned.
> 

That is a bit of taste question (or answer)

Typically you declare the variables in the type you need,
and this is uint64_t.

Let's step back a bit:
To print e.g a variable of type uint32_t, you use  PRIu32 in the format
string, like this:

fprintf(stderr, "Total %"PRIu32" (delta %"PRIu32"),",....

In theory (it is in the later specs, and it exists on many platforms),
there is a PRIu64 as well.

We don't seem to use it in Git, probably because uintmax_t is (more)
portable and understood by all platforms which support Git.
(And beside that, on most platforms uintmax_t is 64 bit).

So my suggestion would be to keep uint64_t and cast the variable into uintmax_t
whenever it is printed.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 12/12] fsmonitor: add a performance test
  2017-09-18 18:19       ` Ben Peart
@ 2017-09-19 15:28         ` Johannes Schindelin
  0 siblings, 0 replies; 137+ messages in thread
From: Johannes Schindelin @ 2017-09-19 15:28 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, pclouds@gmail.com, peff@peff.net

Hi Ben,

On Mon, 18 Sep 2017, Ben Peart wrote:

> > From: Johannes Schindelin [mailto:Johannes.Schindelin@gmx.de]
> > On Fri, 15 Sep 2017, Ben Peart wrote:
> > 
> > > +	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
> > > +		(DWORD(WINAPI *)(INT, PVOID,
> > ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
> > > +	if (!NtSetSystemInformation)
> > > +		return error("Can't get function addresses, wrong Windows
> > > +version?");
> > 
> > It turns out that we have seen this plenty of times in Git for Windows'
> > fork, so much so that we came up with a nice helper to make this all a bit
> > more robust and a bit more obvious, too: the DECLARE_PROC_ADDR and
> > INIT_PROC_ADDR helpers in compat/win32/lazyload.h.
> > 
> > Maybe this would be the perfect excuse to integrate this patch into upstream
> > Git? 
> 
> This patch is pretty hefty already.  How about you push this capability
> upstream and I take advantage of it in a later patch. :)

Deal:
https://public-inbox.org/git/f5a3add27206df3e7f39efeac8a3c3b47f2b79f2.1505834586.git.johannes.schindelin@gmx.de

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-19 14:16             ` Torsten Bögershausen
@ 2017-09-19 15:36               ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 15:36 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Ben Peart, Junio C Hamano, David.Turner@twosigma.com,
	avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

> -----Original Message-----
> From: Torsten Bögershausen [mailto:tboegi@web.de]
> Sent: Tuesday, September 19, 2017 10:16 AM
> To: Ben Peart <Ben.Peart@microsoft.com>
> Cc: Ben Peart <peartben@gmail.com>; Junio C Hamano
> <gitster@pobox.com>; David.Turner@twosigma.com; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: Re: [PATCH v6 08/12] fsmonitor: add a test tool to dump the index
> extension
> 
> >
> > Should I just make the variable type itself uintmax_t and then just
> > skip the cast altogether? I went with uint64_t because that is what
> > getnanotime returned.
> >
> 
> That is a bit of taste question (or answer)
> 
> Typically you declare the variables in the type you need, and this is uint64_t.
> 
> Let's step back a bit:
> To print e.g a variable of type uint32_t, you use  PRIu32 in the format string,
> like this:
> 
> fprintf(stderr, "Total %"PRIu32" (delta %"PRIu32"),",....
> 
> In theory (it is in the later specs, and it exists on many platforms), there is a
> PRIu64 as well.
> 
> We don't seem to use it in Git, probably because uintmax_t is (more)
> portable and understood by all platforms which support Git.
> (And beside that, on most platforms uintmax_t is 64 bit).
> 
> So my suggestion would be to keep uint64_t and cast the variable into
> uintmax_t whenever it is printed.
> 

Great!  That is the way I have it.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* [PATCH v7 00/12] Fast git status via a file system watcher
  2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
                     ` (11 preceding siblings ...)
  2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
@ 2017-09-19 19:27   ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
                       ` (12 more replies)
  12 siblings, 13 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Subject: Fast git status via a file system watcher

Thanks to everyone who provided feedback.  There are lots of minor style
changes, documentation updates and a fixed leak.

The only functional change is the addition of support to set/clear the
fsmonitor valid bit via 'git update-index --[no-]fsmonitor-valid' with
the associated documentation and tests.

Interdiff between V6 and V7:

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c196007a27..db52645cb4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -418,6 +418,7 @@ core.fsmonitor::
 	will identify all files that may have changed since the
 	requested date/time. This information is used to speed up git by
 	avoiding unnecessary processing of files that have not changed.
+	See the "fsmonitor-watchman" section of linkgit:githooks[5].
 
 core.trustctime::
 	If false, the ctime differences between the index and the
diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index d153c17e06..3ac3e3a77d 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -9,7 +9,7 @@ git-ls-files - Show information about files in the index and the working tree
 SYNOPSIS
 --------
 [verse]
-'git ls-files' [-z] [-t] [-v]
+'git ls-files' [-z] [-t] [-v] [-f]
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
 		(-[c|d|o|i|s|u|k|m])*
 		[--eol]
@@ -133,6 +133,11 @@ a space) at the start of each line:
 	that are marked as 'assume unchanged' (see
 	linkgit:git-update-index[1]).
 
+-f::
+	Similar to `-t`, but use lowercase letters for files
+	that are marked as 'fsmonitor valid' (see
+	linkgit:git-update-index[1]).
+
 --full-name::
 	When run from a subdirectory, the command usually
 	outputs paths relative to the current directory.  This
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index e19eba62cd..95231dbfcb 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -16,9 +16,11 @@ SYNOPSIS
 	     [--chmod=(+|-)x]
 	     [--[no-]assume-unchanged]
 	     [--[no-]skip-worktree]
+	     [--[no-]fsmonitor-valid]
 	     [--ignore-submodules]
 	     [--[no-]split-index]
 	     [--[no-|test-|force-]untracked-cache]
+	     [--[no-]fsmonitor]
 	     [--really-refresh] [--unresolve] [--again | -g]
 	     [--info-only] [--index-info]
 	     [-z] [--stdin] [--index-version <n>]
@@ -111,6 +113,12 @@ you will need to handle the situation manually.
 	set and unset the "skip-worktree" bit for the paths. See
 	section "Skip-worktree bit" below for more information.
 
+--[no-]fsmonitor-valid::
+	When one of these flags is specified, the object name recorded
+	for the paths are not updated. Instead, these options
+	set and unset the "fsmonitor valid" bit for the paths. See
+	section "File System Monitor" below for more information.
+
 -g::
 --again::
 	Runs 'git update-index' itself on the paths whose index
@@ -201,6 +209,15 @@ will remove the intended effect of the option.
 	`--untracked-cache` used to imply `--test-untracked-cache` but
 	this option would enable the extension unconditionally.
 
+--fsmonitor::
+--no-fsmonitor::
+	Enable or disable files system monitor feature. These options
+	take effect whatever the value of the `core.fsmonitor`
+	configuration variable (see linkgit:git-config[1]). But a warning
+	is emitted when the change goes against the configured value, as
+	the configured value will take effect next time the index is
+	read and this will remove the intended effect of the option.
+
 \--::
 	Do not interpret any more arguments as options.
 
@@ -447,6 +464,34 @@ command reads the index; while when `--[no-|force-]untracked-cache`
 are used, the untracked cache is immediately added to or removed from
 the index.
 
+File System Monitor
+-------------------
+
+This feature is intended to speed up git operations for repos that have
+large working directories.
+
+It enables git to work together with a file system monitor (see the
+"fsmonitor-watchman" section of linkgit:githooks[5]) that can
+inform it as to what files have been modified. This enables git to avoid
+having to lstat() every file to find modified files.
+
+When used in conjunction with the untracked cache, it can further improve
+performance by avoiding the cost of scaning the entire working directory
+looking for new files.
+
+If you want to enable (or disable) this feature, it is easier to use
+the `core.fsmonitor` configuration variable (see
+linkgit:git-config[1]) than using the `--fsmonitor` option to
+`git update-index` in each repository, especially if you want to do so
+across all repositories you use, because you can set the configuration
+variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
+and have it affect all repositories you touch.
+
+When the `core.fsmonitor` configuration variable is changed, the
+file system monitor is added to or removed from the index the next time
+a command reads the index. When `--[no-]fsmonitor` are used, the file
+system monitor is immediately added to or removed from the index.
+
 Configuration
 -------------
 
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index da82d64b0b..ae60559cd9 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -455,10 +455,8 @@ the name of the file that holds the e-mail to be sent.  Exiting with a
 non-zero status causes 'git send-email' to abort before sending any
 e-mails.
 
-
-[[fsmonitor-watchman]]
 fsmonitor-watchman
-~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~
 
 This hook is invoked when the configuration option core.fsmonitor is
 set to .git/hooks/fsmonitor-watchman.  It takes two arguments, a version
@@ -471,10 +469,17 @@ should be inclusive so that it does not miss any potential changes.
 The paths should be relative to the root of the working directory
 and be separated by a single NUL.
 
+It is OK to include files which have not actually changed.  All changes
+including newly-created and deleted files should be included. When
+files are renamed, both the old and the new name should be included.
+
 Git will limit what files it checks for changes as well as which
 directories are checked for untracked files based on the path names
 given.
 
+An optimized way to tell git "all files have changed" is to return
+the filename '/'.
+
 The exit status determines whether git will use the data from the
 hook to limit its search.  On error, it will fall back to verifying
 all files and folders.
diff --git a/builtin/update-index.c b/builtin/update-index.c
index b03afc1f3a..41618db098 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -33,6 +33,7 @@ static int force_remove;
 static int verbose;
 static int mark_valid_only;
 static int mark_skip_worktree_only;
+static int mark_fsmonitor_only;
 #define MARK_FLAG 1
 #define UNMARK_FLAG 2
 static struct strbuf mtime_dir = STRBUF_INIT;
@@ -229,12 +230,12 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 	int namelen = strlen(path);
 	int pos = cache_name_pos(path, namelen);
 	if (0 <= pos) {
+		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		if (mark)
 			active_cache[pos]->ce_flags |= flag;
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
@@ -460,6 +461,11 @@ static void update_one(const char *path)
 			die("Unable to mark file %s", path);
 		return;
 	}
+	if (mark_fsmonitor_only) {
+		if (mark_ce_flags(path, CE_FSMONITOR_VALID, mark_fsmonitor_only == MARK_FLAG))
+			die("Unable to mark file %s", path);
+		return;
+	}
 
 	if (force_remove) {
 		if (remove_file_from_cache(path))
@@ -1014,6 +1020,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			N_("write out the index even if is not flagged as changed"), 1),
 		OPT_BOOL(0, "fsmonitor", &fsmonitor,
 			N_("enable or disable file system monitor")),
+		{OPTION_SET_INT, 0, "fsmonitor-valid", &mark_fsmonitor_only, NULL,
+			N_("mark files as fsmonitor valid"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, MARK_FLAG},
+		{OPTION_SET_INT, 0, "no-fsmonitor-valid", &mark_fsmonitor_only, NULL,
+			N_("clear fsmonitor valid bit"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, UNMARK_FLAG},
 		OPT_END()
 	};
 
diff --git a/fsmonitor.c b/fsmonitor.c
index 144294b8df..b8b2d88fe1 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -57,12 +57,12 @@ int read_fsmonitor_extension(struct index_state *istate, const void *data,
 
 		/* Mark all previously saved entries as dirty */
 		ewah_each_bit(fsmonitor_dirty, fsmonitor_ewah_callback, istate);
-		ewah_free(fsmonitor_dirty);
 
 		/* Now mark the untracked cache for fsmonitor usage */
 		if (istate->untracked)
 			istate->untracked->use_fsmonitor = 1;
 	}
+	ewah_free(fsmonitor_dirty);
 
 	trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful");
 	return 0;
@@ -177,7 +177,7 @@ void refresh_fsmonitor(struct index_state *istate)
 	}
 
 	/* a fsmonitor process can return '*' to indicate all entries are invalid */
-	if (query_success && query_result.buf[0] != '*') {
+	if (query_success && query_result.buf[0] != '/') {
 		/* Mark all entries returned by the monitor as dirty */
 		buf = query_result.buf;
 		bol = 0;
diff --git a/fsmonitor.h b/fsmonitor.h
index dadbe90283..c2240b811a 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -46,7 +46,7 @@ static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 }
 
 /*
- * Clear the given cache entries CE_FSMONITOR_VALID bit and invalidate any
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate any
  * corresponding untracked cache directory structures.
  */
 static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
index 717079865c..4e5ca8f397 100644
--- a/t/helper/test-drop-caches.c
+++ b/t/helper/test-drop-caches.c
@@ -2,7 +2,7 @@
 
 #if defined(GIT_WINDOWS_NATIVE)
 
-int cmd_sync(void)
+static int cmd_sync(void)
 {
 	char Buffer[MAX_PATH];
 	DWORD dwRet;
@@ -49,7 +49,7 @@ typedef enum _SYSTEM_MEMORY_LIST_COMMAND {
 	MemoryCommandMax
 } SYSTEM_MEMORY_LIST_COMMAND;
 
-BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
+static BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
 {
 	BOOL bResult;
 	DWORD dwBufferLength;
@@ -77,10 +77,11 @@ BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
 	return bResult;
 }
 
-int cmd_dropcaches(void)
+static int cmd_dropcaches(void)
 {
 	HANDLE hProcess = GetCurrentProcess();
 	HANDLE hToken;
+	HMODULE ntdll;
 	int status;
 
 	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
@@ -91,7 +92,7 @@ int cmd_dropcaches(void)
 
 	CloseHandle(hToken);
 
-	HMODULE ntdll = LoadLibrary("ntdll.dll");
+	ntdll = LoadLibrary("ntdll.dll");
 	if (!ntdll)
 		return error("Can't load ntdll.dll, wrong Windows version?");
 
@@ -118,36 +119,36 @@ int cmd_dropcaches(void)
 
 #elif defined(__linux__)
 
-int cmd_sync(void)
+static int cmd_sync(void)
 {
 	return system("sync");
 }
 
-int cmd_dropcaches(void)
+static int cmd_dropcaches(void)
 {
 	return system("echo 3 | sudo tee /proc/sys/vm/drop_caches");
 }
 
 #elif defined(__APPLE__)
 
-int cmd_sync(void)
+static int cmd_sync(void)
 {
 	return system("sync");
 }
 
-int cmd_dropcaches(void)
+static int cmd_dropcaches(void)
 {
 	return system("sudo purge");
 }
 
 #else
 
-int cmd_sync(void)
+static int cmd_sync(void)
 {
 	return 0;
 }
 
-int cmd_dropcaches(void)
+static int cmd_dropcaches(void)
 {
 	return error("drop caches not implemented on this platform");
 }
diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
index 482d749bb9..ad452707e8 100644
--- a/t/helper/test-dump-fsmonitor.c
+++ b/t/helper/test-dump-fsmonitor.c
@@ -12,7 +12,7 @@ int cmd_main(int ac, const char **av)
 		printf("no fsmonitor\n");
 		return 0;
 	}
-	printf("fsmonitor last update %"PRIuMAX"\n", istate->fsmonitor_last_update);
+	printf("fsmonitor last update %"PRIuMAX"\n", (uintmax_t)istate->fsmonitor_last_update);
 
 	for (i = 0; i < istate->cache_nr; i++)
 		printf((istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) ? "+" : "-");
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 1c5978d5c8..16d1bf72e5 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -95,7 +95,7 @@ test_expect_success "setup for fsmonitor" '
 		INTEGRATION_SCRIPT="$GIT_PERF_7519_FSMONITOR"
 	else
 		#
-		# Choose integration script based on existance of Watchman.
+		# Choose integration script based on existence of Watchman.
 		# If Watchman exists, watch the work tree and attempt a query.
 		# If everything succeeds, use Watchman integration script,
 		# else fall back to an empty integration script.
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 6aa1e4e924..c6df85af5e 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -32,14 +32,16 @@ dirty_repo () {
 	echo 6 >dir2/new
 }
 
-write_integration_script() {
+write_integration_script () {
 	write_script .git/hooks/fsmonitor-test<<-\EOF
-	if [ "$#" -ne 2 ]; then
+	if test "$#" -ne 2
+	then
 		echo "$0: exactly 2 arguments expected"
 		exit 2
 	fi
-	if [ "$1" != 1 ]; then
-		echo -e "Unsupported core.fsmonitor hook version.\n" >&2
+	if test "$1" != 1
+	then
+		echo "Unsupported core.fsmonitor hook version." >&2
 		exit 1
 	fi
 	printf "untracked\0"
@@ -100,6 +102,43 @@ test_expect_success 'update-index --no-fsmonitor" removes the fsmonitor extensio
 	grep "^no fsmonitor" actual
 '
 
+cat >expect <<EOF &&
+h dir1/modified
+H dir1/tracked
+h dir2/modified
+H dir2/tracked
+h modified
+H tracked
+EOF
+
+# test that "update-index --fsmonitor-valid" sets the fsmonitor valid bit
+test_expect_success 'update-index --fsmonitor-valid" sets the fsmonitor valid bit' '
+	git update-index --fsmonitor &&
+	git update-index --fsmonitor-valid dir1/modified &&
+	git update-index --fsmonitor-valid dir2/modified &&
+	git update-index --fsmonitor-valid modified &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+H dir1/tracked
+H dir2/modified
+H dir2/tracked
+H modified
+H tracked
+EOF
+
+# test that "update-index --no-fsmonitor-valid" clears the fsmonitor valid bit
+test_expect_success 'update-index --no-fsmonitor-valid" clears the fsmonitor valid bit' '
+	git update-index --no-fsmonitor-valid dir1/modified &&
+	git update-index --no-fsmonitor-valid dir2/modified &&
+	git update-index --no-fsmonitor-valid modified &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
 cat >expect <<EOF &&
 H dir1/modified
 H dir1/tracked
@@ -204,7 +243,7 @@ for preload_val in $preload_values
 do
 	test_expect_success "setup preloadIndex to $preload_val" '
 		git config core.preloadIndex $preload_val &&
-		if [ $preload_val -eq true ]
+		if test $preload_val = true
 		then
 			GIT_FORCE_PRELOAD_TEST=$preload_val; export GIT_FORCE_PRELOAD_TEST
 		else
@@ -249,8 +288,14 @@ do
 			git status >actual &&
 			test_path_is_file marker &&
 			test_i18ngrep ! "Changes not staged for commit:" actual &&
-			if [ $uc_val -eq true ]; then test_i18ngrep ! "Untracked files:" actual; fi &&
-			if [ $uc_val -eq false ]; then test_i18ngrep "Untracked files:" actual; fi &&
+			if test $uc_val = true
+			then
+				test_i18ngrep ! "Untracked files:" actual
+			fi &&
+			if test $uc_val = false
+			then
+				test_i18ngrep "Untracked files:" actual
+			fi &&
 			rm -f marker
 		'
 	done
diff --git a/t/t7519/fsmonitor-all b/t/t7519/fsmonitor-all
index a3870e431e..691bc94dc2 100755
--- a/t/t7519/fsmonitor-all
+++ b/t/t7519/fsmonitor-all
@@ -9,15 +9,16 @@
 #
 #echo "$0 $*" >&2
 
-if [ "$#" -ne 2 ] ; then
-	echo -e "$0: exactly 2 arguments expected\n" >&2
+if test "$#" -ne 2
+then
+	echo "$0: exactly 2 arguments expected" >&2
 	exit 2
 fi
 
-if [ "$1" != 1 ]
+if test "$1" != 1
 then
-	echo -e "Unsupported core.fsmonitor hook version.\n" >&2
+	echo "Unsupported core.fsmonitor hook version." >&2
 	exit 1
 fi
 
-echo "*"
\ No newline at end of file
+echo "/"
diff --git a/t/t7519/fsmonitor-none b/t/t7519/fsmonitor-none
index c500bb0f26..ed9cf5a6a9 100755
--- a/t/t7519/fsmonitor-none
+++ b/t/t7519/fsmonitor-none
@@ -9,13 +9,14 @@
 #
 #echo "$0 $*" >&2
 
-if [ "$#" -ne 2 ] ; then
-	echo -e "$0: exactly 2 arguments expected\n" >&2
+if test "$#" -ne 2
+then
+	echo "$0: exactly 2 arguments expected" >&2
 	exit 2
 fi
 
-if [ "$1" != 1 ]
+if test "$1" != 1
 then
-	echo -e "Unsupported core.fsmonitor hook version.\n" >&2
+	echo "Unsupported core.fsmonitor hook version." >&2
 	exit 1
 fi
diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index aaee5d1fe3..7ceb32dc18 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -17,7 +17,7 @@ use IPC::Open2;
 # 'git config core.fsmonitor .git/hooks/query-watchman'
 #
 my ($version, $time) = @ARGV;
-print STDERR "$0 $version $time\n";
+#print STDERR "$0 $version $time\n";
 
 # Check the hook interface version
 
@@ -29,7 +29,20 @@ if ($version == 1) {
 	    "Falling back to scanning...\n";
 }
 
-my $git_work_tree = $ENV{'PWD'};
+# Convert unix style paths to escaped Windows style paths when running
+# in Windows command prompt
+
+my $system = `uname -s`;
+$system =~ s/[\r\n]+//g;
+my $git_work_tree;
+
+if ($system =~ m/^MSYS_NT/) {
+	$git_work_tree = `cygpath -aw "\$PWD"`;
+	$git_work_tree =~ s/[\r\n]+//g;
+	$git_work_tree =~ s,\\,/,g;
+} else {
+	$git_work_tree = $ENV{'PWD'};
+}
 
 my $retry = 1;
 
@@ -57,20 +70,19 @@ sub launch_watchman {
 	# creation clock (cclock) newer than $time_t value and will also not
 	# currently exist.
 
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $time,
+			"fields": ["name"],
+			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+		}]
+	END
+	
 	open (my $fh, ">", ".git/watchman-query.json");
-	print $fh "[\"query\", \"$git_work_tree\", { \
-	\"since\": $time, \
-	\"fields\": [\"name\"], \
-	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
-	}]";
+	print $fh $query;
 	close $fh;
 
-	print CHLD_IN "[\"query\", \"$git_work_tree\", { \
-	\"since\": $time, \
-	\"fields\": [\"name\"], \
-	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
-	}]";
-
+	print CHLD_IN $query;
 	my $response = <CHLD_OUT>;
 
 	open ($fh, ">", ".git/watchman-response.json");
@@ -100,17 +112,17 @@ sub launch_watchman {
 		qx/watchman watch "$git_work_tree"/;
 		die "Failed to make watchman watch '$git_work_tree'.\n" .
 		    "Falling back to scanning...\n" if $? != 0;
-		# return fast "everything is dirty" flag"
-		print "*\0";
-		open ($fh, ">", ".git/watchman-output.out");
-		print "*\0";
-		close $fh;
 
 		# Watchman will always return all files on the first query so
 		# return the fast "everything is dirty" flag to git and do the
 		# Watchman query just to get it over with now so we won't pay
 		# the cost in git to look up each individual file.
-		print "*\0";
+
+		open ($fh, ">", ".git/watchman-output.out");
+		print "/\0";
+		close $fh;
+
+		print "/\0";
 		eval { launch_watchman() };
 		exit 0;
 	}
@@ -118,11 +130,11 @@ sub launch_watchman {
 	die "Watchman: $o->{error}.\n" .
 	    "Falling back to scanning...\n" if $o->{error};
 
-	binmode STDOUT, ":utf8";
-	local $, = "\0";
-	print @{$o->{files}};
-
 	open ($fh, ">", ".git/watchman-output.out");
 	print $fh @{$o->{files}};
 	close $fh;
-}
\ No newline at end of file
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+}
diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
index 2779d7edf3..870a59d237 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -69,12 +69,15 @@ sub launch_watchman {
 	# creation clock (cclock) newer than $time_t value and will also not
 	# currently exist.
 
-	print CHLD_IN "[\"query\", \"$git_work_tree\", { \
-	\"since\": $time, \
-	\"fields\": [\"name\"], \
-	\"expression\": [\"not\", [\"allof\", [\"since\", $time, \"cclock\"], [\"not\", \"exists\"]]] \
-	}]";
-
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $time,
+			"fields": ["name"],
+			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+		}]
+	END
+
+	print CHLD_IN $query;
 	my $response = <CHLD_OUT>;
 
 	die "Watchman: command returned no output.\n" .
@@ -105,7 +108,7 @@ sub launch_watchman {
 		# return the fast "everything is dirty" flag to git and do the
 		# Watchman query just to get it over with now so we won't pay
 		# the cost in git to look up each individual file.
-		print "*\0";
+		print "/\0";
 		eval { launch_watchman() };
 		exit 0;
 	}

Ben Peart (12):
  bswap: add 64 bit endianness helper get_be64
  preload-index: add override to enable testing preload-index
  update-index: add a new --force-write-index option
  fsmonitor: teach git to optionally utilize a file system monitor to
    speed up detecting new or changed files.
  fsmonitor: add documentation for the fsmonitor extension.
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  update-index: add fsmonitor support to update-index
  fsmonitor: add a test tool to dump the index extension
  split-index: disable the fsmonitor extension when running the split
    index test
  fsmonitor: add test cases for fsmonitor extension
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add a performance test

 Documentation/config.txt                   |   7 +
 Documentation/git-ls-files.txt             |   7 +-
 Documentation/git-update-index.txt         |  45 +++++
 Documentation/githooks.txt                 |  28 +++
 Documentation/technical/index-format.txt   |  19 ++
 Makefile                                   |   3 +
 apply.c                                    |   2 +-
 builtin/ls-files.c                         |   8 +-
 builtin/update-index.c                     |  38 +++-
 cache.h                                    |  10 +-
 compat/bswap.h                             |  22 +++
 config.c                                   |  14 ++
 config.h                                   |   1 +
 diff-lib.c                                 |   2 +
 dir.c                                      |  27 ++-
 dir.h                                      |   2 +
 entry.c                                    |   4 +-
 environment.c                              |   1 +
 fsmonitor.c                                | 253 ++++++++++++++++++++++++
 fsmonitor.h                                |  61 ++++++
 preload-index.c                            |   8 +-
 read-cache.c                               |  49 ++++-
 submodule.c                                |   2 +-
 t/helper/.gitignore                        |   1 +
 t/helper/test-drop-caches.c                | 162 +++++++++++++++
 t/helper/test-dump-fsmonitor.c             |  21 ++
 t/perf/p7519-fsmonitor.sh                  | 184 +++++++++++++++++
 t/t1700-split-index.sh                     |   1 +
 t/t7519-status-fsmonitor.sh                | 304 +++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all                      |  24 +++
 t/t7519/fsmonitor-none                     |  22 +++
 t/t7519/fsmonitor-watchman                 | 140 +++++++++++++
 templates/hooks--fsmonitor-watchman.sample | 122 ++++++++++++
 unpack-trees.c                             |   8 +-
 34 files changed, 1572 insertions(+), 30 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100644 t/helper/test-dump-fsmonitor.c
 create mode 100755 t/perf/p7519-fsmonitor.sh
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
                       ` (11 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a new get_be64 macro to enable 64 bit endian conversions on memory
that may or may not be aligned.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/bswap.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/compat/bswap.h b/compat/bswap.h
index 7d063e9e40..6b22c46214 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -158,7 +158,9 @@ static inline uint64_t git_bswap64(uint64_t x)
 
 #define get_be16(p)	ntohs(*(unsigned short *)(p))
 #define get_be32(p)	ntohl(*(unsigned int *)(p))
+#define get_be64(p)	ntohll(*(uint64_t *)(p))
 #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
+#define put_be64(p, v)	do { *(uint64_t *)(p) = htonll(v); } while (0)
 
 #else
 
@@ -178,6 +180,13 @@ static inline uint32_t get_be32(const void *ptr)
 		(uint32_t)p[3] <<  0;
 }
 
+static inline uint64_t get_be64(const void *ptr)
+{
+	const unsigned char *p = ptr;
+	return	(uint64_t)get_be32(p[0]) << 32 |
+		(uint64_t)get_be32(p[4]) <<  0;
+}
+
 static inline void put_be32(void *ptr, uint32_t value)
 {
 	unsigned char *p = ptr;
@@ -187,4 +196,17 @@ static inline void put_be32(void *ptr, uint32_t value)
 	p[3] = value >>  0;
 }
 
+static inline void put_be64(void *ptr, uint64_t value)
+{
+	unsigned char *p = ptr;
+	p[0] = value >> 56;
+	p[1] = value >> 48;
+	p[2] = value >> 40;
+	p[3] = value >> 32;
+	p[4] = value >> 24;
+	p[5] = value >> 16;
+	p[6] = value >>  8;
+	p[7] = value >>  0;
+}
+
 #endif
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 02/12] preload-index: add override to enable testing preload-index
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
  2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-20 22:06       ` Stefan Beller
  2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
                       ` (10 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Preload index doesn't run unless it has a minimum number of 1000 files.
To enable running tests with fewer files, add an environment variable
(GIT_FORCE_PRELOAD_TEST) which will override that minimum and set it to 2.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 preload-index.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/preload-index.c b/preload-index.c
index 70a4c80878..75564c497a 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -79,6 +79,8 @@ static void preload_index(struct index_state *index,
 		return;
 
 	threads = index->cache_nr / THREAD_COST;
+	if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST"))
+		threads = 2;
 	if (threads < 2)
 		return;
 	if (threads > MAX_PARALLEL)
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
  2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
  2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-20  5:47       ` Junio C Hamano
  2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
                       ` (9 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

At times, it makes sense to avoid the cost of writing out the index
when the only changes can easily be recomputed on demand. This causes
problems when trying to write test cases to verify that state as they
can't guarantee the state has been persisted to disk.

Add a new option (--force-write-index) to update-index that will
ensure the index is written out even if the cache_changed flag is not
set.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/update-index.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d562f2ec69..e1ca0759d5 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -915,6 +915,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	struct refresh_params refresh_args = {0, &has_errors};
 	int lock_error = 0;
 	int split_index = -1;
+	int force_write = 0;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
@@ -1006,6 +1007,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    N_("test if the filesystem supports untracked cache"), UC_TEST),
 		OPT_SET_INT(0, "force-untracked-cache", &untracked_cache,
 			    N_("enable untracked cache without testing the filesystem"), UC_FORCE),
+		OPT_SET_INT(0, "force-write-index", &force_write,
+			N_("write out the index even if is not flagged as changed"), 1),
 		OPT_END()
 	};
 
@@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		die("BUG: bad untracked_cache value: %d", untracked_cache);
 	}
 
-	if (active_cache_changed) {
+	if (active_cache_changed || force_write) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
 				exit(128);
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (2 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-20  2:28       ` Junio C Hamano
  2017-09-20  6:23       ` Junio C Hamano
  2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
                       ` (8 subsequent siblings)
  12 siblings, 2 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile               |   1 +
 apply.c                |   2 +-
 builtin/update-index.c |   2 +
 cache.h                |  10 +-
 config.c               |  14 +++
 config.h               |   1 +
 diff-lib.c             |   2 +
 dir.c                  |  27 ++++--
 dir.h                  |   2 +
 entry.c                |   4 +-
 environment.c          |   1 +
 fsmonitor.c            | 253 +++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor.h            |  61 ++++++++++++
 preload-index.c        |   6 +-
 read-cache.c           |  49 ++++++++--
 submodule.c            |   2 +-
 unpack-trees.c         |   8 +-
 17 files changed, 419 insertions(+), 26 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h

diff --git a/Makefile b/Makefile
index f2bb7f2f63..9d6ec9c1e9 100644
--- a/Makefile
+++ b/Makefile
@@ -786,6 +786,7 @@ LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/apply.c b/apply.c
index 71cbbd141c..9061cc5f15 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct cache_entry *ce, struct stat *st)
 			return -1;
 		return 0;
 	}
-	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 }
 
 #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/builtin/update-index.c b/builtin/update-index.c
index e1ca0759d5..6f39ee9274 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -16,6 +16,7 @@
 #include "pathspec.h"
 #include "dir.h"
 #include "split-index.h"
+#include "fsmonitor.h"
 
 /*
  * Default to not allowing changes to the list of files. The
@@ -233,6 +234,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
diff --git a/cache.h b/cache.h
index a916bc79e3..eccab968bd 100644
--- a/cache.h
+++ b/cache.h
@@ -203,6 +203,7 @@ struct cache_entry {
 #define CE_ADDED             (1 << 19)
 
 #define CE_HASHED            (1 << 20)
+#define CE_FSMONITOR_VALID   (1 << 21)
 #define CE_WT_REMOVE         (1 << 22) /* remove in work directory */
 #define CE_CONFLICTED        (1 << 23)
 
@@ -326,6 +327,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CACHE_TREE_CHANGED	(1 << 5)
 #define SPLIT_INDEX_ORDERED	(1 << 6)
 #define UNTRACKED_CHANGED	(1 << 7)
+#define FSMONITOR_CHANGED	(1 << 8)
 
 struct split_index;
 struct untracked_cache;
@@ -344,6 +346,7 @@ struct index_state {
 	struct hashmap dir_hash;
 	unsigned char sha1[20];
 	struct untracked_cache *untracked;
+	uint64_t fsmonitor_last_update;
 };
 
 extern struct index_state the_index;
@@ -679,8 +682,10 @@ extern void *read_blob_data_from_index(const struct index_state *, const char *,
 #define CE_MATCH_IGNORE_MISSING		0x08
 /* enable stat refresh */
 #define CE_MATCH_REFRESH		0x10
-extern int ie_match_stat(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
-extern int ie_modified(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
+/* do stat comparison even if CE_FSMONITOR_VALID is true */
+#define CE_MATCH_IGNORE_FSMONITOR 0X20
+extern int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
+extern int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
@@ -773,6 +778,7 @@ extern int core_apply_sparse_checkout;
 extern int precomposed_unicode;
 extern int protect_hfs;
 extern int protect_ntfs;
+extern const char *core_fsmonitor;
 
 /*
  * Include broken refs in all ref iterations, which will
diff --git a/config.c b/config.c
index d0d8ce823a..ddda96e584 100644
--- a/config.c
+++ b/config.c
@@ -2165,6 +2165,20 @@ int git_config_get_max_percent_split_change(void)
 	return -1; /* default value */
 }
 
+int git_config_get_fsmonitor(void)
+{
+	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
+		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
+
+	if (core_fsmonitor && !*core_fsmonitor)
+		core_fsmonitor = NULL;
+
+	if (core_fsmonitor)
+		return 1;
+
+	return 0;
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
diff --git a/config.h b/config.h
index 97471b8873..c9fcf691ba 100644
--- a/config.h
+++ b/config.h
@@ -211,6 +211,7 @@ extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
+extern int git_config_get_fsmonitor(void);
 
 /* This dies if the configured or default date is in the future */
 extern int git_config_get_expiry(const char *key, const char **output);
diff --git a/diff-lib.c b/diff-lib.c
index 2a52b07954..23c6d03ca9 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -12,6 +12,7 @@
 #include "refs.h"
 #include "submodule.h"
 #include "dir.h"
+#include "fsmonitor.h"
 
 /*
  * diff-files
@@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 		if (!changed && !dirty_submodule) {
 			ce_mark_uptodate(ce);
+			mark_fsmonitor_valid(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
 				continue;
 		}
diff --git a/dir.c b/dir.c
index 1c55dc3e36..ac9833daec 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1688,17 +1689,23 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	if (stat(path->len ? path->buf : ".", &st)) {
-		invalidate_directory(dir->untracked, untracked);
-		memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
-		return 0;
-	}
-	if (!untracked->valid ||
-	    match_stat_data_racy(istate, &untracked->stat_data, &st)) {
-		if (untracked->valid)
+	/*
+	 * With fsmonitor, we can trust the untracked cache's valid field.
+	 */
+	refresh_fsmonitor(istate);
+	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
+		if (stat(path->len ? path->buf : ".", &st)) {
 			invalidate_directory(dir->untracked, untracked);
-		fill_stat_data(&untracked->stat_data, &st);
-		return 0;
+			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
+			return 0;
+		}
+		if (!untracked->valid ||
+			match_stat_data_racy(istate, &untracked->stat_data, &st)) {
+			if (untracked->valid)
+				invalidate_directory(dir->untracked, untracked);
+			fill_stat_data(&untracked->stat_data, &st);
+			return 0;
+		}
 	}
 
 	if (untracked->check_only != !!check_only) {
diff --git a/dir.h b/dir.h
index e3717055d1..fab8fc1561 100644
--- a/dir.h
+++ b/dir.h
@@ -139,6 +139,8 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
+	/* fsmonitor invalidation data */
+	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/entry.c b/entry.c
index cb291aa88b..5e6794f9fc 100644
--- a/entry.c
+++ b/entry.c
@@ -4,6 +4,7 @@
 #include "streaming.h"
 #include "submodule.h"
 #include "progress.h"
+#include "fsmonitor.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
 			lstat(ce->name, &st);
 		fill_stat_cache_info(ce, &st);
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
 		state->istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 	return 0;
@@ -402,7 +404,7 @@ int checkout_entry(struct cache_entry *ce,
 
 	if (!check_path(path.buf, path.len, &st, state->base_dir_len)) {
 		const struct submodule *sub;
-		unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+		unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 		/*
 		 * Needs to be checked before !changed returns early,
 		 * as the possibly empty directory was not changed
diff --git a/environment.c b/environment.c
index 3fd4b10845..d0b9fc64d4 100644
--- a/environment.c
+++ b/environment.c
@@ -76,6 +76,7 @@ int protect_hfs = PROTECT_HFS_DEFAULT;
 #define PROTECT_NTFS_DEFAULT 0
 #endif
 int protect_ntfs = PROTECT_NTFS_DEFAULT;
+const char *core_fsmonitor;
 
 /*
  * The character that begins a commented line in user-editable file
diff --git a/fsmonitor.c b/fsmonitor.c
new file mode 100644
index 0000000000..b8b2d88fe1
--- /dev/null
+++ b/fsmonitor.c
@@ -0,0 +1,253 @@
+#include "cache.h"
+#include "config.h"
+#include "dir.h"
+#include "ewah/ewok.h"
+#include "fsmonitor.h"
+#include "run-command.h"
+#include "strbuf.h"
+
+#define INDEX_EXTENSION_VERSION	(1)
+#define HOOK_INTERFACE_VERSION	(1)
+
+struct trace_key trace_fsmonitor = TRACE_KEY_INIT(FSMONITOR);
+
+static void fsmonitor_ewah_callback(size_t pos, void *is)
+{
+	struct index_state *istate = (struct index_state *)is;
+	struct cache_entry *ce = istate->cache[pos];
+
+	ce->ce_flags &= ~CE_FSMONITOR_VALID;
+}
+
+int read_fsmonitor_extension(struct index_state *istate, const void *data,
+	unsigned long sz)
+{
+	const char *index = data;
+	uint32_t hdr_version;
+	uint32_t ewah_size;
+	struct ewah_bitmap *fsmonitor_dirty;
+	int i;
+	int ret;
+
+	if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
+		return error("corrupt fsmonitor extension (too short)");
+
+	hdr_version = get_be32(index);
+	index += sizeof(uint32_t);
+	if (hdr_version != INDEX_EXTENSION_VERSION)
+		return error("bad fsmonitor version %d", hdr_version);
+
+	istate->fsmonitor_last_update = get_be64(index);
+	index += sizeof(uint64_t);
+
+	ewah_size = get_be32(index);
+	index += sizeof(uint32_t);
+
+	fsmonitor_dirty = ewah_new();
+	ret = ewah_read_mmap(fsmonitor_dirty, index, ewah_size);
+	if (ret != ewah_size) {
+		ewah_free(fsmonitor_dirty);
+		return error("failed to parse ewah bitmap reading fsmonitor index extension");
+	}
+
+	if (git_config_get_fsmonitor()) {
+		/* Mark all entries valid */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags |= CE_FSMONITOR_VALID;
+
+		/* Mark all previously saved entries as dirty */
+		ewah_each_bit(fsmonitor_dirty, fsmonitor_ewah_callback, istate);
+
+		/* Now mark the untracked cache for fsmonitor usage */
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 1;
+	}
+	ewah_free(fsmonitor_dirty);
+
+	trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful");
+	return 0;
+}
+
+void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
+{
+	uint32_t hdr_version;
+	uint64_t tm;
+	struct ewah_bitmap *bitmap;
+	int i;
+	uint32_t ewah_start;
+	uint32_t ewah_size = 0;
+	int fixup = 0;
+
+	put_be32(&hdr_version, INDEX_EXTENSION_VERSION);
+	strbuf_add(sb, &hdr_version, sizeof(uint32_t));
+
+	put_be64(&tm, istate->fsmonitor_last_update);
+	strbuf_add(sb, &tm, sizeof(uint64_t));
+	fixup = sb->len;
+	strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
+
+	ewah_start = sb->len;
+	bitmap = ewah_new();
+	for (i = 0; i < istate->cache_nr; i++)
+		if (!(istate->cache[i]->ce_flags & CE_FSMONITOR_VALID))
+			ewah_set(bitmap, i);
+	ewah_serialize_strbuf(bitmap, sb);
+	ewah_free(bitmap);
+
+	/* fix up size field */
+	put_be32(&ewah_size, sb->len - ewah_start);
+	memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t));
+
+	trace_printf_key(&trace_fsmonitor, "write fsmonitor extension successful");
+}
+
+/*
+ * Call the query-fsmonitor hook passing the time of the last saved results.
+ */
+static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	char ver[64];
+	char date[64];
+	const char *argv[4];
+
+	if (!(argv[0] = core_fsmonitor))
+		return -1;
+
+	snprintf(ver, sizeof(version), "%d", version);
+	snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
+	argv[1] = ver;
+	argv[2] = date;
+	argv[3] = NULL;
+	cp.argv = argv;
+	cp.use_shell = 1;
+
+	return capture_command(&cp, query_result, 1024);
+}
+
+static void fsmonitor_refresh_callback(struct index_state *istate, const char *name)
+{
+	int pos = index_name_pos(istate, name, strlen(name));
+
+	if (pos >= 0) {
+		struct cache_entry *ce = istate->cache[pos];
+		ce->ce_flags &= ~CE_FSMONITOR_VALID;
+	}
+
+	/*
+	 * Mark the untracked cache dirty even if it wasn't found in the index
+	 * as it could be a new untracked file.
+	 */
+	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);
+	untracked_cache_invalidate_path(istate, name);
+}
+
+void refresh_fsmonitor(struct index_state *istate)
+{
+	static int has_run_once = 0;
+	struct strbuf query_result = STRBUF_INIT;
+	int query_success = 0;
+	size_t bol; /* beginning of line */
+	uint64_t last_update;
+	char *buf;
+	int i;
+
+	if (!core_fsmonitor || has_run_once)
+		return;
+	has_run_once = 1;
+
+	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
+	/*
+	 * This could be racy so save the date/time now and query_fsmonitor
+	 * should be inclusive to ensure we don't miss potential changes.
+	 */
+	last_update = getnanotime();
+
+	/*
+	 * If we have a last update time, call query_fsmonitor for the set of
+	 * changes since that time, else assume everything is possibly dirty
+	 * and check it all.
+	 */
+	if (istate->fsmonitor_last_update) {
+		query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
+			istate->fsmonitor_last_update, &query_result);
+		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
+		trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s",
+			core_fsmonitor, query_success ? "success" : "failure");
+	}
+
+	/* a fsmonitor process can return '*' to indicate all entries are invalid */
+	if (query_success && query_result.buf[0] != '/') {
+		/* Mark all entries returned by the monitor as dirty */
+		buf = query_result.buf;
+		bol = 0;
+		for (i = 0; i < query_result.len; i++) {
+			if (buf[i] != '\0')
+				continue;
+			fsmonitor_refresh_callback(istate, buf + bol);
+			bol = i + 1;
+		}
+		if (bol < query_result.len)
+			fsmonitor_refresh_callback(istate, buf + bol);
+	} else {
+		/* Mark all entries invalid */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 0;
+	}
+	strbuf_release(&query_result);
+
+	/* Now that we've updated istate, save the last_update time */
+	istate->fsmonitor_last_update = last_update;
+}
+
+void add_fsmonitor(struct index_state *istate)
+{
+	int i;
+
+	if (!istate->fsmonitor_last_update) {
+		trace_printf_key(&trace_fsmonitor, "add fsmonitor");
+		istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = getnanotime();
+
+		/* reset the fsmonitor state */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+
+		/* reset the untracked cache */
+		if (istate->untracked) {
+			add_untracked_cache(istate);
+			istate->untracked->use_fsmonitor = 1;
+		}
+
+		/* Update the fsmonitor state */
+		refresh_fsmonitor(istate);
+	}
+}
+
+void remove_fsmonitor(struct index_state *istate)
+{
+	if (istate->fsmonitor_last_update) {
+		trace_printf_key(&trace_fsmonitor, "remove fsmonitor");
+		istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = 0;
+	}
+}
+
+void tweak_fsmonitor(struct index_state *istate)
+{
+	switch (git_config_get_fsmonitor()) {
+	case -1: /* keep: do nothing */
+		break;
+	case 0: /* false */
+		remove_fsmonitor(istate);
+		break;
+	case 1: /* true */
+		add_fsmonitor(istate);
+		break;
+	default: /* unknown value: do nothing */
+		break;
+	}
+}
diff --git a/fsmonitor.h b/fsmonitor.h
new file mode 100644
index 0000000000..c2240b811a
--- /dev/null
+++ b/fsmonitor.h
@@ -0,0 +1,61 @@
+#ifndef FSMONITOR_H
+#define FSMONITOR_H
+
+extern struct trace_key trace_fsmonitor;
+
+/*
+ * Read the the fsmonitor index extension and (if configured) restore the
+ * CE_FSMONITOR_VALID state.
+ */
+extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz);
+
+/*
+ * Write the CE_FSMONITOR_VALID state into the fsmonitor index extension.
+ */
+extern void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate);
+
+/*
+ * Add/remove the fsmonitor index extension
+ */
+extern void add_fsmonitor(struct index_state *istate);
+extern void remove_fsmonitor(struct index_state *istate);
+
+/*
+ * Add/remove the fsmonitor index extension as necessary based on the current
+ * core.fsmonitor setting.
+ */
+extern void tweak_fsmonitor(struct index_state *istate);
+
+/*
+ * Run the configured fsmonitor integration script and clear the
+ * CE_FSMONITOR_VALID bit for any files returned as dirty.  Also invalidate
+ * any corresponding untracked cache directory structures. Optimized to only
+ * run the first time it is called.
+ */
+extern void refresh_fsmonitor(struct index_state *istate);
+
+/*
+ * Set the given cache entries CE_FSMONITOR_VALID bit.
+ */
+static inline void mark_fsmonitor_valid(struct cache_entry *ce)
+{
+	if (core_fsmonitor) {
+		ce->ce_flags |= CE_FSMONITOR_VALID;
+		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_clean '%s'", ce->name);
+	}
+}
+
+/*
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate any
+ * corresponding untracked cache directory structures.
+ */
+static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
+{
+	if (core_fsmonitor) {
+		ce->ce_flags &= ~CE_FSMONITOR_VALID;
+		untracked_cache_invalidate_path(istate, ce->name);
+		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
+	}
+}
+
+#endif
diff --git a/preload-index.c b/preload-index.c
index 75564c497a..2a83255e4e 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -4,6 +4,7 @@
 #include "cache.h"
 #include "pathspec.h"
 #include "dir.h"
+#include "fsmonitor.h"
 
 #ifdef NO_PTHREADS
 static void preload_index(struct index_state *index,
@@ -55,15 +56,18 @@ static void *preload_thread(void *_data)
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
+		if (ce->ce_flags & CE_FSMONITOR_VALID)
+			continue;
 		if (!ce_path_match(ce, &p->pathspec, NULL))
 			continue;
 		if (threaded_has_symlink_leading_path(&cache, ce->name, ce_namelen(ce)))
 			continue;
 		if (lstat(ce->name, &st))
 			continue;
-		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY))
+		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR))
 			continue;
 		ce_mark_uptodate(ce);
+		mark_fsmonitor_valid(ce);
 	} while (--nr > 0);
 	cache_def_clear(&cache);
 	return NULL;
diff --git a/read-cache.c b/read-cache.c
index 40da87ea71..53093dbebf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -19,6 +19,7 @@
 #include "varint.h"
 #include "split-index.h"
 #include "utf8.h"
+#include "fsmonitor.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -38,11 +39,12 @@
 #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
 #define CACHE_EXT_LINK 0x6c696e6b	  /* "link" */
 #define CACHE_EXT_UNTRACKED 0x554E5452	  /* "UNTR" */
+#define CACHE_EXT_FSMONITOR 0x46534D4E	  /* "FSMN" */
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
 		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \
-		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED)
+		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED | FSMONITOR_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -62,6 +64,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	free(old);
 	set_index_entry(istate, nr, ce);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	mark_fsmonitor_invalid(istate, ce);
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 }
 
@@ -150,8 +153,10 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 	if (assume_unchanged)
 		ce->ce_flags |= CE_VALID;
 
-	if (S_ISREG(st->st_mode))
+	if (S_ISREG(st->st_mode)) {
 		ce_mark_uptodate(ce);
+		mark_fsmonitor_valid(ce);
+	}
 }
 
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
@@ -300,7 +305,7 @@ int match_stat_data_racy(const struct index_state *istate,
 	return match_stat_data(sd, st);
 }
 
-int ie_match_stat(const struct index_state *istate,
+int ie_match_stat(struct index_state *istate,
 		  const struct cache_entry *ce, struct stat *st,
 		  unsigned int options)
 {
@@ -308,7 +313,10 @@ int ie_match_stat(const struct index_state *istate,
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 	int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
 	int assume_racy_is_modified = options & CE_MATCH_RACY_IS_DIRTY;
+	int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
 
+	if (!ignore_fsmonitor)
+		refresh_fsmonitor(istate);
 	/*
 	 * If it's marked as always valid in the index, it's
 	 * valid whatever the checked-out copy says.
@@ -319,6 +327,8 @@ int ie_match_stat(const struct index_state *istate,
 		return 0;
 	if (!ignore_valid && (ce->ce_flags & CE_VALID))
 		return 0;
+	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID))
+		return 0;
 
 	/*
 	 * Intent-to-add entries have not been added, so the index entry
@@ -356,7 +366,7 @@ int ie_match_stat(const struct index_state *istate,
 	return changed;
 }
 
-int ie_modified(const struct index_state *istate,
+int ie_modified(struct index_state *istate,
 		const struct cache_entry *ce,
 		struct stat *st, unsigned int options)
 {
@@ -631,7 +641,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	int size, namelen, was_same;
 	mode_t st_mode = st->st_mode;
 	struct cache_entry *ce, *alias;
-	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY;
+	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR;
 	int verbose = flags & (ADD_CACHE_VERBOSE | ADD_CACHE_PRETEND);
 	int pretend = flags & ADD_CACHE_PRETEND;
 	int intent_only = flags & ADD_CACHE_INTENT;
@@ -777,6 +787,7 @@ int chmod_index_entry(struct index_state *istate, struct cache_entry *ce,
 	}
 	cache_tree_invalidate_path(istate, ce->name);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	mark_fsmonitor_invalid(istate, ce);
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 
 	return 0;
@@ -1228,10 +1239,13 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 	int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
 	int ignore_missing = options & CE_MATCH_IGNORE_MISSING;
+	int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
 
 	if (!refresh || ce_uptodate(ce))
 		return ce;
 
+	if (!ignore_fsmonitor)
+		refresh_fsmonitor(istate);
 	/*
 	 * CE_VALID or CE_SKIP_WORKTREE means the user promised us
 	 * that the change to the work tree does not matter and told
@@ -1245,6 +1259,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 		ce_mark_uptodate(ce);
 		return ce;
 	}
+	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) {
+		ce_mark_uptodate(ce);
+		return ce;
+	}
 
 	if (has_symlink_leading_path(ce->name, ce_namelen(ce))) {
 		if (ignore_missing)
@@ -1282,8 +1300,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 			 * because CE_UPTODATE flag is in-core only;
 			 * we are not going to write this change out.
 			 */
-			if (!S_ISGITLINK(ce->ce_mode))
+			if (!S_ISGITLINK(ce->ce_mode)) {
 				ce_mark_uptodate(ce);
+				mark_fsmonitor_valid(ce);
+			}
 			return ce;
 		}
 	}
@@ -1336,7 +1356,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int first = 1;
 	int in_porcelain = (flags & REFRESH_IN_PORCELAIN);
 	unsigned int options = (CE_MATCH_REFRESH |
-				(really ? CE_MATCH_IGNORE_VALID : 0) |
+				(really ? CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_FSMONITOR : 0) |
 				(not_new ? CE_MATCH_IGNORE_MISSING : 0));
 	const char *modified_fmt;
 	const char *deleted_fmt;
@@ -1391,6 +1411,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 				 */
 				ce->ce_flags &= ~CE_VALID;
 				ce->ce_flags |= CE_UPDATE_IN_BASE;
+				mark_fsmonitor_invalid(istate, ce);
 				istate->cache_changed |= CE_ENTRY_CHANGED;
 			}
 			if (quiet)
@@ -1550,6 +1571,9 @@ static int read_index_extension(struct index_state *istate,
 	case CACHE_EXT_UNTRACKED:
 		istate->untracked = read_untracked_extension(data, sz);
 		break;
+	case CACHE_EXT_FSMONITOR:
+		read_fsmonitor_extension(istate, data, sz);
+		break;
 	default:
 		if (*ext < 'A' || 'Z' < *ext)
 			return error("index uses %.4s extension, which we do not understand",
@@ -1722,6 +1746,7 @@ static void post_read_index_from(struct index_state *istate)
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
 	tweak_split_index(istate);
+	tweak_fsmonitor(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
@@ -2306,6 +2331,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 		if (err)
 			return -1;
 	}
+	if (!strip_extensions && istate->fsmonitor_last_update) {
+		struct strbuf sb = STRBUF_INIT;
+
+		write_fsmonitor_extension(&sb, istate);
+		err = write_index_ext_header(&c, newfd, CACHE_EXT_FSMONITOR, sb.len) < 0
+			|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
+		strbuf_release(&sb);
+		if (err)
+			return -1;
+	}
 
 	if (ce_flush(&c, newfd, istate->sha1))
 		return -1;
diff --git a/submodule.c b/submodule.c
index 3cea8221e0..8a931a1aaa 100644
--- a/submodule.c
+++ b/submodule.c
@@ -62,7 +62,7 @@ int is_staging_gitmodules_ok(const struct index_state *istate)
 	if ((pos >= 0) && (pos < istate->cache_nr)) {
 		struct stat st;
 		if (lstat(GITMODULES_FILE, &st) == 0 &&
-		    ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED)
+		    ce_match_stat(istate->cache[pos], &st, CE_MATCH_IGNORE_FSMONITOR) & DATA_CHANGED)
 			return 0;
 	}
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 71b70ccb12..f724a61ac0 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "submodule.h"
 #include "submodule-config.h"
+#include "fsmonitor.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -408,6 +409,7 @@ static int apply_sparse_checkout(struct index_state *istate,
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
 	if (was_skip_worktree != ce_skip_worktree(ce)) {
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(istate, ce);
 		istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 
@@ -1454,7 +1456,7 @@ static int verify_uptodate_1(const struct cache_entry *ce,
 		return 0;
 
 	if (!lstat(ce->name, &st)) {
-		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE;
+		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR;
 		unsigned changed = ie_match_stat(o->src_index, ce, &st, flags);
 
 		if (submodule_from_ce(ce)) {
@@ -1610,7 +1612,7 @@ static int icase_exists(struct unpack_trees_options *o, const char *name, int le
 	const struct cache_entry *src;
 
 	src = index_file_exists(o->src_index, name, len, 1);
-	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 }
 
 static int check_ok_to_remove(const char *name, int len, int dtype,
@@ -2134,7 +2136,7 @@ int oneway_merge(const struct cache_entry * const *src,
 		if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old)) {
 			struct stat st;
 			if (lstat(old->name, &st) ||
-			    ie_match_stat(o->src_index, old, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE))
+			    ie_match_stat(o->src_index, old, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR))
 				update |= CE_UPDATE;
 		}
 		add_entry(o, old, update, 0);
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (3 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-20 10:00       ` Martin Ågren
  2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
                       ` (7 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This includes the core.fsmonitor setting, the query-fsmonitor hook,
and the fsmonitor index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Documentation/config.txt                 |  7 +++++
 Documentation/git-ls-files.txt           |  7 ++++-
 Documentation/git-update-index.txt       | 45 ++++++++++++++++++++++++++++++++
 Documentation/githooks.txt               | 28 ++++++++++++++++++++
 Documentation/technical/index-format.txt | 19 ++++++++++++++
 5 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a2..db52645cb4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -413,6 +413,13 @@ core.protectNTFS::
 	8.3 "short" names.
 	Defaults to `true` on Windows, and `false` elsewhere.
 
+core.fsmonitor::
+	If set, the value of this variable is used as a command which
+	will identify all files that may have changed since the
+	requested date/time. This information is used to speed up git by
+	avoiding unnecessary processing of files that have not changed.
+	See the "fsmonitor-watchman" section of linkgit:githooks[5].
+
 core.trustctime::
 	If false, the ctime differences between the index and the
 	working tree are ignored; useful when the inode change time
diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index d153c17e06..3ac3e3a77d 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -9,7 +9,7 @@ git-ls-files - Show information about files in the index and the working tree
 SYNOPSIS
 --------
 [verse]
-'git ls-files' [-z] [-t] [-v]
+'git ls-files' [-z] [-t] [-v] [-f]
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
 		(-[c|d|o|i|s|u|k|m])*
 		[--eol]
@@ -133,6 +133,11 @@ a space) at the start of each line:
 	that are marked as 'assume unchanged' (see
 	linkgit:git-update-index[1]).
 
+-f::
+	Similar to `-t`, but use lowercase letters for files
+	that are marked as 'fsmonitor valid' (see
+	linkgit:git-update-index[1]).
+
 --full-name::
 	When run from a subdirectory, the command usually
 	outputs paths relative to the current directory.  This
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index e19eba62cd..95231dbfcb 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -16,9 +16,11 @@ SYNOPSIS
 	     [--chmod=(+|-)x]
 	     [--[no-]assume-unchanged]
 	     [--[no-]skip-worktree]
+	     [--[no-]fsmonitor-valid]
 	     [--ignore-submodules]
 	     [--[no-]split-index]
 	     [--[no-|test-|force-]untracked-cache]
+	     [--[no-]fsmonitor]
 	     [--really-refresh] [--unresolve] [--again | -g]
 	     [--info-only] [--index-info]
 	     [-z] [--stdin] [--index-version <n>]
@@ -111,6 +113,12 @@ you will need to handle the situation manually.
 	set and unset the "skip-worktree" bit for the paths. See
 	section "Skip-worktree bit" below for more information.
 
+--[no-]fsmonitor-valid::
+	When one of these flags is specified, the object name recorded
+	for the paths are not updated. Instead, these options
+	set and unset the "fsmonitor valid" bit for the paths. See
+	section "File System Monitor" below for more information.
+
 -g::
 --again::
 	Runs 'git update-index' itself on the paths whose index
@@ -201,6 +209,15 @@ will remove the intended effect of the option.
 	`--untracked-cache` used to imply `--test-untracked-cache` but
 	this option would enable the extension unconditionally.
 
+--fsmonitor::
+--no-fsmonitor::
+	Enable or disable files system monitor feature. These options
+	take effect whatever the value of the `core.fsmonitor`
+	configuration variable (see linkgit:git-config[1]). But a warning
+	is emitted when the change goes against the configured value, as
+	the configured value will take effect next time the index is
+	read and this will remove the intended effect of the option.
+
 \--::
 	Do not interpret any more arguments as options.
 
@@ -447,6 +464,34 @@ command reads the index; while when `--[no-|force-]untracked-cache`
 are used, the untracked cache is immediately added to or removed from
 the index.
 
+File System Monitor
+-------------------
+
+This feature is intended to speed up git operations for repos that have
+large working directories.
+
+It enables git to work together with a file system monitor (see the
+"fsmonitor-watchman" section of linkgit:githooks[5]) that can
+inform it as to what files have been modified. This enables git to avoid
+having to lstat() every file to find modified files.
+
+When used in conjunction with the untracked cache, it can further improve
+performance by avoiding the cost of scaning the entire working directory
+looking for new files.
+
+If you want to enable (or disable) this feature, it is easier to use
+the `core.fsmonitor` configuration variable (see
+linkgit:git-config[1]) than using the `--fsmonitor` option to
+`git update-index` in each repository, especially if you want to do so
+across all repositories you use, because you can set the configuration
+variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
+and have it affect all repositories you touch.
+
+When the `core.fsmonitor` configuration variable is changed, the
+file system monitor is added to or removed from the index the next time
+a command reads the index. When `--[no-]fsmonitor` are used, the file
+system monitor is immediately added to or removed from the index.
+
 Configuration
 -------------
 
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 1bb4f92d4d..ae60559cd9 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -455,6 +455,34 @@ the name of the file that holds the e-mail to be sent.  Exiting with a
 non-zero status causes 'git send-email' to abort before sending any
 e-mails.
 
+fsmonitor-watchman
+~~~~~~~~~~~~~~~~~~
+
+This hook is invoked when the configuration option core.fsmonitor is
+set to .git/hooks/fsmonitor-watchman.  It takes two arguments, a version
+(currently 1) and the time in elapsed nanoseconds since midnight,
+January 1, 1970.
+
+The hook should output to stdout the list of all files in the working
+directory that may have changed since the requested time.  The logic
+should be inclusive so that it does not miss any potential changes.
+The paths should be relative to the root of the working directory
+and be separated by a single NUL.
+
+It is OK to include files which have not actually changed.  All changes
+including newly-created and deleted files should be included. When
+files are renamed, both the old and the new name should be included.
+
+Git will limit what files it checks for changes as well as which
+directories are checked for untracked files based on the path names
+given.
+
+An optimized way to tell git "all files have changed" is to return
+the filename '/'.
+
+The exit status determines whether git will use the data from the
+hook to limit its search.  On error, it will fall back to verifying
+all files and folders.
 
 GIT
 ---
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index ade0b0c445..db3572626b 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -295,3 +295,22 @@ The remaining data of each directory block is grouped by type:
     in the previous ewah bitmap.
 
   - One NUL.
+
+== File System Monitor cache
+
+  The file system monitor cache tracks files for which the core.fsmonitor
+  hook has told us about changes.  The signature for this extension is
+  { 'F', 'S', 'M', 'N' }.
+
+  The extension starts with
+
+  - 32-bit version number: the current supported version is 1.
+
+  - 64-bit time: the extension data reflects all changes through the given
+	time which is stored as the nanoseconds elapsed since midnight,
+	January 1, 1970.
+
+  - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap.
+
+  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
+    is not CE_FSMONITOR_VALID.
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (4 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:46       ` David Turner
  2017-09-19 19:27     ` [PATCH v7 07/12] update-index: add fsmonitor support to update-index Ben Peart
                       ` (6 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a new command line option (-f) to ls-files to have it use lowercase
letters for 'fsmonitor valid' files

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/ls-files.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1339e6d17..313962a0c1 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -31,6 +31,7 @@ static int show_resolve_undo;
 static int show_modified;
 static int show_killed;
 static int show_valid_bit;
+static int show_fsmonitor_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
@@ -86,7 +87,8 @@ static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
 
-	if (tag && *tag && show_valid_bit && (ce->ce_flags & CE_VALID)) {
+	if (tag && *tag && ((show_valid_bit && (ce->ce_flags & CE_VALID)) ||
+		(show_fsmonitor_bit && (ce->ce_flags & CE_FSMONITOR_VALID)))) {
 		memcpy(alttag, tag, 3);
 
 		if (isalpha(tag[0])) {
@@ -515,6 +517,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			N_("identify the file status with tags")),
 		OPT_BOOL('v', NULL, &show_valid_bit,
 			N_("use lowercase letters for 'assume unchanged' files")),
+		OPT_BOOL('f', NULL, &show_fsmonitor_bit,
+			N_("use lowercase letters for 'fsmonitor clean' files")),
 		OPT_BOOL('c', "cached", &show_cached,
 			N_("show cached files in the output (default)")),
 		OPT_BOOL('d', "deleted", &show_deleted,
@@ -584,7 +588,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_exclude(exclude_list.items[i].string, "", 0, el, --exclude_args);
 	}
-	if (show_tag || show_valid_bit) {
+	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
 		tag_removed = "R ";
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 07/12] update-index: add fsmonitor support to update-index
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (5 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add support in update-index to manually add/remove the fsmonitor
extension via --[no-]fsmonitor flags.

Add support in update-index to manually set/clear the fsmonitor
valid bit via --[no-]fsmonitor-valid flags.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/update-index.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 6f39ee9274..41618db098 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -33,6 +33,7 @@ static int force_remove;
 static int verbose;
 static int mark_valid_only;
 static int mark_skip_worktree_only;
+static int mark_fsmonitor_only;
 #define MARK_FLAG 1
 #define UNMARK_FLAG 2
 static struct strbuf mtime_dir = STRBUF_INIT;
@@ -229,12 +230,12 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 	int namelen = strlen(path);
 	int pos = cache_name_pos(path, namelen);
 	if (0 <= pos) {
+		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		if (mark)
 			active_cache[pos]->ce_flags |= flag;
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
@@ -460,6 +461,11 @@ static void update_one(const char *path)
 			die("Unable to mark file %s", path);
 		return;
 	}
+	if (mark_fsmonitor_only) {
+		if (mark_ce_flags(path, CE_FSMONITOR_VALID, mark_fsmonitor_only == MARK_FLAG))
+			die("Unable to mark file %s", path);
+		return;
+	}
 
 	if (force_remove) {
 		if (remove_file_from_cache(path))
@@ -918,6 +924,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	int lock_error = 0;
 	int split_index = -1;
 	int force_write = 0;
+	int fsmonitor = -1;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
@@ -1011,6 +1018,14 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    N_("enable untracked cache without testing the filesystem"), UC_FORCE),
 		OPT_SET_INT(0, "force-write-index", &force_write,
 			N_("write out the index even if is not flagged as changed"), 1),
+		OPT_BOOL(0, "fsmonitor", &fsmonitor,
+			N_("enable or disable file system monitor")),
+		{OPTION_SET_INT, 0, "fsmonitor-valid", &mark_fsmonitor_only, NULL,
+			N_("mark files as fsmonitor valid"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, MARK_FLAG},
+		{OPTION_SET_INT, 0, "no-fsmonitor-valid", &mark_fsmonitor_only, NULL,
+			N_("clear fsmonitor valid bit"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, UNMARK_FLAG},
 		OPT_END()
 	};
 
@@ -1152,6 +1167,22 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		die("BUG: bad untracked_cache value: %d", untracked_cache);
 	}
 
+	if (fsmonitor > 0) {
+		if (git_config_get_fsmonitor() == 0)
+			warning(_("core.fsmonitor is unset; "
+				"set it if you really want to "
+				"enable fsmonitor"));
+		add_fsmonitor(&the_index);
+		report(_("fsmonitor enabled"));
+	} else if (!fsmonitor) {
+		if (git_config_get_fsmonitor() == 1)
+			warning(_("core.fsmonitor is set; "
+				"remove it if you really want to "
+				"disable fsmonitor"));
+		remove_fsmonitor(&the_index);
+		report(_("fsmonitor disabled"));
+	}
+
 	if (active_cache_changed || force_write) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (6 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 07/12] update-index: add fsmonitor support to update-index Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile                       |  1 +
 t/helper/test-dump-fsmonitor.c | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)
 create mode 100644 t/helper/test-dump-fsmonitor.c

diff --git a/Makefile b/Makefile
index 9d6ec9c1e9..d970cd00e9 100644
--- a/Makefile
+++ b/Makefile
@@ -639,6 +639,7 @@ TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
+TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
 TEST_PROGRAMS_NEED_X += test-fake-ssh
diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
new file mode 100644
index 0000000000..ad452707e8
--- /dev/null
+++ b/t/helper/test-dump-fsmonitor.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+int cmd_main(int ac, const char **av)
+{
+	struct index_state *istate = &the_index;
+	int i;
+
+	setup_git_directory();
+	if (do_read_index(istate, get_index_file(), 0) < 0)
+		die("unable to read index file");
+	if (!istate->fsmonitor_last_update) {
+		printf("no fsmonitor\n");
+		return 0;
+	}
+	printf("fsmonitor last update %"PRIuMAX"\n", (uintmax_t)istate->fsmonitor_last_update);
+
+	for (i = 0; i < istate->cache_nr; i++)
+		printf((istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) ? "+" : "-");
+
+	return 0;
+}
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (7 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
                       ` (3 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

The split index test t1700-split-index.sh has hard coded SHA values for
the index.  Currently it supports index V4 and V3 but assumes there are
no index extensions loaded.

When manually forcing the fsmonitor extension to be turned on when
running the test suite, the SHA values no longer match which causes the
test to fail.

The potential matrix of index extensions and index versions can is quite
large so instead disable the extension before attempting to run the test.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t1700-split-index.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 22f69a410b..af9b847761 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -6,6 +6,7 @@ test_description='split index mode tests'
 
 # We need total control of index splitting here
 sane_unset GIT_TEST_SPLIT_INDEX
+sane_unset GIT_FSMONITOR_TEST
 
 test_expect_success 'enable split index' '
 	git config splitIndex.maxPercentChange 100 &&
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (8 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
                       ` (2 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Test the ability to add/remove the fsmonitor index extension via
update-index.

Test that dirty files returned from the integration script are properly
represented in the index extension and verify that ls-files correctly
reports their state.

Test that ensure status results are correct when using the new fsmonitor
extension.  Test untracked, modified, and new files by ensuring the
results are identical to when not using the extension.

Test that if the fsmonitor extension doesn't tell git about a change, it
doesn't discover it on its own.  This ensures git is honoring the
extension and that we get the performance benefits desired.

Three test integration scripts are provided:

fsmonitor-all - marks all files as dirty
fsmonitor-none - marks no files as dirty
fsmonitor-watchman - integrates with Watchman with debug logging

To run tests in the test suite while utilizing fsmonitor:

First copy t/t7519/fsmonitor-all to a location in your path and then set
GIT_FORCE_PRELOAD_TEST=true and GIT_FSMONITOR_TEST=fsmonitor-all and run
your tests.

Note: currently when using the test script fsmonitor-watchman on
Windows, many tests fail due to a reported but not yet fixed bug in
Watchman where it holds on to handles for directories and files which
prevents the test directory from being cleaned up properly.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t7519-status-fsmonitor.sh | 304 ++++++++++++++++++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all       |  24 ++++
 t/t7519/fsmonitor-none      |  22 ++++
 t/t7519/fsmonitor-watchman  | 140 ++++++++++++++++++++
 4 files changed, 490 insertions(+)
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman

diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
new file mode 100755
index 0000000000..c6df85af5e
--- /dev/null
+++ b/t/t7519-status-fsmonitor.sh
@@ -0,0 +1,304 @@
+#!/bin/sh
+
+test_description='git status with file system watcher'
+
+. ./test-lib.sh
+
+#
+# To run the entire git test suite using fsmonitor:
+#
+# copy t/t7519/fsmonitor-all to a location in your path and then set
+# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests.
+#
+
+# Note, after "git reset --hard HEAD" no extensions exist other than 'TREE'
+# "git update-index --fsmonitor" can be used to get the extension written
+# before testing the results.
+
+clean_repo () {
+	git reset --hard HEAD &&
+	git clean -fd
+}
+
+dirty_repo () {
+	: >untracked &&
+	: >dir1/untracked &&
+	: >dir2/untracked &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	echo 4 >new &&
+	echo 5 >dir1/new &&
+	echo 6 >dir2/new
+}
+
+write_integration_script () {
+	write_script .git/hooks/fsmonitor-test<<-\EOF
+	if test "$#" -ne 2
+	then
+		echo "$0: exactly 2 arguments expected"
+		exit 2
+	fi
+	if test "$1" != 1
+	then
+		echo "Unsupported core.fsmonitor hook version." >&2
+		exit 1
+	fi
+	printf "untracked\0"
+	printf "dir1/untracked\0"
+	printf "dir2/untracked\0"
+	printf "modified\0"
+	printf "dir1/modified\0"
+	printf "dir2/modified\0"
+	printf "new\0"
+	printf "dir1/new\0"
+	printf "dir2/new\0"
+	EOF
+}
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success 'setup' '
+	mkdir -p .git/hooks &&
+	: >tracked &&
+	: >modified &&
+	mkdir dir1 &&
+	: >dir1/tracked &&
+	: >dir1/modified &&
+	mkdir dir2 &&
+	: >dir2/tracked &&
+	: >dir2/modified &&
+	git -c core.fsmonitor= add . &&
+	git -c core.fsmonitor= commit -m initial &&
+	git config core.fsmonitor .git/hooks/fsmonitor-test &&
+	cat >.gitignore <<-\EOF
+	.gitignore
+	expect*
+	actual*
+	marker*
+	EOF
+'
+
+# test that the fsmonitor extension is off by default
+test_expect_success 'fsmonitor extension is off by default' '
+	test-dump-fsmonitor >actual &&
+	grep "^no fsmonitor" actual
+'
+
+# test that "update-index --fsmonitor" adds the fsmonitor extension
+test_expect_success 'update-index --fsmonitor" adds the fsmonitor extension' '
+	git update-index --fsmonitor &&
+	test-dump-fsmonitor >actual &&
+	grep "^fsmonitor last update" actual
+'
+
+# test that "update-index --no-fsmonitor" removes the fsmonitor extension
+test_expect_success 'update-index --no-fsmonitor" removes the fsmonitor extension' '
+	git update-index --no-fsmonitor &&
+	test-dump-fsmonitor >actual &&
+	grep "^no fsmonitor" actual
+'
+
+cat >expect <<EOF &&
+h dir1/modified
+H dir1/tracked
+h dir2/modified
+H dir2/tracked
+h modified
+H tracked
+EOF
+
+# test that "update-index --fsmonitor-valid" sets the fsmonitor valid bit
+test_expect_success 'update-index --fsmonitor-valid" sets the fsmonitor valid bit' '
+	git update-index --fsmonitor &&
+	git update-index --fsmonitor-valid dir1/modified &&
+	git update-index --fsmonitor-valid dir2/modified &&
+	git update-index --fsmonitor-valid modified &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+H dir1/tracked
+H dir2/modified
+H dir2/tracked
+H modified
+H tracked
+EOF
+
+# test that "update-index --no-fsmonitor-valid" clears the fsmonitor valid bit
+test_expect_success 'update-index --no-fsmonitor-valid" clears the fsmonitor valid bit' '
+	git update-index --no-fsmonitor-valid dir1/modified &&
+	git update-index --no-fsmonitor-valid dir2/modified &&
+	git update-index --no-fsmonitor-valid modified &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+H dir1/tracked
+H dir2/modified
+H dir2/tracked
+H modified
+H tracked
+EOF
+
+# test that all files returned by the script get flagged as invalid
+test_expect_success 'all files returned by integration script get flagged as invalid' '
+	write_integration_script &&
+	dirty_repo &&
+	git update-index --fsmonitor &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/new
+H dir1/tracked
+H dir2/modified
+h dir2/new
+H dir2/tracked
+H modified
+h new
+H tracked
+EOF
+
+# test that newly added files are marked valid
+test_expect_success 'newly added files are marked valid' '
+	git add new &&
+	git add dir1/new &&
+	git add dir2/new &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/new
+h dir1/tracked
+H dir2/modified
+h dir2/new
+h dir2/tracked
+H modified
+h new
+h tracked
+EOF
+
+# test that all unmodified files get marked valid
+test_expect_success 'all unmodified files get marked valid' '
+	# modified files result in update-index returning 1
+	test_must_fail git update-index --refresh --force-write-index &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/tracked
+h dir2/modified
+h dir2/tracked
+h modified
+h tracked
+EOF
+
+# test that *only* files returned by the integration script get flagged as invalid
+test_expect_success '*only* files returned by the integration script get flagged as invalid' '
+	write_script .git/hooks/fsmonitor-test<<-\EOF &&
+	printf "dir1/modified\0"
+	EOF
+	clean_repo &&
+	git update-index --refresh --force-write-index &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	test_must_fail git update-index --refresh --force-write-index &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+# Ensure commands that call refresh_index() to move the index back in time
+# properly invalidate the fsmonitor cache
+test_expect_success 'refresh_index() invalidates fsmonitor cache' '
+	write_script .git/hooks/fsmonitor-test<<-\EOF &&
+	EOF
+	clean_repo &&
+	dirty_repo &&
+	git add . &&
+	git commit -m "to reset" &&
+	git reset HEAD~1 &&
+	git status >actual &&
+	git -c core.fsmonitor= status >expect &&
+	test_i18ncmp expect actual
+'
+
+# test fsmonitor with and without preloadIndex
+preload_values="false true"
+for preload_val in $preload_values
+do
+	test_expect_success "setup preloadIndex to $preload_val" '
+		git config core.preloadIndex $preload_val &&
+		if test $preload_val = true
+		then
+			GIT_FORCE_PRELOAD_TEST=$preload_val; export GIT_FORCE_PRELOAD_TEST
+		else
+			unset GIT_FORCE_PRELOAD_TEST
+		fi
+	'
+
+	# test fsmonitor with and without the untracked cache (if available)
+	uc_values="false"
+	test_have_prereq UNTRACKED_CACHE && uc_values="false true"
+	for uc_val in $uc_values
+	do
+		test_expect_success "setup untracked cache to $uc_val" '
+			git config core.untrackedcache $uc_val
+		'
+
+		# Status is well tested elsewhere so we'll just ensure that the results are
+		# the same when using core.fsmonitor.
+		test_expect_success 'compare status with and without fsmonitor' '
+			write_integration_script &&
+			clean_repo &&
+			dirty_repo &&
+			git add new &&
+			git add dir1/new &&
+			git add dir2/new &&
+			git status >actual &&
+			git -c core.fsmonitor= status >expect &&
+			test_i18ncmp expect actual
+		'
+
+		# Make sure it's actually skipping the check for modified and untracked
+		# (if enabled) files unless it is told about them.
+		test_expect_success "status doesn't detect unreported modifications" '
+			write_script .git/hooks/fsmonitor-test<<-\EOF &&
+			:>marker
+			EOF
+			clean_repo &&
+			git status &&
+			test_path_is_file marker &&
+			dirty_repo &&
+			rm -f marker &&
+			git status >actual &&
+			test_path_is_file marker &&
+			test_i18ngrep ! "Changes not staged for commit:" actual &&
+			if test $uc_val = true
+			then
+				test_i18ngrep ! "Untracked files:" actual
+			fi &&
+			if test $uc_val = false
+			then
+				test_i18ngrep "Untracked files:" actual
+			fi &&
+			rm -f marker
+		'
+	done
+done
+
+test_done
diff --git a/t/t7519/fsmonitor-all b/t/t7519/fsmonitor-all
new file mode 100755
index 0000000000..691bc94dc2
--- /dev/null
+++ b/t/t7519/fsmonitor-all
@@ -0,0 +1,24 @@
+#!/bin/sh
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+
+if test "$#" -ne 2
+then
+	echo "$0: exactly 2 arguments expected" >&2
+	exit 2
+fi
+
+if test "$1" != 1
+then
+	echo "Unsupported core.fsmonitor hook version." >&2
+	exit 1
+fi
+
+echo "/"
diff --git a/t/t7519/fsmonitor-none b/t/t7519/fsmonitor-none
new file mode 100755
index 0000000000..ed9cf5a6a9
--- /dev/null
+++ b/t/t7519/fsmonitor-none
@@ -0,0 +1,22 @@
+#!/bin/sh
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+
+if test "$#" -ne 2
+then
+	echo "$0: exactly 2 arguments expected" >&2
+	exit 2
+fi
+
+if test "$1" != 1
+then
+	echo "Unsupported core.fsmonitor hook version." >&2
+	exit 1
+fi
diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
new file mode 100755
index 0000000000..7ceb32dc18
--- /dev/null
+++ b/t/t7519/fsmonitor-watchman
@@ -0,0 +1,140 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $time) = @ARGV;
+#print STDERR "$0 $version $time\n";
+
+# Check the hook interface version
+
+if ($version == 1) {
+	# convert nanoseconds to seconds
+	$time = int $time / 1000000000;
+} else {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+# Convert unix style paths to escaped Windows style paths when running
+# in Windows command prompt
+
+my $system = `uname -s`;
+$system =~ s/[\r\n]+//g;
+my $git_work_tree;
+
+if ($system =~ m/^MSYS_NT/) {
+	$git_work_tree = `cygpath -aw "\$PWD"`;
+	$git_work_tree =~ s/[\r\n]+//g;
+	$git_work_tree =~ s,\\,/,g;
+} else {
+	$git_work_tree = $ENV{'PWD'};
+}
+
+my $retry = 1;
+
+launch_watchman();
+
+sub launch_watchman {
+
+	# Set input record separator
+	local $/ = 0666;
+
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+	    or die "open2() failed: $!\n" .
+	    "Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $time but were not transient (ie created after
+	# $time but no longer exist).
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	#
+	# The category of transient files that we want to ignore will have a
+	# creation clock (cclock) newer than $time_t value and will also not
+	# currently exist.
+
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $time,
+			"fields": ["name"],
+			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+		}]
+	END
+	
+	open (my $fh, ">", ".git/watchman-query.json");
+	print $fh $query;
+	close $fh;
+
+	print CHLD_IN $query;
+	my $response = <CHLD_OUT>;
+
+	open ($fh, ">", ".git/watchman-response.json");
+	print $fh $response;
+	close $fh;
+
+	die "Watchman: command returned no output.\n" .
+	    "Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	    "Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	my $json_pkg;
+	eval {
+		require JSON::XS;
+		$json_pkg = "JSON::XS";
+		1;
+	} or do {
+		require JSON::PP;
+		$json_pkg = "JSON::PP";
+	};
+
+	my $o = $json_pkg->new->utf8->decode($response);
+
+	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+		$retry--;
+		qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+
+		open ($fh, ">", ".git/watchman-output.out");
+		print "/\0";
+		close $fh;
+
+		print "/\0";
+		eval { launch_watchman() };
+		exit 0;
+	}
+
+	die "Watchman: $o->{error}.\n" .
+	    "Falling back to scanning...\n" if $o->{error};
+
+	open ($fh, ">", ".git/watchman-output.out");
+	print $fh @{$o->{files}};
+	close $fh;
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+}
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (9 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-19 19:27     ` [PATCH v7 12/12] fsmonitor: add a performance test Ben Peart
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This script integrates the new fsmonitor capabilities of git with the
cross platform Watchman file watching service. To use the script:

Download and install Watchman from https://facebook.github.io/watchman/.
Rename the sample integration hook from fsmonitor-watchman.sample to
fsmonitor-watchman. Configure git to use the extension:

git config core.fsmonitor .git/hooks/fsmonitor-watchman

Optionally turn on the untracked cache for optimal performance.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Christian Couder <christian.couder@gmail.com>
---
 templates/hooks--fsmonitor-watchman.sample | 122 +++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
new file mode 100755
index 0000000000..870a59d237
--- /dev/null
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -0,0 +1,122 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $time) = @ARGV;
+
+# Check the hook interface version
+
+if ($version == 1) {
+	# convert nanoseconds to seconds
+	$time = int $time / 1000000000;
+} else {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+# Convert unix style paths to escaped Windows style paths when running
+# in Windows command prompt
+
+my $system = `uname -s`;
+$system =~ s/[\r\n]+//g;
+my $git_work_tree;
+
+if ($system =~ m/^MSYS_NT/) {
+	$git_work_tree = `cygpath -aw "\$PWD"`;
+	$git_work_tree =~ s/[\r\n]+//g;
+	$git_work_tree =~ s,\\,/,g;
+} else {
+	$git_work_tree = $ENV{'PWD'};
+}
+
+my $retry = 1;
+
+launch_watchman();
+
+sub launch_watchman {
+
+	# Set input record separator
+	local $/ = 0666;
+
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+	    or die "open2() failed: $!\n" .
+	    "Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $time but were not transient (ie created after
+	# $time but no longer exist).
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	#
+	# The category of transient files that we want to ignore will have a
+	# creation clock (cclock) newer than $time_t value and will also not
+	# currently exist.
+
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $time,
+			"fields": ["name"],
+			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+		}]
+	END
+
+	print CHLD_IN $query;
+	my $response = <CHLD_OUT>;
+
+	die "Watchman: command returned no output.\n" .
+	    "Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	    "Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	my $json_pkg;
+	eval {
+		require JSON::XS;
+		$json_pkg = "JSON::XS";
+		1;
+	} or do {
+		require JSON::PP;
+		$json_pkg = "JSON::PP";
+	};
+
+	my $o = $json_pkg->new->utf8->decode($response);
+
+	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+		$retry--;
+		qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+		print "/\0";
+		eval { launch_watchman() };
+		exit 0;
+	}
+
+	die "Watchman: $o->{error}.\n" .
+	    "Falling back to scanning...\n" if $o->{error};
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+}
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v7 12/12] fsmonitor: add a performance test
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (10 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
@ 2017-09-19 19:27     ` Ben Peart
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 19:27 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a test utility (test-drop-caches) that flushes all changes to disk
then drops file system cache on Windows, Linux, and OSX.

Add a perf test (p7519-fsmonitor.sh) for fsmonitor.

By default, the performance test will utilize the Watchman file system
monitor if it is installed.  If Watchman is not installed, it will use a
dummy integration script that does not report any new or modified files.
The dummy script has very little overhead which provides optimistic results.

The performance test will also use the untracked cache feature if it is
available as fsmonitor uses it to speed up scanning for untracked files.

There are 3 environment variables that can be used to alter the default
behavior of the performance test:

GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor

The big win for using fsmonitor is the elimination of the need to scan the
working directory looking for changed and untracked files. If the file
information is all cached in RAM, the benefits are reduced.

GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile                    |   1 +
 t/helper/.gitignore         |   1 +
 t/helper/test-drop-caches.c | 162 ++++++++++++++++++++++++++++++++++++++
 t/perf/p7519-fsmonitor.sh   | 184 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 348 insertions(+)
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100755 t/perf/p7519-fsmonitor.sh

diff --git a/Makefile b/Makefile
index d970cd00e9..b2653ee64f 100644
--- a/Makefile
+++ b/Makefile
@@ -638,6 +638,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-drop-caches
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
 TEST_PROGRAMS_NEED_X += test-dump-split-index
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 721650256e..f9328eebdd 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -3,6 +3,7 @@
 /test-config
 /test-date
 /test-delta
+/test-drop-caches
 /test-dump-cache-tree
 /test-dump-split-index
 /test-dump-untracked-cache
diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
new file mode 100644
index 0000000000..4e5ca8f397
--- /dev/null
+++ b/t/helper/test-drop-caches.c
@@ -0,0 +1,162 @@
+#include "git-compat-util.h"
+
+#if defined(GIT_WINDOWS_NATIVE)
+
+static int cmd_sync(void)
+{
+	char Buffer[MAX_PATH];
+	DWORD dwRet;
+	char szVolumeAccessPath[] = "\\\\.\\X:";
+	HANDLE hVolWrite;
+	int success = 0;
+
+	dwRet = GetCurrentDirectory(MAX_PATH, Buffer);
+	if ((0 == dwRet) || (dwRet > MAX_PATH))
+		return error("Error getting current directory");
+
+	if ((Buffer[0] < 'A') || (Buffer[0] > 'Z'))
+		return error("Invalid drive letter '%c'", Buffer[0]);
+
+	szVolumeAccessPath[4] = Buffer[0];
+	hVolWrite = CreateFile(szVolumeAccessPath, GENERIC_READ | GENERIC_WRITE,
+		FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);
+	if (INVALID_HANDLE_VALUE == hVolWrite)
+		return error("Unable to open volume for writing, need admin access");
+
+	success = FlushFileBuffers(hVolWrite);
+	if (!success)
+		error("Unable to flush volume");
+
+	CloseHandle(hVolWrite);
+
+	return !success;
+}
+
+#define STATUS_SUCCESS			(0x00000000L)
+#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
+
+typedef enum _SYSTEM_INFORMATION_CLASS {
+	SystemMemoryListInformation = 80,
+} SYSTEM_INFORMATION_CLASS;
+
+typedef enum _SYSTEM_MEMORY_LIST_COMMAND {
+	MemoryCaptureAccessedBits,
+	MemoryCaptureAndResetAccessedBits,
+	MemoryEmptyWorkingSets,
+	MemoryFlushModifiedList,
+	MemoryPurgeStandbyList,
+	MemoryPurgeLowPriorityStandbyList,
+	MemoryCommandMax
+} SYSTEM_MEMORY_LIST_COMMAND;
+
+static BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
+{
+	BOOL bResult;
+	DWORD dwBufferLength;
+	LUID luid;
+	TOKEN_PRIVILEGES tpPreviousState;
+	TOKEN_PRIVILEGES tpNewState;
+
+	dwBufferLength = 16;
+	bResult = LookupPrivilegeValueA(0, lpName, &luid);
+	if (bResult) {
+		tpNewState.PrivilegeCount = 1;
+		tpNewState.Privileges[0].Luid = luid;
+		tpNewState.Privileges[0].Attributes = 0;
+		bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState,
+			(DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState),
+			&tpPreviousState, &dwBufferLength);
+		if (bResult) {
+			tpPreviousState.PrivilegeCount = 1;
+			tpPreviousState.Privileges[0].Luid = luid;
+			tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0;
+			bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState,
+				dwBufferLength, 0, 0);
+		}
+	}
+	return bResult;
+}
+
+static int cmd_dropcaches(void)
+{
+	HANDLE hProcess = GetCurrentProcess();
+	HANDLE hToken;
+	HMODULE ntdll;
+	int status;
+
+	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
+		return error("Can't open current process token");
+
+	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
+		return error("Can't get SeProfileSingleProcessPrivilege");
+
+	CloseHandle(hToken);
+
+	ntdll = LoadLibrary("ntdll.dll");
+	if (!ntdll)
+		return error("Can't load ntdll.dll, wrong Windows version?");
+
+	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
+		(DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
+	if (!NtSetSystemInformation)
+		return error("Can't get function addresses, wrong Windows version?");
+
+	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
+	status = NtSetSystemInformation(
+		SystemMemoryListInformation,
+		&command,
+		sizeof(SYSTEM_MEMORY_LIST_COMMAND)
+	);
+	if (status == STATUS_PRIVILEGE_NOT_HELD)
+		error("Insufficient privileges to purge the standby list, need admin access");
+	else if (status != STATUS_SUCCESS)
+		error("Unable to execute the memory list command %d", status);
+
+	FreeLibrary(ntdll);
+
+	return status;
+}
+
+#elif defined(__linux__)
+
+static int cmd_sync(void)
+{
+	return system("sync");
+}
+
+static int cmd_dropcaches(void)
+{
+	return system("echo 3 | sudo tee /proc/sys/vm/drop_caches");
+}
+
+#elif defined(__APPLE__)
+
+static int cmd_sync(void)
+{
+	return system("sync");
+}
+
+static int cmd_dropcaches(void)
+{
+	return system("sudo purge");
+}
+
+#else
+
+static int cmd_sync(void)
+{
+	return 0;
+}
+
+static int cmd_dropcaches(void)
+{
+	return error("drop caches not implemented on this platform");
+}
+
+#endif
+
+int cmd_main(int argc, const char **argv)
+{
+	cmd_sync();
+	return cmd_dropcaches();
+}
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
new file mode 100755
index 0000000000..16d1bf72e5
--- /dev/null
+++ b/t/perf/p7519-fsmonitor.sh
@@ -0,0 +1,184 @@
+#!/bin/sh
+
+test_description="Test core.fsmonitor"
+
+. ./perf-lib.sh
+
+#
+# Performance test for the fsmonitor feature which enables git to talk to a
+# file system change monitor and avoid having to scan the working directory
+# for new or modified files.
+#
+# By default, the performance test will utilize the Watchman file system
+# monitor if it is installed.  If Watchman is not installed, it will use a
+# dummy integration script that does not report any new or modified files.
+# The dummy script has very little overhead which provides optimistic results.
+#
+# The performance test will also use the untracked cache feature if it is
+# available as fsmonitor uses it to speed up scanning for untracked files.
+#
+# There are 3 environment variables that can be used to alter the default
+# behavior of the performance test:
+#
+# GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
+# GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
+# GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor
+#
+# The big win for using fsmonitor is the elimination of the need to scan the
+# working directory looking for changed and untracked files. If the file
+# information is all cached in RAM, the benefits are reduced.
+#
+# GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests
+#
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_lazy_prereq WATCHMAN '
+	{ command -v watchman >/dev/null 2>&1; ret=$?; } &&
+	test $ret -ne 1
+'
+
+if test_have_prereq WATCHMAN
+then
+	# Convert unix style paths to escaped Windows style paths for Watchman
+	case "$(uname -s)" in
+	MSYS_NT*)
+	  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')"
+	  ;;
+	*)
+	  GIT_WORK_TREE="$PWD"
+	  ;;
+	esac
+fi
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"
+then
+	# When using GIT_PERF_7519_DROP_CACHE, GIT_PERF_REPEAT_COUNT must be 1 to
+	# generate valid results. Otherwise the caching that happens for the nth
+	# run will negate the validity of the comparisons.
+	if test "$GIT_PERF_REPEAT_COUNT" -ne 1
+	then
+		echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+		GIT_PERF_REPEAT_COUNT=1
+	fi
+fi
+
+test_expect_success "setup for fsmonitor" '
+	# set untrackedCache depending on the environment
+	if test -n "$GIT_PERF_7519_UNTRACKED_CACHE"
+	then
+		git config core.untrackedCache "$GIT_PERF_7519_UNTRACKED_CACHE"
+	else
+		if test_have_prereq UNTRACKED_CACHE
+		then
+			git config core.untrackedCache true
+		else
+			git config core.untrackedCache false
+		fi
+	fi &&
+
+	# set core.splitindex depending on the environment
+	if test -n "$GIT_PERF_7519_SPLIT_INDEX"
+	then
+		git config core.splitIndex "$GIT_PERF_7519_SPLIT_INDEX"
+	fi &&
+
+	# set INTEGRATION_SCRIPT depending on the environment
+	if test -n "$GIT_PERF_7519_FSMONITOR"
+	then
+		INTEGRATION_SCRIPT="$GIT_PERF_7519_FSMONITOR"
+	else
+		#
+		# Choose integration script based on existence of Watchman.
+		# If Watchman exists, watch the work tree and attempt a query.
+		# If everything succeeds, use Watchman integration script,
+		# else fall back to an empty integration script.
+		#
+		mkdir .git/hooks &&
+		if test_have_prereq WATCHMAN
+		then
+			INTEGRATION_SCRIPT=".git/hooks/fsmonitor-watchman" &&
+			cp "$TEST_DIRECTORY/../templates/hooks--fsmonitor-watchman.sample" "$INTEGRATION_SCRIPT" &&
+			watchman watch "$GIT_WORK_TREE" &&
+			watchman watch-list | grep -q -F "$GIT_WORK_TREE"
+		else
+			INTEGRATION_SCRIPT=".git/hooks/fsmonitor-empty" &&
+			write_script "$INTEGRATION_SCRIPT"<<-\EOF
+			EOF
+		fi
+	fi &&
+
+	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
+	git update-index --fsmonitor
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uno
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uall
+'
+
+test_expect_success "setup without fsmonitor" '
+	unset INTEGRATION_SCRIPT &&
+	git config --unset core.fsmonitor &&
+	git update-index --no-fsmonitor
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uno
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uall
+'
+
+if test_have_prereq WATCHMAN
+then
+	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
+
+	# Work around Watchman bug on Windows where it holds on to handles
+	# preventing the removal of the trash directory
+	watchman shutdown-server >/dev/null 2>&1
+fi
+
+test_done
-- 
2.14.1.windows.1


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* RE: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-15 22:00     ` David Turner
@ 2017-09-19 19:32       ` David Turner
  2017-09-19 20:30         ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-19 19:32 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: 'avarab@gmail.com', 'christian.couder@gmail.com',
	'git@vger.kernel.org', 'gitster@pobox.com',
	'johannes.schindelin@gmx.de', 'pclouds@gmail.com',
	'peff@peff.net'

I think my comment here might have gotten lost, and I don't want it to because it's something I'm really worried about:

> -----Original Message-----
> From: David Turner
> Sent: Friday, September 15, 2017 6:00 PM
> To: 'Ben Peart' <benpeart@microsoft.com>
> Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
> gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
> peff@peff.net
> Subject: RE: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor
> extension
> 
> > -----Original Message-----
> > +dirty_repo () {
> > +	: >untracked &&
> > +	: >dir1/untracked &&
> > +	: >dir2/untracked &&
> > +	echo 1 >modified &&
> > +	echo 2 >dir1/modified &&
> > +	echo 3 >dir2/modified &&
> > +	echo 4 >new &&
> > +	echo 5 >dir1/new &&
> > +	echo 6 >dir2/new
> 
> If I add an untracked file named dir3/untracked to dirty_repo  (and
> write_integration_script), then "status doesn't detect unreported
> modifications", below, fails.  Did I do something wrong, or does this turn up a
> bug?



^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
@ 2017-09-19 19:46       ` David Turner
  2017-09-19 20:44         ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-19 19:46 UTC (permalink / raw)
  To: 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

> -----Original Message-----
> From: Ben Peart [mailto:benpeart@microsoft.com]
> Sent: Tuesday, September 19, 2017 3:28 PM
> To: benpeart@microsoft.com
> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: [PATCH v7 06/12] ls-files: Add support in ls-files to display the
> fsmonitor valid bit
> 
> Add a new command line option (-f) to ls-files to have it use lowercase
> letters for 'fsmonitor valid' files
> 
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> ---
>  builtin/ls-files.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

This is still missing the corresponding documentation patch.  

I can see from replies that at least some of my messages got through.  In total, I sent messages about:
04/12 (I see replies)
05/12 (I see replies)
06/12 (no reply, issue not fixed)
10/12 (no reply, haven't checked whether same issue but I assume same issue since the new case I mentioned isn't added)
12/12 (no reply, typo fixed -- no reply required)


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-19 19:32       ` David Turner
@ 2017-09-19 20:30         ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 20:30 UTC (permalink / raw)
  To: David Turner, 'Ben Peart'
  Cc: 'avarab@gmail.com', 'christian.couder@gmail.com',
	'git@vger.kernel.org', 'gitster@pobox.com',
	'johannes.schindelin@gmx.de', 'pclouds@gmail.com',
	'peff@peff.net'



On 9/19/2017 3:32 PM, David Turner wrote:
> I think my comment here might have gotten lost, and I don't want it to because it's something I'm really worried about:
> 

You have to update the test completely to ensure it passes.  If you run 
the test with the '-v' option you will see the cause of the failure:

t7519-status-fsmonitor.sh: line 27: dir3/untracked: No such file or 
directory

To fix this, you will also need to update the 'setup' test to create the 
directory for the new untracked file to get created into.  Then you will 
need to drop at least one file in it so that the directory is preserved 
through the 'git reset --hard'  Then you have to update the several 'cat 
 >expect' blocks to expect the new file.

In addition, the ability to avoid scanning for untracked files relies on 
the untracked cache.  If you don't have another file that git is aware 
of in that directory then there won't be a cache entry and git will do 
the required scan and detect the new untracked file (by design).

Here is a patch to the test that updates it to meet all these 
requirements.  I hope this helps.


diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index c6df85af5e..29ae4e284f 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -24,12 +24,14 @@ dirty_repo () {
  	: >untracked &&
  	: >dir1/untracked &&
  	: >dir2/untracked &&
+	: >dir3/untracked &&
  	echo 1 >modified &&
  	echo 2 >dir1/modified &&
  	echo 3 >dir2/modified &&
  	echo 4 >new &&
  	echo 5 >dir1/new &&
  	echo 6 >dir2/new
+	echo 7 >dir3/new
  }

  write_integration_script () {
@@ -47,12 +49,14 @@ write_integration_script () {
  	printf "untracked\0"
  	printf "dir1/untracked\0"
  	printf "dir2/untracked\0"
+	printf "dir3/untracked\0"
  	printf "modified\0"
  	printf "dir1/modified\0"
  	printf "dir2/modified\0"
  	printf "new\0"
  	printf "dir1/new\0"
  	printf "dir2/new\0"
+	printf "dir3/new\0"
  	EOF
  }

@@ -71,6 +75,8 @@ test_expect_success 'setup' '
  	mkdir dir2 &&
  	: >dir2/tracked &&
  	: >dir2/modified &&
+	mkdir dir3 &&
+	: >dir3/tracked &&
  	git -c core.fsmonitor= add . &&
  	git -c core.fsmonitor= commit -m initial &&
  	git config core.fsmonitor .git/hooks/fsmonitor-test &&
@@ -107,6 +113,7 @@ h dir1/modified
  H dir1/tracked
  h dir2/modified
  H dir2/tracked
+H dir3/tracked
  h modified
  H tracked
  EOF
@@ -126,6 +133,7 @@ H dir1/modified
  H dir1/tracked
  H dir2/modified
  H dir2/tracked
+H dir3/tracked
  H modified
  H tracked
  EOF
@@ -144,6 +152,7 @@ H dir1/modified
  H dir1/tracked
  H dir2/modified
  H dir2/tracked
+H dir3/tracked
  H modified
  H tracked
  EOF
@@ -164,6 +173,8 @@ H dir1/tracked
  H dir2/modified
  h dir2/new
  H dir2/tracked
+h dir3/new
+H dir3/tracked
  H modified
  h new
  H tracked
@@ -174,6 +185,7 @@ test_expect_success 'newly added files are marked 
valid' '
  	git add new &&
  	git add dir1/new &&
  	git add dir2/new &&
+	git add dir3/new &&
  	git ls-files -f >actual &&
  	test_cmp expect actual
  '
@@ -185,6 +197,8 @@ h dir1/tracked
  H dir2/modified
  h dir2/new
  h dir2/tracked
+h dir3/new
+h dir3/tracked
  H modified
  h new
  h tracked
@@ -203,6 +217,7 @@ H dir1/modified
  h dir1/tracked
  h dir2/modified
  h dir2/tracked
+h dir3/tracked
  h modified
  h tracked
  EOF
@@ -269,6 +284,7 @@ do
  			git add new &&
  			git add dir1/new &&
  			git add dir2/new &&
+			git add dir3/new &&
  			git status >actual &&
  			git -c core.fsmonitor= status >expect &&
  			test_i18ncmp expect actual



>> -----Original Message-----
>> From: David Turner
>> Sent: Friday, September 15, 2017 6:00 PM
>> To: 'Ben Peart' <benpeart@microsoft.com>
>> Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
>> gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
>> peff@peff.net
>> Subject: RE: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor
>> extension
>>
>>> -----Original Message-----
>>> +dirty_repo () {
>>> +	: >untracked &&
>>> +	: >dir1/untracked &&
>>> +	: >dir2/untracked &&
>>> +	echo 1 >modified &&
>>> +	echo 2 >dir1/modified &&
>>> +	echo 3 >dir2/modified &&
>>> +	echo 4 >new &&
>>> +	echo 5 >dir1/new &&
>>> +	echo 6 >dir2/new
>>
>> If I add an untracked file named dir3/untracked to dirty_repo  (and
>> write_integration_script), then "status doesn't detect unreported
>> modifications", below, fails.  Did I do something wrong, or does this turn up a
>> bug?
> 
> 

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-18 15:25       ` Ben Peart
@ 2017-09-19 20:34         ` Jonathan Nieder
  0 siblings, 0 replies; 137+ messages in thread
From: Jonathan Nieder @ 2017-09-19 20:34 UTC (permalink / raw)
  To: Ben Peart
  Cc: Junio C Hamano, Ben Peart, David.Turner, avarab, christian.couder,
	git, johannes.schindelin, pclouds, peff

Ben Peart wrote:

> Some stats on these same coding style errors in the current bash scripts:
>
> 298 instances of "[a-z]\(\).*\{" ie "function_name() {" (no space)
> 140 instances of "if \[ .* \]" (not using the preferred "test")
> 293 instances of "if .*; then"
>
> Wouldn't it be great not to have to write up style feedback for when
> these all get copy/pasted into new scripts?

Agreed.  Care to write patches for these? :)  (I think three patches,
one for each issue, would do the trick.)

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable
  2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
@ 2017-09-19 20:37         ` Jonathan Nieder
  2017-09-20 13:49           ` Torsten Bögershausen
  0 siblings, 1 reply; 137+ messages in thread
From: Jonathan Nieder @ 2017-09-19 20:37 UTC (permalink / raw)
  To: tboegi; +Cc: git, benpeart, Junio C Hamano

Torsten Bögershausen <tboegi@web.de> wrote:

> Some implementations of `echo` support the '-e' option to enable
> backslash interpretation of the following string.
> As an addition, they support '-E' to turn it off.

nit: please wrap the commit message to a consistent line width.

> However, none of these are portable, POSIX doesn't even mention them,
> and many implementations don't support them.
>
> A check for '-n' is already done in check-non-portable-shell.pl,
> extend it to cover '-n', '-e' or '-E-'
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
>  t/check-non-portable-shell.pl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

An excellent change.  Thanks for noticing and fixing this.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
@ 2017-09-19 20:43     ` Jonathan Nieder
  2017-09-20 17:11       ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Jonathan Nieder @ 2017-09-19 20:43 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Hi,

Ben Peart wrote:

> The split index test t1700-split-index.sh has hard coded SHA values for
> the index.  Currently it supports index V4 and V3 but assumes there are
> no index extensions loaded.
>
> When manually forcing the fsmonitor extension to be turned on when
> running the test suite, the SHA values no longer match which causes the
> test to fail.
>
> The potential matrix of index extensions and index versions can is quite
> large so instead disable the extension before attempting to run the test.

Thanks for finding and diagnosing this problem.

This feels to me like the wrong fix.  Wouldn't it be better for the
test not to depend on the precise object ids?  See the "Tips for
Writing Tests" section in t/README:

	                                                         And
	such drastic changes to the core GIT that even changes these
	otherwise supposedly stable object IDs should be accompanied by
	an update to t0000-basic.sh.

	However, other tests that simply rely on basic parts of the core
	GIT working properly should not have that level of intimate
	knowledge of the core GIT internals.  If all the test scripts
	hardcoded the object IDs like t0000-basic.sh does, that defeats
	the purpose of t0000-basic.sh, which is to isolate that level of
	validation in one place.  Your test also ends up needing
	updating when such a change to the internal happens, so do _not_
	do it and leave the low level of validation to t0000-basic.sh.

Worse, t1700-split-index.sh doesn't explain where the object ids it
uses comes from so it is not even obvious to a casual reader like me
how to fix it.

See t/diff-lib.sh for some examples of one way to avoid depending on
the object id computation.  Another way that is often preferable is to
come up with commands to compute the expected hash values, like
$(git rev-parse HEAD^{tree}), and use those instead of hard-coded
values.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-19 19:46       ` David Turner
@ 2017-09-19 20:44         ` Ben Peart
  2017-09-19 21:27           ` David Turner
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-19 20:44 UTC (permalink / raw)
  To: David Turner, 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net



On 9/19/2017 3:46 PM, David Turner wrote:
>> -----Original Message-----
>> From: Ben Peart [mailto:benpeart@microsoft.com]
>> Sent: Tuesday, September 19, 2017 3:28 PM
>> To: benpeart@microsoft.com
>> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
>> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
>> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
>> Subject: [PATCH v7 06/12] ls-files: Add support in ls-files to display the
>> fsmonitor valid bit
>>
>> Add a new command line option (-f) to ls-files to have it use lowercase
>> letters for 'fsmonitor valid' files
>>
>> Signed-off-by: Ben Peart <benpeart@microsoft.com>
>> ---
>>   builtin/ls-files.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> This is still missing the corresponding documentation patch.

Sorry for the confusion.

The documentation is all in a patch together as they all have links to 
each other.  You can find it here:

https://public-inbox.org/git/20170919192744.19224-6-benpeart@microsoft.com/T/#u

> 
> I can see from replies that at least some of my messages got through.  In total, I sent messages about:
> 04/12 (I see replies)
> 05/12 (I see replies)
> 06/12 (no reply, issue not fixed)

The documentation is all in a patch together as they all have links to 
each other.  You can find it here:

https://public-inbox.org/git/20170919192744.19224-6-benpeart@microsoft.com/T/#u

> 10/12 (no reply, haven't checked whether same issue but I assume same issue since the new case I mentioned isn't added)

It wasn't a bug so I didn't "fix" it.  I just sent an explanation and 
patch demonstrating why. You can find it here:

https://public-inbox.org/git/84981984-02c1-f322-a617-57dfe1d87ad2@gmail.com/T/#u

> 12/12 (no reply, typo fixed -- no reply required)
> 

Hope this helps.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-19 20:44         ` Ben Peart
@ 2017-09-19 21:27           ` David Turner
  2017-09-19 22:44             ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: David Turner @ 2017-09-19 21:27 UTC (permalink / raw)
  To: 'Ben Peart', 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net


> -----Original Message-----
> From: Ben Peart [mailto:peartben@gmail.com]
> Sent: Tuesday, September 19, 2017 4:45 PM
> To: David Turner <David.Turner@twosigma.com>; 'Ben Peart'
> <benpeart@microsoft.com>
> Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
> gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
> peff@peff.net
> Subject: Re: [PATCH v7 06/12] ls-files: Add support in ls-files to display the
> fsmonitor valid bit
> 
> 
> 
> On 9/19/2017 3:46 PM, David Turner wrote:
> >> -----Original Message-----
> >> From: Ben Peart [mailto:benpeart@microsoft.com]
> >> Sent: Tuesday, September 19, 2017 3:28 PM
> >> To: benpeart@microsoft.com
> >> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> >> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> >> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> >> Subject: [PATCH v7 06/12] ls-files: Add support in ls-files to
> >> display the fsmonitor valid bit
> >>
> >> Add a new command line option (-f) to ls-files to have it use
> >> lowercase letters for 'fsmonitor valid' files
> >>
> >> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> >> ---
> >>   builtin/ls-files.c | 8 ++++++--
> >>   1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > This is still missing the corresponding documentation patch.
> 
> Sorry for the confusion.

Thanks for following up.

> > 10/12 (no reply, haven't checked whether same issue but I assume same
> > issue since the new case I mentioned isn't added)
> 
> It wasn't a bug so I didn't "fix" it.  I just sent an explanation and patch
> demonstrating why. You can find it here:

I was concerned about the case of an untracked file inside a directory 
that contains no tracked files.  Your patch in this mail treats dir3 just 
like dir1 and dir2.  I think you ought to test the case of a dir with no 
tracked files.

After more careful checking, it looks like this case does work, but it's 
still worth testing.


^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-19 21:27           ` David Turner
@ 2017-09-19 22:44             ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-19 22:44 UTC (permalink / raw)
  To: David Turner, 'Ben Peart'
  Cc: avarab@gmail.com, christian.couder@gmail.com, git@vger.kernel.org,
	gitster@pobox.com, johannes.schindelin@gmx.de, pclouds@gmail.com,
	peff@peff.net

> -----Original Message-----
> From: David Turner [mailto:David.Turner@twosigma.com]
> Sent: Tuesday, September 19, 2017 5:27 PM
> To: 'Ben Peart' <peartben@gmail.com>; Ben Peart
> <Ben.Peart@microsoft.com>
> Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
> gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
> peff@peff.net
> Subject: RE: [PATCH v7 06/12] ls-files: Add support in ls-files to display the
> fsmonitor valid bit
> 
> 
> > -----Original Message-----
> > From: Ben Peart [mailto:peartben@gmail.com]
> > Sent: Tuesday, September 19, 2017 4:45 PM
> > To: David Turner <David.Turner@twosigma.com>; 'Ben Peart'
> > <benpeart@microsoft.com>
> > Cc: avarab@gmail.com; christian.couder@gmail.com; git@vger.kernel.org;
> > gitster@pobox.com; johannes.schindelin@gmx.de; pclouds@gmail.com;
> > peff@peff.net
> > Subject: Re: [PATCH v7 06/12] ls-files: Add support in ls-files to
> > display the fsmonitor valid bit
> >
> >
> >
> > On 9/19/2017 3:46 PM, David Turner wrote:
> > >> -----Original Message-----
> > >> From: Ben Peart [mailto:benpeart@microsoft.com]
> > >> Sent: Tuesday, September 19, 2017 3:28 PM
> > >> To: benpeart@microsoft.com
> > >> Cc: David Turner <David.Turner@twosigma.com>; avarab@gmail.com;
> > >> christian.couder@gmail.com; git@vger.kernel.org; gitster@pobox.com;
> > >> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> > >> Subject: [PATCH v7 06/12] ls-files: Add support in ls-files to
> > >> display the fsmonitor valid bit
> > >>
> > >> Add a new command line option (-f) to ls-files to have it use
> > >> lowercase letters for 'fsmonitor valid' files
> > >>
> > >> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> > >> ---
> > >>   builtin/ls-files.c | 8 ++++++--
> > >>   1 file changed, 6 insertions(+), 2 deletions(-)
> > >
> > > This is still missing the corresponding documentation patch.
> >
> > Sorry for the confusion.
> 
> Thanks for following up.
> 
> > > 10/12 (no reply, haven't checked whether same issue but I assume
> > > same issue since the new case I mentioned isn't added)
> >
> > It wasn't a bug so I didn't "fix" it.  I just sent an explanation and
> > patch demonstrating why. You can find it here:
> 
> I was concerned about the case of an untracked file inside a directory that
> contains no tracked files.  Your patch in this mail treats dir3 just like dir1 and
> dir2.  I think you ought to test the case of a dir with no tracked files.
> 

In the case where there is an untracked file inside a directory that contains no tracked files, git will (as shown by the "failing" test) actually find the untracked file.  This is the correct/expected behavior.  The test failure is just indicating that the optimization of not searching that directory for untracked files was not able to occur (because there was no entry in the untracked cache for that directory).

> After more careful checking, it looks like this case does work, but it's still
> worth testing.


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-09-20  2:28       ` Junio C Hamano
  2017-09-20 16:19         ` Ben Peart
  2017-09-20  6:23       ` Junio C Hamano
  1 sibling, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-20  2:28 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> +/* do stat comparison even if CE_FSMONITOR_VALID is true */
> +#define CE_MATCH_IGNORE_FSMONITOR 0X20

Hmm, when should a programmer use this flag?

> +int git_config_get_fsmonitor(void)
> +{
> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");

Will the environment be part of the published API, or is it a
remnant of a useful tool for debugging while developing the feature?

If it is the former (and I'd say why not, even though "git -c
core.fsmontor=..." may be easy enough), we might want to rename it
to replace _TEST with _PROGRAM or something and document it.

> diff --git a/diff-lib.c b/diff-lib.c
> index 2a52b07954..23c6d03ca9 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -12,6 +12,7 @@
>  #include "refs.h"
>  #include "submodule.h"
>  #include "dir.h"
> +#include "fsmonitor.h"
>  
>  /*
>   * diff-files
> @@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  
>  		if (!changed && !dirty_submodule) {
>  			ce_mark_uptodate(ce);
> +			mark_fsmonitor_valid(ce);

I notice all calls to mark_fsmonitor_valid(ce) always follow a call
to ce_mark_uptodate(ce).  Should the call to the former be made as
part of the latter instead?  Are there cases where we want to call
ce_mark_uptodate(ce) to mark the ce up-to-date, yet we do not want
to call mark_fsmonitor_valid(ce)?  How does a programmer tell when
s/he wants to call ce_mark_uptodate(ce) if s/he also should call
mark_fsmonitor_valid(ce)?

Together with "when to pass IGNORE_FSMONITOR" question, is there a
summary that guides new programmers to answer these questions in the
new documentation?

> diff --git a/dir.h b/dir.h
> index e3717055d1..fab8fc1561 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -139,6 +139,8 @@ struct untracked_cache {
>  	int gitignore_invalidated;
>  	int dir_invalidated;
>  	int dir_opened;
> +	/* fsmonitor invalidation data */
> +	unsigned int use_fsmonitor : 1;

This makes it look like we will add a bit more fields in later
patches that are about "invalidation" around fsmonitor, but it
appears that such an addition never happens til the end of the
series.  And use_fsmonitor boolean does not seem to be what the
comment says---it just tells us if fsmonitor is in use in the
operation of the untracked cache, no?

> diff --git a/entry.c b/entry.c
> index cb291aa88b..5e6794f9fc 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -4,6 +4,7 @@
>  #include "streaming.h"
>  #include "submodule.h"
>  #include "progress.h"
> +#include "fsmonitor.h"
>  
>  static void create_directories(const char *path, int path_len,
>  			       const struct checkout *state)
> @@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
>  			lstat(ce->name, &st);
>  		fill_stat_cache_info(ce, &st);
>  		ce->ce_flags |= CE_UPDATE_IN_BASE;
> +		mark_fsmonitor_invalid(state->istate, ce);
>  		state->istate->cache_changed |= CE_ENTRY_CHANGED;

Similar to "how does mark_fsmonitor_valid() relate to
ce_mark_uptodate()?" question earlier, mark_fsmonitor_invalid()
seems to often appear in pairs with the update to cache_changed.
Are there cases where we need CE_ENTRY_CHANGED bit added to the
index state yet we do not want to call mark_fsmonitor_invalid()?  I
am wondering if there should be a single helper function that lets
callers say "I changed this ce in this istate this way.  Please
update CE_VALID, CE_UPDATE_IN_BASE and CE_FSMONITOR_VALID
accordingly".

That "changed _this way_" is deliberately vague in my question
above, because it is not yet clear to me when mark-valid and
mark-invalid should and should not be called from the series.

> +	/* a fsmonitor process can return '*' to indicate all entries are invalid */
> +	if (query_success && query_result.buf[0] != '/') {
> +		/* Mark all entries returned by the monitor as dirty */

The comment talks about '*' and code checks if it is not '/'?  The
else clause of this if/else handles the "invalidate all" case, so
shouldn't we be comparing with '*' instead here?

Ah, fsmonitor-watchman section of the doc tells us to write the hook
in a way to return slash, so the comment above the code is stale and
the comparison with '/' is what we want, I guess.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
@ 2017-09-20  5:47       ` Junio C Hamano
  2017-09-20 14:58         ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-20  5:47 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> +		OPT_SET_INT(0, "force-write-index", &force_write,
> +			N_("write out the index even if is not flagged as changed"), 1),

Hmph.  The only time this makes difference is when the code forgets
to mark active_cache_changed even when it actually made a change to
the index, no?  I do understand the wish to be able to observe what
_would_ be written if such a bug did not exist in order to debug the
other aspects of the change in this series, but at the same time I
fear that we may end up sweeping the problem under the rug by
running the tests with this option.

>  		OPT_END()
>  	};
>  
> @@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>  		die("BUG: bad untracked_cache value: %d", untracked_cache);
>  	}
>  
> -	if (active_cache_changed) {
> +	if (active_cache_changed || force_write) {
>  		if (newfd < 0) {
>  			if (refresh_args.flags & REFRESH_QUIET)
>  				exit(128);

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
  2017-09-20  2:28       ` Junio C Hamano
@ 2017-09-20  6:23       ` Junio C Hamano
  2017-09-20 16:29         ` Ben Peart
  1 sibling, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-20  6:23 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> @@ -344,6 +346,7 @@ struct index_state {
>  	struct hashmap dir_hash;
>  	unsigned char sha1[20];
>  	struct untracked_cache *untracked;
> +	uint64_t fsmonitor_last_update;

This field being zero has more significance than just "we haven't
got any update yet", right?  The way I am reading the code is that
setting it 0 is a way to signal that fsmon has been inactivated.  It
also made me wonder if add_fsmonitor() that silently returns without
doing anything when this field is already non-zero is a bug (in
other words, I couldn't tell what the right answer would be to a
question "shouldn't the caller be avoiding duplicate calls?").

> diff --git a/fsmonitor.c b/fsmonitor.c
> new file mode 100644
> index 0000000000..b8b2d88fe1
> --- /dev/null
> +++ b/fsmonitor.c
> ...

This part was a pleasant read.

Thanks.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
@ 2017-09-20 10:00       ` Martin Ågren
  2017-09-20 17:02         ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Martin Ågren @ 2017-09-20 10:00 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, Git Mailing List, Junio C Hamano,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy,
	Jeff King

On 19 September 2017 at 21:27, Ben Peart <benpeart@microsoft.com> wrote:
> diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
> index e19eba62cd..95231dbfcb 100644
> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -16,9 +16,11 @@ SYNOPSIS
>              [--chmod=(+|-)x]
>              [--[no-]assume-unchanged]
>              [--[no-]skip-worktree]
> +            [--[no-]fsmonitor-valid]
>              [--ignore-submodules]
>              [--[no-]split-index]
>              [--[no-|test-|force-]untracked-cache]
> +            [--[no-]fsmonitor]
>              [--really-refresh] [--unresolve] [--again | -g]
>              [--info-only] [--index-info]
>              [-z] [--stdin] [--index-version <n>]
> @@ -111,6 +113,12 @@ you will need to handle the situation manually.
>         set and unset the "skip-worktree" bit for the paths. See
>         section "Skip-worktree bit" below for more information.
>
> +--[no-]fsmonitor-valid::
> +       When one of these flags is specified, the object name recorded
> +       for the paths are not updated. Instead, these options
> +       set and unset the "fsmonitor valid" bit for the paths. See
> +       section "File System Monitor" below for more information.
> +

So --no-foo does not undo --foo, but there are three values: --foo,
--no-foo and <nothing/default>. I find that unintuitive, but maybe it's
just me. Maybe there are other such options in the codebase already. How
about --fsmonitor-valid=set, --fsmonitor-valid=unset, and
--no-fsmonitor-valid (which would be the default, and which would forget
any earlier --fsmonitor-valid=...)?

>  -g::
>  --again::
>         Runs 'git update-index' itself on the paths whose index
> @@ -201,6 +209,15 @@ will remove the intended effect of the option.
>         `--untracked-cache` used to imply `--test-untracked-cache` but
>         this option would enable the extension unconditionally.
>
> +--fsmonitor::
> +--no-fsmonitor::

Maybe "--[no-]fsmonitor" for symmetry with how you've done it above and
later.

> +When used in conjunction with the untracked cache, it can further improve
> +performance by avoiding the cost of scaning the entire working directory
> +looking for new files.

s/scaning/scanning/

> +If you want to enable (or disable) this feature, it is easier to use
> +the `core.fsmonitor` configuration variable (see
> +linkgit:git-config[1]) than using the `--fsmonitor` option to
> +`git update-index` in each repository, especially if you want to do so
> +across all repositories you use, because you can set the configuration
> +variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
> +and have it affect all repositories you touch.

This is a mouthful. Maybe you could split it a little, perhaps like so:

  If you want to enable (or disable) this feature, you will probably
  want to use the `core.fsmonitor` configuration variable (see
  linkgit:git-config[1]). By setting it to `true` (or `false`) in your
  `$HOME/.gitconfig`, it will affect all repositories you touch. For a
  more fine-grained control, you can set it per repository, or use the
  `--fsmonitor` option with `git update-index` in each repository.

The part about $HOME/.gitconfig vs per-repo config is perhaps generic
enough that it doesn't belong here. So it'd only be about config vs.
option. Where to place the config item and what implications that has is
arguably orthogonal to knowing that the option exists and what it does.

Martin

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable
  2017-09-19 20:37         ` Jonathan Nieder
@ 2017-09-20 13:49           ` Torsten Bögershausen
  2017-09-22  1:04             ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Torsten Bögershausen @ 2017-09-20 13:49 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, benpeart, Junio C Hamano

On Tue, Sep 19, 2017 at 01:37:14PM -0700, Jonathan Nieder wrote:
> Torsten Bögershausen <tboegi@web.de> wrote:
> 
> > Some implementations of `echo` support the '-e' option to enable
> > backslash interpretation of the following string.
> > As an addition, they support '-E' to turn it off.
> 
> nit: please wrap the commit message to a consistent line width.
> 
> > However, none of these are portable, POSIX doesn't even mention them,
> > and many implementations don't support them.
> >
> > A check for '-n' is already done in check-non-portable-shell.pl,
> > extend it to cover '-n', '-e' or '-E-'
> >
> > Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> > ---
> >  t/check-non-portable-shell.pl | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> An excellent change.  Thanks for noticing and fixing this.
> 
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks for the review.
Junio, if you wouldn't mind to squash that in, 
another fix is needed as well(trailing '-' after '-E') :

s/'-n', '-e' or '-E-'/'-n', '-e' or '-E'
                   ^
                   


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-20  5:47       ` Junio C Hamano
@ 2017-09-20 14:58         ` Ben Peart
  2017-09-21  1:46           ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-20 14:58 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff



On 9/20/2017 1:47 AM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> +		OPT_SET_INT(0, "force-write-index", &force_write,
>> +			N_("write out the index even if is not flagged as changed"), 1),
> 
> Hmph.  The only time this makes difference is when the code forgets
> to mark active_cache_changed even when it actually made a change to
> the index, no?  I do understand the wish to be able to observe what
> _would_ be written if such a bug did not exist in order to debug the
> other aspects of the change in this series, but at the same time I
> fear that we may end up sweeping the problem under the rug by
> running the tests with this option.
> 

This is to enable a performance optimization I discovered while perf 
testing the patch series.  It enables us to do a lazy index write for 
fsmonitor detected changes but still always generate correct results.

Lets see how my ascii art skills do at describing this:

1) Index marked dirty on every fsmonitor change:
A---x---B---y---C

2) Index *not* marked dirty on fsmonitor changes:
A---x---B---x,y---C

Assume the index is written and up-to-date at point A.

In scenario #1 above, the index is marked fsmonitor dirty every time the 
fsmonitor detects a file that has been modified.  At point B, the 
fsmonitor integration script returns that file 'x' has been modified 
since A, the index is marked dirty and then written to disk with a 
last_update time of B.  At point C, the script returns 'y' as the 
changes since point B, the index is marked dirty and written to disk again.

In scenario #2, the index is *not* marked fsmonitor dirty when changed 
are detected.  At point B, the script returns 'x' but the index is not 
flagged dirty nor written to disk.  At point C, the script will return 
'x' and 'y' (since both have been changed since time 'A') and again the 
index is not marked dirty nor written to disk.

Correct results are generated in both scenarios but in scenario 2, there 
were 2 fewer index writes.  In short, the changed files were accumulated 
as the cost of processing 2 files at point C (vs 1) has no measurable 
difference in perf but the savings of two unnecessary index writes is 
significant (especially when the index gets large).

There is no real concern about accumulating too many changes as 1) the 
processing cost for additional modified files is fairly trivial and 2) 
the index ends up getting written out pretty frequently anyway as files 
are added/removed/staged/etc which updates the fsmonitor_last_update time.

The challenge came when it was time to test that the changes to the 
index were correct.  Since they are lazily written by default, I needed 
a way to force the write so that I could verify the index on disk was 
correct.  Hence, this patch.


>>   		OPT_END()
>>   	};
>>   
>> @@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>>   		die("BUG: bad untracked_cache value: %d", untracked_cache);
>>   	}
>>   
>> -	if (active_cache_changed) {
>> +	if (active_cache_changed || force_write) {
>>   		if (newfd < 0) {
>>   			if (refresh_args.flags & REFRESH_QUIET)
>>   				exit(128);

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-20  2:28       ` Junio C Hamano
@ 2017-09-20 16:19         ` Ben Peart
  2017-09-21  2:00           ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-20 16:19 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff



On 9/19/2017 10:28 PM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> +/* do stat comparison even if CE_FSMONITOR_VALID is true */
>> +#define CE_MATCH_IGNORE_FSMONITOR 0X20
> 
> Hmm, when should a programmer use this flag?
> 

Pretty much the same places you would also use CE_MATCH_IGNORE_VALID and 
CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those 
features.  That is generally when you are about to overwrite data so 
want to be *really* sure you have what you think you have.

The other place I used it was in preload_index(). In that case, I didn't 
want to trigger the call to refresh_fsmonitor() as preload_index() is 
about trying to do a fast precompute of state for the bulk of the index 
entries but is not required for correctness as refresh_cache_ent() will 
ensure any "missed" by preload_index() are up-to-date if/when that is 
needed.

>> +int git_config_get_fsmonitor(void)
>> +{
>> +	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
>> +		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
> 
> Will the environment be part of the published API, or is it a
> remnant of a useful tool for debugging while developing the feature?
> 
> If it is the former (and I'd say why not, even though "git -c
> core.fsmontor=..." may be easy enough), we might want to rename it
> to replace _TEST with _PROGRAM or something and document it.
> 

This was added this to facilitate testing.  That is why it has the magic 
naming of "GIT_***_TEST" which is the only way I found to ensure that 
env variable gets passed to tests.  Its use is discussed in patch 10 
which contains the tests.

>> diff --git a/diff-lib.c b/diff-lib.c
>> index 2a52b07954..23c6d03ca9 100644
>> --- a/diff-lib.c
>> +++ b/diff-lib.c
>> @@ -12,6 +12,7 @@
>>   #include "refs.h"
>>   #include "submodule.h"
>>   #include "dir.h"
>> +#include "fsmonitor.h"
>>   
>>   /*
>>    * diff-files
>> @@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>>   
>>   		if (!changed && !dirty_submodule) {
>>   			ce_mark_uptodate(ce);
>> +			mark_fsmonitor_valid(ce);
> 
> I notice all calls to mark_fsmonitor_valid(ce) always follow a call
> to ce_mark_uptodate(ce).  Should the call to the former be made as
> part of the latter instead?  Are there cases where we want to call
> ce_mark_uptodate(ce) to mark the ce up-to-date, yet we do not want
> to call mark_fsmonitor_valid(ce)?  How does a programmer tell when
> s/he wants to call ce_mark_uptodate(ce) if s/he also should call
> mark_fsmonitor_valid(ce)?

mark_fsmonitor_valid(ce) is the way to indicate that cache entries that 
were once fsmonitor dirty are now properly reflected in the index so can 
come off the "dirty" list.  It can't really be combined with 
ce_mark_uptodate(ce) as that would prevent the CE_MATCH_IGNORE_FSMONITOR 
logic:

	if (!ignore_skip_worktree && ce_skip_worktree(ce)) {
		ce_mark_uptodate(ce);
		return ce;
	}
	if (!ignore_valid && (ce->ce_flags & CE_VALID)) {
		ce_mark_uptodate(ce);
		return ce;
	}
	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) {
		ce_mark_uptodate(ce);
		return ce;
	}

In addition, fsmonitor is an optional feature and so the 
mark_fsmonitor_valid(ce) call should only happen when the feature is 
turned on. I tried to keep it as simple as possible by making that test 
and set logic part of the function but still be performant by making the 
function inline.

> 
> Together with "when to pass IGNORE_FSMONITOR" question, is there a
> summary that guides new programmers to answer these questions in the
> new documentation?
> 

Only the discussion in this mail thread.  I could add something to the 
function header in fsmonitor.h if that would help.  How about something 
like:

diff --git a/fsmonitor.h b/fsmonitor.h
index c2240b811a..03bf3efe61 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -34,9 +34,11 @@ extern void tweak_fsmonitor(struct index_state *istate);
   */
  extern void refresh_fsmonitor(struct index_state *istate);

-/*
- * Set the given cache entries CE_FSMONITOR_VALID bit.
- */
+/*
+ * Set the given cache entries CE_FSMONITOR_VALID bit. This should be
+ * called any time the cache entry has been updated to reflect the
+ * current state of the file on disk.
+ */
  static inline void mark_fsmonitor_valid(struct cache_entry *ce)
  {
         if (core_fsmonitor) {
@@ -46,8 +48,11 @@ static inline void mark_fsmonitor_valid(struct 
cache_entry *ce)
  }

  /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate any
- * corresponding untracked cache directory structures.
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
+ * any corresponding untracked cache directory structures. This should
+ * be called any time git creates or modifies a file that should
+ * trigger an lstat() or invalidate the untracked cache for the
+ * corresponding directory
   */
  static inline void mark_fsmonitor_invalid(struct index_state *istate, 
struct cache_entry *ce)
  {


>> diff --git a/dir.h b/dir.h
>> index e3717055d1..fab8fc1561 100644
>> --- a/dir.h
>> +++ b/dir.h
>> @@ -139,6 +139,8 @@ struct untracked_cache {
>>   	int gitignore_invalidated;
>>   	int dir_invalidated;
>>   	int dir_opened;
>> +	/* fsmonitor invalidation data */
>> +	unsigned int use_fsmonitor : 1;
> 
> This makes it look like we will add a bit more fields in later
> patches that are about "invalidation" around fsmonitor, but it
> appears that such an addition never happens til the end of the
> series.  And use_fsmonitor boolean does not seem to be what the
> comment says---it just tells us if fsmonitor is in use in the
> operation of the untracked cache, no?

I don't have any more planned bit fields.  I only needed a single bit so 
used a bit field just in case someone else comes by later and needs 
another bit.  If you aren't worried about that, we can just make this an 
int.

Correct.  The bit just indicates whether fsmonitor has been used to 
ensure the cache is current or if it needs to be checked.  I have 
comments to this effect where the flag is used in the code but could 
move/duplicate them into the header if wished. For example:

/* With fsmonitor, we can trust the untracked cache's valid field. */

and

/* Now mark the untracked cache for fsmonitor usage */

> 
>> diff --git a/entry.c b/entry.c
>> index cb291aa88b..5e6794f9fc 100644
>> --- a/entry.c
>> +++ b/entry.c
>> @@ -4,6 +4,7 @@
>>   #include "streaming.h"
>>   #include "submodule.h"
>>   #include "progress.h"
>> +#include "fsmonitor.h"
>>   
>>   static void create_directories(const char *path, int path_len,
>>   			       const struct checkout *state)
>> @@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
>>   			lstat(ce->name, &st);
>>   		fill_stat_cache_info(ce, &st);
>>   		ce->ce_flags |= CE_UPDATE_IN_BASE;
>> +		mark_fsmonitor_invalid(state->istate, ce);
>>   		state->istate->cache_changed |= CE_ENTRY_CHANGED;
> 
> Similar to "how does mark_fsmonitor_valid() relate to
> ce_mark_uptodate()?" question earlier, mark_fsmonitor_invalid()
> seems to often appear in pairs with the update to cache_changed.
> Are there cases where we need CE_ENTRY_CHANGED bit added to the
> index state yet we do not want to call mark_fsmonitor_invalid()?  I
> am wondering if there should be a single helper function that lets
> callers say "I changed this ce in this istate this way.  Please
> update CE_VALID, CE_UPDATE_IN_BASE and CE_FSMONITOR_VALID
> accordingly".
> 
> That "changed _this way_" is deliberately vague in my question
> above, because it is not yet clear to me when mark-valid and
> mark-invalid should and should not be called from the series.
> 

Please let me know if my comment/patch above does not address this 
concern sufficiently.

>> +	/* a fsmonitor process can return '*' to indicate all entries are invalid */
>> +	if (query_success && query_result.buf[0] != '/') {
>> +		/* Mark all entries returned by the monitor as dirty */
> 
> The comment talks about '*' and code checks if it is not '/'?  The
> else clause of this if/else handles the "invalidate all" case, so
> shouldn't we be comparing with '*' instead here?
> 
> Ah, fsmonitor-watchman section of the doc tells us to write the hook
> in a way to return slash, so the comment above the code is stale and
> the comparison with '/' is what we want, I guess.
> 

Correct.  Sorry about missing that.  Here is a patch that can be 
squashed in.

diff --git a/fsmonitor.c b/fsmonitor.c
index b8b2d88fe1..7c1540c054 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -176,7 +176,7 @@ void refresh_fsmonitor(struct index_state *istate)
                         core_fsmonitor, query_success ? "success" : 
"failure");
         }

-       /* a fsmonitor process can return '*' to indicate all entries 
are invalid */
+       /* a fsmonitor process can return '/' to indicate all entries 
are invalid */
         if (query_success && query_result.buf[0] != '/') {
                 /* Mark all entries returned by the monitor as dirty */
                 buf = query_result.buf;


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-20  6:23       ` Junio C Hamano
@ 2017-09-20 16:29         ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-20 16:29 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff



On 9/20/2017 2:23 AM, Junio C Hamano wrote:
> Ben Peart <benpeart@microsoft.com> writes:
> 
>> @@ -344,6 +346,7 @@ struct index_state {
>>   	struct hashmap dir_hash;
>>   	unsigned char sha1[20];
>>   	struct untracked_cache *untracked;
>> +	uint64_t fsmonitor_last_update;
> 
> This field being zero has more significance than just "we haven't
> got any update yet", right?  The way I am reading the code is that
> setting it 0 is a way to signal that fsmon has been inactivated.  It
> also made me wonder if add_fsmonitor() that silently returns without
> doing anything when this field is already non-zero is a bug (in
> other words, I couldn't tell what the right answer would be to a
> question "shouldn't the caller be avoiding duplicate calls?").
> 

Correct again.  For better (and sometimes for worse) I followed the 
pattern set by the untracked cache.  If you compare them, you will 
notice striking similarities. :)

>> diff --git a/fsmonitor.c b/fsmonitor.c
>> new file mode 100644
>> index 0000000000..b8b2d88fe1
>> --- /dev/null
>> +++ b/fsmonitor.c
>> ...
> 
> This part was a pleasant read.
> > Thanks.
> 

Thank you for your careful review.  I appreciate having another set of 
eyes taking a close look especially as I see this as a big first step 
towards making many git operations O(# changed files) instead of O(# 
size of working directory). Seeing status times drop from 1m22s to 1.45s 
is a huge perf win - but only if it is correct!

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-20 10:00       ` Martin Ågren
@ 2017-09-20 17:02         ` Ben Peart
  2017-09-20 17:11           ` Martin Ågren
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-20 17:02 UTC (permalink / raw)
  To: Martin Ågren, Ben Peart
  Cc: David.Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, Git Mailing List, Junio C Hamano,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy,
	Jeff King

Thanks for the review.  I'm not an English major so appreciate the 
feedback on my attempts to document the feature.

On 9/20/2017 6:00 AM, Martin Ågren wrote:
> On 19 September 2017 at 21:27, Ben Peart <benpeart@microsoft.com> wrote:
>> diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
>> index e19eba62cd..95231dbfcb 100644
>> --- a/Documentation/git-update-index.txt
>> +++ b/Documentation/git-update-index.txt
>> @@ -16,9 +16,11 @@ SYNOPSIS
>>               [--chmod=(+|-)x]
>>               [--[no-]assume-unchanged]
>>               [--[no-]skip-worktree]
>> +            [--[no-]fsmonitor-valid]
>>               [--ignore-submodules]
>>               [--[no-]split-index]
>>               [--[no-|test-|force-]untracked-cache]
>> +            [--[no-]fsmonitor]
>>               [--really-refresh] [--unresolve] [--again | -g]
>>               [--info-only] [--index-info]
>>               [-z] [--stdin] [--index-version <n>]
>> @@ -111,6 +113,12 @@ you will need to handle the situation manually.
>>          set and unset the "skip-worktree" bit for the paths. See
>>          section "Skip-worktree bit" below for more information.
>>
>> +--[no-]fsmonitor-valid::
>> +       When one of these flags is specified, the object name recorded
>> +       for the paths are not updated. Instead, these options
>> +       set and unset the "fsmonitor valid" bit for the paths. See
>> +       section "File System Monitor" below for more information.
>> +
> 
> So --no-foo does not undo --foo, but there are three values: --foo,
> --no-foo and <nothing/default>. I find that unintuitive, but maybe it's
> just me. Maybe there are other such options in the codebase already. 

I understand the unintuitive comment but the other such options in the 
code base are just above the fsmonitor options as it is modeled on how 
'assume-unchanged' and 'skip-worktree' work.  Consistency is certainly 
helps the intuitiveness as once you have learned the model, it applies 
in other places.

How
> about --fsmonitor-valid=set, --fsmonitor-valid=unset, and
> --no-fsmonitor-valid (which would be the default, and which would forget
> any earlier --fsmonitor-valid=...)?
> 
>>   -g::
>>   --again::
>>          Runs 'git update-index' itself on the paths whose index
>> @@ -201,6 +209,15 @@ will remove the intended effect of the option.
>>          `--untracked-cache` used to imply `--test-untracked-cache` but
>>          this option would enable the extension unconditionally.
>>
>> +--fsmonitor::
>> +--no-fsmonitor::
> 
> Maybe "--[no-]fsmonitor" for symmetry with how you've done it above and
> later.
> 

For better and for worse, I choose to be consistent with how the options 
work (especially the untracked-cache option immediately above).  This is 
one weakness of reviewing patches via email - you don't see the patch in 
context with everything around it.

>> +When used in conjunction with the untracked cache, it can further improve
>> +performance by avoiding the cost of scaning the entire working directory
>> +looking for new files.
> 
> s/scaning/scanning/
> 

Thanks!

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index 95231dbfcb..7c2f880a22 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -476,7 +476,7 @@ inform it as to what files have been modified. This 
enables git to avoid
  having to lstat() every file to find modified files.

  When used in conjunction with the untracked cache, it can further improve
-performance by avoiding the cost of scaning the entire working directory
+performance by avoiding the cost of scanning the entire working directory
  looking for new files.

  If you want to enable (or disable) this feature, it is easier to use


>> +If you want to enable (or disable) this feature, it is easier to use
>> +the `core.fsmonitor` configuration variable (see
>> +linkgit:git-config[1]) than using the `--fsmonitor` option to
>> +`git update-index` in each repository, especially if you want to do so
>> +across all repositories you use, because you can set the configuration
>> +variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
>> +and have it affect all repositories you touch.
> 
> This is a mouthful. Maybe you could split it a little, perhaps like so:
> 
>    If you want to enable (or disable) this feature, you will probably
>    want to use the `core.fsmonitor` configuration variable (see
>    linkgit:git-config[1]). By setting it to `true` (or `false`) in your
>    `$HOME/.gitconfig`, it will affect all repositories you touch. For a
>    more fine-grained control, you can set it per repository, or use the
>    `--fsmonitor` option with `git update-index` in each repository.
> 

I'm going to sound like a broken record here. :) The description favored 
consistency with the untracked cache feature immediate above this entry. 
  It is literally a copy/paste/edit.

This is based on the assumption that the text had already been reviewed 
and found to be acceptable.  This also means if you have figured it out 
for one option, when you read the next, you're understanding can carry 
forward speeding up your comprehension.

> The part about $HOME/.gitconfig vs per-repo config is perhaps generic
> enough that it doesn't belong here. So it'd only be about config vs.
> option. Where to place the config item and what implications that has is
> arguably orthogonal to knowing that the option exists and what it does.
> 
> Martin
> 

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-19 20:43     ` Jonathan Nieder
@ 2017-09-20 17:11       ` Ben Peart
  2017-09-20 17:46         ` Jonathan Nieder
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-20 17:11 UTC (permalink / raw)
  To: Jonathan Nieder, Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff



On 9/19/2017 4:43 PM, Jonathan Nieder wrote:
> Hi,
> 
> Ben Peart wrote:
> 
>> The split index test t1700-split-index.sh has hard coded SHA values for
>> the index.  Currently it supports index V4 and V3 but assumes there are
>> no index extensions loaded.
>>
>> When manually forcing the fsmonitor extension to be turned on when
>> running the test suite, the SHA values no longer match which causes the
>> test to fail.
>>
>> The potential matrix of index extensions and index versions can is quite
>> large so instead disable the extension before attempting to run the test.
> 
> Thanks for finding and diagnosing this problem.
> 
> This feels to me like the wrong fix.  Wouldn't it be better for the
> test not to depend on the precise object ids?  See the "Tips for
> Writing Tests" section in t/README:
> 

I completely agree that a better fix would be to rewrite the test to not 
hard code the SHA values.  I'm sure this will come to bite us again as 
we discuss the migration to a different SHA algorithm.

That said, I think fixing this correctly is outside the scope of this 
patch series.  It has been written this way since it was created back in 
2014 (and patched in 2015 to hard code the V4 index SHA).

If desired, this patch can simply be dropped from the series entirely as 
I doubt anyone other than me will attempt to run it with the fsmonitor 
extension turned on.

> 	                                                         And
> 	such drastic changes to the core GIT that even changes these
> 	otherwise supposedly stable object IDs should be accompanied by
> 	an update to t0000-basic.sh.
> 
> 	However, other tests that simply rely on basic parts of the core
> 	GIT working properly should not have that level of intimate
> 	knowledge of the core GIT internals.  If all the test scripts
> 	hardcoded the object IDs like t0000-basic.sh does, that defeats
> 	the purpose of t0000-basic.sh, which is to isolate that level of
> 	validation in one place.  Your test also ends up needing
> 	updating when such a change to the internal happens, so do _not_
> 	do it and leave the low level of validation to t0000-basic.sh.
> 
> Worse, t1700-split-index.sh doesn't explain where the object ids it
> uses comes from so it is not even obvious to a casual reader like me
> how to fix it.
> 
> See t/diff-lib.sh for some examples of one way to avoid depending on
> the object id computation.  Another way that is often preferable is to
> come up with commands to compute the expected hash values, like
> $(git rev-parse HEAD^{tree}), and use those instead of hard-coded
> values.
> 
> Thanks and hope that helps,
> Jonathan
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-20 17:02         ` Ben Peart
@ 2017-09-20 17:11           ` Martin Ågren
  0 siblings, 0 replies; 137+ messages in thread
From: Martin Ågren @ 2017-09-20 17:11 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, Git Mailing List, Junio C Hamano,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy,
	Jeff King

On 20 September 2017 at 19:02, Ben Peart <peartben@gmail.com> wrote:
>>> +--[no-]fsmonitor-valid::
>>> +       When one of these flags is specified, the object name recorded
>>> +       for the paths are not updated. Instead, these options
>>> +       set and unset the "fsmonitor valid" bit for the paths. See
>>> +       section "File System Monitor" below for more information.
>>> +
>>
>>
>> So --no-foo does not undo --foo, but there are three values: --foo,
>> --no-foo and <nothing/default>. I find that unintuitive, but maybe it's
>> just me. Maybe there are other such options in the codebase already.
>
>
> I understand the unintuitive comment but the other such options in the code
> base are just above the fsmonitor options as it is modeled on how
> 'assume-unchanged' and 'skip-worktree' work.  Consistency is certainly helps
> the intuitiveness as once you have learned the model, it applies in other
> places.
>
[...]
>
> For better and for worse, I choose to be consistent with how the options
> work (especially the untracked-cache option immediately above).  This is one
> weakness of reviewing patches via email - you don't see the patch in context
> with everything around it.
>
[...]
>
> I'm going to sound like a broken record here. :) The description favored
> consistency with the untracked cache feature immediate above this entry.  It
> is literally a copy/paste/edit.

Oh. Well, that's what I get for "reviewing" by e-mail. You are indeed
following the current style very well! Sorry for the noise.

Martin

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-20 17:11       ` Ben Peart
@ 2017-09-20 17:46         ` Jonathan Nieder
  2017-09-21  0:05           ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Jonathan Nieder @ 2017-09-20 17:46 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Hi,

Ben Peart wrote:
> On 9/19/2017 4:43 PM, Jonathan Nieder wrote:

>> This feels to me like the wrong fix.  Wouldn't it be better for the
>> test not to depend on the precise object ids?  See the "Tips for
>> Writing Tests" section in t/README:
>
> I completely agree that a better fix would be to rewrite the test to
> not hard code the SHA values.  I'm sure this will come to bite us
> again as we discuss the migration to a different SHA algorithm.

nit: the kind of change I'm proposing does not entail a full rewrite. :)

The SHA migration aspect is true, but that's actually the least of my
worries.  I intend to introduce a SHA1 test prereq that crazy tests
which want to depend on the hash function can declare a dependency on.

My actual worry is that tests hard-coding object ids are (1) hard to
understand, as illustrated by my having no clue what these particular
object ids refer to and (2) very brittle, since an object id changes
whenever a timestamp or any of the history leading to an object
changes.  They create a trap for anyone wanting to change the test
later.  They are basically change detector tests, which is generally
accepted to be a bad practice.

> That said, I think fixing this correctly is outside the scope of
> this patch series.  It has been written this way since it was
> created back in 2014 (and patched in 2015 to hard code the V4 index
> SHA).

Fair enough.

> If desired, this patch can simply be dropped from the series
> entirely as I doubt anyone other than me will attempt to run it with
> the fsmonitor extension turned on.

*shrug*

My motivations in the context of the review were:

 * now that we noticed the problem, we have an opportunity to fix it!
   (i.e. a fix would not have to be part of this series and would not
   necessarily have to be written by you)

 * if we include this non-fix, the commit message really needs to say
   something about it.  Otherwise people are likely to cargo-cult it
   in other contexts and make the problem worse.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 02/12] preload-index: add override to enable testing preload-index
  2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
@ 2017-09-20 22:06       ` Stefan Beller
  2017-09-21  0:02         ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Stefan Beller @ 2017-09-20 22:06 UTC (permalink / raw)
  To: Ben Peart
  Cc: David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git@vger.kernel.org, Junio C Hamano,
	Johannes Schindelin, Duy Nguyen, Jeff King

On Tue, Sep 19, 2017 at 12:27 PM, Ben Peart <benpeart@microsoft.com> wrote:
> Preload index doesn't run unless it has a minimum number of 1000 files.
> To enable running tests with fewer files, add an environment variable
> (GIT_FORCE_PRELOAD_TEST) which will override that minimum and set it to 2.

'it' being the number of threads ('it' was not mentioned before,
so reading the commit message confused me initially)

>
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> ---
>  preload-index.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/preload-index.c b/preload-index.c
> index 70a4c80878..75564c497a 100644
> --- a/preload-index.c
> +++ b/preload-index.c
> @@ -79,6 +79,8 @@ static void preload_index(struct index_state *index,
>                 return;
>
>         threads = index->cache_nr / THREAD_COST;
> +       if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST"))
> +               threads = 2;

Adding these lines is just a bandaid to trick

>         if (threads < 2)
>                 return;

to not return early as the commit message does not discuss why
we set it to 2.

Do we need threads at all for these tests, or would a patch like

-    if (threads < 2)
+    if (threads < 2 && !GIT_FORCE_PRELOAD_TEST)
         return;

work as well?

That way tests can use any number of threads, though
they currently have no way of overriding the heuristic, yet.

With this alternative patch, it sounds to me as if the
issues kept more orthogonal. (specifically "Do we use preload?"
and "How many threads?". One could imagine that we later
want to introduce GIT_PRELOAD_THREADS for $reasons
and that would go over well in combination with
GIT_FORCE_PRELOAD_TEST)

It seems to be only an internal test variable, such that we do
not need documentation. (Is that worth mentioning in the
commit message?)

The test to make use of this new variable is found
in another patch I presume?

Stefan

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 02/12] preload-index: add override to enable testing preload-index
  2017-09-20 22:06       ` Stefan Beller
@ 2017-09-21  0:02         ` Ben Peart
  2017-09-21  0:44           ` Stefan Beller
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-21  0:02 UTC (permalink / raw)
  To: Stefan Beller, Ben Peart
  Cc: David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git@vger.kernel.org, Junio C Hamano,
	Johannes Schindelin, Duy Nguyen, Jeff King



On 9/20/2017 6:06 PM, Stefan Beller wrote:
> On Tue, Sep 19, 2017 at 12:27 PM, Ben Peart <benpeart@microsoft.com> wrote:
>> Preload index doesn't run unless it has a minimum number of 1000 files.
>> To enable running tests with fewer files, add an environment variable
>> (GIT_FORCE_PRELOAD_TEST) which will override that minimum and set it to 2.
> 
> 'it' being the number of threads ('it' was not mentioned before,
> so reading the commit message confused me initially)
> 
>>
>> Signed-off-by: Ben Peart <benpeart@microsoft.com>
>> ---
>>   preload-index.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/preload-index.c b/preload-index.c
>> index 70a4c80878..75564c497a 100644
>> --- a/preload-index.c
>> +++ b/preload-index.c
>> @@ -79,6 +79,8 @@ static void preload_index(struct index_state *index,
>>                  return;
>>
>>          threads = index->cache_nr / THREAD_COST;
>> +       if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST"))
>> +               threads = 2;
> 
> Adding these lines is just a bandaid to trick
> 
>>          if (threads < 2)
>>                  return;
> 
> to not return early as the commit message does not discuss why
> we set it to 2.
> 

To execute the preload code path, we need a minimum of 2 cache entries 
and 2 threads so that each thread actually has work to do.  Otherwise 
the logic below that divides the work up would need to be updated as 
well.  The additional complexity didn't seem worth it just to enable the 
code path to execute with a single thread on a single cache entry.

> Do we need threads at all for these tests, or would a patch like
> 
> -    if (threads < 2)
> +    if (threads < 2 && !GIT_FORCE_PRELOAD_TEST)
>           return;
> 
> work as well?

That would require a larger patch that would update the work division 
and thread creation logic for little to no gain.

> 
> That way tests can use any number of threads, though
> they currently have no way of overriding the heuristic, yet.
> 
> With this alternative patch, it sounds to me as if the
> issues kept more orthogonal. (specifically "Do we use preload?"
> and "How many threads?". One could imagine that we later
> want to introduce GIT_PRELOAD_THREADS for $reasons
> and that would go over well in combination with
> GIT_FORCE_PRELOAD_TEST)
> 

I'm not sure why someone would want to test varying numbers of threads 
as that isn't a case that can ever actually happen with the existing 
code/feature.

I was just enabling testing of the code path with fewer than 1000 files 
as when I made my changes, I discovered that 1) there were no test cases 
for the core.preloadindex feature at all and 2) there was no way to run 
the existing tests with core.preloadindex as they don't have the minimum 
number of files.  This patch allows you to run the existing test suite 
with preload_index turned on for any test case that has 2 or more files 
(it passes by the way :)).

> It seems to be only an internal test variable, such that we do
> not need documentation. (Is that worth mentioning in the
> commit message?)

Correct.  That is the reason I used the magic GIT_***_TEST naming 
pattern as that is the only way to ensure the environment variable is 
preserved when running the tests.

> 
> The test to make use of this new variable is found
> in another patch I presume?
> 

There are no new tests that take advantage of this new environment 
variable.  Instead you can run the entire git test suite with it by running:

GIT_FORCE_PRELOAD_TEXT=1 prove -j12 --state=all ./t[0-9]*.sh

I can add that to the commit message to make it more obvious if desired.

> Stefan
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-20 17:46         ` Jonathan Nieder
@ 2017-09-21  0:05           ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-21  0:05 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff



On 9/20/2017 1:46 PM, Jonathan Nieder wrote:
> Hi,
> 
> Ben Peart wrote:
>> On 9/19/2017 4:43 PM, Jonathan Nieder wrote:
> 
>>> This feels to me like the wrong fix.  Wouldn't it be better for the
>>> test not to depend on the precise object ids?  See the "Tips for
>>> Writing Tests" section in t/README:
>>
>> I completely agree that a better fix would be to rewrite the test to
>> not hard code the SHA values.  I'm sure this will come to bite us
>> again as we discuss the migration to a different SHA algorithm.
> 
> nit: the kind of change I'm proposing does not entail a full rewrite. :)
> 
> The SHA migration aspect is true, but that's actually the least of my
> worries.  I intend to introduce a SHA1 test prereq that crazy tests
> which want to depend on the hash function can declare a dependency on.
> 
> My actual worry is that tests hard-coding object ids are (1) hard to
> understand, as illustrated by my having no clue what these particular
> object ids refer to and (2) very brittle, since an object id changes
> whenever a timestamp or any of the history leading to an object
> changes.  They create a trap for anyone wanting to change the test
> later.  They are basically change detector tests, which is generally
> accepted to be a bad practice.
> 
>> That said, I think fixing this correctly is outside the scope of
>> this patch series.  It has been written this way since it was
>> created back in 2014 (and patched in 2015 to hard code the V4 index
>> SHA).
> 
> Fair enough.
> 
>> If desired, this patch can simply be dropped from the series
>> entirely as I doubt anyone other than me will attempt to run it with
>> the fsmonitor extension turned on.
> 
> *shrug*
> 
> My motivations in the context of the review were:
> 
>   * now that we noticed the problem, we have an opportunity to fix it!
>     (i.e. a fix would not have to be part of this series and would not
>     necessarily have to be written by you)
> 
>   * if we include this non-fix, the commit message really needs to say
>     something about it.  Otherwise people are likely to cargo-cult it
>     in other contexts and make the problem worse.
> 

I'll update the commit message to indicate this is a temporary 
workaround until the underlying hard coded SHA issue is fixed.

> Thanks,
> Jonathan
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 02/12] preload-index: add override to enable testing preload-index
  2017-09-21  0:02         ` Ben Peart
@ 2017-09-21  0:44           ` Stefan Beller
  0 siblings, 0 replies; 137+ messages in thread
From: Stefan Beller @ 2017-09-21  0:44 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git@vger.kernel.org, Junio C Hamano,
	Johannes Schindelin, Duy Nguyen, Jeff King

> To execute the preload code path, we need a minimum of 2 cache entries and 2
> threads so that each thread actually has work to do.  Otherwise the logic
> below that divides the work up would need to be updated as well.  The
> additional complexity didn't seem worth it just to enable the code path to
> execute with a single thread on a single cache entry.

[snip]

>
> That would require a larger patch that would update the work division and
> thread creation logic for little to no gain.

Oh I was not aware of the additional needed refactoring. (I assumed that
would *just work*, experience should have told me to have a look first).

>
> There are no new tests that take advantage of this new environment variable.
> Instead you can run the entire git test suite with it by running:
>
> GIT_FORCE_PRELOAD_TEXT=1 prove -j12 --state=all ./t[0-9]*.sh
>
> I can add that to the commit message to make it more obvious if desired.

it would have helped me being less confused as there was an "obvious"
easier and more correct alternative, which you just explained
is neither easier nor more correct as we probably do not desire that
anyway.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-20 14:58         ` Ben Peart
@ 2017-09-21  1:46           ` Junio C Hamano
  2017-09-21  2:06             ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-21  1:46 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff

Ben Peart <peartben@gmail.com> writes:

> Lets see how my ascii art skills do at describing this:

Your ascii art is fine.  If you said upfront that the capital
letters signify points in time, lower letters are file-touching
events, and time flows from left to right, it would have been
perfect ;-)

> There is no real concern about accumulating too many changes as 1) the
> processing cost for additional modified files is fairly trivial and 2)
> the index ends up getting written out pretty frequently anyway as
> files are added/removed/staged/etc which updates the
> fsmonitor_last_update time.

I still see some impedance mismatch here.  The optimization
described is valid but the code to do the optimization would avoid
writing the index file out when the in-core index is dirty only
because the status reported by fsmonitor changed--if there were
other changes (e.g. the code added a new index entry), even with the
optimization, you would still want to write the index out, right?

With or without the need for forced flush to help debugging, would
that suggest that you need two bits now, instead of just a single
'active-cache-changed' bit?

By keeping track of that new bit that tells us if we have
fsmonitor-only changes that we _could_ flush, this patch can further
reduce the need for forced flushing (i.e. if we know we didn't get
fsmonitor induced dirtyness, force_write can still be no-op), no?

>
> The challenge came when it was time to test that the changes to the
> index were correct.  Since they are lazily written by default, I
> needed a way to force the write so that I could verify the index on
> disk was correct.  Hence, this patch.
>
>
>>>   		OPT_END()
>>>   	};
>>>   @@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char
>>> **argv, const char *prefix)
>>>   		die("BUG: bad untracked_cache value: %d", untracked_cache);
>>>   	}
>>>   -	if (active_cache_changed) {
>>> +	if (active_cache_changed || force_write) {
>>>   		if (newfd < 0) {
>>>   			if (refresh_args.flags & REFRESH_QUIET)
>>>   				exit(128);

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-20 16:19         ` Ben Peart
@ 2017-09-21  2:00           ` Junio C Hamano
  2017-09-21  2:24             ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-21  2:00 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff

Ben Peart <peartben@gmail.com> writes:

> Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
> and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
> features.  That is generally when you are about to overwrite data so
> want to be *really* sure you have what you think you have.

Now that makes me worried gravely.  

IGNORE_VALID is ignored in these places because we have been burned
by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
because we know that the working tree state does not match the
"reality" the index wants to have.  The fact that the code treats
the status reported and kept up to date by fsmonitor the same way as
these two implies that it is merely advisory and cannot be trusted?
Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
ignore the state fsmonitor reported and check the state ourselves?

Oh, wait...


> The other place I used it was in preload_index(). In that case, I
> didn't want to trigger the call to refresh_fsmonitor() as
> preload_index() is about trying to do a fast precompute of state for
> the bulk of the index entries but is not required for correctness as
> refresh_cache_ent() will ensure any "missed" by preload_index() are
> up-to-date if/when that is needed.

That is a very valid design decision.  So IGNORE_FSMONITOR is,
unlike IGNORE_VALID and IGNORE_SKIP_WORKTREE, to tell us "do not
bother asking fsmonitor to refresh the state of this entry--it is OK
for us to use a slightly stale information"?  That would make sense
as an optimization, but that does not mesh well with the previous
"we need to be really really sure" usecase.  That one wants "we do
not trust fsmonitor, so do not bother asking to refresh; we will do
so ourselves", which would not help the "we can use slightly stale
one and that is OK" usecase.

Puzzled...

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-21  1:46           ` Junio C Hamano
@ 2017-09-21  2:06             ` Ben Peart
  2017-09-21  2:18               ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-21  2:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff



On 9/20/2017 9:46 PM, Junio C Hamano wrote:
> Ben Peart <peartben@gmail.com> writes:
> 
>> Lets see how my ascii art skills do at describing this:
> 
> Your ascii art is fine.  If you said upfront that the capital
> letters signify points in time, lower letters are file-touching
> events, and time flows from left to right, it would have been
> perfect ;-)
> 

Rats, so close and yet... ;-)

>> There is no real concern about accumulating too many changes as 1) the
>> processing cost for additional modified files is fairly trivial and 2)
>> the index ends up getting written out pretty frequently anyway as
>> files are added/removed/staged/etc which updates the
>> fsmonitor_last_update time.
> 
> I still see some impedance mismatch here.  The optimization
> described is valid but the code to do the optimization would avoid
> writing the index file out when the in-core index is dirty only
> because the status reported by fsmonitor changed--if there were
> other changes (e.g. the code added a new index entry), even with the
> optimization, you would still want to write the index out, right?
> 

Yes, that is exactly how it works.  FSMONITOR_CHANGED is only set when 
the fsmonitor index extension is added or removed but all the other 
logic to flag the index dirty (CE_ENTRY_CHANGED/REMOVED/ADDED, 
UNTRACKED_CHANGED, etc) still exists and will still trigger the index to 
be written out as it always has.  It's not to say some of these other 
changes could not do the same optimization (I haven't looked) if 
recomputing is cheaper than writing out the index.

> With or without the need for forced flush to help debugging, would
> that suggest that you need two bits now, instead of just a single
> 'active-cache-changed' bit?
> 
> By keeping track of that new bit that tells us if we have
> fsmonitor-only changes that we _could_ flush, this patch can further
> reduce the need for forced flushing (i.e. if we know we didn't get
> fsmonitor induced dirtyness, force_write can still be no-op), no?
> 

Yes, I suppose we _could_ add a 2nd bit (and then add the logic to set 
that bit every time a fsmonitor change was made) but I don't see that it 
really buys us anything useful.  The force write flag in update-index is 
off by default and the only scenario we have that someone would set it 
is for test cases where the perf of writing out the index when it is not 
needed just doesn't matter.

>>
>> The challenge came when it was time to test that the changes to the
>> index were correct.  Since they are lazily written by default, I
>> needed a way to force the write so that I could verify the index on
>> disk was correct.  Hence, this patch.
>>
>>
>>>>    		OPT_END()
>>>>    	};
>>>>    @@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char
>>>> **argv, const char *prefix)
>>>>    		die("BUG: bad untracked_cache value: %d", untracked_cache);
>>>>    	}
>>>>    -	if (active_cache_changed) {
>>>> +	if (active_cache_changed || force_write) {
>>>>    		if (newfd < 0) {
>>>>    			if (refresh_args.flags & REFRESH_QUIET)
>>>>    				exit(128);

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-21  2:06             ` Ben Peart
@ 2017-09-21  2:18               ` Junio C Hamano
  2017-09-21  2:32                 ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-21  2:18 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff

Ben Peart <peartben@gmail.com> writes:

> On 9/20/2017 9:46 PM, Junio C Hamano wrote:
>> Ben Peart <peartben@gmail.com> writes:
>>
>>> Lets see how my ascii art skills do at describing this:
>>
>> Your ascii art is fine.  If you said upfront that the capital
>> letters signify points in time, lower letters are file-touching
>> events, and time flows from left to right, it would have been
>> perfect ;-)
>
> Rats, so close and yet... ;-)

Nah, sorry for forgetting to add "... but I could guess that was the
case after reading a few paragraphs, at which point I rewound and
started reading from the beginning, and it was crystal clear."

> Yes, I suppose we _could_ add a 2nd bit (and then add the logic to set
> that bit every time a fsmonitor change was made) but I don't see that
> it really buys us anything useful.  The force write flag in
> update-index is off by default and the only scenario we have that
> someone would set it is for test cases where the perf of writing out
> the index when it is not needed just doesn't matter.

I tend to agree now.  

My reaction primarily came from that I couldn't quite tell what the
IGNORE_* bit was ment to do and assumed that it meant pretty much
the same thing as existing ones like "valid bit is untrustworthy, so
do not pay attention to it".  It turns out that this one has quite a
different meaning, that is not connected to how much we should trust
state maintained by the fsmonitor, which force me off-track.

Thanks.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-21  2:00           ` Junio C Hamano
@ 2017-09-21  2:24             ` Ben Peart
  2017-09-21 14:35               ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-21  2:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff



On 9/20/2017 10:00 PM, Junio C Hamano wrote:
> Ben Peart <peartben@gmail.com> writes:
> 
>> Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
>> and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
>> features.  That is generally when you are about to overwrite data so
>> want to be *really* sure you have what you think you have.
> 
> Now that makes me worried gravely.
> 
> IGNORE_VALID is ignored in these places because we have been burned
> by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
> because we know that the working tree state does not match the
> "reality" the index wants to have.  The fact that the code treats
> the status reported and kept up to date by fsmonitor the same way as
> these two implies that it is merely advisory and cannot be trusted?
> Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
> ignore the state fsmonitor reported and check the state ourselves?
> 

Sorry for causing unnecessary worry.  The fsmonitor data can be trusted 
(as much as you can trust that Watchman or your file system monitor is 
not buggy).  I wasn't 100% sure *why* these places passed the various 
IGNORE_VALID and IGNORE_SKIP_WORKTREE flags.  When I looked at them, 
that lack of trust seemed to be the reason.

Adding IGNORE_FSMONITOR in those same places was simply an abundance of 
caution on my part.  The only down side of passing the flag for 
fsmonitor is that we will end up calling lstat() on a file where we 
technically didn't need too.  That seemed safer than potentially missing 
a change if I had misunderstood the code.

I'd much rather return correct results (and fall back to the old 
performance) than potentially be incorrect.  I followed that same 
principal in the entire design of fsmonitor - if anything doesn't look 
right, fall back to the old code path just in case...

> Oh, wait...
> 
> 
>> The other place I used it was in preload_index(). In that case, I
>> didn't want to trigger the call to refresh_fsmonitor() as
>> preload_index() is about trying to do a fast precompute of state for
>> the bulk of the index entries but is not required for correctness as
>> refresh_cache_ent() will ensure any "missed" by preload_index() are
>> up-to-date if/when that is needed.
> 
> That is a very valid design decision.  So IGNORE_FSMONITOR is,
> unlike IGNORE_VALID and IGNORE_SKIP_WORKTREE, to tell us "do not
> bother asking fsmonitor to refresh the state of this entry--it is OK
> for us to use a slightly stale information"?  That would make sense
> as an optimization, but that does not mesh well with the previous
> "we need to be really really sure" usecase.  That one wants "we do
> not trust fsmonitor, so do not bother asking to refresh; we will do
> so ourselves", which would not help the "we can use slightly stale
> one and that is OK" usecase.
> 
> Puzzled...
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 03/12] update-index: add a new --force-write-index option
  2017-09-21  2:18               ` Junio C Hamano
@ 2017-09-21  2:32                 ` Junio C Hamano
  0 siblings, 0 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-09-21  2:32 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff

Junio C Hamano <gitster@pobox.com> writes:

> Ben Peart <peartben@gmail.com> writes:
>
>> On 9/20/2017 9:46 PM, Junio C Hamano wrote:
>>> Ben Peart <peartben@gmail.com> writes:
>>>
>>>> Lets see how my ascii art skills do at describing this:
>>>
>>> Your ascii art is fine.  If you said upfront that the capital
>>> letters signify points in time, lower letters are file-touching
>>> events, and time flows from left to right, it would have been
>>> perfect ;-)
>>
>> Rats, so close and yet... ;-)
>
> Nah, sorry for forgetting to add "... but I could guess that was the
> case after reading a few paragraphs, at which point I rewound and
> started reading from the beginning, and it was crystal clear."
>
>> Yes, I suppose we _could_ add a 2nd bit (and then add the logic to set
>> that bit every time a fsmonitor change was made) but I don't see that
>> it really buys us anything useful.  The force write flag in
>> update-index is off by default and the only scenario we have that
>> someone would set it is for test cases where the perf of writing out
>> the index when it is not needed just doesn't matter.
>
> I tend to agree now.  
>
> My reaction primarily came from ...

oops. please ignore the last paragraph, or transplant it to the
other thread X-<.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-21  2:24             ` Ben Peart
@ 2017-09-21 14:35               ` Ben Peart
  2017-09-22  1:02                 ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-21 14:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff



On 9/20/2017 10:24 PM, Ben Peart wrote:
> 
> 
> On 9/20/2017 10:00 PM, Junio C Hamano wrote:
>> Ben Peart <peartben@gmail.com> writes:
>>
>>> Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
>>> and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
>>> features.  That is generally when you are about to overwrite data so
>>> want to be *really* sure you have what you think you have.
>>
>> Now that makes me worried gravely.
>>
>> IGNORE_VALID is ignored in these places because we have been burned
>> by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
>> because we know that the working tree state does not match the
>> "reality" the index wants to have.  The fact that the code treats
>> the status reported and kept up to date by fsmonitor the same way as
>> these two implies that it is merely advisory and cannot be trusted?
>> Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
>> ignore the state fsmonitor reported and check the state ourselves?
>>
> 
> Sorry for causing unnecessary worry.  The fsmonitor data can be trusted 
> (as much as you can trust that Watchman or your file system monitor is 
> not buggy).  I wasn't 100% sure *why* these places passed the various 
> IGNORE_VALID and IGNORE_SKIP_WORKTREE flags.  When I looked at them, 
> that lack of trust seemed to be the reason.
> 
> Adding IGNORE_FSMONITOR in those same places was simply an abundance of 
> caution on my part.  The only down side of passing the flag for 
> fsmonitor is that we will end up calling lstat() on a file where we 
> technically didn't need too.  That seemed safer than potentially missing 
> a change if I had misunderstood the code.
> 
> I'd much rather return correct results (and fall back to the old 
> performance) than potentially be incorrect.  I followed that same 
> principal in the entire design of fsmonitor - if anything doesn't look 
> right, fall back to the old code path just in case...
> 

I spent some time with git blame/show trying to figure out the *why* for 
all the places CE_MATCH_IGNORE_* are passed without gaining a lot of 
additional understanding.  Based on your description above of why these 
exist, I believe there are very few places we actually need to pass 
CE_MATCH_IGNORE_FSMONITOR and that I was being overly cautious.

Here is a patch that removes the unnecessary CE_MATCH_IGNORE_FSMONITOR 
instances.  While the test suite passes with this change, I'm not 100% 
confident that we actually have test cases that would have detected all 
the places that we needed the CE_MATCH_IGNORE_* flags.

If this seems like a reasonable additional optimization to make, I can 
roll it into the next iteration of the patch series as I have some 
spelling, documentation changes and other tweaks as a result of all the 
feedback.


 From 6ff7ed0467fd736dca73efe62391bb3ee9b4e771 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 21 Sep 2017 09:09:42 -0400
Subject: [PATCH] fsmonitor: remove unnecessary uses of
  CE_MATCH_IGNORE_FSMONITOR

With a better understanding of *why* the CE_MATCH_IGNORE_* flags are
used, it is now more clear they are not required in most cases where
CE_MATCH_IGNORE_FSMONITOR was being passed out of an abundance of
caution.

Since the fsmonitor data can be trusted and is kept in sync with the
working directory, the only remaining valid uses are those locations
where we don't want to trigger an unneeded refresh_fsmonitor() call.

One is where preload_index() is doing a fast precompute of state for
the bulk of the index entries but is not required for correctness as
refresh_cache_ent() will ensure any "missed" by preload_index() are
up-to-date if/when they are needed.

The second is in is_staging_gitmodules_ok() where we don't want to
trigger a complete refresh just to check the .gitignore file.

The net result of this change will be that there are more cases where
we will be able to use the cached index state and avoid unnecessary
lstat() calls.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
  apply.c        | 2 +-
  entry.c        | 2 +-
  read-cache.c   | 4 ++--
  unpack-trees.c | 6 +++---
  4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/apply.c b/apply.c
index 9061cc5f15..71cbbd141c 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct 
cache_entry *ce, struct stat *st)
  			return -1;
  		return 0;
  	}
-	return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
  }

  #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/entry.c b/entry.c
index 5e6794f9fc..3a7b667373 100644
--- a/entry.c
+++ b/entry.c
@@ -404,7 +404,7 @@ int checkout_entry(struct cache_entry *ce,

  	if (!check_path(path.buf, path.len, &st, state->base_dir_len)) {
  		const struct submodule *sub;
-		unsigned changed = ce_match_stat(ce, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+		unsigned changed = ce_match_stat(ce, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
  		/*
  		 * Needs to be checked before !changed returns early,
  		 * as the possibly empty directory was not changed
diff --git a/read-cache.c b/read-cache.c
index 53093dbebf..05c0a33fdd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -641,7 +641,7 @@ int add_to_index(struct index_state *istate, const 
char *path, struct stat *st,
  	int size, namelen, was_same;
  	mode_t st_mode = st->st_mode;
  	struct cache_entry *ce, *alias;
-	unsigned ce_option = 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR;
+	unsigned ce_option = 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY;
  	int verbose = flags & (ADD_CACHE_VERBOSE | ADD_CACHE_PRETEND);
  	int pretend = flags & ADD_CACHE_PRETEND;
  	int intent_only = flags & ADD_CACHE_INTENT;
@@ -1356,7 +1356,7 @@ int refresh_index(struct index_state *istate, 
unsigned int flags,
  	int first = 1;
  	int in_porcelain = (flags & REFRESH_IN_PORCELAIN);
  	unsigned int options = (CE_MATCH_REFRESH |
-				(really ? CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_FSMONITOR : 0) |
+				(really ? CE_MATCH_IGNORE_VALID : 0) |
  				(not_new ? CE_MATCH_IGNORE_MISSING : 0));
  	const char *modified_fmt;
  	const char *deleted_fmt;
diff --git a/unpack-trees.c b/unpack-trees.c
index f724a61ac0..1f5d371636 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1456,7 +1456,7 @@ static int verify_uptodate_1(const struct 
cache_entry *ce,
  		return 0;

  	if (!lstat(ce->name, &st)) {
-		int flags = 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR;
+		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE;
  		unsigned changed = ie_match_stat(o->src_index, ce, &st, flags);

  		if (submodule_from_ce(ce)) {
@@ -1612,7 +1612,7 @@ static int icase_exists(struct 
unpack_trees_options *o, const char *name, int le
  	const struct cache_entry *src;

  	src = index_file_exists(o->src_index, name, len, 1);
-	return src && !ie_match_stat(o->src_index, src, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return src && !ie_match_stat(o->src_index, src, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
  }

  static int check_ok_to_remove(const char *name, int len, int dtype,
@@ -2136,7 +2136,7 @@ int oneway_merge(const struct cache_entry * const 
*src,
  		if (o->reset && o->update && !ce_uptodate(old) && 
!ce_skip_worktree(old)) {
  			struct stat st;
  			if (lstat(old->name, &st) ||
-			    ie_match_stat(o->src_index, old, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR))
+			    ie_match_stat(o->src_index, old, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE))
  				update |= CE_UPDATE;
  		}
  		add_entry(o, old, update, 0);
-- 
2.14.1.548.g237ef02b2b.dirty



>> Oh, wait...
>>
>>
>>> The other place I used it was in preload_index(). In that case, I
>>> didn't want to trigger the call to refresh_fsmonitor() as
>>> preload_index() is about trying to do a fast precompute of state for
>>> the bulk of the index entries but is not required for correctness as
>>> refresh_cache_ent() will ensure any "missed" by preload_index() are
>>> up-to-date if/when that is needed.
>>
>> That is a very valid design decision.  So IGNORE_FSMONITOR is,
>> unlike IGNORE_VALID and IGNORE_SKIP_WORKTREE, to tell us "do not
>> bother asking fsmonitor to refresh the state of this entry--it is OK
>> for us to use a slightly stale information"?  That would make sense
>> as an optimization, but that does not mesh well with the previous
>> "we need to be really really sure" usecase.  That one wants "we do
>> not trust fsmonitor, so do not bother asking to refresh; we will do
>> so ourselves", which would not help the "we can use slightly stale
>> one and that is OK" usecase.
>>
>> Puzzled...
>>

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-21 14:35               ` Ben Peart
@ 2017-09-22  1:02                 ` Junio C Hamano
  0 siblings, 0 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-09-22  1:02 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner, avarab, christian.couder, git,
	johannes.schindelin, pclouds, peff

Ben Peart <peartben@gmail.com> writes:

> Since the fsmonitor data can be trusted and is kept in sync with the
> working directory, the only remaining valid uses are those locations
> where we don't want to trigger an unneeded refresh_fsmonitor() call.

Now that is a lot more assuring ;-)  And the exceptions below also
make sense to me.

Thanks for thinking this through.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable
  2017-09-20 13:49           ` Torsten Bögershausen
@ 2017-09-22  1:04             ` Junio C Hamano
  0 siblings, 0 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-09-22  1:04 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Jonathan Nieder, git, benpeart

Torsten Bögershausen <tboegi@web.de> writes:

> Junio, if you wouldn't mind to squash that in, 
> another fix is needed as well(trailing '-' after '-E') :
>
> s/'-n', '-e' or '-E-'/'-n', '-e' or '-E'

Yup.  Thanks all.


^ permalink raw reply	[flat|nested] 137+ messages in thread

* [PATCH v8 00/12] Fast git status via a file system watcher
  2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
                       ` (11 preceding siblings ...)
  2017-09-19 19:27     ` [PATCH v7 12/12] fsmonitor: add a performance test Ben Peart
@ 2017-09-22 16:35     ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
                         ` (12 more replies)
  12 siblings, 13 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Thanks again to everyone who provided feedback.  There are several
spelling, documentation, and code comment changes.

The only behavioral change from V7 is the removal of unnecessary uses of
CE_MATCH_IGNORE_FSMONITOR.  With a better understanding of *why* the
CE_MATCH_IGNORE_* flags are used, it is now clear they are not required
in most cases where CE_MATCH_IGNORE_FSMONITOR was being passed out of an
abundance of caution.

Since the fsmonitor data can be trusted and is kept in sync with the
working directory, the only remaining valid uses are those locations
where we don't want to trigger an unneeded refresh_fsmonitor() call.

One is where preload_index() is doing a fast precompute of state for
the bulk of the index entries but is not required for correctness as
refresh_cache_ent() will ensure any "missed" by preload_index() are
up-to-date if/when they are needed.

The second is in is_staging_gitmodules_ok() where we don't want to
trigger a complete refresh just to check the .gitignore file.

The net result of this change will be that there are more cases where
we will be able to use the cached index state and avoid unnecessary
lstat() calls.

Interdiff between V7 and V8:

diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 95231dbfcb..7c2f880a22 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -476,7 +476,7 @@ inform it as to what files have been modified. This enables git to avoid
 having to lstat() every file to find modified files.
 
 When used in conjunction with the untracked cache, it can further improve
-performance by avoiding the cost of scaning the entire working directory
+performance by avoiding the cost of scanning the entire working directory
 looking for new files.
 
 If you want to enable (or disable) this feature, it is easier to use
diff --git a/apply.c b/apply.c
index 9061cc5f15..71cbbd141c 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct cache_entry *ce, struct stat *st)
 			return -1;
 		return 0;
 	}
-	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
 }
 
 #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/cache.h b/cache.h
index eccab968bd..f1c903e1b6 100644
--- a/cache.h
+++ b/cache.h
@@ -682,7 +682,7 @@ extern void *read_blob_data_from_index(const struct index_state *, const char *,
 #define CE_MATCH_IGNORE_MISSING		0x08
 /* enable stat refresh */
 #define CE_MATCH_REFRESH		0x10
-/* do stat comparison even if CE_FSMONITOR_VALID is true */
+/* don't refresh_fsmonitor state or do stat comparison even if CE_FSMONITOR_VALID is true */
 #define CE_MATCH_IGNORE_FSMONITOR 0X20
 extern int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 extern int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
diff --git a/entry.c b/entry.c
index 5e6794f9fc..3a7b667373 100644
--- a/entry.c
+++ b/entry.c
@@ -404,7 +404,7 @@ int checkout_entry(struct cache_entry *ce,
 
 	if (!check_path(path.buf, path.len, &st, state->base_dir_len)) {
 		const struct submodule *sub;
-		unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+		unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
 		/*
 		 * Needs to be checked before !changed returns early,
 		 * as the possibly empty directory was not changed
diff --git a/fsmonitor.c b/fsmonitor.c
index b8b2d88fe1..7c1540c054 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -176,7 +176,7 @@ void refresh_fsmonitor(struct index_state *istate)
 			core_fsmonitor, query_success ? "success" : "failure");
 	}
 
-	/* a fsmonitor process can return '*' to indicate all entries are invalid */
+	/* a fsmonitor process can return '/' to indicate all entries are invalid */
 	if (query_success && query_result.buf[0] != '/') {
 		/* Mark all entries returned by the monitor as dirty */
 		buf = query_result.buf;
diff --git a/fsmonitor.h b/fsmonitor.h
index c2240b811a..8eb6163455 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -35,7 +35,9 @@ extern void tweak_fsmonitor(struct index_state *istate);
 extern void refresh_fsmonitor(struct index_state *istate);
 
 /*
- * Set the given cache entries CE_FSMONITOR_VALID bit.
+ * Set the given cache entries CE_FSMONITOR_VALID bit. This should be
+ * called any time the cache entry has been updated to reflect the
+ * current state of the file on disk.
  */
 static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 {
@@ -46,8 +48,11 @@ static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 }
 
 /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate any
- * corresponding untracked cache directory structures.
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
+ * any corresponding untracked cache directory structures. This should
+ * be called any time git creates or modifies a file that should
+ * trigger an lstat() or invalidate the untracked cache for the
+ * corresponding directory
  */
 static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
 {
diff --git a/read-cache.c b/read-cache.c
index 53093dbebf..05c0a33fdd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -641,7 +641,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	int size, namelen, was_same;
 	mode_t st_mode = st->st_mode;
 	struct cache_entry *ce, *alias;
-	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR;
+	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY;
 	int verbose = flags & (ADD_CACHE_VERBOSE | ADD_CACHE_PRETEND);
 	int pretend = flags & ADD_CACHE_PRETEND;
 	int intent_only = flags & ADD_CACHE_INTENT;
@@ -1356,7 +1356,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int first = 1;
 	int in_porcelain = (flags & REFRESH_IN_PORCELAIN);
 	unsigned int options = (CE_MATCH_REFRESH |
-				(really ? CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_FSMONITOR : 0) |
+				(really ? CE_MATCH_IGNORE_VALID : 0) |
 				(not_new ? CE_MATCH_IGNORE_MISSING : 0));
 	const char *modified_fmt;
 	const char *deleted_fmt;
diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
index 4e5ca8f397..bd1a857d52 100644
--- a/t/helper/test-drop-caches.c
+++ b/t/helper/test-drop-caches.c
@@ -82,6 +82,8 @@ static int cmd_dropcaches(void)
 	HANDLE hProcess = GetCurrentProcess();
 	HANDLE hToken;
 	HMODULE ntdll;
+	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG);
+	SYSTEM_MEMORY_LIST_COMMAND command;
 	int status;
 
 	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
@@ -96,12 +98,12 @@ static int cmd_dropcaches(void)
 	if (!ntdll)
 		return error("Can't load ntdll.dll, wrong Windows version?");
 
-	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG) =
+	NtSetSystemInformation =
 		(DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
 	if (!NtSetSystemInformation)
 		return error("Can't get function addresses, wrong Windows version?");
 
-	SYSTEM_MEMORY_LIST_COMMAND command = MemoryPurgeStandbyList;
+	command = MemoryPurgeStandbyList;
 	status = NtSetSystemInformation(
 		SystemMemoryListInformation,
 		&command,
diff --git a/unpack-trees.c b/unpack-trees.c
index f724a61ac0..1f5d371636 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1456,7 +1456,7 @@ static int verify_uptodate_1(const struct cache_entry *ce,
 		return 0;
 
 	if (!lstat(ce->name, &st)) {
-		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR;
+		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE;
 		unsigned changed = ie_match_stat(o->src_index, ce, &st, flags);
 
 		if (submodule_from_ce(ce)) {
@@ -1612,7 +1612,7 @@ static int icase_exists(struct unpack_trees_options *o, const char *name, int le
 	const struct cache_entry *src;
 
 	src = index_file_exists(o->src_index, name, len, 1);
-	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
 }
 
 static int check_ok_to_remove(const char *name, int len, int dtype,
@@ -2136,7 +2136,7 @@ int oneway_merge(const struct cache_entry * const *src,
 		if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old)) {
 			struct stat st;
 			if (lstat(old->name, &st) ||
-			    ie_match_stat(o->src_index, old, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR))
+			    ie_match_stat(o->src_index, old, &st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE))
 				update |= CE_UPDATE;
 		}
 		add_entry(o, old, update, 0);

Ben Peart (12):
  bswap: add 64 bit endianness helper get_be64
  preload-index: add override to enable testing preload-index
  update-index: add a new --force-write-index option
  fsmonitor: teach git to optionally utilize a file system monitor to
    speed up detecting new or changed files.
  fsmonitor: add documentation for the fsmonitor extension.
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  update-index: add fsmonitor support to update-index
  fsmonitor: add a test tool to dump the index extension
  split-index: disable the fsmonitor extension when running the split
    index test
  fsmonitor: add test cases for fsmonitor extension
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add a performance test

 Documentation/config.txt                   |   7 +
 Documentation/git-ls-files.txt             |   7 +-
 Documentation/git-update-index.txt         |  45 +++++
 Documentation/githooks.txt                 |  28 +++
 Documentation/technical/index-format.txt   |  19 ++
 Makefile                                   |   3 +
 builtin/ls-files.c                         |   8 +-
 builtin/update-index.c                     |  38 +++-
 cache.h                                    |  10 +-
 compat/bswap.h                             |  22 +++
 config.c                                   |  14 ++
 config.h                                   |   1 +
 diff-lib.c                                 |   2 +
 dir.c                                      |  27 ++-
 dir.h                                      |   2 +
 entry.c                                    |   2 +
 environment.c                              |   1 +
 fsmonitor.c                                | 253 ++++++++++++++++++++++++
 fsmonitor.h                                |  66 +++++++
 preload-index.c                            |   8 +-
 read-cache.c                               |  45 ++++-
 submodule.c                                |   2 +-
 t/helper/.gitignore                        |   1 +
 t/helper/test-drop-caches.c                | 164 ++++++++++++++++
 t/helper/test-dump-fsmonitor.c             |  21 ++
 t/perf/p7519-fsmonitor.sh                  | 184 +++++++++++++++++
 t/t1700-split-index.sh                     |   1 +
 t/t7519-status-fsmonitor.sh                | 304 +++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all                      |  24 +++
 t/t7519/fsmonitor-none                     |  22 +++
 t/t7519/fsmonitor-watchman                 | 140 +++++++++++++
 templates/hooks--fsmonitor-watchman.sample | 122 ++++++++++++
 unpack-trees.c                             |   2 +
 33 files changed, 1572 insertions(+), 23 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100644 t/helper/test-dump-fsmonitor.c
 create mode 100755 t/perf/p7519-fsmonitor.sh
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 23:37         ` Martin Ågren
  2017-09-22 16:35       ` [PATCH v8 02/12] preload-index: add override to enable testing preload-index Ben Peart
                         ` (11 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a new get_be64 macro to enable 64 bit endian conversions on memory
that may or may not be aligned.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 compat/bswap.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/compat/bswap.h b/compat/bswap.h
index 7d063e9e40..6b22c46214 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -158,7 +158,9 @@ static inline uint64_t git_bswap64(uint64_t x)
 
 #define get_be16(p)	ntohs(*(unsigned short *)(p))
 #define get_be32(p)	ntohl(*(unsigned int *)(p))
+#define get_be64(p)	ntohll(*(uint64_t *)(p))
 #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
+#define put_be64(p, v)	do { *(uint64_t *)(p) = htonll(v); } while (0)
 
 #else
 
@@ -178,6 +180,13 @@ static inline uint32_t get_be32(const void *ptr)
 		(uint32_t)p[3] <<  0;
 }
 
+static inline uint64_t get_be64(const void *ptr)
+{
+	const unsigned char *p = ptr;
+	return	(uint64_t)get_be32(p[0]) << 32 |
+		(uint64_t)get_be32(p[4]) <<  0;
+}
+
 static inline void put_be32(void *ptr, uint32_t value)
 {
 	unsigned char *p = ptr;
@@ -187,4 +196,17 @@ static inline void put_be32(void *ptr, uint32_t value)
 	p[3] = value >>  0;
 }
 
+static inline void put_be64(void *ptr, uint64_t value)
+{
+	unsigned char *p = ptr;
+	p[0] = value >> 56;
+	p[1] = value >> 48;
+	p[2] = value >> 40;
+	p[3] = value >> 32;
+	p[4] = value >> 24;
+	p[5] = value >> 16;
+	p[6] = value >>  8;
+	p[7] = value >>  0;
+}
+
 #endif
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 02/12] preload-index: add override to enable testing preload-index
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
  2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 03/12] update-index: add a new --force-write-index option Ben Peart
                         ` (10 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

By default, the preload index code path doesn't run unless there is a
minimum of 1000 files. To enable running the test suite and having it
execute the preload-index path, add an environment variable
(GIT_FORCE_PRELOAD_TEST) which will override that minimum and set it to 2.

This enables you run existing tests and have the core.preloadindex code
path execute as long as the test has at least 2 files by setting
GIT_FORCE_PRELOAD_TEXT=1 before running the test.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 preload-index.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/preload-index.c b/preload-index.c
index 70a4c80878..75564c497a 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -79,6 +79,8 @@ static void preload_index(struct index_state *index,
 		return;
 
 	threads = index->cache_nr / THREAD_COST;
+	if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST"))
+		threads = 2;
 	if (threads < 2)
 		return;
 	if (threads > MAX_PARALLEL)
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 03/12] update-index: add a new --force-write-index option
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
  2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
  2017-09-22 16:35       ` [PATCH v8 02/12] preload-index: add override to enable testing preload-index Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
                         ` (9 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

At times, it makes sense to avoid the cost of writing out the index
when the only changes can easily be recomputed on demand. This causes
problems when trying to write test cases to verify that state as they
can't guarantee the state has been persisted to disk.

Add a new option (--force-write-index) to update-index that will
ensure the index is written out even if the cache_changed flag is not
set.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/update-index.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d562f2ec69..e1ca0759d5 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -915,6 +915,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	struct refresh_params refresh_args = {0, &has_errors};
 	int lock_error = 0;
 	int split_index = -1;
+	int force_write = 0;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
@@ -1006,6 +1007,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    N_("test if the filesystem supports untracked cache"), UC_TEST),
 		OPT_SET_INT(0, "force-untracked-cache", &untracked_cache,
 			    N_("enable untracked cache without testing the filesystem"), UC_FORCE),
+		OPT_SET_INT(0, "force-write-index", &force_write,
+			N_("write out the index even if is not flagged as changed"), 1),
 		OPT_END()
 	};
 
@@ -1147,7 +1150,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		die("BUG: bad untracked_cache value: %d", untracked_cache);
 	}
 
-	if (active_cache_changed) {
+	if (active_cache_changed || force_write) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
 				exit(128);
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (2 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 03/12] update-index: add a new --force-write-index option Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
                         ` (8 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile               |   1 +
 builtin/update-index.c |   2 +
 cache.h                |  10 +-
 config.c               |  14 +++
 config.h               |   1 +
 diff-lib.c             |   2 +
 dir.c                  |  27 ++++--
 dir.h                  |   2 +
 entry.c                |   2 +
 environment.c          |   1 +
 fsmonitor.c            | 253 +++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor.h            |  66 +++++++++++++
 preload-index.c        |   6 +-
 read-cache.c           |  45 ++++++++-
 submodule.c            |   2 +-
 unpack-trees.c         |   2 +
 16 files changed, 417 insertions(+), 19 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h

diff --git a/Makefile b/Makefile
index f2bb7f2f63..9d6ec9c1e9 100644
--- a/Makefile
+++ b/Makefile
@@ -786,6 +786,7 @@ LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/builtin/update-index.c b/builtin/update-index.c
index e1ca0759d5..6f39ee9274 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -16,6 +16,7 @@
 #include "pathspec.h"
 #include "dir.h"
 #include "split-index.h"
+#include "fsmonitor.h"
 
 /*
  * Default to not allowing changes to the list of files. The
@@ -233,6 +234,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
diff --git a/cache.h b/cache.h
index a916bc79e3..f1c903e1b6 100644
--- a/cache.h
+++ b/cache.h
@@ -203,6 +203,7 @@ struct cache_entry {
 #define CE_ADDED             (1 << 19)
 
 #define CE_HASHED            (1 << 20)
+#define CE_FSMONITOR_VALID   (1 << 21)
 #define CE_WT_REMOVE         (1 << 22) /* remove in work directory */
 #define CE_CONFLICTED        (1 << 23)
 
@@ -326,6 +327,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CACHE_TREE_CHANGED	(1 << 5)
 #define SPLIT_INDEX_ORDERED	(1 << 6)
 #define UNTRACKED_CHANGED	(1 << 7)
+#define FSMONITOR_CHANGED	(1 << 8)
 
 struct split_index;
 struct untracked_cache;
@@ -344,6 +346,7 @@ struct index_state {
 	struct hashmap dir_hash;
 	unsigned char sha1[20];
 	struct untracked_cache *untracked;
+	uint64_t fsmonitor_last_update;
 };
 
 extern struct index_state the_index;
@@ -679,8 +682,10 @@ extern void *read_blob_data_from_index(const struct index_state *, const char *,
 #define CE_MATCH_IGNORE_MISSING		0x08
 /* enable stat refresh */
 #define CE_MATCH_REFRESH		0x10
-extern int ie_match_stat(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
-extern int ie_modified(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
+/* don't refresh_fsmonitor state or do stat comparison even if CE_FSMONITOR_VALID is true */
+#define CE_MATCH_IGNORE_FSMONITOR 0X20
+extern int ie_match_stat(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
+extern int ie_modified(struct index_state *, const struct cache_entry *, struct stat *, unsigned int);
 
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
@@ -773,6 +778,7 @@ extern int core_apply_sparse_checkout;
 extern int precomposed_unicode;
 extern int protect_hfs;
 extern int protect_ntfs;
+extern const char *core_fsmonitor;
 
 /*
  * Include broken refs in all ref iterations, which will
diff --git a/config.c b/config.c
index d0d8ce823a..ddda96e584 100644
--- a/config.c
+++ b/config.c
@@ -2165,6 +2165,20 @@ int git_config_get_max_percent_split_change(void)
 	return -1; /* default value */
 }
 
+int git_config_get_fsmonitor(void)
+{
+	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
+		core_fsmonitor = getenv("GIT_FSMONITOR_TEST");
+
+	if (core_fsmonitor && !*core_fsmonitor)
+		core_fsmonitor = NULL;
+
+	if (core_fsmonitor)
+		return 1;
+
+	return 0;
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
diff --git a/config.h b/config.h
index 97471b8873..c9fcf691ba 100644
--- a/config.h
+++ b/config.h
@@ -211,6 +211,7 @@ extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
+extern int git_config_get_fsmonitor(void);
 
 /* This dies if the configured or default date is in the future */
 extern int git_config_get_expiry(const char *key, const char **output);
diff --git a/diff-lib.c b/diff-lib.c
index 2a52b07954..23c6d03ca9 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -12,6 +12,7 @@
 #include "refs.h"
 #include "submodule.h"
 #include "dir.h"
+#include "fsmonitor.h"
 
 /*
  * diff-files
@@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 		if (!changed && !dirty_submodule) {
 			ce_mark_uptodate(ce);
+			mark_fsmonitor_valid(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
 				continue;
 		}
diff --git a/dir.c b/dir.c
index 1c55dc3e36..ac9833daec 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,7 @@
 #include "utf8.h"
 #include "varint.h"
 #include "ewah/ewok.h"
+#include "fsmonitor.h"
 
 /*
  * Tells read_directory_recursive how a file or directory should be treated.
@@ -1688,17 +1689,23 @@ static int valid_cached_dir(struct dir_struct *dir,
 	if (!untracked)
 		return 0;
 
-	if (stat(path->len ? path->buf : ".", &st)) {
-		invalidate_directory(dir->untracked, untracked);
-		memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
-		return 0;
-	}
-	if (!untracked->valid ||
-	    match_stat_data_racy(istate, &untracked->stat_data, &st)) {
-		if (untracked->valid)
+	/*
+	 * With fsmonitor, we can trust the untracked cache's valid field.
+	 */
+	refresh_fsmonitor(istate);
+	if (!(dir->untracked->use_fsmonitor && untracked->valid)) {
+		if (stat(path->len ? path->buf : ".", &st)) {
 			invalidate_directory(dir->untracked, untracked);
-		fill_stat_data(&untracked->stat_data, &st);
-		return 0;
+			memset(&untracked->stat_data, 0, sizeof(untracked->stat_data));
+			return 0;
+		}
+		if (!untracked->valid ||
+			match_stat_data_racy(istate, &untracked->stat_data, &st)) {
+			if (untracked->valid)
+				invalidate_directory(dir->untracked, untracked);
+			fill_stat_data(&untracked->stat_data, &st);
+			return 0;
+		}
 	}
 
 	if (untracked->check_only != !!check_only) {
diff --git a/dir.h b/dir.h
index e3717055d1..fab8fc1561 100644
--- a/dir.h
+++ b/dir.h
@@ -139,6 +139,8 @@ struct untracked_cache {
 	int gitignore_invalidated;
 	int dir_invalidated;
 	int dir_opened;
+	/* fsmonitor invalidation data */
+	unsigned int use_fsmonitor : 1;
 };
 
 struct dir_struct {
diff --git a/entry.c b/entry.c
index cb291aa88b..3a7b667373 100644
--- a/entry.c
+++ b/entry.c
@@ -4,6 +4,7 @@
 #include "streaming.h"
 #include "submodule.h"
 #include "progress.h"
+#include "fsmonitor.h"
 
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
 			lstat(ce->name, &st);
 		fill_stat_cache_info(ce, &st);
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(state->istate, ce);
 		state->istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 	return 0;
diff --git a/environment.c b/environment.c
index 3fd4b10845..d0b9fc64d4 100644
--- a/environment.c
+++ b/environment.c
@@ -76,6 +76,7 @@ int protect_hfs = PROTECT_HFS_DEFAULT;
 #define PROTECT_NTFS_DEFAULT 0
 #endif
 int protect_ntfs = PROTECT_NTFS_DEFAULT;
+const char *core_fsmonitor;
 
 /*
  * The character that begins a commented line in user-editable file
diff --git a/fsmonitor.c b/fsmonitor.c
new file mode 100644
index 0000000000..7c1540c054
--- /dev/null
+++ b/fsmonitor.c
@@ -0,0 +1,253 @@
+#include "cache.h"
+#include "config.h"
+#include "dir.h"
+#include "ewah/ewok.h"
+#include "fsmonitor.h"
+#include "run-command.h"
+#include "strbuf.h"
+
+#define INDEX_EXTENSION_VERSION	(1)
+#define HOOK_INTERFACE_VERSION	(1)
+
+struct trace_key trace_fsmonitor = TRACE_KEY_INIT(FSMONITOR);
+
+static void fsmonitor_ewah_callback(size_t pos, void *is)
+{
+	struct index_state *istate = (struct index_state *)is;
+	struct cache_entry *ce = istate->cache[pos];
+
+	ce->ce_flags &= ~CE_FSMONITOR_VALID;
+}
+
+int read_fsmonitor_extension(struct index_state *istate, const void *data,
+	unsigned long sz)
+{
+	const char *index = data;
+	uint32_t hdr_version;
+	uint32_t ewah_size;
+	struct ewah_bitmap *fsmonitor_dirty;
+	int i;
+	int ret;
+
+	if (sz < sizeof(uint32_t) + sizeof(uint64_t) + sizeof(uint32_t))
+		return error("corrupt fsmonitor extension (too short)");
+
+	hdr_version = get_be32(index);
+	index += sizeof(uint32_t);
+	if (hdr_version != INDEX_EXTENSION_VERSION)
+		return error("bad fsmonitor version %d", hdr_version);
+
+	istate->fsmonitor_last_update = get_be64(index);
+	index += sizeof(uint64_t);
+
+	ewah_size = get_be32(index);
+	index += sizeof(uint32_t);
+
+	fsmonitor_dirty = ewah_new();
+	ret = ewah_read_mmap(fsmonitor_dirty, index, ewah_size);
+	if (ret != ewah_size) {
+		ewah_free(fsmonitor_dirty);
+		return error("failed to parse ewah bitmap reading fsmonitor index extension");
+	}
+
+	if (git_config_get_fsmonitor()) {
+		/* Mark all entries valid */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags |= CE_FSMONITOR_VALID;
+
+		/* Mark all previously saved entries as dirty */
+		ewah_each_bit(fsmonitor_dirty, fsmonitor_ewah_callback, istate);
+
+		/* Now mark the untracked cache for fsmonitor usage */
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 1;
+	}
+	ewah_free(fsmonitor_dirty);
+
+	trace_printf_key(&trace_fsmonitor, "read fsmonitor extension successful");
+	return 0;
+}
+
+void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
+{
+	uint32_t hdr_version;
+	uint64_t tm;
+	struct ewah_bitmap *bitmap;
+	int i;
+	uint32_t ewah_start;
+	uint32_t ewah_size = 0;
+	int fixup = 0;
+
+	put_be32(&hdr_version, INDEX_EXTENSION_VERSION);
+	strbuf_add(sb, &hdr_version, sizeof(uint32_t));
+
+	put_be64(&tm, istate->fsmonitor_last_update);
+	strbuf_add(sb, &tm, sizeof(uint64_t));
+	fixup = sb->len;
+	strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
+
+	ewah_start = sb->len;
+	bitmap = ewah_new();
+	for (i = 0; i < istate->cache_nr; i++)
+		if (!(istate->cache[i]->ce_flags & CE_FSMONITOR_VALID))
+			ewah_set(bitmap, i);
+	ewah_serialize_strbuf(bitmap, sb);
+	ewah_free(bitmap);
+
+	/* fix up size field */
+	put_be32(&ewah_size, sb->len - ewah_start);
+	memcpy(sb->buf + fixup, &ewah_size, sizeof(uint32_t));
+
+	trace_printf_key(&trace_fsmonitor, "write fsmonitor extension successful");
+}
+
+/*
+ * Call the query-fsmonitor hook passing the time of the last saved results.
+ */
+static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	char ver[64];
+	char date[64];
+	const char *argv[4];
+
+	if (!(argv[0] = core_fsmonitor))
+		return -1;
+
+	snprintf(ver, sizeof(version), "%d", version);
+	snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
+	argv[1] = ver;
+	argv[2] = date;
+	argv[3] = NULL;
+	cp.argv = argv;
+	cp.use_shell = 1;
+
+	return capture_command(&cp, query_result, 1024);
+}
+
+static void fsmonitor_refresh_callback(struct index_state *istate, const char *name)
+{
+	int pos = index_name_pos(istate, name, strlen(name));
+
+	if (pos >= 0) {
+		struct cache_entry *ce = istate->cache[pos];
+		ce->ce_flags &= ~CE_FSMONITOR_VALID;
+	}
+
+	/*
+	 * Mark the untracked cache dirty even if it wasn't found in the index
+	 * as it could be a new untracked file.
+	 */
+	trace_printf_key(&trace_fsmonitor, "fsmonitor_refresh_callback '%s'", name);
+	untracked_cache_invalidate_path(istate, name);
+}
+
+void refresh_fsmonitor(struct index_state *istate)
+{
+	static int has_run_once = 0;
+	struct strbuf query_result = STRBUF_INIT;
+	int query_success = 0;
+	size_t bol; /* beginning of line */
+	uint64_t last_update;
+	char *buf;
+	int i;
+
+	if (!core_fsmonitor || has_run_once)
+		return;
+	has_run_once = 1;
+
+	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
+	/*
+	 * This could be racy so save the date/time now and query_fsmonitor
+	 * should be inclusive to ensure we don't miss potential changes.
+	 */
+	last_update = getnanotime();
+
+	/*
+	 * If we have a last update time, call query_fsmonitor for the set of
+	 * changes since that time, else assume everything is possibly dirty
+	 * and check it all.
+	 */
+	if (istate->fsmonitor_last_update) {
+		query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
+			istate->fsmonitor_last_update, &query_result);
+		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
+		trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s",
+			core_fsmonitor, query_success ? "success" : "failure");
+	}
+
+	/* a fsmonitor process can return '/' to indicate all entries are invalid */
+	if (query_success && query_result.buf[0] != '/') {
+		/* Mark all entries returned by the monitor as dirty */
+		buf = query_result.buf;
+		bol = 0;
+		for (i = 0; i < query_result.len; i++) {
+			if (buf[i] != '\0')
+				continue;
+			fsmonitor_refresh_callback(istate, buf + bol);
+			bol = i + 1;
+		}
+		if (bol < query_result.len)
+			fsmonitor_refresh_callback(istate, buf + bol);
+	} else {
+		/* Mark all entries invalid */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+
+		if (istate->untracked)
+			istate->untracked->use_fsmonitor = 0;
+	}
+	strbuf_release(&query_result);
+
+	/* Now that we've updated istate, save the last_update time */
+	istate->fsmonitor_last_update = last_update;
+}
+
+void add_fsmonitor(struct index_state *istate)
+{
+	int i;
+
+	if (!istate->fsmonitor_last_update) {
+		trace_printf_key(&trace_fsmonitor, "add fsmonitor");
+		istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = getnanotime();
+
+		/* reset the fsmonitor state */
+		for (i = 0; i < istate->cache_nr; i++)
+			istate->cache[i]->ce_flags &= ~CE_FSMONITOR_VALID;
+
+		/* reset the untracked cache */
+		if (istate->untracked) {
+			add_untracked_cache(istate);
+			istate->untracked->use_fsmonitor = 1;
+		}
+
+		/* Update the fsmonitor state */
+		refresh_fsmonitor(istate);
+	}
+}
+
+void remove_fsmonitor(struct index_state *istate)
+{
+	if (istate->fsmonitor_last_update) {
+		trace_printf_key(&trace_fsmonitor, "remove fsmonitor");
+		istate->cache_changed |= FSMONITOR_CHANGED;
+		istate->fsmonitor_last_update = 0;
+	}
+}
+
+void tweak_fsmonitor(struct index_state *istate)
+{
+	switch (git_config_get_fsmonitor()) {
+	case -1: /* keep: do nothing */
+		break;
+	case 0: /* false */
+		remove_fsmonitor(istate);
+		break;
+	case 1: /* true */
+		add_fsmonitor(istate);
+		break;
+	default: /* unknown value: do nothing */
+		break;
+	}
+}
diff --git a/fsmonitor.h b/fsmonitor.h
new file mode 100644
index 0000000000..8eb6163455
--- /dev/null
+++ b/fsmonitor.h
@@ -0,0 +1,66 @@
+#ifndef FSMONITOR_H
+#define FSMONITOR_H
+
+extern struct trace_key trace_fsmonitor;
+
+/*
+ * Read the the fsmonitor index extension and (if configured) restore the
+ * CE_FSMONITOR_VALID state.
+ */
+extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz);
+
+/*
+ * Write the CE_FSMONITOR_VALID state into the fsmonitor index extension.
+ */
+extern void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate);
+
+/*
+ * Add/remove the fsmonitor index extension
+ */
+extern void add_fsmonitor(struct index_state *istate);
+extern void remove_fsmonitor(struct index_state *istate);
+
+/*
+ * Add/remove the fsmonitor index extension as necessary based on the current
+ * core.fsmonitor setting.
+ */
+extern void tweak_fsmonitor(struct index_state *istate);
+
+/*
+ * Run the configured fsmonitor integration script and clear the
+ * CE_FSMONITOR_VALID bit for any files returned as dirty.  Also invalidate
+ * any corresponding untracked cache directory structures. Optimized to only
+ * run the first time it is called.
+ */
+extern void refresh_fsmonitor(struct index_state *istate);
+
+/*
+ * Set the given cache entries CE_FSMONITOR_VALID bit. This should be
+ * called any time the cache entry has been updated to reflect the
+ * current state of the file on disk.
+ */
+static inline void mark_fsmonitor_valid(struct cache_entry *ce)
+{
+	if (core_fsmonitor) {
+		ce->ce_flags |= CE_FSMONITOR_VALID;
+		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_clean '%s'", ce->name);
+	}
+}
+
+/*
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
+ * any corresponding untracked cache directory structures. This should
+ * be called any time git creates or modifies a file that should
+ * trigger an lstat() or invalidate the untracked cache for the
+ * corresponding directory
+ */
+static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
+{
+	if (core_fsmonitor) {
+		ce->ce_flags &= ~CE_FSMONITOR_VALID;
+		untracked_cache_invalidate_path(istate, ce->name);
+		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
+	}
+}
+
+#endif
diff --git a/preload-index.c b/preload-index.c
index 75564c497a..2a83255e4e 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -4,6 +4,7 @@
 #include "cache.h"
 #include "pathspec.h"
 #include "dir.h"
+#include "fsmonitor.h"
 
 #ifdef NO_PTHREADS
 static void preload_index(struct index_state *index,
@@ -55,15 +56,18 @@ static void *preload_thread(void *_data)
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
+		if (ce->ce_flags & CE_FSMONITOR_VALID)
+			continue;
 		if (!ce_path_match(ce, &p->pathspec, NULL))
 			continue;
 		if (threaded_has_symlink_leading_path(&cache, ce->name, ce_namelen(ce)))
 			continue;
 		if (lstat(ce->name, &st))
 			continue;
-		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY))
+		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR))
 			continue;
 		ce_mark_uptodate(ce);
+		mark_fsmonitor_valid(ce);
 	} while (--nr > 0);
 	cache_def_clear(&cache);
 	return NULL;
diff --git a/read-cache.c b/read-cache.c
index 40da87ea71..05c0a33fdd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -19,6 +19,7 @@
 #include "varint.h"
 #include "split-index.h"
 #include "utf8.h"
+#include "fsmonitor.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -38,11 +39,12 @@
 #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
 #define CACHE_EXT_LINK 0x6c696e6b	  /* "link" */
 #define CACHE_EXT_UNTRACKED 0x554E5452	  /* "UNTR" */
+#define CACHE_EXT_FSMONITOR 0x46534D4E	  /* "FSMN" */
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
 		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \
-		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED)
+		 SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED | FSMONITOR_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -62,6 +64,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	free(old);
 	set_index_entry(istate, nr, ce);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	mark_fsmonitor_invalid(istate, ce);
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 }
 
@@ -150,8 +153,10 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 	if (assume_unchanged)
 		ce->ce_flags |= CE_VALID;
 
-	if (S_ISREG(st->st_mode))
+	if (S_ISREG(st->st_mode)) {
 		ce_mark_uptodate(ce);
+		mark_fsmonitor_valid(ce);
+	}
 }
 
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
@@ -300,7 +305,7 @@ int match_stat_data_racy(const struct index_state *istate,
 	return match_stat_data(sd, st);
 }
 
-int ie_match_stat(const struct index_state *istate,
+int ie_match_stat(struct index_state *istate,
 		  const struct cache_entry *ce, struct stat *st,
 		  unsigned int options)
 {
@@ -308,7 +313,10 @@ int ie_match_stat(const struct index_state *istate,
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 	int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
 	int assume_racy_is_modified = options & CE_MATCH_RACY_IS_DIRTY;
+	int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
 
+	if (!ignore_fsmonitor)
+		refresh_fsmonitor(istate);
 	/*
 	 * If it's marked as always valid in the index, it's
 	 * valid whatever the checked-out copy says.
@@ -319,6 +327,8 @@ int ie_match_stat(const struct index_state *istate,
 		return 0;
 	if (!ignore_valid && (ce->ce_flags & CE_VALID))
 		return 0;
+	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID))
+		return 0;
 
 	/*
 	 * Intent-to-add entries have not been added, so the index entry
@@ -356,7 +366,7 @@ int ie_match_stat(const struct index_state *istate,
 	return changed;
 }
 
-int ie_modified(const struct index_state *istate,
+int ie_modified(struct index_state *istate,
 		const struct cache_entry *ce,
 		struct stat *st, unsigned int options)
 {
@@ -777,6 +787,7 @@ int chmod_index_entry(struct index_state *istate, struct cache_entry *ce,
 	}
 	cache_tree_invalidate_path(istate, ce->name);
 	ce->ce_flags |= CE_UPDATE_IN_BASE;
+	mark_fsmonitor_invalid(istate, ce);
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 
 	return 0;
@@ -1228,10 +1239,13 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 	int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
 	int ignore_missing = options & CE_MATCH_IGNORE_MISSING;
+	int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
 
 	if (!refresh || ce_uptodate(ce))
 		return ce;
 
+	if (!ignore_fsmonitor)
+		refresh_fsmonitor(istate);
 	/*
 	 * CE_VALID or CE_SKIP_WORKTREE means the user promised us
 	 * that the change to the work tree does not matter and told
@@ -1245,6 +1259,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 		ce_mark_uptodate(ce);
 		return ce;
 	}
+	if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) {
+		ce_mark_uptodate(ce);
+		return ce;
+	}
 
 	if (has_symlink_leading_path(ce->name, ce_namelen(ce))) {
 		if (ignore_missing)
@@ -1282,8 +1300,10 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 			 * because CE_UPTODATE flag is in-core only;
 			 * we are not going to write this change out.
 			 */
-			if (!S_ISGITLINK(ce->ce_mode))
+			if (!S_ISGITLINK(ce->ce_mode)) {
 				ce_mark_uptodate(ce);
+				mark_fsmonitor_valid(ce);
+			}
 			return ce;
 		}
 	}
@@ -1391,6 +1411,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 				 */
 				ce->ce_flags &= ~CE_VALID;
 				ce->ce_flags |= CE_UPDATE_IN_BASE;
+				mark_fsmonitor_invalid(istate, ce);
 				istate->cache_changed |= CE_ENTRY_CHANGED;
 			}
 			if (quiet)
@@ -1550,6 +1571,9 @@ static int read_index_extension(struct index_state *istate,
 	case CACHE_EXT_UNTRACKED:
 		istate->untracked = read_untracked_extension(data, sz);
 		break;
+	case CACHE_EXT_FSMONITOR:
+		read_fsmonitor_extension(istate, data, sz);
+		break;
 	default:
 		if (*ext < 'A' || 'Z' < *ext)
 			return error("index uses %.4s extension, which we do not understand",
@@ -1722,6 +1746,7 @@ static void post_read_index_from(struct index_state *istate)
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
 	tweak_split_index(istate);
+	tweak_fsmonitor(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
@@ -2306,6 +2331,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 		if (err)
 			return -1;
 	}
+	if (!strip_extensions && istate->fsmonitor_last_update) {
+		struct strbuf sb = STRBUF_INIT;
+
+		write_fsmonitor_extension(&sb, istate);
+		err = write_index_ext_header(&c, newfd, CACHE_EXT_FSMONITOR, sb.len) < 0
+			|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
+		strbuf_release(&sb);
+		if (err)
+			return -1;
+	}
 
 	if (ce_flush(&c, newfd, istate->sha1))
 		return -1;
diff --git a/submodule.c b/submodule.c
index 3cea8221e0..8a931a1aaa 100644
--- a/submodule.c
+++ b/submodule.c
@@ -62,7 +62,7 @@ int is_staging_gitmodules_ok(const struct index_state *istate)
 	if ((pos >= 0) && (pos < istate->cache_nr)) {
 		struct stat st;
 		if (lstat(GITMODULES_FILE, &st) == 0 &&
-		    ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED)
+		    ce_match_stat(istate->cache[pos], &st, CE_MATCH_IGNORE_FSMONITOR) & DATA_CHANGED)
 			return 0;
 	}
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 71b70ccb12..1f5d371636 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "submodule.h"
 #include "submodule-config.h"
+#include "fsmonitor.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -408,6 +409,7 @@ static int apply_sparse_checkout(struct index_state *istate,
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
 	if (was_skip_worktree != ce_skip_worktree(ce)) {
 		ce->ce_flags |= CE_UPDATE_IN_BASE;
+		mark_fsmonitor_invalid(istate, ce);
 		istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension.
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (3 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
                         ` (7 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This includes the core.fsmonitor setting, the fsmonitor integration hook,
and the fsmonitor index extension.

Also add documentation for the new fsmonitor options to ls-files and
update-index.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Documentation/config.txt                 |  7 +++++
 Documentation/git-ls-files.txt           |  7 ++++-
 Documentation/git-update-index.txt       | 45 ++++++++++++++++++++++++++++++++
 Documentation/githooks.txt               | 28 ++++++++++++++++++++
 Documentation/technical/index-format.txt | 19 ++++++++++++++
 5 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a2..db52645cb4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -413,6 +413,13 @@ core.protectNTFS::
 	8.3 "short" names.
 	Defaults to `true` on Windows, and `false` elsewhere.
 
+core.fsmonitor::
+	If set, the value of this variable is used as a command which
+	will identify all files that may have changed since the
+	requested date/time. This information is used to speed up git by
+	avoiding unnecessary processing of files that have not changed.
+	See the "fsmonitor-watchman" section of linkgit:githooks[5].
+
 core.trustctime::
 	If false, the ctime differences between the index and the
 	working tree are ignored; useful when the inode change time
diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index d153c17e06..3ac3e3a77d 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -9,7 +9,7 @@ git-ls-files - Show information about files in the index and the working tree
 SYNOPSIS
 --------
 [verse]
-'git ls-files' [-z] [-t] [-v]
+'git ls-files' [-z] [-t] [-v] [-f]
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
 		(-[c|d|o|i|s|u|k|m])*
 		[--eol]
@@ -133,6 +133,11 @@ a space) at the start of each line:
 	that are marked as 'assume unchanged' (see
 	linkgit:git-update-index[1]).
 
+-f::
+	Similar to `-t`, but use lowercase letters for files
+	that are marked as 'fsmonitor valid' (see
+	linkgit:git-update-index[1]).
+
 --full-name::
 	When run from a subdirectory, the command usually
 	outputs paths relative to the current directory.  This
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index e19eba62cd..7c2f880a22 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -16,9 +16,11 @@ SYNOPSIS
 	     [--chmod=(+|-)x]
 	     [--[no-]assume-unchanged]
 	     [--[no-]skip-worktree]
+	     [--[no-]fsmonitor-valid]
 	     [--ignore-submodules]
 	     [--[no-]split-index]
 	     [--[no-|test-|force-]untracked-cache]
+	     [--[no-]fsmonitor]
 	     [--really-refresh] [--unresolve] [--again | -g]
 	     [--info-only] [--index-info]
 	     [-z] [--stdin] [--index-version <n>]
@@ -111,6 +113,12 @@ you will need to handle the situation manually.
 	set and unset the "skip-worktree" bit for the paths. See
 	section "Skip-worktree bit" below for more information.
 
+--[no-]fsmonitor-valid::
+	When one of these flags is specified, the object name recorded
+	for the paths are not updated. Instead, these options
+	set and unset the "fsmonitor valid" bit for the paths. See
+	section "File System Monitor" below for more information.
+
 -g::
 --again::
 	Runs 'git update-index' itself on the paths whose index
@@ -201,6 +209,15 @@ will remove the intended effect of the option.
 	`--untracked-cache` used to imply `--test-untracked-cache` but
 	this option would enable the extension unconditionally.
 
+--fsmonitor::
+--no-fsmonitor::
+	Enable or disable files system monitor feature. These options
+	take effect whatever the value of the `core.fsmonitor`
+	configuration variable (see linkgit:git-config[1]). But a warning
+	is emitted when the change goes against the configured value, as
+	the configured value will take effect next time the index is
+	read and this will remove the intended effect of the option.
+
 \--::
 	Do not interpret any more arguments as options.
 
@@ -447,6 +464,34 @@ command reads the index; while when `--[no-|force-]untracked-cache`
 are used, the untracked cache is immediately added to or removed from
 the index.
 
+File System Monitor
+-------------------
+
+This feature is intended to speed up git operations for repos that have
+large working directories.
+
+It enables git to work together with a file system monitor (see the
+"fsmonitor-watchman" section of linkgit:githooks[5]) that can
+inform it as to what files have been modified. This enables git to avoid
+having to lstat() every file to find modified files.
+
+When used in conjunction with the untracked cache, it can further improve
+performance by avoiding the cost of scanning the entire working directory
+looking for new files.
+
+If you want to enable (or disable) this feature, it is easier to use
+the `core.fsmonitor` configuration variable (see
+linkgit:git-config[1]) than using the `--fsmonitor` option to
+`git update-index` in each repository, especially if you want to do so
+across all repositories you use, because you can set the configuration
+variable to `true` (or `false`) in your `$HOME/.gitconfig` just once
+and have it affect all repositories you touch.
+
+When the `core.fsmonitor` configuration variable is changed, the
+file system monitor is added to or removed from the index the next time
+a command reads the index. When `--[no-]fsmonitor` are used, the file
+system monitor is immediately added to or removed from the index.
+
 Configuration
 -------------
 
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 1bb4f92d4d..ae60559cd9 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -455,6 +455,34 @@ the name of the file that holds the e-mail to be sent.  Exiting with a
 non-zero status causes 'git send-email' to abort before sending any
 e-mails.
 
+fsmonitor-watchman
+~~~~~~~~~~~~~~~~~~
+
+This hook is invoked when the configuration option core.fsmonitor is
+set to .git/hooks/fsmonitor-watchman.  It takes two arguments, a version
+(currently 1) and the time in elapsed nanoseconds since midnight,
+January 1, 1970.
+
+The hook should output to stdout the list of all files in the working
+directory that may have changed since the requested time.  The logic
+should be inclusive so that it does not miss any potential changes.
+The paths should be relative to the root of the working directory
+and be separated by a single NUL.
+
+It is OK to include files which have not actually changed.  All changes
+including newly-created and deleted files should be included. When
+files are renamed, both the old and the new name should be included.
+
+Git will limit what files it checks for changes as well as which
+directories are checked for untracked files based on the path names
+given.
+
+An optimized way to tell git "all files have changed" is to return
+the filename '/'.
+
+The exit status determines whether git will use the data from the
+hook to limit its search.  On error, it will fall back to verifying
+all files and folders.
 
 GIT
 ---
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index ade0b0c445..db3572626b 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -295,3 +295,22 @@ The remaining data of each directory block is grouped by type:
     in the previous ewah bitmap.
 
   - One NUL.
+
+== File System Monitor cache
+
+  The file system monitor cache tracks files for which the core.fsmonitor
+  hook has told us about changes.  The signature for this extension is
+  { 'F', 'S', 'M', 'N' }.
+
+  The extension starts with
+
+  - 32-bit version number: the current supported version is 1.
+
+  - 64-bit time: the extension data reflects all changes through the given
+	time which is stored as the nanoseconds elapsed since midnight,
+	January 1, 1970.
+
+  - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap.
+
+  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
+    is not CE_FSMONITOR_VALID.
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (4 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 07/12] update-index: add fsmonitor support to update-index Ben Peart
                         ` (6 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a new command line option (-f) to ls-files to have it use lowercase
letters for 'fsmonitor valid' files

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/ls-files.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1339e6d17..313962a0c1 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -31,6 +31,7 @@ static int show_resolve_undo;
 static int show_modified;
 static int show_killed;
 static int show_valid_bit;
+static int show_fsmonitor_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
@@ -86,7 +87,8 @@ static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
 
-	if (tag && *tag && show_valid_bit && (ce->ce_flags & CE_VALID)) {
+	if (tag && *tag && ((show_valid_bit && (ce->ce_flags & CE_VALID)) ||
+		(show_fsmonitor_bit && (ce->ce_flags & CE_FSMONITOR_VALID)))) {
 		memcpy(alttag, tag, 3);
 
 		if (isalpha(tag[0])) {
@@ -515,6 +517,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			N_("identify the file status with tags")),
 		OPT_BOOL('v', NULL, &show_valid_bit,
 			N_("use lowercase letters for 'assume unchanged' files")),
+		OPT_BOOL('f', NULL, &show_fsmonitor_bit,
+			N_("use lowercase letters for 'fsmonitor clean' files")),
 		OPT_BOOL('c', "cached", &show_cached,
 			N_("show cached files in the output (default)")),
 		OPT_BOOL('d', "deleted", &show_deleted,
@@ -584,7 +588,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_exclude(exclude_list.items[i].string, "", 0, el, --exclude_args);
 	}
-	if (show_tag || show_valid_bit) {
+	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
 		tag_removed = "R ";
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 07/12] update-index: add fsmonitor support to update-index
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (5 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
                         ` (5 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add support in update-index to manually add/remove the fsmonitor
extension via --[no-]fsmonitor flags.

Add support in update-index to manually set/clear the fsmonitor
valid bit via --[no-]fsmonitor-valid flags.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/update-index.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 6f39ee9274..41618db098 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -33,6 +33,7 @@ static int force_remove;
 static int verbose;
 static int mark_valid_only;
 static int mark_skip_worktree_only;
+static int mark_fsmonitor_only;
 #define MARK_FLAG 1
 #define UNMARK_FLAG 2
 static struct strbuf mtime_dir = STRBUF_INIT;
@@ -229,12 +230,12 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 	int namelen = strlen(path);
 	int pos = cache_name_pos(path, namelen);
 	if (0 <= pos) {
+		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		if (mark)
 			active_cache[pos]->ce_flags |= flag;
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
-		mark_fsmonitor_invalid(&the_index, active_cache[pos]);
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
@@ -460,6 +461,11 @@ static void update_one(const char *path)
 			die("Unable to mark file %s", path);
 		return;
 	}
+	if (mark_fsmonitor_only) {
+		if (mark_ce_flags(path, CE_FSMONITOR_VALID, mark_fsmonitor_only == MARK_FLAG))
+			die("Unable to mark file %s", path);
+		return;
+	}
 
 	if (force_remove) {
 		if (remove_file_from_cache(path))
@@ -918,6 +924,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	int lock_error = 0;
 	int split_index = -1;
 	int force_write = 0;
+	int fsmonitor = -1;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
@@ -1011,6 +1018,14 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    N_("enable untracked cache without testing the filesystem"), UC_FORCE),
 		OPT_SET_INT(0, "force-write-index", &force_write,
 			N_("write out the index even if is not flagged as changed"), 1),
+		OPT_BOOL(0, "fsmonitor", &fsmonitor,
+			N_("enable or disable file system monitor")),
+		{OPTION_SET_INT, 0, "fsmonitor-valid", &mark_fsmonitor_only, NULL,
+			N_("mark files as fsmonitor valid"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, MARK_FLAG},
+		{OPTION_SET_INT, 0, "no-fsmonitor-valid", &mark_fsmonitor_only, NULL,
+			N_("clear fsmonitor valid bit"),
+			PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, UNMARK_FLAG},
 		OPT_END()
 	};
 
@@ -1152,6 +1167,22 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		die("BUG: bad untracked_cache value: %d", untracked_cache);
 	}
 
+	if (fsmonitor > 0) {
+		if (git_config_get_fsmonitor() == 0)
+			warning(_("core.fsmonitor is unset; "
+				"set it if you really want to "
+				"enable fsmonitor"));
+		add_fsmonitor(&the_index);
+		report(_("fsmonitor enabled"));
+	} else if (!fsmonitor) {
+		if (git_config_get_fsmonitor() == 1)
+			warning(_("core.fsmonitor is set; "
+				"remove it if you really want to "
+				"disable fsmonitor"));
+		remove_fsmonitor(&the_index);
+		report(_("fsmonitor disabled"));
+	}
+
 	if (active_cache_changed || force_write) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (6 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 07/12] update-index: add fsmonitor support to update-index Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 23:37         ` Martin Ågren
  2017-09-22 16:35       ` [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
                         ` (4 subsequent siblings)
  12 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 Makefile                       |  1 +
 t/helper/test-dump-fsmonitor.c | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)
 create mode 100644 t/helper/test-dump-fsmonitor.c

diff --git a/Makefile b/Makefile
index 9d6ec9c1e9..d970cd00e9 100644
--- a/Makefile
+++ b/Makefile
@@ -639,6 +639,7 @@ TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
+TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
 TEST_PROGRAMS_NEED_X += test-fake-ssh
diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
new file mode 100644
index 0000000000..ad452707e8
--- /dev/null
+++ b/t/helper/test-dump-fsmonitor.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+int cmd_main(int ac, const char **av)
+{
+	struct index_state *istate = &the_index;
+	int i;
+
+	setup_git_directory();
+	if (do_read_index(istate, get_index_file(), 0) < 0)
+		die("unable to read index file");
+	if (!istate->fsmonitor_last_update) {
+		printf("no fsmonitor\n");
+		return 0;
+	}
+	printf("fsmonitor last update %"PRIuMAX"\n", (uintmax_t)istate->fsmonitor_last_update);
+
+	for (i = 0; i < istate->cache_nr; i++)
+		printf((istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) ? "+" : "-");
+
+	return 0;
+}
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (7 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
                         ` (3 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

The split index test t1700-split-index.sh has hard coded SHA values for
the index.  Currently it supports index V4 and V3 but assumes there are
no index extensions loaded.

When manually forcing the fsmonitor extension to be turned on when
running the test suite, the SHA values no longer match which causes the
test to fail.

The potential matrix of index extensions and index versions can is quite
large so instead temporarily disable the extension before attempting to
run the test until the underlying problem of hard coded SHA values is fixed.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t1700-split-index.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 22f69a410b..af9b847761 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -6,6 +6,7 @@ test_description='split index mode tests'
 
 # We need total control of index splitting here
 sane_unset GIT_TEST_SPLIT_INDEX
+sane_unset GIT_FSMONITOR_TEST
 
 test_expect_success 'enable split index' '
 	git config splitIndex.maxPercentChange 100 &&
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (8 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
                         ` (2 subsequent siblings)
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Test the ability to add/remove the fsmonitor index extension via
update-index.

Test that dirty files returned from the integration script are properly
represented in the index extension and verify that ls-files correctly
reports their state.

Test that ensure status results are correct when using the new fsmonitor
extension.  Test untracked, modified, and new files by ensuring the
results are identical to when not using the extension.

Test that if the fsmonitor extension doesn't tell git about a change, it
doesn't discover it on its own.  This ensures git is honoring the
extension and that we get the performance benefits desired.

Three test integration scripts are provided:

fsmonitor-all - marks all files as dirty
fsmonitor-none - marks no files as dirty
fsmonitor-watchman - integrates with Watchman with debug logging

To run tests in the test suite while utilizing fsmonitor:

First copy t/t7519/fsmonitor-all to a location in your path and then set
GIT_FORCE_PRELOAD_TEST=true and GIT_FSMONITOR_TEST=fsmonitor-all and run
your tests.

Note: currently when using the test script fsmonitor-watchman on
Windows, many tests fail due to a reported but not yet fixed bug in
Watchman where it holds on to handles for directories and files which
prevents the test directory from being cleaned up properly.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 t/t7519-status-fsmonitor.sh | 304 ++++++++++++++++++++++++++++++++++++++++++++
 t/t7519/fsmonitor-all       |  24 ++++
 t/t7519/fsmonitor-none      |  22 ++++
 t/t7519/fsmonitor-watchman  | 140 ++++++++++++++++++++
 4 files changed, 490 insertions(+)
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 t/t7519/fsmonitor-all
 create mode 100755 t/t7519/fsmonitor-none
 create mode 100755 t/t7519/fsmonitor-watchman

diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
new file mode 100755
index 0000000000..c6df85af5e
--- /dev/null
+++ b/t/t7519-status-fsmonitor.sh
@@ -0,0 +1,304 @@
+#!/bin/sh
+
+test_description='git status with file system watcher'
+
+. ./test-lib.sh
+
+#
+# To run the entire git test suite using fsmonitor:
+#
+# copy t/t7519/fsmonitor-all to a location in your path and then set
+# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests.
+#
+
+# Note, after "git reset --hard HEAD" no extensions exist other than 'TREE'
+# "git update-index --fsmonitor" can be used to get the extension written
+# before testing the results.
+
+clean_repo () {
+	git reset --hard HEAD &&
+	git clean -fd
+}
+
+dirty_repo () {
+	: >untracked &&
+	: >dir1/untracked &&
+	: >dir2/untracked &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	echo 4 >new &&
+	echo 5 >dir1/new &&
+	echo 6 >dir2/new
+}
+
+write_integration_script () {
+	write_script .git/hooks/fsmonitor-test<<-\EOF
+	if test "$#" -ne 2
+	then
+		echo "$0: exactly 2 arguments expected"
+		exit 2
+	fi
+	if test "$1" != 1
+	then
+		echo "Unsupported core.fsmonitor hook version." >&2
+		exit 1
+	fi
+	printf "untracked\0"
+	printf "dir1/untracked\0"
+	printf "dir2/untracked\0"
+	printf "modified\0"
+	printf "dir1/modified\0"
+	printf "dir2/modified\0"
+	printf "new\0"
+	printf "dir1/new\0"
+	printf "dir2/new\0"
+	EOF
+}
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success 'setup' '
+	mkdir -p .git/hooks &&
+	: >tracked &&
+	: >modified &&
+	mkdir dir1 &&
+	: >dir1/tracked &&
+	: >dir1/modified &&
+	mkdir dir2 &&
+	: >dir2/tracked &&
+	: >dir2/modified &&
+	git -c core.fsmonitor= add . &&
+	git -c core.fsmonitor= commit -m initial &&
+	git config core.fsmonitor .git/hooks/fsmonitor-test &&
+	cat >.gitignore <<-\EOF
+	.gitignore
+	expect*
+	actual*
+	marker*
+	EOF
+'
+
+# test that the fsmonitor extension is off by default
+test_expect_success 'fsmonitor extension is off by default' '
+	test-dump-fsmonitor >actual &&
+	grep "^no fsmonitor" actual
+'
+
+# test that "update-index --fsmonitor" adds the fsmonitor extension
+test_expect_success 'update-index --fsmonitor" adds the fsmonitor extension' '
+	git update-index --fsmonitor &&
+	test-dump-fsmonitor >actual &&
+	grep "^fsmonitor last update" actual
+'
+
+# test that "update-index --no-fsmonitor" removes the fsmonitor extension
+test_expect_success 'update-index --no-fsmonitor" removes the fsmonitor extension' '
+	git update-index --no-fsmonitor &&
+	test-dump-fsmonitor >actual &&
+	grep "^no fsmonitor" actual
+'
+
+cat >expect <<EOF &&
+h dir1/modified
+H dir1/tracked
+h dir2/modified
+H dir2/tracked
+h modified
+H tracked
+EOF
+
+# test that "update-index --fsmonitor-valid" sets the fsmonitor valid bit
+test_expect_success 'update-index --fsmonitor-valid" sets the fsmonitor valid bit' '
+	git update-index --fsmonitor &&
+	git update-index --fsmonitor-valid dir1/modified &&
+	git update-index --fsmonitor-valid dir2/modified &&
+	git update-index --fsmonitor-valid modified &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+H dir1/tracked
+H dir2/modified
+H dir2/tracked
+H modified
+H tracked
+EOF
+
+# test that "update-index --no-fsmonitor-valid" clears the fsmonitor valid bit
+test_expect_success 'update-index --no-fsmonitor-valid" clears the fsmonitor valid bit' '
+	git update-index --no-fsmonitor-valid dir1/modified &&
+	git update-index --no-fsmonitor-valid dir2/modified &&
+	git update-index --no-fsmonitor-valid modified &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+H dir1/tracked
+H dir2/modified
+H dir2/tracked
+H modified
+H tracked
+EOF
+
+# test that all files returned by the script get flagged as invalid
+test_expect_success 'all files returned by integration script get flagged as invalid' '
+	write_integration_script &&
+	dirty_repo &&
+	git update-index --fsmonitor &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/new
+H dir1/tracked
+H dir2/modified
+h dir2/new
+H dir2/tracked
+H modified
+h new
+H tracked
+EOF
+
+# test that newly added files are marked valid
+test_expect_success 'newly added files are marked valid' '
+	git add new &&
+	git add dir1/new &&
+	git add dir2/new &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/new
+h dir1/tracked
+H dir2/modified
+h dir2/new
+h dir2/tracked
+H modified
+h new
+h tracked
+EOF
+
+# test that all unmodified files get marked valid
+test_expect_success 'all unmodified files get marked valid' '
+	# modified files result in update-index returning 1
+	test_must_fail git update-index --refresh --force-write-index &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<EOF &&
+H dir1/modified
+h dir1/tracked
+h dir2/modified
+h dir2/tracked
+h modified
+h tracked
+EOF
+
+# test that *only* files returned by the integration script get flagged as invalid
+test_expect_success '*only* files returned by the integration script get flagged as invalid' '
+	write_script .git/hooks/fsmonitor-test<<-\EOF &&
+	printf "dir1/modified\0"
+	EOF
+	clean_repo &&
+	git update-index --refresh --force-write-index &&
+	echo 1 >modified &&
+	echo 2 >dir1/modified &&
+	echo 3 >dir2/modified &&
+	test_must_fail git update-index --refresh --force-write-index &&
+	git ls-files -f >actual &&
+	test_cmp expect actual
+'
+
+# Ensure commands that call refresh_index() to move the index back in time
+# properly invalidate the fsmonitor cache
+test_expect_success 'refresh_index() invalidates fsmonitor cache' '
+	write_script .git/hooks/fsmonitor-test<<-\EOF &&
+	EOF
+	clean_repo &&
+	dirty_repo &&
+	git add . &&
+	git commit -m "to reset" &&
+	git reset HEAD~1 &&
+	git status >actual &&
+	git -c core.fsmonitor= status >expect &&
+	test_i18ncmp expect actual
+'
+
+# test fsmonitor with and without preloadIndex
+preload_values="false true"
+for preload_val in $preload_values
+do
+	test_expect_success "setup preloadIndex to $preload_val" '
+		git config core.preloadIndex $preload_val &&
+		if test $preload_val = true
+		then
+			GIT_FORCE_PRELOAD_TEST=$preload_val; export GIT_FORCE_PRELOAD_TEST
+		else
+			unset GIT_FORCE_PRELOAD_TEST
+		fi
+	'
+
+	# test fsmonitor with and without the untracked cache (if available)
+	uc_values="false"
+	test_have_prereq UNTRACKED_CACHE && uc_values="false true"
+	for uc_val in $uc_values
+	do
+		test_expect_success "setup untracked cache to $uc_val" '
+			git config core.untrackedcache $uc_val
+		'
+
+		# Status is well tested elsewhere so we'll just ensure that the results are
+		# the same when using core.fsmonitor.
+		test_expect_success 'compare status with and without fsmonitor' '
+			write_integration_script &&
+			clean_repo &&
+			dirty_repo &&
+			git add new &&
+			git add dir1/new &&
+			git add dir2/new &&
+			git status >actual &&
+			git -c core.fsmonitor= status >expect &&
+			test_i18ncmp expect actual
+		'
+
+		# Make sure it's actually skipping the check for modified and untracked
+		# (if enabled) files unless it is told about them.
+		test_expect_success "status doesn't detect unreported modifications" '
+			write_script .git/hooks/fsmonitor-test<<-\EOF &&
+			:>marker
+			EOF
+			clean_repo &&
+			git status &&
+			test_path_is_file marker &&
+			dirty_repo &&
+			rm -f marker &&
+			git status >actual &&
+			test_path_is_file marker &&
+			test_i18ngrep ! "Changes not staged for commit:" actual &&
+			if test $uc_val = true
+			then
+				test_i18ngrep ! "Untracked files:" actual
+			fi &&
+			if test $uc_val = false
+			then
+				test_i18ngrep "Untracked files:" actual
+			fi &&
+			rm -f marker
+		'
+	done
+done
+
+test_done
diff --git a/t/t7519/fsmonitor-all b/t/t7519/fsmonitor-all
new file mode 100755
index 0000000000..691bc94dc2
--- /dev/null
+++ b/t/t7519/fsmonitor-all
@@ -0,0 +1,24 @@
+#!/bin/sh
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+
+if test "$#" -ne 2
+then
+	echo "$0: exactly 2 arguments expected" >&2
+	exit 2
+fi
+
+if test "$1" != 1
+then
+	echo "Unsupported core.fsmonitor hook version." >&2
+	exit 1
+fi
+
+echo "/"
diff --git a/t/t7519/fsmonitor-none b/t/t7519/fsmonitor-none
new file mode 100755
index 0000000000..ed9cf5a6a9
--- /dev/null
+++ b/t/t7519/fsmonitor-none
@@ -0,0 +1,22 @@
+#!/bin/sh
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+
+if test "$#" -ne 2
+then
+	echo "$0: exactly 2 arguments expected" >&2
+	exit 2
+fi
+
+if test "$1" != 1
+then
+	echo "Unsupported core.fsmonitor hook version." >&2
+	exit 1
+fi
diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
new file mode 100755
index 0000000000..7ceb32dc18
--- /dev/null
+++ b/t/t7519/fsmonitor-watchman
@@ -0,0 +1,140 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $time) = @ARGV;
+#print STDERR "$0 $version $time\n";
+
+# Check the hook interface version
+
+if ($version == 1) {
+	# convert nanoseconds to seconds
+	$time = int $time / 1000000000;
+} else {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+# Convert unix style paths to escaped Windows style paths when running
+# in Windows command prompt
+
+my $system = `uname -s`;
+$system =~ s/[\r\n]+//g;
+my $git_work_tree;
+
+if ($system =~ m/^MSYS_NT/) {
+	$git_work_tree = `cygpath -aw "\$PWD"`;
+	$git_work_tree =~ s/[\r\n]+//g;
+	$git_work_tree =~ s,\\,/,g;
+} else {
+	$git_work_tree = $ENV{'PWD'};
+}
+
+my $retry = 1;
+
+launch_watchman();
+
+sub launch_watchman {
+
+	# Set input record separator
+	local $/ = 0666;
+
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+	    or die "open2() failed: $!\n" .
+	    "Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $time but were not transient (ie created after
+	# $time but no longer exist).
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	#
+	# The category of transient files that we want to ignore will have a
+	# creation clock (cclock) newer than $time_t value and will also not
+	# currently exist.
+
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $time,
+			"fields": ["name"],
+			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+		}]
+	END
+	
+	open (my $fh, ">", ".git/watchman-query.json");
+	print $fh $query;
+	close $fh;
+
+	print CHLD_IN $query;
+	my $response = <CHLD_OUT>;
+
+	open ($fh, ">", ".git/watchman-response.json");
+	print $fh $response;
+	close $fh;
+
+	die "Watchman: command returned no output.\n" .
+	    "Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	    "Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	my $json_pkg;
+	eval {
+		require JSON::XS;
+		$json_pkg = "JSON::XS";
+		1;
+	} or do {
+		require JSON::PP;
+		$json_pkg = "JSON::PP";
+	};
+
+	my $o = $json_pkg->new->utf8->decode($response);
+
+	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+		$retry--;
+		qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+
+		open ($fh, ">", ".git/watchman-output.out");
+		print "/\0";
+		close $fh;
+
+		print "/\0";
+		eval { launch_watchman() };
+		exit 0;
+	}
+
+	die "Watchman: $o->{error}.\n" .
+	    "Falling back to scanning...\n" if $o->{error};
+
+	open ($fh, ">", ".git/watchman-output.out");
+	print $fh @{$o->{files}};
+	close $fh;
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+}
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (9 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-22 16:35       ` [PATCH v8 12/12] fsmonitor: add a performance test Ben Peart
  2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

This script integrates the new fsmonitor capabilities of git with the
cross platform Watchman file watching service. To use the script:

Download and install Watchman from https://facebook.github.io/watchman/.
Rename the sample integration hook from fsmonitor-watchman.sample to
fsmonitor-watchman. Configure git to use the extension:

git config core.fsmonitor .git/hooks/fsmonitor-watchman

Optionally turn on the untracked cache for optimal performance.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Christian Couder <christian.couder@gmail.com>
---
 templates/hooks--fsmonitor-watchman.sample | 122 +++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)
 create mode 100755 templates/hooks--fsmonitor-watchman.sample

diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
new file mode 100755
index 0000000000..870a59d237
--- /dev/null
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -0,0 +1,122 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 1) and a time in nanoseconds
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $time) = @ARGV;
+
+# Check the hook interface version
+
+if ($version == 1) {
+	# convert nanoseconds to seconds
+	$time = int $time / 1000000000;
+} else {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+# Convert unix style paths to escaped Windows style paths when running
+# in Windows command prompt
+
+my $system = `uname -s`;
+$system =~ s/[\r\n]+//g;
+my $git_work_tree;
+
+if ($system =~ m/^MSYS_NT/) {
+	$git_work_tree = `cygpath -aw "\$PWD"`;
+	$git_work_tree =~ s/[\r\n]+//g;
+	$git_work_tree =~ s,\\,/,g;
+} else {
+	$git_work_tree = $ENV{'PWD'};
+}
+
+my $retry = 1;
+
+launch_watchman();
+
+sub launch_watchman {
+
+	# Set input record separator
+	local $/ = 0666;
+
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+	    or die "open2() failed: $!\n" .
+	    "Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $time but were not transient (ie created after
+	# $time but no longer exist).
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	#
+	# The category of transient files that we want to ignore will have a
+	# creation clock (cclock) newer than $time_t value and will also not
+	# currently exist.
+
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $time,
+			"fields": ["name"],
+			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+		}]
+	END
+
+	print CHLD_IN $query;
+	my $response = <CHLD_OUT>;
+
+	die "Watchman: command returned no output.\n" .
+	    "Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	    "Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	my $json_pkg;
+	eval {
+		require JSON::XS;
+		$json_pkg = "JSON::XS";
+		1;
+	} or do {
+		require JSON::PP;
+		$json_pkg = "JSON::PP";
+	};
+
+	my $o = $json_pkg->new->utf8->decode($response);
+
+	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+		$retry--;
+		qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+		print "/\0";
+		eval { launch_watchman() };
+		exit 0;
+	}
+
+	die "Watchman: $o->{error}.\n" .
+	    "Falling back to scanning...\n" if $o->{error};
+
+	binmode STDOUT, ":utf8";
+	local $, = "\0";
+	print @{$o->{files}};
+}
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* [PATCH v8 12/12] fsmonitor: add a performance test
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (10 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
@ 2017-09-22 16:35       ` Ben Peart
  2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
  12 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-22 16:35 UTC (permalink / raw)
  To: benpeart
  Cc: David.Turner, avarab, christian.couder, git, gitster,
	johannes.schindelin, pclouds, peff

Add a test utility (test-drop-caches) that flushes all changes to disk
then drops file system cache on Windows, Linux, and OSX.

Add a perf test (p7519-fsmonitor.sh) for fsmonitor.

By default, the performance test will utilize the Watchman file system
monitor if it is installed.  If Watchman is not installed, it will use a
dummy integration script that does not report any new or modified files.
The dummy script has very little overhead which provides optimistic results.

The performance test will also use the untracked cache feature if it is
available as fsmonitor uses it to speed up scanning for untracked files.

There are 4 environment variables that can be used to alter the default
behavior of the performance test:

GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
GIT_PERF_7519_FSMONITOR: used to configure core.fsmonitor
GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests

The big win for using fsmonitor is the elimination of the need to scan the
working directory looking for changed and untracked files. If the file
information is all cached in RAM, the benefits are reduced.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile                    |   1 +
 t/helper/.gitignore         |   1 +
 t/helper/test-drop-caches.c | 164 +++++++++++++++++++++++++++++++++++++++
 t/perf/p7519-fsmonitor.sh   | 184 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 350 insertions(+)
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100755 t/perf/p7519-fsmonitor.sh

diff --git a/Makefile b/Makefile
index d970cd00e9..b2653ee64f 100644
--- a/Makefile
+++ b/Makefile
@@ -638,6 +638,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-drop-caches
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
 TEST_PROGRAMS_NEED_X += test-dump-split-index
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 721650256e..f9328eebdd 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -3,6 +3,7 @@
 /test-config
 /test-date
 /test-delta
+/test-drop-caches
 /test-dump-cache-tree
 /test-dump-split-index
 /test-dump-untracked-cache
diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c
new file mode 100644
index 0000000000..bd1a857d52
--- /dev/null
+++ b/t/helper/test-drop-caches.c
@@ -0,0 +1,164 @@
+#include "git-compat-util.h"
+
+#if defined(GIT_WINDOWS_NATIVE)
+
+static int cmd_sync(void)
+{
+	char Buffer[MAX_PATH];
+	DWORD dwRet;
+	char szVolumeAccessPath[] = "\\\\.\\X:";
+	HANDLE hVolWrite;
+	int success = 0;
+
+	dwRet = GetCurrentDirectory(MAX_PATH, Buffer);
+	if ((0 == dwRet) || (dwRet > MAX_PATH))
+		return error("Error getting current directory");
+
+	if ((Buffer[0] < 'A') || (Buffer[0] > 'Z'))
+		return error("Invalid drive letter '%c'", Buffer[0]);
+
+	szVolumeAccessPath[4] = Buffer[0];
+	hVolWrite = CreateFile(szVolumeAccessPath, GENERIC_READ | GENERIC_WRITE,
+		FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);
+	if (INVALID_HANDLE_VALUE == hVolWrite)
+		return error("Unable to open volume for writing, need admin access");
+
+	success = FlushFileBuffers(hVolWrite);
+	if (!success)
+		error("Unable to flush volume");
+
+	CloseHandle(hVolWrite);
+
+	return !success;
+}
+
+#define STATUS_SUCCESS			(0x00000000L)
+#define STATUS_PRIVILEGE_NOT_HELD	(0xC0000061L)
+
+typedef enum _SYSTEM_INFORMATION_CLASS {
+	SystemMemoryListInformation = 80,
+} SYSTEM_INFORMATION_CLASS;
+
+typedef enum _SYSTEM_MEMORY_LIST_COMMAND {
+	MemoryCaptureAccessedBits,
+	MemoryCaptureAndResetAccessedBits,
+	MemoryEmptyWorkingSets,
+	MemoryFlushModifiedList,
+	MemoryPurgeStandbyList,
+	MemoryPurgeLowPriorityStandbyList,
+	MemoryCommandMax
+} SYSTEM_MEMORY_LIST_COMMAND;
+
+static BOOL GetPrivilege(HANDLE TokenHandle, LPCSTR lpName, int flags)
+{
+	BOOL bResult;
+	DWORD dwBufferLength;
+	LUID luid;
+	TOKEN_PRIVILEGES tpPreviousState;
+	TOKEN_PRIVILEGES tpNewState;
+
+	dwBufferLength = 16;
+	bResult = LookupPrivilegeValueA(0, lpName, &luid);
+	if (bResult) {
+		tpNewState.PrivilegeCount = 1;
+		tpNewState.Privileges[0].Luid = luid;
+		tpNewState.Privileges[0].Attributes = 0;
+		bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpNewState,
+			(DWORD)((LPBYTE)&(tpNewState.Privileges[1]) - (LPBYTE)&tpNewState),
+			&tpPreviousState, &dwBufferLength);
+		if (bResult) {
+			tpPreviousState.PrivilegeCount = 1;
+			tpPreviousState.Privileges[0].Luid = luid;
+			tpPreviousState.Privileges[0].Attributes = flags != 0 ? 2 : 0;
+			bResult = AdjustTokenPrivileges(TokenHandle, 0, &tpPreviousState,
+				dwBufferLength, 0, 0);
+		}
+	}
+	return bResult;
+}
+
+static int cmd_dropcaches(void)
+{
+	HANDLE hProcess = GetCurrentProcess();
+	HANDLE hToken;
+	HMODULE ntdll;
+	DWORD(WINAPI *NtSetSystemInformation)(INT, PVOID, ULONG);
+	SYSTEM_MEMORY_LIST_COMMAND command;
+	int status;
+
+	if (!OpenProcessToken(hProcess, TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken))
+		return error("Can't open current process token");
+
+	if (!GetPrivilege(hToken, "SeProfileSingleProcessPrivilege", 1))
+		return error("Can't get SeProfileSingleProcessPrivilege");
+
+	CloseHandle(hToken);
+
+	ntdll = LoadLibrary("ntdll.dll");
+	if (!ntdll)
+		return error("Can't load ntdll.dll, wrong Windows version?");
+
+	NtSetSystemInformation =
+		(DWORD(WINAPI *)(INT, PVOID, ULONG))GetProcAddress(ntdll, "NtSetSystemInformation");
+	if (!NtSetSystemInformation)
+		return error("Can't get function addresses, wrong Windows version?");
+
+	command = MemoryPurgeStandbyList;
+	status = NtSetSystemInformation(
+		SystemMemoryListInformation,
+		&command,
+		sizeof(SYSTEM_MEMORY_LIST_COMMAND)
+	);
+	if (status == STATUS_PRIVILEGE_NOT_HELD)
+		error("Insufficient privileges to purge the standby list, need admin access");
+	else if (status != STATUS_SUCCESS)
+		error("Unable to execute the memory list command %d", status);
+
+	FreeLibrary(ntdll);
+
+	return status;
+}
+
+#elif defined(__linux__)
+
+static int cmd_sync(void)
+{
+	return system("sync");
+}
+
+static int cmd_dropcaches(void)
+{
+	return system("echo 3 | sudo tee /proc/sys/vm/drop_caches");
+}
+
+#elif defined(__APPLE__)
+
+static int cmd_sync(void)
+{
+	return system("sync");
+}
+
+static int cmd_dropcaches(void)
+{
+	return system("sudo purge");
+}
+
+#else
+
+static int cmd_sync(void)
+{
+	return 0;
+}
+
+static int cmd_dropcaches(void)
+{
+	return error("drop caches not implemented on this platform");
+}
+
+#endif
+
+int cmd_main(int argc, const char **argv)
+{
+	cmd_sync();
+	return cmd_dropcaches();
+}
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
new file mode 100755
index 0000000000..16d1bf72e5
--- /dev/null
+++ b/t/perf/p7519-fsmonitor.sh
@@ -0,0 +1,184 @@
+#!/bin/sh
+
+test_description="Test core.fsmonitor"
+
+. ./perf-lib.sh
+
+#
+# Performance test for the fsmonitor feature which enables git to talk to a
+# file system change monitor and avoid having to scan the working directory
+# for new or modified files.
+#
+# By default, the performance test will utilize the Watchman file system
+# monitor if it is installed.  If Watchman is not installed, it will use a
+# dummy integration script that does not report any new or modified files.
+# The dummy script has very little overhead which provides optimistic results.
+#
+# The performance test will also use the untracked cache feature if it is
+# available as fsmonitor uses it to speed up scanning for untracked files.
+#
+# There are 3 environment variables that can be used to alter the default
+# behavior of the performance test:
+#
+# GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
+# GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
+# GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor
+#
+# The big win for using fsmonitor is the elimination of the need to scan the
+# working directory looking for changed and untracked files. If the file
+# information is all cached in RAM, the benefits are reduced.
+#
+# GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests
+#
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_lazy_prereq WATCHMAN '
+	{ command -v watchman >/dev/null 2>&1; ret=$?; } &&
+	test $ret -ne 1
+'
+
+if test_have_prereq WATCHMAN
+then
+	# Convert unix style paths to escaped Windows style paths for Watchman
+	case "$(uname -s)" in
+	MSYS_NT*)
+	  GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,/,g')"
+	  ;;
+	*)
+	  GIT_WORK_TREE="$PWD"
+	  ;;
+	esac
+fi
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"
+then
+	# When using GIT_PERF_7519_DROP_CACHE, GIT_PERF_REPEAT_COUNT must be 1 to
+	# generate valid results. Otherwise the caching that happens for the nth
+	# run will negate the validity of the comparisons.
+	if test "$GIT_PERF_REPEAT_COUNT" -ne 1
+	then
+		echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2
+		GIT_PERF_REPEAT_COUNT=1
+	fi
+fi
+
+test_expect_success "setup for fsmonitor" '
+	# set untrackedCache depending on the environment
+	if test -n "$GIT_PERF_7519_UNTRACKED_CACHE"
+	then
+		git config core.untrackedCache "$GIT_PERF_7519_UNTRACKED_CACHE"
+	else
+		if test_have_prereq UNTRACKED_CACHE
+		then
+			git config core.untrackedCache true
+		else
+			git config core.untrackedCache false
+		fi
+	fi &&
+
+	# set core.splitindex depending on the environment
+	if test -n "$GIT_PERF_7519_SPLIT_INDEX"
+	then
+		git config core.splitIndex "$GIT_PERF_7519_SPLIT_INDEX"
+	fi &&
+
+	# set INTEGRATION_SCRIPT depending on the environment
+	if test -n "$GIT_PERF_7519_FSMONITOR"
+	then
+		INTEGRATION_SCRIPT="$GIT_PERF_7519_FSMONITOR"
+	else
+		#
+		# Choose integration script based on existence of Watchman.
+		# If Watchman exists, watch the work tree and attempt a query.
+		# If everything succeeds, use Watchman integration script,
+		# else fall back to an empty integration script.
+		#
+		mkdir .git/hooks &&
+		if test_have_prereq WATCHMAN
+		then
+			INTEGRATION_SCRIPT=".git/hooks/fsmonitor-watchman" &&
+			cp "$TEST_DIRECTORY/../templates/hooks--fsmonitor-watchman.sample" "$INTEGRATION_SCRIPT" &&
+			watchman watch "$GIT_WORK_TREE" &&
+			watchman watch-list | grep -q -F "$GIT_WORK_TREE"
+		else
+			INTEGRATION_SCRIPT=".git/hooks/fsmonitor-empty" &&
+			write_script "$INTEGRATION_SCRIPT"<<-\EOF
+			EOF
+		fi
+	fi &&
+
+	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
+	git update-index --fsmonitor
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uno
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uall
+'
+
+test_expect_success "setup without fsmonitor" '
+	unset INTEGRATION_SCRIPT &&
+	git config --unset core.fsmonitor &&
+	git update-index --no-fsmonitor
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uno
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-drop-caches
+fi
+
+test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git status -uall
+'
+
+if test_have_prereq WATCHMAN
+then
+	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
+
+	# Work around Watchman bug on Windows where it holds on to handles
+	# preventing the removal of the trash directory
+	watchman shutdown-server >/dev/null 2>&1
+fi
+
+test_done
-- 
2.14.1.549.g6ff7ed0467


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
@ 2017-09-22 23:37         ` Martin Ågren
  2017-09-23 23:31           ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Martin Ågren @ 2017-09-22 23:37 UTC (permalink / raw)
  To: Ben Peart
  Cc: David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git, Junio C Hamano, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Jeff King

On 22 September 2017 at 18:35, Ben Peart <benpeart@microsoft.com> wrote:
> Add a new get_be64 macro to enable 64 bit endian conversions on memory
> that may or may not be aligned.

I needed this to compile and pass the tests with NO_UNALIGNED_LOADS.

Martin

diff --git a/compat/bswap.h b/compat/bswap.h
index 6b22c4621..9dc79bdf5 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -183,8 +183,8 @@ static inline uint32_t get_be32(const void *ptr)
 static inline uint64_t get_be64(const void *ptr)
 {
 	const unsigned char *p = ptr;
-	return	(uint64_t)get_be32(p[0]) << 32 |
-		(uint64_t)get_be32(p[4]) <<  0;
+	return	(uint64_t)get_be32(p + 0) << 32 |
+		(uint64_t)get_be32(p + 4) <<  0;
 }
 
 static inline void put_be32(void *ptr, uint32_t value)
-- 
2.14.1.727.g9ddaf86


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
@ 2017-09-22 23:37         ` Martin Ågren
  2017-09-23 23:33           ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Martin Ågren @ 2017-09-22 23:37 UTC (permalink / raw)
  To: Ben Peart
  Cc: David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git, Junio C Hamano, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Jeff King

On 22 September 2017 at 18:35, Ben Peart <benpeart@microsoft.com> wrote:
> Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
> index extension.
>
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> ---
>  Makefile                       |  1 +
>  t/helper/test-dump-fsmonitor.c | 21 +++++++++++++++++++++
>  2 files changed, 22 insertions(+)
>  create mode 100644 t/helper/test-dump-fsmonitor.c

You forget to add the new binary to .gitignore. (In patch 12/12, you
introduce test-drop-caches, which you _do_ add to .gitignore.)

Martin

diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 6f07de62d..0fe2e0440 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -6,6 +6,7 @@
 /test-delta
 /test-drop-caches
 /test-dump-cache-tree
+/test-dump-fsmonitor
 /test-dump-split-index
 /test-dump-untracked-cache
 /test-fake-ssh
-- 
2.14.1.727.g9ddaf86


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* RE: [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-22 23:37         ` Martin Ågren
@ 2017-09-23 23:31           ` Ben Peart
  2017-09-24  3:51             ` Jeff King
  2017-09-24  3:52             ` Junio C Hamano
  0 siblings, 2 replies; 137+ messages in thread
From: Ben Peart @ 2017-09-23 23:31 UTC (permalink / raw)
  To: Martin Ågren
  Cc: David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git@vger.kernel.org, Junio C Hamano,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy,
	Jeff King



Thanks,

Ben

> -----Original Message-----
> From: Martin Ågren [mailto:martin.agren@gmail.com]
> Sent: Friday, September 22, 2017 7:37 PM
> To: Ben Peart <Ben.Peart@microsoft.com>
> Cc: David Turner <David.Turner@twosigma.com>; Ævar Arnfjörð Bjarmason
> <avarab@gmail.com>; Christian Couder <christian.couder@gmail.com>;
> git@vger.kernel.org; Junio C Hamano <gitster@pobox.com>; Johannes
> Schindelin <johannes.schindelin@gmx.de>; Nguyễn Thái Ngọc Duy
> <pclouds@gmail.com>; Jeff King <peff@peff.net>
> Subject: Re: [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64
> 
> On 22 September 2017 at 18:35, Ben Peart <benpeart@microsoft.com>
> wrote:
> > Add a new get_be64 macro to enable 64 bit endian conversions on memory
> > that may or may not be aligned.
> 
> I needed this to compile and pass the tests with NO_UNALIGNED_LOADS.
> 
> Martin
> 
> diff --git a/compat/bswap.h b/compat/bswap.h index 6b22c4621..9dc79bdf5
> 100644
> --- a/compat/bswap.h
> +++ b/compat/bswap.h
> @@ -183,8 +183,8 @@ static inline uint32_t get_be32(const void *ptr)  static
> inline uint64_t get_be64(const void *ptr)  {
>  	const unsigned char *p = ptr;
> -	return	(uint64_t)get_be32(p[0]) << 32 |
> -		(uint64_t)get_be32(p[4]) <<  0;
> +	return	(uint64_t)get_be32(p + 0) << 32 |
> +		(uint64_t)get_be32(p + 4) <<  0;

This is surprising.  Every other function in the file uses the p[x] syntax.  Just for
consistency, is there a way to stick to that syntax but still make it work correctly?
Is there a typecast that can make it work?

>  }
> 
>  static inline void put_be32(void *ptr, uint32_t value)
> --
> 2.14.1.727.g9ddaf86


^ permalink raw reply	[flat|nested] 137+ messages in thread

* RE: [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-22 23:37         ` Martin Ågren
@ 2017-09-23 23:33           ` Ben Peart
  2017-09-24  3:51             ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-23 23:33 UTC (permalink / raw)
  To: Martin Ågren
  Cc: David Turner, Ævar Arnfjörð Bjarmason,
	Christian Couder, git@vger.kernel.org, Junio C Hamano,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy,
	Jeff King

> -----Original Message-----
> From: Martin Ågren [mailto:martin.agren@gmail.com]
> Sent: Friday, September 22, 2017 7:37 PM
> To: Ben Peart <Ben.Peart@microsoft.com>
> Cc: David Turner <David.Turner@twosigma.com>; Ævar Arnfjörð Bjarmason
> <avarab@gmail.com>; Christian Couder <christian.couder@gmail.com>;
> git@vger.kernel.org; Junio C Hamano <gitster@pobox.com>; Johannes
> Schindelin <johannes.schindelin@gmx.de>; Nguyễn Thái Ngọc Duy
> <pclouds@gmail.com>; Jeff King <peff@peff.net>
> Subject: Re: [PATCH v8 08/12] fsmonitor: add a test tool to dump the index
> extension
> 
> On 22 September 2017 at 18:35, Ben Peart <benpeart@microsoft.com>
> wrote:
> > Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
> > index extension.
> >
> > Signed-off-by: Ben Peart <benpeart@microsoft.com>
> > ---
> >  Makefile                       |  1 +
> >  t/helper/test-dump-fsmonitor.c | 21 +++++++++++++++++++++
> >  2 files changed, 22 insertions(+)
> >  create mode 100644 t/helper/test-dump-fsmonitor.c
> 
> You forget to add the new binary to .gitignore. (In patch 12/12, you introduce
> test-drop-caches, which you _do_ add to .gitignore.)
> 

Oops.  Thanks!  Hopefully Junio can squash this in...

> Martin
> 
> diff --git a/t/helper/.gitignore b/t/helper/.gitignore index
> 6f07de62d..0fe2e0440 100644
> --- a/t/helper/.gitignore
> +++ b/t/helper/.gitignore
> @@ -6,6 +6,7 @@
>  /test-delta
>  /test-drop-caches
>  /test-dump-cache-tree
> +/test-dump-fsmonitor
>  /test-dump-split-index
>  /test-dump-untracked-cache
>  /test-fake-ssh
> --
> 2.14.1.727.g9ddaf86


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-23 23:31           ` Ben Peart
@ 2017-09-24  3:51             ` Jeff King
  2017-09-24  3:52             ` Junio C Hamano
  1 sibling, 0 replies; 137+ messages in thread
From: Jeff King @ 2017-09-24  3:51 UTC (permalink / raw)
  To: Ben Peart
  Cc: Martin Ågren, David Turner,
	Ævar Arnfjörð Bjarmason, Christian Couder,
	git@vger.kernel.org, Junio C Hamano, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy

On Sat, Sep 23, 2017 at 11:31:50PM +0000, Ben Peart wrote:

> > diff --git a/compat/bswap.h b/compat/bswap.h index 6b22c4621..9dc79bdf5
> > 100644
> > --- a/compat/bswap.h
> > +++ b/compat/bswap.h
> > @@ -183,8 +183,8 @@ static inline uint32_t get_be32(const void *ptr)  static
> > inline uint64_t get_be64(const void *ptr)  {
> >  	const unsigned char *p = ptr;
> > -	return	(uint64_t)get_be32(p[0]) << 32 |
> > -		(uint64_t)get_be32(p[4]) <<  0;
> > +	return	(uint64_t)get_be32(p + 0) << 32 |
> > +		(uint64_t)get_be32(p + 4) <<  0;
> 
> This is surprising.  Every other function in the file uses the p[x] syntax.  Just for
> consistency, is there a way to stick to that syntax but still make it work correctly?
> Is there a typecast that can make it work?

The other ones are accessing the byte values directly, but since you are
building on get_be32 here, you have to pass the pointer.  So:

  return (uint64_t)get_be32(&p[0]) << 32 |
         (uint64_t)get_be32(&p[4]) <<  0;

would work.  Or of course you could just spell it out like the others:

  return (uint64_t)p[0] << 56 |
         (uint64_t)p[1] << 48 |
         (uint64_t)p[2] << 40 |
	 (uint64_t)p[3] << 32 |
         (uint64_t)p[4] << 24 |
         (uint64_t)p[5] << 16 |
         (uint64_t)p[6] <<  8 |
         (uint64_t)p[7] <<  0;

I suspect compilers would end up with the same output either way (on
x86, "gcc -O2" actually turns the whole thing into a bswap instruction).

-Peff

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension
  2017-09-23 23:33           ` Ben Peart
@ 2017-09-24  3:51             ` Junio C Hamano
  0 siblings, 0 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-09-24  3:51 UTC (permalink / raw)
  To: Ben Peart
  Cc: Martin Ågren, David Turner,
	Ævar Arnfjörð Bjarmason, Christian Couder,
	git@vger.kernel.org, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Jeff King

Ben Peart <Ben.Peart@microsoft.com> writes:

>> You forget to add the new binary to .gitignore. (In patch 12/12, you introduce
>> test-drop-caches, which you _do_ add to .gitignore.)
>> 
>
> Oops.  Thanks!  Hopefully Junio can squash this in...

OK, will do.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64
  2017-09-23 23:31           ` Ben Peart
  2017-09-24  3:51             ` Jeff King
@ 2017-09-24  3:52             ` Junio C Hamano
  1 sibling, 0 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-09-24  3:52 UTC (permalink / raw)
  To: Ben Peart
  Cc: Martin Ågren, David Turner,
	Ævar Arnfjörð Bjarmason, Christian Couder,
	git@vger.kernel.org, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Jeff King

Ben Peart <Ben.Peart@microsoft.com> writes:

>> @@ -183,8 +183,8 @@ static inline uint32_t get_be32(const void *ptr)  static
>> inline uint64_t get_be64(const void *ptr)  {
>>  	const unsigned char *p = ptr;
>> -	return	(uint64_t)get_be32(p[0]) << 32 |
>> -		(uint64_t)get_be32(p[4]) <<  0;
>> +	return	(uint64_t)get_be32(p + 0) << 32 |
>> +		(uint64_t)get_be32(p + 4) <<  0;
>
> This is surprising.  Every other function in the file uses the p[x] syntax.  Just for
> consistency, is there a way to stick to that syntax but still make it work correctly?
> Is there a typecast that can make it work?

I'll do "get_be32(&p[0])" etc. while queueing for now.

Thanks, both of you.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
                         ` (11 preceding siblings ...)
  2017-09-22 16:35       ` [PATCH v8 12/12] fsmonitor: add a performance test Ben Peart
@ 2017-09-29  2:20       ` Junio C Hamano
  2017-09-29 12:07         ` Ben Peart
  12 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-09-29  2:20 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner, avarab, christian.couder, git, johannes.schindelin,
	pclouds, peff

Ben Peart <benpeart@microsoft.com> writes:

> The only behavioral change from V7 is the removal of unnecessary uses of
> CE_MATCH_IGNORE_FSMONITOR.  With a better understanding of *why* the
> CE_MATCH_IGNORE_* flags are used, it is now clear they are not required
> in most cases where CE_MATCH_IGNORE_FSMONITOR was being passed out of an
> abundance of caution.

The reviews and updates after this round was posted were to

 * 01/12 had an obvious pointer-vs-pointee thinko, which I think I
   have locally fixed;

 * 08/12 forgot to add a new test executable to .gitignore file,
   which I think I have locally fixed, too.

Any other review comments and suggestions for improvements?
Otherwise I am tempted to declare victory and merge this to 'next'
soonish.

For reference, here is the interdiff between what was posted as v8
and what I have on 'pu'.

Thanks.

 compat/bswap.h      | 4 ++--
 t/helper/.gitignore | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git b/compat/bswap.h a/compat/bswap.h
index 6b22c46214..5078ce5ecc 100644
--- b/compat/bswap.h
+++ a/compat/bswap.h
@@ -183,8 +183,8 @@ static inline uint32_t get_be32(const void *ptr)
 static inline uint64_t get_be64(const void *ptr)
 {
 	const unsigned char *p = ptr;
-	return	(uint64_t)get_be32(p[0]) << 32 |
-		(uint64_t)get_be32(p[4]) <<  0;
+	return	(uint64_t)get_be32(&p[0]) << 32 |
+		(uint64_t)get_be32(&p[4]) <<  0;
 }
 
 static inline void put_be32(void *ptr, uint32_t value)
diff --git b/t/helper/.gitignore a/t/helper/.gitignore
index f9328eebdd..87a648a7cf 100644
--- b/t/helper/.gitignore
+++ a/t/helper/.gitignore
@@ -5,6 +5,7 @@
 /test-delta
 /test-drop-caches
 /test-dump-cache-tree
+/test-dump-fsmonitor
 /test-dump-split-index
 /test-dump-untracked-cache
 /test-fake-ssh

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* RE: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
@ 2017-09-29 12:07         ` Ben Peart
  2017-10-01  8:24           ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-09-29 12:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

> -----Original Message-----
> From: Junio C Hamano [mailto:gitster@pobox.com]
> Sent: Thursday, September 28, 2017 10:21 PM
> To: Ben Peart <Ben.Peart@microsoft.com>
> Cc: David.Turner@twosigma.com; avarab@gmail.com;
> christian.couder@gmail.com; git@vger.kernel.org;
> johannes.schindelin@gmx.de; pclouds@gmail.com; peff@peff.net
> Subject: Re: [PATCH v8 00/12] Fast git status via a file system watcher
> 
> Ben Peart <benpeart@microsoft.com> writes:
> 
> > The only behavioral change from V7 is the removal of unnecessary uses
> > of CE_MATCH_IGNORE_FSMONITOR.  With a better understanding of
> *why*
> > the
> > CE_MATCH_IGNORE_* flags are used, it is now clear they are not
> > required in most cases where CE_MATCH_IGNORE_FSMONITOR was being
> > passed out of an abundance of caution.
> 
> The reviews and updates after this round was posted were to
> 
>  * 01/12 had an obvious pointer-vs-pointee thinko, which I think I
>    have locally fixed;
> 
>  * 08/12 forgot to add a new test executable to .gitignore file,
>    which I think I have locally fixed, too.
> 
> Any other review comments and suggestions for improvements?
> Otherwise I am tempted to declare victory and merge this to 'next'
> soonish.
> 
> For reference, here is the interdiff between what was posted as v8 and what
> I have on 'pu'.

I had accumulated the same set of changes with one addition of removing
a duplicate "the" from a comment in the fsmonitor.h file:

diff --git a/fsmonitor.h b/fsmonitor.h
index 8eb6163455..0de644e01a 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -4,7 +4,7 @@
 extern struct trace_key trace_fsmonitor;
 
 /*
- * Read the the fsmonitor index extension and (if configured) restore the
+ * Read the fsmonitor index extension and (if configured) restore the
  * CE_FSMONITOR_VALID state.
  */
 extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz); 

> 
> Thanks.
> 
>  compat/bswap.h      | 4 ++--
>  t/helper/.gitignore | 1 +
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git b/compat/bswap.h a/compat/bswap.h index
> 6b22c46214..5078ce5ecc 100644
> --- b/compat/bswap.h
> +++ a/compat/bswap.h
> @@ -183,8 +183,8 @@ static inline uint32_t get_be32(const void *ptr)  static
> inline uint64_t get_be64(const void *ptr)  {
>  	const unsigned char *p = ptr;
> -	return	(uint64_t)get_be32(p[0]) << 32 |
> -		(uint64_t)get_be32(p[4]) <<  0;
> +	return	(uint64_t)get_be32(&p[0]) << 32 |
> +		(uint64_t)get_be32(&p[4]) <<  0;
>  }
> 
>  static inline void put_be32(void *ptr, uint32_t value) diff --git
> b/t/helper/.gitignore a/t/helper/.gitignore index f9328eebdd..87a648a7cf
> 100644
> --- b/t/helper/.gitignore
> +++ a/t/helper/.gitignore
> @@ -5,6 +5,7 @@
>  /test-delta
>  /test-drop-caches
>  /test-dump-cache-tree
> +/test-dump-fsmonitor
>  /test-dump-split-index
>  /test-dump-untracked-cache
>  /test-fake-ssh

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-09-29 12:07         ` Ben Peart
@ 2017-10-01  8:24           ` Junio C Hamano
  2017-10-03 19:48             ` Ben Peart
  0 siblings, 1 reply; 137+ messages in thread
From: Junio C Hamano @ 2017-10-01  8:24 UTC (permalink / raw)
  To: Ben Peart
  Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

Ben Peart <Ben.Peart@microsoft.com> writes:

> I had accumulated the same set of changes with one addition of removing
> a duplicate "the" from a comment in the fsmonitor.h file:
>
> diff --git a/fsmonitor.h b/fsmonitor.h
> index 8eb6163455..0de644e01a 100644
> --- a/fsmonitor.h
> +++ b/fsmonitor.h
> @@ -4,7 +4,7 @@
>  extern struct trace_key trace_fsmonitor;
>  
>  /*
> - * Read the the fsmonitor index extension and (if configured) restore the
> + * Read the fsmonitor index extension and (if configured) restore the
>   * CE_FSMONITOR_VALID state.
>   */
>  extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz); 
>
>> 
>> Thanks.

OK, now my copy has the same, so we are in sync.  Unless there is no
more comment that benefits from a reroll of the series, let's run
with this version for now and merge it to 'next'.  Further updates
can be done incrementally on top.

Thanks.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-10-01  8:24           ` Junio C Hamano
@ 2017-10-03 19:48             ` Ben Peart
  2017-10-04  2:09               ` Junio C Hamano
  0 siblings, 1 reply; 137+ messages in thread
From: Ben Peart @ 2017-10-03 19:48 UTC (permalink / raw)
  To: Junio C Hamano, Ben Peart
  Cc: David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net



On 10/1/2017 4:24 AM, Junio C Hamano wrote:
> Ben Peart <Ben.Peart@microsoft.com> writes:
> 
>> I had accumulated the same set of changes with one addition of removing
>> a duplicate "the" from a comment in the fsmonitor.h file:
>>
>> diff --git a/fsmonitor.h b/fsmonitor.h
>> index 8eb6163455..0de644e01a 100644
>> --- a/fsmonitor.h
>> +++ b/fsmonitor.h
>> @@ -4,7 +4,7 @@
>>   extern struct trace_key trace_fsmonitor;
>>   
>>   /*
>> - * Read the the fsmonitor index extension and (if configured) restore the
>> + * Read the fsmonitor index extension and (if configured) restore the
>>    * CE_FSMONITOR_VALID state.
>>    */
>>   extern int read_fsmonitor_extension(struct index_state *istate, const void *data, unsigned long sz);
>>
>>>
>>> Thanks.
> 
> OK, now my copy has the same, so we are in sync.  Unless there is no
> more comment that benefits from a reroll of the series, let's run
> with this version for now and merge it to 'next'.  Further updates
> can be done incrementally on top.
> 
> Thanks.
> 

Well, rats. I found one more issue that applies to two of the commits. 
Can you squash this in as well or do you want it in some other form?


diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 7ceb32dc18..cca3d71e90 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -36,7 +36,7 @@ my $system = `uname -s`;
  $system =~ s/[\r\n]+//g;
  my $git_work_tree;

-if ($system =~ m/^MSYS_NT/) {
+if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
         $git_work_tree = `cygpath -aw "\$PWD"`;
         $git_work_tree =~ s/[\r\n]+//g;
         $git_work_tree =~ s,\\,/,g;
diff --git a/templates/hooks--fsmonitor-watchman.sample 
b/templates/hooks--fsmonitor-watchman.sample
index 870a59d237..c68038ef00 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -35,7 +35,7 @@ my $system = `uname -s`;
  $system =~ s/[\r\n]+//g;
  my $git_work_tree;

-if ($system =~ m/^MSYS_NT/) {
+if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
         $git_work_tree = `cygpath -aw "\$PWD"`;
         $git_work_tree =~ s/[\r\n]+//g;
         $git_work_tree =~ s,\\,/,g;


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-10-03 19:48             ` Ben Peart
@ 2017-10-04  2:09               ` Junio C Hamano
  2017-10-04  6:38                 ` Alex Vandiver
  2017-10-04 12:27                 ` Ben Peart
  0 siblings, 2 replies; 137+ messages in thread
From: Junio C Hamano @ 2017-10-04  2:09 UTC (permalink / raw)
  To: Ben Peart
  Cc: Ben Peart, David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

Ben Peart <peartben@gmail.com> writes:

> Well, rats. I found one more issue that applies to two of the
> commits. Can you squash this in as well or do you want it in some
> other form?

Rats indeed.  Let's go incremental as promised, perhaps like this
(but please supply a better description if you have one).

-- >8 --
From: Ben Peart <benpeart@microsoft.com>
Subject: fsmonitor: MINGW support for watchman integration

Instead of just taking $ENV{'PWD'}, use the same logic that converts
PWD to $git_work_tree on MSYS_NT in the watchman integration hook
script also on MINGW.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7519/fsmonitor-watchman                 | 2 +-
 templates/hooks--fsmonitor-watchman.sample | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 7ceb32dc18..cca3d71e90 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -36,7 +36,7 @@ my $system = `uname -s`;
 $system =~ s/[\r\n]+//g;
 my $git_work_tree;
 
-if ($system =~ m/^MSYS_NT/) {
+if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
 	$git_work_tree = `cygpath -aw "\$PWD"`;
 	$git_work_tree =~ s/[\r\n]+//g;
 	$git_work_tree =~ s,\\,/,g;
diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
index 870a59d237..c68038ef00 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -35,7 +35,7 @@ my $system = `uname -s`;
 $system =~ s/[\r\n]+//g;
 my $git_work_tree;
 
-if ($system =~ m/^MSYS_NT/) {
+if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
 	$git_work_tree = `cygpath -aw "\$PWD"`;
 	$git_work_tree =~ s/[\r\n]+//g;
 	$git_work_tree =~ s,\\,/,g;


^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-10-04  2:09               ` Junio C Hamano
@ 2017-10-04  6:38                 ` Alex Vandiver
  2017-10-04 12:48                   ` Ben Peart
  2017-10-04 12:27                 ` Ben Peart
  1 sibling, 1 reply; 137+ messages in thread
From: Alex Vandiver @ 2017-10-04  6:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, Ben Peart, David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net

On Wed, 4 Oct 2017, Junio C Hamano wrote:
> Rats indeed.  Let's go incremental as promised, perhaps like this
> (but please supply a better description if you have one).

I think you'll also want the following squashed into 5c8cdcfd8 and
def437671:

-- >8 --
From 445d45027bb5b7823338cf111910d2884af6318b Mon Sep 17 00:00:00 2001
From: Alex Vandiver <alexmv@dropbox.com>
Date: Tue, 3 Oct 2017 23:27:46 -0700
Subject: [PATCH] fsmonitor: Read entirety of watchman output

In perl, setting $/ sets the string that is used as the "record
separator," which sets the boundary that the `<>` construct reads to.
Setting `local $/ = 0666;` evaluates the octal, getting 438, and
stringifies it.  Thus, the later read from `<CHLD_OUT>` stops as soon
as it encounters the string "438" in the watchman output, yielding
invalid JSON; repositories containing filenames with SHA1 hashes are
able to trip this easily.

Set `$/` to undefined, thus slurping all output from watchman.  Also
close STDIN which is provided to watchman, to better guarantee that we
cannot deadlock with watchman while both attempting to read.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
---
 t/t7519/fsmonitor-watchman                 | 6 ++----
 templates/hooks--fsmonitor-watchman.sample | 6 ++----
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 7ceb32dc1..7d6aef635 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -50,9 +50,6 @@ launch_watchman();
 
 sub launch_watchman {
 
-	# Set input record separator
-	local $/ = 0666;
-
 	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
 	    or die "open2() failed: $!\n" .
 	    "Falling back to scanning...\n";
@@ -83,7 +80,8 @@ sub launch_watchman {
 	close $fh;
 
 	print CHLD_IN $query;
-	my $response = <CHLD_OUT>;
+	close CHLD_IN;
+	my $response = do {local $/; <CHLD_OUT>};
 
 	open ($fh, ">", ".git/watchman-response.json");
 	print $fh $response;
diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
index 870a59d23..1b8ed173e 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -49,9 +49,6 @@ launch_watchman();
 
 sub launch_watchman {
 
-	# Set input record separator
-	local $/ = 0666;
-
 	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
 	    or die "open2() failed: $!\n" .
 	    "Falling back to scanning...\n";
@@ -78,7 +75,8 @@ sub launch_watchman {
 	END
 
 	print CHLD_IN $query;
-	my $response = <CHLD_OUT>;
+	close CHLD_IN;
+	my $response = do {local $/; <CHLD_OUT>};
 
 	die "Watchman: command returned no output.\n" .
 	    "Falling back to scanning...\n" if $response eq "";
-- 
2.14.2.959.g6663358d3

^ permalink raw reply related	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-10-04  2:09               ` Junio C Hamano
  2017-10-04  6:38                 ` Alex Vandiver
@ 2017-10-04 12:27                 ` Ben Peart
  1 sibling, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-10-04 12:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Peart, David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net



On 10/3/2017 10:09 PM, Junio C Hamano wrote:
> Ben Peart <peartben@gmail.com> writes:
> 
>> Well, rats. I found one more issue that applies to two of the
>> commits. Can you squash this in as well or do you want it in some
>> other form?
> 
> Rats indeed.  Let's go incremental as promised, perhaps like this
> (but please supply a better description if you have one).

Thank you.  Looks good.

> 
> -- >8 --
> From: Ben Peart <benpeart@microsoft.com>
> Subject: fsmonitor: MINGW support for watchman integration
> 
> Instead of just taking $ENV{'PWD'}, use the same logic that converts
> PWD to $git_work_tree on MSYS_NT in the watchman integration hook
> script also on MINGW.
> 
> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>   t/t7519/fsmonitor-watchman                 | 2 +-
>   templates/hooks--fsmonitor-watchman.sample | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
> index 7ceb32dc18..cca3d71e90 100755
> --- a/t/t7519/fsmonitor-watchman
> +++ b/t/t7519/fsmonitor-watchman
> @@ -36,7 +36,7 @@ my $system = `uname -s`;
>   $system =~ s/[\r\n]+//g;
>   my $git_work_tree;
>   
> -if ($system =~ m/^MSYS_NT/) {
> +if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
>   	$git_work_tree = `cygpath -aw "\$PWD"`;
>   	$git_work_tree =~ s/[\r\n]+//g;
>   	$git_work_tree =~ s,\\,/,g;
> diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
> index 870a59d237..c68038ef00 100755
> --- a/templates/hooks--fsmonitor-watchman.sample
> +++ b/templates/hooks--fsmonitor-watchman.sample
> @@ -35,7 +35,7 @@ my $system = `uname -s`;
>   $system =~ s/[\r\n]+//g;
>   my $git_work_tree;
>   
> -if ($system =~ m/^MSYS_NT/) {
> +if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
>   	$git_work_tree = `cygpath -aw "\$PWD"`;
>   	$git_work_tree =~ s/[\r\n]+//g;
>   	$git_work_tree =~ s,\\,/,g;
> 

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [PATCH v8 00/12] Fast git status via a file system watcher
  2017-10-04  6:38                 ` Alex Vandiver
@ 2017-10-04 12:48                   ` Ben Peart
  0 siblings, 0 replies; 137+ messages in thread
From: Ben Peart @ 2017-10-04 12:48 UTC (permalink / raw)
  To: Alex Vandiver, Junio C Hamano
  Cc: Ben Peart, David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net



On 10/4/2017 2:38 AM, Alex Vandiver wrote:
> On Wed, 4 Oct 2017, Junio C Hamano wrote:
>> Rats indeed.  Let's go incremental as promised, perhaps like this
>> (but please supply a better description if you have one).
> 
> I think you'll also want the following squashed into 5c8cdcfd8 and
> def437671:
> 
> -- >8 --
>  From 445d45027bb5b7823338cf111910d2884af6318b Mon Sep 17 00:00:00 2001
> From: Alex Vandiver <alexmv@dropbox.com>
> Date: Tue, 3 Oct 2017 23:27:46 -0700
> Subject: [PATCH] fsmonitor: Read entirety of watchman output
> 
> In perl, setting $/ sets the string that is used as the "record
> separator," which sets the boundary that the `<>` construct reads to.
> Setting `local $/ = 0666;` evaluates the octal, getting 438, and
> stringifies it.  Thus, the later read from `<CHLD_OUT>` stops as soon
> as it encounters the string "438" in the watchman output, yielding
> invalid JSON; repositories containing filenames with SHA1 hashes are
> able to trip this easily.
> 
> Set `$/` to undefined, thus slurping all output from watchman.  Also
> close STDIN which is provided to watchman, to better guarantee that we
> cannot deadlock with watchman while both attempting to read.
> 

Thank you!  I'm a perl neophyte so have to rely on others when it comes 
to these types of perl issues.  I tried out your fixes and they appear 
to work well.

While testing them, I discovered that your fix of `local $/ = 0666;` 
exposed an existing issue in the test version of the integration script. 
  The fix for that is within my perl capabilities and is fixed with the 
following patch:

-- >8 --
 From 3e3b983a4208a62d166c233a7de3bf045322f6c7 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Wed, 4 Oct 2017 08:33:39 -0400
Subject: [PATCH] fsmonitor: preserve utf8 filenames in 
fsmonitor-watchman log

Update the test fsmonitor-watchman integration script to properly
preserve utf8 filenames when outputting the .git/watchman-output.out log
file.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
  t/t7519/fsmonitor-watchman | 1 +
  1 file changed, 1 insertion(+)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 51330f8b3d..a3e30bf54f 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -129,6 +129,7 @@ sub launch_watchman {
  	    "Falling back to scanning...\n" if $o->{error};

  	open ($fh, ">", ".git/watchman-output.out");
+	binmode $fh, ":utf8";
  	print $fh @{$o->{files}};
  	close $fh;

-- 
2.14.1.windows.1.1034.g0776750557


^ permalink raw reply related	[flat|nested] 137+ messages in thread

end of thread, other threads:[~2017-10-04 12:48 UTC | newest]

Thread overview: 137+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-06-10 13:40 ` [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c Ben Peart
2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-06-27 15:43   ` Christian Couder
2017-07-03 21:25     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-06-27 16:20   ` Christian Couder
2017-07-07 18:50     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 5/7] fsmonitor: add documentation for the " Ben Peart
2017-06-10 13:40 ` [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
2017-06-10 14:04   ` Ben Peart
2017-06-12 22:04   ` Junio C Hamano
2017-06-14 14:12     ` Ben Peart
2017-06-14 18:36       ` Junio C Hamano
2017-07-07 18:14         ` Ben Peart
2017-07-07 18:35           ` Junio C Hamano
2017-07-07 19:07             ` Ben Peart
2017-07-07 19:33             ` David Turner
2017-07-08  7:19             ` Christian Couder
2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
2017-07-10 13:36   ` Ben Peart
2017-07-10 14:40     ` Ben Peart
2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-15 19:20   ` [PATCH v6 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-15 21:35     ` David Turner
2017-09-18 13:07       ` Ben Peart
2017-09-18 13:32         ` David Turner
2017-09-18 13:49           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-15 19:43     ` David Turner
2017-09-18 13:27       ` Ben Peart
2017-09-17  8:03     ` Junio C Hamano
2017-09-18 13:29       ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-15 20:34     ` David Turner
2017-09-15 19:20   ` [PATCH v6 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-17  8:02     ` Junio C Hamano
2017-09-18 13:38       ` Ben Peart
2017-09-18 15:43         ` Torsten Bögershausen
2017-09-18 16:28           ` Ben Peart
2017-09-19 14:16             ` Torsten Bögershausen
2017-09-19 15:36               ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 20:43     ` Jonathan Nieder
2017-09-20 17:11       ` Ben Peart
2017-09-20 17:46         ` Jonathan Nieder
2017-09-21  0:05           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-15 22:00     ` David Turner
2017-09-19 19:32       ` David Turner
2017-09-19 20:30         ` Ben Peart
2017-09-16 15:27     ` Torsten Bögershausen
2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
2017-09-19 20:37         ` Jonathan Nieder
2017-09-20 13:49           ` Torsten Bögershausen
2017-09-22  1:04             ` Junio C Hamano
2017-09-18 14:06       ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-17  4:47     ` Junio C Hamano
2017-09-18 15:25       ` Ben Peart
2017-09-19 20:34         ` Jonathan Nieder
2017-09-15 19:20   ` [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
2017-09-15 21:56     ` David Turner
2017-09-18 14:24     ` Johannes Schindelin
2017-09-18 18:19       ` Ben Peart
2017-09-19 15:28         ` Johannes Schindelin
2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-20 22:06       ` Stefan Beller
2017-09-21  0:02         ` Ben Peart
2017-09-21  0:44           ` Stefan Beller
2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-20  5:47       ` Junio C Hamano
2017-09-20 14:58         ` Ben Peart
2017-09-21  1:46           ` Junio C Hamano
2017-09-21  2:06             ` Ben Peart
2017-09-21  2:18               ` Junio C Hamano
2017-09-21  2:32                 ` Junio C Hamano
2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-20  2:28       ` Junio C Hamano
2017-09-20 16:19         ` Ben Peart
2017-09-21  2:00           ` Junio C Hamano
2017-09-21  2:24             ` Ben Peart
2017-09-21 14:35               ` Ben Peart
2017-09-22  1:02                 ` Junio C Hamano
2017-09-20  6:23       ` Junio C Hamano
2017-09-20 16:29         ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-20 10:00       ` Martin Ågren
2017-09-20 17:02         ` Ben Peart
2017-09-20 17:11           ` Martin Ågren
2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-19 19:46       ` David Turner
2017-09-19 20:44         ` Ben Peart
2017-09-19 21:27           ` David Turner
2017-09-19 22:44             ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-19 19:27     ` [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 19:27     ` [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-19 19:27     ` [PATCH v7 12/12] fsmonitor: add a performance test Ben Peart
2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:31           ` Ben Peart
2017-09-24  3:51             ` Jeff King
2017-09-24  3:52             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-22 16:35       ` [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-22 16:35       ` [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-22 16:35       ` [PATCH v8 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:33           ` Ben Peart
2017-09-24  3:51             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-22 16:35       ` [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-22 16:35       ` [PATCH v8 12/12] fsmonitor: add a performance test Ben Peart
2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
2017-09-29 12:07         ` Ben Peart
2017-10-01  8:24           ` Junio C Hamano
2017-10-03 19:48             ` Ben Peart
2017-10-04  2:09               ` Junio C Hamano
2017-10-04  6:38                 ` Alex Vandiver
2017-10-04 12:48                   ` Ben Peart
2017-10-04 12:27                 ` Ben Peart

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).