git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 01/32] ewah: fix constness of ewah_read_mmap
  2014-04-28 10:55 [PATCH 00/32] Split index mode for very large indexes Nguyễn Thái Ngọc Duy
@ 2014-04-28 10:55 ` Nguyễn Thái Ngọc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-04-28 10:55 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 ewah/ewah_io.c | 4 ++--
 ewah/ewok.h    | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index f7f700e..1c2d7af 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -110,9 +110,9 @@ int ewah_serialize(struct ewah_bitmap *self, int fd)
 	return ewah_serialize_to(self, write_helper, (void *)(intptr_t)fd);
 }
 
-int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
+int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 {
-	uint8_t *ptr = map;
+	const uint8_t *ptr = map;
 	size_t i;
 
 	self->bit_size = get_be32(ptr);
diff --git a/ewah/ewok.h b/ewah/ewok.h
index 43adeb5..0556ca5 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -99,7 +99,7 @@ int ewah_serialize(struct ewah_bitmap *self, int fd);
 int ewah_serialize_native(struct ewah_bitmap *self, int fd);
 
 int ewah_deserialize(struct ewah_bitmap *self, int fd);
-int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len);
+int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len);
 int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);
 
 uint32_t ewah_checksum(struct ewah_bitmap *self);
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 00/32] Split index resend
@ 2014-06-13 12:19 Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 01/32] ewah: fix constness of ewah_read_mmap Nguyễn Thái Ngọc Duy
                   ` (31 more replies)
  0 siblings, 32 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This is basically what's in 'pu' with fixup patches squashed in (also
resend is a good way to get people's eyes on it one more time).
There's also another minor change that SOMETHING_CHANGED now has value
1.  This is the usual value of cache_changed before this series. So if
another in-flight series update cache_changed the old way, nothing's
subtly broken.

Nguyễn Thái Ngọc Duy (32):
  ewah: fix constness of ewah_read_mmap
  ewah: delete unused ewah_read_mmap_native declaration
  sequencer: do not update/refresh index if the lock cannot be held
  read-cache: new API write_locked_index instead of write_index/write_cache
  read-cache: relocate and unexport commit_locked_index()
  read-cache: store in-memory flags in the first 12 bits of ce_flags
  read-cache: be strict about "changed" in remove_marked_cache_entries()
  read-cache: be specific what part of the index has changed
  update-index: be specific what part of the index has changed
  resolve-undo: be specific what part of the index has changed
  unpack-trees: be specific what part of the index has changed
  cache-tree: mark istate->cache_changed on cache tree invalidation
  cache-tree: mark istate->cache_changed on cache tree update
  cache-tree: mark istate->cache_changed on prime_cache_tree()
  entry.c: update cache_changed if refresh_cache is set in checkout_entry()
  read-cache: save index SHA-1 after reading
  read-cache: split-index mode
  read-cache: mark new entries for split index
  read-cache: save deleted entries in split index
  read-cache: mark updated entries for split index
  split-index: the writing part
  split-index: the reading part
  split-index: do not invalidate cache-tree at read time
  split-index: strip pathname of on-disk replaced entries
  update-index: new options to enable/disable split index mode
  update-index --split-index: do not split if $GIT_DIR is read only
  rev-parse: add --shared-index-path to get shared index path
  read-tree: force split-index mode off on --index-output
  read-tree: note about dropping split-index mode or index version
  read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
  t2104: make sure split index mode is off for the version test
  t1700: new tests for split-index mode

 .gitignore                               |   1 +
 Documentation/git-rev-parse.txt          |   4 +
 Documentation/git-update-index.txt       |  11 ++
 Documentation/gitrepository-layout.txt   |   4 +
 Documentation/technical/index-format.txt |  35 ++++
 Makefile                                 |   2 +
 builtin/add.c                            |   6 +-
 builtin/apply.c                          |  17 +-
 builtin/blame.c                          |   2 +-
 builtin/checkout-index.c                 |   4 +-
 builtin/checkout.c                       |  12 +-
 builtin/clone.c                          |   7 +-
 builtin/commit.c                         |  33 ++--
 builtin/merge.c                          |  12 +-
 builtin/mv.c                             |   7 +-
 builtin/read-tree.c                      |  18 +-
 builtin/reset.c                          |   7 +-
 builtin/rev-parse.c                      |  10 +
 builtin/rm.c                             |   7 +-
 builtin/update-index.c                   |  33 +++-
 cache-tree.c                             |  52 ++---
 cache-tree.h                             |   6 +-
 cache.h                                  |  28 ++-
 entry.c                                  |   3 +
 ewah/ewah_io.c                           |   4 +-
 ewah/ewok.h                              |   3 +-
 lockfile.c                               |  20 --
 merge-recursive.c                        |  11 +-
 merge.c                                  |   7 +-
 read-cache.c                             | 270 ++++++++++++++++++++++---
 rerere.c                                 |   3 +-
 resolve-undo.c                           |   2 +-
 sequencer.c                              |  16 +-
 split-index.c (new)                      | 328 +++++++++++++++++++++++++++++++
 split-index.h (new)                      |  35 ++++
 t/t1700-split-index.sh (new +x)          | 194 ++++++++++++++++++
 t/t2104-update-index-skip-worktree.sh    |   2 +
 test-dump-cache-tree.c                   |   7 +-
 test-dump-split-index.c (new)            |  34 ++++
 test-scrap-cache-tree.c                  |   5 +-
 unpack-trees.c                           |  18 +-
 41 files changed, 1088 insertions(+), 192 deletions(-)
 create mode 100644 split-index.c
 create mode 100644 split-index.h
 create mode 100755 t/t1700-split-index.sh
 create mode 100644 test-dump-split-index.c

-- 
1.9.1.346.ga2b5940

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/32] ewah: fix constness of ewah_read_mmap
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 02/32] ewah: delete unused ewah_read_mmap_native declaration Nguyễn Thái Ngọc Duy
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 ewah/ewah_io.c | 4 ++--
 ewah/ewok.h    | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index f7f700e..1c2d7af 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -110,9 +110,9 @@ int ewah_serialize(struct ewah_bitmap *self, int fd)
 	return ewah_serialize_to(self, write_helper, (void *)(intptr_t)fd);
 }
 
-int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
+int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 {
-	uint8_t *ptr = map;
+	const uint8_t *ptr = map;
 	size_t i;
 
 	self->bit_size = get_be32(ptr);
diff --git a/ewah/ewok.h b/ewah/ewok.h
index 43adeb5..0556ca5 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -99,7 +99,7 @@ int ewah_serialize(struct ewah_bitmap *self, int fd);
 int ewah_serialize_native(struct ewah_bitmap *self, int fd);
 
 int ewah_deserialize(struct ewah_bitmap *self, int fd);
-int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len);
+int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len);
 int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);
 
 uint32_t ewah_checksum(struct ewah_bitmap *self);
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 02/32] ewah: delete unused ewah_read_mmap_native declaration
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 01/32] ewah: fix constness of ewah_read_mmap Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 03/32] sequencer: do not update/refresh index if the lock cannot be held Nguyễn Thái Ngọc Duy
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 ewah/ewok.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/ewah/ewok.h b/ewah/ewok.h
index 0556ca5..f6ad190 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -100,7 +100,6 @@ int ewah_serialize_native(struct ewah_bitmap *self, int fd);
 
 int ewah_deserialize(struct ewah_bitmap *self, int fd);
 int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len);
-int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);
 
 uint32_t ewah_checksum(struct ewah_bitmap *self);
 
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 03/32] sequencer: do not update/refresh index if the lock cannot be held
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 01/32] ewah: fix constness of ewah_read_mmap Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 02/32] ewah: delete unused ewah_read_mmap_native declaration Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 04/32] read-cache: new API write_locked_index instead of write_index/write_cache Nguyễn Thái Ngọc Duy
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index bde5f04..7b886a6 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -679,7 +679,7 @@ static void read_and_refresh_cache(struct replay_opts *opts)
 	if (read_index_preload(&the_index, NULL) < 0)
 		die(_("git %s: failed to read the index"), action_name(opts));
 	refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, NULL);
-	if (the_index.cache_changed) {
+	if (the_index.cache_changed && index_fd >= 0) {
 		if (write_index(&the_index, index_fd) ||
 		    commit_locked_index(&index_lock))
 			die(_("git %s: failed to refresh the index"), action_name(opts));
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 04/32] read-cache: new API write_locked_index instead of write_index/write_cache
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 03/32] sequencer: do not update/refresh index if the lock cannot be held Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 05/32] read-cache: relocate and unexport commit_locked_index() Nguyễn Thái Ngọc Duy
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/add.c            |  6 ++----
 builtin/apply.c          |  9 ++++-----
 builtin/checkout-index.c |  3 +--
 builtin/checkout.c       | 11 ++++-------
 builtin/clone.c          |  7 +++----
 builtin/commit.c         | 33 ++++++++++++++-------------------
 builtin/merge.c          | 12 ++++--------
 builtin/mv.c             |  7 +++----
 builtin/read-tree.c      |  7 +++----
 builtin/reset.c          |  5 ++---
 builtin/rm.c             |  7 +++----
 builtin/update-index.c   |  3 +--
 cache-tree.c             |  3 +--
 cache.h                  |  6 ++++--
 merge-recursive.c        |  7 +++----
 merge.c                  |  7 +++----
 read-cache.c             | 28 ++++++++++++++++++++++++----
 rerere.c                 |  3 +--
 sequencer.c              | 10 ++++------
 test-scrap-cache-tree.c  |  5 ++---
 20 files changed, 86 insertions(+), 93 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 459208a..4baf3a5 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -299,7 +299,6 @@ static int add_files(struct dir_struct *dir, int flags)
 int cmd_add(int argc, const char **argv, const char *prefix)
 {
 	int exit_status = 0;
-	int newfd;
 	struct pathspec pathspec;
 	struct dir_struct dir;
 	int flags;
@@ -345,7 +344,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	add_new_files = !take_worktree_changes && !refresh_only;
 	require_pathspec = !take_worktree_changes;
 
-	newfd = hold_locked_index(&lock_file, 1);
+	hold_locked_index(&lock_file, 1);
 
 	flags = ((verbose ? ADD_CACHE_VERBOSE : 0) |
 		 (show_only ? ADD_CACHE_PRETEND : 0) |
@@ -443,8 +442,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 finish:
 	if (active_cache_changed) {
-		if (write_cache(newfd, active_cache, active_nr) ||
-		    commit_locked_index(&lock_file))
+		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 			die(_("Unable to write new index file"));
 	}
 
diff --git a/builtin/apply.c b/builtin/apply.c
index 87439fa..5e13444 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3644,7 +3644,7 @@ static void build_fake_ancestor(struct patch *list, const char *filename)
 {
 	struct patch *patch;
 	struct index_state result = { NULL };
-	int fd;
+	static struct lock_file lock;
 
 	/* Once we start supporting the reverse patch, it may be
 	 * worth showing the new sha1 prefix, but until then...
@@ -3682,8 +3682,8 @@ static void build_fake_ancestor(struct patch *list, const char *filename)
 			die ("Could not add %s to temporary index", name);
 	}
 
-	fd = open(filename, O_WRONLY | O_CREAT, 0666);
-	if (fd < 0 || write_index(&result, fd) || close(fd))
+	hold_lock_file_for_update(&lock, filename, LOCK_DIE_ON_ERROR);
+	if (write_locked_index(&result, &lock, COMMIT_LOCK))
 		die ("Could not write temporary index to %s", filename);
 
 	discard_index(&result);
@@ -4501,8 +4501,7 @@ int cmd_apply(int argc, const char **argv, const char *prefix_)
 	}
 
 	if (update_index) {
-		if (write_cache(newfd, active_cache, active_nr) ||
-		    commit_locked_index(&lock_file))
+		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 			die(_("Unable to write new index file"));
 	}
 
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 61e75eb..9e49bf2 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -279,8 +279,7 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
 		checkout_all(prefix, prefix_length);
 
 	if (0 <= newfd &&
-	    (write_cache(newfd, active_cache, active_nr) ||
-	     commit_locked_index(&lock_file)))
+	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 		die("Unable to write new index file");
 	return 0;
 }
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 07cf555..944a634 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -225,7 +225,6 @@ static int checkout_paths(const struct checkout_opts *opts,
 	int flag;
 	struct commit *head;
 	int errs = 0;
-	int newfd;
 	struct lock_file *lock_file;
 
 	if (opts->track != BRANCH_TRACK_UNSPECIFIED)
@@ -256,7 +255,7 @@ static int checkout_paths(const struct checkout_opts *opts,
 
 	lock_file = xcalloc(1, sizeof(struct lock_file));
 
-	newfd = hold_locked_index(lock_file, 1);
+	hold_locked_index(lock_file, 1);
 	if (read_cache_preload(&opts->pathspec) < 0)
 		return error(_("corrupt index file"));
 
@@ -352,8 +351,7 @@ static int checkout_paths(const struct checkout_opts *opts,
 		}
 	}
 
-	if (write_cache(newfd, active_cache, active_nr) ||
-	    commit_locked_index(lock_file))
+	if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 		die(_("unable to write new index file"));
 
 	read_ref_full("HEAD", rev, 0, &flag);
@@ -444,8 +442,8 @@ static int merge_working_tree(const struct checkout_opts *opts,
 {
 	int ret;
 	struct lock_file *lock_file = xcalloc(1, sizeof(struct lock_file));
-	int newfd = hold_locked_index(lock_file, 1);
 
+	hold_locked_index(lock_file, 1);
 	if (read_cache_preload(NULL) < 0)
 		return error(_("corrupt index file"));
 
@@ -553,8 +551,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
 		}
 	}
 
-	if (write_cache(newfd, active_cache, active_nr) ||
-	    commit_locked_index(lock_file))
+	if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 		die(_("unable to write new index file"));
 
 	if (!opts->force && !opts->quiet)
diff --git a/builtin/clone.c b/builtin/clone.c
index 9b3c04d..48f91f5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -616,7 +616,7 @@ static int checkout(void)
 	struct unpack_trees_options opts;
 	struct tree *tree;
 	struct tree_desc t;
-	int err = 0, fd;
+	int err = 0;
 
 	if (option_no_checkout)
 		return 0;
@@ -640,7 +640,7 @@ static int checkout(void)
 	setup_work_tree();
 
 	lock_file = xcalloc(1, sizeof(struct lock_file));
-	fd = hold_locked_index(lock_file, 1);
+	hold_locked_index(lock_file, 1);
 
 	memset(&opts, 0, sizeof opts);
 	opts.update = 1;
@@ -656,8 +656,7 @@ static int checkout(void)
 	if (unpack_trees(1, &t, &opts) < 0)
 		die(_("unable to checkout working tree"));
 
-	if (write_cache(fd, active_cache, active_nr) ||
-	    commit_locked_index(lock_file))
+	if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 		die(_("unable to write new index file"));
 
 	err |= run_hook_le(NULL, "post-checkout", sha1_to_hex(null_sha1),
diff --git a/builtin/commit.c b/builtin/commit.c
index 9cfef6c..243b0c3 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -305,7 +305,6 @@ static void refresh_cache_or_die(int refresh_flags)
 static char *prepare_index(int argc, const char **argv, const char *prefix,
 			   const struct commit *current_head, int is_status)
 {
-	int fd;
 	struct string_list partial;
 	struct pathspec pathspec;
 	int refresh_flags = REFRESH_QUIET;
@@ -321,12 +320,11 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 
 	if (interactive) {
 		char *old_index_env = NULL;
-		fd = hold_locked_index(&index_lock, 1);
+		hold_locked_index(&index_lock, 1);
 
 		refresh_cache_or_die(refresh_flags);
 
-		if (write_cache(fd, active_cache, active_nr) ||
-		    close_lock_file(&index_lock))
+		if (write_locked_index(&the_index, &index_lock, CLOSE_LOCK))
 			die(_("unable to create temporary index"));
 
 		old_index_env = getenv(INDEX_ENVIRONMENT);
@@ -360,12 +358,11 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 	 * (B) on failure, rollback the real index.
 	 */
 	if (all || (also && pathspec.nr)) {
-		fd = hold_locked_index(&index_lock, 1);
+		hold_locked_index(&index_lock, 1);
 		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
 		refresh_cache_or_die(refresh_flags);
 		update_main_cache_tree(WRITE_TREE_SILENT);
-		if (write_cache(fd, active_cache, active_nr) ||
-		    close_lock_file(&index_lock))
+		if (write_locked_index(&the_index, &index_lock, CLOSE_LOCK))
 			die(_("unable to write new_index file"));
 		commit_style = COMMIT_NORMAL;
 		return index_lock.filename;
@@ -381,12 +378,12 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 	 * We still need to refresh the index here.
 	 */
 	if (!only && !pathspec.nr) {
-		fd = hold_locked_index(&index_lock, 1);
+		hold_locked_index(&index_lock, 1);
 		refresh_cache_or_die(refresh_flags);
 		if (active_cache_changed) {
 			update_main_cache_tree(WRITE_TREE_SILENT);
-			if (write_cache(fd, active_cache, active_nr) ||
-			    commit_locked_index(&index_lock))
+			if (write_locked_index(&the_index, &index_lock,
+					       COMMIT_LOCK))
 				die(_("unable to write new_index file"));
 		} else {
 			rollback_lock_file(&index_lock);
@@ -432,24 +429,22 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 	if (read_cache() < 0)
 		die(_("cannot read the index"));
 
-	fd = hold_locked_index(&index_lock, 1);
+	hold_locked_index(&index_lock, 1);
 	add_remove_files(&partial);
 	refresh_cache(REFRESH_QUIET);
-	if (write_cache(fd, active_cache, active_nr) ||
-	    close_lock_file(&index_lock))
+	if (write_locked_index(&the_index, &index_lock, CLOSE_LOCK))
 		die(_("unable to write new_index file"));
 
-	fd = hold_lock_file_for_update(&false_lock,
-				       git_path("next-index-%"PRIuMAX,
-						(uintmax_t) getpid()),
-				       LOCK_DIE_ON_ERROR);
+	hold_lock_file_for_update(&false_lock,
+				  git_path("next-index-%"PRIuMAX,
+					   (uintmax_t) getpid()),
+				  LOCK_DIE_ON_ERROR);
 
 	create_base_index(current_head);
 	add_remove_files(&partial);
 	refresh_cache(REFRESH_QUIET);
 
-	if (write_cache(fd, active_cache, active_nr) ||
-	    close_lock_file(&false_lock))
+	if (write_locked_index(&the_index, &false_lock, CLOSE_LOCK))
 		die(_("unable to write temporary index file"));
 
 	discard_cache();
diff --git a/builtin/merge.c b/builtin/merge.c
index 66d8843..bf770b6 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -657,14 +657,12 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 			      struct commit_list *remoteheads,
 			      struct commit *head, const char *head_arg)
 {
-	int index_fd;
 	struct lock_file *lock = xcalloc(1, sizeof(struct lock_file));
 
-	index_fd = hold_locked_index(lock, 1);
+	hold_locked_index(lock, 1);
 	refresh_cache(REFRESH_QUIET);
 	if (active_cache_changed &&
-			(write_cache(index_fd, active_cache, active_nr) ||
-			 commit_locked_index(lock)))
+	    write_locked_index(&the_index, lock, COMMIT_LOCK))
 		return error(_("Unable to write index."));
 	rollback_lock_file(lock);
 
@@ -672,7 +670,6 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 		int clean, x;
 		struct commit *result;
 		struct lock_file *lock = xcalloc(1, sizeof(struct lock_file));
-		int index_fd;
 		struct commit_list *reversed = NULL;
 		struct merge_options o;
 		struct commit_list *j;
@@ -700,12 +697,11 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 		for (j = common; j; j = j->next)
 			commit_list_insert(j->item, &reversed);
 
-		index_fd = hold_locked_index(lock, 1);
+		hold_locked_index(lock, 1);
 		clean = merge_recursive(&o, head,
 				remoteheads->item, reversed, &result);
 		if (active_cache_changed &&
-				(write_cache(index_fd, active_cache, active_nr) ||
-				 commit_locked_index(lock)))
+		    write_locked_index(&the_index, lock, COMMIT_LOCK))
 			die (_("unable to write %s"), get_index_file());
 		rollback_lock_file(lock);
 		return clean ? 0 : 1;
diff --git a/builtin/mv.c b/builtin/mv.c
index 2a7243f..db40777 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -63,7 +63,7 @@ static struct lock_file lock_file;
 
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
-	int i, newfd, gitmodules_modified = 0;
+	int i, gitmodules_modified = 0;
 	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0;
 	struct option builtin_mv_options[] = {
 		OPT__VERBOSE(&verbose, N_("be verbose")),
@@ -85,7 +85,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	if (--argc < 1)
 		usage_with_options(builtin_mv_usage, builtin_mv_options);
 
-	newfd = hold_locked_index(&lock_file, 1);
+	hold_locked_index(&lock_file, 1);
 	if (read_cache() < 0)
 		die(_("index file corrupt"));
 
@@ -275,8 +275,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		stage_updated_gitmodules();
 
 	if (active_cache_changed) {
-		if (write_cache(newfd, active_cache, active_nr) ||
-		    commit_locked_index(&lock_file))
+		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 			die(_("Unable to write new index file"));
 	}
 
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 0d7ef84..f26d90f 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -99,7 +99,7 @@ static struct lock_file lock_file;
 
 int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 {
-	int i, newfd, stage = 0;
+	int i, stage = 0;
 	unsigned char sha1[20];
 	struct tree_desc t[MAX_UNPACK_TREES];
 	struct unpack_trees_options opts;
@@ -149,7 +149,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	argc = parse_options(argc, argv, unused_prefix, read_tree_options,
 			     read_tree_usage, 0);
 
-	newfd = hold_locked_index(&lock_file, 1);
+	hold_locked_index(&lock_file, 1);
 
 	prefix_set = opts.prefix ? 1 : 0;
 	if (1 < opts.merge + opts.reset + prefix_set)
@@ -233,8 +233,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	if (nr_trees == 1 && !opts.prefix)
 		prime_cache_tree(&active_cache_tree, trees[0]);
 
-	if (write_cache(newfd, active_cache, active_nr) ||
-	    commit_locked_index(&lock_file))
+	if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 		die("unable to write new index file");
 	return 0;
 }
diff --git a/builtin/reset.c b/builtin/reset.c
index f4e0875..0c56d28 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -351,7 +351,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 
 	if (reset_type != SOFT) {
 		struct lock_file *lock = xcalloc(1, sizeof(*lock));
-		int newfd = hold_locked_index(lock, 1);
+		hold_locked_index(lock, 1);
 		if (reset_type == MIXED) {
 			int flags = quiet ? REFRESH_QUIET : REFRESH_IN_PORCELAIN;
 			if (read_from_tree(&pathspec, sha1, intent_to_add))
@@ -367,8 +367,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix)
 				die(_("Could not reset index file to revision '%s'."), rev);
 		}
 
-		if (write_cache(newfd, active_cache, active_nr) ||
-		    commit_locked_index(lock))
+		if (write_locked_index(&the_index, lock, COMMIT_LOCK))
 			die(_("Could not write new index file."));
 	}
 
diff --git a/builtin/rm.c b/builtin/rm.c
index 960634d..bc6490b 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -278,7 +278,7 @@ static struct option builtin_rm_options[] = {
 
 int cmd_rm(int argc, const char **argv, const char *prefix)
 {
-	int i, newfd;
+	int i;
 	struct pathspec pathspec;
 	char *seen;
 
@@ -293,7 +293,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (!index_only)
 		setup_work_tree();
 
-	newfd = hold_locked_index(&lock_file, 1);
+	hold_locked_index(&lock_file, 1);
 
 	if (read_cache() < 0)
 		die(_("index file corrupt"));
@@ -427,8 +427,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	}
 
 	if (active_cache_changed) {
-		if (write_cache(newfd, active_cache, active_nr) ||
-		    commit_locked_index(&lock_file))
+		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 			die(_("Unable to write new index file"));
 	}
 
diff --git a/builtin/update-index.c b/builtin/update-index.c
index ba54e19..42cbe4b 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -921,8 +921,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 				exit(128);
 			unable_to_lock_index_die(get_index_file(), lock_error);
 		}
-		if (write_cache(newfd, active_cache, active_nr) ||
-		    commit_locked_index(lock_file))
+		if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 			die("Unable to write new index file");
 	}
 
diff --git a/cache-tree.c b/cache-tree.c
index 7fa524a..52f8692 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -595,8 +595,7 @@ int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix)
 				      active_nr, flags) < 0)
 			return WRITE_TREE_UNMERGED_INDEX;
 		if (0 <= newfd) {
-			if (!write_cache(newfd, active_cache, active_nr) &&
-			    !commit_lock_file(lock_file))
+			if (!write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 				newfd = -1;
 		}
 		/* Not being able to write is fine -- we are only interested
diff --git a/cache.h b/cache.h
index 107ac61..9cc2b97 100644
--- a/cache.h
+++ b/cache.h
@@ -301,7 +301,6 @@ extern void free_name_hash(struct index_state *istate);
 #define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec))
 #define is_cache_unborn() is_index_unborn(&the_index)
 #define read_cache_unmerged() read_index_unmerged(&the_index)
-#define write_cache(newfd, cache, entries) write_index(&the_index, (newfd))
 #define discard_cache() discard_index(&the_index)
 #define unmerged_cache() unmerged_index(&the_index)
 #define cache_name_pos(name, namelen) index_name_pos(&the_index,(name),(namelen))
@@ -456,12 +455,15 @@ extern int daemonize(void);
 	} while (0)
 
 /* Initialize and use the cache information */
+struct lock_file;
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec *pathspec);
 extern int read_index_from(struct index_state *, const char *path);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
-extern int write_index(struct index_state *, int newfd);
+#define COMMIT_LOCK		(1 << 0)
+#define CLOSE_LOCK		(1 << 1)
+extern int write_locked_index(struct index_state *, struct lock_file *lock, unsigned flags);
 extern int discard_index(struct index_state *);
 extern int unmerged_index(const struct index_state *);
 extern int verify_path(const char *path);
diff --git a/merge-recursive.c b/merge-recursive.c
index 4177092..442c1ec 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1986,7 +1986,7 @@ int merge_recursive_generic(struct merge_options *o,
 			    const unsigned char **base_list,
 			    struct commit **result)
 {
-	int clean, index_fd;
+	int clean;
 	struct lock_file *lock = xcalloc(1, sizeof(struct lock_file));
 	struct commit *head_commit = get_ref(head, o->branch1);
 	struct commit *next_commit = get_ref(merge, o->branch2);
@@ -2003,12 +2003,11 @@ int merge_recursive_generic(struct merge_options *o,
 		}
 	}
 
-	index_fd = hold_locked_index(lock, 1);
+	hold_locked_index(lock, 1);
 	clean = merge_recursive(o, head_commit, next_commit, ca,
 			result);
 	if (active_cache_changed &&
-			(write_cache(index_fd, active_cache, active_nr) ||
-			 commit_locked_index(lock)))
+	    write_locked_index(&the_index, lock, COMMIT_LOCK))
 		return error(_("Unable to write index."));
 
 	return clean ? 0 : 1;
diff --git a/merge.c b/merge.c
index 70f1000..610725c 100644
--- a/merge.c
+++ b/merge.c
@@ -66,13 +66,13 @@ int checkout_fast_forward(const unsigned char *head,
 	struct tree *trees[MAX_UNPACK_TREES];
 	struct unpack_trees_options opts;
 	struct tree_desc t[MAX_UNPACK_TREES];
-	int i, fd, nr_trees = 0;
+	int i, nr_trees = 0;
 	struct dir_struct dir;
 	struct lock_file *lock_file = xcalloc(1, sizeof(struct lock_file));
 
 	refresh_cache(REFRESH_QUIET);
 
-	fd = hold_locked_index(lock_file, 1);
+	hold_locked_index(lock_file, 1);
 
 	memset(&trees, 0, sizeof(trees));
 	memset(&opts, 0, sizeof(opts));
@@ -105,8 +105,7 @@ int checkout_fast_forward(const unsigned char *head,
 	}
 	if (unpack_trees(nr_trees, t, &opts))
 		return -1;
-	if (write_cache(fd, active_cache, active_nr) ||
-		commit_locked_index(lock_file))
+	if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
 		die(_("unable to write new index file"));
 	return 0;
 }
diff --git a/read-cache.c b/read-cache.c
index ba13353..44d4732 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1779,13 +1779,11 @@ static int has_racy_timestamp(struct index_state *istate)
 void update_index_if_able(struct index_state *istate, struct lock_file *lockfile)
 {
 	if ((istate->cache_changed || has_racy_timestamp(istate)) &&
-	    !write_index(istate, lockfile->fd))
-		commit_locked_index(lockfile);
-	else
+	    write_locked_index(istate, lockfile, COMMIT_LOCK))
 		rollback_lock_file(lockfile);
 }
 
-int write_index(struct index_state *istate, int newfd)
+static int do_write_index(struct index_state *istate, int newfd)
 {
 	git_SHA_CTX c;
 	struct cache_header hdr;
@@ -1877,6 +1875,28 @@ int write_index(struct index_state *istate, int newfd)
 	return 0;
 }
 
+static int do_write_locked_index(struct index_state *istate, struct lock_file *lock,
+				 unsigned flags)
+{
+	int ret = do_write_index(istate, lock->fd);
+	if (ret)
+		return ret;
+	assert((flags & (COMMIT_LOCK | CLOSE_LOCK)) !=
+	       (COMMIT_LOCK | CLOSE_LOCK));
+	if (flags & COMMIT_LOCK)
+		return commit_locked_index(lock);
+	else if (flags & CLOSE_LOCK)
+		return close_lock_file(lock);
+	else
+		return ret;
+}
+
+int write_locked_index(struct index_state *istate, struct lock_file *lock,
+		       unsigned flags)
+{
+	return do_write_locked_index(istate, lock, flags);
+}
+
 /*
  * Read the index file that is potentially unmerged into given
  * index_state, dropping any unmerged entries.  Returns true if
diff --git a/rerere.c b/rerere.c
index d55aa8a..ffc6a5b 100644
--- a/rerere.c
+++ b/rerere.c
@@ -492,8 +492,7 @@ static int update_paths(struct string_list *update)
 	}
 
 	if (!status && active_cache_changed) {
-		if (write_cache(fd, active_cache, active_nr) ||
-		    commit_locked_index(&index_lock))
+		if (write_locked_index(&the_index, &index_lock, COMMIT_LOCK))
 			die("Unable to write new index file");
 	} else if (fd >= 0)
 		rollback_lock_file(&index_lock);
diff --git a/sequencer.c b/sequencer.c
index 7b886a6..5ebc461 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -294,11 +294,11 @@ static int do_recursive_merge(struct commit *base, struct commit *next,
 {
 	struct merge_options o;
 	struct tree *result, *next_tree, *base_tree, *head_tree;
-	int clean, index_fd;
+	int clean;
 	const char **xopt;
 	static struct lock_file index_lock;
 
-	index_fd = hold_locked_index(&index_lock, 1);
+	hold_locked_index(&index_lock, 1);
 
 	read_cache();
 
@@ -319,8 +319,7 @@ static int do_recursive_merge(struct commit *base, struct commit *next,
 			    next_tree, base_tree, &result);
 
 	if (active_cache_changed &&
-	    (write_cache(index_fd, active_cache, active_nr) ||
-	     commit_locked_index(&index_lock)))
+	    write_locked_index(&the_index, &index_lock, COMMIT_LOCK))
 		/* TRANSLATORS: %s will be "revert" or "cherry-pick" */
 		die(_("%s: Unable to write new index file"), action_name(opts));
 	rollback_lock_file(&index_lock);
@@ -680,8 +679,7 @@ static void read_and_refresh_cache(struct replay_opts *opts)
 		die(_("git %s: failed to read the index"), action_name(opts));
 	refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, NULL);
 	if (the_index.cache_changed && index_fd >= 0) {
-		if (write_index(&the_index, index_fd) ||
-		    commit_locked_index(&index_lock))
+		if (write_locked_index(&the_index, &index_lock, COMMIT_LOCK))
 			die(_("git %s: failed to refresh the index"), action_name(opts));
 	}
 	rollback_lock_file(&index_lock);
diff --git a/test-scrap-cache-tree.c b/test-scrap-cache-tree.c
index 4728013..9ebcbca 100644
--- a/test-scrap-cache-tree.c
+++ b/test-scrap-cache-tree.c
@@ -6,12 +6,11 @@ static struct lock_file index_lock;
 
 int main(int ac, char **av)
 {
-	int fd = hold_locked_index(&index_lock, 1);
+	hold_locked_index(&index_lock, 1);
 	if (read_cache() < 0)
 		die("unable to read index file");
 	active_cache_tree = NULL;
-	if (write_cache(fd, active_cache, active_nr)
-	    || commit_lock_file(&index_lock))
+	if (write_locked_index(&the_index, &index_lock, COMMIT_LOCK))
 		die("unable to write index file");
 	return 0;
 }
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 05/32] read-cache: relocate and unexport commit_locked_index()
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 04/32] read-cache: new API write_locked_index instead of write_index/write_cache Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 06/32] read-cache: store in-memory flags in the first 12 bits of ce_flags Nguyễn Thái Ngọc Duy
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This function is now only used by write_locked_index(). Move it to
read-cache.c (because read-cache.c will need to be aware of
alternate_index_output later) and unexport it.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h      |  1 -
 lockfile.c   | 20 --------------------
 read-cache.c | 20 ++++++++++++++++++++
 3 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/cache.h b/cache.h
index 9cc2b97..e44048c 100644
--- a/cache.h
+++ b/cache.h
@@ -552,7 +552,6 @@ extern int commit_lock_file(struct lock_file *);
 extern void update_index_if_able(struct index_state *, struct lock_file *);
 
 extern int hold_locked_index(struct lock_file *, int);
-extern int commit_locked_index(struct lock_file *);
 extern void set_alternate_index_output(const char *);
 extern int close_lock_file(struct lock_file *);
 extern void rollback_lock_file(struct lock_file *);
diff --git a/lockfile.c b/lockfile.c
index 8fbcb6a..b706614 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -5,7 +5,6 @@
 #include "sigchain.h"
 
 static struct lock_file *lock_file_list;
-static const char *alternate_index_output;
 
 static void remove_lock_file(void)
 {
@@ -252,25 +251,6 @@ int hold_locked_index(struct lock_file *lk, int die_on_error)
 					 : 0);
 }
 
-void set_alternate_index_output(const char *name)
-{
-	alternate_index_output = name;
-}
-
-int commit_locked_index(struct lock_file *lk)
-{
-	if (alternate_index_output) {
-		if (lk->fd >= 0 && close_lock_file(lk))
-			return -1;
-		if (rename(lk->filename, alternate_index_output))
-			return -1;
-		lk->filename[0] = 0;
-		return 0;
-	}
-	else
-		return commit_lock_file(lk);
-}
-
 void rollback_lock_file(struct lock_file *lk)
 {
 	if (lk->filename[0]) {
diff --git a/read-cache.c b/read-cache.c
index 44d4732..a7b48a9 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -36,6 +36,7 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
 
 struct index_state the_index;
+static const char *alternate_index_output;
 
 static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
 {
@@ -1875,6 +1876,25 @@ static int do_write_index(struct index_state *istate, int newfd)
 	return 0;
 }
 
+void set_alternate_index_output(const char *name)
+{
+	alternate_index_output = name;
+}
+
+static int commit_locked_index(struct lock_file *lk)
+{
+	if (alternate_index_output) {
+		if (lk->fd >= 0 && close_lock_file(lk))
+			return -1;
+		if (rename(lk->filename, alternate_index_output))
+			return -1;
+		lk->filename[0] = 0;
+		return 0;
+	} else {
+		return commit_lock_file(lk);
+	}
+}
+
 static int do_write_locked_index(struct index_state *istate, struct lock_file *lock,
 				 unsigned flags)
 {
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 06/32] read-cache: store in-memory flags in the first 12 bits of ce_flags
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (4 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 05/32] read-cache: relocate and unexport commit_locked_index() Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 07/32] read-cache: be strict about "changed" in remove_marked_cache_entries() Nguyễn Thái Ngọc Duy
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

We're running out of room for in-memory flags. But since b60e188
(Strip namelen out of ce_flags into a ce_namelen field - 2012-07-11),
we copy the namelen (first 12 bits) to ce_namelen field. So those bits
are free to use. Just make sure we do not accidentally write any
in-memory flags back.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h      | 2 +-
 read-cache.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index e44048c..57ad318 100644
--- a/cache.h
+++ b/cache.h
@@ -145,7 +145,7 @@ struct cache_entry {
 #define CE_STAGESHIFT 12
 
 /*
- * Range 0xFFFF0000 in ce_flags is divided into
+ * Range 0xFFFF0FFF in ce_flags is divided into
  * two parts: in-memory flags and on-disk ones.
  * Flags in CE_EXTENDED_FLAGS will get saved on-disk
  * if you want to save a new flag, add it in
diff --git a/read-cache.c b/read-cache.c
index a7b48a9..5e8c06c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1702,7 +1702,7 @@ static char *copy_cache_entry_to_ondisk(struct ondisk_cache_entry *ondisk,
 	ondisk->size = htonl(ce->ce_stat_data.sd_size);
 	hashcpy(ondisk->sha1, ce->sha1);
 
-	flags = ce->ce_flags;
+	flags = ce->ce_flags & ~CE_NAMEMASK;
 	flags |= (ce_namelen(ce) >= CE_NAMEMASK ? CE_NAMEMASK : ce_namelen(ce));
 	ondisk->flags = htons(flags);
 	if (ce->ce_flags & CE_EXTENDED) {
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 07/32] read-cache: be strict about "changed" in remove_marked_cache_entries()
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (5 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 06/32] read-cache: store in-memory flags in the first 12 bits of ce_flags Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 08/32] read-cache: be specific what part of the index has changed Nguyễn Thái Ngọc Duy
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

remove_marked_cache_entries() deletes entries marked with
CE_REMOVE. But if there is no such entry, do not mark the index as
"changed" because that could trigger an index update unnecessarily.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 5e8c06c..c0c2e39 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -510,6 +510,8 @@ void remove_marked_cache_entries(struct index_state *istate)
 		else
 			ce_array[j++] = ce_array[i];
 	}
+	if (j == istate->cache_nr)
+		return;
 	istate->cache_changed = 1;
 	istate->cache_nr = j;
 }
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 08/32] read-cache: be specific what part of the index has changed
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (6 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 07/32] read-cache: be strict about "changed" in remove_marked_cache_entries() Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 09/32] update-index: " Nguyễn Thái Ngọc Duy
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

cache entry additions, removals and modifications are separated
out. The rest of changes are still in the catch-all flag
SOMETHING_CHANGED, which would be more specific later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/update-index.c |  6 +++---
 cache.h                |  5 +++++
 read-cache.c           | 11 ++++++-----
 resolve-undo.c         |  2 +-
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 42cbe4b..d2654d6 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -56,7 +56,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		cache_tree_invalidate_path(active_cache_tree, path);
-		active_cache_changed = 1;
+		active_cache_changed = SOMETHING_CHANGED;
 		return 0;
 	}
 	return -1;
@@ -268,7 +268,7 @@ static void chmod_path(int flip, const char *path)
 		goto fail;
 	}
 	cache_tree_invalidate_path(active_cache_tree, path);
-	active_cache_changed = 1;
+	active_cache_changed = SOMETHING_CHANGED;
 	report("chmod %cx '%s'", flip, path);
 	return;
  fail:
@@ -889,7 +889,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    INDEX_FORMAT_LB, INDEX_FORMAT_UB);
 
 		if (the_index.version != preferred_index_format)
-			active_cache_changed = 1;
+			active_cache_changed = SOMETHING_CHANGED;
 		the_index.version = preferred_index_format;
 	}
 
diff --git a/cache.h b/cache.h
index 57ad318..31d4541 100644
--- a/cache.h
+++ b/cache.h
@@ -268,6 +268,11 @@ static inline unsigned int canon_mode(unsigned int mode)
 
 #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
 
+#define SOMETHING_CHANGED	(1 << 0) /* unclassified changes go here */
+#define CE_ENTRY_CHANGED	(1 << 1)
+#define CE_ENTRY_REMOVED	(1 << 2)
+#define CE_ENTRY_ADDED		(1 << 3)
+
 struct index_state {
 	struct cache_entry **cache;
 	unsigned int version;
diff --git a/read-cache.c b/read-cache.c
index c0c2e39..035c72e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -51,7 +51,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	remove_name_hash(istate, old);
 	free(old);
 	set_index_entry(istate, nr, ce);
-	istate->cache_changed = 1;
+	istate->cache_changed |= CE_ENTRY_CHANGED;
 }
 
 void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
@@ -482,7 +482,7 @@ int remove_index_entry_at(struct index_state *istate, int pos)
 	record_resolve_undo(istate, ce);
 	remove_name_hash(istate, ce);
 	free(ce);
-	istate->cache_changed = 1;
+	istate->cache_changed |= CE_ENTRY_REMOVED;
 	istate->cache_nr--;
 	if (pos >= istate->cache_nr)
 		return 0;
@@ -512,7 +512,7 @@ void remove_marked_cache_entries(struct index_state *istate)
 	}
 	if (j == istate->cache_nr)
 		return;
-	istate->cache_changed = 1;
+	istate->cache_changed |= CE_ENTRY_REMOVED;
 	istate->cache_nr = j;
 }
 
@@ -1002,7 +1002,7 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
 			istate->cache + pos,
 			(istate->cache_nr - pos - 1) * sizeof(ce));
 	set_index_entry(istate, pos, ce);
-	istate->cache_changed = 1;
+	istate->cache_changed |= CE_ENTRY_ADDED;
 	return 0;
 }
 
@@ -1101,6 +1101,7 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	    !(ce->ce_flags & CE_VALID))
 		updated->ce_flags &= ~CE_VALID;
 
+	/* istate->cache_changed is updated in the caller */
 	return updated;
 }
 
@@ -1182,7 +1183,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 				 * means the index is not valid anymore.
 				 */
 				ce->ce_flags &= ~CE_VALID;
-				istate->cache_changed = 1;
+				istate->cache_changed |= CE_ENTRY_CHANGED;
 			}
 			if (quiet)
 				continue;
diff --git a/resolve-undo.c b/resolve-undo.c
index 44c697c..e9dff57 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -110,7 +110,7 @@ void resolve_undo_clear_index(struct index_state *istate)
 	string_list_clear(resolve_undo, 1);
 	free(resolve_undo);
 	istate->resolve_undo = NULL;
-	istate->cache_changed = 1;
+	istate->cache_changed = SOMETHING_CHANGED;
 }
 
 int unmerge_index_entry_at(struct index_state *istate, int pos)
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 09/32] update-index: be specific what part of the index has changed
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (7 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 08/32] read-cache: be specific what part of the index has changed Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 10/32] resolve-undo: " Nguyễn Thái Ngọc Duy
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/update-index.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d2654d6..e0e881b 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -56,7 +56,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 		else
 			active_cache[pos]->ce_flags &= ~flag;
 		cache_tree_invalidate_path(active_cache_tree, path);
-		active_cache_changed = SOMETHING_CHANGED;
+		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
 	}
 	return -1;
@@ -268,7 +268,7 @@ static void chmod_path(int flip, const char *path)
 		goto fail;
 	}
 	cache_tree_invalidate_path(active_cache_tree, path);
-	active_cache_changed = SOMETHING_CHANGED;
+	active_cache_changed |= CE_ENTRY_CHANGED;
 	report("chmod %cx '%s'", flip, path);
 	return;
  fail:
@@ -889,7 +889,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			    INDEX_FORMAT_LB, INDEX_FORMAT_UB);
 
 		if (the_index.version != preferred_index_format)
-			active_cache_changed = SOMETHING_CHANGED;
+			active_cache_changed |= SOMETHING_CHANGED;
 		the_index.version = preferred_index_format;
 	}
 
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 10/32] resolve-undo: be specific what part of the index has changed
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (8 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 09/32] update-index: " Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 11/32] unpack-trees: " Nguyễn Thái Ngọc Duy
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h        | 1 +
 resolve-undo.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 31d4541..976f2e0 100644
--- a/cache.h
+++ b/cache.h
@@ -272,6 +272,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CE_ENTRY_CHANGED	(1 << 1)
 #define CE_ENTRY_REMOVED	(1 << 2)
 #define CE_ENTRY_ADDED		(1 << 3)
+#define RESOLVE_UNDO_CHANGED	(1 << 4)
 
 struct index_state {
 	struct cache_entry **cache;
diff --git a/resolve-undo.c b/resolve-undo.c
index e9dff57..468a2eb 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -110,7 +110,7 @@ void resolve_undo_clear_index(struct index_state *istate)
 	string_list_clear(resolve_undo, 1);
 	free(resolve_undo);
 	istate->resolve_undo = NULL;
-	istate->cache_changed = SOMETHING_CHANGED;
+	istate->cache_changed |= RESOLVE_UNDO_CHANGED;
 }
 
 int unmerge_index_entry_at(struct index_state *istate, int pos)
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 11/32] unpack-trees: be specific what part of the index has changed
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (9 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 10/32] resolve-undo: " Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 12/32] cache-tree: mark istate->cache_changed on cache tree invalidation Nguyễn Thái Ngọc Duy
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 unpack-trees.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 97fc995..a722685 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -246,7 +246,9 @@ static int verify_absent_sparse(const struct cache_entry *ce,
 				enum unpack_trees_error_types,
 				struct unpack_trees_options *o);
 
-static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_options *o)
+static int apply_sparse_checkout(struct index_state *istate,
+				 struct cache_entry *ce,
+				 struct unpack_trees_options *o)
 {
 	int was_skip_worktree = ce_skip_worktree(ce);
 
@@ -254,6 +256,8 @@ static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_opt
 		ce->ce_flags |= CE_SKIP_WORKTREE;
 	else
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
+	if (was_skip_worktree != ce_skip_worktree(ce))
+		istate->cache_changed |= CE_ENTRY_CHANGED;
 
 	/*
 	 * if (!was_skip_worktree && !ce_skip_worktree()) {
@@ -1131,7 +1135,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 				ret = -1;
 			}
 
-			if (apply_sparse_checkout(ce, o)) {
+			if (apply_sparse_checkout(&o->result, ce, o)) {
 				if (!o->show_all_errors)
 					goto return_failed;
 				ret = -1;
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 12/32] cache-tree: mark istate->cache_changed on cache tree invalidation
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (10 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 11/32] unpack-trees: " Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 13/32] cache-tree: mark istate->cache_changed on cache tree update Nguyễn Thái Ngọc Duy
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/blame.c        |  2 +-
 builtin/update-index.c |  4 ++--
 cache-tree.c           | 15 +++++++++++----
 cache-tree.h           |  2 +-
 cache.h                |  1 +
 read-cache.c           |  6 +++---
 unpack-trees.c         |  2 +-
 7 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 88cb799..914d919 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2126,7 +2126,7 @@ static struct commit *fake_working_tree_commit(struct diff_options *opt,
 	 * right now, but someday we might optimize diff-index --cached
 	 * with cache-tree information.
 	 */
-	cache_tree_invalidate_path(active_cache_tree, path);
+	cache_tree_invalidate_path(&the_index, path);
 
 	return commit;
 }
diff --git a/builtin/update-index.c b/builtin/update-index.c
index e0e881b..fa3c441 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -55,7 +55,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 			active_cache[pos]->ce_flags |= flag;
 		else
 			active_cache[pos]->ce_flags &= ~flag;
-		cache_tree_invalidate_path(active_cache_tree, path);
+		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
 	}
@@ -267,7 +267,7 @@ static void chmod_path(int flip, const char *path)
 	default:
 		goto fail;
 	}
-	cache_tree_invalidate_path(active_cache_tree, path);
+	cache_tree_invalidate_path(&the_index, path);
 	active_cache_changed |= CE_ENTRY_CHANGED;
 	report("chmod %cx '%s'", flip, path);
 	return;
diff --git a/cache-tree.c b/cache-tree.c
index 52f8692..23ddc73 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -98,7 +98,7 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *it, const char *path)
 	return find_subtree(it, path, pathlen, 1);
 }
 
-void cache_tree_invalidate_path(struct cache_tree *it, const char *path)
+static int do_invalidate_path(struct cache_tree *it, const char *path)
 {
 	/* a/b/c
 	 * ==> invalidate self
@@ -116,7 +116,7 @@ void cache_tree_invalidate_path(struct cache_tree *it, const char *path)
 #endif
 
 	if (!it)
-		return;
+		return 0;
 	slash = strchrnul(path, '/');
 	namelen = slash - path;
 	it->entry_count = -1;
@@ -137,11 +137,18 @@ void cache_tree_invalidate_path(struct cache_tree *it, const char *path)
 				(it->subtree_nr - pos - 1));
 			it->subtree_nr--;
 		}
-		return;
+		return 1;
 	}
 	down = find_subtree(it, path, namelen, 0);
 	if (down)
-		cache_tree_invalidate_path(down->cache_tree, slash + 1);
+		do_invalidate_path(down->cache_tree, slash + 1);
+	return 1;
+}
+
+void cache_tree_invalidate_path(struct index_state *istate, const char *path)
+{
+	if (do_invalidate_path(istate->cache_tree, path))
+		istate->cache_changed |= CACHE_TREE_CHANGED;
 }
 
 static int verify_cache(const struct cache_entry * const *cache,
diff --git a/cache-tree.h b/cache-tree.h
index f1923ad..dfbcfab 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -23,7 +23,7 @@ struct cache_tree {
 
 struct cache_tree *cache_tree(void);
 void cache_tree_free(struct cache_tree **);
-void cache_tree_invalidate_path(struct cache_tree *, const char *);
+void cache_tree_invalidate_path(struct index_state *, const char *);
 struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *);
 
 void cache_tree_write(struct strbuf *, struct cache_tree *root);
diff --git a/cache.h b/cache.h
index 976f2e0..b4edbcd 100644
--- a/cache.h
+++ b/cache.h
@@ -273,6 +273,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CE_ENTRY_REMOVED	(1 << 2)
 #define CE_ENTRY_ADDED		(1 << 3)
 #define RESOLVE_UNDO_CHANGED	(1 << 4)
+#define CACHE_TREE_CHANGED	(1 << 5)
 
 struct index_state {
 	struct cache_entry **cache;
diff --git a/read-cache.c b/read-cache.c
index 035c72e..d1ad227 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -65,7 +65,7 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
 	new->ce_namelen = namelen;
 	memcpy(new->name, new_name, namelen + 1);
 
-	cache_tree_invalidate_path(istate->cache_tree, old->name);
+	cache_tree_invalidate_path(istate, old->name);
 	remove_index_entry_at(istate, nr);
 	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
@@ -521,7 +521,7 @@ int remove_file_from_index(struct index_state *istate, const char *path)
 	int pos = index_name_pos(istate, path, strlen(path));
 	if (pos < 0)
 		pos = -pos-1;
-	cache_tree_invalidate_path(istate->cache_tree, path);
+	cache_tree_invalidate_path(istate, path);
 	while (pos < istate->cache_nr && !strcmp(istate->cache[pos]->name, path))
 		remove_index_entry_at(istate, pos);
 	return 0;
@@ -939,7 +939,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
 	int new_only = option & ADD_CACHE_NEW_ONLY;
 
-	cache_tree_invalidate_path(istate->cache_tree, ce->name);
+	cache_tree_invalidate_path(istate, ce->name);
 	pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce));
 
 	/* existing match? Just replace it. */
diff --git a/unpack-trees.c b/unpack-trees.c
index a722685..3beff8a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1263,7 +1263,7 @@ static void invalidate_ce_path(const struct cache_entry *ce,
 			       struct unpack_trees_options *o)
 {
 	if (ce)
-		cache_tree_invalidate_path(o->src_index->cache_tree, ce->name);
+		cache_tree_invalidate_path(o->src_index, ce->name);
 }
 
 /*
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 13/32] cache-tree: mark istate->cache_changed on cache tree update
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (11 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 12/32] cache-tree: mark istate->cache_changed on cache tree invalidation Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 14/32] cache-tree: mark istate->cache_changed on prime_cache_tree() Nguyễn Thái Ngọc Duy
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache-tree.c           | 25 +++++++++++--------------
 cache-tree.h           |  2 +-
 merge-recursive.c      |  4 +---
 sequencer.c            |  4 +---
 test-dump-cache-tree.c |  7 ++++---
 5 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 23ddc73..18055f1 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -151,7 +151,7 @@ void cache_tree_invalidate_path(struct index_state *istate, const char *path)
 		istate->cache_changed |= CACHE_TREE_CHANGED;
 }
 
-static int verify_cache(const struct cache_entry * const *cache,
+static int verify_cache(struct cache_entry **cache,
 			int entries, int flags)
 {
 	int i, funny;
@@ -236,7 +236,7 @@ int cache_tree_fully_valid(struct cache_tree *it)
 }
 
 static int update_one(struct cache_tree *it,
-		      const struct cache_entry * const *cache,
+		      struct cache_entry **cache,
 		      int entries,
 		      const char *base,
 		      int baselen,
@@ -398,18 +398,19 @@ static int update_one(struct cache_tree *it,
 	return i;
 }
 
-int cache_tree_update(struct cache_tree *it,
-		      const struct cache_entry * const *cache,
-		      int entries,
-		      int flags)
+int cache_tree_update(struct index_state *istate, int flags)
 {
-	int i, skip;
-	i = verify_cache(cache, entries, flags);
+	struct cache_tree *it = istate->cache_tree;
+	struct cache_entry **cache = istate->cache;
+	int entries = istate->cache_nr;
+	int skip, i = verify_cache(cache, entries, flags);
+
 	if (i)
 		return i;
 	i = update_one(it, cache, entries, "", 0, &skip, flags);
 	if (i < 0)
 		return i;
+	istate->cache_changed |= CACHE_TREE_CHANGED;
 	return 0;
 }
 
@@ -597,9 +598,7 @@ int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix)
 
 	was_valid = cache_tree_fully_valid(active_cache_tree);
 	if (!was_valid) {
-		if (cache_tree_update(active_cache_tree,
-				      (const struct cache_entry * const *)active_cache,
-				      active_nr, flags) < 0)
+		if (cache_tree_update(&the_index, flags) < 0)
 			return WRITE_TREE_UNMERGED_INDEX;
 		if (0 <= newfd) {
 			if (!write_locked_index(&the_index, lock_file, COMMIT_LOCK))
@@ -698,7 +697,5 @@ int update_main_cache_tree(int flags)
 {
 	if (!the_index.cache_tree)
 		the_index.cache_tree = cache_tree();
-	return cache_tree_update(the_index.cache_tree,
-				 (const struct cache_entry * const *)the_index.cache,
-				 the_index.cache_nr, flags);
+	return cache_tree_update(&the_index, flags);
 }
diff --git a/cache-tree.h b/cache-tree.h
index dfbcfab..154b357 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -30,7 +30,7 @@ void cache_tree_write(struct strbuf *, struct cache_tree *root);
 struct cache_tree *cache_tree_read(const char *buffer, unsigned long size);
 
 int cache_tree_fully_valid(struct cache_tree *);
-int cache_tree_update(struct cache_tree *, const struct cache_entry * const *, int, int);
+int cache_tree_update(struct index_state *, int);
 
 int update_main_cache_tree(int);
 
diff --git a/merge-recursive.c b/merge-recursive.c
index 442c1ec..0b5d34d 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -265,9 +265,7 @@ struct tree *write_tree_from_memory(struct merge_options *o)
 		active_cache_tree = cache_tree();
 
 	if (!cache_tree_fully_valid(active_cache_tree) &&
-	    cache_tree_update(active_cache_tree,
-			      (const struct cache_entry * const *)active_cache,
-			      active_nr, 0) < 0)
+	    cache_tree_update(&the_index, 0) < 0)
 		die(_("error building trees"));
 
 	result = lookup_tree(active_cache_tree->sha1);
diff --git a/sequencer.c b/sequencer.c
index 5ebc461..4b709db 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -371,9 +371,7 @@ static int is_index_unchanged(void)
 		active_cache_tree = cache_tree();
 
 	if (!cache_tree_fully_valid(active_cache_tree))
-		if (cache_tree_update(active_cache_tree,
-				      (const struct cache_entry * const *)active_cache,
-				      active_nr, 0))
+		if (cache_tree_update(&the_index, 0))
 			return error(_("Unable to update cache tree\n"));
 
 	return !hashcmp(active_cache_tree->sha1, head_commit->tree->object.sha1);
diff --git a/test-dump-cache-tree.c b/test-dump-cache-tree.c
index 47eab97..330ba4f 100644
--- a/test-dump-cache-tree.c
+++ b/test-dump-cache-tree.c
@@ -56,11 +56,12 @@ static int dump_cache_tree(struct cache_tree *it,
 
 int main(int ac, char **av)
 {
+	struct index_state istate;
 	struct cache_tree *another = cache_tree();
 	if (read_cache() < 0)
 		die("unable to read index file");
-	cache_tree_update(another,
-			  (const struct cache_entry * const *)active_cache,
-			  active_nr, WRITE_TREE_DRY_RUN);
+	istate = the_index;
+	istate.cache_tree = another;
+	cache_tree_update(&istate, WRITE_TREE_DRY_RUN);
 	return dump_cache_tree(active_cache_tree, another, "");
 }
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 14/32] cache-tree: mark istate->cache_changed on prime_cache_tree()
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (12 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 13/32] cache-tree: mark istate->cache_changed on cache tree update Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 15/32] entry.c: update cache_changed if refresh_cache is set in checkout_entry() Nguyễn Thái Ngọc Duy
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/read-tree.c | 2 +-
 builtin/reset.c     | 2 +-
 cache-tree.c        | 9 +++++----
 cache-tree.h        | 2 +-
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index f26d90f..3204c62 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -231,7 +231,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	 * what came from the tree.
 	 */
 	if (nr_trees == 1 && !opts.prefix)
-		prime_cache_tree(&active_cache_tree, trees[0]);
+		prime_cache_tree(&the_index, trees[0]);
 
 	if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 		die("unable to write new index file");
diff --git a/builtin/reset.c b/builtin/reset.c
index 0c56d28..234b2eb 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -84,7 +84,7 @@ static int reset_index(const unsigned char *sha1, int reset_type, int quiet)
 
 	if (reset_type == MIXED || reset_type == HARD) {
 		tree = parse_tree_indirect(sha1);
-		prime_cache_tree(&active_cache_tree, tree);
+		prime_cache_tree(&the_index, tree);
 	}
 
 	return 0;
diff --git a/cache-tree.c b/cache-tree.c
index 18055f1..c53f7de 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -654,11 +654,12 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
 	it->entry_count = cnt;
 }
 
-void prime_cache_tree(struct cache_tree **it, struct tree *tree)
+void prime_cache_tree(struct index_state *istate, struct tree *tree)
 {
-	cache_tree_free(it);
-	*it = cache_tree();
-	prime_cache_tree_rec(*it, tree);
+	cache_tree_free(&istate->cache_tree);
+	istate->cache_tree = cache_tree();
+	prime_cache_tree_rec(istate->cache_tree, tree);
+	istate->cache_changed |= CACHE_TREE_CHANGED;
 }
 
 /*
diff --git a/cache-tree.h b/cache-tree.h
index 154b357..b47ccec 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -46,7 +46,7 @@ int update_main_cache_tree(int);
 #define WRITE_TREE_PREFIX_ERROR (-3)
 
 int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix);
-void prime_cache_tree(struct cache_tree **, struct tree *);
+void prime_cache_tree(struct index_state *, struct tree *);
 
 extern int cache_tree_matches_traversal(struct cache_tree *, struct name_entry *ent, struct traverse_info *info);
 
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 15/32] entry.c: update cache_changed if refresh_cache is set in checkout_entry()
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (13 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 14/32] cache-tree: mark istate->cache_changed on prime_cache_tree() Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 16/32] read-cache: save index SHA-1 after reading Nguyễn Thái Ngọc Duy
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/apply.c          | 8 +++++---
 builtin/checkout-index.c | 1 +
 builtin/checkout.c       | 1 +
 cache.h                  | 1 +
 entry.c                  | 2 ++
 unpack-trees.c           | 1 +
 6 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 5e13444..adca035 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3084,13 +3084,15 @@ static void prepare_fn_table(struct patch *patch)
 	}
 }
 
-static int checkout_target(struct cache_entry *ce, struct stat *st)
+static int checkout_target(struct index_state *istate,
+			   struct cache_entry *ce, struct stat *st)
 {
 	struct checkout costate;
 
 	memset(&costate, 0, sizeof(costate));
 	costate.base_dir = "";
 	costate.refresh_cache = 1;
+	costate.istate = istate;
 	if (checkout_entry(ce, &costate, NULL) || lstat(ce->name, st))
 		return error(_("cannot checkout %s"), ce->name);
 	return 0;
@@ -3257,7 +3259,7 @@ static int load_current(struct image *image, struct patch *patch)
 	if (lstat(name, &st)) {
 		if (errno != ENOENT)
 			return error(_("%s: %s"), name, strerror(errno));
-		if (checkout_target(ce, &st))
+		if (checkout_target(&the_index, ce, &st))
 			return -1;
 	}
 	if (verify_index_match(ce, &st))
@@ -3411,7 +3413,7 @@ static int check_preimage(struct patch *patch, struct cache_entry **ce, struct s
 		}
 		*ce = active_cache[pos];
 		if (stat_ret < 0) {
-			if (checkout_target(*ce, st))
+			if (checkout_target(&the_index, *ce, st))
 				return -1;
 		}
 		if (!cached && verify_index_match(*ce, st))
diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
index 9e49bf2..05edd9e 100644
--- a/builtin/checkout-index.c
+++ b/builtin/checkout-index.c
@@ -135,6 +135,7 @@ static int option_parse_u(const struct option *opt,
 	int *newfd = opt->value;
 
 	state.refresh_cache = 1;
+	state.istate = &the_index;
 	if (*newfd < 0)
 		*newfd = hold_locked_index(&lock_file, 1);
 	return 0;
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 944a634..146ab91 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -336,6 +336,7 @@ static int checkout_paths(const struct checkout_opts *opts,
 	memset(&state, 0, sizeof(state));
 	state.force = 1;
 	state.refresh_cache = 1;
+	state.istate = &the_index;
 	for (pos = 0; pos < active_nr; pos++) {
 		struct cache_entry *ce = active_cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
diff --git a/cache.h b/cache.h
index b4edbcd..bf09d1c 100644
--- a/cache.h
+++ b/cache.h
@@ -1063,6 +1063,7 @@ extern int split_ident_line(struct ident_split *, const char *, int);
 extern int ident_cmp(const struct ident_split *, const struct ident_split *);
 
 struct checkout {
+	struct index_state *istate;
 	const char *base_dir;
 	int base_dir_len;
 	unsigned force:1,
diff --git a/entry.c b/entry.c
index 77c6882..d913c1d 100644
--- a/entry.c
+++ b/entry.c
@@ -210,9 +210,11 @@ static int write_entry(struct cache_entry *ce,
 
 finish:
 	if (state->refresh_cache) {
+		assert(state->istate);
 		if (!fstat_done)
 			lstat(ce->name, &st);
 		fill_stat_cache_info(ce, &st);
+		state->istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 	return 0;
 }
diff --git a/unpack-trees.c b/unpack-trees.c
index 3beff8a..26f65c7 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1029,6 +1029,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	state.force = 1;
 	state.quiet = 1;
 	state.refresh_cache = 1;
+	state.istate = &o->result;
 
 	memset(&el, 0, sizeof(el));
 	if (!core_apply_sparse_checkout || !o->update)
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 16/32] read-cache: save index SHA-1 after reading
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (14 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 15/32] entry.c: update cache_changed if refresh_cache is set in checkout_entry() Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 17/32] read-cache: split-index mode Nguyễn Thái Ngọc Duy
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Also update SHA-1 after writing. If we do not do that, the second
read_index() will see "initialized" variable already set and not read
.git/index again, which is fine, except istate->sha1 now has a stale
value.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h        | 1 +
 read-cache.c   | 6 ++++--
 unpack-trees.c | 1 +
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index bf09d1c..41cdcd0 100644
--- a/cache.h
+++ b/cache.h
@@ -286,6 +286,7 @@ struct index_state {
 		 initialized : 1;
 	struct hashmap name_hash;
 	struct hashmap dir_hash;
+	unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
diff --git a/read-cache.c b/read-cache.c
index d1ad227..b4653c0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1481,6 +1481,7 @@ int read_index_from(struct index_state *istate, const char *path)
 	if (verify_hdr(hdr, mmap_size) < 0)
 		goto unmap;
 
+	hashcpy(istate->sha1, (const unsigned char *)hdr + mmap_size - 20);
 	istate->version = ntohl(hdr->hdr_version);
 	istate->cache_nr = ntohl(hdr->hdr_entries);
 	istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1616,7 +1617,7 @@ static int write_index_ext_header(git_SHA_CTX *context, int fd,
 		(ce_write(context, fd, &sz, 4) < 0)) ? -1 : 0;
 }
 
-static int ce_flush(git_SHA_CTX *context, int fd)
+static int ce_flush(git_SHA_CTX *context, int fd, unsigned char *sha1)
 {
 	unsigned int left = write_buffer_len;
 
@@ -1634,6 +1635,7 @@ static int ce_flush(git_SHA_CTX *context, int fd)
 
 	/* Append the SHA1 signature at the end */
 	git_SHA1_Final(write_buffer + left, context);
+	hashcpy(sha1, write_buffer + left);
 	left += 20;
 	return (write_in_full(fd, write_buffer, left) != left) ? -1 : 0;
 }
@@ -1872,7 +1874,7 @@ static int do_write_index(struct index_state *istate, int newfd)
 			return -1;
 	}
 
-	if (ce_flush(&c, newfd) || fstat(newfd, &st))
+	if (ce_flush(&c, newfd, istate->sha1) || fstat(newfd, &st))
 		return -1;
 	istate->timestamp.sec = (unsigned int)st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
diff --git a/unpack-trees.c b/unpack-trees.c
index 26f65c7..f594932 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1046,6 +1046,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
 	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
 	o->result.version = o->src_index->version;
+	hashcpy(o->result.sha1, o->src_index->sha1);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
 
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 17/32] read-cache: split-index mode
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (15 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 16/32] read-cache: save index SHA-1 after reading Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 18/32] read-cache: mark new entries for split index Nguyễn Thái Ngọc Duy
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This split-index mode is designed to keep write cost proportional to
the number of changes the user has made, not the size of the work
tree. (Read cost is another matter, to be dealt separately.)

This mode stores index info in a pair of $GIT_DIR/index and
$GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over
time while "index" is smaller and updated often. Format details are in
index-format.txt, although not everything is implemented in this
patch.

Shared indexes are not automatically removed, because it's unclear if
the shared index is needed by any (even temporary) indexes by just
looking at it. After a while you'll collect stale shared indexes. The
good news is one shared index is useable for long, until
$GIT_DIR/index becomes too big and sluggish that the new shared index
must be created.

The safest way to clean shared indexes is to turn off split index
mode, so shared files are all garbage, delete them all, then turn on
split index mode again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/gitrepository-layout.txt   |  4 ++
 Documentation/technical/index-format.txt | 35 ++++++++++++
 Makefile                                 |  1 +
 cache.h                                  |  3 +
 read-cache.c                             | 96 ++++++++++++++++++++++++++++++--
 split-index.c (new)                      | 90 ++++++++++++++++++++++++++++++
 split-index.h (new)                      | 25 +++++++++
 unpack-trees.c                           |  4 ++
 8 files changed, 253 insertions(+), 5 deletions(-)
 create mode 100644 split-index.c
 create mode 100644 split-index.h

diff --git a/Documentation/gitrepository-layout.txt b/Documentation/gitrepository-layout.txt
index 17d2ea6..79653f3 100644
--- a/Documentation/gitrepository-layout.txt
+++ b/Documentation/gitrepository-layout.txt
@@ -155,6 +155,10 @@ index::
 	The current index file for the repository.  It is
 	usually not found in a bare repository.
 
+sharedindex.<SHA-1>::
+	The shared index part, to be referenced by $GIT_DIR/index and
+	other temporary index files. Only valid in split index mode.
+
 info::
 	Additional information about the repository is recorded
 	in this directory.
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index f352a9b..fe6f316 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -129,6 +129,9 @@ Git index format
   (Version 4) In version 4, the padding after the pathname does not
   exist.
 
+  Interpretation of index entries in split index mode is completely
+  different. See below for details.
+
 == Extensions
 
 === Cached tree
@@ -198,3 +201,35 @@ Git index format
   - At most three 160-bit object names of the entry in stages from 1 to 3
     (nothing is written for a missing stage).
 
+=== Split index
+
+  In split index mode, the majority of index entries could be stored
+  in a separate file. This extension records the changes to be made on
+  top of that to produce the final index.
+
+  The signature for this extension is { 'l', 'i, 'n', 'k' }.
+
+  The extension consists of:
+
+  - 160-bit SHA-1 of the shared index file. The shared index file path
+    is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the
+    index does not require a shared index file.
+
+  - An ewah-encoded delete bitmap, each bit represents an entry in the
+    shared index. If a bit is set, its corresponding entry in the
+    shared index will be removed from the final index.  Note, because
+    a delete operation changes index entry positions, but we do need
+    original positions in replace phase, it's best to just mark
+    entries for removal, then do a mass deletion after replacement.
+
+  - An ewah-encoded replace bitmap, each bit represents an entry in
+    the shared index. If a bit is set, its corresponding entry in the
+    shared index will be replaced with an entry in this index
+    file. All replaced entries are stored in sorted order in this
+    index. The first "1" bit in the replace bitmap corresponds to the
+    first index entry, the second "1" bit to the second entry and so
+    on. Replaced entries may have empty path names to save space.
+
+  The remaining index entries after replaced ones will be added to the
+  final index. These added entries are also sorted by entry namme then
+  stage.
diff --git a/Makefile b/Makefile
index 81e8214..94ad3ce 100644
--- a/Makefile
+++ b/Makefile
@@ -887,6 +887,7 @@ LIB_OBJS += sha1_name.o
 LIB_OBJS += shallow.o
 LIB_OBJS += sideband.o
 LIB_OBJS += sigchain.o
+LIB_OBJS += split-index.o
 LIB_OBJS += strbuf.o
 LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
diff --git a/cache.h b/cache.h
index 41cdcd0..47fe374 100644
--- a/cache.h
+++ b/cache.h
@@ -135,6 +135,7 @@ struct cache_entry {
 	unsigned int ce_mode;
 	unsigned int ce_flags;
 	unsigned int ce_namelen;
+	unsigned int index;	/* for link extension */
 	unsigned char sha1[20];
 	char name[FLEX_ARRAY]; /* more */
 };
@@ -275,12 +276,14 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define RESOLVE_UNDO_CHANGED	(1 << 4)
 #define CACHE_TREE_CHANGED	(1 << 5)
 
+struct split_index;
 struct index_state {
 	struct cache_entry **cache;
 	unsigned int version;
 	unsigned int cache_nr, cache_alloc, cache_changed;
 	struct string_list *resolve_undo;
 	struct cache_tree *cache_tree;
+	struct split_index *split_index;
 	struct cache_time timestamp;
 	unsigned name_hash_initialized : 1,
 		 initialized : 1;
diff --git a/read-cache.c b/read-cache.c
index b4653c0..90a3f09 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "strbuf.h"
 #include "varint.h"
+#include "split-index.h"
 
 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 					       unsigned int options);
@@ -34,6 +35,10 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 #define CACHE_EXT(s) ( (s[0]<<24)|(s[1]<<16)|(s[2]<<8)|(s[3]) )
 #define CACHE_EXT_TREE 0x54524545	/* "TREE" */
 #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
+#define CACHE_EXT_LINK 0x6c696e6b	  /* "link" */
+
+/* changes that can be kept in $GIT_DIR/index (basically all extensions) */
+#define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -63,6 +68,7 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
 	copy_cache_entry(new, old);
 	new->ce_flags &= ~CE_HASHED;
 	new->ce_namelen = namelen;
+	new->index = 0;
 	memcpy(new->name, new_name, namelen + 1);
 
 	cache_tree_invalidate_path(istate, old->name);
@@ -1335,6 +1341,10 @@ static int read_index_extension(struct index_state *istate,
 	case CACHE_EXT_RESOLVE_UNDO:
 		istate->resolve_undo = resolve_undo_read(data, sz);
 		break;
+	case CACHE_EXT_LINK:
+		if (read_link_extension(istate, data, sz))
+			return -1;
+		break;
 	default:
 		if (*ext < 'A' || 'Z' < *ext)
 			return error("index uses %.4s extension, which we do not understand",
@@ -1369,6 +1379,7 @@ static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *on
 	ce->ce_stat_data.sd_size  = get_be32(&ondisk->size);
 	ce->ce_flags = flags & ~CE_NAMEMASK;
 	ce->ce_namelen = len;
+	ce->index = 0;
 	hashcpy(ce->sha1, ondisk->sha1);
 	memcpy(ce->name, name, len);
 	ce->name[len] = '\0';
@@ -1443,7 +1454,8 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 }
 
 /* remember to discard_cache() before reading a different cache! */
-int read_index_from(struct index_state *istate, const char *path)
+static int do_read_index(struct index_state *istate, const char *path,
+			 int must_exist)
 {
 	int fd, i;
 	struct stat st;
@@ -1460,9 +1472,9 @@ int read_index_from(struct index_state *istate, const char *path)
 	istate->timestamp.nsec = 0;
 	fd = open(path, O_RDONLY);
 	if (fd < 0) {
-		if (errno == ENOENT)
+		if (!must_exist && errno == ENOENT)
 			return 0;
-		die_errno("index file open failed");
+		die_errno("%s: index file open failed", path);
 	}
 
 	if (fstat(fd, &st))
@@ -1535,6 +1547,42 @@ unmap:
 	die("index file corrupt");
 }
 
+int read_index_from(struct index_state *istate, const char *path)
+{
+	struct split_index *split_index;
+	int ret;
+
+	/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
+	if (istate->initialized)
+		return istate->cache_nr;
+
+	ret = do_read_index(istate, path, 0);
+	split_index = istate->split_index;
+	if (!split_index)
+		return ret;
+
+	if (is_null_sha1(split_index->base_sha1))
+		return ret;
+	if (istate->cache_nr)
+		die("index in split-index mode must contain no entries");
+
+	if (split_index->base)
+		discard_index(split_index->base);
+	else
+		split_index->base = xcalloc(1, sizeof(*split_index->base));
+	ret = do_read_index(split_index->base,
+			    git_path("sharedindex.%s",
+				     sha1_to_hex(split_index->base_sha1)), 1);
+	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
+		die("broken index, expect %s in %s, got %s",
+		    sha1_to_hex(split_index->base_sha1),
+		    git_path("sharedindex.%s",
+				     sha1_to_hex(split_index->base_sha1)),
+		    sha1_to_hex(split_index->base->sha1));
+	merge_base_index(istate);
+	return ret;
+}
+
 int is_index_unborn(struct index_state *istate)
 {
 	return (!istate->cache_nr && !istate->timestamp.sec);
@@ -1544,8 +1592,15 @@ int discard_index(struct index_state *istate)
 {
 	int i;
 
-	for (i = 0; i < istate->cache_nr; i++)
+	for (i = 0; i < istate->cache_nr; i++) {
+		if (istate->cache[i]->index &&
+		    istate->split_index &&
+		    istate->split_index->base &&
+		    istate->cache[i]->index <= istate->split_index->base->cache_nr &&
+		    istate->cache[i] == istate->split_index->base->cache[istate->cache[i]->index - 1])
+			continue;
 		free(istate->cache[i]);
+	}
 	resolve_undo_clear_index(istate);
 	istate->cache_nr = 0;
 	istate->cache_changed = 0;
@@ -1557,6 +1612,7 @@ int discard_index(struct index_state *istate)
 	free(istate->cache);
 	istate->cache = NULL;
 	istate->cache_alloc = 0;
+	discard_split_index(istate);
 	return 0;
 }
 
@@ -1852,6 +1908,17 @@ static int do_write_index(struct index_state *istate, int newfd)
 	strbuf_release(&previous_name_buf);
 
 	/* Write extension data here */
+	if (istate->split_index) {
+		struct strbuf sb = STRBUF_INIT;
+
+		err = write_link_extension(&sb, istate) < 0 ||
+			write_index_ext_header(&c, newfd, CACHE_EXT_LINK,
+					       sb.len) < 0 ||
+			ce_write(&c, newfd, sb.buf, sb.len) < 0;
+		strbuf_release(&sb);
+		if (err)
+			return -1;
+	}
 	if (istate->cache_tree) {
 		struct strbuf sb = STRBUF_INIT;
 
@@ -1916,10 +1983,29 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
 		return ret;
 }
 
+static int write_split_index(struct index_state *istate,
+			     struct lock_file *lock,
+			     unsigned flags)
+{
+	int ret;
+	prepare_to_write_split_index(istate);
+	ret = do_write_locked_index(istate, lock, flags);
+	finish_writing_split_index(istate);
+	return ret;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		       unsigned flags)
 {
-	return do_write_locked_index(istate, lock, flags);
+	struct split_index *si = istate->split_index;
+
+	if (!si || (istate->cache_changed & ~EXTMASK)) {
+		if (si)
+			hashclr(si->base_sha1);
+		return do_write_locked_index(istate, lock, flags);
+	}
+
+	return write_split_index(istate, lock, flags);
 }
 
 /*
diff --git a/split-index.c b/split-index.c
new file mode 100644
index 0000000..63b52bb
--- /dev/null
+++ b/split-index.c
@@ -0,0 +1,90 @@
+#include "cache.h"
+#include "split-index.h"
+
+struct split_index *init_split_index(struct index_state *istate)
+{
+	if (!istate->split_index) {
+		istate->split_index = xcalloc(1, sizeof(*istate->split_index));
+		istate->split_index->refcount = 1;
+	}
+	return istate->split_index;
+}
+
+int read_link_extension(struct index_state *istate,
+			 const void *data_, unsigned long sz)
+{
+	const unsigned char *data = data_;
+	struct split_index *si;
+	if (sz < 20)
+		return error("corrupt link extension (too short)");
+	si = init_split_index(istate);
+	hashcpy(si->base_sha1, data);
+	data += 20;
+	sz -= 20;
+	if (sz)
+		return error("garbage at the end of link extension");
+	return 0;
+}
+
+int write_link_extension(struct strbuf *sb,
+			 struct index_state *istate)
+{
+	struct split_index *si = istate->split_index;
+	strbuf_add(sb, si->base_sha1, 20);
+	return 0;
+}
+
+static void mark_base_index_entries(struct index_state *base)
+{
+	int i;
+	/*
+	 * To keep track of the shared entries between
+	 * istate->base->cache[] and istate->cache[], base entry
+	 * position is stored in each base entry. All positions start
+	 * from 1 instead of 0, which is resrved to say "this is a new
+	 * entry".
+	 */
+	for (i = 0; i < base->cache_nr; i++)
+		base->cache[i]->index = i + 1;
+}
+
+void merge_base_index(struct index_state *istate)
+{
+	struct split_index *si = istate->split_index;
+
+	mark_base_index_entries(si->base);
+	istate->cache_nr = si->base->cache_nr;
+	ALLOC_GROW(istate->cache, istate->cache_nr, istate->cache_alloc);
+	memcpy(istate->cache, si->base->cache,
+	       sizeof(*istate->cache) * istate->cache_nr);
+}
+
+void prepare_to_write_split_index(struct index_state *istate)
+{
+	struct split_index *si = init_split_index(istate);
+	/* take cache[] out temporarily */
+	si->saved_cache_nr = istate->cache_nr;
+	istate->cache_nr = 0;
+}
+
+void finish_writing_split_index(struct index_state *istate)
+{
+	struct split_index *si = init_split_index(istate);
+	istate->cache_nr = si->saved_cache_nr;
+}
+
+void discard_split_index(struct index_state *istate)
+{
+	struct split_index *si = istate->split_index;
+	if (!si)
+		return;
+	istate->split_index = NULL;
+	si->refcount--;
+	if (si->refcount)
+		return;
+	if (si->base) {
+		discard_index(si->base);
+		free(si->base);
+	}
+	free(si);
+}
diff --git a/split-index.h b/split-index.h
new file mode 100644
index 0000000..8d74041
--- /dev/null
+++ b/split-index.h
@@ -0,0 +1,25 @@
+#ifndef SPLIT_INDEX_H
+#define SPLIT_INDEX_H
+
+struct index_state;
+struct strbuf;
+
+struct split_index {
+	unsigned char base_sha1[20];
+	struct index_state *base;
+	unsigned int saved_cache_nr;
+	int refcount;
+};
+
+struct split_index *init_split_index(struct index_state *istate);
+int read_link_extension(struct index_state *istate,
+			const void *data, unsigned long sz);
+int write_link_extension(struct strbuf *sb,
+			 struct index_state *istate);
+void move_cache_to_base_index(struct index_state *istate);
+void merge_base_index(struct index_state *istate);
+void prepare_to_write_split_index(struct index_state *istate);
+void finish_writing_split_index(struct index_state *istate);
+void discard_split_index(struct index_state *istate);
+
+#endif
diff --git a/unpack-trees.c b/unpack-trees.c
index f594932..a941f7c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -8,6 +8,7 @@
 #include "progress.h"
 #include "refs.h"
 #include "attr.h"
+#include "split-index.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -1046,6 +1047,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
 	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
 	o->result.version = o->src_index->version;
+	o->result.split_index = o->src_index->split_index;
+	if (o->result.split_index)
+		o->result.split_index->refcount++;
 	hashcpy(o->result.sha1, o->src_index->sha1);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 18/32] read-cache: mark new entries for split index
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (16 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 17/32] read-cache: split-index mode Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 19/32] read-cache: save deleted entries in " Nguyễn Thái Ngọc Duy
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Make sure entry addition does not lead to unifying the index. We don't
need to explicitly keep track of new entries. If ce->index is zero,
they're new. Otherwise it's unlikely that they are new, but we'll do a
thorough check later at writing time.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 90a3f09..52a27b3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -38,7 +38,8 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 #define CACHE_EXT_LINK 0x6c696e6b	  /* "link" */
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
-#define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED)
+#define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
+		 CE_ENTRY_ADDED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 19/32] read-cache: save deleted entries in split index
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (17 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 18/32] read-cache: mark new entries for split index Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 20/32] read-cache: mark updated entries for " Nguyễn Thái Ngọc Duy
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Entries that belong to the base index should not be freed. Mark
CE_REMOVE to track them.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c  | 14 ++++++++------
 split-index.c | 12 ++++++++++++
 split-index.h |  1 +
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 52a27b3..c437585 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -39,7 +39,7 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
-		 CE_ENTRY_ADDED)
+		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -488,7 +488,7 @@ int remove_index_entry_at(struct index_state *istate, int pos)
 
 	record_resolve_undo(istate, ce);
 	remove_name_hash(istate, ce);
-	free(ce);
+	save_or_free_index_entry(istate, ce);
 	istate->cache_changed |= CE_ENTRY_REMOVED;
 	istate->cache_nr--;
 	if (pos >= istate->cache_nr)
@@ -512,7 +512,7 @@ void remove_marked_cache_entries(struct index_state *istate)
 	for (i = j = 0; i < istate->cache_nr; i++) {
 		if (ce_array[i]->ce_flags & CE_REMOVE) {
 			remove_name_hash(istate, ce_array[i]);
-			free(ce_array[i]);
+			save_or_free_index_entry(istate, ce_array[i]);
 		}
 		else
 			ce_array[j++] = ce_array[i];
@@ -577,7 +577,9 @@ static int different_name(struct cache_entry *ce, struct cache_entry *alias)
  * So we use the CE_ADDED flag to verify that the alias was an old
  * one before we accept it as
  */
-static struct cache_entry *create_alias_ce(struct cache_entry *ce, struct cache_entry *alias)
+static struct cache_entry *create_alias_ce(struct index_state *istate,
+					   struct cache_entry *ce,
+					   struct cache_entry *alias)
 {
 	int len;
 	struct cache_entry *new;
@@ -590,7 +592,7 @@ static struct cache_entry *create_alias_ce(struct cache_entry *ce, struct cache_
 	new = xcalloc(1, cache_entry_size(len));
 	memcpy(new->name, alias->name, len);
 	copy_cache_entry(new, ce);
-	free(ce);
+	save_or_free_index_entry(istate, ce);
 	return new;
 }
 
@@ -683,7 +685,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 		set_object_name_for_intent_to_add_entry(ce);
 
 	if (ignore_case && alias && different_name(ce, alias))
-		ce = create_alias_ce(ce, alias);
+		ce = create_alias_ce(istate, ce, alias);
 	ce->ce_flags |= CE_ADDED;
 
 	/* It was suspected to be racily clean, but it turns out to be Ok */
diff --git a/split-index.c b/split-index.c
index 63b52bb..2bb5d55 100644
--- a/split-index.c
+++ b/split-index.c
@@ -88,3 +88,15 @@ void discard_split_index(struct index_state *istate)
 	}
 	free(si);
 }
+
+void save_or_free_index_entry(struct index_state *istate, struct cache_entry *ce)
+{
+	if (ce->index &&
+	    istate->split_index &&
+	    istate->split_index->base &&
+	    ce->index <= istate->split_index->base->cache_nr &&
+	    ce == istate->split_index->base->cache[ce->index - 1])
+		ce->ce_flags |= CE_REMOVE;
+	else
+		free(ce);
+}
diff --git a/split-index.h b/split-index.h
index 8d74041..5302118 100644
--- a/split-index.h
+++ b/split-index.h
@@ -12,6 +12,7 @@ struct split_index {
 };
 
 struct split_index *init_split_index(struct index_state *istate);
+void save_or_free_index_entry(struct index_state *istate, struct cache_entry *ce);
 int read_link_extension(struct index_state *istate,
 			const void *data, unsigned long sz);
 int write_link_extension(struct strbuf *sb,
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 20/32] read-cache: mark updated entries for split index
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (18 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 19/32] read-cache: save deleted entries in " Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 21/32] split-index: the writing part Nguyễn Thái Ngọc Duy
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

The large part of this patch just follows CE_ENTRY_CHANGED
marks. replace_index_entry() is updated to update
split_index->base->cache[] as well so base->cache[] does not reference
to a freed entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/update-index.c |  2 ++
 cache.h                |  2 ++
 entry.c                |  1 +
 read-cache.c           |  5 ++++-
 split-index.c          | 15 +++++++++++++++
 split-index.h          |  3 +++
 unpack-trees.c         |  4 +++-
 7 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index fa3c441..f7a19c4 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -55,6 +55,7 @@ static int mark_ce_flags(const char *path, int flag, int mark)
 			active_cache[pos]->ce_flags |= flag;
 		else
 			active_cache[pos]->ce_flags &= ~flag;
+		active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
 		cache_tree_invalidate_path(&the_index, path);
 		active_cache_changed |= CE_ENTRY_CHANGED;
 		return 0;
@@ -268,6 +269,7 @@ static void chmod_path(int flip, const char *path)
 		goto fail;
 	}
 	cache_tree_invalidate_path(&the_index, path);
+	ce->ce_flags |= CE_UPDATE_IN_BASE;
 	active_cache_changed |= CE_ENTRY_CHANGED;
 	report("chmod %cx '%s'", flip, path);
 	return;
diff --git a/cache.h b/cache.h
index 47fe374..295bf9d 100644
--- a/cache.h
+++ b/cache.h
@@ -169,6 +169,8 @@ struct cache_entry {
 /* used to temporarily mark paths matched by pathspecs */
 #define CE_MATCHED           (1 << 26)
 
+#define CE_UPDATE_IN_BASE    (1 << 27)
+
 /*
  * Extended on-disk flags
  */
diff --git a/entry.c b/entry.c
index d913c1d..1eda8e9 100644
--- a/entry.c
+++ b/entry.c
@@ -214,6 +214,7 @@ finish:
 		if (!fstat_done)
 			lstat(ce->name, &st);
 		fill_stat_cache_info(ce, &st);
+		ce->ce_flags |= CE_UPDATE_IN_BASE;
 		state->istate->cache_changed |= CE_ENTRY_CHANGED;
 	}
 	return 0;
diff --git a/read-cache.c b/read-cache.c
index c437585..7966fb7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -39,7 +39,7 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
-		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED)
+		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -54,9 +54,11 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 {
 	struct cache_entry *old = istate->cache[nr];
 
+	replace_index_entry_in_base(istate, old, ce);
 	remove_name_hash(istate, old);
 	free(old);
 	set_index_entry(istate, nr, ce);
+	ce->ce_flags |= CE_UPDATE_IN_BASE;
 	istate->cache_changed |= CE_ENTRY_CHANGED;
 }
 
@@ -1192,6 +1194,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 				 * means the index is not valid anymore.
 				 */
 				ce->ce_flags &= ~CE_VALID;
+				ce->ce_flags |= CE_UPDATE_IN_BASE;
 				istate->cache_changed |= CE_ENTRY_CHANGED;
 			}
 			if (quiet)
diff --git a/split-index.c b/split-index.c
index 2bb5d55..b36c73b 100644
--- a/split-index.c
+++ b/split-index.c
@@ -100,3 +100,18 @@ void save_or_free_index_entry(struct index_state *istate, struct cache_entry *ce
 	else
 		free(ce);
 }
+
+void replace_index_entry_in_base(struct index_state *istate,
+				 struct cache_entry *old,
+				 struct cache_entry *new)
+{
+	if (old->index &&
+	    istate->split_index &&
+	    istate->split_index->base &&
+	    old->index <= istate->split_index->base->cache_nr) {
+		new->index = old->index;
+		if (old != istate->split_index->base->cache[new->index - 1])
+			free(istate->split_index->base->cache[new->index - 1]);
+		istate->split_index->base->cache[new->index - 1] = new;
+	}
+}
diff --git a/split-index.h b/split-index.h
index 5302118..812e510 100644
--- a/split-index.h
+++ b/split-index.h
@@ -13,6 +13,9 @@ struct split_index {
 
 struct split_index *init_split_index(struct index_state *istate);
 void save_or_free_index_entry(struct index_state *istate, struct cache_entry *ce);
+void replace_index_entry_in_base(struct index_state *istate,
+				 struct cache_entry *old,
+				 struct cache_entry *new);
 int read_link_extension(struct index_state *istate,
 			const void *data, unsigned long sz);
 int write_link_extension(struct strbuf *sb,
diff --git a/unpack-trees.c b/unpack-trees.c
index a941f7c..4a9cdf2 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -257,8 +257,10 @@ static int apply_sparse_checkout(struct index_state *istate,
 		ce->ce_flags |= CE_SKIP_WORKTREE;
 	else
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
-	if (was_skip_worktree != ce_skip_worktree(ce))
+	if (was_skip_worktree != ce_skip_worktree(ce)) {
+		ce->ce_flags |= CE_UPDATE_IN_BASE;
 		istate->cache_changed |= CE_ENTRY_CHANGED;
+	}
 
 	/*
 	 * if (!was_skip_worktree && !ce_skip_worktree()) {
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 21/32] split-index: the writing part
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (19 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 20/32] read-cache: mark updated entries for " Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 22/32] split-index: the reading part Nguyễn Thái Ngọc Duy
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

prepare_to_write_split_index() does the major work, classifying
deleted, updated and added entries. write_link_extension() then just
writes it down.

An observation is, deleting an entry, then adding it back is recorded
as "entry X is deleted, entry X is added", not "entry X is replaced".
This is simpler, with small overhead: a replaced entry is stored
without its path, a new entry is store with its path.

A note about unpack_trees() and the deduplication code inside
prepare_to_write_split_index(). Usually tracking updated/removed
entries via read-cache API is enough. unpack_trees() manipulates the
index in a different way: it throws the entire source index out,
builds up a new one, copying/duplicating entries (using dup_entry)
from the source index over if necessary, then returns the new index.

A naive solution would be marking the entire source index "deleted"
and add their duplicates as new. That could bring $GIT_DIR/index back
to the original size. So we try harder and memcmp() between the
original and the duplicate to see if it needs updating.

We could avoid memcmp() too, by avoiding duplicating the original
entry in dup_entry(). The performance gain this way is within noise
level and it complicates unpack-trees.c. So memcmp() is the preferred
way to deal with deduplication.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 split-index.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 split-index.h |   4 +++
 2 files changed, 103 insertions(+), 2 deletions(-)

diff --git a/split-index.c b/split-index.c
index b36c73b..5708807 100644
--- a/split-index.c
+++ b/split-index.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "split-index.h"
+#include "ewah/ewok.h"
 
 struct split_index *init_split_index(struct index_state *istate)
 {
@@ -26,11 +27,22 @@ int read_link_extension(struct index_state *istate,
 	return 0;
 }
 
+static int write_strbuf(void *user_data, const void *data, size_t len)
+{
+	struct strbuf *sb = user_data;
+	strbuf_add(sb, data, len);
+	return len;
+}
+
 int write_link_extension(struct strbuf *sb,
 			 struct index_state *istate)
 {
 	struct split_index *si = istate->split_index;
 	strbuf_add(sb, si->base_sha1, 20);
+	if (!si->delete_bitmap && !si->replace_bitmap)
+		return 0;
+	ewah_serialize_to(si->delete_bitmap, write_strbuf, sb);
+	ewah_serialize_to(si->replace_bitmap, write_strbuf, sb);
 	return 0;
 }
 
@@ -62,14 +74,99 @@ void merge_base_index(struct index_state *istate)
 void prepare_to_write_split_index(struct index_state *istate)
 {
 	struct split_index *si = init_split_index(istate);
-	/* take cache[] out temporarily */
+	struct cache_entry **entries = NULL, *ce;
+	int i, nr_entries = 0, nr_alloc = 0;
+
+	si->delete_bitmap = ewah_new();
+	si->replace_bitmap = ewah_new();
+
+	if (si->base) {
+		/* Go through istate->cache[] and mark CE_MATCHED to
+		 * entry with positive index. We'll go through
+		 * base->cache[] later to delete all entries in base
+		 * that are not marked eith either CE_MATCHED or
+		 * CE_UPDATE_IN_BASE. If istate->cache[i] is a
+		 * duplicate, deduplicate it.
+		 */
+		for (i = 0; i < istate->cache_nr; i++) {
+			struct cache_entry *base;
+			/* namelen is checked separately */
+			const unsigned int ondisk_flags =
+				CE_STAGEMASK | CE_VALID | CE_EXTENDED_FLAGS;
+			unsigned int ce_flags, base_flags, ret;
+			ce = istate->cache[i];
+			if (!ce->index)
+				continue;
+			if (ce->index > si->base->cache_nr) {
+				ce->index = 0;
+				continue;
+			}
+			ce->ce_flags |= CE_MATCHED; /* or "shared" */
+			base = si->base->cache[ce->index - 1];
+			if (ce == base)
+				continue;
+			if (ce->ce_namelen != base->ce_namelen ||
+			    strcmp(ce->name, base->name)) {
+				ce->index = 0;
+				continue;
+			}
+			ce_flags = ce->ce_flags;
+			base_flags = base->ce_flags;
+			/* only on-disk flags matter */
+			ce->ce_flags   &= ondisk_flags;
+			base->ce_flags &= ondisk_flags;
+			ret = memcmp(&ce->ce_stat_data, &base->ce_stat_data,
+				     offsetof(struct cache_entry, name) -
+				     offsetof(struct cache_entry, ce_stat_data));
+			ce->ce_flags = ce_flags;
+			base->ce_flags = base_flags;
+			if (ret)
+				ce->ce_flags |= CE_UPDATE_IN_BASE;
+			free(base);
+			si->base->cache[ce->index - 1] = ce;
+		}
+		for (i = 0; i < si->base->cache_nr; i++) {
+			ce = si->base->cache[i];
+			if ((ce->ce_flags & CE_REMOVE) ||
+			    !(ce->ce_flags & CE_MATCHED))
+				ewah_set(si->delete_bitmap, i);
+			else if (ce->ce_flags & CE_UPDATE_IN_BASE) {
+				ewah_set(si->replace_bitmap, i);
+				ALLOC_GROW(entries, nr_entries+1, nr_alloc);
+				entries[nr_entries++] = ce;
+			}
+		}
+	}
+
+	for (i = 0; i < istate->cache_nr; i++) {
+		ce = istate->cache[i];
+		if ((!si->base || !ce->index) && !(ce->ce_flags & CE_REMOVE)) {
+			ALLOC_GROW(entries, nr_entries+1, nr_alloc);
+			entries[nr_entries++] = ce;
+		}
+		ce->ce_flags &= ~CE_MATCHED;
+	}
+
+	/*
+	 * take cache[] out temporarily, put entries[] in its place
+	 * for writing
+	 */
+	si->saved_cache = istate->cache;
 	si->saved_cache_nr = istate->cache_nr;
-	istate->cache_nr = 0;
+	istate->cache = entries;
+	istate->cache_nr = nr_entries;
 }
 
 void finish_writing_split_index(struct index_state *istate)
 {
 	struct split_index *si = init_split_index(istate);
+
+	ewah_free(si->delete_bitmap);
+	ewah_free(si->replace_bitmap);
+	si->delete_bitmap = NULL;
+	si->replace_bitmap = NULL;
+	free(istate->cache);
+	istate->cache = si->saved_cache;
 	istate->cache_nr = si->saved_cache_nr;
 }
 
diff --git a/split-index.h b/split-index.h
index 812e510..53b778f 100644
--- a/split-index.h
+++ b/split-index.h
@@ -3,10 +3,14 @@
 
 struct index_state;
 struct strbuf;
+struct ewah_bitmap;
 
 struct split_index {
 	unsigned char base_sha1[20];
 	struct index_state *base;
+	struct ewah_bitmap *delete_bitmap;
+	struct ewah_bitmap *replace_bitmap;
+	struct cache_entry **saved_cache;
 	unsigned int saved_cache_nr;
 	int refcount;
 };
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 22/32] split-index: the reading part
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (20 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 21/32] split-index: the writing part Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 23/32] split-index: do not invalidate cache-tree at read time Nguyễn Thái Ngọc Duy
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

CE_REMOVE'd entries are removed here because only parts of the code
base (unpack_trees in fact) test this bit when they look for the
presence of an entry. Leaving them may confuse the code ignores this
bit and expects to see a real entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c  |  2 --
 split-index.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 split-index.h |  2 ++
 3 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7966fb7..1a7ef7f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1569,8 +1569,6 @@ int read_index_from(struct index_state *istate, const char *path)
 
 	if (is_null_sha1(split_index->base_sha1))
 		return ret;
-	if (istate->cache_nr)
-		die("index in split-index mode must contain no entries");
 
 	if (split_index->base)
 		discard_index(split_index->base);
diff --git a/split-index.c b/split-index.c
index 5708807..b03a250 100644
--- a/split-index.c
+++ b/split-index.c
@@ -16,13 +16,27 @@ int read_link_extension(struct index_state *istate,
 {
 	const unsigned char *data = data_;
 	struct split_index *si;
+	int ret;
+
 	if (sz < 20)
 		return error("corrupt link extension (too short)");
 	si = init_split_index(istate);
 	hashcpy(si->base_sha1, data);
 	data += 20;
 	sz -= 20;
-	if (sz)
+	if (!sz)
+		return 0;
+	si->delete_bitmap = ewah_new();
+	ret = ewah_read_mmap(si->delete_bitmap, data, sz);
+	if (ret < 0)
+		return error("corrupt delete bitmap in link extension");
+	data += ret;
+	sz -= ret;
+	si->replace_bitmap = ewah_new();
+	ret = ewah_read_mmap(si->replace_bitmap, data, sz);
+	if (ret < 0)
+		return error("corrupt replace bitmap in link extension");
+	if (ret != sz)
 		return error("garbage at the end of link extension");
 	return 0;
 }
@@ -60,15 +74,81 @@ static void mark_base_index_entries(struct index_state *base)
 		base->cache[i]->index = i + 1;
 }
 
+static void mark_entry_for_delete(size_t pos, void *data)
+{
+	struct index_state *istate = data;
+	if (pos >= istate->cache_nr)
+		die("position for delete %d exceeds base index size %d",
+		    (int)pos, istate->cache_nr);
+	istate->cache[pos]->ce_flags |= CE_REMOVE;
+	istate->split_index->nr_deletions = 1;
+}
+
+static void replace_entry(size_t pos, void *data)
+{
+	struct index_state *istate = data;
+	struct split_index *si = istate->split_index;
+	struct cache_entry *dst, *src;
+	if (pos >= istate->cache_nr)
+		die("position for replacement %d exceeds base index size %d",
+		    (int)pos, istate->cache_nr);
+	if (si->nr_replacements >= si->saved_cache_nr)
+		die("too many replacements (%d vs %d)",
+		    si->nr_replacements, si->saved_cache_nr);
+	dst = istate->cache[pos];
+	if (dst->ce_flags & CE_REMOVE)
+		die("entry %d is marked as both replaced and deleted",
+		    (int)pos);
+	src = si->saved_cache[si->nr_replacements];
+	src->index = pos + 1;
+	src->ce_flags |= CE_UPDATE_IN_BASE;
+	free(dst);
+	dst = src;
+	si->nr_replacements++;
+}
+
 void merge_base_index(struct index_state *istate)
 {
 	struct split_index *si = istate->split_index;
+	unsigned int i;
 
 	mark_base_index_entries(si->base);
-	istate->cache_nr = si->base->cache_nr;
+
+	si->saved_cache	    = istate->cache;
+	si->saved_cache_nr  = istate->cache_nr;
+	istate->cache_nr    = si->base->cache_nr;
+	istate->cache	    = NULL;
+	istate->cache_alloc = 0;
 	ALLOC_GROW(istate->cache, istate->cache_nr, istate->cache_alloc);
 	memcpy(istate->cache, si->base->cache,
 	       sizeof(*istate->cache) * istate->cache_nr);
+
+	si->nr_deletions = 0;
+	si->nr_replacements = 0;
+	ewah_each_bit(si->replace_bitmap, replace_entry, istate);
+	ewah_each_bit(si->delete_bitmap, mark_entry_for_delete, istate);
+	if (si->nr_deletions)
+		remove_marked_cache_entries(istate);
+
+	for (i = si->nr_replacements; i < si->saved_cache_nr; i++) {
+		add_index_entry(istate, si->saved_cache[i],
+				ADD_CACHE_OK_TO_ADD |
+				/*
+				 * we may have to replay what
+				 * merge-recursive.c:update_stages()
+				 * does, which has this flag on
+				 */
+				ADD_CACHE_SKIP_DFCHECK);
+		si->saved_cache[i] = NULL;
+	}
+
+	ewah_free(si->delete_bitmap);
+	ewah_free(si->replace_bitmap);
+	free(si->saved_cache);
+	si->delete_bitmap  = NULL;
+	si->replace_bitmap = NULL;
+	si->saved_cache	   = NULL;
+	si->saved_cache_nr = 0;
 }
 
 void prepare_to_write_split_index(struct index_state *istate)
diff --git a/split-index.h b/split-index.h
index 53b778f..c1324f5 100644
--- a/split-index.h
+++ b/split-index.h
@@ -12,6 +12,8 @@ struct split_index {
 	struct ewah_bitmap *replace_bitmap;
 	struct cache_entry **saved_cache;
 	unsigned int saved_cache_nr;
+	unsigned int nr_deletions;
+	unsigned int nr_replacements;
 	int refcount;
 };
 
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 23/32] split-index: do not invalidate cache-tree at read time
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (21 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 22/32] split-index: the reading part Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 24/32] split-index: strip pathname of on-disk replaced entries Nguyễn Thái Ngọc Duy
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

We are sure that after merge_base_index() is done. cache-tree can
still be used with the final index. So don't destroy cache tree.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h       | 1 +
 read-cache.c  | 3 ++-
 split-index.c | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 295bf9d..70fd8ed 100644
--- a/cache.h
+++ b/cache.h
@@ -488,6 +488,7 @@ extern int index_name_pos(const struct index_state *, const char *name, int name
 #define ADD_CACHE_SKIP_DFCHECK 4	/* Ok to skip DF conflict checks */
 #define ADD_CACHE_JUST_APPEND 8		/* Append only; tree.c::read_tree() */
 #define ADD_CACHE_NEW_ONLY 16		/* Do not replace existing ones */
+#define ADD_CACHE_KEEP_CACHE_TREE 32	/* Do not invalidate cache-tree */
 extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option);
 extern void rename_index_entry_at(struct index_state *, int pos, const char *new_name);
 extern int remove_index_entry_at(struct index_state *, int pos);
diff --git a/read-cache.c b/read-cache.c
index 1a7ef7f..d5f70a0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -950,7 +950,8 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
 	int new_only = option & ADD_CACHE_NEW_ONLY;
 
-	cache_tree_invalidate_path(istate, ce->name);
+	if (!(option & ADD_CACHE_KEEP_CACHE_TREE))
+		cache_tree_invalidate_path(istate, ce->name);
 	pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce));
 
 	/* existing match? Just replace it. */
diff --git a/split-index.c b/split-index.c
index b03a250..33c0c4b 100644
--- a/split-index.c
+++ b/split-index.c
@@ -133,6 +133,7 @@ void merge_base_index(struct index_state *istate)
 	for (i = si->nr_replacements; i < si->saved_cache_nr; i++) {
 		add_index_entry(istate, si->saved_cache[i],
 				ADD_CACHE_OK_TO_ADD |
+				ADD_CACHE_KEEP_CACHE_TREE |
 				/*
 				 * we may have to replay what
 				 * merge-recursive.c:update_stages()
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 24/32] split-index: strip pathname of on-disk replaced entries
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (22 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 23/32] split-index: do not invalidate cache-tree at read time Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 25/32] update-index: new options to enable/disable split index mode Nguyễn Thái Ngọc Duy
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

We know the positions of replaced entries via the replace bitmap in
"link" extension, so the "name" path does not have to be stored (it's
still in the shared index). With this, we also have a way to
distinguish additions vs replacements at load time and can catch
broken "link" extensions.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h       |  1 +
 read-cache.c  | 10 ++++++++++
 split-index.c | 14 ++++++++++++--
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 70fd8ed..ddb7cd2 100644
--- a/cache.h
+++ b/cache.h
@@ -170,6 +170,7 @@ struct cache_entry {
 #define CE_MATCHED           (1 << 26)
 
 #define CE_UPDATE_IN_BASE    (1 << 27)
+#define CE_STRIP_NAME        (1 << 28)
 
 /*
  * Extended on-disk flags
diff --git a/read-cache.c b/read-cache.c
index d5f70a0..af475dc 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1787,9 +1787,15 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct cache_entry *ce,
 {
 	int size;
 	struct ondisk_cache_entry *ondisk;
+	int saved_namelen = saved_namelen; /* compiler workaround */
 	char *name;
 	int result;
 
+	if (ce->ce_flags & CE_STRIP_NAME) {
+		saved_namelen = ce_namelen(ce);
+		ce->ce_namelen = 0;
+	}
+
 	if (!previous_name) {
 		size = ondisk_ce_size(ce);
 		ondisk = xcalloc(1, size);
@@ -1821,6 +1827,10 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct cache_entry *ce,
 		strbuf_splice(previous_name, common, to_remove,
 			      ce->name + common, ce_namelen(ce) - common);
 	}
+	if (ce->ce_flags & CE_STRIP_NAME) {
+		ce->ce_namelen = saved_namelen;
+		ce->ce_flags &= ~CE_STRIP_NAME;
+	}
 
 	result = ce_write(c, fd, ondisk, size);
 	free(ondisk);
diff --git a/split-index.c b/split-index.c
index 33c0c4b..ee3246f 100644
--- a/split-index.c
+++ b/split-index.c
@@ -89,6 +89,7 @@ static void replace_entry(size_t pos, void *data)
 	struct index_state *istate = data;
 	struct split_index *si = istate->split_index;
 	struct cache_entry *dst, *src;
+
 	if (pos >= istate->cache_nr)
 		die("position for replacement %d exceeds base index size %d",
 		    (int)pos, istate->cache_nr);
@@ -100,10 +101,14 @@ static void replace_entry(size_t pos, void *data)
 		die("entry %d is marked as both replaced and deleted",
 		    (int)pos);
 	src = si->saved_cache[si->nr_replacements];
+	if (ce_namelen(src))
+		die("corrupt link extension, entry %d should have "
+		    "zero length name", (int)pos);
 	src->index = pos + 1;
 	src->ce_flags |= CE_UPDATE_IN_BASE;
-	free(dst);
-	dst = src;
+	src->ce_namelen = dst->ce_namelen;
+	copy_cache_entry(dst, src);
+	free(src);
 	si->nr_replacements++;
 }
 
@@ -131,6 +136,9 @@ void merge_base_index(struct index_state *istate)
 		remove_marked_cache_entries(istate);
 
 	for (i = si->nr_replacements; i < si->saved_cache_nr; i++) {
+		if (!ce_namelen(si->saved_cache[i]))
+			die("corrupt link extension, entry %d should "
+			    "have non-zero length name", i);
 		add_index_entry(istate, si->saved_cache[i],
 				ADD_CACHE_OK_TO_ADD |
 				ADD_CACHE_KEEP_CACHE_TREE |
@@ -213,6 +221,7 @@ void prepare_to_write_split_index(struct index_state *istate)
 				ewah_set(si->delete_bitmap, i);
 			else if (ce->ce_flags & CE_UPDATE_IN_BASE) {
 				ewah_set(si->replace_bitmap, i);
+				ce->ce_flags |= CE_STRIP_NAME;
 				ALLOC_GROW(entries, nr_entries+1, nr_alloc);
 				entries[nr_entries++] = ce;
 			}
@@ -222,6 +231,7 @@ void prepare_to_write_split_index(struct index_state *istate)
 	for (i = 0; i < istate->cache_nr; i++) {
 		ce = istate->cache[i];
 		if ((!si->base || !ce->index) && !(ce->ce_flags & CE_REMOVE)) {
+			assert(!(ce->ce_flags & CE_STRIP_NAME));
 			ALLOC_GROW(entries, nr_entries+1, nr_alloc);
 			entries[nr_entries++] = ce;
 		}
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 25/32] update-index: new options to enable/disable split index mode
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (23 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 24/32] split-index: strip pathname of on-disk replaced entries Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 26/32] update-index --split-index: do not split if $GIT_DIR is read only Nguyễn Thái Ngọc Duy
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

If you have a large work tree but only make changes in a subset, then
$GIT_DIR/index's size should be stable after a while. If you change
branches that touch something else, $GIT_DIR/index's size may grow
large that it becomes as slow as the unified index. Do --split-index
again occasionally to force all changes back to the shared index and
keep $GIT_DIR/index small.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-update-index.txt | 11 +++++++
 builtin/update-index.c             | 18 ++++++++++
 cache.h                            |  1 +
 read-cache.c                       | 67 ++++++++++++++++++++++++++++++++++----
 split-index.c                      | 23 +++++++++++++
 5 files changed, 114 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index d6de4a0..dfc09d9 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -161,6 +161,17 @@ may not support it yet.
 	Only meaningful with `--stdin` or `--index-info`; paths are
 	separated with NUL character instead of LF.
 
+--split-index::
+--no-split-index::
+	Enable or disable split index mode. If enabled, the index is
+	split into two files, $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>.
+	Changes are accumulated in $GIT_DIR/index while the shared
+	index file contains all index entries stays unchanged. If
+	split-index mode is already enabled and `--split-index` is
+	given again, all changes in $GIT_DIR/index are pushed back to
+	the shared index file. This mode is designed for very large
+	indexes that take a signficant amount of time to read or write.
+
 \--::
 	Do not interpret any more arguments as options.
 
diff --git a/builtin/update-index.c b/builtin/update-index.c
index f7a19c4..b0503f4 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -13,6 +13,7 @@
 #include "parse-options.h"
 #include "pathspec.h"
 #include "dir.h"
+#include "split-index.h"
 
 /*
  * Default to not allowing changes to the list of files. The
@@ -742,6 +743,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	char set_executable_bit = 0;
 	struct refresh_params refresh_args = {0, &has_errors};
 	int lock_error = 0;
+	int split_index = -1;
 	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	int parseopt_state = PARSE_OPT_UNKNOWN;
@@ -824,6 +826,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 			resolve_undo_clear_callback},
 		OPT_INTEGER(0, "index-version", &preferred_index_format,
 			N_("write index in this format")),
+		OPT_BOOL(0, "split-index", &split_index,
+			N_("enable or disable split index")),
 		OPT_END()
 	};
 
@@ -917,6 +921,20 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
+	if (split_index > 0) {
+		init_split_index(&the_index);
+		the_index.cache_changed |= SPLIT_INDEX_ORDERED;
+	} else if (!split_index && the_index.split_index) {
+		/*
+		 * can't discard_split_index(&the_index); because that
+		 * will destroy split_index->base->cache[], which may
+		 * be shared with the_index.cache[]. So yeah we're
+		 * leaking a bit here.
+		 */
+		the_index.split_index = NULL;
+		the_index.cache_changed |= SOMETHING_CHANGED;
+	}
+
 	if (active_cache_changed) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
diff --git a/cache.h b/cache.h
index ddb7cd2..7523c28 100644
--- a/cache.h
+++ b/cache.h
@@ -278,6 +278,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CE_ENTRY_ADDED		(1 << 3)
 #define RESOLVE_UNDO_CHANGED	(1 << 4)
 #define CACHE_TREE_CHANGED	(1 << 5)
+#define SPLIT_INDEX_ORDERED	(1 << 6)
 
 struct split_index;
 struct index_state {
diff --git a/read-cache.c b/read-cache.c
index af475dc..8ecb959 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -15,6 +15,7 @@
 #include "strbuf.h"
 #include "varint.h"
 #include "split-index.h"
+#include "sigchain.h"
 
 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 					       unsigned int options);
@@ -39,7 +40,8 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
-		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED)
+		 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \
+		 SPLIT_INDEX_ORDERED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
@@ -1860,7 +1862,8 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
 		rollback_lock_file(lockfile);
 }
 
-static int do_write_index(struct index_state *istate, int newfd)
+static int do_write_index(struct index_state *istate, int newfd,
+			  int strip_extensions)
 {
 	git_SHA_CTX c;
 	struct cache_header hdr;
@@ -1923,7 +1926,7 @@ static int do_write_index(struct index_state *istate, int newfd)
 	strbuf_release(&previous_name_buf);
 
 	/* Write extension data here */
-	if (istate->split_index) {
+	if (!strip_extensions && istate->split_index) {
 		struct strbuf sb = STRBUF_INIT;
 
 		err = write_link_extension(&sb, istate) < 0 ||
@@ -1934,7 +1937,7 @@ static int do_write_index(struct index_state *istate, int newfd)
 		if (err)
 			return -1;
 	}
-	if (istate->cache_tree) {
+	if (!strip_extensions && istate->cache_tree) {
 		struct strbuf sb = STRBUF_INIT;
 
 		cache_tree_write(&sb, istate->cache_tree);
@@ -1944,7 +1947,7 @@ static int do_write_index(struct index_state *istate, int newfd)
 		if (err)
 			return -1;
 	}
-	if (istate->resolve_undo) {
+	if (!strip_extensions && istate->resolve_undo) {
 		struct strbuf sb = STRBUF_INIT;
 
 		resolve_undo_write(&sb, istate->resolve_undo);
@@ -1985,7 +1988,7 @@ static int commit_locked_index(struct lock_file *lk)
 static int do_write_locked_index(struct index_state *istate, struct lock_file *lock,
 				 unsigned flags)
 {
-	int ret = do_write_index(istate, lock->fd);
+	int ret = do_write_index(istate, lock->fd, 0);
 	if (ret)
 		return ret;
 	assert((flags & (COMMIT_LOCK | CLOSE_LOCK)) !=
@@ -2009,6 +2012,52 @@ static int write_split_index(struct index_state *istate,
 	return ret;
 }
 
+static char *temporary_sharedindex;
+
+static void remove_temporary_sharedindex(void)
+{
+	if (temporary_sharedindex) {
+		unlink_or_warn(temporary_sharedindex);
+		free(temporary_sharedindex);
+		temporary_sharedindex = NULL;
+	}
+}
+
+static void remove_temporary_sharedindex_on_signal(int signo)
+{
+	remove_temporary_sharedindex();
+	sigchain_pop(signo);
+	raise(signo);
+}
+
+static int write_shared_index(struct index_state *istate)
+{
+	struct split_index *si = istate->split_index;
+	static int installed_handler;
+	int fd, ret;
+
+	temporary_sharedindex = git_pathdup("sharedindex_XXXXXX");
+	fd = xmkstemp(temporary_sharedindex);
+	if (!installed_handler) {
+		atexit(remove_temporary_sharedindex);
+		sigchain_push_common(remove_temporary_sharedindex_on_signal);
+	}
+	move_cache_to_base_index(istate);
+	ret = do_write_index(si->base, fd, 1);
+	close(fd);
+	if (ret) {
+		remove_temporary_sharedindex();
+		return ret;
+	}
+	ret = rename(temporary_sharedindex,
+		     git_path("sharedindex.%s", sha1_to_hex(si->base->sha1)));
+	free(temporary_sharedindex);
+	temporary_sharedindex = NULL;
+	if (!ret)
+		hashcpy(si->base_sha1, si->base->sha1);
+	return ret;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		       unsigned flags)
 {
@@ -2020,6 +2069,12 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		return do_write_locked_index(istate, lock, flags);
 	}
 
+	if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
+		int ret = write_shared_index(istate);
+		if (ret)
+			return ret;
+	}
+
 	return write_split_index(istate, lock, flags);
 }
 
diff --git a/split-index.c b/split-index.c
index ee3246f..21485e2 100644
--- a/split-index.c
+++ b/split-index.c
@@ -74,6 +74,29 @@ static void mark_base_index_entries(struct index_state *base)
 		base->cache[i]->index = i + 1;
 }
 
+void move_cache_to_base_index(struct index_state *istate)
+{
+	struct split_index *si = istate->split_index;
+	int i;
+
+	/*
+	 * do not delete old si->base, its index entries may be shared
+	 * with istate->cache[]. Accept a bit of leaking here because
+	 * this code is only used by short-lived update-index.
+	 */
+	si->base = xcalloc(1, sizeof(*si->base));
+	si->base->version = istate->version;
+	/* zero timestamp disables racy test in ce_write_index() */
+	si->base->timestamp = istate->timestamp;
+	ALLOC_GROW(si->base->cache, istate->cache_nr, si->base->cache_alloc);
+	si->base->cache_nr = istate->cache_nr;
+	memcpy(si->base->cache, istate->cache,
+	       sizeof(*istate->cache) * istate->cache_nr);
+	mark_base_index_entries(si->base);
+	for (i = 0; i < si->base->cache_nr; i++)
+		si->base->cache[i]->ce_flags &= ~CE_UPDATE_IN_BASE;
+}
+
 static void mark_entry_for_delete(size_t pos, void *data)
 {
 	struct index_state *istate = data;
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 26/32] update-index --split-index: do not split if $GIT_DIR is read only
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (24 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 25/32] update-index: new options to enable/disable split index mode Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 27/32] rev-parse: add --shared-index-path to get shared index path Nguyễn Thái Ngọc Duy
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

If $GIT_DIR is read only, we can't write $GIT_DIR/sharedindex. This
could happen when $GIT_INDEX_FILE is set to somehwere outside
$GIT_DIR.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 8ecb959..aa848e1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2030,14 +2030,21 @@ static void remove_temporary_sharedindex_on_signal(int signo)
 	raise(signo);
 }
 
-static int write_shared_index(struct index_state *istate)
+static int write_shared_index(struct index_state *istate,
+			      struct lock_file *lock, unsigned flags)
 {
 	struct split_index *si = istate->split_index;
 	static int installed_handler;
 	int fd, ret;
 
 	temporary_sharedindex = git_pathdup("sharedindex_XXXXXX");
-	fd = xmkstemp(temporary_sharedindex);
+	fd = mkstemp(temporary_sharedindex);
+	if (fd < 0) {
+		free(temporary_sharedindex);
+		temporary_sharedindex = NULL;
+		hashclr(si->base_sha1);
+		return do_write_locked_index(istate, lock, flags);
+	}
 	if (!installed_handler) {
 		atexit(remove_temporary_sharedindex);
 		sigchain_push_common(remove_temporary_sharedindex_on_signal);
@@ -2070,7 +2077,7 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 	}
 
 	if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
-		int ret = write_shared_index(istate);
+		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
 			return ret;
 	}
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 27/32] rev-parse: add --shared-index-path to get shared index path
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (25 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 26/32] update-index --split-index: do not split if $GIT_DIR is read only Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 28/32] read-tree: force split-index mode off on --index-output Nguyễn Thái Ngọc Duy
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Normally scripts do not have to be aware about split indexes because
all shared indexes are in $GIT_DIR. A simple "mv $tmp_index
$GIT_DIR/somewhere" is enough. Scripts that generate temporary indexes
and move them across repos must be aware about split index and copy
the shared file as well. This option enables that.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-rev-parse.txt |  4 ++++
 builtin/rev-parse.c             | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 987395d..9bd76a5 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -245,6 +245,10 @@ print a message to stderr and exit with nonzero status.
 --show-toplevel::
 	Show the absolute path of the top-level directory.
 
+--shared-index-path::
+	Show the path to the shared index file in split index mode, or
+	empty if not in split-index mode.
+
 Other Options
 ~~~~~~~~~~~~~
 
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 1a6122d..8102aaa 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -11,6 +11,7 @@
 #include "parse-options.h"
 #include "diff.h"
 #include "revision.h"
+#include "split-index.h"
 
 #define DO_REVS		1
 #define DO_NOREV	2
@@ -775,6 +776,15 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 						: "false");
 				continue;
 			}
+			if (!strcmp(arg, "--shared-index-path")) {
+				if (read_cache() < 0)
+					die(_("Could not read the index"));
+				if (the_index.split_index) {
+					const unsigned char *sha1 = the_index.split_index->base_sha1;
+					puts(git_path("sharedindex.%s", sha1_to_hex(sha1)));
+				}
+				continue;
+			}
 			if (starts_with(arg, "--since=")) {
 				show_datestring("--max-age=", arg+8);
 				continue;
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 28/32] read-tree: force split-index mode off on --index-output
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (26 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 27/32] rev-parse: add --shared-index-path to get shared index path Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 29/32] read-tree: note about dropping split-index mode or index version Nguyễn Thái Ngọc Duy
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Just a (paranoid?) safety measure..

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index aa848e1..b1027f7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2070,7 +2070,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 {
 	struct split_index *si = istate->split_index;
 
-	if (!si || (istate->cache_changed & ~EXTMASK)) {
+	if (!si || alternate_index_output ||
+	    (istate->cache_changed & ~EXTMASK)) {
 		if (si)
 			hashclr(si->base_sha1);
 		return do_write_locked_index(istate, lock, flags);
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 29/32] read-tree: note about dropping split-index mode or index version
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (27 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 28/32] read-tree: force split-index mode off on --index-output Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 30/32] read-cache: force split index mode with GIT_TEST_SPLIT_INDEX Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/read-tree.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 3204c62..e7e1c33 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -155,6 +155,15 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	if (1 < opts.merge + opts.reset + prefix_set)
 		die("Which one? -m, --reset, or --prefix?");
 
+	/*
+	 * NEEDSWORK
+	 *
+	 * The old index should be read anyway even if we're going to
+	 * destroy all index entries because we still need to preserve
+	 * certain information such as index version or split-index
+	 * mode.
+	 */
+
 	if (opts.reset || opts.merge || opts.prefix) {
 		if (read_cache_unmerged() && (opts.prefix || opts.merge))
 			die("You need to resolve your current index first");
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 30/32] read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (28 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 29/32] read-tree: note about dropping split-index mode or index version Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 31/32] t2104: make sure split index mode is off for the version test Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 32/32] t1700: new tests for split-index mode Nguyễn Thái Ngọc Duy
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This could be used to run the whole test suite with split
indexes. Index splitting is carried out at random. "git read-tree"
also resets the index and forces splitting at the next update.

I had a lot of headaches with the test suite, which proves it
exercises split index pretty good.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index b1027f7..12335a3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1885,8 +1885,11 @@ static int do_write_index(struct index_state *istate, int newfd,
 		}
 	}
 
-	if (!istate->version)
+	if (!istate->version) {
 		istate->version = get_index_format_default();
+		if (getenv("GIT_TEST_SPLIT_INDEX"))
+			init_split_index(istate);
+	}
 
 	/* demote version 3 to version 2 when the latter suffices */
 	if (istate->version == 3 || istate->version == 2)
@@ -2077,6 +2080,11 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		return do_write_locked_index(istate, lock, flags);
 	}
 
+	if (getenv("GIT_TEST_SPLIT_INDEX")) {
+		int v = si->base_sha1[0];
+		if ((v & 15) < 6)
+			istate->cache_changed |= SPLIT_INDEX_ORDERED;
+	}
 	if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
 		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 31/32] t2104: make sure split index mode is off for the version test
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (29 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 30/32] read-cache: force split index mode with GIT_TEST_SPLIT_INDEX Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  2014-06-13 12:19 ` [PATCH 32/32] t1700: new tests for split-index mode Nguyễn Thái Ngọc Duy
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Version tests only make sense when all entries are in the same file,
so we can see if version is downgraded to 2 if 3 is not required.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t2104-update-index-skip-worktree.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/t2104-update-index-skip-worktree.sh b/t/t2104-update-index-skip-worktree.sh
index 29c1fb1..cc830da 100755
--- a/t/t2104-update-index-skip-worktree.sh
+++ b/t/t2104-update-index-skip-worktree.sh
@@ -7,6 +7,8 @@ test_description='skip-worktree bit test'
 
 . ./test-lib.sh
 
+sane_unset GIT_TEST_SPLIT_INDEX
+
 test_set_index_version 3
 
 cat >expect.full <<EOF
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 32/32] t1700: new tests for split-index mode
  2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
                   ` (30 preceding siblings ...)
  2014-06-13 12:19 ` [PATCH 31/32] t2104: make sure split index mode is off for the version test Nguyễn Thái Ngọc Duy
@ 2014-06-13 12:19 ` Nguyễn Thái Ngọc Duy
  31 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-06-13 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 .gitignore                      |   1 +
 Makefile                        |   1 +
 cache.h                         |   2 +
 read-cache.c                    |   3 +-
 t/t1700-split-index.sh (new +x) | 194 ++++++++++++++++++++++++++++++++++++++++
 test-dump-split-index.c (new)   |  34 +++++++
 6 files changed, 233 insertions(+), 2 deletions(-)
 create mode 100755 t/t1700-split-index.sh
 create mode 100644 test-dump-split-index.c

diff --git a/.gitignore b/.gitignore
index dc600f9..70992a4 100644
--- a/.gitignore
+++ b/.gitignore
@@ -180,6 +180,7 @@
 /test-date
 /test-delta
 /test-dump-cache-tree
+/test-dump-split-index
 /test-scrap-cache-tree
 /test-genrandom
 /test-hashmap
diff --git a/Makefile b/Makefile
index 94ad3ce..028749b 100644
--- a/Makefile
+++ b/Makefile
@@ -565,6 +565,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
+TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-genrandom
 TEST_PROGRAMS_NEED_X += test-hashmap
 TEST_PROGRAMS_NEED_X += test-index-version
diff --git a/cache.h b/cache.h
index 7523c28..c6b7770 100644
--- a/cache.h
+++ b/cache.h
@@ -473,6 +473,8 @@ extern int daemonize(void);
 struct lock_file;
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec *pathspec);
+extern int do_read_index(struct index_state *istate, const char *path,
+			 int must_exist); /* for testting only! */
 extern int read_index_from(struct index_state *, const char *path);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
diff --git a/read-cache.c b/read-cache.c
index 12335a3..342fe52 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1463,8 +1463,7 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 }
 
 /* remember to discard_cache() before reading a different cache! */
-static int do_read_index(struct index_state *istate, const char *path,
-			 int must_exist)
+int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
 	int fd, i;
 	struct stat st;
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
new file mode 100755
index 0000000..94fb473
--- /dev/null
+++ b/t/t1700-split-index.sh
@@ -0,0 +1,194 @@
+#!/bin/sh
+
+test_description='split index mode tests'
+
+. ./test-lib.sh
+
+# We need total control of index splitting here
+sane_unset GIT_TEST_SPLIT_INDEX
+
+test_expect_success 'enable split index' '
+	git update-index --split-index &&
+	test-dump-split-index .git/index >actual &&
+	cat >expect <<EOF &&
+own 8299b0bcd1ac364e5f1d7768efb62fa2da79a339
+base 39d890139ee5356c7ef572216cebcd27aa41f9df
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add one file' '
+	: >one &&
+	git update-index --add one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+base 39d890139ee5356c7ef572216cebcd27aa41f9df
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'disable split index' '
+	git update-index --no-split-index &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	BASE=`test-dump-split-index .git/index | grep "^own" | sed "s/own/base/"` &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+not a split index
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'enable split index again, "one" now belongs to base index"' '
+	git update-index --split-index &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'modify original file, base index untouched' '
+	echo modified >one &&
+	git update-index one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 2e0996000b7e9019eabcad29391bf0f5c7702f0b 0	one
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	q_to_tab >expect <<EOF &&
+$BASE
+100644 2e0996000b7e9019eabcad29391bf0f5c7702f0b 0Q
+replacements: 0
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add another file, which stays index' '
+	: >two &&
+	git update-index --add two &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 2e0996000b7e9019eabcad29391bf0f5c7702f0b 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	q_to_tab >expect <<EOF &&
+$BASE
+100644 2e0996000b7e9019eabcad29391bf0f5c7702f0b 0Q
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+replacements: 0
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'remove file not in base index' '
+	git update-index --force-remove two &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 2e0996000b7e9019eabcad29391bf0f5c7702f0b 0	one
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	q_to_tab >expect <<EOF &&
+$BASE
+100644 2e0996000b7e9019eabcad29391bf0f5c7702f0b 0Q
+replacements: 0
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'remove file in base index' '
+	git update-index --force-remove one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions: 0
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add original file back' '
+	: >one &&
+	git update-index --add one &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+replacements:
+deletions: 0
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'add new file' '
+	: >two &&
+	git update-index --add two &&
+	git ls-files --stage >actual &&
+	cat >expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'unify index, two files remain' '
+	git update-index --no-split-index &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp ls-files.expect ls-files.actual
+
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+not a split index
+EOF
+	test_cmp expect actual
+'
+
+test_done
diff --git a/test-dump-split-index.c b/test-dump-split-index.c
new file mode 100644
index 0000000..9cf3112
--- /dev/null
+++ b/test-dump-split-index.c
@@ -0,0 +1,34 @@
+#include "cache.h"
+#include "split-index.h"
+#include "ewah/ewok.h"
+
+static void show_bit(size_t pos, void *data)
+{
+	printf(" %d", (int)pos);
+}
+
+int main(int ac, char **av)
+{
+	struct split_index *si;
+	int i;
+
+	do_read_index(&the_index, av[1], 1);
+	printf("own %s\n", sha1_to_hex(the_index.sha1));
+	si = the_index.split_index;
+	if (!si) {
+		printf("not a split index\n");
+		return 0;
+	}
+	printf("base %s\n", sha1_to_hex(si->base_sha1));
+	for (i = 0; i < the_index.cache_nr; i++) {
+		struct cache_entry *ce = the_index.cache[i];
+		printf("%06o %s %d\t%s\n", ce->ce_mode,
+		       sha1_to_hex(ce->sha1), ce_stage(ce), ce->name);
+	}
+	printf("replacements:");
+	ewah_each_bit(si->replace_bitmap, show_bit, NULL);
+	printf("\ndeletions:");
+	ewah_each_bit(si->delete_bitmap, show_bit, NULL);
+	printf("\n");
+	return 0;
+}
-- 
1.9.1.346.ga2b5940

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2014-06-13 12:23 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-13 12:19 [PATCH 00/32] Split index resend Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 01/32] ewah: fix constness of ewah_read_mmap Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 02/32] ewah: delete unused ewah_read_mmap_native declaration Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 03/32] sequencer: do not update/refresh index if the lock cannot be held Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 04/32] read-cache: new API write_locked_index instead of write_index/write_cache Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 05/32] read-cache: relocate and unexport commit_locked_index() Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 06/32] read-cache: store in-memory flags in the first 12 bits of ce_flags Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 07/32] read-cache: be strict about "changed" in remove_marked_cache_entries() Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 08/32] read-cache: be specific what part of the index has changed Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 09/32] update-index: " Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 10/32] resolve-undo: " Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 11/32] unpack-trees: " Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 12/32] cache-tree: mark istate->cache_changed on cache tree invalidation Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 13/32] cache-tree: mark istate->cache_changed on cache tree update Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 14/32] cache-tree: mark istate->cache_changed on prime_cache_tree() Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 15/32] entry.c: update cache_changed if refresh_cache is set in checkout_entry() Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 16/32] read-cache: save index SHA-1 after reading Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 17/32] read-cache: split-index mode Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 18/32] read-cache: mark new entries for split index Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 19/32] read-cache: save deleted entries in " Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 20/32] read-cache: mark updated entries for " Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 21/32] split-index: the writing part Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 22/32] split-index: the reading part Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 23/32] split-index: do not invalidate cache-tree at read time Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 24/32] split-index: strip pathname of on-disk replaced entries Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 25/32] update-index: new options to enable/disable split index mode Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 26/32] update-index --split-index: do not split if $GIT_DIR is read only Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 27/32] rev-parse: add --shared-index-path to get shared index path Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 28/32] read-tree: force split-index mode off on --index-output Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 29/32] read-tree: note about dropping split-index mode or index version Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 30/32] read-cache: force split index mode with GIT_TEST_SPLIT_INDEX Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 31/32] t2104: make sure split index mode is off for the version test Nguyễn Thái Ngọc Duy
2014-06-13 12:19 ` [PATCH 32/32] t1700: new tests for split-index mode Nguyễn Thái Ngọc Duy
  -- strict thread matches above, loose matches on Subject: below --
2014-04-28 10:55 [PATCH 00/32] Split index mode for very large indexes Nguyễn Thái Ngọc Duy
2014-04-28 10:55 ` [PATCH 01/32] ewah: fix constness of ewah_read_mmap Nguyễn Thái Ngọc Duy

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).