git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 00/21] Add configuration options for split-index
@ 2016-12-26 10:22 Christian Couder
  2016-12-26 10:22 ` [PATCH v3 01/21] config: mark an error message up for translation Christian Couder
                   ` (21 more replies)
  0 siblings, 22 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Goal
~~~~

We want to make it possible to use the split-index feature
automatically by just setting a new "core.splitIndex" configuration
variable to true.

This can be valuable as split-index can help significantly speed up
`git rebase` especially along with the work to libify `git apply`
that has been merged to master
(see https://github.com/git/git/commit/81358dc238372793b1590efa149cc1581d1fbd98)
and is now in v2.11.

Design
~~~~~~

The design is similar as the previous work that introduced
"core.untrackedCache". 

The new "core.splitIndex" configuration option can be either true,
false or undefined which is the default.

When it is true, the split index is created, if it does not already
exists, when the index is read. When it is false, the split index is
removed if it exists, when the index is read. Otherwise it is left as
is.

Along with this new configuration variable, the two following options
are also introduced:

    - splitIndex.maxPercentChange

    This is to avoid having too many changes accumulating in the split
    index while in split index mode. The git-update-index
    documentation says:

	If split-index mode is already enabled and `--split-index` is
	given again, all changes in $GIT_DIR/index are pushed back to
	the shared index file.

    but it is probably better to not expect the user to think about it
    and to have a mechanism that pushes back all changes to the shared
    index file automatically when some threshold is reached.

    The default threshold is when the number of entries in the split
    index file reaches 20% of the number of entries in the shared
    index file. The new "splitIndex.maxPercentChange" config option
    lets people tweak this value.

    - splitIndex.sharedIndexExpire

    To make sure that old sharedindex files are eventually removed
    when a new one has been created, we "touch" the shared index file
    every time a split index file using the shared index file is
    either created or read from. Then we can delete shared indexes
    with an mtime older than one week (by default), when we create a
    new shared index file. The new "splitIndex.sharedIndexExpire"
    config option lets people tweak this grace period.

    This idea was suggested by Duy in:

    https://public-inbox.org/git/CACsJy8BqMFASHf5kJgUh+bd7XG98CafNydE964VJyPXz-emEvA@mail.gmail.com/

    and after some experiments, I agree that it is much simpler than
    what I thought could be done during our discussion.

    Junio also thinks that we have to do "time-based GC" in:
 
    https://public-inbox.org/git/xmqqeg33ccjj.fsf@gitster.mtv.corp.google.com/

Note that this patch series doesn't address a leak when removing a
split-index, but Duy wrote that he has a patch to fix this leak:

https://public-inbox.org/git/CACsJy8AisF2ZVs7JdnVFp5wdskkbVQQQ=DBq5UzE1MOsCfBMtQ@mail.gmail.com/

Highlevel view of the patches in the series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Except for patch 1/21, there are 3 big steps, one for each new
configuration variable introduced.

There only a few small differences between this patch series and the
v2 patch series sent a few weeks ago. Only two commits have been
changed a little, as suggested by Duy.

    - Patch 1/21 marks a message for translation. It was a new patch
      in v2 and it can be applied separately. (The patch 1/19 in v1,
      which was a typo fix, has been merged separately into master
      already.)

Step 1 is:

    - Patches 2/21 to 5/21 introduce the functions that are reading
      the "core.splitIndex" configuration variable and tweaking the
      split index depending on its value.

    - Patch 6/21 adds a few tests for the new feature.

    - Patches 7/21 and 8/21 add some documentation for the new
      feature.

    The only change since v2 in this step is that an unnecessary
    variable initialization has been droped in 2/21, as suggested by
    Duy.

Step 2 is:

    - Patches 9/21 and 10/21 introduce the functions that are reading
      the "splitIndex.maxPercentChange" configuration variable and
      regenerating a new shared index file depending on its value.

    - Patch 11/21 adds a few tests for the new feature.

    - Patch 12/21 add some documentation for the new feature.

    There are no changes in this step since v2.

Step 3 is:

    - Patches 13/21 to 16/21 introduce the functions that are reading
      the "splitIndex.sharedIndexExpire" configuration variable and
      expiring old shared index files depending on its value.

    - Patch 17/21 adds a few tests for the new feature.

    - Patches 18/21 and 19/21 are new patches. They update the mtime
      of the shared index file when a split index based on the shared
      index file is read from. 18/21 is a refactoring to make the
      actual change in 19/21 easier.

    - Patches 20/21 and 21/21 add some documentation for the new
      feature.

    The only change since v2 in this step is that the full shared
    index path is passed to the can_delete_shared_index() function in
    16/21 as suggested by Duy.

Links
~~~~~

This patch series is also available here:

  https://github.com/chriscool/git/commits/config-split-index

The previous versions were:

  RFC: https://github.com/chriscool/git/commits/config-split-index7
  v1:  https://github.com/chriscool/git/commits/config-split-index72
  v2:  https://github.com/chriscool/git/commits/config-split-index99

On the mailing list the related patch series and discussions were:

  RFC: https://public-inbox.org/git/20160711172254.13439-1-chriscool@tuxfamily.org/
  v1:  https://public-inbox.org/git/20161023092648.12086-1-chriscool@tuxfamily.org/
  v2:  https://public-inbox.org/git/20161217145547.11748-1-chriscool@tuxfamily.org/

Christian Couder (21):
  config: mark an error message up for translation
  config: add git_config_get_split_index()
  split-index: add {add,remove}_split_index() functions
  read-cache: add and then use tweak_split_index()
  update-index: warn in case of split-index incoherency
  t1700: add tests for core.splitIndex
  Documentation/config: add information for core.splitIndex
  Documentation/git-update-index: talk about core.splitIndex config var
  config: add git_config_get_max_percent_split_change()
  read-cache: regenerate shared index if necessary
  t1700: add tests for splitIndex.maxPercentChange
  Documentation/config: add splitIndex.maxPercentChange
  sha1_file: make check_and_freshen_file() non static
  read-cache: touch shared index files when used
  config: add git_config_get_expiry() from gc.c
  read-cache: unlink old sharedindex files
  t1700: test shared index file expiration
  read-cache: refactor read_index_from()
  read-cache: use freshen_shared_index() in read_index_from()
  Documentation/config: add splitIndex.sharedIndexExpire
  Documentation/git-update-index: explain splitIndex.*

 Documentation/config.txt           |  28 +++++++
 Documentation/git-update-index.txt |  43 +++++++++--
 builtin/gc.c                       |  15 +---
 builtin/update-index.c             |  25 +++---
 cache.h                            |   8 ++
 config.c                           |  42 +++++++++-
 read-cache.c                       | 143 ++++++++++++++++++++++++++++++++--
 sha1_file.c                        |   2 +-
 split-index.c                      |  22 ++++++
 split-index.h                      |   2 +
 t/t1700-split-index.sh             | 154 +++++++++++++++++++++++++++++++++++++
 11 files changed, 442 insertions(+), 42 deletions(-)

-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 01/21] config: mark an error message up for translation
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 02/21] config: add git_config_get_split_index() Christian Couder
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 config.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/config.c b/config.c
index 83fdecb1bc..2eaf8ad77a 100644
--- a/config.c
+++ b/config.c
@@ -1701,8 +1701,8 @@ int git_config_get_untracked_cache(void)
 		if (!strcasecmp(v, "keep"))
 			return -1;
 
-		error("unknown core.untrackedCache value '%s'; "
-		      "using 'keep' default value", v);
+		error(_("unknown core.untrackedCache value '%s'; "
+			"using 'keep' default value"), v);
 		return -1;
 	}
 
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 02/21] config: add git_config_get_split_index()
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
  2016-12-26 10:22 ` [PATCH v3 01/21] config: mark an error message up for translation Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 03/21] split-index: add {add,remove}_split_index() functions Christian Couder
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

This new function will be used in a following commit to know
if we want to use the split index feature or not.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h  |  1 +
 config.c | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/cache.h b/cache.h
index a50a61a197..c126fe475e 100644
--- a/cache.h
+++ b/cache.h
@@ -1821,6 +1821,7 @@ extern int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
+extern int git_config_get_split_index(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 2eaf8ad77a..421e8c9da6 100644
--- a/config.c
+++ b/config.c
@@ -1709,6 +1709,16 @@ int git_config_get_untracked_cache(void)
 	return -1; /* default value */
 }
 
+int git_config_get_split_index(void)
+{
+	int val;
+
+	if (!git_config_get_maybe_bool("core.splitindex", &val))
+		return val;
+
+	return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 03/21] split-index: add {add,remove}_split_index() functions
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
  2016-12-26 10:22 ` [PATCH v3 01/21] config: mark an error message up for translation Christian Couder
  2016-12-26 10:22 ` [PATCH v3 02/21] config: add git_config_get_split_index() Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 04/21] read-cache: add and then use tweak_split_index() Christian Couder
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Also use the functions in cmd_update_index() in
builtin/update-index.c.

These functions will be used in a following commit to tweak
our use of the split-index feature depending on the setting
of a configuration variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/update-index.c | 18 ++++++------------
 split-index.c          | 22 ++++++++++++++++++++++
 split-index.h          |  2 ++
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d530e89368..24fdadfa4b 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,18 +1099,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (split_index > 0) {
-		init_split_index(&the_index);
-		the_index.cache_changed |= SPLIT_INDEX_ORDERED;
-	} else if (!split_index && the_index.split_index) {
-		/*
-		 * can't discard_split_index(&the_index); because that
-		 * will destroy split_index->base->cache[], which may
-		 * be shared with the_index.cache[]. So yeah we're
-		 * leaking a bit here.
-		 */
-		the_index.split_index = NULL;
-		the_index.cache_changed |= SOMETHING_CHANGED;
-	}
+		if (the_index.split_index)
+			the_index.cache_changed |= SPLIT_INDEX_ORDERED;
+		else
+			add_split_index(&the_index);
+	} else if (!split_index)
+		remove_split_index(&the_index);
 
 	switch (untracked_cache) {
 	case UC_UNSPECIFIED:
diff --git a/split-index.c b/split-index.c
index 615f4cac05..f519e60f87 100644
--- a/split-index.c
+++ b/split-index.c
@@ -317,3 +317,25 @@ void replace_index_entry_in_base(struct index_state *istate,
 		istate->split_index->base->cache[new->index - 1] = new;
 	}
 }
+
+void add_split_index(struct index_state *istate)
+{
+	if (!istate->split_index) {
+		init_split_index(istate);
+		istate->cache_changed |= SPLIT_INDEX_ORDERED;
+	}
+}
+
+void remove_split_index(struct index_state *istate)
+{
+	if (istate->split_index) {
+		/*
+		 * can't discard_split_index(&the_index); because that
+		 * will destroy split_index->base->cache[], which may
+		 * be shared with the_index.cache[]. So yeah we're
+		 * leaking a bit here.
+		 */
+		istate->split_index = NULL;
+		istate->cache_changed |= SOMETHING_CHANGED;
+	}
+}
diff --git a/split-index.h b/split-index.h
index c1324f521a..df91c1bda8 100644
--- a/split-index.h
+++ b/split-index.h
@@ -31,5 +31,7 @@ void merge_base_index(struct index_state *istate);
 void prepare_to_write_split_index(struct index_state *istate);
 void finish_writing_split_index(struct index_state *istate);
 void discard_split_index(struct index_state *istate);
+void add_split_index(struct index_state *istate);
+void remove_split_index(struct index_state *istate);
 
 #endif
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 04/21] read-cache: add and then use tweak_split_index()
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (2 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 03/21] split-index: add {add,remove}_split_index() functions Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 05/21] update-index: warn in case of split-index incoherency Christian Couder
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

This will make us use the split-index feature or not depending
on the value of the "core.splitIndex" config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index f92a912dcb..732b7fe611 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1566,10 +1566,27 @@ static void tweak_untracked_cache(struct index_state *istate)
 	}
 }
 
+static void tweak_split_index(struct index_state *istate)
+{
+	switch (git_config_get_split_index()) {
+	case -1: /* unset: do nothing */
+		break;
+	case 0: /* false */
+		remove_split_index(istate);
+		break;
+	case 1: /* true */
+		add_split_index(istate);
+		break;
+	default: /* unknown value: do nothing */
+		break;
+	}
+}
+
 static void post_read_index_from(struct index_state *istate)
 {
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
+	tweak_split_index(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 05/21] update-index: warn in case of split-index incoherency
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (3 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 04/21] read-cache: add and then use tweak_split_index() Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 06/21] t1700: add tests for core.splitIndex Christian Couder
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

When users are using `git update-index --(no-)split-index`, they
may expect the split-index feature to be used or not according to
the option they just used, but this might not be the case if the
new "core.splitIndex" config variable has been set. In this case
let's warn about what will happen and why.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/update-index.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 24fdadfa4b..d74d72cc7f 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,12 +1099,21 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (split_index > 0) {
+		if (git_config_get_split_index() == 0)
+			warning(_("core.splitIndex is set to false; "
+				  "remove or change it, if you really want to "
+				  "enable split index"));
 		if (the_index.split_index)
 			the_index.cache_changed |= SPLIT_INDEX_ORDERED;
 		else
 			add_split_index(&the_index);
-	} else if (!split_index)
+	} else if (!split_index) {
+		if (git_config_get_split_index() == 1)
+			warning(_("core.splitIndex is set to true; "
+				  "remove or change it, if you really want to "
+				  "disable split index"));
 		remove_split_index(&the_index);
+	}
 
 	switch (untracked_cache) {
 	case UC_UNSPECIFIED:
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 06/21] t1700: add tests for core.splitIndex
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (4 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 05/21] update-index: warn in case of split-index incoherency Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:04   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 07/21] Documentation/config: add information " Christian Couder
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 292a0720fc..db8c39f446 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -200,4 +200,41 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+	git config core.splitIndex true &&
+	: >three &&
+	git update-index --add three &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	three
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'set core.splitIndex config variable to false' '
+	git config core.splitIndex false &&
+	git update-index --force-remove three &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+not a split index
+EOF
+	test_cmp expect actual
+'
+
 test_done
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 07/21] Documentation/config: add information for core.splitIndex
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (5 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 06/21] t1700: add tests for core.splitIndex Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var Christian Couder
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d51182a060..221c5982c0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -331,6 +331,10 @@ core.trustctime::
 	crawlers and some backup systems).
 	See linkgit:git-update-index[1]. True by default.
 
+core.splitIndex::
+	If true, the split-index feature of the index will be used.
+	See linkgit:git-update-index[1]. False by default.
+
 core.untrackedCache::
 	Determines what to do about the untracked cache feature of the
 	index. It will be kept, if this variable is unset or set to
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (6 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 07/21] Documentation/config: add information " Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:07   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 09/21] config: add git_config_get_max_percent_split_change() Christian Couder
                   ` (13 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-update-index.txt | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 7386c93162..e091b2a409 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -171,6 +171,12 @@ may not support it yet.
 	given again, all changes in $GIT_DIR/index are pushed back to
 	the shared index file. This mode is designed for very large
 	indexes that take a significant amount of time to read or write.
++
+These options take effect whatever the value of the `core.splitIndex`
+configuration variable (see linkgit:git-config[1]). But a warning is
+emitted when the change goes against the configured value, as the
+configured value will take effect next time the index is read and this
+will remove the intended effect of the option.
 
 --untracked-cache::
 --no-untracked-cache::
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 09/21] config: add git_config_get_max_percent_split_change()
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (7 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 10/21] read-cache: regenerate shared index if necessary Christian Couder
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

This new function will be used in a following commit to get the
value of the "splitIndex.maxPercentChange" config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h  |  1 +
 config.c | 15 +++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/cache.h b/cache.h
index c126fe475e..e15b421b6f 100644
--- a/cache.h
+++ b/cache.h
@@ -1822,6 +1822,7 @@ extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
+extern int git_config_get_max_percent_split_change(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 421e8c9da6..cf212785bb 100644
--- a/config.c
+++ b/config.c
@@ -1719,6 +1719,21 @@ int git_config_get_split_index(void)
 	return -1; /* default value */
 }
 
+int git_config_get_max_percent_split_change(void)
+{
+	int val = -1;
+
+	if (!git_config_get_int("splitindex.maxpercentchange", &val)) {
+		if (0 <= val && val <= 100)
+			return val;
+
+		return error(_("splitIndex.maxPercentChange value '%d' "
+			       "should be between 0 and 100"), val);
+	}
+
+	return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 10/21] read-cache: regenerate shared index if necessary
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (8 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 09/21] config: add git_config_get_max_percent_split_change() Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:08   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 11/21] t1700: add tests for splitIndex.maxPercentChange Christian Couder
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

When writing a new split-index and there is a big number of cache
entries in the split-index compared to the shared index, it is a
good idea to regenerate the shared index.

By default when the ratio reaches 20%, we will push back all
the entries from the split-index into a new shared index file
instead of just creating a new split-index file.

The threshold can be configured using the
"splitIndex.maxPercentChange" config variable.

We need to adjust the existing tests in t1700 by setting
"splitIndex.maxPercentChange" to 100 at the beginning of t1700,
as the existing tests are assuming that the shared index is
regenerated only when `git update-index --split-index` is used.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c           | 32 ++++++++++++++++++++++++++++++++
 t/t1700-split-index.sh |  1 +
 2 files changed, 33 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 732b7fe611..a1aaec5135 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2220,6 +2220,36 @@ static int write_shared_index(struct index_state *istate,
 	return ret;
 }
 
+static const int default_max_percent_split_change = 20;
+
+static int too_many_not_shared_entries(struct index_state *istate)
+{
+	int i, not_shared = 0;
+	int max_split = git_config_get_max_percent_split_change();
+
+	switch (max_split) {
+	case -1:
+		/* not or badly configured: use the default value */
+		max_split = default_max_percent_split_change;
+		break;
+	case 0:
+		return 1; /* 0% means always write a new shared index */
+	case 100:
+		return 0; /* 100% means never write a new shared index */
+	default:
+		; /* do nothing: just use the configured value */
+	}
+
+	/* Count not shared entries */
+	for (i = 0; i < istate->cache_nr; i++) {
+		struct cache_entry *ce = istate->cache[i];
+		if (!ce->index)
+			not_shared++;
+	}
+
+	return istate->cache_nr * max_split < not_shared * 100;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		       unsigned flags)
 {
@@ -2237,6 +2267,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		if ((v & 15) < 6)
 			istate->cache_changed |= SPLIT_INDEX_ORDERED;
 	}
+	if (too_many_not_shared_entries(istate))
+		istate->cache_changed |= SPLIT_INDEX_ORDERED;
 	if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
 		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index db8c39f446..507a1dd1ad 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -8,6 +8,7 @@ test_description='split index mode tests'
 sane_unset GIT_TEST_SPLIT_INDEX
 
 test_expect_success 'enable split index' '
+	git config splitIndex.maxPercentChange 100 &&
 	git update-index --split-index &&
 	test-dump-split-index .git/index >actual &&
 	indexversion=$(test-index-version <.git/index) &&
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 11/21] t1700: add tests for splitIndex.maxPercentChange
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (9 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 10/21] read-cache: regenerate shared index if necessary Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange Christian Couder
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 507a1dd1ad..f03addf654 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -238,4 +238,76 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+	git config core.splitIndex true &&
+	: >three &&
+	git update-index --add three &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >four &&
+	git update-index --add four &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	four
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check behavior with splitIndex.maxPercentChange unset' '
+	git config --unset splitIndex.maxPercentChange &&
+	: >five &&
+	git update-index --add five &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >six &&
+	git update-index --add six &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	six
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check splitIndex.maxPercentChange set to 0' '
+	git config splitIndex.maxPercentChange 0 &&
+	: >seven &&
+	git update-index --add seven &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >eight &&
+	git update-index --add eight &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
 test_done
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (10 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 11/21] t1700: add tests for splitIndex.maxPercentChange Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:09   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 13/21] sha1_file: make check_and_freshen_file() non static Christian Couder
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 221c5982c0..e0f5a77980 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2773,6 +2773,19 @@ showbranch.default::
 	The default set of branches for linkgit:git-show-branch[1].
 	See linkgit:git-show-branch[1].
 
+splitIndex.maxPercentChange::
+	When the split index feature is used, this specifies the
+	percent of entries the split index can contain compared to the
+	whole number of entries in both the split index and the shared
+	index before a new shared index is written.
+	The value should be between 0 and 100. If the value is 0 then
+	a new shared index is always written, if it is 100 a new
+	shared index is never written.
+	By default the value is 20, so a new shared index is written
+	if the number of entries in the split index would be greater
+	than 20 percent of the total number of entries.
+	See linkgit:git-update-index[1].
+
 status.relativePaths::
 	By default, linkgit:git-status[1] shows paths relative to the
 	current directory. Setting this variable to `false` shows paths
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 13/21] sha1_file: make check_and_freshen_file() non static
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (11 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:09   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 14/21] read-cache: touch shared index files when used Christian Couder
                   ` (8 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

This function will be used in a commit soon, so let's make
it available globally.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h     | 3 +++
 sha1_file.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index e15b421b6f..f442f28189 100644
--- a/cache.h
+++ b/cache.h
@@ -1170,6 +1170,9 @@ extern int has_pack_index(const unsigned char *sha1);
 
 extern void assert_sha1_type(const unsigned char *sha1, enum object_type expect);
 
+/* Helper to check and "touch" a file */
+extern int check_and_freshen_file(const char *fn, int freshen);
+
 extern const signed char hexval_table[256];
 static inline unsigned int hexval(unsigned char c)
 {
diff --git a/sha1_file.c b/sha1_file.c
index 1173071859..f5303d955a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -601,7 +601,7 @@ static int freshen_file(const char *fn)
  * either does not exist on disk, or has a stale mtime and may be subject to
  * pruning).
  */
-static int check_and_freshen_file(const char *fn, int freshen)
+int check_and_freshen_file(const char *fn, int freshen)
 {
 	if (access(fn, F_OK))
 		return 0;
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 14/21] read-cache: touch shared index files when used
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (12 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 13/21] sha1_file: make check_and_freshen_file() non static Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:10   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 15/21] config: add git_config_get_expiry() from gc.c Christian Couder
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

When a split-index file is created, let's update the mtime of the
shared index file that the split-index file is referencing.

In a following commit we will make shared index file expire
depending on their mtime, so updating the mtime makes sure that
the shared index file will not be deleted soon.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index a1aaec5135..9fbad2044b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1682,6 +1682,19 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	die("index file corrupt");
 }
 
+/*
+ * Signal that the shared index is used by updating its mtime.
+ *
+ * This way, shared index can be removed if they have not been used
+ * for some time. It's ok to fail to update the mtime if we are on a
+ * read only file system.
+ */
+void freshen_shared_index(char *base_sha1_hex)
+{
+	const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+	check_and_freshen_file(shared_index, 1);
+}
+
 int read_index_from(struct index_state *istate, const char *path)
 {
 	struct split_index *split_index;
@@ -2273,6 +2286,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
 			return ret;
+	} else {
+		freshen_shared_index(sha1_to_hex(si->base_sha1));
 	}
 
 	return write_split_index(istate, lock, flags);
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 15/21] config: add git_config_get_expiry() from gc.c
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (13 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 14/21] read-cache: touch shared index files when used Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 16/21] read-cache: unlink old sharedindex files Christian Couder
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

This function will be used in a following commit to get the expiration
time of the shared index files from the config, and it is generic
enough to be put in "config.c".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/gc.c | 15 ++-------------
 cache.h      |  3 +++
 config.c     | 13 +++++++++++++
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 069950d0b4..1e40d45aa2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -62,17 +62,6 @@ static void report_pack_garbage(unsigned seen_bits, const char *path)
 		string_list_append(&pack_garbage, path);
 }
 
-static void git_config_date_string(const char *key, const char **output)
-{
-	if (git_config_get_string_const(key, output))
-		return;
-	if (strcmp(*output, "now")) {
-		unsigned long now = approxidate("now");
-		if (approxidate(*output) >= now)
-			git_die_config(key, _("Invalid %s: '%s'"), key, *output);
-	}
-}
-
 static void process_log_file(void)
 {
 	struct stat st;
@@ -111,8 +100,8 @@ static void gc_config(void)
 	git_config_get_int("gc.auto", &gc_auto_threshold);
 	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
 	git_config_get_bool("gc.autodetach", &detach_auto);
-	git_config_date_string("gc.pruneexpire", &prune_expire);
-	git_config_date_string("gc.worktreepruneexpire", &prune_worktrees_expire);
+	git_config_get_expiry("gc.pruneexpire", &prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
 	git_config(git_default_config, NULL);
 }
 
diff --git a/cache.h b/cache.h
index f442f28189..279415afbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1827,6 +1827,9 @@ extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
 
+/* This dies if the configured or default date is in the future */
+extern int git_config_get_expiry(const char *key, const char **output);
+
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
  * ensure that they do not modify the untracked cache when reading it.
diff --git a/config.c b/config.c
index cf212785bb..d6c8f8f3ba 100644
--- a/config.c
+++ b/config.c
@@ -1685,6 +1685,19 @@ int git_config_get_pathname(const char *key, const char **dest)
 	return ret;
 }
 
+int git_config_get_expiry(const char *key, const char **output)
+{
+	int ret = git_config_get_string_const(key, output);
+	if (ret)
+		return ret;
+	if (strcmp(*output, "now")) {
+		unsigned long now = approxidate("now");
+		if (approxidate(*output) >= now)
+			git_die_config(key, _("Invalid %s: '%s'"), key, *output);
+	}
+	return ret;
+}
+
 int git_config_get_untracked_cache(void)
 {
 	int val = -1;
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 16/21] read-cache: unlink old sharedindex files
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (14 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 15/21] config: add git_config_get_expiry() from gc.c Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 17/21] t1700: test shared index file expiration Christian Couder
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Everytime split index is turned on, it creates a "sharedindex.XXXX"
file in the git directory. This change makes sure that shared index
files that haven't been used for a long time are removed when a new
shared index file is created.

The new "splitIndex.sharedIndexExpire" config variable is created
to tell the delay after which an unused shared index file can be
deleted. It defaults to "1.week.ago".

A previous commit made sure that each time a split index file is
created the mtime of the shared index file it references is updated.
This makes sure that recently used shared index file will not be
deleted.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 9fbad2044b..e62a6c742d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2207,6 +2207,65 @@ static int write_split_index(struct index_state *istate,
 	return ret;
 }
 
+static const char *shared_index_expire = "1.week.ago";
+
+static unsigned long get_shared_index_expire_date(void)
+{
+	static unsigned long shared_index_expire_date;
+	static int shared_index_expire_date_prepared;
+
+	if (!shared_index_expire_date_prepared) {
+		git_config_get_expiry("splitindex.sharedindexexpire",
+				      &shared_index_expire);
+		shared_index_expire_date = approxidate(shared_index_expire);
+		shared_index_expire_date_prepared = 1;
+	}
+
+	return shared_index_expire_date;
+}
+
+static int can_delete_shared_index(const char *shared_index_path)
+{
+	struct stat st;
+	unsigned long expiration;
+
+	/* Check timestamp */
+	expiration = get_shared_index_expire_date();
+	if (!expiration)
+		return 0;
+	if (stat(shared_index_path, &st))
+		return error_errno(_("could not stat '%s"), shared_index_path);
+	if (st.st_mtime > expiration)
+		return 0;
+
+	return 1;
+}
+
+static int clean_shared_index_files(const char *current_hex)
+{
+	struct dirent *de;
+	DIR *dir = opendir(get_git_dir());
+
+	if (!dir)
+		return error_errno(_("unable to open git dir: %s"), get_git_dir());
+
+	while ((de = readdir(dir)) != NULL) {
+		const char *sha1_hex;
+		const char *shared_index_path;
+		if (!skip_prefix(de->d_name, "sharedindex.", &sha1_hex))
+			continue;
+		if (!strcmp(sha1_hex, current_hex))
+			continue;
+		shared_index_path = git_path("%s", de->d_name);
+		if (can_delete_shared_index(shared_index_path) > 0 &&
+		    unlink(shared_index_path))
+			error_errno(_("unable to unlink: %s"), shared_index_path);
+	}
+	closedir(dir);
+
+	return 0;
+}
+
 static struct tempfile temporary_sharedindex;
 
 static int write_shared_index(struct index_state *istate,
@@ -2228,8 +2287,11 @@ static int write_shared_index(struct index_state *istate,
 	}
 	ret = rename_tempfile(&temporary_sharedindex,
 			      git_path("sharedindex.%s", sha1_to_hex(si->base->sha1)));
-	if (!ret)
+	if (!ret) {
 		hashcpy(si->base_sha1, si->base->sha1);
+		clean_shared_index_files(sha1_to_hex(si->base->sha1));
+	}
+
 	return ret;
 }
 
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 17/21] t1700: test shared index file expiration
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (15 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 16/21] read-cache: unlink old sharedindex files Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 18/21] read-cache: refactor read_index_from() Christian Couder
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index f03addf654..f448fc13cd 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -310,4 +310,48 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'shared index files expire after 7 days by default' '
+	: >ten &&
+	git update-index --add ten &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	just_under_7_days_ago=$((1-7*86400)) &&
+	test-chmtime =$just_under_7_days_ago .git/sharedindex.* &&
+	: >eleven &&
+	git update-index --add eleven &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	just_over_7_days_ago=$((-1-7*86400)) &&
+	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
+	: >twelve &&
+	git update-index --add twelve &&
+	test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
+	git config splitIndex.sharedIndexExpire "8.days.ago" &&
+	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
+	: >thirteen &&
+	git update-index --add thirteen &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	just_over_8_days_ago=$((-1-8*86400)) &&
+	test-chmtime =$just_over_8_days_ago .git/sharedindex.* &&
+	: >fourteen &&
+	git update-index --add fourteen &&
+	test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"' '
+	git config splitIndex.sharedIndexExpire never &&
+	just_10_years_ago=$((-365*10*86400)) &&
+	test-chmtime =$just_10_years_ago .git/sharedindex.* &&
+	: >fifteen &&
+	git update-index --add fifteen &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	git config splitIndex.sharedIndexExpire now &&
+	just_1_second_ago=-1 &&
+	test-chmtime =$just_1_second_ago .git/sharedindex.* &&
+	: >sixteen &&
+	git update-index --add sixteen &&
+	test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
 test_done
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 18/21] read-cache: refactor read_index_from()
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (16 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 17/21] t1700: test shared index file expiration Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 19/21] read-cache: use freshen_shared_index() in read_index_from() Christian Couder
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

It looks better and is simpler to review when we don't compute
the same things many times in the function.

It will also help make the following commit simpler.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index e62a6c742d..98ef1323d6 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1699,6 +1699,8 @@ int read_index_from(struct index_state *istate, const char *path)
 {
 	struct split_index *split_index;
 	int ret;
+	char *base_sha1_hex;
+	const char *base_path;
 
 	/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
 	if (istate->initialized)
@@ -1716,15 +1718,15 @@ int read_index_from(struct index_state *istate, const char *path)
 		discard_index(split_index->base);
 	else
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
-	ret = do_read_index(split_index->base,
-			    git_path("sharedindex.%s",
-				     sha1_to_hex(split_index->base_sha1)), 1);
+
+	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
+	base_path = git_path("sharedindex.%s", base_sha1_hex);
+	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
-		    sha1_to_hex(split_index->base_sha1),
-		    git_path("sharedindex.%s",
-			     sha1_to_hex(split_index->base_sha1)),
+		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
+
 	merge_base_index(istate);
 	post_read_index_from(istate);
 	return ret;
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 19/21] read-cache: use freshen_shared_index() in read_index_from()
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (17 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 18/21] read-cache: refactor read_index_from() Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-26 10:22 ` [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire Christian Couder
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

This way a share index file will not be garbage collected if
we still read from an index it is based from.

As we need to read the current index before creating a new
one, the tests have to be adjusted, so that we don't expect
an old shared index file to be deleted right away when we
create a new one.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c           |  1 +
 t/t1700-split-index.sh | 14 +++++++-------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 98ef1323d6..bf0ac1ce61 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1727,6 +1727,7 @@ int read_index_from(struct index_state *istate, const char *path)
 		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
 
+	freshen_shared_index(base_sha1_hex);
 	merge_base_index(istate);
 	post_read_index_from(istate);
 	return ret;
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index f448fc13cd..800f84a593 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -313,17 +313,17 @@ EOF
 test_expect_success 'shared index files expire after 7 days by default' '
 	: >ten &&
 	git update-index --add ten &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_under_7_days_ago=$((1-7*86400)) &&
 	test-chmtime =$just_under_7_days_ago .git/sharedindex.* &&
 	: >eleven &&
 	git update-index --add eleven &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_over_7_days_ago=$((-1-7*86400)) &&
 	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
 	: >twelve &&
 	git update-index --add twelve &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
@@ -331,12 +331,12 @@ test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
 	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
 	: >thirteen &&
 	git update-index --add thirteen &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_over_8_days_ago=$((-1-8*86400)) &&
 	test-chmtime =$just_over_8_days_ago .git/sharedindex.* &&
 	: >fourteen &&
 	git update-index --add fourteen &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"' '
@@ -345,13 +345,13 @@ test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"
 	test-chmtime =$just_10_years_ago .git/sharedindex.* &&
 	: >fifteen &&
 	git update-index --add fifteen &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	git config splitIndex.sharedIndexExpire now &&
 	just_1_second_ago=-1 &&
 	test-chmtime =$just_1_second_ago .git/sharedindex.* &&
 	: >sixteen &&
 	git update-index --add sixteen &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_done
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (18 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 19/21] read-cache: use freshen_shared_index() in read_index_from() Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:11   ` Junio C Hamano
  2016-12-26 10:22 ` [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.* Christian Couder
  2016-12-26 10:29 ` [PATCH v3 00/21] Add configuration options for split-index Christian Couder
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e0f5a77980..24e771d22e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2786,6 +2786,17 @@ splitIndex.maxPercentChange::
 	than 20 percent of the total number of entries.
 	See linkgit:git-update-index[1].
 
+splitIndex.sharedIndexExpire::
+	When the split index feature is used, shared index files with
+	a mtime older than this time will be removed when a new shared
+	index file is created. The value "now" expires all entries
+	immediately, and "never" suppresses expiration altogether.
+	The default value is "one.week.ago".
+	Note that each time a new split-index file is created, the
+	mtime of the related shared index file is updated to the
+	current time.
+	See linkgit:git-update-index[1].
+
 status.relativePaths::
 	By default, linkgit:git-status[1] shows paths relative to the
 	current directory. Setting this variable to `false` shows paths
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.*
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (19 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire Christian Couder
@ 2016-12-26 10:22 ` Christian Couder
  2016-12-27 19:13   ` Junio C Hamano
  2016-12-26 10:29 ` [PATCH v3 00/21] Add configuration options for split-index Christian Couder
  21 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt           |  6 +++---
 Documentation/git-update-index.txt | 37 +++++++++++++++++++++++++++++--------
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 24e771d22e..87a570cf88 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2792,9 +2792,9 @@ splitIndex.sharedIndexExpire::
 	index file is created. The value "now" expires all entries
 	immediately, and "never" suppresses expiration altogether.
 	The default value is "one.week.ago".
-	Note that each time a new split-index file is created, the
-	mtime of the related shared index file is updated to the
-	current time.
+	Note that each time a split index based on a shared index file
+	is either created or read from, the mtime of the shared index
+	file is updated to the current time.
 	See linkgit:git-update-index[1].
 
 status.relativePaths::
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index e091b2a409..46c953b2f2 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -163,14 +163,10 @@ may not support it yet.
 
 --split-index::
 --no-split-index::
-	Enable or disable split index mode. If enabled, the index is
-	split into two files, $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>.
-	Changes are accumulated in $GIT_DIR/index while the shared
-	index file contains all index entries stays unchanged. If
-	split-index mode is already enabled and `--split-index` is
-	given again, all changes in $GIT_DIR/index are pushed back to
-	the shared index file. This mode is designed for very large
-	indexes that take a significant amount of time to read or write.
+	Enable or disable split index mode. If split-index mode is
+	already enabled and `--split-index` is given again, all
+	changes in $GIT_DIR/index are pushed back to the shared index
+	file.
 +
 These options take effect whatever the value of the `core.splitIndex`
 configuration variable (see linkgit:git-config[1]). But a warning is
@@ -394,6 +390,31 @@ Although this bit looks similar to assume-unchanged bit, its goal is
 different from assume-unchanged bit's. Skip-worktree also takes
 precedence over assume-unchanged bit when both are set.
 
+Split index
+-----------
+
+This mode is designed for very large indexes that take a significant
+amount of time to read or write.
+
+In this mode, the index is split into two files, $GIT_DIR/index and
+$GIT_DIR/sharedindex.<SHA-1>. Changes are accumulated in
+$GIT_DIR/index, the split index, while the shared index file contains
+all index entries and stays unchanged.
+
+All changes in the split index are pushed back to the shared index
+file when the number of entries in the split index reaches a level
+specified by the splitIndex.maxPercentChange config variable (see
+linkgit:git-config[1]).
+
+Each time a new shared index file is created, the old shared index
+files are deleted if their mtime is older than what is specified by
+the splitIndex.sharedIndexExpire config variable (see
+linkgit:git-config[1]).
+
+To avoid deleting a shared index file that is still used, its mtime is
+updated to the current time everytime a new split index based on the
+shared index file is either created or read from.
+
 Untracked cache
 ---------------
 
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/21] Add configuration options for split-index
  2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
                   ` (20 preceding siblings ...)
  2016-12-26 10:22 ` [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.* Christian Couder
@ 2016-12-26 10:29 ` Christian Couder
  21 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2016-12-26 10:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

On Mon, Dec 26, 2016 at 11:22 AM, Christian Couder
<christian.couder@gmail.com> wrote:
>
> Highlevel view of the patches in the series
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Except for patch 1/21, there are 3 big steps, one for each new
> configuration variable introduced.
>
> There only a few small differences between this patch series and the
> v2 patch series sent a few weeks ago. Only two commits have been
> changed a little, as suggested by Duy.

Here is the diff compared to v2:

diff --git a/config.c b/config.c
index 5c52cefd78..d6c8f8f3ba 100644
--- a/config.c
+++ b/config.c
@@ -1724,7 +1724,7 @@ int git_config_get_untracked_cache(void)

 int git_config_get_split_index(void)
 {
-       int val = -1;
+       int val;

        if (!git_config_get_maybe_bool("core.splitindex", &val))
                return val;
diff --git a/read-cache.c b/read-cache.c
index 772343ab25..bf0ac1ce61 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2227,18 +2227,17 @@ static unsigned long get_shared_index_expire_date(void)
        return shared_index_expire_date;
 }

-static int can_delete_shared_index(const char *shared_sha1_hex)
+static int can_delete_shared_index(const char *shared_index_path)
 {
        struct stat st;
        unsigned long expiration;
-       const char *shared_index = git_path("sharedindex.%s", shared_sha1_hex);

        /* Check timestamp */
        expiration = get_shared_index_expire_date();
        if (!expiration)
                return 0;
-       if (stat(shared_index, &st))
-               return error_errno(_("could not stat '%s"), shared_index);
+       if (stat(shared_index_path, &st))
+               return error_errno(_("could not stat '%s"), shared_index_path);
        if (st.st_mtime > expiration)
                return 0;

@@ -2255,13 +2254,15 @@ static int clean_shared_index_files(const char
*current_hex)

        while ((de = readdir(dir)) != NULL) {
                const char *sha1_hex;
+               const char *shared_index_path;
                if (!skip_prefix(de->d_name, "sharedindex.", &sha1_hex))
                        continue;
                if (!strcmp(sha1_hex, current_hex))
                        continue;
-               if (can_delete_shared_index(sha1_hex) > 0 &&
-                   unlink(git_path("%s", de->d_name)))
-                       error_errno(_("unable to unlink: %s"),
git_path("%s", de->d_name));
+               shared_index_path = git_path("%s", de->d_name);
+               if (can_delete_shared_index(shared_index_path) > 0 &&
+                   unlink(shared_index_path))
+                       error_errno(_("unable to unlink: %s"),
shared_index_path);
        }
        closedir(dir);

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/21] t1700: add tests for core.splitIndex
  2016-12-26 10:22 ` [PATCH v3 06/21] t1700: add tests for core.splitIndex Christian Couder
@ 2016-12-27 19:04   ` Junio C Hamano
  2017-01-02  8:29     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:04 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> +test_expect_success 'set core.splitIndex config variable to true' '
> +	git config core.splitIndex true &&
> +	: >three &&
> +	git update-index --add three &&
> +	git ls-files --stage >ls-files.actual &&
> +	cat >ls-files.expect <<EOF &&
> +100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
> +100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	three
> +100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
> +EOF
> +	test_cmp ls-files.expect ls-files.actual &&

It does not add much value to follow the "existing" outdated style
like this when you are only adding new tests.  Write these like

        cat >ls-files.expect <<-\EOF &&
	100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
	100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	three
	100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
	EOF

which would give incentive to others (or yourself) to update the
style of the existing mess ;-).

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2016-12-26 10:22 ` [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var Christian Couder
@ 2016-12-27 19:07   ` Junio C Hamano
  2017-01-02  9:39     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:07 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  Documentation/git-update-index.txt | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
> index 7386c93162..e091b2a409 100644
> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -171,6 +171,12 @@ may not support it yet.
>  	given again, all changes in $GIT_DIR/index are pushed back to
>  	the shared index file. This mode is designed for very large
>  	indexes that take a significant amount of time to read or write.
> ++
> +These options take effect whatever the value of the `core.splitIndex`
> +configuration variable (see linkgit:git-config[1]).

Doesn't the "whatever..." clause in this sentence lack its verb
(presumably, "is", right after "variable")?

> +emitted when the change goes against the configured value, as the
> +configured value will take effect next time the index is read and this
> +will remove the intended effect of the option.

It feels somewhat strange to see a warning in this case. 

If the user configures the system to usually do X, and issues a
command that explicitly does Y (or "not-X"), we do not warn for the
command invocation if it is a single-shot operation.  That's the
usual "command line overrides configuration" an end-user expects.

I think you made this deviate from the usual end-user expectation
because it is unpredictable when the configuration kicks in the next
time to undo the effect of this command line request.  And I agree
that it is a very un-nice thing to do to the users if we did not
warn here, if we are to keep that unpredictableness.

But stepping back a bit, we know the user's intention is that with
"--split-index" the user wants to use the split index, even though
the user may have configuration not to use the split index.  Don't
we want to honor that intention?  In order to honor that intention
without getting confused by the configured variable to tell us not
to split, another way is to destroy that unpredictableness.  For
that, shouldn't we automatically remove the configuration that says
"do not split" (by perhaps set it to "do split")?  Doing so still
may need a warning that says "we disabled your previously configured
setting while at it", but it would be much better than a warning
message that essentially says "we do it once, but we will disobey
you at an unpredictable time", I would think.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 10/21] read-cache: regenerate shared index if necessary
  2016-12-26 10:22 ` [PATCH v3 10/21] read-cache: regenerate shared index if necessary Christian Couder
@ 2016-12-27 19:08   ` Junio C Hamano
  2017-01-02 11:23     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:08 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> +	case 0:
> +		return 1; /* 0% means always write a new shared index */
> +	case 100:
> +		return 0; /* 100% means never write a new shared index */
> +	default:
> +		; /* do nothing: just use the configured value */
> +	}

Just like you did in 04/21, write "break" to avoid mistakes made in
the future, i.e.

	default:
		break; /* just use the configured value */

> +
> +	/* Count not shared entries */
> +	for (i = 0; i < istate->cache_nr; i++) {
> +		struct cache_entry *ce = istate->cache[i];
> +		if (!ce->index)
> +			not_shared++;
> +	}
> +
> +	return istate->cache_nr * max_split < not_shared * 100;

On a 32-bit arch with 2G int and more than 20 million paths in the
index, multiplying by max_split that can come close to 100 can
theoretically cause integer overflow, but in practice it probably
does not matter.  Or does it?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange
  2016-12-26 10:22 ` [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange Christian Couder
@ 2016-12-27 19:09   ` Junio C Hamano
  2017-01-02 13:50     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:09 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  Documentation/config.txt | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 221c5982c0..e0f5a77980 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2773,6 +2773,19 @@ showbranch.default::
>  	The default set of branches for linkgit:git-show-branch[1].
>  	See linkgit:git-show-branch[1].
>  
> +splitIndex.maxPercentChange::
> +	When the split index feature is used, this specifies the
> +	percent of entries the split index can contain compared to the
> +	whole number of entries in both the split index and the shared

s/whole/total/ to match the last sentence of this section, perhaps?
"The number of all entries" would also be OK, but "the whole number
of entries" sounds a bit strange.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 13/21] sha1_file: make check_and_freshen_file() non static
  2016-12-26 10:22 ` [PATCH v3 13/21] sha1_file: make check_and_freshen_file() non static Christian Couder
@ 2016-12-27 19:09   ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:09 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> This function will be used in a commit soon, so let's make
> it available globally.

See comment on 14/21; I am not convinced that this is the function
you would want to borrow.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2016-12-26 10:22 ` [PATCH v3 14/21] read-cache: touch shared index files when used Christian Couder
@ 2016-12-27 19:10   ` Junio C Hamano
  2017-01-02 14:09     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:10 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> +/*
> + * Signal that the shared index is used by updating its mtime.
> + *
> + * This way, shared index can be removed if they have not been used
> + * for some time. It's ok to fail to update the mtime if we are on a
> + * read only file system.
> + */
> +void freshen_shared_index(char *base_sha1_hex)
> +{
> +	const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
> +	check_and_freshen_file(shared_index, 1);

What happens when this call fails?  The function returns 0 if the
file did not even exist.  It also returns 0 if you cannot update its
timestamp.

Shouldn't the series be exposing freshen_file() instead _and_ taking
its error return value seriously?

> +}
> +
>  int read_index_from(struct index_state *istate, const char *path)
>  {
>  	struct split_index *split_index;
> @@ -2273,6 +2286,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
>  		int ret = write_shared_index(istate, lock, flags);
>  		if (ret)
>  			return ret;
> +	} else {
> +		freshen_shared_index(sha1_to_hex(si->base_sha1));
>  	}
>  
>  	return write_split_index(istate, lock, flags);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire
  2016-12-26 10:22 ` [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire Christian Couder
@ 2016-12-27 19:11   ` Junio C Hamano
  2017-01-02 14:33     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:11 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  Documentation/config.txt | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index e0f5a77980..24e771d22e 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2786,6 +2786,17 @@ splitIndex.maxPercentChange::
>  	than 20 percent of the total number of entries.
>  	See linkgit:git-update-index[1].
>  
> +splitIndex.sharedIndexExpire::
> +	When the split index feature is used, shared index files with
> +	a mtime older than this time will be removed when a new shared

As end-user facing documentation, it would be much better if we can
rephrase it for those who do not know what a 'mtime' is, and it
would be even better if we can do so without losing precision.

I think "shared index files that were not modified since the time
this variable specifies will be removed" would be understandable and
correct enough?

> +	index file is created. The value "now" expires all entries
> +	immediately, and "never" suppresses expiration altogether.
> +	The default value is "one.week.ago".
> +	Note that each time a new split-index file is created, the
> +	mtime of the related shared index file is updated to the
> +	current time.

To match the above suggestion, "Note that a shared index file is
considered modified (for the purpose of expiration) each time a new
split-index file is created based on it."?

> +	See linkgit:git-update-index[1].
> +
>  status.relativePaths::
>  	By default, linkgit:git-status[1] shows paths relative to the
>  	current directory. Setting this variable to `false` shows paths

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.*
  2016-12-26 10:22 ` [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.* Christian Couder
@ 2016-12-27 19:13   ` Junio C Hamano
  2017-01-02 16:04     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2016-12-27 19:13 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

>  --split-index::
>  --no-split-index::
> -	Enable or disable split index mode. If enabled, the index is
> -	split into two files, $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>.
> -	Changes are accumulated in $GIT_DIR/index while the shared
> -	index file contains all index entries stays unchanged. If
> -	split-index mode is already enabled and `--split-index` is
> -	given again, all changes in $GIT_DIR/index are pushed back to
> -	the shared index file. This mode is designed for very large
> -	indexes that take a significant amount of time to read or write.
> +	Enable or disable split index mode. If split-index mode is
> +	already enabled and `--split-index` is given again, all
> +	changes in $GIT_DIR/index are pushed back to the shared index
> +	file.

In the world after this series that introduced the percentage-based
auto consolidation, it smells strange, or even illogical, that index
is un-split after doing this:

    $ git update-index --split-index
    $ git update-index --split-index

Before this series, it may have been a quick and dirty way to force
consolidating the split index without totally disabling the feature,
i.e. it would have looked more like

    $ git update-index --split-index
    ... work work work to accumulate more modifications
    ... consolidate into one --- there was no other way than
    ... disabling it temporarily
    $ git update-index --no-split-index
    ... but the user likes the feature so re-enable it.
    $ git update-index --split-index

which I guess is where this strange behaviour comes from.

It may be something we need to fix to unconfuse the end-users after
this series lands.  Even though "first disable and then re-enable"
takes two commands (as opposed to one), it is far more logical.  And
the percentage-based auto consolidation feature is meant to reduce
the need for manual consolidation, so it probably makes sense to
remove this illogical feature.

> @@ -394,6 +390,31 @@ Although this bit looks similar to assume-unchanged bit, its goal is
>  different from assume-unchanged bit's. Skip-worktree also takes
>  precedence over assume-unchanged bit when both are set.
>  
> +Split index
> +-----------
> +
> +This mode is designed for very large indexes that take a significant
> +amount of time to read or write.

This is not a new problem, but it probably is incorrect to say "to
read or write".  It saves time by not rewriting the whole thing but
instead write out only the updated bits.  You'd still read the whole
thing while populating the in-core index from the disk, and if
anything, you'd probably spend _more_ cycles because you'd essentially
be reading the base and then reading the delta to apply on top.

> +To avoid deleting a shared index file that is still used, its mtime is
> +updated to the current time everytime a new split index based on the
> +shared index file is either created or read from.


The same comment on the mention of "mtime" in another patch applies
here as well.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/21] t1700: add tests for core.splitIndex
  2016-12-27 19:04   ` Junio C Hamano
@ 2017-01-02  8:29     ` Christian Couder
  2017-01-03 12:58       ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-02  8:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> +test_expect_success 'set core.splitIndex config variable to true' '
>> +     git config core.splitIndex true &&
>> +     : >three &&
>> +     git update-index --add three &&
>> +     git ls-files --stage >ls-files.actual &&
>> +     cat >ls-files.expect <<EOF &&
>> +100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0    one
>> +100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0    three
>> +100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0    two
>> +EOF
>> +     test_cmp ls-files.expect ls-files.actual &&
>
> It does not add much value to follow the "existing" outdated style
> like this when you are only adding new tests.  Write these like
>
>         cat >ls-files.expect <<-\EOF &&
>         100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       one
>         100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       three
>         100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       two
>         EOF
>
> which would give incentive to others (or yourself) to update the
> style of the existing mess ;-).

Ok, I will add a patch to update the style of the existing tests at
the beginning of the series and then use the same new style in the
tests I add in later patches.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2016-12-27 19:07   ` Junio C Hamano
@ 2017-01-02  9:39     ` Christian Couder
  2017-01-07 21:38       ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-02  9:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:07 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> ---
>>  Documentation/git-update-index.txt | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
>> index 7386c93162..e091b2a409 100644
>> --- a/Documentation/git-update-index.txt
>> +++ b/Documentation/git-update-index.txt
>> @@ -171,6 +171,12 @@ may not support it yet.
>>       given again, all changes in $GIT_DIR/index are pushed back to
>>       the shared index file. This mode is designed for very large
>>       indexes that take a significant amount of time to read or write.
>> ++
>> +These options take effect whatever the value of the `core.splitIndex`
>> +configuration variable (see linkgit:git-config[1]).
>
> Doesn't the "whatever..." clause in this sentence lack its verb
> (presumably, "is", right after "variable")?

I think that it is ok to have no verb here. My preferred online
dictionary (http://www.wordreference.com/enfr/whatever) gives "We'll
try to free the hostage whatever the cost." as an example with no
verb, though I am not a native speaker.

Also note that I mostly copied what I had already written for
--(no-)untracked-cache:

--untracked-cache::
--no-untracked-cache::
    Enable or disable untracked cache feature. Please use
    `--test-untracked-cache` before enabling it.
+
These options take effect whatever the value of the `core.untrackedCache`
configuration variable (see linkgit:git-config[1]). But a warning is
emitted when the change goes against the configured value, as the
configured value will take effect next time the index is read and this
will remove the intended effect of the option.

>> +emitted when the change goes against the configured value, as the
>> +configured value will take effect next time the index is read and this
>> +will remove the intended effect of the option.
>
> It feels somewhat strange to see a warning in this case.

We already discussed the warning issue for --(no-)untracked-cache, and
I am just following the conclusions of that previous discussion about
--(no-)untracked-cache. See the end of this message where you
suggested the warning:

https://public-inbox.org/git/xmqqlh8cg9y2.fsf@gitster.mtv.corp.google.com/

> If the user configures the system to usually do X, and issues a
> command that explicitly does Y (or "not-X"), we do not warn for the
> command invocation if it is a single-shot operation.  That's the
> usual "command line overrides configuration" an end-user expects.
>
> I think you made this deviate from the usual end-user expectation
> because it is unpredictable when the configuration kicks in the next
> time to undo the effect of this command line request.  And I agree
> that it is a very un-nice thing to do to the users if we did not
> warn here, if we are to keep that unpredictableness.

I also think it would be strange and user unfriendly for
--untracked-cache and --split-index to behave differently, and I am
reluctant at this point to rework the way it works for
--untracked-cache.

> But stepping back a bit, we know the user's intention is that with
> "--split-index" the user wants to use the split index, even though
> the user may have configuration not to use the split index.  Don't
> we want to honor that intention?  In order to honor that intention
> without getting confused by the configured variable to tell us not
> to split, another way is to destroy that unpredictableness.  For
> that, shouldn't we automatically remove the configuration that says
> "do not split" (by perhaps set it to "do split")?  Doing so still
> may need a warning that says "we disabled your previously configured
> setting while at it", but it would be much better than a warning
> message that essentially says "we do it once, but we will disobey
> you at an unpredictable time", I would think.

I wanted to do that for --untracked-cache around one year ago but in
the following message:

https://public-inbox.org/git/xmqqsi3ckadi.fsf@gitster.mtv.corp.google.com/

you wrote the following:

"Why is it a good idea to allow an explicit operation from the
command line to muck with the configuration?  If the user wants to
change the configuration permanently, "git config" has been there
for the purpose for all the configuration variables to do so for
almost forever.  Why is it a good idea to make this variable so
special that the user does not have to use "git config" but disable
or enable it in some other way?"

It feels strange that when I do things one way, you suggest another
way, and the next time in a similar situation when I do things the way
you suggested previously, then you suggest the way I did it initially
the first time...

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 10/21] read-cache: regenerate shared index if necessary
  2016-12-27 19:08   ` Junio C Hamano
@ 2017-01-02 11:23     ` Christian Couder
  0 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2017-01-02 11:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:08 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> +     case 0:
>> +             return 1; /* 0% means always write a new shared index */
>> +     case 100:
>> +             return 0; /* 100% means never write a new shared index */
>> +     default:
>> +             ; /* do nothing: just use the configured value */
>> +     }
>
> Just like you did in 04/21, write "break" to avoid mistakes made in
> the future, i.e.
>
>         default:
>                 break; /* just use the configured value */

Ok, I will do that.

>> +
>> +     /* Count not shared entries */
>> +     for (i = 0; i < istate->cache_nr; i++) {
>> +             struct cache_entry *ce = istate->cache[i];
>> +             if (!ce->index)
>> +                     not_shared++;
>> +     }
>> +
>> +     return istate->cache_nr * max_split < not_shared * 100;
>
> On a 32-bit arch with 2G int and more than 20 million paths in the
> index, multiplying by max_split that can come close to 100 can
> theoretically cause integer overflow, but in practice it probably
> does not matter.  Or does it?

From a cursory look a "struct cache_entry" takes at least 80 bytes
without counting the "char name[FLEX_ARRAY]" on a 32 bit machine, so I
don't think it would be a good idea to work on a repo with 20 million
paths on a 32 bit machine, but maybe theoretically it could be a
problem.

To be safe I think I will use:

return (int64_t)istate->cache_nr * max_split < (int64_t)not_shared * 100;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange
  2016-12-27 19:09   ` Junio C Hamano
@ 2017-01-02 13:50     ` Christian Couder
  0 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2017-01-02 13:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> ---
>>  Documentation/config.txt | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 221c5982c0..e0f5a77980 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -2773,6 +2773,19 @@ showbranch.default::
>>       The default set of branches for linkgit:git-show-branch[1].
>>       See linkgit:git-show-branch[1].
>>
>> +splitIndex.maxPercentChange::
>> +     When the split index feature is used, this specifies the
>> +     percent of entries the split index can contain compared to the
>> +     whole number of entries in both the split index and the shared
>
> s/whole/total/ to match the last sentence of this section, perhaps?
> "The number of all entries" would also be OK, but "the whole number
> of entries" sounds a bit strange.

Yeah "total" is better than "whole" here. I will use "total".

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2016-12-27 19:10   ` Junio C Hamano
@ 2017-01-02 14:09     ` Christian Couder
  2017-01-07 21:46       ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-02 14:09 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:10 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> +/*
>> + * Signal that the shared index is used by updating its mtime.
>> + *
>> + * This way, shared index can be removed if they have not been used
>> + * for some time. It's ok to fail to update the mtime if we are on a
>> + * read only file system.
>> + */
>> +void freshen_shared_index(char *base_sha1_hex)
>> +{
>> +     const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
>> +     check_and_freshen_file(shared_index, 1);
>
> What happens when this call fails?  The function returns 0 if the
> file did not even exist.  It also returns 0 if you cannot update its
> timestamp.

Yeah and I don't think it's a problem in either case.
If we cannot update its timestamp, it's not a problem, as we could be
on a read-only file system, and you said in a previous iteration that
we should not even warn in this case.
If the file does not exist, it could be because it has just been
deleted for a good reason, and anyway, if it is a problem that the
shared index file has been deleted, it is better addressed when we
will actually need the shared index file to read something into from
it, rather than just to update its mtime.

> Shouldn't the series be exposing freshen_file() instead _and_ taking
> its error return value seriously?

So what should we do if freshen_file() returns 0 which means that the
freshening failed?

>> +}

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire
  2016-12-27 19:11   ` Junio C Hamano
@ 2017-01-02 14:33     ` Christian Couder
  0 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2017-01-02 14:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:11 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> ---
>>  Documentation/config.txt | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index e0f5a77980..24e771d22e 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -2786,6 +2786,17 @@ splitIndex.maxPercentChange::
>>       than 20 percent of the total number of entries.
>>       See linkgit:git-update-index[1].
>>
>> +splitIndex.sharedIndexExpire::
>> +     When the split index feature is used, shared index files with
>> +     a mtime older than this time will be removed when a new shared
>
> As end-user facing documentation, it would be much better if we can
> rephrase it for those who do not know what a 'mtime' is, and it
> would be even better if we can do so without losing precision.
>
> I think "shared index files that were not modified since the time
> this variable specifies will be removed" would be understandable and
> correct enough?

Yeah, I agree it is better for end users. I will use what you suggest.

>> +     index file is created. The value "now" expires all entries
>> +     immediately, and "never" suppresses expiration altogether.
>> +     The default value is "one.week.ago".
>> +     Note that each time a new split-index file is created, the
>> +     mtime of the related shared index file is updated to the
>> +     current time.
>
> To match the above suggestion, "Note that a shared index file is
> considered modified (for the purpose of expiration) each time a new
> split-index file is created based on it."?

Yeah, I also agree it is better and will use that.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.*
  2016-12-27 19:13   ` Junio C Hamano
@ 2017-01-02 16:04     ` Christian Couder
  0 siblings, 0 replies; 56+ messages in thread
From: Christian Couder @ 2017-01-02 16:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Tue, Dec 27, 2016 at 8:13 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>>  --split-index::
>>  --no-split-index::
>> -     Enable or disable split index mode. If enabled, the index is
>> -     split into two files, $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>.
>> -     Changes are accumulated in $GIT_DIR/index while the shared
>> -     index file contains all index entries stays unchanged. If
>> -     split-index mode is already enabled and `--split-index` is
>> -     given again, all changes in $GIT_DIR/index are pushed back to
>> -     the shared index file. This mode is designed for very large
>> -     indexes that take a significant amount of time to read or write.
>> +     Enable or disable split index mode. If split-index mode is
>> +     already enabled and `--split-index` is given again, all
>> +     changes in $GIT_DIR/index are pushed back to the shared index
>> +     file.
>
> In the world after this series that introduced the percentage-based
> auto consolidation, it smells strange, or even illogical, that index
> is un-split after doing this:
>
>     $ git update-index --split-index
>     $ git update-index --split-index
>
> Before this series, it may have been a quick and dirty way to force
> consolidating the split index without totally disabling the feature,
> i.e. it would have looked more like
>
>     $ git update-index --split-index
>     ... work work work to accumulate more modifications
>     ... consolidate into one --- there was no other way than
>     ... disabling it temporarily
>     $ git update-index --no-split-index
>     ... but the user likes the feature so re-enable it.
>     $ git update-index --split-index
>
> which I guess is where this strange behaviour comes from.
>
> It may be something we need to fix to unconfuse the end-users after
> this series lands.  Even though "first disable and then re-enable"
> takes two commands (as opposed to one), it is far more logical.  And
> the percentage-based auto consolidation feature is meant to reduce
> the need for manual consolidation, so it probably makes sense to
> remove this illogical feature.

Yeah, I tend to agree that this feature could be removed later. Though
before removing it, I'd like to hear Duy's opinion on this as he
created the feature in the first place.

>> @@ -394,6 +390,31 @@ Although this bit looks similar to assume-unchanged bit, its goal is
>>  different from assume-unchanged bit's. Skip-worktree also takes
>>  precedence over assume-unchanged bit when both are set.
>>
>> +Split index
>> +-----------
>> +
>> +This mode is designed for very large indexes that take a significant
>> +amount of time to read or write.
>
> This is not a new problem, but it probably is incorrect to say "to
> read or write".  It saves time by not rewriting the whole thing but
> instead write out only the updated bits.  You'd still read the whole
> thing while populating the in-core index from the disk, and if
> anything, you'd probably spend _more_ cycles because you'd essentially
> be reading the base and then reading the delta to apply on top.

Ok, then what about:

+This mode is designed for repositories with very large indexes, and aims
+at reducing the time it takes to repeatedly write these indexes.

>> +To avoid deleting a shared index file that is still used, its mtime is
>> +updated to the current time everytime a new split index based on the
>> +shared index file is either created or read from.
>
> The same comment on the mention of "mtime" in another patch applies
> here as well.

Ok, I will use "modification time" instead of "mtime".

Thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/21] t1700: add tests for core.splitIndex
  2017-01-02  8:29     ` Christian Couder
@ 2017-01-03 12:58       ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2017-01-03 12:58 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> Ok, I will add a patch to update the style of the existing tests at
> the beginning of the series and then use the same new style in the
> tests I add in later patches.

That's not what I meant---I was expecting and was willing to accept
a corrected patch that leaves existing old-fashioned ones as they
are while making sure that added ones are modern, to reduce the cost
of finishing this series, leaving the style fixes of existing ones
for future clean-up task that can be done by anybody after the dust
from this series settles.

A preparatory clean-up patch before the series like you plan is fine
(eh, rather, "extra nice"), so... thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2017-01-02  9:39     ` Christian Couder
@ 2017-01-07 21:38       ` Junio C Hamano
  2017-01-09 11:18         ` Duy Nguyen
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2017-01-07 21:38 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> It feels strange that when I do things one way, you suggest another
> way, and the next time in a similar situation when I do things the way
> you suggested previously, then you suggest the way I did it initially
> the first time...

Perhaps because neither is quite satisfactory and I am being forced
to choose between the two unsatifactory designs?  In any case, I
mostly am and was pointing out the issues and not saying the other
one is the most preferred solution in these threads.

I think there should just be one authoritative source of the truth,
and I have always thought it should be the bit set in the index file
when the command line option is used, because that was the way the
feature was introduced first and I am superstitious about end-user
inertia.  And from that point of view, no matter how you make this
new "config" thing interact with it, it would always give a strange
and unsatifactory end-user experience, at least to me.

Perhaps we should declare that config will be the one and only way
in the future and start deprecating the command line option way.
That will remove the need for two to interact with each other.

I dunno.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-02 14:09     ` Christian Couder
@ 2017-01-07 21:46       ` Junio C Hamano
  2017-01-09 10:55         ` Duy Nguyen
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2017-01-07 21:46 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Nguyen Thai Ngoc Duy, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> So what should we do if freshen_file() returns 0 which means that the
> freshening failed?

You tell me ;-)  as you are the one who is proposing this feature.

Isn't a failure to freshen it a grave error?  We are letting a
base/shared index file that is known to be in-use go stale and
eventually subject for garbage collection, and the user should be
notified in some way before the actual GC happens that renders the
index file unusable?

What is the failure mode after such a premature GC happens?  What
does the end-user see?  Can you try to (1) split the index (2)
modify bunch of entries (3) remove the base/shared index with /bin/rm
and then see how various Git commands fail?  Do they fail gracefully?

I am trying to gauge the seriousness of ignoring such an error here.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-07 21:46       ` Junio C Hamano
@ 2017-01-09 10:55         ` Duy Nguyen
  2017-01-09 14:34           ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Duy Nguyen @ 2017-01-09 10:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> So what should we do if freshen_file() returns 0 which means that the
>> freshening failed?
>
> You tell me ;-)  as you are the one who is proposing this feature.

My answer is, we are not worse than freshening loose objects case
(especially since I took the idea from there). In both cases we
silently ignore utime() error (sha1_file.c does retry, but will do
nothing else if the retry fails). And errors in odb are much more
serious than index files. If we are to improve it, I think we should
do it inside check_and_freshen_file(), maybe with an optional flag to
silence it.

> Isn't a failure to freshen it a grave error?  We are letting a
> base/shared index file that is known to be in-use go stale and
> eventually subject for garbage collection, and the user should be
> notified in some way before the actual GC happens that renders the
> index file unusable?
>
> What is the failure mode after such a premature GC happens?  What
> does the end-user see?  Can you try to (1) split the index (2)
> modify bunch of entries (3) remove the base/shared index with /bin/rm
> and then see how various Git commands fail?  Do they fail gracefully?
>
> I am trying to gauge the seriousness of ignoring such an error here.

If we fail to refresh it and the file is old enough and gc happens,
any index file referenced to it are broken. Any commands that read the
index will die(). The best you could do is delete $GIT_DIR/index and
read-tree HEAD.
-- 
Duy

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2017-01-07 21:38       ` Junio C Hamano
@ 2017-01-09 11:18         ` Duy Nguyen
  2017-01-23 15:55           ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Duy Nguyen @ 2017-01-09 11:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Sun, Jan 8, 2017 at 4:38 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> It feels strange that when I do things one way, you suggest another
>> way, and the next time in a similar situation when I do things the way
>> you suggested previously, then you suggest the way I did it initially
>> the first time...
>
> Perhaps because neither is quite satisfactory and I am being forced
> to choose between the two unsatifactory designs?  In any case, I
> mostly am and was pointing out the issues and not saying the other
> one is the most preferred solution in these threads.
>
> I think there should just be one authoritative source of the truth,

Either that, or we make sure all sources of truth are consistent. In
this case, 'update --split-index' could update core.splitIndex if it
finds that the config tells a different story. As a plumbing though, I
rather leave update-index do simple things, even if it means the user
has to clean up after it (or before it) with "git config -unset
core.splitIndex". Another option is refuse to execute --split-index in
the presence of (conflicting) core.splitIndex. We kind of force the
user to keep all sources of truth consistent this way while leaving a
back door ("git -c core.splitIndex= update-index") for those who need
tools to recover from a bad case.

> and I have always thought it should be the bit set in the index file
> when the command line option is used, because that was the way the
> feature was introduced first and I am superstitious about end-user
> inertia.  And from that point of view, no matter how you make this
> new "config" thing interact with it, it would always give a strange
> and unsatifactory end-user experience, at least to me.
>
> Perhaps we should declare that config will be the one and only way
> in the future and start deprecating the command line option way.
> That will remove the need for two to interact with each other.
>
> I dunno.



-- 
Duy

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-09 10:55         ` Duy Nguyen
@ 2017-01-09 14:34           ` Junio C Hamano
  2017-01-19 12:13             ` Duy Nguyen
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2017-01-09 14:34 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Duy Nguyen <pclouds@gmail.com> writes:

> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>>> So what should we do if freshen_file() returns 0 which means that the
>>> freshening failed?
>>
>> You tell me ;-)  as you are the one who is proposing this feature.
>
> My answer is, we are not worse than freshening loose objects case
> (especially since I took the idea from there). 

I do not think so, unfortunately.  Loose object files with stale
timestamps are not removed as long as objects are still reachable.
For the base/shared index file, the timestamp is the only thing that
protects them from pruning, unless it is serving as the base file
for the currently active $GIT_DIR/index that is split.

>> What is the failure mode after such a premature GC happens?  What
>> does the end-user see?  Can you try to (1) split the index (2)
>> modify bunch of entries (3) remove the base/shared index with /bin/rm
>> and then see how various Git commands fail?  Do they fail gracefully?
>>
>> I am trying to gauge the seriousness of ignoring such an error here.
>
> If we fail to refresh it and the file is old enough and gc happens,
> any index file referenced to it are broken. Any commands that read the
> index will die(). The best you could do is delete $GIT_DIR/index and
> read-tree HEAD.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-09 14:34           ` Junio C Hamano
@ 2017-01-19 12:13             ` Duy Nguyen
  2017-01-19 19:00               ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Duy Nguyen @ 2017-01-19 12:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Christian Couder <christian.couder@gmail.com> writes:
>>>
>>>> So what should we do if freshen_file() returns 0 which means that the
>>>> freshening failed?
>>>
>>> You tell me ;-)  as you are the one who is proposing this feature.
>>
>> My answer is, we are not worse than freshening loose objects case
>> (especially since I took the idea from there).
>
> I do not think so, unfortunately.  Loose object files with stale
> timestamps are not removed as long as objects are still reachable.

But there are plenty of unreachable loose objects, added in index,
then got replaced with new versions. cache-tree can create loose trees
too and it's been run more often, behind user's back, to take
advantage of the shortcut in unpack-trees.

> For the base/shared index file, the timestamp is the only thing that
> protects them from pruning, unless it is serving as the base file
> for the currently active $GIT_DIR/index that is split.
-- 
Duy

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-19 12:13             ` Duy Nguyen
@ 2017-01-19 19:00               ` Junio C Hamano
  2017-01-20 10:44                 ` Duy Nguyen
  2017-01-23 18:14                 ` Christian Couder
  0 siblings, 2 replies; 56+ messages in thread
From: Junio C Hamano @ 2017-01-19 19:00 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Duy Nguyen <pclouds@gmail.com> writes:

> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Duy Nguyen <pclouds@gmail.com> writes:
>>
>>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> Christian Couder <christian.couder@gmail.com> writes:
>>>>
>>>>> So what should we do if freshen_file() returns 0 which means that the
>>>>> freshening failed?
>>>>
>>>> You tell me ;-)  as you are the one who is proposing this feature.
>>>
>>> My answer is, we are not worse than freshening loose objects case
>>> (especially since I took the idea from there).
>>
>> I do not think so, unfortunately.  Loose object files with stale
>> timestamps are not removed as long as objects are still reachable.
>
> But there are plenty of unreachable loose objects, added in index,
> then got replaced with new versions. cache-tree can create loose trees
> too and it's been run more often, behind user's back, to take
> advantage of the shortcut in unpack-trees.

I am not sure if I follow.  Aren't objects reachable from the
cache-tree in the index protected from gc?

Not that I think freshening would actually fail in a repository
where you can actually write into to update the index or its refs to
make a difference (iow, even if we make it die() loudly when shared
index cannot be "touched" because we are paranoid, no real life
usage will trigger that die(), and if a repository does trigger the
die(), I think you would really want to know about it).

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-19 19:00               ` Junio C Hamano
@ 2017-01-20 10:44                 ` Duy Nguyen
  2017-01-23 18:14                 ` Christian Couder
  1 sibling, 0 replies; 56+ messages in thread
From: Duy Nguyen @ 2017-01-20 10:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Fri, Jan 20, 2017 at 2:00 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Duy Nguyen <pclouds@gmail.com> writes:
>>>
>>>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>>> Christian Couder <christian.couder@gmail.com> writes:
>>>>>
>>>>>> So what should we do if freshen_file() returns 0 which means that the
>>>>>> freshening failed?
>>>>>
>>>>> You tell me ;-)  as you are the one who is proposing this feature.
>>>>
>>>> My answer is, we are not worse than freshening loose objects case
>>>> (especially since I took the idea from there).
>>>
>>> I do not think so, unfortunately.  Loose object files with stale
>>> timestamps are not removed as long as objects are still reachable.
>>
>> But there are plenty of unreachable loose objects, added in index,
>> then got replaced with new versions. cache-tree can create loose trees
>> too and it's been run more often, behind user's back, to take
>> advantage of the shortcut in unpack-trees.
>
> I am not sure if I follow.  Aren't objects reachable from the
> cache-tree in the index protected from gc?

I think the problem is loose objects created then gc run just before
they are referenced (e.g. index written down). But I think I may be
following a wrong road. If mtime is in fact to deal with race
conditions, applying the same idea here is wrong because we have a
different problem here.

> Not that I think freshening would actually fail in a repository
> where you can actually write into to update the index or its refs to
> make a difference (iow, even if we make it die() loudly when shared
> index cannot be "touched" because we are paranoid, no real life
> usage will trigger that die(), and if a repository does trigger the
> die(), I think you would really want to know about it).
-- 
Duy

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2017-01-09 11:18         ` Duy Nguyen
@ 2017-01-23 15:55           ` Christian Couder
  2017-01-23 19:59             ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-23 15:55 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Mon, Jan 9, 2017 at 12:18 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sun, Jan 8, 2017 at 4:38 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>>> It feels strange that when I do things one way, you suggest another
>>> way, and the next time in a similar situation when I do things the way
>>> you suggested previously, then you suggest the way I did it initially
>>> the first time...
>>
>> Perhaps because neither is quite satisfactory and I am being forced
>> to choose between the two unsatifactory designs?  In any case, I
>> mostly am and was pointing out the issues and not saying the other
>> one is the most preferred solution in these threads.
>>
>> I think there should just be one authoritative source of the truth,
>
> Either that, or we make sure all sources of truth are consistent. In
> this case, 'update --split-index' could update core.splitIndex if it
> finds that the config tells a different story. As a plumbing though, I
> rather leave update-index do simple things, even if it means the user
> has to clean up after it (or before it) with "git config -unset
> core.splitIndex". Another option is refuse to execute --split-index in
> the presence of (conflicting) core.splitIndex. We kind of force the
> user to keep all sources of truth consistent this way while leaving a
> back door ("git -c core.splitIndex= update-index") for those who need
> tools to recover from a bad case.
>
>> and I have always thought it should be the bit set in the index file
>> when the command line option is used, because that was the way the
>> feature was introduced first and I am superstitious about end-user
>> inertia.  And from that point of view, no matter how you make this
>> new "config" thing interact with it, it would always give a strange
>> and unsatifactory end-user experience, at least to me.
>>
>> Perhaps we should declare that config will be the one and only way
>> in the future and start deprecating the command line option way.
>> That will remove the need for two to interact with each other.

That would be my preferred solution as I think it is the simplest in
the end for users.
Also, as Duy wrote above, one can always use something like "git -c
core.splitIndex= ...", which by the way can work for any command, not
just "update-index".

Anyway we would have to introduce core.splitIndex first, and it
wouldn't make sense to have a different behavior between
--[no-]split-index and --[no-]untracked-cache in the meantime before
they are deprecated and eventually removed.

So let's just go with the implementation in this series, which is
similar to --[no-]untracked-cache, and let's plan to deprecate
--[no-]split-index and --[no-]untracked-cache in a later patch series.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-19 19:00               ` Junio C Hamano
  2017-01-20 10:44                 ` Duy Nguyen
@ 2017-01-23 18:14                 ` Christian Couder
  2017-01-23 18:53                   ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-23 18:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Thu, Jan 19, 2017 at 8:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Duy Nguyen <pclouds@gmail.com> writes:
>>>
>>>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>>> Christian Couder <christian.couder@gmail.com> writes:
>>>>>
>>>>>> So what should we do if freshen_file() returns 0 which means that the
>>>>>> freshening failed?
>>>>>
>>>>> You tell me ;-)  as you are the one who is proposing this feature.

Before the above question I already had given my opinion about what we
should do.

There are the following cases:

- Freshening failed because the shared index file does not exist
anymore. In this case it could have been removed for a good reason
(for example maybe the user wants to remove all the shared index
files), or it could be a bug somewhere else. Anyway we cannot know and
the user will get an error if the shared index file that disappeared
is read from afterwards, so I don't think we need to warn or do
anything.

- Freshening failed because the mtime of the shared index cannot be
changed. You already in a previous email wrote that we shoudn't warn
if the file system is read only, and I agree with that, as anyway if
the file system is read only, the shared index file cannot be deleted,
so there is no risk from the current user. Also the split index mode
is useful to speed up index writing at the cost of making index
reading a little slower, so its use in a read only mode should not be
the primary way it is used.

So my opinion is that there are good reasons not to do anything if
freshening fails.

>>>> My answer is, we are not worse than freshening loose objects case
>>>> (especially since I took the idea from there).
>>>
>>> I do not think so, unfortunately.  Loose object files with stale
>>> timestamps are not removed as long as objects are still reachable.

As the current index is read, which freshens its shared index file,
before a new index is created, the number of shared index files
doesn't go below 2. This can be seen in the tests changed in patch
19/21. So the risk of deleting interesting shared index files is quite
low in my opinion.

Also in general the split-index mode is useful when you often write
new indexes, and in this case shared index files that are used will
often be freshened, so the risk of deleting interesting shared index
files should be low.

>> But there are plenty of unreachable loose objects, added in index,
>> then got replaced with new versions. cache-tree can create loose trees
>> too and it's been run more often, behind user's back, to take
>> advantage of the shortcut in unpack-trees.
>
> I am not sure if I follow.  Aren't objects reachable from the
> cache-tree in the index protected from gc?
>
> Not that I think freshening would actually fail in a repository
> where you can actually write into to update the index or its refs to
> make a difference (iow, even if we make it die() loudly when shared
> index cannot be "touched" because we are paranoid, no real life
> usage will trigger that die(), and if a repository does trigger the
> die(), I think you would really want to know about it).

As I wrote above, I think if we can actually write the shared index
file but its freshening fails, it probably means that the shared index
file has been removed behind us, and this case is equivalent as when
loose files have been removed behind us.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-23 18:14                 ` Christian Couder
@ 2017-01-23 18:53                   ` Junio C Hamano
  2017-01-25 14:35                     ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2017-01-23 18:53 UTC (permalink / raw)
  To: Christian Couder
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> Also in general the split-index mode is useful when you often write
> new indexes, and in this case shared index files that are used will
> often be freshened, so the risk of deleting interesting shared index
> files should be low.
> ...
>> Not that I think freshening would actually fail in a repository
>> where you can actually write into to update the index or its refs to
>> make a difference (iow, even if we make it die() loudly when shared
>> index cannot be "touched" because we are paranoid, no real life
>> usage will trigger that die(), and if a repository does trigger the
>> die(), I think you would really want to know about it).
>
> As I wrote above, I think if we can actually write the shared index
> file but its freshening fails, it probably means that the shared index
> file has been removed behind us, and this case is equivalent as when
> loose files have been removed behind us.

OK, so it is unlikely to happen, and when it happens it leads to a
catastrophic failure---do we just ignore or do we report an error?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var
  2017-01-23 15:55           ` Christian Couder
@ 2017-01-23 19:59             ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2017-01-23 19:59 UTC (permalink / raw)
  To: Christian Couder
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

>>> Perhaps we should declare that config will be the one and only way
>>> in the future and start deprecating the command line option way.
>>> That will remove the need for two to interact with each other.
>
> That would be my preferred solution as I think it is the simplest in
> the end for users.
> Also, as Duy wrote above, one can always use something like "git -c
> core.splitIndex= ...", which by the way can work for any command, not
> just "update-index".
>
> Anyway we would have to introduce core.splitIndex first, and it
> wouldn't make sense to have a different behavior between
> --[no-]split-index and --[no-]untracked-cache in the meantime before
> they are deprecated and eventually removed.
>
> So let's just go with the implementation in this series, which is
> similar to --[no-]untracked-cache, and let's plan to deprecate
> --[no-]split-index and --[no-]untracked-cache in a later patch series.

Sounds like we have a plan.  Thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-23 18:53                   ` Junio C Hamano
@ 2017-01-25 14:35                     ` Christian Couder
  2017-01-25 20:52                       ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-25 14:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Mon, Jan 23, 2017 at 7:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> Also in general the split-index mode is useful when you often write
>> new indexes, and in this case shared index files that are used will
>> often be freshened, so the risk of deleting interesting shared index
>> files should be low.
>> ...
>>> Not that I think freshening would actually fail in a repository
>>> where you can actually write into to update the index or its refs to
>>> make a difference (iow, even if we make it die() loudly when shared
>>> index cannot be "touched" because we are paranoid, no real life
>>> usage will trigger that die(), and if a repository does trigger the
>>> die(), I think you would really want to know about it).
>>
>> As I wrote above, I think if we can actually write the shared index
>> file but its freshening fails, it probably means that the shared index
>> file has been removed behind us, and this case is equivalent as when
>> loose files have been removed behind us.
>
> OK, so it is unlikely to happen, and when it happens it leads to a
> catastrophic failure---do we just ignore or do we report an error?

Well, when we cannot freshen a loose file (with
freshen_loose_object()), we don't warn or die, we just write the loose
file. But here we cannot write the shared index file.

And this doesn't lead to a catastrophic failure right away. There
could be a catastrophic failure if the shared index file is needed
again later, but we are not sure that it's going to be needed later.
In fact it may have just been removed because it won't be needed
later.

So I am not sure it's a good idea to anticipate a catastrophic failure
that may not happen. Perhaps we could just warn, but I am not sure it
will help the user. If the catastrophic failure doesn't happen because
the shared index file is not needed, I can't see how the warning could
help. And if there is a catastrophic failure, the following will be
displayed:

fatal: .git/sharedindex.cbfe41352a4623d4d3e9757861fed53f3094e0ac:
index file open failed: No such file or directory

and I don't see how the warning could help on top of that. It could at
most repeat the same thing.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-25 14:35                     ` Christian Couder
@ 2017-01-25 20:52                       ` Junio C Hamano
  2017-01-31 14:06                         ` Christian Couder
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2017-01-25 20:52 UTC (permalink / raw)
  To: Christian Couder
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> Well, when we cannot freshen a loose file (with
> freshen_loose_object()), we don't warn or die, we just write the loose
> file. But here we cannot write the shared index file.

I think that is an excellent point.  Let me make sure I got you
right.  For loose object files, we may attempt to freshen and when
it fails we stay silent and do not talk about the failure.  Instead,
we write the same file again.  That will have two potential outcomes:

 1. write fails and we fail the whole thing.  It is very clear to
    the user that there is something wrong going on.

 2. write succeeds, and because we just wrote it, we know that the
    file is fresh and is protected from gc.

So the "freshen and if fails just write" is sufficient.  It is
absolutely the right thing to do for loose object files.

When we are forking off a new split index file based on an old
shared index file, we may attempt to "touch" the old shared one, but
we cannot write the old shared one again, because other split index
may be based on that, and we do not have enough information to
recreate the old one [*1*].  The fallback safety is not available.

> And this doesn't lead to a catastrophic failure right away. 

Exactly.

> There
> could be a catastrophic failure if the shared index file is needed
> again later, but we are not sure that it's going to be needed later.
> In fact it may have just been removed because it won't be needed
> later.

You are listing only the irrelevant cases.  The shared one may be
used immediately, and the user can keep using it for a while without
"touching".  Perhaps the shared one was chowned to "root" while the
user is looking the other way, and its timestamp is now frozen to
the time of chown.  It is a ticking time-bomb that will go off when
your expiry logic kicks in.

> So I am not sure it's a good idea to anticipate a catastrophic failure
> that may not happen. Perhaps we could just warn, but I am not sure it
> will help the user. If the catastrophic failure doesn't happen because
> the shared index file is not needed, I can't see how the warning could
> help. And if there is a catastrophic failure, the following will be
> displayed:
>
> fatal: .git/sharedindex.cbfe41352a4623d4d3e9757861fed53f3094e0ac:
> index file open failed: No such file or directory

That "fatal" is given _ONLY_ after time passes and our failure to
"touch" the file that is still-in-use left it subject to "gc".  Of
course, when "fatal" is given, it is too late to warn about ticking
time bomb.

At the time we notice a ticking time bomb is the only sensible time
to warn.  Or better yet take a corrective action.

As I expect (and you seem to agree) that a failure to "touch" would
be a very rare event (like, sysadmin chowning it to "root" by
mistake), I do not mind if the "corrective action" were "immediately
unsplit the index, so that at least the current and the latest index
contents will be written out safely to a new single unsplit index
file".  That won't help _other_ split index files that are based on
the same "untouchable" shared index, but I do not think that is a
problem we need to solve---if they are still in use, the code that
use them will notice it, and otherwise they are not in use and can
be aged away safely.


[Footnote]

*1* My understanding is that we lose information on stale entries in
    the shared file that are covered by the split index overlay
    after read_index() returns, so we _might_ be able to write the
    "old" one that is sufficient for _our_ split index, but we do
    not have good enough information to recreate "old" one usable by
    other split index files.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-25 20:52                       ` Junio C Hamano
@ 2017-01-31 14:06                         ` Christian Couder
  2017-01-31 19:22                           ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Christian Couder @ 2017-01-31 14:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

On Wed, Jan 25, 2017 at 9:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> Well, when we cannot freshen a loose file (with
>> freshen_loose_object()), we don't warn or die, we just write the loose
>> file. But here we cannot write the shared index file.
>
> I think that is an excellent point.  Let me make sure I got you
> right.  For loose object files, we may attempt to freshen and when
> it fails we stay silent and do not talk about the failure.  Instead,
> we write the same file again.  That will have two potential outcomes:
>
>  1. write fails and we fail the whole thing.  It is very clear to
>     the user that there is something wrong going on.
>
>  2. write succeeds, and because we just wrote it, we know that the
>     file is fresh and is protected from gc.
>
> So the "freshen and if fails just write" is sufficient.  It is
> absolutely the right thing to do for loose object files.
>
> When we are forking off a new split index file based on an old
> shared index file, we may attempt to "touch" the old shared one, but
> we cannot write the old shared one again, because other split index
> may be based on that, and we do not have enough information to
> recreate the old one [*1*].  The fallback safety is not available.

Yeah I agree. But note that I was talking about the case where the
shared index file does not exist anymore.
I was just saying that when the shared index file has been removed
behind us, it is equivalent as when
loose files have been removed behind us, except that we cannot write
the removed file to disk.

>> And this doesn't lead to a catastrophic failure right away.
>
> Exactly.
>
>> There
>> could be a catastrophic failure if the shared index file is needed
>> again later, but we are not sure that it's going to be needed later.
>> In fact it may have just been removed because it won't be needed
>> later.
>
> You are listing only the irrelevant cases.  The shared one may be
> used immediately, and the user can keep using it for a while without
> "touching".

Now you are talking about a case where the shared index file can be
used immediately and the user can keep using it.
This is a different case than the case I was talking about (which is
the case when the shared index file does not exist anymore).

> Perhaps the shared one was chowned to "root" while the
> user is looking the other way, and its timestamp is now frozen to
> the time of chown.  It is a ticking time-bomb that will go off when
> your expiry logic kicks in.

In the case I was talking about, the shared index file doesn't exist
anymore. So there is no time ticking bomb, because what could happen,
if we cannot freshen the shared index file, is that the shared index
file could be removed, which wouldn't make things worse as anyway it
does not exist anymore.

So the case when the shared index file doesn't exist is different than
other cases.

Now if you want to talk about the case when the shared index file has
not been removed, but just chowned to "root", then I wonder how is it
different from the case when the file system is read only.
Perhaps the admin just chowned everything to make sure that the repo
could not be changed anymore by the usual processes and users.

>> So I am not sure it's a good idea to anticipate a catastrophic failure
>> that may not happen. Perhaps we could just warn, but I am not sure it
>> will help the user. If the catastrophic failure doesn't happen because
>> the shared index file is not needed, I can't see how the warning could
>> help. And if there is a catastrophic failure, the following will be
>> displayed:
>>
>> fatal: .git/sharedindex.cbfe41352a4623d4d3e9757861fed53f3094e0ac:
>> index file open failed: No such file or directory
>
> That "fatal" is given _ONLY_ after time passes and our failure to
> "touch" the file that is still-in-use left it subject to "gc".  Of
> course, when "fatal" is given, it is too late to warn about ticking
> time bomb.
>
> At the time we notice a ticking time bomb is the only sensible time
> to warn.  Or better yet take a corrective action.

In a previous email you wrote that if the file system is read only, it
is a bad thing if we warn.
So I wonder why it would be a good thing to warn or try to do
something if the admin chowned everything to prevent usual processes
and users to change anything.

> As I expect (and you seem to agree) that a failure to "touch" would
> be a very rare event (like, sysadmin chowning it to "root" by
> mistake),

Yeah, I agree that a failure to "touch" would be a rare event. But the
thing is we often don't know if admins or users do things by mistake
or not.
If they rm -rf everything forcefully, for example, they might not like
it, if we start to rewrite stuff to try to "correct" things.

> I do not mind if the "corrective action" were "immediately
> unsplit the index, so that at least the current and the latest index
> contents will be written out safely to a new single unsplit index
> file".

If the admins or users chowned a shared index file precisely because
they want to save it from ever being garbage collected, I am not sure
that they would be happy if we decide to "correct" things.
For example they might not like it if we change the current mode to
unsplit index.

So in my opinion, the safest thing, if we really want to do something,
is to just warn, perhaps once, and I am ok to do that.
If we really wanted to do our best, perhaps we could warn once only in
the case when the shared index file still exists (because otherwise
there nothing to lose anymore).

> That won't help _other_ split index files that are based on
> the same "untouchable" shared index, but I do not think that is a
> problem we need to solve---if they are still in use, the code that
> use them will notice it, and otherwise they are not in use and can
> be aged away safely.

I am not sure that I understand you correctly on this. The current
code does not age away the split index files. It only ages away shared
index files.
I think aging away the split index file is a different issue, because
when we are not in split index mode, the index files are not aged
anyway.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-31 14:06                         ` Christian Couder
@ 2017-01-31 19:22                           ` Junio C Hamano
  2017-01-31 21:09                             ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2017-01-31 19:22 UTC (permalink / raw)
  To: Christian Couder
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

>> You are listing only the irrelevant cases.  The shared one may be
>> used immediately, and the user can keep using it for a while without
>> "touching".
>
> Now you are talking about a case where the shared index file can be
> used immediately and the user can keep using it.
> This is a different case than the case I was talking about (which is
> the case when the shared index file does not exist anymore).

Yes, as I said, you are listing irrelevant and uninteresting one.
If the shared index is already gone, reporting failure to touch it
is of no use---an attempt to read and use the split index that
depends on the shared index that is gone will fail anyway.

> In a previous email you wrote that if the file system is read only, it
> is a bad thing if we warn.

Yes, but you need to realize that "it is better not to bother users
with a report of failure to touch in read-only repository" and "we
ignore all failures".

IIUC, you attempt to touch the shared index even when you are
only reading the index, because you want to mark the fact that the
shared index is still being depended upon.  And I tend to agree that
it is OK not to report a failure for that case.  It is very similar
to a situation where you are asked to peek into your coworker's
repository, which you do not have write access to, and run "status".
The command first runs the equivalent of "update-index --refresh"
only in-core, and it attempts to write the updated index because
(1) it paid the cost of doing the refreshing already, and (2) if it
can write into the index, it will help future operations in the
repository.  But it does not report a failure to write the index
exactly because it is merely doing an opportunistic write.

And in the "we read from the split index, and we attempt to touch
the underlying shared one to update its timestamp" case, it is OK
not to report if we failed to touch.

But you also attempt to touch the shared index when you are actually
writing out a new split index file based on it, no?  The "you
created a ticking bomb" situation is where you fail to touch the
shared index for whatever reason, even when you managed to write the
new split index file.

We agreed that read-only operation should not nag, so it won't
trigger when you are peeking somebody else's repository to help him.
As I said, it is an uninteresting and irrelevant case---when your
read-only peeking did not add new information to be preserved, it is
less severe problem that you fail to reset the expiration.  On the
other hand, if you added new information, i.e. wrote the split index
based on it, it is a good indication that the <split,shared> index
pair has information that is more valuable.  We must warn in that
case.

> Now if you want to talk about the case when the shared index file has
> not been removed, but just chowned to "root", then I wonder how is it
> different from the case when the file system is read only.

The difference is that your code has enough information to notice
the case where you know your touch failed, you know that you wanted
to write (and the write succeeded), and yet you do NOT know why your
touch failed.  That is the ticking bomb we need to warn the user
about.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
  2017-01-31 19:22                           ` Junio C Hamano
@ 2017-01-31 21:09                             ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2017-01-31 21:09 UTC (permalink / raw)
  To: Christian Couder
  Cc: Duy Nguyen, git, Ævar Arnfjörð Bjarmason,
	Christian Couder

Junio C Hamano <gitster@pobox.com> writes:

> Yes, but you need to realize that "it is better not to bother users
> with a report of failure to touch in read-only repository" and "we
> ignore all failures".

Sorry about an unfinished sentence here.  "need to realize that
... and ... are different things."

> ... It is very similar
> to a situation where you ... run "status".
> The command first runs the equivalent of "update-index --refresh"
> only in-core, and it attempts to write the updated index because
> (1) it paid the cost of doing the refreshing already, and (2) if it
> can write into the index, it will help future operations in the
> repository.  But it does not report a failure to write the index
> exactly because it is merely doing an opportunistic write.
>
> And in the "we read from the split index, and we attempt to touch
> the underlying shared one to update its timestamp" case, it is OK
> not to report if we failed to touch.
> ...
> ... On the
> other hand, if you added new information, i.e. wrote the split index
> based on it, it is a good indication that the <split,shared> index
> pair has information that is more valuable.  We must warn in that
> case.

This reminds us of a third case.  What should happen if we are doing
the "opportunistic writing of the index" in "git status", managed to
write the split index, but failed to touch the shared one?

In the ideal world, I think we should do the following sequence:

 - "status" tries to write cache to the file.

 - we try to write and on any error, we return error to the caller,
   who is already prepared to ignore it and stay silent.

    - as the first step of writing the index, we first try to touch
      the shared one.  If it fails, we return an error here without
      writing the split one out.

    - then we try to write the split one out.  If this fails, we
      also return an error.

    - otherwise, both touching of the shared one and writing of the
      split one are successful.  

 - "status" finishes the opportunistic refreshing of the index, by
   either ignoring an error silently (if either touching of shared
   one or writing of split one fails) or writing the refreshed index
   out successfully.

It is OK to swap the order of touching the shared one and writing of
the split one in the above sequence, as long as an error in either
step signals a failure to the opportunistic caller.

I do not offhand know if the split-index code is structured in such
a way to allow the above sequence easily, or it needs refactoring.

If such a restructuring is required, it might not be within the
scope of the series and I am OK if you just left the NEEDSWORK
comment that describes the above (i.e. what we should be doing) as
an in-code comment so that we can pick it up later.  The whole point
of the step 14/21 on the other hand is to make sure that a shared
index that is still in active use will not go stale, and from that
point of view, such a "punting" may not be a good idea---it
deliberately finishes the series knowing that it does not adequately
do what it promises to do.  

So, ... I dunno.


^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2017-01-31 21:39 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-26 10:22 [PATCH v3 00/21] Add configuration options for split-index Christian Couder
2016-12-26 10:22 ` [PATCH v3 01/21] config: mark an error message up for translation Christian Couder
2016-12-26 10:22 ` [PATCH v3 02/21] config: add git_config_get_split_index() Christian Couder
2016-12-26 10:22 ` [PATCH v3 03/21] split-index: add {add,remove}_split_index() functions Christian Couder
2016-12-26 10:22 ` [PATCH v3 04/21] read-cache: add and then use tweak_split_index() Christian Couder
2016-12-26 10:22 ` [PATCH v3 05/21] update-index: warn in case of split-index incoherency Christian Couder
2016-12-26 10:22 ` [PATCH v3 06/21] t1700: add tests for core.splitIndex Christian Couder
2016-12-27 19:04   ` Junio C Hamano
2017-01-02  8:29     ` Christian Couder
2017-01-03 12:58       ` Junio C Hamano
2016-12-26 10:22 ` [PATCH v3 07/21] Documentation/config: add information " Christian Couder
2016-12-26 10:22 ` [PATCH v3 08/21] Documentation/git-update-index: talk about core.splitIndex config var Christian Couder
2016-12-27 19:07   ` Junio C Hamano
2017-01-02  9:39     ` Christian Couder
2017-01-07 21:38       ` Junio C Hamano
2017-01-09 11:18         ` Duy Nguyen
2017-01-23 15:55           ` Christian Couder
2017-01-23 19:59             ` Junio C Hamano
2016-12-26 10:22 ` [PATCH v3 09/21] config: add git_config_get_max_percent_split_change() Christian Couder
2016-12-26 10:22 ` [PATCH v3 10/21] read-cache: regenerate shared index if necessary Christian Couder
2016-12-27 19:08   ` Junio C Hamano
2017-01-02 11:23     ` Christian Couder
2016-12-26 10:22 ` [PATCH v3 11/21] t1700: add tests for splitIndex.maxPercentChange Christian Couder
2016-12-26 10:22 ` [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange Christian Couder
2016-12-27 19:09   ` Junio C Hamano
2017-01-02 13:50     ` Christian Couder
2016-12-26 10:22 ` [PATCH v3 13/21] sha1_file: make check_and_freshen_file() non static Christian Couder
2016-12-27 19:09   ` Junio C Hamano
2016-12-26 10:22 ` [PATCH v3 14/21] read-cache: touch shared index files when used Christian Couder
2016-12-27 19:10   ` Junio C Hamano
2017-01-02 14:09     ` Christian Couder
2017-01-07 21:46       ` Junio C Hamano
2017-01-09 10:55         ` Duy Nguyen
2017-01-09 14:34           ` Junio C Hamano
2017-01-19 12:13             ` Duy Nguyen
2017-01-19 19:00               ` Junio C Hamano
2017-01-20 10:44                 ` Duy Nguyen
2017-01-23 18:14                 ` Christian Couder
2017-01-23 18:53                   ` Junio C Hamano
2017-01-25 14:35                     ` Christian Couder
2017-01-25 20:52                       ` Junio C Hamano
2017-01-31 14:06                         ` Christian Couder
2017-01-31 19:22                           ` Junio C Hamano
2017-01-31 21:09                             ` Junio C Hamano
2016-12-26 10:22 ` [PATCH v3 15/21] config: add git_config_get_expiry() from gc.c Christian Couder
2016-12-26 10:22 ` [PATCH v3 16/21] read-cache: unlink old sharedindex files Christian Couder
2016-12-26 10:22 ` [PATCH v3 17/21] t1700: test shared index file expiration Christian Couder
2016-12-26 10:22 ` [PATCH v3 18/21] read-cache: refactor read_index_from() Christian Couder
2016-12-26 10:22 ` [PATCH v3 19/21] read-cache: use freshen_shared_index() in read_index_from() Christian Couder
2016-12-26 10:22 ` [PATCH v3 20/21] Documentation/config: add splitIndex.sharedIndexExpire Christian Couder
2016-12-27 19:11   ` Junio C Hamano
2017-01-02 14:33     ` Christian Couder
2016-12-26 10:22 ` [PATCH v3 21/21] Documentation/git-update-index: explain splitIndex.* Christian Couder
2016-12-27 19:13   ` Junio C Hamano
2017-01-02 16:04     ` Christian Couder
2016-12-26 10:29 ` [PATCH v3 00/21] Add configuration options for split-index Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).