git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH 0/3] fixes for split index mode
@ 2017-12-10 21:21 Thomas Gummerer
  2017-12-10 21:22 ` [PATCH 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
                   ` (3 more replies)
  0 siblings, 4 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-10 21:21 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, Thomas Gummerer

On the current master branch, 95ec6b1b33 ("RelNotes: the eighth
batch", 2017-12-06) , the test suite fails a few tests when
GIT_TEST_SPLIT_INDEX is set:

Test Summary Report
-------------------
t3007-ls-files-recurse-submodules.sh             (Wstat: 256 Tests: 21 Failed: 13)
  Failed tests:  2-14
  Non-zero exit status: 1
t7009-filter-branch-null-sha1.sh                 (Wstat: 256 Tests: 6 Failed: 1)
  Failed test:  4
  Non-zero exit status: 1
t5304-prune.sh                                   (Wstat: 256 Tests: 25 Failed: 3)
  Failed tests:  23-25
  Non-zero exit status: 1
t7814-grep-recurse-submodules.sh                 (Wstat: 256 Tests: 22 Failed: 13)
  Failed tests:  2-3, 5-10, 12-15, 22
  Non-zero exit status: 1

This series fixes these and makes travis run the test suite with
GIT_TEST_SPLIT_INDEX to avoid similar breakages in the future.

Thomas Gummerer (3):
  repository: fix repo_read_index with submodules
  prune: fix pruning with multiple worktrees and split index
  travis: run tests with GIT_TEST_SPLIT_INDEX

 .travis.yml  |  2 +-
 cache.h      |  1 +
 read-cache.c | 19 +++++++++++++++++--
 repository.c | 13 ++++++++++++-
 repository.h |  2 ++
 revision.c   | 13 ++++++++-----
 6 files changed, 41 insertions(+), 9 deletions(-)

-- 
2.15.1.504.g5279b80103


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 1/3] repository: fix repo_read_index with submodules
  2017-12-10 21:21 [PATCH 0/3] fixes for split index mode Thomas Gummerer
@ 2017-12-10 21:22 ` Thomas Gummerer
  2017-12-11 18:54   ` Brandon Williams
  2017-12-10 21:22 ` [PATCH 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-10 21:22 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, Thomas Gummerer

repo_read_index calls read_index_from, which takes an path argument for
the location of the index file.  For the split index however it relies
on the current working directory to construct the path using git_path.

repo_read_index calls read_index_from with the full path for the index
file, however it doesn't change the cwd, so when split index mode is
turned on, read_index_from can't find the file for the split index.

For example t3007-ls-files-recurse-submodules.sh was broken with
GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
broken in a similar manner, probably by introducing struct repository
there, although I didn't track down the exact commit for that.

Fix this by introducing a new read_index_for_repo function, which knows
about the correct paths for the submodules.

The alternative would have been to make the callers pass in the base
path for the split index, however that ended up being more complicated,
and I think we want to converge towards using struct repository for
things like these anyway.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h      |  1 +
 read-cache.c | 19 +++++++++++++++++--
 repository.c |  2 +-
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index cb5db7bf83..d42bea1ef7 100644
--- a/cache.h
+++ b/cache.h
@@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
 extern int do_read_index(struct index_state *istate, const char *path,
 			 int must_exist); /* for testting only! */
 extern int read_index_from(struct index_state *, const char *path);
+extern int read_index_for_repo(const struct repository *);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
 
diff --git a/read-cache.c b/read-cache.c
index 2eb81a66b9..4d5c4ad79b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -20,6 +20,7 @@
 #include "split-index.h"
 #include "utf8.h"
 #include "fsmonitor.h"
+#include "repository.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
 	free(shared_index);
 }
 
-int read_index_from(struct index_state *istate, const char *path)
+static int do_read_index_from(struct index_state *istate, const char *path,
+			      const struct repository *repo)
 {
 	struct split_index *split_index;
 	int ret;
@@ -1896,7 +1898,10 @@ int read_index_from(struct index_state *istate, const char *path)
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 
 	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
-	base_path = git_path("sharedindex.%s", base_sha1_hex);
+	if (repo)
+		base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
+	else
+		base_path = git_path("sharedindex.%s", base_sha1_hex);
 	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
@@ -1909,6 +1914,16 @@ int read_index_from(struct index_state *istate, const char *path)
 	return ret;
 }
 
+int read_index_for_repo(const struct repository *repo)
+{
+	return do_read_index_from(repo->index, repo->index_file, repo);
+}
+
+int read_index_from(struct index_state *istate, const char *path)
+{
+	return do_read_index_from(istate, path, NULL);
+}
+
 int is_index_unborn(struct index_state *istate)
 {
 	return (!istate->cache_nr && !istate->timestamp.sec);
diff --git a/repository.c b/repository.c
index bb2fae5446..928b1f553d 100644
--- a/repository.c
+++ b/repository.c
@@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
 
-	return read_index_from(repo->index, repo->index_file);
+	return read_index_for_repo(repo);
 }
-- 
2.15.1.504.g5279b80103


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 2/3] prune: fix pruning with multiple worktrees and split index
  2017-12-10 21:21 [PATCH 0/3] fixes for split index mode Thomas Gummerer
  2017-12-10 21:22 ` [PATCH 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
@ 2017-12-10 21:22 ` Thomas Gummerer
  2017-12-11 19:09   ` Brandon Williams
  2017-12-10 21:22 ` [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
  2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
  3 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-10 21:22 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, Thomas Gummerer

be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) made sure that pruning takes objects from all
worktrees into account.

It did that by reading the index of every worktree and adding the
necessary index objects to the set of pending objects.  The index is
read by read_index_from.  As mentioned in the previous commit,
read_index_from depends on the CWD for the location of the split index,
and add_index_objects_to_pending doesn't set that before using
read_index_from.

Instead of using read_index_from, use repo_read_index, which is aware of
the proper paths for the worktree.

This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---

This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
quite sure why it is fixed by this.  Either way I tracked the failure
down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
2017-05-16).  Maybe Peff has an idea why this fixes that test?

 repository.c | 11 +++++++++++
 repository.h |  2 ++
 revision.c   | 13 ++++++++-----
 3 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/repository.c b/repository.c
index 928b1f553d..3c9bfbd1b8 100644
--- a/repository.c
+++ b/repository.c
@@ -2,6 +2,7 @@
 #include "repository.h"
 #include "config.h"
 #include "submodule-config.h"
+#include "worktree.h"
 
 /* The main repository */
 static struct repository the_repo = {
@@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
 	return -1;
 }
 
+/*
+ * Initialize 'repo' based on the provided worktree
+ * Return 0 upon success and a non-zero value upon failure.
+ */
+int repo_worktree_init(struct repository *repo, struct worktree *worktree)
+{
+	return repo_init(repo, get_worktree_git_dir(worktree),
+			 worktree->path);
+}
+
 /*
  * Initialize 'submodule' as the submodule given by 'path' in parent repository
  * 'superproject'.
diff --git a/repository.h b/repository.h
index 7f5e24a0a2..2adeb05bf4 100644
--- a/repository.h
+++ b/repository.h
@@ -4,6 +4,7 @@
 struct config_set;
 struct index_state;
 struct submodule_cache;
+struct worktree;
 
 struct repository {
 	/* Environment */
@@ -87,6 +88,7 @@ extern struct repository *the_repository;
 extern void repo_set_gitdir(struct repository *repo, const char *path);
 extern void repo_set_worktree(struct repository *repo, const char *path);
 extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
+extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
 extern int repo_submodule_init(struct repository *submodule,
 			       struct repository *superproject,
 			       const char *path);
diff --git a/revision.c b/revision.c
index e2e691dd5a..9d8d9b96d1 100644
--- a/revision.c
+++ b/revision.c
@@ -22,6 +22,7 @@
 #include "packfile.h"
 #include "worktree.h"
 #include "argv-array.h"
+#include "repository.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -1346,15 +1347,17 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
 	worktrees = get_worktrees(0);
 	for (p = worktrees; *p; p++) {
 		struct worktree *wt = *p;
-		struct index_state istate = { NULL };
+		struct repository *repo;
 
+		repo = xmalloc(sizeof(struct repository));
 		if (wt->is_current)
 			continue; /* current index already taken care of */
+		if (repo_worktree_init(repo, wt))
+			BUG("couldn't initialize repository object from worktree");
 
-		if (read_index_from(&istate,
-				    worktree_git_path(wt, "index")) > 0)
-			do_add_index_objects_to_pending(revs, &istate);
-		discard_index(&istate);
+		if (repo_read_index(repo) > 0)
+			do_add_index_objects_to_pending(revs, repo->index);
+		discard_index(repo->index);
 	}
 	free_worktrees(worktrees);
 }
-- 
2.15.1.504.g5279b80103


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-10 21:21 [PATCH 0/3] fixes for split index mode Thomas Gummerer
  2017-12-10 21:22 ` [PATCH 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
  2017-12-10 21:22 ` [PATCH 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
@ 2017-12-10 21:22 ` Thomas Gummerer
  2017-12-10 23:37   ` Eric Sunshine
  2017-12-11 21:09   ` SZEDER Gábor
  2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
  3 siblings, 2 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-10 21:22 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, Thomas Gummerer

Make sure that split index doesn't get broken, by running it on travis
CI.

Run the test suite with split index enabled in linux 64 bit mode, and
leave split index turned off in 32-bit mode.  The laternative would be
to add an extra target in the matrix, enabling split index mode, but
that would only use additional cycles on travis and would not bring that
much benefit, as we are still running the test suite in the "vanilla"
version in the 32-bit mode.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 .travis.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 281f101f31..c83c766dee 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -39,7 +39,7 @@ env:
 
 matrix:
   include:
-    - env: GETTEXT_POISON=YesPlease
+    - env: GETTEXT_POISON=YesPlease GIT_TEST_SPLIT_INDEX=YesPlease
       os: linux
       compiler:
       addons:
-- 
2.15.1.504.g5279b80103


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-10 21:22 ` [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
@ 2017-12-10 23:37   ` Eric Sunshine
  2017-12-11 21:09   ` SZEDER Gábor
  1 sibling, 0 replies; 77+ messages in thread
From: Eric Sunshine @ 2017-12-10 23:37 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Git List, Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy

On Sun, Dec 10, 2017 at 4:22 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> Make sure that split index doesn't get broken, by running it on travis
> CI.
>
> Run the test suite with split index enabled in linux 64 bit mode, and
> leave split index turned off in 32-bit mode.  The laternative would be

s/laternative/alternative/

> to add an extra target in the matrix, enabling split index mode, but
> that would only use additional cycles on travis and would not bring that
> much benefit, as we are still running the test suite in the "vanilla"
> version in the 32-bit mode.
>
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 1/3] repository: fix repo_read_index with submodules
  2017-12-10 21:22 ` [PATCH 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
@ 2017-12-11 18:54   ` Brandon Williams
  2017-12-11 20:37     ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Brandon Williams @ 2017-12-11 18:54 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Christian Couder, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy

On 12/10, Thomas Gummerer wrote:
> repo_read_index calls read_index_from, which takes an path argument for
> the location of the index file.  For the split index however it relies
> on the current working directory to construct the path using git_path.
> 
> repo_read_index calls read_index_from with the full path for the index
> file, however it doesn't change the cwd, so when split index mode is
> turned on, read_index_from can't find the file for the split index.
> 
> For example t3007-ls-files-recurse-submodules.sh was broken with
> GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> broken in a similar manner, probably by introducing struct repository
> there, although I didn't track down the exact commit for that.
> 
> Fix this by introducing a new read_index_for_repo function, which knows
> about the correct paths for the submodules.
> 
> The alternative would have been to make the callers pass in the base
> path for the split index, however that ended up being more complicated,
> and I think we want to converge towards using struct repository for
> things like these anyway.

Thanks for catching this, I'm not a user of split index myself which is
why I unfortunately overlooked this.  Definitely a good change.  I
really only have one nit below.

> 
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  cache.h      |  1 +
>  read-cache.c | 19 +++++++++++++++++--
>  repository.c |  2 +-
>  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/cache.h b/cache.h
> index cb5db7bf83..d42bea1ef7 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
>  extern int do_read_index(struct index_state *istate, const char *path,
>  			 int must_exist); /* for testting only! */
>  extern int read_index_from(struct index_state *, const char *path);
> +extern int read_index_for_repo(const struct repository *);
>  extern int is_index_unborn(struct index_state *);
>  extern int read_index_unmerged(struct index_state *);
>  
> diff --git a/read-cache.c b/read-cache.c
> index 2eb81a66b9..4d5c4ad79b 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -20,6 +20,7 @@
>  #include "split-index.h"
>  #include "utf8.h"
>  #include "fsmonitor.h"
> +#include "repository.h"
>  
>  /* Mask for the name length in ce_flags in the on-disk index */
>  
> @@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
>  	free(shared_index);
>  }
>  
> -int read_index_from(struct index_state *istate, const char *path)
> +static int do_read_index_from(struct index_state *istate, const char *path,
> +			      const struct repository *repo)
>  {
>  	struct split_index *split_index;
>  	int ret;
> @@ -1896,7 +1898,10 @@ int read_index_from(struct index_state *istate, const char *path)
>  		split_index->base = xcalloc(1, sizeof(*split_index->base));
>  
>  	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> -	base_path = git_path("sharedindex.%s", base_sha1_hex);
> +	if (repo)
> +		base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
> +	else
> +		base_path = git_path("sharedindex.%s", base_sha1_hex);
>  	ret = do_read_index(split_index->base, base_path, 1);
>  	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
>  		die("broken index, expect %s in %s, got %s",
> @@ -1909,6 +1914,16 @@ int read_index_from(struct index_state *istate, const char *path)
>  	return ret;
>  }
>  
> +int read_index_for_repo(const struct repository *repo)
> +{
> +	return do_read_index_from(repo->index, repo->index_file, repo);
> +}
> +
> +int read_index_from(struct index_state *istate, const char *path)
> +{
> +	return do_read_index_from(istate, path, NULL);
> +}

Instead of passing NULL and having to special case it in
'do_read_index_from()', how about we pass in 'the_repository'?

> +
>  int is_index_unborn(struct index_state *istate)
>  {
>  	return (!istate->cache_nr && !istate->timestamp.sec);
> diff --git a/repository.c b/repository.c
> index bb2fae5446..928b1f553d 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
>  	if (!repo->index)
>  		repo->index = xcalloc(1, sizeof(*repo->index));
>  
> -	return read_index_from(repo->index, repo->index_file);
> +	return read_index_for_repo(repo);
>  }
> -- 
> 2.15.1.504.g5279b80103
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/3] prune: fix pruning with multiple worktrees and split index
  2017-12-10 21:22 ` [PATCH 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
@ 2017-12-11 19:09   ` Brandon Williams
  2017-12-11 21:39     ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Brandon Williams @ 2017-12-11 19:09 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Christian Couder, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy

On 12/10, Thomas Gummerer wrote:
> be489d02d2 ("revision.c: --indexed-objects add objects from all
> worktrees", 2017-08-23) made sure that pruning takes objects from all
> worktrees into account.
> 
> It did that by reading the index of every worktree and adding the
> necessary index objects to the set of pending objects.  The index is
> read by read_index_from.  As mentioned in the previous commit,
> read_index_from depends on the CWD for the location of the split index,
> and add_index_objects_to_pending doesn't set that before using
> read_index_from.
> 
> Instead of using read_index_from, use repo_read_index, which is aware of
> the proper paths for the worktree.
> 
> This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> 

I'm on the fence about this change.  I understand that this will ensure
that the proper objects aren't pruned when using a split index in the
presence of worktrees but I think the solution needs to be thought
through a bit more.

My big concern right now is the interaction of 'struct worktree's and
'struct repository'.  I'll try to highlight my concerns below.

> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
> 
> This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
> quite sure why it is fixed by this.  Either way I tracked the failure
> down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
> 2017-05-16).  Maybe Peff has an idea why this fixes that test?
> 
>  repository.c | 11 +++++++++++
>  repository.h |  2 ++
>  revision.c   | 13 ++++++++-----
>  3 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index 928b1f553d..3c9bfbd1b8 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -2,6 +2,7 @@
>  #include "repository.h"
>  #include "config.h"
>  #include "submodule-config.h"
> +#include "worktree.h"
>  
>  /* The main repository */
>  static struct repository the_repo = {
> @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
>  	return -1;
>  }
>  
> +/*
> + * Initialize 'repo' based on the provided worktree
> + * Return 0 upon success and a non-zero value upon failure.
> + */
> +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> +{
> +	return repo_init(repo, get_worktree_git_dir(worktree),
> +			 worktree->path);
> +}

My first concern is the use of 'get_worktree_git_dir()'.  Under the hood
it calls 'get_git_dir()', 'get_git_common_dir()', and
'git_common_path()' which rely on global state as stored in
'the_repository'.  So how does one initialize a repository struct (using
this initializer) using a worktree from a repository other than the
global 'the_repository' struct?  I'm not sure I have an answer right
now, but its an issue that needs to be thought through before we head
down this road.

Just thinking to myself, Does it make sense to have worktree's as a
separate struct or to have them stored in 'struct repository' in some
way?  Shouldn't a repository struct have a way to interact with all of
its worktrees?  How would initializing a repository struct for every
worktree work once we migrate the object store to be stored in 'struct
repoisotry'?  Shouldn't every worktree share the same object store
in-memory like they do on-disk?

> +
>  /*
>   * Initialize 'submodule' as the submodule given by 'path' in parent repository
>   * 'superproject'.
> diff --git a/repository.h b/repository.h
> index 7f5e24a0a2..2adeb05bf4 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -4,6 +4,7 @@
>  struct config_set;
>  struct index_state;
>  struct submodule_cache;
> +struct worktree;
>  
>  struct repository {
>  	/* Environment */
> @@ -87,6 +88,7 @@ extern struct repository *the_repository;
>  extern void repo_set_gitdir(struct repository *repo, const char *path);
>  extern void repo_set_worktree(struct repository *repo, const char *path);
>  extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
> +extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
>  extern int repo_submodule_init(struct repository *submodule,
>  			       struct repository *superproject,
>  			       const char *path);
> diff --git a/revision.c b/revision.c
> index e2e691dd5a..9d8d9b96d1 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -22,6 +22,7 @@
>  #include "packfile.h"
>  #include "worktree.h"
>  #include "argv-array.h"
> +#include "repository.h"
>  
>  volatile show_early_output_fn_t show_early_output;
>  
> @@ -1346,15 +1347,17 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
>  	worktrees = get_worktrees(0);
>  	for (p = worktrees; *p; p++) {
>  		struct worktree *wt = *p;
> -		struct index_state istate = { NULL };
> +		struct repository *repo;
>  
> +		repo = xmalloc(sizeof(struct repository));

This was allocated but never freed, was that intentional?

>  		if (wt->is_current)
>  			continue; /* current index already taken care of */
> +		if (repo_worktree_init(repo, wt))
> +			BUG("couldn't initialize repository object from worktree");
>  
> -		if (read_index_from(&istate,
> -				    worktree_git_path(wt, "index")) > 0)
> -			do_add_index_objects_to_pending(revs, &istate);
> -		discard_index(&istate);
> +		if (repo_read_index(repo) > 0)
> +			do_add_index_objects_to_pending(revs, repo->index);
> +		discard_index(repo->index);

One we have separate object stores per-repository how would we handle
this since this pruning should only work on a single repository's object
store?

>  	}
>  	free_worktrees(worktrees);
>  }
> -- 
> 2.15.1.504.g5279b80103
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 1/3] repository: fix repo_read_index with submodules
  2017-12-11 18:54   ` Brandon Williams
@ 2017-12-11 20:37     ` Thomas Gummerer
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-11 20:37 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Christian Couder, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy

On 12/11, Brandon Williams wrote:
> On 12/10, Thomas Gummerer wrote:
> > repo_read_index calls read_index_from, which takes an path argument for
> > the location of the index file.  For the split index however it relies
> > on the current working directory to construct the path using git_path.
> > 
> > repo_read_index calls read_index_from with the full path for the index
> > file, however it doesn't change the cwd, so when split index mode is
> > turned on, read_index_from can't find the file for the split index.
> > 
> > For example t3007-ls-files-recurse-submodules.sh was broken with
> > GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> > object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> > broken in a similar manner, probably by introducing struct repository
> > there, although I didn't track down the exact commit for that.
> > 
> > Fix this by introducing a new read_index_for_repo function, which knows
> > about the correct paths for the submodules.
> > 
> > The alternative would have been to make the callers pass in the base
> > path for the split index, however that ended up being more complicated,
> > and I think we want to converge towards using struct repository for
> > things like these anyway.
> 
> Thanks for catching this, I'm not a user of split index myself which is
> why I unfortunately overlooked this.  Definitely a good change.  I
> really only have one nit below.

Me neither, I just remember to run the split index tests every once in
a while, which is also why it's taken so long to catch this.
Hopefully after we fixed this we can get travis to run the tests,
which should help :)

> > 
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> >  cache.h      |  1 +
> >  read-cache.c | 19 +++++++++++++++++--
> >  repository.c |  2 +-
> >  3 files changed, 19 insertions(+), 3 deletions(-)
> > 
> > diff --git a/cache.h b/cache.h
> > index cb5db7bf83..d42bea1ef7 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
> >  extern int do_read_index(struct index_state *istate, const char *path,
> >  			 int must_exist); /* for testting only! */
> >  extern int read_index_from(struct index_state *, const char *path);
> > +extern int read_index_for_repo(const struct repository *);
> >  extern int is_index_unborn(struct index_state *);
> >  extern int read_index_unmerged(struct index_state *);
> >  
> > diff --git a/read-cache.c b/read-cache.c
> > index 2eb81a66b9..4d5c4ad79b 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -20,6 +20,7 @@
> >  #include "split-index.h"
> >  #include "utf8.h"
> >  #include "fsmonitor.h"
> > +#include "repository.h"
> >  
> >  /* Mask for the name length in ce_flags in the on-disk index */
> >  
> > @@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
> >  	free(shared_index);
> >  }
> >  
> > -int read_index_from(struct index_state *istate, const char *path)
> > +static int do_read_index_from(struct index_state *istate, const char *path,
> > +			      const struct repository *repo)
> >  {
> >  	struct split_index *split_index;
> >  	int ret;
> > @@ -1896,7 +1898,10 @@ int read_index_from(struct index_state *istate, const char *path)
> >  		split_index->base = xcalloc(1, sizeof(*split_index->base));
> >  
> >  	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> > -	base_path = git_path("sharedindex.%s", base_sha1_hex);
> > +	if (repo)
> > +		base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
> > +	else
> > +		base_path = git_path("sharedindex.%s", base_sha1_hex);
> >  	ret = do_read_index(split_index->base, base_path, 1);
> >  	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
> >  		die("broken index, expect %s in %s, got %s",
> > @@ -1909,6 +1914,16 @@ int read_index_from(struct index_state *istate, const char *path)
> >  	return ret;
> >  }
> >  
> > +int read_index_for_repo(const struct repository *repo)
> > +{
> > +	return do_read_index_from(repo->index, repo->index_file, repo);
> > +}
> > +
> > +int read_index_from(struct index_state *istate, const char *path)
> > +{
> > +	return do_read_index_from(istate, path, NULL);
> > +}
> 
> Instead of passing NULL and having to special case it in
> 'do_read_index_from()', how about we pass in 'the_repository'?

I think that makes sense the only way this function used to work with
split index turned on is if it's called from the main repository, so
just passing through 'the_repository' would have the function behave
the exact same way as before.

> > +
> >  int is_index_unborn(struct index_state *istate)
> >  {
> >  	return (!istate->cache_nr && !istate->timestamp.sec);
> > diff --git a/repository.c b/repository.c
> > index bb2fae5446..928b1f553d 100644
> > --- a/repository.c
> > +++ b/repository.c
> > @@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
> >  	if (!repo->index)
> >  		repo->index = xcalloc(1, sizeof(*repo->index));
> >  
> > -	return read_index_from(repo->index, repo->index_file);
> > +	return read_index_for_repo(repo);
> >  }
> > -- 
> > 2.15.1.504.g5279b80103
> > 
> 
> -- 
> Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-10 21:22 ` [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
  2017-12-10 23:37   ` Eric Sunshine
@ 2017-12-11 21:09   ` SZEDER Gábor
  2017-12-11 21:42     ` Thomas Gummerer
  1 sibling, 1 reply; 77+ messages in thread
From: SZEDER Gábor @ 2017-12-11 21:09 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: SZEDER Gábor, Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, git

> Make sure that split index doesn't get broken, by running it on travis
> CI.
> 
> Run the test suite with split index enabled in linux 64 bit mode, and
> leave split index turned off in 32-bit mode.

This doesn't accurately describe what the patch does.
Travis CI runs three 64 bit Linux build jobs for us: one compiled with
Clang, one with GCC, and one with GETTEXT_POISON enabled.  This patch
enables split index only in the latter build job, but not in the Clang
and GCC build jobs.

>  The laternative would be
> to add an extra target in the matrix, enabling split index mode, but
> that would only use additional cycles on travis and would not bring that
> much benefit, as we are still running the test suite in the "vanilla"
> version in the 32-bit mode.
> 
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  .travis.yml | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index 281f101f31..c83c766dee 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -39,7 +39,7 @@ env:
>  
>  matrix:
>    include:
> -    - env: GETTEXT_POISON=YesPlease
> +    - env: GETTEXT_POISON=YesPlease GIT_TEST_SPLIT_INDEX=YesPlease
>        os: linux
>        compiler:
>        addons:
> -- 
> 2.15.1.504.g5279b80103

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/3] prune: fix pruning with multiple worktrees and split index
  2017-12-11 19:09   ` Brandon Williams
@ 2017-12-11 21:39     ` Thomas Gummerer
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-11 21:39 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Christian Couder, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy

On 12/11, Brandon Williams wrote:
> On 12/10, Thomas Gummerer wrote:
> > be489d02d2 ("revision.c: --indexed-objects add objects from all
> > worktrees", 2017-08-23) made sure that pruning takes objects from all
> > worktrees into account.
> > 
> > It did that by reading the index of every worktree and adding the
> > necessary index objects to the set of pending objects.  The index is
> > read by read_index_from.  As mentioned in the previous commit,
> > read_index_from depends on the CWD for the location of the split index,
> > and add_index_objects_to_pending doesn't set that before using
> > read_index_from.
> > 
> > Instead of using read_index_from, use repo_read_index, which is aware of
> > the proper paths for the worktree.
> > 
> > This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> > 
> 
> I'm on the fence about this change.  I understand that this will ensure
> that the proper objects aren't pruned when using a split index in the
> presence of worktrees but I think the solution needs to be thought
> through a bit more.
> 
> My big concern right now is the interaction of 'struct worktree's and
> 'struct repository'.  I'll try to highlight my concerns below.

Thanks for the review!  My thoughts on the concerns are below.

> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> > 
> > This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
> > quite sure why it is fixed by this.  Either way I tracked the failure
> > down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
> > 2017-05-16).  Maybe Peff has an idea why this fixes that test?
> > 
> >  repository.c | 11 +++++++++++
> >  repository.h |  2 ++
> >  revision.c   | 13 ++++++++-----
> >  3 files changed, 21 insertions(+), 5 deletions(-)
> > 
> > diff --git a/repository.c b/repository.c
> > index 928b1f553d..3c9bfbd1b8 100644
> > --- a/repository.c
> > +++ b/repository.c
> > @@ -2,6 +2,7 @@
> >  #include "repository.h"
> >  #include "config.h"
> >  #include "submodule-config.h"
> > +#include "worktree.h"
> >  
> >  /* The main repository */
> >  static struct repository the_repo = {
> > @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
> >  	return -1;
> >  }
> >  
> > +/*
> > + * Initialize 'repo' based on the provided worktree
> > + * Return 0 upon success and a non-zero value upon failure.
> > + */
> > +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> > +{
> > +	return repo_init(repo, get_worktree_git_dir(worktree),
> > +			 worktree->path);
> > +}
> 
> My first concern is the use of 'get_worktree_git_dir()'.  Under the hood
> it calls 'get_git_dir()', 'get_git_common_dir()', and
> 'git_common_path()' which rely on global state as stored in
> 'the_repository'.  So how does one initialize a repository struct (using
> this initializer) using a worktree from a repository other than the
> global 'the_repository' struct?  I'm not sure I have an answer right
> now, but its an issue that needs to be thought through before we head
> down this road.

The main reason for just using 'get_worktree_git_dir()' was because it
exists and it does exactly what I needed.  If we ever have the need to
initialize a repository struct for a worktree from another repository
struct, I would imagine we introduce a new function
'repo_get_worktree_git_dir()', which takes a struct repository as
input and returns the worktree git dir based on that.

And then we'd have to add a different initializer
'repo_worktree_init_from_repo()' or something like that, where we
could initialize a worktree from a repository struct different from
'the_repository'.  But maybe there are some complications there that I
didn't think about?

> Just thinking to myself, Does it make sense to have worktree's as a
> separate struct or to have them stored in 'struct repository' in some
> way?

I'm not sure I have a good answer for this.  Looking at the libgit2
source, which has a repository struct for (I assume at least, I'm not
very familiar with the libgit2 source :)), they have a repository
struct for each worktree, with a flag specifying whether the struct is
for a worktree, or a "normal" git repository/the main worktree.

Obviously that doesn't mean we should do it the same way, but it
probably also doesn't hurt to look at prior art.

> Shouldn't a repository struct have a way to interact with all of
> its worktrees?

It could go either way I guess depending on how we want to design it.
If we want the repository struct to interact with all worktrees, it
seems to me like we would almost need some kind of 'struct
super_repository', which then contains a list of 'struct repository's
representing each worktree, probably including the main worktree.

But then again in normal operation we probably only want the
repository struct for the current repository we're working with, not
all of the worktrees.  I suspect it's only in some special cases where
we want to have all worktrees in memory at once (such as in the prune
case).

Or how would you haven envisioned this?

> How would initializing a repository struct for every
> worktree work once we migrate the object store to be stored in 'struct
> repoisotry'?  Shouldn't every worktree share the same object store
> in-memory like they do on-disk?

I haven't thought this far.  They probably want to share the same
object store, if they all want the object store to be loaded.  But
we can do that either way, whether we have one 'struct repository'
that has links to the worktrees, or we have multiple 'struct
repository's, which each have a ref-counted pointer to the object
store or something like that.

> > +
> >  /*
> >   * Initialize 'submodule' as the submodule given by 'path' in parent repository
> >   * 'superproject'.
> > diff --git a/repository.h b/repository.h
> > index 7f5e24a0a2..2adeb05bf4 100644
> > --- a/repository.h
> > +++ b/repository.h
> > @@ -4,6 +4,7 @@
> >  struct config_set;
> >  struct index_state;
> >  struct submodule_cache;
> > +struct worktree;
> >  
> >  struct repository {
> >  	/* Environment */
> > @@ -87,6 +88,7 @@ extern struct repository *the_repository;
> >  extern void repo_set_gitdir(struct repository *repo, const char *path);
> >  extern void repo_set_worktree(struct repository *repo, const char *path);
> >  extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
> > +extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
> >  extern int repo_submodule_init(struct repository *submodule,
> >  			       struct repository *superproject,
> >  			       const char *path);
> > diff --git a/revision.c b/revision.c
> > index e2e691dd5a..9d8d9b96d1 100644
> > --- a/revision.c
> > +++ b/revision.c
> > @@ -22,6 +22,7 @@
> >  #include "packfile.h"
> >  #include "worktree.h"
> >  #include "argv-array.h"
> > +#include "repository.h"
> >  
> >  volatile show_early_output_fn_t show_early_output;
> >  
> > @@ -1346,15 +1347,17 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
> >  	worktrees = get_worktrees(0);
> >  	for (p = worktrees; *p; p++) {
> >  		struct worktree *wt = *p;
> > -		struct index_state istate = { NULL };
> > +		struct repository *repo;
> >  
> > +		repo = xmalloc(sizeof(struct repository));
> 
> This was allocated but never freed, was that intentional?

No, that was an oversight, will fix, thanks!

> >  		if (wt->is_current)
> >  			continue; /* current index already taken care of */
> > +		if (repo_worktree_init(repo, wt))
> > +			BUG("couldn't initialize repository object from worktree");
> >  
> > -		if (read_index_from(&istate,
> > -				    worktree_git_path(wt, "index")) > 0)
> > -			do_add_index_objects_to_pending(revs, &istate);
> > -		discard_index(&istate);
> > +		if (repo_read_index(repo) > 0)
> > +			do_add_index_objects_to_pending(revs, repo->index);
> > +		discard_index(repo->index);
> 
> One we have separate object stores per-repository how would we handle
> this since this pruning should only work on a single repository's object
> store?

That depends a lot on how we design the object store and how it's held
by the repository struct.  If we only have one in-memory
representation of it, the answer is obvious.  If we have multiple
in-memory representations of it at the same time, things will get a
bit more hairy.  But until we have at least a design of how that would
look like it's quite hard to tell I'd say.

> >  	}
> >  	free_worktrees(worktrees);
> >  }
> > -- 
> > 2.15.1.504.g5279b80103
> > 
> 
> -- 
> Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-11 21:09   ` SZEDER Gábor
@ 2017-12-11 21:42     ` Thomas Gummerer
  2017-12-12 15:54       ` Lars Schneider
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-11 21:42 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Christian Couder, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, git

On 12/11, SZEDER Gábor wrote:
> > Make sure that split index doesn't get broken, by running it on travis
> > CI.
> > 
> > Run the test suite with split index enabled in linux 64 bit mode, and
> > leave split index turned off in 32-bit mode.
> 
> This doesn't accurately describe what the patch does.
> Travis CI runs three 64 bit Linux build jobs for us: one compiled with
> Clang, one with GCC, and one with GETTEXT_POISON enabled.  This patch
> enables split index only in the latter build job, but not in the Clang
> and GCC build jobs.

You're right, it's my first time using travis CI and I got confused
about how the .travis.yml works, thanks for catching that.  Will
re-phrase the commit message.

> >  The laternative would be
> > to add an extra target in the matrix, enabling split index mode, but
> > that would only use additional cycles on travis and would not bring that
> > much benefit, as we are still running the test suite in the "vanilla"
> > version in the 32-bit mode.
> > 
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> >  .travis.yml | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/.travis.yml b/.travis.yml
> > index 281f101f31..c83c766dee 100644
> > --- a/.travis.yml
> > +++ b/.travis.yml
> > @@ -39,7 +39,7 @@ env:
> >  
> >  matrix:
> >    include:
> > -    - env: GETTEXT_POISON=YesPlease
> > +    - env: GETTEXT_POISON=YesPlease GIT_TEST_SPLIT_INDEX=YesPlease
> >        os: linux
> >        compiler:
> >        addons:
> > -- 
> > 2.15.1.504.g5279b80103

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-11 21:42     ` Thomas Gummerer
@ 2017-12-12 15:54       ` Lars Schneider
  2017-12-12 19:15         ` Junio C Hamano
  0 siblings, 1 reply; 77+ messages in thread
From: Lars Schneider @ 2017-12-12 15:54 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, git


> On 11 Dec 2017, at 22:42, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> 
> On 12/11, SZEDER Gábor wrote:
>>> Make sure that split index doesn't get broken, by running it on travis
>>> CI.
>>> 
>>> Run the test suite with split index enabled in linux 64 bit mode, and
>>> leave split index turned off in 32-bit mode.
>> 
>> This doesn't accurately describe what the patch does.
>> Travis CI runs three 64 bit Linux build jobs for us: one compiled with
>> Clang, one with GCC, and one with GETTEXT_POISON enabled.  This patch
>> enables split index only in the latter build job, but not in the Clang
>> and GCC build jobs.
> 
> You're right, it's my first time using travis CI and I got confused
> about how the .travis.yml works, thanks for catching that.  Will
> re-phrase the commit message.

Szeder is spot on. If you fix up the message, then this patch looks
perfect! Pragmatic (= very good!) idea to combine GITTEXT_POISON with 
GIT_TEST_SPLIT_INDEX :-)

Thanks,
Lars


> 
>>> The laternative would be
>>> to add an extra target in the matrix, enabling split index mode, but
>>> that would only use additional cycles on travis and would not bring that
>>> much benefit, as we are still running the test suite in the "vanilla"
>>> version in the 32-bit mode.
>>> 
>>> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
>>> ---
>>> .travis.yml | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/.travis.yml b/.travis.yml
>>> index 281f101f31..c83c766dee 100644
>>> --- a/.travis.yml
>>> +++ b/.travis.yml
>>> @@ -39,7 +39,7 @@ env:
>>> 
>>> matrix:
>>>   include:
>>> -    - env: GETTEXT_POISON=YesPlease
>>> +    - env: GETTEXT_POISON=YesPlease GIT_TEST_SPLIT_INDEX=YesPlease
>>>       os: linux
>>>       compiler:
>>>       addons:
>>> -- 
>>> 2.15.1.504.g5279b80103


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-12 15:54       ` Lars Schneider
@ 2017-12-12 19:15         ` Junio C Hamano
  2017-12-12 20:15           ` Thomas Gummerer
  2017-12-13 17:21           ` Lars Schneider
  0 siblings, 2 replies; 77+ messages in thread
From: Junio C Hamano @ 2017-12-12 19:15 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Thomas Gummerer, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git

Lars Schneider <larsxschneider@gmail.com> writes:

>> You're right, it's my first time using travis CI and I got confused
>> about how the .travis.yml works, thanks for catching that.  Will
>> re-phrase the commit message.
>
> Szeder is spot on. If you fix up the message, then this patch looks
> perfect! Pragmatic (= very good!) idea to combine GITTEXT_POISON with 
> GIT_TEST_SPLIT_INDEX :-)

I am failing to guess the real intent of the smiley here.

If split-index code is so easy to break, I do not think it is a good
idea to combine it into the poison build.  In fact, the poison test
is useless on a codebase where other/real breakages are expected to
exist, because it is about seeing messages meant for non-humans are
not passed to the _() mechanism by sloppy coding, and the way it
does so is to corrupt all the messages that come through the _()
mechanism.  If we do not even produce a message when a correct code
_should_ produce one, poison test would catch nothing useful.

I wonder if it makes more sense to update ci/run-tests.sh so that
its final step is run twice with different settings, like so?

 ci/run-tests.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ci/run-tests.sh b/ci/run-tests.sh
index f0c743de94..15a5f5a6cc 100755
--- a/ci/run-tests.sh
+++ b/ci/run-tests.sh
@@ -8,3 +8,4 @@
 mkdir -p $HOME/travis-cache
 ln -s $HOME/travis-cache/.prove t/.prove
 make --quiet test
+GIT_TEST_SPLIT_INDEX=LetsTryIt make --quiet test

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-12 19:15         ` Junio C Hamano
@ 2017-12-12 20:15           ` Thomas Gummerer
  2017-12-12 20:51             ` Junio C Hamano
  2017-12-13 17:21           ` Lars Schneider
  1 sibling, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-12 20:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Schneider, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git

On 12/12, Junio C Hamano wrote:
> Lars Schneider <larsxschneider@gmail.com> writes:
> 
> >> You're right, it's my first time using travis CI and I got confused
> >> about how the .travis.yml works, thanks for catching that.  Will
> >> re-phrase the commit message.
> >
> > Szeder is spot on. If you fix up the message, then this patch looks
> > perfect! Pragmatic (= very good!) idea to combine GITTEXT_POISON with 
> > GIT_TEST_SPLIT_INDEX :-)
> 
> I am failing to guess the real intent of the smiley here.
> 
> If split-index code is so easy to break, I do not think it is a good
> idea to combine it into the poison build.  In fact, the poison test
> is useless on a codebase where other/real breakages are expected to
> exist, because it is about seeing messages meant for non-humans are
> not passed to the _() mechanism by sloppy coding, and the way it
> does so is to corrupt all the messages that come through the _()
> mechanism.  If we do not even produce a message when a correct code
> _should_ produce one, poison test would catch nothing useful.

The breakages wen the split-index code fails tend to break things in
much more obvious manners than a wrong message, usually git ends up
dying if it gets broken.  Both of the bugs that were fixed here would
have been caught with the change in my patch.

But yeah I can see the argument that it doesn't give us a guarantee
that it catches all things the test suite could catch.

> I wonder if it makes more sense to update ci/run-tests.sh so that
> its final step is run twice with different settings, like so?

I kind of wanted to avoid that, because it ends up running the test
suite twice on travis for every test, which seems a bit overkill.  But
I don't exactly how worried we are about cycles on travis (I don't
check it very often personally, so I don't really know).  If we aren't
worried about cycles what you have below would certainly make sense.

>  ci/run-tests.sh | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ci/run-tests.sh b/ci/run-tests.sh
> index f0c743de94..15a5f5a6cc 100755
> --- a/ci/run-tests.sh
> +++ b/ci/run-tests.sh
> @@ -8,3 +8,4 @@
>  mkdir -p $HOME/travis-cache
>  ln -s $HOME/travis-cache/.prove t/.prove
>  make --quiet test
> +GIT_TEST_SPLIT_INDEX=LetsTryIt make --quiet test

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-12 20:15           ` Thomas Gummerer
@ 2017-12-12 20:51             ` Junio C Hamano
  2017-12-13 23:21               ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Junio C Hamano @ 2017-12-12 20:51 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Lars Schneider, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git

Thomas Gummerer <t.gummerer@gmail.com> writes:

>
> The breakages wen the split-index code fails tend to break things in
> much more obvious manners than a wrong message, usually git ends up
> dying if it gets broken.  Both of the bugs that were fixed here would
> have been caught with the change in my patch.
>
> But yeah I can see the argument that it doesn't give us a guarantee
> that it catches all things the test suite could catch.

I think you misunderstood me.  When split index is much easier to
break than poison tests, combining them together would hurt the test
coverage of poison tests.  If you value poison tests much more than
how well split index mode works, that is a worse outcome.

>> I wonder if it makes more sense to update ci/run-tests.sh so that
>> its final step is run twice with different settings, like so?
>
> I kind of wanted to avoid that, because it ends up running the test
> suite twice on travis for every test, which seems a bit overkill.  But
> I don't exactly how worried we are about cycles on travis (I don't
> check it very often personally, so I don't really know).  If we aren't
> worried about cycles what you have below would certainly make sense.

I think 64-bit gcc/clang builds tend to cost us about 10-20 minutes
each, and 32-bit linux builds about 10 minutes or so.  I wonder if
it makes sense to do the "run twice" for only one of 64-bit builds
(perhaps the clang one, as I suspect 32-bit linux one uses gcc) and
the 32-bit linux build, and nowhere else.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-12 19:15         ` Junio C Hamano
  2017-12-12 20:15           ` Thomas Gummerer
@ 2017-12-13 17:21           ` Lars Schneider
  2017-12-13 17:38             ` Junio C Hamano
  1 sibling, 1 reply; 77+ messages in thread
From: Lars Schneider @ 2017-12-13 17:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Gummerer, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git


> On 12 Dec 2017, at 20:15, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Lars Schneider <larsxschneider@gmail.com> writes:
> 
>>> You're right, it's my first time using travis CI and I got confused
>>> about how the .travis.yml works, thanks for catching that.  Will
>>> re-phrase the commit message.
>> 
>> Szeder is spot on. If you fix up the message, then this patch looks
>> perfect! Pragmatic (= very good!) idea to combine GITTEXT_POISON with 
>> GIT_TEST_SPLIT_INDEX :-)
> 
> I am failing to guess the real intent of the smiley here.

No real reason. I was just happy to see that Travis CI seems to
be useful for the Git project.


> If split-index code is so easy to break, I do not think it is a good
> idea to combine it into the poison build.  In fact, the poison test
> is useless on a codebase where other/real breakages are expected to
> exist, because it is about seeing messages meant for non-humans are
> not passed to the _() mechanism by sloppy coding, and the way it
> does so is to corrupt all the messages that come through the _()
> mechanism.  If we do not even produce a message when a correct code
> _should_ produce one, poison test would catch nothing useful.
> 
> I wonder if it makes more sense to update ci/run-tests.sh so that
> its final step is run twice with different settings, like so?

Agreed - I didn't think it through. Let's keep it separate then.

I think your solution points into the right direction.
Right now we have the following test matrix:

1. Linux - clang
2. Linux - gcc
3. Mac - clang
4. Mac - gcc
5. Linux - gcc - GET_TEXT_POISION
6. Linux - gcc - 32bit
7. Windows

AFAIK your solution would run the split index test for 
1, 2, 3, and 4. I think that is too much.

1 runs the fastest and I would like to keep it that way
to get a quick "general" result. I think only 2 should be
extended in the way you are suggesting. We could run
the tests with different env variables there. What else
do we have besides GIT_TEST_SPLIT_INDEX?

Would that work for everyone?

- Lars


> ci/run-tests.sh | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/ci/run-tests.sh b/ci/run-tests.sh
> index f0c743de94..15a5f5a6cc 100755
> --- a/ci/run-tests.sh
> +++ b/ci/run-tests.sh
> @@ -8,3 +8,4 @@
> mkdir -p $HOME/travis-cache
> ln -s $HOME/travis-cache/.prove t/.prove
> make --quiet test
> +GIT_TEST_SPLIT_INDEX=LetsTryIt make --quiet test


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-13 17:21           ` Lars Schneider
@ 2017-12-13 17:38             ` Junio C Hamano
  2017-12-13 17:46               ` Lars Schneider
  0 siblings, 1 reply; 77+ messages in thread
From: Junio C Hamano @ 2017-12-13 17:38 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Thomas Gummerer, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git

Lars Schneider <larsxschneider@gmail.com> writes:

> I think your solution points into the right direction.
> Right now we have the following test matrix:
>
> 1. Linux - clang
> 2. Linux - gcc
> 3. Mac - clang
> 4. Mac - gcc
> 5. Linux - gcc - GET_TEXT_POISION
> 6. Linux - gcc - 32bit
> 7. Windows
>
> AFAIK your solution would run the split index test for 
> 1, 2, 3, and 4. I think that is too much.

Not that it matters too much, but I meant to add it to 1. and
6. when I said "only one of 64-bit build plus 32-bit one".  

> 1 runs the fastest and I would like to keep it that way
> to get a quick "general" result. I think only 2 should be
> extended in the way you are suggesting. We could run
> the tests with different env variables there. What else
> do we have besides GIT_TEST_SPLIT_INDEX?
>
> Would that work for everyone?

I am OK to make 2. use split-index (which unfortunately would mean
we lose tests without split-index under gcc), or add 2.5 that is a
copy of 2. plus split-index.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-13 17:38             ` Junio C Hamano
@ 2017-12-13 17:46               ` Lars Schneider
  2017-12-13 23:28                 ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Lars Schneider @ 2017-12-13 17:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Gummerer, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git


> On 13 Dec 2017, at 18:38, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Lars Schneider <larsxschneider@gmail.com> writes:
> 
>> I think your solution points into the right direction.
>> Right now we have the following test matrix:
>> 
>> 1. Linux - clang
>> 2. Linux - gcc
>> 3. Mac - clang
>> 4. Mac - gcc
>> 5. Linux - gcc - GET_TEXT_POISION
>> 6. Linux - gcc - 32bit
>> 7. Windows
>> 
>> AFAIK your solution would run the split index test for 
>> 1, 2, 3, and 4. I think that is too much.
> 
> Not that it matters too much, but I meant to add it to 1. and
> 6. when I said "only one of 64-bit build plus 32-bit one".  

Ah. Sorry, I didn't get that.


>> 1 runs the fastest and I would like to keep it that way
>> to get a quick "general" result. I think only 2 should be
>> extended in the way you are suggesting. We could run
>> the tests with different env variables there. What else
>> do we have besides GIT_TEST_SPLIT_INDEX?
>> 
>> Would that work for everyone?
> 
> I am OK to make 2. use split-index (which unfortunately would mean
> we lose tests without split-index under gcc), or add 2.5 that is a
> copy of 2. plus split-index.


I think I experessed myself poorly. As far as I understand it,
GIT_TEST_SPLIT_INDEX is an environment variable that only needs
to be set at test time and not at compile time. Is this right?

If yes, my idea for 2. is as follows:
- build Git with gcc
- run tests with "make --quiet test"
- run tests with "GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test"

This should be quicker than a new 2.5 target because we don't need to
spin up the machine and build Git. Plus, we could run the tests
a few more times with other test flags.

- Lars

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-12 20:51             ` Junio C Hamano
@ 2017-12-13 23:21               ` Thomas Gummerer
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-13 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Schneider, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git

On 12/12, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> >
> > The breakages wen the split-index code fails tend to break things in
> > much more obvious manners than a wrong message, usually git ends up
> > dying if it gets broken.  Both of the bugs that were fixed here would
> > have been caught with the change in my patch.
> >
> > But yeah I can see the argument that it doesn't give us a guarantee
> > that it catches all things the test suite could catch.
> 
> I think you misunderstood me.  When split index is much easier to
> break than poison tests, combining them together would hurt the test
> coverage of poison tests.  If you value poison tests much more than
> how well split index mode works, that is a worse outcome.

Sorry, indeed I misunderstood you.  I do agree with the above.

> >> I wonder if it makes more sense to update ci/run-tests.sh so that
> >> its final step is run twice with different settings, like so?
> >
> > I kind of wanted to avoid that, because it ends up running the test
> > suite twice on travis for every test, which seems a bit overkill.  But
> > I don't exactly how worried we are about cycles on travis (I don't
> > check it very often personally, so I don't really know).  If we aren't
> > worried about cycles what you have below would certainly make sense.
> 
> I think 64-bit gcc/clang builds tend to cost us about 10-20 minutes
> each, and 32-bit linux builds about 10 minutes or so.  I wonder if
> it makes sense to do the "run twice" for only one of 64-bit builds
> (perhaps the clang one, as I suspect 32-bit linux one uses gcc) and
> the 32-bit linux build, and nowhere else.

Yeah I'd be happy with something like that.  And I see there's some
more discussion about it in another part of the thread, will follow up
there.  Thanks!

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-13 17:46               ` Lars Schneider
@ 2017-12-13 23:28                 ` Thomas Gummerer
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-13 23:28 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Junio C Hamano, SZEDER Gábor, Christian Couder, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, git

On 12/13, Lars Schneider wrote:
> 
> > On 13 Dec 2017, at 18:38, Junio C Hamano <gitster@pobox.com> wrote:
> > 
> > Lars Schneider <larsxschneider@gmail.com> writes:
> > 
> >> I think your solution points into the right direction.
> >> Right now we have the following test matrix:
> >> 
> >> 1. Linux - clang
> >> 2. Linux - gcc
> >> 3. Mac - clang
> >> 4. Mac - gcc
> >> 5. Linux - gcc - GET_TEXT_POISION
> >> 6. Linux - gcc - 32bit
> >> 7. Windows
> >> 
> >> AFAIK your solution would run the split index test for 
> >> 1, 2, 3, and 4. I think that is too much.
> > 
> > Not that it matters too much, but I meant to add it to 1. and
> > 6. when I said "only one of 64-bit build plus 32-bit one".  
> 
> Ah. Sorry, I didn't get that.
> 
> 
> >> 1 runs the fastest and I would like to keep it that way
> >> to get a quick "general" result. I think only 2 should be
> >> extended in the way you are suggesting. We could run
> >> the tests with different env variables there. What else
> >> do we have besides GIT_TEST_SPLIT_INDEX?
> >> 
> >> Would that work for everyone?
> > 
> > I am OK to make 2. use split-index (which unfortunately would mean
> > we lose tests without split-index under gcc), or add 2.5 that is a
> > copy of 2. plus split-index.
> 
> 
> I think I experessed myself poorly. As far as I understand it,
> GIT_TEST_SPLIT_INDEX is an environment variable that only needs
> to be set at test time and not at compile time. Is this right?

Yes, that's correct.

> If yes, my idea for 2. is as follows:
> - build Git with gcc
> - run tests with "make --quiet test"
> - run tests with "GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test"
> 
> This should be quicker than a new 2.5 target because we don't need to
> spin up the machine and build Git. Plus, we could run the tests
> a few more times with other test flags.

I'd be happy with that.  Thanks for the discussion on this, will
change this in my re-roll.

> - Lars

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v2 0/3] fixes for split index mode
  2017-12-10 21:21 [PATCH 0/3] fixes for split index mode Thomas Gummerer
                   ` (2 preceding siblings ...)
  2017-12-10 21:22 ` [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
@ 2017-12-17 22:51 ` Thomas Gummerer
  2017-12-17 22:51   ` [PATCH v2 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
                     ` (3 more replies)
  3 siblings, 4 replies; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-17 22:51 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

The previous round was at <20171210212202.28231-1-t.gummerer@gmail.com>.

Thanks Brandon, Eric, Szeder, Lars and Junio for the comments on the
previous round.  

Changes since the previous round:

- pass in the_repository to do_read_index_from, so we don't have to
  special case NULL anymore.
- free struct repository properly after allocating it in revision.c
- run the travis tests in the linux-gcc and linux 32-bit builds, to
  avoid clashes with the GETTEXT_POISON tests.

The travis tests now depend on sg/travis-fixes, as running the split
index tests separately gets simplified with the introduction of the
'$jobname' variable.  The script falls back gracefully to not runing
the split index test under linux with gcc and only running the
"normal" tests there, while the split index tests would still get run
in the linux 32-bit build.

Because it's a "soft" dependency I thought having this dependency
would be okay, but let me know if you it's getting to complicated and
we can use the environment variables set by travis for now, and then
switch to the '$jobname' variable when it's merged to next/master.

I kept the usage of struct repository for worktrees, as I didn't hear
anything back after my last reply in <20171211213948.GC25616@hank>,
and don't see a better way of doing it.

Interdiff below:
diff --git a/.travis.yml b/.travis.yml
index c83c766dee..281f101f31 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -39,7 +39,7 @@ env:
 
 matrix:
   include:
-    - env: GETTEXT_POISON=YesPlease GIT_TEST_SPLIT_INDEX=YesPlease
+    - env: GETTEXT_POISON=YesPlease
       os: linux
       compiler:
       addons:
diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e30fb2cddc..f173c9cf2a 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -27,4 +27,5 @@ linux32 --32bit i386 su -m -l $CI_USER -c '
     cd /usr/src/git &&
     make --jobs=2 &&
     make --quiet test
+    GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
 '
diff --git a/ci/run-tests.sh b/ci/run-tests.sh
index f0c743de94..c7aee5b9ff 100755
--- a/ci/run-tests.sh
+++ b/ci/run-tests.sh
@@ -8,3 +8,7 @@
 mkdir -p $HOME/travis-cache
 ln -s $HOME/travis-cache/.prove t/.prove
 make --quiet test
+if test "$jobname" = "linux-gcc"
+then
+	GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
+fi
diff --git a/read-cache.c b/read-cache.c
index 4d5c4ad79b..70357febdc 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1898,10 +1898,7 @@ static int do_read_index_from(struct index_state *istate, const char *path,
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 
 	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
-	if (repo)
-		base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
-	else
-		base_path = git_path("sharedindex.%s", base_sha1_hex);
+	base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
 	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
@@ -1921,7 +1918,7 @@ int read_index_for_repo(const struct repository *repo)
 
 int read_index_from(struct index_state *istate, const char *path)
 {
-	return do_read_index_from(istate, path, NULL);
+	return do_read_index_from(istate, path, the_repository);
 }
 
 int is_index_unborn(struct index_state *istate)
diff --git a/revision.c b/revision.c
index 9d8d9b96d1..34e1e4b799 100644
--- a/revision.c
+++ b/revision.c
@@ -1358,6 +1358,7 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
 		if (repo_read_index(repo) > 0)
 			do_add_index_objects_to_pending(revs, repo->index);
 		discard_index(repo->index);
+		free(repo);
 	}
 	free_worktrees(worktrees);
 }

Thomas Gummerer (3):
  repository: fix repo_read_index with submodules
  prune: fix pruning with multiple worktrees and split index
  travis: run tests with GIT_TEST_SPLIT_INDEX

 cache.h                 |  1 +
 ci/run-linux32-build.sh |  1 +
 ci/run-tests.sh         |  1 +
 read-cache.c            | 16 ++++++++++++++--
 repository.c            | 13 ++++++++++++-
 repository.h            |  2 ++
 revision.c              | 14 +++++++++-----
 7 files changed, 40 insertions(+), 8 deletions(-)

-- 
2.15.1.620.gb9897f4670


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v2 1/3] repository: fix repo_read_index with submodules
  2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
@ 2017-12-17 22:51   ` Thomas Gummerer
  2017-12-18 18:01     ` Brandon Williams
  2017-12-17 22:51   ` [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-17 22:51 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

repo_read_index calls read_index_from, which takes an path argument for
the location of the index file.  For the split index however it relies
on the current working directory to construct the path using git_path.

repo_read_index calls read_index_from with the full path for the index
file, however it doesn't change the cwd, so when split index mode is
turned on, read_index_from can't find the file for the split index.

For example t3007-ls-files-recurse-submodules.sh was broken with
GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
broken in a similar manner, probably by introducing struct repository
there, although I didn't track down the exact commit for that.

Fix this by introducing a new read_index_for_repo function, which knows
about the correct paths for the submodules.

The alternative would have been to make the callers pass in the base
path for the split index, however that ended up being more complicated,
and I think we want to converge towards using struct repository for
things like these anyway.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h      |  1 +
 read-cache.c | 16 ++++++++++++++--
 repository.c |  2 +-
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index cb5db7bf83..d42bea1ef7 100644
--- a/cache.h
+++ b/cache.h
@@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
 extern int do_read_index(struct index_state *istate, const char *path,
 			 int must_exist); /* for testting only! */
 extern int read_index_from(struct index_state *, const char *path);
+extern int read_index_for_repo(const struct repository *);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
 
diff --git a/read-cache.c b/read-cache.c
index 2eb81a66b9..70357febdc 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -20,6 +20,7 @@
 #include "split-index.h"
 #include "utf8.h"
 #include "fsmonitor.h"
+#include "repository.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
 	free(shared_index);
 }
 
-int read_index_from(struct index_state *istate, const char *path)
+static int do_read_index_from(struct index_state *istate, const char *path,
+			      const struct repository *repo)
 {
 	struct split_index *split_index;
 	int ret;
@@ -1896,7 +1898,7 @@ int read_index_from(struct index_state *istate, const char *path)
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 
 	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
-	base_path = git_path("sharedindex.%s", base_sha1_hex);
+	base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
 	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
@@ -1909,6 +1911,16 @@ int read_index_from(struct index_state *istate, const char *path)
 	return ret;
 }
 
+int read_index_for_repo(const struct repository *repo)
+{
+	return do_read_index_from(repo->index, repo->index_file, repo);
+}
+
+int read_index_from(struct index_state *istate, const char *path)
+{
+	return do_read_index_from(istate, path, the_repository);
+}
+
 int is_index_unborn(struct index_state *istate)
 {
 	return (!istate->cache_nr && !istate->timestamp.sec);
diff --git a/repository.c b/repository.c
index bb2fae5446..928b1f553d 100644
--- a/repository.c
+++ b/repository.c
@@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
 
-	return read_index_from(repo->index, repo->index_file);
+	return read_index_for_repo(repo);
 }
-- 
2.15.1.620.gb9897f4670


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index
  2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
  2017-12-17 22:51   ` [PATCH v2 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
@ 2017-12-17 22:51   ` Thomas Gummerer
  2017-12-18 18:19     ` Brandon Williams
  2017-12-17 22:51   ` [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
  3 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-17 22:51 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) made sure that pruning takes objects from all
worktrees into account.

It did that by reading the index of every worktree and adding the
necessary index objects to the set of pending objects.  The index is
read by read_index_from.  As mentioned in the previous commit,
read_index_from depends on the CWD for the location of the split index,
and add_index_objects_to_pending doesn't set that before using
read_index_from.

Instead of using read_index_from, use repo_read_index, which is aware of
the proper paths for the worktree.

This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 repository.c | 11 +++++++++++
 repository.h |  2 ++
 revision.c   | 14 +++++++++-----
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/repository.c b/repository.c
index 928b1f553d..3c9bfbd1b8 100644
--- a/repository.c
+++ b/repository.c
@@ -2,6 +2,7 @@
 #include "repository.h"
 #include "config.h"
 #include "submodule-config.h"
+#include "worktree.h"
 
 /* The main repository */
 static struct repository the_repo = {
@@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
 	return -1;
 }
 
+/*
+ * Initialize 'repo' based on the provided worktree
+ * Return 0 upon success and a non-zero value upon failure.
+ */
+int repo_worktree_init(struct repository *repo, struct worktree *worktree)
+{
+	return repo_init(repo, get_worktree_git_dir(worktree),
+			 worktree->path);
+}
+
 /*
  * Initialize 'submodule' as the submodule given by 'path' in parent repository
  * 'superproject'.
diff --git a/repository.h b/repository.h
index 7f5e24a0a2..2adeb05bf4 100644
--- a/repository.h
+++ b/repository.h
@@ -4,6 +4,7 @@
 struct config_set;
 struct index_state;
 struct submodule_cache;
+struct worktree;
 
 struct repository {
 	/* Environment */
@@ -87,6 +88,7 @@ extern struct repository *the_repository;
 extern void repo_set_gitdir(struct repository *repo, const char *path);
 extern void repo_set_worktree(struct repository *repo, const char *path);
 extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
+extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
 extern int repo_submodule_init(struct repository *submodule,
 			       struct repository *superproject,
 			       const char *path);
diff --git a/revision.c b/revision.c
index e2e691dd5a..34e1e4b799 100644
--- a/revision.c
+++ b/revision.c
@@ -22,6 +22,7 @@
 #include "packfile.h"
 #include "worktree.h"
 #include "argv-array.h"
+#include "repository.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -1346,15 +1347,18 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
 	worktrees = get_worktrees(0);
 	for (p = worktrees; *p; p++) {
 		struct worktree *wt = *p;
-		struct index_state istate = { NULL };
+		struct repository *repo;
 
+		repo = xmalloc(sizeof(struct repository));
 		if (wt->is_current)
 			continue; /* current index already taken care of */
+		if (repo_worktree_init(repo, wt))
+			BUG("couldn't initialize repository object from worktree");
 
-		if (read_index_from(&istate,
-				    worktree_git_path(wt, "index")) > 0)
-			do_add_index_objects_to_pending(revs, &istate);
-		discard_index(&istate);
+		if (repo_read_index(repo) > 0)
+			do_add_index_objects_to_pending(revs, repo->index);
+		discard_index(repo->index);
+		free(repo);
 	}
 	free_worktrees(worktrees);
 }
-- 
2.15.1.620.gb9897f4670


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
  2017-12-17 22:51   ` [PATCH v2 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
  2017-12-17 22:51   ` [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
@ 2017-12-17 22:51   ` Thomas Gummerer
  2017-12-18 18:16     ` Lars Schneider
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
  3 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-17 22:51 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

Split index mode only has a few dedicated tests, but as the index is
involved in nearly every git operation, this doesn't quite cover all the
ways repositories with split index can break.  To use split index mode
throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
can be set, which makes git split the index at random and thus
excercises the functionality much more thoroughly.

As this is not turned on by default, it is not executed nearly as often
as the test suite is run, so occationally breakages slip through.  Try
to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
mode turned on on travis.

To avoid using too many cycles on travis only run split index mode in
the linux-gcc and the linux 32-bit gcc targets.  The Linux builds were
chosen over the Mac OS builds because they tend to be much faster to
complete.

The linux gcc build was chosen over the linux clang build because the
linux clang build is the fastest build, so it can serve as an early
indicator if something is broken and we want to avoid spending the extra
cycles of running the test suite twice for that.

Helped-by: Lars Schneider <larsxschneider@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 ci/run-linux32-build.sh | 1 +
 ci/run-tests.sh         | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e30fb2cddc..f173c9cf2a 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -27,4 +27,5 @@ linux32 --32bit i386 su -m -l $CI_USER -c '
     cd /usr/src/git &&
     make --jobs=2 &&
     make --quiet test
+    GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
 '
diff --git a/ci/run-tests.sh b/ci/run-tests.sh
index f0c743de94..c7aee5b9ff 100755
--- a/ci/run-tests.sh
+++ b/ci/run-tests.sh
@@ -8,3 +8,7 @@
 mkdir -p $HOME/travis-cache
 ln -s $HOME/travis-cache/.prove t/.prove
 make --quiet test
+if test "$jobname" = "linux-gcc"
+then
+	GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
+fi
-- 
2.15.1.620.gb9897f4670


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 1/3] repository: fix repo_read_index with submodules
  2017-12-17 22:51   ` [PATCH v2 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
@ 2017-12-18 18:01     ` Brandon Williams
  2017-12-18 23:05       ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Brandon Williams @ 2017-12-18 18:01 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 12/17, Thomas Gummerer wrote:
> repo_read_index calls read_index_from, which takes an path argument for
> the location of the index file.  For the split index however it relies

> on the current working directory to construct the path using git_path.

This line isn't actually true and should probably be fixed.  git_path
doesn't rely on the CWD but rather it relies on the gitdir of the main
repository (the_repository).

> 
> repo_read_index calls read_index_from with the full path for the index
> file, however it doesn't change the cwd, so when split index mode is
> turned on, read_index_from can't find the file for the split index.
> 
> For example t3007-ls-files-recurse-submodules.sh was broken with
> GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> broken in a similar manner, probably by introducing struct repository
> there, although I didn't track down the exact commit for that.
> 
> Fix this by introducing a new read_index_for_repo function, which knows
> about the correct paths for the submodules.
> 
> The alternative would have been to make the callers pass in the base
> path for the split index, however that ended up being more complicated,
> and I think we want to converge towards using struct repository for
> things like these anyway.
> 
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  cache.h      |  1 +
>  read-cache.c | 16 ++++++++++++++--
>  repository.c |  2 +-
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/cache.h b/cache.h
> index cb5db7bf83..d42bea1ef7 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
>  extern int do_read_index(struct index_state *istate, const char *path,
>  			 int must_exist); /* for testting only! */
>  extern int read_index_from(struct index_state *, const char *path);
> +extern int read_index_for_repo(const struct repository *);
>  extern int is_index_unborn(struct index_state *);
>  extern int read_index_unmerged(struct index_state *);
>  
> diff --git a/read-cache.c b/read-cache.c
> index 2eb81a66b9..70357febdc 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -20,6 +20,7 @@
>  #include "split-index.h"
>  #include "utf8.h"
>  #include "fsmonitor.h"
> +#include "repository.h"
>  
>  /* Mask for the name length in ce_flags in the on-disk index */
>  
> @@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
>  	free(shared_index);
>  }
>  
> -int read_index_from(struct index_state *istate, const char *path)
> +static int do_read_index_from(struct index_state *istate, const char *path,
> +			      const struct repository *repo)
>  {
>  	struct split_index *split_index;
>  	int ret;
> @@ -1896,7 +1898,7 @@ int read_index_from(struct index_state *istate, const char *path)
>  		split_index->base = xcalloc(1, sizeof(*split_index->base));
>  
>  	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> -	base_path = git_path("sharedindex.%s", base_sha1_hex);
> +	base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
>  	ret = do_read_index(split_index->base, base_path, 1);
>  	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
>  		die("broken index, expect %s in %s, got %s",
> @@ -1909,6 +1911,16 @@ int read_index_from(struct index_state *istate, const char *path)
>  	return ret;
>  }
>  
> +int read_index_for_repo(const struct repository *repo)
> +{
> +	return do_read_index_from(repo->index, repo->index_file, repo);
> +}

> +
> +int read_index_from(struct index_state *istate, const char *path)
> +{
> +	return do_read_index_from(istate, path, the_repository);
> +}

This looks fine, though I wonder what the point of passing in the index
file even was since we end just ended up reading the 'sharedindex' file based
on the git path. I'm just curious about how this function evolved.

> +
>  int is_index_unborn(struct index_state *istate)
>  {
>  	return (!istate->cache_nr && !istate->timestamp.sec);
> diff --git a/repository.c b/repository.c
> index bb2fae5446..928b1f553d 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
>  	if (!repo->index)
>  		repo->index = xcalloc(1, sizeof(*repo->index));
>  
> -	return read_index_from(repo->index, repo->index_file);
> +	return read_index_for_repo(repo);
>  }
> -- 
> 2.15.1.620.gb9897f4670
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-17 22:51   ` [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
@ 2017-12-18 18:16     ` Lars Schneider
  2018-01-04 20:13       ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Lars Schneider @ 2017-12-18 18:16 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Git List, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor


> On 17 Dec 2017, at 23:51, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> 
> Split index mode only has a few dedicated tests, but as the index is
> involved in nearly every git operation, this doesn't quite cover all the
> ways repositories with split index can break.  To use split index mode
> throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
> can be set, which makes git split the index at random and thus
> excercises the functionality much more thoroughly.
> 
> As this is not turned on by default, it is not executed nearly as often
> as the test suite is run, so occationally breakages slip through.  Try
> to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
> mode turned on on travis.
> 
> To avoid using too many cycles on travis only run split index mode in
> the linux-gcc and the linux 32-bit gcc targets.

I am surprised to see the split index mode test for the linux 32-bit
target. Is it likely that a split index bug appears only on 32-bit? 
Wouldn't the linux-gcc target be sufficient to save resources/time?


>  The Linux builds were
> chosen over the Mac OS builds because they tend to be much faster to
> complete.
> 
> The linux gcc build was chosen over the linux clang build because the
> linux clang build is the fastest build, so it can serve as an early
> indicator if something is broken and we want to avoid spending the extra
> cycles of running the test suite twice for that.
> 
> Helped-by: Lars Schneider <larsxschneider@gmail.com>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
> ci/run-linux32-build.sh | 1 +
> ci/run-tests.sh         | 4 ++++
> 2 files changed, 5 insertions(+)
> 
> diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
> index e30fb2cddc..f173c9cf2a 100755
> --- a/ci/run-linux32-build.sh
> +++ b/ci/run-linux32-build.sh
> @@ -27,4 +27,5 @@ linux32 --32bit i386 su -m -l $CI_USER -c '
>     cd /usr/src/git &&
>     make --jobs=2 &&
>     make --quiet test
> +    GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
> '
> diff --git a/ci/run-tests.sh b/ci/run-tests.sh
> index f0c743de94..c7aee5b9ff 100755
> --- a/ci/run-tests.sh
> +++ b/ci/run-tests.sh
> @@ -8,3 +8,7 @@
> mkdir -p $HOME/travis-cache
> ln -s $HOME/travis-cache/.prove t/.prove
> make --quiet test
> +if test "$jobname" = "linux-gcc"
> +then
> +	GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
> +fi

For now I think that looks good. Maybe we could define additional test 
configurations with an environment variable. That could be an array variable
defined in the lib-travis.ci "case" statement:
https://github.com/git/git/blob/1229713f78cd2883798e95f33c19c81b523413fd/ci/lib-travisci.sh#L42-L65


- Lars

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index
  2017-12-17 22:51   ` [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
@ 2017-12-18 18:19     ` Brandon Williams
  2018-01-03 22:18       ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Brandon Williams @ 2017-12-18 18:19 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 12/17, Thomas Gummerer wrote:
> be489d02d2 ("revision.c: --indexed-objects add objects from all
> worktrees", 2017-08-23) made sure that pruning takes objects from all
> worktrees into account.
> 
> It did that by reading the index of every worktree and adding the
> necessary index objects to the set of pending objects.  The index is
> read by read_index_from.  As mentioned in the previous commit,
> read_index_from depends on the CWD for the location of the split index,

As I mentioned before this doesn't actually depend on the CWD but
rather the per-worktree gitdir.

> and add_index_objects_to_pending doesn't set that before using
> read_index_from.
> 
> Instead of using read_index_from, use repo_read_index, which is aware of
> the proper paths for the worktree.
> 
> This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> 
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  repository.c | 11 +++++++++++
>  repository.h |  2 ++
>  revision.c   | 14 +++++++++-----
>  3 files changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index 928b1f553d..3c9bfbd1b8 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -2,6 +2,7 @@
>  #include "repository.h"
>  #include "config.h"
>  #include "submodule-config.h"
> +#include "worktree.h"
>  
>  /* The main repository */
>  static struct repository the_repo = {
> @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
>  	return -1;
>  }
>  
> +/*
> + * Initialize 'repo' based on the provided worktree
> + * Return 0 upon success and a non-zero value upon failure.
> + */
> +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> +{
> +	return repo_init(repo, get_worktree_git_dir(worktree),
> +			 worktree->path);

I still feel very unsettled about this and don't think its a good idea.
get_worktree_git_dir depends implicitly on the global the_repository
object and I would like to avoid relying on it for an initialization
function like this.

> +}
> +
>  /*
>   * Initialize 'submodule' as the submodule given by 'path' in parent repository
>   * 'superproject'.
> diff --git a/repository.h b/repository.h
> index 7f5e24a0a2..2adeb05bf4 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -4,6 +4,7 @@
>  struct config_set;
>  struct index_state;
>  struct submodule_cache;
> +struct worktree;
>  
>  struct repository {
>  	/* Environment */
> @@ -87,6 +88,7 @@ extern struct repository *the_repository;
>  extern void repo_set_gitdir(struct repository *repo, const char *path);
>  extern void repo_set_worktree(struct repository *repo, const char *path);
>  extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
> +extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
>  extern int repo_submodule_init(struct repository *submodule,
>  			       struct repository *superproject,
>  			       const char *path);
> diff --git a/revision.c b/revision.c
> index e2e691dd5a..34e1e4b799 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -22,6 +22,7 @@
>  #include "packfile.h"
>  #include "worktree.h"
>  #include "argv-array.h"
> +#include "repository.h"
>  
>  volatile show_early_output_fn_t show_early_output;
>  
> @@ -1346,15 +1347,18 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
>  	worktrees = get_worktrees(0);
>  	for (p = worktrees; *p; p++) {
>  		struct worktree *wt = *p;
> -		struct index_state istate = { NULL };
> +		struct repository *repo;
>  
> +		repo = xmalloc(sizeof(struct repository));
>  		if (wt->is_current)
>  			continue; /* current index already taken care of */
> +		if (repo_worktree_init(repo, wt))
> +			BUG("couldn't initialize repository object from worktree");
>  
> -		if (read_index_from(&istate,
> -				    worktree_git_path(wt, "index")) > 0)

Ok, after thinking this through a bit more I think a better approach may
be to restructure the call to read_index_from to take in both an index
file as well as the explicit gitdir to use when constructing a path to
the sharedindex file.  That way you can fix this for worktrees and
submodules without having to pass in a repository object to the logic
which is reading an index file as well as avoiding needing to init a
repository object for every worktree.

So you could rework the first patch to do something like:

diff --git a/read-cache.c b/read-cache.c
index 2eb81a66b..3a04b5567 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1863,20 +1863,19 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
  * This way, shared index can be removed if they have not been used
  * for some time.
  */
-static void freshen_shared_index(char *base_sha1_hex, int warn)
+static void freshen_shared_index(const char *shared_index, int warn)
 {
-	char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
 	if (!check_and_freshen_file(shared_index, 1) && warn)
 		warning("could not freshen shared index '%s'", shared_index);
-	free(shared_index);
 }
 
-int read_index_from(struct index_state *istate, const char *path)
+int read_index_from(struct index_state *istate, const char *path,
+		    const char *gitdir)
 {
 	struct split_index *split_index;
 	int ret;
 	char *base_sha1_hex;
-	const char *base_path;
+	char *base_path;
 
 	/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
 	if (istate->initialized)
@@ -1896,14 +1895,14 @@ int read_index_from(struct index_state *istate, const char *path)
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 
 	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
-	base_path = git_path("sharedindex.%s", base_sha1_hex);
+	base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
 	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
 		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
 
-	freshen_shared_index(base_sha1_hex, 0);
+	freshen_shared_index(base_path, 0);
 	merge_base_index(istate);
 	post_read_index_from(istate);
+ free(base_path);
 	return ret;


-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 1/3] repository: fix repo_read_index with submodules
  2017-12-18 18:01     ` Brandon Williams
@ 2017-12-18 23:05       ` Thomas Gummerer
  2017-12-18 23:05         ` Brandon Williams
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2017-12-18 23:05 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 12/18, Brandon Williams wrote:
> On 12/17, Thomas Gummerer wrote:
> > repo_read_index calls read_index_from, which takes an path argument for
> > the location of the index file.  For the split index however it relies
> 
> > on the current working directory to construct the path using git_path.
> 
> This line isn't actually true and should probably be fixed.  git_path
> doesn't rely on the CWD but rather it relies on the gitdir of the main
> repository (the_repository).

Right, let me fix that.  Thanks!

> > 
> > repo_read_index calls read_index_from with the full path for the index
> > file, however it doesn't change the cwd, so when split index mode is
> > turned on, read_index_from can't find the file for the split index.
> > 
> > For example t3007-ls-files-recurse-submodules.sh was broken with
> > GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> > object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> > broken in a similar manner, probably by introducing struct repository
> > there, although I didn't track down the exact commit for that.
> > 
> > Fix this by introducing a new read_index_for_repo function, which knows
> > about the correct paths for the submodules.
> > 
> > The alternative would have been to make the callers pass in the base
> > path for the split index, however that ended up being more complicated,
> > and I think we want to converge towards using struct repository for
> > things like these anyway.
> > 
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> >  cache.h      |  1 +
> >  read-cache.c | 16 ++++++++++++++--
> >  repository.c |  2 +-
> >  3 files changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/cache.h b/cache.h
> > index cb5db7bf83..d42bea1ef7 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
> >  extern int do_read_index(struct index_state *istate, const char *path,
> >  			 int must_exist); /* for testting only! */
> >  extern int read_index_from(struct index_state *, const char *path);
> > +extern int read_index_for_repo(const struct repository *);
> >  extern int is_index_unborn(struct index_state *);
> >  extern int read_index_unmerged(struct index_state *);
> >  
> > diff --git a/read-cache.c b/read-cache.c
> > index 2eb81a66b9..70357febdc 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -20,6 +20,7 @@
> >  #include "split-index.h"
> >  #include "utf8.h"
> >  #include "fsmonitor.h"
> > +#include "repository.h"
> >  
> >  /* Mask for the name length in ce_flags in the on-disk index */
> >  
> > @@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
> >  	free(shared_index);
> >  }
> >  
> > -int read_index_from(struct index_state *istate, const char *path)
> > +static int do_read_index_from(struct index_state *istate, const char *path,
> > +			      const struct repository *repo)
> >  {
> >  	struct split_index *split_index;
> >  	int ret;
> > @@ -1896,7 +1898,7 @@ int read_index_from(struct index_state *istate, const char *path)
> >  		split_index->base = xcalloc(1, sizeof(*split_index->base));
> >  
> >  	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> > -	base_path = git_path("sharedindex.%s", base_sha1_hex);
> > +	base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
> >  	ret = do_read_index(split_index->base, base_path, 1);
> >  	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
> >  		die("broken index, expect %s in %s, got %s",
> > @@ -1909,6 +1911,16 @@ int read_index_from(struct index_state *istate, const char *path)
> >  	return ret;
> >  }
> >  
> > +int read_index_for_repo(const struct repository *repo)
> > +{
> > +	return do_read_index_from(repo->index, repo->index_file, repo);
> > +}
> 
> > +
> > +int read_index_from(struct index_state *istate, const char *path)
> > +{
> > +	return do_read_index_from(istate, path, the_repository);
> > +}
> 
> This looks fine, though I wonder what the point of passing in the index
> file even was since we end just ended up reading the 'sharedindex' file based
> on the git path. I'm just curious about how this function evolved.

There are some callsites that are using an index different form
$gitdir/index, or even GIT_INDEX_FILE.  e.g. see builtin/am.c [*1*],
which uses it's own 'patch-merge-index' in the am state directory for
its internal operations.

The split index mode was later bolted on, and the sharedindex.xxxx
would always go in $gitdir for the repository.  Others probably know
quite a bit more about this, while I'm always interested in index
related things as that's how I got started with the git project, I
couldn't follow all the conversations that were going on there.

*1*: https://github.com/gitster/git/blob/52015aaf9d19c97b52c47c7046058e6d029ff856/builtin/am.c#L1844

> > +
> >  int is_index_unborn(struct index_state *istate)
> >  {
> >  	return (!istate->cache_nr && !istate->timestamp.sec);
> > diff --git a/repository.c b/repository.c
> > index bb2fae5446..928b1f553d 100644
> > --- a/repository.c
> > +++ b/repository.c
> > @@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
> >  	if (!repo->index)
> >  		repo->index = xcalloc(1, sizeof(*repo->index));
> >  
> > -	return read_index_from(repo->index, repo->index_file);
> > +	return read_index_for_repo(repo);
> >  }
> > -- 
> > 2.15.1.620.gb9897f4670
> > 
> 
> -- 
> Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 1/3] repository: fix repo_read_index with submodules
  2017-12-18 23:05       ` Thomas Gummerer
@ 2017-12-18 23:05         ` Brandon Williams
  0 siblings, 0 replies; 77+ messages in thread
From: Brandon Williams @ 2017-12-18 23:05 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 12/18, Thomas Gummerer wrote:
> On 12/18, Brandon Williams wrote:
> > On 12/17, Thomas Gummerer wrote:
> > > repo_read_index calls read_index_from, which takes an path argument for
> > > the location of the index file.  For the split index however it relies
> > 
> > > on the current working directory to construct the path using git_path.
> > 
> > This line isn't actually true and should probably be fixed.  git_path
> > doesn't rely on the CWD but rather it relies on the gitdir of the main
> > repository (the_repository).
> 
> Right, let me fix that.  Thanks!
> 
> > > 
> > > repo_read_index calls read_index_from with the full path for the index
> > > file, however it doesn't change the cwd, so when split index mode is
> > > turned on, read_index_from can't find the file for the split index.
> > > 
> > > For example t3007-ls-files-recurse-submodules.sh was broken with
> > > GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> > > object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> > > broken in a similar manner, probably by introducing struct repository
> > > there, although I didn't track down the exact commit for that.
> > > 
> > > Fix this by introducing a new read_index_for_repo function, which knows
> > > about the correct paths for the submodules.
> > > 
> > > The alternative would have been to make the callers pass in the base
> > > path for the split index, however that ended up being more complicated,
> > > and I think we want to converge towards using struct repository for
> > > things like these anyway.
> > > 
> > > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > > ---
> > >  cache.h      |  1 +
> > >  read-cache.c | 16 ++++++++++++++--
> > >  repository.c |  2 +-
> > >  3 files changed, 16 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/cache.h b/cache.h
> > > index cb5db7bf83..d42bea1ef7 100644
> > > --- a/cache.h
> > > +++ b/cache.h
> > > @@ -614,6 +614,7 @@ extern int read_index_preload(struct index_state *, const struct pathspec *paths
> > >  extern int do_read_index(struct index_state *istate, const char *path,
> > >  			 int must_exist); /* for testting only! */
> > >  extern int read_index_from(struct index_state *, const char *path);
> > > +extern int read_index_for_repo(const struct repository *);
> > >  extern int is_index_unborn(struct index_state *);
> > >  extern int read_index_unmerged(struct index_state *);
> > >  
> > > diff --git a/read-cache.c b/read-cache.c
> > > index 2eb81a66b9..70357febdc 100644
> > > --- a/read-cache.c
> > > +++ b/read-cache.c
> > > @@ -20,6 +20,7 @@
> > >  #include "split-index.h"
> > >  #include "utf8.h"
> > >  #include "fsmonitor.h"
> > > +#include "repository.h"
> > >  
> > >  /* Mask for the name length in ce_flags in the on-disk index */
> > >  
> > > @@ -1871,7 +1872,8 @@ static void freshen_shared_index(char *base_sha1_hex, int warn)
> > >  	free(shared_index);
> > >  }
> > >  
> > > -int read_index_from(struct index_state *istate, const char *path)
> > > +static int do_read_index_from(struct index_state *istate, const char *path,
> > > +			      const struct repository *repo)
> > >  {
> > >  	struct split_index *split_index;
> > >  	int ret;
> > > @@ -1896,7 +1898,7 @@ int read_index_from(struct index_state *istate, const char *path)
> > >  		split_index->base = xcalloc(1, sizeof(*split_index->base));
> > >  
> > >  	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> > > -	base_path = git_path("sharedindex.%s", base_sha1_hex);
> > > +	base_path = repo_git_path(repo, "sharedindex.%s", base_sha1_hex);
> > >  	ret = do_read_index(split_index->base, base_path, 1);
> > >  	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
> > >  		die("broken index, expect %s in %s, got %s",
> > > @@ -1909,6 +1911,16 @@ int read_index_from(struct index_state *istate, const char *path)
> > >  	return ret;
> > >  }
> > >  
> > > +int read_index_for_repo(const struct repository *repo)
> > > +{
> > > +	return do_read_index_from(repo->index, repo->index_file, repo);
> > > +}
> > 
> > > +
> > > +int read_index_from(struct index_state *istate, const char *path)
> > > +{
> > > +	return do_read_index_from(istate, path, the_repository);
> > > +}
> > 
> > This looks fine, though I wonder what the point of passing in the index
> > file even was since we end just ended up reading the 'sharedindex' file based
> > on the git path. I'm just curious about how this function evolved.
> 
> There are some callsites that are using an index different form
> $gitdir/index, or even GIT_INDEX_FILE.  e.g. see builtin/am.c [*1*],
> which uses it's own 'patch-merge-index' in the am state directory for
> its internal operations.
> 
> The split index mode was later bolted on, and the sharedindex.xxxx
> would always go in $gitdir for the repository.  Others probably know
> quite a bit more about this, while I'm always interested in index
> related things as that's how I got started with the git project, I
> couldn't follow all the conversations that were going on there.
> 
> *1*: https://github.com/gitster/git/blob/52015aaf9d19c97b52c47c7046058e6d029ff856/builtin/am.c#L1844

Thanks for the explanation and pointer! :) 

> 
> > > +
> > >  int is_index_unborn(struct index_state *istate)
> > >  {
> > >  	return (!istate->cache_nr && !istate->timestamp.sec);
> > > diff --git a/repository.c b/repository.c
> > > index bb2fae5446..928b1f553d 100644
> > > --- a/repository.c
> > > +++ b/repository.c
> > > @@ -229,5 +229,5 @@ int repo_read_index(struct repository *repo)
> > >  	if (!repo->index)
> > >  		repo->index = xcalloc(1, sizeof(*repo->index));
> > >  
> > > -	return read_index_from(repo->index, repo->index_file);
> > > +	return read_index_for_repo(repo);
> > >  }
> > > -- 
> > > 2.15.1.620.gb9897f4670
> > > 
> > 
> > -- 
> > Brandon Williams

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index
  2017-12-18 18:19     ` Brandon Williams
@ 2018-01-03 22:18       ` Thomas Gummerer
  2018-01-04 19:12         ` Junio C Hamano
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-03 22:18 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

[sorry for the late reply.  I was on Christmas holidays until today
and am still catching up on the mailing list.  It will probably take
me untill the weekend to send a re-roll]

On 12/18, Brandon Williams wrote:
> On 12/17, Thomas Gummerer wrote:
> > be489d02d2 ("revision.c: --indexed-objects add objects from all
> > worktrees", 2017-08-23) made sure that pruning takes objects from all
> > worktrees into account.
> > 
> > It did that by reading the index of every worktree and adding the
> > necessary index objects to the set of pending objects.  The index is
> > read by read_index_from.  As mentioned in the previous commit,
> > read_index_from depends on the CWD for the location of the split index,
> 
> As I mentioned before this doesn't actually depend on the CWD but
> rather the per-worktree gitdir.

Right, will fix.

> > and add_index_objects_to_pending doesn't set that before using
> > read_index_from.
> > 
> > Instead of using read_index_from, use repo_read_index, which is aware of
> > the proper paths for the worktree.
> > 
> > This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> > 
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> >  repository.c | 11 +++++++++++
> >  repository.h |  2 ++
> >  revision.c   | 14 +++++++++-----
> >  3 files changed, 22 insertions(+), 5 deletions(-)
> > 
> > diff --git a/repository.c b/repository.c
> > index 928b1f553d..3c9bfbd1b8 100644
> > --- a/repository.c
> > +++ b/repository.c
> > @@ -2,6 +2,7 @@
> >  #include "repository.h"
> >  #include "config.h"
> >  #include "submodule-config.h"
> > +#include "worktree.h"
> >  
> >  /* The main repository */
> >  static struct repository the_repo = {
> > @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
> >  	return -1;
> >  }
> >  
> > +/*
> > + * Initialize 'repo' based on the provided worktree
> > + * Return 0 upon success and a non-zero value upon failure.
> > + */
> > +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> > +{
> > +	return repo_init(repo, get_worktree_git_dir(worktree),
> > +			 worktree->path);
> 
> I still feel very unsettled about this and don't think its a good idea.
> get_worktree_git_dir depends implicitly on the global the_repository
> object and I would like to avoid relying on it for an initialization
> function like this.
> 
> > +}
> > +
> >  /*
> >   * Initialize 'submodule' as the submodule given by 'path' in parent repository
> >   * 'superproject'.
> > diff --git a/repository.h b/repository.h
> > index 7f5e24a0a2..2adeb05bf4 100644
> > --- a/repository.h
> > +++ b/repository.h
> > @@ -4,6 +4,7 @@
> >  struct config_set;
> >  struct index_state;
> >  struct submodule_cache;
> > +struct worktree;
> >  
> >  struct repository {
> >  	/* Environment */
> > @@ -87,6 +88,7 @@ extern struct repository *the_repository;
> >  extern void repo_set_gitdir(struct repository *repo, const char *path);
> >  extern void repo_set_worktree(struct repository *repo, const char *path);
> >  extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
> > +extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
> >  extern int repo_submodule_init(struct repository *submodule,
> >  			       struct repository *superproject,
> >  			       const char *path);
> > diff --git a/revision.c b/revision.c
> > index e2e691dd5a..34e1e4b799 100644
> > --- a/revision.c
> > +++ b/revision.c
> > @@ -22,6 +22,7 @@
> >  #include "packfile.h"
> >  #include "worktree.h"
> >  #include "argv-array.h"
> > +#include "repository.h"
> >  
> >  volatile show_early_output_fn_t show_early_output;
> >  
> > @@ -1346,15 +1347,18 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
> >  	worktrees = get_worktrees(0);
> >  	for (p = worktrees; *p; p++) {
> >  		struct worktree *wt = *p;
> > -		struct index_state istate = { NULL };
> > +		struct repository *repo;
> >  
> > +		repo = xmalloc(sizeof(struct repository));
> >  		if (wt->is_current)
> >  			continue; /* current index already taken care of */
> > +		if (repo_worktree_init(repo, wt))
> > +			BUG("couldn't initialize repository object from worktree");
> >  
> > -		if (read_index_from(&istate,
> > -				    worktree_git_path(wt, "index")) > 0)
> 
> Ok, after thinking this through a bit more I think a better approach may
> be to restructure the call to read_index_from to take in both an index
> file as well as the explicit gitdir to use when constructing a path to
> the sharedindex file.  That way you can fix this for worktrees and
> submodules without having to pass in a repository object to the logic
> which is reading an index file as well as avoiding needing to init a
> repository object for every worktree.

I still think with eventually we probably only want to read the index
through a repository object instead of reading it to a separate struct
index_state.

But we can probably defer that for now, as this series really just
wants to fix the regressions, and we can think a bit more about how
the repository struct interacts with worktrees.

> So you could rework the first patch to do something like:

Thanks for the below, I'll try the below and see how much churn it
causes adjusting all the callsites, and send a re-roll later this
week.

> diff --git a/read-cache.c b/read-cache.c
> index 2eb81a66b..3a04b5567 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1863,20 +1863,19 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
>   * This way, shared index can be removed if they have not been used
>   * for some time.
>   */
> -static void freshen_shared_index(char *base_sha1_hex, int warn)
> +static void freshen_shared_index(const char *shared_index, int warn)
>  {
> -	char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
>  	if (!check_and_freshen_file(shared_index, 1) && warn)
>  		warning("could not freshen shared index '%s'", shared_index);
> -	free(shared_index);
>  }
>  
> -int read_index_from(struct index_state *istate, const char *path)
> +int read_index_from(struct index_state *istate, const char *path,
> +		    const char *gitdir)
>  {
>  	struct split_index *split_index;
>  	int ret;
>  	char *base_sha1_hex;
> -	const char *base_path;
> +	char *base_path;
>  
>  	/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
>  	if (istate->initialized)
> @@ -1896,14 +1895,14 @@ int read_index_from(struct index_state *istate, const char *path)
>  		split_index->base = xcalloc(1, sizeof(*split_index->base));
>  
>  	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> -	base_path = git_path("sharedindex.%s", base_sha1_hex);
> +	base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
>  	ret = do_read_index(split_index->base, base_path, 1);
>  	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
>  		die("broken index, expect %s in %s, got %s",
>  		    base_sha1_hex, base_path,
>  		    sha1_to_hex(split_index->base->sha1));
>  
> -	freshen_shared_index(base_sha1_hex, 0);
> +	freshen_shared_index(base_path, 0);
>  	merge_base_index(istate);
>  	post_read_index_from(istate);
> + free(base_path);
>  	return ret;
> 
> 
> -- 
> Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index
  2018-01-03 22:18       ` Thomas Gummerer
@ 2018-01-04 19:12         ` Junio C Hamano
  0 siblings, 0 replies; 77+ messages in thread
From: Junio C Hamano @ 2018-01-04 19:12 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Brandon Williams, git, Lars Schneider, Jeff King, Nguyễn Thái Ngọc Duy, SZEDER Gábor

Thomas Gummerer <t.gummerer@gmail.com> writes:

> [sorry for the late reply.  I was on Christmas holidays until today
> and am still catching up on the mailing list.  It will probably take
> me untill the weekend to send a re-roll]
>
> On 12/18, Brandon Williams wrote:
>> On 12/17, Thomas Gummerer wrote:
>> > be489d02d2 ("revision.c: --indexed-objects add objects from all
>> > worktrees", 2017-08-23) made sure that pruning takes objects from all
>> > worktrees into account.
>> > 
>> > It did that by reading the index of every worktree and adding the
>> > necessary index objects to the set of pending objects.  The index is
>> > read by read_index_from.  As mentioned in the previous commit,
>> > read_index_from depends on the CWD for the location of the split index,
>> 
>> As I mentioned before this doesn't actually depend on the CWD but
>> rather the per-worktree gitdir.
>
> Right, will fix.

Thanks.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2017-12-18 18:16     ` Lars Schneider
@ 2018-01-04 20:13       ` Thomas Gummerer
  2018-01-05 11:03         ` Lars Schneider
  2018-01-07 20:02         ` Thomas Gummerer
  0 siblings, 2 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-04 20:13 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git List, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 12/18, Lars Schneider wrote:
> 
> > On 17 Dec 2017, at 23:51, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > 
> > Split index mode only has a few dedicated tests, but as the index is
> > involved in nearly every git operation, this doesn't quite cover all the
> > ways repositories with split index can break.  To use split index mode
> > throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
> > can be set, which makes git split the index at random and thus
> > excercises the functionality much more thoroughly.
> > 
> > As this is not turned on by default, it is not executed nearly as often
> > as the test suite is run, so occationally breakages slip through.  Try
> > to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
> > mode turned on on travis.
> > 
> > To avoid using too many cycles on travis only run split index mode in
> > the linux-gcc and the linux 32-bit gcc targets.
> 
> I am surprised to see the split index mode test for the linux 32-bit
> target. Is it likely that a split index bug appears only on 32-bit? 
> Wouldn't the linux-gcc target be sufficient to save resources/time?

I'm not sure it's particularly likely for a bug to appear only on
32-bit builds.  It also doesn't seem to take too long to run, so I
thought I'd add it just in case, but I'm happy running the tests only
in the 64-bit builds if that's preferred.

> >  The Linux builds were
> > chosen over the Mac OS builds because they tend to be much faster to
> > complete.
> > 
> > The linux gcc build was chosen over the linux clang build because the
> > linux clang build is the fastest build, so it can serve as an early
> > indicator if something is broken and we want to avoid spending the extra
> > cycles of running the test suite twice for that.
> > 
> > Helped-by: Lars Schneider <larsxschneider@gmail.com>
> > Helped-by: Junio C Hamano <gitster@pobox.com>
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> > ci/run-linux32-build.sh | 1 +
> > ci/run-tests.sh         | 4 ++++
> > 2 files changed, 5 insertions(+)
> > 
> > diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
> > index e30fb2cddc..f173c9cf2a 100755
> > --- a/ci/run-linux32-build.sh
> > +++ b/ci/run-linux32-build.sh
> > @@ -27,4 +27,5 @@ linux32 --32bit i386 su -m -l $CI_USER -c '
> >     cd /usr/src/git &&
> >     make --jobs=2 &&
> >     make --quiet test
> > +    GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
> > '
> > diff --git a/ci/run-tests.sh b/ci/run-tests.sh
> > index f0c743de94..c7aee5b9ff 100755
> > --- a/ci/run-tests.sh
> > +++ b/ci/run-tests.sh
> > @@ -8,3 +8,7 @@
> > mkdir -p $HOME/travis-cache
> > ln -s $HOME/travis-cache/.prove t/.prove
> > make --quiet test
> > +if test "$jobname" = "linux-gcc"
> > +then
> > +	GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
> > +fi
> 
> For now I think that looks good. Maybe we could define additional test 
> configurations with an environment variable. That could be an array variable
> defined in the lib-travis.ci "case" statement:
> https://github.com/git/git/blob/1229713f78cd2883798e95f33c19c81b523413fd/ci/lib-travisci.sh#L42-L65

That sounds like a good idea.  I'll try to see if I can come up with
something.

> - Lars

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2018-01-04 20:13       ` Thomas Gummerer
@ 2018-01-05 11:03         ` Lars Schneider
  2018-01-07 20:02         ` Thomas Gummerer
  1 sibling, 0 replies; 77+ messages in thread
From: Lars Schneider @ 2018-01-05 11:03 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Git List, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor


> On 04 Jan 2018, at 21:13, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> 
> On 12/18, Lars Schneider wrote:
>> 
>>> On 17 Dec 2017, at 23:51, Thomas Gummerer <t.gummerer@gmail.com> wrote:
>>> 
>>> Split index mode only has a few dedicated tests, but as the index is
>>> involved in nearly every git operation, this doesn't quite cover all the
>>> ways repositories with split index can break.  To use split index mode
>>> throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
>>> can be set, which makes git split the index at random and thus
>>> excercises the functionality much more thoroughly.
>>> 
>>> As this is not turned on by default, it is not executed nearly as often
>>> as the test suite is run, so occationally breakages slip through.  Try
>>> to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
>>> mode turned on on travis.
>>> 
>>> To avoid using too many cycles on travis only run split index mode in
>>> the linux-gcc and the linux 32-bit gcc targets.
>> 
>> I am surprised to see the split index mode test for the linux 32-bit
>> target. Is it likely that a split index bug appears only on 32-bit? 
>> Wouldn't the linux-gcc target be sufficient to save resources/time?
> 
> I'm not sure it's particularly likely for a bug to appear only on
> 32-bit builds.  It also doesn't seem to take too long to run, so I
> thought I'd add it just in case, but I'm happy running the tests only
> in the 64-bit builds if that's preferred.

CI running times naturally add up over time as more and more stuff is
tested. That's why I would argue testing GIT_TEST_SPLIT_INDEX on 64-bit
is probably sufficient. However, I am not really familiar with 
GIT_TEST_SPLIT_INDEX. 

There is no right or wrong here and either way is fine with me. 
I just wanted to express my "keep an eye on CI time" concern.


Thanks,
Lars

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2018-01-04 20:13       ` Thomas Gummerer
  2018-01-05 11:03         ` Lars Schneider
@ 2018-01-07 20:02         ` Thomas Gummerer
  1 sibling, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-07 20:02 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git List, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 01/04, Thomas Gummerer wrote:
> On 12/18, Lars Schneider wrote:
> > [snip]
> > For now I think that looks good. Maybe we could define additional test 
> > configurations with an environment variable. That could be an array variable
> > defined in the lib-travis.ci "case" statement:
> > https://github.com/git/git/blob/1229713f78cd2883798e95f33c19c81b523413fd/ci/lib-travisci.sh#L42-L65
> 
> That sounds like a good idea.  I'll try to see if I can come up with
> something.

On second thought I'd prefer to just leave it as is for now, and leave
defining additional test configurations for a future iteration.
Having it configurable makes it a bit uglier, and I'm not even sure
how many configurations we can just test at runtime vs. having to
compile with the flag set.  So I'd like to punt on that for now, and
introduce more configurability once we actually need it :)

> > - Lars

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v3 0/3] fixes for split index mode
  2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
                     ` (2 preceding siblings ...)
  2017-12-17 22:51   ` [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
@ 2018-01-07 22:30   ` Thomas Gummerer
  2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
                       ` (4 more replies)
  3 siblings, 5 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-07 22:30 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

Thanks Brandon and Lars for comments on the previous round.

Previous rounds were at <20171210212202.28231-1-t.gummerer@gmail.com>
and <20171217225122.28941-1-t.gummerer@gmail.com>.

Changes since the previous round:

- reworked the patches to no longer try to use struct repository for
  worktrees, but pass gitdir into read_index_from instead (Thanks
  Brandon :)).  So the fixes that were in 1/3 and 2/3 previously are
  now in 1/3
- 2/3 now fixes t7009 properly.  I thought it was fixed before, but it
  probably just passed the tests because of the GIT_TEST_SPLIT_INDEX
  "randomness".
- The travis job is now only running the 64-bit linux build with split
  index mode to save even more cycles.
- As this wasn't picked up anywhere yet, I took the freedom to rebase
  this onto the current master, which includes sg/travis-fixes, which
  made it a bit easier for me to test.  If this makes the life harder
  for anyone reviewing this let me know and I can base it on the same
  commit previous iterations were based on.

Thomas Gummerer (3):
  read-cache: fix reading the shared index for other repos
  split-index: don't write cache tree with null sha1 entries
  travis: run tests with GIT_TEST_SPLIT_INDEX

 cache-tree.c            |  2 +-
 cache.h                 |  8 +++++---
 ci/run-linux32-build.sh |  1 +
 ci/run-tests.sh         |  4 ++++
 read-cache.c            | 25 ++++++++++++++-----------
 repository.c            |  2 +-
 revision.c              |  3 ++-
 split-index.c           |  2 ++
 t/t1700-split-index.sh  | 19 +++++++++++++++++++
 9 files changed, 49 insertions(+), 17 deletions(-)

-- 
2.16.0.rc1.238.g530d649a79


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
@ 2018-01-07 22:30     ` Thomas Gummerer
  2018-01-08 10:41       ` Duy Nguyen
                         ` (2 more replies)
  2018-01-07 22:30     ` [PATCH v3 2/3] split-index: don't write cache tree with null oid entries Thomas Gummerer
                       ` (3 subsequent siblings)
  4 siblings, 3 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-07 22:30 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

read_index_from() takes a path argument for the location of the index
file.  For reading the shared index in split index mode however it just
ignores that path argument, and reads it from the gitdir of the current
repository.

This works as long as an index in the_repository is read.  Once that
changes, such as when we read the index of a submodule, or of a
different working tree than the current one, the gitdir of
the_repository will no longer contain the appropriate shared index,
and git will fail to read it.

For example t3007-ls-files-recurse-submodules.sh was broken with
GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
broken in a similar manner, probably by introducing struct repository
there, although I didn't track down the exact commit for that.

be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) breaks with split index mode in a similar
manner, not erroring out when it can't read the index, but instead
carrying on with pruning, without taking the index of the worktree into
account.

Fix this by passing an additional gitdir parameter to read_index_from,
to indicate where it should look for and read the shared index from.

read_cache_from() defaults to using the gitdir of the_repository.  As it
is mostly a convenience macro, having to pass get_git_dir() for every
call seems overkill, and if necessary users can have more control by
using read_index_from().

Helped-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache-tree.c |  2 +-
 cache.h      |  5 +++--
 read-cache.c | 23 +++++++++++++----------
 repository.c |  2 +-
 revision.c   |  3 ++-
 5 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index e03e72c34a..0dd6292a94 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -608,7 +608,7 @@ int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, co
 
 	hold_lock_file_for_update(&lock_file, index_path, LOCK_DIE_ON_ERROR);
 
-	entries = read_index_from(index_state, index_path);
+	entries = read_index_from(index_state, index_path, get_git_dir());
 	if (entries < 0) {
 		ret = WRITE_TREE_UNREADABLE_INDEX;
 		goto out;
diff --git a/cache.h b/cache.h
index d8b975a571..d5a7d1d7f1 100644
--- a/cache.h
+++ b/cache.h
@@ -371,7 +371,7 @@ extern void free_name_hash(struct index_state *istate);
 #define active_cache_tree (the_index.cache_tree)
 
 #define read_cache() read_index(&the_index)
-#define read_cache_from(path) read_index_from(&the_index, (path))
+#define read_cache_from(path) read_index_from(&the_index, (path), (get_git_dir()))
 #define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec))
 #define is_cache_unborn() is_index_unborn(&the_index)
 #define read_cache_unmerged() read_index_unmerged(&the_index)
@@ -616,7 +616,8 @@ extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec *pathspec);
 extern int do_read_index(struct index_state *istate, const char *path,
 			 int must_exist); /* for testting only! */
-extern int read_index_from(struct index_state *, const char *path);
+extern int read_index_from(struct index_state *, const char *path,
+			   const char *gitdir);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
 
diff --git a/read-cache.c b/read-cache.c
index 2eb81a66b9..3f5b4afa36 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1603,7 +1603,7 @@ int hold_locked_index(struct lock_file *lk, int lock_flags)
 
 int read_index(struct index_state *istate)
 {
-	return read_index_from(istate, get_index_file());
+	return read_index_from(istate, get_index_file(), get_git_dir());
 }
 
 static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *ondisk,
@@ -1863,20 +1863,19 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
  * This way, shared index can be removed if they have not been used
  * for some time.
  */
-static void freshen_shared_index(char *base_sha1_hex, int warn)
+static void freshen_shared_index(const char *shared_index, int warn)
 {
-	char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
 	if (!check_and_freshen_file(shared_index, 1) && warn)
 		warning("could not freshen shared index '%s'", shared_index);
-	free(shared_index);
 }
 
-int read_index_from(struct index_state *istate, const char *path)
+int read_index_from(struct index_state *istate, const char *path,
+		    const char *gitdir)
 {
 	struct split_index *split_index;
 	int ret;
 	char *base_sha1_hex;
-	const char *base_path;
+	char *base_path;
 
 	/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
 	if (istate->initialized)
@@ -1896,16 +1895,17 @@ int read_index_from(struct index_state *istate, const char *path)
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 
 	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
-	base_path = git_path("sharedindex.%s", base_sha1_hex);
+	base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
 	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
 		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
 
-	freshen_shared_index(base_sha1_hex, 0);
+	freshen_shared_index(base_path, 0);
 	merge_base_index(istate);
 	post_read_index_from(istate);
+	free(base_path);
 	return ret;
 }
 
@@ -2573,8 +2573,11 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 	ret = write_split_index(istate, lock, flags);
 
 	/* Freshen the shared index only if the split-index was written */
-	if (!ret && !new_shared_index)
-		freshen_shared_index(sha1_to_hex(si->base_sha1), 1);
+	if (!ret && !new_shared_index) {
+		const char *shared_index = git_path("sharedindex.%s",
+						    sha1_to_hex(si->base_sha1));
+		freshen_shared_index(shared_index, 1);
+	}
 
 out:
 	if (flags & COMMIT_LOCK)
diff --git a/repository.c b/repository.c
index 998413b8bb..0d715f4fdb 100644
--- a/repository.c
+++ b/repository.c
@@ -236,5 +236,5 @@ int repo_read_index(struct repository *repo)
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
 
-	return read_index_from(repo->index, repo->index_file);
+	return read_index_from(repo->index, repo->index_file, repo->gitdir);
 }
diff --git a/revision.c b/revision.c
index 72f2b4572e..76d6d125b3 100644
--- a/revision.c
+++ b/revision.c
@@ -1352,7 +1352,8 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
 			continue; /* current index already taken care of */
 
 		if (read_index_from(&istate,
-				    worktree_git_path(wt, "index")) > 0)
+				    worktree_git_path(wt, "index"),
+				    get_worktree_git_dir(wt)) > 0)
 			do_add_index_objects_to_pending(revs, &istate);
 		discard_index(&istate);
 	}
-- 
2.16.0.rc1.238.g530d649a79


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v3 2/3] split-index: don't write cache tree with null oid entries
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
  2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
@ 2018-01-07 22:30     ` Thomas Gummerer
  2018-01-07 22:30     ` [PATCH v3 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-07 22:30 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

In a96d3cc3f6 ("cache-tree: reject entries with null sha1", 2017-04-21)
we made sure that broken cache entries do not get propagated to new
trees.  Part of that was making sure not to re-use an existing cache
tree that includes a null oid.

It did so by dropping the cache tree in 'do_write_index()' if one of
the entries contains a null oid.  In split index mode however, there
are two invocations to 'do_write_index()', one for the shared index
and one for the split index.  The cache tree is only written once, to
the split index.

As we only loop through the elements that are effectively being
written by the current invocation, that may not include the entry with
a null oid in the split index (when it is already written to the
shared index), where we write the cache tree.  Therefore in split
index mode we may still end up writing the cache tree, even though
there is an entry with a null oid in the index.

Fix this by checking for null oids in prepare_to_write_split_index,
where we loop the entries of the shared index as well as the entries for
the split index.

This fixes t7009 with GIT_TEST_SPLIT_INDEX.  Also add a new test that's
more specifically showing the problem.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h                |  3 ++-
 read-cache.c           |  2 +-
 split-index.c          |  2 ++
 t/t1700-split-index.sh | 19 +++++++++++++++++++
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index d5a7d1d7f1..fd755c32cf 100644
--- a/cache.h
+++ b/cache.h
@@ -345,7 +345,8 @@ struct index_state {
 	struct split_index *split_index;
 	struct cache_time timestamp;
 	unsigned name_hash_initialized : 1,
-		 initialized : 1;
+		 initialized : 1,
+		 drop_cache_tree : 1;
 	struct hashmap name_hash;
 	struct hashmap dir_hash;
 	unsigned char sha1[20];
diff --git a/read-cache.c b/read-cache.c
index 3f5b4afa36..d13ce83794 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2243,7 +2243,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	struct stat st;
 	struct ondisk_cache_entry_extended ondisk;
 	struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
-	int drop_cache_tree = 0;
+	int drop_cache_tree = istate->drop_cache_tree;
 
 	for (i = removed = extended = 0; i < entries; i++) {
 		if (cache[i]->ce_flags & CE_REMOVE)
diff --git a/split-index.c b/split-index.c
index 83e39ec8d7..284d04d67f 100644
--- a/split-index.c
+++ b/split-index.c
@@ -238,6 +238,8 @@ void prepare_to_write_split_index(struct index_state *istate)
 				ALLOC_GROW(entries, nr_entries+1, nr_alloc);
 				entries[nr_entries++] = ce;
 			}
+			if (is_null_oid(&ce->oid))
+				istate->drop_cache_tree = 1;
 		}
 	}
 
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index af9b847761..c087b63367 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -401,4 +401,23 @@ done <<\EOF
 0642 -rw-r---w-
 EOF
 
+test_expect_success 'writing split index with null sha1 does not write cache tree' '
+	git config core.splitIndex true &&
+	git config splitIndex.maxPercentChange 0 &&
+	git commit -m "commit" &&
+	{
+		git ls-tree HEAD &&
+		printf "160000 commit $_z40\\tbroken\\n"
+	} >broken-tree &&
+	echo "add broken entry" >msg &&
+
+	tree=$(git mktree <broken-tree) &&
+	test_tick &&
+	commit=$(git commit-tree $tree -p HEAD <msg) &&
+	git update-ref HEAD "$commit" &&
+	GIT_ALLOW_NULL_SHA1=1 git reset --hard &&
+	(test-dump-cache-tree >cache-tree.out || true) &&
+	test_line_count = 0 cache-tree.out
+'
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v3 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
  2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
  2018-01-07 22:30     ` [PATCH v3 2/3] split-index: don't write cache tree with null oid entries Thomas Gummerer
@ 2018-01-07 22:30     ` Thomas Gummerer
  2018-01-13 22:37     ` [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index Thomas Gummerer
  2018-01-18 21:53     ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
  4 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-07 22:30 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor, Thomas Gummerer

Split index mode only has a few dedicated tests, but as the index is
involved in nearly every git operation, this doesn't quite cover all the
ways repositories with split index can break.  To use split index mode
throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
can be set, which makes git split the index at random and thus
excercises the functionality much more thoroughly.

As this is not turned on by default, it is not executed nearly as often
as the test suite is run, so occationally breakages slip through.  Try
to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
mode turned on on travis.

To avoid using too many cycles on travis only run split index mode in
the linux-gcc target only.  The Linux build was chosen over the Mac OS
builds because it tends to be much faster to complete.

The linux gcc build was chosen over the linux clang build because the
linux clang build is the fastest build, so it can serve as an early
indicator if something is broken and we want to avoid spending the extra
cycles of running the test suite twice for that.

Helped-by: Lars Schneider <larsxschneider@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 ci/run-tests.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/ci/run-tests.sh b/ci/run-tests.sh
index f0c743de94..c7aee5b9ff 100755
--- a/ci/run-tests.sh
+++ b/ci/run-tests.sh
@@ -8,3 +8,7 @@
 mkdir -p $HOME/travis-cache
 ln -s $HOME/travis-cache/.prove t/.prove
 make --quiet test
+if test "$jobname" = "linux-gcc"
+then
+	GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
+fi
-- 
2.16.0.rc1.238.g530d649a79


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
@ 2018-01-08 10:41       ` Duy Nguyen
  2018-01-08 22:41         ` Thomas Gummerer
  2018-01-08 23:38         ` Brandon Williams
  2018-01-16 21:42       ` Brandon Williams
  2018-01-19 21:57       ` Junio C Hamano
  2 siblings, 2 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-08 10:41 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Git Mailing List, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, SZEDER Gábor

On Mon, Jan 8, 2018 at 5:30 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> @@ -1896,16 +1895,17 @@ int read_index_from(struct index_state *istate, const char *path)
>                 split_index->base = xcalloc(1, sizeof(*split_index->base));
>
>         base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> -       base_path = git_path("sharedindex.%s", base_sha1_hex);
> +       base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);

Personally I prefer the repo_git_path() from v2 (sorry I was away and
could not comment anything). The thing is, git_path() and friends
could do some path translation underneath to support multiple
worktrees. Think of the given path here as a "virtual path" that may
be translated to something else, not exactly <git_dir> + "/" +
"sharedindex.%s". But in practice, we're not breaking the relationship
between $GIT_DIR/index and $GIT_DIR/sharedindex.* any time soon, doing
manual path transformation here is fine.

What about the other git_path() in this file? With patch applied I still get

> ~/w/git/temp $ git grep git_path read-cache.c
read-cache.c:           shared_index_path = git_path("%s", de->d_name);
read-cache.c:   temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
read-cache.c:                         git_path("sharedindex.%s",
sha1_to_hex(si->base->sha1)));
read-cache.c:           const char *shared_index = git_path("sharedindex.%s",

I suppose submodule has not triggered any of these code paths yet. Not
sure if we should deal with them now or wait until later.

Perhaps if we add a "struct repository *" pointer inside index_state,
we could retrieve back the_repository (or others) and call
repo_git_path() everywhere without changing index api too much. I
don't know. I like the  'struct repository' concept but couldn't
follow its development so I don't if this is what it should become.
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-08 10:41       ` Duy Nguyen
@ 2018-01-08 22:41         ` Thomas Gummerer
  2018-01-13 22:33           ` Thomas Gummerer
  2018-01-08 23:38         ` Brandon Williams
  1 sibling, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-08 22:41 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, SZEDER Gábor

On 01/08, Duy Nguyen wrote:
> On Mon, Jan 8, 2018 at 5:30 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > @@ -1896,16 +1895,17 @@ int read_index_from(struct index_state *istate, const char *path)
> >                 split_index->base = xcalloc(1, sizeof(*split_index->base));
> >
> >         base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> > -       base_path = git_path("sharedindex.%s", base_sha1_hex);
> > +       base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
> 
> Personally I prefer the repo_git_path() from v2 (sorry I was away and
> could not comment anything).

It felt slightly nicer to me as well, but it threw up a bunch of
questions about how worktrees will fit together with struct
repository.  As I don't feel very strongly about this either way I
decided to go with what Brandon suggested as an alternative, which
allows us to defer that decision.  I'd be happy to revert this to the
way I had it in v2, but I don't want to drag the discussion on too
long either, as this series does fix some real regressions.

> The thing is, git_path() and friends
> could do some path translation underneath to support multiple
> worktrees. Think of the given path here as a "virtual path" that may
> be translated to something else, not exactly <git_dir> + "/" +
> "sharedindex.%s". But in practice, we're not breaking the relationship
> between $GIT_DIR/index and $GIT_DIR/sharedindex.* any time soon, doing
> manual path transformation here is fine.
>
> What about the other git_path() in this file? With patch applied I still get
> 
> > ~/w/git/temp $ git grep git_path read-cache.c
> read-cache.c:           shared_index_path = git_path("%s", de->d_name);
> read-cache.c:   temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
> read-cache.c:                         git_path("sharedindex.%s",
> sha1_to_hex(si->base->sha1)));
> read-cache.c:           const char *shared_index = git_path("sharedindex.%s",

Good point, I hadn't looked at these, I only looked at the current
test failures.  I'm going to be away for the rest of the week, but
I'll have a look at them when I come back.

> I suppose submodule has not triggered any of these code paths yet. Not
> sure if we should deal with them now or wait until later.
> 
> Perhaps if we add a "struct repository *" pointer inside index_state,
> we could retrieve back the_repository (or others) and call
> repo_git_path() everywhere without changing index api too much. I
> don't know. I like the  'struct repository' concept but couldn't
> follow its development so I don't if this is what it should become.

Interesting.  I didn't follow the development of  struct repository
too closely either, so I'm not sure.  Brandon might have more of an
opinion on that? :)

> -- 
> Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-08 10:41       ` Duy Nguyen
  2018-01-08 22:41         ` Thomas Gummerer
@ 2018-01-08 23:38         ` Brandon Williams
  2018-01-09  1:24           ` Duy Nguyen
  1 sibling, 1 reply; 77+ messages in thread
From: Brandon Williams @ 2018-01-08 23:38 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Thomas Gummerer, Git Mailing List, Lars Schneider, Jeff King, Junio C Hamano, SZEDER Gábor

On 01/08, Duy Nguyen wrote:
> On Mon, Jan 8, 2018 at 5:30 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > @@ -1896,16 +1895,17 @@ int read_index_from(struct index_state *istate, const char *path)
> >                 split_index->base = xcalloc(1, sizeof(*split_index->base));
> >
> >         base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> > -       base_path = git_path("sharedindex.%s", base_sha1_hex);
> > +       base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
> 
> Personally I prefer the repo_git_path() from v2 (sorry I was away and
> could not comment anything). The thing is, git_path() and friends
> could do some path translation underneath to support multiple
> worktrees. Think of the given path here as a "virtual path" that may
> be translated to something else, not exactly <git_dir> + "/" +
> "sharedindex.%s". But in practice, we're not breaking the relationship
> between $GIT_DIR/index and $GIT_DIR/sharedindex.* any time soon, doing
> manual path transformation here is fine.

My biggest complaint about v2 is that we still don't quite know the best
way to integrate worktrees and struct repository yet so I was very
reluctant to start having them interact in the way v2 was using them
together.  I'm very much in favor of this version (v3) as each worktree
can explicitly provide their gitdir to be used to determine where to
read the shared index file without having to replicate a struct
repository for each.

> 
> What about the other git_path() in this file? With patch applied I still get
> 
> > ~/w/git/temp $ git grep git_path read-cache.c
> read-cache.c:           shared_index_path = git_path("%s", de->d_name);
> read-cache.c:   temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
> read-cache.c:                         git_path("sharedindex.%s",
> sha1_to_hex(si->base->sha1)));
> read-cache.c:           const char *shared_index = git_path("sharedindex.%s",
> 
> I suppose submodule has not triggered any of these code paths yet. Not
> sure if we should deal with them now or wait until later.
> 
> Perhaps if we add a "struct repository *" pointer inside index_state,
> we could retrieve back the_repository (or others) and call
> repo_git_path() everywhere without changing index api too much. I
> don't know. I like the  'struct repository' concept but couldn't
> follow its development so I don't if this is what it should become.

I'm not too keen on having an index_state struct contain a back pointer
to a repository struct.  I do think that we may want to have worktree
structs contain a back pointer to the struct repository they correspond
to.  That way there is only one instance of the repository (and
object-store once that gets integrated) yet multiple worktrees.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-08 23:38         ` Brandon Williams
@ 2018-01-09  1:24           ` Duy Nguyen
  0 siblings, 0 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-09  1:24 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Thomas Gummerer, Git Mailing List, Lars Schneider, Jeff King, Junio C Hamano, SZEDER Gábor

On Tue, Jan 9, 2018 at 6:38 AM, Brandon Williams <bmwill@google.com> wrote:
> On 01/08, Duy Nguyen wrote:
>> On Mon, Jan 8, 2018 at 5:30 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
>> > @@ -1896,16 +1895,17 @@ int read_index_from(struct index_state *istate, const char *path)
>> >                 split_index->base = xcalloc(1, sizeof(*split_index->base));
>> >
>> >         base_sha1_hex = sha1_to_hex(split_index->base_sha1);
>> > -       base_path = git_path("sharedindex.%s", base_sha1_hex);
>> > +       base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
>>
>> Personally I prefer the repo_git_path() from v2 (sorry I was away and
>> could not comment anything). The thing is, git_path() and friends
>> could do some path translation underneath to support multiple
>> worktrees. Think of the given path here as a "virtual path" that may
>> be translated to something else, not exactly <git_dir> + "/" +
>> "sharedindex.%s". But in practice, we're not breaking the relationship
>> between $GIT_DIR/index and $GIT_DIR/sharedindex.* any time soon, doing
>> manual path transformation here is fine.
>
> My biggest complaint about v2 is that we still don't quite know the best
> way to integrate worktrees and struct repository yet so I was very
> reluctant to start having them interact in the way v2 was using them
> together.

This will be on my todo list (after I have finished with
nd/worktree-move which has been on 'pu' for too long). I'm ok with v3
then.

> I'm very much in favor of this version (v3) as each worktree
> can explicitly provide their gitdir to be used to determine where to
> read the shared index file without having to replicate a struct
> repository for each.

Isn't that the end goal though (I vaguely recall this discussion when
'struct repository' was an idea), that we will pass struct repository
around [1] rather than relying on global objects.

[1] or in this case struct worktree-something because the index
belongs more to a worktree than the object/refs database.

>> What about the other git_path() in this file? With patch applied I still get
>>
>> > ~/w/git/temp $ git grep git_path read-cache.c
>> read-cache.c:           shared_index_path = git_path("%s", de->d_name);
>> read-cache.c:   temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
>> read-cache.c:                         git_path("sharedindex.%s",
>> sha1_to_hex(si->base->sha1)));
>> read-cache.c:           const char *shared_index = git_path("sharedindex.%s",
>>
>> I suppose submodule has not triggered any of these code paths yet. Not
>> sure if we should deal with them now or wait until later.
>>
>> Perhaps if we add a "struct repository *" pointer inside index_state,
>> we could retrieve back the_repository (or others) and call
>> repo_git_path() everywhere without changing index api too much. I
>> don't know. I like the  'struct repository' concept but couldn't
>> follow its development so I don't if this is what it should become.
>
> I'm not too keen on having an index_state struct contain a back pointer
> to a repository struct.  I do think that we may want to have worktree
> structs contain a back pointer to the struct repository they correspond
> to.  That way there is only one instance of the repository (and
> object-store once that gets integrated) yet multiple worktrees.

Yeah back pointer from worktree struct makes sense.
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-08 22:41         ` Thomas Gummerer
@ 2018-01-13 22:33           ` Thomas Gummerer
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-13 22:33 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, SZEDER Gábor

On 01/08, Thomas Gummerer wrote:
> On 01/08, Duy Nguyen wrote:
> > On Mon, Jan 8, 2018 at 5:30 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > > @@ -1896,16 +1895,17 @@ int read_index_from(struct index_state *istate, const char *path)
> > >                 split_index->base = xcalloc(1, sizeof(*split_index->base));
> > >
> > >         base_sha1_hex = sha1_to_hex(split_index->base_sha1);
> > > -       base_path = git_path("sharedindex.%s", base_sha1_hex);
> > > +       base_path = xstrfmt("%s/sharedindex.%s", gitdir, base_sha1_hex);
> > 
> > Personally I prefer the repo_git_path() from v2 (sorry I was away and
> > could not comment anything).
> 
> It felt slightly nicer to me as well, but it threw up a bunch of
> questions about how worktrees will fit together with struct
> repository.  As I don't feel very strongly about this either way I
> decided to go with what Brandon suggested as an alternative, which
> allows us to defer that decision.  I'd be happy to revert this to the
> way I had it in v2, but I don't want to drag the discussion on too
> long either, as this series does fix some real regressions.
> 
> > The thing is, git_path() and friends
> > could do some path translation underneath to support multiple
> > worktrees. Think of the given path here as a "virtual path" that may
> > be translated to something else, not exactly <git_dir> + "/" +
> > "sharedindex.%s". But in practice, we're not breaking the relationship
> > between $GIT_DIR/index and $GIT_DIR/sharedindex.* any time soon, doing
> > manual path transformation here is fine.
> >
> > What about the other git_path() in this file? With patch applied I still get
> > 
> > > ~/w/git/temp $ git grep git_path read-cache.c
> > read-cache.c:           shared_index_path = git_path("%s", de->d_name);
> > read-cache.c:   temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
> > read-cache.c:                         git_path("sharedindex.%s",
> > sha1_to_hex(si->base->sha1)));
> > read-cache.c:           const char *shared_index = git_path("sharedindex.%s",
> 
> Good point, I hadn't looked at these, I only looked at the current
> test failures.  I'm going to be away for the rest of the week, but
> I'll have a look at them when I come back.
> 
> > I suppose submodule has not triggered any of these code paths yet. Not
> > sure if we should deal with them now or wait until later.

Having had a look at these now, they are all in the write_locked_index
codepath.  We should probably have something like
'repo_write_locked_index()' for this.  But that probably requires a
bit more work/discussion to see what this should look like.  I'd
rather keep this patch series focused on the current breakages, and
deal with that in a separate patch series.

While looking at this, I did find another breakage in the split index
code, which I'll send as 4/3.

> > Perhaps if we add a "struct repository *" pointer inside index_state,
> > we could retrieve back the_repository (or others) and call
> > repo_git_path() everywhere without changing index api too much. I
> > don't know. I like the  'struct repository' concept but couldn't
> > follow its development so I don't if this is what it should become.
> 
> Interesting.  I didn't follow the development of  struct repository
> too closely either, so I'm not sure.  Brandon might have more of an
> opinion on that? :)
> 
> > -- 
> > Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
                       ` (2 preceding siblings ...)
  2018-01-07 22:30     ` [PATCH v3 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
@ 2018-01-13 22:37     ` Thomas Gummerer
  2018-01-14  9:36       ` Duy Nguyen
  2018-01-18 21:53     ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
  4 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-13 22:37 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

In a0a967568e ("update-index --split-index: do not split if $GIT_DIR is
read only", 2014-06-13), we tried to make sure we can still write an
index, even if the shared index can not be written.

We did so by just calling 'do_write_locked_index()' from
'write_shared_index()'.  'do_write_locked_index()' always at least
closes the tempfile nowadays, and used to close or commit the lockfile
if COMMIT_LOCK or CLOSE_LOCK were given at the time this feature was
introduced.  COMMIT_LOCK or CLOSE_LOCK is passed in by most callers of
'write_locked_index()'.

After calling 'write_shared_index()', we call 'write_split_index()',
which calls 'do_write_locked_index()' again, which then tries to use the
closed lockfile again, but in fact fails to do so as it's already
closed.

In the current version, git will in fact segfault if it can't create a
new file in $gitdir, and this feature seems to never have worked in the
first place.

Ever since introducing the split index feature, nobody has complained
about this failing, and it really just papers over repositories that
will sooner or later need fixing anyway.

Therefore just make being unable to write the split index a proper
error, and have users fix their repositories instead of trying (but
failing) to paper over the error.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 read-cache.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index d13ce83794..a9c8facdfd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2471,18 +2471,15 @@ static int clean_shared_index_files(const char *current_hex)
 	return 0;
 }
 
-static int write_shared_index(struct index_state *istate,
-			      struct lock_file *lock, unsigned flags)
+static int write_shared_index(struct index_state *istate)
 {
 	struct tempfile *temp;
 	struct split_index *si = istate->split_index;
 	int ret;
 
 	temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
-	if (!temp) {
-		hashclr(si->base_sha1);
-		return do_write_locked_index(istate, lock, flags);
-	}
+	if (!temp)
+		return error("cannot create new shared index");
 	move_cache_to_base_index(istate);
 	ret = do_write_index(si->base, temp, 1);
 	if (ret) {
@@ -2565,7 +2562,7 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 	new_shared_index = istate->cache_changed & SPLIT_INDEX_ORDERED;
 
 	if (new_shared_index) {
-		ret = write_shared_index(istate, lock, flags);
+		ret = write_shared_index(istate);
 		if (ret)
 			goto out;
 	}
-- 
2.16.0.rc2.280.g09355b536d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index
  2018-01-13 22:37     ` [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index Thomas Gummerer
@ 2018-01-14  9:36       ` Duy Nguyen
  2018-01-14 10:18         ` [PATCH 1/3] read-cache.c: change type of "temp" in write_shared_index() Nguyễn Thái Ngọc Duy
  2018-01-14 14:29         ` [PATCH v3 4/3] read-cache: don't try to write index " Thomas Gummerer
  0 siblings, 2 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-14  9:36 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Git Mailing List, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, SZEDER Gábor

On Sun, Jan 14, 2018 at 5:37 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> In a0a967568e ("update-index --split-index: do not split if $GIT_DIR is
> read only", 2014-06-13), we tried to make sure we can still write an
> index, even if the shared index can not be written.
>
> We did so by just calling 'do_write_locked_index()' from
> 'write_shared_index()'.  'do_write_locked_index()' always at least
> closes the tempfile nowadays, and used to close or commit the lockfile
> if COMMIT_LOCK or CLOSE_LOCK were given at the time this feature was
> introduced.  COMMIT_LOCK or CLOSE_LOCK is passed in by most callers of
> 'write_locked_index()'.
>
> After calling 'write_shared_index()', we call 'write_split_index()',
> which calls 'do_write_locked_index()' again, which then tries to use the
> closed lockfile again, but in fact fails to do so as it's already
> closed.
>
> In the current version, git will in fact segfault if it can't create a
> new file in $gitdir, and this feature seems to never have worked in the
> first place.
>
> Ever since introducing the split index feature, nobody has complained
> about this failing, and it really just papers over repositories that
> will sooner or later need fixing anyway.

Actually there's one valid case for this: you're accessing a read-only
$GIT_DIR (.e.g maybe from a web server cgi script which may be run by
user nobody or something) and creating a temporary index _outside_
$GIT_DIR. I used to do this when I wanted to do "git grep" on some
SHA-1 a couple times. Doing "git grep <SHA-1>" directly (a couple
times) pays full cost for walking trees. If you prepare an index
first, you pay it only once.

> Therefore just make being unable to write the split index a proper
> error, and have users fix their repositories instead of trying (but
> failing) to paper over the error.
>
> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  read-cache.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index d13ce83794..a9c8facdfd 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2471,18 +2471,15 @@ static int clean_shared_index_files(const char *current_hex)
>         return 0;
>  }
>
> -static int write_shared_index(struct index_state *istate,
> -                             struct lock_file *lock, unsigned flags)
> +static int write_shared_index(struct index_state *istate)
>  {
>         struct tempfile *temp;
>         struct split_index *si = istate->split_index;
>         int ret;
>
>         temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
> -       if (!temp) {
> -               hashclr(si->base_sha1);
> -               return do_write_locked_index(istate, lock, flags);

I think this code tries to do what's done near the beginning of
write_locked_index() where we also bail out early:

-- 8< --
        if (!si || alternate_index_output ||
            (istate->cache_changed & ~EXTMASK)) {
                if (si)
                        hashclr(si->base_sha1);
                ret = do_write_locked_index(istate, lock, flags);
                goto out;
        }
-- 8< --

the only difference is it does not realize that it can't do "goto
out;" like that code unless something goes wrong. I'll try to prepare
a patch that move tempfile creation out of write_shared_index()
instead. Patches coming in a bit..
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 1/3] read-cache.c: change type of "temp" in write_shared_index()
  2018-01-14  9:36       ` Duy Nguyen
@ 2018-01-14 10:18         ` Nguyễn Thái Ngọc Duy
  2018-01-14 10:18           ` [PATCH 2/3] read-cache.c: move tempfile creation/cleanup out of write_shared_index Nguyễn Thái Ngọc Duy
  2018-01-14 10:18           ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index Nguyễn Thái Ngọc Duy
  2018-01-14 14:29         ` [PATCH v3 4/3] read-cache: don't try to write index " Thomas Gummerer
  1 sibling, 2 replies; 77+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-01-14 10:18 UTC (permalink / raw)
  To: git; +Cc: t.gummerer, larsxschneider, bmwill, peff, Junio C Hamano, szeder.dev, Nguyễn Thái Ngọc Duy

This local variable 'temp' will be passed in from the caller in the next
patch. To reduce patch noise, let's change its type now while it's still
a local variable and get all the trival conversion out of the next patch.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 2eb81a66b9..536086e1fe 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2474,30 +2474,32 @@ static int clean_shared_index_files(const char *current_hex)
 static int write_shared_index(struct index_state *istate,
 			      struct lock_file *lock, unsigned flags)
 {
-	struct tempfile *temp;
+	struct tempfile *real_temp;
+	struct tempfile **temp = &real_temp;
 	struct split_index *si = istate->split_index;
 	int ret;
 
-	temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
-	if (!temp) {
+	real_temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
+	if (!real_temp) {
 		hashclr(si->base_sha1);
 		return do_write_locked_index(istate, lock, flags);
 	}
+	temp = &real_temp;
 	move_cache_to_base_index(istate);
-	ret = do_write_index(si->base, temp, 1);
+	ret = do_write_index(si->base, *temp, 1);
 	if (ret) {
-		delete_tempfile(&temp);
+		delete_tempfile(temp);
 		return ret;
 	}
-	ret = adjust_shared_perm(get_tempfile_path(temp));
+	ret = adjust_shared_perm(get_tempfile_path(*temp));
 	if (ret) {
 		int save_errno = errno;
-		error("cannot fix permission bits on %s", get_tempfile_path(temp));
-		delete_tempfile(&temp);
+		error("cannot fix permission bits on %s", get_tempfile_path(*temp));
+		delete_tempfile(temp);
 		errno = save_errno;
 		return ret;
 	}
-	ret = rename_tempfile(&temp,
+	ret = rename_tempfile(temp,
 			      git_path("sharedindex.%s", sha1_to_hex(si->base->sha1)));
 	if (!ret) {
 		hashcpy(si->base_sha1, si->base->sha1);
-- 
2.15.1.600.g899a5f85c6


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 2/3] read-cache.c: move tempfile creation/cleanup out of write_shared_index
  2018-01-14 10:18         ` [PATCH 1/3] read-cache.c: change type of "temp" in write_shared_index() Nguyễn Thái Ngọc Duy
@ 2018-01-14 10:18           ` Nguyễn Thái Ngọc Duy
  2018-01-14 10:18           ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index Nguyễn Thái Ngọc Duy
  1 sibling, 0 replies; 77+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-01-14 10:18 UTC (permalink / raw)
  To: git; +Cc: t.gummerer, larsxschneider, bmwill, peff, Junio C Hamano, szeder.dev, Nguyễn Thái Ngọc Duy

For one thing, we have more consistent cleanup procedure now and always
keep errno intact.

The real purpose is the ability to break out of write_locked_index()
early when mks_tempfile() fails in the next patch. It's more awkward to
do it if this mks_tempfile() is still inside write_shared_index().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 536086e1fe..c568643f55 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2472,31 +2472,18 @@ static int clean_shared_index_files(const char *current_hex)
 }
 
 static int write_shared_index(struct index_state *istate,
-			      struct lock_file *lock, unsigned flags)
+			      struct tempfile **temp)
 {
-	struct tempfile *real_temp;
-	struct tempfile **temp = &real_temp;
 	struct split_index *si = istate->split_index;
 	int ret;
 
-	real_temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
-	if (!real_temp) {
-		hashclr(si->base_sha1);
-		return do_write_locked_index(istate, lock, flags);
-	}
-	temp = &real_temp;
 	move_cache_to_base_index(istate);
 	ret = do_write_index(si->base, *temp, 1);
-	if (ret) {
-		delete_tempfile(temp);
+	if (ret)
 		return ret;
-	}
 	ret = adjust_shared_perm(get_tempfile_path(*temp));
 	if (ret) {
-		int save_errno = errno;
 		error("cannot fix permission bits on %s", get_tempfile_path(*temp));
-		delete_tempfile(temp);
-		errno = save_errno;
 		return ret;
 	}
 	ret = rename_tempfile(temp,
@@ -2567,7 +2554,21 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 	new_shared_index = istate->cache_changed & SPLIT_INDEX_ORDERED;
 
 	if (new_shared_index) {
-		ret = write_shared_index(istate, lock, flags);
+		struct tempfile *temp;
+		int saved_errno;
+
+		temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
+		if (!temp) {
+			hashclr(si->base_sha1);
+			ret = do_write_locked_index(istate, lock, flags);
+		} else
+			ret = write_shared_index(istate, &temp);
+
+		saved_errno = errno;
+		if (is_tempfile_active(temp))
+			delete_tempfile(&temp);
+		errno = saved_errno;
+
 		if (ret)
 			goto out;
 	}
-- 
2.15.1.600.g899a5f85c6


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-14 10:18         ` [PATCH 1/3] read-cache.c: change type of "temp" in write_shared_index() Nguyễn Thái Ngọc Duy
  2018-01-14 10:18           ` [PATCH 2/3] read-cache.c: move tempfile creation/cleanup out of write_shared_index Nguyễn Thái Ngọc Duy
@ 2018-01-14 10:18           ` Nguyễn Thái Ngọc Duy
  2018-01-18 11:36             ` SZEDER Gábor
  1 sibling, 1 reply; 77+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-01-14 10:18 UTC (permalink / raw)
  To: git; +Cc: t.gummerer, larsxschneider, bmwill, peff, Junio C Hamano, szeder.dev, Nguyễn Thái Ngọc Duy

In a0a967568e ("update-index --split-index: do not split if $GIT_DIR is
read only", 2014-06-13), we tried to make sure we can still write an
index, even if the shared index can not be written.

We did so by just calling 'do_write_locked_index()' just before
'write_shared_index()'.  'do_write_locked_index()' always at least
closes the tempfile nowadays, and used to close or commit the lockfile
if COMMIT_LOCK or CLOSE_LOCK were given at the time this feature was
introduced.  COMMIT_LOCK or CLOSE_LOCK is passed in by most callers of
'write_locked_index()'.

After calling 'write_shared_index()', we call 'write_split_index()',
which calls 'do_write_locked_index()' again, which then tries to use the
closed lockfile again, but in fact fails to do so as it's already
closed. This eventually leads to a segfault.

Make sure to write the main index only once.

[nd: most of the commit message and investigation done by Thomas, I only
tweaked the solution a bit]

Helped-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c           |  5 +++--
 t/t1700-split-index.sh | 19 +++++++++++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index c568643f55..c58c0a978a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2561,8 +2561,9 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		if (!temp) {
 			hashclr(si->base_sha1);
 			ret = do_write_locked_index(istate, lock, flags);
-		} else
-			ret = write_shared_index(istate, &temp);
+			goto out;
+		}
+		ret = write_shared_index(istate, &temp);
 
 		saved_errno = errno;
 		if (is_tempfile_active(temp))
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index af9b847761..5494389dbb 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -401,4 +401,23 @@ done <<\EOF
 0642 -rw-r---w-
 EOF
 
+test_expect_success POSIXPERM 'graceful handling splitting index is not allowed' '
+	test_create_repo ro &&
+	(
+		cd ro &&
+		test_commit initial &&
+		git update-index --split-index &&
+		test -f .git/sharedindex.*
+	) &&
+	test_when_finished "chmod -R u+w ro" &&
+	chmod -R u-w ro &&
+	cp ro/.git/index new-index &&
+	GIT_INDEX_FILE="$(pwd)/new-index" git -C ro update-index --split-index &&
+	chmod -R u+w ro &&
+	rm ro/.git/sharedindex.* &&
+	GIT_INDEX_FILE=new-index git ls-files >actual &&
+	echo initial.t >expected &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.15.1.600.g899a5f85c6


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index
  2018-01-14  9:36       ` Duy Nguyen
  2018-01-14 10:18         ` [PATCH 1/3] read-cache.c: change type of "temp" in write_shared_index() Nguyễn Thái Ngọc Duy
@ 2018-01-14 14:29         ` " Thomas Gummerer
  1 sibling, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-14 14:29 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano, SZEDER Gábor

On 01/14, Duy Nguyen wrote:
> On Sun, Jan 14, 2018 at 5:37 AM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > In a0a967568e ("update-index --split-index: do not split if $GIT_DIR is
> > read only", 2014-06-13), we tried to make sure we can still write an
> > index, even if the shared index can not be written.
> >
> > We did so by just calling 'do_write_locked_index()' from
> > 'write_shared_index()'.  'do_write_locked_index()' always at least
> > closes the tempfile nowadays, and used to close or commit the lockfile
> > if COMMIT_LOCK or CLOSE_LOCK were given at the time this feature was
> > introduced.  COMMIT_LOCK or CLOSE_LOCK is passed in by most callers of
> > 'write_locked_index()'.
> >
> > After calling 'write_shared_index()', we call 'write_split_index()',
> > which calls 'do_write_locked_index()' again, which then tries to use the
> > closed lockfile again, but in fact fails to do so as it's already
> > closed.
> >
> > In the current version, git will in fact segfault if it can't create a
> > new file in $gitdir, and this feature seems to never have worked in the
> > first place.
> >
> > Ever since introducing the split index feature, nobody has complained
> > about this failing, and it really just papers over repositories that
> > will sooner or later need fixing anyway.
> 
> Actually there's one valid case for this: you're accessing a read-only
> $GIT_DIR (.e.g maybe from a web server cgi script which may be run by
> user nobody or something) and creating a temporary index _outside_
> $GIT_DIR. I used to do this when I wanted to do "git grep" on some
> SHA-1 a couple times. Doing "git grep <SHA-1>" directly (a couple
> times) pays full cost for walking trees. If you prepare an index
> first, you pay it only once.

Makes sense, I didn't realize that usecase, thanks!

> > Therefore just make being unable to write the split index a proper
> > error, and have users fix their repositories instead of trying (but
> > failing) to paper over the error.
> >
> > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> > ---
> >  read-cache.c | 11 ++++-------
> >  1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/read-cache.c b/read-cache.c
> > index d13ce83794..a9c8facdfd 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -2471,18 +2471,15 @@ static int clean_shared_index_files(const char *current_hex)
> >         return 0;
> >  }
> >
> > -static int write_shared_index(struct index_state *istate,
> > -                             struct lock_file *lock, unsigned flags)
> > +static int write_shared_index(struct index_state *istate)
> >  {
> >         struct tempfile *temp;
> >         struct split_index *si = istate->split_index;
> >         int ret;
> >
> >         temp = mks_tempfile(git_path("sharedindex_XXXXXX"));
> > -       if (!temp) {
> > -               hashclr(si->base_sha1);
> > -               return do_write_locked_index(istate, lock, flags);
> 
> I think this code tries to do what's done near the beginning of
> write_locked_index() where we also bail out early:
> 
> -- 8< --
>         if (!si || alternate_index_output ||
>             (istate->cache_changed & ~EXTMASK)) {
>                 if (si)
>                         hashclr(si->base_sha1);
>                 ret = do_write_locked_index(istate, lock, flags);
>                 goto out;
>         }
> -- 8< --
> 
> the only difference is it does not realize that it can't do "goto
> out;" like that code unless something goes wrong. I'll try to prepare
> a patch that move tempfile creation out of write_shared_index()
> instead. Patches coming in a bit..

Thanks for fixing this in a nicer way :)

> -- 
> Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
  2018-01-08 10:41       ` Duy Nguyen
@ 2018-01-16 21:42       ` Brandon Williams
  2018-01-17  0:16         ` Duy Nguyen
  2018-01-19 21:57       ` Junio C Hamano
  2 siblings, 1 reply; 77+ messages in thread
From: Brandon Williams @ 2018-01-16 21:42 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Lars Schneider, Jeff King, Junio C Hamano, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 01/07, Thomas Gummerer wrote:
> read_index_from() takes a path argument for the location of the index
> file.  For reading the shared index in split index mode however it just
> ignores that path argument, and reads it from the gitdir of the current
> repository.
> 
> This works as long as an index in the_repository is read.  Once that
> changes, such as when we read the index of a submodule, or of a
> different working tree than the current one, the gitdir of
> the_repository will no longer contain the appropriate shared index,
> and git will fail to read it.
> 
> For example t3007-ls-files-recurse-submodules.sh was broken with
> GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> broken in a similar manner, probably by introducing struct repository
> there, although I didn't track down the exact commit for that.
> 
> be489d02d2 ("revision.c: --indexed-objects add objects from all
> worktrees", 2017-08-23) breaks with split index mode in a similar
> manner, not erroring out when it can't read the index, but instead
> carrying on with pruning, without taking the index of the worktree into
> account.
> 
> Fix this by passing an additional gitdir parameter to read_index_from,
> to indicate where it should look for and read the shared index from.
> 
> read_cache_from() defaults to using the gitdir of the_repository.  As it
> is mostly a convenience macro, having to pass get_git_dir() for every
> call seems overkill, and if necessary users can have more control by
> using read_index_from().

I'm not saying we need to change this again but I got to thinking about
what the root cause for this bug is and I think it's a design flaw in
how split index is implemented.  IIUC Split index is an index extension
that can be enabled to limit the size of the index file that is written
when making changes to the index.  It breaks the index into two pieces,
index (which contains only changes) and sharedindex.XXXXX (which
contains unchanged information) where 'XXXXX' is a value found in the
index file.  If we don't do anything fancy then these two files live
next to one another in a repository's git directory at $GIT_DIR/index
and $GIT_DIR/sharedindex.XXXXX.  This seems to work all well and fine
except that this isn't always the case and the read_index_from function
takes this into account by enabling a caller to specify a path to where
the index file is located.  We can do this by specifying the index file
we want to use by setting GIT_INDEX_FILE.

Being able to specify an index file via the environment is a feature
that has existed for a very long time (one that I personally think makes
things difficult because of things like additions like the sharedindex)
and I don't think was taken into account when introducing the split
index extension.  In this case if i were to specify a location of an
index file in my home directory '~/index' and be using the split index
feature then the corresponding sharedindex file would live in my
repository's git directory '~/project/.git/sharedindex.XXXXX'.  So the
sharedindex file is always located relative to the project's git
directory and not the index file itself, which is kind of confusing.
Maybe a better design would be to have the sharedindex file located
relative to the index file.

Anyway, some food for thought.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-16 21:42       ` Brandon Williams
@ 2018-01-17  0:16         ` Duy Nguyen
  2018-01-17  0:32           ` Brandon Williams
  2018-01-17 18:16           ` Jonathan Nieder
  0 siblings, 2 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-17  0:16 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Thomas Gummerer, Git Mailing List, Lars Schneider, Jeff King, Junio C Hamano, SZEDER Gábor

On Wed, Jan 17, 2018 at 4:42 AM, Brandon Williams <bmwill@google.com> wrote:
> On 01/07, Thomas Gummerer wrote:
>> read_index_from() takes a path argument for the location of the index
>> file.  For reading the shared index in split index mode however it just
>> ignores that path argument, and reads it from the gitdir of the current
>> repository.
>>
>> This works as long as an index in the_repository is read.  Once that
>> changes, such as when we read the index of a submodule, or of a
>> different working tree than the current one, the gitdir of
>> the_repository will no longer contain the appropriate shared index,
>> and git will fail to read it.
>>
>> For example t3007-ls-files-recurse-submodules.sh was broken with
>> GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
>> object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
>> broken in a similar manner, probably by introducing struct repository
>> there, although I didn't track down the exact commit for that.
>>
>> be489d02d2 ("revision.c: --indexed-objects add objects from all
>> worktrees", 2017-08-23) breaks with split index mode in a similar
>> manner, not erroring out when it can't read the index, but instead
>> carrying on with pruning, without taking the index of the worktree into
>> account.
>>
>> Fix this by passing an additional gitdir parameter to read_index_from,
>> to indicate where it should look for and read the shared index from.
>>
>> read_cache_from() defaults to using the gitdir of the_repository.  As it
>> is mostly a convenience macro, having to pass get_git_dir() for every
>> call seems overkill, and if necessary users can have more control by
>> using read_index_from().
>
> I'm not saying we need to change this again but I got to thinking about
> what the root cause for this bug is and I think it's a design flaw in
> how split index is implemented.  IIUC Split index is an index extension
> that can be enabled to limit the size of the index file that is written
> when making changes to the index.  It breaks the index into two pieces,
> index (which contains only changes) and sharedindex.XXXXX (which
> contains unchanged information) where 'XXXXX' is a value found in the
> index file.  If we don't do anything fancy then these two files live
> next to one another in a repository's git directory at $GIT_DIR/index
> and $GIT_DIR/sharedindex.XXXXX.  This seems to work all well and fine
> except that this isn't always the case and the read_index_from function
> takes this into account by enabling a caller to specify a path to where
> the index file is located.  We can do this by specifying the index file
> we want to use by setting GIT_INDEX_FILE.
>
> Being able to specify an index file via the environment is a feature
> that has existed for a very long time (one that I personally think makes
> things difficult because of things like additions like the sharedindex)
> and I don't think was taken into account when introducing the split
> index extension.

It was (partly because I did use GIT_INDEX_FILE on occasions).

> In this case if i were to specify a location of an
> index file in my home directory '~/index' and be using the split index
> feature then the corresponding sharedindex file would live in my
> repository's git directory '~/project/.git/sharedindex.XXXXX'.  So the
> sharedindex file is always located relative to the project's git
> directory and not the index file itself, which is kind of confusing.
> Maybe a better design would be to have the sharedindex file located
> relative to the index file.

That adds more problems. Now when you move the index file around you
have to move the shared index file too (think about atomic rename
which we use in plenty of places, we can't achieve that by moving two
files). A new dependency to $GIT_DIR is not that confusing to me, the
index file is useless anyway if you don't have access to
$GIT_DIR/objects. There was always the option to _not_ split the index
when $GIT_INDEX_FILE is specified, I think I did consider that but I
dropped it because we'd lose the performance gain by splitting.

> Anyway, some food for thought.
>
> --
> Brandon Williams



-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-17  0:16         ` Duy Nguyen
@ 2018-01-17  0:32           ` Brandon Williams
  2018-01-17 18:16           ` Jonathan Nieder
  1 sibling, 0 replies; 77+ messages in thread
From: Brandon Williams @ 2018-01-17  0:32 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Thomas Gummerer, Git Mailing List, Lars Schneider, Jeff King, Junio C Hamano, SZEDER Gábor

On 01/17, Duy Nguyen wrote:
> On Wed, Jan 17, 2018 at 4:42 AM, Brandon Williams <bmwill@google.com> wrote:
> > On 01/07, Thomas Gummerer wrote:
> >> read_index_from() takes a path argument for the location of the index
> >> file.  For reading the shared index in split index mode however it just
> >> ignores that path argument, and reads it from the gitdir of the current
> >> repository.
> >>
> >> This works as long as an index in the_repository is read.  Once that
> >> changes, such as when we read the index of a submodule, or of a
> >> different working tree than the current one, the gitdir of
> >> the_repository will no longer contain the appropriate shared index,
> >> and git will fail to read it.
> >>
> >> For example t3007-ls-files-recurse-submodules.sh was broken with
> >> GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
> >> object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
> >> broken in a similar manner, probably by introducing struct repository
> >> there, although I didn't track down the exact commit for that.
> >>
> >> be489d02d2 ("revision.c: --indexed-objects add objects from all
> >> worktrees", 2017-08-23) breaks with split index mode in a similar
> >> manner, not erroring out when it can't read the index, but instead
> >> carrying on with pruning, without taking the index of the worktree into
> >> account.
> >>
> >> Fix this by passing an additional gitdir parameter to read_index_from,
> >> to indicate where it should look for and read the shared index from.
> >>
> >> read_cache_from() defaults to using the gitdir of the_repository.  As it
> >> is mostly a convenience macro, having to pass get_git_dir() for every
> >> call seems overkill, and if necessary users can have more control by
> >> using read_index_from().
> >
> > I'm not saying we need to change this again but I got to thinking about
> > what the root cause for this bug is and I think it's a design flaw in
> > how split index is implemented.  IIUC Split index is an index extension
> > that can be enabled to limit the size of the index file that is written
> > when making changes to the index.  It breaks the index into two pieces,
> > index (which contains only changes) and sharedindex.XXXXX (which
> > contains unchanged information) where 'XXXXX' is a value found in the
> > index file.  If we don't do anything fancy then these two files live
> > next to one another in a repository's git directory at $GIT_DIR/index
> > and $GIT_DIR/sharedindex.XXXXX.  This seems to work all well and fine
> > except that this isn't always the case and the read_index_from function
> > takes this into account by enabling a caller to specify a path to where
> > the index file is located.  We can do this by specifying the index file
> > we want to use by setting GIT_INDEX_FILE.
> >
> > Being able to specify an index file via the environment is a feature
> > that has existed for a very long time (one that I personally think makes
> > things difficult because of things like additions like the sharedindex)
> > and I don't think was taken into account when introducing the split
> > index extension.
> 
> It was (partly because I did use GIT_INDEX_FILE on occasions).
> 
> > In this case if i were to specify a location of an
> > index file in my home directory '~/index' and be using the split index
> > feature then the corresponding sharedindex file would live in my
> > repository's git directory '~/project/.git/sharedindex.XXXXX'.  So the
> > sharedindex file is always located relative to the project's git
> > directory and not the index file itself, which is kind of confusing.
> > Maybe a better design would be to have the sharedindex file located
> > relative to the index file.
> 
> That adds more problems. Now when you move the index file around you
> have to move the shared index file too (think about atomic rename
> which we use in plenty of places, we can't achieve that by moving two
> files). A new dependency to $GIT_DIR is not that confusing to me, the
> index file is useless anyway if you don't have access to
> $GIT_DIR/objects. There was always the option to _not_ split the index
> when $GIT_INDEX_FILE is specified, I think I did consider that but I
> dropped it because we'd lose the performance gain by splitting.
> 
> > Anyway, some food for thought.
> >
> > --
> > Brandon Williams
> 
> 
> 
> -- 
> Duy

Thanks for giving me some more context :) makes me feel more confident
in the solution that's already been proposed.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-17  0:16         ` Duy Nguyen
  2018-01-17  0:32           ` Brandon Williams
@ 2018-01-17 18:16           ` Jonathan Nieder
  2018-01-18 10:19             ` Duy Nguyen
  1 sibling, 1 reply; 77+ messages in thread
From: Jonathan Nieder @ 2018-01-17 18:16 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Brandon Williams, Thomas Gummerer, Git Mailing List, Lars Schneider, Jeff King, Junio C Hamano, SZEDER Gábor

Hi,

Duy Nguyen wrote:
> On Wed, Jan 17, 2018 at 4:42 AM, Brandon Williams <bmwill@google.com> wrote:

>>                                  IIUC Split index is an index extension
>> that can be enabled to limit the size of the index file that is written
>> when making changes to the index.  It breaks the index into two pieces,
>> index (which contains only changes) and sharedindex.XXXXX (which
>> contains unchanged information) where 'XXXXX' is a value found in the
>> index file.  If we don't do anything fancy then these two files live
>> next to one another in a repository's git directory at $GIT_DIR/index
>> and $GIT_DIR/sharedindex.XXXXX.  This seems to work all well and fine
>> except that this isn't always the case and the read_index_from function
>> takes this into account by enabling a caller to specify a path to where
>> the index file is located.  We can do this by specifying the index file
>> we want to use by setting GIT_INDEX_FILE.
[...]
>> In this case if i were to specify a location of an
>> index file in my home directory '~/index' and be using the split index
>> feature then the corresponding sharedindex file would live in my
>> repository's git directory '~/project/.git/sharedindex.XXXXX'.  So the
>> sharedindex file is always located relative to the project's git
>> directory and not the index file itself, which is kind of confusing.
>> Maybe a better design would be to have the sharedindex file located
>> relative to the index file.
>
> That adds more problems. Now when you move the index file around you
> have to move the shared index file too (think about atomic rename
> which we use in plenty of places, we can't achieve that by moving two
> files). A new dependency to $GIT_DIR is not that confusing to me, the
> index file is useless anyway if you don't have access to
> $GIT_DIR/objects. There was always the option to _not_ split the index
> when $GIT_INDEX_FILE is specified, I think I did consider that but I
> dropped it because we'd lose the performance gain by splitting.

Can you elaborate a little more on this?

At first glance, it seems simpler to say "paths in index extensions
named in the index file are relative to the location of the index
file" and to make moving the index file also require moving the shared
index file, exactly as you say.  So at least from a "principle of
least surprise" perspective I would be tempted to go that way.

It's true that we rely on atomic rename in plenty of places, but only
within a directory.  (Filesystem boundaries, NFS, etc mean that atomic
renames across directories are a lost cause.)

Fortunately index files (including temp index files used by scripts)
tend to only be in $GIT_DIR, for exactly that reason.  So I am
wondering if switching to index-file-relative semantics would be an
invasive move and what the pros and cons of such a move are.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-17 18:16           ` Jonathan Nieder
@ 2018-01-18 10:19             ` Duy Nguyen
  0 siblings, 0 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-18 10:19 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, Thomas Gummerer, Git Mailing List, Lars Schneider, Jeff King, Junio C Hamano, SZEDER Gábor

On Thu, Jan 18, 2018 at 1:16 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
>
> Duy Nguyen wrote:
>> On Wed, Jan 17, 2018 at 4:42 AM, Brandon Williams <bmwill@google.com> wrote:
>
>>>                                  IIUC Split index is an index extension
>>> that can be enabled to limit the size of the index file that is written
>>> when making changes to the index.  It breaks the index into two pieces,
>>> index (which contains only changes) and sharedindex.XXXXX (which
>>> contains unchanged information) where 'XXXXX' is a value found in the
>>> index file.  If we don't do anything fancy then these two files live
>>> next to one another in a repository's git directory at $GIT_DIR/index
>>> and $GIT_DIR/sharedindex.XXXXX.  This seems to work all well and fine
>>> except that this isn't always the case and the read_index_from function
>>> takes this into account by enabling a caller to specify a path to where
>>> the index file is located.  We can do this by specifying the index file
>>> we want to use by setting GIT_INDEX_FILE.
> [...]
>>> In this case if i were to specify a location of an
>>> index file in my home directory '~/index' and be using the split index
>>> feature then the corresponding sharedindex file would live in my
>>> repository's git directory '~/project/.git/sharedindex.XXXXX'.  So the
>>> sharedindex file is always located relative to the project's git
>>> directory and not the index file itself, which is kind of confusing.
>>> Maybe a better design would be to have the sharedindex file located
>>> relative to the index file.
>>
>> That adds more problems. Now when you move the index file around you
>> have to move the shared index file too (think about atomic rename
>> which we use in plenty of places, we can't achieve that by moving two
>> files). A new dependency to $GIT_DIR is not that confusing to me, the
>> index file is useless anyway if you don't have access to
>> $GIT_DIR/objects. There was always the option to _not_ split the index
>> when $GIT_INDEX_FILE is specified, I think I did consider that but I
>> dropped it because we'd lose the performance gain by splitting.
>
> Can you elaborate a little more on this?
>
> At first glance, it seems simpler to say "paths in index extensions
> named in the index file are relative to the location of the index
> file" and to make moving the index file also require moving the shared
> index file, exactly as you say.  So at least from a "principle of
> least surprise" perspective I would be tempted to go that way.
>
> It's true that we rely on atomic rename in plenty of places, but only
> within a directory.  (Filesystem boundaries, NFS, etc mean that atomic
> renames across directories are a lost cause.)
>
> Fortunately index files (including temp index files used by scripts)
> tend to only be in $GIT_DIR, for exactly that reason.  So I am
> wondering if switching to index-file-relative semantics would be an
> invasive move and what the pros and cons of such a move are.

I think it gets messier. Now you have to move two files. If the first
move succeeds but the second one fails, recovery may involve un-move
the first file, but its old content is already gone. We probably can
get around that. But since the shared index is assumed big and heavy,
I just went with "store it in the place it's going to be and never
move it anywhere ever (until nobody uses it then it's deleted)"
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-14 10:18           ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index Nguyễn Thái Ngọc Duy
@ 2018-01-18 11:36             ` SZEDER Gábor
  2018-01-18 12:47               ` Duy Nguyen
  0 siblings, 1 reply; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-18 11:36 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano

This series, queued as 'nd/shared-index-fix', makes the 32 bit Linux
build job fail on Travis CI.  Unfortunately, all it can tell us about
the failure is this:

  Test Summary Report
  -------------------
  t1700-split-index.sh                             (Wstat: 256 Tests: 23
  Failed: 1)
    Failed test:  23
    Non-zero exit status: 1
  Files=809, Tests=18491, 401 wallclock secs ( 7.22 usr  1.60 sys + 263.16
  cusr 49.58 csys = 321.56 CPU)
  Result: FAIL

because it can't access the test's verbose log due to lack of
permissions.


  https://travis-ci.org/git/git/jobs/329681826#L2074

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 11:36             ` SZEDER Gábor
@ 2018-01-18 12:47               ` Duy Nguyen
  2018-01-18 13:29                 ` Jeff King
  2018-01-22 18:27                 ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index SZEDER Gábor
  0 siblings, 2 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-18 12:47 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Jeff King, Junio C Hamano

On Thu, Jan 18, 2018 at 6:36 PM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> This series, queued as 'nd/shared-index-fix', makes the 32 bit Linux
> build job fail on Travis CI.  Unfortunately, all it can tell us about
> the failure is this:
>
>   Test Summary Report
>   -------------------
>   t1700-split-index.sh                             (Wstat: 256 Tests: 23
>   Failed: 1)
>     Failed test:  23
>     Non-zero exit status: 1
>   Files=809, Tests=18491, 401 wallclock secs ( 7.22 usr  1.60 sys + 263.16
>   cusr 49.58 csys = 321.56 CPU)
>   Result: FAIL
>
> because it can't access the test's verbose log due to lack of
> permissions.
>
>
>   https://travis-ci.org/git/git/jobs/329681826#L2074

I may need help getting that log (or even better the trash directory
of t1700). I tried -m32 build, then valgrind on amd64 (because it
didn't work with my 32 build), running the tests with dash and even
the linux32 docker version that comes with "ci" directory. Everything
worked for me.
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 12:47               ` Duy Nguyen
@ 2018-01-18 13:29                 ` Jeff King
  2018-01-18 13:36                   ` Duy Nguyen
  2018-01-22 18:27                 ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index SZEDER Gábor
  1 sibling, 1 reply; 77+ messages in thread
From: Jeff King @ 2018-01-18 13:29 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: SZEDER Gábor, Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1013 bytes --]

On Thu, Jan 18, 2018 at 07:47:50PM +0700, Duy Nguyen wrote:

> I may need help getting that log (or even better the trash directory
> of t1700). I tried -m32 build, then valgrind on amd64 (because it
> didn't work with my 32 build), running the tests with dash and even
> the linux32 docker version that comes with "ci" directory. Everything
> worked for me.

It fails for me locally if I run it in the linux32 docker environment.
Here's the end of the "-v -x" output:

  + GIT_INDEX_FILE=/usr/src/git/t/trash directory.t1700-split-index/new-index git -C ro update-index --split-index
  + chmod -R u+w ro
  + rm ro/.git/sharedindex.bbdf63f5a51242904dba2ea929ea4f56fcc340b6
  + GIT_INDEX_FILE=new-index git ls-files
  fatal: .git/sharedindex.bbdf63f5a51242904dba2ea929ea4f56fcc340b6: index file open failed: No such file or directory
  error: last command exited with $?=128
  not ok 23 - graceful handling splitting index is not allowed

I don't know if the trash state will be helpful, but it's attached.

-Peff

[-- Attachment #2: t1700-trash.tar.gz --]
[-- Type: application/gzip, Size: 24979 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 13:29                 ` Jeff King
@ 2018-01-18 13:36                   ` Duy Nguyen
  2018-01-18 15:00                     ` Duy Nguyen
  0 siblings, 1 reply; 77+ messages in thread
From: Duy Nguyen @ 2018-01-18 13:36 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Junio C Hamano

On Thu, Jan 18, 2018 at 8:29 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Jan 18, 2018 at 07:47:50PM +0700, Duy Nguyen wrote:
>
>> I may need help getting that log (or even better the trash directory
>> of t1700). I tried -m32 build, then valgrind on amd64 (because it
>> didn't work with my 32 build), running the tests with dash and even
>> the linux32 docker version that comes with "ci" directory. Everything
>> worked for me.
>
> It fails for me locally if I run it in the linux32 docker environment.
> Here's the end of the "-v -x" output:
>
>   + GIT_INDEX_FILE=/usr/src/git/t/trash directory.t1700-split-index/new-index git -C ro update-index --split-index
>   + chmod -R u+w ro
>   + rm ro/.git/sharedindex.bbdf63f5a51242904dba2ea929ea4f56fcc340b6
>   + GIT_INDEX_FILE=new-index git ls-files
>   fatal: .git/sharedindex.bbdf63f5a51242904dba2ea929ea4f56fcc340b6: index file open failed: No such file or directory
>   error: last command exited with $?=128
>   not ok 23 - graceful handling splitting index is not allowed
>
> I don't know if the trash state will be helpful, but it's attached.

Thanks. Somehow my way of forcing mks_tempfile() to fail, well..
failed in this environment. I see the same output if I remove "chmod
-R u-w ro". I think I have enough to continue from here.
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 13:36                   ` Duy Nguyen
@ 2018-01-18 15:00                     ` Duy Nguyen
  2018-01-18 21:37                       ` Jeff King
  0 siblings, 1 reply; 77+ messages in thread
From: Duy Nguyen @ 2018-01-18 15:00 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Junio C Hamano

On Thu, Jan 18, 2018 at 08:36:32PM +0700, Duy Nguyen wrote:
> On Thu, Jan 18, 2018 at 8:29 PM, Jeff King <peff@peff.net> wrote:
> > On Thu, Jan 18, 2018 at 07:47:50PM +0700, Duy Nguyen wrote:
> >
> >> I may need help getting that log (or even better the trash directory
> >> of t1700). I tried -m32 build, then valgrind on amd64 (because it
> >> didn't work with my 32 build), running the tests with dash and even
> >> the linux32 docker version that comes with "ci" directory. Everything
> >> worked for me.
> >
> > It fails for me locally if I run it in the linux32 docker environment.
> > Here's the end of the "-v -x" output:
> >
> >   + GIT_INDEX_FILE=/usr/src/git/t/trash directory.t1700-split-index/new-index git -C ro update-index --split-index
> >   + chmod -R u+w ro
> >   + rm ro/.git/sharedindex.bbdf63f5a51242904dba2ea929ea4f56fcc340b6
> >   + GIT_INDEX_FILE=new-index git ls-files
> >   fatal: .git/sharedindex.bbdf63f5a51242904dba2ea929ea4f56fcc340b6: index file open failed: No such file or directory
> >   error: last command exited with $?=128
> >   not ok 23 - graceful handling splitting index is not allowed
> >
> > I don't know if the trash state will be helpful, but it's attached.
> 
> Thanks. Somehow my way of forcing mks_tempfile() to fail, well..
> failed in this environment. I see the same output if I remove "chmod
> -R u-w ro". I think I have enough to continue from here.

The test suite was run as root, no wonder why my removing write access
has no effect. I got the test to pass with this, but then it fails
with

    Can't write .prove (Permission denied) at /usr/share/perl/5.22/App/Prove.pm line 542.

Some more chown'ing or chmod'ing is required....

-- 8< --
Subject: [PATCH] ci: don't accidentally run the test suite as root

This script assigns and adds a user named "ci" in a subshell so the
outer CI_USER is not affected. For some reason, CI_USER is actually
empty on Travis linux32 builds. This makes the following "su" useless
and the test suite is run as root.

Some tests may expect file/dir permissions to be respected. But root
rules all and ignores all. That could make tests fail.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 ci/run-linux32-build.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index c19c50c1c9..92b488ff27 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -21,8 +21,8 @@ linux32 --32bit i386 sh -c '
 # If a host user id is given, then create a user "ci" with the host user
 # id to make everything accessible to the host user.
 HOST_UID=$1 &&
-CI_USER=$USER &&
-test -z $HOST_UID || (CI_USER="ci" && useradd -u $HOST_UID $CI_USER) &&
+CI_USER=${USER:-ci} &&
+test -z "$HOST_UID" || useradd -u $HOST_UID $CI_USER &&
 
 # Build and test
 linux32 --32bit i386 su -m -l $CI_USER -c '
-- 
2.16.0.47.g5ff498d35b

-- 8< --

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 15:00                     ` Duy Nguyen
@ 2018-01-18 21:37                       ` Jeff King
  2018-01-18 22:32                         ` SZEDER Gábor
  0 siblings, 1 reply; 77+ messages in thread
From: Jeff King @ 2018-01-18 21:37 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: SZEDER Gábor, Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Junio C Hamano

On Thu, Jan 18, 2018 at 10:00:14PM +0700, Duy Nguyen wrote:

> The test suite was run as root, no wonder why my removing write access
> has no effect. I got the test to pass with this, but then it fails
> with
> 
>     Can't write .prove (Permission denied) at /usr/share/perl/5.22/App/Prove.pm line 542.
> 
> Some more chown'ing or chmod'ing is required....

Ah, right. I agree that we probably ought to run the ci as non-root.
However, if the test requires non-root, it probably needs to be marked
with the SANITY prereq.

I also ran into one funny thing: if you run the script with "-i", then
we do not run the test_when_finished block. And therefore the "ro"
directory is left without its write bit, and the next test run fails, as
it cannot "rm -rf" the old trash directory out of the way.

I'm not sure there's a good solution, though. Skipping the
test_when_finished block on a "-i" run is intentional, to let you
inspect the broken state.

> Subject: [PATCH] ci: don't accidentally run the test suite as root
> 
> This script assigns and adds a user named "ci" in a subshell so the
> outer CI_USER is not affected. For some reason, CI_USER is actually
> empty on Travis linux32 builds. This makes the following "su" useless
> and the test suite is run as root.

Are we sure this was the problem on Travis, and it wasn't just an issue
with how I reproduced via docker?

-Peff

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 0/3] fixes for split index mode
  2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
                       ` (3 preceding siblings ...)
  2018-01-13 22:37     ` [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index Thomas Gummerer
@ 2018-01-18 21:53     ` Thomas Gummerer
  2018-01-19 18:34       ` Junio C Hamano
  4 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-18 21:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Schneider, Brandon Williams, Jeff King, git, Nguyễn Thái Ngọc Duy, SZEDER Gábor

Friendly ping on this series now that 2.16 is out :) Is there anything
in this series (up to 3/3, 4/3 can be dropped now that Duy fixed it in
a nicer way) that still needs updating?  It fixes a few bugs in split
index mode with submodules/worktrees, so it would be nice to get this
reviewed/merged.

On 01/07, Thomas Gummerer wrote:
> Thanks Brandon and Lars for comments on the previous round.
> 
> Previous rounds were at <20171210212202.28231-1-t.gummerer@gmail.com>
> and <20171217225122.28941-1-t.gummerer@gmail.com>.
> 
> Changes since the previous round:
> 
> - reworked the patches to no longer try to use struct repository for
>   worktrees, but pass gitdir into read_index_from instead (Thanks
>   Brandon :)).  So the fixes that were in 1/3 and 2/3 previously are
>   now in 1/3
> - 2/3 now fixes t7009 properly.  I thought it was fixed before, but it
>   probably just passed the tests because of the GIT_TEST_SPLIT_INDEX
>   "randomness".
> - The travis job is now only running the 64-bit linux build with split
>   index mode to save even more cycles.
> - As this wasn't picked up anywhere yet, I took the freedom to rebase
>   this onto the current master, which includes sg/travis-fixes, which
>   made it a bit easier for me to test.  If this makes the life harder
>   for anyone reviewing this let me know and I can base it on the same
>   commit previous iterations were based on.
> 
> Thomas Gummerer (3):
>   read-cache: fix reading the shared index for other repos
>   split-index: don't write cache tree with null sha1 entries
>   travis: run tests with GIT_TEST_SPLIT_INDEX
> 
>  cache-tree.c            |  2 +-
>  cache.h                 |  8 +++++---
>  ci/run-linux32-build.sh |  1 +
>  ci/run-tests.sh         |  4 ++++
>  read-cache.c            | 25 ++++++++++++++-----------
>  repository.c            |  2 +-
>  revision.c              |  3 ++-
>  split-index.c           |  2 ++
>  t/t1700-split-index.sh  | 19 +++++++++++++++++++
>  9 files changed, 49 insertions(+), 17 deletions(-)
> 
> -- 
> 2.16.0.rc1.238.g530d649a79
> 

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 21:37                       ` Jeff King
@ 2018-01-18 22:32                         ` SZEDER Gábor
  2018-01-19  0:30                           ` Duy Nguyen
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
  0 siblings, 2 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-18 22:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Duy Nguyen, Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Junio C Hamano

On Thu, Jan 18, 2018 at 10:37 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Jan 18, 2018 at 10:00:14PM +0700, Duy Nguyen wrote:
>
>> The test suite was run as root, no wonder why my removing write access
>> has no effect. I got the test to pass with this, but then it fails
>> with
>>
>>     Can't write .prove (Permission denied) at /usr/share/perl/5.22/App/Prove.pm line 542.
>>
>> Some more chown'ing or chmod'ing is required....

This is the fallout of running the tests as root in the past.  With your
patch 'prove' is run as a non-root user, but the prove state is loaded
from Travis CI's cache, where it has been written as root the last time
around, so now we don't have permissions to (over)write it.

I have patches in the works to address this as well.

>> Subject: [PATCH] ci: don't accidentally run the test suite as root
>>
>> This script assigns and adds a user named "ci" in a subshell so the
>> outer CI_USER is not affected. For some reason, CI_USER is actually
>> empty on Travis linux32 builds. This makes the following "su" useless
>> and the test suite is run as root.
>
> Are we sure this was the problem on Travis, and it wasn't just an issue
> with how I reproduced via docker?

Yes, we are.


Gábor

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 22:32                         ` SZEDER Gábor
@ 2018-01-19  0:30                           ` Duy Nguyen
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
  1 sibling, 0 replies; 77+ messages in thread
From: Duy Nguyen @ 2018-01-19  0:30 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Jeff King, Git mailing list, Thomas Gummerer, Lars Schneider, Brandon Williams, Junio C Hamano

On Fri, Jan 19, 2018 at 5:32 AM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> On Thu, Jan 18, 2018 at 10:37 PM, Jeff King <peff@peff.net> wrote:
>> On Thu, Jan 18, 2018 at 10:00:14PM +0700, Duy Nguyen wrote:
>>
>>> The test suite was run as root, no wonder why my removing write access
>>> has no effect. I got the test to pass with this, but then it fails
>>> with
>>>
>>>     Can't write .prove (Permission denied) at /usr/share/perl/5.22/App/Prove.pm line 542.
>>>
>>> Some more chown'ing or chmod'ing is required....
>
> This is the fallout of running the tests as root in the past.  With your
> patch 'prove' is run as a non-root user, but the prove state is loaded
> from Travis CI's cache, where it has been written as root the last time
> around, so now we don't have permissions to (over)write it.
>
> I have patches in the works to address this as well.

Great. I'll leave it to you then. I will update the test with SANITY
(and a small fix in the test name) and think more about the "-i" Jeff
mentioned.
-- 
Duy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 0/3] fixes for split index mode
  2018-01-18 21:53     ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
@ 2018-01-19 18:34       ` Junio C Hamano
  2018-01-19 21:11         ` Thomas Gummerer
  0 siblings, 1 reply; 77+ messages in thread
From: Junio C Hamano @ 2018-01-19 18:34 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Lars Schneider, Brandon Williams, Jeff King, git, Nguyễn Thái Ngọc Duy, SZEDER Gábor

Thomas Gummerer <t.gummerer@gmail.com> writes:

> Friendly ping on this series now that 2.16 is out :) Is there anything
> in this series (up to 3/3, 4/3 can be dropped now that Duy fixed it in
> a nicer way) that still needs updating?  It fixes a few bugs in split
> index mode with submodules/worktrees, so it would be nice to get this
> reviewed/merged.

I was wondering about the same thing.  Especially it wasn't very
clear to me what Duy's replacement was meant to replace and how well
it was supposed to work with the rest of your series.

Let's drop 4/3 and queue 1-3/3 for now.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 0/3] fixes for split index mode
  2018-01-19 18:34       ` Junio C Hamano
@ 2018-01-19 21:11         ` Thomas Gummerer
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-19 21:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Schneider, Brandon Williams, Jeff King, git, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 01/19, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > Friendly ping on this series now that 2.16 is out :) Is there anything
> > in this series (up to 3/3, 4/3 can be dropped now that Duy fixed it in
> > a nicer way) that still needs updating?  It fixes a few bugs in split
> > index mode with submodules/worktrees, so it would be nice to get this
> > reviewed/merged.
> 
> I was wondering about the same thing.  Especially it wasn't very
> clear to me what Duy's replacement was meant to replace and how well
> it was supposed to work with the rest of your series.

Sorry about the confusion.  4/3 was replaced by Duy's series, but the
rest of this series is independent of those patches.

> Let's drop 4/3 and queue 1-3/3 for now.

Thanks!

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
  2018-01-08 10:41       ` Duy Nguyen
  2018-01-16 21:42       ` Brandon Williams
@ 2018-01-19 21:57       ` Junio C Hamano
  2018-01-20 11:58         ` Thomas Gummerer
  2 siblings, 1 reply; 77+ messages in thread
From: Junio C Hamano @ 2018-01-19 21:57 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Lars Schneider, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, SZEDER Gábor

Thomas Gummerer <t.gummerer@gmail.com> writes:

> read_cache_from() defaults to using the gitdir of the_repository.  As it
> is mostly a convenience macro, having to pass get_git_dir() for every
> call seems overkill, and if necessary users can have more control by
> using read_index_from().

This was a bit painful change, given that some changes in flight do
add new callsites to read_index_from() and they got the function
changed under their feet.

Please double check if I made the right git-dir to be passed to them
when I push out 'pu' in a few hours.

Thanks.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-19 21:57       ` Junio C Hamano
@ 2018-01-20 11:58         ` Thomas Gummerer
  2018-01-22  6:14           ` Junio C Hamano
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Gummerer @ 2018-01-20 11:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Lars Schneider, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, SZEDER Gábor

On 01/19, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > read_cache_from() defaults to using the gitdir of the_repository.  As it
> > is mostly a convenience macro, having to pass get_git_dir() for every
> > call seems overkill, and if necessary users can have more control by
> > using read_index_from().
> 
> This was a bit painful change, given that some changes in flight do
> add new callsites to read_index_from() and they got the function
> changed under their feet.

Sorry about that.  Is there any way to make such a change less painful
in the future?

> Please double check if I made the right git-dir to be passed to them
> when I push out 'pu' in a few hours.

I think one conversion was not quite correct, even though all tests
still pass with GIT_TEST_SPLIT_INDEX set.

The following diff fixes that conversation and has a test showing the
breakage:

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 6a49f9e628..4d86a3574f 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -612,7 +612,8 @@ static void validate_no_submodules(const struct worktree *wt)
        struct index_state istate = {0};
        int i, found_submodules = 0;
 
-       if (read_index_from(&istate, worktree_git_path(wt, "index"), get_git_dir()) > 0) {
+       if (read_index_from(&istate, worktree_git_path(wt, "index"),
+                           get_worktree_git_dir(wt)) > 0) {
                for (i = 0; i < istate.cache_nr; i++) {
                        struct cache_entry *ce = istate.cache[i];
 
diff --git a/t/t2028-worktree-move.sh b/t/t2028-worktree-move.sh
index b3105eaaed..8faf61bbf5 100755
--- a/t/t2028-worktree-move.sh
+++ b/t/t2028-worktree-move.sh
@@ -90,6 +90,16 @@ test_expect_success 'move main worktree' '
        test_must_fail git worktree move . def
 '
 
+test_expect_success 'move worktree with split index' '
+       git worktree add test &&
+       (
+               cd test &&
+               test_commit file &&
+               git update-index --split-index
+       ) &&
+       git worktree move test test-destination
+'
+
 test_expect_success 'remove main worktree' '
        test_must_fail git worktree remove .
 '

With this applied what you have in pu looks good to me.  Does the
above help, or should I send this in another for for you to apply?

> Thanks.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v3 1/3] read-cache: fix reading the shared index for other repos
  2018-01-20 11:58         ` Thomas Gummerer
@ 2018-01-22  6:14           ` Junio C Hamano
  0 siblings, 0 replies; 77+ messages in thread
From: Junio C Hamano @ 2018-01-22  6:14 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, Lars Schneider, Brandon Williams, Jeff King, Nguyễn Thái Ngọc Duy, SZEDER Gábor

Thomas Gummerer <t.gummerer@gmail.com> writes:

> On 01/19, Junio C Hamano wrote:
>> Thomas Gummerer <t.gummerer@gmail.com> writes:
>> 
>> > read_cache_from() defaults to using the gitdir of the_repository.  As it
>> > is mostly a convenience macro, having to pass get_git_dir() for every
>> > call seems overkill, and if necessary users can have more control by
>> > using read_index_from().
>> 
>> This was a bit painful change, given that some changes in flight do
>> add new callsites to read_index_from() and they got the function
>> changed under their feet.
>
> Sorry about that.  Is there any way to make such a change less painful
> in the future?

One way is to do for read_index_from() what you did for the existing
users of read_cache_from().  Introduce a _new_ helper that will not
be known for any existing topics in flight, and use that to make the
existing API a thin wrapper around it.

I _think_ I got it right with evil merge, so unless this fix needs
to be delayed for extended period of time for whatever reason while
any more new callers of the function appears (which is unlikely), we
should be OK ;-)

Thanks.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build
  2018-01-18 22:32                         ` SZEDER Gábor
  2018-01-19  0:30                           ` Duy Nguyen
@ 2018-01-22 13:32                           ` SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 1/5] travis-ci: use 'set -x' for the commands under 'su' " SZEDER Gábor
                                               ` (4 more replies)
  1 sibling, 5 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Johannes Schindelin, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor


The important one is patch 4, which fixes the issue of running the test
suite as root, with a bit of sugar on top to make sure that a future
"non-root" build job can cope with a root-written cache directory from
the past.

The rest is a collection of small cleanups and improvements to make the
debugging this Docked-based build more convenient.

SZEDER Gábor (5):
  travis-ci: use 'set -x' for the commands under 'su' in the 32 bit
    Linux build
  travis-ci: use 'set -e' in the 32 bit Linux build job
  travis-ci: don't repeat the path of the cache directory
  travis-ci: don't run the test suite as root in the 32 bit Linux build
  travis-ci: don't fail if user already exists on 32 bit Linux build job

 ci/lib-travisci.sh        |  7 +++---
 ci/run-build-and-tests.sh |  2 +-
 ci/run-linux32-build.sh   | 55 +++++++++++++++++++++++++++++++++++------------
 ci/run-linux32-docker.sh  |  7 ++++--
 4 files changed, 51 insertions(+), 20 deletions(-)

-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 1/5] travis-ci: use 'set -x' for the commands under 'su' in the 32 bit Linux build
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
@ 2018-01-22 13:32                             ` " SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 2/5] travis-ci: use 'set -e' in the 32 bit Linux build job SZEDER Gábor
                                               ` (3 subsequent siblings)
  4 siblings, 0 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Johannes Schindelin, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/run-linux32-build.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index c19c50c1c9..5a36a8d7c0 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -26,6 +26,7 @@ test -z $HOST_UID || (CI_USER="ci" && useradd -u $HOST_UID $CI_USER) &&
 
 # Build and test
 linux32 --32bit i386 su -m -l $CI_USER -c '
+    set -x &&
     cd /usr/src/git &&
     ln -s /tmp/travis-cache/.prove t/.prove &&
     make --jobs=2 &&
-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 2/5] travis-ci: use 'set -e' in the 32 bit Linux build job
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 1/5] travis-ci: use 'set -x' for the commands under 'su' " SZEDER Gábor
@ 2018-01-22 13:32                             ` SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 3/5] travis-ci: don't repeat the path of the cache directory SZEDER Gábor
                                               ` (2 subsequent siblings)
  4 siblings, 0 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Johannes Schindelin, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor

All 'ci/*' scripts use 'set -e' to break the build job if a command
fails, except 'ci/run-linux32-build.sh' which relies on the && chain
to do the same.  This inconsistency among the 'ci/*' scripts is asking
for trouble: I forgot about the && chain more than once while working
on this patch series.

Enable 'set -e' for the whole script and for the commands executed
under 'su' as well.

While touching every line in the 'su' command block anyway, change
their indentation to use a tab instead of spaces.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/run-linux32-build.sh | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index 5a36a8d7c0..248183982b 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -6,29 +6,29 @@
 #   run-linux32-build.sh [host-user-id]
 #
 
-set -x
+set -ex
 
 # Update packages to the latest available versions
 linux32 --32bit i386 sh -c '
     apt update >/dev/null &&
     apt install -y build-essential libcurl4-openssl-dev libssl-dev \
 	libexpat-dev gettext python >/dev/null
-' &&
+'
 
 # If this script runs inside a docker container, then all commands are
 # usually executed as root. Consequently, the host user might not be
 # able to access the test output files.
 # If a host user id is given, then create a user "ci" with the host user
 # id to make everything accessible to the host user.
-HOST_UID=$1 &&
-CI_USER=$USER &&
-test -z $HOST_UID || (CI_USER="ci" && useradd -u $HOST_UID $CI_USER) &&
+HOST_UID=$1
+CI_USER=$USER
+test -z $HOST_UID || (CI_USER="ci" && useradd -u $HOST_UID $CI_USER)
 
 # Build and test
 linux32 --32bit i386 su -m -l $CI_USER -c '
-    set -x &&
-    cd /usr/src/git &&
-    ln -s /tmp/travis-cache/.prove t/.prove &&
-    make --jobs=2 &&
-    make --quiet test
+	set -ex
+	cd /usr/src/git
+	ln -s /tmp/travis-cache/.prove t/.prove
+	make --jobs=2
+	make --quiet test
 '
-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 3/5] travis-ci: don't repeat the path of the cache directory
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 1/5] travis-ci: use 'set -x' for the commands under 'su' " SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 2/5] travis-ci: use 'set -e' in the 32 bit Linux build job SZEDER Gábor
@ 2018-01-22 13:32                             ` SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 4/5] travis-ci: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 5/5] travis-ci: don't fail if user already exists on 32 bit Linux build job SZEDER Gábor
  4 siblings, 0 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Johannes Schindelin, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor

Some of our 'ci/*' scripts repeat the name or full path of the Travis
CI cache directory, and the following patches will add new places
using that path.

Use a variable to refer to the path of the cache directory instead, so
it's hard-coded only in a single place.

Pay extra attention to the 32 bit Linux build: it runs in a Docker
container, so pass the path of the cache directory from the host to
the container in an environment variable.  Furthermore, use the
variable in the container only if it's set, to prevent errors when
someone is running or debugging the Docker build locally, because in
that case the variable won't be set as there won't be any Travis CI
cache.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/lib-travisci.sh        | 7 ++++---
 ci/run-build-and-tests.sh | 2 +-
 ci/run-linux32-build.sh   | 6 +++---
 ci/run-linux32-docker.sh  | 5 ++++-
 4 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 07f27c7270..1efee55ef7 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -21,8 +21,6 @@ skip_branch_tip_with_tag () {
 	fi
 }
 
-good_trees_file="$HOME/travis-cache/good-trees"
-
 # Save some info about the current commit's tree, so we can skip the build
 # job if we encounter the same tree again and can provide a useful info
 # message.
@@ -83,7 +81,10 @@ check_unignored_build_artifacts ()
 # and installing dependencies.
 set -ex
 
-mkdir -p "$HOME/travis-cache"
+cache_dir="$HOME/travis-cache"
+good_trees_file="$cache_dir/good-trees"
+
+mkdir -p "$cache_dir"
 
 skip_branch_tip_with_tag
 skip_good_tree
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index d3a094603f..30790e258d 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -5,7 +5,7 @@
 
 . ${0%/*}/lib-travisci.sh
 
-ln -s $HOME/travis-cache/.prove t/.prove
+ln -s "$cache_dir/.prove" t/.prove
 
 make --jobs=2
 make --quiet test
diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index 248183982b..c9476d6598 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -25,10 +25,10 @@ CI_USER=$USER
 test -z $HOST_UID || (CI_USER="ci" && useradd -u $HOST_UID $CI_USER)
 
 # Build and test
-linux32 --32bit i386 su -m -l $CI_USER -c '
+linux32 --32bit i386 su -m -l $CI_USER -c "
 	set -ex
 	cd /usr/src/git
-	ln -s /tmp/travis-cache/.prove t/.prove
+	test -n '$cache_dir' && ln -s '$cache_dir/.prove' t/.prove
 	make --jobs=2
 	make --quiet test
-'
+"
diff --git a/ci/run-linux32-docker.sh b/ci/run-linux32-docker.sh
index 4f191c5bb1..15288ea2cf 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-linux32-docker.sh
@@ -11,6 +11,8 @@ docker pull daald/ubuntu32:xenial
 # $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
 # root@container:/# /usr/src/git/ci/run-linux32-build.sh
 
+container_cache_dir=/tmp/travis-cache
+
 docker run \
 	--interactive \
 	--env DEVELOPER \
@@ -18,8 +20,9 @@ docker run \
 	--env GIT_PROVE_OPTS \
 	--env GIT_TEST_OPTS \
 	--env GIT_TEST_CLONE_2GB \
+	--env cache_dir="$container_cache_dir" \
 	--volume "${PWD}:/usr/src/git" \
-	--volume "${HOME}/travis-cache:/tmp/travis-cache" \
+	--volume "$cache_dir:$container_cache_dir" \
 	daald/ubuntu32:xenial \
 	/usr/src/git/ci/run-linux32-build.sh $(id -u $USER)
 
-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 4/5] travis-ci: don't run the test suite as root in the 32 bit Linux build
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
                                               ` (2 preceding siblings ...)
  2018-01-22 13:32                             ` [PATCH 3/5] travis-ci: don't repeat the path of the cache directory SZEDER Gábor
@ 2018-01-22 13:32                             ` SZEDER Gábor
  2018-01-22 13:32                             ` [PATCH 5/5] travis-ci: don't fail if user already exists on 32 bit Linux build job SZEDER Gábor
  4 siblings, 0 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Johannes Schindelin, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor

Travis CI runs the 32 bit Linux build job in a Docker container, where
all commands are executed as root by default.  Therefore, ever since
we added this build job in 88dedd5e7 (Travis: also test on 32-bit
Linux, 2017-03-05), we have a bit of code to create a user in the
container matching the ID of the host user and then to run the test
suite as this user.  Matching the host user ID is important, because
otherwise the host user would have no access to any files written by
processes running in the container, notably the logs of failed tests
couldn't be included in the build job's trace log.

Alas, this piece of code never worked, because it sets the variable
holding the user name ($CI_USER) in a subshell, meaning it doesn't
have any effect by the time we get to the point to actually use the
variable to switch users with 'su'.  So all this time we were running
the test suite as root.

Reorganize that piece of code in 'ci/run-linux32-build.sh' a bit to
avoid that problematic subshell and to ensure that we switch to the
right user.  Furthermore, make the script's optional host user ID
option mandatory, so running the build accidentally as root will
become harder when debugging locally.  If someone really wants to run
the test suite as root, whatever the reasons might be, it'll still be
possible to do so by explicitly passing '0' as host user ID.

Finally, one last catch: since commit 7e72cfcee (travis-ci: save prove
state for the 32 bit Linux build, 2017-12-27) the 'prove' test harness
has been writing its state to the Travis CI cache directory from
within the Docker container while running as root.  After this patch
'prove' will run as a regular user, so in future build jobs it won't
be able overwrite a previously written, still root-owned state file,
resulting in build job failures.  To resolve this we should manually
delete caches containing such root-owned files, but that would be a
hassle.  Instead, work this around by changing the owner of the whole
contents of the cache directory to the host user ID.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/run-linux32-build.sh  | 30 +++++++++++++++++++++++++-----
 ci/run-linux32-docker.sh |  2 +-
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index c9476d6598..e37e1d2d5f 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -3,11 +3,17 @@
 # Build and test Git in a 32-bit environment
 #
 # Usage:
-#   run-linux32-build.sh [host-user-id]
+#   run-linux32-build.sh <host-user-id>
 #
 
 set -ex
 
+if test $# -ne 1 || test -z "$1"
+then
+	echo >&2 "usage: run-linux32-build.sh <host-user-id>"
+	exit 1
+fi
+
 # Update packages to the latest available versions
 linux32 --32bit i386 sh -c '
     apt update >/dev/null &&
@@ -18,11 +24,25 @@ linux32 --32bit i386 sh -c '
 # If this script runs inside a docker container, then all commands are
 # usually executed as root. Consequently, the host user might not be
 # able to access the test output files.
-# If a host user id is given, then create a user "ci" with the host user
-# id to make everything accessible to the host user.
+# If a non 0 host user id is given, then create a user "ci" with that
+# user id to make everything accessible to the host user.
 HOST_UID=$1
-CI_USER=$USER
-test -z $HOST_UID || (CI_USER="ci" && useradd -u $HOST_UID $CI_USER)
+if test $HOST_UID -eq 0
+then
+	# Just in case someone does want to run the test suite as root.
+	CI_USER=root
+else
+	CI_USER=ci
+	useradd -u $HOST_UID $CI_USER
+	# Due to a bug the test suite was run as root in the past, so
+	# a prove state file created back then is only accessible by
+	# root.  Now that bug is fixed, the test suite is run as a
+	# regular user, but the prove state file coming from Travis
+	# CI's cache might still be owned by root.
+	# Make sure that this user has rights to any cached files,
+	# including an existing prove state file.
+	test -n "$cache_dir" && chown -R $HOST_UID:$HOST_UID "$cache_dir"
+fi
 
 # Build and test
 linux32 --32bit i386 su -m -l $CI_USER -c "
diff --git a/ci/run-linux32-docker.sh b/ci/run-linux32-docker.sh
index 15288ea2cf..21637903ce 100755
--- a/ci/run-linux32-docker.sh
+++ b/ci/run-linux32-docker.sh
@@ -9,7 +9,7 @@ docker pull daald/ubuntu32:xenial
 
 # Use the following command to debug the docker build locally:
 # $ docker run -itv "${PWD}:/usr/src/git" --entrypoint /bin/bash daald/ubuntu32:xenial
-# root@container:/# /usr/src/git/ci/run-linux32-build.sh
+# root@container:/# /usr/src/git/ci/run-linux32-build.sh <host-user-id>
 
 container_cache_dir=/tmp/travis-cache
 
-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 5/5] travis-ci: don't fail if user already exists on 32 bit Linux build job
  2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
                                               ` (3 preceding siblings ...)
  2018-01-22 13:32                             ` [PATCH 4/5] travis-ci: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
@ 2018-01-22 13:32                             ` SZEDER Gábor
  4 siblings, 0 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 13:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Johannes Schindelin, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor

The 32 bit Linux build job runs in a Docker container, which lends
itself to running and debugging locally, too.  Especially during
debugging one usually doesn't want to start with a fresh container
every time, to save time spent on installing a bunch of dependencies.
However, that doesn't work quite smootly, because the script running
in the container always creates a new user, which then must be removed
every time before subsequent executions, or the build script fails.

Make this process more convenient and don't try to create that user if
it already exists and has the right user ID in the container, so
developers don't have to bother with running a 'userdel' each time
before they run the build script.

The build job on Travis CI always starts with a fresh Docker
container, so this change doesn't make a difference there.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/run-linux32-build.sh | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
index e37e1d2d5f..13047adde3 100755
--- a/ci/run-linux32-build.sh
+++ b/ci/run-linux32-build.sh
@@ -33,7 +33,13 @@ then
 	CI_USER=root
 else
 	CI_USER=ci
-	useradd -u $HOST_UID $CI_USER
+	if test "$(id -u $CI_USER)" = $HOST_UID
+	then
+		: # user already exists with the right ID
+	else
+		useradd -u $HOST_UID $CI_USER
+	fi
+
 	# Due to a bug the test suite was run as root in the past, so
 	# a prove state file created back then is only accessible by
 	# root.  Now that bug is fixed, the test suite is run as a
-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-18 12:47               ` Duy Nguyen
  2018-01-18 13:29                 ` Jeff King
@ 2018-01-22 18:27                 ` SZEDER Gábor
  2018-01-22 19:46                   ` Eric Sunshine
  1 sibling, 1 reply; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 18:27 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jeff King, Lars Schneider, Thomas Gummerer, Brandon Williams, git, SZEDER Gábor


On Thu, Jan 18, 2018 at 1:47 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Thu, Jan 18, 2018 at 6:36 PM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
>> This series, queued as 'nd/shared-index-fix', makes the 32 bit Linux
>> build job fail on Travis CI.  Unfortunately, all it can tell us about
>> the failure is this:
>>
>>   Test Summary Report
>>   -------------------
>>   t1700-split-index.sh                             (Wstat: 256 Tests: 23
>>   Failed: 1)
>>     Failed test:  23
>>     Non-zero exit status: 1
>>   Files=809, Tests=18491, 401 wallclock secs ( 7.22 usr  1.60 sys + 263.16
>>   cusr 49.58 csys = 321.56 CPU)
>>   Result: FAIL
>>
>> because it can't access the test's verbose log due to lack of
>> permissions.
>>
>>
>>   https://travis-ci.org/git/git/jobs/329681826#L2074
>
> I may need help getting that log (or even better the trash directory
> of t1700). 

Well, shower thoughts gave me an idea, see the included PoC patch below.
I can't really decide whether it's too clever or too ugly :)

It produces output like this (a previous WIP version without
'--immediate'):

  https://travis-ci.org/szeder/git/jobs/331601009#L2486


  -- >8 --

Subject: [PATCH] travis-ci: include the trash directories of failed tests in
 the trace log

The trash directory of a failed test might contain valuable
information about the cause of the failure, but we have no access to
the trash directories of Travis CI build jobs.  The only feedback we
get from there is the trace log, so...

Modify 'ci/print-test-failures.sh' to create a tar.gz archive of the
test directory of each failed test and encode it with base64, so the
result is a block of ASCII text that can be safely included in the
trace log, along with a hint about how to restore it.  Furthermore,
run tests with '--immediate' to faithfully preserve the failed state.

A few of our tests create a sizeable trash directory, so limit the
size of each included base64-encoded block, let's say, to 1MB.

Note:

  - The logs of Linux build jobs coming from Travis CI have mostly
    CRLF line endings, which makes 'base64 -d' from 'coreutils'
    complain about "invalid input"; it has to be converted to LF
    first.  'openssl base64 -d' can grok it just fine, even without
    conversion.

  - The logs of OSX build jobs have CRCRLF line endings.  However, the
    'base64' util of OSX doesn't wrap its output at 76 columns, i.e.
    prints one single albeit very long line.  This means that while
    'base64' from 'coreutils' still complains, by the time it gets to
    the invalid input it already decoded everything and produced a
    valid .tar.gz.  OTOH, 'openssl base64 -d' doesn't grok it, but
    exits without any error message or even an error code, even after
    converting to CRLF or LF line endings.

    Go figure.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/lib-travisci.sh        |  2 +-
 ci/print-test-failures.sh | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 1efee55ef7..981c6e9504 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -97,7 +97,7 @@ fi
 export DEVELOPER=1
 export DEFAULT_TEST_TARGET=prove
 export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save"
-export GIT_TEST_OPTS="--verbose-log"
+export GIT_TEST_OPTS="--verbose-log --immediate"
 export GIT_TEST_CLONE_2GB=YesPlease
 
 case "$jobname" in
diff --git a/ci/print-test-failures.sh b/ci/print-test-failures.sh
index 4f261ddc01..cc6a1247a4 100755
--- a/ci/print-test-failures.sh
+++ b/ci/print-test-failures.sh
@@ -23,5 +23,25 @@ do
 		echo "$(tput setaf 1)${TEST_OUT}...$(tput sgr0)"
 		echo "------------------------------------------------------------------------"
 		cat "${TEST_OUT}"
+
+		TEST_NAME="${TEST_EXIT%.exit}"
+		TEST_NAME="${TEST_NAME##*/}"
+		TRASH_DIR="t/trash directory.$TEST_NAME"
+		TRASH_TGZ_B64="$TEST_NAME.trash.base64"
+		if [ -d "$TRASH_DIR" ]
+		then
+			tar czp "$TRASH_DIR" |base64 >"$TRASH_TGZ_B64"
+
+			if [ 1048576 -lt $(wc -c <"$TRASH_TGZ_B64") ]
+			then
+				# larger than 1MB
+				echo "$(tput setaf 1)Trash directory of '$TEST_NAME' is too big to be included in the trace log$(tput sgr0)"
+			else
+				echo "$(tput setaf 1)Trash directory of '$TEST_NAME':$(tput sgr0)"
+				echo "(Extract it to a file from the raw log, make sure it has Unix line endings, then run 'base64 -d <file> |tar vxzp' to restore it)"
+				cat "$TRASH_TGZ_B64"
+				echo "$(tput setaf 1)End of trash directory of '$TEST_NAME'$(tput sgr0)"
+			fi
+		fi
 	fi
 done
-- 
2.16.1.80.gc0eec9753d


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-22 18:27                 ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index SZEDER Gábor
@ 2018-01-22 19:46                   ` Eric Sunshine
  2018-01-22 22:10                     ` SZEDER Gábor
  0 siblings, 1 reply; 77+ messages in thread
From: Eric Sunshine @ 2018-01-22 19:46 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Thomas Gummerer, Brandon Williams, Git List

On Mon, Jan 22, 2018 at 1:27 PM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> Subject: [PATCH] travis-ci: include the trash directories of failed tests in
>  the trace log
>
> The trash directory of a failed test might contain valuable
> information about the cause of the failure, but we have no access to
> the trash directories of Travis CI build jobs.  The only feedback we
> get from there is the trace log, so...
>
> Modify 'ci/print-test-failures.sh' to create a tar.gz archive of the
> test directory of each failed test and encode it with base64, so the
> result is a block of ASCII text that can be safely included in the
> trace log, along with a hint about how to restore it.  Furthermore,
> run tests with '--immediate' to faithfully preserve the failed state.
>
> A few of our tests create a sizeable trash directory, so limit the
> size of each included base64-encoded block, let's say, to 1MB.
>
> Note:
>
>   - The logs of Linux build jobs coming from Travis CI have mostly
>     CRLF line endings, which makes 'base64 -d' from 'coreutils'
>     complain about "invalid input"; it has to be converted to LF
>     first.  'openssl base64 -d' can grok it just fine, even without
>     conversion.
>
>   - The logs of OSX build jobs have CRCRLF line endings.  However, the
>     'base64' util of OSX doesn't wrap its output at 76 columns, i.e.
>     prints one single albeit very long line.  This means that while

Perhaps you could pipe the 'base64' output through 'fold' or 'fmt'?

>     'base64' from 'coreutils' still complains, by the time it gets to
>     the invalid input it already decoded everything and produced a
>     valid .tar.gz.  OTOH, 'openssl base64 -d' doesn't grok it, but
>     exits without any error message or even an error code, even after
>     converting to CRLF or LF line endings.
>
>     Go figure.
>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index
  2018-01-22 19:46                   ` Eric Sunshine
@ 2018-01-22 22:10                     ` SZEDER Gábor
  0 siblings, 0 replies; 77+ messages in thread
From: SZEDER Gábor @ 2018-01-22 22:10 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Duy Nguyen, Jeff King, Lars Schneider, Thomas Gummerer, Brandon Williams, Git List

On Mon, Jan 22, 2018 at 8:46 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Mon, Jan 22, 2018 at 1:27 PM, SZEDER Gábor <szeder.dev@gmail.com> wrote:

>>   - The logs of OSX build jobs have CRCRLF line endings.  However, the
>>     'base64' util of OSX doesn't wrap its output at 76 columns, i.e.
>>     prints one single albeit very long line.  This means that while
>
> Perhaps you could pipe the 'base64' output through 'fold' or 'fmt'?

No need to, according to its manpage[1], OSX's base64 has an option to
wrap lines after given number of characters.  Of course it uses a
different letter for the option than the coreutils base64... :)

(While long lines are ugly, in this particular case it comes handy: when
using vim to extract the base64-encoded section from the log to a
separate file, it's less keystrokes to yank a single line than to search
for the end of the encoded section.)


[1] - https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man1/base64.1.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, back to index

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-10 21:21 [PATCH 0/3] fixes for split index mode Thomas Gummerer
2017-12-10 21:22 ` [PATCH 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
2017-12-11 18:54   ` Brandon Williams
2017-12-11 20:37     ` Thomas Gummerer
2017-12-10 21:22 ` [PATCH 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
2017-12-11 19:09   ` Brandon Williams
2017-12-11 21:39     ` Thomas Gummerer
2017-12-10 21:22 ` [PATCH 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
2017-12-10 23:37   ` Eric Sunshine
2017-12-11 21:09   ` SZEDER Gábor
2017-12-11 21:42     ` Thomas Gummerer
2017-12-12 15:54       ` Lars Schneider
2017-12-12 19:15         ` Junio C Hamano
2017-12-12 20:15           ` Thomas Gummerer
2017-12-12 20:51             ` Junio C Hamano
2017-12-13 23:21               ` Thomas Gummerer
2017-12-13 17:21           ` Lars Schneider
2017-12-13 17:38             ` Junio C Hamano
2017-12-13 17:46               ` Lars Schneider
2017-12-13 23:28                 ` Thomas Gummerer
2017-12-17 22:51 ` [PATCH v2 0/3] fixes for split index mode Thomas Gummerer
2017-12-17 22:51   ` [PATCH v2 1/3] repository: fix repo_read_index with submodules Thomas Gummerer
2017-12-18 18:01     ` Brandon Williams
2017-12-18 23:05       ` Thomas Gummerer
2017-12-18 23:05         ` Brandon Williams
2017-12-17 22:51   ` [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index Thomas Gummerer
2017-12-18 18:19     ` Brandon Williams
2018-01-03 22:18       ` Thomas Gummerer
2018-01-04 19:12         ` Junio C Hamano
2017-12-17 22:51   ` [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
2017-12-18 18:16     ` Lars Schneider
2018-01-04 20:13       ` Thomas Gummerer
2018-01-05 11:03         ` Lars Schneider
2018-01-07 20:02         ` Thomas Gummerer
2018-01-07 22:30   ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
2018-01-07 22:30     ` [PATCH v3 1/3] read-cache: fix reading the shared index for other repos Thomas Gummerer
2018-01-08 10:41       ` Duy Nguyen
2018-01-08 22:41         ` Thomas Gummerer
2018-01-13 22:33           ` Thomas Gummerer
2018-01-08 23:38         ` Brandon Williams
2018-01-09  1:24           ` Duy Nguyen
2018-01-16 21:42       ` Brandon Williams
2018-01-17  0:16         ` Duy Nguyen
2018-01-17  0:32           ` Brandon Williams
2018-01-17 18:16           ` Jonathan Nieder
2018-01-18 10:19             ` Duy Nguyen
2018-01-19 21:57       ` Junio C Hamano
2018-01-20 11:58         ` Thomas Gummerer
2018-01-22  6:14           ` Junio C Hamano
2018-01-07 22:30     ` [PATCH v3 2/3] split-index: don't write cache tree with null oid entries Thomas Gummerer
2018-01-07 22:30     ` [PATCH v3 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX Thomas Gummerer
2018-01-13 22:37     ` [PATCH v3 4/3] read-cache: don't try to write index if we can't write shared index Thomas Gummerer
2018-01-14  9:36       ` Duy Nguyen
2018-01-14 10:18         ` [PATCH 1/3] read-cache.c: change type of "temp" in write_shared_index() Nguyễn Thái Ngọc Duy
2018-01-14 10:18           ` [PATCH 2/3] read-cache.c: move tempfile creation/cleanup out of write_shared_index Nguyễn Thái Ngọc Duy
2018-01-14 10:18           ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index Nguyễn Thái Ngọc Duy
2018-01-18 11:36             ` SZEDER Gábor
2018-01-18 12:47               ` Duy Nguyen
2018-01-18 13:29                 ` Jeff King
2018-01-18 13:36                   ` Duy Nguyen
2018-01-18 15:00                     ` Duy Nguyen
2018-01-18 21:37                       ` Jeff King
2018-01-18 22:32                         ` SZEDER Gábor
2018-01-19  0:30                           ` Duy Nguyen
2018-01-22 13:32                           ` [PATCH 0/5] Travis CI: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
2018-01-22 13:32                             ` [PATCH 1/5] travis-ci: use 'set -x' for the commands under 'su' " SZEDER Gábor
2018-01-22 13:32                             ` [PATCH 2/5] travis-ci: use 'set -e' in the 32 bit Linux build job SZEDER Gábor
2018-01-22 13:32                             ` [PATCH 3/5] travis-ci: don't repeat the path of the cache directory SZEDER Gábor
2018-01-22 13:32                             ` [PATCH 4/5] travis-ci: don't run the test suite as root in the 32 bit Linux build SZEDER Gábor
2018-01-22 13:32                             ` [PATCH 5/5] travis-ci: don't fail if user already exists on 32 bit Linux build job SZEDER Gábor
2018-01-22 18:27                 ` [PATCH 3/3] read-cache: don't write index twice if we can't write shared index SZEDER Gábor
2018-01-22 19:46                   ` Eric Sunshine
2018-01-22 22:10                     ` SZEDER Gábor
2018-01-14 14:29         ` [PATCH v3 4/3] read-cache: don't try to write index " Thomas Gummerer
2018-01-18 21:53     ` [PATCH v3 0/3] fixes for split index mode Thomas Gummerer
2018-01-19 18:34       ` Junio C Hamano
2018-01-19 21:11         ` Thomas Gummerer

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox