git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* new segfault in master (6a6c0f10a70a6eb1)
@ 2019-05-11 20:57 Eric Wong
  2019-05-11 22:31 ` Jeff King
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Wong @ 2019-05-11 20:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This test-tool submodule segfault seems new.  Noticed it while
checking dmesg for other things.
There's also "name-rev HEAD~4000" (bottom), which is old, I think...

Core was generated by `$WT/t/helper/test-tool submodule-nested-repo-config submodule sub'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000559b12746174 in get_oid_with_context_1 (
    repo=repo@entry=0x7ffc5de3cf30, 
    name=name@entry=0x559b1280a882 ":.gitmodules", flags=flags@entry=0, 
    prefix=prefix@entry=0x0, oid=oid@entry=0x7ffc5de3ce80, 
    oc=oc@entry=0x7ffc5de3ce20) at sha1-name.c:1840
1840			if (!repo->index->cache)
(gdb) #0  0x0000559b12746174 in get_oid_with_context_1 (
    repo=repo@entry=0x7ffc5de3cf30, 
    name=name@entry=0x559b1280a882 ":.gitmodules", flags=flags@entry=0, 
    prefix=prefix@entry=0x0, oid=oid@entry=0x7ffc5de3ce80, 
    oc=oc@entry=0x7ffc5de3ce20) at sha1-name.c:1840
#1  0x0000559b12746dc3 in get_oid_with_context (oc=0x7ffc5de3ce20, 
    oid=0x7ffc5de3ce80, flags=0, str=str@entry=0x559b1280a882 ":.gitmodules", 
    repo=repo@entry=0x7ffc5de3cf30) at sha1-name.c:1946
#2  repo_get_oid (r=r@entry=0x7ffc5de3cf30, 
    name=name@entry=0x559b1280a882 ":.gitmodules", 
    oid=oid@entry=0x7ffc5de3ce80) at sha1-name.c:1595
#3  0x0000559b12753447 in config_from_gitmodules (
    fn=fn@entry=0x559b127534b0 <config_print_callback>, 
    repo=repo@entry=0x7ffc5de3cf30, data=0x559b145802c0)
    at submodule-config.c:633
#4  0x0000559b12754664 in print_config_from_gitmodules (
    repo=repo@entry=0x7ffc5de3cf30, key=<optimized out>)
    at submodule-config.c:742
#5  0x0000559b126dca4f in cmd__submodule_nested_repo_config (
    argc=<optimized out>, argv=0x7ffc5de3d290)
    at t/helper/test-submodule-nested-repo-config.c:27
#6  0x0000559b126d367f in cmd_main (argc=3, argv=0x7ffc5de3d290)
    at t/helper/test-tool.c:109
#7  0x0000559b126d337a in main (argc=4, argv=0x7ffc5de3d288)
    at common-main.c:50
(gdb) quit


Looks like a stack overflow:

Core was generated by `$WT/git name-rev HEAD~4000'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4df1896d6a in _int_malloc (
    av=av@entry=0x7f4df1bb7b00 <main_arena>, bytes=bytes@entry=33)
    at malloc.c:3444
(gdb) #0  0x00007f4df1896d6a in _int_malloc (
    av=av@entry=0x7f4df1bb7b00 <main_arena>, bytes=bytes@entry=33)
    at malloc.c:3444
#1  0x00007f4df1897b68 in malloc_check (sz=32, caller=<optimized out>)
    at hooks.c:295
#2  0x00005642d1240f51 in do_xmalloc (size=size@entry=32, 
    gentle=gentle@entry=0) at wrapper.c:60
#3  0x00005642d12410e7 in xmalloc (size=size@entry=32) at wrapper.c:87
#4  0x00005642d10c75b1 in name_rev (commit=0x5642d2357bb0, 
    tip_name=tip_name@entry=0x5642d245e7c0 "master", 
    taggerdate=taggerdate@entry=1000799900, generation=generation@entry=1044, 
    distance=distance@entry=1044, from_tag=from_tag@entry=0, deref=0)
    at builtin/name-rev.c:103
#5  0x00005642d10c74e3 in name_rev (commit=<optimized out>, 
    tip_name=tip_name@entry=0x5642d245e7c0 "master", 
    taggerdate=taggerdate@entry=1000799900, generation=generation@entry=1043, 
    distance=distance@entry=1043, from_tag=from_tag@entry=0, deref=0)
    at builtin/name-rev.c:138

<snip>...

#1047 0x00005642d10c74e3 in name_rev (commit=<optimized out>, 
    tip_name=tip_name@entry=0x5642d245e7c0 "master", 
    taggerdate=taggerdate@entry=1000799900, generation=generation@entry=1, 
    distance=distance@entry=1, from_tag=from_tag@entry=0, deref=0)
    at builtin/name-rev.c:138
#1048 0x00005642d10c74e3 in name_rev (commit=commit@entry=0x5642d22e6420, 
    tip_name=0x5642d245e7c0 "master", taggerdate=taggerdate@entry=1000799900, 
    generation=generation@entry=0, distance=distance@entry=0, 
    from_tag=from_tag@entry=0, deref=0) at builtin/name-rev.c:138
#1049 0x00005642d10c7889 in name_ref (path=<optimized out>, 
    oid=0x5642d2465bd8, flags=<optimized out>, cb_data=<optimized out>)
    at builtin/name-rev.c:276
#1050 0x00005642d11d5074 in do_for_each_repo_ref_iterator (
    r=0x5642d1546de0 <the_repo>, iter=0x5642d245da20, 
    fn=fn@entry=0x5642d11cab10 <do_for_each_ref_helper>, 
    cb_data=cb_data@entry=0x7ffc71fffa90) at refs/iterator.c:418
#1051 0x00005642d11cc7eb in do_for_each_ref (refs=<optimized out>, 
    prefix=prefix@entry=0x5642d12a0450 "", 
    fn=fn@entry=0x5642d10c75f0 <name_ref>, trim=trim@entry=0, 
    flags=flags@entry=0, cb_data=cb_data@entry=0x7ffc71fffb40) at refs.c:1496
#1052 0x00005642d11cd488 in refs_for_each_ref (cb_data=0x7ffc71fffb40, 
    fn=0x5642d10c75f0 <name_ref>, refs=<optimized out>) at refs.c:1502
#1053 for_each_ref (fn=fn@entry=0x5642d10c75f0 <name_ref>, 
    cb_data=cb_data@entry=0x7ffc71fffb40) at refs.c:1507
#1054 0x00005642d10c7ec5 in cmd_name_rev (argc=<optimized out>, 
    argv=0x7ffc71fffb40, prefix=<optimized out>) at builtin/name-rev.c:490
#1055 0x00005642d1070d68 in run_builtin (argv=<optimized out>, 
    argc=<optimized out>, p=<optimized out>) at git.c:444
#1056 handle_builtin (argc=2, argv=0x7ffc72000ac0) at git.c:675
#1057 0x00005642d1071d1e in run_argv (argv=0x7ffc72000840, 
    argcp=0x7ffc7200084c) at git.c:742
#1058 cmd_main (argc=<optimized out>, argv=<optimized out>) at git.c:876
#1059 0x00005642d107094a in main (argc=3, argv=0x7ffc72000ab8)
    at common-main.c:50
(gdb) quit

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: new segfault in master (6a6c0f10a70a6eb1)
  2019-05-11 20:57 new segfault in master (6a6c0f10a70a6eb1) Eric Wong
@ 2019-05-11 22:31 ` Jeff King
  2019-05-11 23:02   ` Jeff King
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2019-05-11 22:31 UTC (permalink / raw)
  To: Eric Wong; +Cc: Antonio Ospite, Junio C Hamano, git

On Sat, May 11, 2019 at 08:57:11PM +0000, Eric Wong wrote:

> This test-tool submodule segfault seems new.  Noticed it while
> checking dmesg for other things.

Yeah, I hadn't seen it before. It's almost certainly the expect_failure
added in 2b1257e463 (t/helper: add test-submodule-nested-repo-config,
2018-10-25), since otherwise we'd be complaining of a test failure.

I know we don't expect this to do the right thing yet, but it seems like
there's still a bug, since the test seems to think we should output a
nice message (and it's possible that the segfault can be triggered from
non-test-tool code, too).

+cc the author.

> There's also "name-rev HEAD~4000" (bottom), which is old, I think...
> [...]
> Looks like a stack overflow:

Yeah, this one is old and expected. It's also in an expect_failure. The
CI we run things through at GitHub complains if there are any segfaults,
and I hacked around it with the patch below.

I sort of assumed nobody else cared, since they hadn't mentioned it. But
we could do something similar. Though note in my version the default is
"do not run the test", and we'd maybe want to flip it the other way (and
also break up the setup step so that the succeeding test actually runs).

-- >8 --
Subject: [PATCH] t6120: mark a failing test with SEGFAULT_OK prereq

Upstream recently added a test of name-rev on a deep repository, which
shows that its recursive algorithm can blow out the stack and segfault.
We have several such tests already, but the twist here is that it's
expect_failure. So we _know_ it's going to segfault and Git's test suite
is OK with that, until the problem is fixed.

But our CI is not so forgiving. If it sees any segfault at all, it
interrupts the test run and declares the whole thing a failure.

Let's just skip this test by adding a prerequisite that isn't filled.
It's not telling us anything interesting. And if it ever gets fixed
upstream, that will cause a conflict and we can start running it.

Note that we also have to skip the test after it, which relies on the
state set up by the first one. This isn't a big deal, as it's not
testing code that we're likely to change ourselves.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t6120-describe.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e8659d9..9d98b95ba6 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -310,7 +310,7 @@ test_expect_success 'describe ignoring a borken submodule' '
 	grep broken out
 '
 
-test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
+test_expect_failure ULIMIT_STACK_SIZE,SEGFAULT_OK 'name-rev works in a deep repo' '
 	i=1 &&
 	while test $i -lt 8000
 	do
@@ -331,7 +331,7 @@ EOF"
 	test_cmp expect actual
 '
 
-test_expect_success ULIMIT_STACK_SIZE 'describe works in a deep repo' '
+test_expect_success ULIMIT_STACK_SIZE,SEGFAULT_OK 'describe works in a deep repo' '
 	git tag -f far-far-away HEAD~7999 &&
 	echo "far-far-away" >expect &&
 	git describe --tags --abbrev=0 HEAD~4000 >actual &&
-- 
2.21.0.1388.g2b1efd806f


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: new segfault in master (6a6c0f10a70a6eb1)
  2019-05-11 22:31 ` Jeff King
@ 2019-05-11 23:02   ` Jeff King
  2019-05-12  4:26     ` Duy Nguyen
  2019-05-14 13:54     ` [PATCH] get_oid: handle NULL repo->index Jeff King
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff King @ 2019-05-11 23:02 UTC (permalink / raw)
  To: Eric Wong
  Cc: Nguyễn Thái Ngọc Duy, Antonio Ospite,
	Junio C Hamano, git

On Sat, May 11, 2019 at 06:31:20PM -0400, Jeff King wrote:

> On Sat, May 11, 2019 at 08:57:11PM +0000, Eric Wong wrote:
> 
> > This test-tool submodule segfault seems new.  Noticed it while
> > checking dmesg for other things.
> 
> Yeah, I hadn't seen it before. It's almost certainly the expect_failure
> added in 2b1257e463 (t/helper: add test-submodule-nested-repo-config,
> 2018-10-25), since otherwise we'd be complaining of a test failure.
> 
> I know we don't expect this to do the right thing yet, but it seems like
> there's still a bug, since the test seems to think we should output a
> nice message (and it's possible that the segfault can be triggered from
> non-test-tool code, too).
> 
> +cc the author.

Actually, the plot thickens. That test _used to_ correctly expect
failure (well, sort of -- it greps for the string with %s, which is
wrong!). But then more recently in d9b8b8f896 (submodule-config.c: use
repo_get_oid for reading .gitmodules, 2019-04-16), it started actually
doing the lookup in the correct repo. And that started the segfault,
because nobody has actually loaded the index for the submodule.

I don't think this can be triggered outside of test-tool. There are
four ways to get to config_from_gitmodules():

  - via repo_read_gitmodules(), which explicitly loads the index

  - via print_config_from_gitmodules(). This is called from
    submodule--helper, but only with the_repository as the argument (and
    I _think_ that the_repository->index is never NULL, because we point
    it at the_index).

  - via fetch_config_from_gitmodules(), which always passes
    the_repository

  - via update_clone_config_from_gitmodules(), likewise

But regardless, I think it makes sense to load the index on demand when
we need it here, which makes Antonio's original test pass (like the
patch below).

The segfault ultimately comes from repo_get_oid(); we feed it
":.gitmodules" and it blindly looks at repo->index. It's probably worth
it being a bit more defensive and just returning "no such entry" if
there's no index to look at (it could also load on demand, I guess, but
it seems like too low a level to be making that kind of decision).

I'm out of time for now, but I'll look into cleaning this up and writing
a real commit message later.

---
diff --git a/submodule-config.c b/submodule-config.c
index 4264ee216f..ad2444bcec 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -630,7 +630,8 @@ static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void
 		file = repo_worktree_path(repo, GITMODULES_FILE);
 		if (file_exists(file)) {
 			config_source.file = file;
-		} else if (repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0 ||
+		} else if ((repo_read_index(repo) >= 0 &&
+			    repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0) ||
 			   repo_get_oid(repo, GITMODULES_HEAD, &oid) >= 0) {
 			config_source.blob = oidstr = xstrdup(oid_to_hex(&oid));
 			if (repo != the_repository)
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index fcc0fb82d8..ad28e93880 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -243,18 +243,14 @@ test_expect_success 'reading nested submodules config' '
 	)
 '
 
-# When this test eventually passes, before turning it into
-# test_expect_success, remember to replace the test_i18ngrep below with
-# a "test_must_be_empty warning" to be sure that the warning is actually
-# removed from the code.
-test_expect_failure 'reading nested submodules config when .gitmodules is not in the working tree' '
+test_expect_success 'reading nested submodules config when .gitmodules is not in the working tree' '
 	test_when_finished "git -C super/submodule checkout .gitmodules" &&
 	(cd super &&
 		echo "./nested_submodule" >expect &&
 		rm submodule/.gitmodules &&
 		test-tool submodule-nested-repo-config \
 			submodule submodule.nested_submodule.url >actual 2>warning &&
-		test_i18ngrep "nested submodules without %s in the working tree are not supported yet" warning &&
+		test_must_be_empty warning &&
 		test_cmp expect actual
 	)
 '

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: new segfault in master (6a6c0f10a70a6eb1)
  2019-05-11 23:02   ` Jeff King
@ 2019-05-12  4:26     ` Duy Nguyen
  2019-05-14 13:54     ` [PATCH] get_oid: handle NULL repo->index Jeff King
  1 sibling, 0 replies; 16+ messages in thread
From: Duy Nguyen @ 2019-05-12  4:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Wong, Antonio Ospite, Junio C Hamano, Git Mailing List

On Sun, May 12, 2019 at 6:02 AM Jeff King <peff@peff.net> wrote:
>
> On Sat, May 11, 2019 at 06:31:20PM -0400, Jeff King wrote:
>
> > On Sat, May 11, 2019 at 08:57:11PM +0000, Eric Wong wrote:
> >
> > > This test-tool submodule segfault seems new.  Noticed it while
> > > checking dmesg for other things.
> >
> > Yeah, I hadn't seen it before. It's almost certainly the expect_failure
> > added in 2b1257e463 (t/helper: add test-submodule-nested-repo-config,
> > 2018-10-25), since otherwise we'd be complaining of a test failure.
> >
> > I know we don't expect this to do the right thing yet, but it seems like
> > there's still a bug, since the test seems to think we should output a
> > nice message (and it's possible that the segfault can be triggered from
> > non-test-tool code, too).
> >
> > +cc the author.
>
> Actually, the plot thickens. That test _used to_ correctly expect
> failure (well, sort of -- it greps for the string with %s, which is
> wrong!). But then more recently in d9b8b8f896 (submodule-config.c: use
> repo_get_oid for reading .gitmodules, 2019-04-16), it started actually
> doing the lookup in the correct repo. And that started the segfault,
> because nobody has actually loaded the index for the submodule.
>
> I don't think this can be triggered outside of test-tool. There are
> four ways to get to config_from_gitmodules():
>
>   - via repo_read_gitmodules(), which explicitly loads the index
>
>   - via print_config_from_gitmodules(). This is called from
>     submodule--helper, but only with the_repository as the argument (and
>     I _think_ that the_repository->index is never NULL, because we point
>     it at the_index).
>
>   - via fetch_config_from_gitmodules(), which always passes
>     the_repository
>
>   - via update_clone_config_from_gitmodules(), likewise
>
> But regardless, I think it makes sense to load the index on demand when
> we need it here, which makes Antonio's original test pass (like the
> patch below).
>
> The segfault ultimately comes from repo_get_oid(); we feed it
> ":.gitmodules" and it blindly looks at repo->index. It's probably worth
> it being a bit more defensive and just returning "no such entry" if
> there's no index to look at (it could also load on demand, I guess, but
> it seems like too low a level to be making that kind of decision).
>
> I'm out of time for now, but I'll look into cleaning this up and writing
> a real commit message later.
>
> ---
> diff --git a/submodule-config.c b/submodule-config.c
> index 4264ee216f..ad2444bcec 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -630,7 +630,8 @@ static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void
>                 file = repo_worktree_path(repo, GITMODULES_FILE);
>                 if (file_exists(file)) {
>                         config_source.file = file;
> -               } else if (repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0 ||
> +               } else if ((repo_read_index(repo) >= 0 &&
> +                           repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0) ||
>                            repo_get_oid(repo, GITMODULES_HEAD, &oid) >= 0) {
>                         config_source.blob = oidstr = xstrdup(oid_to_hex(&oid));
>                         if (repo != the_repository)
> diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
> index fcc0fb82d8..ad28e93880 100755
> --- a/t/t7411-submodule-config.sh
> +++ b/t/t7411-submodule-config.sh
> @@ -243,18 +243,14 @@ test_expect_success 'reading nested submodules config' '
>         )
>  '
>
> -# When this test eventually passes, before turning it into
> -# test_expect_success, remember to replace the test_i18ngrep below with
> -# a "test_must_be_empty warning" to be sure that the warning is actually
> -# removed from the code.
> -test_expect_failure 'reading nested submodules config when .gitmodules is not in the working tree' '
> +test_expect_success 'reading nested submodules config when .gitmodules is not in the working tree' '

I did miss this test. Yeah your fix makes sense.

>         test_when_finished "git -C super/submodule checkout .gitmodules" &&
>         (cd super &&
>                 echo "./nested_submodule" >expect &&
>                 rm submodule/.gitmodules &&
>                 test-tool submodule-nested-repo-config \
>                         submodule submodule.nested_submodule.url >actual 2>warning &&
> -               test_i18ngrep "nested submodules without %s in the working tree are not supported yet" warning &&
> +               test_must_be_empty warning &&
>                 test_cmp expect actual
>         )
>  '



-- 
Duy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] get_oid: handle NULL repo->index
  2019-05-11 23:02   ` Jeff King
  2019-05-12  4:26     ` Duy Nguyen
@ 2019-05-14 13:54     ` Jeff King
  2019-05-14 23:38       ` Eric Wong
  2019-05-15  1:24       ` Duy Nguyen
  1 sibling, 2 replies; 16+ messages in thread
From: Jeff King @ 2019-05-14 13:54 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: Eric Wong, Antonio Ospite, Junio C Hamano, git

On Sat, May 11, 2019 at 07:02:05PM -0400, Jeff King wrote:

> But regardless, I think it makes sense to load the index on demand when
> we need it here, which makes Antonio's original test pass (like the
> patch below).
> 
> The segfault ultimately comes from repo_get_oid(); we feed it
> ":.gitmodules" and it blindly looks at repo->index. It's probably worth
> it being a bit more defensive and just returning "no such entry" if
> there's no index to look at (it could also load on demand, I guess, but
> it seems like too low a level to be making that kind of decision).

This turned out to be even simpler than the patch I posted earlier.
After polishing up my submodule-config fix, I looked at teaching
get_oid() to behave differently. And it turns out that it already tries
to load the index on demand! There's just a silly mistake in the code to
check whether the index is already initialized.

Once we fix that, we don't need to handle this specifically in the
submodule code. So here's a simpler, revised patch:

-- >8 --
Subject: [PATCH] get_oid: handle NULL repo->index

When get_oid() and its helpers see an index name like ":.gitmodules",
they try to load the index on demand, like:

  if (repo->index->cache)
	repo_read_index(repo);

However, that misses the case when "repo->index" itself is NULL; we'll
segfault in the conditional.

This never happens with the_repository; there we always point its index
field to &the_index. But a submodule repository may have a NULL index
field until somebody calls repo_read_index().

This bug is triggered by t7411, but it was hard to notice because it's
in an expect_failure block. That test was added by 2b1257e463 (t/helper:
add test-submodule-nested-repo-config, 2018-10-25). Back then we had no
easy way to access the .gitmodules blob of a submodule repo, so we
expected (and got) an error message to that effect. Later, d9b8b8f896
(submodule-config.c: use repo_get_oid for reading .gitmodules,
2019-04-16) started looking in the correct repo, which is when we
started triggering the segfault.

With this fix, the test starts passing (once we clean it up as its
comment instructs).

Note that as far as I know, this bug could not be triggered outside of
the test suite. It requires resolving an index name in a submodule, and
all of the code paths (aside from test-tool) which do that either load
the index themselves, or always pass the_repository.

Ultimately it comes from 3a7a698e93 (sha1-name.c: remove implicit
dependency on the_index, 2019-01-12), which replaced a check of
"the_index.cache" with "repo->index->cache". So even if there is another
way to trigger it, it wouldn't affect any versions before then.

Signed-off-by: Jeff King <peff@peff.net>
---
Arguably this code should just unconditionally call repo_read_index(),
which should be a noop if the index is already loaded. But I wanted to
do the minimal fix here, without getting into any subtle differences
between what checking index->cache versus index->initialized might mean.
Anybody who wants to dig into that is welcome to make a patch on top. :)

I also wondered if we should simply allocate an empty index whenever we
have a non-toplevel "struct repository", which might be less surprising
to other callers. I don't have a strong opinion either way. I did grep
around for other callers which might have similar problems, but couldn't
find any.

 sha1-name.c                 | 2 +-
 t/t7411-submodule-config.sh | 8 ++------
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/sha1-name.c b/sha1-name.c
index 775a73d8ad..455e9fb1ea 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1837,7 +1837,7 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
 		if (flags & GET_OID_RECORD_PATH)
 			oc->path = xstrdup(cp);
 
-		if (!repo->index->cache)
+		if (!repo->index || !repo->index->cache)
 			repo_read_index(repo);
 		pos = index_name_pos(repo->index, cp, namelen);
 		if (pos < 0)
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index fcc0fb82d8..ad28e93880 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -243,18 +243,14 @@ test_expect_success 'reading nested submodules config' '
 	)
 '
 
-# When this test eventually passes, before turning it into
-# test_expect_success, remember to replace the test_i18ngrep below with
-# a "test_must_be_empty warning" to be sure that the warning is actually
-# removed from the code.
-test_expect_failure 'reading nested submodules config when .gitmodules is not in the working tree' '
+test_expect_success 'reading nested submodules config when .gitmodules is not in the working tree' '
 	test_when_finished "git -C super/submodule checkout .gitmodules" &&
 	(cd super &&
 		echo "./nested_submodule" >expect &&
 		rm submodule/.gitmodules &&
 		test-tool submodule-nested-repo-config \
 			submodule submodule.nested_submodule.url >actual 2>warning &&
-		test_i18ngrep "nested submodules without %s in the working tree are not supported yet" warning &&
+		test_must_be_empty warning &&
 		test_cmp expect actual
 	)
 '
-- 
2.21.0.1388.g2b1efd806f


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] get_oid: handle NULL repo->index
  2019-05-14 13:54     ` [PATCH] get_oid: handle NULL repo->index Jeff King
@ 2019-05-14 23:38       ` Eric Wong
  2019-05-15  1:24       ` Duy Nguyen
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Wong @ 2019-05-14 23:38 UTC (permalink / raw)
  To: Jeff King
  Cc: Nguyễn Thái Ngọc Duy, Antonio Ospite,
	Junio C Hamano, git

Jeff King <peff@peff.net> wrote:
> +++ b/sha1-name.c
> @@ -1837,7 +1837,7 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
>  		if (flags & GET_OID_RECORD_PATH)
>  			oc->path = xstrdup(cp);
>  
> -		if (!repo->index->cache)
> +		if (!repo->index || !repo->index->cache)
>  			repo_read_index(repo);

Awesome, looks obviously correct and can confirm it fixes the
new segfault :>

Now I'm kinda wondering if some static checker coulda/shoulda
spotted that, sooner.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] get_oid: handle NULL repo->index
  2019-05-14 13:54     ` [PATCH] get_oid: handle NULL repo->index Jeff King
  2019-05-14 23:38       ` Eric Wong
@ 2019-05-15  1:24       ` Duy Nguyen
  2019-05-15  1:46         ` Jeff King
  1 sibling, 1 reply; 16+ messages in thread
From: Duy Nguyen @ 2019-05-15  1:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Wong, Antonio Ospite, Junio C Hamano, Git Mailing List

On Tue, May 14, 2019 at 8:54 PM Jeff King <peff@peff.net> wrote:
> diff --git a/sha1-name.c b/sha1-name.c
> index 775a73d8ad..455e9fb1ea 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -1837,7 +1837,7 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
>                 if (flags & GET_OID_RECORD_PATH)
>                         oc->path = xstrdup(cp);
>
> -               if (!repo->index->cache)
> +               if (!repo->index || !repo->index->cache)
>                         repo_read_index(repo);

We could even drop the "if" and call repo_read_index()
unconditionally. If the index is already read, it will be no-op
(forcing a reread has always been discard_index(); read_index();)

Thanks for catching this by the way. I'll need to go through all
the_index conversion to see if I left similar traps like this.
-- 
Duy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] get_oid: handle NULL repo->index
  2019-05-15  1:24       ` Duy Nguyen
@ 2019-05-15  1:46         ` Jeff King
  2019-05-15  5:16           ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2019-05-15  1:46 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Eric Wong, Antonio Ospite, Junio C Hamano, Git Mailing List

On Wed, May 15, 2019 at 08:24:34AM +0700, Duy Nguyen wrote:

> On Tue, May 14, 2019 at 8:54 PM Jeff King <peff@peff.net> wrote:
> > diff --git a/sha1-name.c b/sha1-name.c
> > index 775a73d8ad..455e9fb1ea 100644
> > --- a/sha1-name.c
> > +++ b/sha1-name.c
> > @@ -1837,7 +1837,7 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
> >                 if (flags & GET_OID_RECORD_PATH)
> >                         oc->path = xstrdup(cp);
> >
> > -               if (!repo->index->cache)
> > +               if (!repo->index || !repo->index->cache)
> >                         repo_read_index(repo);
> 
> We could even drop the "if" and call repo_read_index()
> unconditionally. If the index is already read, it will be no-op
> (forcing a reread has always been discard_index(); read_index();)

I think you missed my bit after the "---":

  Arguably this code should just unconditionally call repo_read_index(),
  which should be a noop if the index is already loaded. But I wanted to
  do the minimal fix here, without getting into any subtle differences
  between what checking index->cache versus index->initialized might
  mean.  Anybody who wants to dig into that is welcome to make a patch
  on top. :)

> Thanks for catching this by the way. I'll need to go through all
> the_index conversion to see if I left similar traps like this.

Yeah. I could not find any others, but it would not hurt to have a
second set of eyes.

Also from my earlier message, if you missed it:

  I also wondered if we should simply allocate an empty index whenever
  we have a non-toplevel "struct repository", which might be less
  surprising to other callers. I don't have a strong opinion either way.
  I did grep around for other callers which might have similar problems,
  but couldn't find any.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] get_oid: handle NULL repo->index
  2019-05-15  1:46         ` Jeff King
@ 2019-05-15  5:16           ` Junio C Hamano
  2019-05-15  9:29             ` Duy Nguyen
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2019-05-15  5:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Duy Nguyen, Eric Wong, Antonio Ospite, Git Mailing List

Jeff King <peff@peff.net> writes:

> Also from my earlier message, if you missed it:
>
>   I also wondered if we should simply allocate an empty index whenever
>   we have a non-toplevel "struct repository", which might be less
>   surprising to other callers. I don't have a strong opinion either way.
>   I did grep around for other callers which might have similar problems,
>   but couldn't find any.

That is an approach to make it harder to make mistakes by accepting
possibly a small wasted resource; but at that point, I think calling
repo_read_index() unconditionally from here and similar places would
be a simpler fix in the same spirit.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] get_oid: handle NULL repo->index
  2019-05-15  5:16           ` Junio C Hamano
@ 2019-05-15  9:29             ` Duy Nguyen
  2019-05-16  1:43               ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Duy Nguyen @ 2019-05-15  9:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Eric Wong, Antonio Ospite, Git Mailing List

On Wed, May 15, 2019 at 12:16 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeff King <peff@peff.net> writes:
>
> > Also from my earlier message, if you missed it:
> >
> >   I also wondered if we should simply allocate an empty index whenever
> >   we have a non-toplevel "struct repository", which might be less
> >   surprising to other callers. I don't have a strong opinion either way.
> >   I did grep around for other callers which might have similar problems,
> >   but couldn't find any.
>
> That is an approach to make it harder to make mistakes by accepting
> possibly a small wasted resource; but at that point, I think calling
> repo_read_index() unconditionally from here and similar places would
> be a simpler fix in the same spirit.

For repo_read_index() case, maybe. But we have a lot of
"r(epo)->index->something". All or most of these references
traditionally are the_index.something, which is always safe to
dereference. Submodule repos with the optionally NULL repo->index
break this assumption.

So either we audit all the code and make sure "repo->index" cannot be
NULL because repo_read_index() has been called before (and may have to
repeat auditing), or we just stick to the old assumption and make sure
repo->index is not NULL from the beginning. This makes me think the
small extra resource is worth it. Much less time will be spent on
similar issues now and in the future.

PS. Sorry Jeff for the noise. I should have waited until I come home
and can read your mail more carefully.
-- 
Duy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] get_oid: handle NULL repo->index
  2019-05-15  9:29             ` Duy Nguyen
@ 2019-05-16  1:43               ` Junio C Hamano
  2019-05-19  2:56                 ` [PATCH] repository.c: always allocate 'index' at repo init time Nguyễn Thái Ngọc Duy
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2019-05-16  1:43 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jeff King, Eric Wong, Antonio Ospite, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

>> That is an approach to make it harder to make mistakes by accepting
>> possibly a small wasted resource; but at that point, I think calling
>> repo_read_index() unconditionally from here and similar places would
>> be a simpler fix in the same spirit.
>
> For repo_read_index() case, maybe. But we have a lot of
> "r(epo)->index->something". All or most of these references
> traditionally are the_index.something, which is always safe to
> dereference. Submodule repos with the optionally NULL repo->index
> break this assumption.

Ah, good point.  Thanks for a dose of sanity.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] repository.c: always allocate 'index' at repo init time
  2019-05-16  1:43               ` Junio C Hamano
@ 2019-05-19  2:56                 ` Nguyễn Thái Ngọc Duy
  2019-05-20 13:17                   ` Jeff King
  0 siblings, 1 reply; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2019-05-19  2:56 UTC (permalink / raw)
  To: gitster; +Cc: ao2, e, git, pclouds, peff

There are two ways a 'struct repository' could be initialized before
using: via initialize_the_repository() and repo_init().

The first way always initializes 'index' field because that's how it is
before the introduction of 'struct repository'. Back then 'the_index' is
always available (even if not loaded). The second way however leaves
'index' NULL and relies on repo_read_index() to allocate it on demand.

The problem with the second way is that, the majority of our code base
was written with 'the_index' (i.e. the first way) in mind, where
dereferencing 'the_index' (or the 'index' field now) is always
safe.

The second way breaks this assumption. The 'index' field can be NULL
until loading from disk, which could lead to segfaults like
581d2fd9f2 (get_oid: handle NULL repo->index, 2019-05-14).

We have two options to handle this: either we audit the entire code
base, adding 'is index NULL' when needed, or we make sure 'index' is
never NULL to begin with.

This patch goes with the second option, making sure that 'index' is
always allocated after initialization. It's less effort than the first
one, and also safer because you could still miss things during the code
audit. The extra allocation cost is not a real concern.

The 'index' field is still freed and reset to NULL in repo_clear(). But
after that call, a lot more is missing in 'repo' and it can never be
used again without going through reinitialization phase. So it should be
fine.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 repository.c | 3 ++-
 repository.h | 4 ++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 682c239fe3..ca58692504 100644
--- a/repository.c
+++ b/repository.c
@@ -160,6 +160,7 @@ int repo_init(struct repository *repo,
 	struct repository_format format = REPOSITORY_FORMAT_INIT;
 	memset(repo, 0, sizeof(*repo));
 
+	repo->index = xcalloc(1, sizeof(*repo->index));
 	repo->objects = raw_object_store_new();
 	repo->parsed_objects = parsed_object_pool_new();
 
@@ -262,7 +263,7 @@ void repo_clear(struct repository *repo)
 int repo_read_index(struct repository *repo)
 {
 	if (!repo->index)
-		repo->index = xcalloc(1, sizeof(*repo->index));
+		BUG("the repo hasn't been setup");
 
 	return read_index_from(repo->index, repo->index_file, repo->gitdir);
 }
diff --git a/repository.h b/repository.h
index 4fb6a5885f..75c4f68b22 100644
--- a/repository.h
+++ b/repository.h
@@ -85,6 +85,7 @@ struct repository {
 
 	/*
 	 * Repository's in-memory index.
+	 * Cannot be NULL after initialization.
 	 * 'repo_read_index()' can be used to populate 'index'.
 	 */
 	struct index_state *index;
@@ -132,6 +133,9 @@ struct submodule;
 int repo_submodule_init(struct repository *subrepo,
 			struct repository *superproject,
 			const struct submodule *sub);
+/*
+ * Release all resources in 'repo'. 'repo' cannot be used again.
+ */
 void repo_clear(struct repository *repo);
 
 /*
-- 
2.22.0.rc0.322.g2b0371e29a


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] repository.c: always allocate 'index' at repo init time
  2019-05-19  2:56                 ` [PATCH] repository.c: always allocate 'index' at repo init time Nguyễn Thái Ngọc Duy
@ 2019-05-20 13:17                   ` Jeff King
  2019-05-21 10:34                     ` Duy Nguyen
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2019-05-20 13:17 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: gitster, ao2, e, git

On Sun, May 19, 2019 at 09:56:36AM +0700, Nguyễn Thái Ngọc Duy wrote:

> This patch goes with the second option, making sure that 'index' is
> always allocated after initialization. It's less effort than the first
> one, and also safer because you could still miss things during the code
> audit. The extra allocation cost is not a real concern.

I think this direction makes sense.

The patch looks good, though I wonder if we could simplify even further
by just embedding an index into the repository object. The purpose of
having it as a pointer, I think, is so that the_repository can point to
the_index. But we could possibly hide the latter behind some macro
trickery like:

  #define the_index (the_repository->index)

I spent a few minutes on a proof of concept patch, but it gets a bit
hairy:

  1. There are some circular dependencies in the header files. We'd need
     repository.h to depend on cache.h to get the definition of
     index_state, but the latter includes repository.h. We'd need to
     break the index bits out of cache.h into index.h, which in turn
     requires breaking out some other parts. I did a sloppy job of it in
     the patch below.

  2. There are hundreds of spots that need to swap out "repo->index" for
     "&repo->index". In the patch below I just did enough to compile
     archive-zip.o, to illustrate. :)

So it's definitely non-trivial to go that way. I'm not sure if it's
worth the effort to switch at this point, but even if it is, your patch
seems like a good thing to do in the meantime.

Either way, I think we could probably revert the non-test portion of my
581d2fd9f2 (get_oid: handle NULL repo->index, 2019-05-14) after this.

-Peff

---
diff --git a/archive-zip.c b/archive-zip.c
index 4d66b5be6e..517e203483 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -353,7 +353,7 @@ static int write_zip_entry(struct archiver_args *args,
 				return error(_("cannot read %s"),
 					     oid_to_hex(oid));
 			crc = crc32(crc, buffer, size);
-			is_binary = entry_is_binary(args->repo->index,
+			is_binary = entry_is_binary(&args->repo->index,
 						    path_without_prefix,
 						    buffer, size);
 			out = buffer;
@@ -430,7 +430,7 @@ static int write_zip_entry(struct archiver_args *args,
 				break;
 			crc = crc32(crc, buf, readlen);
 			if (is_binary == -1)
-				is_binary = entry_is_binary(args->repo->index,
+				is_binary = entry_is_binary(&args->repo->index,
 							    path_without_prefix,
 							    buf, readlen);
 			write_or_die(1, buf, readlen);
@@ -463,7 +463,7 @@ static int write_zip_entry(struct archiver_args *args,
 				break;
 			crc = crc32(crc, buf, readlen);
 			if (is_binary == -1)
-				is_binary = entry_is_binary(args->repo->index,
+				is_binary = entry_is_binary(&args->repo->index,
 							    path_without_prefix,
 							    buf, readlen);
 
diff --git a/cache.h b/cache.h
index b4bb2e2c11..d0450025e1 100644
--- a/cache.h
+++ b/cache.h
@@ -17,6 +17,7 @@
 #include "sha1-array.h"
 #include "repository.h"
 #include "mem-pool.h"
+#include "oid.h"
 
 #include <zlib.h>
 typedef struct git_zstream {
@@ -43,28 +44,6 @@ int git_deflate_end_gently(git_zstream *);
 int git_deflate(git_zstream *, int flush);
 unsigned long git_deflate_bound(git_zstream *, unsigned long);
 
-/* The length in bytes and in hex digits of an object name (SHA-1 value). */
-#define GIT_SHA1_RAWSZ 20
-#define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
-/* The block size of SHA-1. */
-#define GIT_SHA1_BLKSZ 64
-
-/* The length in bytes and in hex digits of an object name (SHA-256 value). */
-#define GIT_SHA256_RAWSZ 32
-#define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)
-/* The block size of SHA-256. */
-#define GIT_SHA256_BLKSZ 64
-
-/* The length in byte and in hex digits of the largest possible hash value. */
-#define GIT_MAX_RAWSZ GIT_SHA256_RAWSZ
-#define GIT_MAX_HEXSZ GIT_SHA256_HEXSZ
-/* The largest possible block size for any supported hash. */
-#define GIT_MAX_BLKSZ GIT_SHA256_BLKSZ
-
-struct object_id {
-	unsigned char hash[GIT_MAX_RAWSZ];
-};
-
 #define the_hash_algo the_repository->hash_algo
 
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
@@ -143,16 +122,6 @@ struct cache_header {
 #define INDEX_FORMAT_LB 2
 #define INDEX_FORMAT_UB 4
 
-/*
- * The "cache_time" is just the low 32 bits of the
- * time. It doesn't matter if it overflows - we only
- * check it for equality in the 32 bits we save.
- */
-struct cache_time {
-	uint32_t sec;
-	uint32_t nsec;
-};
-
 struct stat_data {
 	struct cache_time sd_ctime;
 	struct cache_time sd_mtime;
@@ -326,32 +295,6 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define UNTRACKED_CHANGED	(1 << 7)
 #define FSMONITOR_CHANGED	(1 << 8)
 
-struct split_index;
-struct untracked_cache;
-
-struct index_state {
-	struct cache_entry **cache;
-	unsigned int version;
-	unsigned int cache_nr, cache_alloc, cache_changed;
-	struct string_list *resolve_undo;
-	struct cache_tree *cache_tree;
-	struct split_index *split_index;
-	struct cache_time timestamp;
-	unsigned name_hash_initialized : 1,
-		 initialized : 1,
-		 drop_cache_tree : 1,
-		 updated_workdir : 1,
-		 updated_skipworktree : 1,
-		 fsmonitor_has_run_once : 1;
-	struct hashmap name_hash;
-	struct hashmap dir_hash;
-	struct object_id oid;
-	struct untracked_cache *untracked;
-	uint64_t fsmonitor_last_update;
-	struct ewah_bitmap *fsmonitor_dirty;
-	struct mem_pool *ce_mem_pool;
-};
-
 /* Name hashing */
 int test_lazy_init_name_hash(struct index_state *istate, int try_threaded);
 void add_name_hash(struct index_state *istate, struct cache_entry *ce);
diff --git a/repository.h b/repository.h
index 4fb6a5885f..3371afceaa 100644
--- a/repository.h
+++ b/repository.h
@@ -1,11 +1,11 @@
 #ifndef REPOSITORY_H
 #define REPOSITORY_H
 
+#include "index.h"
 #include "path.h"
 
 struct config_set;
 struct git_hash_algo;
-struct index_state;
 struct lock_file;
 struct pathspec;
 struct raw_object_store;
@@ -87,7 +87,7 @@ struct repository {
 	 * Repository's in-memory index.
 	 * 'repo_read_index()' can be used to populate 'index'.
 	 */
-	struct index_state *index;
+	struct index_state index;
 
 	/* Repository's current hash algorithm, as serialized on disk. */
 	const struct git_hash_algo *hash_algo;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] repository.c: always allocate 'index' at repo init time
  2019-05-20 13:17                   ` Jeff King
@ 2019-05-21 10:34                     ` Duy Nguyen
  2019-05-21 20:58                       ` Jeff King
  2019-05-28 16:07                       ` Junio C Hamano
  0 siblings, 2 replies; 16+ messages in thread
From: Duy Nguyen @ 2019-05-21 10:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Antonio Ospite, Eric Wong, Git Mailing List

On Mon, May 20, 2019 at 8:17 PM Jeff King <peff@peff.net> wrote:
> The patch looks good, though I wonder if we could simplify even further
> by just embedding an index into the repository object. The purpose of
> having it as a pointer, I think, is so that the_repository can point to
> the_index. But we could possibly hide the latter behind some macro
> trickery like:
>
>   #define the_index (the_repository->index)
>
> I spent a few minutes on a proof of concept patch, but it gets a bit
> hairy:
>
>   1. There are some circular dependencies in the header files. We'd need
>      repository.h to depend on cache.h to get the definition of
>      index_state, but the latter includes repository.h. We'd need to
>      break the index bits out of cache.h into index.h, which in turn
>      requires breaking out some other parts. I did a sloppy job of it in
>      the patch below.
>
>   2. There are hundreds of spots that need to swap out "repo->index" for
>      "&repo->index". In the patch below I just did enough to compile
>      archive-zip.o, to illustrate. :)

You are more thorough than me. I saw #2 first and immediately backed
off (partly for a selfish reason: I have plenty of the_repo conversion
patches in queue and anything touching "repo" may delay those patches
even more).

There's also #3 but this one is minor. So far 'struct repo' is more of
a glue of things. Embedding index_state in it while leaving
object_store, ref_store... pointers feels inconsistent and a bit
weird. It's not a strong reason for making index_state a pointer too,
but if we have to deal with pointers anyway...

> So it's definitely non-trivial to go that way. I'm not sure if it's
> worth the effort to switch at this point, but even if it is, your patch
> seems like a good thing to do in the meantime.
>
> Either way, I think we could probably revert the non-test portion of my
> 581d2fd9f2 (get_oid: handle NULL repo->index, 2019-05-14) after this.

Yeah. I'm thinking of doing that after, scanning for similar lines
too. But it looks like it's the only one. Will fix in v2.
-- 
Duy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] repository.c: always allocate 'index' at repo init time
  2019-05-21 10:34                     ` Duy Nguyen
@ 2019-05-21 20:58                       ` Jeff King
  2019-05-28 16:07                       ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Jeff King @ 2019-05-21 20:58 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Junio C Hamano, Antonio Ospite, Eric Wong, Git Mailing List

On Tue, May 21, 2019 at 05:34:02PM +0700, Duy Nguyen wrote:

> >   2. There are hundreds of spots that need to swap out "repo->index" for
> >      "&repo->index". In the patch below I just did enough to compile
> >      archive-zip.o, to illustrate. :)
> 
> You are more thorough than me. I saw #2 first and immediately backed
> off (partly for a selfish reason: I have plenty of the_repo conversion
> patches in queue and anything touching "repo" may delay those patches
> even more).

Yeah, that's true, it would be disruptive.

> There's also #3 but this one is minor. So far 'struct repo' is more of
> a glue of things. Embedding index_state in it while leaving
> object_store, ref_store... pointers feels inconsistent and a bit
> weird. It's not a strong reason for making index_state a pointer too,
> but if we have to deal with pointers anyway...

And yeah, I agree it would nice for it to all be consistent. Let's leave
it at your patch for now, and we can think about refactoring this later.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] repository.c: always allocate 'index' at repo init time
  2019-05-21 10:34                     ` Duy Nguyen
  2019-05-21 20:58                       ` Jeff King
@ 2019-05-28 16:07                       ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2019-05-28 16:07 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jeff King, Antonio Ospite, Eric Wong, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> On Mon, May 20, 2019 at 8:17 PM Jeff King <peff@peff.net> wrote:
>> The patch looks good, though I wonder if we could simplify even further
>> by just embedding an index into the repository object. The purpose of
>> having it as a pointer, I think, is so that the_repository can point to
>> the_index. But we could possibly hide the latter behind some macro
>> trickery like:
>>
>>   #define the_index (the_repository->index)
> ...
>> So it's definitely non-trivial to go that way. I'm not sure if it's
>> worth the effort to switch at this point, but even if it is, your patch
>> seems like a good thing to do in the meantime.

Yeah, the fact that the_reopsitory->index is not an embedded
instance has bothered me from the very beginning, and I am happy to
see others share the same feeling ;-)

>> Either way, I think we could probably revert the non-test portion of my
>> 581d2fd9f2 (get_oid: handle NULL repo->index, 2019-05-14) after this.
>
> Yeah. I'm thinking of doing that after, scanning for similar lines
> too. But it looks like it's the only one. Will fix in v2.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, back to index

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-11 20:57 new segfault in master (6a6c0f10a70a6eb1) Eric Wong
2019-05-11 22:31 ` Jeff King
2019-05-11 23:02   ` Jeff King
2019-05-12  4:26     ` Duy Nguyen
2019-05-14 13:54     ` [PATCH] get_oid: handle NULL repo->index Jeff King
2019-05-14 23:38       ` Eric Wong
2019-05-15  1:24       ` Duy Nguyen
2019-05-15  1:46         ` Jeff King
2019-05-15  5:16           ` Junio C Hamano
2019-05-15  9:29             ` Duy Nguyen
2019-05-16  1:43               ` Junio C Hamano
2019-05-19  2:56                 ` [PATCH] repository.c: always allocate 'index' at repo init time Nguyễn Thái Ngọc Duy
2019-05-20 13:17                   ` Jeff King
2019-05-21 10:34                     ` Duy Nguyen
2019-05-21 20:58                       ` Jeff King
2019-05-28 16:07                       ` Junio C Hamano

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox