git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: gitster@pobox.com
Cc: ao2@ao2.it, e@80x24.org, git@vger.kernel.org, pclouds@gmail.com,
	peff@peff.net
Subject: [PATCH] repository.c: always allocate 'index' at repo init time
Date: Sun, 19 May 2019 09:56:36 +0700	[thread overview]
Message-ID: <20190519025636.24819-1-pclouds@gmail.com> (raw)
In-Reply-To: <xmqqftpf5g3d.fsf@gitster-ct.c.googlers.com>

There are two ways a 'struct repository' could be initialized before
using: via initialize_the_repository() and repo_init().

The first way always initializes 'index' field because that's how it is
before the introduction of 'struct repository'. Back then 'the_index' is
always available (even if not loaded). The second way however leaves
'index' NULL and relies on repo_read_index() to allocate it on demand.

The problem with the second way is that, the majority of our code base
was written with 'the_index' (i.e. the first way) in mind, where
dereferencing 'the_index' (or the 'index' field now) is always
safe.

The second way breaks this assumption. The 'index' field can be NULL
until loading from disk, which could lead to segfaults like
581d2fd9f2 (get_oid: handle NULL repo->index, 2019-05-14).

We have two options to handle this: either we audit the entire code
base, adding 'is index NULL' when needed, or we make sure 'index' is
never NULL to begin with.

This patch goes with the second option, making sure that 'index' is
always allocated after initialization. It's less effort than the first
one, and also safer because you could still miss things during the code
audit. The extra allocation cost is not a real concern.

The 'index' field is still freed and reset to NULL in repo_clear(). But
after that call, a lot more is missing in 'repo' and it can never be
used again without going through reinitialization phase. So it should be
fine.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 repository.c | 3 ++-
 repository.h | 4 ++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 682c239fe3..ca58692504 100644
--- a/repository.c
+++ b/repository.c
@@ -160,6 +160,7 @@ int repo_init(struct repository *repo,
 	struct repository_format format = REPOSITORY_FORMAT_INIT;
 	memset(repo, 0, sizeof(*repo));
 
+	repo->index = xcalloc(1, sizeof(*repo->index));
 	repo->objects = raw_object_store_new();
 	repo->parsed_objects = parsed_object_pool_new();
 
@@ -262,7 +263,7 @@ void repo_clear(struct repository *repo)
 int repo_read_index(struct repository *repo)
 {
 	if (!repo->index)
-		repo->index = xcalloc(1, sizeof(*repo->index));
+		BUG("the repo hasn't been setup");
 
 	return read_index_from(repo->index, repo->index_file, repo->gitdir);
 }
diff --git a/repository.h b/repository.h
index 4fb6a5885f..75c4f68b22 100644
--- a/repository.h
+++ b/repository.h
@@ -85,6 +85,7 @@ struct repository {
 
 	/*
 	 * Repository's in-memory index.
+	 * Cannot be NULL after initialization.
 	 * 'repo_read_index()' can be used to populate 'index'.
 	 */
 	struct index_state *index;
@@ -132,6 +133,9 @@ struct submodule;
 int repo_submodule_init(struct repository *subrepo,
 			struct repository *superproject,
 			const struct submodule *sub);
+/*
+ * Release all resources in 'repo'. 'repo' cannot be used again.
+ */
 void repo_clear(struct repository *repo);
 
 /*
-- 
2.22.0.rc0.322.g2b0371e29a


  reply	other threads:[~2019-05-19 18:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-11 20:57 new segfault in master (6a6c0f10a70a6eb1) Eric Wong
2019-05-11 22:31 ` Jeff King
2019-05-11 23:02   ` Jeff King
2019-05-12  4:26     ` Duy Nguyen
2019-05-14 13:54     ` [PATCH] get_oid: handle NULL repo->index Jeff King
2019-05-14 23:38       ` Eric Wong
2019-05-15  1:24       ` Duy Nguyen
2019-05-15  1:46         ` Jeff King
2019-05-15  5:16           ` Junio C Hamano
2019-05-15  9:29             ` Duy Nguyen
2019-05-16  1:43               ` Junio C Hamano
2019-05-19  2:56                 ` Nguyễn Thái Ngọc Duy [this message]
2019-05-20 13:17                   ` [PATCH] repository.c: always allocate 'index' at repo init time Jeff King
2019-05-21 10:34                     ` Duy Nguyen
2019-05-21 20:58                       ` Jeff King
2019-05-28 16:07                       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190519025636.24819-1-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=ao2@ao2.it \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --subject='Re: [PATCH] repository.c: always allocate '\''index'\'' at repo init time' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).