git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Han-Wen Nienhuys <hanwen@google.com>
Cc: Han-Wen Nienhuys via GitGitGadget <gitgitgadget@gmail.com>,
	git <git@vger.kernel.org>, Han-Wen Nienhuys <hanwenn@gmail.com>
Subject: Re: [PATCH] init-db: init the_repository->hash_algo early from GIT_DEFAULT_HASH
Date: Thu, 03 Dec 2020 15:25:12 -0800	[thread overview]
Message-ID: <xmqqtut2fp5z.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <CAFQ2z_M3OO_nR6dhw6zzE0orYxcawP1DaJ_EOL5=+RUiZgCo8w@mail.gmail.com> (Han-Wen Nienhuys's message of "Thu, 3 Dec 2020 21:23:36 +0100")

Han-Wen Nienhuys <hanwen@google.com> writes:

> If the_repository is only half-initialized at this point in init_db(),
> then why are we passing it in refs_init_db() just a couple of lines
> further? At what point the_repository considered initialized?

I would have to say it probably depends on what callees expect.  The
current implementation of refs_init_db() for files backend may not
need anything other than the hash algorithm enum, but many other
fields are missing, and they should ideally be populated, no?  

For example, I see files_ref_store_create() cheats by calling
get_common_dir_noenv() to find out where the commondir is, instead
of ever asking the repository the ref store belongs to. At least,
get_main_ref_store() is told to get the ref store that belongs to
the_repository, and it would be the right place to learn relevant
pieces of information (for that matter, I am not sure why struct
ref_store does not have a pointer to a repository structure; perhaps
we are seeing the result of piecemeal evolution, not a designed
structure?).

> I'm a bit at a loss here; I never learned how to cleanly work with so
> many global variables, so I'm happy to take your suggestion.

I am only interested in giving a clear direction to future
developers where to populate the_repository's members (and nothing
else) if their enhancement needs members other than the hash
algorithm to be populated, as if it were the_repository initialized
in an already working repository (I am not talking about many global
variables, whichever you are referring to).

One way to do so would probably be to do something like the
attached.

The patch that started this thread (or the equivalent one in the
updated reftable series) may want to initialize a bit more members
while at it (looking at how commondir from the_repository is not
used by the refs/files-backend.c::files_init_db() to decide where
the function creates the refs/heads and refs/tags directories, we
probably would need to populate the_repository->commondir before
that codepath can be fixed to look at the member, for example).


 builtin/init-db.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git c/builtin/init-db.c w/builtin/init-db.c
index dcc45bef51..2b5c94596f 100644
--- c/builtin/init-db.c
+++ w/builtin/init-db.c
@@ -438,6 +438,27 @@ int init_db(const char *git_dir, const char *real_git_dir,
 
 	validate_hash_algorithm(&repo_fmt, hash);
 
+	/*
+	 * At this point, the_repository we have in-core does not look
+	 * anything like one that we would see initialized in an already
+	 * working repository after calling setup_git_directory().
+	 *
+	 * Calling repository.c::initialize_the_repository() may have
+	 * prepared the .index .objects and .parsed_objects members, but
+	 * other members like .gitdir, .commondir, etc. have not been
+	 * initialized.
+	 *
+	 * Many API functions assume they are working with the_repository
+	 * that has sensibly been initialized, but because we haven't
+	 * really read from an existing repository, we need to hand-craft
+	 * the necessary members of the structure to get out of this
+	 * chicken-and-egg situation.
+	 *
+	 * For now, we update the hash algorithm member to what the
+	 * validate_hash_algorithm() call decided for us.
+	 */
+	repo_set_hash_algo(the_repository, repo_fmt->hash_algo);
+
 	reinit = create_default_files(template_dir, original_git_dir,
 				      initial_branch, &repo_fmt,
 				      flags & INIT_DB_QUIET);



  reply	other threads:[~2020-12-03 23:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 19:39 [PATCH] init-db: init the_repository->hash_algo early from GIT_DEFAULT_HASH Han-Wen Nienhuys via GitGitGadget
2020-12-01  4:59 ` Junio C Hamano
2020-12-03 20:23   ` Han-Wen Nienhuys
2020-12-03 23:25     ` Junio C Hamano [this message]
2020-12-07 11:52       ` Han-Wen Nienhuys
2020-12-07 19:31         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtut2fp5z.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=hanwen@google.com \
    --cc=hanwenn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).