git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Jeff King <peff@peff.net>, Geert Jansen <gerardu@amazon.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	"René Scharfe" <l.s.r@web.de>,
	"Takuto Ikuta" <tikuta@chromium.org>
Subject: Re: [PATCH 6/9] sha1-file: use an object_directory for the main object dir
Date: Mon, 12 Nov 2018 10:48:36 -0500	[thread overview]
Message-ID: <421d3b43-3425-72c9-218e-facd86e28267@gmail.com> (raw)
In-Reply-To: <20181112145039.GF7400@sigill.intra.peff.net>

On 11/12/2018 9:50 AM, Jeff King wrote:
> Our handling of alternate object directories is needlessly different
> from the main object directory. As a result, many places in the code
> basically look like this:
>
>    do_something(r->objects->objdir);
>
>    for (odb = r->objects->alt_odb_list; odb; odb = odb->next)
>          do_something(odb->path);
>
> That gets annoying when do_something() is non-trivial, and we've
> resorted to gross hacks like creating fake alternates (see
> find_short_object_filename()).
>
> Instead, let's give each raw_object_store a unified list of
> object_directory structs. The first will be the main store, and
> everything after is an alternate. Very few callers even care about the
> distinction, and can just loop over the whole list (and those who care
> can just treat the first element differently).
>
> A few observations:
>
>    - we don't need r->objects->objectdir anymore, and can just
>      mechanically convert that to r->objects->odb->path
>
>    - object_directory's path field needs to become a real pointer rather
>      than a FLEX_ARRAY, in order to fill it with expand_base_dir()
>
>    - we'll call prepare_alt_odb() earlier in many functions (i.e.,
>      outside of the loop). This may result in us calling it even when our
>      function would be satisfied looking only at the main odb.
>
>      But this doesn't matter in practice. It's not a very expensive
>      operation in the first place, and in the majority of cases it will
>      be a noop. We call it already (and cache its results) in
>      prepare_packed_git(), and we'll generally check packs before loose
>      objects. So essentially every program is going to call it
>      immediately once per program.
>
>      Arguably we should just prepare_alt_odb() immediately upon setting
>      up the repository's object directory, which would save us sprinkling
>      calls throughout the code base (and forgetting to do so has been a
>      source of subtle bugs in the past). But I've stopped short of that
>      here, since there are already a lot of other moving parts in this
>      patch.
>
>    - Most call sites just get shorter. The check_and_freshen() functions
>      are an exception, because they have entry points to handle local and
>      nonlocal directories separately.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> If the "the first one is the main store, the rest are alternates" bit is
> too subtle, we could mark each "struct object_directory" with a bit for
> "is_local".

This is probably a good thing to do proactively. We have the equivalent 
in the packed_git struct, but that's also because they get out of order. 
At the moment, I can't think of a read-only action that needs to treat 
the local object directory more carefully. The closest I know about is 
'git pack-objects --local', but that also writes a pack-file.

I assume that when we write a pack-file to the "default location" we use 
get_object_directory() instead of referring to the default object_directory?

>
>   builtin/fsck.c |  21 ++-------
>   builtin/grep.c |   2 +-
>   commit-graph.c |   5 +-
>   environment.c  |   4 +-
>   object-store.h |  27 ++++++-----
>   object.c       |  19 ++++----
>   packfile.c     |  10 ++--
>   path.c         |   2 +-
>   repository.c   |   8 +++-
>   sha1-file.c    | 122 ++++++++++++++++++-------------------------------
>   sha1-name.c    |  17 ++-----
>   11 files changed, 90 insertions(+), 147 deletions(-)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 55153cf92a..15338bd178 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -725,13 +725,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>   		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
>   		for_each_packed_object(mark_packed_for_connectivity, NULL, 0);
>   	} else {
> -		struct object_directory *alt_odb_list;
> -
> -		fsck_object_dir(get_object_directory());
> -
>   		prepare_alt_odb(the_repository);
> -		alt_odb_list = the_repository->objects->alt_odb_list;
> -		for (odb = alt_odb_list; odb; odb = odb->next)
> +		for (odb = the_repository->objects->odb; odb; odb = odb->next)
>   			fsck_object_dir(odb->path);
>   
>   		if (check_full) {
> @@ -834,13 +829,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>   		struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
>   		const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
>   
> -		commit_graph_verify.argv = verify_argv;
> -		commit_graph_verify.git_cmd = 1;
> -		if (run_command(&commit_graph_verify))
> -			errors_found |= ERROR_COMMIT_GRAPH;
> -
>   		prepare_alt_odb(the_repository);
> -		for (odb = the_repository->objects->alt_odb_list; odb; odb = odb->next) {
> +		for (odb = the_repository->objects->odb; odb; odb = odb->next) {
>   			child_process_init(&commit_graph_verify);
>   			commit_graph_verify.argv = verify_argv;
>   			commit_graph_verify.git_cmd = 1;
> @@ -855,13 +845,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>   		struct child_process midx_verify = CHILD_PROCESS_INIT;
>   		const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
>   
> -		midx_verify.argv = midx_argv;
> -		midx_verify.git_cmd = 1;
> -		if (run_command(&midx_verify))
> -			errors_found |= ERROR_COMMIT_GRAPH;
> -
>   		prepare_alt_odb(the_repository);
> -		for (odb = the_repository->objects->alt_odb_list; odb; odb = odb->next) {
> +		for (odb = the_repository->objects->odb; odb; odb = odb->next) {
>   			child_process_init(&midx_verify);
>   			midx_verify.argv = midx_argv;
>   			midx_verify.git_cmd = 1;
> diff --git a/builtin/grep.c b/builtin/grep.c
> index d8508ddf79..714c8d91ba 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -441,7 +441,7 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
>   	 * object.
>   	 */
>   	grep_read_lock();
> -	add_to_alternates_memory(submodule.objects->objectdir);
> +	add_to_alternates_memory(submodule.objects->odb->path);
>   	grep_read_unlock();
>   
>   	if (oid) {
> diff --git a/commit-graph.c b/commit-graph.c
> index 5dd3f5b15c..99163c244b 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -231,7 +231,6 @@ static void prepare_commit_graph_one(struct repository *r, const char *obj_dir)
>   static int prepare_commit_graph(struct repository *r)
>   {
>   	struct object_directory *odb;
> -	char *obj_dir;
>   	int config_value;
>   
>   	if (r->objects->commit_graph_attempted)
> @@ -252,10 +251,8 @@ static int prepare_commit_graph(struct repository *r)
>   	if (!commit_graph_compatible(r))
>   		return 0;
>   
> -	obj_dir = r->objects->objectdir;
> -	prepare_commit_graph_one(r, obj_dir);
>   	prepare_alt_odb(r);
> -	for (odb = r->objects->alt_odb_list;
> +	for (odb = r->objects->odb;
>   	     !r->objects->commit_graph && odb;
>   	     odb = odb->next)
>   		prepare_commit_graph_one(r, odb->path);
> diff --git a/environment.c b/environment.c
> index 3f3c8746c2..441ce56690 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -274,9 +274,9 @@ const char *get_git_work_tree(void)
>   
>   char *get_object_directory(void)
>   {
> -	if (!the_repository->objects->objectdir)
> +	if (!the_repository->objects->odb)
>   		BUG("git environment hasn't been setup");
> -	return the_repository->objects->objectdir;
> +	return the_repository->objects->odb->path;
>   }
>   
>   int odb_mkstemp(struct strbuf *temp_filename, const char *pattern)
> diff --git a/object-store.h b/object-store.h
> index b2fa0d0df0..30faf7b391 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -24,19 +24,14 @@ struct object_directory {
>   	 * Path to the alternative object store. If this is a relative path,
>   	 * it is relative to the current working directory.
>   	 */
> -	char path[FLEX_ARRAY];
> +	char *path;
>   };
> +
>   void prepare_alt_odb(struct repository *r);
>   char *compute_alternate_path(const char *path, struct strbuf *err);
>   typedef int alt_odb_fn(struct object_directory *, void *);
>   int foreach_alt_odb(alt_odb_fn, void*);
>   
> -/*
> - * Allocate a "struct alternate_object_database" but do _not_ actually
> - * add it to the list of alternates.
> - */
> -struct object_directory *alloc_alt_odb(const char *dir);
> -
>   /*
>    * Add the directory to the on-disk alternates file; the new entry will also
>    * take effect in the current process.
> @@ -80,17 +75,21 @@ struct multi_pack_index;
>   
>   struct raw_object_store {
>   	/*
> -	 * Path to the repository's object store.
> -	 * Cannot be NULL after initialization.
> +	 * Set of all object directories; the main directory is first (and
> +	 * cannot be NULL after initialization). Subsequent directories are
> +	 * alternates.
>   	 */
> -	char *objectdir;
> +	struct object_directory *odb;
> +	struct object_directory **odb_tail;
> +	int loaded_alternates;
>   
> -	/* Path to extra alternate object database if not NULL */
> +	/*
> +	 * A list of alternate object directories loaded from the environment;
> +	 * this should not generally need to be accessed directly, but will
> +	 * populate the "odb" list when prepare_alt_odb() is run.
> +	 */
>   	char *alternate_db;
>   
> -	struct object_directory *alt_odb_list;
> -	struct object_directory **alt_odb_tail;
> -
>   	/*
>   	 * Objects that should be substituted by other objects
>   	 * (see git-replace(1)).
> diff --git a/object.c b/object.c
> index dd485ac629..79d636091c 100644
> --- a/object.c
> +++ b/object.c
> @@ -482,26 +482,26 @@ struct raw_object_store *raw_object_store_new(void)
>   	return o;
>   }
>   
> -static void free_alt_odb(struct object_directory *odb)
> +static void free_object_directory(struct object_directory *odb)
>   {
> +	free(odb->path);
>   	oid_array_clear(&odb->loose_objects_cache);
>   	free(odb);
>   }
>   
> -static void free_alt_odbs(struct raw_object_store *o)
> +static void free_object_directories(struct raw_object_store *o)
>   {
> -	while (o->alt_odb_list) {
> +	while (o->odb) {
>   		struct object_directory *next;
>   
> -		next = o->alt_odb_list->next;
> -		free_alt_odb(o->alt_odb_list);
> -		o->alt_odb_list = next;
> +		next = o->odb->next;
> +		free_object_directory(o->odb);
> +		o->odb = next;
>   	}
>   }
>   
>   void raw_object_store_clear(struct raw_object_store *o)
>   {
> -	FREE_AND_NULL(o->objectdir);
>   	FREE_AND_NULL(o->alternate_db);
>   
>   	oidmap_free(o->replace_map, 1);
> @@ -511,8 +511,9 @@ void raw_object_store_clear(struct raw_object_store *o)
>   	o->commit_graph = NULL;
>   	o->commit_graph_attempted = 0;
>   
> -	free_alt_odbs(o);
> -	o->alt_odb_tail = NULL;
> +	free_object_directories(o);
> +	o->odb_tail = NULL;
> +	o->loaded_alternates = 0;
>   
>   	INIT_LIST_HEAD(&o->packed_git_mru);
>   	close_all_packs(o);
> diff --git a/packfile.c b/packfile.c
> index d6d511cfd2..1eda33247f 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -970,12 +970,12 @@ static void prepare_packed_git(struct repository *r)
>   
>   	if (r->objects->packed_git_initialized)
>   		return;
> -	prepare_multi_pack_index_one(r, r->objects->objectdir, 1);
> -	prepare_packed_git_one(r, r->objects->objectdir, 1);
> +
>   	prepare_alt_odb(r);
> -	for (odb = r->objects->alt_odb_list; odb; odb = odb->next) {
> -		prepare_multi_pack_index_one(r, odb->path, 0);
> -		prepare_packed_git_one(r, odb->path, 0);
> +	for (odb = r->objects->odb; odb; odb = odb->next) {
> +		int local = (odb == r->objects->odb);

Here seems to be a place where `odb->is_local` would help.

> +		prepare_multi_pack_index_one(r, odb->path, local);
> +		prepare_packed_git_one(r, odb->path, local);
>   	}
>   	rearrange_packed_git(r);
>   

Thanks,
-Stolee

  reply	other threads:[~2018-11-12 15:48 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-25 18:38 [RFC PATCH] index-pack: improve performance on NFS Jansen, Geert
2018-10-26  0:21 ` Junio C Hamano
2018-10-26 20:38   ` Ævar Arnfjörð Bjarmason
2018-10-27  7:26     ` Junio C Hamano
2018-10-27  9:33       ` Jeff King
2018-10-27 11:22         ` Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 0/4] index-pack: optionally turn off SHA-1 collision checking Ævar Arnfjörð Bjarmason
2018-10-30  2:49             ` Geert Jansen
2018-10-30  9:04               ` Junio C Hamano
2018-10-30 18:43             ` [PATCH v2 0/3] index-pack: test updates Ævar Arnfjörð Bjarmason
2018-11-13 20:19               ` [PATCH v3] index-pack: add ability to disable SHA-1 collision check Ævar Arnfjörð Bjarmason
2018-11-14  7:09                 ` Junio C Hamano
2018-11-14 12:40                   ` Ævar Arnfjörð Bjarmason
2018-10-30 18:43             ` [PATCH v2 1/3] pack-objects test: modernize style Ævar Arnfjörð Bjarmason
2018-10-30 18:43             ` [PATCH v2 2/3] pack-objects tests: don't leave test .git corrupt at end Ævar Arnfjörð Bjarmason
2018-10-30 18:43             ` [PATCH v2 3/3] index-pack tests: don't leave test repo dirty " Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 1/4] pack-objects test: modernize style Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 2/4] pack-objects tests: don't leave test .git corrupt at end Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 3/4] index-pack tests: don't leave test repo dirty " Ævar Arnfjörð Bjarmason
2018-10-28 22:50           ` [PATCH 4/4] index-pack: add ability to disable SHA-1 collision check Ævar Arnfjörð Bjarmason
2018-10-29 15:04           ` [RFC PATCH] index-pack: improve performance on NFS Jeff King
2018-10-29 15:09             ` Jeff King
2018-10-29 19:36             ` Ævar Arnfjörð Bjarmason
2018-10-29 23:27               ` Jeff King
2018-11-07 22:55                 ` Geert Jansen
2018-11-08 12:02                   ` Jeff King
2018-11-08 20:58                     ` Geert Jansen
2018-11-08 21:18                       ` Jeff King
2018-11-08 21:55                         ` Geert Jansen
2018-11-08 22:20                     ` Ævar Arnfjörð Bjarmason
2018-11-09 10:11                       ` Ævar Arnfjörð Bjarmason
2018-11-12 14:31                       ` Jeff King
2018-11-12 14:46                     ` [PATCH 0/9] caching loose objects Jeff King
2018-11-12 14:46                       ` [PATCH 1/9] fsck: do not reuse child_process structs Jeff King
2018-11-12 15:26                         ` Derrick Stolee
2018-11-12 14:47                       ` [PATCH 2/9] submodule--helper: prefer strip_suffix() to ends_with() Jeff King
2018-11-12 18:23                         ` Stefan Beller
2018-11-12 14:48                       ` [PATCH 3/9] rename "alternate_object_database" to "object_directory" Jeff King
2018-11-12 15:30                         ` Derrick Stolee
2018-11-12 15:36                           ` Jeff King
2018-11-12 19:41                             ` Ramsay Jones
2018-11-12 14:48                       ` [PATCH 4/9] sha1_file_name(): overwrite buffer instead of appending Jeff King
2018-11-12 15:32                         ` Derrick Stolee
2018-11-12 14:49                       ` [PATCH 5/9] handle alternates paths the same as the main object dir Jeff King
2018-11-12 15:38                         ` Derrick Stolee
2018-11-12 15:46                           ` Jeff King
2018-11-12 15:50                             ` Derrick Stolee
2018-11-12 14:50                       ` [PATCH 6/9] sha1-file: use an object_directory for " Jeff King
2018-11-12 15:48                         ` Derrick Stolee [this message]
2018-11-12 16:09                           ` Jeff King
2018-11-12 19:04                             ` Stefan Beller
2018-11-22 17:42                               ` Jeff King
2018-11-12 18:48                           ` Stefan Beller
2018-11-12 14:50                       ` [PATCH 7/9] object-store: provide helpers for loose_objects_cache Jeff King
2018-11-12 19:24                         ` René Scharfe
2018-11-12 20:16                           ` Jeff King
2018-11-12 14:54                       ` [PATCH 8/9] sha1-file: use loose object cache for quick existence check Jeff King
2018-11-12 16:00                         ` Derrick Stolee
2018-11-12 16:01                         ` Ævar Arnfjörð Bjarmason
2018-11-12 16:21                           ` Jeff King
2018-11-12 22:18                             ` Ævar Arnfjörð Bjarmason
2018-11-12 22:30                               ` Ævar Arnfjörð Bjarmason
2018-11-13 10:02                                 ` Ævar Arnfjörð Bjarmason
2018-11-14 18:21                                   ` René Scharfe
2018-12-02 10:52                                   ` René Scharfe
2018-12-03 22:04                                     ` Jeff King
2018-12-04 21:45                                       ` René Scharfe
2018-12-05  4:46                                         ` Jeff King
2018-12-05  6:02                                           ` René Scharfe
2018-12-05  6:51                                             ` Jeff King
2018-12-05  8:15                                               ` Jeff King
2018-12-05 18:41                                                 ` René Scharfe
2018-12-05 20:17                                                   ` Jeff King
2018-11-12 22:44                             ` Geert Jansen
2018-11-27 20:48                         ` René Scharfe
2018-12-01 19:49                           ` Jeff King
2018-11-12 14:55                       ` [PATCH 9/9] fetch-pack: drop custom loose object cache Jeff King
2018-11-12 19:25                         ` René Scharfe
2018-11-12 19:32                           ` Ævar Arnfjörð Bjarmason
2018-11-12 20:07                             ` Jeff King
2018-11-12 20:13                             ` René Scharfe
2018-11-12 16:02                       ` [PATCH 0/9] caching loose objects Derrick Stolee
2018-11-12 19:10                         ` Stefan Beller
2018-11-09 13:43                   ` [RFC PATCH] index-pack: improve performance on NFS Ævar Arnfjörð Bjarmason
2018-11-09 16:08                     ` Duy Nguyen
2018-11-10 14:04                       ` Ævar Arnfjörð Bjarmason
2018-11-12 14:34                         ` Jeff King
2018-11-12 22:58                     ` Geert Jansen
2018-10-27 14:04         ` Duy Nguyen
2018-10-29 15:18           ` Jeff King
2018-10-29  0:48         ` Junio C Hamano
2018-10-29 15:20           ` Jeff King
2018-10-29 18:43             ` Ævar Arnfjörð Bjarmason
2018-10-29 21:34           ` Geert Jansen
2018-10-29 21:50             ` Jeff King
2018-10-29 22:21               ` Geert Jansen
2018-10-29 22:27             ` Jeff King
2018-10-29 22:35               ` Stefan Beller
2018-10-29 23:29                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=421d3b43-3425-72c9-218e-facd86e28267@gmail.com \
    --to=stolee@gmail.com \
    --cc=avarab@gmail.com \
    --cc=gerardu@amazon.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    --cc=tikuta@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).