git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, avarab@gmail.com,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 3/3] packfile: close_all_packs to close_object_store
Date: Mon, 20 May 2019 12:01:04 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1905201141000.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <0e948f639fb5209f07f8e3eb356b5886c41ff2be.1558118506.git.gitgitgadget@gmail.com>

Hi Stolee,

*really* minor nit: the commit subject probably wants to have a "rename"
after the colon ;-)

The patch looks sensible to me. Since Junio asked for a sanity check
whether all of the call sites of `close_all_packs()` actually want to
close the MIDX and the commit graph, too, I'll do the "speak out loud"
type of patch review here (spoiler: all of them check out):

On Fri, 17 May 2019, Derrick Stolee via GitGitGadget wrote:

> diff --git a/builtin/am.c b/builtin/am.c
> index 58a2aef28b..9315d32d2a 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1800,7 +1800,7 @@ static void am_run(struct am_state *state, int resume)
>  	 */
>  	if (!state->rebasing) {
>  		am_destroy(state);
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);

Here, we run `git gc --auto`, so we obviously really want to close all
read handles.

Check.

>  	}
>  }
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 50bde99618..82ce682c80 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1240,7 +1240,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	transport_disconnect(transport);
>
>  	if (option_dissociate) {
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		dissociate_from_references();

Here, we prepare for disassociating the reference repository specified via
`git clone --reference <directory>`. Obviously, we need to let go of all
the handles we might have open there.

Check.

>  	}
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index b620fd54b4..3aec95608f 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1670,7 +1670,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>
>  	string_list_clear(&list, 0);
>
> -	close_all_packs(the_repository->objects);
> +	close_object_store(the_repository->objects);
>
>  	argv_array_pushl(&argv_gc_auto, "gc", "--auto", NULL);

Again, a `git gc --auto` that needs closing of all read handles to the
files that might be overwritten by the garbage collection.

Check.

>  	if (verbosity < 0)
> diff --git a/builtin/gc.c b/builtin/gc.c
> index df2573f124..20c8f1bfe8 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -632,7 +632,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	gc_before_repack();
>
>  	if (!repository_format_precious_objects) {
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		if (run_command_v_opt(repack.argv, RUN_GIT_CMD))

Here, we want to repack. AFAICT it is the only sane thing we can do to
invalidate whatever we read from the object store into memory.

Check.

>  			die(FAILED_RUN, repack.argv[0]);
>
> @@ -660,7 +660,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	report_garbage = report_pack_garbage;
>  	reprepare_packed_git(the_repository);
>  	if (pack_garbage.nr > 0) {
> -		close_all_packs(the_repository->objects);
> +		close_object_store(the_repository->objects);
>  		clean_pack_garbage();

This wants to delete a number of files that are now obsolete, and it makes
sense to make sure that there are no open read handles to those anymore.
It is a bit unclear from just reading the code what types of files are
accumulated into the `pack_garbage` string list, but then, we're in the
last throngs of a garbage collection, and *just* about to write a new
commit graph (if `gc.writeCommitGraph=true`), so I think it is quite okay
to close not only the packs here, but everything we opened from the object
store.

So I'd give this a check mark, too.

>  	}
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index e47d77baee..72d7a7c909 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -449,7 +449,7 @@ static void finish(struct commit *head_commit,
>  			 * We ignore errors in 'gc --auto', since the
>  			 * user should see them.
>  			 */
> -			close_all_packs(the_repository->objects);
> +			close_object_store(the_repository->objects);
>  			run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);

Obviously yet another `git gc --auto`, so yes, we need to close the object
store handles we have.

Check.

>  		}
>  	}
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 7c7bc13e91..ed30fcd633 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -328,7 +328,7 @@ static int finish_rebase(struct rebase_options *opts)
>
>  	delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
>  	apply_autostash(opts);
> -	close_all_packs(the_repository->objects);
> +	close_object_store(the_repository->objects);
>  	/*
>  	 * We ignore errors in 'gc --auto', since the
>  	 * user should see them.

Yet another `git gc --auto`.

Check.

> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index d58b7750b6..92cd1f508c 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -2032,7 +2032,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  			proc.git_cmd = 1;
>  			proc.argv = argv_gc_auto;
>
> -			close_all_packs(the_repository->objects);
> +			close_object_store(the_repository->objects);
>  			if (!start_command(&proc)) {

This `proc` refers to another `git gc --auto` (see a couple lines above,
still within the hunk).

Check.

>  				if (use_sideband)
>  					copy_to_sideband(proc.err, -1, NULL);
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 67f8978043..4de8b6600c 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -419,7 +419,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  	if (!names.nr && !po_args.quiet)
>  		printf_ln(_("Nothing new to pack."));
>
> -	close_all_packs(the_repository->objects);
> +	close_object_store(the_repository->objects);
>
>  	/*
>  	 * Ok we have prepared all new packfiles.

Ah, the joys of un-dynamic patch review. What you, dear reader, cannot see
in this hunk is that the code comment at the end continues thusly:

         * First see if there are packs of the same name and if so
         * if we can move them out of the way (this can happen if we
         * repacked immediately after packing fully.
         */

Meaning: we're about to rename some pack files. So the pack file handles
need to be closed, all right, but what about the other object store
handles? There is no mention of the commit graph (more on that below), but
the loop following the code comment contains this:

                        if (!midx_cleared) {
                                clear_midx_file(the_repository);
                                midx_cleared = 1;
                        }

So yes, I would give this a check.

It does puzzle me, I have to admit, that there is no (opt-in) code block
to re-write the commit graph. After all, the commit graph references the
pack files, right? So if they are repacked, it would at least be
invalidated at this point...

> diff --git a/object.c b/object.c
> index e81d47a79c..cf1a2b7086 100644
> --- a/object.c
> +++ b/object.c
> @@ -517,7 +517,7 @@ void raw_object_store_clear(struct raw_object_store *o)
>  	o->loaded_alternates = 0;
>
>  	INIT_LIST_HEAD(&o->packed_git_mru);
> -	close_all_packs(o);
> +	close_object_store(o);

We're in the middle of a function called `raw_object_store_clear()`. So...

Check.

>  	o->packed_git = NULL;
>  }
>
> diff --git a/packfile.c b/packfile.c
> index ce12bffe3e..017046fcf9 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -337,7 +337,7 @@ void close_pack(struct packed_git *p)
>  	close_pack_index(p);
>  }
>
> -void close_all_packs(struct raw_object_store *o)
> +void close_object_store(struct raw_object_store *o)
>  {
>  	struct packed_git *p;
>
> diff --git a/packfile.h b/packfile.h
> index d70c6d9afb..e95e389eb8 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -81,7 +81,7 @@ extern uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
>  extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
>  extern void close_pack_windows(struct packed_git *);
>  extern void close_pack(struct packed_git *);
> -extern void close_all_packs(struct raw_object_store *o);
> +extern void close_object_store(struct raw_object_store *o);
>  extern void unuse_pack(struct pack_window **);
>  extern void clear_delta_base_cache(void);
>  extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
> --
> gitgitgadget

And this concludes my review.

Thank you!
Dscho

  reply	other threads:[~2019-05-20 10:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17 18:41 [PATCH 0/3] Close commit-graph before calling 'gc' Derrick Stolee via GitGitGadget
2019-05-17 18:41 ` [PATCH 1/3] commit-graph: use raw_object_store when closing Derrick Stolee via GitGitGadget
2019-05-17 18:41 ` [PATCH 2/3] packfile: close commit-graph in close_all_packs Derrick Stolee via GitGitGadget
2019-05-17 18:41 ` [PATCH 3/3] packfile: close_all_packs to close_object_store Derrick Stolee via GitGitGadget
2019-05-20 10:01   ` Johannes Schindelin [this message]
2019-05-20 11:55     ` Derrick Stolee
2019-05-28 16:29       ` Junio C Hamano
2019-05-19  2:04 ` [PATCH 0/3] Close commit-graph before calling 'gc' Junio C Hamano
2019-05-20 10:13   ` Johannes Schindelin
2019-05-20 11:05   ` Derrick Stolee
2019-05-20  9:40 ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1905201141000.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).