git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jameson Miller <jamill@microsoft.com>
Cc: "git\@vger.kernel.org" <git@vger.kernel.org>,
	"jonathantanmy\@google.com" <jonathantanmy@google.com>,
	"pclouds\@gmail.com" <pclouds@gmail.com>,
	"peartben\@gmail.com" <peartben@gmail.com>,
	"peff\@peff.net" <peff@peff.net>,
	"sbeller\@google.com" <sbeller@google.com>
Subject: Re: [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs
Date: Thu, 28 Jun 2018 11:43:06 -0700	[thread overview]
Message-ID: <xmqqsh56wy4l.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <20180628135932.225288-4-jamill@microsoft.com> (Jameson Miller's message of "Thu, 28 Jun 2018 14:00:09 +0000")

Jameson Miller <jamill@microsoft.com> writes:

> Add an API around managing the lifetime of cache_entry
> structs. Abstracting memory management details behind this API will
> allow for alternative memory management strategies without affecting
> all the call sites.  This commit does not change how memory is
> allocated or freed. A later commit in this series will allocate cache
> entries from memory pools as appropriate.
>
> Motivation:
> It has been observed that the time spent loading an index with a large
> number of entries is partly dominated by malloc() calls. This change
> is in preparation for using memory pools to reduce the number of
> malloc() calls made when loading an index.

Not worth a reroll, but having these four lines at the very
beginning, dropping the line "Motivation:", and then following that
with "Add an API around ..." as the second paragraph, would make the
above easier to read, without stutter-causing "Motivation:" in the
middle.

> diff --git a/apply.c b/apply.c
> index 8ef975a32d..8a4a4439bc 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -4092,12 +4092,12 @@ static int build_fake_ancestor(struct apply_state *state, struct patch *list)
>  			return error(_("sha1 information is lacking or useless "
>  				       "(%s)."), name);
>  
> -		ce = make_cache_entry(patch->old_mode, &oid, name, 0, 0);
> +		ce = make_cache_entry(&result, patch->old_mode, &oid, name, 0, 0);
>  		if (!ce)
>  			return error(_("make_cache_entry failed for path '%s'"),
>  				     name);
>  		if (add_index_entry(&result, ce, ADD_CACHE_OK_TO_ADD)) {
> -			free(ce);
> +			discard_cache_entry(ce);
>  			return error(_("could not add %s to temporary index"),
>  				     name);
>  		}

So..., even though it wasn't clear in the proposed log message, two
large part of the lifecycle management API is (1) make_cache_entry()
knows for which istate it is creating the entry and (2) discarding
the entry may not be just a simple matter of free()ing.  Both of
which makes perfect sense, but if the changes are that easily
enumeratable, it would have been nicer for readers if the commit did
so in the proposed log message.

> @@ -4424,27 +4423,26 @@ static int add_conflicted_stages_file(struct apply_state *state,
>  				       struct patch *patch)
>  {
>  	int stage, namelen;
> -	unsigned ce_size, mode;
> +	unsigned mode;
>  	struct cache_entry *ce;
>  
>  	if (!state->update_index)
>  		return 0;
>  	namelen = strlen(patch->new_name);
> -	ce_size = cache_entry_size(namelen);
>  	mode = patch->new_mode ? patch->new_mode : (S_IFREG | 0644);
>  
>  	remove_file_from_cache(patch->new_name);
>  	for (stage = 1; stage < 4; stage++) {
>  		if (is_null_oid(&patch->threeway_stage[stage - 1]))
>  			continue;
> -		ce = xcalloc(1, ce_size);
> +		ce = make_empty_cache_entry(&the_index, namelen);

... and another one in the enumeration is make_empty_cache_entry()
which is somehow different.  And the readers are forced to read its
implementation to find out how it is different, but the use of the
same discard_cache_entry() suggests that the liftime rule of an
entry allcoated by it may be similar to those created by
make_cache_entry().

> ...
>  		if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD) < 0) {
> -			free(ce);
> +			discard_cache_entry(ce);
>  			return error(_("unable to add cache entry for %s"),
>  				     patch->new_name);
>  		}

> @@ -230,11 +230,11 @@ static int checkout_merged(int pos, const struct checkout *state)
>  	if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid))
>  		die(_("Unable to add merge result for '%s'"), path);
>  	free(result_buf.ptr);
> -	ce = make_cache_entry(mode, &oid, path, 2, 0);
> +	ce = make_transient_cache_entry(mode, &oid, path, 2);

... and then yet another, which is "transient".  An intelligent
reader can guess from the lack of istate parameter (and from the
word "transient") that the resulting one would not belong to any
in-core index.

>  	if (!ce)
>  		die(_("make_cache_entry failed for path '%s'"), path);
>  	status = checkout_entry(ce, state, NULL);
> -	free(ce);
> +	discard_cache_entry(ce);

... but discovers that it is discarded the same way, realizes that
ce knows how it was allocated to allow discard() different way to
discard it, and his/her earlier conjecture about make_empty() does
not hold at all and gets somewhat disappointed.

> diff --git a/cache.h b/cache.h
> index 3fbf24771a..035a627bea 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -339,6 +339,40 @@ extern void remove_name_hash(struct index_state *istate, struct cache_entry *ce)
>  extern void free_name_hash(struct index_state *istate);
>  
>  
> +/* Cache entry creation and cleanup */
> +
> +/*
> + * Create cache_entry intended for use in the specified index. Caller
> + * is responsible for discarding the cache_entry with
> + * `discard_cache_entry`.
> + */
> +struct cache_entry *make_cache_entry(struct index_state *istate,
> +				     unsigned int mode,
> +				     const struct object_id *oid,
> +				     const char *path,
> +				     int stage,
> +				     unsigned int refresh_options);
> +
> +struct cache_entry *make_empty_cache_entry(struct index_state *istate,
> +					   size_t name_len);
> +
> +/*
> + * Create a cache_entry that is not intended to be added to an index.
> + * Caller is responsible for discarding the cache_entry
> + * with `discard_cache_entry`.
> + */
> +struct cache_entry *make_transient_cache_entry(unsigned int mode,
> +					       const struct object_id *oid,
> +					       const char *path,
> +					       int stage);
> +
> +struct cache_entry *make_empty_transient_cache_entry(size_t name_len);

OK, finally it becomes clear that we have per-istate and transient
sets of two (i.e. one that takes the path, stage and mode pfront,
and the other that only takes the length of the name), and ...

> +/*
> + * Discard cache entry.
> + */
> +void discard_cache_entry(struct cache_entry *ce);

... a single function that knows how to discard each kind.  It would
have really helped the reader to talk about them in the proposed log
message, as they are only 5 functions.  Another way to make it easier
for readers would have been to show the diff for cache.h first before
diffs for all the others.

  reply	other threads:[~2018-06-28 18:43 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-17 16:34 [PATCH v1 0/5] Allocate cache entries from memory pool Jameson Miller
2018-04-17 16:34 ` Jameson Miller
2018-04-17 16:34 ` [PATCH v1 1/5] read-cache: teach refresh_cache_entry to take istate Jameson Miller
2018-04-17 19:00   ` Ben Peart
2018-04-17 16:34 ` [PATCH v1 2/5] Add an API creating / discarding cache_entry structs Jameson Miller
2018-04-17 23:11   ` Ben Peart
2018-04-17 16:34 ` [PATCH v1 4/5] Allocate cache entries from memory pools Jameson Miller
2018-04-17 16:34 ` [PATCH v1 3/5] mem-pool: fill out functionality Jameson Miller
2018-04-20 23:21   ` Jonathan Tan
2018-04-23 17:27     ` Jameson Miller
2018-04-23 17:49       ` Jonathan Tan
2018-04-23 18:20         ` Jameson Miller
2018-04-17 16:34 ` [PATCH v1 5/5] Add optional memory validations around cache_entry lifecyle Jameson Miller
2018-04-17 18:39 ` [PATCH v1 0/5] Allocate cache entries from memory pool Ben Peart
2018-04-23 14:09   ` Jameson Miller
2018-04-18  4:49 ` Junio C Hamano
2018-04-20 17:49   ` Stefan Beller
2018-04-23 16:44     ` Jameson Miller
2018-04-23 17:18       ` Stefan Beller
2018-04-23 16:19   ` Jameson Miller
2018-04-20 23:34 ` Jonathan Tan
2018-04-23 17:14   ` Jameson Miller
2018-04-30 15:31 ` [PATCH v2 " Jameson Miller
2018-04-30 15:31   ` [PATCH v2 1/5] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-04-30 15:31   ` [PATCH v2 2/5] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-04-30 15:31   ` [PATCH v2 3/5] mem-pool: fill out functionality Jameson Miller
2018-04-30 21:42     ` Stefan Beller
2018-05-01 15:43       ` Jameson Miller
2018-05-03 16:18     ` Duy Nguyen
2018-04-30 15:31   ` [PATCH v2 4/5] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-04-30 15:31   ` [PATCH v2 5/5] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-05-03 16:28     ` Duy Nguyen
2018-05-03 16:35   ` [PATCH v2 0/5] Allocate cache entries from memory pool Duy Nguyen
2018-05-03 17:21     ` Stefan Beller
2018-05-03 19:17       ` Duy Nguyen
2018-05-03 20:58         ` Stefan Beller
2018-05-03 21:13           ` Jameson Miller
2018-05-03 22:18             ` [PATCH] alloc.c: replace alloc by mempool Stefan Beller
2018-05-04 16:33               ` Duy Nguyen
2018-05-08  0:37                 ` Junio C Hamano
2018-05-08  0:44                   ` Stefan Beller
2018-05-08  1:07                     ` Junio C Hamano
2018-05-23 14:47 ` [PATCH v3 0/7] allocate cache entries from memory pool Jameson Miller
2018-05-23 14:47   ` [PATCH v3 1/7] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-05-25 22:54     ` Stefan Beller
2018-05-23 14:47   ` [PATCH v3 2/7] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-05-24  4:52     ` Junio C Hamano
2018-05-24 14:47       ` Jameson Miller
2018-05-23 14:47   ` [PATCH v3 3/7] mem-pool: only search head block for available space Jameson Miller
2018-05-23 14:47   ` [PATCH v3 4/7] mem-pool: add lifecycle management functions Jameson Miller
2018-05-23 14:47   ` [PATCH v3 5/7] mem-pool: fill out functionality Jameson Miller
2018-06-01 19:28     ` Stefan Beller
2018-05-23 14:47   ` [PATCH v3 6/7] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-05-23 14:47   ` [PATCH v3 7/7] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-05-24  4:55   ` [PATCH v3 0/7] allocate cache entries from memory pool Junio C Hamano
2018-05-24 14:44     ` Jameson Miller
2018-05-25 22:53       ` Stefan Beller
2018-06-20 20:41         ` Jameson Miller
2018-05-25 22:41   ` Stefan Beller
2018-06-20 20:17 ` [PATCH v4 0/8] Allocate cache entries from mem_pool Jameson Miller
2018-06-20 20:17   ` [PATCH v4 1/8] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-06-20 20:17   ` [PATCH v4 2/8] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-06-21 21:14     ` Stefan Beller
2018-06-28 14:07       ` Jameson Miller
2018-06-20 20:17   ` [PATCH v4 3/8] mem-pool: only search head block for available space Jameson Miller
2018-06-21 21:33     ` Stefan Beller
2018-06-28 14:12       ` Jameson Miller
2018-06-20 20:17   ` [PATCH v4 4/8] mem-pool: tweak math on mp_block allocation size Jameson Miller
2018-06-20 20:17   ` [PATCH v4 5/8] mem-pool: add lifecycle management functions Jameson Miller
2018-06-20 20:17   ` [PATCH v4 6/8] mem-pool: fill out functionality Jameson Miller
2018-06-20 20:17   ` [PATCH v4 7/8] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-06-20 20:17   ` [PATCH v4 8/8] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-06-28 14:00   ` [PATCH v5 0/8] Allocate cache entries from mem_pool Jameson Miller
2018-06-28 14:00     ` [PATCH v5 1/8] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-06-28 14:00     ` [PATCH v5 2/8] read-cache: make_cache_entry should take object_id struct Jameson Miller
2018-06-28 17:14       ` Junio C Hamano
2018-06-28 22:27       ` SZEDER Gábor
2018-06-28 14:00     ` [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-06-28 18:43       ` Junio C Hamano [this message]
2018-06-28 22:28       ` SZEDER Gábor
2018-06-28 14:00     ` [PATCH v5 4/8] mem-pool: only search head block for available space Jameson Miller
2018-06-28 14:00     ` [PATCH v5 5/8] mem-pool: add life cycle management functions Jameson Miller
2018-06-28 17:15       ` Junio C Hamano
2018-06-28 14:00     ` [PATCH v5 6/8] mem-pool: fill out functionality Jameson Miller
2018-06-28 19:09       ` Junio C Hamano
2018-07-02 18:28         ` Jameson Miller
2018-06-28 14:00     ` [PATCH v5 7/8] block alloc: allocate cache entries from mem-pool Jameson Miller
2018-06-28 14:00     ` [PATCH v5 8/8] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-07-02 19:49     ` [PATCH v6 0/8] Allocate cache entries from mem_pool Jameson Miller
2018-07-02 19:49       ` [PATCH v6 1/8] read-cache: teach refresh_cache_entry to take istate Jameson Miller
2018-07-02 19:49       ` [PATCH v6 2/8] read-cache: teach make_cache_entry to take object_id Jameson Miller
2018-07-02 21:23         ` Stefan Beller
2018-07-05 15:20           ` Jameson Miller
2018-07-02 19:49       ` [PATCH v6 3/8] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-07-22  9:23         ` Duy Nguyen
2018-07-02 19:49       ` [PATCH v6 4/8] mem-pool: only search head block for available space Jameson Miller
2018-07-02 19:49       ` [PATCH v6 5/8] mem-pool: add life cycle management functions Jameson Miller
2018-07-02 19:49       ` [PATCH v6 6/8] mem-pool: fill out functionality Jameson Miller
2018-07-02 19:49       ` [PATCH v6 7/8] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-07-02 19:49       ` [PATCH v6 8/8] block alloc: add validations around cache_entry lifecyle Jameson Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqsh56wy4l.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jamill@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=pclouds@gmail.com \
    --cc=peartben@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).