From: Ben Peart <peartben@gmail.com>
To: Jameson Miller <jamill@microsoft.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Cc: "gitster@pobox.com" <gitster@pobox.com>,
"pclouds@gmail.com" <pclouds@gmail.com>,
"jonathantanmy@google.com" <jonathantanmy@google.com>
Subject: Re: [PATCH v1 2/5] Add an API creating / discarding cache_entry structs
Date: Tue, 17 Apr 2018 19:11:59 -0400 [thread overview]
Message-ID: <921a8d5c-b118-c5b1-35dd-459130faeed9@gmail.com> (raw)
In-Reply-To: <20180417163400.3875-4-jamill@microsoft.com>
On 4/17/2018 12:34 PM, Jameson Miller wrote:
> Add an API around managing the lifetime of cache_entry structs. Abstracting
> memory management details behind an API will allow for alternative memory
> management strategies without affecting all the call sites. This commit does
> not change how memory is allocated / freed. A later commit in this series will
> allocate cache entries from memory pools as appropriate.
>
> Motivation:
> It has been observed that the time spent loading an index with a large
> number of entries is partly dominated by malloc() calls. This
> change is in preparation for using memory pools to reduce the number
> of malloc() calls made when loading an index.
>
> This API makes a distinction between cache entries that are intended for use
> with a particular to an index and cache entries that are not.
The wording here is awkward. Did you mean "intended for use with a
particular index?"
This enables us
> to use the knowledge about how a cache entry will be used to make informed
> decisions about how to handle the corresponding memory.
> ---
> apply.c | 26 ++++++------
> blame.c | 5 +--
> builtin/checkout.c | 8 ++--
> builtin/difftool.c | 8 ++--
> builtin/reset.c | 6 +--
> builtin/update-index.c | 26 ++++++------
> cache.h | 29 +++++++++++++-
> merge-recursive.c | 2 +-
> read-cache.c | 105 +++++++++++++++++++++++++++++++++++--------------
> resolve-undo.c | 6 ++-
> split-index.c | 8 ++--
> tree.c | 4 +-
> unpack-trees.c | 27 ++++++++-----
> 13 files changed, 166 insertions(+), 94 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index 7e5792c996..47903f427b 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -4090,12 +4090,12 @@ static int build_fake_ancestor(struct apply_state *state, struct patch *list)
> return error(_("sha1 information is lacking or useless "
> "(%s)."), name);
>
> - ce = make_cache_entry(patch->old_mode, oid.hash, name, 0, 0);
> + ce = make_index_cache_entry(&result, patch->old_mode, oid.hash, name, 0, 0);
> if (!ce)
> - return error(_("make_cache_entry failed for path '%s'"),
> + return error(_("make_index_cache_entry failed for path '%s'"),
> name);
> if (add_index_entry(&result, ce, ADD_CACHE_OK_TO_ADD)) {
> - free(ce);
> + index_cache_entry_discard(ce);
I personally prefer name symmetry. To me, make_index_cache_entry()
should be paired with discard_index_cache_entry().
The rest of this patch looks like a fairly mechanical refactoring with
the biggest exception being the difference between the the
*_index_cache_entry() APIs and the *_transient_cache_entry() APIs.
There are quite a few changes but I didn't see any instances that were
missed or any errors. I see that later patches will put verification
code in place to detect if any were done incorrectly and to prevent
regressions moving forward.
Overall, it looks correct and reasonable.
<snip>
>
> -struct cache_entry *make_cache_entry(unsigned int mode,
> - const unsigned char *sha1, const char *path, int stage,
> - unsigned int refresh_options)
> +struct cache_entry *make_empty_index_cache_entry(struct index_state *istate, size_t len)
> +{
> + return xcalloc(1, cache_entry_size(len));
> +}
> +
> +struct cache_entry *make_empty_transient_cache_entry(size_t len)
> +{
> + return xcalloc(1, cache_entry_size(len));
> +}
> +
> +struct cache_entry *make_index_cache_entry(struct index_state *istate, unsigned int mode,
> + const unsigned char *sha1, const char *path,
> + int stage, unsigned int refresh_options)
> {
> - int size, len;
> struct cache_entry *ce, *ret;
> + int len;
>
> if (!verify_path(path)) {
> error("Invalid path '%s'", path);
> @@ -758,8 +767,7 @@ struct cache_entry *make_cache_entry(unsigned int mode,
> }
>
> len = strlen(path);
> - size = cache_entry_size(len);
> - ce = xcalloc(1, size);
> + ce = make_empty_index_cache_entry(istate, len);
>
> hashcpy(ce->oid.hash, sha1);
> memcpy(ce->name, path, len);
> @@ -769,10 +777,34 @@ struct cache_entry *make_cache_entry(unsigned int mode,
>
> ret = refresh_cache_entry(&the_index, ce, refresh_options);
> if (ret != ce)
> - free(ce);
> + index_cache_entry_discard(ce);
> +
> return ret;
> }
>
> +struct cache_entry *make_transient_cache_entry(unsigned int mode, const unsigned char *sha1,
> + const char *path, int stage)
> +{
> + struct cache_entry *ce;
> + int len;
> +
> + if (!verify_path(path)) {
> + error("Invalid path '%s'", path);
> + return NULL;
> + }
> +
> + len = strlen(path);
> + ce = make_empty_transient_cache_entry(len);
> +
> + hashcpy(ce->oid.hash, sha1);
> + memcpy(ce->name, path, len);
> + ce->ce_flags = create_ce_flags(stage);
> + ce->ce_namelen = len;
> + ce->ce_mode = create_ce_mode(mode);
> +
Nit, feel free to ignore. There isn't a lot of initialization here but I
wonder if it makes sense to have an internal helper function to ensure
these stay the same.
> + return ce;
> +}
> +
> /*
> * Chmod an index entry with either +x or -x.
> *
> @@ -1243,7 +1275,7 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
> {
> struct stat st;
> struct cache_entry *updated;
> - int changed, size;
> + int changed;
> int refresh = options & CE_MATCH_REFRESH;
> int ignore_valid = options & CE_MATCH_IGNORE_VALID;
> int ignore_skip_worktree = options & CE_MATCH_IGNORE_SKIP_WORKTREE;
> @@ -1323,8 +1355,7 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
> return NULL;
> }
>
> - size = ce_size(ce);
> - updated = xmalloc(size);
> + updated = make_empty_index_cache_entry(istate, ce_namelen(ce));
> copy_cache_entry(updated, ce);
> memcpy(updated->name, ce->name, ce->ce_namelen + 1);
> fill_stat_cache_info(updated, &st);
> @@ -1610,12 +1641,13 @@ int read_index(struct index_state *istate)
> return read_index_from(istate, get_index_file(), get_git_dir());
> }
>
> -static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *ondisk,
> +static struct cache_entry *cache_entry_from_ondisk(struct index_state *istate,
> + struct ondisk_cache_entry *ondisk,
> unsigned int flags,
> const char *name,
> size_t len)
> {
> - struct cache_entry *ce = xmalloc(cache_entry_size(len));
> + struct cache_entry *ce = make_empty_index_cache_entry(istate, len);
>
> ce->ce_stat_data.sd_ctime.sec = get_be32(&ondisk->ctime.sec);
> ce->ce_stat_data.sd_mtime.sec = get_be32(&ondisk->mtime.sec);
> @@ -1657,7 +1689,8 @@ static unsigned long expand_name_field(struct strbuf *name, const char *cp_)
> return (const char *)ep + 1 - cp_;
> }
>
> -static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
> +static struct cache_entry *create_from_disk(struct index_state *istate,
> + struct ondisk_cache_entry *ondisk,
> unsigned long *ent_size,
> struct strbuf *previous_name)
Nit. Just wondering why you pulled refresh_cache_entry() out into an
earlier/separate commit but then did create_from_disk() as part of the
large refactoring focused on adding the create/discard APIs?
<snip>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index e73745051e..232cdecc72 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -192,10 +192,10 @@ static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
> ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE);
> }
>
> -static struct cache_entry *dup_entry(const struct cache_entry *ce)
> +static struct cache_entry *dup_entry(const struct cache_entry *ce, struct index_state *istate)
Ditto with the dup_entry() and create_ce_entry() functions refactoring.
> -static struct cache_entry *create_ce_entry(const struct traverse_info *info, const struct name_entry *n, int stage)
> +static struct cache_entry *create_ce_entry(const struct traverse_info *info,
> + const struct name_entry *n,
> + int stage,
> + struct index_state *istate,
> + int is_transient)
next prev parent reply other threads:[~2018-04-17 23:12 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-17 16:34 [PATCH v1 0/5] Allocate cache entries from memory pool Jameson Miller
2018-04-17 16:34 ` Jameson Miller
2018-04-17 16:34 ` [PATCH v1 1/5] read-cache: teach refresh_cache_entry to take istate Jameson Miller
2018-04-17 19:00 ` Ben Peart
2018-04-17 16:34 ` [PATCH v1 2/5] Add an API creating / discarding cache_entry structs Jameson Miller
2018-04-17 23:11 ` Ben Peart [this message]
2018-04-17 16:34 ` [PATCH v1 3/5] mem-pool: fill out functionality Jameson Miller
2018-04-20 23:21 ` Jonathan Tan
2018-04-23 17:27 ` Jameson Miller
2018-04-23 17:49 ` Jonathan Tan
2018-04-23 18:20 ` Jameson Miller
2018-04-17 16:34 ` [PATCH v1 4/5] Allocate cache entries from memory pools Jameson Miller
2018-04-17 16:34 ` [PATCH v1 5/5] Add optional memory validations around cache_entry lifecyle Jameson Miller
2018-04-17 18:39 ` [PATCH v1 0/5] Allocate cache entries from memory pool Ben Peart
2018-04-23 14:09 ` Jameson Miller
2018-04-18 4:49 ` Junio C Hamano
2018-04-20 17:49 ` Stefan Beller
2018-04-23 16:44 ` Jameson Miller
2018-04-23 17:18 ` Stefan Beller
2018-04-23 16:19 ` Jameson Miller
2018-04-20 23:34 ` Jonathan Tan
2018-04-23 17:14 ` Jameson Miller
2018-04-30 15:31 ` [PATCH v2 " Jameson Miller
2018-04-30 15:31 ` [PATCH v2 1/5] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-04-30 15:31 ` [PATCH v2 2/5] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-04-30 15:31 ` [PATCH v2 3/5] mem-pool: fill out functionality Jameson Miller
2018-04-30 21:42 ` Stefan Beller
2018-05-01 15:43 ` Jameson Miller
2018-05-03 16:18 ` Duy Nguyen
2018-04-30 15:31 ` [PATCH v2 4/5] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-04-30 15:31 ` [PATCH v2 5/5] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-05-03 16:28 ` Duy Nguyen
2018-05-03 16:35 ` [PATCH v2 0/5] Allocate cache entries from memory pool Duy Nguyen
2018-05-03 17:21 ` Stefan Beller
2018-05-03 19:17 ` Duy Nguyen
2018-05-03 20:58 ` Stefan Beller
2018-05-03 21:13 ` Jameson Miller
2018-05-03 22:18 ` [PATCH] alloc.c: replace alloc by mempool Stefan Beller
2018-05-04 16:33 ` Duy Nguyen
2018-05-08 0:37 ` Junio C Hamano
2018-05-08 0:44 ` Stefan Beller
2018-05-08 1:07 ` Junio C Hamano
2018-05-23 14:47 ` [PATCH v3 0/7] allocate cache entries from memory pool Jameson Miller
2018-05-23 14:47 ` [PATCH v3 1/7] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-05-25 22:54 ` Stefan Beller
2018-05-23 14:47 ` [PATCH v3 2/7] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-05-24 4:52 ` Junio C Hamano
2018-05-24 14:47 ` Jameson Miller
2018-05-23 14:47 ` [PATCH v3 3/7] mem-pool: only search head block for available space Jameson Miller
2018-05-23 14:47 ` [PATCH v3 4/7] mem-pool: add lifecycle management functions Jameson Miller
2018-05-23 14:47 ` [PATCH v3 5/7] mem-pool: fill out functionality Jameson Miller
2018-06-01 19:28 ` Stefan Beller
2018-05-23 14:47 ` [PATCH v3 6/7] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-05-23 14:47 ` [PATCH v3 7/7] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-05-24 4:55 ` [PATCH v3 0/7] allocate cache entries from memory pool Junio C Hamano
2018-05-24 14:44 ` Jameson Miller
2018-05-25 22:53 ` Stefan Beller
2018-06-20 20:41 ` Jameson Miller
2018-05-25 22:41 ` Stefan Beller
2018-06-20 20:17 ` [PATCH v4 0/8] Allocate cache entries from mem_pool Jameson Miller
2018-06-20 20:17 ` [PATCH v4 1/8] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-06-20 20:17 ` [PATCH v4 2/8] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-06-21 21:14 ` Stefan Beller
2018-06-28 14:07 ` Jameson Miller
2018-06-20 20:17 ` [PATCH v4 3/8] mem-pool: only search head block for available space Jameson Miller
2018-06-21 21:33 ` Stefan Beller
2018-06-28 14:12 ` Jameson Miller
2018-06-20 20:17 ` [PATCH v4 4/8] mem-pool: tweak math on mp_block allocation size Jameson Miller
2018-06-20 20:17 ` [PATCH v4 5/8] mem-pool: add lifecycle management functions Jameson Miller
2018-06-20 20:17 ` [PATCH v4 6/8] mem-pool: fill out functionality Jameson Miller
2018-06-20 20:17 ` [PATCH v4 7/8] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-06-20 20:17 ` [PATCH v4 8/8] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-06-28 14:00 ` [PATCH v5 0/8] Allocate cache entries from mem_pool Jameson Miller
2018-06-28 14:00 ` [PATCH v5 1/8] read-cache: teach refresh_cache_entry() to take istate Jameson Miller
2018-06-28 14:00 ` [PATCH v5 2/8] read-cache: make_cache_entry should take object_id struct Jameson Miller
2018-06-28 17:14 ` Junio C Hamano
2018-06-28 22:27 ` SZEDER Gábor
2018-06-28 14:00 ` [PATCH v5 3/8] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-06-28 18:43 ` Junio C Hamano
2018-06-28 22:28 ` SZEDER Gábor
2018-06-28 14:00 ` [PATCH v5 4/8] mem-pool: only search head block for available space Jameson Miller
2018-06-28 14:00 ` [PATCH v5 5/8] mem-pool: add life cycle management functions Jameson Miller
2018-06-28 17:15 ` Junio C Hamano
2018-06-28 14:00 ` [PATCH v5 6/8] mem-pool: fill out functionality Jameson Miller
2018-06-28 19:09 ` Junio C Hamano
2018-07-02 18:28 ` Jameson Miller
2018-06-28 14:00 ` [PATCH v5 7/8] block alloc: allocate cache entries from mem-pool Jameson Miller
2018-06-28 14:00 ` [PATCH v5 8/8] block alloc: add validations around cache_entry lifecyle Jameson Miller
2018-07-02 19:49 ` [PATCH v6 0/8] Allocate cache entries from mem_pool Jameson Miller
2018-07-02 19:49 ` [PATCH v6 1/8] read-cache: teach refresh_cache_entry to take istate Jameson Miller
2018-07-02 19:49 ` [PATCH v6 2/8] read-cache: teach make_cache_entry to take object_id Jameson Miller
2018-07-02 21:23 ` Stefan Beller
2018-07-05 15:20 ` Jameson Miller
2018-07-02 19:49 ` [PATCH v6 3/8] block alloc: add lifecycle APIs for cache_entry structs Jameson Miller
2018-07-22 9:23 ` Duy Nguyen
2018-07-02 19:49 ` [PATCH v6 4/8] mem-pool: only search head block for available space Jameson Miller
2018-07-02 19:49 ` [PATCH v6 5/8] mem-pool: add life cycle management functions Jameson Miller
2018-07-02 19:49 ` [PATCH v6 6/8] mem-pool: fill out functionality Jameson Miller
2018-07-02 19:49 ` [PATCH v6 7/8] block alloc: allocate cache entries from mem_pool Jameson Miller
2018-07-02 19:49 ` [PATCH v6 8/8] block alloc: add validations around cache_entry lifecyle Jameson Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=921a8d5c-b118-c5b1-35dd-459130faeed9@gmail.com \
--to=peartben@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jamill@microsoft.com \
--cc=jonathantanmy@google.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).