git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org, Keith McGuigan <kmcguigan@twitter.com>
Subject: Re: [PATCH 0/1] merge: fix cache_entry use-after-free
Date: Thu, 08 Oct 2015 13:00:53 -0700	[thread overview]
Message-ID: <xmqqmvvtqgei.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <1444330071-8909-2-git-send-email-dturner@twitter.com> (David Turner's message of "Thu, 8 Oct 2015 14:47:51 -0400")

David Turner <dturner@twopensource.com> writes:

> From: Keith McGuigan <kmcguigan@twitter.com>
>
> During merges, we would previously free entries that we no longer need
> in the destination index.  But those entries might also be stored in
> the dir_entry cache, and when a later call to add_to_index found them,
> they would be used after being freed.
>
> To prevent this, add a ref count for struct cache_entry.  Whenever
> a cache entry is added to a data structure, the ref count is incremented;
> when it is removed from the data structure, it is decremented.  When
> it hits zero, the cache_entry is freed.
>
> Signed-off-by: David Turner <dturner@twopensource.com>
> Signed-off-by: Keith McGuigan <kmcguigan@twitter.com>
> ---

Thanks.

> @@ -213,6 +214,32 @@ struct cache_entry {
>  struct pathspec;
>  
>  /*
> + * Increment the cache_entry reference count.  Should be called
> + * whenever a pointer to a cache_entry is retained in a data structure,
> + * thus marking it as alive.
> + */
> +static inline void add_ce_ref(struct cache_entry *ce)
> +{
> +	assert(ce != NULL && ce->ref_count >= 0);
> +	++ce->ref_count;
> +}

We tend to prefer post-increment when the distinction between pre-
or post- does not make any difference, which is the case here.

> diff --git a/name-hash.c b/name-hash.c
> index 702cd05..f12c919 100644
> --- a/name-hash.c
> +++ b/name-hash.c
> @@ -92,7 +93,9 @@ static void remove_dir_entry(struct index_state *istate, struct cache_entry *ce)
>  	struct dir_entry *dir = hash_dir_entry(istate, ce, ce_namelen(ce));
>  	while (dir && !(--dir->nr)) {
>  		struct dir_entry *parent = dir->parent;
> -		hashmap_remove(&istate->dir_hash, dir, NULL);
> +		struct dir_entry *removed = hashmap_remove(&istate->dir_hash, dir, NULL);
> +		assert(removed == dir);
> +		drop_ce_ref(dir->ce);

This is curious.  In remove_name_hash() you do not have the
corresponding assert.  Why is it necessary here (or is it
unnecessary over there)?

> @@ -147,6 +151,7 @@ void remove_name_hash(struct index_state *istate, struct cache_entry *ce)
>  		return;
>  	ce->ce_flags &= ~CE_HASHED;
>  	hashmap_remove(&istate->name_hash, ce, ce);
> +	drop_ce_ref(ce);
>  
>  	if (ignore_case)
>  		remove_dir_entry(istate, ce);
> diff --git a/read-cache.c b/read-cache.c
> index 87204a5..442de34 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -53,6 +53,7 @@ static const char *alternate_index_output;
>  static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
>  {
>  	istate->cache[nr] = ce;
> +	add_ce_ref(ce);
>  	add_name_hash(istate, ce);
>  }

What happens to the existing entry that used to be istate->cache[nr],
which may or may not be 'ce' that is replacing it?

It turns out that all three calling sites are safe, but it is not
immediately obvious why.  Perhaps some comment in front of the
function is in order, to warn those who may have to add a new caller
or restructure the existing calling chain, that istate->cache[nr] is
expected not to hold a live reference when the function is called,
or something?

> @@ -314,8 +314,10 @@ void replace_index_entry_in_base(struct index_state *istate,
>  	    istate->split_index->base &&
>  	    old->index <= istate->split_index->base->cache_nr) {
>  		new->index = old->index;
> -		if (old != istate->split_index->base->cache[new->index - 1])
> -			free(istate->split_index->base->cache[new->index - 1]);
> +		if (old != istate->split_index->base->cache[new->index - 1]) {
> +			struct cache_entry *ce = istate->split_index->base->cache[new->index - 1];
> +			drop_ce_ref(ce);
> +		}
>  		istate->split_index->base->cache[new->index - 1] = new;

Does 'new' already have the right refcount at this point?

  reply	other threads:[~2015-10-08 20:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-08 18:47 [PATCH 0/1] merge: fix cache_entry use-after-free David Turner
2015-10-08 18:47 ` David Turner
2015-10-08 20:00   ` Junio C Hamano [this message]
2015-10-09 22:16     ` David Turner
2015-10-09 22:51       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqmvvtqgei.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=kmcguigan@twitter.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).