git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
@ 2021-10-06  9:29 Phillip Wood via GitGitGadget
  2021-10-06 11:20 ` Derrick Stolee
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Phillip Wood via GitGitGadget @ 2021-10-06  9:29 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when
index_name_pos() looks up a entry that is missing from the index and
that would be a descendant of a sparse entry. That triggers a call to
ensure_full_index() which frees the cache tree that is being verified.
Carrying on trying to verify the tree after this results in a
use-after-free bug. Instead restart the verification if a sparse index
is converted to a full index. This bug is triggered by a call to
reset_head() in "git rebase --apply". Thanks to René Scharfe for his
help analyzing the problem.

==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
    #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
    #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)

0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
    #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
    #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
    #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
    #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
    #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

previously allocated by thread T0 here:
    #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
    #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
    #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
    #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
    #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
    #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
    [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
    
    In a sparse index it is possible for the tree that is being verified to
    be freed while it is being verified. This is an RFC as I'm not familiar
    with the cache tree code. I'm confused as to why this bug is triggered
    by the sequence
    
    unpack_trees()
    prime_cache_tree()
    write_locked_index()
    
    
    but not
    
    unpack_trees()
    write_locked_index()
    
    
    as unpack_trees() appears to update the cache tree with
    
    if (!cache_tree_fully_valid(o->result.cache_tree))
                cache_tree_update(&o->result,
                          WRITE_TREE_SILENT |
                          WRITE_TREE_REPAIR);
    
    
    and I don't understand why the cache tree from prime_cache_tree()
    results in different behavior. It concerns me that this fix is hiding
    another bug.
    
    Best Wishes
    
    Phillip

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1053%2Fphillipwood%2Fwip%2Fsparse-index-fix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1053/phillipwood/wip/sparse-index-fix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1053

 cache-tree.c                             | 29 +++++++++++++++++-------
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 90919f9e345..7bdbbc24268 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -826,10 +826,10 @@ static void verify_one_sparse(struct repository *r,
 		    path->buf);
 }
 
-static void verify_one(struct repository *r,
-		       struct index_state *istate,
-		       struct cache_tree *it,
-		       struct strbuf *path)
+static int verify_one(struct repository *r,
+		      struct index_state *istate,
+		      struct cache_tree *it,
+		      struct strbuf *path)
 {
 	int i, pos, len = path->len;
 	struct strbuf tree_buf = STRBUF_INIT;
@@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
 
 	for (i = 0; i < it->subtree_nr; i++) {
 		strbuf_addf(path, "%s/", it->down[i]->name);
-		verify_one(r, istate, it->down[i]->cache_tree, path);
+		if (verify_one(r, istate, it->down[i]->cache_tree, path))
+			return 1;
 		strbuf_setlen(path, len);
 	}
 
 	if (it->entry_count < 0 ||
 	    /* no verification on tests (t7003) that replace trees */
 	    lookup_replace_object(r, &it->oid) != &it->oid)
-		return;
+		return 0;
 
 	if (path->len) {
+		/*
+		 * If the index is sparse index_name_pos() may trigger
+		 * ensure_full_index() which will free the tree that is being
+		 * verified.
+		 */
+		int is_sparse = istate->sparse_index;
 		pos = index_name_pos(istate, path->buf, path->len);
+		if (is_sparse && !istate->sparse_index)
+			return 1;
 
 		if (pos >= 0) {
 			verify_one_sparse(r, istate, it, path, pos);
-			return;
+			return 0;
 		}
 
 		pos = -pos - 1;
@@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
 		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
 	strbuf_setlen(path, len);
 	strbuf_release(&tree_buf);
+	return 0;
 }
 
 void cache_tree_verify(struct repository *r, struct index_state *istate)
@@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
 
 	if (!istate->cache_tree)
 		return;
-	verify_one(r, istate, istate->cache_tree, &path);
+	if (verify_one(r, istate, istate->cache_tree, &path)) {
+		strbuf_reset(&path);
+		verify_one(r, istate, istate->cache_tree, &path);
+	}
 	strbuf_release(&path);
 }
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 886e78715fe..85d5279b33c 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	for OPERATION in "merge -m merge" cherry-pick rebase
+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
 	do
 		test_all_match git checkout -B temp update-deep &&
 		test_all_match git $OPERATION update-folder1 &&

base-commit: cefe983a320c03d7843ac78e73bd513a27806845
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-06  9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
@ 2021-10-06 11:20 ` Derrick Stolee
  2021-10-06 14:01   ` Phillip Wood
  2021-10-06 19:17 ` Junio C Hamano
  2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
  2 siblings, 1 reply; 26+ messages in thread
From: Derrick Stolee @ 2021-10-06 11:20 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Phillip Wood, vdye

On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> In a sparse index it is possible for the tree that is being verified
> to be freed while it is being verified. This happens when
> index_name_pos() looks up a entry that is missing from the index and
> that would be a descendant of a sparse entry. That triggers a call to
> ensure_full_index() which frees the cache tree that is being verified.
> Carrying on trying to verify the tree after this results in a
> use-after-free bug. Instead restart the verification if a sparse index
> is converted to a full index. This bug is triggered by a call to
> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
> help analyzing the problem.

Thank you for identifying an interesting case! I hadn't thought to
change the mode from --merge to --apply.

>     In a sparse index it is possible for the tree that is being verified to
>     be freed while it is being verified. This is an RFC as I'm not familiar
>     with the cache tree code. I'm confused as to why this bug is triggered
>     by the sequence
>     
>     unpack_trees()
>     prime_cache_tree()
>     write_locked_index()
>     
>     but not
>     
>     unpack_trees()
>     write_locked_index()
>     
>     
>     as unpack_trees() appears to update the cache tree with
>     
>     if (!cache_tree_fully_valid(o->result.cache_tree))
>                 cache_tree_update(&o->result,
>                           WRITE_TREE_SILENT |
>                           WRITE_TREE_REPAIR);
>     
>     
>     and I don't understand why the cache tree from prime_cache_tree()
>     results in different behavior. It concerns me that this fix is hiding
>     another bug.

prime_cache_tree() appears to clear the cache tree and start from scratch
from a tree object instead of using the index.

In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
cone, so the cache tree is the full size at that point.

When the verify_one() method reaches these nodes that are outside of the
cone, index_name_pos() triggers the index expansion in a way that the
cache-tree that is restricted to the sparse-checkout cone does not.

Hopefully that helps clear up _why_ this happens.

There is a remaining issue that "git rebase --apply" will be a lot slower
than "git rebase --merge" because of this construction of a cache-tree
that is much larger than necessary.

I will make note of this as a potential improvement for the future.

> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>  {
>  	int i, pos, len = path->len;
>  	struct strbuf tree_buf = STRBUF_INIT;
> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>  
>  	for (i = 0; i < it->subtree_nr; i++) {
>  		strbuf_addf(path, "%s/", it->down[i]->name);
> -		verify_one(r, istate, it->down[i]->cache_tree, path);
> +		if (verify_one(r, istate, it->down[i]->cache_tree, path))
> +			return 1;
>  		strbuf_setlen(path, len);
>  	}
>  
>  	if (it->entry_count < 0 ||
>  	    /* no verification on tests (t7003) that replace trees */
>  	    lookup_replace_object(r, &it->oid) != &it->oid)
> -		return;
> +		return 0;
>  
>  	if (path->len) {
> +		/*
> +		 * If the index is sparse index_name_pos() may trigger
> +		 * ensure_full_index() which will free the tree that is being
> +		 * verified.
> +		 */
> +		int is_sparse = istate->sparse_index;
>  		pos = index_name_pos(istate, path->buf, path->len);
> +		if (is_sparse && !istate->sparse_index)
> +			return 1;

I think this guard is good to have, even if we fix prime_cache_tree() to
avoid triggering expansion here in most cases.

>  		if (pos >= 0) {
>  			verify_one_sparse(r, istate, it, path, pos);
> -			return;
> +			return 0;
>  		}
>  
>  		pos = -pos - 1;
> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
>  		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
>  	strbuf_setlen(path, len);
>  	strbuf_release(&tree_buf);
> +	return 0;
>  }
>  
>  void cache_tree_verify(struct repository *r, struct index_state *istate)
> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>  
>  	if (!istate->cache_tree)
>  		return;
> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		verify_one(r, istate, istate->cache_tree, &path);
> +	}

And this limits us to doing at most two passes. Good.

>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"

Thank you for the additional test!

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-06 11:20 ` Derrick Stolee
@ 2021-10-06 14:01   ` Phillip Wood
  2021-10-06 14:19     ` Derrick Stolee
  0 siblings, 1 reply; 26+ messages in thread
From: Phillip Wood @ 2021-10-06 14:01 UTC (permalink / raw)
  To: Derrick Stolee, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Phillip Wood, vdye

Hi Stolee

On 06/10/2021 12:20, Derrick Stolee wrote:
> On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> In a sparse index it is possible for the tree that is being verified
>> to be freed while it is being verified. This happens when
>> index_name_pos() looks up a entry that is missing from the index and
>> that would be a descendant of a sparse entry. That triggers a call to
>> ensure_full_index() which frees the cache tree that is being verified.
>> Carrying on trying to verify the tree after this results in a
>> use-after-free bug. Instead restart the verification if a sparse index
>> is converted to a full index. This bug is triggered by a call to
>> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
>> help analyzing the problem.
> 
> Thank you for identifying an interesting case! I hadn't thought to
> change the mode from --merge to --apply.

Thanks, I can't really take much credit for that though - Junio pointed 
out that my patch converting the merge based rebase to use the same 
checkout code as the apply based rebase broke a test in seen and René 
diagnosed the problem.

>>      In a sparse index it is possible for the tree that is being verified to
>>      be freed while it is being verified. This is an RFC as I'm not familiar
>>      with the cache tree code. I'm confused as to why this bug is triggered
>>      by the sequence
>>      
>>      unpack_trees()
>>      prime_cache_tree()
>>      write_locked_index()
>>      
>>      but not
>>      
>>      unpack_trees()
>>      write_locked_index()
>>      
>>      
>>      as unpack_trees() appears to update the cache tree with
>>      
>>      if (!cache_tree_fully_valid(o->result.cache_tree))
>>                  cache_tree_update(&o->result,
>>                            WRITE_TREE_SILENT |
>>                            WRITE_TREE_REPAIR);
>>      
>>      
>>      and I don't understand why the cache tree from prime_cache_tree()
>>      results in different behavior. It concerns me that this fix is hiding
>>      another bug.
> 
> prime_cache_tree() appears to clear the cache tree and start from scratch
> from a tree object instead of using the index.
> 
> In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
> cone, so the cache tree is the full size at that point.
> 
> When the verify_one() method reaches these nodes that are outside of the
> cone, index_name_pos() triggers the index expansion in a way that the
> cache-tree that is restricted to the sparse-checkout cone does not.
> 
> Hopefully that helps clear up _why_ this happens.

It does thanks - we end up with a full cache tree but a sparse index

> There is a remaining issue that "git rebase --apply" will be a lot slower
> than "git rebase --merge" because of this construction of a cache-tree
> that is much larger than necessary.
> 
> I will make note of this as a potential improvement for the future.

I think I'm going to remove the call to prime_cache_tree(). Correct me 
if I'm wrong but as I understand it unpack_trees() updates the cache 
tree so the call to prime_cache_tree() is not needed (I think it was 
copied from builtin/rebase.c which does need to call prime_cache_tree() 
if it has updated a few paths rather than the whole top-level tree). In 
any case I've just noticed that one of Victoria's patches[1] looks like 
it fixes prime_cache_tree() with a sparse index.

[1] 
https://lore.kernel.org/git/78cd85d8dcc790251ce8235e649902cf6adf091a.1633440057.git.gitgitgadget@gmail.com/

>> -static void verify_one(struct repository *r,
>> -		       struct index_state *istate,
>> -		       struct cache_tree *it,
>> -		       struct strbuf *path)
>> +static int verify_one(struct repository *r,
>> +		      struct index_state *istate,
>> +		      struct cache_tree *it,
>> +		      struct strbuf *path)
>>   {
>>   	int i, pos, len = path->len;
>>   	struct strbuf tree_buf = STRBUF_INIT;
>> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>>   
>>   	for (i = 0; i < it->subtree_nr; i++) {
>>   		strbuf_addf(path, "%s/", it->down[i]->name);
>> -		verify_one(r, istate, it->down[i]->cache_tree, path);
>> +		if (verify_one(r, istate, it->down[i]->cache_tree, path))
>> +			return 1;
>>   		strbuf_setlen(path, len);
>>   	}
>>   
>>   	if (it->entry_count < 0 ||
>>   	    /* no verification on tests (t7003) that replace trees */
>>   	    lookup_replace_object(r, &it->oid) != &it->oid)
>> -		return;
>> +		return 0;
>>   
>>   	if (path->len) {
>> +		/*
>> +		 * If the index is sparse index_name_pos() may trigger
>> +		 * ensure_full_index() which will free the tree that is being
>> +		 * verified.
>> +		 */
>> +		int is_sparse = istate->sparse_index;
>>   		pos = index_name_pos(istate, path->buf, path->len);
>> +		if (is_sparse && !istate->sparse_index)
>> +			return 1;
> 
> I think this guard is good to have, even if we fix prime_cache_tree() to
> avoid triggering expansion here in most cases.
> 
>>   		if (pos >= 0) {
>>   			verify_one_sparse(r, istate, it, path, pos);
>> -			return;
>> +			return 0;
>>   		}
>>   
>>   		pos = -pos - 1;
>> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
>>   		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
>>   	strbuf_setlen(path, len);
>>   	strbuf_release(&tree_buf);
>> +	return 0;
>>   }
>>   
>>   void cache_tree_verify(struct repository *r, struct index_state *istate)
>> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>>   
>>   	if (!istate->cache_tree)
>>   		return;
>> -	verify_one(r, istate, istate->cache_tree, &path);
>> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
>> +		strbuf_reset(&path);
>> +		verify_one(r, istate, istate->cache_tree, &path);
>> +	}
> 
> And this limits us to doing at most two passes. Good.

In theory ensure_full_index() will only ever be called once but I wanted 
to make sure we could not get into an infinite loop.

>>   test_expect_success 'merge, cherry-pick, and rebase' '
>>   	init_repos &&
>>   
>> -	for OPERATION in "merge -m merge" cherry-pick rebase
>> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
> 
> Thank you for the additional test!

Thanks for your explanation and looking at the patch

Best Wishes

Phillip

> Thanks,
> -Stolee
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-06 14:01   ` Phillip Wood
@ 2021-10-06 14:19     ` Derrick Stolee
  0 siblings, 0 replies; 26+ messages in thread
From: Derrick Stolee @ 2021-10-06 14:19 UTC (permalink / raw)
  To: phillip.wood, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, vdye

On 10/6/21 10:01 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 06/10/2021 12:20, Derrick Stolee wrote:
>> In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
>> cone, so the cache tree is the full size at that point.
>>
>> When the verify_one() method reaches these nodes that are outside of the
>> cone, index_name_pos() triggers the index expansion in a way that the
>> cache-tree that is restricted to the sparse-checkout cone does not.
>>
>> Hopefully that helps clear up _why_ this happens.
> 
> It does thanks - we end up with a full cache tree but a sparse index

That's a short-and-sweet way to describe it.

>> There is a remaining issue that "git rebase --apply" will be a lot slower
>> than "git rebase --merge" because of this construction of a cache-tree
>> that is much larger than necessary.
>>
>> I will make note of this as a potential improvement for the future.
> 
> I think I'm going to remove the call to prime_cache_tree(). Correct me if I'm wrong but as I understand it unpack_trees() updates the cache tree so the call to prime_cache_tree() is not needed (I think it was copied from builtin/rebase.c which does need to call prime_cache_tree() if it has updated a few paths rather than the whole top-level tree). In any case I've just noticed that one of Victoria's patches[1] looks like it fixes prime_cache_tree() with a sparse index.
> 
> [1] https://lore.kernel.org/git/78cd85d8dcc790251ce8235e649902cf6adf091a.1633440057.git.gitgitgadget@gmail.com/

Of course it does! I'm losing track of all the ongoing work in
the sparse index as I've been distracted and out of it for a
while. It's in good hands.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-06  9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
  2021-10-06 11:20 ` Derrick Stolee
@ 2021-10-06 19:17 ` Junio C Hamano
  2021-10-06 20:43   ` Derrick Stolee
  2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
  2 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2021-10-06 19:17 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Phillip Wood

"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

/*
 * Please document what the values that can be returned from
 * this function are and what they mean, just before this
 * funciton.  I am guessing that this is "all bets are off and
 * you need to redo the computation again over the full in-core
 * index"?  It is not an error and I think it makes sense to use
 * positive 1 like this patch does instead of -1.
 */
>  
> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>  {



> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>  
>  	if (!istate->cache_tree)
>  		return;
> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		verify_one(r, istate, istate->cache_tree, &path);
> +	}
>  	strbuf_release(&path);
>  }

This is just a style thing, but I would find it easier to follow if
it just recursed into itself, i.e.

-	verify_one(...);
+	if (verify_one(...))
+		cache_tree_verify(r, istate);

or

-	verify_one(...);
+	again:
+	if (verify_one(...))
+		strbuf_reset(&path);
+		goto again;
}	}

On the other hand, if the new code wants to say "I would retry at
most once, otherwise there is something wrong in me", then

> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		if (verify_one(r, istate, istate->cache_tree, &path))
> +			BUG("...");
> +	}

would be better.

Other than that, nicely done.

> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 886e78715fe..85d5279b33c 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
>  	do
>  		test_all_match git checkout -B temp update-deep &&
>  		test_all_match git $OPERATION update-folder1 &&
>
> base-commit: cefe983a320c03d7843ac78e73bd513a27806845

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-06 19:17 ` Junio C Hamano
@ 2021-10-06 20:43   ` Derrick Stolee
  0 siblings, 0 replies; 26+ messages in thread
From: Derrick Stolee @ 2021-10-06 20:43 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Phillip Wood

On 10/6/2021 3:17 PM, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>>  
>>  	if (!istate->cache_tree)
>>  		return;
>> -	verify_one(r, istate, istate->cache_tree, &path);
>> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
>> +		strbuf_reset(&path);
>> +		verify_one(r, istate, istate->cache_tree, &path);
>> +	}
>>  	strbuf_release(&path);
>>  }
> 
> This is just a style thing, but I would find it easier to follow if
> it just recursed into itself, i.e.
> 
> -	verify_one(...);
> +	if (verify_one(...))
> +		cache_tree_verify(r, istate);
> 
> or
> 
> -	verify_one(...);
> +	again:
> +	if (verify_one(...))
> +		strbuf_reset(&path);
> +		goto again;
> }	}
> 
> On the other hand, if the new code wants to say "I would retry at
> most once, otherwise there is something wrong in me", then
> 
>> -	verify_one(r, istate, istate->cache_tree, &path);
>> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
>> +		strbuf_reset(&path);
>> +		if (verify_one(r, istate, istate->cache_tree, &path))
>> +			BUG("...");
>> +	}
> 
> would be better.

I'm in favor of this second option.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-06  9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
  2021-10-06 11:20 ` Derrick Stolee
  2021-10-06 19:17 ` Junio C Hamano
@ 2021-10-07  9:50 ` Phillip Wood via GitGitGadget
  2021-10-07 13:35   ` Derrick Stolee
                     ` (2 more replies)
  2 siblings, 3 replies; 26+ messages in thread
From: Phillip Wood via GitGitGadget @ 2021-10-07  9:50 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when the index is
sparse but the cache tree is not and index_name_pos() looks up a path
from the cache tree that is a descendant of a sparse index entry. That
triggers a call to ensure_full_index() which frees the cache tree that
is being verified.  Carrying on trying to verify the tree after this
results in a use-after-free bug. Instead restart the verification if a
sparse index is converted to a full index. This bug is triggered by a
call to reset_head() in "git rebase --apply". Thanks to René Scharfe
and Derick Stolee for their help analyzing the problem.

==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
    #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
    #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)

0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
    #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
    #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
    #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
    #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
    #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

previously allocated by thread T0 here:
    #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
    #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
    #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
    #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
    #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
    #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
    [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
    
    Thanks for the feedback and help, here are the changes from the RFC
    
     * Updated commit message and comments to make it clear this is
       triggered by a sparse index with a full cache tree based on Stolee's
       explanation.
     * Added a comment and BUG() suggested by Junio
    
    RFC cover letter In a sparse index it is possible for the tree that is
    being verified to be freed while it is being verified. This is an RFC as
    I'm not familiar with the cache tree code. I'm confused as to why this
    bug is triggered by the sequence
    
    unpack_trees()
    prime_cache_tree()
    write_locked_index()
    
    
    but not
    
    unpack_trees()
    write_locked_index()
    
    
    as unpack_trees() appears to update the cache tree with
    
    if (!cache_tree_fully_valid(o->result.cache_tree))
                cache_tree_update(&o->result,
                          WRITE_TREE_SILENT |
                          WRITE_TREE_REPAIR);
    
    
    and I don't understand why the cache tree from prime_cache_tree()
    results in different behavior. It concerns me that this fix is hiding
    another bug.
    
    Best Wishes
    
    Phillip

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1053%2Fphillipwood%2Fwip%2Fsparse-index-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1053/phillipwood/wip/sparse-index-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1053

Range-diff vs v1:

 1:  358b7afb653 ! 1:  4ee972fee2e [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
     @@ Metadata
      Author: Phillip Wood <phillip.wood@dunelm.org.uk>
      
       ## Commit message ##
     -    [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
     +    sparse index: fix use-after-free bug in cache_tree_verify()
      
          In a sparse index it is possible for the tree that is being verified
     -    to be freed while it is being verified. This happens when
     -    index_name_pos() looks up a entry that is missing from the index and
     -    that would be a descendant of a sparse entry. That triggers a call to
     -    ensure_full_index() which frees the cache tree that is being verified.
     -    Carrying on trying to verify the tree after this results in a
     -    use-after-free bug. Instead restart the verification if a sparse index
     -    is converted to a full index. This bug is triggered by a call to
     -    reset_head() in "git rebase --apply". Thanks to René Scharfe for his
     -    help analyzing the problem.
     +    to be freed while it is being verified. This happens when the index is
     +    sparse but the cache tree is not and index_name_pos() looks up a path
     +    from the cache tree that is a descendant of a sparse index entry. That
     +    triggers a call to ensure_full_index() which frees the cache tree that
     +    is being verified.  Carrying on trying to verify the tree after this
     +    results in a use-after-free bug. Instead restart the verification if a
     +    sparse index is converted to a full index. This bug is triggered by a
     +    call to reset_head() in "git rebase --apply". Thanks to René Scharfe
     +    and Derick Stolee for their help analyzing the problem.
      
          ==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
          READ of size 4 at 0x606000001b20 thread T0
     @@ cache-tree.c: static void verify_one_sparse(struct repository *r,
      -		       struct index_state *istate,
      -		       struct cache_tree *it,
      -		       struct strbuf *path)
     ++/*
     ++ * Returns:
     ++ *  0 - Verification completed.
     ++ *  1 - Restart verification - a call to ensure_full_index() freed the cache
     ++ *      tree that is being verified and verification needs to be restarted from
     ++ *      the new toplevel cache tree.
     ++ */
      +static int verify_one(struct repository *r,
      +		      struct index_state *istate,
      +		      struct cache_tree *it,
     @@ cache-tree.c: static void verify_one(struct repository *r,
       
       	if (path->len) {
      +		/*
     -+		 * If the index is sparse index_name_pos() may trigger
     -+		 * ensure_full_index() which will free the tree that is being
     -+		 * verified.
     ++		 * If the index is sparse and the cache tree is not
     ++		 * index_name_pos() may trigger ensure_full_index() which will
     ++		 * free the tree that is being verified.
      +		 */
      +		int is_sparse = istate->sparse_index;
       		pos = index_name_pos(istate, path->buf, path->len);
     @@ cache-tree.c: void cache_tree_verify(struct repository *r, struct index_state *i
      -	verify_one(r, istate, istate->cache_tree, &path);
      +	if (verify_one(r, istate, istate->cache_tree, &path)) {
      +		strbuf_reset(&path);
     -+		verify_one(r, istate, istate->cache_tree, &path);
     ++		if (verify_one(r, istate, istate->cache_tree, &path))
     ++			BUG("ensure_full_index() called twice while verifying cache tree");
      +	}
       	strbuf_release(&path);
       }


 cache-tree.c                             | 37 +++++++++++++++++++-----
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 90919f9e345..8044e21bcf3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -826,10 +826,17 @@ static void verify_one_sparse(struct repository *r,
 		    path->buf);
 }
 
-static void verify_one(struct repository *r,
-		       struct index_state *istate,
-		       struct cache_tree *it,
-		       struct strbuf *path)
+/*
+ * Returns:
+ *  0 - Verification completed.
+ *  1 - Restart verification - a call to ensure_full_index() freed the cache
+ *      tree that is being verified and verification needs to be restarted from
+ *      the new toplevel cache tree.
+ */
+static int verify_one(struct repository *r,
+		      struct index_state *istate,
+		      struct cache_tree *it,
+		      struct strbuf *path)
 {
 	int i, pos, len = path->len;
 	struct strbuf tree_buf = STRBUF_INIT;
@@ -837,21 +844,30 @@ static void verify_one(struct repository *r,
 
 	for (i = 0; i < it->subtree_nr; i++) {
 		strbuf_addf(path, "%s/", it->down[i]->name);
-		verify_one(r, istate, it->down[i]->cache_tree, path);
+		if (verify_one(r, istate, it->down[i]->cache_tree, path))
+			return 1;
 		strbuf_setlen(path, len);
 	}
 
 	if (it->entry_count < 0 ||
 	    /* no verification on tests (t7003) that replace trees */
 	    lookup_replace_object(r, &it->oid) != &it->oid)
-		return;
+		return 0;
 
 	if (path->len) {
+		/*
+		 * If the index is sparse and the cache tree is not
+		 * index_name_pos() may trigger ensure_full_index() which will
+		 * free the tree that is being verified.
+		 */
+		int is_sparse = istate->sparse_index;
 		pos = index_name_pos(istate, path->buf, path->len);
+		if (is_sparse && !istate->sparse_index)
+			return 1;
 
 		if (pos >= 0) {
 			verify_one_sparse(r, istate, it, path, pos);
-			return;
+			return 0;
 		}
 
 		pos = -pos - 1;
@@ -899,6 +915,7 @@ static void verify_one(struct repository *r,
 		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
 	strbuf_setlen(path, len);
 	strbuf_release(&tree_buf);
+	return 0;
 }
 
 void cache_tree_verify(struct repository *r, struct index_state *istate)
@@ -907,6 +924,10 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
 
 	if (!istate->cache_tree)
 		return;
-	verify_one(r, istate, istate->cache_tree, &path);
+	if (verify_one(r, istate, istate->cache_tree, &path)) {
+		strbuf_reset(&path);
+		if (verify_one(r, istate, istate->cache_tree, &path))
+			BUG("ensure_full_index() called twice while verifying cache tree");
+	}
 	strbuf_release(&path);
 }
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 886e78715fe..85d5279b33c 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	for OPERATION in "merge -m merge" cherry-pick rebase
+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
 	do
 		test_all_match git checkout -B temp update-deep &&
 		test_all_match git $OPERATION update-folder1 &&

base-commit: cefe983a320c03d7843ac78e73bd513a27806845
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
@ 2021-10-07 13:35   ` Derrick Stolee
  2021-10-07 14:59     ` Phillip Wood
  2021-10-07 13:53   ` Derrick Stolee
  2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
  2 siblings, 1 reply; 26+ messages in thread
From: Derrick Stolee @ 2021-10-07 13:35 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Phillip Wood

On 10/7/2021 5:50 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> In a sparse index it is possible for the tree that is being verified
> to be freed while it is being verified. This happens when the index is
> sparse but the cache tree is not and index_name_pos() looks up a path
> from the cache tree that is a descendant of a sparse index entry. That
> triggers a call to ensure_full_index() which frees the cache tree that
> is being verified.  Carrying on trying to verify the tree after this
> results in a use-after-free bug. Instead restart the verification if a
> sparse index is converted to a full index. This bug is triggered by a
> call to reset_head() in "git rebase --apply". Thanks to René Scharfe
> and Derick Stolee for their help analyzing the problem.

nit: s/Derick/Derrick/

Otherwise, this version looks good to me. Thanks for putting the last
bit of polish on it.

I'm taking this patch into our microsoft/git fork as we speak [1].

[1] https://github.com/microsoft/git/pull/439

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
  2021-10-07 13:35   ` Derrick Stolee
@ 2021-10-07 13:53   ` Derrick Stolee
  2021-10-07 15:05     ` Phillip Wood
  2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
  2 siblings, 1 reply; 26+ messages in thread
From: Derrick Stolee @ 2021-10-07 13:53 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Phillip Wood

On 10/7/2021 5:50 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
...
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 886e78715fe..85d5279b33c 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"

I spoke too soon. On my machine, the 'git rebase --apply' tests fail
because of some verbose output that does not match across the full
and sparse cases. Using "rebase -q --apply" works for me.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 13:35   ` Derrick Stolee
@ 2021-10-07 14:59     ` Phillip Wood
  0 siblings, 0 replies; 26+ messages in thread
From: Phillip Wood @ 2021-10-07 14:59 UTC (permalink / raw)
  To: Derrick Stolee, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Phillip Wood

Hi Stolee

On 07/10/2021 14:35, Derrick Stolee wrote:
> On 10/7/2021 5:50 AM, Phillip Wood via GitGitGadget wrote:
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> In a sparse index it is possible for the tree that is being verified
>> to be freed while it is being verified. This happens when the index is
>> sparse but the cache tree is not and index_name_pos() looks up a path
>> from the cache tree that is a descendant of a sparse index entry. That
>> triggers a call to ensure_full_index() which frees the cache tree that
>> is being verified.  Carrying on trying to verify the tree after this
>> results in a use-after-free bug. Instead restart the verification if a
>> sparse index is converted to a full index. This bug is triggered by a
>> call to reset_head() in "git rebase --apply". Thanks to René Scharfe
>> and Derick Stolee for their help analyzing the problem.
> 
> nit: s/Derick/Derrick/

Sorry, maybe Junio can tweak that when he applies the patch, if not I'll 
fix it.

> Otherwise, this version looks good to me. Thanks for putting the last
> bit of polish on it.
> 
> I'm taking this patch into our microsoft/git fork as we speak [1].
> 
> [1] https://github.com/microsoft/git/pull/439

That's nice to know, Thanks

Phillip

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 13:53   ` Derrick Stolee
@ 2021-10-07 15:05     ` Phillip Wood
  2021-10-07 15:44       ` Derrick Stolee
  0 siblings, 1 reply; 26+ messages in thread
From: Phillip Wood @ 2021-10-07 15:05 UTC (permalink / raw)
  To: Derrick Stolee, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Phillip Wood

Hi Stolee

On 07/10/2021 14:53, Derrick Stolee wrote:
> On 10/7/2021 5:50 AM, Phillip Wood via GitGitGadget wrote:
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> ...
>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>> index 886e78715fe..85d5279b33c 100755
>> --- a/t/t1092-sparse-checkout-compatibility.sh
>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>>   test_expect_success 'merge, cherry-pick, and rebase' '
>>   	init_repos &&
>>   
>> -	for OPERATION in "merge -m merge" cherry-pick rebase
>> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
> 
> I spoke too soon. On my machine, the 'git rebase --apply' tests fail
> because of some verbose output that does not match across the full
> and sparse cases. Using "rebase -q --apply" works for me.

Oh, that's strange, the CI tests pass on gitgitgadget and that script 
passes locally for me. Do you know what the output is that does not match?

Best Wishes

Phillip

> Thanks,
> -Stolee
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 15:05     ` Phillip Wood
@ 2021-10-07 15:44       ` Derrick Stolee
  2021-10-07 17:59         ` Phillip Wood
  0 siblings, 1 reply; 26+ messages in thread
From: Derrick Stolee @ 2021-10-07 15:44 UTC (permalink / raw)
  To: phillip.wood, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano

On 10/7/2021 11:05 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 07/10/2021 14:53, Derrick Stolee wrote:
>> On 10/7/2021 5:50 AM, Phillip Wood via GitGitGadget wrote:
>>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ...
>>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>>> index 886e78715fe..85d5279b33c 100755
>>> --- a/t/t1092-sparse-checkout-compatibility.sh
>>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>>> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>>>   test_expect_success 'merge, cherry-pick, and rebase' '
>>>       init_repos &&
>>>   -    for OPERATION in "merge -m merge" cherry-pick rebase
>>> +    for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
>>
>> I spoke too soon. On my machine, the 'git rebase --apply' tests fail
>> because of some verbose output that does not match across the full
>> and sparse cases. Using "rebase -q --apply" works for me.
> 
> Oh, that's strange, the CI tests pass on gitgitgadget and that script passes locally for me. Do you know what the output is that does not match?

It's entirely possible that it's something in git-for-windows/git or
microsoft/git that is causing the difference:

+ diff -u full-checkout-out sparse-checkout-out
--- full-checkout-out	2021-10-07 13:37:00.475394970 +0000
+++ sparse-checkout-out	2021-10-07 13:37:00.531396095 +0000
@@ -1,3 +1,10 @@
 First, rewinding head to replay your work on top of it...
 Applying: update folder1
+Using index info to reconstruct a base tree...
+Falling back to patching base and 3-way merge...
+Merging:
+e1886b3 update folder2
+virtual update folder1
+found 1 common ancestor:
+virtual b4ad7e16921c16e36f1d5d45ea4fa186efa8422a
 Applying: update deep
+ return 1
error: last command exited with $?=1

[1] https://github.com/microsoft/git/runs/3827705316?check_suite_focus=true#step:5:10469

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 15:44       ` Derrick Stolee
@ 2021-10-07 17:59         ` Phillip Wood
  0 siblings, 0 replies; 26+ messages in thread
From: Phillip Wood @ 2021-10-07 17:59 UTC (permalink / raw)
  To: Derrick Stolee, phillip.wood, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano

Hi Stolee

On 07/10/2021 16:44, Derrick Stolee wrote:
> On 10/7/2021 11:05 AM, Phillip Wood wrote:
>> Hi Stolee
>>
>> On 07/10/2021 14:53, Derrick Stolee wrote:
>>> On 10/7/2021 5:50 AM, Phillip Wood via GitGitGadget wrote:
>>>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>> ...
>>>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>>>> index 886e78715fe..85d5279b33c 100755
>>>> --- a/t/t1092-sparse-checkout-compatibility.sh
>>>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>>>> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>>>>    test_expect_success 'merge, cherry-pick, and rebase' '
>>>>        init_repos &&
>>>>    -    for OPERATION in "merge -m merge" cherry-pick rebase
>>>> +    for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
>>>
>>> I spoke too soon. On my machine, the 'git rebase --apply' tests fail
>>> because of some verbose output that does not match across the full
>>> and sparse cases. Using "rebase -q --apply" works for me.
>>
>> Oh, that's strange, the CI tests pass on gitgitgadget and that script passes locally for me. Do you know what the output is that does not match?
> 
> It's entirely possible that it's something in git-for-windows/git or
> microsoft/git that is causing the difference:

Yes, if I apply the hunk below from 044d9fdaeb ("sparse-checkout:
avoid writing entries with the skip-worktree bit", 2017-03-01) which
is in microsoft/vfs-2.33.0 to my fix then I see the same failure. It
looks like this change makes the apply back end fall back to a three
way merge where a simple patch application succeeded before. Adding
"-q" to the test feels like a bit of a hack but it's probably the best
we can do - at least it still catches any crashes.

Best Wishes

Phillip

diff --git a/apply.c b/apply.c
index 43a0aebf4e..4c1ca6d360 100644
--- a/apply.c
+++ b/apply.c
@@ -3346,6 +3345,24 @@ static int checkout_target(struct index_state *istate,
  {
         struct checkout costate = CHECKOUT_INIT;
  
+       /*
+        * Do not checkout the entry if the skipworktree bit is set
+        *
+        * Both callers of this method (check_preimage and load_current)
+        * check for the existance of the file before calling this
+        * method so we know that the file doesn't exist at this point
+        * and we don't need to perform that check again here.
+        * We just need to check the skip-worktree and return.
+        *
+        * This is to prevent git from creating a file in the
+        * working directory that has the skip-worktree bit on,
+        * then updating the index from the patch and not keeping
+        * the working directory version up to date with what it
+        * changed the index version to be.
+        */
+       if (ce_skip_worktree(ce))
+               return 0;
+
         costate.refresh_cache = 1;
         costate.istate = istate;
         if (checkout_entry(ce, &costate, NULL, NULL) ||


> + diff -u full-checkout-out sparse-checkout-out
> --- full-checkout-out	2021-10-07 13:37:00.475394970 +0000
> +++ sparse-checkout-out	2021-10-07 13:37:00.531396095 +0000
> @@ -1,3 +1,10 @@
>   First, rewinding head to replay your work on top of it...
>   Applying: update folder1
> +Using index info to reconstruct a base tree...
> +Falling back to patching base and 3-way merge...
> +Merging:
> +e1886b3 update folder2
> +virtual update folder1
> +found 1 common ancestor:
> +virtual b4ad7e16921c16e36f1d5d45ea4fa186efa8422a
>   Applying: update deep
> + return 1
> error: last command exited with $?=1
> 
> [1] https://github.com/microsoft/git/runs/3827705316?check_suite_focus=true#step:5:10469
> 
> Thanks,
> -Stolee
> 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
  2021-10-07 13:35   ` Derrick Stolee
  2021-10-07 13:53   ` Derrick Stolee
@ 2021-10-07 18:07   ` Phillip Wood via GitGitGadget
  2021-10-07 21:23     ` Junio C Hamano
                       ` (2 more replies)
  2 siblings, 3 replies; 26+ messages in thread
From: Phillip Wood via GitGitGadget @ 2021-10-07 18:07 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Derrick Stolee, Phillip Wood,
	Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when the index is
sparse but the cache tree is not and index_name_pos() looks up a path
from the cache tree that is a descendant of a sparse index entry. That
triggers a call to ensure_full_index() which frees the cache tree that
is being verified.  Carrying on trying to verify the tree after this
results in a use-after-free bug. Instead restart the verification if a
sparse index is converted to a full index. This bug is triggered by a
call to reset_head() in "git rebase --apply". Thanks to René Scharfe
and Derrick Stolee for their help analyzing the problem.

==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
    #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
    #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)

0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
    #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
    #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
    #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
    #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
    #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

previously allocated by thread T0 here:
    #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
    #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
    #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
    #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
    #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
    #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
    [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
    
    Changes since V2
    
     * Fixed the spelling of Stolee's name (sorry Stolee)
     * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
    
    [1]
    https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/
    
    Thanks for the feedback and help, here are the changes from the RFC
    
     * Updated commit message and comments to make it clear this is
       triggered by a sparse index with a full cache tree based on Stolee's
       explanation.
     * Added a comment and BUG() suggested by Junio
    
    RFC cover letter In a sparse index it is possible for the tree that is
    being verified to be freed while it is being verified. This is an RFC as
    I'm not familiar with the cache tree code. I'm confused as to why this
    bug is triggered by the sequence
    
    unpack_trees()
    prime_cache_tree()
    write_locked_index()
    
    
    but not
    
    unpack_trees()
    write_locked_index()
    
    
    as unpack_trees() appears to update the cache tree with
    
    if (!cache_tree_fully_valid(o->result.cache_tree))
                cache_tree_update(&o->result,
                          WRITE_TREE_SILENT |
                          WRITE_TREE_REPAIR);
    
    
    and I don't understand why the cache tree from prime_cache_tree()
    results in different behavior. It concerns me that this fix is hiding
    another bug.
    
    Best Wishes
    
    Phillip

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1053%2Fphillipwood%2Fwip%2Fsparse-index-fix-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1053/phillipwood/wip/sparse-index-fix-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1053

Range-diff vs v2:

 1:  4ee972fee2e ! 1:  b3dbe02fcc3 sparse index: fix use-after-free bug in cache_tree_verify()
     @@ Commit message
          results in a use-after-free bug. Instead restart the verification if a
          sparse index is converted to a full index. This bug is triggered by a
          call to reset_head() in "git rebase --apply". Thanks to René Scharfe
     -    and Derick Stolee for their help analyzing the problem.
     +    and Derrick Stolee for their help analyzing the problem.
      
          ==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
          READ of size 4 at 0x606000001b20 thread T0
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'checkout and rese
       	init_repos &&
       
      -	for OPERATION in "merge -m merge" cherry-pick rebase
     -+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
     ++	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
       	do
       		test_all_match git checkout -B temp update-deep &&
       		test_all_match git $OPERATION update-folder1 &&


 cache-tree.c                             | 37 +++++++++++++++++++-----
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 90919f9e345..8044e21bcf3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -826,10 +826,17 @@ static void verify_one_sparse(struct repository *r,
 		    path->buf);
 }
 
-static void verify_one(struct repository *r,
-		       struct index_state *istate,
-		       struct cache_tree *it,
-		       struct strbuf *path)
+/*
+ * Returns:
+ *  0 - Verification completed.
+ *  1 - Restart verification - a call to ensure_full_index() freed the cache
+ *      tree that is being verified and verification needs to be restarted from
+ *      the new toplevel cache tree.
+ */
+static int verify_one(struct repository *r,
+		      struct index_state *istate,
+		      struct cache_tree *it,
+		      struct strbuf *path)
 {
 	int i, pos, len = path->len;
 	struct strbuf tree_buf = STRBUF_INIT;
@@ -837,21 +844,30 @@ static void verify_one(struct repository *r,
 
 	for (i = 0; i < it->subtree_nr; i++) {
 		strbuf_addf(path, "%s/", it->down[i]->name);
-		verify_one(r, istate, it->down[i]->cache_tree, path);
+		if (verify_one(r, istate, it->down[i]->cache_tree, path))
+			return 1;
 		strbuf_setlen(path, len);
 	}
 
 	if (it->entry_count < 0 ||
 	    /* no verification on tests (t7003) that replace trees */
 	    lookup_replace_object(r, &it->oid) != &it->oid)
-		return;
+		return 0;
 
 	if (path->len) {
+		/*
+		 * If the index is sparse and the cache tree is not
+		 * index_name_pos() may trigger ensure_full_index() which will
+		 * free the tree that is being verified.
+		 */
+		int is_sparse = istate->sparse_index;
 		pos = index_name_pos(istate, path->buf, path->len);
+		if (is_sparse && !istate->sparse_index)
+			return 1;
 
 		if (pos >= 0) {
 			verify_one_sparse(r, istate, it, path, pos);
-			return;
+			return 0;
 		}
 
 		pos = -pos - 1;
@@ -899,6 +915,7 @@ static void verify_one(struct repository *r,
 		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
 	strbuf_setlen(path, len);
 	strbuf_release(&tree_buf);
+	return 0;
 }
 
 void cache_tree_verify(struct repository *r, struct index_state *istate)
@@ -907,6 +924,10 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
 
 	if (!istate->cache_tree)
 		return;
-	verify_one(r, istate, istate->cache_tree, &path);
+	if (verify_one(r, istate, istate->cache_tree, &path)) {
+		strbuf_reset(&path);
+		if (verify_one(r, istate, istate->cache_tree, &path))
+			BUG("ensure_full_index() called twice while verifying cache tree");
+	}
 	strbuf_release(&path);
 }
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 886e78715fe..80c77bb432e 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	for OPERATION in "merge -m merge" cherry-pick rebase
+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
 	do
 		test_all_match git checkout -B temp update-deep &&
 		test_all_match git $OPERATION update-folder1 &&

base-commit: cefe983a320c03d7843ac78e73bd513a27806845
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
@ 2021-10-07 21:23     ` Junio C Hamano
  2021-10-08  9:09       ` Phillip Wood
  2021-10-08  9:38     ` Bagas Sanjaya
  2021-10-16  9:07     ` [PATCH v4] " Phillip Wood via GitGitGadget
  2 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2021-10-07 21:23 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Derrick Stolee, Phillip Wood, Phillip Wood

"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

>      * Fixed the spelling of Stolee's name (sorry Stolee)
>      * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
>     
>     [1]
>     https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/

I've seen the exchange, but ...

> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
>  	do

... it looks too strange that only one of them requires a "--quiet"
option.  Is it a possibility to get whoever's fork corrected so that
it behaves sensibly without requiring the "-q" option only for the
particular rebase backend?

In the meantime, I'll queue the patch as-is (I actually queued the
previous round with namefix already).

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 21:23     ` Junio C Hamano
@ 2021-10-08  9:09       ` Phillip Wood
  2021-10-08 18:53         ` Derrick Stolee
  2021-10-08 19:57         ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Phillip Wood @ 2021-10-08  9:09 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Derrick Stolee, Phillip Wood

On 07/10/2021 22:23, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>       * Fixed the spelling of Stolee's name (sorry Stolee)
>>       * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
>>      
>>      [1]
>>      https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/
> 
> I've seen the exchange, but ...
> 
>> -	for OPERATION in "merge -m merge" cherry-pick rebase
>> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
>>   	do
> 
> ... it looks too strange that only one of them requires a "--quiet"
> option.  Is it a possibility to get whoever's fork corrected so that
> it behaves sensibly without requiring the "-q" option only for the
> particular rebase backend?

The issue is caused by a patch that Microsoft is carrying that stops 
apply from creating paths with the skip-worktree bit set. As they're 
upstreaming their sparse index and checkout work I expect it will show 
up on the list sooner or later. I agree the "-q" is odd and it also 
means the test is weaker but I'm not sure what else we can do.

> In the meantime, I'll queue the patch as-is (I actually queued the
> previous round with namefix already).

Thanks

Phillip

> Thanks.
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
  2021-10-07 21:23     ` Junio C Hamano
@ 2021-10-08  9:38     ` Bagas Sanjaya
  2021-10-14  9:40       ` Phillip Wood
  2021-10-16  9:07     ` [PATCH v4] " Phillip Wood via GitGitGadget
  2 siblings, 1 reply; 26+ messages in thread
From: Bagas Sanjaya @ 2021-10-08  9:38 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Derrick Stolee, Phillip Wood,
	Phillip Wood

On 08/10/21 01.07, Phillip Wood via GitGitGadget wrote:
> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +/*
> + * Returns:
> + *  0 - Verification completed.
> + *  1 - Restart verification - a call to ensure_full_index() freed the cache
> + *      tree that is being verified and verification needs to be restarted from
> + *      the new toplevel cache tree.
> + */
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>   {
>   	int i, pos, len = path->len;
>   	struct strbuf tree_buf = STRBUF_INIT;

What is verify_one() doing? I think it worth mentioning it in the 
comment above.

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-08  9:09       ` Phillip Wood
@ 2021-10-08 18:53         ` Derrick Stolee
  2021-10-08 19:57         ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Derrick Stolee @ 2021-10-08 18:53 UTC (permalink / raw)
  To: phillip.wood, Junio C Hamano, Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin

On 10/8/2021 5:09 AM, Phillip Wood wrote:
> On 07/10/2021 22:23, Junio C Hamano wrote:
>> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>>       * Fixed the spelling of Stolee's name (sorry Stolee)
>>>       * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
>>>           [1]
>>>      https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/
>>
>> I've seen the exchange, but ...
>>
>>> -    for OPERATION in "merge -m merge" cherry-pick rebase
>>> +    for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
>>>       do
>>
>> ... it looks too strange that only one of them requires a "--quiet"
>> option.  Is it a possibility to get whoever's fork corrected so that
>> it behaves sensibly without requiring the "-q" option only for the
>> particular rebase backend?
> 
> The issue is caused by a patch that Microsoft is carrying that stops apply from creating paths with the skip-worktree bit set. As they're upstreaming their sparse index and checkout work I expect it will show up on the list sooner or later. I agree the "-q" is odd and it also means the test is weaker but I'm not sure what else we can do.

That particular patch is old and is due to some interactions with
how VFS for Git (ab)uses the skip-worktree bit. I'm not sure it will
ever come upstream. It is probably very much like a recent example [1]
that we tried to upstream only to realize that it should be replaced
with something better.

[1] https://lore.kernel.org/git/65905bf4e001118e8b9ced95c1bcecbacb6334ac.1633013461.git.gitgitgadget@gmail.com/

I'm fine to leave the `-q` out of this patch and I can add it myself
when we take this into microsoft/git. That can also motivate me to
rethink that patch.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-08  9:09       ` Phillip Wood
  2021-10-08 18:53         ` Derrick Stolee
@ 2021-10-08 19:57         ` Junio C Hamano
  2021-10-14 13:34           ` Phillip Wood
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2021-10-08 19:57 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Phillip Wood via GitGitGadget, git, Derrick Stolee,
	René Scharfe, Elijah Newren, Johannes Schindelin,
	Derrick Stolee, Phillip Wood

Phillip Wood <phillip.wood123@gmail.com> writes:

> On 07/10/2021 22:23, Junio C Hamano wrote:
>> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> 
>>>       * Fixed the spelling of Stolee's name (sorry Stolee)
>>>       * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
>>>           [1]
>>>      https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/
>> I've seen the exchange, but ...
>> 
>>> -	for OPERATION in "merge -m merge" cherry-pick rebase
>>> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
>>>   	do
>> ... it looks too strange that only one of them requires a "--quiet"
>> option.  Is it a possibility to get whoever's fork corrected so that
>> it behaves sensibly without requiring the "-q" option only for the
>> particular rebase backend?
>
> The issue is caused by a patch that Microsoft is carrying that stops
> apply from creating paths with the skip-worktree bit set. As they're 
> upstreaming their sparse index and checkout work I expect it will show
> up on the list sooner or later. I agree the "-q" is odd and it also 
> means the test is weaker but I'm not sure what else we can do.

Perhaps passing "-q" to the other variant of "rebase" would make it
clear that (1) we do not want to worry about traces involved in the
verbose message generation and (2) there is nothing fishy going on
in only one of the "rebase" backends.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-08  9:38     ` Bagas Sanjaya
@ 2021-10-14  9:40       ` Phillip Wood
  0 siblings, 0 replies; 26+ messages in thread
From: Phillip Wood @ 2021-10-14  9:40 UTC (permalink / raw)
  To: Bagas Sanjaya, Phillip Wood via GitGitGadget, git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Derrick Stolee, Phillip Wood

Hi Bagas

On 08/10/2021 10:38, Bagas Sanjaya wrote:
> On 08/10/21 01.07, Phillip Wood via GitGitGadget wrote:
>> -static void verify_one(struct repository *r,
>> -               struct index_state *istate,
>> -               struct cache_tree *it,
>> -               struct strbuf *path)
>> +/*
>> + * Returns:
>> + *  0 - Verification completed.
>> + *  1 - Restart verification - a call to ensure_full_index() freed 
>> the cache
>> + *      tree that is being verified and verification needs to be 
>> restarted from
>> + *      the new toplevel cache tree.
>> + */
>> +static int verify_one(struct repository *r,
>> +              struct index_state *istate,
>> +              struct cache_tree *it,
>> +              struct strbuf *path)
>>   {
>>       int i, pos, len = path->len;
>>       struct strbuf tree_buf = STRBUF_INIT;
> 
> What is verify_one() doing? I think it worth mentioning it in the 
> comment above.

I think it's pretty obvious if you read the code rather than my patch. 
It is a common pattern in git that a function with "one" in the name is 
a helper for another similarly named function without the "one". In this 
case verify_one() is a recursive helper for cache_tree_verify()

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-08 19:57         ` Junio C Hamano
@ 2021-10-14 13:34           ` Phillip Wood
  2021-10-14 16:42             ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Phillip Wood @ 2021-10-14 13:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Phillip Wood via GitGitGadget, git, Derrick Stolee,
	René Scharfe, Elijah Newren, Johannes Schindelin,
	Derrick Stolee, Phillip Wood

Hi Junio

On 08/10/2021 20:57, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> On 07/10/2021 22:23, Junio C Hamano wrote:
>>> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>
>>>>        * Fixed the spelling of Stolee's name (sorry Stolee)
>>>>        * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
>>>>            [1]
>>>>       https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/
>>> I've seen the exchange, but ...
>>>
>>>> -	for OPERATION in "merge -m merge" cherry-pick rebase
>>>> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
>>>>    	do
>>> ... it looks too strange that only one of them requires a "--quiet"
>>> option.  Is it a possibility to get whoever's fork corrected so that
>>> it behaves sensibly without requiring the "-q" option only for the
>>> particular rebase backend?
>>
>> The issue is caused by a patch that Microsoft is carrying that stops
>> apply from creating paths with the skip-worktree bit set. As they're
>> upstreaming their sparse index and checkout work I expect it will show
>> up on the list sooner or later. I agree the "-q" is odd and it also
>> means the test is weaker but I'm not sure what else we can do.
> 
> Perhaps passing "-q" to the other variant of "rebase" would make it
> clear that (1) we do not want to worry about traces involved in the
> verbose message generation and (2) there is nothing fishy going on
> in only one of the "rebase" backends.

I'm not sure about that. There are really three levels of output from 
rebase - quiet, normal and verbose. I think passing "-q" suppresses 
virtually all the output - there is no indication of which commits have 
been picked. As test appears to be comparing the output of the command 
for the sparse and non-spare case as a proxy for "it behaves the same 
for sparse and non-sparse checkouts/indexes" passing "-q" to rebase 
weakens the test considerably. Stolee indicated [1] that he is happy for 
us to drop the "-q" for the "--apply" case so I'd be inclined to go back 
to your corrected version of V2.

Best Wishes

Phillip

[1] 
https://lore.kernel.org/git/e281c2e2-2044-1a11-e2bc-5ab3ee92c300@gmail.com/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-14 13:34           ` Phillip Wood
@ 2021-10-14 16:42             ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2021-10-14 16:42 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Phillip Wood via GitGitGadget, git, Derrick Stolee,
	René Scharfe, Elijah Newren, Johannes Schindelin,
	Derrick Stolee, Phillip Wood

Phillip Wood <phillip.wood123@gmail.com> writes:

> I'm not sure about that. There are really three levels of output from
> rebase - quiet, normal and verbose. I think passing "-q" suppresses 
> virtually all the output - there is no indication of which commits
> have been picked. As test appears to be comparing the output of the
> command for the sparse and non-spare case as a proxy for "it behaves
> the same for sparse and non-sparse checkouts/indexes" passing "-q" to
> rebase weakens the test considerably.

True.  Also because the behaviour of "rebase" using different
backends are sufficiently different, I no longer consider it a funny
inconsistency that one backend has to to use "-q" while the other
doesn't.

> Stolee indicated [1] that he is
> happy for us to drop the "-q" for the "--apply" case so I'd be
> inclined to go back to your corrected version of V2.

OK.  Can we have a v4 that is identical to "corrected" v2, then,
please?  That's easier than having to dig v2 up and remember and
apply the "correction" ;-).

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
  2021-10-07 21:23     ` Junio C Hamano
  2021-10-08  9:38     ` Bagas Sanjaya
@ 2021-10-16  9:07     ` Phillip Wood via GitGitGadget
  2021-10-17  5:38       ` Junio C Hamano
  2 siblings, 1 reply; 26+ messages in thread
From: Phillip Wood via GitGitGadget @ 2021-10-16  9:07 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Junio C Hamano, Derrick Stolee, Phillip Wood,
	Bagas Sanjaya, Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when the index is
sparse but the cache tree is not and index_name_pos() looks up a path
from the cache tree that is a descendant of a sparse index entry. That
triggers a call to ensure_full_index() which frees the cache tree that
is being verified.  Carrying on trying to verify the tree after this
results in a use-after-free bug. Instead restart the verification if a
sparse index is converted to a full index. This bug is triggered by a
call to reset_head() in "git rebase --apply". Thanks to René Scharfe
and Derrick Stolee for their help analyzing the problem.

==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
    #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
    #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)

0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
    #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
    #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
    #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
    #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
    #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

previously allocated by thread T0 here:
    #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
    #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
    #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
    #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
    #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
    #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
    [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
    
    Changes since V3
    
     * removed "-q" from the test [1]. This is the same as V2 with a typo
       fixed in the commit message
    
    [1] https://lore.kernel.org/git/
    e281c2e2-2044-1a11-e2bc-5ab3ee92c300@gmail.com/
    
    Changes since V2
    
     * Fixed the spelling of Stolee's name (sorry Stolee)
     * Added "-q" to the test to prevent a failure on Microsoft's fork[1]
    
    [1]
    https://lore.kernel.org/git/ebbe8616-0863-812b-e112-103680f7298b@gmail.com/
    
    Thanks for the feedback and help, here are the changes from the RFC
    
     * Updated commit message and comments to make it clear this is
       triggered by a sparse index with a full cache tree based on Stolee's
       explanation.
     * Added a comment and BUG() suggested by Junio
    
    RFC cover letter In a sparse index it is possible for the tree that is
    being verified to be freed while it is being verified. This is an RFC as
    I'm not familiar with the cache tree code. I'm confused as to why this
    bug is triggered by the sequence
    
    unpack_trees()
    prime_cache_tree()
    write_locked_index()
    
    
    but not
    
    unpack_trees()
    write_locked_index()
    
    
    as unpack_trees() appears to update the cache tree with
    
    if (!cache_tree_fully_valid(o->result.cache_tree))
                cache_tree_update(&o->result,
                          WRITE_TREE_SILENT |
                          WRITE_TREE_REPAIR);
    
    
    and I don't understand why the cache tree from prime_cache_tree()
    results in different behavior. It concerns me that this fix is hiding
    another bug.
    
    Best Wishes
    
    Phillip

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1053%2Fphillipwood%2Fwip%2Fsparse-index-fix-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1053/phillipwood/wip/sparse-index-fix-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1053

Range-diff vs v3:

 1:  b3dbe02fcc3 ! 1:  ea7e93e1a47 sparse index: fix use-after-free bug in cache_tree_verify()
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'checkout and rese
       	init_repos &&
       
      -	for OPERATION in "merge -m merge" cherry-pick rebase
     -+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
     ++	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
       	do
       		test_all_match git checkout -B temp update-deep &&
       		test_all_match git $OPERATION update-folder1 &&


 cache-tree.c                             | 37 +++++++++++++++++++-----
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 90919f9e345..8044e21bcf3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -826,10 +826,17 @@ static void verify_one_sparse(struct repository *r,
 		    path->buf);
 }
 
-static void verify_one(struct repository *r,
-		       struct index_state *istate,
-		       struct cache_tree *it,
-		       struct strbuf *path)
+/*
+ * Returns:
+ *  0 - Verification completed.
+ *  1 - Restart verification - a call to ensure_full_index() freed the cache
+ *      tree that is being verified and verification needs to be restarted from
+ *      the new toplevel cache tree.
+ */
+static int verify_one(struct repository *r,
+		      struct index_state *istate,
+		      struct cache_tree *it,
+		      struct strbuf *path)
 {
 	int i, pos, len = path->len;
 	struct strbuf tree_buf = STRBUF_INIT;
@@ -837,21 +844,30 @@ static void verify_one(struct repository *r,
 
 	for (i = 0; i < it->subtree_nr; i++) {
 		strbuf_addf(path, "%s/", it->down[i]->name);
-		verify_one(r, istate, it->down[i]->cache_tree, path);
+		if (verify_one(r, istate, it->down[i]->cache_tree, path))
+			return 1;
 		strbuf_setlen(path, len);
 	}
 
 	if (it->entry_count < 0 ||
 	    /* no verification on tests (t7003) that replace trees */
 	    lookup_replace_object(r, &it->oid) != &it->oid)
-		return;
+		return 0;
 
 	if (path->len) {
+		/*
+		 * If the index is sparse and the cache tree is not
+		 * index_name_pos() may trigger ensure_full_index() which will
+		 * free the tree that is being verified.
+		 */
+		int is_sparse = istate->sparse_index;
 		pos = index_name_pos(istate, path->buf, path->len);
+		if (is_sparse && !istate->sparse_index)
+			return 1;
 
 		if (pos >= 0) {
 			verify_one_sparse(r, istate, it, path, pos);
-			return;
+			return 0;
 		}
 
 		pos = -pos - 1;
@@ -899,6 +915,7 @@ static void verify_one(struct repository *r,
 		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
 	strbuf_setlen(path, len);
 	strbuf_release(&tree_buf);
+	return 0;
 }
 
 void cache_tree_verify(struct repository *r, struct index_state *istate)
@@ -907,6 +924,10 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
 
 	if (!istate->cache_tree)
 		return;
-	verify_one(r, istate, istate->cache_tree, &path);
+	if (verify_one(r, istate, istate->cache_tree, &path)) {
+		strbuf_reset(&path);
+		if (verify_one(r, istate, istate->cache_tree, &path))
+			BUG("ensure_full_index() called twice while verifying cache tree");
+	}
 	strbuf_release(&path);
 }
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 886e78715fe..85d5279b33c 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	for OPERATION in "merge -m merge" cherry-pick rebase
+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
 	do
 		test_all_match git checkout -B temp update-deep &&
 		test_all_match git $OPERATION update-folder1 &&

base-commit: cefe983a320c03d7843ac78e73bd513a27806845
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-16  9:07     ` [PATCH v4] " Phillip Wood via GitGitGadget
@ 2021-10-17  5:38       ` Junio C Hamano
  2021-10-17 19:35         ` Derrick Stolee
  2021-10-18  9:37         ` Phillip Wood
  0 siblings, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2021-10-17  5:38 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Derrick Stolee, Phillip Wood, Bagas Sanjaya,
	Phillip Wood

"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

>     [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
>     
>     Changes since V3
>     
>      * removed "-q" from the test [1]. This is the same as V2 with a typo
>        fixed in the commit message
>     
>     [1] https://lore.kernel.org/git/
>     e281c2e2-2044-1a11-e2bc-5ab3ee92c300@gmail.com/

Thanks.  Unfortunately I've already merged the previosu version on
the 11th, so I took the liberty of turning this round into an
incremental.  How does this look?

----- >8 --------- >8 --------- >8 --------- >8 -----
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Sat, 16 Oct 2021 09:07:09 +0000
Subject: [PATCH] t1092: run "rebase --apply" without "-q" in the test

We run a few Git subcommands and make sure they produce identical
results with and without sparse-index.  To this set of subcommands,
an earlier commit added "rebase --apply", but did so with the "-q"
option, in order to work around a breakge caused by a version used
at Microsoft with some unreleased changes.

Because we would want to make sure the commands produce indentical
results, including reports given to the output that lists which
commits were picked, use of "-q" loses too much interesting
information.  Let's drop "-q" from the command invocation and
revisit the issue when the problematic changes are upstreamed.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 80c77bb432..85d5279b33 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
 	do
 		test_all_match git checkout -B temp update-deep &&
 		test_all_match git $OPERATION update-folder1 &&
-- 
2.33.1-877-g9d049ddf90


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-17  5:38       ` Junio C Hamano
@ 2021-10-17 19:35         ` Derrick Stolee
  2021-10-18  9:37         ` Phillip Wood
  1 sibling, 0 replies; 26+ messages in thread
From: Derrick Stolee @ 2021-10-17 19:35 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Phillip Wood, Bagas Sanjaya, Phillip Wood

On 10/17/2021 1:38 AM, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>     [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
>>     
>>     Changes since V3
>>     
>>      * removed "-q" from the test [1]. This is the same as V2 with a typo
>>        fixed in the commit message
>>     
>>     [1] https://lore.kernel.org/git/
>>     e281c2e2-2044-1a11-e2bc-5ab3ee92c300@gmail.com/
> 
> Thanks.  Unfortunately I've already merged the previosu version on
> the 11th, so I took the liberty of turning this round into an
> incremental.  How does this look?
> 
> ----- >8 --------- >8 --------- >8 --------- >8 -----
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> Date: Sat, 16 Oct 2021 09:07:09 +0000
> Subject: [PATCH] t1092: run "rebase --apply" without "-q" in the test
> 
> We run a few Git subcommands and make sure they produce identical
> results with and without sparse-index.  To this set of subcommands,
> an earlier commit added "rebase --apply", but did so with the "-q"
> option, in order to work around a breakge caused by a version used

s/breakge/breakage/

> at Microsoft with some unreleased changes.
> 
> Because we would want to make sure the commands produce indentical

s/indentical/identical/

> results, including reports given to the output that lists which
> commits were picked, use of "-q" loses too much interesting
> information.  Let's drop "-q" from the command invocation and
> revisit the issue when the problematic changes are upstreamed.

I think this summarizes the situation quite well. Thanks.

-Stolee

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] sparse index: fix use-after-free bug in cache_tree_verify()
  2021-10-17  5:38       ` Junio C Hamano
  2021-10-17 19:35         ` Derrick Stolee
@ 2021-10-18  9:37         ` Phillip Wood
  1 sibling, 0 replies; 26+ messages in thread
From: Phillip Wood @ 2021-10-18  9:37 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, René Scharfe, Elijah Newren,
	Johannes Schindelin, Derrick Stolee, Bagas Sanjaya, Phillip Wood

Hi Junio

On 17/10/2021 06:38, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>      [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
>>      
>>      Changes since V3
>>      
>>       * removed "-q" from the test [1]. This is the same as V2 with a typo
>>         fixed in the commit message
>>      
>>      [1] https://lore.kernel.org/git/
>>      e281c2e2-2044-1a11-e2bc-5ab3ee92c300@gmail.com/
> 
> Thanks.  Unfortunately I've already merged the previosu version on
> the 11th,

Oh sorry I'd missed that

  so I took the liberty of turning this round into an
> incremental.  How does this look?
It looks fine to me with Stolee's typo fixes

Thanks

Phillip

> ----- >8 --------- >8 --------- >8 --------- >8 -----
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> Date: Sat, 16 Oct 2021 09:07:09 +0000
> Subject: [PATCH] t1092: run "rebase --apply" without "-q" in the test
> 
> We run a few Git subcommands and make sure they produce identical
> results with and without sparse-index.  To this set of subcommands,
> an earlier commit added "rebase --apply", but did so with the "-q"
> option, in order to work around a breakge caused by a version used
> at Microsoft with some unreleased changes.
> 
> Because we would want to make sure the commands produce indentical
> results, including reports given to the output that lists which
> commits were picked, use of "-q" loses too much interesting
> information.  Let's drop "-q" from the command invocation and
> revisit the issue when the problematic changes are upstreamed.
> 
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>   t/t1092-sparse-checkout-compatibility.sh | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 80c77bb432..85d5279b33 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>   test_expect_success 'merge, cherry-pick, and rebase' '
>   	init_repos &&
>   
> -	for OPERATION in "merge -m merge" cherry-pick "rebase --apply -q" "rebase --merge"
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
>   	do
>   		test_all_match git checkout -B temp update-deep &&
>   		test_all_match git $OPERATION update-folder1 &&
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-10-18  9:37 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06  9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
2021-10-06 11:20 ` Derrick Stolee
2021-10-06 14:01   ` Phillip Wood
2021-10-06 14:19     ` Derrick Stolee
2021-10-06 19:17 ` Junio C Hamano
2021-10-06 20:43   ` Derrick Stolee
2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
2021-10-07 13:35   ` Derrick Stolee
2021-10-07 14:59     ` Phillip Wood
2021-10-07 13:53   ` Derrick Stolee
2021-10-07 15:05     ` Phillip Wood
2021-10-07 15:44       ` Derrick Stolee
2021-10-07 17:59         ` Phillip Wood
2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
2021-10-07 21:23     ` Junio C Hamano
2021-10-08  9:09       ` Phillip Wood
2021-10-08 18:53         ` Derrick Stolee
2021-10-08 19:57         ` Junio C Hamano
2021-10-14 13:34           ` Phillip Wood
2021-10-14 16:42             ` Junio C Hamano
2021-10-08  9:38     ` Bagas Sanjaya
2021-10-14  9:40       ` Phillip Wood
2021-10-16  9:07     ` [PATCH v4] " Phillip Wood via GitGitGadget
2021-10-17  5:38       ` Junio C Hamano
2021-10-17 19:35         ` Derrick Stolee
2021-10-18  9:37         ` Phillip Wood

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).