git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: stolee@gmail.com, vdye@github.com, gitster@pobox.com,
	newren@gmail.com, Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 0/2] Sparse Index: fix a checkout bug with deep sparse-checkout patterns
Date: Mon, 06 Dec 2021 14:10:35 +0000	[thread overview]
Message-ID: <pull.1092.v2.git.1638799837.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1092.git.1638586534.gitgitgadget@gmail.com>

This week, we rolled out the sparse index to a large internal monorepo. We
got two very similar bug reports that dealt with a strange error that
involved the same set of paths. One was during git pull (pull was a red
herring) and the other was git checkout. The git checkout case gave enough
of a reproduction to debug deep into unpack-trees.c and find the problem.

This bug dates back to 523506d (unpack-trees: unpack sparse directory
entries, 2021-07-14). The reason we didn't hit this before is because it
requires the following:

 1. The sparse-checkout definition needs to have recursive inclusion of deep
    folders (depth 3 or more).
 2. Adjacent to those deep folders, we need a deep sparse directory entry
    that receives changes.
 3. In this particular repo, deep directories are only added to the
    sparse-checkout in rare occasions and those adjacent folders are rarely
    updated. They happened to update this week and hit our sparse index
    dogfooders in surprising ways.

The first patch adds a test that fails without the fix. It requires
modifying our test data to make adjacent, deep sparse directory entries
possible. It's a rather simple test after we have that data change.

The second patch includes the actual fix. It's really just an error of not
understanding the difference between the name and traverse_path members of
the struct traverse_info structure. name only stores a single tree entry
while traverse_path actually includes the full name from root. The method we
are editing also has an additional struct name_entry that fills in the tree
entry on top of the traverse_path, which explains how this worked to depth
two, but not depth three.


Update in v2
============

 * Fixed the comment describing the sparse_dir_matches_path() method.

Thanks, -Stolee

Derrick Stolee (2):
  t1092: add deeper changes during a checkout
  unpack-trees: use traverse_path instead of name

 t/t1092-sparse-checkout-compatibility.sh | 16 +++++++++++++++-
 unpack-trees.c                           | 14 ++++++++------
 2 files changed, 23 insertions(+), 7 deletions(-)


base-commit: cd3e606211bb1cf8bc57f7d76bab98cc17a150bc
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1092%2Fderrickstolee%2Fsparse-index%2Fcheckout-bug-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1092/derrickstolee/sparse-index/checkout-bug-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1092

Range-diff vs v1:

 1:  ba05d7d4149 = 1:  ba05d7d4149 t1092: add deeper changes during a checkout
 2:  c9142199656 ! 2:  aa37168dcb4 unpack-trees: use traverse_path instead of name
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'add, commit, chec
       	test_sparse_match git sparse-checkout set deep/deeper1/deepest &&
      
       ## unpack-trees.c ##
     +@@ unpack-trees.c: static int find_cache_pos(struct traverse_info *info,
     + 
     + /*
     +  * Given a sparse directory entry 'ce', compare ce->name to
     +- * info->name + '/' + p->path + '/' if info->name is non-empty.
     ++ * info->traverse_path + p->path + '/' if info->traverse_path
     ++ * is non-empty.
     ++ *
     +  * Compare ce->name to p->path + '/' otherwise. Note that
     +  * ce->name must end in a trailing '/' because it is a sparse
     +  * directory entry.
      @@ unpack-trees.c: static int sparse_dir_matches_path(const struct cache_entry *ce,
       	assert(S_ISSPARSEDIR(ce->ce_mode));
       	assert(ce->name[ce->ce_namelen - 1] == '/');

-- 
gitgitgadget

  parent reply	other threads:[~2021-12-06 14:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-04  2:55 [PATCH 0/2] Sparse Index: fix a checkout bug with deep sparse-checkout patterns Derrick Stolee via GitGitGadget
2021-12-04  2:55 ` [PATCH 1/2] t1092: add deeper changes during a checkout Derrick Stolee via GitGitGadget
2021-12-04  2:55 ` [PATCH 2/2] unpack-trees: use traverse_path instead of name Derrick Stolee via GitGitGadget
2021-12-04  5:42   ` Elijah Newren
2021-12-06 13:59     ` Derrick Stolee
2021-12-04  5:45 ` [PATCH 0/2] Sparse Index: fix a checkout bug with deep sparse-checkout patterns Elijah Newren
2021-12-06 14:10 ` Derrick Stolee via GitGitGadget [this message]
2021-12-06 14:10   ` [PATCH v2 1/2] t1092: add deeper changes during a checkout Derrick Stolee via GitGitGadget
2021-12-06 14:10   ` [PATCH v2 2/2] unpack-trees: use traverse_path instead of name Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1092.v2.git.1638799837.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).