git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com,
	Philip Oakley <philipoakley@iee.email>,
	Josh Steadmon <steadmon@google.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 0/5] Sparse index integration with 'git show'
Date: Tue, 26 Apr 2022 20:43:15 +0000	[thread overview]
Message-ID: <pull.1207.v2.git.1651005800.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1207.git.1649349442.gitgitgadget@gmail.com>

This continues our sequence of integrating builtins with the sparse index.

'git show' is relatively simple to get working in a way that doesn't fail
when it would previously succeed, but there are some subtleties when the
user passes a directory path using the ":" syntax to get the path out of the
index. If that path happens to be a sparse directory entry, we suddenly
start succeeding and printing the tree information!

Since this behavior can change depending on the sparse checkout definition
and the state of index entries within that directory, this new behavior
would be more likely to confuse users than help them.

This ":" syntax is shared by other commands like "git rev-parse", but we are
not adding those integrations at this point.

Some background: as we shipped our sparse index integrations in the
microsoft/git fork and measured performance of real users, we noticed that
'git show' was a frequently-used command that went from ~0.5 seconds to ~4
seconds in monorepo situations. This was unexpected, because we didn't know
about the ":" syntax. Further, it seemed that some third-party tools were
triggering this behavior as a frequent way to check on staged content. That
motivated quickly shipping this integration. Performance improved to ~0.1
seconds because of the reduced index size. While inspecting our rebase of
microsoft/git commits onto the 2.36.0 release candidates, I noticed this
integration would be a simpler example to demonstrate how sparse index
integrations should go when the behavior is not too complicated.

Here is an outline of this series:

 * Patch 1: Add more tests around 'git show :' in t1092. These tests are
   only to establish the existing differences between the full and
   sparse-checkout cases, since 'git show' is still protected by
   'command_requires_full_index'.

 * Patch 2: Make 'git show' stop expanding the index by default. Make note
   of this behavior change in the tests.

 * Patches 3-4: Make the subtle changes to object-name.c that help us reject
   sparse directories (patch 3) and print the correct error message (patch
   4).

 * Patch 5: Now that the common index-parsing code is updated, do the
   minimum change to 'git rev-parse' to avoid expanding the index to parse
   index entries for a sparse index.

Patches 2-4 could realistically be squashed into a single commit, but I
thought it might be instructive to show these individual steps, especially
as an example for our GSoC project.

I know that Victoria intends to submit her 'git stash' integration soon, and
this provides a way to test if our split of test changes in t1092 are easy
to merge without conflict. If that is successful, then I will likely submit
my integration with the 'sparse-checkout' builtin after this series is
complete. (UPDATE: we inserted a test in the same location of t1092, but
otherwise there are no textual or semantic conflicts.)


Updates in v2
=============

 * The test comment in patch 2 is updated.
 * A commit message typo in patch 4 is fixed.
 * Patch 4 simplified the behavior, but the previous version didn't clean up
   a test comment about that. It now cleans up the test to be simpler.
 * Patch 5 includes an integration with 'git rev-parse'.
 * The cover letter is expanded with more context.
 * The only conflict with Victoria's new 'git stash' patch series is that we
   both added a test in the same position of t1092. Including both new tests
   is the right way to resolve the conflict. Order does not matter.

[1]
https://lore.kernel.org/git/pull.1171.git.1650908957.gitgitgadget@gmail.com/

Thanks, -Stolee

Derrick Stolee (5):
  t1092: add compatibility tests for 'git show'
  show: integrate with the sparse index
  object-name: reject trees found in the index
  object-name: diagnose trees in index properly
  rev-parse: integrate with sparse index

 builtin/log.c                            |  5 ++++
 builtin/rev-parse.c                      |  3 ++
 object-name.c                            | 25 ++++++++++++++--
 t/t1092-sparse-checkout-compatibility.sh | 36 ++++++++++++++++++++++++
 4 files changed, 66 insertions(+), 3 deletions(-)


base-commit: 07330a41d66a2c9589b585a3a24ecdcf19994f19
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1207%2Fderrickstolee%2Fsparse-index%2Fgit-show-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1207/derrickstolee/sparse-index/git-show-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1207

Range-diff vs v1:

 1:  8c2fdb5a4fc = 1:  8c2fdb5a4fc t1092: add compatibility tests for 'git show'
 2:  27ab853a9b4 ! 2:  2e9d47ab09b show: integrate with the sparse index
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'show (cached blob
      -	# does not work as implemented. The error message is
      -	# different for a full checkout and a sparse checkout
      -	# when the directory is outside of the cone.
     -+	# changes depending on the existence of a sparse index.
     ++	# had different behavior depending on the existence
     ++	# of a sparse index.
       	test_all_match test_must_fail git show :deep/ &&
       	test_must_fail git -C full-checkout show :folder1/ &&
      -	test_sparse_match test_must_fail git show :folder1/
 3:  f5da5327673 = 3:  5a7561637f0 object-name: reject trees found in the index
 4:  99c09ccc240 ! 4:  b730457fccc object-name: diagnose trees in index properly
     @@ Commit message
          checkout. The error message from diagnose_invalid_index_path() reports
          whether the path is on disk or not. The full checkout will have the
          directory on disk, but the path will not be in the index. The sparse
     -    checokut could have the directory not exist, specifically when that
     +    checkout could have the directory not exist, specifically when that
          directory is outside of the sparse-checkout cone.
      
          In the case of a sparse index, we have yet another state: the path can
     @@ object-name.c: static void diagnose_invalid_index_path(struct repository *r,
      
       ## t/t1092-sparse-checkout-compatibility.sh ##
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'show (cached blobs/trees)' '
     + 	test_all_match git show :deep/a &&
     + 	test_sparse_match git show :folder1/a &&
     + 
     +-	# Asking "git show" for directories in the index
     +-	# had different behavior depending on the existence
     +-	# of a sparse index.
     ++	# The error message differs depending on whether
     ++	# the directory exists in the worktree.
     + 	test_all_match test_must_fail git show :deep/ &&
       	test_must_fail git -C full-checkout show :folder1/ &&
     - 	test_must_fail git -C sparse-checkout show :folder1/ &&
     +-	test_must_fail git -C sparse-checkout show :folder1/ &&
     ++	test_sparse_match test_must_fail git show :folder1/ &&
       
      -	test_must_fail git -C sparse-index show :folder1/ 2>err &&
      -	grep "is in the index, but not at stage 0" err
     -+	test_sparse_match test_must_fail git show :folder1/ &&
     -+
      +	# Change the sparse cone for an extra case:
      +	run_on_sparse git sparse-checkout set deep/deeper1 &&
      +
 -:  ----------- > 5:  69efe637a18 rev-parse: integrate with sparse index

-- 
gitgitgadget

  parent reply	other threads:[~2022-04-26 20:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-07 16:37 [PATCH 0/4] Sparse index integration with 'git show' Derrick Stolee via GitGitGadget
2022-04-07 16:37 ` [PATCH 1/4] t1092: add compatibility tests for " Derrick Stolee via GitGitGadget
2022-04-14 18:37   ` Josh Steadmon
2022-04-18 12:23     ` Derrick Stolee
2022-04-07 16:37 ` [PATCH 2/4] show: integrate with the sparse index Derrick Stolee via GitGitGadget
2022-04-14 18:50   ` Josh Steadmon
2022-04-18 12:28     ` Derrick Stolee
2022-04-07 16:37 ` [PATCH 3/4] object-name: reject trees found in the index Derrick Stolee via GitGitGadget
2022-04-14 18:57   ` Josh Steadmon
2022-04-18 12:31     ` Derrick Stolee
2022-04-07 16:37 ` [PATCH 4/4] object-name: diagnose trees in index properly Derrick Stolee via GitGitGadget
2022-04-07 20:46   ` Philip Oakley
2022-04-12  6:32   ` Junio C Hamano
2022-04-12 13:52     ` Derrick Stolee
2022-04-12 15:45       ` Junio C Hamano
2022-04-14 18:37 ` [PATCH 0/4] Sparse index integration with 'git show' Josh Steadmon
2022-04-14 21:14   ` Junio C Hamano
2022-04-18 12:42     ` Derrick Stolee
2022-04-26 20:43 ` Derrick Stolee via GitGitGadget [this message]
2022-04-26 20:43   ` [PATCH v2 1/5] t1092: add compatibility tests for " Derrick Stolee via GitGitGadget
2022-04-26 20:43   ` [PATCH v2 2/5] show: integrate with the sparse index Derrick Stolee via GitGitGadget
2022-04-26 20:43   ` [PATCH v2 3/5] object-name: reject trees found in the index Derrick Stolee via GitGitGadget
2022-04-26 20:43   ` [PATCH v2 4/5] object-name: diagnose trees in index properly Derrick Stolee via GitGitGadget
2022-04-26 20:43   ` [PATCH v2 5/5] rev-parse: integrate with sparse index Derrick Stolee via GitGitGadget
2022-04-26 20:55   ` [PATCH v2 0/5] Sparse index integration with 'git show' Junio C Hamano
2022-04-27 13:47     ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1207.v2.git.1651005800.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=philipoakley@iee.email \
    --cc=shaoxuan.yuan02@gmail.com \
    --cc=steadmon@google.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).