git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>,
	Victoria Dye <vdye@github.com>
Subject: Re: [PATCH v2 0/7] Sparse index: integrate with 'git stash'
Date: Fri, 6 May 2022 00:46:49 -0700	[thread overview]
Message-ID: <CABPp-BFANwZn73w8zrVySB7mh0bQQBdGJjBuSJy50UOeyYT6aA@mail.gmail.com> (raw)
In-Reply-To: <pull.1171.v2.git.1651083378.gitgitgadget@gmail.com>

On Wed, Apr 27, 2022 at 11:16 AM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series, in combination with the sparse index integrations of reset [1],
> update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
> most subcommands of 'git stash' to use the sparse index end-to-end without
> index expansion.

I've read through the series.  Like Stolee, I'm pleased with how
simple some of the code changes are (much due to prior hard work in
the area) and how nicely you describe the changes.

At first when I was reading through, I was again a bit disappointed
that we're just converting the subcommands stash uses instead of
fixing stash to stop forking subprocesses, but...then I realized this
served as a forcing function of sorts to make more of Git sparse-index
compatible, so I think it's actually the better route.  We can always
make the orthogonal fixes to stash's implementation design later.  :-)

However, there is one small issue around the way merging is handled...

> Like the earlier series, this series starts with new tests ensuring
> compatibility of the sparse index with non-sparse index full and sparse
> checkouts [1/7]. Next, sparse index is trivially enabled [2/7].
> Functionally, sparse index-enabled sparse-checkouts remain compatible with
> non-sparse index sparse-checkouts, but there are still some cases where the
> index (or a temporary index) is expanded unnecessarily. These cases are
> fixed in three parts:
>
>  * First, 'git stash -u' is made sparse index-compatible by ensuring the
>    "temporary" index holding the stashed, untracked files is created as a
>    sparse index whenever possible (per repo settings &
>    'is_sparse_index_allowed()'). Patch [3/7] exposes
>    'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then
>    patch [4/7] uses that function to mark the temporary index sparse when
>    appropriate.
>  * Next, 'git stash (apply|pop)' are made sparse index-compatible by
>    changing their internal "merge" function (executed via
>    'merge_recursive_generic()') from 'merge_recursive()' to
>    'merge_ort_recursive()'. This requires first allowing
>    'merge_recursive_generic()' to accept a merge function as an input
>    (rather than hardcoding use of 'merge_recursive()') in patch [5/7], then
>    changing the call in 'stash.c' to specify 'merge_ort_recursive()' in
>    patch [6/7]. See note [4] for possible alternate implementations.
>  * Finally, while patches 5 & 6 avoid index expansion for most cases of 'git
>    stash (apply|pop)', applying a stash that includes untracked files still
>    expands the index. This is a result of an internal 'read-tree' execution
>    (specifically in its 'unpack_trees' call) creating a result index that is
>    never sparse in-core, thus forcing the index to be unnecessarily
>    collapsed and re-expanded in 'do_write_locked_index()'. In patch [7/7],
>    'unpack_trees' is updated to set the default sparsity of the resultant
>    index to "sparse" if allowed by repo settings and
>    'is_sparse_index_allowed()' (similar to the change in patch 4).
>
> Performance results (from the 'p2000' tests):
>
> (git stash &&
>  git stash pop)              master            this series
> ---------------------------------------------------------------------
> full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
> full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
> sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
> sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%
>
> (echo >>new &&
>  git stash -u &&
>  git stash pop)              master            this series
> ---------------------------------------------------------------------
> full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
> full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
> sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
> sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%
>
>
>
> Changes since V1
> ================
>
>  * Added quotes to the "$WITHOUT_UNTRACKED_TXT" when testing for it in
>    'ensure_not_expanded' (in 't/t1092-sparse-checkout-compatibility.sh')
>  * Moved the 'stash' test in 't1092' elsewhere in the file, so that it
>    doesn't conflict (even trivially) with the also-in-flight 'git show'
>    integration
>  * Moved the 'ensure_not_expended' tests for 'checkout-index' back to
>    original location
>
> [1]
> https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
> [2]
> https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/
> [3]
> https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/
> [4] I went with changing 'stash' to always use 'merge-ort' in
> 'merge_recursive_generic()' as a sort of "middle ground" between "replace
> 'merge_recursive()' with 'merge_ort_recursive()' in all of its hardcoded
> internal usage" and "only use 'merge-ort' if using a sparse index in 'git
> stash', otherwise 'merge-recursive'". The former would extend the use of
> 'merge-ort' to 'git am' and 'git merge-recursive', whereas the latter is a
> more cautious/narrowly-focused option. If anyone has any other thoughts, I'm
> interested in hearing them.

I understand that you'd want to modify and use
merge_recursive_generic() in order to make the smallest code change
possible, but merge_recursive_generic() has no equivalent in ort
because during the review of ort, we discovered that porting over
merge_recursive_generic() meant porting bugs.  We need to fix those
bugs.  However, in the case of stash, I think we can just sidestep
those bugs. stash probably only uses merge_recursive_generic() because
it was the easiest transliteration of shell (which originally invoked
git-merge-recursive), as opposed to the best conversion.  The best
conversion at the time would have been to call merge_trees() instead
(stash doesn't need recursiveness, and tends to have commits available
rather than just trees).  I'd like to see us modify stash to call
merge_incore_nonrecursive() directly (and/or merge_trees() if we're
supporting using both merge-ort and merge-recursive backends).

No need to worry about git-am or git-merge-recursive for now; we can
replace/fix/update their calls to merge_recursive_generic() later.

  parent reply	other threads:[~2022-05-06  7:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-04-25 21:34   ` Junio C Hamano
2022-04-26 12:53   ` Derrick Stolee
2022-04-26 15:26     ` Victoria Dye
2022-04-26 16:21     ` Junio C Hamano
2022-04-25 17:49 ` [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-04-25 21:35   ` Junio C Hamano
2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
2022-04-25 21:38   ` Junio C Hamano
2022-04-26 12:57     ` Derrick Stolee
2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
2022-04-26 13:02   ` Derrick Stolee
2022-04-25 17:49 ` [PATCH 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee
2022-04-26 13:09   ` Derrick Stolee
2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
2022-05-06  7:23     ` Elijah Newren
2022-05-09 19:24       ` Victoria Dye
2022-05-10  7:06         ` Elijah Newren
2022-04-27 18:16   ` [PATCH v2 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
2022-05-06  7:46   ` Elijah Newren [this message]
2022-05-10 23:32   ` [PATCH v3 0/6] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 2/6] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 4/6] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
2022-05-11  0:26       ` Junio C Hamano
2022-05-12  1:01       ` Jonathan Tan
2022-05-12 14:52         ` Elijah Newren
2022-05-12 16:55           ` Jonathan Tan
2022-05-12 14:51       ` Elijah Newren
2022-05-10 23:32     ` [PATCH v3 6/6] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BFANwZn73w8zrVySB7mh0bQQBdGJjBuSJy50UOeyYT6aA@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).