git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Derrick Stolee <dstolee@microsoft.com>, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH v2 09/18] unpack-trees: add a new update_sparsity() function
Date: Mon, 23 Mar 2020 14:02:22 -0400
Message-ID: <9501e018-fe87-16b8-b0d8-913e4331778d@gmail.com> (raw)
In-Reply-To: <a46439c8536f912ad4a1e1751852cf477d3d7dc7.1584813609.git.gitgitgadget@gmail.com>

On 3/21/2020 2:00 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Previously, the only way to update the SKIP_WORKTREE bits for various
> paths was invoking `git read-tree -mu HEAD` or calling the same code
> that this codepath invoked.  This however had a number of problems if
> the index or working directory were not clean.  First, let's consider
> the case:
> 
>   Flipping SKIP_WORKTREE -> !SKIP_WORKTREE (materializing files)
> 
> If the working tree was clean this was fine, but if there were files or
> directories or symlinks or whatever already present at the given path
> then the operation would abort with an error.  Let's label this case
> for later discussion:
> 
>     A) There is an untracked path in the way
> 
> Now let's consider the opposite case:
> 
>   Flipping !SKIP_WORKTREE -> SKIP_WORKTREE (removing files)
> 
> If the index and working tree was clean this was fine, but if there were
> any unclean paths we would run into problems.  There are three different
> cases to consider:
> 
>     B) The path is unmerged
>     C) The path has unstaged changes
>     D) The path has staged changes (differs from HEAD)
> 
> If any path fell into case B or C, then the whole operation would be
> aborted with an error.  With sparse-checkout, the whole operation would
> be aborted for case D as well, but for its predecessor of using `git
> read-tree -mu HEAD` directly, any paths that fell into case D would be
> removed from the working copy and the index entry for that path would be
> reset to match HEAD -- which looks and feels like data loss to users
> (only a few are even aware to ask whether it can be recovered, and even
> then it requires walking through loose objects trying to match up the
> right ones).
> 
> Refusing to remove files that have unsaved user changes is good, but
> refusing to work on any other paths is very problematic for users.  If
> the user is in the middle of a rebase or has made modifications to files
> that bring in more dependencies, then for their build to work they need
> to update the sparse paths.  This logic has been preventing them from
> doing so.  Sometimes in response, the user will stage the files and
> re-try, to no avail with sparse-checkout or to the horror of losing
> their changes if they are using its predecessor of `git read-tree -mu
> HEAD`.
> 
> Add a new update_sparsity() function which will not error out in any of
> these cases but behaves as follows for the special cases:
>     A) Leave the file in the working copy alone, clear the SKIP_WORKTREE
>        bit, and print a warning (thus leaving the path in a state where
>        status will report the file as modified, which seems logical).
>     B) Do NOT mark this path as SKIP_WORKTREE, and leave it as unmerged.
>     C) Do NOT mark this path as SKIP_WORKTREE and print a warning about
>        the dirty path.
>     D) Mark the path as SKIP_WORKTREE, but do not revert the version
>        stored in the index to match HEAD; leave the contents alone.
> 
> I tried a different behavior for A (leave the SKIP_WORKTREE bit set),
> but found it very surprising and counter-intuitive (e.g. the user sees
> it is present along with all the other files in that directory, tries to
> stage it, but git add ignores it since the SKIP_WORKTREE bit is set).  A
> & C seem like optimal behavior to me.  B may be as well, though I wonder
> if printing a warning would be an improvement.  Some might be slightly
> surprised by D at first, but given that it does the right thing with
> `git commit` and even `git commit -a` (`git add` ignores entries that
> are marked SKIP_WORKTREE and thus doesn't delete them, and `commit -a`
> is similar), it seems logical to me.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  unpack-trees.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  unpack-trees.h |  9 ++++++
>  2 files changed, 87 insertions(+)
> 
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 4733e7eaf89..6abea555929 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1714,6 +1714,84 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  	goto done;
>  }
>  
> +/*
> + * Update SKIP_WORKTREE bits according to sparsity patterns, and update
> + * working directory to match.
> + *
> + * CE_NEW_SKIP_WORKTREE is used internally.
> + */
> +enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
> +{
> +	enum update_sparsity_result ret = UPDATE_SPARSITY_SUCCESS;
> +	struct pattern_list pl;
> +	int i, empty_worktree;
> +	unsigned old_show_all_errors;
> +	int free_pattern_list = 0;
> +
> +	old_show_all_errors = o->show_all_errors;
> +	o->show_all_errors = 1;
> +
> +	/* Sanity checks */
> +	if (!o->update || o->index_only || o->skip_sparse_checkout)
> +		BUG("update_sparsity() is for reflecting sparsity patterns in working directory");
> +	if (o->src_index != o->dst_index || o->fn)
> +		BUG("update_sparsity() called wrong");
> +
> +	trace_performance_enter();

I was about to say "why didn't you use the trace2 regions like in
unpack_trees()?" when I discovered that we haven't sent them [1]
upstream yet. I'll put that on my TODO list.

[1] https://github.com/microsoft/git/commit/9a04644e14fe4aeb556dfc30cb2220b799f53448

> +	/* If we weren't given patterns, use the recorded ones */
> +	if (!o->pl) {
> +		memset(&pl, 0, sizeof(pl));
> +		free_pattern_list = 1;

I notice you are using the same free_pattern_list pattern as your
earlier commit. Good.

> +		populate_from_existing_patterns(o, &pl);
> +		if (o->skip_sparse_checkout)
> +			goto skip_sparse_checkout;
> +	}
> +
> +	/* Set NEW_SKIP_WORKTREE on existing entries. */
> +	mark_all_ce_unused(o->src_index);
> +	mark_new_skip_worktree(o->pl, o->src_index, 0,
> +			       CE_NEW_SKIP_WORKTREE, o->verbose_update);
> +
> +	/* Then loop over entries and update/remove as needed */
> +	ret = UPDATE_SPARSITY_SUCCESS;
> +	empty_worktree = 1;
> +	for (i = 0; i < o->src_index->cache_nr; i++) {
> +		struct cache_entry *ce = o->src_index->cache[i];
> +
> +		if (apply_sparse_checkout(o->src_index, ce, o))
> +			ret = UPDATE_SPARSITY_WARNINGS;
> +
> +		if (!ce_skip_worktree(ce))
> +			empty_worktree = 0;
> +

nit: extra whitespace-only line

> +	}
> +
> +	/*
> +	 * Sparse checkout is meant to narrow down checkout area
> +	 * but it does not make sense to narrow down to empty working
> +	 * tree. This is usually a mistake in sparse checkout rules.
> +	 * Do not allow users to do that.
> +	 */
> +	if (o->src_index->cache_nr && empty_worktree) {
> +		unpack_failed(o, "Sparse checkout leaves no entry on working directory");
> +		ret = UPDATE_SPARSITY_INDEX_UPDATE_FAILURES;
> +		goto done;
> +	}
> +
> +skip_sparse_checkout:
> +	if (check_updates(o, o->src_index))
> +		ret = UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES;
> +
> +done:
> +	display_error_msgs(o);
> +	o->show_all_errors = old_show_all_errors;
> +	if (free_pattern_list)
> +		clear_pattern_list(&pl);
> +	trace_performance_leave("update_sparsity");
> +	return ret;
> +}
> +
>  /* Here come the merge functions */
>  
>  static int reject_merge(const struct cache_entry *ce,
> diff --git a/unpack-trees.h b/unpack-trees.h
> index d3516267f36..2c5d54cae9f 100644
> --- a/unpack-trees.h
> +++ b/unpack-trees.h
> @@ -28,6 +28,13 @@ enum unpack_trees_error_types {
>  	NB_UNPACK_TREES_ERROR_TYPES
>  };
>  
> +enum update_sparsity_result {
> +	UPDATE_SPARSITY_SUCCESS = 0,
> +	UPDATE_SPARSITY_WARNINGS = 1,
> +	UPDATE_SPARSITY_INDEX_UPDATE_FAILURES = -1,
> +	UPDATE_SPARSITY_WORKTREE_UPDATE_FAILURES = -2
> +};
> +

Is there a reason this isn't located just before
update_sparsity()?

>  /*
>   * Sets the list of user-friendly error messages to be used by the
>   * command "cmd" (either merge or checkout), and show_all_errors to 1.
> @@ -88,6 +95,8 @@ struct unpack_trees_options {
>  int unpack_trees(unsigned n, struct tree_desc *t,
>  		 struct unpack_trees_options *options);
>  
> +int update_sparsity(struct unpack_trees_options *options);
> +

This appears to not use the enum as it should.

Thanks,
-Stolee


  reply index

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-14  7:11 [PATCH 0/7] Sparse checkout improvements -- improved sparsity updating Elijah Newren via GitGitGadget
2020-03-14  7:11 ` [PATCH 1/7] unpack-trees: fix minor typo in comment Elijah Newren via GitGitGadget
2020-03-14  7:11 ` [PATCH 2/7] t1091: make some tests a little more defensive against failures Elijah Newren via GitGitGadget
2020-03-14  7:11 ` [PATCH 3/7] unpack-trees: allow check_updates() to work on a different index Elijah Newren via GitGitGadget
2020-03-14  7:11 ` [PATCH 4/7] unpack-trees: do not mark a dirty path with SKIP_WORKTREE Elijah Newren via GitGitGadget
2020-03-15 14:39   ` Derrick Stolee
2020-03-14  7:11 ` [PATCH 5/7] unpack-trees: add a new update_sparsity() function Elijah Newren via GitGitGadget
2020-03-15 18:17   ` Derrick Stolee
2020-03-16 20:24     ` Elijah Newren
2020-03-14  7:11 ` [PATCH 6/7] sparse-checkout: use " Elijah Newren via GitGitGadget
2020-03-15 16:19   ` Derrick Stolee
2020-03-16 17:02     ` Elijah Newren
2020-03-14  7:11 ` [PATCH 7/7] sparse-checkout: provide a new update subcommand Elijah Newren via GitGitGadget
2020-03-15 16:24   ` Derrick Stolee
2020-03-16 17:05     ` Elijah Newren
2020-03-16 17:18       ` Derrick Stolee
2020-03-16 19:23         ` Elijah Newren
2020-03-15 12:26 ` [PATCH 0/7] Sparse checkout improvements -- improved sparsity updating Derrick Stolee
2020-03-21 17:59 ` [PATCH v2 00/18] " Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 01/18] unpack-trees: fix minor typo in comment Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 02/18] unpack-trees: remove unused error type Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 03/18] unpack-trees: simplify verify_absent_sparse() Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 04/18] unpack-trees: simplify pattern_list freeing Elijah Newren via GitGitGadget
2020-03-23 15:57     ` Derrick Stolee
2020-03-21 17:59   ` [PATCH v2 05/18] t1091: make some tests a little more defensive against failures Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 06/18] unpack-trees: allow check_updates() to work on a different index Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 07/18] unpack-trees: do not mark a dirty path with SKIP_WORKTREE Elijah Newren via GitGitGadget
2020-03-21 17:59   ` [PATCH v2 08/18] unpack-trees: pull sparse-checkout pattern reading into a new function Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 09/18] unpack-trees: add a new update_sparsity() function Elijah Newren via GitGitGadget
2020-03-23 18:02     ` Derrick Stolee [this message]
2020-03-23 18:10       ` Elijah Newren
2020-03-23 18:21         ` Derrick Stolee
2020-03-23 20:24           ` Junio C Hamano
2020-03-21 18:00   ` [PATCH v2 10/18] sparse-checkout: use " Elijah Newren via GitGitGadget
2020-03-23 18:07     ` Derrick Stolee
2020-03-23 18:14       ` Elijah Newren
2020-03-23 18:22         ` Derrick Stolee
2020-03-21 18:00   ` [PATCH v2 11/18] sparse-checkout: use improved unpack_trees porcelain messages Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 12/18] unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 13/18] unpack-trees: rename ERROR_* fields meant for warnings to WARNING_* Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 14/18] unpack-trees: split display_error_msgs() into two Elijah Newren via GitGitGadget
2020-03-23 18:32     ` Derrick Stolee
2020-03-21 18:00   ` [PATCH v2 15/18] unpack-trees: make sparse path messages sound like warnings Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 16/18] unpack-trees: provide warnings on sparse updates for unmerged paths too Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 17/18] unpack-trees: failure to set SKIP_WORKTREE bits always just a warning Elijah Newren via GitGitGadget
2020-03-21 18:00   ` [PATCH v2 18/18] sparse-checkout: provide a new reapply subcommand Elijah Newren via GitGitGadget
2020-03-23 18:40     ` Derrick Stolee
2020-03-23 18:41   ` [PATCH v2 00/18] Sparse checkout improvements -- improved sparsity updating Derrick Stolee
2020-03-23 20:26     ` Junio C Hamano
2020-03-27  0:48   ` [PATCH v3 " Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 01/18] unpack-trees: fix minor typo in comment Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 02/18] unpack-trees: remove unused error type Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 03/18] unpack-trees: simplify verify_absent_sparse() Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 04/18] unpack-trees: simplify pattern_list freeing Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 05/18] t1091: make some tests a little more defensive against failures Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 06/18] unpack-trees: allow check_updates() to work on a different index Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 07/18] unpack-trees: do not mark a dirty path with SKIP_WORKTREE Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 08/18] unpack-trees: pull sparse-checkout pattern reading into a new function Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 09/18] unpack-trees: add a new update_sparsity() function Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 10/18] sparse-checkout: use " Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 11/18] sparse-checkout: use improved unpack_trees porcelain messages Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 12/18] unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 13/18] unpack-trees: rename ERROR_* fields meant for warnings to WARNING_* Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 14/18] unpack-trees: split display_error_msgs() into two Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 15/18] unpack-trees: make sparse path messages sound like warnings Elijah Newren via GitGitGadget
2020-03-27  0:48     ` [PATCH v3 16/18] unpack-trees: provide warnings on sparse updates for unmerged paths too Elijah Newren via GitGitGadget
2020-03-27  0:49     ` [PATCH v3 17/18] unpack-trees: failure to set SKIP_WORKTREE bits always just a warning Elijah Newren via GitGitGadget
2020-03-27  0:49     ` [PATCH v3 18/18] sparse-checkout: provide a new reapply subcommand Elijah Newren via GitGitGadget
2020-03-27 13:22     ` [PATCH v3 00/18] Sparse checkout improvements -- improved sparsity updating Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9501e018-fe87-16b8-b0d8-913e4331778d@gmail.com \
    --to=stolee@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror http://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git