git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Matheus Tavares Bernardino <matheus.bernardino@usp.br>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH] git-sparse-checkout: clarify interactions with submodules
Date: Thu, 11 Jun 2020 00:00:50 -0700	[thread overview]
Message-ID: <CABPp-BGVB3XNT=yrfnwX63V9ZbH-UxetDJ0ND3Or6TxBiMfHNw@mail.gmail.com> (raw)
In-Reply-To: <pull.805.git.git.1591831009762.gitgitgadget@gmail.com>

On Wed, Jun 10, 2020 at 4:16 PM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Elijah Newren <newren@gmail.com>
>
> Ignoring the sparse-checkout feature momentarily, if one has a submodule and
> creates local branches within it with unpushed changes and maybe adds some
> untracked files to it, then we would want to avoid accidentally removing such
> a submodule.  So, for example with git.git, if you run
>    git checkout v2.13.0
> then the sha1collisiondetection/ submodule is NOT removed even though it
> did not exist as a submodule until v2.14.0.  Similarly, if you only had
> v2.13.0 checked out previously and ran
>    git checkout v2.14.0
> the sha1collisiondetection/ submodule would NOT be automatically
> initialized despite being part of v2.14.0.  In both cases, git requires
> submodules to be initialized or deinitialized separately.  Further, we
> also have special handling for submodules in other commands such as
> clean, which requires two --force flags to delete untracked submodules,
> and some commands have a --recurse-submodules flag.
>
> sparse-checkout is very similar to checkout, as evidenced by the similar
> name -- it adds and removes files from the working copy.  However, for
> the same avoid-data-loss reasons we do not want to remove a submodule
> from the working copy with checkout, we do not want to do it with
> sparse-checkout either.  So submodules need to be separately initialized
> or deinitialized; changing sparse-checkout rules should not
> automatically trigger the removal or vivification of submodules.
>
> I believe the previous wording in git-sparse-checkout.txt about
> submodules was only about this particular issue.  Unfortunately, the
> previous wording could be interpreted to imply that submodules should be
> considered active regardless of sparsity patterns.  Update the wording
> to avoid making such an implication.  It may be helpful to consider two
> example situations where the differences in wording become important:
>
> In the future, we want users to be able to run commands like
>    git clone --sparse=moduleA --recurse-submodules $REPO_URL
> and have sparsity paths automatically set up and have submodules *within
> the sparsity paths* be automatically initialized.  We do not want all
> submodules in any path to be automatically initialized with that
> command.
>
> Similarly, we want to be able to do things like
>    git -c sparse.restrictCmds grep --recurse-submodules $REV $PATTERN
> and search through $REV for $PATTERN within the recorded sparsity
> patterns.  We want it to recurse into submodules within those sparsity
> patterns, but do not want to recurse into directories that do not match
> the sparsity patterns in search of a possible submodule.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>     git-sparse-checkout: clarify interactions with submodules
>
>     gitgitgadget is going to treat this like V1, but it's really V2. V1 was
>     an inline scissors patch.
>
>     Changes since V1:

To make the record easier for those looking over the archives, V1 is
over here: https://lore.kernel.org/git/20200522142611.1217757-1-newren@gmail.com/


>      * More wording clarifications in areas pointed out by Stolee, and using
>        some of his suggested wording.
>      * In particular, given that the final sentence from V1 was causing lots
>        of problems, I just stepped back and painted a very broad stroke for
>        end users that I think will make sense to them: we have two reasons
>        tracked files might be missing from the working copy, so there are
>        two things that might limit commands that search through tracked
>        files in the working copy. Greater detail about if or how they are
>        limited can be left to the manpages of individual subcommands.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-805%2Fnewren%2Fsparse-submodule-interactions-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-805/newren/sparse-submodule-interactions-v1
> Pull-Request: https://github.com/git/git/pull/805
>
>  Documentation/git-sparse-checkout.txt | 30 +++++++++++++++++++++++----
>  1 file changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
> index 1a3ace60820..c7feaeca110 100644
> --- a/Documentation/git-sparse-checkout.txt
> +++ b/Documentation/git-sparse-checkout.txt
> @@ -200,10 +200,32 @@ directory.
>  SUBMODULES
>  ----------
>
> -If your repository contains one or more submodules, then those submodules will
> -appear based on which you initialized with the `git submodule` command. If
> -your sparse-checkout patterns exclude an initialized submodule, then that
> -submodule will still appear in your working directory.
> +If your repository contains one or more submodules, then submodules
> +are populated based on interactions with the `git submodule` command.
> +Specifically, `git submodule init -- <path>` will ensure the submodule
> +at `<path>` is present, while `git submodule deinit [-f] -- <path>`
> +will remove the files for the submodule at `<path>` (including any
> +untracked files, uncommitted changes, and unpushed history).  Similar
> +to how sparse-checkout removes files from the working tree but still
> +leaves entries in the index, deinitialized submodules are removed from
> +the working directory but still have an entry in the index.
> +
> +Since submodules may have unpushed changes or untracked files,
> +removing them could result in data loss.  Thus, changing sparse
> +inclusion/exclusion rules will not cause an already checked out
> +submodule to be removed from the working copy.  Said another way, just
> +as `checkout` will not cause submodules to be automatically removed or
> +initialized even when switching between branches that remove or add
> +submodules, using `sparse-checkout` to reduce or expand the scope of
> +"interesting" files will not cause submodules to be automatically
> +deinitialized or initialized either.
> +
> +Further, the above facts mean that there are multiple reasons that
> +"tracked" files might not be present in the working copy: sparsity
> +pattern application from sparse-checkout, and submodule initialization
> +state.  Thus, commands like `git grep` that work on tracked files in
> +the working copy may return results that are limited by either or both
> +of these restrictions.
>
>
>  SEE ALSO
>
> base-commit: 87680d32efb6d14f162e54ad3bda4e3d6c908559
> --
> gitgitgadget

  reply	other threads:[~2020-06-11  7:01 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-10 23:16 [PATCH] git-sparse-checkout: clarify interactions with submodules Elijah Newren via GitGitGadget
2020-06-11  7:00 ` Elijah Newren [this message]
2020-06-11 11:32 ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BGVB3XNT=yrfnwX63V9ZbH-UxetDJ0ND3Or6TxBiMfHNw@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).