git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	newren@gmaill.com, Jeff King <peff@peff.net>,
	Taylor Blau <me@ttaylorr.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Derrick Stolee <dstolee@microsoft.com>,
	Son Luong Ngoc <sluongng@gmail.com>
Subject: Re: [PATCH 04/10] sparse-checkout: allow in-tree definitions
Date: Wed, 17 Jun 2020 16:07:01 -0700	[thread overview]
Message-ID: <CABPp-BEzz2MF72jZeJP6=9vJqcoJV15LN-O8umDDfWwQXLjZRA@mail.gmail.com> (raw)
In-Reply-To: <CABPp-BH5p1VPXfMOyN_0SLnsFKkRU9R-ZpiAe4k5r=ZUbHeibQ@mail.gmail.com>

On Wed, May 20, 2020 at 10:52 AM Elijah Newren <newren@gmail.com> wrote:
>
> On Fri, May 8, 2020 at 8:42 AM Derrick Stolee <stolee@gmail.com> wrote:
> >
> > On 5/7/2020 6:58 PM, Junio C Hamano wrote:
> > > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > >
> > >> One of the difficulties of using the sparse-checkout feature is not
> > >> knowing which directories are absolutely needed for working in a portion
> > >> of the repository. Some of this can be documented in README files or
> > >> included in a bootstrapping tool along with the repository. This is done
> > >> in an ad-hoc way by every project that wants to use it.
> > >>
> > >> Let's make this process easier for users by creating a way to define a
> > >> useful sparse-checkout definition inside the Git tree data. This has
> > >> several benefits. In particular, the data is available to anyone who has
> > >> a copy of the repository without needing a different data source.
> > >> Second, the needs of the repository can change over time and Git can
> > >> present a way to automatically update the working directory as these
> > >> sparse-checkout definitions change over time.
> > >
> > > And two lines of development can merge them together?
> > >
> > > Any time a new "feature" pops up that would eventually affect how
> > > "git clone" and "git checkout" work based on untrusted user data, we
> > > need to make sure there is no negative security implications.
> > >
> > > If it only boils down to "we have files that can record list of
> > > leading directory names and without offering extra 'flexibility'", I
> > > guess there aren't all that much that a malicious sparse definition
> > > can do and we would be safe, though.
> >
> > Yes. I hope that we can be extremely careful with this feature.
> > The RFC status of this series implicitly includes the question
> > "Should we do this at all?" I think the benefits outweigh the
> > risks, but we can minimize those risks with very careful design
> > and implementation.
> >
> > >> To use this feature, add the "--in-tree" option when setting or adding
> > >> directories to the sparse-checkout definition. For example:
> > >>
> > >>   $ git sparse-checkout set --in-tree .sparse/base
> > >>   $ git sparse-checkout add --in-tree .sparse/extra
> > >>
> > >> These commands add values to the multi-valued config setting
> > >> "sparse.inTree". When updating the sparse-checkout definition, these
> > >> values describe paths in the repository to find the sparse-checkout
> > >> data. After the commands listed earlier, we expect to see the following
> > >> in .git/config.worktree:
> > >>
> > >>      [sparse]
> > >>              intree = .sparse/base
> > >>              intree = .sparse/extra
> > >
> > > What does this say in human words?  "These two tracked files specify
> > > which paths should be in the working tree"?  Spelling it out here
> > > would help readers of this commit.
> >
> > You got it. Sounds good.
> >
> > >> When applying the sparse-checkout definitions from this config, the
> > >> blobs at HEAD:.sparse/base and HEAD:.sparse/extra are loaded.
> > >
> > > OK, so end-user edit to the working tree copy or what is added to
> > > the index does not count and only the committed version gets used.
> > >
> > > That makes it simple---I was wondering how we would operate when
> > > merging a branch with different contents in the .sparse/* files
> > > until the conflicts are resolved.
> >
> > It's worth testing this case so we can be sure what happens.
>
> During a merge or rebase or checkout -m, what happens if .sparse/extra
> has the following working tree content:
>
> [sparse]
>     dir = D
>     dir = X
> <<<<<< HEAD
>     dir = Y
> |||||| MERGE_BASE
> ======
>     inherit = .sparse/tools
> >>>>>>  MERGE_HEAD
>     inherit = .sparse/base
>
> and, of course, three different entries in the index?
>
> Also, do we use the version of the --in-tree file from the latest
> commit, from the index, or from the working tree?  (This is a question
> not only for merge and rebase, but also checkout with dirty changes
> and even checkout -m.)  Which one "wins"?
>
> And what if the user updates and commits an ill-formed version of the
> file -- is it equivalent to getting an empty cone with just the
> toplevel directory, equivalent to getting a complete checkout of
> everything, or something else?

Son pointed out that mercurial has a 'sparse' extension that has some
possible ideas of things we could do here; see
https://lore.kernel.org/git/CABPp-BGLBmWXrmPsTogyBFMgwYbHjN39oWbU=qDWroU1_fJaoQ@mail.gmail.com/
for some further discussion.

  reply	other threads:[~2020-06-17 23:07 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-07 13:17 [PATCH 00/10] [RFC] In-tree sparse-checkout definitions Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 01/10] unpack-trees: avoid array out-of-bounds error Derrick Stolee via GitGitGadget
2020-05-07 22:27   ` Junio C Hamano
2020-05-08 12:19     ` Derrick Stolee
2020-05-08 15:09       ` Junio C Hamano
2020-05-20 16:32     ` Elijah Newren
2020-05-07 13:17 ` [PATCH 02/10] sparse-checkout: move code from builtin Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 03/10] sparse-checkout: move code from unpack-trees.c Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 04/10] sparse-checkout: allow in-tree definitions Derrick Stolee via GitGitGadget
2020-05-07 22:58   ` Junio C Hamano
2020-05-08 15:40     ` Derrick Stolee
2020-05-20 17:52       ` Elijah Newren
2020-06-17 23:07         ` Elijah Newren [this message]
2020-06-18  8:18           ` Son Luong Ngoc
2020-05-07 13:17 ` [PATCH 05/10] sparse-checkout: automatically update in-tree definition Derrick Stolee via GitGitGadget
2020-05-20 16:28   ` Elijah Newren
2020-05-07 13:17 ` [PATCH 06/10] sparse-checkout: use oidset to prevent repeat blobs Derrick Stolee via GitGitGadget
2020-05-20 16:40   ` Elijah Newren
2020-05-21  3:49     ` Elijah Newren
2020-05-21 17:54       ` Derrick Stolee
2020-05-07 13:17 ` [PATCH 07/10] sparse-checkout: define in-tree dependencies Derrick Stolee via GitGitGadget
2020-05-20 18:10   ` Elijah Newren
2020-05-30 17:26     ` Elijah Newren
2020-05-07 13:17 ` [PATCH 08/10] Makefile: skip git-gui if dir is missing Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 09/10] Makefile: disable GETTEXT when 'po' " Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 10/10] .sparse: add in-tree sparse-checkout for Git Derrick Stolee via GitGitGadget
2020-05-20 17:38 ` [PATCH 00/10] [RFC] In-tree sparse-checkout definitions Elijah Newren
2020-06-17 23:14 ` Elijah Newren
2020-06-18  1:42   ` Derrick Stolee
2020-06-18  1:59     ` Elijah Newren
2020-06-18  3:01       ` Derrick Stolee
2020-06-18  5:03         ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BEzz2MF72jZeJP6=9vJqcoJV15LN-O8umDDfWwQXLjZRA@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmaill.com \
    --cc=peff@peff.net \
    --cc=sluongng@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).