git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* sparse-checkout papercuts
@ 2020-06-17  1:37 Elijah Newren
  0 siblings, 0 replies; only message in thread
From: Elijah Newren @ 2020-06-17  1:37 UTC (permalink / raw)
  To: Git Mailing List, Derrick Stolee

Hi everyone,

I just sent off a few series to address some small sparse-checkout
issues, but there are a couple others that I wasn't sure of a good
initial solution for and wanted to solicit ideas from everyone else:

A) Users surprised at left-behind directories after doing a sparse checkout

This is similar to my "mention sparsity state in git status" series I
sent off, but instead of being about users coming back to a directory
and having forgotten that they sparsified and then being surprised
about files and directories missing, this issue is about users being
surprised by directories left around immediately after trying to
sparsify.

In particular, after people do some builds in a non-sparse project and
then go to sparsify, the .gitignore'd build artifacts are not removed
when all the sources are.  I recently modified my 'sparsify' wrapper,
so that at the time people go to sparsify they will see a message
like:

    $ ./sparsify --modules $module1 $module2 $module 3
    You are now in a sparse checkout with only 21458 of the 56079 files.
    Note: You have leftover .gitignore'd files in 33 directories;
          these likely represent build artifacts.
          To remove, run: ./sparsify --update --remove-old-ignores

Further, that special --remove-old-ignores basically runs `git clean
-fX <directories-outside-sparse-cone>`, where it does some work to
figure out which directories are outside the sparse cone, of course.

Is this something that would make sense to add to sparse-checkout
init/set?  Should it go in git status instead?  Should git clean get a
special flag for looking just at directories outside of the sparse
cone (and simplifying my logic in sparsify so it doesn't have to
handle this)?  Or is this behavior too specific to our usecases?


B) Double updating and unnecessary dirtying of files

`git sparse-checkout` docs say the correct way to get into a sparse
checkout is to run the combination of commands:
   git sparse-checkout init [--cone]
   git sparse-checkout set $DIR1 $DIR2 ...

Unfortunately, the first step removes virtually all files from the
working directory, then the second step (in our case) adds many of
them back and sometimes even a majority.  For other wrappers or tools
that do dependency analysis between directories and wrap this
behavior, this results in behavior that looks like:

   $ ./sparsify --modules maindir1 maindir2
   Updating files: 100% (56059/56059), done.
   Updating files: 100% (24853/24853), done.
   You are now in a sparse checkout with only 24873 of the 56079 files.

There are three problems here: (1) the user watches two progress bars,
which seems kind of suboptimal.  (2) This process deletes tens of
thousands of files from the working directory just to immediately add
them back.  (3) Adding the files back to the working directory updated
the modification timestamps of the files, so any subsequent build they
invoke has to rebuild everything present regardless of whether they
were up-to-date before sparsifying.

Any ideas or preferences on whether (a) a tool could combine the two
sparse-checkout commands, (b) specify an extra flag to init that
didn't immediately try to apply all the changes to the working tree,
or (c) something else that'd avoid these problems?


Thanks,
Elijah

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-06-17  1:38 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-17  1:37 sparse-checkout papercuts Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).