git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Rafael Ascensão" <rafa.almas@gmail.com>,
	"Samuel Lijin" <sxlijin@gmail.com>
Subject: Re: [RFC PATCH v2 10/12] clean: avoid removing untracked files in a nested git repository
Date: Thu, 5 Sep 2019 23:20:43 +0200	[thread overview]
Message-ID: <20190905212043.GC32087@szeder.dev> (raw)
In-Reply-To: <20190905154735.29784-11-newren@gmail.com>

On Thu, Sep 05, 2019 at 08:47:33AM -0700, Elijah Newren wrote:
> Users expect files in a nested git repository to be left alone unless
> sufficiently forced (with two -f's).  Unfortunately, in certain
> circumstances, git would delete both tracked (and possibly dirty) files
> and untracked files within a nested repository.  To explain how this
> happens, let's contrast a couple cases.  First, take the following
> example setup (which assumes we are already within a git repo):
> 
>    git init nested
>    cd nested
>    >tracked
>    git add tracked
>    git commit -m init
>    >untracked
>    cd ..
> 
> In this setup, everything works as expected; running 'git clean -fd'
> will result in fill_directory() returning the following paths:
>    nested/
>    nested/tracked
>    nested/untracked
> and then correct_untracked_entries() would notice this can be compressed
> to
>    nested/
> and then since "nested/" is a directory, we would call
> remove_dirs("nested/", ...), which would
> check is_nonbare_repository_dir() and then decide to skip it.
> 
> However, if someone also creates an ignored file:
>    >nested/ignored
> then running 'git clean -fd' would result in fill_directory() returning
> the same paths:
>    nested/
>    nested/tracked
>    nested/untracked
> but correct_untracked_entries() will notice that we had ignored entries
> under nested/ and thus simplify this list to
>    nested/tracked
>    nested/untracked
> Since these are not directories, we do not call remove_dirs() which was
> the only place that had the is_nonbare_repository_dir() safety check --
> resulting in us deleting both the untracked file and the tracked (and
> possibly dirty) file.
> 
> One possible fix for this issue would be walking the parent directories
> of each path and checking if they represent nonbare repositories, but
> that would be wasteful.  Even if we added caching of some sort, it's
> still a waste because we should have been able to check that "nested/"
> represented a nonbare repository before even descending into it in the
> first place.  Add a DIR_SKIP_NESTED_GIT flag to dir_struct.flags and use
> it to prevent fill_directory() and friends from descending into nested
> git repos.

> Finally, there is one somewhat related bug which this patch does not
> fix, coming from the opposite angle.  If the user runs
>    git clean -ffd
> to force deletion of untracked nested repositories, and within an
> untracked nested repo the user has ignored files (according to the inner
> OR outer repositories' .gitignore), then not only will those ignored
> files be left alone but the .git/ subdirectory of the nested repo will
> be left alone too.  I am not completely sure if this should be
> considered a bug (though it seems like it since the lack of the
> untracked file would result in the .git/ subdirectory being deleted),
> but in any event it is very minor compared to accidentally deleting user
> data and I did not dive into it.

We briefly mentioned this "ignored file in a nested repo fools 'git
clean -d'" issue in an unrelated thread as well, where Philip
suggested that the gitignore of the outer repository should not have
any effect on the nested repository.  I'm inclined to agree.

  https://public-inbox.org/git/e221aaf8-7d0b-6feb-3f58-1e9e4382939b@iee.email/

Now, 'git clean -X' is supposed to "Remove only files ignored by
Git.".  I'm not entirely sure what 'git clean -ffdX' is supposed to do
(or whether it makes any sense in the first place), but it does delete
files in the nested repository that are ignored only in the outer
repository, both tracked (and possibly dirty) and untracked, even with
this patch series.  Without this series '-fdX' is just as bad, but
with this patch (i.e. by not descending into nested repositories)
'-fdX' becomes sensible and leaves the nested repository alone.

> diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
> index 3ab749b921..ba31d8d166 100644
> --- a/Documentation/git-clean.txt
> +++ b/Documentation/git-clean.txt
> @@ -37,9 +37,9 @@ OPTIONS
>  --force::
>  	If the Git configuration variable clean.requireForce is not set
>  	to false, 'git clean' will refuse to delete files or directories
> -	unless given -f or -i. Git will refuse to delete directories
> -	with .git sub directory or file unless a second -f
> -	is given.
> +	unless given -f or -i.  Git will refuse to modify untracked
> +	nested git repositories (directories with a .git subdirectory)
> +	unless a second -f is given.

I like this wording.


  reply	other threads:[~2019-09-05 21:20 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-25 18:59 [PATCH] t7300-clean: demonstrate deleting nested repo with an ignored file breakage SZEDER Gábor
2019-08-25 20:34 ` SZEDER Gábor
2019-08-25 22:32 ` Philip Oakley
2019-08-26  7:48   ` SZEDER Gábor
2019-09-05 15:47 ` [RFC PATCH v2 00/12] Fix some git clean issues Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 01/12] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 02/12] dir: fix typo in comment Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 04/12] dir: Directories should be checked for matching pathspecs too Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 05/12] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 06/12] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-05 21:20     ` SZEDER Gábor [this message]
2019-09-05 15:47   ` [RFC PATCH v2 11/12] clean: rewrap overly long line Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-05 19:27     ` SZEDER Gábor
2019-09-07  0:34       ` Elijah Newren
2019-09-05 19:01   ` [RFC PATCH v2 00/12] Fix some git clean issues SZEDER Gábor
2019-09-07  0:33     ` Elijah Newren
2019-09-12 22:12   ` [PATCH v3 " Elijah Newren
2019-09-12 22:12     ` [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-13 18:54       ` Junio C Hamano
2019-09-13 19:10         ` Elijah Newren
2019-09-13 20:29           ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 02/12] dir: fix typo in comment Elijah Newren
2019-09-12 22:12     ` [PATCH v3 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-13 19:05       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-12 22:12     ` [PATCH v3 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-12 22:12     ` [PATCH v3 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-13 19:45       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-13 20:04       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-12 22:12     ` [PATCH v3 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-12 22:12     ` [PATCH v3 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-12 22:12     ` [PATCH v3 11/12] clean: rewrap overly long line Elijah Newren
2019-09-12 22:12     ` [PATCH v3 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-17 16:34     ` [PATCH v4 00/12] Fix some git clean issues Elijah Newren
2019-09-17 16:34       ` [PATCH v4 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-17 16:34       ` [PATCH v4 02/12] dir: fix typo in comment Elijah Newren
2019-09-17 16:34       ` [PATCH v4 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-17 16:34       ` [PATCH v4 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-25 20:39         ` [BUG] git is segfaulting, was " Denton Liu
2019-09-25 21:28           ` Elijah Newren
2019-09-25 21:55             ` Denton Liu
2019-09-26 20:35               ` Denton Liu
2019-09-27  0:12                 ` Elijah Newren
2019-09-27  1:09           ` SZEDER Gábor
2019-09-27  2:17             ` SZEDER Gábor
2019-09-27 17:10               ` Denton Liu
2019-09-30 19:11                 ` [PATCH] dir: special case check for the possibility that pathspec is NULL Elijah Newren
2019-09-30 22:31                   ` Denton Liu
2019-10-01  7:01                     ` Elijah Newren
2019-10-01 18:30                   ` [PATCH v2] " Elijah Newren
2019-10-01 18:40                     ` Denton Liu
2019-10-01 18:54                       ` Elijah Newren
2019-10-01 18:55                       ` [PATCH v3] " Elijah Newren
2019-10-01 19:35                         ` Denton Liu
2019-10-01 19:39                           ` Elijah Newren
2019-10-02 15:51                             ` Elijah Newren
2019-10-07 18:04                         ` SZEDER Gábor
2019-09-17 16:34       ` [PATCH v4 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-17 16:34       ` [PATCH v4 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-17 16:34       ` [PATCH v4 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-17 16:35       ` [PATCH v4 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-17 16:35       ` [PATCH v4 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-17 16:35       ` [PATCH v4 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-17 16:35       ` [PATCH v4 11/12] clean: rewrap overly long line Elijah Newren
2019-09-17 16:35       ` [PATCH v4 12/12] clean: fix theoretical path corruption Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190905212043.GC32087@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=rafa.almas@gmail.com \
    --cc=sxlijin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).