git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Rafael Ascensão" <rafa.almas@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Samuel Lijin" <sxlijin@gmail.com>,
	"Elijah Newren" <newren@gmail.com>
Subject: [PATCH v3 00/12] Fix some git clean issues
Date: Thu, 12 Sep 2019 15:12:28 -0700
Message-ID: <20190912221240.18057-1-newren@gmail.com> (raw)
In-Reply-To: <20190905154735.29784-1-newren@gmail.com>

NOTE: This series builds on sg/clean-nested-repo-with-ignored, as it
      (among other things) modifies his testcase from expect_failure
      to expect_success.  Also, Peff is probably the only one who
      remembers v1 (and even he may have forgotten it): v1 was posted
      a year and a half ago.

This patch series fixes a few issues with git-clean:
  * Failure to clean when multiple pathspecs are specified, reported both
    in April 2018[1] and again in May 2019[2].
  * Failure to preserve both tracked and untracked files within a nested
    Git repository reported a few weeks ago by SZEDER[3].

[1] https://public-inbox.org/git/20180405173446.32372-4-newren@gmail.com/
[2] https://public-inbox.org/git/20190531183651.10067-1-rafa.almas@gmail.com/
[3] https://public-inbox.org/git/20190825185918.3909-1-szeder.dev@gmail.com/

Changes since v2:
  * Removed the RFC label
  * Fixed up a few things SZEDER pointed out in v2 -- some commit
    message improvements, and an extra safety precaution for the
    final patch.

Stuff I'd like reviewer to focus on:
  * Read the (long) commit messages for patches 9 & 10; do folks agree
    with my declaration of "correct" behavior and my changes to the
    few regression tests in those patches?

Elijah Newren (12):
  t7300: add testcases showing failure to clean specified pathspecs
  dir: fix typo in comment
  dir: fix off-by-one error in match_pathspec_item
  dir: also check directories for matching pathspecs
  dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule
    case
  dir: if our pathspec might match files under a dir, recurse into it
  dir: add commentary explaining match_pathspec_item's return value
  git-clean.txt: do not claim we will delete files with -n/--dry-run
  clean: disambiguate the definition of -d
  clean: avoid removing untracked files in a nested git repository
  clean: rewrap overly long line
  clean: fix theoretical path corruption

 Documentation/git-clean.txt | 16 +++++-----
 builtin/clean.c             | 15 +++++++--
 dir.c                       | 63 +++++++++++++++++++++++++++----------
 dir.h                       |  8 +++--
 t/t7300-clean.sh            | 44 +++++++++++++++++++++++---
 5 files changed, 112 insertions(+), 34 deletions(-)

Range-diff:
 1:  82328e2033 !  1:  fe35ab8cc3 t7300: Add some testcases showing failure to clean specified pathspecs
    @@ Metadata
     Author: Elijah Newren <newren@gmail.com>
     
      ## Commit message ##
    -    t7300: Add some testcases showing failure to clean specified pathspecs
    +    t7300: add testcases showing failure to clean specified pathspecs
     
         Someone brought me a testcase where multiple git-clean invocations were
         required to clean out unwanted files:
 2:  5c1f58fd9d =  2:  707d287d79 dir: fix typo in comment
 3:  0e8b415af3 =  3:  bb316e82b2 dir: fix off-by-one error in match_pathspec_item
 4:  30b3ede443 !  4:  56319f934a dir: Directories should be checked for matching pathspecs too
    @@ Metadata
     Author: Elijah Newren <newren@gmail.com>
     
      ## Commit message ##
    -    dir: Directories should be checked for matching pathspecs too
    +    dir: also check directories for matching pathspecs
     
         Even if a directory doesn't match a pathspec, it is possible, depending
         on the precise pathspecs, that some file underneath it might.  So we
 5:  bab01f4cda !  5:  81593a565c dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
    @@ Metadata
     Author: Elijah Newren <newren@gmail.com>
     
      ## Commit message ##
    -    dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
    +    dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
     
         The specific checks done in match_pathspec_item for the DO_MATCH_SUBMODULE
         case are useful for other cases which have nothing to do with submodules.
 6:  c619ab4b3e !  6:  9566823a0f dir: If our pathspec might match files under a dir, recurse into it
    @@ Metadata
     Author: Elijah Newren <newren@gmail.com>
     
      ## Commit message ##
    -    dir: If our pathspec might match files under a dir, recurse into it
    +    dir: if our pathspec might match files under a dir, recurse into it
     
         For git clean, if a directory is entirely untracked and the user did not
         specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
    @@ Commit message
         the given directory which match one of the pathspecs being added to the
         entries list.
     
    -    Two particular considerations for this patch:
    +    Two notes of potential interest to future readers:
     
    -      * If we want to only recurse into a directory when it is specifically
    +      * If we wanted to only recurse into a directory when it is specifically
             matched rather than matched-via-glob (e.g. '*.c'), then we could do
             so via making the final non-zero return in match_pathspec_item be
             MATCHED_RECURSIVELY instead of MATCHED_RECURSIVELY_LEADING_PATHSPEC.
    -        (See final patch of this RFC series for details; note that the
    -        relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC and
    -        MATCHED_RECURSIVELY are important for such a change.))
    +        (Note that the relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC
    +        and MATCHED_RECURSIVELY are important for such a change.)  I was
    +        leaving open that possibility while writing an RFC asking for the
    +        behavior we want, but even though we don't want it, that knowledge
    +        might help you understand the code flow better.
     
           * There is a growing amount of logic in read_directory_recursive() for
    -        deciding whether to recurse into a subdirectory.  However, there is
    -        a comment immediately preceding this logic that says to recurse if
    +        deciding whether to recurse into a subdirectory.  However, there is a
    +        comment immediately preceding this logic that says to recurse if
             instructed by treat_path().   It may be better for the logic in
    -        read_directory_recursive() to be moved to treat_path() (or another
    -        function it calls, such as treat_directory()), but I did not feel
    -        strongly about this and just left the logic where it was while
    -        adding to it.  Do others have strong opinions on this?
    +        read_directory_recursive() to ultimately be moved to treat_path() (or
    +        another function it calls, such as treat_directory()), but I have
    +        left that for someone else to tackle in the future.
     
         Signed-off-by: Elijah Newren <newren@gmail.com>
     
 7:  5e1686a30e =  7:  7821898ba7 dir: add commentary explaining match_pathspec_item's return value
 8:  d92637f961 =  8:  13def5df57 git-clean.txt: do not claim we will delete files with -n/--dry-run
 9:  4550a30df8 =  9:  e6b274abf7 clean: disambiguate the definition of -d
10:  0985be2793 ! 10:  5f4ef14765 clean: avoid removing untracked files in a nested git repository
    @@ Commit message
         Also, clean up some grammar errors describing this functionality in the
         git-clean manpage.
     
    -    Finally, there is one somewhat related bug which this patch does not
    -    fix, coming from the opposite angle.  If the user runs
    -       git clean -ffd
    -    to force deletion of untracked nested repositories, and within an
    -    untracked nested repo the user has ignored files (according to the inner
    -    OR outer repositories' .gitignore), then not only will those ignored
    -    files be left alone but the .git/ subdirectory of the nested repo will
    -    be left alone too.  I am not completely sure if this should be
    -    considered a bug (though it seems like it since the lack of the
    -    untracked file would result in the .git/ subdirectory being deleted),
    -    but in any event it is very minor compared to accidentally deleting user
    -    data and I did not dive into it.
    +    Finally, there are still a couple bugs with -ffd not cleaning out enough
    +    (e.g.  missing the nested .git) and with -ffdX possibly cleaning out the
    +    wrong files (paying attention to outer .gitignore instead of inner).
    +    This patch does not address these cases at all (and does not change the
    +    behavior relative to those flags), it only fixes the handling when given
    +    a single -f.  See
    +    https://public-inbox.org/git/20190905212043.GC32087@szeder.dev/ for more
    +    discussion of the -ffd[X?] bugs.
     
         Signed-off-by: Elijah Newren <newren@gmail.com>
     
11:  2d36e3f7cb = 11:  4e30e62eb1 clean: rewrap overly long line
12:  3c5d1ff16a <  -:  ---------- clean: fix theoretical path corruption
 -:  ---------- > 12:  de2444f7cb clean: fix theoretical path corruption
-- 
2.23.0.173.gad11b3a635.dirty


  parent reply index

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-25 18:59 [PATCH] t7300-clean: demonstrate deleting nested repo with an ignored file breakage SZEDER Gábor
2019-08-25 20:34 ` SZEDER Gábor
2019-08-25 22:32 ` Philip Oakley
2019-08-26  7:48   ` SZEDER Gábor
2019-09-05 15:47 ` [RFC PATCH v2 00/12] Fix some git clean issues Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 01/12] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 02/12] dir: fix typo in comment Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 04/12] dir: Directories should be checked for matching pathspecs too Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 05/12] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 06/12] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-05 21:20     ` SZEDER Gábor
2019-09-05 15:47   ` [RFC PATCH v2 11/12] clean: rewrap overly long line Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-05 19:27     ` SZEDER Gábor
2019-09-07  0:34       ` Elijah Newren
2019-09-05 19:01   ` [RFC PATCH v2 00/12] Fix some git clean issues SZEDER Gábor
2019-09-07  0:33     ` Elijah Newren
2019-09-12 22:12   ` Elijah Newren [this message]
2019-09-12 22:12     ` [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-13 18:54       ` Junio C Hamano
2019-09-13 19:10         ` Elijah Newren
2019-09-13 20:29           ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 02/12] dir: fix typo in comment Elijah Newren
2019-09-12 22:12     ` [PATCH v3 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-13 19:05       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-12 22:12     ` [PATCH v3 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-12 22:12     ` [PATCH v3 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-13 19:45       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-13 20:04       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-12 22:12     ` [PATCH v3 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-12 22:12     ` [PATCH v3 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-12 22:12     ` [PATCH v3 11/12] clean: rewrap overly long line Elijah Newren
2019-09-12 22:12     ` [PATCH v3 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-17 16:34     ` [PATCH v4 00/12] Fix some git clean issues Elijah Newren
2019-09-17 16:34       ` [PATCH v4 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-17 16:34       ` [PATCH v4 02/12] dir: fix typo in comment Elijah Newren
2019-09-17 16:34       ` [PATCH v4 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-17 16:34       ` [PATCH v4 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-25 20:39         ` [BUG] git is segfaulting, was " Denton Liu
2019-09-25 21:28           ` Elijah Newren
2019-09-25 21:55             ` Denton Liu
2019-09-26 20:35               ` Denton Liu
2019-09-27  0:12                 ` Elijah Newren
2019-09-27  1:09           ` SZEDER Gábor
2019-09-27  2:17             ` SZEDER Gábor
2019-09-27 17:10               ` Denton Liu
2019-09-30 19:11                 ` [PATCH] dir: special case check for the possibility that pathspec is NULL Elijah Newren
2019-09-30 22:31                   ` Denton Liu
2019-10-01  7:01                     ` Elijah Newren
2019-10-01 18:30                   ` [PATCH v2] " Elijah Newren
2019-10-01 18:40                     ` Denton Liu
2019-10-01 18:54                       ` Elijah Newren
2019-10-01 18:55                       ` [PATCH v3] " Elijah Newren
2019-10-01 19:35                         ` Denton Liu
2019-10-01 19:39                           ` Elijah Newren
2019-10-02 15:51                             ` Elijah Newren
2019-10-07 18:04                         ` SZEDER Gábor
2019-09-17 16:34       ` [PATCH v4 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-17 16:34       ` [PATCH v4 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-17 16:34       ` [PATCH v4 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-17 16:35       ` [PATCH v4 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-17 16:35       ` [PATCH v4 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-17 16:35       ` [PATCH v4 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-17 16:35       ` [PATCH v4 11/12] clean: rewrap overly long line Elijah Newren
2019-09-17 16:35       ` [PATCH v4 12/12] clean: fix theoretical path corruption Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190912221240.18057-1-newren@gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=rafa.almas@gmail.com \
    --cc=sxlijin@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git