git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH 0/7] Fix `git clean` with pathspecs
@ 2018-04-05 17:34 Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 1/7] dir.c: Fix typo in comment Elijah Newren
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

Sometimes, multiple `git clean $ARGS` invocations (with the exact same
flags and parameters for each invocation) are needed to properly clean
out the desired files.  Sometimes, `git clean $PATHS` just refuses to
clean the files it was explicitly told to clean.  This patch series
aims to address these (very old) problems.

I was made aware of the problems when a user brought to me the
following testcase:
    mkdir d{1,2}
    touch d{1,2}/ut
    touch d1/t
    git add d1/t
With this setup, the user would need to run
    git clean -ffd */ut
twice to delete both ut files.  Digging further, I found multiple
interesting variants.

However, I am still slightly unsure of what the correct behavior is
supposed to be for one particular case, namely, if the clean command
were instead:
    git clean -f '*ut'
(note that the glob is quoted to protect from shell expansion, and
that the -d option was removed), should the files still be cleaned?  I
assumed yes and implemented that in patches 5-6, but the commit message
discusses this case, and patch 7 exists to change the implementation
to answer this question with a 'no'.  Patch 7 should NOT should not be
accepted as-is -- it should either be dropped or squashed into earlier
commits, but which depends on the desired behavior.

Patches 1-2 are almost independent one-line fixes that could be
submitted independently.  However, if we decide to keep the changes
from patch 7, then this series does depend on patch 2 for the tests to
pass.

Patch 3 adds four new testcases covering the variants I noticed.

Patch 4 fixes clean with explicit pathspecs and the -d option.

Patches 5-7 fixes clean with explicit pathspecs without the -d option.

Elijah Newren (7):
  dir.c: Fix typo in comment
  dir.c: fix off-by-one error in match_pathspec_item
  t7300: Add some testcases showing failure to clean specified pathspecs
  dir: Directories should be checked for matching pathspecs too
  dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule
    case
  dir: If our pathspec might match files under a dir, recurse into it
  If we do not want globs to recurse into subdirs without -d...

 dir.c            | 23 +++++++++++++++--------
 dir.h            |  5 +++--
 t/t7300-clean.sh | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 10 deletions(-)

-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH 1/7] dir.c: Fix typo in comment
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item Elijah Newren
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index dedbf5d476..19212129f0 100644
--- a/dir.c
+++ b/dir.c
@@ -138,7 +138,7 @@ static size_t common_prefix_len(const struct pathspec *pathspec)
 	 * ":(icase)path" is treated as a pathspec full of
 	 * wildcard. In other words, only prefix is considered common
 	 * prefix. If the pathspec is abc/foo abc/bar, running in
-	 * subdir xyz, the common prefix is still xyz, not xuz/abc as
+	 * subdir xyz, the common prefix is still xyz, not xyz/abc as
 	 * in non-:(icase).
 	 */
 	GUARD_PATHSPEC(pathspec,
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 1/7] dir.c: Fix typo in comment Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  2018-04-05 17:49   ` Jeff King
  2018-04-05 17:34 ` [RFC PATCH 3/7] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

For a pathspec like 'foo/bar' comparing against a path named "foo/",
namelen will be 4, and match[namelen] will be 'b'.  The correct location
of the directory separator is namelen-1.

The reason the code worked anyway was that the following code immediately
checked whether the first matchlen characters matched (which they do) and
then bailed and return MATCHED_RECURSIVELY anyway since wildmatch doesn't
have the ability to check if "name" can be matched as a directory (or
prefix) against the pathspec.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 19212129f0..c915a69385 100644
--- a/dir.c
+++ b/dir.c
@@ -384,7 +384,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 	if (flags & DO_MATCH_SUBMODULE) {
 		/* name is a literal prefix of the pathspec */
 		if ((namelen < matchlen) &&
-		    (match[namelen] == '/') &&
+		    (match[namelen-1] == '/') &&
 		    !ps_strncmp(item, match, name, namelen))
 			return MATCHED_RECURSIVELY;
 
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 3/7] t7300: Add some testcases showing failure to clean specified pathspecs
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 1/7] dir.c: Fix typo in comment Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too Elijah Newren
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

Someone brought me a testcase where multiple git-clean invocations were
required to clean out unwanted files:
  mkdir d{1,2}
  touch d{1,2}/ut
  touch d1/t && git add d1/t
With this setup, the user would need to run
  git clean -ffd */ut
twice to delete both ut files.

A little testing showed some interesting variants:
  * If only one of those two ut files existed (either one), then only one
    clean command would be necessary.
  * If both directories had tracked files, then only one git clean would
    be necessary to clean both files.
  * If both directories had no tracked files then the clean command above
    would never clean either of the untracked files despite the pathspec
    explicitly calling both of them out.

A bisect showed that the failure to clean out the files started with
commit cf424f5fd89b ("clean: respect pathspecs with "-d"", 2014-03-10).
However, that pointed to a separate issue: while the "-d" flag was used
by the original user who showed me this problem, that flag should have
been irrelevant to this problem.  Testing again without the "-d" flag
showed that the same buggy behavior exists without using that flag, and
has in fact existed since before cf424f5fd89b.

Add testcases showing that multiple untracked files within entirely
untracked directories cannot be cleaned when specifying these files to
git clean via pathspecs.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t7300-clean.sh | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 7b36954d63..3d260e21ea 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -669,4 +669,36 @@ test_expect_success 'git clean -d skips untracked dirs containing ignored files'
 	test_path_is_missing foo/b/bb
 '
 
+test_expect_failure 'git clean handles being told what to clean' '
+	mkdir -p d1 d2 &&
+	touch d1/ut d2/ut &&
+	git clean -f */ut &&
+	test_path_is_missing d1/ut &&
+	test_path_is_missing d2/ut
+'
+
+test_expect_failure 'git clean handles being told what to clean, with -d' '
+	mkdir -p d1 d2 &&
+	touch d1/ut d2/ut &&
+	git clean -ffd */ut &&
+	test_path_is_missing d1/ut &&
+	test_path_is_missing d2/ut
+'
+
+test_expect_failure 'git clean handles being told a glob to clean' '
+	mkdir -p d1 d2 &&
+	touch d1/ut d2/ut &&
+	git clean -f "*ut" &&
+	test_path_is_missing d1/ut &&
+	test_path_is_missing d2/ut
+'
+
+test_expect_failure 'git clean handles being told a glob to clean with -d' '
+	mkdir -p d1 d2 &&
+	touch d1/ut d2/ut &&
+	git clean -ffd "*ut" &&
+	test_path_is_missing d1/ut &&
+	test_path_is_missing d2/ut
+'
+
 test_done
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
                   ` (2 preceding siblings ...)
  2018-04-05 17:34 ` [RFC PATCH 3/7] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  2018-04-05 18:58   ` Jeff King
  2018-04-05 17:34 ` [RFC PATCH 5/7] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

Even if a directory doesn't match a pathspec, it is possible, depending
on the precise pathspecs, that some file underneath it might.  So we
special case and recurse into the directory for such situations.  However,
we previously always added any untracked directory that we recursed into
to the list of untracked paths, regardless of whether the directory
itself matched the pathspec.

For the case of git-clean and a set of pathspecs of "dir/file" and "more",
this caused a problem because we'd end up with dir entries for both of
  "dir"
  "dir/file"
Then correct_untracked_entries() would try to helpfully prune duplicates
for us by removing "dir/file" since it's under "dir", leaving us with
  "dir"
Since the original pathspec only had "dir/file", the only entry left
doesn't match and leaves nothing to be removed.  (Note that if only one
pathspec was specified, e.g. only "dir/file", then the common_prefix_len
optimizations in fill_directory would cause us to bypass this problem,
making it appear in simple tests that we could correctly remove manually
specified pathspecs.)

Fix this by actually checking whether the directory we are about to add
to the list of dir entries actually matches the pathspec; only do this
matching check after we have already returned from recursing into the
directory.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c            | 5 +++++
 t/t7300-clean.sh | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index c915a69385..e783431948 100644
--- a/dir.c
+++ b/dir.c
@@ -1973,6 +1973,11 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 							 check_only, stop_at_first_file, pathspec);
 			if (subdir_state > dir_state)
 				dir_state = subdir_state;
+
+			if (!match_pathspec(pathspec, path.buf, path.len,
+					    0 /* prefix */, NULL,
+					    0 /* do NOT special case dirs */))
+				state = path_none;
 		}
 
 		if (check_only) {
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 3d260e21ea..31f2d0d8ba 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -677,7 +677,7 @@ test_expect_failure 'git clean handles being told what to clean' '
 	test_path_is_missing d2/ut
 '
 
-test_expect_failure 'git clean handles being told what to clean, with -d' '
+test_expect_success 'git clean handles being told what to clean, with -d' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -ffd */ut &&
@@ -693,7 +693,7 @@ test_expect_failure 'git clean handles being told a glob to clean' '
 	test_path_is_missing d2/ut
 '
 
-test_expect_failure 'git clean handles being told a glob to clean with -d' '
+test_expect_success 'git clean handles being told a glob to clean with -d' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -ffd "*ut" &&
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 5/7] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
                   ` (3 preceding siblings ...)
  2018-04-05 17:34 ` [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 6/7] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 7/7] If we do not want globs to recurse into subdirs without -d Elijah Newren
  6 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

The specific checks done in match_pathspec_item for the DO_MATCH_SUBMODULE
case are useful for other cases which have nothing to do with submodules.
Rename this constant; a subsequent commit will make use of this change.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/dir.c b/dir.c
index e783431948..b0bca179fd 100644
--- a/dir.c
+++ b/dir.c
@@ -272,7 +272,7 @@ static int do_read_blob(const struct object_id *oid, struct oid_stat *oid_stat,
 
 #define DO_MATCH_EXCLUDE   (1<<0)
 #define DO_MATCH_DIRECTORY (1<<1)
-#define DO_MATCH_SUBMODULE (1<<2)
+#define DO_MATCH_LEADING_PATHSPEC (1<<2)
 
 static int match_attrs(const char *name, int namelen,
 		       const struct pathspec_item *item)
@@ -381,7 +381,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 		return MATCHED_FNMATCH;
 
 	/* Perform checks to see if "name" is a super set of the pathspec */
-	if (flags & DO_MATCH_SUBMODULE) {
+	if (flags & DO_MATCH_LEADING_PATHSPEC) {
 		/* name is a literal prefix of the pathspec */
 		if ((namelen < matchlen) &&
 		    (match[namelen-1] == '/') &&
@@ -521,7 +521,7 @@ int submodule_path_match(const struct pathspec *ps,
 					strlen(submodule_name),
 					0, seen,
 					DO_MATCH_DIRECTORY |
-					DO_MATCH_SUBMODULE);
+					DO_MATCH_LEADING_PATHSPEC);
 	return matched;
 }
 
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 6/7] dir: If our pathspec might match files under a dir, recurse into it
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
                   ` (4 preceding siblings ...)
  2018-04-05 17:34 ` [RFC PATCH 5/7] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  2018-04-05 17:34 ` [RFC PATCH 7/7] If we do not want globs to recurse into subdirs without -d Elijah Newren
  6 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

For git clean, if a directory is entirely untracked and the user did not
specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
not want to remove that directory and thus do not recurse into it.
However, if the user manually specified specific (or even globbed) paths
somewhere under that directory to remove, then we need to recurse into
the directory to make sure we remove the relevant paths under that
directory as the user requested.

Note that this does not mean that the recursed-into directory will be
added to dir->entries for later removal; as of a few commits earlier in
this series, there is another more strict match check that is run after
returning from a recursed-into directory before deciding to add it to the
list of entries.  Therefore, this will only result in files underneath
the given directory which match one of the pathspecs being added to the
entries list.

Two particular considerations for this patch:

  * If we want to only recurse into a directory when it is specifically
    matched rather than matched-via-glob (e.g. '*.c'), then we could do
    so via making the final non-zero return in match_pathspec_item be
    MATCHED_RECURSIVELY instead of MATCHED_RECURSIVELY_LEADING_PATHSPEC.
    (See final patch of this RFC series for details; note that the
    relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC and
    MATCHED_RECURSIVELY are important for such a change.))

  * There is a growing amount of logic in read_directory_recursive() for
    deciding whether to recurse into a subdirectory.  However, there is
    a comment immediately preceding this logic that says to recurse if
    instructed by treat_path().   It may be better for the logic in
    read_directory_recursive() to be moved to treat_path() (or another
    function it calls, such as treat_directory()), but I did not feel
    strongly about this and just left the logic where it was while
    adding to it.  Do others have strong opinions on this?

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c            | 10 ++++++----
 dir.h            |  5 +++--
 t/t7300-clean.sh |  4 ++--
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/dir.c b/dir.c
index b0bca179fd..f55e24f149 100644
--- a/dir.c
+++ b/dir.c
@@ -386,7 +386,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 		if ((namelen < matchlen) &&
 		    (match[namelen-1] == '/') &&
 		    !ps_strncmp(item, match, name, namelen))
-			return MATCHED_RECURSIVELY;
+			return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
 
 		/* name" doesn't match up to the first wild character */
 		if (item->nowildcard_len < item->len &&
@@ -403,7 +403,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 		 * The submodules themselves will be able to perform more
 		 * accurate matching to determine if the pathspec matches.
 		 */
-		return MATCHED_RECURSIVELY;
+		return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
 	}
 
 	return 0;
@@ -1961,8 +1961,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* recurse into subdir if instructed by treat_path */
 		if ((state == path_recurse) ||
 			((state == path_untracked) &&
-			 (dir->flags & DIR_SHOW_IGNORED_TOO) &&
-			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR))) {
+			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR) &&
+			 ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
+			  do_match_pathspec(pathspec, path.buf, path.len,
+					    baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC))) {
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
diff --git a/dir.h b/dir.h
index b0758b82a2..0573f99ae0 100644
--- a/dir.h
+++ b/dir.h
@@ -210,8 +210,9 @@ extern int count_slashes(const char *s);
  * when populating the seen[] array.
  */
 #define MATCHED_RECURSIVELY 1
-#define MATCHED_FNMATCH 2
-#define MATCHED_EXACTLY 3
+#define MATCHED_RECURSIVELY_LEADING_PATHSPEC 2
+#define MATCHED_FNMATCH 3
+#define MATCHED_EXACTLY 4
 extern int simple_length(const char *match);
 extern int no_wildcard(const char *string);
 extern char *common_prefix(const struct pathspec *pathspec);
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 31f2d0d8ba..889b3401e4 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -669,7 +669,7 @@ test_expect_success 'git clean -d skips untracked dirs containing ignored files'
 	test_path_is_missing foo/b/bb
 '
 
-test_expect_failure 'git clean handles being told what to clean' '
+test_expect_success 'git clean handles being told what to clean' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f */ut &&
@@ -685,7 +685,7 @@ test_expect_success 'git clean handles being told what to clean, with -d' '
 	test_path_is_missing d2/ut
 '
 
-test_expect_failure 'git clean handles being told a glob to clean' '
+test_expect_success 'git clean handles being told a glob to clean' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f "*ut" &&
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH 7/7] If we do not want globs to recurse into subdirs without -d...
  2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
                   ` (5 preceding siblings ...)
  2018-04-05 17:34 ` [RFC PATCH 6/7] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
@ 2018-04-05 17:34 ` Elijah Newren
  6 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 17:34 UTC (permalink / raw)
  To: git; +Cc: sxlijin, peff, Elijah Newren

If folks prefer this behavior, I'll squash this patch into the previous.
Otherwise, I'll just drop this patch from the series.
---
 dir.c            | 2 +-
 t/t7300-clean.sh | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/dir.c b/dir.c
index f55e24f149..bad75e9fbd 100644
--- a/dir.c
+++ b/dir.c
@@ -403,7 +403,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 		 * The submodules themselves will be able to perform more
 		 * accurate matching to determine if the pathspec matches.
 		 */
-		return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
+		return MATCHED_RECURSIVELY;
 	}
 
 	return 0;
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 889b3401e4..913ea6bda3 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -685,12 +685,12 @@ test_expect_success 'git clean handles being told what to clean, with -d' '
 	test_path_is_missing d2/ut
 '
 
-test_expect_success 'git clean handles being told a glob to clean' '
+test_expect_success 'git clean will not recurse with globs without -d' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f "*ut" &&
-	test_path_is_missing d1/ut &&
-	test_path_is_missing d2/ut
+	test_path_is_file d1/ut &&
+	test_path_is_file d2/ut
 '
 
 test_expect_success 'git clean handles being told a glob to clean with -d' '
-- 
2.17.0.7.g0b50f94d69


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
  2018-04-05 17:34 ` [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item Elijah Newren
@ 2018-04-05 17:49   ` Jeff King
  2018-04-05 18:36     ` Elijah Newren
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2018-04-05 17:49 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, sxlijin

On Thu, Apr 05, 2018 at 10:34:41AM -0700, Elijah Newren wrote:

> For a pathspec like 'foo/bar' comparing against a path named "foo/",
> namelen will be 4, and match[namelen] will be 'b'.  The correct location
> of the directory separator is namelen-1.
> 
> The reason the code worked anyway was that the following code immediately
> checked whether the first matchlen characters matched (which they do) and
> then bailed and return MATCHED_RECURSIVELY anyway since wildmatch doesn't
> have the ability to check if "name" can be matched as a directory (or
> prefix) against the pathspec.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  dir.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/dir.c b/dir.c
> index 19212129f0..c915a69385 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -384,7 +384,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
>  	if (flags & DO_MATCH_SUBMODULE) {
>  		/* name is a literal prefix of the pathspec */
>  		if ((namelen < matchlen) &&
> -		    (match[namelen] == '/') &&
> +		    (match[namelen-1] == '/') &&
>  		    !ps_strncmp(item, match, name, namelen))
>  			return MATCHED_RECURSIVELY;

Do we care about matching the name "foo" against the patchspec_item "foo/"?

That matches now, but wouldn't after your patch.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
  2018-04-05 17:49   ` Jeff King
@ 2018-04-05 18:36     ` Elijah Newren
  2018-04-05 19:04       ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 18:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, sxlijin

On Thu, Apr 5, 2018 at 10:49 AM, Jeff King <peff@peff.net> wrote:
>> diff --git a/dir.c b/dir.c
>> index 19212129f0..c915a69385 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -384,7 +384,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
>>       if (flags & DO_MATCH_SUBMODULE) {
>>               /* name is a literal prefix of the pathspec */
>>               if ((namelen < matchlen) &&
>> -                 (match[namelen] == '/') &&
>> +                 (match[namelen-1] == '/') &&
>>                   !ps_strncmp(item, match, name, namelen))
>>                       return MATCHED_RECURSIVELY;
>
> Do we care about matching the name "foo" against the patchspec_item "foo/"?
>
> That matches now, but wouldn't after your patch.

Technically, the tests pass anyway due to the fallback behavior
mentioned in the commit message, but this is a really good point.  It
looks like the call to submodule_path_match() from builtin/grep.c is
going to be passing name without the trailing '/', which is contrary
to how read_directory_recursive() in dir.c builds up paths (namely
with the trailing '/'). If we tried to force consistency (either
always omit the trailing slash or always include it), then we'd
probably want to do so for match_pathspec() calls as well, and there
are lots of those throughout the code and auditing it all looks
painful.

So I should probably make the check handle both cases:

@@ -383,8 +383,9 @@ static int match_pathspec_item(const struct
pathspec_item *item, int prefix,
        /* Perform checks to see if "name" is a super set of the pathspec */
        if (flags & DO_MATCH_LEADING_PATHSPEC) {
                /* name is a literal prefix of the pathspec */
+               int offset = name[namelen-1] == '/' ? 1 : 0;
                if ((namelen < matchlen) &&
-                   (match[namelen] == '/') &&
+                   (match[namelen-offset] == '/') &&
                    !ps_strncmp(item, match, name, namelen))
                        return MATCHED_RECURSIVELY_LEADING_PATHSPEC;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too
  2018-04-05 17:34 ` [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too Elijah Newren
@ 2018-04-05 18:58   ` Jeff King
  2018-04-05 19:15     ` Elijah Newren
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2018-04-05 18:58 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, sxlijin

On Thu, Apr 05, 2018 at 10:34:43AM -0700, Elijah Newren wrote:

> Even if a directory doesn't match a pathspec, it is possible, depending
> on the precise pathspecs, that some file underneath it might.  So we
> special case and recurse into the directory for such situations.  However,
> we previously always added any untracked directory that we recursed into
> to the list of untracked paths, regardless of whether the directory
> itself matched the pathspec.
> 
> For the case of git-clean and a set of pathspecs of "dir/file" and "more",
> this caused a problem because we'd end up with dir entries for both of
>   "dir"
>   "dir/file"
> Then correct_untracked_entries() would try to helpfully prune duplicates
> for us by removing "dir/file" since it's under "dir", leaving us with
>   "dir"
> Since the original pathspec only had "dir/file", the only entry left
> doesn't match and leaves nothing to be removed.  (Note that if only one
> pathspec was specified, e.g. only "dir/file", then the common_prefix_len
> optimizations in fill_directory would cause us to bypass this problem,
> making it appear in simple tests that we could correctly remove manually
> specified pathspecs.)

It sounds like correct_untracked_entries() is doing the wrong thing, and
it should be aware of the pathspec-matching when culling entries. In
other words, my understanding was that read_directory() does not
necessarily promise to cull fully (which is what led to cf424f5fd in the
first place), and callers are forced to apply their own pathspecs.

The distinction is academic for this particular bug, but it makes me
wonder if there are other cases where "clean" needs to be more careful
with what comes out of dir.c.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
  2018-04-05 18:36     ` Elijah Newren
@ 2018-04-05 19:04       ` Jeff King
  2018-04-05 20:06         ` Elijah Newren
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2018-04-05 19:04 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, sxlijin

On Thu, Apr 05, 2018 at 11:36:45AM -0700, Elijah Newren wrote:

> > Do we care about matching the name "foo" against the patchspec_item "foo/"?
> >
> > That matches now, but wouldn't after your patch.
> 
> Technically, the tests pass anyway due to the fallback behavior
> mentioned in the commit message, but this is a really good point.  It
> looks like the call to submodule_path_match() from builtin/grep.c is
> going to be passing name without the trailing '/', which is contrary
> to how read_directory_recursive() in dir.c builds up paths (namely
> with the trailing '/'). If we tried to force consistency (either
> always omit the trailing slash or always include it), then we'd
> probably want to do so for match_pathspec() calls as well, and there
> are lots of those throughout the code and auditing it all looks
> painful.
> 
> So I should probably make the check handle both cases:
> 
> @@ -383,8 +383,9 @@ static int match_pathspec_item(const struct
> pathspec_item *item, int prefix,
>         /* Perform checks to see if "name" is a super set of the pathspec */
>         if (flags & DO_MATCH_LEADING_PATHSPEC) {
>                 /* name is a literal prefix of the pathspec */
> +               int offset = name[namelen-1] == '/' ? 1 : 0;
>                 if ((namelen < matchlen) &&
> -                   (match[namelen] == '/') &&
> +                   (match[namelen-offset] == '/') &&
>                     !ps_strncmp(item, match, name, namelen))
>                         return MATCHED_RECURSIVELY_LEADING_PATHSPEC;

That seems reasonable to me, and your "offset" trick here should prevent
us from getting confused. Can namelen ever be zero here? I guess
probably not (I could see an empty pathspec, but an empty path does not
make sense).

There are other similar trailing-slash matches in that function, but I'm
not sure of all the cases in which they're used. I don't know if any of
those would need similar treatment (sorry for being vague; I expect I'd
need a few hours to dig into how the pathspec code actually works, and I
don't have that today).

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too
  2018-04-05 18:58   ` Jeff King
@ 2018-04-05 19:15     ` Elijah Newren
  2018-04-05 19:31       ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 19:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Samuel Lijin

On Thu, Apr 5, 2018 at 11:58 AM, Jeff King <peff@peff.net> wrote:

> It sounds like correct_untracked_entries() is doing the wrong thing, and
> it should be aware of the pathspec-matching when culling entries. In
> other words, my understanding was that read_directory() does not
> necessarily promise to cull fully (which is what led to cf424f5fd in the
> first place), and callers are forced to apply their own pathspecs.
>
> The distinction is academic for this particular bug, but it makes me
> wonder if there are other cases where "clean" needs to be more careful
> with what comes out of dir.c.

Interesting, I read things very differently.  Looking back at commit
6b1db43109ab ("clean: teach clean -d to preserve ignored paths",
2017-05-23), which introduced correct_untracked_entries(), I thought
that correct_untracked_entries() wasn't there to correct pathspec
issues with fill_directory(), but instead to special case the handling
of files which are both untracked and ignored.  Did I mis-read or were
there other commits that changed the semantics?

Also, it would just seem odd to me that fill_directory() requires
pathspecs, and it uses those pathspecs, but it doesn't guarantee that
the files it returns matches them.  That seems like an API ripe for
mis-use, especially since I don't see any comment in the code about
such an assumption.  Is that really the expectation?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too
  2018-04-05 19:15     ` Elijah Newren
@ 2018-04-05 19:31       ` Jeff King
  2018-04-09  2:07         ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2018-04-05 19:31 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Samuel Lijin

On Thu, Apr 05, 2018 at 12:15:59PM -0700, Elijah Newren wrote:

> On Thu, Apr 5, 2018 at 11:58 AM, Jeff King <peff@peff.net> wrote:
> 
> > It sounds like correct_untracked_entries() is doing the wrong thing, and
> > it should be aware of the pathspec-matching when culling entries. In
> > other words, my understanding was that read_directory() does not
> > necessarily promise to cull fully (which is what led to cf424f5fd in the
> > first place), and callers are forced to apply their own pathspecs.
> >
> > The distinction is academic for this particular bug, but it makes me
> > wonder if there are other cases where "clean" needs to be more careful
> > with what comes out of dir.c.
> 
> Interesting, I read things very differently.  Looking back at commit
> 6b1db43109ab ("clean: teach clean -d to preserve ignored paths",
> 2017-05-23), which introduced correct_untracked_entries(), I thought
> that correct_untracked_entries() wasn't there to correct pathspec
> issues with fill_directory(), but instead to special case the handling
> of files which are both untracked and ignored.  Did I mis-read or were
> there other commits that changed the semantics?
> 
> Also, it would just seem odd to me that fill_directory() requires
> pathspecs, and it uses those pathspecs, but it doesn't guarantee that
> the files it returns matches them.  That seems like an API ripe for
> mis-use, especially since I don't see any comment in the code about
> such an assumption.  Is that really the expectation?

To be honest, I don't know. Most of dir.c predates me, and I've tried to
avoid looking at it too hard. But I had a vague recollection of it being
"best effort", and this bit from cf424f5fd89b reinforces that:

  However, read_directory does not actually check against our pathspec.
  It uses a simplified version that may turn up false positives. As a
  result, we need to check that any hits match our pathspec.

So I don't know that correct_untracked_entries() is there to fix the
pathspec handling. But I think that anybody who looks at the output of
fill_directory() does need to be aware that they may get more entries
than they expected, and has to apply the pathspecs themselves. And
that's what that extra dir_path_match() call in cmd_clean() is
there for (it used to be match_pathspec before some renaming).

I agree it's an error-prone interface. I don't know all the conditions
under which dir.c might return extra entries, but it seems like it might
be sane for it to do a final pathspec-matching pass so that callers
don't have to. That would mean that correct_untracked_entries() sees the
correctly culled list, and the extra check in cmd_clean() could be
dropped.

Ideally, of course, we'd fix those individual cases, since that would be
more efficient. And your patch may be the right first step in that
direction. But since we don't know what all of them are, it seems ripe
for regressions.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
  2018-04-05 19:04       ` Jeff King
@ 2018-04-05 20:06         ` Elijah Newren
  2018-04-06 17:53           ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren @ 2018-04-05 20:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Samuel Lijin

On Thu, Apr 5, 2018 at 12:04 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Apr 05, 2018 at 11:36:45AM -0700, Elijah Newren wrote:
>
>> > Do we care about matching the name "foo" against the patchspec_item "foo/"?
>> >
>> > That matches now, but wouldn't after your patch.
>>
<snip>
>> So I should probably make the check handle both cases:
>>
>> @@ -383,8 +383,9 @@ static int match_pathspec_item(const struct
>> pathspec_item *item, int prefix,
>>         /* Perform checks to see if "name" is a super set of the pathspec */
>>         if (flags & DO_MATCH_LEADING_PATHSPEC) {
>>                 /* name is a literal prefix of the pathspec */
>> +               int offset = name[namelen-1] == '/' ? 1 : 0;
>>                 if ((namelen < matchlen) &&
>> -                   (match[namelen] == '/') &&
>> +                   (match[namelen-offset] == '/') &&
>>                     !ps_strncmp(item, match, name, namelen))
>>                         return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
>
> That seems reasonable to me, and your "offset" trick here should prevent
> us from getting confused. Can namelen ever be zero here? I guess
> probably not (I could see an empty pathspec, but an empty path does not
> make sense).

Right, I don't see how an empty path would make sense.

> There are other similar trailing-slash matches in that function, but I'm
> not sure of all the cases in which they're used. I don't know if any of
> those would need similar treatment (sorry for being vague; I expect I'd
> need a few hours to dig into how the pathspec code actually works, and I
> don't have that today).

If it'd only take you a few hours, then you're a lot faster than me.
It took me a while to start wrapping my head around it.

The other trailing-slash matches in the function are all correct,
according to the testsuite.  (I'm not sure I like the
DO_MATCH_DIRECTORY stuff, but it is encoded in tests and backward
compatibility is important.)  In particular, changing the earlier code
to have the same offset trick would make it claim that e.g. either
"a/b" or "a/b/" as names match unconditionally against "a/b/c" as a
pathspec.  We need it to be conditional: we only want that to be
considered a match when checking whether we want to recurse into the
directory for other matches, not when checking whether the directory
itself matches the pathspec.  Thus, it should be behind a separate
flag, in a subsequent check, which is what this series does (namely
with DO_MATCH_LEADING_PATHSPEC).

To be more precise, here is how a matrix of pathnames and pathspecs
would be treated by match_pathspec_item(), where I am abbreviating
names like MATCH_RECURSIVELY_LEADING_PATHSPEC to LEADING):

                               Pathspecs
                |    a/b    |    a/b/    |   a/b/c
          ------+-----------+------------+-----------
          a/b   |  EXACT    | RECURSIVE  |  LEADING[3]
  Names   a/b/  |  EXACT[1] |  EXACT     |  LEADING[2]
          a/b/c | RECURSIVE | RECURSIVE  |  EXACT

[1] Only if DO_MATCH_DIRECTORY is passed.  Otherwise,
    this is NOT a match at all.
[2] Only if DO_MATCH_LEADING_PATHSPEC is passed,
    after applying this series.  Otherwise, not a match
    at all.
[3] Without the fix in this thread that you highlighted,
    and assuming we apply patch 7, this would actually
    mistakenly return RECURSIVE.


Now for a separate question: How much of the above would you like
added to the commit message...or even as a comment in the code to make
it clearer to other folks trying to make sense of it?


Elijah

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
  2018-04-05 20:06         ` Elijah Newren
@ 2018-04-06 17:53           ` Jeff King
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2018-04-06 17:53 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Samuel Lijin

On Thu, Apr 05, 2018 at 01:06:30PM -0700, Elijah Newren wrote:

> > There are other similar trailing-slash matches in that function, but I'm
> > not sure of all the cases in which they're used. I don't know if any of
> > those would need similar treatment (sorry for being vague; I expect I'd
> > need a few hours to dig into how the pathspec code actually works, and I
> > don't have that today).
> 
> If it'd only take you a few hours, then you're a lot faster than me.
> It took me a while to start wrapping my head around it.

OK, I was being overly optimistic. :)

> The other trailing-slash matches in the function are all correct,
> according to the testsuite.  (I'm not sure I like the
> DO_MATCH_DIRECTORY stuff, but it is encoded in tests and backward
> compatibility is important.)  In particular, changing the earlier code
> to have the same offset trick would make it claim that e.g. either
> "a/b" or "a/b/" as names match unconditionally against "a/b/c" as a
> pathspec.  We need it to be conditional: we only want that to be
> considered a match when checking whether we want to recurse into the
> directory for other matches, not when checking whether the directory
> itself matches the pathspec.  Thus, it should be behind a separate
> flag, in a subsequent check, which is what this series does (namely
> with DO_MATCH_LEADING_PATHSPEC).

OK, that makes some sense to me.

> To be more precise, here is how a matrix of pathnames and pathspecs
> would be treated by match_pathspec_item(), where I am abbreviating
> names like MATCH_RECURSIVELY_LEADING_PATHSPEC to LEADING):
> 
>                                Pathspecs
>                 |    a/b    |    a/b/    |   a/b/c
>           ------+-----------+------------+-----------
>           a/b   |  EXACT    | RECURSIVE  |  LEADING[3]
>   Names   a/b/  |  EXACT[1] |  EXACT     |  LEADING[2]
>           a/b/c | RECURSIVE | RECURSIVE  |  EXACT
> 
> [1] Only if DO_MATCH_DIRECTORY is passed.  Otherwise,
>     this is NOT a match at all.
> [2] Only if DO_MATCH_LEADING_PATHSPEC is passed,
>     after applying this series.  Otherwise, not a match
>     at all.
> [3] Without the fix in this thread that you highlighted,
>     and assuming we apply patch 7, this would actually
>     mistakenly return RECURSIVE.
> 
> 
> Now for a separate question: How much of the above would you like
> added to the commit message...or even as a comment in the code to make
> it clearer to other folks trying to make sense of it?

That table seems quite illuminating to me. It's hard to pick out all the
special-cases from the code, or what they're _supposed_ to be doing. I
think it makes sense as a code comment.

-Peff

PS I'm going to be on a 3-week vacation starting tomorrow, so apologies
   in advance for ignoring any follow-ups.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too
  2018-04-05 19:31       ` Jeff King
@ 2018-04-09  2:07         ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2018-04-09  2:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Elijah Newren, Git Mailing List, Samuel Lijin

Jeff King <peff@peff.net> writes:

> To be honest, I don't know. Most of dir.c predates me, and I've tried to
> avoid looking at it too hard. But I had a vague recollection of it being
> "best effort", and this bit from cf424f5fd89b reinforces that:
>
>   However, read_directory does not actually check against our pathspec.
>   It uses a simplified version that may turn up false positives. As a
>   result, we need to check that any hits match our pathspec.

At least the original design of the traversal was "try making use of
pathspec during the traversal when we can cheaply filter out obvious
non-hits and avoid recursing into an entire hierarchy---false negative
is an absolute no-no, but forcing the consumer to post filter is OK".

> ... But I think that anybody who looks at the output of
> fill_directory() does need to be aware that they may get more entries
> than they expected, and has to apply the pathspecs themselves.

That matches with my understanding of how "fill" thing worked from
early days.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-04-09  2:07 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 1/7] dir.c: Fix typo in comment Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item Elijah Newren
2018-04-05 17:49   ` Jeff King
2018-04-05 18:36     ` Elijah Newren
2018-04-05 19:04       ` Jeff King
2018-04-05 20:06         ` Elijah Newren
2018-04-06 17:53           ` Jeff King
2018-04-05 17:34 ` [RFC PATCH 3/7] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too Elijah Newren
2018-04-05 18:58   ` Jeff King
2018-04-05 19:15     ` Elijah Newren
2018-04-05 19:31       ` Jeff King
2018-04-09  2:07         ` Junio C Hamano
2018-04-05 17:34 ` [RFC PATCH 5/7] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 6/7] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 7/7] If we do not want globs to recurse into subdirs without -d Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).