git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v4] Add an explicit GIT_DIR to the list of excludes
@ 2014-05-27  3:56 Pasha Bolokhov
  2014-05-28 12:36 ` Duy Nguyen
  0 siblings, 1 reply; 6+ messages in thread
From: Pasha Bolokhov @ 2014-05-27  3:56 UTC (permalink / raw
  To: pclouds; +Cc: jrnieder, git, Pasha Bolokhov

When an explicit '--git-dir' option points to a directory inside
the work tree, git treats it as if it were any other directory.
In particular, 'git status' lists it as untracked, while 'git add -A'
stages the metadata directory entirely

Add GIT_DIR to the list of excludes in setup_standard_excludes(),
while checking that GIT_DIR is not just '.git', in which case it
would be ignored by default, and that GIT_DIR is inside GIT_WORK_TREE

Although an analogous comparison of any given path against '.git'
is done in treat_path(), this does not seem to be the right place
to compare against GIT_DIR. Instead, the excludes provide an
effective mechanism of ignoring a file/directory, and adding GIT_DIR
as an exclude is equivalent of putting it into '.gitignore'. Function
setup_standard_excludes() was chosen because that is the place where
the excludes are initialized by the commands that are concerned about
excludes

Signed-off-by: Pasha Bolokhov <pasha.bolokhov@gmail.com>
---
Improve test tree structure.
Add a check for work_tree==NULL in dir.c:setup_standard_excludes
When work_tree is NULL, there is no concept of whether the
repo is within work tree or not, but we still default to ignoring
GIT_DIR

 Documentation/technical/api-directory-listing.txt |   4 +-
 dir.c                                             |  32 +++++
 t/t2205-add-gitdir.sh                             | 163 ++++++++++++++++++++++
 3 files changed, 197 insertions(+), 2 deletions(-)
 create mode 100755 t/t2205-add-gitdir.sh

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 7f8e78d..fd4a178 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -90,8 +90,8 @@ marked. If you to exclude files, make sure you have loaded index first.
   `add_exclude()`.
 
 * To add patterns from a file (e.g. `.git/info/exclude`), call
-  `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`.  A
-  short-hand function `setup_standard_excludes()` can be used to set
+  `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`.  The
+  short-hand function `setup_standard_excludes()` must be used to set
   up the standard set of exclude settings.
 
 * Set options described in the Data Structure section above.
diff --git a/dir.c b/dir.c
index 98bb50f..dd76da1 100644
--- a/dir.c
+++ b/dir.c
@@ -1588,6 +1588,38 @@ void setup_standard_excludes(struct dir_struct *dir)
 {
 	const char *path;
 	char *xdg_path;
+	const char *r_git, *gitdir = get_git_dir();
+	char *n_git, *basename;
+	int len, i;
+
+	/*
+	 * Add git directory to the ignores; do this only if
+	 * GIT_DIR does not end with "/.git"
+	 */
+	r_git = real_path(absolute_path(gitdir));
+	n_git = xmalloc(strlen(r_git) + 1 + 1);
+	normalize_path_copy(n_git, r_git);
+
+	len = strlen(n_git); /* real_path() has stripped trailing slash */
+	for (i = len - 1; i > 0 && !is_dir_sep(n_git[i]); i--) ;
+	basename = n_git + i;
+	if (is_dir_sep(*basename))
+		basename++;
+	if (strcmp(basename, ".git")) {
+		const char *worktree = get_git_work_tree();
+
+		if (worktree == NULL ||
+		    dir_inside_of(n_git, get_git_work_tree()) >= 0) {
+			struct exclude_list *el = add_exclude_list(dir, EXC_CMDL,
+							"GIT_DIR setup");
+
+			/* append a trailing slash to exclude directories */
+			n_git[len] = '/';
+			n_git[len + 1] = '\0';
+			add_exclude(basename, "", 0, el, 0);
+		}
+	}
+	free(n_git);
 
 	dir->exclude_per_dir = ".gitignore";
 	path = git_path("info/exclude");
diff --git a/t/t2205-add-gitdir.sh b/t/t2205-add-gitdir.sh
new file mode 100755
index 0000000..4cc26af
--- /dev/null
+++ b/t/t2205-add-gitdir.sh
@@ -0,0 +1,163 @@
+#!/bin/sh
+#
+# Copyright (c) 2014 Pasha Bolokhov
+#
+
+test_description='alternative repository path specified by --git-dir is ignored by add and status'
+
+. ./test-lib.sh
+
+#
+# Create a tree:
+#
+#	repo-inside/  repo-outside/
+#
+#
+# repo-inside:
+# 	a  b  c  d  subdir/ [meta/]
+#
+# repo-inside/subdir:
+# 	e  f  g  h  meta/  ssubdir/
+#
+# repo-inside/subdir/meta:
+# 	aa
+#
+# repo-inside/subdir/ssubdir:
+# 	meta/
+#
+# repo-inside/subdir/ssubdir/meta:
+# 	aaa
+#
+#
+#
+# repo-outside:
+#	external/  tree/
+#
+# repo-outside/external:
+#	[meta/]
+#
+# repo-outside/tree:
+#	n  o  p  q  meta/  sub/
+#
+# repo-outside/tree/meta:
+#	bb
+#
+# repo-outside/tree/sub:
+#	meta/
+#
+# repo-outside/tree/sub/meta:
+#	bbb
+#
+#
+# (both of the above [meta/] denote the actual repositories)
+#
+
+#
+# First set of tests (in "repo-inside/"):
+# ---------------------------------------
+#
+# Name the repository "meta" and see whether or not "git status" includes or
+# ignores directories named "meta". Directory "meta" at the top level of
+# "repo-inside/"is the repository and appears upon the first "git init"
+#
+#
+# Second set of tests (in "repo-outside/"):
+# -----------------------------------------
+#
+# Put the work tree into "tree/" and repository into "external/meta"
+# (the latter directory appears upon the corresponding "git init").
+# The work tree again contains directories named "meta", but those ones are
+# tested not to be ignored now
+#
+
+test_expect_success "setup" '
+	mkdir repo-inside/ &&
+	(
+		cd repo-inside/ &&
+		for f in a b c d
+		do
+			echo "DATA" >"$f" || exit 1
+		done &&
+		mkdir subdir subdir/meta &&
+		mkdir subdir/ssubdir subdir/ssubdir/meta &&
+		for f in e f g h
+		do
+			echo "MORE DATA" >"subdir/$f" || exit 1
+		done &&
+		echo "EVEN more Data" >subdir/meta/aa &&
+		echo "Data and BAIT" >subdir/ssubdir/meta/aaa &&
+		git --git-dir=meta init
+	) &&
+	mkdir repo-outside/ repo-outside/external repo-outside/tree &&
+	(
+		cd repo-outside/tree &&
+		for f in n o p q
+		do
+			echo "Literal Data" >"$f" || exit 1
+		done &&
+		mkdir meta sub sub/meta &&
+		echo "Sample data" >meta/bb &&
+		echo "Stream of data and BAIT" >sub/meta/bbb &&
+		git --git-dir=../external/meta init
+	)
+'
+
+
+#
+# The first set of tests (the repository is inside the work tree)
+#
+test_expect_success "'git status' ignores the repository directory" '
+	(
+		cd repo-inside &&
+		git --git-dir=meta --work-tree=. status --porcelain >status.out &&
+		test_might_fail grep meta status.out >out &&
+		! test -s out
+	)
+'
+
+test_expect_success "'git add -A' ignores the repository directory" '
+	(
+		cd repo-inside &&
+		git --git-dir=meta --work-tree=. add -A &&
+		git --git-dir=meta --work-tree=. status --porcelain >status1.out &&
+		test_might_fail grep meta status1.out >out1 &&
+		! test -s out1
+	)
+'
+
+test_expect_success "'git grep --exclude-standard' ignores the repository directory" '
+	(
+		cd repo-inside &&
+		test_might_fail git --git-dir=meta \
+			grep --no-index --exclude-standard BAIT >out2 &&
+		! test -s out2
+	)
+'
+
+
+#
+# The second set of tests (the repository is outside of the work tree)
+#
+test_expect_success "'git status' acknowledges directories 'meta' \
+if repo is not within work tree" '
+	test_might_fail rm -rf meta/ &&
+	(
+		cd repo-outside/tree &&
+		git --git-dir=../external/meta init &&
+		git --git-dir=../external/meta --work-tree=. status --porcelain >status3.out &&
+		test_might_fail grep meta status3.out >out3 &&
+		test -s out3
+	)
+'
+
+test_expect_success "'git add -A' adds 'meta' if the repo is outside the work tree" '
+	(
+		cd repo-outside/tree &&
+		git --git-dir=../external/meta --work-tree=. add -A &&
+		git --git-dir=../external/meta --work-tree=. status --porcelain >status4.out &&
+		test_might_fail grep meta status4.out >out4 &&
+		test -s out4
+	)
+'
+
+test_done
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes
  2014-05-27  3:56 [PATCH v4] Add an explicit GIT_DIR to the list of excludes Pasha Bolokhov
@ 2014-05-28 12:36 ` Duy Nguyen
  2014-05-28 22:11   ` Pasha Bolokhov
  0 siblings, 1 reply; 6+ messages in thread
From: Duy Nguyen @ 2014-05-28 12:36 UTC (permalink / raw
  To: Pasha Bolokhov; +Cc: Jonathan Nieder, Git Mailing List

On Tue, May 27, 2014 at 10:56 AM, Pasha Bolokhov
<pasha.bolokhov@gmail.com> wrote:
> @@ -1588,6 +1588,38 @@ void setup_standard_excludes(struct dir_struct *dir)
>  {
>         const char *path;
>         char *xdg_path;
> +       const char *r_git, *gitdir = get_git_dir();
> +       char *n_git, *basename;
> +       int len, i;
> +
> +       /*
> +        * Add git directory to the ignores; do this only if
> +        * GIT_DIR does not end with "/.git"
> +        */
> +       r_git = real_path(absolute_path(gitdir));
> +       n_git = xmalloc(strlen(r_git) + 1 + 1);
> +       normalize_path_copy(n_git, r_git);
> +
> +       len = strlen(n_git); /* real_path() has stripped trailing slash */
> +       for (i = len - 1; i > 0 && !is_dir_sep(n_git[i]); i--) ;
> +       basename = n_git + i;
> +       if (is_dir_sep(*basename))
> +               basename++;
> +       if (strcmp(basename, ".git")) {

I think normalize_path_copy makes sure that dir sep is '/', so this
code may be simplified to "if (strcmp(n_git, .git") && (len == 4 ||
strcmp(n_git + len - 5, "/.git")))"?

> +               const char *worktree = get_git_work_tree();
> +
> +               if (worktree == NULL ||
> +                   dir_inside_of(n_git, get_git_work_tree()) >= 0) {
> +                       struct exclude_list *el = add_exclude_list(dir, EXC_CMDL,
> +                                                       "GIT_DIR setup");
> +
> +                       /* append a trailing slash to exclude directories */
> +                       n_git[len] = '/';
> +                       n_git[len + 1] = '\0';
> +                       add_exclude(basename, "", 0, el, 0);
> +               }
> +       }
> +       free(n_git);

All this add-only code makes me think it may be nice to make it a
separate function. A good function name could replace the comment near
the beginning of the block.
-- 
Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes
  2014-05-28 12:36 ` Duy Nguyen
@ 2014-05-28 22:11   ` Pasha Bolokhov
  2014-05-29 11:42     ` Duy Nguyen
  0 siblings, 1 reply; 6+ messages in thread
From: Pasha Bolokhov @ 2014-05-28 22:11 UTC (permalink / raw
  To: Duy Nguyen; +Cc: Jonathan Nieder, Git Mailing List

On Wed, May 28, 2014 at 5:36 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Tue, May 27, 2014 at 10:56 AM, Pasha Bolokhov
> <pasha.bolokhov@gmail.com> wrote:
>> @@ -1588,6 +1588,38 @@ void setup_standard_excludes(struct dir_struct *dir)
>>  {
>>         const char *path;
>>         char *xdg_path;
>> +       const char *r_git, *gitdir = get_git_dir();
>> +       char *n_git, *basename;
>> +       int len, i;
>> +
>> +       /*
>> +        * Add git directory to the ignores; do this only if
>> +        * GIT_DIR does not end with "/.git"
>> +        */
>> +       r_git = real_path(absolute_path(gitdir));
>> +       n_git = xmalloc(strlen(r_git) + 1 + 1);
>> +       normalize_path_copy(n_git, r_git);
>> +
>> +       len = strlen(n_git); /* real_path() has stripped trailing slash */
>> +       for (i = len - 1; i > 0 && !is_dir_sep(n_git[i]); i--) ;
>> +       basename = n_git + i;
>> +       if (is_dir_sep(*basename))
>> +               basename++;
>> +       if (strcmp(basename, ".git")) {
>
> I think normalize_path_copy makes sure that dir sep is '/', so this
> code may be simplified to "if (strcmp(n_git, .git") && (len == 4 ||
> strcmp(n_git + len - 5, "/.git")))"?

Then if "n_git" is "/ab"  =>  coredump. But I agree there is logic to
this (if we check len >= 4 first). However, we still need the
basename. So I've just shortened it a bit, what do you think: (notice
the condition "i >= 0" btw)

        for (i = len - 1; i >= 0 && n_git[i] != '/'; i--) ;
        basename = n_git + i + 1;
        if (strcmp(basename, ".git)) {

>
>> +               const char *worktree = get_git_work_tree();
>> +
>> +               if (worktree == NULL ||
>> +                   dir_inside_of(n_git, get_git_work_tree()) >= 0) {
>> +                       struct exclude_list *el = add_exclude_list(dir, EXC_CMDL,
>> +                                                       "GIT_DIR setup");
>> +
>> +                       /* append a trailing slash to exclude directories */
>> +                       n_git[len] = '/';
>> +                       n_git[len + 1] = '\0';
>> +                       add_exclude(basename, "", 0, el, 0);
>> +               }
>> +       }
>> +       free(n_git);
>
> All this add-only code makes me think it may be nice to make it a
> separate function. A good function name could replace the comment near
> the beginning of the block.

Reasonable
I'll send the all-updated patch including doc when ready

> --
> Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes
  2014-05-28 22:11   ` Pasha Bolokhov
@ 2014-05-29 11:42     ` Duy Nguyen
  2014-06-03 20:55       ` Pasha Bolokhov
  0 siblings, 1 reply; 6+ messages in thread
From: Duy Nguyen @ 2014-05-29 11:42 UTC (permalink / raw
  To: Pasha Bolokhov; +Cc: Jonathan Nieder, Git Mailing List

On Thu, May 29, 2014 at 5:11 AM, Pasha Bolokhov
<pasha.bolokhov@gmail.com> wrote:
>>> +       len = strlen(n_git); /* real_path() has stripped trailing slash */
>>> +       for (i = len - 1; i > 0 && !is_dir_sep(n_git[i]); i--) ;
>>> +       basename = n_git + i;
>>> +       if (is_dir_sep(*basename))
>>> +               basename++;
>>> +       if (strcmp(basename, ".git")) {
>>
>> I think normalize_path_copy makes sure that dir sep is '/', so this
>> code may be simplified to "if (strcmp(n_git, .git") && (len == 4 ||
>> strcmp(n_git + len - 5, "/.git")))"?
>
> Then if "n_git" is "/ab"  =>  coredump. But I agree there is logic to
> this (if we check len >= 4 first). However, we still need the
> basename.

Ah I missed this at add_exclude()

> So I've just shortened it a bit, what do you think: (notice
> the condition "i >= 0" btw)
>
>         for (i = len - 1; i >= 0 && n_git[i] != '/'; i--) ;

There's basename() that does this for you. A compat version is
provided for Windows port so no portability worries.

>         basename = n_git + i + 1;
>         if (strcmp(basename, ".git)) {
>
>>
>>> +               const char *worktree = get_git_work_tree();
>>> +
>>> +               if (worktree == NULL ||
>>> +                   dir_inside_of(n_git, get_git_work_tree()) >= 0) {
>>> +                       struct exclude_list *el = add_exclude_list(dir, EXC_CMDL,
>>> +                                                       "GIT_DIR setup");
>>> +
>>> +                       /* append a trailing slash to exclude directories */
>>> +                       n_git[len] = '/';
>>> +                       n_git[len + 1] = '\0';
>>> +                       add_exclude(basename, "", 0, el, 0);

Hmm.. I overlooked this bit before. So if  $GIT_DIR is /something/foo,
we set to ignore "foo/". Because we know n_git must be part of
(normalized) get_git_work_tree() at this point, could we path n_git +
strlen(get_git_work_tree()) to add_exclude() instead of basename? Full
path makes sure we don't accidentally exclude too much.

The case when $GIT_DIR points to a _file_ seems uncovered.
setup_git_directory() will transform the file to the directory
internally and we never know the .git file's path (so we can't exclude
it). So people could accidentally add the .git file in, then remove it
from from work tree and suddenly the work tree becomes repo-less. It's
not as bad as .git _directory_ because we don't lose valuable data. I
don't know if you want to cover this too.
-- 
Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes
  2014-05-29 11:42     ` Duy Nguyen
@ 2014-06-03 20:55       ` Pasha Bolokhov
  2014-06-04 13:52         ` Duy Nguyen
  0 siblings, 1 reply; 6+ messages in thread
From: Pasha Bolokhov @ 2014-06-03 20:55 UTC (permalink / raw
  To: Duy Nguyen; +Cc: Jonathan Nieder, Git Mailing List

On Thu, May 29, 2014 at 4:42 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>>>> +               const char *worktree = get_git_work_tree();
>>>> +
>>>> +               if (worktree == NULL ||
>>>> +                   dir_inside_of(n_git, get_git_work_tree()) >= 0) {
>>>> +                       struct exclude_list *el = add_exclude_list(dir, EXC_CMDL,
>>>> +                                                       "GIT_DIR setup");
>>>> +
>>>> +                       /* append a trailing slash to exclude directories */
>>>> +                       n_git[len] = '/';
>>>> +                       n_git[len + 1] = '\0';
>>>> +                       add_exclude(basename, "", 0, el, 0);
>
> Hmm.. I overlooked this bit before. So if  $GIT_DIR is /something/foo,
> we set to ignore "foo/". Because we know n_git must be part of
> (normalized) get_git_work_tree() at this point, could we path n_git +
> strlen(get_git_work_tree()) to add_exclude() instead of basename? Full
> path makes sure we don't accidentally exclude too much.

I guess so. In fact, dir_inside_of() already returns the relative
position, can reuse that (however, that function doesn't include the
leading path, making it a relative path; but it's not difficult to
work around). The only uncovered situation is when GIT_DIR=WORK_TREE.
But that's user's fault and I don't think we need to guarantee that
GIT_DIR will be excluded then

>
> The case when $GIT_DIR points to a _file_ seems uncovered.
> setup_git_directory() will transform the file to the directory
> internally and we never know the .git file's path (so we can't exclude
> it). So people could accidentally add the .git file in, then remove it
> from from work tree and suddenly the work tree becomes repo-less. It's
> not as bad as .git _directory_ because we don't lose valuable data. I
> don't know if you want to cover this too.

That's right, there is no way of knowing what the original .git file
was. I guess the only way to work around this problem is to modify
"read_gitfile()" so it saves the name of the original file. Then we
can add both that .git-file and GIT_DIR to the exclude list. Not a big
problem with me, but need to see what you guys think

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] Add an explicit GIT_DIR to the list of excludes
  2014-06-03 20:55       ` Pasha Bolokhov
@ 2014-06-04 13:52         ` Duy Nguyen
  0 siblings, 0 replies; 6+ messages in thread
From: Duy Nguyen @ 2014-06-04 13:52 UTC (permalink / raw
  To: Pasha Bolokhov; +Cc: Jonathan Nieder, Git Mailing List

On Wed, Jun 4, 2014 at 3:55 AM, Pasha Bolokhov <pasha.bolokhov@gmail.com> wrote:
>> The case when $GIT_DIR points to a _file_ seems uncovered.
>> setup_git_directory() will transform the file to the directory
>> internally and we never know the .git file's path (so we can't exclude
>> it). So people could accidentally add the .git file in, then remove it
>> from from work tree and suddenly the work tree becomes repo-less. It's
>> not as bad as .git _directory_ because we don't lose valuable data. I
>> don't know if you want to cover this too.
>
> That's right, there is no way of knowing what the original .git file
> was. I guess the only way to work around this problem is to modify
> "read_gitfile()" so it saves the name of the original file. Then we
> can add both that .git-file and GIT_DIR to the exclude list. Not a big
> problem with me, but need to see what you guys think

My view is this non-standard $(basename $GIT_DIR) is a corner case.
Unless people who care about it (e.g. you) do something that affects
the common ".git" case, or really mess up the code, I don't think it's
a problem if you decide to ignore some smaller cases.
-- 
Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-04 13:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-27  3:56 [PATCH v4] Add an explicit GIT_DIR to the list of excludes Pasha Bolokhov
2014-05-28 12:36 ` Duy Nguyen
2014-05-28 22:11   ` Pasha Bolokhov
2014-05-29 11:42     ` Duy Nguyen
2014-06-03 20:55       ` Pasha Bolokhov
2014-06-04 13:52         ` Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).