git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git merge, .gitignore, and silently overwriting untracked files
@ 2010-08-17  5:21 Joshua Jensen
  2010-08-17 19:33 ` Junio C Hamano
  2018-11-06 15:12 ` Checkout deleted semi-untracked file Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 86+ messages in thread
From: Joshua Jensen @ 2010-08-17  5:21 UTC (permalink / raw)
  To: git@vger.kernel.org

  Git silently overwrote an important untracked file for me today.  I 
have come up with the following transcript to reproduce the issue.

In a nutshell, if an untracked file is in .gitignore, a merge from a 
branch where the file was added will OVERWRITE the untracked file 
without warning.

This doesn't seem right.

What do others think?

Josh

-------------------------------

/s/testgitoverwrite
$ mkdir aaa

/s/testgitoverwrite
$ cd aaa

/s/testgitoverwrite/aaa
$ echo abc > abc.txt

/s/testgitoverwrite/aaa
$ ls -l
total 1
-rw-r--r--    1 Joshua   Administ        4 Aug 16 23:10 abc.txt

/s/testgitoverwrite/aaa
$ git init
Initialized empty Git repository in s:/testgitoverwrite/aaa/.git/

/s/testgitoverwrite/aaa (master)
$ git add abc.txt

/s/testgitoverwrite/aaa (master)
$ git commit -m "Added abc.txt"
[master (root-commit) 0e8bffb] Added abc.txt
  1 files changed, 1 insertions(+), 0 deletions(-)
  create mode 100644 abc.txt

/s/testgitoverwrite/aaa (master)
$ cd ..

/s/testgitoverwrite
$ git clone aaa bbb
Cloning into bbb...
done.

/s/testgitoverwrite
$ cd bbb

/s/testgitoverwrite/bbb (master)
$ ls -l
total 1
-rw-r--r--    1 Joshua   Administ        4 Aug 16 23:10 abc.txt

/s/testgitoverwrite/bbb (master)
$ echo "Important stuff" > important.txt

/s/testgitoverwrite/bbb (master)
$ git add important.txt

/s/testgitoverwrite/bbb (master)
$ git commit -m "This file is SO important"
[master 2fe9ab6] This file is SO important
  1 files changed, 1 insertions(+), 0 deletions(-)
  create mode 100644 important.txt

/s/testgitoverwrite/bbb (master)
$ cd ..

/s/testgitoverwrite
$ cd aaa

/s/testgitoverwrite/aaa (master)
$ echo "Local testing copy of important.txt" > important.txt

/s/testgitoverwrite/aaa (master)
$ git status
# On branch master
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       important.txt
nothing added to commit but untracked files present (use "git add" to track)

/s/testgitoverwrite/aaa (master)
$ git pull ../bbb
remote: Counting objects: 4, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
 From ../bbb
  * branch            HEAD       -> FETCH_HEAD
Updating 0e8bffb..2fe9ab6
error: Untracked working tree file 'important.txt' would be overwritten 
by merge.  Aborting

/s/testgitoverwrite/aaa (master)
$ git pull --rebase ../bbb
 From ../bbb
  * branch            HEAD       -> FETCH_HEAD
First, rewinding head to replay your work on top of it...
error: Untracked working tree file 'important.txt' would be overwritten 
by merge.
could not detach HEAD

/s/testgitoverwrite/aaa (master)
$ echo important.txt > .gitignore

/s/testgitoverwrite/aaa (master)
$ git status
# On branch master
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       .gitignore
nothing added to commit but untracked files present (use "git add" to track)

/s/testgitoverwrite/aaa (master)
$ cat important.txt
Local testing copy of important.txt

/s/testgitoverwrite/aaa (master)
$ git pull ../bbb
 From ../bbb
  * branch            HEAD       -> FETCH_HEAD
Updating 0e8bffb..2fe9ab6
Fast-forward
  important.txt |    1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
  create mode 100644 important.txt

/s/testgitoverwrite/aaa (master)
$ cat important.txt
Important stuff

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: git merge, .gitignore, and silently overwriting untracked files
  2010-08-17  5:21 git merge, .gitignore, and silently overwriting untracked files Joshua Jensen
@ 2010-08-17 19:33 ` Junio C Hamano
  2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
  2018-11-06 15:12 ` Checkout deleted semi-untracked file Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2010-08-17 19:33 UTC (permalink / raw)
  To: Joshua Jensen; +Cc: git@vger.kernel.org

There are only two kinds of untracked paths in git.  Ones that git is
allowed to remove in order to carry out operations asked by the end user,
i.e. ones that are trashable, are registered to the gitignore mechanism,
and the ones that are precious are not in gitignore; the latter can
interfere with git and prevent it from carrying out certain operations.
E.g. an untracked and unignored file F will prevent you from switching to
a branch that has file F tracked, or that has file F/G (iow, F is a
directory in that branch) tracked.

Ancient git didn't honor gitignore and considered no files trashable,
which caused a lot of trouble to users.  It may be illuminating to check
discussions in the list archive during the period the following commits
were made.  f8a9d42 (read-tree: further loosen "working file will be lost"
check., 2006-12-04), ed93b44 (merge: loosen overcautious "working file
will be lost" check., 2006-10-08), c819353 (Fix switching to a branch with
D/F when current branch has file D., 2007-03-15).

We could enhance the mechanism to categorize untracked paths into three
instead of two: trashable, unknown and precious.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH] optionally disable overwriting of ignored files
  2010-08-17 19:33 ` Junio C Hamano
@ 2010-08-18 23:39   ` Clemens Buchacher
  2010-08-19 10:41     ` Jakub Narebski
                       ` (4 more replies)
  0 siblings, 5 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-18 23:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 6594 bytes --]

By default, checkout and fast-forward merge will overwrite ignored
files. Make this behavior configurable.

If overwriting ignored files is disabled, it can still be enabled
on the command line by passing -i to merge or checkout.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

Hi,

On Tue, Aug 17, 2010 at 12:33:25PM -0700, Junio C Hamano wrote:
>
> Ancient git didn't honor gitignore and considered no files trashable,
> which caused a lot of trouble to users.  It may be illuminating to check
> discussions in the list archive during the period the following commits
> were made.  f8a9d42 (read-tree: further loosen "working file will be lost"
> check., 2006-12-04), ed93b44 (merge: loosen overcautious "working file
> will be lost" check., 2006-10-08), c819353 (Fix switching to a branch with
> D/F when current branch has file D., 2007-03-15).

I am not convinced this is something that most users want.

The fact that adding a file to .gitignore also marks it as
trashable to git is quite surprising to me. This is something that
the gitignore manpage should warn about loudly. Instead, I think
this behavior is completely undocumented.

As far as I can tell from the code, only fast-forward merge and
checkout overwrite ignored files by default. A regular merge will
complain about untracked files that would be overwritten (ignored
or not). I am quite familiar with that behavior and I think it's
useful to be reminded of those untracked files rather than having
them overwritten. I thought that by extension, other git commands
would behave in the same way.

Here is a patch to optionally disable overwriting of ignored files.
But I think we should even disable it by default. 

Clemens

 Documentation/config.txt |    6 ++++++
 builtin/checkout.c       |    7 +++++++
 builtin/merge.c          |    9 +++++++++
 cache.h                  |    1 +
 config.c                 |    3 +++
 environment.c            |    1 +
 6 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f81fb91..eb9bb43 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -450,6 +450,12 @@ core.excludesfile::
 	to the value of `$HOME` and "{tilde}user/" to the specified user's
 	home directory.  See linkgit:gitignore[5].
 
+core.overwriteignored::
+	Untracked files can get in the way of merge or checkout. If
+	overwriteignored is enabled, such files will be overwritten
+	to let fast-forward merges and checkouts succeed. Enabled
+	by default.
+
 core.editor::
 	Commands such as `commit` and `tag` that lets you edit
 	messages by launching an editor uses the value of this
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5affb6f..121a6a3 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -389,6 +389,11 @@ static int merge_working_tree(struct checkout_opts *opts,
 		topts.gently = opts->merge && old->commit;
 		topts.verbose_update = !opts->quiet;
 		topts.fn = twoway_merge;
+		if (overwrite_ignored) {
+			topts.dir = xcalloc(1, sizeof(*topts.dir));
+			topts.dir->flags |= DIR_SHOW_IGNORED;
+			topts.dir->exclude_per_dir = ".gitignore";
+		}
 		tree = parse_tree_indirect(old->commit ?
 					   old->commit->object.sha1 :
 					   (unsigned char *)EMPTY_TREE_SHA1_BIN);
@@ -664,6 +669,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 		OPT_SET_INT('3', "theirs", &opts.writeout_stage, "stage",
 			    3),
 		OPT_BOOLEAN('f', "force", &opts.force, "force"),
+		OPT_BOOLEAN('i', "overwrite-ignored", &overwrite_ignored,
+		  "allow explicitly ignored files to be overwritten"),
 		OPT_BOOLEAN('m', "merge", &opts.merge, "merge"),
 		OPT_STRING(0, "conflict", &conflict_style, "style",
 			   "conflict style (merge or diff3)"),
diff --git a/builtin/merge.c b/builtin/merge.c
index 4d31e09..47b12ba 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -185,6 +185,8 @@ static struct option builtin_merge_options[] = {
 		"allow fast-forward (default)"),
 	OPT_BOOLEAN(0, "ff-only", &fast_forward_only,
 		"abort if fast-forward is not possible"),
+	OPT_BOOLEAN('i', "overwrite-ignored", &overwrite_ignored,
+	  "allow explicitly ignored files to be overwritten"),
 	OPT_RERERE_AUTOUPDATE(&allow_rerere_auto),
 	OPT_CALLBACK('s', "strategy", &use_strategies, "strategy",
 		"merge strategy to use", option_parse_strategy),
@@ -682,6 +684,7 @@ int checkout_fast_forward(const unsigned char *head, const unsigned char *remote
 	struct unpack_trees_options opts;
 	struct tree_desc t[MAX_UNPACK_TREES];
 	int i, fd, nr_trees = 0;
+	struct dir_struct dir;
 	struct lock_file *lock_file = xcalloc(1, sizeof(struct lock_file));
 
 	refresh_cache(REFRESH_QUIET);
@@ -691,6 +694,12 @@ int checkout_fast_forward(const unsigned char *head, const unsigned char *remote
 	memset(&trees, 0, sizeof(trees));
 	memset(&opts, 0, sizeof(opts));
 	memset(&t, 0, sizeof(t));
+	memset(&dir, 0, sizeof(dir));
+	if (overwrite_ignored) {
+		dir.flags |= DIR_SHOW_IGNORED;
+		dir.exclude_per_dir = ".gitignore";
+		opts.dir = &dir;
+	}
 
 	opts.head_idx = 1;
 	opts.src_index = &the_index;
diff --git a/cache.h b/cache.h
index c9fa3df..dd1b8f7 100644
--- a/cache.h
+++ b/cache.h
@@ -548,6 +548,7 @@ extern size_t packed_git_window_size;
 extern size_t packed_git_limit;
 extern size_t delta_base_cache_limit;
 extern int read_replace_refs;
+extern int overwrite_ignored;
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
diff --git a/config.c b/config.c
index cdcf583..bd7956e 100644
--- a/config.c
+++ b/config.c
@@ -563,6 +563,9 @@ static int git_default_core_config(const char *var, const char *value)
 	if (!strcmp(var, "core.excludesfile"))
 		return git_config_pathname(&excludes_file, var, value);
 
+	if (!strcmp(var, "core.overwriteignored"))
+		overwrite_ignored = git_config_bool(var, value);
+
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
 			return config_error_nonbool(var);
diff --git a/environment.c b/environment.c
index 83d38d3..1b92f60 100644
--- a/environment.c
+++ b/environment.c
@@ -30,6 +30,7 @@ const char *apply_default_ignorewhitespace;
 int zlib_compression_level = Z_BEST_SPEED;
 int core_compression_level;
 int core_compression_seen;
+int overwrite_ignored = 1;
 int fsync_object_files;
 size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
 size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
-- 
1.7.2.1.1.g202c


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
@ 2010-08-19 10:41     ` Jakub Narebski
  2010-08-20 18:48       ` Clemens Buchacher
  2010-08-20 20:35     ` Junio C Hamano
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 86+ messages in thread
From: Jakub Narebski @ 2010-08-19 10:41 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Junio C Hamano, Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

>  Documentation/config.txt |    6 ++++++
>  builtin/checkout.c       |    7 +++++++
>  builtin/merge.c          |    9 +++++++++
>  cache.h                  |    1 +
>  config.c                 |    3 +++
>  environment.c            |    1 +
>  6 files changed, 27 insertions(+), 0 deletions(-)

You forgot to update Documentation/git-checkout.txt and
Documentation/git-merge.txt with the new '-i' option.

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 5affb6f..121a6a3 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c

> @@ -664,6 +669,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>  		OPT_SET_INT('3', "theirs", &opts.writeout_stage, "stage",
>  			    3),
>  		OPT_BOOLEAN('f', "force", &opts.force, "force"),
> +		OPT_BOOLEAN('i', "overwrite-ignored", &overwrite_ignored,
> +		  "allow explicitly ignored files to be overwritten"),
>  		OPT_BOOLEAN('m', "merge", &opts.merge, "merge"),
>  		OPT_STRING(0, "conflict", &conflict_style, "style",
>  			   "conflict style (merge or diff3)"),
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 4d31e09..47b12ba 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -185,6 +185,8 @@ static struct option builtin_merge_options[] = {
>  		"allow fast-forward (default)"),
>  	OPT_BOOLEAN(0, "ff-only", &fast_forward_only,
>  		"abort if fast-forward is not possible"),
> +	OPT_BOOLEAN('i', "overwrite-ignored", &overwrite_ignored,
> +	  "allow explicitly ignored files to be overwritten"),
>  	OPT_RERERE_AUTOUPDATE(&allow_rerere_auto),
>  	OPT_CALLBACK('s', "strategy", &use_strategies, "strategy",
>  		"merge strategy to use", option_parse_strategy),

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-19 10:41     ` Jakub Narebski
@ 2010-08-20 18:48       ` Clemens Buchacher
  2010-08-20 19:01         ` Joshua Jensen
  0 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-20 18:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 329 bytes --]

On Thu, Aug 19, 2010 at 03:41:08AM -0700, Jakub Narebski wrote:
> 
> You forgot to update Documentation/git-checkout.txt and
> Documentation/git-merge.txt with the new '-i' option.

Thanks. I am just waiting for some kind of feedback. I'm not going
to write documentation for something that won't get accepted.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-20 18:48       ` Clemens Buchacher
@ 2010-08-20 19:01         ` Joshua Jensen
  0 siblings, 0 replies; 86+ messages in thread
From: Joshua Jensen @ 2010-08-20 19:01 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Jakub Narebski, Junio C Hamano, git@vger.kernel.org

  ----- Original Message -----
From: Clemens Buchacher
Date: 8/20/2010 12:48 PM
> On Thu, Aug 19, 2010 at 03:41:08AM -0700, Jakub Narebski wrote:
>> You forgot to update Documentation/git-checkout.txt and
>> Documentation/git-merge.txt with the new '-i' option.
> Thanks. I am just waiting for some kind of feedback. I'm not going
> to write documentation for something that won't get accepted.
I like the idea... :)

My vote doesn't count, though...

Josh

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
  2010-08-19 10:41     ` Jakub Narebski
@ 2010-08-20 20:35     ` Junio C Hamano
  2010-08-21  8:05       ` Clemens Buchacher
                         ` (7 more replies)
  2010-08-20 20:46     ` [PATCH] optionally disable overwriting of ignored files Junio C Hamano
                       ` (2 subsequent siblings)
  4 siblings, 8 replies; 86+ messages in thread
From: Junio C Hamano @ 2010-08-20 20:35 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Joshua Jensen, git@vger.kernel.org

If (and this may be a big IF) it is reasonable to add paths to .gitignore
that you do not want to lose, then you would want to have three classes of
untracked paths: "precious but ignored", "trashable" (and by definition
ignored), and "unignored" (and by definition is not ignored and is
precious).

As I already pointed out, we don't support the "precious but ignored"
class.  So an obvious alternate solution to your particular case is not to
add such a path to the gitignore mechanism.

I have a suspicion that the approach this patch takes would not help very
much in the real life.  You just traded the lack of "precious but ignored"
with "no file is trashable, even if it is listed in .gitignore".

Granted, as long as it is not default, it won't negatively affect people
who do not enable it, other than that it may add maintenance burden on the
resulting bloated code, but I find it hard to swallow new code that does
not convincingly solve the real problem.

By the way, we seem to have a few longstanding bugs that trashes an
unignored and untracked (hence by definition precious) path during branch
switching, and does not give a correct escape hatch.

Doing this:

    $ git checkout maint
    $ echo foo >t/t2018-checkout-branch.sh
    $ git checkout master

does correctly error out because the unignored file needs to be
overwritten.  But doing this after the above still errors out, which is
probably wrong:

    $ echo t/t2018-checkout-branch.sh >>.git/info/exclude
    $ git checkout master

After doing the above:

    $ ed .git/info/exclude ;# remove the extra entry in info/excludes
    $d
    w
    q
    $ rm t/t2018-checkout-branch.sh
    $ echo foo >po
    $ git checkout pu

should error out, as "po" is a directory that has tracked contents, and we
never said the untracked regular file "po" is trashable, but the above
sequence happily checks the branch out.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
  2010-08-19 10:41     ` Jakub Narebski
  2010-08-20 20:35     ` Junio C Hamano
@ 2010-08-20 20:46     ` Junio C Hamano
  2010-08-21  6:48       ` [PATCH v2] " Clemens Buchacher
  2010-08-23  8:33     ` [PATCH] " Matthieu Moy
  2010-08-23  9:37     ` Matthieu Moy
  4 siblings, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2010-08-20 20:46 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 5affb6f..121a6a3 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -389,6 +389,11 @@ static int merge_working_tree(struct checkout_opts *opts,
>  		topts.gently = opts->merge && old->commit;
>  		topts.verbose_update = !opts->quiet;
>  		topts.fn = twoway_merge;
> +		if (overwrite_ignored) {
> +			topts.dir = xcalloc(1, sizeof(*topts.dir));
> +			topts.dir->flags |= DIR_SHOW_IGNORED;
> +			topts.dir->exclude_per_dir = ".gitignore";
> +		}
>  		tree = parse_tree_indirect(old->commit ?
>  					   old->commit->object.sha1 :
>  					   (unsigned char *)EMPTY_TREE_SHA1_BIN);

What is this patch based on?  I don't see a branch that has these two
lines adjacent to each other as the preimage...

>  		topts.fn = twoway_merge;
>  		tree = parse_tree_indirect(old->commit ?

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH v2] optionally disable overwriting of ignored files
  2010-08-20 20:46     ` [PATCH] optionally disable overwriting of ignored files Junio C Hamano
@ 2010-08-21  6:48       ` Clemens Buchacher
  0 siblings, 0 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-21  6:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4965 bytes --]


Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

On Fri, Aug 20, 2010 at 01:46:10PM -0700, Junio C Hamano wrote:
>         
> What is this patch based on?
          
Oops, I forgot to squash an intermediate commit. This is based on
v1.7.2.1.

 Documentation/config.txt |    6 ++++++
 builtin/checkout.c       |   10 +++++++---
 builtin/merge.c          |   10 +++++++---
 cache.h                  |    1 +
 config.c                 |    3 +++
 environment.c            |    1 +
 6 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f81fb91..eb9bb43 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -450,6 +450,12 @@ core.excludesfile::
 	to the value of `$HOME` and "{tilde}user/" to the specified user's
 	home directory.  See linkgit:gitignore[5].
 
+core.overwriteignored::
+	Untracked files can get in the way of merge or checkout. If
+	overwriteignored is enabled, such files will be overwritten
+	to let fast-forward merges and checkouts succeed. Enabled
+	by default.
+
 core.editor::
 	Commands such as `commit` and `tag` that lets you edit
 	messages by launching an editor uses the value of this
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 1994be9..121a6a3 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -389,9 +389,11 @@ static int merge_working_tree(struct checkout_opts *opts,
 		topts.gently = opts->merge && old->commit;
 		topts.verbose_update = !opts->quiet;
 		topts.fn = twoway_merge;
-		topts.dir = xcalloc(1, sizeof(*topts.dir));
-		topts.dir->flags |= DIR_SHOW_IGNORED;
-		topts.dir->exclude_per_dir = ".gitignore";
+		if (overwrite_ignored) {
+			topts.dir = xcalloc(1, sizeof(*topts.dir));
+			topts.dir->flags |= DIR_SHOW_IGNORED;
+			topts.dir->exclude_per_dir = ".gitignore";
+		}
 		tree = parse_tree_indirect(old->commit ?
 					   old->commit->object.sha1 :
 					   (unsigned char *)EMPTY_TREE_SHA1_BIN);
@@ -667,6 +669,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 		OPT_SET_INT('3', "theirs", &opts.writeout_stage, "stage",
 			    3),
 		OPT_BOOLEAN('f', "force", &opts.force, "force"),
+		OPT_BOOLEAN('i', "overwrite-ignored", &overwrite_ignored,
+		  "allow explicitly ignored files to be overwritten"),
 		OPT_BOOLEAN('m', "merge", &opts.merge, "merge"),
 		OPT_STRING(0, "conflict", &conflict_style, "style",
 			   "conflict style (merge or diff3)"),
diff --git a/builtin/merge.c b/builtin/merge.c
index 37ce4f5..47b12ba 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -185,6 +185,8 @@ static struct option builtin_merge_options[] = {
 		"allow fast-forward (default)"),
 	OPT_BOOLEAN(0, "ff-only", &fast_forward_only,
 		"abort if fast-forward is not possible"),
+	OPT_BOOLEAN('i', "overwrite-ignored", &overwrite_ignored,
+	  "allow explicitly ignored files to be overwritten"),
 	OPT_RERERE_AUTOUPDATE(&allow_rerere_auto),
 	OPT_CALLBACK('s', "strategy", &use_strategies, "strategy",
 		"merge strategy to use", option_parse_strategy),
@@ -693,9 +695,11 @@ int checkout_fast_forward(const unsigned char *head, const unsigned char *remote
 	memset(&opts, 0, sizeof(opts));
 	memset(&t, 0, sizeof(t));
 	memset(&dir, 0, sizeof(dir));
-	dir.flags |= DIR_SHOW_IGNORED;
-	dir.exclude_per_dir = ".gitignore";
-	opts.dir = &dir;
+	if (overwrite_ignored) {
+		dir.flags |= DIR_SHOW_IGNORED;
+		dir.exclude_per_dir = ".gitignore";
+		opts.dir = &dir;
+	}
 
 	opts.head_idx = 1;
 	opts.src_index = &the_index;
diff --git a/cache.h b/cache.h
index c9fa3df..dd1b8f7 100644
--- a/cache.h
+++ b/cache.h
@@ -548,6 +548,7 @@ extern size_t packed_git_window_size;
 extern size_t packed_git_limit;
 extern size_t delta_base_cache_limit;
 extern int read_replace_refs;
+extern int overwrite_ignored;
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
diff --git a/config.c b/config.c
index cdcf583..bd7956e 100644
--- a/config.c
+++ b/config.c
@@ -563,6 +563,9 @@ static int git_default_core_config(const char *var, const char *value)
 	if (!strcmp(var, "core.excludesfile"))
 		return git_config_pathname(&excludes_file, var, value);
 
+	if (!strcmp(var, "core.overwriteignored"))
+		overwrite_ignored = git_config_bool(var, value);
+
 	if (!strcmp(var, "core.whitespace")) {
 		if (!value)
 			return config_error_nonbool(var);
diff --git a/environment.c b/environment.c
index 83d38d3..1b92f60 100644
--- a/environment.c
+++ b/environment.c
@@ -30,6 +30,7 @@ const char *apply_default_ignorewhitespace;
 int zlib_compression_level = Z_BEST_SPEED;
 int core_compression_level;
 int core_compression_seen;
+int overwrite_ignored = 1;
 int fsync_object_files;
 size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
 size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
-- 
1.7.2.1.1.g202c


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-20 20:35     ` Junio C Hamano
@ 2010-08-21  8:05       ` Clemens Buchacher
  2010-08-22  7:25         ` Junio C Hamano
  2010-10-09 22:39         ` Kevin Ballard
  2010-08-21 13:23       ` Clemens Buchacher
                         ` (6 subsequent siblings)
  7 siblings, 2 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-21  8:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3399 bytes --]

On Fri, Aug 20, 2010 at 01:35:32PM -0700, Junio C Hamano wrote:
>
> If (and this may be a big IF) it is reasonable to add paths to .gitignore
> that you do not want to lose, then you would want to have three classes of
> untracked paths: "precious but ignored", "trashable" (and by definition
> ignored), and "unignored" (and by definition is not ignored and is
> precious).

I am not sure anybody would bother categorizing their files into
several classes. On the other hand, .gitignore and
.git/info/exclude may already serve such a purpose.

Files that go into a tracked .gitignore are most likely generated
files, and therefore trashable. Files that go into
.git/info/exclude or into an untracked .gitignore (e.g. echo '*' >
precious-simulation-results/.gitignore), are not always generated
and may not be trashable. At least they would not likely get in the
way of checkout or merge.

(As you noted below, except for the untracked .gitignore part, this
is how git behaves already.)

> As I already pointed out, we don't support the "precious but ignored"
> class.  So an obvious alternate solution to your particular case is not to
> add such a path to the gitignore mechanism.

That would defeat the point of ignoring it.

> I have a suspicion that the approach this patch takes would not help very
> much in the real life.  You just traded the lack of "precious but ignored"
> with "no file is trashable, even if it is listed in .gitignore".
> 
> Granted, as long as it is not default, it won't negatively affect people
> who do not enable it, other than that it may add maintenance burden on the
> resulting bloated code, but I find it hard to swallow new code that does
> not convincingly solve the real problem.

Yes, I am not a big fan of this solution either. But if we do not
find a better solution, I think having it configurable cannot hurt.
The code required is minimal. As far as I am concerned, we can even
remove the -i part.

> By the way, we seem to have a few longstanding bugs that trashes an
> unignored and untracked (hence by definition precious) path during branch
> switching, and does not give a correct escape hatch.
> 
> Doing this:
> 
>     $ git checkout maint
>     $ echo foo >t/t2018-checkout-branch.sh
>     $ git checkout master
> 
> does correctly error out because the unignored file needs to be
> overwritten.  But doing this after the above still errors out, which is
> probably wrong:
> 
>     $ echo t/t2018-checkout-branch.sh >>.git/info/exclude
>     $ git checkout master

I understand your point. But this is actually a great example. I
have a bunch of such tests, which are not in shape for upstream,
but I want to keep them around anyways (and run them). Do you
really think that an untracked test which was added to
.git/info/exclude should be considered trashable? If it were a
generated file, it would have been added to .gitignore.

> After doing the above:
> 
>     $ ed .git/info/exclude ;# remove the extra entry in info/excludes
>     $d
>     w
>     q
>     $ rm t/t2018-checkout-branch.sh
>     $ echo foo >po
>     $ git checkout pu
> 
> should error out, as "po" is a directory that has tracked contents, and we
> never said the untracked regular file "po" is trashable, but the above
> sequence happily checks the branch out.

Ok, that's bad.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-20 20:35     ` Junio C Hamano
  2010-08-21  8:05       ` Clemens Buchacher
@ 2010-08-21 13:23       ` Clemens Buchacher
  2010-10-09 13:52       ` [PATCH 0/5] do not overwrite untracked files in leading path Clemens Buchacher
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-21 13:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4410 bytes --]

On Fri, Aug 20, 2010 at 01:35:32PM -0700, Junio C Hamano wrote:
>
>     $ echo foo >po
>     $ git checkout pu
> 
> should error out, as "po" is a directory that has tracked contents, and we
> never said the untracked regular file "po" is trashable, but the above
> sequence happily checks the branch out.

Looks like this case is simply overlooked in verify_absent_1(). The
following takes the existing lstat_cache() code and deals with the
FL_ERR case, which is when there is a file in the way of the
leading path.

The patch below fixes the issue and passes the test suite, but it's
lacking in various ways and I am probably breaking something in
lstat_cache(), which I do not completely understand yet.

So it's going to take some more work to fix this properly.

Clemens

 cache.h        |    1 +
 symlinks.c     |   20 ++++++++++++++++++--
 unpack-trees.c |   25 ++++++++++++++++++++++++-
 3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index dd1b8f7..5a8a99d 100644
--- a/cache.h
+++ b/cache.h
@@ -850,6 +850,7 @@ struct cache_def {
 extern int has_symlink_leading_path(const char *name, int len);
 extern int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
 extern int has_symlink_or_noent_leading_path(const char *name, int len);
+extern int find_leading_path(const char *name, int len, const char **path, int *path_len);
 extern int has_dirs_only_path(const char *name, int len, int prefix_len);
 extern void schedule_dir_for_removal(const char *name, int len);
 extern void remove_scheduled_dirs(void);
diff --git a/symlinks.c b/symlinks.c
index 8860120..3f78168 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -152,7 +152,7 @@ static int lstat_cache(struct cache_def *cache, const char *name, int len,
 	 * path types, FL_NOENT, FL_SYMLINK and FL_DIR, can be cached
 	 * for the moment!
 	 */
-	save_flags = ret_flags & track_flags & (FL_NOENT|FL_SYMLINK);
+	save_flags = ret_flags & track_flags & (FL_NOENT|FL_SYMLINK|FL_ERR);
 	if (save_flags && last_slash > 0 && last_slash <= PATH_MAX) {
 		cache->path[last_slash] = '\0';
 		cache->len = last_slash;
@@ -199,7 +199,7 @@ int has_symlink_leading_path(const char *name, int len)
 
 /*
  * Return non-zero if path 'name' has a leading symlink component or
- * if some leading path component does not exists.
+ * if some leading path component does not exist.
  */
 int has_symlink_or_noent_leading_path(const char *name, int len)
 {
@@ -210,6 +210,22 @@ int has_symlink_or_noent_leading_path(const char *name, int len)
 }
 
 /*
+ * Stat for leading path.
+ */
+int find_leading_path(const char *name, int len, const char **path, int *path_len)
+{
+	struct cache_def *cache = &default_cache;	/* FIXME */
+	int flags = lstat_cache(cache, name, len,
+			   FL_SYMLINK|FL_NOENT|FL_DIR|FL_ERR, USE_ONLY_LSTAT);
+	*path = cache->path;
+	*path_len = cache->len;
+	if (flags & FL_ERR)
+		return -1;
+	else
+		return flags & (FL_SYMLINK|FL_NOENT);
+}
+
+/*
  * Return non-zero if all path components of 'name' exists as a
  * directory.  If prefix_len > 0, we will test with the stat()
  * function instead of the lstat() function for a prefix length of
diff --git a/unpack-trees.c b/unpack-trees.c
index 8cf0da3..250ed7c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1016,12 +1016,35 @@ static int verify_absent_1(struct cache_entry *ce, const char *action,
 				 const char *error_msg)
 {
 	struct stat st;
+	const char *path;
+	int path_len;
+	int ret;
 
 	if (o->index_only || o->reset || !o->update)
 		return 0;
 
-	if (has_symlink_or_noent_leading_path(ce->name, ce_namelen(ce)))
+	ret = find_leading_path(ce->name, ce_namelen(ce), &path, &path_len);
+	if (ret > 0)
 		return 0;
+	else if (ret < 0) {
+		struct cache_entry *result;
+
+		/* FIXME: respect ignores etc. as below */
+
+		/*
+		 * The previous round may already have decided to
+		 * delete this path, which is in a subdirectory that
+		 * is being replaced with a blob.
+		 */
+		result = index_name_exists(&o->result, path, path_len, 0);
+		if (result) {
+			if (result->ce_flags & CE_REMOVE)
+				return 0;
+		}
+
+		return o->gently ? -1 :
+			error(ERRORMSG(o, would_lose_untracked), path, action);
+	}
 
 	if (!lstat(ce->name, &st)) {
 		int dtype = ce_to_dtype(ce);
-- 
1.7.2.1.1.g202c


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-21  8:05       ` Clemens Buchacher
@ 2010-08-22  7:25         ` Junio C Hamano
  2010-08-22  8:20           ` Clemens Buchacher
  2010-10-09 22:39         ` Kevin Ballard
  1 sibling, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2010-08-22  7:25 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

> I understand your point. But this is actually a great example. I
> have a bunch of such tests, which are not in shape for upstream,
> but I want to keep them around anyways (and run them). Do you
> really think that an untracked test which was added to
> .git/info/exclude should be considered trashable? If it were a
> generated file, it would have been added to .gitignore.

I agree that in such a workflow to keep untracked tests around, they
should not be considered trashable. But more importantly, as I have
already said, adding your untracked tests to exclude is a wrong thing to
do.

Traditionally (think "git status" output without "-s") the way to remind
oneself that some day these paths need to be added when they are ready has
been to keep them untracked but _not_ ignored, so that they will be listed
in the output.

Quite contrary to what you earlier said in another message, adding
such a path that is not trashable does defeat the point of the "ignore"
mechanism.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-22  7:25         ` Junio C Hamano
@ 2010-08-22  8:20           ` Clemens Buchacher
  0 siblings, 0 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-22  8:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1351 bytes --]

On Sun, Aug 22, 2010 at 12:25:31AM -0700, Junio C Hamano wrote:
> 
> Traditionally (think "git status" output without "-s") the way to remind
> oneself that some day these paths need to be added when they are ready has
> been to keep them untracked but _not_ ignored, so that they will be listed
> in the output.

But they do not have to be added. I may never want to add them. So
I do not want git status to keep reminding me. That's why I add
them to .git/info/exclude. And with the current behavior, I do not
see anything wrong with that, because git does not consider them
trashable.

> Quite contrary to what you earlier said in another message, adding
> such a path that is not trashable does defeat the point of the "ignore"
> mechanism.

If I understand correctly, the reason for considering ignored paths
trashable is because they are likely to have been tracked in other
versions of the project, and would therefore frequently get in the
way of commands like checkout or merge.

But the same is not true for private ignores in .git/info/exclude.
Such paths most likely never were tracked and never will be. And in
the rare case that an ignored path turns out to be a nuisance, it's
easy enough to remove it from .git/info/exclude.

So I see only advantages if such a path is not considered
trashable.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
                       ` (2 preceding siblings ...)
  2010-08-20 20:46     ` [PATCH] optionally disable overwriting of ignored files Junio C Hamano
@ 2010-08-23  8:33     ` Matthieu Moy
  2010-08-31 18:44       ` Heiko Voigt
  2010-08-23  9:37     ` Matthieu Moy
  4 siblings, 1 reply; 86+ messages in thread
From: Matthieu Moy @ 2010-08-23  8:33 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Junio C Hamano, Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

> By default, checkout and fast-forward merge will overwrite ignored
> files. Make this behavior configurable.

I'd use this option if it gets into git.git.

I didn't follow the discussions when the feature was added, and I was
basically not aware that Git could trash my ignored files this way.
I've always thought that Git took great care not to touch untracked
files, and I found this good ...

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
                       ` (3 preceding siblings ...)
  2010-08-23  8:33     ` [PATCH] " Matthieu Moy
@ 2010-08-23  9:37     ` Matthieu Moy
  2010-08-23 13:56       ` Holger Hellmuth
  2018-10-16  9:10       ` Ignored files being silently overwritten when switching branches Ævar Arnfjörð Bjarmason
  4 siblings, 2 replies; 86+ messages in thread
From: Matthieu Moy @ 2010-08-23  9:37 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Junio C Hamano, Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

> By default, checkout and fast-forward merge will overwrite ignored
> files. Make this behavior configurable.

Just mentionning an alternative, or complementary approach:

Instead of overwritting completely a .gitignore-d file, Git could
rename it, and warn the user kindly. For example:

$ git merge
...
Warning: existing ignored file 'foo' renamed to 'foo~'

(in case foo~ already exists, it's possible to use numbered backups
just like "mv --backup=existing" does for example)


That could complement your patch if core.overwriteignored is a
multiple-choice option instead of a Boolean:

- "overwrite" => current behavior
- "refuse" => your proposal
- "rename" => my proposal

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-23  9:37     ` Matthieu Moy
@ 2010-08-23 13:56       ` Holger Hellmuth
  2010-08-23 15:11         ` Clemens Buchacher
  2018-10-16  9:10       ` Ignored files being silently overwritten when switching branches Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 86+ messages in thread
From: Holger Hellmuth @ 2010-08-23 13:56 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Clemens Buchacher, Junio C Hamano, Joshua Jensen,
	git@vger.kernel.org

Am 23.08.2010 11:37, schrieb Matthieu Moy:
> Clemens Buchacher <drizzd@aon.at> writes:
> 
>> By default, checkout and fast-forward merge will overwrite ignored
>> files. Make this behavior configurable.
> 
> Just mentionning an alternative, or complementary approach:
> 
> Instead of overwritting completely a .gitignore-d file, Git could
> rename it, and warn the user kindly. For example:


You've got my vote. This is the only option that combines safety with
minimal configuration hassle.

I didn't know about this subtle difference between .gitignore and
.git/info/exclude. And while this makes sense I expect a sizable
percentage of users will never even know that .git/info/exclude exists.

And in practice I guess a lot of files will get added to .gitignore even
though they only are relevant to one user (out of carelessness or
ignorance). I know I have added patterns without making a conscious
decision about their relevance to others.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-23 13:56       ` Holger Hellmuth
@ 2010-08-23 15:11         ` Clemens Buchacher
  2010-08-23 15:57           ` Junio C Hamano
  0 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-23 15:11 UTC (permalink / raw)
  To: Holger Hellmuth
  Cc: Matthieu Moy, Junio C Hamano, Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1881 bytes --]

On Mon, Aug 23, 2010 at 03:56:39PM +0200, Holger Hellmuth wrote:
> 
> You've got my vote. This is the only option that combines safety with
> minimal configuration hassle.
> 
> I didn't know about this subtle difference between .gitignore and
> .git/info/exclude. And while this makes sense I expect a sizable
> percentage of users will never even know that .git/info/exclude exists.

I don't know if this subtle difference was even intentional. But it
makes sense, and we simply need to make it explicit by documenting
it and by making it optional.

And in many cases, that behavior makes sense. Imagine a generated
file is accidentally added to git, later removed from version
control and added to .gitignore instead. A common scenario, I am
sure.

Now if you start merging, rebasing or bisecting with such a
history, you _will_ run into this problem all the time.

Renaming is certainly a possibility, but it does not really solve
the problem. We will end up with a bunch of renamed, possibly
precious files in our work tree, which means we have to clean up
manually anyways.

So I think it is better to make the decision whether or not a file
is precious at the time it is added to .gitignore. In other words,
never add precious files to .gitignore, but add them to
.git/info/exclude instead.

> And in practice I guess a lot of files will get added to
> .gitignore even though they only are relevant to one user (out of
> carelessness or ignorance). I know I have added patterns without
> making a conscious decision about their relevance to others.

Then you have been using it wrong, just like I have.
Ignorant/careless users may still make that mistake, but if git
does not overwrite such files by default, at least they made a
conscious decision at some point to take that risk.

That is if we can make it the default.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-23 15:11         ` Clemens Buchacher
@ 2010-08-23 15:57           ` Junio C Hamano
  2010-08-24  7:28             ` Clemens Buchacher
  0 siblings, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2010-08-23 15:57 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Holger Hellmuth, Matthieu Moy, Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

> On Mon, Aug 23, 2010 at 03:56:39PM +0200, Holger Hellmuth wrote:
>> 
>> You've got my vote. This is the only option that combines safety with
>> minimal configuration hassle.
>> 
>> I didn't know about this subtle difference between .gitignore and
>> .git/info/exclude. And while this makes sense I expect a sizable
>> percentage of users will never even know that .git/info/exclude exists.
>
> I don't know if this subtle difference was even intentional.

I am sorry, what difference are you talking about?

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-23 15:57           ` Junio C Hamano
@ 2010-08-24  7:28             ` Clemens Buchacher
  2010-08-24 16:19               ` Junio C Hamano
  0 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-08-24  7:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Holger Hellmuth, Matthieu Moy, Joshua Jensen, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]

On Mon, Aug 23, 2010 at 08:57:38AM -0700, Junio C Hamano wrote:
> Clemens Buchacher <drizzd@aon.at> writes:
> 
> > On Mon, Aug 23, 2010 at 03:56:39PM +0200, Holger Hellmuth wrote:
> >> 
> >> You've got my vote. This is the only option that combines safety with
> >> minimal configuration hassle.
> >> 
> >> I didn't know about this subtle difference between .gitignore and
> >> .git/info/exclude. And while this makes sense I expect a sizable
> >> percentage of users will never even know that .git/info/exclude exists.
> >
> > I don't know if this subtle difference was even intentional.
> 
> I am sorry, what difference are you talking about?

The fact that files in .gitignore are considered trashable, and the
ones in .git/info/exclude are not.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-24  7:28             ` Clemens Buchacher
@ 2010-08-24 16:19               ` Junio C Hamano
  0 siblings, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2010-08-24 16:19 UTC (permalink / raw)
  To: Clemens Buchacher
  Cc: Holger Hellmuth, Matthieu Moy, Joshua Jensen, git@vger.kernel.org

Clemens Buchacher <drizzd@aon.at> writes:

> The fact that files in .gitignore are considered trashable, and the
> ones in .git/info/exclude are not.

Is that a fact?

There may be a subtle bug in the ignore handling code.  The entries in
these various places are never meant to produce different behaviours.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-23  8:33     ` [PATCH] " Matthieu Moy
@ 2010-08-31 18:44       ` Heiko Voigt
  0 siblings, 0 replies; 86+ messages in thread
From: Heiko Voigt @ 2010-08-31 18:44 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Clemens Buchacher, Junio C Hamano, Joshua Jensen,
	git@vger.kernel.org

On Mon, Aug 23, 2010 at 10:33:21AM +0200, Matthieu Moy wrote:
> Clemens Buchacher <drizzd@aon.at> writes:
> 
> > By default, checkout and fast-forward merge will overwrite ignored
> > files. Make this behavior configurable.
> 
> I'd use this option if it gets into git.git.
> 
> I didn't follow the discussions when the feature was added, and I was
> basically not aware that Git could trash my ignored files this way.
> I've always thought that Git took great care not to touch untracked
> files, and I found this good ...

I agree here. I would like to have this as an option as well. And I
think adding a "ignored but precious" category does not make sense in
practise. My guess is that a user would only know _after_ git trashed
some precious file that it belongs into that category;) I have told my
users that git is very careful with their files and I would like to keep
this promise.

I was actually never annoyed by a failed checkout because of an
untracked file in the way. (A 'git clean -fx' always fixes this).

Thanks.

Heiko

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH 0/5] do not overwrite untracked files in leading path
  2010-08-20 20:35     ` Junio C Hamano
  2010-08-21  8:05       ` Clemens Buchacher
  2010-08-21 13:23       ` Clemens Buchacher
@ 2010-10-09 13:52       ` Clemens Buchacher
  2010-10-09 13:52       ` [PATCH 1/5] t7607: use test_commit and test_must_fail Clemens Buchacher
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-09 13:52 UTC (permalink / raw)
  To: git; +Cc: gitster

Hi,

On Fri, Aug 20, 2010 at 01:35:32PM -0700, Junio C Hamano wrote:
>
>     $ echo foo >po
>     $ git checkout pu
> 
> should error out, as "po" is a directory that has tracked
> contents, and we never said the untracked regular file "po" is
> trashable, but the above sequence happily checks the branch out.

The following patch series is aimed at fixing this bug.

Clemens

[PATCH 1/5] t7607: use test_commit and test_must_fail
[PATCH 2/5] t7607: add leading-path tests
[PATCH 3/5] add function check_ok_to_remove()
[PATCH 4/5] lstat_cache: optionally return match_len
[PATCH 5/5] do not overwrite files in leading path

 cache.h                    |    2 +-
 symlinks.c                 |   64 +++++++++++++++-------
 t/t7607-merge-overwrite.sh |  129 +++++++++++++++++++++++++++++---------------
 unpack-trees.c             |  121 ++++++++++++++++++++++++-----------------
 4 files changed, 200 insertions(+), 116 deletions(-)

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH 1/5] t7607: use test_commit and test_must_fail
  2010-08-20 20:35     ` Junio C Hamano
                         ` (2 preceding siblings ...)
  2010-10-09 13:52       ` [PATCH 0/5] do not overwrite untracked files in leading path Clemens Buchacher
@ 2010-10-09 13:52       ` Clemens Buchacher
  2010-10-10  6:35         ` Jonathan Nieder
  2010-10-09 13:52       ` [PATCH 2/5] t7607: add leading-path tests Clemens Buchacher
                         ` (3 subsequent siblings)
  7 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-09 13:52 UTC (permalink / raw)
  To: git; +Cc: gitster, Clemens Buchacher

Also make sure that aborted merges do not leave
MERGE_HEAD except in the "will not overwrite
removed file" test, where we cannot do so. See
also the discussion in the following thread.

http://mid.gmane.org/7vskopwxej.fsf@gitster.siamese.dyndns.org
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---
 t/t7607-merge-overwrite.sh |   78 +++++++++++++++++++------------------------
 1 files changed, 35 insertions(+), 43 deletions(-)

diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index d82349a..6ed40b1 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -7,81 +7,73 @@ Do not overwrite changes.'
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-	echo c0 > c0.c &&
-	git add c0.c &&
-	git commit -m c0 &&
-	git tag c0 &&
-	echo c1 > c1.c &&
-	git add c1.c &&
-	git commit -m c1 &&
-	git tag c1 &&
+	test_commit c0 &&
+	test_commit c1 &&
+	test_commit "c1a" "c1.t" "c1 a" &&
 	git reset --hard c0 &&
-	echo c2 > c2.c &&
-	git add c2.c &&
-	git commit -m c2 &&
-	git tag c2 &&
-	git reset --hard c1 &&
-	echo "c1 a" > c1.c &&
-	git add c1.c &&
-	git commit -m "c1 a" &&
-	git tag c1a &&
+	test_commit c2 &&
 	echo "VERY IMPORTANT CHANGES" > important
 '
 
 test_expect_success 'will not overwrite untracked file' '
 	git reset --hard c1 &&
-	cat important > c2.c &&
+	cp important c2.t &&
 	test_must_fail git merge c2 &&
-	test_cmp important c2.c
+	! test -f .git/MERGE_HEAD &&
+	test_cmp important c2.t
 '
 
 test_expect_success 'will not overwrite new file' '
 	git reset --hard c1 &&
-	cat important > c2.c &&
-	git add c2.c &&
+	cp important c2.t &&
+	git add c2.t &&
 	test_must_fail git merge c2 &&
-	test_cmp important c2.c
+	! test -f .git/MERGE_HEAD &&
+	test_cmp important c2.t
 '
 
 test_expect_success 'will not overwrite staged changes' '
 	git reset --hard c1 &&
-	cat important > c2.c &&
-	git add c2.c &&
-	rm c2.c &&
+	cp important c2.t &&
+	git add c2.t &&
+	rm c2.t &&
 	test_must_fail git merge c2 &&
-	git checkout c2.c &&
-	test_cmp important c2.c
+	! test -f .git/MERGE_HEAD &&
+	git checkout c2.t &&
+	test_cmp important c2.t
 '
 
 test_expect_success 'will not overwrite removed file' '
 	git reset --hard c1 &&
-	git rm c1.c &&
-	git commit -m "rm c1.c" &&
-	cat important > c1.c &&
+	git rm c1.t &&
+	git commit -m "rm c1.t" &&
+	cp important c1.t &&
 	test_must_fail git merge c1a &&
-	test_cmp important c1.c
+	test_cmp important c1.t
 '
 
 test_expect_success 'will not overwrite re-added file' '
 	git reset --hard c1 &&
-	git rm c1.c &&
-	git commit -m "rm c1.c" &&
-	cat important > c1.c &&
-	git add c1.c &&
+	git rm c1.t &&
+	git commit -m "rm c1.t" &&
+	cp important c1.t &&
+	git add c1.t &&
 	test_must_fail git merge c1a &&
-	test_cmp important c1.c
+	! test -f .git/MERGE_HEAD &&
+	test_cmp important c1.t
 '
 
 test_expect_success 'will not overwrite removed file with staged changes' '
 	git reset --hard c1 &&
-	git rm c1.c &&
-	git commit -m "rm c1.c" &&
-	cat important > c1.c &&
-	git add c1.c &&
-	rm c1.c &&
+	git rm c1.t &&
+	git commit -m "rm c1.t" &&
+	cp important c1.t &&
+	git add c1.t &&
+	rm c1.t &&
 	test_must_fail git merge c1a &&
-	git checkout c1.c &&
-	test_cmp important c1.c
+	! test -f .git/MERGE_HEAD &&
+	git checkout c1.t &&
+	test_cmp important c1.t
 '
 
 test_done
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH 2/5] t7607: add leading-path tests
  2010-08-20 20:35     ` Junio C Hamano
                         ` (3 preceding siblings ...)
  2010-10-09 13:52       ` [PATCH 1/5] t7607: use test_commit and test_must_fail Clemens Buchacher
@ 2010-10-09 13:52       ` Clemens Buchacher
  2010-10-09 19:14         ` Johannes Sixt
  2010-10-09 13:52       ` [PATCH 3/5] add function check_ok_to_remove() Clemens Buchacher
                         ` (2 subsequent siblings)
  7 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-09 13:52 UTC (permalink / raw)
  To: git; +Cc: gitster, Clemens Buchacher


Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

I added a test for symlinks as well, even though from the code it
looks like overwriting untracked symlinks is intentionally allowed.
One could argue that symlinks are valuable just like regular files.
On the other hand, the potential loss of information is relatively
small.

For consistency, I would prefer to treat symlinks just like regular
files, but I do not feel strongly about it.

Clemens

 t/t7607-merge-overwrite.sh |   51 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 6ed40b1..01b070b 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -12,6 +12,12 @@ test_expect_success 'setup' '
 	test_commit "c1a" "c1.t" "c1 a" &&
 	git reset --hard c0 &&
 	test_commit c2 &&
+	git reset --hard c0 &&
+	mkdir sub &&
+	echo "sub/f" > sub/f &&
+	git add sub/f &&
+	git commit -m sub &&
+	git tag sub &&
 	echo "VERY IMPORTANT CHANGES" > important
 '
 
@@ -23,6 +29,14 @@ test_expect_success 'will not overwrite untracked file' '
 	test_cmp important c2.t
 '
 
+test_expect_success 'will overwrite tracked file' '
+	git reset --hard c1 &&
+	cp important c2.t &&
+	git add c2.t &&
+	git commit -m important &&
+	git checkout c2
+'
+
 test_expect_success 'will not overwrite new file' '
 	git reset --hard c1 &&
 	cp important c2.t &&
@@ -76,4 +90,41 @@ test_expect_success 'will not overwrite removed file with staged changes' '
 	test_cmp important c1.t
 '
 
+test_expect_success 'will not overwrite untracked subtree' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	mkdir -p sub/f &&
+	cp important sub/f/important &&
+	test_must_fail git merge sub &&
+	! test -f .git/MERGE_HEAD &&
+	test_cmp important sub/f/important
+'
+
+test_expect_failure 'will not overwrite untracked file in leading path' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	cp important sub &&
+	test_must_fail git merge sub &&
+	! test -f .git/MERGE_HEAD &&
+	test_cmp important sub
+'
+
+test_expect_failure 'will not overwrite untracked symlink in leading path' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	mkdir sub2 &&
+	ln -s sub2 sub &&
+	test_must_fail git merge sub &&
+	! test -f .git/MERGE_HEAD
+'
+
+test_expect_success 'will not be confused by symlink in leading path' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	ln -s sub2 sub &&
+	git add sub &&
+	git commit -m ln &&
+	git checkout sub
+'
+
 test_done
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH 3/5] add function check_ok_to_remove()
  2010-08-20 20:35     ` Junio C Hamano
                         ` (4 preceding siblings ...)
  2010-10-09 13:52       ` [PATCH 2/5] t7607: add leading-path tests Clemens Buchacher
@ 2010-10-09 13:52       ` Clemens Buchacher
  2010-10-13 21:43         ` Junio C Hamano
  2010-10-09 13:52       ` [PATCH 4/5] lstat_cache: optionally return match_len Clemens Buchacher
  2010-10-09 13:53       ` [PATCH 5/5] do not overwrite files in leading path Clemens Buchacher
  7 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-09 13:52 UTC (permalink / raw)
  To: git; +Cc: gitster, Clemens Buchacher

This wraps some inline code into the function check_ok_to_remove(),
which will later be used for leading path components as well.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---
 unpack-trees.c |  107 ++++++++++++++++++++++++++++++-------------------------
 1 files changed, 58 insertions(+), 49 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 803445a..df1c920 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1127,14 +1127,65 @@ static int verify_clean_subdirectory(struct cache_entry *ce,
  * See if we can find a case-insensitive match in the index that also
  * matches the stat information, and assume it's that other file!
  */
-static int icase_exists(struct unpack_trees_options *o, struct cache_entry *dst, struct stat *st)
+static int icase_exists(struct unpack_trees_options *o, const char *name, int len, struct stat *st)
 {
 	struct cache_entry *src;
 
-	src = index_name_exists(o->src_index, dst->name, ce_namelen(dst), 1);
+	src = index_name_exists(o->src_index, name, len, 1);
 	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
 }
 
+static int check_ok_to_remove(const char *name, int len, int dtype,
+			      struct cache_entry *ce, struct stat *st,
+			      enum unpack_trees_error_types error_type,
+			      struct unpack_trees_options *o)
+{
+	struct cache_entry *result;
+
+	/*
+	 * It may be that the 'lstat()' succeeded even though
+	 * target 'ce' was absent, because there is an old
+	 * entry that is different only in case..
+	 *
+	 * Ignore that lstat() if it matches.
+	 */
+	if (ignore_case && icase_exists(o, name, len, st))
+		return 0;
+
+	if (o->dir && excluded(o->dir, name, &dtype))
+		/*
+		 * ce->name is explicitly excluded, so it is Ok to
+		 * overwrite it.
+		 */
+		return 0;
+	if (S_ISDIR(st->st_mode)) {
+		/*
+		 * We are checking out path "foo" and
+		 * found "foo/." in the working tree.
+		 * This is tricky -- if we have modified
+		 * files that are in "foo/" we would lose
+		 * them.
+		 */
+		if (verify_clean_subdirectory(ce, error_type, o) < 0)
+			return -1;
+		return 0;
+	}
+
+	/*
+	 * The previous round may already have decided to
+	 * delete this path, which is in a subdirectory that
+	 * is being replaced with a blob.
+	 */
+	result = index_name_exists(&o->result, name, len, 0);
+	if (result) {
+		if (result->ce_flags & CE_REMOVE)
+			return 0;
+	}
+
+	return o->gently ? -1 :
+		add_rejected_path(o, error_type, name);
+}
+
 /*
  * We do not want to remove or overwrite a working tree file that
  * is not tracked, unless it is ignored.
@@ -1151,55 +1202,13 @@ static int verify_absent_1(struct cache_entry *ce,
 	if (has_symlink_or_noent_leading_path(ce->name, ce_namelen(ce)))
 		return 0;
 
-	if (!lstat(ce->name, &st)) {
-		int dtype = ce_to_dtype(ce);
-		struct cache_entry *result;
-
-		/*
-		 * It may be that the 'lstat()' succeeded even though
-		 * target 'ce' was absent, because there is an old
-		 * entry that is different only in case..
-		 *
-		 * Ignore that lstat() if it matches.
-		 */
-		if (ignore_case && icase_exists(o, ce, &st))
-			return 0;
-
-		if (o->dir && excluded(o->dir, ce->name, &dtype))
-			/*
-			 * ce->name is explicitly excluded, so it is Ok to
-			 * overwrite it.
-			 */
-			return 0;
-		if (S_ISDIR(st.st_mode)) {
-			/*
-			 * We are checking out path "foo" and
-			 * found "foo/." in the working tree.
-			 * This is tricky -- if we have modified
-			 * files that are in "foo/" we would lose
-			 * them.
-			 */
-			if (verify_clean_subdirectory(ce, error_type, o) < 0)
-				return -1;
-			return 0;
-		}
-
-		/*
-		 * The previous round may already have decided to
-		 * delete this path, which is in a subdirectory that
-		 * is being replaced with a blob.
-		 */
-		result = index_name_exists(&o->result, ce->name, ce_namelen(ce), 0);
-		if (result) {
-			if (result->ce_flags & CE_REMOVE)
-				return 0;
-		}
-
-		return o->gently ? -1 :
-			add_rejected_path(o, error_type, ce->name);
-	}
+	if (!lstat(ce->name, &st))
+		return check_ok_to_remove(ce->name, ce_namelen(ce),
+				ce_to_dtype(ce), ce, &st,
+				error_type, o);
 	return 0;
 }
+
 static int verify_absent(struct cache_entry *ce,
 			 enum unpack_trees_error_types error_type,
 			 struct unpack_trees_options *o)
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH 4/5] lstat_cache: optionally return match_len
  2010-08-20 20:35     ` Junio C Hamano
                         ` (5 preceding siblings ...)
  2010-10-09 13:52       ` [PATCH 3/5] add function check_ok_to_remove() Clemens Buchacher
@ 2010-10-09 13:52       ` Clemens Buchacher
  2010-10-09 13:53       ` [PATCH 5/5] do not overwrite files in leading path Clemens Buchacher
  7 siblings, 0 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-09 13:52 UTC (permalink / raw)
  To: git; +Cc: gitster, Clemens Buchacher

Return match_len so that the caller can know which leading path
component matched.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---
 symlinks.c |   43 +++++++++++++++++++++++++++----------------
 1 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/symlinks.c b/symlinks.c
index 8860120..b7343eb 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -64,11 +64,13 @@ static inline void reset_lstat_cache(struct cache_def *cache)
  * of the prefix, where the cache should use the stat() function
  * instead of the lstat() function to test each path component.
  */
-static int lstat_cache(struct cache_def *cache, const char *name, int len,
-		       int track_flags, int prefix_len_stat_func)
+static int lstat_cache_matchlen(struct cache_def *cache,
+				const char *name, int len,
+				int *ret_flags, int track_flags,
+				int prefix_len_stat_func)
 {
 	int match_len, last_slash, last_slash_dir, previous_slash;
-	int match_flags, ret_flags, save_flags, max_len, ret;
+	int save_flags, max_len, ret;
 	struct stat st;
 
 	if (cache->track_flags != track_flags ||
@@ -90,13 +92,13 @@ static int lstat_cache(struct cache_def *cache, const char *name, int len,
 		match_len = last_slash =
 			longest_path_match(name, len, cache->path, cache->len,
 					   &previous_slash);
-		match_flags = cache->flags & track_flags & (FL_NOENT|FL_SYMLINK);
+		*ret_flags = cache->flags & track_flags & (FL_NOENT|FL_SYMLINK);
 
 		if (!(track_flags & FL_FULLPATH) && match_len == len)
 			match_len = last_slash = previous_slash;
 
-		if (match_flags && match_len == cache->len)
-			return match_flags;
+		if (*ret_flags && match_len == cache->len)
+			return match_len;
 		/*
 		 * If we now have match_len > 0, we would know that
 		 * the matched part will always be a directory.
@@ -105,16 +107,16 @@ static int lstat_cache(struct cache_def *cache, const char *name, int len,
 		 * a substring of the cache on a path component basis,
 		 * we can return immediately.
 		 */
-		match_flags = track_flags & FL_DIR;
-		if (match_flags && len == match_len)
-			return match_flags;
+		*ret_flags = track_flags & FL_DIR;
+		if (*ret_flags && len == match_len)
+			return match_len;
 	}
 
 	/*
 	 * Okay, no match from the cache so far, so now we have to
 	 * check the rest of the path components.
 	 */
-	ret_flags = FL_DIR;
+	*ret_flags = FL_DIR;
 	last_slash_dir = last_slash;
 	max_len = len < PATH_MAX ? len : PATH_MAX;
 	while (match_len < max_len) {
@@ -133,16 +135,16 @@ static int lstat_cache(struct cache_def *cache, const char *name, int len,
 			ret = lstat(cache->path, &st);
 
 		if (ret) {
-			ret_flags = FL_LSTATERR;
+			*ret_flags = FL_LSTATERR;
 			if (errno == ENOENT)
-				ret_flags |= FL_NOENT;
+				*ret_flags |= FL_NOENT;
 		} else if (S_ISDIR(st.st_mode)) {
 			last_slash_dir = last_slash;
 			continue;
 		} else if (S_ISLNK(st.st_mode)) {
-			ret_flags = FL_SYMLINK;
+			*ret_flags = FL_SYMLINK;
 		} else {
-			ret_flags = FL_ERR;
+			*ret_flags = FL_ERR;
 		}
 		break;
 	}
@@ -152,7 +154,7 @@ static int lstat_cache(struct cache_def *cache, const char *name, int len,
 	 * path types, FL_NOENT, FL_SYMLINK and FL_DIR, can be cached
 	 * for the moment!
 	 */
-	save_flags = ret_flags & track_flags & (FL_NOENT|FL_SYMLINK);
+	save_flags = *ret_flags & track_flags & (FL_NOENT|FL_SYMLINK);
 	if (save_flags && last_slash > 0 && last_slash <= PATH_MAX) {
 		cache->path[last_slash] = '\0';
 		cache->len = last_slash;
@@ -176,7 +178,16 @@ static int lstat_cache(struct cache_def *cache, const char *name, int len,
 	} else {
 		reset_lstat_cache(cache);
 	}
-	return ret_flags;
+	return match_len;
+}
+
+static int lstat_cache(struct cache_def *cache, const char *name, int len,
+		       int track_flags, int prefix_len_stat_func)
+{
+	int flags;
+	(void)lstat_cache_matchlen(cache, name, len, &flags, track_flags,
+			prefix_len_stat_func);
+	return flags;
 }
 
 #define USE_ONLY_LSTAT  0
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH 5/5] do not overwrite files in leading path
  2010-08-20 20:35     ` Junio C Hamano
                         ` (6 preceding siblings ...)
  2010-10-09 13:52       ` [PATCH 4/5] lstat_cache: optionally return match_len Clemens Buchacher
@ 2010-10-09 13:53       ` Clemens Buchacher
  2010-10-13 21:57         ` Junio C Hamano
  7 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-09 13:53 UTC (permalink / raw)
  To: git; +Cc: gitster, Clemens Buchacher

If the work tree contains an untracked file x, and
unpack-trees wants to checkout a path x/*, the
file x is removed unconditionally.

Instead, apply the same checks that are normally
used for untracked files, and abort if the file
cannot be removed.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

It's not exactly pretty. But it's the best solution I could find
without major refactoring.

Clemens

 cache.h                    |    2 +-
 symlinks.c                 |   21 ++++++++++++++++-----
 t/t7607-merge-overwrite.sh |    2 +-
 unpack-trees.c             |   16 +++++++++++++---
 4 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index 2ef2fa3..f65d6f9 100644
--- a/cache.h
+++ b/cache.h
@@ -852,7 +852,7 @@ struct cache_def {
 
 extern int has_symlink_leading_path(const char *name, int len);
 extern int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
-extern int has_symlink_or_noent_leading_path(const char *name, int len);
+extern int check_leading_path(const char *name, int len);
 extern int has_dirs_only_path(const char *name, int len, int prefix_len);
 extern void schedule_dir_for_removal(const char *name, int len);
 extern void remove_scheduled_dirs(void);
diff --git a/symlinks.c b/symlinks.c
index b7343eb..3cacebd 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -209,15 +209,26 @@ int has_symlink_leading_path(const char *name, int len)
 }
 
 /*
- * Return non-zero if path 'name' has a leading symlink component or
+ * Return zero if path 'name' has a leading symlink component or
  * if some leading path component does not exists.
+ *
+ * Return -1 if leading path exists and is a directory.
+ *
+ * Return path length if leading path exists and is neither a
+ * directory nor a symlink.
  */
-int has_symlink_or_noent_leading_path(const char *name, int len)
+int check_leading_path(const char *name, int len)
 {
 	struct cache_def *cache = &default_cache;	/* FIXME */
-	return lstat_cache(cache, name, len,
-			   FL_SYMLINK|FL_NOENT|FL_DIR, USE_ONLY_LSTAT) &
-		(FL_SYMLINK|FL_NOENT);
+	int flags;
+	int match_len = lstat_cache_matchlen(cache, name, len, &flags,
+			   FL_SYMLINK|FL_NOENT|FL_DIR, USE_ONLY_LSTAT);
+	if (flags & (FL_SYMLINK|FL_NOENT))
+		return 0;
+	else if (flags & FL_DIR)
+		return -1;
+	else
+		return match_len;
 }
 
 /*
diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 01b070b..412db15 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -100,7 +100,7 @@ test_expect_success 'will not overwrite untracked subtree' '
 	test_cmp important sub/f/important
 '
 
-test_expect_failure 'will not overwrite untracked file in leading path' '
+test_expect_success 'will not overwrite untracked file in leading path' '
 	git reset --hard c0 &&
 	rm -rf sub &&
 	cp important sub &&
diff --git a/unpack-trees.c b/unpack-trees.c
index df1c920..6816113 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -182,7 +182,7 @@ static void display_error_msgs(struct unpack_trees_options *o)
  */
 static void unlink_entry(struct cache_entry *ce)
 {
-	if (has_symlink_or_noent_leading_path(ce->name, ce_namelen(ce)))
+	if (!check_leading_path(ce->name, ce_namelen(ce)))
 		return;
 	if (remove_or_warn(ce->ce_mode, ce->name))
 		return;
@@ -1194,18 +1194,28 @@ static int verify_absent_1(struct cache_entry *ce,
 				 enum unpack_trees_error_types error_type,
 				 struct unpack_trees_options *o)
 {
+	int len;
 	struct stat st;
 
 	if (o->index_only || o->reset || !o->update)
 		return 0;
 
-	if (has_symlink_or_noent_leading_path(ce->name, ce_namelen(ce)))
+	len = check_leading_path(ce->name, ce_namelen(ce));
+	if (!len)
 		return 0;
+	else if (len > 0) {
+		char path[PATH_MAX + 1];
+		memcpy(path, ce->name, len);
+		path[len] = 0;
+		lstat(path, &st);
 
-	if (!lstat(ce->name, &st))
+		return check_ok_to_remove(path, len, DT_UNKNOWN, NULL, &st,
+				error_type, o);
+	} else if (!lstat(ce->name, &st))
 		return check_ok_to_remove(ce->name, ce_namelen(ce),
 				ce_to_dtype(ce), ce, &st,
 				error_type, o);
+
 	return 0;
 }
 
-- 
1.7.1.571.gba4d01

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH 2/5] t7607: add leading-path tests
  2010-10-09 13:52       ` [PATCH 2/5] t7607: add leading-path tests Clemens Buchacher
@ 2010-10-09 19:14         ` Johannes Sixt
  2010-10-10  8:38           ` [PATCH 2/5 v2] " Clemens Buchacher
  0 siblings, 1 reply; 86+ messages in thread
From: Johannes Sixt @ 2010-10-09 19:14 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: git, gitster

On Samstag, 9. Oktober 2010, Clemens Buchacher wrote:
> +test_expect_failure 'will not overwrite untracked symlink in leading path'

> +test_expect_success 'will not be confused by symlink in leading path' '

Please guard these two with a SYMLINKS prerequisite.

-- Hannes

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] optionally disable overwriting of ignored files
  2010-08-21  8:05       ` Clemens Buchacher
  2010-08-22  7:25         ` Junio C Hamano
@ 2010-10-09 22:39         ` Kevin Ballard
  1 sibling, 0 replies; 86+ messages in thread
From: Kevin Ballard @ 2010-10-09 22:39 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Junio C Hamano, Joshua Jensen, git@vger.kernel.org

On Aug 21, 2010, at 1:05 AM, Clemens Buchacher wrote:

> Files that go into a tracked .gitignore are most likely generated
> files, and therefore trashable. Files that go into
> .git/info/exclude or into an untracked .gitignore (e.g. echo '*' >
> precious-simulation-results/.gitignore), are not always generated
> and may not be trashable. At least they would not likely get in the
> way of checkout or merge.

I can think of at least one common case that doesn't match this pattern - A web app that has a config file. What I've seen frequently done is the config file is named something like foo.config.template, and that's tracked, and you're encouraged to copy this to foo.config and change the values. This file may not want to be tracked because it contains sensitive information like database passwords, so the repo may stick it in the .gitignore file to ensure that this never gets added. However, this is most definitely a "precious" file. For this case, and others like it, I am strongly in favor of Junio's suggestion to add a 3rd category "precious but untracked".

-Kevin Ballard

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 1/5] t7607: use test_commit and test_must_fail
  2010-10-09 13:52       ` [PATCH 1/5] t7607: use test_commit and test_must_fail Clemens Buchacher
@ 2010-10-10  6:35         ` Jonathan Nieder
  2010-10-10  8:35           ` [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD Clemens Buchacher
  0 siblings, 1 reply; 86+ messages in thread
From: Jonathan Nieder @ 2010-10-10  6:35 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: git, gitster

Clemens Buchacher wrote:

> [Subject: t7607: use test_commit and test_must_fail]
>
> Also make sure that aborted merges do not leave
> MERGE_HEAD except in the "will not overwrite
> removed file" test, where we cannot do so.

Motivation?

> See
> also the discussion in the following thread.
>
> http://mid.gmane.org/7vskopwxej.fsf@gitster.siamese.dyndns.org

That subthread is about teaching merge-recursive to write its results
to the index first without touching the worktree for robustness.  It
is not immediately obvious what that has to do with not leaving
MERGE_HEAD.  Could you summarize the relevant part for the commit
message so the reader does not have to track it down?

> --- a/t/t7607-merge-overwrite.sh
> +++ b/t/t7607-merge-overwrite.sh
> @@ -7,81 +7,73 @@ Do not overwrite changes.'
>  . ./test-lib.sh
>  
>  test_expect_success 'setup' '
> -	echo c0 > c0.c &&
> -	git add c0.c &&
> -	git commit -m c0 &&
> -	git tag c0 &&
> +	test_commit c0 &&

Behavior changes:

 . filename is c0.t instead of c0.c
 . uses test_tick for timestamp

Sounds good.

> -	echo c1 > c1.c &&
> -	git add c1.c &&
> -	git commit -m c1 &&
> -	git tag c1 &&
> +	test_commit c1 &&

Another almost-noop (good).

> -	git reset --hard c0 &&
> -	echo c2 > c2.c &&
> -	git add c2.c &&
> -	git commit -m c2 &&
> -	git tag c2 &&
> -	git reset --hard c1 &&
> -	echo "c1 a" > c1.c &&
> -	git add c1.c &&
> -	git commit -m "c1 a" &&
> -	git tag c1a &&
> +	test_commit "c1a" "c1.t" "c1 a" &&
> +	git reset --hard c0 &&
> +	test_commit c2 &&

The reader might not remember the c1.t filename.  Maybe
it would be good to be explicit in the 'test_commit c1' line:

	test_commit c1 c1.t

?  Or maybe a comment could help?

	# Commit rewriting the file from c1

>  	echo "VERY IMPORTANT CHANGES" > important
>  '
>  
>  test_expect_success 'will not overwrite untracked file' '
>  	git reset --hard c1 &&
> -	cat important > c2.c &&
> +	cp important c2.t &&
>  	test_must_fail git merge c2 &&
> -	test_cmp important c2.c
> +	! test -f .git/MERGE_HEAD &&
> +	test_cmp important c2.t
>  '

Nit: if backportability is not important, maybe

	test_path_is_missing .git/MERGE_HEAD &&

for better error messages when running tests with -v and
the file is present?

[...]
> -	cat important > c2.c &&
> -	git add c2.c &&
> -	rm c2.c &&
> +	cp important c2.t &&
> +	git add c2.t &&
> +	rm c2.t &&

These changes make for a lot of noise.  Why not specify the
filenames in the setup test to keep the .c extension?

The main change (checking that MERGE_HEAD is not present
for a merge that fails due to pre-merge checks) seems good.
Thanks.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD
  2010-10-10  6:35         ` Jonathan Nieder
@ 2010-10-10  8:35           ` Clemens Buchacher
  2010-10-13 21:33             ` Junio C Hamano
  2010-10-13 21:59             ` Junio C Hamano
  0 siblings, 2 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-10  8:35 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster

[-- Attachment #1: Type: text/plain, Size: 3635 bytes --]

Use the test_commit and test_path_is_missing
functions from the test library.

Also make sure that a merge which fails due to
pre-merge checks aborts properly and does not
leave MERGE_HEAD behind.

The "will not overwrite removed file" test is an
exception to this. It notices the untracked file
at a stage where the merge is already well under
way. Therefore we cannot abort the merge without
major restructuring. See the following thread for
more details.

http://mid.gmane.org/7vskopwxej.fsf@gitster.siamese.dyndns.org

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

On Sun, Oct 10, 2010 at 01:35:27AM -0500, Jonathan Nieder wrote:
> 
> The main change (checking that MERGE_HEAD is not present
> for a merge that fails due to pre-merge checks) seems good.

Thanks. This new version incorporates all of your suggestions.

Clemens

 t/t7607-merge-overwrite.sh |   38 +++++++++++++++-----------------------
 1 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index d82349a..b8fab54 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -7,48 +7,38 @@ Do not overwrite changes.'
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-	echo c0 > c0.c &&
-	git add c0.c &&
-	git commit -m c0 &&
-	git tag c0 &&
-	echo c1 > c1.c &&
-	git add c1.c &&
-	git commit -m c1 &&
-	git tag c1 &&
+	test_commit c0 c0.c &&
+	test_commit c1 c1.c &&
+	test_commit c1a c1.c "c1 a" &&
 	git reset --hard c0 &&
-	echo c2 > c2.c &&
-	git add c2.c &&
-	git commit -m c2 &&
-	git tag c2 &&
-	git reset --hard c1 &&
-	echo "c1 a" > c1.c &&
-	git add c1.c &&
-	git commit -m "c1 a" &&
-	git tag c1a &&
+	test_commit c2 c2.c &&
 	echo "VERY IMPORTANT CHANGES" > important
 '
 
 test_expect_success 'will not overwrite untracked file' '
 	git reset --hard c1 &&
-	cat important > c2.c &&
+	cp important c2.c &&
 	test_must_fail git merge c2 &&
+	test_path_is_missing .git/MERGE_HEAD &&
 	test_cmp important c2.c
 '
 
 test_expect_success 'will not overwrite new file' '
 	git reset --hard c1 &&
-	cat important > c2.c &&
+	cp important c2.c &&
 	git add c2.c &&
 	test_must_fail git merge c2 &&
+	test_path_is_missing .git/MERGE_HEAD &&
 	test_cmp important c2.c
 '
 
 test_expect_success 'will not overwrite staged changes' '
 	git reset --hard c1 &&
-	cat important > c2.c &&
+	cp important c2.c &&
 	git add c2.c &&
 	rm c2.c &&
 	test_must_fail git merge c2 &&
+	test_path_is_missing .git/MERGE_HEAD &&
 	git checkout c2.c &&
 	test_cmp important c2.c
 '
@@ -57,7 +47,7 @@ test_expect_success 'will not overwrite removed file' '
 	git reset --hard c1 &&
 	git rm c1.c &&
 	git commit -m "rm c1.c" &&
-	cat important > c1.c &&
+	cp important c1.c &&
 	test_must_fail git merge c1a &&
 	test_cmp important c1.c
 '
@@ -66,9 +56,10 @@ test_expect_success 'will not overwrite re-added file' '
 	git reset --hard c1 &&
 	git rm c1.c &&
 	git commit -m "rm c1.c" &&
-	cat important > c1.c &&
+	cp important c1.c &&
 	git add c1.c &&
 	test_must_fail git merge c1a &&
+	test_path_is_missing .git/MERGE_HEAD &&
 	test_cmp important c1.c
 '
 
@@ -76,10 +67,11 @@ test_expect_success 'will not overwrite removed file with staged changes' '
 	git reset --hard c1 &&
 	git rm c1.c &&
 	git commit -m "rm c1.c" &&
-	cat important > c1.c &&
+	cp important c1.c &&
 	git add c1.c &&
 	rm c1.c &&
 	test_must_fail git merge c1a &&
+	test_path_is_missing .git/MERGE_HEAD &&
 	git checkout c1.c &&
 	test_cmp important c1.c
 '
-- 
1.7.1.571.gba4d01


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH 2/5 v2] t7607: add leading-path tests
  2010-10-09 19:14         ` Johannes Sixt
@ 2010-10-10  8:38           ` Clemens Buchacher
  0 siblings, 0 replies; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-10  8:38 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, gitster

[-- Attachment #1: Type: text/plain, Size: 2716 bytes --]


Signed-off-by: Clemens Buchacher <drizzd@aon.at>
---

This patch depends on PATCH 1/5 v2. The remaining patches 3 to 5
still apply cleanly.

On Sat, Oct 09, 2010 at 09:14:51PM +0200, Johannes Sixt wrote:
> On Samstag, 9. Oktober 2010, Clemens Buchacher wrote:
> > +test_expect_failure 'will not overwrite untracked symlink in leading path'
> 
> > +test_expect_success 'will not be confused by symlink in leading path' '
> 
> Please guard these two with a SYMLINKS prerequisite.

Done. Thank you.

Clemens

 t/t7607-merge-overwrite.sh |   51 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index b8fab54..77fcaa2 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -12,6 +12,12 @@ test_expect_success 'setup' '
 	test_commit c1a c1.c "c1 a" &&
 	git reset --hard c0 &&
 	test_commit c2 c2.c &&
+	git reset --hard c0 &&
+	mkdir sub &&
+	echo "sub/f" > sub/f &&
+	git add sub/f &&
+	git commit -m sub &&
+	git tag sub &&
 	echo "VERY IMPORTANT CHANGES" > important
 '
 
@@ -23,6 +29,14 @@ test_expect_success 'will not overwrite untracked file' '
 	test_cmp important c2.c
 '
 
+test_expect_success 'will overwrite tracked file' '
+	git reset --hard c1 &&
+	cp important c2.c &&
+	git add c2.c &&
+	git commit -m important &&
+	git checkout c2
+'
+
 test_expect_success 'will not overwrite new file' '
 	git reset --hard c1 &&
 	cp important c2.c &&
@@ -76,4 +90,41 @@ test_expect_success 'will not overwrite removed file with staged changes' '
 	test_cmp important c1.c
 '
 
+test_expect_success 'will not overwrite untracked subtree' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	mkdir -p sub/f &&
+	cp important sub/f/important &&
+	test_must_fail git merge sub &&
+	test_path_is_missing .git/MERGE_HEAD &&
+	test_cmp important sub/f/important
+'
+
+test_expect_failure 'will not overwrite untracked file in leading path' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	cp important sub &&
+	test_must_fail git merge sub &&
+	test_path_is_missing .git/MERGE_HEAD &&
+	test_cmp important sub
+'
+
+test_expect_failure SYMLINKS 'will not overwrite untracked symlink in leading path' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	mkdir sub2 &&
+	ln -s sub2 sub &&
+	test_must_fail git merge sub &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_expect_success SYMLINKS 'will not be confused by symlink in leading path' '
+	git reset --hard c0 &&
+	rm -rf sub &&
+	ln -s sub2 sub &&
+	git add sub &&
+	git commit -m ln &&
+	git checkout sub
+'
+
 test_done
-- 
1.7.1.571.gba4d01


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD
  2010-10-10  8:35           ` [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD Clemens Buchacher
@ 2010-10-13 21:33             ` Junio C Hamano
  2010-10-13 21:59             ` Junio C Hamano
  1 sibling, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2010-10-13 21:33 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Jonathan Nieder, git

Clemens Buchacher <drizzd@aon.at> writes:

> --ew6BAiZeqk4r7MaW
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable

Please don't do this...

>
> --ew6BAiZeqk4r7MaW
> Content-Type: application/pgp-signature; name="signature.asc"
> Content-Description: Digital signature
> Content-Disposition: inline
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iQEcBAEBAgAGBQJMsXrfAAoJELKdZexG8uqM/BwH/Ai2w+DFW4L9D1SivLoxDL8/
> Z3YDRrx5oJa35ZhbBwInGJx7xwXG3Pn/mx0avGwoRwQe9cRufp6AO5hHUq3U1LNv
> ZaP4RdlScMqeuKUu8mrbjJs4kumL/sjZ59MRnBZzX1Ovdq/GbKhJqidYTvmHQc6e
> 0sngXx9Jf1WlS7m1sDztYPRJ3z2lF0js+BNHLIVOi6CgSbBJYdQzeLrvO/BVX9V0
> P2F0bGZLWTqqOSLMav5jcYFAgIv8mRqxjre+1IviFuGTuu5hX7yTx++qAiV8CK0f
> Fpe0HNQlYahbytd9qlOoqsbV5fuXkOoFqYHKmT6461u31QHaYPyBwsSWgjkF9IE=
> =3pEL
> -----END PGP SIGNATURE-----

Nor this.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 3/5] add function check_ok_to_remove()
  2010-10-09 13:52       ` [PATCH 3/5] add function check_ok_to_remove() Clemens Buchacher
@ 2010-10-13 21:43         ` Junio C Hamano
  0 siblings, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2010-10-13 21:43 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: git

Clemens Buchacher <drizzd@aon.at> writes:

> This wraps some inline code into the function check_ok_to_remove(),
> which will later be used for leading path components as well.
>
> Signed-off-by: Clemens Buchacher <drizzd@aon.at>

Nice depth reduction.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 5/5] do not overwrite files in leading path
  2010-10-09 13:53       ` [PATCH 5/5] do not overwrite files in leading path Clemens Buchacher
@ 2010-10-13 21:57         ` Junio C Hamano
  2010-10-13 22:34           ` Clemens Buchacher
  0 siblings, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2010-10-13 21:57 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: git

Clemens Buchacher <drizzd@aon.at> writes:

> If the work tree contains an untracked file x, and
> unpack-trees wants to checkout a path x/*, the
> file x is removed unconditionally.
>
> Instead, apply the same checks that are normally
> used for untracked files, and abort if the file
> cannot be removed.
>

Too short a line is also hard to read as too long a line.

> diff --git a/cache.h b/cache.h
> index 2ef2fa3..f65d6f9 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -852,7 +852,7 @@ struct cache_def {
>  
>  extern int has_symlink_leading_path(const char *name, int len);
>  extern int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
> -extern int has_symlink_or_noent_leading_path(const char *name, int len);
> +extern int check_leading_path(const char *name, int len);

This is an API regression.  "check" in function name "check-blah" does not
tell what you are checking (is it checking if the path is Ok?  is it
checking if the path has symlinked components?  is it checking if some
part of the path overlaps with tracked content?  is it checking if you
have already run lstat(2) on some of the leading path components?), nor
for what purpose the check can be used (it is Ok to overwrite it?, is it
Ok to create a directory there?  is it Ok to create a file there?).

At least the old name told us what it checked, didn't it?

The approach you took looks sound, though.  What did you feel was "not
exactly pretty" about it?

Thanks.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD
  2010-10-10  8:35           ` [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD Clemens Buchacher
  2010-10-13 21:33             ` Junio C Hamano
@ 2010-10-13 21:59             ` Junio C Hamano
  1 sibling, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2010-10-13 21:59 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Jonathan Nieder, git, gitster

Clemens Buchacher <drizzd@aon.at> writes:

>  test_expect_success 'will not overwrite untracked file' '
>  	git reset --hard c1 &&
> -	cat important > c2.c &&
> +	cp important c2.c &&
>  	test_must_fail git merge c2 &&
> +	test_path_is_missing .git/MERGE_HEAD &&
>  	test_cmp important c2.c
>  '
>  
>  test_expect_success 'will not overwrite new file' '
>  	git reset --hard c1 &&
> -	cat important > c2.c &&
> +	cp important c2.c &&

Why?

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 5/5] do not overwrite files in leading path
  2010-10-13 21:57         ` Junio C Hamano
@ 2010-10-13 22:34           ` Clemens Buchacher
  2010-10-15  6:48             ` Clemens Buchacher
  0 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-13 22:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 3140 bytes --]

On Wed, Oct 13, 2010 at 02:57:59PM -0700, Junio C Hamano wrote:
> Clemens Buchacher <drizzd@aon.at> writes:
> 
> > If the work tree contains an untracked file x, and
> > unpack-trees wants to checkout a path x/*, the
> > file x is removed unconditionally.
> >
> > Instead, apply the same checks that are normally
> > used for untracked files, and abort if the file
> > cannot be removed.
> >
> 
> Too short a line is also hard to read as too long a line.

Ok.

> > diff --git a/cache.h b/cache.h
> > index 2ef2fa3..f65d6f9 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -852,7 +852,7 @@ struct cache_def {
> >  
> >  extern int has_symlink_leading_path(const char *name, int len);
> >  extern int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
> > -extern int has_symlink_or_noent_leading_path(const char *name, int len);
> > +extern int check_leading_path(const char *name, int len);
> 
> This is an API regression.  "check" in function name "check-blah" does not
> tell what you are checking (is it checking if the path is Ok?  is it
> checking if the path has symlinked components?  is it checking if some
> part of the path overlaps with tracked content?  is it checking if you
> have already run lstat(2) on some of the leading path components?), nor
> for what purpose the check can be used (it is Ok to overwrite it?, is it
> Ok to create a directory there?  is it Ok to create a file there?).
>
> At least the old name told us what it checked, didn't it?

Yes, but that's not what it does any more. I worded it vague on
purpose, because there is really no way to explain all it does in
one function name. This is the full description.

/*
 * Return zero if path 'name' has a leading symlink component or
 * if some leading path component does not exists.
 *
 * Return -1 if leading path exists and is a directory.
 *
 * Return path length if leading path exists and is neither a
 * directory nor a symlink.
 */

But if you prefer, we can call it "verify_clean_leading_path"?

> The approach you took looks sound, though.  What did you feel was "not
> exactly pretty" about it?

First, check_leading_path bothers me. It does too many things and
encodes all of that in one integer return value.

Second, I also have to copy the offending path manually to a local
path variable, even though lstat_cache already has that path stored
in a buffer.

Third, check_ok_to_remove expects a cache_entry. But in case of a
match in check_leading_path, we do not have one and pass NULL to
check_ok_to_rmove. It will not be used anyways, because
check_leading_path should return > 0 only for regular files, and
the cache_entry is only used for directories in
verify_clean_subdirectory.

If lstat returns with an error other than NOENT, or if
check_ok_to_remove is called with anything other than a directory
and cache_entry is NULL, we get a segmentation fault. Before, an
error was simply ignored. I don't know which is worse.

On the other hand, I think the patch fixes a nasty bug, so maybe we
can accept those issues for now.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 5/5] do not overwrite files in leading path
  2010-10-13 22:34           ` Clemens Buchacher
@ 2010-10-15  6:48             ` Clemens Buchacher
  2010-10-15 18:47               ` Junio C Hamano
  0 siblings, 1 reply; 86+ messages in thread
From: Clemens Buchacher @ 2010-10-15  6:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

On Thu, Oct 14, 2010 at 12:34:25AM +0200, Clemens Buchacher wrote:
> 
> If lstat returns with an error other than NOENT, or if
> check_ok_to_remove is called with anything other than a directory
> and cache_entry is NULL, we get a segmentation fault. Before, an
> error was simply ignored. I don't know which is worse.

I suppose we only need the following additional changes.

- die if lstat returns an error other than ENOENT.

- Rewrite verify_clean_subdirectory to not require a cache_entry.

- Expose lstat result and path cache to the caller of
  lstat_cache_matchlen() in verify_absent_1().

- rewrite check_leading_path (or 'verify_clean_path') to
  check the full path and return zero if
  + the leading path contains a symlink or
  + the leading path exists, but the full path does not
  and returns the path length of the offending entry otherwise.
  
I think that's manageable and much cleaner. I will start to work on
it as soon as possible.

Thank you for your review.

Clemens

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH 5/5] do not overwrite files in leading path
  2010-10-15  6:48             ` Clemens Buchacher
@ 2010-10-15 18:47               ` Junio C Hamano
  0 siblings, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2010-10-15 18:47 UTC (permalink / raw)
  To: Clemens Buchacher; +Cc: Junio C Hamano, git

Clemens Buchacher <drizzd@aon.at> writes:

> On Thu, Oct 14, 2010 at 12:34:25AM +0200, Clemens Buchacher wrote:
>> 
>> If lstat returns with an error other than NOENT, or if
>> check_ok_to_remove is called with anything other than a directory
>> and cache_entry is NULL, we get a segmentation fault. Before, an
>> error was simply ignored. I don't know which is worse.
>
> I suppose we only need the following additional changes.
>
> - die if lstat returns an error other than ENOENT.
>
> - Rewrite verify_clean_subdirectory to not require a cache_entry.
>
> - Expose lstat result and path cache to the caller of
>   lstat_cache_matchlen() in verify_absent_1().
>
> - rewrite check_leading_path (or 'verify_clean_path') to
>   check the full path and return zero if
>   + the leading path contains a symlink or
>   + the leading path exists, but the full path does not
>   and returns the path length of the offending entry otherwise.
>   
> I think that's manageable and much cleaner. I will start to work on
> it as soon as possible.

Thanks.  I like seeing people thinking things through ;-).

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Ignored files being silently overwritten when switching branches
@ 2018-10-15 13:01 Per Lundberg
  2018-10-16  6:40 ` Jeff King
  0 siblings, 1 reply; 86+ messages in thread
From: Per Lundberg @ 2018-10-15 13:01 UTC (permalink / raw)
  To: git@vger.kernel.org

Hi,

Sorry if this question has been asked before; I skimmed through the list 
archives and the FAQ but couldn't immediately find it - please point me 
in the right direction if it has indeed been discussed before.

We were renaming some previously-included configuration files (foo.conf) 
in one of our repos, instead providing a "default" configuration 
(foo.conf.default) that can easily be copied over to foo.conf by 
individual developers. This all works fine, and the *.conf are now added 
to the .gitignore list.

_However_, when switching back to our previous release branches (which 
includes the foo.conf file in the tree), we have noticed that git 
silently overwrites the locally-modified foo.conf file with the upstream 
foo.conf file from that branch. When switching back to master, the file 
contents is therefore perpetually lost, which is a bit unfortunate.

I did a quick repro case here: https://github.com/perlun/git-test, and 
it seems easy to reproduce this behavior using the following steps (also 
documented in that git repo):

$ git init
$ touch foo.txt
$ nano foo.txt
$ git add foo.txt
$ git commit -m 'Add foo.txt'
[master (root-commit) 8ef05cb] Add foo.txt
  1 file changed, 1 insertion(+)
  create mode 100644 foo.txt
$ git checkout -b dev
Switched to a new branch 'dev'
$ git mv foo.txt foo.bar
$ git commit -m "Rename foo.txt -> foo.bar"
[dev 4c55c9b] Rename foo.txt -> foo.bar
  1 file changed, 0 insertions(+), 0 deletions(-)
  rename foo.txt => foo.bar (100%)
$ echo 'my local foo.txt' > foo.txt
$ echo foo.txt > .gitignore
$ git commit -m "Add .gitignore"
[dev 4c16acb] Add .gitignore
  1 file changed, 2 insertions(+)
  create mode 100644 .gitignore
$ git checkout master # This will silently overwrite the local foo.txt

So my question is: is this by design or should this be considered a bug 
in git? Of course, it depends largely on what .gitignore is being used 
for - if we are talking about files which can easily be regenerated 
(build artifacts, node_modules folders etc.) I can totally understand 
the current behavior, but when dealing with more sensitive & important 
content it's a bit inconvenient.


What I would have expected would be for git to complain, with this message:

error: The following untracked working tree files would be overwritten 
by checkout:
	foo.txt
Please move or remove them before you switch branches.
Aborting

This is normally the message you get when a _non-ignored_ file is being 
overwritten. But apparently not so when an ignored file is being 
overwritten. If this can be tweaked in the local repo settings somehow, 
please let me know.
--
Best regards,
Per

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: Ignored files being silently overwritten when switching branches
  2018-10-15 13:01 Ignored files being silently overwritten when switching branches Per Lundberg
@ 2018-10-16  6:40 ` Jeff King
  0 siblings, 0 replies; 86+ messages in thread
From: Jeff King @ 2018-10-16  6:40 UTC (permalink / raw)
  To: Per Lundberg; +Cc: git@vger.kernel.org

On Mon, Oct 15, 2018 at 01:01:50PM +0000, Per Lundberg wrote:

> Sorry if this question has been asked before; I skimmed through the list 
> archives and the FAQ but couldn't immediately find it - please point me 
> in the right direction if it has indeed been discussed before.

It is a frequently asked question, but it doesn't seem to be in any FAQ
that I could find. The behavior you're seeing is intended. See this
message (and the rest of the thread) for discussion:

  https://public-inbox.org/git/7viq39avay.fsf@alter.siamese.dyndns.org/

> So my question is: is this by design or should this be considered a bug 
> in git? Of course, it depends largely on what .gitignore is being used 
> for - if we are talking about files which can easily be regenerated 
> (build artifacts, node_modules folders etc.) I can totally understand 
> the current behavior, but when dealing with more sensitive & important 
> content it's a bit inconvenient.

Basically: yes. It would be nice to have that "do not track this, but do
not trash it either" state for a file, but Git does not currently
support that.

-Peff

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: Ignored files being silently overwritten when switching branches
  2010-08-23  9:37     ` Matthieu Moy
  2010-08-23 13:56       ` Holger Hellmuth
@ 2018-10-16  9:10       ` Ævar Arnfjörð Bjarmason
  2018-10-16 15:05         ` Duy Nguyen
  1 sibling, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-16  9:10 UTC (permalink / raw)
  To: Jeff King
  Cc: Per Lundberg, git@vger.kernel.org, Junio C Hamano, Matthieu Moy,
	Clemens Buchacher, Holger Hellmuth, Kevin Ballard


On Tue, Oct 16 2018, Jeff King wrote:

> On Mon, Oct 15, 2018 at 01:01:50PM +0000, Per Lundberg wrote:
>
>> Sorry if this question has been asked before; I skimmed through the list
>> archives and the FAQ but couldn't immediately find it - please point me
>> in the right direction if it has indeed been discussed before.
>
> It is a frequently asked question, but it doesn't seem to be in any FAQ
> that I could find. The behavior you're seeing is intended. See this
> message (and the rest of the thread) for discussion:
>
>   https://public-inbox.org/git/7viq39avay.fsf@alter.siamese.dyndns.org/
>
>> So my question is: is this by design or should this be considered a bug
>> in git? Of course, it depends largely on what .gitignore is being used
>> for - if we are talking about files which can easily be regenerated
>> (build artifacts, node_modules folders etc.) I can totally understand
>> the current behavior, but when dealing with more sensitive & important
>> content it's a bit inconvenient.
>
> Basically: yes. It would be nice to have that "do not track this, but do
> not trash it either" state for a file, but Git does not currently
> support that.

There's some patches in that thread that could be picked up by someone
interested. I think the approach mentioned by Matthieu Moy here makes
the most sense:
https://public-inbox.org/git/vpqd3t9656k.fsf@bauges.imag.fr/

I don't think the rationale mentioned by Junio in
https://public-inbox.org/git/7v4oepaup7.fsf@alter.siamese.dyndns.org/ is
very convincing.

The question is not whether .gitignore is intended to be used in some
specific way, e.g. only ignoring *.o files, but whether we can
reasonably suspect that users use the combination of the features we
expose in such a way that their precious data gets destroyed. User data
should get the benefit of the doubt.

Off the top of my head, I can imagine many ways in which this'll go
wrong:

 1. Even if you're using .gitignore only for "trashable" as as Junio
    mentions, git not trashing your data depends on everyone who
    modifies .gitignore in your project having enough situational
    awareness not to inadvertently add a glob to the file which
    *accidentally* ignores existing files, and *nothing warns about
    this*.

    Between the caveat noted in "It is not possible to re-include[...]"
    in gitignore(5) and negative pathspecs it can be really easy to get
    this wrong.

    So e.g. in git.git I can add a line with "*" to .gitignore, and
    nothing will complain or look unusual as long as I'm not introducing
    new files, and I'll only find out when some-new-file.c of mine gets
    trashed.

 2. Related, the UI "git add <ignored>" presents is just "Use -f if you
    really want to add them". Users who aren't careful will just think
    "oh, I just need -f in this case" and not alter .gitignore, leaving
    a timebomb for future users.

    Those new users will have no way of knowing that they've cloned a
    repo with a broken overzealous .gitignore, e.g. there's nothing on
    clone that says "you've just cloned a repo with N files, all of
    which are ignored, so git clean etc. will likely wipe out anything
    you have in the checkout".

 3. Since we implictly expose this "you need a one-off action to
    override .gitignore" noted in #2 users can and *do* use this for
    "soft" ignores.

    E.g. in a big work repo there's an ignore for *.png, even though the
    repo has thousands of such files, because it's not considered good
    practice to add them anymore (there's another static repo), and
    someone thought to use .gitignore to enforce that suggestion.

    I have a personal repo where I only want *.gpg files, and due to the
    inability to re-include files recursively (noted in #1) I just
    ignore '*' and use git veeery carefully. I was only worried about
    'git clean' so far, but now I see I need to worry about "checkout"
    as well.

But maybe the use-cases I'm mentioning are highly unusual and the repos
at work have ended up in some bizarre state and nobody else cares about
this.

It would be interesting if someone at a big git hosting providers (hint:
Jeff :) could provide some numbers about how common it is to have a
repository containing tracked files ignored by a .gitignore the
repository itself carries. This wouldn't cover all of #1-3 above, but is
probably a pretty good proxy metric.

I thought this could be done by:

    git ls-tree -r --name-only HEAD  | git check-ignore --no-index --stdin

But I see that e.g. on git.git this goes wrong due to
t/helper/.gitignore. So I don't know how one would answer "does this
repo have .gitignored files tracked?" in a one-liner.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: Ignored files being silently overwritten when switching branches
  2018-10-16  9:10       ` Ignored files being silently overwritten when switching branches Ævar Arnfjörð Bjarmason
@ 2018-10-16 15:05         ` Duy Nguyen
  2018-10-18  1:55           ` Junio C Hamano
  0 siblings, 1 reply; 86+ messages in thread
From: Duy Nguyen @ 2018-10-16 15:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff King, per.lundberg, Git Mailing List, Junio C Hamano, git,
	Clemens Buchacher, Holger Hellmuth (IKS), Kevin Ballard

On Tue, Oct 16, 2018 at 11:12 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Oct 16 2018, Jeff King wrote:
>
> > On Mon, Oct 15, 2018 at 01:01:50PM +0000, Per Lundberg wrote:
> >
> >> Sorry if this question has been asked before; I skimmed through the list
> >> archives and the FAQ but couldn't immediately find it - please point me
> >> in the right direction if it has indeed been discussed before.
> >
> > It is a frequently asked question, but it doesn't seem to be in any FAQ
> > that I could find. The behavior you're seeing is intended. See this
> > message (and the rest of the thread) for discussion:
> >
> >   https://public-inbox.org/git/7viq39avay.fsf@alter.siamese.dyndns.org/
> >
> >> So my question is: is this by design or should this be considered a bug
> >> in git? Of course, it depends largely on what .gitignore is being used
> >> for - if we are talking about files which can easily be regenerated
> >> (build artifacts, node_modules folders etc.) I can totally understand
> >> the current behavior, but when dealing with more sensitive & important
> >> content it's a bit inconvenient.
> >
> > Basically: yes. It would be nice to have that "do not track this, but do
> > not trash it either" state for a file, but Git does not currently
> > support that.
>
> There's some patches in that thread that could be picked up by someone
> interested. I think the approach mentioned by Matthieu Moy here makes
> the most sense:
> https://public-inbox.org/git/vpqd3t9656k.fsf@bauges.imag.fr/
>
> I don't think the rationale mentioned by Junio in
> https://public-inbox.org/git/7v4oepaup7.fsf@alter.siamese.dyndns.org/ is
> very convincing.

Just fyi I also have some wip changes that add the forth "precious"
class in addition to tracked, untracked and ignored [1]. If someone
has time it could be another option to pick up.

[1] https://gitlab.com/pclouds/git/commit/0e7f7afa1879b055369ebd3f1224311c43c8a32b
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: Ignored files being silently overwritten when switching branches
  2018-10-16 15:05         ` Duy Nguyen
@ 2018-10-18  1:55           ` Junio C Hamano
  0 siblings, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2018-10-18  1:55 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Ævar Arnfjörð Bjarmason, Jeff King, per.lundberg,
	Git Mailing List, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

Duy Nguyen <pclouds@gmail.com> writes:

> Just fyi I also have some wip changes that add the forth "precious"
> class in addition to tracked, untracked and ignored [1]. If someone
> has time it could be another option to pick up.

It is much more sensible than gaining the ability to express
precious by trading away the ability to express expendable, I would
think.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Checkout deleted semi-untracked file
@ 2018-11-06 12:41 Steffen Jost
  0 siblings, 0 replies; 86+ messages in thread
From: Steffen Jost @ 2018-11-06 12:41 UTC (permalink / raw)
  To: git

Hello!

A brief discussion on the git user mailing list on Google Groups recommended me to file the following as a bug report.

The problem led to an actual file loss, but I suspect that this might be intended:

1) .gitignore is added to the repository (which then causes problems)
2) A file is added, repeatedly edited and comitted to a remote repository.
3) Later on, the file is added to .gitignore and "git rm --cached file" is executed (since the file now contains information private to each developer).
4) Several commits happen.
5) I checkout an old branch which has not yet seen the change in .gitignore in the master branch. The file is reverted to the state of the branch.
6) I checkout master, and the file with all later changes is irrevocably lost.

I usually advise my students to check-in their .gitignore file into the repository. Apparently this is a bad advice, since it now led to a somewhat painful file loss for me.

So what is the actual advice on this? Google turned up mixed advice there, and the git user mailing list on Google Groups recommended me submitting this as a bug here. However, I think this works as intended. However, I don't know either how to avert this problem, apart from not checking in the .gitignore file (or checking it in under a different name and copying it manually).


Thanks for any advice,
  Steffen Jost.

-- 
+49-89-2180-9139
http://www.tcs.ifi.lmu.de/~jost/

Lehr- und Forschungseinheit für Theoretische Informatik
Ludwig-Maximilians-Universität München
Oettingenstr. 67 (E111)
80538 München
BAVARIA

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: Checkout deleted semi-untracked file
  2010-08-17  5:21 git merge, .gitignore, and silently overwriting untracked files Joshua Jensen
  2010-08-17 19:33 ` Junio C Hamano
@ 2018-11-06 15:12 ` Ævar Arnfjörð Bjarmason
  2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
  2018-11-11 12:33   ` [RFC PATCH] Introduce "precious" file concept Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-06 15:12 UTC (permalink / raw)
  To: Steffen Jost; +Cc: git


On Tue, Nov 06 2018, Steffen Jost wrote:

> Hello!
>
> A brief discussion on the git user mailing list on Google Groups recommended me to file the following as a bug report.
>
> The problem led to an actual file loss, but I suspect that this might be intended:
>
> 1) .gitignore is added to the repository (which then causes problems)
> 2) A file is added, repeatedly edited and comitted to a remote repository.
> 3) Later on, the file is added to .gitignore and "git rm --cached file" is executed (since the file now contains information private to each developer).
> 4) Several commits happen.
> 5) I checkout an old branch which has not yet seen the change in .gitignore in the master branch. The file is reverted to the state of the branch.
> 6) I checkout master, and the file with all later changes is irrevocably lost.
>
> I usually advise my students to check-in their .gitignore file into the repository. Apparently this is a bad advice, since it now led to a somewhat painful file loss for me.
>
> So what is the actual advice on this? Google turned up mixed advice there, and the git user mailing list on Google Groups recommended me submitting this as a bug here. However, I think this works as intended. However, I don't know either how to avert this problem, apart from not checking in the .gitignore file (or checking it in under a different name and copying it manually).

This recent thread should be a good starting point:
https://public-inbox.org/git/4C6A1C5B.4030304@workspacewhiz.com/

My reply here I think has an overview of some of the caveats:
https://public-inbox.org/git/871s8qdzph.fsf@evledraar.gmail.com/

tl;dr: Git assumes that a pattern in .gitignore means you don't care
about the contents, since it's meant for *.o and the like. This leads to
data loss in some situations (such as yours).

Various suggestions in that thread of ways forward, but none have
been implemented>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC PATCH] Introduce "precious" file concept
@ 2018-11-11  9:52   ` Nguyễn Thái Ngọc Duy
  2018-11-11 12:15     ` Bert Wesarg
                       ` (2 more replies)
  0 siblings, 3 replies; 86+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-11-11  9:52 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Since this topic has come up twice recently, I revisited this
"precious" thingy that I started four years ago and tried to see if I
could finally finish it. There are a couple things to be sorted out...

A new attribute "precious" is added to indicate that certain files
have valuable content and should not be easily discarded even if they
are ignored or untracked (*).

So far there are two parts of Git that are made aware of precious
files: "git clean" will leave precious files alone and unpack-trees.c
(i.e. merges and branch switches) will not overwrite
ignored-but-precious files.

Is there any other parts of Git that should be made aware of this
"precious" attribute?

Also while "precious" is a fun name, but it does not sound serious.
Any suggestions? Perhaps "valuable"?

Very lightly tested. The patch is more to have something to discuss
than is bug free and ready to use.

(*) Note that tracked files could be marked "precious" in the future
    too although the exact semantics is not very clear since tracked
    files are by default precious.

    But something like "index log" could use this to record all
    changes to precious files instead of just "git add -p" changes,
    for example. So these files are in a sense more precious than
    other tracked files.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-clean.txt     |  3 ++-
 Documentation/gitattributes.txt | 13 +++++++++++++
 attr.c                          |  9 +++++++++
 attr.h                          |  2 ++
 builtin/clean.c                 | 19 ++++++++++++++++---
 unpack-trees.c                  |  3 ++-
 6 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index 03056dad0d..a9beadfb12 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This can, for
 example, be useful to remove all build products.
 
 If any optional `<path>...` arguments are given, only those paths
-are affected.
+are affected. Ignored or untracked files with `precious` attributes
+are not removed.
 
 OPTIONS
 -------
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index b8392fc330..c722479bdc 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, the value of the
 (See linkgit:git-config[1]).
 
 
+Precious files
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+`precious`
+^^^^^^^^^^
+
+This attribute is set on files to indicate that their content is
+valuable. Many commands will behave slightly different on precious
+files. linkgit:git-clean[1] will leave precious files alone. Merging
+and branch switching will not silently overwrite ignored files that
+are marked "precious".
+
+
 USING MACRO ATTRIBUTES
 ----------------------
 
diff --git a/attr.c b/attr.c
index 60d284796d..d06ca0ae4b 100644
--- a/attr.c
+++ b/attr.c
@@ -1186,3 +1186,12 @@ void attr_start(void)
 	pthread_mutex_init(&check_vector.mutex, NULL);
 #endif
 }
+
+int is_precious_file(struct index_state *istate, const char *path)
+{
+	static struct attr_check *check;
+	if (!check)
+		check = attr_check_initl("precious", NULL);
+	git_check_attr(istate, path, check);
+	return check && ATTR_TRUE(check->items[0].value);
+}
diff --git a/attr.h b/attr.h
index b0378bfe5f..b9a9751a66 100644
--- a/attr.h
+++ b/attr.h
@@ -82,4 +82,6 @@ void git_attr_set_direction(enum git_attr_direction new_direction);
 
 void attr_start(void);
 
+int is_precious_file(struct index_state *istate, const char *path);
+
 #endif /* ATTR_H */
diff --git a/builtin/clean.c b/builtin/clean.c
index 8d9a7dc206..9e554448a6 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -17,6 +17,7 @@
 #include "color.h"
 #include "pathspec.h"
 #include "help.h"
+#include "attr.h"
 
 static int force = -1; /* unset */
 static int interactive;
@@ -30,6 +31,8 @@ static const char *const builtin_clean_usage[] = {
 
 static const char *msg_remove = N_("Removing %s\n");
 static const char *msg_would_remove = N_("Would remove %s\n");
+static const char *msg_skip_precious = N_("Skipping precious file %s\n");
+static const char *msg_would_skip_precious = N_("Would skip precious file %s\n");
 static const char *msg_skip_git_dir = N_("Skipping repository %s\n");
 static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n");
 static const char *msg_warn_remove_failed = N_("failed to remove %s");
@@ -152,6 +155,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 	struct dirent *e;
 	int res = 0, ret = 0, gone = 1, original_len = path->len, len;
 	struct string_list dels = STRING_LIST_INIT_DUP;
+	const char *rel_path;
 
 	*dir_gone = 1;
 
@@ -191,9 +195,15 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 
 		strbuf_setlen(path, len);
 		strbuf_addstr(path, e->d_name);
-		if (lstat(path->buf, &st))
+		if (lstat(path->buf, &st)) {
 			; /* fall thru */
-		else if (S_ISDIR(st.st_mode)) {
+		} else if ((!prefix || skip_prefix(path->buf, prefix, &rel_path)) &&
+			   is_precious_file(&the_index, rel_path)) {
+			quote_path_relative(path->buf, prefix, &quoted);
+			printf(dry_run ? _(msg_would_skip_precious) : _(msg_skip_precious), quoted.buf);
+			*dir_gone = 0;
+			continue;
+		} else if (S_ISDIR(st.st_mode)) {
 			if (remove_dirs(path, prefix, force_flag, dry_run, quiet, &gone))
 				ret = 1;
 			if (gone) {
@@ -1017,7 +1027,10 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		if (lstat(abs_path.buf, &st))
 			continue;
 
-		if (S_ISDIR(st.st_mode)) {
+		if (is_precious_file(&the_index, item->string)) {
+			qname = quote_path_relative(item->string, NULL, &buf);
+			printf(dry_run ? _(msg_would_skip_precious) : _(msg_skip_precious), qname);
+		} else if (S_ISDIR(st.st_mode)) {
 			if (remove_dirs(&abs_path, prefix, rm_flags, dry_run, quiet, &gone))
 				errors++;
 			if (gone && !quiet) {
diff --git a/unpack-trees.c b/unpack-trees.c
index 7570df481b..d49fe0f77e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1895,7 +1895,8 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;
 
 	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	    is_excluded(o->dir, o->src_index, name, &dtype) &&
+	    !is_precious_file(o->src_index, name))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
-- 
2.19.1.1235.ga92291acdb


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
@ 2018-11-11 12:15     ` Bert Wesarg
  2018-11-11 12:59     ` Junio C Hamano
  2018-11-26 19:38     ` [PATCH v2 0/2] Precios files round two Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 86+ messages in thread
From: Bert Wesarg @ 2018-11-11 12:15 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc; +Cc: Git Mailing List

On Sun, Nov 11, 2018 at 10:53 AM Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>
> Also while "precious" is a fun name, but it does not sound serious.
> Any suggestions? Perhaps "valuable"?

"precious" is also used by POSIX make:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html

Bert

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-06 15:12 ` Checkout deleted semi-untracked file Ævar Arnfjörð Bjarmason
  2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
@ 2018-11-11 12:33   ` Ævar Arnfjörð Bjarmason
  2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason
                       ` (2 more replies)
  1 sibling, 3 replies; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-11 12:33 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Steffen Jost, Joshua Jensen, Per Lundberg, Junio C Hamano,
	Matthieu Moy, Clemens Buchacher, Holger Hellmuth, Kevin Ballard


[CC-ing some of the people involved in recent threads about this]

On Sun, Nov 11 2018, Nguyễn Thái Ngọc Duy wrote:

> Since this topic has come up twice recently, I revisited this
> "precious" thingy that I started four years ago and tried to see if I
> could finally finish it. There are a couple things to be sorted out...
>
> A new attribute "precious" is added to indicate that certain files
> have valuable content and should not be easily discarded even if they
> are ignored or untracked (*).
>
> So far there are two parts of Git that are made aware of precious
> files: "git clean" will leave precious files alone and unpack-trees.c
> (i.e. merges and branch switches) will not overwrite
> ignored-but-precious files.
>
> Is there any other parts of Git that should be made aware of this
> "precious" attribute?
>
> Also while "precious" is a fun name, but it does not sound serious.
> Any suggestions? Perhaps "valuable"?
>
> Very lightly tested. The patch is more to have something to discuss
> than is bug free and ready to use.
>
> (*) Note that tracked files could be marked "precious" in the future
>     too although the exact semantics is not very clear since tracked
>     files are by default precious.
>
>     But something like "index log" could use this to record all
>     changes to precious files instead of just "git add -p" changes,
>     for example. So these files are in a sense more precious than
>     other tracked files.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  Documentation/git-clean.txt     |  3 ++-
>  Documentation/gitattributes.txt | 13 +++++++++++++
>  attr.c                          |  9 +++++++++
>  attr.h                          |  2 ++
>  builtin/clean.c                 | 19 ++++++++++++++++---
>  unpack-trees.c                  |  3 ++-
>  6 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
> index 03056dad0d..a9beadfb12 100644
> --- a/Documentation/git-clean.txt
> +++ b/Documentation/git-clean.txt
> @@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This can, for
>  example, be useful to remove all build products.
>
>  If any optional `<path>...` arguments are given, only those paths
> -are affected.
> +are affected. Ignored or untracked files with `precious` attributes
> +are not removed.
>
>  OPTIONS
>  -------
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index b8392fc330..c722479bdc 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, the value of the
>  (See linkgit:git-config[1]).
>
>
> +Precious files
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +`precious`
> +^^^^^^^^^^
> +
> +This attribute is set on files to indicate that their content is
> +valuable. Many commands will behave slightly different on precious
> +files. linkgit:git-clean[1] will leave precious files alone. Merging
> +and branch switching will not silently overwrite ignored files that
> +are marked "precious".
> +
> +
>  USING MACRO ATTRIBUTES
>  ----------------------
>
> diff --git a/attr.c b/attr.c
> index 60d284796d..d06ca0ae4b 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -1186,3 +1186,12 @@ void attr_start(void)
>  	pthread_mutex_init(&check_vector.mutex, NULL);
>  #endif
>  }
> +
> +int is_precious_file(struct index_state *istate, const char *path)
> +{
> +	static struct attr_check *check;
> +	if (!check)
> +		check = attr_check_initl("precious", NULL);
> +	git_check_attr(istate, path, check);
> +	return check && ATTR_TRUE(check->items[0].value);
> +}

If we merge two branches is this using the merged post-image of
.gitattributes as a source?

>  	if (o->dir &&
> -	    is_excluded(o->dir, o->src_index, name, &dtype))
> +	    is_excluded(o->dir, o->src_index, name, &dtype) &&
> +	    !is_precious_file(o->src_index, name))
>  		/*
>  		 * ce->name is explicitly excluded, so it is Ok to
>  		 * overwrite it.

I wonder if instead we should just be reverting c81935348b ("Fix
switching to a branch with D/F when current branch has file D.",
2007-03-15), which these days (haven't dug deeply) would just be this,
right?:

>    diff --git a/unpack-trees.c b/unpack-trees.c
    index 7570df481b..b3efaddd4f 100644
    --- a/unpack-trees.c
    +++ b/unpack-trees.c
    @@ -1894,13 +1894,6 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
     	if (ignore_case && icase_exists(o, name, len, st))
     		return 0;

    -	if (o->dir &&
    -	    is_excluded(o->dir, o->src_index, name, &dtype))
    -		/*
    -		 * ce->name is explicitly excluded, so it is Ok to
    -		 * overwrite it.
    -		 */
    -		return 0;
     	if (S_ISDIR(st->st_mode)) {
     		/*
     		 * We are checking out path "foo" and

Something like the approach you're taking will absolutely work from a
technical standpoint, but I fear that it's going to be useless in
practice.

The users who need protection against git deleting their files the most
are exactly the sort of users who aren't expert-level enough to
understand the nuances of how the semantics of .gitignore and "precious"
are going to interact before git eats their data.

This is pretty apparent from the bug reports we're getting about
this. None of them are:

    "Hey, I 100% understood .gitignore semantics including this one part
    of the docs where you say you'll do this, but just forgot one day
    and deleted my work. Can we get some more safety?"

But rather (with some hyperbole for effect):

    "ZOMG git deleted my file! Is this a bug??"

So I think we should have the inverse of this "precious"
attribute". Just a change to the docs to say that .gitignore doesn't
imply these eager deletion semantics on tree unpacking anymore, and if
users want it back they can define a "garbage" attribute
(s/precious/garbage/).

That will lose no data, and in the very rare cases where a checkout of
tracked files would overwrite an ignored pattern, we can just error out
(as we do with the "Ok to overwrite" branch removed) and tell the user
to delete the files to proceed.

Three tests in our test suite fail with that patch applied, and they're
explicitly testing for exactly the sort of scenario where users are likely to lose data. I.e.:

 1. Open a tracked file in an editor
 2. Save it
 3. Switch to a topic branch, that has different .gitignore semantics
    (e.g. let's say a build/ dir exists there)
 4. Have their work deleted

So actually in writing this out I've become convinced that this
"precious" approach can't work either, because *even if* you're an
expert who manages to perfectly define their .gitignore and "precious"
rules in advance to avoid data deletion, those rules will *also* need to
take into account switching between branches (or even different
histories) where you have other sorts of rules.

So really, if there's ambiguity let's just not delete stuff by default
and ask the user to resolve it.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
  2018-11-11 12:15     ` Bert Wesarg
@ 2018-11-11 12:59     ` Junio C Hamano
  2018-11-26 19:38     ` [PATCH v2 0/2] Precios files round two Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2018-11-11 12:59 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> Also while "precious" is a fun name, but it does not sound serious.
> Any suggestions? Perhaps "valuable"?

FWIW, I am reasonably sure that I was the first in Git circle who
used the term "precious" in discussions wrt .gitignore, i.e. "Git
has ignored but not precious category".  Since it was not my
invention but was a borrowed term from tla (aka GNU arch), I'd
suggest to keep using that term, unless there is a strong reason not
to follow longstanding precedent.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11 12:33   ` [RFC PATCH] Introduce "precious" file concept Ævar Arnfjörð Bjarmason
@ 2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason
  2018-11-12 16:14       ` Duy Nguyen
  2018-11-11 15:41     ` Duy Nguyen
  2018-11-12 23:22     ` brian m. carlson
  2 siblings, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-11 13:06 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Steffen Jost, Joshua Jensen, Per Lundberg, Junio C Hamano,
	Matthieu Moy, Clemens Buchacher, Holger Hellmuth, Kevin Ballard


On Sun, Nov 11 2018, Ævar Arnfjörð Bjarmason wrote:

> [CC-ing some of the people involved in recent threads about this]
>
> On Sun, Nov 11 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> Since this topic has come up twice recently, I revisited this
>> "precious" thingy that I started four years ago and tried to see if I
>> could finally finish it. There are a couple things to be sorted out...
>>
>> A new attribute "precious" is added to indicate that certain files
>> have valuable content and should not be easily discarded even if they
>> are ignored or untracked (*).
>>
>> So far there are two parts of Git that are made aware of precious
>> files: "git clean" will leave precious files alone and unpack-trees.c
>> (i.e. merges and branch switches) will not overwrite
>> ignored-but-precious files.
>>
>> Is there any other parts of Git that should be made aware of this
>> "precious" attribute?
>>
>> Also while "precious" is a fun name, but it does not sound serious.
>> Any suggestions? Perhaps "valuable"?
>>
>> Very lightly tested. The patch is more to have something to discuss
>> than is bug free and ready to use.
>>
>> (*) Note that tracked files could be marked "precious" in the future
>>     too although the exact semantics is not very clear since tracked
>>     files are by default precious.
>>
>>     But something like "index log" could use this to record all
>>     changes to precious files instead of just "git add -p" changes,
>>     for example. So these files are in a sense more precious than
>>     other tracked files.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>>  Documentation/git-clean.txt     |  3 ++-
>>  Documentation/gitattributes.txt | 13 +++++++++++++
>>  attr.c                          |  9 +++++++++
>>  attr.h                          |  2 ++
>>  builtin/clean.c                 | 19 ++++++++++++++++---
>>  unpack-trees.c                  |  3 ++-
>>  6 files changed, 44 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
>> index 03056dad0d..a9beadfb12 100644
>> --- a/Documentation/git-clean.txt
>> +++ b/Documentation/git-clean.txt
>> @@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This can, for
>>  example, be useful to remove all build products.
>>
>>  If any optional `<path>...` arguments are given, only those paths
>> -are affected.
>> +are affected. Ignored or untracked files with `precious` attributes
>> +are not removed.
>>
>>  OPTIONS
>>  -------
>> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
>> index b8392fc330..c722479bdc 100644
>> --- a/Documentation/gitattributes.txt
>> +++ b/Documentation/gitattributes.txt
>> @@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, the value of the
>>  (See linkgit:git-config[1]).
>>
>>
>> +Precious files
>> +~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +`precious`
>> +^^^^^^^^^^
>> +
>> +This attribute is set on files to indicate that their content is
>> +valuable. Many commands will behave slightly different on precious
>> +files. linkgit:git-clean[1] will leave precious files alone. Merging
>> +and branch switching will not silently overwrite ignored files that
>> +are marked "precious".
>> +
>> +
>>  USING MACRO ATTRIBUTES
>>  ----------------------
>>
>> diff --git a/attr.c b/attr.c
>> index 60d284796d..d06ca0ae4b 100644
>> --- a/attr.c
>> +++ b/attr.c
>> @@ -1186,3 +1186,12 @@ void attr_start(void)
>>  	pthread_mutex_init(&check_vector.mutex, NULL);
>>  #endif
>>  }
>> +
>> +int is_precious_file(struct index_state *istate, const char *path)
>> +{
>> +	static struct attr_check *check;
>> +	if (!check)
>> +		check = attr_check_initl("precious", NULL);
>> +	git_check_attr(istate, path, check);
>> +	return check && ATTR_TRUE(check->items[0].value);
>> +}
>
> If we merge two branches is this using the merged post-image of
> .gitattributes as a source?
>
>>  	if (o->dir &&
>> -	    is_excluded(o->dir, o->src_index, name, &dtype))
>> +	    is_excluded(o->dir, o->src_index, name, &dtype) &&
>> +	    !is_precious_file(o->src_index, name))
>>  		/*
>>  		 * ce->name is explicitly excluded, so it is Ok to
>>  		 * overwrite it.
>
> I wonder if instead we should just be reverting c81935348b ("Fix
> switching to a branch with D/F when current branch has file D.",
> 2007-03-15), which these days (haven't dug deeply) would just be this,
> right?:
>
>>    diff --git a/unpack-trees.c b/unpack-trees.c
>     index 7570df481b..b3efaddd4f 100644
>     --- a/unpack-trees.c
>     +++ b/unpack-trees.c
>     @@ -1894,13 +1894,6 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
>      	if (ignore_case && icase_exists(o, name, len, st))
>      		return 0;
>
>     -	if (o->dir &&
>     -	    is_excluded(o->dir, o->src_index, name, &dtype))
>     -		/*
>     -		 * ce->name is explicitly excluded, so it is Ok to
>     -		 * overwrite it.
>     -		 */
>     -		return 0;
>      	if (S_ISDIR(st->st_mode)) {
>      		/*
>      		 * We are checking out path "foo" and
>
> Something like the approach you're taking will absolutely work from a
> technical standpoint, but I fear that it's going to be useless in
> practice.
>
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
>
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
>
>     "Hey, I 100% understood .gitignore semantics including this one part
>     of the docs where you say you'll do this, but just forgot one day
>     and deleted my work. Can we get some more safety?"
>
> But rather (with some hyperbole for effect):
>
>     "ZOMG git deleted my file! Is this a bug??"
>
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
>
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.
>
> Three tests in our test suite fail with that patch applied, and they're
> explicitly testing for exactly the sort of scenario where users are likely to lose data. I.e.:
>
>  1. Open a tracked file in an editor
>  2. Save it
>  3. Switch to a topic branch, that has different .gitignore semantics
>     (e.g. let's say a build/ dir exists there)
>  4. Have their work deleted
>
> So actually in writing this out I've become convinced that this
> "precious" approach can't work either, because *even if* you're an
> expert who manages to perfectly define their .gitignore and "precious"
> rules in advance to avoid data deletion, those rules will *also* need to
> take into account switching between branches (or even different
> histories) where you have other sorts of rules.
>
> So really, if there's ambiguity let's just not delete stuff by default
> and ask the user to resolve it.

Here's a patch to implement that (which borrows from some of yours). It
passes all of our tests:

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index b8392fc330..a6cad17899 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -1188,6 +1188,17 @@ If this attribute is not set or has an invalid value, the value of the
 (See linkgit:git-config[1]).


+Trashable files
+~~~~~~~~~~~~~~~
+
+`trashable`
+^^^^^^^^^^
+
+Provides an escape hatch for re-enabling a potentially data destroying
+feature which was enabled by default between Git versions 1.5.2 and
+2.20. See the `NOTES` section of linkgit:gitignore[5] for details.
+
+
 USING MACRO ATTRIBUTES
 ----------------------

diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index d107daaffd..39c6d5955a 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -140,6 +140,13 @@ not tracked by Git remain untracked.
 To stop tracking a file that is currently tracked, use
 'git rm --cached'.

+Between Git versions 1.5.2 and 2.20 untracked files or directories
+which were ignored and conflicted with a file about to be checked out
+(e.g. during linkgit:git-checkout[1] or linkgit:git-merge[1]) would be
+deleted. This could lead to loss of user data and is no longer the
+default, See `trashable` in linkgit:gitattributes[5]. for how to
+selectively enable this behavior.
+
 EXAMPLES
 --------

diff --git a/attr.c b/attr.c
index 60d284796d..930af78650 100644
--- a/attr.c
+++ b/attr.c
@@ -1186,3 +1186,12 @@ void attr_start(void)
 	pthread_mutex_init(&check_vector.mutex, NULL);
 #endif
 }
+
+int is_trashable_file(struct index_state *istate, const char *path)
+{
+	static struct attr_check *check;
+	if (!check)
+		check = attr_check_initl("trashable", NULL);
+	git_check_attr(istate, path, check);
+	return check && ATTR_TRUE(check->items[0].value);
+}
diff --git a/attr.h b/attr.h
index b0378bfe5f..ccf4d4e6b5 100644
--- a/attr.h
+++ b/attr.h
@@ -82,4 +82,6 @@ void git_attr_set_direction(enum git_attr_direction new_direction);

 void attr_start(void);

+int is_trashable_file(struct index_state *istate, const char *path);
+
 #endif /* ATTR_H */
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 016391723c..d2ceee33d2 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -844,6 +844,8 @@ test_submodule_switch_recursing_with_args () {
 			git branch -t add_sub1 origin/add_sub1 &&
 			: >sub1 &&
 			echo sub1 >.git/info/exclude &&
+			test_must_fail $command add_sub1 &&
+			echo sub1 trashable >.gitattributes &&
 			$command add_sub1 &&
 			test_superproject_content origin/add_sub1 &&
 			test_submodule_content sub1 origin/add_sub1
diff --git a/t/t1004-read-tree-m-u-wf.sh b/t/t1004-read-tree-m-u-wf.sh
index c13578a635..2243cd955e 100755
--- a/t/t1004-read-tree-m-u-wf.sh
+++ b/t/t1004-read-tree-m-u-wf.sh
@@ -63,8 +63,10 @@ test_expect_success 'two-way with incorrect --exclude-per-directory (2)' '
 	fi
 '

-test_expect_success 'two-way clobbering a ignored file' '
+test_expect_success 'two-way keeping a ignored file, trashing a trashable file' '

+	read_tree_u_must_fail -m -u --exclude-per-directory=.gitignore master side &&
+	echo file2 trashable >.gitattributes &&
 	read_tree_u_must_succeed -m -u --exclude-per-directory=.gitignore master side
 '

@@ -106,7 +108,7 @@ test_expect_success 'three-way not clobbering a working tree file' '

 echo >.gitignore file3

-test_expect_success 'three-way not complaining on an untracked file' '
+test_expect_success 'three-way complaining on an untracked file, trashing a trashable file' '

 	git reset --hard &&
 	rm -f file2 subdir/file2 file3 subdir/file3 &&
@@ -114,6 +116,8 @@ test_expect_success 'three-way not complaining on an untracked file' '
 	echo >file3 file three created in master, untracked &&
 	echo >subdir/file3 file three created in master, untracked &&

+	read_tree_u_must_fail -m -u --exclude-per-directory=.gitignore branch-point master side &&
+	echo file3 trashable >.gitattributes &&
 	read_tree_u_must_succeed -m -u --exclude-per-directory=.gitignore branch-point master side
 '

diff --git a/unpack-trees.c b/unpack-trees.c
index 7570df481b..e9a7fb6583 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1895,9 +1895,10 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;

 	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	    is_excluded(o->dir, o->src_index, name, &dtype) &&
+	    is_trashable_file(o->src_index, name))
 		/*
-		 * ce->name is explicitly excluded, so it is Ok to
+		 * ce->name is explicitly trashable, so it is Ok to
 		 * overwrite it.
 		 */
 		return 0;

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11 12:33   ` [RFC PATCH] Introduce "precious" file concept Ævar Arnfjörð Bjarmason
  2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason
@ 2018-11-11 15:41     ` Duy Nguyen
  2018-11-11 16:55       ` Ævar Arnfjörð Bjarmason
  2018-11-12  7:35       ` Per Lundberg
  2018-11-12 23:22     ` brian m. carlson
  2 siblings, 2 replies; 86+ messages in thread
From: Duy Nguyen @ 2018-11-11 15:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, jost, Joshua Jensen, per.lundberg,
	Junio C Hamano, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
>
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
>
>     "Hey, I 100% understood .gitignore semantics including this one part
>     of the docs where you say you'll do this, but just forgot one day
>     and deleted my work. Can we get some more safety?"
>
> But rather (with some hyperbole for effect):
>
>     "ZOMG git deleted my file! Is this a bug??"
>
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
>
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.

There's also the other side of the coin. If this refuse to overwrite
triggers too often, it can become an annoyance. So far I've seen two
reports of accident overwriting which make me think turning precious
to trashable may be too extreme. Plus ignored files are trashable by
default (or at least by design so far), adding trashable attribute
changes how we handle ignored files quite significantly.
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11 15:41     ` Duy Nguyen
@ 2018-11-11 16:55       ` Ævar Arnfjörð Bjarmason
  2018-11-12  7:35       ` Per Lundberg
  1 sibling, 0 replies; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-11 16:55 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, jost, Joshua Jensen, per.lundberg,
	Junio C Hamano, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard


On Sun, Nov 11 2018, Duy Nguyen wrote:

> On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> The users who need protection against git deleting their files the most
>> are exactly the sort of users who aren't expert-level enough to
>> understand the nuances of how the semantics of .gitignore and "precious"
>> are going to interact before git eats their data.
>>
>> This is pretty apparent from the bug reports we're getting about
>> this. None of them are:
>>
>>     "Hey, I 100% understood .gitignore semantics including this one part
>>     of the docs where you say you'll do this, but just forgot one day
>>     and deleted my work. Can we get some more safety?"
>>
>> But rather (with some hyperbole for effect):
>>
>>     "ZOMG git deleted my file! Is this a bug??"
>>
>> So I think we should have the inverse of this "precious"
>> attribute". Just a change to the docs to say that .gitignore doesn't
>> imply these eager deletion semantics on tree unpacking anymore, and if
>> users want it back they can define a "garbage" attribute
>> (s/precious/garbage/).
>>
>> That will lose no data, and in the very rare cases where a checkout of
>> tracked files would overwrite an ignored pattern, we can just error out
>> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> to delete the files to proceed.
>
> There's also the other side of the coin. If this refuse to overwrite
> triggers too often, it can become an annoyance. So far I've seen two
> reports of accident overwriting which make me think turning precious
> to trashable may be too extreme. Plus ignored files are trashable by
> default (or at least by design so far), adding trashable attribute
> changes how we handle ignored files quite significantly.

Yeah I'm not trying to make the argument that we should just go with
these user bug reports, clearly that's just going to give us selection
bias and we could continue to flip between the two behaviors with that
approach. Just that an advanced opt-in feature to prevent dataloss will
not prevent it in practice.

Is taking my patch the right thing? I don't know. I'm leaning in that
direction, but more making a devil's advocate argument to see if anyone
finds good cases that'll demonstrate how it's bad. I haven't read/seen
them so far, and the test suite didn't have any.

I did go through the list archives as Junio suggested in
https://public-inbox.org/git/7viq39avay.fsf@alter.siamese.dyndns.org/
and found these two:
https://public-inbox.org/git/?q=d%3A20070301..20070331+verify_absent

It seems to me that the reason we ended up with this behavior is a bug
report from Shawn that was describing a similar but not quite the same
problem:

    "[...]a bug in read-tree -m that prevents him from switching
    branches when the type of a path changes between a directory and a
    file.[...]"

That's not the same as when a now-tracked file clobbers a .gitignored
file. As far as I can tell (but may not have read carefully enough) that
wasn't a problem anyone reported, but was changed while fixing another
bug in c81935348b ("Fix switching to a branch with D/F when current
branch has file D.", 2007-03-15).

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11 15:41     ` Duy Nguyen
  2018-11-11 16:55       ` Ævar Arnfjörð Bjarmason
@ 2018-11-12  7:35       ` Per Lundberg
  2018-11-12  9:08         ` Matthieu Moy
  1 sibling, 1 reply; 86+ messages in thread
From: Per Lundberg @ 2018-11-12  7:35 UTC (permalink / raw)
  To: Duy Nguyen, Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, jost@tcs.ifi.lmu.de, Joshua Jensen,
	Junio C Hamano, git@matthieu-moy.fr, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard

On 11/11/18 5:41 PM, Duy Nguyen wrote:
> On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
 >
>> That will lose no data, and in the very rare cases where a checkout of
>> tracked files would overwrite an ignored pattern, we can just error out
>> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> to delete the files to proceed. 
> There's also the other side of the coin. If this refuse to overwrite
> triggers too often, it can become an annoyance.

Sure, but doing "git checkout -f some_ref" instead of "git checkout 
some_ref" isn't really _that_ annoying, is it? I think, people (because 
of not having read/studied the .gitignore semantics well enough) having 
their files being overwritten _without realizing it_ is a bigger danger. 
But obviously there is a bit of treading a thin line here.

If we feel thrashable is stretching it too far (which I don't think it 
is), we could add a "core.ignore_files_are_trashable" setting that 
brings back the old semantics, for those who have a strong feeling about it.

It's also quite possible to do it the other way around - i.e. set 
"core.ignore_files_are_trashable" to true by default, and let the "new" 
behavior be opt-in. However, this might "miss the mark" in that those 
people who would really benefit from the new semantics might miss this 
setting, just like they could risk missing the "precious" setting.

(I also think "trashable" sounds better and is more clear & precise than 
"precious", for whatever that is worth.)
--
Per Lundberg

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12  7:35       ` Per Lundberg
@ 2018-11-12  9:08         ` Matthieu Moy
  2018-11-12  9:49           ` Ævar Arnfjörð Bjarmason
  2018-11-12 16:07           ` Duy Nguyen
  0 siblings, 2 replies; 86+ messages in thread
From: Matthieu Moy @ 2018-11-12  9:08 UTC (permalink / raw)
  To: Per Lundberg
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason,
	Git Mailing List, jost, Joshua Jensen, Junio C Hamano,
	Clemens Buchacher, Holger Hellmuth (IKS), Kevin Ballard

"Per Lundberg" <per.lundberg@hibox.tv> wrote:

> On 11/11/18 5:41 PM, Duy Nguyen wrote:
> > On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
>
> >> That will lose no data, and in the very rare cases where a checkout of
> >> tracked files would overwrite an ignored pattern, we can just error out
> >> (as we do with the "Ok to overwrite" branch removed) and tell the user
> >> to delete the files to proceed.
> > There's also the other side of the coin. If this refuse to overwrite
> > triggers too often, it can become an annoyance.

I may have missed some cases, but to me the cases when checkout may try
to overwrite an ignored file are essentially:

* Someone "git add"ed a file meant to be ignored by mistake (e.g.
  "git add -f *.o").

* A file that was meant to be kept private (e.g. config.mak.dev) ends
  up being tracked. This may happen when we find a way to make per-developer
  settings the same for everyone.

I both cases I'd want at least to be notified that something is going on,
and in the second I'd probably want to keep my local file around.

> If we feel thrashable is stretching it too far (which I don't think it
> is), we could add a "core.ignore_files_are_trashable" setting that
> brings back the old semantics, for those who have a strong feeling about it.

May I remind an idea I sugested in an old thread: add an intermediate level
where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
backup files):

https://public-inbox.org/git/vpqd3t9656k.fsf@bauges.imag.fr/

One advantage of the "rename" behavior is that it's safer that the current,
but still not very disturbing for people who like the current behavior. This
makes it a good candidate for a default behavior.

This could come in complement with this thread's "precious" concept:

* If you know what you're doing and know that such or such file is precious,
  mark it as such and Git will never overwrite it.

* If you don't know about precious files, just keep the default setting and
  the worse that can happen is to get your file overwritten with a bakup
  of the old version kept around.

This would probably play better with a notion of "precious" files than with
a notion of "trashable" files.

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12  9:08         ` Matthieu Moy
@ 2018-11-12  9:49           ` Ævar Arnfjörð Bjarmason
  2018-11-12 10:26             ` Junio C Hamano
  2018-11-12 16:07           ` Duy Nguyen
  1 sibling, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-12  9:49 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Per Lundberg, Duy Nguyen, Git Mailing List, jost, Joshua Jensen,
	Junio C Hamano, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard


On Mon, Nov 12 2018, Matthieu Moy wrote:

> "Per Lundberg" <per.lundberg@hibox.tv> wrote:
>
>> On 11/11/18 5:41 PM, Duy Nguyen wrote:
>> > On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
>> > <avarab@gmail.com> wrote:
>>
>> >> That will lose no data, and in the very rare cases where a checkout of
>> >> tracked files would overwrite an ignored pattern, we can just error out
>> >> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> >> to delete the files to proceed.
>> > There's also the other side of the coin. If this refuse to overwrite
>> > triggers too often, it can become an annoyance.
>
> I may have missed some cases, but to me the cases when checkout may try
> to overwrite an ignored file are essentially:
>
> * Someone "git add"ed a file meant to be ignored by mistake (e.g.
>   "git add -f *.o").
>
> * A file that was meant to be kept private (e.g. config.mak.dev) ends
>   up being tracked. This may happen when we find a way to make per-developer
>   settings the same for everyone.

Yes, the cases under discussion here are all cases where a tracked file
clobbers a file matching a pattern in in .gitignore.

What I'd add to your list is:

* Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
  else usually doesn't belong in the repo as a "soft ignore". This is
  something we've never recommended, but have implicitly supported since
  the only caveats are a) you need a one-off "git add -f" and then
  they're tracked b) them being missing from "status" before being
  tracked c) the issue under discussion here.

* Cases where e.g. filename changes to a directory or globs because of
  that make this more complex.

> I both cases I'd want at least to be notified that something is going on,
> and in the second I'd probably want to keep my local file around.
>> If we feel thrashable is stretching it too far (which I don't think it
>> is), we could add a "core.ignore_files_are_trashable" setting that
>> brings back the old semantics, for those who have a strong feeling about it.
>
> May I remind an idea I sugested in an old thread: add an intermediate level
> where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
> backup files):
>
> https://public-inbox.org/git/vpqd3t9656k.fsf@bauges.imag.fr/
>
> One advantage of the "rename" behavior is that it's safer that the current,
> but still not very disturbing for people who like the current behavior. This
> makes it a good candidate for a default behavior.
>
> This could come in complement with this thread's "precious" concept:
>
> * If you know what you're doing and know that such or such file is precious,
>   mark it as such and Git will never overwrite it.
>
> * If you don't know about precious files, just keep the default setting and
>   the worse that can happen is to get your file overwritten with a bakup
>   of the old version kept around.
>
> This would probably play better with a notion of "precious" files than with
> a notion of "trashable" files.

I used to think this foo -> foo~ approach made the most sense (and said
as much in
https://public-inbox.org/git/871s8qdzph.fsf@evledraar.gmail.com/) but I
think it's probably best not to do it and just error out, because:

 * We'd still need to handle the cases where "tests" the file collides
   with "tests" the directory. Then where do we move the colliding file?
   ~/.git/lost+found/* ? We could handle the subdir case with another
   special-case though...

 * I think such silent action will just leave users more confused, and
   in many cases (e.g. a merge) whatever message we print out will be
   missed in a deluge of other messaging, and they'll probably miss it.

   I'd like to avoid a case where a bunch of *~ files get committed
   because the user's workflow is (and some beginner git users do this):

       git pull && git add . && git commit && git push

   As the "pull" would now invoke a merge that would do this rename.

 * If I have the "foo" file open in my editor (a plausible way to run
   into this) I switch to another terminal, do the merge, miss the
   message, then re-save "foo". Now I have both "foo" and "foo~"
   on-disk. Another case where we should just refuse until the user
   resolves the situation to avoid the confusion.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12  9:49           ` Ævar Arnfjörð Bjarmason
@ 2018-11-12 10:26             ` Junio C Hamano
  2018-11-12 12:45               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2018-11-12 10:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Matthieu Moy, Per Lundberg, Duy Nguyen, Git Mailing List, jost,
	Joshua Jensen, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> What I'd add to your list is:
>
> * Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
>   else usually doesn't belong in the repo as a "soft ignore". This is
>   something we've never recommended, but have implicitly supported since
>   the only caveats are a) you need a one-off "git add -f" and then
>   they're tracked b) them being missing from "status" before being
>   tracked c) the issue under discussion here.

Or only selected "*.o" (vendor supplied binary blob) kept tracked
while everything else is built from the source.

I do not know who you are referring to "we" in your sentence, but as
far as I am concerned, it has been and still is a BCP recommendation
on this list to deal with a case like that.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12 10:26             ` Junio C Hamano
@ 2018-11-12 12:45               ` Ævar Arnfjörð Bjarmason
  2018-11-12 13:02                 ` Junio C Hamano
  0 siblings, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-12 12:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthieu Moy, Per Lundberg, Duy Nguyen, Git Mailing List, jost,
	Joshua Jensen, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard


On Mon, Nov 12 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> What I'd add to your list is:
>>
>> * Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
>>   else usually doesn't belong in the repo as a "soft ignore". This is
>>   something we've never recommended, but have implicitly supported since
>>   the only caveats are a) you need a one-off "git add -f" and then
>>   they're tracked b) them being missing from "status" before being
>>   tracked c) the issue under discussion here.
>
> Or only selected "*.o" (vendor supplied binary blob) kept tracked
> while everything else is built from the source.
>
> I do not know who you are referring to "we" in your sentence, but as
> far as I am concerned, it has been and still is a BCP recommendation
> on this list to deal with a case like that.

I mean that this use-case of having a "soft" ignore by carrying it
across the "git add" barrier with a one-off "-f" isn't something
explicitly documented, and apparently not something many
expect. I.e. you / Matthieu have mentioned .gitignore in the past for
only-generated *.o use-case.

But it also does get used for "mostly we don't want this file, but
sometimes we do" use-case, so that's something we need to deal with in
practice. Like many workflows in git it's not something that was forseen
or intended, but does happen in the wild.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12 12:45               ` Ævar Arnfjörð Bjarmason
@ 2018-11-12 13:02                 ` Junio C Hamano
  0 siblings, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2018-11-12 13:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Matthieu Moy, Per Lundberg, Duy Nguyen, Git Mailing List, jost,
	Joshua Jensen, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> Or only selected "*.o" (vendor supplied binary blob) kept tracked
>> while everything else is built from the source.
>> ...
> But it also does get used for "mostly we don't want this file, but
> sometimes we do" use-case, so that's something we need to deal with in
> practice.

Exactly.  "Mostly we don't want *.o as we prefer to build from the
source, but we have only object files for some selected ones" is an
often cited use case where it is the BCP to have *.o in .gitignore
and use "add -f" to add the "selected" ones initially.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12  9:08         ` Matthieu Moy
  2018-11-12  9:49           ` Ævar Arnfjörð Bjarmason
@ 2018-11-12 16:07           ` Duy Nguyen
  1 sibling, 0 replies; 86+ messages in thread
From: Duy Nguyen @ 2018-11-12 16:07 UTC (permalink / raw)
  To: git
  Cc: per.lundberg, Ævar Arnfjörð Bjarmason,
	Git Mailing List, jost, Joshua Jensen, Junio C Hamano,
	Clemens Buchacher, Holger Hellmuth (IKS), Kevin Ballard

On Mon, Nov 12, 2018 at 10:09 AM Matthieu Moy <git@matthieu-moy.fr> wrote:
> May I remind an idea I sugested in an old thread: add an intermediate level
> where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
> backup files):
>
> https://public-inbox.org/git/vpqd3t9656k.fsf@bauges.imag.fr/
>
> One advantage of the "rename" behavior is that it's safer that the current,
> but still not very disturbing for people who like the current behavior. This
> makes it a good candidate for a default behavior.

I have something else in the bag that does something like this. The
idea is that we go ahead and do destructive things but we let the user
undo.

Some more background in [1] but basically we hash "every" change and
store in the object database (in this case we store "foo" content
before overwriting it). We maintain a list of these hashes so that
undo is possible, but of course we don't keep infinite change history,
eventually too old changes will be pruned. [1] talks about index
changes (e.g. "git add -p") but it could apply to worktree changes as
well (and I'm also eyeing $GIT_DIR/config changes).

The upside: a similar undo mechanism that works for more than just
this case and it allows undoing multiple times while foo~ only allow
once. The downside: hashing is definitely heavier than renaming foo to
foo~. So this will feature be opt-in in most cases. But for
"dangerous" overwrite like this case, I think we value the file
content more and make it opt-out.

[1] https://public-inbox.org/git/CACsJy8A3QCYY6QeJQYkbCKYh=7Q7pj=rer_OQHLGoAMqTNomNA@mail.gmail.com/
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason
@ 2018-11-12 16:14       ` Duy Nguyen
  0 siblings, 0 replies; 86+ messages in thread
From: Duy Nguyen @ 2018-11-12 16:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, jost, Joshua Jensen, per.lundberg,
	Junio C Hamano, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

On Sun, Nov 11, 2018 at 2:06 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> +Trashable files
> +~~~~~~~~~~~~~~~
> +
> +`trashable`
> +^^^^^^^^^^
> +
> +Provides an escape hatch for re-enabling a potentially data destroying
> +feature which was enabled by default between Git versions 1.5.2 and
> +2.20. See the `NOTES` section of linkgit:gitignore[5] for details.

How does this interact with "git clean -x"? Most ignored files will
not have trashable attribute, so we don't remove any of them? Making
"git clean" completely ignore this attribute is also possible, I
guess, if we rename it somehow to avoid confusion.
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-11 12:33   ` [RFC PATCH] Introduce "precious" file concept Ævar Arnfjörð Bjarmason
  2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason
  2018-11-11 15:41     ` Duy Nguyen
@ 2018-11-12 23:22     ` brian m. carlson
  2018-11-26  9:30       ` Per Lundberg
  2018-11-26 16:02       ` Eckhard Maaß
  2 siblings, 2 replies; 86+ messages in thread
From: brian m. carlson @ 2018-11-12 23:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Nguyễn Thái Ngọc Duy, git, Steffen Jost,
	Joshua Jensen, Per Lundberg, Junio C Hamano, Matthieu Moy,
	Clemens Buchacher, Holger Hellmuth, Kevin Ballard

[-- Attachment #1: Type: text/plain, Size: 1923 bytes --]

On Sun, Nov 11, 2018 at 01:33:44PM +0100, Ævar Arnfjörð Bjarmason wrote:
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
> 
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
> 
>     "Hey, I 100% understood .gitignore semantics including this one part
>     of the docs where you say you'll do this, but just forgot one day
>     and deleted my work. Can we get some more safety?"
> 
> But rather (with some hyperbole for effect):
> 
>     "ZOMG git deleted my file! Is this a bug??"
> 
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
> 
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.

This is going to totally hose automation.  My last job had files which
might move from tracked to untracked (a file that had become generated),
and long-running CI and build systems would need to be able to check out
one status and switch to the other.  Your proposed change will prevent
those systems from working, whereas they previously did.

I agree that your proposal would have been a better design originally,
but breaking the way automated systems currently work is probably going
to be a dealbreaker.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12 23:22     ` brian m. carlson
@ 2018-11-26  9:30       ` Per Lundberg
  2018-11-26 10:28         ` Ævar Arnfjörð Bjarmason
                           ` (2 more replies)
  2018-11-26 16:02       ` Eckhard Maaß
  1 sibling, 3 replies; 86+ messages in thread
From: Per Lundberg @ 2018-11-26  9:30 UTC (permalink / raw)
  To: brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Junio C Hamano, Matthieu Moy, Clemens Buchacher,
	Holger Hellmuth, Kevin Ballard
  Cc: Ævar Arnfjörð Bjarmason,
	Nguyễn Thái Ngọc Duy

On 11/13/18 1:22 AM, brian m. carlson wrote:
> This is going to totally hose automation.  My last job had files which
> might move from tracked to untracked (a file that had become generated),
> and long-running CI and build systems would need to be able to check out
> one status and switch to the other.  Your proposed change will prevent
> those systems from working, whereas they previously did.
> 
> I agree that your proposal would have been a better design originally,
> but breaking the way automated systems currently work is probably going
> to be a dealbreaker.

How about something like this:

1. Introduce a concept with "garbage" files, which git is "permitted to 
delete" without prompting.

2. Retain the current default, i.e. "ignored files are garbage" for now, 
making the new behavior _opt in_ to avoid breaking automated 
systems/existing scripts for anyone. Put the setting for this behind a 
new core.* config flag.

3. In the plan for version 3.0 (a new major version where some breakage 
can be tolerable, according to Semantic Versioning), change the default 
so that "only explicit garbage is garbage". Include very clear notices 
of this in the release notes. The config flag is retained, but its 
default changes from true->false or vice versa. People who dislike the 
new behavior can easily change back to the 2.x semantics.

Would this be a reasonable compromise for everybody?
-- 
Per Lundberg


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26  9:30       ` Per Lundberg
@ 2018-11-26 10:28         ` Ævar Arnfjörð Bjarmason
  2018-11-26 12:49         ` Junio C Hamano
  2018-11-26 15:26         ` Duy Nguyen
  2 siblings, 0 replies; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-26 10:28 UTC (permalink / raw)
  To: Per Lundberg
  Cc: brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Junio C Hamano, Matthieu Moy, Clemens Buchacher,
	Holger Hellmuth, Kevin Ballard,
	Nguyễn Thái Ngọc Duy


On Mon, Nov 26 2018, Per Lundberg wrote:

> On 11/13/18 1:22 AM, brian m. carlson wrote:
>> This is going to totally hose automation.  My last job had files which
>> might move from tracked to untracked (a file that had become generated),
>> and long-running CI and build systems would need to be able to check out
>> one status and switch to the other.  Your proposed change will prevent
>> those systems from working, whereas they previously did.
>>
>> I agree that your proposal would have been a better design originally,
>> but breaking the way automated systems currently work is probably going
>> to be a dealbreaker.
>
> How about something like this:
>
> 1. Introduce a concept with "garbage" files, which git is "permitted to
> delete" without prompting.
>
> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> making the new behavior _opt in_ to avoid breaking automated
> systems/existing scripts for anyone. Put the setting for this behind a
> new core.* config flag.
>
> 3. In the plan for version 3.0 (a new major version where some breakage
> can be tolerable, according to Semantic Versioning), change the default
> so that "only explicit garbage is garbage". Include very clear notices
> of this in the release notes. The config flag is retained, but its
> default changes from true->false or vice versa. People who dislike the
> new behavior can easily change back to the 2.x semantics.
>
> Would this be a reasonable compromise for everybody?

Possibly, but I think there's an earlier step zero there for anyone
interested in pursuing this (and currently I can't make time for it),
which is to submit a patch with tests and documentation showing exactly
the sort of scenarios where we clobber or don't clobber existing files.

As my https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/
shows we have tests for this, but they're not explicit, and some want to
test some unrelated thing.

I.e. to test the cases where we clobber foo.c because foo.c now
explicitly exists, or cases where dir/foo.c is clobbered because "dir"
is now a tracked text file etc., are those the only two cases? I vaguely
suspect that there were other interesting cases, but at this point the
information has been paged out of the working set of my wetware. The
thread at
https://public-inbox.org/git/87o9au39s7.fsf@evledraar.gmail.com/ has
some notes about this.

Then as noted in
https://public-inbox.org/git/87wopj3661.fsf@evledraar.gmail.com/ the
reason we have this behavior seems to be something that grew organically
from a semi-related bugfix.

So I don't think we're at a point where we're all dug into our trenches
and some people want X and others want Y and in the name of backwards
compatibility we're going to stay with X. It may turn out that we just
want to retain 10% of X, and can get 99% of the safety of Y by doing
that.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26  9:30       ` Per Lundberg
  2018-11-26 10:28         ` Ævar Arnfjörð Bjarmason
@ 2018-11-26 12:49         ` Junio C Hamano
  2018-11-27 15:08           ` Ævar Arnfjörð Bjarmason
  2018-11-26 15:26         ` Duy Nguyen
  2 siblings, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2018-11-26 12:49 UTC (permalink / raw)
  To: Per Lundberg
  Cc: brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Matthieu Moy, Clemens Buchacher, Holger Hellmuth,
	Kevin Ballard, Ævar Arnfjörð Bjarmason,
	Nguyễn Thái Ngọc Duy

Per Lundberg <per.lundberg@hibox.tv> writes:

> How about something like this:
> ...
> Would this be a reasonable compromise for everybody?

I do not think you'd need to introduce such a deliberately breaking
change at all.  Just introduce a new "precious" class, perhaps mark
them with the atttribute mechanism, and that would be the endgame.
Early adopters would start marking ignored but not expendable paths
with the "precious" attribute and they won't be clobbered.  As the
mechanism becomes widely known and mature, more and more people use
it.  And even after that happens, early adopters do not have to change
any attribute setting, and late adopters would have plenty of examples
to imitate.  Those who do not need any "precious" class do not have
to do anything and you won't break any existing automation that way.




^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26  9:30       ` Per Lundberg
  2018-11-26 10:28         ` Ævar Arnfjörð Bjarmason
  2018-11-26 12:49         ` Junio C Hamano
@ 2018-11-26 15:26         ` Duy Nguyen
  2018-11-26 15:34           ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 86+ messages in thread
From: Duy Nguyen @ 2018-11-26 15:26 UTC (permalink / raw)
  To: per.lundberg
  Cc: brian m. carlson, Git Mailing List, jost, Joshua Jensen,
	Junio C Hamano, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard, Ævar Arnfjörð Bjarmason

On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg <per.lundberg@hibox.tv> wrote:
>
> On 11/13/18 1:22 AM, brian m. carlson wrote:
> > This is going to totally hose automation.  My last job had files which
> > might move from tracked to untracked (a file that had become generated),
> > and long-running CI and build systems would need to be able to check out
> > one status and switch to the other.  Your proposed change will prevent
> > those systems from working, whereas they previously did.
> >
> > I agree that your proposal would have been a better design originally,
> > but breaking the way automated systems currently work is probably going
> > to be a dealbreaker.
>
> How about something like this:
>
> 1. Introduce a concept with "garbage" files, which git is "permitted to
> delete" without prompting.
>
> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> making the new behavior _opt in_ to avoid breaking automated
> systems/existing scripts for anyone. Put the setting for this behind a
> new core.* config flag.
>
> 3. In the plan for version 3.0 (a new major version where some breakage
> can be tolerable, according to Semantic Versioning), change the default
> so that "only explicit garbage is garbage". Include very clear notices
> of this in the release notes. The config flag is retained, but its
> default changes from true->false or vice versa. People who dislike the
> new behavior can easily change back to the 2.x semantics.

How does this garbage thing interact with "git clean -x"? My
interpretation of this flag/attribute is that at version 3.0 by
default all ignored files are _not_ garbage, so "git clean -x" should
not remove any of them. Which is weird because most of ignored files
are like *.o that should be removed.

I also need to mark "precious" on untracked or even tracked files (*).
Not sure how this "garbage" attribute interacts with that.

(*) I was hoping I could get the idea [1] implemented in somewhat good
shape before presenting here. But I'm a bit slow on that front. So
yeah this "precious" on untracked/tracked thingy may be even
irrelevant if the patch series will be rejected.

[1] https://public-inbox.org/git/CACsJy8C3rOFv0kQeJrWufQQzbnfU4mSxJtphEYBGMmrroFFN-A@mail.gmail.com/

> Would this be a reasonable compromise for everybody?
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26 15:26         ` Duy Nguyen
@ 2018-11-26 15:34           ` Ævar Arnfjörð Bjarmason
  2018-11-26 15:40             ` Duy Nguyen
  0 siblings, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-26 15:34 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: per.lundberg, brian m. carlson, Git Mailing List, jost,
	Joshua Jensen, Junio C Hamano, git, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard


On Mon, Nov 26 2018, Duy Nguyen wrote:

> On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg <per.lundberg@hibox.tv> wrote:
>>
>> On 11/13/18 1:22 AM, brian m. carlson wrote:
>> > This is going to totally hose automation.  My last job had files which
>> > might move from tracked to untracked (a file that had become generated),
>> > and long-running CI and build systems would need to be able to check out
>> > one status and switch to the other.  Your proposed change will prevent
>> > those systems from working, whereas they previously did.
>> >
>> > I agree that your proposal would have been a better design originally,
>> > but breaking the way automated systems currently work is probably going
>> > to be a dealbreaker.
>>
>> How about something like this:
>>
>> 1. Introduce a concept with "garbage" files, which git is "permitted to
>> delete" without prompting.
>>
>> 2. Retain the current default, i.e. "ignored files are garbage" for now,
>> making the new behavior _opt in_ to avoid breaking automated
>> systems/existing scripts for anyone. Put the setting for this behind a
>> new core.* config flag.
>>
>> 3. In the plan for version 3.0 (a new major version where some breakage
>> can be tolerable, according to Semantic Versioning), change the default
>> so that "only explicit garbage is garbage". Include very clear notices
>> of this in the release notes. The config flag is retained, but its
>> default changes from true->false or vice versa. People who dislike the
>> new behavior can easily change back to the 2.x semantics.
>
> How does this garbage thing interact with "git clean -x"? My
> interpretation of this flag/attribute is that at version 3.0 by
> default all ignored files are _not_ garbage, so "git clean -x" should
> not remove any of them. Which is weird because most of ignored files
> are like *.o that should be removed.
>
> I also need to mark "precious" on untracked or even tracked files (*).
> Not sure how this "garbage" attribute interacts with that.
>
> (*) I was hoping I could get the idea [1] implemented in somewhat good
> shape before presenting here. But I'm a bit slow on that front. So
> yeah this "precious" on untracked/tracked thingy may be even
> irrelevant if the patch series will be rejected.

I think a garbage (or trashable) flag, if implemented, wouldn't need any
special case in git-clean, i.e. -x would remove all untracked files,
whether ignored or garbage/trashable. That's what my patch to implement
it does:
https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/

I think that makes sense. Users running "git clean" have "--dry-run" and
unlike "checkout a branch" or "merge this commit" where we'll now shred
data implicitly it's obvious that git-clean is going to shred your data.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26 15:34           ` Ævar Arnfjörð Bjarmason
@ 2018-11-26 15:40             ` Duy Nguyen
  2018-11-26 15:47               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 86+ messages in thread
From: Duy Nguyen @ 2018-11-26 15:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: per.lundberg, brian m. carlson, Git Mailing List, jost,
	Joshua Jensen, Junio C Hamano, git, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard

On Mon, Nov 26, 2018 at 4:34 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Mon, Nov 26 2018, Duy Nguyen wrote:
>
> > On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg <per.lundberg@hibox.tv> wrote:
> >>
> >> On 11/13/18 1:22 AM, brian m. carlson wrote:
> >> > This is going to totally hose automation.  My last job had files which
> >> > might move from tracked to untracked (a file that had become generated),
> >> > and long-running CI and build systems would need to be able to check out
> >> > one status and switch to the other.  Your proposed change will prevent
> >> > those systems from working, whereas they previously did.
> >> >
> >> > I agree that your proposal would have been a better design originally,
> >> > but breaking the way automated systems currently work is probably going
> >> > to be a dealbreaker.
> >>
> >> How about something like this:
> >>
> >> 1. Introduce a concept with "garbage" files, which git is "permitted to
> >> delete" without prompting.
> >>
> >> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> >> making the new behavior _opt in_ to avoid breaking automated
> >> systems/existing scripts for anyone. Put the setting for this behind a
> >> new core.* config flag.
> >>
> >> 3. In the plan for version 3.0 (a new major version where some breakage
> >> can be tolerable, according to Semantic Versioning), change the default
> >> so that "only explicit garbage is garbage". Include very clear notices
> >> of this in the release notes. The config flag is retained, but its
> >> default changes from true->false or vice versa. People who dislike the
> >> new behavior can easily change back to the 2.x semantics.
> >
> > How does this garbage thing interact with "git clean -x"? My
> > interpretation of this flag/attribute is that at version 3.0 by
> > default all ignored files are _not_ garbage, so "git clean -x" should
> > not remove any of them. Which is weird because most of ignored files
> > are like *.o that should be removed.
> >
> > I also need to mark "precious" on untracked or even tracked files (*).
> > Not sure how this "garbage" attribute interacts with that.
> >
> > (*) I was hoping I could get the idea [1] implemented in somewhat good
> > shape before presenting here. But I'm a bit slow on that front. So
> > yeah this "precious" on untracked/tracked thingy may be even
> > irrelevant if the patch series will be rejected.
>
> I think a garbage (or trashable) flag, if implemented, wouldn't need any
> special case in git-clean, i.e. -x would remove all untracked files,
> whether ignored or garbage/trashable. That's what my patch to implement
> it does:
> https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/
>
> I think that makes sense. Users running "git clean" have "--dry-run" and
> unlike "checkout a branch" or "merge this commit" where we'll now shred
> data implicitly it's obvious that git-clean is going to shred your data.

Then that't not what I want. If I'm going to mark to keep "config.mak"
around, I'm not going to carefully move it away before doing "git
clean -fdx" then move it back. No "git clean --dry-run" telling me to
make a backup of config.mak is no good.
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26 15:40             ` Duy Nguyen
@ 2018-11-26 15:47               ` Ævar Arnfjörð Bjarmason
  2018-11-26 15:55                 ` Duy Nguyen
  0 siblings, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-26 15:47 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: per.lundberg, brian m. carlson, Git Mailing List, jost,
	Joshua Jensen, Junio C Hamano, git, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard


On Mon, Nov 26 2018, Duy Nguyen wrote:

> On Mon, Nov 26, 2018 at 4:34 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Mon, Nov 26 2018, Duy Nguyen wrote:
>>
>> > On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg <per.lundberg@hibox.tv> wrote:
>> >>
>> >> On 11/13/18 1:22 AM, brian m. carlson wrote:
>> >> > This is going to totally hose automation.  My last job had files which
>> >> > might move from tracked to untracked (a file that had become generated),
>> >> > and long-running CI and build systems would need to be able to check out
>> >> > one status and switch to the other.  Your proposed change will prevent
>> >> > those systems from working, whereas they previously did.
>> >> >
>> >> > I agree that your proposal would have been a better design originally,
>> >> > but breaking the way automated systems currently work is probably going
>> >> > to be a dealbreaker.
>> >>
>> >> How about something like this:
>> >>
>> >> 1. Introduce a concept with "garbage" files, which git is "permitted to
>> >> delete" without prompting.
>> >>
>> >> 2. Retain the current default, i.e. "ignored files are garbage" for now,
>> >> making the new behavior _opt in_ to avoid breaking automated
>> >> systems/existing scripts for anyone. Put the setting for this behind a
>> >> new core.* config flag.
>> >>
>> >> 3. In the plan for version 3.0 (a new major version where some breakage
>> >> can be tolerable, according to Semantic Versioning), change the default
>> >> so that "only explicit garbage is garbage". Include very clear notices
>> >> of this in the release notes. The config flag is retained, but its
>> >> default changes from true->false or vice versa. People who dislike the
>> >> new behavior can easily change back to the 2.x semantics.
>> >
>> > How does this garbage thing interact with "git clean -x"? My
>> > interpretation of this flag/attribute is that at version 3.0 by
>> > default all ignored files are _not_ garbage, so "git clean -x" should
>> > not remove any of them. Which is weird because most of ignored files
>> > are like *.o that should be removed.
>> >
>> > I also need to mark "precious" on untracked or even tracked files (*).
>> > Not sure how this "garbage" attribute interacts with that.
>> >
>> > (*) I was hoping I could get the idea [1] implemented in somewhat good
>> > shape before presenting here. But I'm a bit slow on that front. So
>> > yeah this "precious" on untracked/tracked thingy may be even
>> > irrelevant if the patch series will be rejected.
>>
>> I think a garbage (or trashable) flag, if implemented, wouldn't need any
>> special case in git-clean, i.e. -x would remove all untracked files,
>> whether ignored or garbage/trashable. That's what my patch to implement
>> it does:
>> https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/
>>
>> I think that makes sense. Users running "git clean" have "--dry-run" and
>> unlike "checkout a branch" or "merge this commit" where we'll now shred
>> data implicitly it's obvious that git-clean is going to shred your data.
>
> Then that't not what I want. If I'm going to mark to keep "config.mak"
> around, I'm not going to carefully move it away before doing "git
> clean -fdx" then move it back. No "git clean --dry-run" telling me to
> make a backup of config.mak is no good.

Understood. I mean this in the context of solving the problem users have
with running otherwise non-data-destroying commands like "checkout" and
"merge" and getting their data destroyed, which is overwhelmingly why
this topic gets resurrected.

Some of the solutions overlap with this thing you want, but I think it's
worth keeping the distinction between the two in mind. I.e. I can
imagine us finding some acceptable solution to the data shredding
problem that doesn't implement this mode for "git-clean", or the other
way around.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26 15:47               ` Ævar Arnfjörð Bjarmason
@ 2018-11-26 15:55                 ` Duy Nguyen
  2018-11-27  9:43                   ` Per Lundberg
  0 siblings, 1 reply; 86+ messages in thread
From: Duy Nguyen @ 2018-11-26 15:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: per.lundberg, brian m. carlson, Git Mailing List, jost,
	Joshua Jensen, Junio C Hamano, git, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard

On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> >> >> How about something like this:
> >> >>
> >> >> 1. Introduce a concept with "garbage" files, which git is "permitted to
> >> >> delete" without prompting.
> >> >>
> >> >> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> >> >> making the new behavior _opt in_ to avoid breaking automated
> >> >> systems/existing scripts for anyone. Put the setting for this behind a
> >> >> new core.* config flag.
> >> >>
> >> >> 3. In the plan for version 3.0 (a new major version where some breakage
> >> >> can be tolerable, according to Semantic Versioning), change the default
> >> >> so that "only explicit garbage is garbage". Include very clear notices
> >> >> of this in the release notes. The config flag is retained, but its
> >> >> default changes from true->false or vice versa. People who dislike the
> >> >> new behavior can easily change back to the 2.x semantics.
> >> >
> >> > How does this garbage thing interact with "git clean -x"? My
> >> > interpretation of this flag/attribute is that at version 3.0 by
> >> > default all ignored files are _not_ garbage, so "git clean -x" should
> >> > not remove any of them. Which is weird because most of ignored files
> >> > are like *.o that should be removed.
> >> >
> >> > I also need to mark "precious" on untracked or even tracked files (*).
> >> > Not sure how this "garbage" attribute interacts with that.
> >> >
> >> > (*) I was hoping I could get the idea [1] implemented in somewhat good
> >> > shape before presenting here. But I'm a bit slow on that front. So
> >> > yeah this "precious" on untracked/tracked thingy may be even
> >> > irrelevant if the patch series will be rejected.
> >>
> >> I think a garbage (or trashable) flag, if implemented, wouldn't need any
> >> special case in git-clean, i.e. -x would remove all untracked files,
> >> whether ignored or garbage/trashable. That's what my patch to implement
> >> it does:
> >> https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/
> >>
> >> I think that makes sense. Users running "git clean" have "--dry-run" and
> >> unlike "checkout a branch" or "merge this commit" where we'll now shred
> >> data implicitly it's obvious that git-clean is going to shred your data.
> >
> > Then that't not what I want. If I'm going to mark to keep "config.mak"
> > around, I'm not going to carefully move it away before doing "git
> > clean -fdx" then move it back. No "git clean --dry-run" telling me to
> > make a backup of config.mak is no good.
>
> Understood. I mean this in the context of solving the problem users have
> with running otherwise non-data-destroying commands like "checkout" and
> "merge" and getting their data destroyed, which is overwhelmingly why
> this topic gets resurrected.
>
> Some of the solutions overlap with this thing you want, but I think it's
> worth keeping the distinction between the two in mind.

On the other hand all use cases should be considered. It's going to be
a mess to have "trashable" attribute that applies to some commands
while "precious" to some others (and even worse when they overlap,
imagine having to define both in .gitattributes)

> I.e. I can
> imagine us finding some acceptable solution to the data shredding
> problem that doesn't implement this mode for "git-clean", or the other
> way around.
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-12 23:22     ` brian m. carlson
  2018-11-26  9:30       ` Per Lundberg
@ 2018-11-26 16:02       ` Eckhard Maaß
  1 sibling, 0 replies; 86+ messages in thread
From: Eckhard Maaß @ 2018-11-26 16:02 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Ævar Arnfjörð Bjarmason,
	Nguyễn Thái Ngọc Duy, git, Steffen Jost,
	Joshua Jensen, Per Lundberg, Junio C Hamano, Matthieu Moy,
	Clemens Buchacher, Holger Hellmuth, Kevin Ballard

On Mon, Nov 12, 2018 at 11:22:09PM +0000, brian m. carlson wrote:
> This is going to totally hose automation.  My last job had files which
> might move from tracked to untracked (a file that had become generated),
> and long-running CI and build systems would need to be able to check out
> one status and switch to the other.  Your proposed change will prevent
> those systems from working, whereas they previously did.

Wouldn't those systems not use -f right now? And shouldn't Git have the
same semantic for -f to clobber everything in the proposed use case?
Like it does right now for untracked files which are not ignored. So to
be save going back and forth I would expect those systems to use -f
anyway. Have I missed something here?

Regards,
Eckhard

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH v2 0/2] Precios files round two
  2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
  2018-11-11 12:15     ` Bert Wesarg
  2018-11-11 12:59     ` Junio C Hamano
@ 2018-11-26 19:38     ` Nguyễn Thái Ngọc Duy
  2018-11-26 19:38       ` [PATCH v2 1/2] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
  2018-11-26 19:38       ` [PATCH v2 2/2] unpack-trees: support core.allIgnoredFilesArePreciousWhenMerging Nguyễn Thái Ngọc Duy
  2 siblings, 2 replies; 86+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-11-26 19:38 UTC (permalink / raw)
  To: pclouds
  Cc: git, Junio C Hamano, bert.wesarg, Ævar Arnfjörð,
	drizzd, git, hellmuth, jjensen, jost, kevin, per.lundberg,
	sandals, eckhard.s.maass

Let's see if I can make everybody almost happy with this.

Patch 1 is not that much different from round one except now with
tests. It is still about "precious" attributes that prevent content
loss with "git clean", "git checkout <ref>" or "git merge".

Patch 2 goes Ævar and Per are pursuing. It adds an opt-in config that
turns all ignored files precious, but only for unpack-trees
operations.

Nguyễn Thái Ngọc Duy (2):
  Introduce "precious" file concept
  unpack-trees: support core.allIgnoredFilesArePreciousWhenMerging

 Documentation/config/core.txt   |  6 ++++++
 Documentation/git-clean.txt     |  3 ++-
 Documentation/gitattributes.txt | 13 +++++++++++++
 Documentation/gitignore.txt     |  5 +++++
 attr.c                          | 12 ++++++++++++
 attr.h                          |  2 ++
 builtin/clean.c                 | 20 +++++++++++++++++---
 t/t1004-read-tree-m-u-wf.sh     |  6 ++++++
 t/t7300-clean.sh                | 29 +++++++++++++++++++++++++++++
 unpack-trees.c                  | 20 ++++++++++++++++++++
 10 files changed, 112 insertions(+), 4 deletions(-)

-- 
2.19.1.1327.g328c130451.dirty


^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH v2 1/2] Introduce "precious" file concept
  2018-11-26 19:38     ` [PATCH v2 0/2] Precios files round two Nguyễn Thái Ngọc Duy
@ 2018-11-26 19:38       ` Nguyễn Thái Ngọc Duy
  2018-11-26 19:38       ` [PATCH v2 2/2] unpack-trees: support core.allIgnoredFilesArePreciousWhenMerging Nguyễn Thái Ngọc Duy
  1 sibling, 0 replies; 86+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-11-26 19:38 UTC (permalink / raw)
  To: pclouds
  Cc: git, Junio C Hamano, bert.wesarg, Ævar Arnfjörð,
	drizzd, git, hellmuth, jjensen, jost, kevin, per.lundberg,
	sandals, eckhard.s.maass

A new attribute "precious" is added to indicate that certain files
have valuable content and should not be easily discarded even if they
are ignored or untracked.

So far there are two parts of Git that are made aware of precious
files: "git clean" will leave precious files alone and unpack-trees.c
(i.e. merges and branch switches) will not overwrite
ignored-but-precious files.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-clean.txt     |  3 ++-
 Documentation/gitattributes.txt | 13 +++++++++++++
 Documentation/gitignore.txt     |  4 ++++
 attr.c                          | 12 ++++++++++++
 attr.h                          |  2 ++
 builtin/clean.c                 | 20 +++++++++++++++++---
 t/t1004-read-tree-m-u-wf.sh     |  6 ++++++
 t/t7300-clean.sh                | 29 +++++++++++++++++++++++++++++
 unpack-trees.c                  |  1 +
 9 files changed, 86 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index 03056dad0d..a9beadfb12 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This can, for
 example, be useful to remove all build products.
 
 If any optional `<path>...` arguments are given, only those paths
-are affected.
+are affected. Ignored or untracked files with `precious` attributes
+are not removed.
 
 OPTIONS
 -------
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index b8392fc330..b027abea4a 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, the value of the
 (See linkgit:git-config[1]).
 
 
+Precious files
+~~~~~~~~~~~~~~
+
+`precious`
+^^^^^^^^^^
+
+This attribute is set on files to indicate that their content is
+valuable. Many commands will behave slightly different on precious
+files. linkgit:git-clean[1] will leave precious files alone. Merging
+and branch switching will not silently overwrite ignored files that
+are marked "precious".
+
+
 USING MACRO ATTRIBUTES
 ----------------------
 
diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 1c94f08ff4..0e9614289e 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -141,6 +141,10 @@ not tracked by Git remain untracked.
 To stop tracking a file that is currently tracked, use
 'git rm --cached'.
 
+Ignored files are generally considered discardable. See `precious`
+attribute in linkgit:gitattributes[5] to change the behavior regarding
+ignored files.
+
 EXAMPLES
 --------
 
diff --git a/attr.c b/attr.c
index eaece6658d..b07d8cd835 100644
--- a/attr.c
+++ b/attr.c
@@ -1172,3 +1172,15 @@ void attr_start(void)
 	pthread_mutex_init(&g_attr_hashmap.mutex, NULL);
 	pthread_mutex_init(&check_vector.mutex, NULL);
 }
+
+int is_precious_file(struct index_state *istate, const char *path)
+{
+	static struct attr_check *check;
+	if (!check)
+		check = attr_check_initl("precious", NULL);
+	if (!check)
+		return 0;
+
+	git_check_attr(istate, path, check);
+	return ATTR_TRUE(check->items[0].value);
+}
diff --git a/attr.h b/attr.h
index b0378bfe5f..b9a9751a66 100644
--- a/attr.h
+++ b/attr.h
@@ -82,4 +82,6 @@ void git_attr_set_direction(enum git_attr_direction new_direction);
 
 void attr_start(void);
 
+int is_precious_file(struct index_state *istate, const char *path);
+
 #endif /* ATTR_H */
diff --git a/builtin/clean.c b/builtin/clean.c
index bbcdeb2d9e..42cd040849 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -17,6 +17,7 @@
 #include "color.h"
 #include "pathspec.h"
 #include "help.h"
+#include "attr.h"
 
 static int force = -1; /* unset */
 static int interactive;
@@ -30,6 +31,8 @@ static const char *const builtin_clean_usage[] = {
 
 static const char *msg_remove = N_("Removing %s\n");
 static const char *msg_would_remove = N_("Would remove %s\n");
+static const char *msg_skip_precious = N_("Skipping precious file %s\n");
+static const char *msg_would_skip_precious = N_("Would skip precious file %s\n");
 static const char *msg_skip_git_dir = N_("Skipping repository %s\n");
 static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n");
 static const char *msg_warn_remove_failed = N_("failed to remove %s");
@@ -153,6 +156,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 	struct dirent *e;
 	int res = 0, ret = 0, gone = 1, original_len = path->len, len;
 	struct string_list dels = STRING_LIST_INIT_DUP;
+	const char *rel_path;
 
 	*dir_gone = 1;
 
@@ -192,9 +196,16 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 
 		strbuf_setlen(path, len);
 		strbuf_addstr(path, e->d_name);
-		if (lstat(path->buf, &st))
+		if (lstat(path->buf, &st)) {
 			; /* fall thru */
-		else if (S_ISDIR(st.st_mode)) {
+		} else if ((!prefix && is_precious_file(&the_index, path->buf)) ||
+			   (prefix && skip_prefix(path->buf, prefix, &rel_path) &&
+			    is_precious_file(&the_index, rel_path))) {
+			quote_path_relative(path->buf, prefix, &quoted);
+			printf(dry_run ? _(msg_would_skip_precious) : _(msg_skip_precious), quoted.buf);
+			*dir_gone = 0;
+			continue;
+		} else if (S_ISDIR(st.st_mode)) {
 			if (remove_dirs(path, prefix, force_flag, dry_run, quiet, &gone))
 				ret = 1;
 			if (gone) {
@@ -1018,7 +1029,10 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		if (lstat(abs_path.buf, &st))
 			continue;
 
-		if (S_ISDIR(st.st_mode)) {
+		if (is_precious_file(&the_index, item->string)) {
+			qname = quote_path_relative(item->string, NULL, &buf);
+			printf(dry_run ? _(msg_would_skip_precious) : _(msg_skip_precious), qname);
+		} else if (S_ISDIR(st.st_mode)) {
 			if (remove_dirs(&abs_path, prefix, rm_flags, dry_run, quiet, &gone))
 				errors++;
 			if (gone && !quiet) {
diff --git a/t/t1004-read-tree-m-u-wf.sh b/t/t1004-read-tree-m-u-wf.sh
index c13578a635..17dc626f62 100755
--- a/t/t1004-read-tree-m-u-wf.sh
+++ b/t/t1004-read-tree-m-u-wf.sh
@@ -63,6 +63,12 @@ test_expect_success 'two-way with incorrect --exclude-per-directory (2)' '
 	fi
 '
 
+test_expect_success 'two-way not clobbering a precious ignored file' '
+	test_when_finished rm -f .git/info/attributes &&
+	echo "file2 precious" >.git/info/attributes &&
+	read_tree_u_must_fail -m -u --exclude-per-directory=.gitignore master side
+'
+
 test_expect_success 'two-way clobbering a ignored file' '
 
 	read_tree_u_must_succeed -m -u --exclude-per-directory=.gitignore master side
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 7b36954d63..a478a08a27 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -669,4 +669,33 @@ test_expect_success 'git clean -d skips untracked dirs containing ignored files'
 	test_path_is_missing foo/b/bb
 '
 
+test_expect_success 'git clean -xd leaves precious files alone' '
+	git init precious &&
+	(
+		cd precious &&
+		test_commit one &&
+		cat >.gitignore <<-\EOF &&
+		*.o
+		*.mak
+		EOF
+		cat >.gitattributes <<-\EOF &&
+		*.mak precious
+		.gitattributes precious
+		*.precious precious
+		EOF
+		mkdir sub &&
+		touch one.o sub/two.o one.mak sub/two.mak &&
+		touch one.untracked two.precious sub/also.precious &&
+		git clean -fdx &&
+		test_path_is_missing one.o &&
+		test_path_is_missing sub/two.o &&
+		test_path_is_missing one.untracked &&
+		test_path_is_file .gitattributes &&
+		test_path_is_file one.mak &&
+		test_path_is_file sub/two.mak &&
+		test_path_is_file two.precious &&
+		test_path_is_file sub/also.precious
+	)
+'
+
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 7570df481b..9a5aadc084 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1895,6 +1895,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;
 
 	if (o->dir &&
+	    !is_precious_file(o->src_index, name) &&
 	    is_excluded(o->dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
-- 
2.19.1.1327.g328c130451.dirty


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH v2 2/2] unpack-trees: support core.allIgnoredFilesArePreciousWhenMerging
  2018-11-26 19:38     ` [PATCH v2 0/2] Precios files round two Nguyễn Thái Ngọc Duy
  2018-11-26 19:38       ` [PATCH v2 1/2] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
@ 2018-11-26 19:38       ` Nguyễn Thái Ngọc Duy
  1 sibling, 0 replies; 86+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-11-26 19:38 UTC (permalink / raw)
  To: pclouds
  Cc: git, Junio C Hamano, bert.wesarg, Ævar Arnfjörð,
	drizzd, git, hellmuth, jjensen, jost, kevin, per.lundberg,
	sandals, eckhard.s.maass

Ignored files can be marked precious to prevent being overwritten
during a merge (or even a branch switch). If you really want to make
sure no ignored files are overwritten, this config variable is for
you.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/config/core.txt |  6 ++++++
 Documentation/gitignore.txt   |  5 +++--
 unpack-trees.c                | 21 ++++++++++++++++++++-
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index d0e6635fe0..bff5834c13 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -1,3 +1,9 @@
+core.allIgnoredFilesArePreciousWhenMerging::
+	During a merge operation, if "precious" attribute is unset,
+	consider it set. You can explicitly remove "precious"
+	attribute if needed. See linkgit:gitattributes[5] for more
+	information.
+
 core.fileMode::
 	Tells Git if the executable bit of files in the working tree
 	is to be honored.
diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 0e9614289e..2832df7178 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -142,8 +142,9 @@ To stop tracking a file that is currently tracked, use
 'git rm --cached'.
 
 Ignored files are generally considered discardable. See `precious`
-attribute in linkgit:gitattributes[5] to change the behavior regarding
-ignored files.
+attribute in linkgit:gitattributes[5] and
+`core.allIgnoredFilesArePreciousWhenMerging` in linkgit:git-config[1]
+to change the behavior regarding ignored files.
 
 EXAMPLES
 --------
diff --git a/unpack-trees.c b/unpack-trees.c
index 9a5aadc084..df3b163e2e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1877,6 +1877,25 @@ static int icase_exists(struct unpack_trees_options *o, const char *name, int le
 	return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
 }
 
+static int is_precious_ignored_file(struct index_state *istate, const char *path)
+{
+	static struct attr_check *check;
+	int all_precious;
+
+	if (!check)
+		check = attr_check_initl("precious", NULL);
+	if (!check)
+		return 0;
+
+	git_check_attr(istate, path, check);
+	if (ATTR_UNSET(check->items[0].value) &&
+	    !git_config_get_bool("core.allignoredfilesarepreciouswhenmerging",
+				 &all_precious) &&
+	    all_precious)
+		return 1;
+	return ATTR_TRUE(check->items[0].value);
+}
+
 static int check_ok_to_remove(const char *name, int len, int dtype,
 			      const struct cache_entry *ce, struct stat *st,
 			      enum unpack_trees_error_types error_type,
@@ -1895,7 +1914,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;
 
 	if (o->dir &&
-	    !is_precious_file(o->src_index, name) &&
+	    !is_precious_ignored_file(o->src_index, name) &&
 	    is_excluded(o->dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
-- 
2.19.1.1327.g328c130451.dirty


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26 15:55                 ` Duy Nguyen
@ 2018-11-27  9:43                   ` Per Lundberg
  2018-11-27 12:55                     ` Jacob Keller
  0 siblings, 1 reply; 86+ messages in thread
From: Per Lundberg @ 2018-11-27  9:43 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Ævar Arnfjörð Bjarmason, brian m. carlson,
	Git Mailing List, jost@tcs.ifi.lmu.de, Joshua Jensen,
	Junio C Hamano, git@matthieu-moy.fr, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard

On 11/26/18 5:55 PM, Duy Nguyen wrote:
> On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> Some of the solutions overlap with this thing you want, but I think it's
>> worth keeping the distinction between the two in mind.
> 
> On the other hand all use cases should be considered. It's going to be
> a mess to have "trashable" attribute that applies to some commands
> while "precious" to some others (and even worse when they overlap,
> imagine having to define both in .gitattributes)

Agree - I think it would be a very bad idea to have a "mix" of both 
trashable and precious. IMO, we should try to find which one of these 
concepts suits most general use cases best and causes less churn for 
existing scripts/users' existing "semantic expectations", and pick that one.
--
Per Lundberg

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-27  9:43                   ` Per Lundberg
@ 2018-11-27 12:55                     ` Jacob Keller
  2018-11-27 14:50                       ` Per Lundberg
                                         ` (2 more replies)
  0 siblings, 3 replies; 86+ messages in thread
From: Jacob Keller @ 2018-11-27 12:55 UTC (permalink / raw)
  To: per.lundberg
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Git mailing list, jost, jjensen, Junio C Hamano,
	Matthieu Moy, drizzd, hellmuth, kevin

On Tue, Nov 27, 2018 at 1:45 AM Per Lundberg <per.lundberg@hibox.tv> wrote:
>
> On 11/26/18 5:55 PM, Duy Nguyen wrote:
> > On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >> Some of the solutions overlap with this thing you want, but I think it's
> >> worth keeping the distinction between the two in mind.
> >
> > On the other hand all use cases should be considered. It's going to be
> > a mess to have "trashable" attribute that applies to some commands
> > while "precious" to some others (and even worse when they overlap,
> > imagine having to define both in .gitattributes)
>
> Agree - I think it would be a very bad idea to have a "mix" of both
> trashable and precious. IMO, we should try to find which one of these
> concepts suits most general use cases best and causes less churn for
> existing scripts/users' existing "semantic expectations", and pick that one.
> --
> Per Lundberg

Personally, I would rather err on the side which requires the least
interaction from users to avoid silently clobbering an ignored file.

Either Duy's solution with a sort of "untracked" reflog, or the
garbage/trashable notion.

I don't like the idea of precious because it means people have to know
and remember to opt in, and it's quite possible they will not do so
until after they've lost real data.

I'd only have trashable apply in the case where it was implicit. i.e.
git clean -fdx would still delete them, as this is an explicit
operation that (hopefully?) users know will delete data.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-27 12:55                     ` Jacob Keller
@ 2018-11-27 14:50                       ` Per Lundberg
  2018-11-28  1:21                         ` brian m. carlson
  2018-11-27 15:19                       ` Duy Nguyen
  2018-12-06 18:39                       ` Duy Nguyen
  2 siblings, 1 reply; 86+ messages in thread
From: Per Lundberg @ 2018-11-27 14:50 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Git mailing list, jost@tcs.ifi.lmu.de,
	jjensen@workspacewhiz.com, Junio C Hamano, Matthieu Moy,
	drizzd@gmx.net, hellmuth@ira.uka.de, kevin@sb.org

On 11/27/18 2:55 PM, Jacob Keller wrote:

> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
> 
 > [...]
> 
> I don't like the idea of precious because it means people have to know
> and remember to opt in, and it's quite possible they will not do so
> until after they've lost real data.

I agree strongly with this personally; if we must choose between "might 
break automation" and "might delete non-garbage files", I would say the 
former is the lesser evil of the two.

But, if I had 10 000 000 servers set up using automated scripts that 
would break because of this, I might think differently. Quite likely so, 
in fact.

What are these automation scenarios _more specifically_? Junio or Brian, 
would you care to elaborate? Is it for build servers where you want "git 
clean -dfx" to always reset the working copy to a pristine state or are 
we talking about some other scenarios?

> I'd only have trashable apply in the case where it was implicit. i.e.
> git clean -fdx would still delete them, as this is an explicit
> operation that (hopefully?) users know will delete data.

This is one of the tougher calls, unfortunately.

If I was a user (which I am), and I was typing "git clean -dfx", what 
would I expect?

The help text (currently) states "-x   remove ignored files, too".

Would it be safe to assume that people would understand that "ignored 
_does not_ mean trashable when doing "git checkout some-ref" BUT it 
_does_ mean trashable in the "git clean -dfx" context"? I'm not so 
certain. It would be one of those perceived inconsistencies that would 
make people scream in anger because they _presumed_ that with the new 
"trashable" concept, "git clean -dfx" would no longer hit them in the leg.

And the other way around: if we change "git clean -dfx" to _not_ treat 
"ignored == trashable", it is likely to "hose automation" as it has been 
previously stated. People who might be using this syntax and _want_ it 
to remove ignored files would be upset, and rightfully so.

So in my POV, it's a tough decision between two, less-than-optimal 
alternatives.

But I would perhaps be able to live with the current semantics for "git 
clean -dfx" _as long as we update the help text_ so that "-x" indicates 
more clearly that non-trashable files can be deleted. It doesn't make 
things _worse_ than they currently are and if this is what it takes to 
get the trashable concept implemented and accepted by the community, 
it's a compromise I'd be willing to make.
--
Per Lundberg


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-26 12:49         ` Junio C Hamano
@ 2018-11-27 15:08           ` Ævar Arnfjörð Bjarmason
  2018-11-28  3:58             ` Junio C Hamano
  0 siblings, 1 reply; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-27 15:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Per Lundberg, brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Matthieu Moy, Clemens Buchacher, Holger Hellmuth,
	Kevin Ballard, Nguyễn Thái Ngọc Duy


On Mon, Nov 26 2018, Junio C Hamano wrote:

> Per Lundberg <per.lundberg@hibox.tv> writes:
>
>> How about something like this:
>> ...
>> Would this be a reasonable compromise for everybody?
>
> I do not think you'd need to introduce such a deliberately breaking
> change at all.  Just introduce a new "precious" class, perhaps mark
> them with the atttribute mechanism, and that would be the endgame.
> Early adopters would start marking ignored but not expendable paths
> with the "precious" attribute and they won't be clobbered.  As the
> mechanism becomes widely known and mature, more and more people use
> it.  And even after that happens, early adopters do not have to change
> any attribute setting, and late adopters would have plenty of examples
> to imitate.  Those who do not need any "precious" class do not have
> to do anything and you won't break any existing automation that way.

The patch I submitted in <87zhuf3gs0.fsf@evledraar.gmail.com>[1] changed
the behavior for read-tree & checkout & merge etc.

It was an RFC more in the spirit of showing what in our current tests
had to change to spur some discussion.

But I'm very sympathetic to this line of argument. I.e. in my patch I'm
changing the semantics of read-tree, which is plumbing.

What do you think about some patch like that which retains the plumbing
behavior for things like read-tree, doesn't introduce "precious" or
"trashable", and just makes you specify "[checkout|merge|...] --force"
in cases where we'd have clobbering?

This would give scripts which relied on our stable plumbing consistent
behavior, while helping users who're using our main porcelain not to
lose data. I could then add a --force option to the likes of read-tree
(on by default), so you could get porcelain-like behavior with
--no-force.

1. https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-27 12:55                     ` Jacob Keller
  2018-11-27 14:50                       ` Per Lundberg
@ 2018-11-27 15:19                       ` Duy Nguyen
  2018-12-06 18:39                       ` Duy Nguyen
  2 siblings, 0 replies; 86+ messages in thread
From: Duy Nguyen @ 2018-11-27 15:19 UTC (permalink / raw)
  To: Jacob Keller
  Cc: per.lundberg, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Git Mailing List, jost, Joshua Jensen,
	Junio C Hamano, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

On Tue, Nov 27, 2018 at 1:56 PM Jacob Keller <jacob.keller@gmail.com> wrote:
>
> On Tue, Nov 27, 2018 at 1:45 AM Per Lundberg <per.lundberg@hibox.tv> wrote:
> >
> > On 11/26/18 5:55 PM, Duy Nguyen wrote:
> > > On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
> > > <avarab@gmail.com> wrote:
> > >> Some of the solutions overlap with this thing you want, but I think it's
> > >> worth keeping the distinction between the two in mind.
> > >
> > > On the other hand all use cases should be considered. It's going to be
> > > a mess to have "trashable" attribute that applies to some commands
> > > while "precious" to some others (and even worse when they overlap,
> > > imagine having to define both in .gitattributes)
> >
> > Agree - I think it would be a very bad idea to have a "mix" of both
> > trashable and precious. IMO, we should try to find which one of these
> > concepts suits most general use cases best and causes less churn for
> > existing scripts/users' existing "semantic expectations", and pick that one.
> > --
> > Per Lundberg
>
> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
>
> Either Duy's solution with a sort of "untracked" reflog, or the
> garbage/trashable notion.
>
> I don't like the idea of precious because it means people have to know
> and remember to opt in, and it's quite possible they will not do so
> until after they've lost real data.
>
> I'd only have trashable apply in the case where it was implicit. i.e.
> git clean -fdx would still delete them, as this is an explicit
> operation that (hopefully?) users know will delete data.

Yes I know it will delete ignored files. But I don't want it to delete
some files. There is no way I can tell Git to do that.

It's the same with merge/checkout's overwriting problem. Once the
initial surprise is over, I want control over what files I want Git to
just delete and not annoy me, what Git should not delete.
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-27 14:50                       ` Per Lundberg
@ 2018-11-28  1:21                         ` brian m. carlson
  2018-11-28  6:54                           ` Per Lundberg
  0 siblings, 1 reply; 86+ messages in thread
From: brian m. carlson @ 2018-11-28  1:21 UTC (permalink / raw)
  To: Per Lundberg
  Cc: Jacob Keller, Duy Nguyen, Ævar Arnfjörð Bjarmason,
	Git mailing list, jost@tcs.ifi.lmu.de, jjensen@workspacewhiz.com,
	Junio C Hamano, Matthieu Moy, drizzd@gmx.net, hellmuth@ira.uka.de,
	kevin@sb.org

[-- Attachment #1: Type: text/plain, Size: 2982 bytes --]

On Tue, Nov 27, 2018 at 02:50:34PM +0000, Per Lundberg wrote:
> I agree strongly with this personally; if we must choose between "might
> break automation" and "might delete non-garbage files", I would say the
> former is the lesser evil of the two.
> 
> But, if I had 10 000 000 servers set up using automated scripts that
> would break because of this, I might think differently. Quite likely so,
> in fact.
> 
> What are these automation scenarios _more specifically_? Junio or Brian,
> would you care to elaborate? Is it for build servers where you want "git
> clean -dfx" to always reset the working copy to a pristine state or are
> we talking about some other scenarios?

We had long-running CI servers, since bootstrapping a new system took an
hour.  These would check out the branch to test and run some process
(essentially, a "make" and "make test").  Then, another branch would be
tested, and so on.  The old branch would likely not be merged at this
point.

The scenario I'm thinking of is when a file (say a CSS file) became
built instead of stored in the repository.  Then the file would be added
to .gitignore in the new commit, and it would be generated as part of
the make step.  It would be important to blow away that file away when
checking out a new commit, because not doing so would mean that the CI
system would simply fail to work and require manual intervention.

Moreover, a CI job might fail, due to either a flaky test or a
legitimate failures, so the job might need to be re-run multiple times.
Requiring human intervention, especially when such jobs might be running
at odd hours, would be undesirable.

Another thing we did was to use a specially named gitignore file in our
build step.  We created a new repository, copied the special gitignore
file in as .gitignore, copied in the source and build products, ran git
add and git commit, and then ran git clean -dfx to remove proprietary
source code, packaging the result.  A change to the behavior of git
clean -dfx would be devastating here.

I point this out to underscore how fundamental this change is.  People
overwhelmingly do not read the release notes, so expecting people to
realize that a change has been made, especially when many people only
upgrade Git because of a security issue, may result in unexpected
consequences.  Just because we don't think of this use of Git as normal
or desirable doesn't mean people don't do it and don't expect it to
keep working.  People do read and rely on our documentation.

I think any change we make here has to be opt-in, at least until Git
3.0.  A config knob would probably be the right way to go.  I realize
that may not provide all the benefits we want out of the box, but it
lets people turn the option on once and forget about it.  It also lets
people who don't desire this new behavior explicitly turn it off.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-27 15:08           ` Ævar Arnfjörð Bjarmason
@ 2018-11-28  3:58             ` Junio C Hamano
  2018-11-28 21:54               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 86+ messages in thread
From: Junio C Hamano @ 2018-11-28  3:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Per Lundberg, brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Matthieu Moy, Clemens Buchacher, Holger Hellmuth,
	Kevin Ballard, Nguyễn Thái Ngọc Duy

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> What do you think about some patch like that which retains the plumbing
> behavior for things like read-tree, doesn't introduce "precious" or
> "trashable", and just makes you specify "[checkout|merge|...] --force"
> in cases where we'd have clobbering?

Whether you like it or not, don't people's automation use tons of
invocations of "git merge", "git checkout", etc.?  You'd be breaking
them by such a change.  Other than that, if we never had Git before
and do not have to worry about existing users, I'd think it would be
a lot closer to the ideal than today's system if "checkout <tree>
foo.o" rejected overwriting "foo.o" that is not tracked in the
current index but matches an ignore pattern, and required a
"--force" option to overwrite it.

A user, during a conflict resolution, may say "I want this 'git
checkout foo/' to ignore conflicted paths in that directory, so I
would give "--force" option to it, but now "--force" also implies
that I am willing to clobber ignored paths, which means I cannot use
it".

I would think that a proper automation needs per-path hint from the
user and/or the project, not just a single-size-fits-all --force
option, and "unlike all the *.o ignored files that are expendable,
this vendor-supplied-object.o is not" is one way to give such a
per-path hint.

> This would give scripts which relied on our stable plumbing consistent
> behavior, while helping users who're using our main porcelain not to
> lose data. I could then add a --force option to the likes of read-tree
> (on by default), so you could get porcelain-like behavior with
> --no-force.

At that low level, I suspect that a single size fits all "--force"
would work even less well.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-28  1:21                         ` brian m. carlson
@ 2018-11-28  6:54                           ` Per Lundberg
  0 siblings, 0 replies; 86+ messages in thread
From: Per Lundberg @ 2018-11-28  6:54 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Jacob Keller, Duy Nguyen, Ævar Arnfjörð Bjarmason,
	Git mailing list, Junio C Hamano

On 11/28/18 3:21 AM, brian m. carlson wrote:

Thanks for the elaboration, Brian - good to get things down to a 
practical, real-world level.

 > [...]
 >
> I point this out to underscore how fundamental this change is.  People
> overwhelmingly do not read the release notes, so expecting people to
> realize that a change has been made, especially when many people only
> upgrade Git because of a security issue, may result in unexpected
> consequences.

This is one of the more important things of software engineering. _Don't 
mix security fixes with breaking changes_. They are very different 
things and like you say, we can't really expect people to real release 
notes for every little incremental release we do.

That's an important part of the SemVer guarantee: a minor version 
bump/patch level increase means "bug fix" or "added functionality in a 
backwards-compatible way". So: no changing of default behavior or 
semantics, but adding a new behavior which is opt-in is perfectly fine.

> Just because we don't think of this use of Git as normal or desirable > doesn't mean people don't do it and don't expect it to keep working.

In other words, we need to be "bug-by-bug" compatible with previous 
versions. :-) What some people would consider a bug, others would 
consider a feature.

> I think any change we make here has to be opt-in, at least until Git
> 3.0.  A config knob would probably be the right way to go. 

Agree. It's less than optimal but I think it's something that we all 
could live with. Deciding to switching the default (or not) is then 
rightfully postponed to a later time, and we can revisit the pros and 
cons then. The important thing now is to get the functionality 
implemented in a good way, tested and eventually merged.
--
Per Lundberg

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-28  3:58             ` Junio C Hamano
@ 2018-11-28 21:54               ` Ævar Arnfjörð Bjarmason
  2018-11-29  5:04                 ` Junio C Hamano
  2018-12-01  6:21                 ` Duy Nguyen
  0 siblings, 2 replies; 86+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-28 21:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Per Lundberg, brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Matthieu Moy, Clemens Buchacher, Holger Hellmuth,
	Kevin Ballard, Nguyễn Thái Ngọc Duy


On Wed, Nov 28 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> What do you think about some patch like that which retains the plumbing
>> behavior for things like read-tree, doesn't introduce "precious" or
>> "trashable", and just makes you specify "[checkout|merge|...] --force"
>> in cases where we'd have clobbering?
>
> Whether you like it or not, don't people's automation use tons of
> invocations of "git merge", "git checkout", etc.?  You'd be breaking
> them by such a change.

I'm so sympathetic to this argument that I tried to convince you of
something like this around a year and a half ago:
https://public-inbox.org/git/CACBZZX59KXPOEjiUKtZLN6zjO_xpiWve7Xga6q-53J2LwvfZyw@mail.gmail.com/
:)

I was probing for what your current stance on this sort of thing is,
because discussions like this tend to get bogged down in the irrelevant
distraction of whether something is plumbing or porcelain, which almost
none of our users care about, and we've effectively stopped caring about
ourselves.

But we must have some viable way to repair warts in the tools, and
losing user data is a *big* wart.

I don't think something like the endgame you've described in
https://public-inbox.org/git/xmqqzhtwuhpc.fsf@gitster-ct.c.googlers.com/
is ever going to work. Novice git users (the vast majority) are not
going to diligently update both .gitignore and some .gitattribute
mechanism in lockstep. I'd bet most git users haven't read more than a
few paragraphs of our entire documentation at best.

So what's the way forward? I think ultimately we must move to something
where we effectively version the entire CLI UI similar to stable API
versions. I.e. for things like this that would break some things (or
Duy's new "split checkout") introduce them as flags first, then bundle
up all such flags and cut a major release "Git 3, 4, ...", and
eventually remove old functionality.

> Other than that, if we never had Git before and do not have to worry
> about existing users, I'd think it would be a lot closer to the ideal
> than today's system if "checkout <tree> foo.o" rejected overwriting
> "foo.o" that is not tracked in the current index but matches an ignore
> pattern, and required a "--force" option to overwrite it.
>
> A user, during a conflict resolution, may say "I want this 'git
> checkout foo/' to ignore conflicted paths in that directory, so I
> would give "--force" option to it, but now "--force" also implies
> that I am willing to clobber ignored paths, which means I cannot use
> it".
>
> I would think that a proper automation needs per-path hint from the
> user and/or the project, not just a single-size-fits-all --force
> option, and "unlike all the *.o ignored files that are expendable,
> this vendor-supplied-object.o is not" is one way to give such a
> per-path hint.
>
>> This would give scripts which relied on our stable plumbing consistent
>> behavior, while helping users who're using our main porcelain not to
>> lose data. I could then add a --force option to the likes of read-tree
>> (on by default), so you could get porcelain-like behavior with
>> --no-force.
>
> At that low level, I suspect that a single size fits all "--force"
> would work even less well.

Yeah I don't think the one-size-fits-all way out of this is a single
--force flag.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-28 21:54               ` Ævar Arnfjörð Bjarmason
@ 2018-11-29  5:04                 ` Junio C Hamano
  2018-12-01  6:21                 ` Duy Nguyen
  1 sibling, 0 replies; 86+ messages in thread
From: Junio C Hamano @ 2018-11-29  5:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Per Lundberg, brian m. carlson, git@vger.kernel.org, Steffen Jost,
	Joshua Jensen, Matthieu Moy, Clemens Buchacher, Holger Hellmuth,
	Kevin Ballard, Nguyễn Thái Ngọc Duy

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I don't think something like the endgame you've described in
> https://public-inbox.org/git/xmqqzhtwuhpc.fsf@gitster-ct.c.googlers.com/
> is ever going to work. Novice git users (the vast majority) are not
> going to diligently update both .gitignore and some .gitattribute
> mechanism in lockstep.

That goes both ways, no?  Forcing people to add the same pattern,
e.g. *.o, to .gitignore and .gitattribute to say they are
expendable, when most of the time they are, is not something you
should expect from the normal population.

>> I would think that a proper automation needs per-path hint from the
>> user and/or the project, not just a single-size-fits-all --force
>> option, and "unlike all the *.o ignored files that are expendable,
>> this vendor-supplied-object.o is not" is one way to give such a
>> per-path hint.
>>
>>> This would give scripts which relied on our stable plumbing consistent
>>> behavior, while helping users who're using our main porcelain not to
>>> lose data. I could then add a --force option to the likes of read-tree
>>> (on by default), so you could get porcelain-like behavior with
>>> --no-force.
>>
>> At that low level, I suspect that a single size fits all "--force"
>> would work even less well.
>
> Yeah I don't think the one-size-fits-all way out of this is a single
> --force flag.

Yes, indeed.  That's why I prefer the "precious" bit.  The system
would behave the same way with or without it, but projects (not
individual endusers) can take advantage of the feature if they
wanted to.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-28 21:54               ` Ævar Arnfjörð Bjarmason
  2018-11-29  5:04                 ` Junio C Hamano
@ 2018-12-01  6:21                 ` Duy Nguyen
  1 sibling, 0 replies; 86+ messages in thread
From: Duy Nguyen @ 2018-12-01  6:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, per.lundberg, brian m. carlson, Git Mailing List,
	jost, Joshua Jensen, git, Clemens Buchacher,
	Holger Hellmuth (IKS), Kevin Ballard

On Wed, Nov 28, 2018 at 10:54 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> But we must have some viable way to repair warts in the tools, and
> losing user data is a *big* wart.
>
> I don't think something like the endgame you've described in
> https://public-inbox.org/git/xmqqzhtwuhpc.fsf@gitster-ct.c.googlers.com/
> is ever going to work. Novice git users (the vast majority) are not
> going to diligently update both .gitignore and some .gitattribute
> mechanism in lockstep. I'd bet most git users haven't read more than a
> few paragraphs of our entire documentation at best.
>
> So what's the way forward? I think ultimately we must move to something
> where we effectively version the entire CLI UI similar to stable API
> versions. I.e. for things like this that would break some things (or
> Duy's new "split checkout") introduce them as flags first, then bundle
> up all such flags and cut a major release "Git 3, 4, ...", and
> eventually remove old functionality.

Related to Duy's new "split chekckout", I just realized that I added
--overwrite-ignore (enabled by default) [1] years ago to allow to out
out of this behavior. We could turn --no-overwrite-ignore by default
on the new command "git switch-branch" to err on the safe side. Then
the user could switch to --overwrite-ignore once they learn more about
gitignore and gitattributes (or the coming "backup log"). I'm not sure
if I really like this, but at least it's one of the options.

[1] https://public-inbox.org/git/1322388933-6284-2-git-send-email-pclouds@gmail.com/
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC PATCH] Introduce "precious" file concept
  2018-11-27 12:55                     ` Jacob Keller
  2018-11-27 14:50                       ` Per Lundberg
  2018-11-27 15:19                       ` Duy Nguyen
@ 2018-12-06 18:39                       ` Duy Nguyen
  2 siblings, 0 replies; 86+ messages in thread
From: Duy Nguyen @ 2018-12-06 18:39 UTC (permalink / raw)
  To: Jacob Keller
  Cc: per.lundberg, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Git Mailing List, jost, Joshua Jensen,
	Junio C Hamano, git, Clemens Buchacher, Holger Hellmuth (IKS),
	Kevin Ballard

On Tue, Nov 27, 2018 at 1:56 PM Jacob Keller <jacob.keller@gmail.com> wrote:
> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
>
> Either Duy's solution with a sort of "untracked" reflog, or the
> garbage/trashable notion.

The "untracked reflog" is partially functional now [1] if you want to
have a look. I'm not going to post the series until post-2.20, but if
you do look, I suggest the first patch that lays out the design and a
plumbing command to manage it. Basically you'll do

    git backup-log --id=worktree log <some-path>

then pick up the version you like with "git backup-log cat" and do
whatever you want with it. High level UI is not there and will be a
topic of discussion.

The precious/trashable/garbage notion can be used to suppress making backups.

[1] https://gitlab.com/pclouds/git/commits/backup-log
-- 
Duy

^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2018-12-06 18:40 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-17  5:21 git merge, .gitignore, and silently overwriting untracked files Joshua Jensen
2010-08-17 19:33 ` Junio C Hamano
2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
2010-08-19 10:41     ` Jakub Narebski
2010-08-20 18:48       ` Clemens Buchacher
2010-08-20 19:01         ` Joshua Jensen
2010-08-20 20:35     ` Junio C Hamano
2010-08-21  8:05       ` Clemens Buchacher
2010-08-22  7:25         ` Junio C Hamano
2010-08-22  8:20           ` Clemens Buchacher
2010-10-09 22:39         ` Kevin Ballard
2010-08-21 13:23       ` Clemens Buchacher
2010-10-09 13:52       ` [PATCH 0/5] do not overwrite untracked files in leading path Clemens Buchacher
2010-10-09 13:52       ` [PATCH 1/5] t7607: use test_commit and test_must_fail Clemens Buchacher
2010-10-10  6:35         ` Jonathan Nieder
2010-10-10  8:35           ` [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD Clemens Buchacher
2010-10-13 21:33             ` Junio C Hamano
2010-10-13 21:59             ` Junio C Hamano
2010-10-09 13:52       ` [PATCH 2/5] t7607: add leading-path tests Clemens Buchacher
2010-10-09 19:14         ` Johannes Sixt
2010-10-10  8:38           ` [PATCH 2/5 v2] " Clemens Buchacher
2010-10-09 13:52       ` [PATCH 3/5] add function check_ok_to_remove() Clemens Buchacher
2010-10-13 21:43         ` Junio C Hamano
2010-10-09 13:52       ` [PATCH 4/5] lstat_cache: optionally return match_len Clemens Buchacher
2010-10-09 13:53       ` [PATCH 5/5] do not overwrite files in leading path Clemens Buchacher
2010-10-13 21:57         ` Junio C Hamano
2010-10-13 22:34           ` Clemens Buchacher
2010-10-15  6:48             ` Clemens Buchacher
2010-10-15 18:47               ` Junio C Hamano
2010-08-20 20:46     ` [PATCH] optionally disable overwriting of ignored files Junio C Hamano
2010-08-21  6:48       ` [PATCH v2] " Clemens Buchacher
2010-08-23  8:33     ` [PATCH] " Matthieu Moy
2010-08-31 18:44       ` Heiko Voigt
2010-08-23  9:37     ` Matthieu Moy
2010-08-23 13:56       ` Holger Hellmuth
2010-08-23 15:11         ` Clemens Buchacher
2010-08-23 15:57           ` Junio C Hamano
2010-08-24  7:28             ` Clemens Buchacher
2010-08-24 16:19               ` Junio C Hamano
2018-10-16  9:10       ` Ignored files being silently overwritten when switching branches Ævar Arnfjörð Bjarmason
2018-10-16 15:05         ` Duy Nguyen
2018-10-18  1:55           ` Junio C Hamano
2018-11-06 15:12 ` Checkout deleted semi-untracked file Ævar Arnfjörð Bjarmason
2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
2018-11-11 12:15     ` Bert Wesarg
2018-11-11 12:59     ` Junio C Hamano
2018-11-26 19:38     ` [PATCH v2 0/2] Precios files round two Nguyễn Thái Ngọc Duy
2018-11-26 19:38       ` [PATCH v2 1/2] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
2018-11-26 19:38       ` [PATCH v2 2/2] unpack-trees: support core.allIgnoredFilesArePreciousWhenMerging Nguyễn Thái Ngọc Duy
2018-11-11 12:33   ` [RFC PATCH] Introduce "precious" file concept Ævar Arnfjörð Bjarmason
2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason
2018-11-12 16:14       ` Duy Nguyen
2018-11-11 15:41     ` Duy Nguyen
2018-11-11 16:55       ` Ævar Arnfjörð Bjarmason
2018-11-12  7:35       ` Per Lundberg
2018-11-12  9:08         ` Matthieu Moy
2018-11-12  9:49           ` Ævar Arnfjörð Bjarmason
2018-11-12 10:26             ` Junio C Hamano
2018-11-12 12:45               ` Ævar Arnfjörð Bjarmason
2018-11-12 13:02                 ` Junio C Hamano
2018-11-12 16:07           ` Duy Nguyen
2018-11-12 23:22     ` brian m. carlson
2018-11-26  9:30       ` Per Lundberg
2018-11-26 10:28         ` Ævar Arnfjörð Bjarmason
2018-11-26 12:49         ` Junio C Hamano
2018-11-27 15:08           ` Ævar Arnfjörð Bjarmason
2018-11-28  3:58             ` Junio C Hamano
2018-11-28 21:54               ` Ævar Arnfjörð Bjarmason
2018-11-29  5:04                 ` Junio C Hamano
2018-12-01  6:21                 ` Duy Nguyen
2018-11-26 15:26         ` Duy Nguyen
2018-11-26 15:34           ` Ævar Arnfjörð Bjarmason
2018-11-26 15:40             ` Duy Nguyen
2018-11-26 15:47               ` Ævar Arnfjörð Bjarmason
2018-11-26 15:55                 ` Duy Nguyen
2018-11-27  9:43                   ` Per Lundberg
2018-11-27 12:55                     ` Jacob Keller
2018-11-27 14:50                       ` Per Lundberg
2018-11-28  1:21                         ` brian m. carlson
2018-11-28  6:54                           ` Per Lundberg
2018-11-27 15:19                       ` Duy Nguyen
2018-12-06 18:39                       ` Duy Nguyen
2018-11-26 16:02       ` Eckhard Maaß
  -- strict thread matches above, loose matches on Subject: below --
2018-10-15 13:01 Ignored files being silently overwritten when switching branches Per Lundberg
2018-10-16  6:40 ` Jeff King
2018-11-06 12:41 Checkout deleted semi-untracked file Steffen Jost

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).