git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org, Steffen Jost <jost@tcs.ifi.lmu.de>,
	Joshua Jensen <jjensen@workspacewhiz.com>,
	Per Lundberg <per.lundberg@hibox.tv>,
	Junio C Hamano <gitster@pobox.com>,
	Matthieu Moy <git@matthieu-moy.fr>,
	Clemens Buchacher <drizzd@gmx.net>,
	Holger Hellmuth <hellmuth@ira.uka.de>,
	Kevin Ballard <kevin@sb.org>
Subject: Re: [RFC PATCH] Introduce "precious" file concept
Date: Sun, 11 Nov 2018 14:06:23 +0100	[thread overview]
Message-ID: <87zhuf3gs0.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <871s7r4wuv.fsf@evledraar.gmail.com>


On Sun, Nov 11 2018, Ævar Arnfjörð Bjarmason wrote:

> [CC-ing some of the people involved in recent threads about this]
>
> On Sun, Nov 11 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> Since this topic has come up twice recently, I revisited this
>> "precious" thingy that I started four years ago and tried to see if I
>> could finally finish it. There are a couple things to be sorted out...
>>
>> A new attribute "precious" is added to indicate that certain files
>> have valuable content and should not be easily discarded even if they
>> are ignored or untracked (*).
>>
>> So far there are two parts of Git that are made aware of precious
>> files: "git clean" will leave precious files alone and unpack-trees.c
>> (i.e. merges and branch switches) will not overwrite
>> ignored-but-precious files.
>>
>> Is there any other parts of Git that should be made aware of this
>> "precious" attribute?
>>
>> Also while "precious" is a fun name, but it does not sound serious.
>> Any suggestions? Perhaps "valuable"?
>>
>> Very lightly tested. The patch is more to have something to discuss
>> than is bug free and ready to use.
>>
>> (*) Note that tracked files could be marked "precious" in the future
>>     too although the exact semantics is not very clear since tracked
>>     files are by default precious.
>>
>>     But something like "index log" could use this to record all
>>     changes to precious files instead of just "git add -p" changes,
>>     for example. So these files are in a sense more precious than
>>     other tracked files.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>>  Documentation/git-clean.txt     |  3 ++-
>>  Documentation/gitattributes.txt | 13 +++++++++++++
>>  attr.c                          |  9 +++++++++
>>  attr.h                          |  2 ++
>>  builtin/clean.c                 | 19 ++++++++++++++++---
>>  unpack-trees.c                  |  3 ++-
>>  6 files changed, 44 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
>> index 03056dad0d..a9beadfb12 100644
>> --- a/Documentation/git-clean.txt
>> +++ b/Documentation/git-clean.txt
>> @@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This can, for
>>  example, be useful to remove all build products.
>>
>>  If any optional `<path>...` arguments are given, only those paths
>> -are affected.
>> +are affected. Ignored or untracked files with `precious` attributes
>> +are not removed.
>>
>>  OPTIONS
>>  -------
>> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
>> index b8392fc330..c722479bdc 100644
>> --- a/Documentation/gitattributes.txt
>> +++ b/Documentation/gitattributes.txt
>> @@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, the value of the
>>  (See linkgit:git-config[1]).
>>
>>
>> +Precious files
>> +~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +`precious`
>> +^^^^^^^^^^
>> +
>> +This attribute is set on files to indicate that their content is
>> +valuable. Many commands will behave slightly different on precious
>> +files. linkgit:git-clean[1] will leave precious files alone. Merging
>> +and branch switching will not silently overwrite ignored files that
>> +are marked "precious".
>> +
>> +
>>  USING MACRO ATTRIBUTES
>>  ----------------------
>>
>> diff --git a/attr.c b/attr.c
>> index 60d284796d..d06ca0ae4b 100644
>> --- a/attr.c
>> +++ b/attr.c
>> @@ -1186,3 +1186,12 @@ void attr_start(void)
>>  	pthread_mutex_init(&check_vector.mutex, NULL);
>>  #endif
>>  }
>> +
>> +int is_precious_file(struct index_state *istate, const char *path)
>> +{
>> +	static struct attr_check *check;
>> +	if (!check)
>> +		check = attr_check_initl("precious", NULL);
>> +	git_check_attr(istate, path, check);
>> +	return check && ATTR_TRUE(check->items[0].value);
>> +}
>
> If we merge two branches is this using the merged post-image of
> .gitattributes as a source?
>
>>  	if (o->dir &&
>> -	    is_excluded(o->dir, o->src_index, name, &dtype))
>> +	    is_excluded(o->dir, o->src_index, name, &dtype) &&
>> +	    !is_precious_file(o->src_index, name))
>>  		/*
>>  		 * ce->name is explicitly excluded, so it is Ok to
>>  		 * overwrite it.
>
> I wonder if instead we should just be reverting c81935348b ("Fix
> switching to a branch with D/F when current branch has file D.",
> 2007-03-15), which these days (haven't dug deeply) would just be this,
> right?:
>
>>    diff --git a/unpack-trees.c b/unpack-trees.c
>     index 7570df481b..b3efaddd4f 100644
>     --- a/unpack-trees.c
>     +++ b/unpack-trees.c
>     @@ -1894,13 +1894,6 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
>      	if (ignore_case && icase_exists(o, name, len, st))
>      		return 0;
>
>     -	if (o->dir &&
>     -	    is_excluded(o->dir, o->src_index, name, &dtype))
>     -		/*
>     -		 * ce->name is explicitly excluded, so it is Ok to
>     -		 * overwrite it.
>     -		 */
>     -		return 0;
>      	if (S_ISDIR(st->st_mode)) {
>      		/*
>      		 * We are checking out path "foo" and
>
> Something like the approach you're taking will absolutely work from a
> technical standpoint, but I fear that it's going to be useless in
> practice.
>
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
>
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
>
>     "Hey, I 100% understood .gitignore semantics including this one part
>     of the docs where you say you'll do this, but just forgot one day
>     and deleted my work. Can we get some more safety?"
>
> But rather (with some hyperbole for effect):
>
>     "ZOMG git deleted my file! Is this a bug??"
>
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
>
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.
>
> Three tests in our test suite fail with that patch applied, and they're
> explicitly testing for exactly the sort of scenario where users are likely to lose data. I.e.:
>
>  1. Open a tracked file in an editor
>  2. Save it
>  3. Switch to a topic branch, that has different .gitignore semantics
>     (e.g. let's say a build/ dir exists there)
>  4. Have their work deleted
>
> So actually in writing this out I've become convinced that this
> "precious" approach can't work either, because *even if* you're an
> expert who manages to perfectly define their .gitignore and "precious"
> rules in advance to avoid data deletion, those rules will *also* need to
> take into account switching between branches (or even different
> histories) where you have other sorts of rules.
>
> So really, if there's ambiguity let's just not delete stuff by default
> and ask the user to resolve it.

Here's a patch to implement that (which borrows from some of yours). It
passes all of our tests:

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index b8392fc330..a6cad17899 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -1188,6 +1188,17 @@ If this attribute is not set or has an invalid value, the value of the
 (See linkgit:git-config[1]).


+Trashable files
+~~~~~~~~~~~~~~~
+
+`trashable`
+^^^^^^^^^^
+
+Provides an escape hatch for re-enabling a potentially data destroying
+feature which was enabled by default between Git versions 1.5.2 and
+2.20. See the `NOTES` section of linkgit:gitignore[5] for details.
+
+
 USING MACRO ATTRIBUTES
 ----------------------

diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index d107daaffd..39c6d5955a 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -140,6 +140,13 @@ not tracked by Git remain untracked.
 To stop tracking a file that is currently tracked, use
 'git rm --cached'.

+Between Git versions 1.5.2 and 2.20 untracked files or directories
+which were ignored and conflicted with a file about to be checked out
+(e.g. during linkgit:git-checkout[1] or linkgit:git-merge[1]) would be
+deleted. This could lead to loss of user data and is no longer the
+default, See `trashable` in linkgit:gitattributes[5]. for how to
+selectively enable this behavior.
+
 EXAMPLES
 --------

diff --git a/attr.c b/attr.c
index 60d284796d..930af78650 100644
--- a/attr.c
+++ b/attr.c
@@ -1186,3 +1186,12 @@ void attr_start(void)
 	pthread_mutex_init(&check_vector.mutex, NULL);
 #endif
 }
+
+int is_trashable_file(struct index_state *istate, const char *path)
+{
+	static struct attr_check *check;
+	if (!check)
+		check = attr_check_initl("trashable", NULL);
+	git_check_attr(istate, path, check);
+	return check && ATTR_TRUE(check->items[0].value);
+}
diff --git a/attr.h b/attr.h
index b0378bfe5f..ccf4d4e6b5 100644
--- a/attr.h
+++ b/attr.h
@@ -82,4 +82,6 @@ void git_attr_set_direction(enum git_attr_direction new_direction);

 void attr_start(void);

+int is_trashable_file(struct index_state *istate, const char *path);
+
 #endif /* ATTR_H */
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 016391723c..d2ceee33d2 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -844,6 +844,8 @@ test_submodule_switch_recursing_with_args () {
 			git branch -t add_sub1 origin/add_sub1 &&
 			: >sub1 &&
 			echo sub1 >.git/info/exclude &&
+			test_must_fail $command add_sub1 &&
+			echo sub1 trashable >.gitattributes &&
 			$command add_sub1 &&
 			test_superproject_content origin/add_sub1 &&
 			test_submodule_content sub1 origin/add_sub1
diff --git a/t/t1004-read-tree-m-u-wf.sh b/t/t1004-read-tree-m-u-wf.sh
index c13578a635..2243cd955e 100755
--- a/t/t1004-read-tree-m-u-wf.sh
+++ b/t/t1004-read-tree-m-u-wf.sh
@@ -63,8 +63,10 @@ test_expect_success 'two-way with incorrect --exclude-per-directory (2)' '
 	fi
 '

-test_expect_success 'two-way clobbering a ignored file' '
+test_expect_success 'two-way keeping a ignored file, trashing a trashable file' '

+	read_tree_u_must_fail -m -u --exclude-per-directory=.gitignore master side &&
+	echo file2 trashable >.gitattributes &&
 	read_tree_u_must_succeed -m -u --exclude-per-directory=.gitignore master side
 '

@@ -106,7 +108,7 @@ test_expect_success 'three-way not clobbering a working tree file' '

 echo >.gitignore file3

-test_expect_success 'three-way not complaining on an untracked file' '
+test_expect_success 'three-way complaining on an untracked file, trashing a trashable file' '

 	git reset --hard &&
 	rm -f file2 subdir/file2 file3 subdir/file3 &&
@@ -114,6 +116,8 @@ test_expect_success 'three-way not complaining on an untracked file' '
 	echo >file3 file three created in master, untracked &&
 	echo >subdir/file3 file three created in master, untracked &&

+	read_tree_u_must_fail -m -u --exclude-per-directory=.gitignore branch-point master side &&
+	echo file3 trashable >.gitattributes &&
 	read_tree_u_must_succeed -m -u --exclude-per-directory=.gitignore branch-point master side
 '

diff --git a/unpack-trees.c b/unpack-trees.c
index 7570df481b..e9a7fb6583 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1895,9 +1895,10 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;

 	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	    is_excluded(o->dir, o->src_index, name, &dtype) &&
+	    is_trashable_file(o->src_index, name))
 		/*
-		 * ce->name is explicitly excluded, so it is Ok to
+		 * ce->name is explicitly trashable, so it is Ok to
 		 * overwrite it.
 		 */
 		return 0;

  reply	other threads:[~2018-11-11 13:06 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-17  5:21 git merge, .gitignore, and silently overwriting untracked files Joshua Jensen
2010-08-17 19:33 ` Junio C Hamano
2010-08-18 23:39   ` [PATCH] optionally disable overwriting of ignored files Clemens Buchacher
2010-08-19 10:41     ` Jakub Narebski
2010-08-20 18:48       ` Clemens Buchacher
2010-08-20 19:01         ` Joshua Jensen
2010-08-20 20:35     ` Junio C Hamano
2010-08-21  8:05       ` Clemens Buchacher
2010-08-22  7:25         ` Junio C Hamano
2010-08-22  8:20           ` Clemens Buchacher
2010-10-09 22:39         ` Kevin Ballard
2010-08-21 13:23       ` Clemens Buchacher
2010-10-09 13:52       ` [PATCH 0/5] do not overwrite untracked files in leading path Clemens Buchacher
2010-10-09 13:52       ` [PATCH 1/5] t7607: use test_commit and test_must_fail Clemens Buchacher
2010-10-10  6:35         ` Jonathan Nieder
2010-10-10  8:35           ` [PATCH 1/5 v2] t7607: use test-lib functions and check MERGE_HEAD Clemens Buchacher
2010-10-13 21:33             ` Junio C Hamano
2010-10-13 21:59             ` Junio C Hamano
2010-10-09 13:52       ` [PATCH 2/5] t7607: add leading-path tests Clemens Buchacher
2010-10-09 19:14         ` Johannes Sixt
2010-10-10  8:38           ` [PATCH 2/5 v2] " Clemens Buchacher
2010-10-09 13:52       ` [PATCH 3/5] add function check_ok_to_remove() Clemens Buchacher
2010-10-13 21:43         ` Junio C Hamano
2010-10-09 13:52       ` [PATCH 4/5] lstat_cache: optionally return match_len Clemens Buchacher
2010-10-09 13:53       ` [PATCH 5/5] do not overwrite files in leading path Clemens Buchacher
2010-10-13 21:57         ` Junio C Hamano
2010-10-13 22:34           ` Clemens Buchacher
2010-10-15  6:48             ` Clemens Buchacher
2010-10-15 18:47               ` Junio C Hamano
2010-08-20 20:46     ` [PATCH] optionally disable overwriting of ignored files Junio C Hamano
2010-08-21  6:48       ` [PATCH v2] " Clemens Buchacher
2010-08-23  8:33     ` [PATCH] " Matthieu Moy
2010-08-31 18:44       ` Heiko Voigt
2010-08-23  9:37     ` Matthieu Moy
2010-08-23 13:56       ` Holger Hellmuth
2010-08-23 15:11         ` Clemens Buchacher
2010-08-23 15:57           ` Junio C Hamano
2010-08-24  7:28             ` Clemens Buchacher
2010-08-24 16:19               ` Junio C Hamano
2018-10-16  9:10       ` Ignored files being silently overwritten when switching branches Ævar Arnfjörð Bjarmason
2018-10-16 15:05         ` Duy Nguyen
2018-10-18  1:55           ` Junio C Hamano
2018-11-06 15:12 ` Checkout deleted semi-untracked file Ævar Arnfjörð Bjarmason
2018-11-11  9:52   ` [RFC PATCH] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
2018-11-11 12:15     ` Bert Wesarg
2018-11-11 12:59     ` Junio C Hamano
2018-11-26 19:38     ` [PATCH v2 0/2] Precios files round two Nguyễn Thái Ngọc Duy
2018-11-26 19:38       ` [PATCH v2 1/2] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
2018-11-26 19:38       ` [PATCH v2 2/2] unpack-trees: support core.allIgnoredFilesArePreciousWhenMerging Nguyễn Thái Ngọc Duy
2018-11-11 12:33   ` [RFC PATCH] Introduce "precious" file concept Ævar Arnfjörð Bjarmason
2018-11-11 13:06     ` Ævar Arnfjörð Bjarmason [this message]
2018-11-12 16:14       ` Duy Nguyen
2018-11-11 15:41     ` Duy Nguyen
2018-11-11 16:55       ` Ævar Arnfjörð Bjarmason
2018-11-12  7:35       ` Per Lundberg
2018-11-12  9:08         ` Matthieu Moy
2018-11-12  9:49           ` Ævar Arnfjörð Bjarmason
2018-11-12 10:26             ` Junio C Hamano
2018-11-12 12:45               ` Ævar Arnfjörð Bjarmason
2018-11-12 13:02                 ` Junio C Hamano
2018-11-12 16:07           ` Duy Nguyen
2018-11-12 23:22     ` brian m. carlson
2018-11-26  9:30       ` Per Lundberg
2018-11-26 10:28         ` Ævar Arnfjörð Bjarmason
2018-11-26 12:49         ` Junio C Hamano
2018-11-27 15:08           ` Ævar Arnfjörð Bjarmason
2018-11-28  3:58             ` Junio C Hamano
2018-11-28 21:54               ` Ævar Arnfjörð Bjarmason
2018-11-29  5:04                 ` Junio C Hamano
2018-12-01  6:21                 ` Duy Nguyen
2018-11-26 15:26         ` Duy Nguyen
2018-11-26 15:34           ` Ævar Arnfjörð Bjarmason
2018-11-26 15:40             ` Duy Nguyen
2018-11-26 15:47               ` Ævar Arnfjörð Bjarmason
2018-11-26 15:55                 ` Duy Nguyen
2018-11-27  9:43                   ` Per Lundberg
2018-11-27 12:55                     ` Jacob Keller
2018-11-27 14:50                       ` Per Lundberg
2018-11-28  1:21                         ` brian m. carlson
2018-11-28  6:54                           ` Per Lundberg
2018-11-27 15:19                       ` Duy Nguyen
2018-12-06 18:39                       ` Duy Nguyen
2018-11-26 16:02       ` Eckhard Maaß
  -- strict thread matches above, loose matches on Subject: below --
2018-10-15 13:01 Ignored files being silently overwritten when switching branches Per Lundberg
2018-10-16  6:40 ` Jeff King
2018-11-06 12:41 Checkout deleted semi-untracked file Steffen Jost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zhuf3gs0.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=drizzd@gmx.net \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hellmuth@ira.uka.de \
    --cc=jjensen@workspacewhiz.com \
    --cc=jost@tcs.ifi.lmu.de \
    --cc=kevin@sb.org \
    --cc=pclouds@gmail.com \
    --cc=per.lundberg@hibox.tv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).