git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>,
	Per Lundberg <per.lundberg@hibox.tv>,
	Steffen Jost <jost@tcs.ifi.lmu.de>,
	Joshua Jensen <jjensen@workspacewhiz.com>,
	Matthieu Moy <git@matthieu-moy.fr>,
	Clemens Buchacher <drizzd@gmx.net>,
	Holger Hellmuth <hellmuth@ira.uka.de>,
	Kevin Ballard <kevin@sb.org>
Subject: Re: [PATCH 1/1] Introduce "precious" file concept
Date: Wed, 20 Feb 2019 11:46:57 +0100	[thread overview]
Message-ID: <87ftsi68ke.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CACsJy8B15hORnaOdYW8TNE3Gniv9NBJopyLYmHR5iF0U3beq6g@mail.gmail.com>


On Wed, Feb 20 2019, Duy Nguyen wrote:

> On Wed, Feb 20, 2019 at 4:19 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> > I personally do not believe in "backup log"; if we can screw up and
>> > can fail to stop an operation that must avoid losing info, then we
>> > can screw up the same way and fail to design and implement "backup"
>> > to save info before an operation loses it.
>>
>> Yes, there could be some unforseen interaction between git commands
>> where we should have such a backup log, but did not think to implement
>> it. I'd hope such cases would be reported, and we could fix them.
>>
>> But those sorts of cases aren't why we started discussing this, rather
>> we *know* what the data shredding command interaction is, but there
>> wasn't a consensus for just not shredding data by default by making
>> users use "checkout -f" or "merge -f" to proceed. I.e. taking some
>> variant of my "trashable" patch[1].
>>
>> > If we do a good job in
>> > supporting "precious" in various operations, we can rely less on
>> > "backup log" and still be safe ;-)
>>
>> Is noted in previous discussions[2] I think that's entirely
>> implausible. I think at best the "precious" facility will be used to
>> mark e.g *.o files as "don't check in, but don't clean (Makefile handles
>> it)".
>>
>> Most git users are at the level of only knowing very basic
>> add/commit/pull/push command interaction. I feel strongly that we need
>> to make our tools safe to use by default, and not require some
>> relatively advanced "precious"/attribute facility to be carefully
>> configured in advance so we don't throw away uncommitted work on the
>> likes of merge/checkout.
>
> There is a trade off somewhere. "new user first" should not come at
> the cost for more experienced users.
>
> Making "git checkout/merge" abort while it's working before breaks
> scripts. And requiring to mark trashable files manually duplicates a
> lot of ignore patterns. Have a look at any .gitignore file, the
> majority of them is for discardable files because "ignored" class was
> created with those in mind (*.o and friends). So now you would need to
> add more or less the same set of ignore rules in .gitattributes to
> mark them trashable, and gitignore/gitattributes rules are not exactly
> compatible, you can't just blindly copy them over. Every time you add
> one more .gitignore rule, there's a good chance you need to add a
> similar rule for trashable attribute.
>
> Maybe we just add a new "newbie" config knob and turn on the safety
> nets on. Leave the knob on by default. And I will turn it off in my
> ~/.gitconfig as soon as it's real.

Oh yes, as noted upthread ("My commentary on this whole thing..."[1] )
my position on what we should do at this point is not that we should
definitely go one way or the other, but that more investigation is
needed.

As my "trashable"[2] patch makes clear we don't even have good tests or
documentation for these cases, which would be a good first step.

The one thing that *is* clear from my digging a few months back is that
the behavior we have now in git is overzealous when we look at the
initial case reported by Shawn way back when it was added.

Specifically, the intention back in 2007 was to fix a case where "git
checkout" ("read-tree -m", but whatever) would barf on a branch switch
where switching needed to replace a *tracked* "smth" with a *tracked*
"smth/file", or the other way around[3].

Does that mean we can just back that behavior out? No, because people
might have come to rely on it, but we should start with seeing exactly
what it *does* do, whether all those things are important or intended,
and maybe we can weight the shredding/backcompat trade-off for some of
those differently than others.

So the obvious thing to try would be to see if we can narrowly keep the
behavior where we end up shredding a file on disk, *but* are switching
between two trees A & B where that have/don't have that file.

Or more generously, try to "git hash-object" arbitrary files we're about
to shred, and check if it's already in the object database. That would
catch case where e.g. the user switching from A->B and would shred a
file, but it (or conflicting dir) is known to neither "A" nor "B", but
exists as a checked-in file in unrelated commit "C", which the user
recently had checked out (and e.g. their editor auto saved it as-is or
something...).

But it's entirely possible that after all that digging we'll come to the
conclusion that we can't change this at all, and we're just going to
live with all the current caveats.

That doesn't mean that having what amounts to a power user feature to
mitigate that damage if you know git well enough that it's going to be a
problem is going to help anything but a small minority of users. So
"dude, where's my data?" problem will still exist.

Even then there room to maneuver, e.g.:

 X. Perhaps after investigating it's not acceptable to change the
    default for script use, but could we require --force if we detect
    that we're connected to a terminal?

 Y. Or if even that is considered too much, we could have something like
    how help.autoCorrect works, where if we detect we're about to eat
    data we wait for 10 seconds, and invite the user to Ctrl+C now
    because we're about to clobber file "xyz".

 Z. It's for whatever reason still unacceptable to do X or Y (or some
    similar mitigation) for all cases of file shredding, but would be OK
    for some specific sub-cases (e.g. the not known to git-hash-object
    case above), and we have reason to suspect that such a narrow
    mitigation strikes the right trade-off between backwards
    compatibility and preventing the "dude, where's my data?" reports we
    get about this periodically.

1. https://public-inbox.org/git/87wolzo7a1.fsf@evledraar.gmail.com/
2. https://public-inbox.org/git/87zhuf3gs0.fsf@evledraar.gmail.com/
3. https://public-inbox.org/git/87wopj3661.fsf@evledraar.gmail.com/

  reply	other threads:[~2019-02-20 10:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-16 11:49 [PATCH 0/1] Introduce "precious" file attribute Nguyễn Thái Ngọc Duy
2019-02-16 11:49 ` [PATCH 1/1] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
2019-02-16 19:36   ` Ævar Arnfjörð Bjarmason
2019-02-17  9:31     ` Duy Nguyen
2019-02-18  9:53       ` Ævar Arnfjörð Bjarmason
2019-02-18 10:14         ` Duy Nguyen
2019-02-19 18:08       ` Junio C Hamano
2019-02-20  1:35         ` Duy Nguyen
2019-02-20  8:31           ` Clemens Buchacher
2019-02-20 22:32           ` Junio C Hamano
2019-02-20  9:19         ` Ævar Arnfjörð Bjarmason
2019-02-20  9:36           ` Steffen Jost
2019-02-20  9:41           ` Duy Nguyen
2019-02-20 10:46             ` Ævar Arnfjörð Bjarmason [this message]
2019-02-20 11:11             ` Clemens Buchacher
2019-02-22  9:46               ` Duy Nguyen
2019-02-20 22:39             ` Junio C Hamano
2019-02-22  9:35               ` Duy Nguyen
2019-02-22 18:07                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ftsi68ke.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=drizzd@gmx.net \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hellmuth@ira.uka.de \
    --cc=jjensen@workspacewhiz.com \
    --cc=jost@tcs.ifi.lmu.de \
    --cc=kevin@sb.org \
    --cc=pclouds@gmail.com \
    --cc=per.lundberg@hibox.tv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).