git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git reset --hard should not irretrievably destroy new files
@ 2016-12-03  5:04 Julian de Bhal
  2016-12-03  7:49 ` Johannes Sixt
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Julian de Bhal @ 2016-12-03  5:04 UTC (permalink / raw)
  To: git

If you `git add new_file; git reset --hard`, new_file is gone forever.

This is totally what git says it will do on the box, but it caught me out.

It might seem a little less stupid if I explain what I was doing: I was
breaking apart a chunk of work into smaller changes:

git commit -a -m 'tmp'           # You feel pretty safe now, right?
git checkout -b backup/my-stuff  # Not necessary, just a convenience
git checkout -
git reset HEAD^                  # mixed
git add new_file
git add -p                       # also not necessary, but distracting
git reset --hard                 # decided copy from backed up diff
# boom. new_file is gone forever


Now, again, this is totally what git says it's going to do, and that was
pretty stupid, but that file is gone for good, and it feels bad.

Everything that was committed is safe, and the other untracked files in
my local directory are also fine, but that particular file is
permanently destroyed. This is the first time I've lost something since I
discovered the reflog a year or two ago.

The behaviour that would make the most sense to me (personally) would be
for a hard reset to unstage new files, but I'd be nearly as happy if a
commit was added to the reflog when the reset happens (I can probably make
that happen with some configuration now that I've been bitten).

If there's support for this idea but no-one is keen to write the code, let
me know and I could have a crack at it.

Cheers,

Julian de Bhál

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git reset --hard should not irretrievably destroy new files
  2016-12-03  5:04 git reset --hard should not irretrievably destroy new files Julian de Bhal
@ 2016-12-03  7:49 ` Johannes Sixt
  2016-12-04  0:14   ` Julian de Bhal
  2016-12-03  8:11 ` Christian Couder
  2016-12-04 19:08 ` Junio C Hamano
  2 siblings, 1 reply; 7+ messages in thread
From: Johannes Sixt @ 2016-12-03  7:49 UTC (permalink / raw)
  To: Julian de Bhal; +Cc: git

Am 03.12.2016 um 06:04 schrieb Julian de Bhal:
> If you `git add new_file; git reset --hard`, new_file is gone forever.

AFAIC, this is a feature ;-) I occasionally use it to remove a file when 
I already have git-gui in front of me. Then it's often less convenient 
to type the path in a shell, or to pointy-click around in a file browser.

> git add new_file

Because of this ...

> git add -p                       # also not necessary, but distracting
> git reset --hard                 # decided copy from backed up diff
> # boom. new_file is gone forever

... it is not. The file is still among the dangling blobs in the 
repository until you clean it up with 'git gc'. Use 'git fsck --lost-found':

--lost-found

     Write dangling objects into .git/lost-found/commit/ or 
.git/lost-found/other/, depending on type. If the object is a blob, the 
contents are written into the file, rather than its object name.

-- Hannes


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git reset --hard should not irretrievably destroy new files
  2016-12-03  5:04 git reset --hard should not irretrievably destroy new files Julian de Bhal
  2016-12-03  7:49 ` Johannes Sixt
@ 2016-12-03  8:11 ` Christian Couder
  2016-12-04  0:57   ` Julian de Bhal
  2016-12-04 19:08 ` Junio C Hamano
  2 siblings, 1 reply; 7+ messages in thread
From: Christian Couder @ 2016-12-03  8:11 UTC (permalink / raw)
  To: Julian de Bhal; +Cc: git

On Sat, Dec 3, 2016 at 6:04 AM, Julian de Bhal <julian.debhal@gmail.com> wrote:
> If you `git add new_file; git reset --hard`, new_file is gone forever.
>
> This is totally what git says it will do on the box, but it caught me out.

Yeah, you are not the first one, and probably not the last
unfortunately, to be caught by it, see for example the last discussion
about it:

https://public-inbox.org/git/loom.20160523T023140-975@post.gmane.org/

which itself refers to this previous discussion:

https://public-inbox.org/git/CANWD=rX-MEiS4cNzDWr2wwkshz2zu8-L31UrKwbZrJSBcJX-nQ@mail.gmail.com/

> It might seem a little less stupid if I explain what I was doing: I was
> breaking apart a chunk of work into smaller changes:
>
> git commit -a -m 'tmp'           # You feel pretty safe now, right?
> git checkout -b backup/my-stuff  # Not necessary, just a convenience
> git checkout -
> git reset HEAD^                  # mixed
> git add new_file
> git add -p                       # also not necessary, but distracting
> git reset --hard                 # decided copy from backed up diff
> # boom. new_file is gone forever
>
>
> Now, again, this is totally what git says it's going to do, and that was
> pretty stupid, but that file is gone for good, and it feels bad.

Yeah, I agree that it feels bad even if there are often ways to get
back your data as you can see from the links in Yotam's email above.

> Everything that was committed is safe, and the other untracked files in
> my local directory are also fine, but that particular file is
> permanently destroyed. This is the first time I've lost something since I
> discovered the reflog a year or two ago.
>
> The behaviour that would make the most sense to me (personally) would be
> for a hard reset to unstage new files,

This has already been proposed last time...

> but I'd be nearly as happy if a
> commit was added to the reflog when the reset happens (I can probably make
> that happen with some configuration now that I've been bitten).

Not sure if this has been proposed. Perhaps it would be simpler to
just output the sha1, and maybe the filenames too, of the blobs, that
are no more referenced from the trees, somewhere (in a bloblog?).

> If there's support for this idea but no-one is keen to write the code, let
> me know and I could have a crack at it.

Not sure if your report and your offer will make us more likely to
agree to do something, but thanks for trying!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git reset --hard should not irretrievably destroy new files
  2016-12-03  7:49 ` Johannes Sixt
@ 2016-12-04  0:14   ` Julian de Bhal
  0 siblings, 0 replies; 7+ messages in thread
From: Julian de Bhal @ 2016-12-04  0:14 UTC (permalink / raw)
  Cc: git

On Sat, Dec 3, 2016 at 5:49 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 03.12.2016 um 06:04 schrieb Julian de Bhal:
>>
>> If you `git add new_file; git reset --hard`, new_file is gone forever.
>
> AFAIC, this is a feature ;-) I occasionally use it to remove a file when I
> already have git-gui in front of me. Then it's often less convenient to type
> the path in a shell, or to pointy-click around in a file browser.

Yeah, I'm conscious that it would be a change in behaviour and would
almost certainly break things in the wild.

On the other hand, `rm` deletes perfectly well, but there's no good
way to recover the lost files after the fact. You can take some
precautions after you've been bitten, but git usually means never
saying "you should have".

>> git add new_file
>> [...]
>> git reset --hard                 # decided copy from backed up diff
>> # boom. new_file is gone forever
>
> ... it is not. The file is still among the dangling blobs in the repository
> until you clean it up with 'git gc'. Use 'git fsck --lost-found':

Thank you so much! Super glad to be wrong here.

Cheers,

Jules

On Sat, Dec 3, 2016 at 5:49 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 03.12.2016 um 06:04 schrieb Julian de Bhal:
>>
>> If you `git add new_file; git reset --hard`, new_file is gone forever.
>
>
> AFAIC, this is a feature ;-) I occasionally use it to remove a file when I
> already have git-gui in front of me. Then it's often less convenient to type
> the path in a shell, or to pointy-click around in a file browser.
>
>> git add new_file
>
>
> Because of this ...
>
>> git add -p                       # also not necessary, but distracting
>> git reset --hard                 # decided copy from backed up diff
>> # boom. new_file is gone forever
>
>
> ... it is not. The file is still among the dangling blobs in the repository
> until you clean it up with 'git gc'. Use 'git fsck --lost-found':
>
> --lost-found
>
>     Write dangling objects into .git/lost-found/commit/ or
> .git/lost-found/other/, depending on type. If the object is a blob, the
> contents are written into the file, rather than its object name.
>
> -- Hannes
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git reset --hard should not irretrievably destroy new files
  2016-12-03  8:11 ` Christian Couder
@ 2016-12-04  0:57   ` Julian de Bhal
  2016-12-04 10:47     ` Christian Couder
  0 siblings, 1 reply; 7+ messages in thread
From: Julian de Bhal @ 2016-12-04  0:57 UTC (permalink / raw)
  To: Christian Couder; +Cc: git

On Sat, Dec 3, 2016 at 6:11 PM, Christian Couder
<christian.couder@gmail.com> wrote:
> On Sat, Dec 3, 2016 at 6:04 AM, Julian de Bhal <julian.debhal@gmail.com> wrote:
>> but I'd be nearly as happy if a
>> commit was added to the reflog when the reset happens (I can probably make
>> that happen with some configuration now that I've been bitten).
>
> Not sure if this has been proposed. Perhaps it would be simpler to
> just output the sha1, and maybe the filenames too, of the blobs, that
> are no more referenced from the trees, somewhere (in a bloblog?).

Yeah, after doing a bit more reading around the issue, this seems like
a smaller part of destroying local changes with a hard reset, and I'm
one of the lucky ones where it is recoverable.

Has anyone discussed having `git reset --hard` create objects for the
current state of anything it's about to destroy, specifically so they
end up in the --lost-found?

I think this is what you're suggesting, only without checking for
references, so that tree & blob objects exist that make any hard reset
reversible.

Cheers

Jules

P.s. Thank you for such a warm welcome while I blunder through
unfamiliar protocols.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git reset --hard should not irretrievably destroy new files
  2016-12-04  0:57   ` Julian de Bhal
@ 2016-12-04 10:47     ` Christian Couder
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Couder @ 2016-12-04 10:47 UTC (permalink / raw)
  To: Julian de Bhal; +Cc: git

On Sun, Dec 4, 2016 at 1:57 AM, Julian de Bhal <julian.debhal@gmail.com> wrote:
> On Sat, Dec 3, 2016 at 6:11 PM, Christian Couder
> <christian.couder@gmail.com> wrote:
>> On Sat, Dec 3, 2016 at 6:04 AM, Julian de Bhal <julian.debhal@gmail.com> wrote:
>>> but I'd be nearly as happy if a
>>> commit was added to the reflog when the reset happens (I can probably make
>>> that happen with some configuration now that I've been bitten).
>>
>> Not sure if this has been proposed. Perhaps it would be simpler to
>> just output the sha1, and maybe the filenames too, of the blobs, that
>> are no more referenced from the trees, somewhere (in a bloblog?).
>
> Yeah, after doing a bit more reading around the issue, this seems like
> a smaller part of destroying local changes with a hard reset, and I'm
> one of the lucky ones where it is recoverable.

Yeah, but not everyone knows it is recoverable and using fsck to
recover is not nice and easy for the user.
So having a bloblog for example in .git/logs/blobs/, like the reflogs
we already have, but for blobs, could help even if (first) it's just
about writing the filenames and sha1s related to the blobs we stop
referencing.

> Has anyone discussed having `git reset --hard` create objects for the
> current state of anything it's about to destroy, specifically so they
> end up in the --lost-found?

Well, when we start talking about creating new objects, then someone
usually says that it is what "git stash" is about. So the discussion
then often turns to how can we make people more aware of "git stash",
or incite them to create an alias or a shell function that does a "git
stash" before "git reset --hard ...", or teach them to use "git reset
--keep ..." when it does what they want and is safer...

> I think this is what you're suggesting, only without checking for
> references, so that tree & blob objects exist that make any hard reset
> reversible.

I suggest we start with just logging blobs that we have already
created (when they have been "git add"ed) but that we are
dereferencing.
If we can agree on that, it will already help and not be very costly
performance wise. After that we could then start thinking about
creating blobs for all the content we discard, which could be done
only in a beginner mode (at least at first) to make sure it has no
performance impact if people rely on "git reset --hard" being fast.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git reset --hard should not irretrievably destroy new files
  2016-12-03  5:04 git reset --hard should not irretrievably destroy new files Julian de Bhal
  2016-12-03  7:49 ` Johannes Sixt
  2016-12-03  8:11 ` Christian Couder
@ 2016-12-04 19:08 ` Junio C Hamano
  2 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2016-12-04 19:08 UTC (permalink / raw)
  To: Julian de Bhal; +Cc: git

Julian de Bhal <julian.debhal@gmail.com> writes:

> The behaviour that would make the most sense to me (personally) would be
> for a hard reset to unstage new files,...

I think _sometimes_ that may be useful.  I haven't thought things
through yet to arrive the final decision, but one thing that must be
kept in mind by anybody who wants to move this topic forward is that
a path that does not exist in the HEAD commit MUST be removed from
the index and the working tree upon "git reset --hard" when the path
resulted from a mergy operation.  I.e. in this sequence:

    $ git merge other-branch ;# or git cherry-pick one-commit
    ... try to resolve conflicts, make a mess, decide
    ... to try it again from scratch
    $ git reset --hard
    $ git merge other-branch ;# or git cherry-pick one-commit

the "reset --hard" step MUST remove new paths that existed only on
the other-branch (or in one-commit), which by definition would have
auto-resolved cleanly, from the index and the working tree.  There
are other commands (e.g. "git am -3", "git apply -3", "git rebase")
that are "mergy" and their intermediate states must be handled the
same way.

If a very simple to explain and understand rule can be used to tell
if a new path (i.e. a path that exists in the index and in the
working tree, but is not in the HEAD commit) is what was created
manually by the end user without any other copy (i.e. "create
newfile && edit newfile && git add newfile") and is not a result of
a mergy operation being abandoned, then I think it is OK to allow
"reset --hard" to leave the working tree files untracked, but if the
rule becomes anything complex to understand for the users, I think
it would make the behaviour of "reset --hard" hard to explain,
understand AND anticipate---and at that point, we would be better
off keeping the "You said 'hard', we clean 'hard' to match HEAD"
behaviour of "reset --hard" and EDUCATE users not to say 'hard' too
casually. There may be a room for new option that unconditionally
leave the new working tree files untracked so that users can choose
between the two, if we end up going that route.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-12-04 19:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-03  5:04 git reset --hard should not irretrievably destroy new files Julian de Bhal
2016-12-03  7:49 ` Johannes Sixt
2016-12-04  0:14   ` Julian de Bhal
2016-12-03  8:11 ` Christian Couder
2016-12-04  0:57   ` Julian de Bhal
2016-12-04 10:47     ` Christian Couder
2016-12-04 19:08 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).