git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Clemens Buchacher <drizzd@gmx.net>
To: Duy Nguyen <pclouds@gmail.com>, Junio C Hamano <gitster@pobox.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Per Lundberg" <per.lundberg@hibox.tv>,
	"Steffen Jost" <jost@tcs.ifi.lmu.de>,
	"Joshua Jensen" <jjensen@workspacewhiz.com>,
	"Matthieu Moy" <git@matthieu-moy.fr>,
	"Holger Hellmuth" <hellmuth@ira.uka.de>,
	"Kevin Ballard" <kevin@sb.org>
Subject: Re: [PATCH 1/1] Introduce "precious" file concept
Date: Wed, 20 Feb 2019 09:31:36 +0100	[thread overview]
Message-ID: <49F0F61F-9874-4027-8430-E313AA46C83D@gmx.net> (raw)
In-Reply-To: <CACsJy8Dq9_uFofs40XwjLkmiBNWXCpic96W1MK_tjLQyaF0+BA@mail.gmail.com>



On February 20, 2019 2:35:41 AM GMT+01:00, Duy Nguyen <pclouds@gmail.com> wrote:
>On Wed, Feb 20, 2019 at 1:08 AM Junio C Hamano <gitster@pobox.com>
>wrote:
>>
>> Duy Nguyen <pclouds@gmail.com> writes:
>>
>> > On Sun, Feb 17, 2019 at 2:36 AM Ævar Arnfjörð Bjarmason
>> > <avarab@gmail.com> wrote:
>> >>
>> >>
>> >> On Sat, Feb 16 2019, Nguyễn Thái Ngọc Duy wrote:
>> >>
>> >> [Re-CC some people involved the last time around]
>> >>
>> >> > A new attribute "precious" is added to indicate that certain
>files
>> >> > have valuable content and should not be easily discarded even if
>they
>> >> > are ignored or untracked.
>> >> >
>> >> > So far there are one part of Git that are made aware of precious
>> >> > files: "git clean" will leave precious files alone.
>> >>
>> >> Thanks for bringing this up again. There were also some patches
>recently
>> >> to save away clobbered files, do you/anyone else have any end goal
>in
>> >> mind here that combines this & that, or some other thing I may not
>have
>> >> kept up with?
>> >
>> > I assume you mean the clobbering untracked files by merge/checkout.
>> > Those files will be backed up [1] if backup-log is implemented.
>Even
>> > files deleted by "git clean" could be saved but that might go a
>little
>> > too far.
>>
>> I agree with Ævar that it is a very good idea to ask what the
>> endgame should look like.  I would have expected that, with an
>> introduction of new "ignored but unexpendable" class of file
>> (i.e. "precious" here), operations such as merge and checkout will
>> be updated to keep them in situations where we would remove "ignored
>> and expendable" files (i.e. "ignored").  And it is perfectly OK if
>> the very first introduction of the "precious" support begins only
>> with a single operation, such as "clean", as long as the end-goal is
>> clear.
>
>I think the sticking point is how to deal with the surprise factor and
>"precious" will not help at all in this aspect. In my mind there are
>three classes
>
> - total expectation, i know i want git to not touch some files, i
>tell git so (e.g. with "precious")
>
> - surprises sometimes, but in known classes. This is the main use
>case of backup log, where I may accidentally do "git commit
>-amsomething" after carefully preparing the index. Saving overwritten
>files by merge/checkout could be done here as an alternative to
>"garbage" attribute.
>
>> I personally do not believe in "backup log"; if we can screw up and
>> can fail to stop an operation that must avoid losing info, then we
>> can screw up the same way and fail to design and implement "backup"
>> to save info before an operation loses it.  If we do a good job in
>> supporting "precious" in various operations, we can rely less on
>> "backup log" and still be safe ;-)
>
>and this is the third class, something completely unexpected. Yes
>backup-log can't help here, but I don't think "precious" can either.
>And I have no good proposal for this case.

Sorry for going off on a tangent here, but I have had this on my mind for a long time. For cases where merge can lead to loss of a non-ignored untracked file (t7607-merge-overwrite.sh), I have the following proposal:

1. Merge the ORIG_HEAD and MERGE_HEAD commits without touching the index or the work tree. This is where we do rename detection, recursive merge, and content (line-by-line) merge. The result is CHECKOUT_HEAD, a tree with possible merge conflicts. For the switch branch operation CHECKOUT_HEAD is the tree to switch to. The remaining steps are the same for merge and switch branch operations.
2. Merge CHECKOUT_HEAD and the index with ORIG_HEAD as the merge base. The result is the CHECKOUT_INDEX. Do this in order to keep staged changes which are not affected by the merge. Do not do rename detection or content merge. In case of conflict, rollback and error out.
3. Merge CHECKOUT_INDEX with the work tree with the original index as merge base. Do this to simulate the work tree update. Dp not do remame detection or content merge. A conflict means that the checkout operation would touch untracked files or files with unstaged changes. In case of such a conflict, rollback and error out.

I believe this algorithm would behave much like the current implementation. But it separates the rename/history/content aspects of the merge algorithm from the checkout operation. It greatly simplifies the implementation of the checkout operation and there are no special cases where we lose files. Implementing step 1 is the tricky part. But it may still be worthwhile because the merge algorithm does not have to worry about staged changes or unstaged changes. The merge algorithm could work on the hierarchical tree structure instead of the flattened index. This makes it trivial to detect directory/file conflicts (no need to do a lookahead when iterating index files). This is also a better fit for detecting directory renames. Maybe this will allow us to focus more on rename detection, such as directory renames or moved functions [*1*]. 

[*1*] Also: moved files where the original file is replaced with a wrapper for the moved file always fools rename detection because we don't detect renames for files which were not removed.

  reply	other threads:[~2019-02-20  8:32 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-16 11:49 [PATCH 0/1] Introduce "precious" file attribute Nguyễn Thái Ngọc Duy
2019-02-16 11:49 ` [PATCH 1/1] Introduce "precious" file concept Nguyễn Thái Ngọc Duy
2019-02-16 19:36   ` Ævar Arnfjörð Bjarmason
2019-02-17  9:31     ` Duy Nguyen
2019-02-18  9:53       ` Ævar Arnfjörð Bjarmason
2019-02-18 10:14         ` Duy Nguyen
2019-02-19 18:08       ` Junio C Hamano
2019-02-20  1:35         ` Duy Nguyen
2019-02-20  8:31           ` Clemens Buchacher [this message]
2019-02-20 22:32           ` Junio C Hamano
2019-02-20  9:19         ` Ævar Arnfjörð Bjarmason
2019-02-20  9:36           ` Steffen Jost
2019-02-20  9:41           ` Duy Nguyen
2019-02-20 10:46             ` Ævar Arnfjörð Bjarmason
2019-02-20 11:11             ` Clemens Buchacher
2019-02-22  9:46               ` Duy Nguyen
2019-02-20 22:39             ` Junio C Hamano
2019-02-22  9:35               ` Duy Nguyen
2019-02-22 18:07                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F0F61F-9874-4027-8430-E313AA46C83D@gmx.net \
    --to=drizzd@gmx.net \
    --cc=avarab@gmail.com \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hellmuth@ira.uka.de \
    --cc=jjensen@workspacewhiz.com \
    --cc=jost@tcs.ifi.lmu.de \
    --cc=kevin@sb.org \
    --cc=pclouds@gmail.com \
    --cc=per.lundberg@hibox.tv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).