git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Optimizing writes to unchanged files during merges?
Date: Fri, 13 Apr 2018 00:02:36 -0700	[thread overview]
Message-ID: <CABPp-BHQsOSCJiPU9Ku5b67QTkAjnEBrhx04mTXf2QdPBriHmw@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFwwVZDetd-SobOzzLQW4_GEwm3krxEGR+cpqzkzK-yiwQ@mail.gmail.com>

On Thu, Apr 12, 2018 at 5:01 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> [ Still talking to myself. Very soothing. ]
>
> On Thu, Apr 12, 2018 at 4:55 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> [ Talking to myself ]
>>
>> Did it perhaps mean to say
>>
>>                 path_renamed_outside_HEAD = path2 && !strcmp(path, path2);
>>
>> instead?
>
> Probably not correct, but making that change makes my test-case work.
>
> It probably breaks something important, though. I didn't look at why
> that code happened in the first place.
>
> But it's nice to know I'm at least looking at the right code while I'm
> talking to myself.

I hope you don't mind me barging into your conversation, but I figured
out some interesting details.

What we want here is that if there are no content conflicts and the
contents match the version of the file from HEAD, and there are no
mode conflicts and the final mode matches what was in HEAD, and there
are no path conflicts (e.g. a rename/rename(1to2) issue or a D/F
conflict putting a directory in the way) and the final path matches
what was already in HEAD, then we can skip the update.

What does this look like in code?  Well, most of it was already there:

if (mfi.clean && !df_conflict_remains &&
    oid_eq(&mfi.oid, a_oid) && mfi.mode == a_mode) {

That covers everything except "the final path matches what was already
in HEAD".  How do we check for that?  The current code is just making
an approximation; your modification improves the approximation.
However, it turns out we have this awesome function called
"was_tracked(const char *path)" that was intended for answering this
exact question.  So, assuming was_tracked() isn't buggy, the correct
patch for this problem would look like:

-               path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-               if (!path_renamed_outside_HEAD) {
+               if (was_tracked(path)) {

Turns out that patch was already submitted as c5b761fb
("merge-recursive: ensure we write updates for directory-renamed
file", 2018-02-14).  While that patch was for a different bug, it is
identical to what I would have proposed to fix this bug.  A big series
including that patch was merged to master two days ago, but
unfortunately that exact patch was the one that caused some
impressively awful fireworks[1].  So it, along with the rest of the
series it was part of, was reverted just a few hours later.  As it
turns out, the cause of the problem is that was_tracked() can lie when
renames are involved.  It just hadn't ever bit us yet.

I have a fix for was_tracked(), and 10 additional testcases covering
interesting should-be-skipped and
should-not-be-skipped-but-is-close-to-a-should-be-skipped scenarios,
verifying the skipping actually happens if and only if it should
happen.  That should fix your bug, and the bug with my series.  Rough
WIP can be seen at the testme branch in github.com/newren/git for the
curious, but I need to sleep and have a bunch of other things on my
plate for the next few days.  But I thought I'd at least mention what
I've found so far.

Elijah

[1] https://public-inbox.org/git/xmqqefjm3w1h.fsf@gitster-ct.c.googlers.com/

  reply	other threads:[~2018-04-13  7:02 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-12 21:14 Optimizing writes to unchanged files during merges? Linus Torvalds
2018-04-12 21:46 ` Junio C Hamano
2018-04-12 23:17   ` Junio C Hamano
2018-04-12 23:35     ` Linus Torvalds
2018-04-12 23:41       ` Linus Torvalds
2018-04-12 23:55         ` Linus Torvalds
2018-04-13  0:01           ` Linus Torvalds
2018-04-13  7:02             ` Elijah Newren [this message]
2018-04-13 17:14               ` Linus Torvalds
2018-04-13 17:39                 ` Stefan Beller
2018-04-13 17:53                   ` Linus Torvalds
2018-04-13 20:04                 ` Elijah Newren
2018-04-13 22:27                   ` Junio C Hamano
2018-04-16  1:44                 ` Junio C Hamano
2018-04-16  2:03                   ` Linus Torvalds
2018-04-16 16:07                     ` Lars Schneider
2018-04-16 17:04                       ` Ævar Arnfjörð Bjarmason
2018-04-17 17:23                         ` Lars Schneider
2018-04-16 17:43                       ` Jacob Keller
2018-04-16 17:45                         ` Jacob Keller
2018-04-16 22:34                           ` Junio C Hamano
2018-04-17 17:27                           ` Lars Schneider
2018-04-17 17:43                             ` Jacob Keller
2018-04-16 17:47                       ` Phillip Wood
2018-04-16 20:09                       ` Stefan Haller
2018-04-16 22:55                     ` Elijah Newren
2018-04-16 23:03                   ` Elijah Newren
2018-04-12 23:18   ` Linus Torvalds
2018-04-13  0:01 ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BHQsOSCJiPU9Ku5b67QTkAjnEBrhx04mTXf2QdPBriHmw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).