git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Lars Schneider <larsxschneider@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Junio C Hamano <gitster@pobox.com>,
	Elijah Newren <newren@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	mgorny@gentoo.org, rtc@helen.PLASMA.Xg8.DE,
	winserver.support@winserver.com, tytso@mit.edu
Subject: Re: Optimizing writes to unchanged files during merges?
Date: Tue, 17 Apr 2018 19:23:16 +0200	[thread overview]
Message-ID: <976C3B07-6396-4762-B6FF-4092B5D5D8C6@gmail.com> (raw)
In-Reply-To: <8736zvf5gc.fsf@evledraar.gmail.com>


> On 16 Apr 2018, at 19:04, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> 
> On Mon, Apr 16 2018, Lars Schneider wrote:
> 
>>> On 16 Apr 2018, at 04:03, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>>> 
>>> On Sun, Apr 15, 2018 at 6:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> 
>>>> I think Elijah's corrected was_tracked() also does not care "has
>>>> this been renamed".
>>> 
>>> I'm perfectly happy with the slightly smarter patches. My patch was
>>> really just an RFC and because I had tried it out.
>>> 
>>>> One thing that makes me curious is what happens (and what we want to
>>>> happen) when such a "we already have the changes the side branch
>>>> tries to bring in" path has local (i.e. not yet in the index)
>>>> changes.  For a dirty file that trivially merges (e.g. a path we
>>>> modified since our histories forked, while the other side didn't do
>>>> anything, has local changes in the working tree), we try hard to
>>>> make the merge succeed while keeping the local changes, and we
>>>> should be able to do the same in this case, too.
>>> 
>>> I think it might be nice, but probably not really worth it.
>>> 
>>> I find the "you can merge even if some files are dirty" to be really
>>> convenient, because I often keep stupid test patches in my tree that I
>>> may not even intend to commit, and I then use the same tree for
>>> merging.
>>> 
>>> For example, I sometimes end up editing the Makefile for the release
>>> version early, but I won't *commit* that until I actually cut the
>>> release. But if I pull some branch that has also changed the Makefile,
>>> it's not worth any complexity to try to be nice about the dirty state.
>>> 
>>> If it's a file that actually *has* been changed in the branch I'm
>>> merging, and I'm more than happy to just stage the patch (or throw it
>>> away - I think it's about 50:50 for me).
>>> 
>>> So I don't think it's a big deal, and I'd rather have the merge fail
>>> very early with "that file has seen changes in the branch you are
>>> merging" than add any real complexity to the merge logic.
>> 
>> I am happy to see this discussion and the patches, because long rebuilds
>> are a constant annoyance for us. We might have been bitten by the exact
>> case discussed here, but more often, we have a slightly different
>> situation:
>> 
>> An engineer works on a task branch and runs incremental builds — all
>> is good. The engineer switches to another branch to review another
>> engineer's work. This other branch changes a low-level header file,
>> but no rebuild is triggered. The engineer switches back to the previous
>> task branch. At this point, the incremental build will rebuild
>> everything, as the compiler thinks that the low-level header file has
>> been changed (because the mtime is different).
>> 
>> Of course, this problem can be solved with a separate worktree. However,
>> our engineers forget about that sometimes, and then, they are annoyed by
>> a 4h rebuild.
>> 
>> Is this situation a problem for others too?
>> If yes, what do you think about the following approach:
>> 
>> What if Git kept a LRU list that contains file path, content hash, and
>> mtime of any file that is removed or modified during a checkout. If a
>> file is checked out later with the exact same path and content hash,
>> then Git could set the mtime to the previous value. This way the
>> compiler would not think that the content has been changed since the
>> last rebuild.
>> 
>> I think that would fix the problem that our engineers run into and also
>> the problem that Linus experienced during the merge, wouldn't it?
> 
> Could what you're describing be prototyped as a post-checkout hook that
> looks at the reflog? It sounds to me like it could, but perhaps I've
> missed some subtlety.

Yeah, probably. You don't even need the reflog I think. I just wanted
to get a sense if other people run into this problem too.


> Not re-writing out a file that hasn't changed is one thing, but I think
> for more complex behaviors (such as the "I want everything to have the
> same mtime" mentioned in another thread on-list), and this, it makes
> sense to provide some hook mechanism than have git itself do all the
> work.
> 
> I also don't see how what you're describing could be generalized, or
> even be made to work reliably in the case you're describing. If the
> engineer runs "make" on this branch he's testing out that might produce
> an object file that'll get used as-is once he switches back, since
> you've set the mtime in the past for that file because you re-checked it
> out.

Ohh... you're right. I thought Visual Studio looks *just* at ctime/mtime 
of the files. But this seems not to be true [1]:
 
   "MSBuild to build it quickly checks if any of a project’s input files 
    are modified later than any of the project’s outputs"

In that case my idea outlined above is garbage.

Thanks,
Lars


[1] https://blogs.msdn.microsoft.com/kirillosenkov/2014/08/04/how-to-investigate-rebuilding-in-visual-studio-when-nothing-has-changed/

  reply	other threads:[~2018-04-17 17:23 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-12 21:14 Optimizing writes to unchanged files during merges? Linus Torvalds
2018-04-12 21:46 ` Junio C Hamano
2018-04-12 23:17   ` Junio C Hamano
2018-04-12 23:35     ` Linus Torvalds
2018-04-12 23:41       ` Linus Torvalds
2018-04-12 23:55         ` Linus Torvalds
2018-04-13  0:01           ` Linus Torvalds
2018-04-13  7:02             ` Elijah Newren
2018-04-13 17:14               ` Linus Torvalds
2018-04-13 17:39                 ` Stefan Beller
2018-04-13 17:53                   ` Linus Torvalds
2018-04-13 20:04                 ` Elijah Newren
2018-04-13 22:27                   ` Junio C Hamano
2018-04-16  1:44                 ` Junio C Hamano
2018-04-16  2:03                   ` Linus Torvalds
2018-04-16 16:07                     ` Lars Schneider
2018-04-16 17:04                       ` Ævar Arnfjörð Bjarmason
2018-04-17 17:23                         ` Lars Schneider [this message]
2018-04-16 17:43                       ` Jacob Keller
2018-04-16 17:45                         ` Jacob Keller
2018-04-16 22:34                           ` Junio C Hamano
2018-04-17 17:27                           ` Lars Schneider
2018-04-17 17:43                             ` Jacob Keller
2018-04-16 17:47                       ` Phillip Wood
2018-04-16 20:09                       ` Stefan Haller
2018-04-16 22:55                     ` Elijah Newren
2018-04-16 23:03                   ` Elijah Newren
2018-04-12 23:18   ` Linus Torvalds
2018-04-13  0:01 ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=976C3B07-6396-4762-B6FF-4092B5D5D8C6@gmail.com \
    --to=larsxschneider@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mgorny@gentoo.org \
    --cc=newren@gmail.com \
    --cc=rtc@helen.PLASMA.Xg8.DE \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=winserver.support@winserver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).