From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Optimizing writes to unchanged files during merges?
Date: Thu, 12 Apr 2018 14:14:24 -0700 [thread overview]
Message-ID: <CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com> (raw)
So I just had an interesting experience that has happened before too,
but this time I decided to try to figure out *why* it happened.
I'm obviously in the latter part of the kernel merge window, and
things are slowly calming down. I do the second XFS merge during this
window, and it brings in updates just to the fs/xfs/ subdirectory, so
I expect that my test build for the full kernel configuration should
be about a minute.
Instead of recompiles pretty much *everything*, and the test build
takes 22 minutes.
This happens occasionally, and I blame gremlins. But this time I
decided to look at what the gremlins actually *are*.
The diff that git shows for the pull was very clear: only fs/xfs/
changes. But doing
ls -tr $(git ls-files '*.[chS]') | tail -10
gave the real reason: in between all the fs/xfs/xyz files was this:
include/linux/mm.h
and yeah, that rather core header file causes pretty much everything
to be re-built.
Now, the reason it was marked as changed is that the xfs branch _had_
in fact changed it, but the changes were already upstream and got
merged away. But the file still got written out (with the same
contents it had before the merge), and 'make' obviously only looks at
modification time, so make rebuilt everything.
Now, because it's still the merge window, I don't have much time to
look into this, but I was hoping somebody in git land would like to
give it a quick look. I'm sure I'm not the only one to have ever been
hit by this, and I don't think the kernel is the only project to hit
it either.
Because it would be lovely if the merging logic would just notice "oh,
that file doesn't change", and not even write out the end result.
For testing, the merge that triggered this git introspection is kernel
commit 80aa76bcd364 ("Merge tag 'xfs-4.17-merge-4' of
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux"), which can act as a
test-case. It's a clean merge, so no kernel knowledge necessary: just
re-merge the parents and see if the modification time of
include/linux/mm.h changes.
I'm guessing some hack in update_file_flags() to say "if the new
object matches the old one, don't do anything" might work. Although I
didn't look if we perhaps end up writing to the working tree copy
earlier already.
Looking at the blame output of that function, most of it is really
old, so this "problem" goes back a long long time.
Just to clarify: this is absolutely not a correctness issue. It's not
even a *git* performance issue. It's literally just a "not updating
the working tree when things don't change results in better behavior
for other tools".
So if somebody gets interested in this problem, that would be great.
And if not, I'll hopefully remember to come back to this next week
when the merge window is over, and take a second look.
Linus
next reply other threads:[~2018-04-12 21:14 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-12 21:14 Linus Torvalds [this message]
2018-04-12 21:46 ` Optimizing writes to unchanged files during merges? Junio C Hamano
2018-04-12 23:17 ` Junio C Hamano
2018-04-12 23:35 ` Linus Torvalds
2018-04-12 23:41 ` Linus Torvalds
2018-04-12 23:55 ` Linus Torvalds
2018-04-13 0:01 ` Linus Torvalds
2018-04-13 7:02 ` Elijah Newren
2018-04-13 17:14 ` Linus Torvalds
2018-04-13 17:39 ` Stefan Beller
2018-04-13 17:53 ` Linus Torvalds
2018-04-13 20:04 ` Elijah Newren
2018-04-13 22:27 ` Junio C Hamano
2018-04-16 1:44 ` Junio C Hamano
2018-04-16 2:03 ` Linus Torvalds
2018-04-16 16:07 ` Lars Schneider
2018-04-16 17:04 ` Ævar Arnfjörð Bjarmason
2018-04-17 17:23 ` Lars Schneider
2018-04-16 17:43 ` Jacob Keller
2018-04-16 17:45 ` Jacob Keller
2018-04-16 22:34 ` Junio C Hamano
2018-04-17 17:27 ` Lars Schneider
2018-04-17 17:43 ` Jacob Keller
2018-04-16 17:47 ` Phillip Wood
2018-04-16 20:09 ` Stefan Haller
2018-04-16 22:55 ` Elijah Newren
2018-04-16 23:03 ` Elijah Newren
2018-04-12 23:18 ` Linus Torvalds
2018-04-13 0:01 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).