From: Taylor Blau <ttaylorr@github.com>
To: Sun Chao <16657101987@163.com>
Cc: "Taylor Blau" <me@ttaylorr.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Sun Chao via GitGitGadget" <gitgitgadget@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH v2] packfile: freshen the mtime of packfile by configuration
Date: Wed, 14 Jul 2021 13:04:03 -0400 [thread overview]
Message-ID: <YO8XrOChAtxhpuxS@nand.local> (raw)
In-Reply-To: <ACE7ECBE-0D7A-4FB8-B4F9-F9E32BE2234C@163.com>
On Thu, Jul 15, 2021 at 12:46:47AM +0800, Sun Chao wrote:
> > Stepping back, I'm not sure I understand why freshening a pack is so
> > slow for you. freshen_file() just calls utime(2), and any sync back to
> > the disk shouldn't need to update the pack itself, just a couple of
> > fields in its inode. Maybe you could help explain further.
> >
> > [ ... ]
>
> The reason why we want to avoid freshen the mtime of ".pack" file is to
> improve the reading speed of Git Servers.
>
> We have some large repositories in our Git Severs (some are bigger than 10GB),
> and we created '.keep' files for large ".pack" files, we want the big files
> unchanged to speed up git upload-pack, because in our mind the file system
> cache will reduce the disk IO if a file does not changed.
>
> However we find the mtime of ".pack" files changes over time which makes the
> file system always reload the big files, that takes a lot of IO time and result
> in lower speed of git upload-pack and even further the disk IOPS is exhausted.
That's surprising behavior to me. Are you saying that calling utime(2)
causes the *page* cache to be invalidated and that most reads are
cache-misses lowering overall IOPS?
If so, then I am quite surprised ;). The only state that should be
dirtied by calling utime(2) is the inode itself, so the blocks referred
to by the inode corresponding to a pack should be left in-tact.
If you're on Linux, you can try observing the behavior of evicting
inodes, blocks, or both from the disk cache by changing "2" in the
following:
hyperfine 'git pack-objects --all --stdout --delta-base-offset >/dev/null'
--prepare='sync; echo 2 | sudo tee /proc/sys/vm/drop_caches'
where "1" drops the page cache, "2" drops the inodes, and "3" evicts
both.
I wonder if you could share the results of running the above varying
the value of "1", "2", and "3", as well as swapping the `--prepare` for
`--warmup=3` to warm your caches (and give us an idea of what your
expected performance is probably like).
Thanks,
Taylor
next prev parent reply other threads:[~2021-07-14 17:04 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-10 19:01 [PATCH] packfile: enhance the mtime of packfile by idx file Sun Chao via GitGitGadget
2021-07-11 23:44 ` Ævar Arnfjörð Bjarmason
2021-07-12 16:17 ` Sun Chao
2021-07-14 1:28 ` [PATCH v2] packfile: freshen the mtime of packfile by configuration Sun Chao via GitGitGadget
2021-07-14 1:39 ` Ævar Arnfjörð Bjarmason
2021-07-14 2:52 ` Taylor Blau
2021-07-14 16:46 ` Sun Chao
2021-07-14 17:04 ` Taylor Blau [this message]
2021-07-14 18:19 ` Ævar Arnfjörð Bjarmason
2021-07-14 19:11 ` Martin Fick
2021-07-14 19:41 ` Ævar Arnfjörð Bjarmason
2021-07-14 20:20 ` Martin Fick
2021-07-20 6:32 ` Ævar Arnfjörð Bjarmason
2021-07-15 8:23 ` Son Luong Ngoc
2021-07-20 6:29 ` Ævar Arnfjörð Bjarmason
2021-07-14 19:30 ` Taylor Blau
2021-07-14 19:32 ` Ævar Arnfjörð Bjarmason
2021-07-14 19:52 ` Taylor Blau
2021-07-14 21:40 ` Junio C Hamano
2021-07-15 16:30 ` Sun Chao
2021-07-15 16:42 ` Taylor Blau
2021-07-15 16:48 ` Sun Chao
2021-07-14 16:11 ` Sun Chao
2021-07-19 19:53 ` [PATCH v3] " Sun Chao via GitGitGadget
2021-07-19 20:51 ` Taylor Blau
2021-07-20 0:07 ` Junio C Hamano
2021-07-20 15:07 ` Sun Chao
2021-07-20 6:19 ` Ævar Arnfjörð Bjarmason
2021-07-20 15:34 ` Sun Chao
2021-07-20 15:00 ` Sun Chao
2021-07-20 16:53 ` Taylor Blau
2021-08-15 17:08 ` [PATCH v4 0/2] " Sun Chao via GitGitGadget
2021-08-15 17:08 ` [PATCH v4 1/2] packfile: rename `derive_filename()` to `derive_pack_filename()` Sun Chao via GitGitGadget
2021-08-15 17:08 ` [PATCH v4 2/2] packfile: freshen the mtime of packfile by bump file Sun Chao via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YO8XrOChAtxhpuxS@nand.local \
--to=ttaylorr@github.com \
--cc=16657101987@163.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).