git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	git@vger.kernel.org, "René Scharfe" <l.s.r@web.de>
Subject: Re: What's cooking in git.git (Nov 2018, #06; Wed, 21)
Date: Thu, 22 Nov 2018 19:36:54 +0100	[thread overview]
Message-ID: <87efbd0xix.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20181122175259.GC22123@sigill.intra.peff.net>


On Thu, Nov 22 2018, Jeff King wrote:

> On Wed, Nov 21, 2018 at 11:48:14AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Wed, Nov 21 2018, Junio C Hamano wrote:
>>
>> > * jk/loose-object-cache (2018-11-13) 9 commits
>> >   (merged to 'next' on 2018-11-18 at 276691a21b)
>> >  + fetch-pack: drop custom loose object cache
>> >  + sha1-file: use loose object cache for quick existence check
>> >  + object-store: provide helpers for loose_objects_cache
>> >  + sha1-file: use an object_directory for the main object dir
>> >  + handle alternates paths the same as the main object dir
>> >  + sha1_file_name(): overwrite buffer instead of appending
>> >  + rename "alternate_object_database" to "object_directory"
>> >  + submodule--helper: prefer strip_suffix() to ends_with()
>> >  + fsck: do not reuse child_process structs
>> >
>> >  Code clean-up with optimization for the codepath that checks
>> >  (non-)existence of loose objects.
>> >
>> >  Will cook in 'next'.
>>
>> I think as noted in
>> https://public-inbox.org/git/e5148b8c-9a3a-5d2e-ac8c-3e536c0f2358@web.de/
>> that we should hold off the [89]/9 of this series due to the performance
>> regressions this introduces in some cases (while fixing other cases).
>>
>> I hadn't had time to follow up on that, and figured it could wait until
>> post-2.20 for a re-roll.
>
> Yeah, my intent had been to circle back around to this, but I just
> hadn't gotten to it. I'm still pondering a config option or similar,
> though I remain unconvinced that the cases in which you've showed it
> being slow are actually realistic or worth worrying about

FWIW those "used to be 2ms are now 20-40ms" pushes on ext4 are
representative of the actual prod setup I'm mainly targeting. Now, I
don't run on ext4 this patch helps there, but it seems plausible that it
matters to someone who's counting on that performance.

Buh yeah, it's certainly obscure. I don't blame you if you don't want to
hack on it, and not ejecting this out before 2.20 isn't going to break
anything for me. But do you mind if I make it configurable as part of my
post-2.20 "disable collisions?"

>  (and certainly having an obscure config option is not enough to help
> most people). If we could have it kick in heuristically, that would be
> better.

Aside from this specific scenario. I'd really prefer if we avoid having
heuristic performance optimizations at all costs.

Database servers tend to do that sort of thing with their query planner,
and it results in cases where your entire I/O profile changes overnight
because you're now on the wrong side of some if/else heuristic about
whather to use some index or not.

> However, note that the cache-load for finding abbreviations _must_ have
> the complete list. And has been loading it for some time. So if you run
> "git-fetch", for example, you've already been running this code for
> months (and using the cache in more places is now a free speedup).

This is reminding me that I need to get around to re-submitting my
core.validateAbbrev series, which addresses this part of the problem:
https://public-inbox.org/git/20180608224136.20220-21-avarab@gmail.com/

> At the very least, we'd want this patch on top, too. I also think René's
> suggestion use access() is worth pursuing (though to some degree is
> orthogonal to the cache).

I haven't had time to test that, and wasn't prioritizing it since I
figured this was post-2.20. My hunch is it doesn't matter much if at all
on NFS. The roundtrip time is what matters, whether that roundtrip is
fstat() or access() probably not.

> -- >8 --
> Subject: [PATCH] odb_load_loose_cache: fix strbuf leak
>
> Commit 66f04152be (object-store: provide helpers for
> loose_objects_cache, 2018-11-12) moved the cache-loading code from
> find_short_object_filename(), but forgot the line that releases the path
> strbuf.
>
> Reported-by: René Scharfe <l.s.r@web.de>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  sha1-file.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/sha1-file.c b/sha1-file.c
> index 5894e48ea4..5a272f70de 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -2169,6 +2169,7 @@ void odb_load_loose_cache(struct object_directory *odb, int subdir_nr)
>  				    NULL, NULL,
>  				    &odb->loose_objects_cache);
>  	odb->loose_objects_subdir_seen[subdir_nr] = 1;
> +	strbuf_release(&buf);
>  }
>
>  static int check_stream_sha1(git_zstream *stream,

  reply	other threads:[~2018-11-22 18:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-21  9:00 What's cooking in git.git (Nov 2018, #06; Wed, 21) Junio C Hamano
2018-11-21 10:48 ` Ævar Arnfjörð Bjarmason
2018-11-22 17:53   ` Jeff King
2018-11-22 18:36     ` Ævar Arnfjörð Bjarmason [this message]
2018-11-24 12:09       ` Jeff King
2018-11-25  2:02         ` Junio C Hamano
2018-11-26 18:13           ` Jeff King
2018-11-24  2:11     ` Junio C Hamano
2018-11-24 12:06       ` Jeff King
2018-11-21 20:11 ` js/vsts-ci, was " Johannes Schindelin
2018-11-21 23:54 ` Stephen P. Smith
2018-11-22  1:06   ` Junio C Hamano
2018-11-22  2:08     ` Stephen P. Smith
2018-11-22 16:05       ` Linus Torvalds
2018-11-22 10:48   ` Ævar Arnfjörð Bjarmason
2018-11-26 21:57 ` Stefan Beller
2018-11-26 23:34   ` Junio C Hamano
2018-12-05 22:04 ` Matthew DeVore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87efbd0xix.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).