git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jacob Vosmaer <jacob@gitlab.com>
Cc: Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
	Git Mailing List <git@vger.kernel.org>,
	ps@pks.im
Subject: Re: [PATCH v3 0/2] send_ref buffering
Date: Tue, 31 Aug 2021 20:15:30 -0400	[thread overview]
Message-ID: <YS7GItELaEM8lq2z@coredump.intra.peff.net> (raw)
In-Reply-To: <CADMWQoOfPr47a6bJtApt=wSE9BrgXQKJRfJuNAj0zbtio2BRHw@mail.gmail.com> <CADMWQoPerffQcTfKh3bfesjgHaBqXGzW2805knyRW3R_q4V-YA@mail.gmail.com>

On Tue, Aug 31, 2021 at 03:08:25PM +0200, Jacob Vosmaer wrote:

> > Does the 64k buffer actually improve things? Here are the timings I get
> > on a repo with ~1M refs (it's linux.git with one ref per commit).
> Thanks for challenging that. I have a repeatable benchmark where it
> matters, because each write syscall wakes up a chain of proxies
> between the user and git-upload-pack. Larger buffers means fewer
> wake-ups. But then I tried to simplify my example by having sshd as
> the only intermediary, and in that experiment 64K buffers were not
> better than 4K buffers. I think that goes to show that picking a good
> buffer size is hard, and we'd be better off picking one specifically
> for Gitaly (and GitLab) that works with our stack.

Thanks for explaining. Yeah, I think leaving it as a custom thing makes
sense, then.

> >   Summary
> >     'GIT_REF_PARANOIA=1 git.compile upload-pack .' ran
> >       2.17 ± 0.02 times faster than 'git.compile upload-pack .'
> >
> > It's not exactly the intended use of that environment variable, but its
> > side effect is that we do not call has_object_file() on each ref tip.
> That is nice to know, but as a user of Git I don't know when it is or
> is not safe to skip those has_object_file() calls. If it's safe to
> skip them then Git should skip them always. If not, then I will err on
> the side of caution and keep the checks.

Yeah, the use of REF_PARANOIA there was just an easy illustration. IMHO
it would be reasonable for upload-pack to just assume that the refs
files are valid. If they aren't, then either:

  - the receiver is uninterested in those objects or already has them,
    so won't ask for them. They're happy either way.

  - the receiver _will_ ask for them, at which point we'd barf later in
    pack-objects when we try to access them.

There are some thoughts in this old thread which introduce
GIT_REF_PARANOIA:

  https://lore.kernel.org/git/20150317073730.GA25267@peff.net/

I think I was mostly too cowardly to make the change back then. And I
hadn't considered that the performance implications would be all that
big (though I will say this million-ref repo is at the edge of what I'd
consider reasonable).

> > > I do think it would be nice to take the packet_writer
> > > interface further (letting it replace the static buf, and use stdio
> > > handles, and using it throughout upload-pack).
> > I would like that too, for the sake of neatness and general
> > performance, but I don't have the time to take on a larger project
> > like that at the moment.
> I gave solving the problem with packet_writer a couple of hours today.
> The diff gets too big, and I have too little confidence I'm not
> introducing deadlocks. This really is more work than I can chew off
> right now. Sorry!

Thanks for taking a look! I think we can proceed with your series for
now, then.

-Peff

  parent reply	other threads:[~2021-09-01  0:15 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24 14:02 [PATCH 1/1] upload-pack: buffer ref advertisement writes Jacob Vosmaer
2021-08-24 21:07 ` Taylor Blau
2021-08-24 21:42   ` Junio C Hamano
2021-08-25  0:44     ` Jeff King
2021-08-26 10:02       ` Jacob Vosmaer
2021-08-26 10:06         ` [PATCH 1/2] pkt-line: add packet_fwrite Jacob Vosmaer
2021-08-26 10:06           ` [PATCH 2/2] upload-pack: use stdio in send_ref callbacks Jacob Vosmaer
2021-08-26 16:33             ` Junio C Hamano
2021-08-26 20:21               ` Junio C Hamano
2021-08-26 22:35               ` Taylor Blau
2021-08-26 23:24               ` Jeff King
2021-08-27 16:15                 ` Junio C Hamano
2021-08-31  9:34                   ` [PATCH v3 0/2] send_ref buffering Jacob Vosmaer
2021-08-31  9:34                     ` [PATCH v3 1/2] pkt-line: add stdio packet write functions Jacob Vosmaer
2021-08-31 10:37                       ` Jeff King
2021-08-31 18:13                       ` Junio C Hamano
2021-09-01 12:54                         ` [PATCH v4 0/2] send_ref buffering Jacob Vosmaer
2021-09-01 12:54                           ` [PATCH v4 1/2] pkt-line: add stdio packet write functions Jacob Vosmaer
2021-09-01 12:54                           ` [PATCH v4 2/2] upload-pack: use stdio in send_ref callbacks Jacob Vosmaer
2021-09-02  9:18                           ` [PATCH v4 0/2] send_ref buffering Jeff King
2021-08-31  9:34                     ` [PATCH v3 2/2] upload-pack: use stdio in send_ref callbacks Jacob Vosmaer
2021-08-31 10:25                     ` [PATCH v3 0/2] send_ref buffering Jeff King
2021-08-31 13:08                       ` Jacob Vosmaer
2021-08-31 17:44                         ` Jacob Vosmaer
2021-09-01  0:15                         ` Jeff King [this message]
2021-08-26 23:32             ` [PATCH 2/2] upload-pack: use stdio in send_ref callbacks Jeff King
2021-08-26 16:33           ` [PATCH 1/2] pkt-line: add packet_fwrite Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YS7GItELaEM8lq2z@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob@gitlab.com \
    --cc=me@ttaylorr.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).