git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Shawn Pearce <spearce@spearce.org>
Cc: Stefan Beller <sbeller@google.com>,
	Junio C Hamano <gitster@pobox.com>, git <git@vger.kernel.org>,
	e@80x24.org, dwwang@google.com,
	Dennis Kaarsemaker <dennis@kaarsemaker.net>,
	Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
Date: Tue, 12 Jul 2016 01:24:30 -0400	[thread overview]
Message-ID: <20160712052430.GA23897@sigill.intra.peff.net> (raw)
In-Reply-To: <CAJo=hJtUyF=-iZeA1qBi42KBCP0pE6KsK4_MDP4JZEOf-K0waQ@mail.gmail.com>

On Sun, Jul 10, 2016 at 10:06:12AM -0700, Shawn Pearce wrote:

> On Fri, Jul 8, 2016 at 5:31 PM, Stefan Beller <sbeller@google.com> wrote:
> > +
> > +       /* NEEDSWORK: expose the limitations to be configurable. */
> > +       int max_options = 32;
> > +
> > +       /*
> > +        * NEEDSWORK: expose the limitations to be configurable;
> > +        * Once the limit can be lifted, include a way for payloads
> > +        * larger than one pkt, e.g allow a payload of up to
> > +        * LARGE_PACKET_MAX - 1 only, and reserve the last byte
> > +        * to indicate whether the next pkt continues with this
> > +        * push option.
> > +        */
> > +       int max_size = 1024;
> 
> Instead of this, what about a new config variable
> receive.maxCommandBytes[1] that places a limit on the number of bytes
> of pkt-line data the client can supply in both the command list ("old
> new ref"), push signature framing, and option list?

Hmm, I hadn't thought about that. Just for RAM usage, I don't really see
a huge need for individual limits Sometimes meta-limits like this can be
confusing to set ("I have a huge repo with 50,000 refs that I want to
mirror-push; what do I set the byte-limit to?"), but for the most part
people shouldn't need to be touching it.  And I wonder if this could
simply be done at the pkt-line level.

OTOH, if the true goal is just to limit memory usage, I wonder if git
should be involved at all. Sadly, things like ulimit or cgroups are not
greated for limiting heap memory. ulimit doesn't count non-brk() memory
on most operating systems, and cgroups is too eager to count shared
packfile mmaps. You can try hooking xmalloc/free to keep stats, but you
have to know the size of each allocation in free(), which means either
being intimate with the underlying malloc implementation, or doing dirty
tricks (e.g., increasing the allocation by a few bytes and shoving the
size there). So it feels like there should be a way to ask the malloc
implementation not to allocate more than N bytes at a time, but AFAIK
glibc malloc does not have such a feature. Maybe other allocators do.

I also think there are some reasons besides RAM usage to limit things.

For instance, there is definitely some per-ref work that receive-pack
does (and hooks may do, as well). That remains proportional to the
amount of data sent, but an annoying client can min-max the parameters
by sending a bunch of short ref names, maximizing the size of the "# of
refs" parameter.

Similarly, I'm as much interested in total RAM usage as I am in making
sure we don't hit weird pathological cases. For instance, places where
tree entry names are so long that they overflow "int" and cause bogus
size computations and access memory outside of the array. That probably
wouldn't be an issue here, if the total data size is on the order of
megabytes (so your worst-case is somebody minimizes all parameters but
one, and then that maximizes the other one; but if it still caps out at
a few megabytes, that's not going to overflow anything).

> I studied a lot of repositories[2] and most use ref names under 200
> bytes in length. A 3 MiB default for receive.maxCommandBytes gives
> users something like 11,115 references in a single git push invocation
> if they used all 200 bytes in every name. Most users don't have ref
> names this long. Unlike a cap on each ref, it allows users to use the
> full 65449 bytes in a reference name available in pkt-line, but you
> can only send 48 such references. Likewise for options. :)

My usual "worst case" for refs is:

  https://github.com/JetBrains/intellij-community

A mirror push of that needs more like 4MiB. I'd probably say something
like 10MiB is a reasonable default.

-Peff

  parent reply	other threads:[~2016-07-12  5:24 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-09  0:31 [PATCHv4 0/4] Push options Stefan Beller
2016-07-09  0:31 ` [PATCH 1/4] push options: {pre,post}-receive hook learns about push options Stefan Beller
2016-07-09  0:31 ` [PATCH 2/4] receive-pack: implement advertising and receiving " Stefan Beller
2016-07-10 17:06   ` Shawn Pearce
2016-07-10 18:05     ` Stefan Beller
2016-07-12  4:53       ` Shawn Pearce
2016-07-12  5:24     ` Jeff King [this message]
2016-07-09  0:31 ` [PATCH 3/4] push: accept " Stefan Beller
2016-07-09  0:31 ` [PATCH 4/4] add a test for " Stefan Beller
  -- strict thread matches above, loose matches on Subject: below --
2016-07-14 21:49 [PATCHv7 0/4] Push options Stefan Beller
2016-07-14 21:49 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-14 17:39 [PATCHv5 0/4] Push options Stefan Beller
2016-07-14 17:39 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-14 18:38   ` Junio C Hamano
2016-07-14 19:00     ` Stefan Beller
2016-07-14 19:07       ` Junio C Hamano
2016-07-14 19:45         ` Jeff King
2016-07-14 20:07           ` Junio C Hamano
2016-07-07  1:12 [PATCHv3 0/4] Push options in C Git Stefan Beller
2016-07-07  1:12 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-07 20:37   ` Junio C Hamano
2016-07-07 21:41     ` Stefan Beller
2016-07-07 21:56       ` Jeff King
2016-07-07 22:06         ` Stefan Beller
2016-07-07 22:09           ` Jeff King
2016-07-07 22:06       ` Junio C Hamano
2016-07-08 17:58         ` Jonathan Nieder
2016-07-08 18:39           ` Junio C Hamano
2016-07-08 18:57             ` Stefan Beller
2016-07-08 21:46               ` Jeff King
2016-07-08 22:17                 ` Stefan Beller
2016-07-08 22:21                   ` Jeff King
2016-07-08 22:29                     ` Stefan Beller
2016-07-08 22:35                       ` Jeff King
2016-07-08 22:43                         ` Stefan Beller
2016-07-08 22:46                           ` Jeff King
2016-07-08 22:51                             ` Stefan Beller
2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
2016-06-30  0:59 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-01 17:11   ` Junio C Hamano
2016-07-01 17:24     ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160712052430.GA23897@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=dennis@kaarsemaker.net \
    --cc=dwwang@google.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=sbeller@google.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).