git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH] trace2: don't overload target directories
Date: Tue, 30 Jul 2019 15:08:01 -0700	[thread overview]
Message-ID: <20190730220801.GN43313@google.com> (raw)
In-Reply-To: <9ccf15ac-0e53-7044-e7b0-03bb229824d8@jeffhostetler.com>

On 2019.07.30 14:00, Jeff Hostetler wrote:
> 
> 
> On 7/29/2019 6:20 PM, Josh Steadmon wrote:
> > trace2 can write files into a target directory. With heavy usage, this
> > directory can fill up with files, causing difficulty for
> > trace-processing systems.
> 
> I'm routing data in my org to a daemon via a Named Pipe or UD Socket,
> so I'm not seeing the thousands of files problems that you're seeing.
> However, we were being overwhelmed with lots of "uninteresting" commands
> and so I added some whitelisting to my post-processing daemon.  For
> example, I want to know about checkout and push times -- I really don't
> care about rev-parse or config times or other such minor commands.
> 
> I went one step further and allow either "(cmd_name)" or
> the pair "(cmd_name, cmd_mode)".  This lets me select all checkouts
> and limit checkouts to branch-changing ones, for example.  I drop
> any events in my post-processor that does not match any of my whitelist
> patterns.
> 
> Perhaps you could run a quick histogram and see if something would
> be useful to pre-filter the data.  That is, if we had whitelisting
> within git.exe itself, would you still have too much data and/or
> would you still need the overload feature that you've proposed in
> this RFC?

Our problem is not so much with the volume of data as the fact that a
few special users have a ton of git invocations in rapid succession,
each of which creates a new trace file. If we had a more synchronous
collection system like yours it would probably not be so much of a
challenge. But the massive number of files created in a short timeframe
revealed some inefficiencies in our collection system (which are
thankfully being addressed by the owners of that code). But we probably
still want some sort of overload prevention feature as long as we're
using the target directory approach.

  reply	other threads:[~2019-07-30 22:08 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-29 22:20 Josh Steadmon
2019-07-30 13:29 ` Derrick Stolee
2019-07-30 21:52   ` Josh Steadmon
2019-07-30 16:46 ` Jeff Hostetler
2019-07-30 22:01   ` Josh Steadmon
2019-07-30 22:02   ` Josh Steadmon
2019-07-30 18:00 ` Jeff Hostetler
2019-07-30 22:08   ` Josh Steadmon [this message]
2019-08-02 22:02 ` [RFC PATCH v2 0/2] " Josh Steadmon
2019-08-02 22:02   ` [RFC PATCH v2 1/2] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-08-02 22:02   ` [RFC PATCH v2 2/2] trace2: don't overload target directories Josh Steadmon
2019-08-05 15:34     ` Jeff Hostetler
2019-08-05 18:17       ` Josh Steadmon
2019-08-05 18:01     ` SZEDER Gábor
2019-08-05 18:09       ` Josh Steadmon
2019-09-14  0:25 ` [RFC PATCH v3 0/3] " Josh Steadmon
2019-09-14  0:25   ` [RFC PATCH v3 1/3] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-09-14  0:25   ` [RFC PATCH v3 2/3] trace2: don't overload target directories Josh Steadmon
2019-09-14  0:26   ` [RFC PATCH v3 3/3] trace2: write overload message to sentinel files Josh Steadmon
2019-09-16 12:07     ` Derrick Stolee
2019-09-16 14:11       ` Jeff Hostetler
2019-09-16 18:20         ` Josh Steadmon
2019-09-19 18:23           ` Jeff Hostetler
2019-09-19 22:47             ` Josh Steadmon
2019-09-20 15:59               ` Jeff Hostetler
2019-09-16 18:07       ` Josh Steadmon
2019-10-03 23:32 ` [PATCH v4 0/4] trace2: don't overload target directories Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 1/4] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 2/4] docs: clarify trace2 version invariants Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 3/4] trace2: don't overload target directories Josh Steadmon
2019-10-04  0:25     ` Junio C Hamano
2019-10-04 21:57       ` Josh Steadmon
2019-10-04  9:12     ` Johannes Schindelin
2019-10-04 22:05       ` Josh Steadmon
2019-10-03 23:32   ` [PATCH v4 4/4] trace2: write overload message to sentinel files Josh Steadmon
2019-10-04 22:08 ` [PATCH v5 0/4] trace2: discard new traces if the target directory contains too many files Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 1/4] docs: mention trace2 target-dir mode in git-config Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 2/4] docs: clarify trace2 version invariants Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 3/4] trace2: discard new traces if target directory has too many files Josh Steadmon
2019-10-04 22:08   ` [PATCH v5 4/4] trace2: write discard message to sentinel files Josh Steadmon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190730220801.GN43313@google.com \
    --to=steadmon@google.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --subject='Re: [RFC PATCH] trace2: don'\''t overload target directories' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.version-control.git
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.version-control.git
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git