git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Derrick Stolee <stolee@gmail.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Paul Smith" <paul@mad-scientist.net>,
	git@vger.kernel.org, "Duy Nguyen" <pclouds@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [ANNOUNCE] Git v2.19.0-rc0
Date: Fri, 24 Aug 2018 02:56:25 -0400	[thread overview]
Message-ID: <20180824065625.GA10556@sigill.intra.peff.net> (raw)
In-Reply-To: <4b0168ee-4826-1f14-fc83-04c4cec18687@gmail.com>

On Thu, Aug 23, 2018 at 04:59:27PM -0400, Derrick Stolee wrote:

> Using git/git:
> 
> Test      v2.18.0           v2.19.0-rc0             HEAD
> -------------------------------------------------------------------------
> 0001.2:   3.10(3.02+0.08)   3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3%
> 
> 
> Using torvalds/linux:
> 
> Test     v2.18.0             v2.19.0-rc0               HEAD
> ------------------------------------------------------------------------------
> 0001.2:  56.08(45.91+1.50)   56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6%

Interesting that these timings aren't as dramatic as the ones you got
the other day (mine seemed to shift, too; for whatever reason it seems
like under load the difference is larger).

> Now here is where I get on my soapbox (and create a TODO for myself later).
> I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively suggests
> that the results should be _more_ accurate than the default of 3. However, I
> then remember that we only report the *minimum* time from all the runs,
> which is likely to select an outlier from the distribution. To test this, I
> ran a few tests manually and found the variation between runs to be larger
> than 3%.

Yes, I agree it's not a great system. The whole "best of 3" thing is
OK for throwing out cold-cache warmups, but it's really bad for teasing
out the significance of small changes, or even understanding how much
run-to-run noise there is.

> When I choose my own metrics for performance tests, I like to run at least
> 10 runs, remove the largest AND smallest runs from the samples, and then
> take the average. I did this manually for 'git rev-list --all --objects' on
> git/git and got the following results:

I agree that technique is better. I wonder if there's something even
more statistically rigorous we could do. E.g., to compute the variance
and throw away outliers based on standard deviations. And also to report
the variance to give a sense of the significance of any changes.

Obviously more runs gives greater confidence in the results, but 10
sounds like a lot. Many of these tests take minutes to run. Letting it
go overnight is OK if you're doing a once-per-release mega-run, but it's
pretty painful if you just want to generate some numbers to show off
your commit.

> v2.18.0    v2.19.0-rc0   HEAD
> --------------------------------
> 3.126 s    3.308 s       3.170 s

So that's 5% worsening in 2.19, and we reclaim all but 1.4% of it. Those
numbers match what I expect (and what I was seeing in some of my earlier
timings).

> I just kicked off a script that will run this test on the Linux repo while I
> drive home. I'll be able to report a similar table of data easily.

Thanks, I'd expect it to come up with similar percentages. So we'll see
if that holds true. :)

> My TODO is to consider aggregating the data this way (or with a median)
> instead of reporting the minimum.

Yes, I think that would be a great improvement for t/perf.

-Peff

  reply	other threads:[~2018-08-24  6:56 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20 22:13 [ANNOUNCE] Git v2.19.0-rc0 Junio C Hamano
2018-08-20 22:41 ` Stefan Beller
2018-08-20 23:39   ` Jonathan Nieder
2018-08-21  0:27     ` Jonathan Nieder
2018-08-21  0:46       ` Stefan Beller
2018-08-21 20:41 ` Derrick Stolee
2018-08-21 21:29   ` Jeff King
2018-08-22  0:48     ` brian m. carlson
2018-08-22  3:03       ` Jeff King
2018-08-22  3:36         ` Jeff King
2018-08-22 11:11           ` Derrick Stolee
2018-08-22  5:36         ` brian m. carlson
2018-08-22  6:07           ` Jeff King
2018-08-22  7:39             ` Ævar Arnfjörð Bjarmason
2018-08-22 11:14               ` Derrick Stolee
2018-08-22 15:17                 ` Jeff King
2018-08-22 16:08                   ` Duy Nguyen
2018-08-22 16:14                     ` Duy Nguyen
2018-08-22 16:26                       ` Jeff King
2018-08-22 16:49                         ` Derrick Stolee
2018-08-22 16:58                           ` Duy Nguyen
2018-08-22 17:04                             ` Derrick Stolee
2018-08-22 16:59                           ` Jeff King
2018-08-22 17:02                             ` Junio C Hamano
2018-08-22 15:14               ` Jeff King
2018-08-22 14:28           ` Derrick Stolee
2018-08-22 15:24             ` Jeff King
2018-08-22 12:42         ` Paul Smith
2018-08-22 15:23           ` Jeff King
2018-08-23  1:23             ` Jonathan Nieder
2018-08-23  2:16               ` Jeff King
2018-08-23  2:27                 ` Jonathan Nieder
2018-08-23  5:02                   ` Jeff King
2018-08-23  5:09                     ` brian m. carlson
2018-08-23  5:10                     ` Jonathan Nieder
2018-08-23 13:20                     ` Junio C Hamano
2018-08-23 16:31                       ` wide t/perf output, was " Jeff King
2018-08-23  3:47                 ` brian m. carlson
2018-08-23  5:04                   ` Jeff King
2018-08-23 10:26                     ` Derrick Stolee
2018-08-23 13:16                       ` Junio C Hamano
2018-08-23 16:14                       ` Jeff King
2018-08-23 23:30                         ` Jacob Keller
2018-08-23 23:40                           ` Jeff King
2018-08-24  0:06                             ` Jeff King
2018-08-24  0:16                               ` Jeff King
2018-08-24  2:48                                 ` Jacob Keller
2018-08-24  2:59                                   ` Jeff King
2018-08-24  6:45                                     ` Jeff King
2018-08-24 11:04                                       ` Derrick Stolee
2018-08-27 19:36                                     ` Junio C Hamano
2018-08-23 18:53                       ` Jeff King
2018-08-23 20:59                         ` Derrick Stolee
2018-08-24  6:56                           ` Jeff King [this message]
2018-08-24  7:57                             ` Ævar Arnfjörð Bjarmason
2018-08-24 16:45                           ` Derrick Stolee
2018-08-25  8:26                             ` Jeff King
2018-09-02 18:53                       ` Kaartic Sivaraam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180824065625.GA10556@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=paul@mad-scientist.net \
    --cc=pclouds@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).