From: Derrick Stolee <firstname.lastname@example.org>
To: Jeff King <email@example.com>,
Thomas Braun <firstname.lastname@example.org>
Cc: Derrick Stolee <email@example.com>, firstname.lastname@example.org
Subject: Re: t7900's new expensive test
Date: Tue, 1 Dec 2020 15:55:00 -0500 [thread overview]
Message-ID: <email@example.com> (raw)
On 12/1/2020 6:39 AM, Jeff King wrote:
> On Tue, Dec 01, 2020 at 06:23:28AM -0500, Jeff King wrote:
>> I'm not sure if EXPENSIVE is the right ballpark, or if we'd want a
>> VERY_EXPENSIVE. On my machine, the whole test suite for v2.29.0 takes 64
>> seconds to run, and setting GIT_TEST_LONG=1 bumps that to 103s. It got a
>> bit worse since then, as t7900 adds an EXPENSIVE test that takes ~200s
>> (it's not strictly additive, since we can work in parallel on other
>> tests for the first bit, but still, yuck).
> Since Stolee is on the cc and has already seen me complaining about his
> test, I guess I should expand a bit. ;)
Ha. I apologize for causing pain here. My thought was that GIT_TEST_LONG=1
was only used by someone really willing to wait, or someone specifically
trying to investigate a problem that only triggers on very large cases.
In that sense, it's not so much intended as a frequently-run regression
test, but a "run this if you are messing with this area" kind of thing.
Perhaps there is a different pattern to use here?
> There are some small wins possible (e.g., using "commit --quiet" seems
> to shave off ~8s when we don't even think about writing a diff), but
> fundamentally the issue is that it just takes a long time to "git add"
> the 5.2GB worth of random data. I almost wonder if it would be worth it
> to hard-coded the known sha1 and sha256 names of the blobs, and write
> them straight into the appropriate loose object file. I guess that is
> tricky, though, because it actually needs to be a zlib stream, not just
> the output of "test-tool genrandom".
> Though speaking of which, another easy win might be setting
> core.compression to "0". We know the random data won't compress anyway,
> so there's no point in spending cycles on zlib.
The intention is mostly to expand the data beyond two gigabytes, so
dropping compression to get there seems like a good idea. If we are
not compressing at all, then perhaps we can reliably cut ourselves
closer to the 2GB limit instead of overshooting as a precaution.
> Doing this:
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index d9e68bb2bf..849c6d1361 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -239,6 +239,8 @@ test_expect_success 'incremental-repack task' '
> test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
> + test_config core.compression 0 &&
> for i in $(test_seq 1 5)
> test-tool genrandom foo$i $((512 * 1024 * 1024 + 1)) >>big ||
> @@ -257,7 +259,7 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
> return 1
> done &&
> git add big &&
> - git commit -m "Add big file (2)" &&
> + git commit -qm "Add big file (2)" &&
> # ensure any possible loose objects are in a pack-file
> git maintenance run --task=loose-objects &&
> seems to shave off ~140s from the test. I think we could get a little
> more by cleaning up the enormous objects, too (they end up causing the
> subsequent test to run slower, too, though perhaps it was intentional to
> impact downstream tests).
Cutting out 70% out seems like a great idea. I don't think it was super
intentional to slow down those tests.
next prev parent reply other threads:[~2020-12-01 20:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-13 5:06 [PATCH 0/5] handling 4GB .idx files Jeff King
2020-11-13 5:06 ` [PATCH 1/5] compute pack .idx byte offsets using size_t Jeff King
2020-11-13 5:07 ` [PATCH 2/5] use size_t to store pack .idx byte offsets Jeff King
2020-11-13 5:07 ` [PATCH 3/5] fsck: correctly compute checksums on idx files larger than 4GB Jeff King
2020-11-13 5:07 ` [PATCH 4/5] block-sha1: take a size_t length parameter Jeff King
2020-11-13 5:07 ` [PATCH 5/5] packfile: detect overflow in .idx file size checks Jeff King
2020-11-13 11:02 ` Johannes Schindelin
2020-11-15 14:43 ` [PATCH 0/5] handling 4GB .idx files Thomas Braun
2020-11-16 4:10 ` Jeff King
2020-11-16 13:30 ` Derrick Stolee
2020-11-16 23:49 ` Jeff King
2020-11-30 22:57 ` Thomas Braun
2020-12-01 11:23 ` Jeff King
2020-12-01 11:39 ` t7900's new expensive test Jeff King
2020-12-01 20:55 ` Derrick Stolee [this message]
2020-12-02 2:47 ` [PATCH] t7900: speed up " Jeff King
2020-12-03 15:23 ` Derrick Stolee
2020-12-01 18:27 ` [PATCH 0/5] handling 4GB .idx files Taylor Blau
2020-12-02 13:12 ` Jeff King
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).