From: Derrick Stolee <firstname.lastname@example.org> To: Jeff King <email@example.com>, Thomas Braun <firstname.lastname@example.org> Cc: Derrick Stolee <email@example.com>, firstname.lastname@example.org Subject: Re: t7900's new expensive test Date: Tue, 1 Dec 2020 15:55:00 -0500 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <X8YrbDpC9/EjRr95@coredump.intra.peff.net> On 12/1/2020 6:39 AM, Jeff King wrote: > On Tue, Dec 01, 2020 at 06:23:28AM -0500, Jeff King wrote: > >> I'm not sure if EXPENSIVE is the right ballpark, or if we'd want a >> VERY_EXPENSIVE. On my machine, the whole test suite for v2.29.0 takes 64 >> seconds to run, and setting GIT_TEST_LONG=1 bumps that to 103s. It got a >> bit worse since then, as t7900 adds an EXPENSIVE test that takes ~200s >> (it's not strictly additive, since we can work in parallel on other >> tests for the first bit, but still, yuck). > > Since Stolee is on the cc and has already seen me complaining about his > test, I guess I should expand a bit. ;) Ha. I apologize for causing pain here. My thought was that GIT_TEST_LONG=1 was only used by someone really willing to wait, or someone specifically trying to investigate a problem that only triggers on very large cases. In that sense, it's not so much intended as a frequently-run regression test, but a "run this if you are messing with this area" kind of thing. Perhaps there is a different pattern to use here? > There are some small wins possible (e.g., using "commit --quiet" seems > to shave off ~8s when we don't even think about writing a diff), but > fundamentally the issue is that it just takes a long time to "git add" > the 5.2GB worth of random data. I almost wonder if it would be worth it > to hard-coded the known sha1 and sha256 names of the blobs, and write > them straight into the appropriate loose object file. I guess that is > tricky, though, because it actually needs to be a zlib stream, not just > the output of "test-tool genrandom". > > Though speaking of which, another easy win might be setting > core.compression to "0". We know the random data won't compress anyway, > so there's no point in spending cycles on zlib. The intention is mostly to expand the data beyond two gigabytes, so dropping compression to get there seems like a good idea. If we are not compressing at all, then perhaps we can reliably cut ourselves closer to the 2GB limit instead of overshooting as a precaution. > Doing this: > > diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh > index d9e68bb2bf..849c6d1361 100755 > --- a/t/t7900-maintenance.sh > +++ b/t/t7900-maintenance.sh > @@ -239,6 +239,8 @@ test_expect_success 'incremental-repack task' ' > ' > > test_expect_success EXPENSIVE 'incremental-repack 2g limit' ' > + test_config core.compression 0 && > + > for i in $(test_seq 1 5) > do > test-tool genrandom foo$i $((512 * 1024 * 1024 + 1)) >>big || > @@ -257,7 +259,7 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' ' > return 1 > done && > git add big && > - git commit -m "Add big file (2)" && > + git commit -qm "Add big file (2)" && > > # ensure any possible loose objects are in a pack-file > git maintenance run --task=loose-objects && > > seems to shave off ~140s from the test. I think we could get a little > more by cleaning up the enormous objects, too (they end up causing the > subsequent test to run slower, too, though perhaps it was intentional to > impact downstream tests). Cutting out 70% out seems like a great idea. I don't think it was super intentional to slow down those tests. Thanks, -Stolee
next prev parent reply other threads:[~2020-12-01 20:56 UTC|newest] Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-13 5:06 [PATCH 0/5] handling 4GB .idx files Jeff King 2020-11-13 5:06 ` [PATCH 1/5] compute pack .idx byte offsets using size_t Jeff King 2020-11-13 5:07 ` [PATCH 2/5] use size_t to store pack .idx byte offsets Jeff King 2020-11-13 5:07 ` [PATCH 3/5] fsck: correctly compute checksums on idx files larger than 4GB Jeff King 2020-11-13 5:07 ` [PATCH 4/5] block-sha1: take a size_t length parameter Jeff King 2020-11-13 5:07 ` [PATCH 5/5] packfile: detect overflow in .idx file size checks Jeff King 2020-11-13 11:02 ` Johannes Schindelin 2020-11-15 14:43 ` [PATCH 0/5] handling 4GB .idx files Thomas Braun 2020-11-16 4:10 ` Jeff King 2020-11-16 13:30 ` Derrick Stolee 2020-11-16 23:49 ` Jeff King 2020-11-30 22:57 ` Thomas Braun 2020-12-01 11:23 ` Jeff King 2020-12-01 11:39 ` t7900's new expensive test Jeff King 2020-12-01 20:55 ` Derrick Stolee [this message] 2020-12-02 2:47 ` [PATCH] t7900: speed up " Jeff King 2020-12-03 15:23 ` Derrick Stolee 2020-12-01 18:27 ` [PATCH 0/5] handling 4GB .idx files Taylor Blau 2020-12-02 13:12 ` Jeff King
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: t7900'\''s new expensive test' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).