From: Jeff King <firstname.lastname@example.org> To: Derrick Stolee <email@example.com> Cc: firstname.lastname@example.org, Junio C Hamano <email@example.com> Subject: Re: [PATCH v2 13/15] pack-bitmap: implement BLOB_NONE filtering Date: Tue, 18 Feb 2020 15:24:17 -0500 [thread overview] Message-ID: <20200218202417.GE21774@coredump.intra.peff.net> (raw) In-Reply-To: <firstname.lastname@example.org> On Tue, Feb 18, 2020 at 02:26:53PM -0500, Derrick Stolee wrote: > On 2/14/2020 1:22 PM, Jeff King wrote: > > We can easily support BLOB_NONE filters with bitmaps. Since we know the > > types of all of the objects, we just need to clear the result bits of > > any blobs. > > > > Note two subtleties in the implementation (which I also called out in > > comments): > > > > - we have to include any blobs that were specifically asked for (and > > not reached through graph traversal) to match the non-bitmap version > > I have a concern here, but maybe I'm worrying about nothing. When a > partial clone asks for a pack of missing blobs, will your code create > an empty bitmap and then add bits to that bitmap one-by-one instead > of appending to a simple object list? Yes. They'd all be listed in revs->pending, so we'd add them to our array of "wants", and then create a bitmap. There's no traversal cost, but we'd pay the cost to open the bitmap file. But... > In the typical case where we ask for specific commits and trees, we > expect a very small number of blobs to add to the resulting bitmap. > When no commits or trees are included in the wants, then we don't > need the bitmap at all. IIRC an EWAH bitmap is relatively expensive > to update bits one at a time, so this is not incredibly efficient. There's no ewah bitmap at play here at all. The internal bitmap we create in memory for the walk is a real uncompressed bitmap, so settings and retrieving bits is pretty cheap. I'd worry much more about the fact that we had to parse the whole bitmap file (which for historical format reasons involves actually running over all of the bytes in the file). I think that's somewhat orthogonal to this patch, though. The same pessimality would be true of anybody fetching a couple blobs, whether they use filters or not (and really, there's no reason that the follow-up fetch for blobs in a filtered repository would need to use filters, but for some reason it does). It would be an easy optimization to say "we have only blobs, don't bother opening the bitmap file", but I think that should come on top. However, I have a suspicion that we actually call parse_object() on each blob before we even get to the bitmap code (via get_reference()). If so, that's a much larger low-hanging fruit. > > --- a/t/perf/p5310-pack-bitmaps.sh > [...] > > I wondered why you chose to extend these tests instead of using > p5600-partial-clone.sh, but I guess this script definitely creates > the bitmap for the test. When I tested p5600-partial-clone.sh below, > I manually repacked the Linux repo to have a bitmap: Right, when the two features combine, we have to either pick a test script that hits one or the other, or create a new one. Especially since this one covers the full and partial bitmap states, I think it's nice to reuse that work. > Test v2.25.0 HEAD > ---------------------------------------------------------------------------- > 5600.2: clone without blobs 79.81(111.34+11.35) 36.00(69.37+7.30) -54.9% > 5600.3: checkout of result 45.56(114.59+4.81) 46.43(80.50+5.41) +1.9% > > Perhaps results for these tests would also be appropriate for your > commit messages? I think your p5600.2 is basically the same as what I ended up with in p5310 for the final commit (when pack-objects finally learns to use these filters, too). But we want to make sure we are testing with bitmaps, so I don't think we'd want to count on the sample repo having bitmaps already. I think this speaks to the general problems with the perf suite, in that we probably ought to be testing combinations of repos and tests, rather than manually creating situations in each script). > Note the +1.9% for the checkout. It's unlikely that this is actually > something meaningful, but it _could_ be related to my concerns above > about building a blob list from an empty bitmap. In my experience with the perf suite, 2% is most likely noise (especially for a filesystem-heavy operation like checkout). Note that the difference in system time covers almost all of the wall-clock time. It is curious that the user time in your second test dropped so drastically, but didn't create a wall-clock improvement. I've often seen weirdness there with the CPU clock changing due to external factors. -Peff
next prev parent reply other threads:[~2020-02-18 20:24 UTC|newest] Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-02-13 2:15 [PATCH 0/13] combining object filters and bitmaps Jeff King 2020-02-13 2:16 ` [PATCH 01/13] pack-bitmap: factor out type iterator initialization Jeff King 2020-02-13 17:45 ` Junio C Hamano 2020-02-13 2:16 ` [PATCH 02/13] pack-bitmap: fix leak of haves/wants object lists Jeff King 2020-02-13 18:12 ` Junio C Hamano 2020-02-13 2:17 ` [PATCH 03/13] rev-list: fallback to non-bitmap traversal when filtering Jeff King 2020-02-13 18:19 ` Junio C Hamano 2020-02-13 18:40 ` Jeff King 2020-02-13 2:17 ` [PATCH 04/13] rev-list: consolidate bitmap-disabling options Jeff King 2020-02-13 2:18 ` [PATCH 05/13] rev-list: factor out bitmap-optimized routines Jeff King 2020-02-13 18:34 ` Junio C Hamano 2020-02-13 2:19 ` [PATCH 06/13] rev-list: make --count work with --objects Jeff King 2020-02-13 19:14 ` Junio C Hamano 2020-02-13 20:27 ` Jeff King 2020-02-13 2:20 ` [PATCH 07/13] rev-list: allow bitmaps when counting objects Jeff King 2020-02-13 21:47 ` Junio C Hamano 2020-02-13 22:27 ` Jeff King 2020-02-13 2:20 ` [PATCH 08/13] pack-bitmap: basic noop bitmap filter infrastructure Jeff King 2020-02-13 2:21 ` [PATCH 09/13] rev-list: use bitmap filters for traversal Jeff King 2020-02-13 22:22 ` Junio C Hamano 2020-02-13 22:34 ` Jeff King 2020-02-13 2:21 ` [PATCH 10/13] bitmap: add bitmap_unset() function Jeff King 2020-02-13 2:23 ` [PATCH 11/13] pack-bitmap: implement BLOB_NONE filtering Jeff King 2020-02-13 2:25 ` [PATCH 12/13] pack-bitmap: implement BLOB_LIMIT filtering Jeff King 2020-02-13 23:17 ` Junio C Hamano 2020-02-13 2:25 ` [PATCH 13/13] pack-objects: support filters with bitmaps Jeff King 2020-02-14 18:21 ` [PATCH v2 0/15] combining object filters and bitmaps Jeff King 2020-02-14 18:22 ` [PATCH v2 01/15] pack-bitmap: factor out type iterator initialization Jeff King 2020-02-15 0:10 ` Taylor Blau 2020-02-14 18:22 ` [PATCH v2 02/15] pack-bitmap: fix leak of haves/wants object lists Jeff King 2020-02-15 0:15 ` Taylor Blau 2020-02-15 6:46 ` Jeff King 2020-02-18 17:58 ` Derrick Stolee 2020-02-18 20:02 ` Jeff King 2020-02-14 18:22 ` [PATCH v2 03/15] rev-list: fallback to non-bitmap traversal when filtering Jeff King 2020-02-15 0:22 ` Taylor Blau 2020-02-14 18:22 ` [PATCH v2 04/15] pack-bitmap: refuse to do a bitmap traversal with pathspecs Jeff King 2020-02-14 19:03 ` Junio C Hamano 2020-02-14 20:51 ` Jeff King 2020-02-14 18:22 ` [PATCH v2 05/15] rev-list: factor out bitmap-optimized routines Jeff King 2020-02-15 0:35 ` Taylor Blau 2020-02-14 18:22 ` [PATCH v2 06/15] rev-list: make --count work with --objects Jeff King 2020-02-15 0:42 ` Taylor Blau 2020-02-15 6:48 ` Jeff King 2020-02-16 23:34 ` Junio C Hamano 2020-02-18 5:24 ` Jeff King 2020-02-18 17:28 ` Junio C Hamano 2020-02-18 19:55 ` Jeff King 2020-02-18 21:19 ` Junio C Hamano 2020-02-18 21:23 ` Jeff King 2020-02-18 18:05 ` Derrick Stolee 2020-02-18 19:59 ` Jeff King 2020-02-14 18:22 ` [PATCH v2 07/15] rev-list: allow bitmaps when counting objects Jeff King 2020-02-15 0:45 ` Taylor Blau 2020-02-15 6:55 ` Jeff King 2020-02-16 23:36 ` Junio C Hamano 2020-02-14 18:22 ` [PATCH v2 08/15] t5310: factor out bitmap traversal comparison Jeff King 2020-02-15 2:14 ` Taylor Blau 2020-02-15 7:00 ` Jeff King 2020-02-14 18:22 ` [PATCH v2 09/15] rev-list: allow commit-only bitmap traversals Jeff King 2020-02-18 18:18 ` Derrick Stolee 2020-02-18 20:05 ` Jeff King 2020-02-18 20:11 ` Derrick Stolee 2020-02-14 18:22 ` [PATCH v2 10/15] pack-bitmap: basic noop bitmap filter infrastructure Jeff King 2020-02-14 18:22 ` [PATCH v2 11/15] rev-list: use bitmap filters for traversal Jeff King 2020-02-14 18:22 ` [PATCH v2 12/15] bitmap: add bitmap_unset() function Jeff King 2020-02-14 18:22 ` [PATCH v2 13/15] pack-bitmap: implement BLOB_NONE filtering Jeff King 2020-02-18 19:26 ` Derrick Stolee 2020-02-18 19:36 ` Derrick Stolee 2020-02-18 20:30 ` Jeff King 2020-02-18 20:24 ` Jeff King [this message] 2020-02-14 18:22 ` [PATCH v2 14/15] pack-bitmap: implement BLOB_LIMIT filtering Jeff King 2020-02-14 18:22 ` [PATCH v2 15/15] pack-objects: support filters with bitmaps Jeff King
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200218202417.GE21774@coredump.intra.peff.net \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: [PATCH v2 13/15] pack-bitmap: implement BLOB_NONE filtering' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).