git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v4 7/7] reftable/block: avoid decoding keys when searching restart points
Date: Wed, 3 Apr 2024 08:04:29 +0200	[thread overview]
Message-ID: <632f725dde7b7f2eda83d285af90c7d989f294dc.1712123093.git.ps@pks.im> (raw)
In-Reply-To: <cover.1712123093.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2061 bytes --]

When searching over restart points in a block we decode the key of each
of the records, which results in a memory allocation. This is quite
pointless though given that records it restart points will never use
prefix compression and thus store their keys verbatim in the block.

Refactor the code so that we can avoid decoding the keys, which saves us
some allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/reftable/block.c b/reftable/block.c
index ca217636ae..298e8c56b9 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args)
 		.buf = args->reader->block.data + off,
 		.len = args->reader->block_len - off,
 	};
-	struct strbuf kth_restart_key = STRBUF_INIT;
-	uint8_t unused_extra;
-	int result, n;
+	uint64_t prefix_len, suffix_len;
+	uint8_t extra;
+	int n;
 
 	/*
-	 * TODO: The restart key is verbatim in the block, so we can in theory
-	 * avoid decoding the key and thus save some allocations.
+	 * Records at restart points are stored without prefix compression, so
+	 * there is no need to fully decode the record key here. This removes
+	 * the need for allocating memory.
 	 */
-	n = reftable_decode_key(&kth_restart_key, &unused_extra, in);
-	if (n < 0) {
+	n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra);
+	if (n < 0 || prefix_len) {
 		args->error = 1;
 		return -1;
 	}
 
-	result = strbuf_cmp(&args->needle, &kth_restart_key);
-	strbuf_release(&kth_restart_key);
-	return result < 0;
+	string_view_consume(&in, n);
+	if (suffix_len > in.len) {
+		args->error = 1;
+		return -1;
+	}
+
+	n = memcmp(args->needle.buf, in.buf,
+		   args->needle.len < suffix_len ? args->needle.len : suffix_len);
+	if (n)
+		return n < 0;
+	return args->needle.len < suffix_len;
 }
 
 void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
-- 
2.44.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2024-04-03  6:04 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-22 12:22 [PATCH 0/7] reftable: improvements for the `binsearch()` mechanism Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-03-22 18:46   ` Justin Tobler
2024-03-25 10:07     ` Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-03-22 18:55   ` Justin Tobler
2024-03-25 10:07     ` Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-03-22 12:22 ` [PATCH 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
2024-03-25 10:10 ` [PATCH v2 0/7] reftable: improvements for the `binsearch()` mechanism Patrick Steinhardt
2024-03-25 10:10   ` [PATCH v2 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-03-25 10:10   ` [PATCH v2 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-03-25 10:10   ` [PATCH v2 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-04-02 16:27     ` Justin Tobler
2024-04-02 17:15       ` Patrick Steinhardt
2024-03-25 10:10   ` [PATCH v2 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-04-02 16:42     ` Justin Tobler
2024-04-02 17:15       ` Patrick Steinhardt
2024-04-02 17:46         ` Justin Tobler
2024-04-03  6:01           ` Patrick Steinhardt
2024-03-25 10:10   ` [PATCH v2 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-03-25 10:10   ` [PATCH v2 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-03-25 10:11   ` [PATCH v2 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
2024-04-02 16:47     ` Justin Tobler
2024-04-02 17:15       ` Patrick Steinhardt
2024-04-02 17:24 ` [PATCH v3 0/7] reftable: improvements for the `binsearch()` mechanism Patrick Steinhardt
2024-04-02 17:24   ` [PATCH v3 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-04-02 17:24   ` [PATCH v3 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-04-02 17:24   ` [PATCH v3 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-04-02 17:24   ` [PATCH v3 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-04-02 17:24   ` [PATCH v3 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-04-02 17:25   ` [PATCH v3 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-04-02 17:25   ` [PATCH v3 7/7] reftable/block: avoid decoding keys when searching restart points Patrick Steinhardt
2024-04-02 17:49   ` [PATCH v3 0/7] reftable: improvements for the `binsearch()` mechanism Justin Tobler
2024-04-03  6:03 ` [PATCH v4 " Patrick Steinhardt
2024-04-03  6:03   ` [PATCH v4 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Patrick Steinhardt
2024-04-03  6:04   ` [PATCH v4 2/7] reftable/basics: improve `binsearch()` test Patrick Steinhardt
2024-04-03  6:04   ` [PATCH v4 3/7] reftable/refname: refactor binary search over refnames Patrick Steinhardt
2024-04-03  6:04   ` [PATCH v4 4/7] reftable/block: refactor binary search over restart points Patrick Steinhardt
2024-04-03  6:04   ` [PATCH v4 5/7] reftable/block: fix error handling when searching " Patrick Steinhardt
2024-04-03  6:04   ` [PATCH v4 6/7] reftable/record: extract function to decode key lengths Patrick Steinhardt
2024-04-03  6:04   ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=632f725dde7b7f2eda83d285af90c7d989f294dc.1712123093.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).