* [PATCH 1/3] searchidx: use regexp as first arg for `split' op
2022-06-20 19:27 5% [PATCH 0/3] search indexing improvements Eric Wong
@ 2022-06-20 19:27 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2022-06-20 19:27 UTC (permalink / raw)
To: meta
Current implementations of Perl5 don't have optimizations for
single-character field separators (unlike another non-Perl5 VM
I'm familiar with).
---
lib/PublicInbox/SearchIdx.pm | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 85fae4ad..50e26050 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -236,8 +236,8 @@ sub index_old_diff_fn {
# no renames or space support for traditional diffs,
# find the number of leading common paths to strip:
- my @fa = split('/', $fa);
- my @fb = split('/', $fb);
+ my @fa = split(m'/', $fa);
+ my @fb = split(m'/', $fb);
while (scalar(@fa) && scalar(@fb)) {
$fa = join('/', @fa);
$fb = join('/', @fb);
@@ -278,12 +278,12 @@ sub index_diff ($$$) {
$xnq);
} elsif (m!^--- ("?[^/]+/.+)!) {
my $fn = $1;
- $fn = (split('/', git_unquote($fn), 2))[1];
+ $fn = (split(m'/', git_unquote($fn), 2))[1];
$seen{$fn}++ or index_diff_inc($self, $fn, 'XDFN', $xnq);
$in_diff = 1;
} elsif (m!^\+\+\+ ("?[^/]+/.+)!) {
my $fn = $1;
- $fn = (split('/', git_unquote($fn), 2))[1];
+ $fn = (split(m'/', git_unquote($fn), 2))[1];
$seen{$fn}++ or index_diff_inc($self, $fn, 'XDFN', $xnq);
$in_diff = 1;
} elsif (/^--- (\S+)/) {
^ permalink raw reply related [relevance 7%]
* [PATCH 0/3] search indexing improvements
@ 2022-06-20 19:27 5% Eric Wong
2022-06-20 19:27 7% ` [PATCH 1/3] searchidx: use regexp as first arg for `split' op Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2022-06-20 19:27 UTC (permalink / raw)
To: meta
Still stuck on POP3 account manglement, but here's some easy-ish
indexing changes for public-inbox-* tools. These require a full
reindex with either public-inbox-index or public-inbox-extindex,
but old/new indexes should be fully compatible and should be
doable hot:
public-inbox-index --no-fsync --reindex /path/to/v1-or-v2
public-inbox-extindex --no-fsync --reindex --all /path/to/eidx
Will probably take 2 days or so on my own machine.
Note: lei doesn't support reindexing, yet, but will, soon...
Eric Wong (3):
searchidx: use regexp as first arg for `split' op
search: support "patchid:" prefix (git patch-id --stable)
search: do not index base-85 binary patches
MANIFEST | 1 +
TODO | 5 ---
lib/PublicInbox/Search.pm | 5 ++-
lib/PublicInbox/SearchIdx.pm | 75 ++++++++++++++++++++++++++----------
t/data/binary.patch | 20 ++++++++++
t/extsearch.t | 7 +++-
t/search.t | 15 ++++++++
t/v2mda.t | 10 ++++-
8 files changed, 108 insertions(+), 30 deletions(-)
create mode 100644 t/data/binary.patch
^ permalink raw reply [relevance 5%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2022-06-20 19:27 5% [PATCH 0/3] search indexing improvements Eric Wong
2022-06-20 19:27 7% ` [PATCH 1/3] searchidx: use regexp as first arg for `split' op Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).