* [PATCH 1/3] eml: relax warn_ignore regexps for current Email::Address::XS
2021-07-06 12:42 5% [PATCH 0/3] extindex: dedupe support, + gc fix Eric Wong
@ 2021-07-06 12:42 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2021-07-06 12:42 UTC (permalink / raw)
To: meta
These seem needed with the data I'm currently working on, but I
haven't changed my version of Email::Address::XS since my last
Debian stable upgrade (to buster).
---
lib/PublicInbox/Eml.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm
index 46c273ce..955d6a96 100644
--- a/lib/PublicInbox/Eml.pm
+++ b/lib/PublicInbox/Eml.pm
@@ -484,8 +484,8 @@ sub crlf { $_[0]->{crlf} // "\n" }
sub warn_ignore {
my $s = "@_";
# Email::Address::XS warnings
- $s =~ /^Argument contains empty address at /
- || $s =~ /^Element at index [0-9]+ contains /
+ $s =~ /^Argument contains empty /
+ || $s =~ /^Element at index [0-9]+.*? contains /
# PublicInbox::MsgTime
|| $s =~ /^bogus TZ offset: .+?, ignoring and assuming \+0000/
|| $s =~ /^bad Date: .+? in /
^ permalink raw reply related [relevance 7%]
* [PATCH 0/3] extindex: dedupe support, + gc fix
@ 2021-07-06 12:42 5% Eric Wong
2021-07-06 12:42 7% ` [PATCH 1/3] eml: relax warn_ignore regexps for current Email::Address::XS Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2021-07-06 12:42 UTC (permalink / raw)
To: meta
I'm still not sure how the duplicates got into my extindices;
but the problem doesn't seem reproducible at the moment so
maybe the original bug was fixed.
Since there's already dedupe failures from past indexing, the
--dedupe switch here should help us get rid of them. It's only
lightly tested, but it seems to be working.
There's also a minor fix for --gc, too.
Eric Wong (3):
eml: relax warn_ignore regexps for current Email::Address::XS
extindex: implement --dedupe to fix old extindices
extindex: --gc: avoid SQLite lock conflict on shard cleanup
lib/PublicInbox/Eml.pm | 4 +-
lib/PublicInbox/ExtSearchIdx.pm | 96 +++++++++++++++++++++++++++++++
lib/PublicInbox/OverIdx.pm | 20 +++++++
lib/PublicInbox/SearchIdxShard.pm | 5 +-
script/public-inbox-extindex | 13 ++++-
t/extsearch.t | 11 ++++
6 files changed, 142 insertions(+), 7 deletions(-)
^ permalink raw reply [relevance 5%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-07-06 12:42 5% [PATCH 0/3] extindex: dedupe support, + gc fix Eric Wong
2021-07-06 12:42 7% ` [PATCH 1/3] eml: relax warn_ignore regexps for current Email::Address::XS Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).