From: Eric Wong <e@yhbt.net>
To: meta@public-inbox.org
Subject: Re: [WIP 1/?] v2writable: index Message-IDs w/ spaces properly
Date: Wed, 1 Apr 2020 00:05:28 +0000 [thread overview]
Message-ID: <20200401000528.GA32055@dcvr> (raw)
In-Reply-To: <20200331084936.GA26977@dcvr>
Eric Wong <e@yhbt.net> wrote:
> Message-IDs can apparently contain spaces and other weird
> characters. Ensure we pass those properly to shard subprocesses
> when importing messages in parallel mode.
>
> Our NNTP parser does not deal with spaces in the Message-ID,
> yet, and I don't expect most NNTP clients to, either.
Nor does Net::NNTP on the client side...
But regardless of what happens with Message-IDs in the NNTP
side, this patch will remain correct and fixes an indexing
problem when Message-IDs.
This bug was exacerbated by the changes to pass date and
timestamps from the git commit into the shard when mirroring,
but has always been with us when using multi-process indexing.
> diff --git a/t/v2writable.t b/t/v2writable.t
> index cdcfe4d0..8167e4de 100644
> --- a/t/v2writable.t
> +++ b/t/v2writable.t
> @@ -175,8 +180,12 @@ EOF
> is($uniq{$mid}++, 0, "MID for $num is unique in XOVER");
> is_deeply($n->xhdr('Message-ID', $num),
> { $num => $mid }, "XHDR lookup OK on num $num");
> +
> + # FIXME NNTP.pm doesn't handle spaces in Message-ID
> + next if $mid =~ / /;
> +
Pushed with the following squashed in:
diff --git a/t/v2writable.t b/t/v2writable.t
index 8167e4de..66d5663e 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -181,7 +181,8 @@ EOF
is_deeply($n->xhdr('Message-ID', $num),
{ $num => $mid }, "XHDR lookup OK on num $num");
- # FIXME NNTP.pm doesn't handle spaces in Message-ID
+ # FIXME PublicInbox::NNTP (server) doesn't handle spaces in
+ # Message-ID, but neither does Net::NNTP (client)
next if $mid =~ / /;
is_deeply($n->xhdr('Message-ID', $mid),
prev parent reply other threads:[~2020-04-01 0:05 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-31 8:32 how to gracefully handle spaces in Message-IDs? Eric Wong
2020-03-31 8:49 ` [WIP 1/?] v2writable: index Message-IDs w/ spaces properly Eric Wong
2020-04-01 0:05 ` Eric Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200401000528.GA32055@dcvr \
--to=e@yhbt.net \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).