From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 723FE1F487; Wed, 1 Apr 2020 00:05:28 +0000 (UTC) Date: Wed, 1 Apr 2020 00:05:28 +0000 From: Eric Wong To: meta@public-inbox.org Subject: Re: [WIP 1/?] v2writable: index Message-IDs w/ spaces properly Message-ID: <20200401000528.GA32055@dcvr> References: <20200331083250.GA27164@dcvr> <20200331084936.GA26977@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200331084936.GA26977@dcvr> List-Id: Eric Wong wrote: > Message-IDs can apparently contain spaces and other weird > characters. Ensure we pass those properly to shard subprocesses > when importing messages in parallel mode. > > Our NNTP parser does not deal with spaces in the Message-ID, > yet, and I don't expect most NNTP clients to, either. Nor does Net::NNTP on the client side... But regardless of what happens with Message-IDs in the NNTP side, this patch will remain correct and fixes an indexing problem when Message-IDs. This bug was exacerbated by the changes to pass date and timestamps from the git commit into the shard when mirroring, but has always been with us when using multi-process indexing. > diff --git a/t/v2writable.t b/t/v2writable.t > index cdcfe4d0..8167e4de 100644 > --- a/t/v2writable.t > +++ b/t/v2writable.t > @@ -175,8 +180,12 @@ EOF > is($uniq{$mid}++, 0, "MID for $num is unique in XOVER"); > is_deeply($n->xhdr('Message-ID', $num), > { $num => $mid }, "XHDR lookup OK on num $num"); > + > + # FIXME NNTP.pm doesn't handle spaces in Message-ID > + next if $mid =~ / /; > + Pushed with the following squashed in: diff --git a/t/v2writable.t b/t/v2writable.t index 8167e4de..66d5663e 100644 --- a/t/v2writable.t +++ b/t/v2writable.t @@ -181,7 +181,8 @@ EOF is_deeply($n->xhdr('Message-ID', $num), { $num => $mid }, "XHDR lookup OK on num $num"); - # FIXME NNTP.pm doesn't handle spaces in Message-ID + # FIXME PublicInbox::NNTP (server) doesn't handle spaces in + # Message-ID, but neither does Net::NNTP (client) next if $mid =~ / /; is_deeply($n->xhdr('Message-ID', $mid),