From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 4A50E1F4B4 for ; Thu, 31 Dec 2020 13:51:54 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 00/36] another round of lei stuff Date: Thu, 31 Dec 2020 13:51:18 +0000 Message-Id: <20201231135154.6070-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: This is against lei branch @ commit 0c8106d44f317175e122744b43407bf067183175 in https://public-inbox.org/public-inbox.git Infrastructure stuff for reading + writing local Maildirs and a bunch of mbox formats are done (including gz/bz2/xz support) and it's usage should be familiar to mairix(1) users. Infrastructure for deduplication + augmenting search results in place and tested. Going to skip MH and MMDF for now; but IMAP/JMAP might happen sooner but deduplication needs low-latency. "extinbox" renamed "external" Basic infrastructure like PublicInbox::IPC and SharedKV should've been done and in use ages ago... I look forward to using them, at least. Some DS safety fixes since lei will use it in stranger ways than current. Bad enough we have messages with duplicate Message-IDs, lei will need to deal with Unsent/Drafts messages w/o Message-IDs at all! Eric Wong (36): import: respect init.defaultBranch lei_store: use per-machine refname as git HEAD revert "lei_store: use per-machine refname as git HEAD" lei_to_mail: initial implementation for writing mbox formats sharedkv: fork()-friendly key-value store sharedkv: split out index_values lei_to_mail: start atomic and compressed mbox writing mboxreader: new class for reading various mbox formats lei_to_mail: start --augment, dedupe, bz2 and xz lei: implement various deduplication strategies lei_to_mail: lazy-require LeiDedupe lei_to_mail: support for non-seekable outputs lei_to_mail: support Maildir, fix+test --augment ipc: generic IPC dispatch based on Storable ipc: support Sereal lei_store: add ->set_eml, ->add_eml can return smsg lei: rename "extinbox" => "external" mid: use defined-or with `push' for uniqueness check mid: hoist out mids_in sub lei_store: handle messages without Message-ID at all ipc: use shutdown(2), base atfork* callback lei_to_mail: unlink mboxes if not augmenting lei: add --mfolder as an option spawn: move run_die here from PublicInbox::Import init: remove embedded UnlinkMe package t/run.perl: avoid uninitialized var on incomplete test gcf2client: reap process on DESTROY lei_to_mail: open FIFOs O_WRONLY so we block searchidxshard: call DS->Reset at worker start t/ipc.t: test for references via `die' use PublicInbox::DS for dwaitpid syscall: SFD_NONBLOCK can be a constant, again lei: avoid Spawn package when starting daemon avoid calling waitpid from children in DESTROY ds: clobber $in_loop first at reset on_destroy: support PID owner guard MANIFEST | 12 +- lib/PublicInbox/DS.pm | 42 +- lib/PublicInbox/DSKQXS.pm | 4 +- lib/PublicInbox/Daemon.pm | 4 +- lib/PublicInbox/Gcf2Client.pm | 18 +- lib/PublicInbox/Git.pm | 7 +- lib/PublicInbox/IPC.pm | 165 ++++++++ lib/PublicInbox/Import.pm | 36 +- lib/PublicInbox/LEI.pm | 44 +-- lib/PublicInbox/LeiDedupe.pm | 100 +++++ .../{LeiExtinbox.pm => LeiExternal.pm} | 18 +- lib/PublicInbox/LeiStore.pm | 32 +- lib/PublicInbox/LeiToMail.pm | 361 ++++++++++++++++++ lib/PublicInbox/LeiXSearch.pm | 2 +- lib/PublicInbox/Lock.pm | 17 +- lib/PublicInbox/MID.pm | 15 +- lib/PublicInbox/MboxReader.pm | 127 ++++++ lib/PublicInbox/OnDestroy.pm | 5 + lib/PublicInbox/OverIdx.pm | 2 + lib/PublicInbox/ProcessPipe.pm | 34 +- lib/PublicInbox/Qspawn.pm | 43 +-- lib/PublicInbox/SearchIdxShard.pm | 1 + lib/PublicInbox/SharedKV.pm | 148 +++++++ lib/PublicInbox/Sigfd.pm | 4 +- lib/PublicInbox/Smsg.pm | 6 +- lib/PublicInbox/Spawn.pm | 9 +- lib/PublicInbox/Syscall.pm | 4 +- lib/PublicInbox/TestCommon.pm | 25 +- lib/PublicInbox/V2Writable.pm | 10 +- script/lei | 17 +- script/public-inbox-init | 32 +- script/public-inbox-watch | 4 +- t/convert-compact.t | 4 +- t/index-git-times.t | 3 +- t/ipc.t | 80 ++++ t/lei.t | 22 +- t/lei_dedupe.t | 59 +++ t/lei_store.t | 47 ++- t/lei_to_mail.t | 246 ++++++++++++ t/lei_xsearch.t | 2 +- t/mbox_reader.t | 75 ++++ t/on_destroy.t | 9 + t/plack.t | 4 +- t/run.perl | 3 +- t/shared_kv.t | 58 +++ t/sigfd.t | 6 +- 46 files changed, 1755 insertions(+), 211 deletions(-) create mode 100644 lib/PublicInbox/IPC.pm create mode 100644 lib/PublicInbox/LeiDedupe.pm rename lib/PublicInbox/{LeiExtinbox.pm => LeiExternal.pm} (75%) create mode 100644 lib/PublicInbox/LeiToMail.pm create mode 100644 lib/PublicInbox/MboxReader.pm create mode 100644 lib/PublicInbox/SharedKV.pm create mode 100644 t/ipc.t create mode 100644 t/lei_dedupe.t create mode 100644 t/lei_to_mail.t create mode 100644 t/mbox_reader.t create mode 100644 t/shared_kv.t