From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS680 194.94.0.0/15 X-Spam-Status: No, score=-3.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mta02.hs-regensburg.de (mta02.hs-regensburg.de [194.95.104.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 395B420248 for ; Wed, 20 Mar 2019 22:28:34 +0000 (UTC) Received: from E16S02.hs-regensburg.de (e16s02.hs-regensburg.de [IPv6:2001:638:a01:8013::92]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "E16S02", Issuer "E16S02" (not verified)) by mta02.hs-regensburg.de (Postfix) with ESMTPS id 44Pl1q5Dy7zyFm; Wed, 20 Mar 2019 23:28:31 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oth-regensburg.de; s=mta01-20160622; t=1553120911; bh=z9wKkKwl0zEixmnoYu/g+uC5QSn48q1JkxNmQMDTEQ8=; h=From:To:CC:Subject:Date:From; b=cRnWRwaUvliPKIUPi0kVz1mZBMB5i0zy0ywfqPEgf5fGy5Hbe+c1HAwvH5VooRc6A rnZoRN8Y8h0tZYU5aD49VDdwTgVEQ440DH11MQ/8UrMBERekEp6AWUGCk0Zzwh6ZBv EvvJ4qzwkCjVv8KPi3/QhqL3V60NWEIqUB3YfztOBe8TcOZW0NBFoVrB8Kzkbwt3n5 OgNZs/5kQmaRi512Kf1mq6kVmg/JGW7o0lI4UbDkHd4wrdLxI0GfV0NfiAULXaves6 73xAHPeMu51U51cQjChbMkvweO7LTyXiVAxXtQvjZexCay6aHPUJPGoNCgL1h5Jai9 mET33X+qXJYng== Received: from [172.23.3.74] (194.95.106.138) by E16S02.hs-regensburg.de (2001:638:a01:8013::92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 20 Mar 2019 23:28:31 +0100 From: Ralf Ramsauer To: CC: Lukas Bulwahn Openpgp: preference=signencrypt Autocrypt: addr=ralf.ramsauer@oth-regensburg.de; keydata= mQINBFsT8OUBEADEz1dVva7HkfpQUsAH71/4RzV23kannVpJhTOhy9wLEJclj0cGMvvWFyaw 9lTRxKfmWgDNThCvNziuPgJdaZ3KMlCuF9QOsW/e2ZKvP5N1GoIperljb3+DW3FFGC8mzCDa x6rVeY0MtSa9rdKbWKIwtSOPBgPk7Yg+QkF0gMHyDMjKrNPolnCZjypAIj81MQfG0s6hIwMB 5LXZPl9WL2NwcBWxU71NBhyTvtVMy6eCPTDIT+rDIaIjdqXUbL8QBzaApxSLAgb7Nbatkx7k 3LjqflPMmtQfQ67O1qS/ILe5DrYjGbwZWYb2xmXNwJvEENIDou9Wnusxphh1P1acnn+9DIjQ 9/A+/zCiube3tgCpv5sq8++knQChn2NLMrHlVsRCgGApciO7/0hCvcS9mGE1JM3Nmwfs2wqW vG9vhv3uBJHjH4C8s5UCvF/44E22+bBqsrLUlr5d+YRNtY+LCH1rwNIrzNtfZraq0hPiI8pv P4GpvHDmrsGTyG9YbD33XiI7DD8IaAtwld7wSkMmt07NRhyxVsPc1ZIBQMyS28VvuLbDK4f6 WyjQMJmA8EQspEmNcTFG6LnmW+7PGad2Nt7RhHRs4e4JkT8WckWzTCRzlRusyr13SbiFWznt +29Q47elnVUG3nB2h1VGZofX+myYJS0uX4BQ2G7sO+LrBY4HXQARAQABtC9SYWxmIFJhbXNh dWVyIDxyYWxmLnJhbXNhdWVyQG90aC1yZWdlbnNidXJnLmRlPokCVAQTAQgAPhYhBMAttVrc MMGXiLwkKnP5TRHIUlLMBQJbE/EnAhsDBQkFo5qABQsJCAcCBhUKCQgLAgQWAgMBAh4BAheA AAoJEHP5TRHIUlLMICYQALEBOS5+OegeYvi/8qwcXWTtSPu6/L6z2kgh6XCii8zH8Rn9T1mB xzA5h1sBku1wIH+xloRxNNmZlxNyJOML5zMng8cLw/PRTDZ3JdzIFFw7bssAgDiLzr8F0gTq bRrAwFCDuZMNCJgJhxRrPRNSrZovqUeaSUAxw10Dea3NgcvJ1SLtClBaU2+U7dHQdBINBLXm UAg54P6voe/MhkPEwESRHWKsiEWBp4BBPv8AjXnYAth6F9LZksugF4KZMPWnEgXNjw6ObD6C T7qA46/ETXBcxI05lQFs3G9P6YpeOmH1V5pRWb2pS/f9vDudU52QRcAIUir0yjR45tmgJMLV oRR7xRyj/BXqBHbzjilg3GDZMiUtfjg6skr++du79b7xnoEgzHR/ByHW67MCbjcuTmpTeXBK Iq61He/l2NETfy+2ZnWOUNC7/lZHdfrEyHR3Q3S7TQbkm80TXE05Cfb5NXtZxlbCNxFEMtCT UeaUX0NtsHfRDNBzFY6pKSpg8EXDtEFe8+utLekEZ6lFgQ5ZJ1c9NfaOiRJ/NrnQfqAEXUyo uILPmXK+3UiFlWtmIIzSQ/Wd+4pJtM291zt0umnxboOZc1mOU9B2wKT3mnA3HxQ1LiRIT9j8 l8iT6TwRB/aiiXa51hN4R7rfSQMxK6a93EAyUZSoWFpZiBo1/5PynB4zuQINBFsT8OUBEAC9 HeOKJ/KJ861Q/5C1qwHRK95nJiwCCpASxip68e3ZW9vPTV3VmcQ3tPNRBLPZW1S+IV6DL8/j HnopXyyrFBkSJYEAtKkBI5xO6olYglCJqhJ5GdE2WIxvFfTkKwXf3gYc7zuif/5tS7D4XeEH wScrncFHCxDSUCXyGM/lnLhu3HfQbK49whpel67uteHrXC4tCMzaTy1SOwlXQi4nufxfARBe PT2udi+aZCs4a5bTqvEllPsWRsab4JjTsd831VLYCeRM6siKkzzv9nUjBjTri2cPm0FDS80X vQVHEw4bP+V4EvcrarNh/9VmCypuH23qRsAX33mLhB94aBoE6afCkWG5G2m24pj3NCkdA0MG IleuuD4/I+6+31Dip53AMvx5EDepMrA2b7gsQOKidgDe1fz/j1qkszmQlxlcb/LruXMWWY7L 3NcwGUjNRfH0KiSyQ6GMtU5ECu8/o4fecOee76fHTviI6h7jSL3O0AKJadUXekAfhyVS/zUD iZTv2zI4wAyxIWj3AFVXXeb1T4UG+k4Ea+M7+jtgGUz/K3/mDYXWWRHkT5CMZLiU8BCdfewg Zp94L5KOWDYCeX5LWworOwtkoePd9h5g7L2EBbeINk8Ru018FkEiqALN03vPI8KYNXb6epUg xhdvhaPoSD3aCnQttvU8lN70cKBGMwTZYwARAQABiQI8BBgBCAAmFiEEwC21WtwwwZeIvCQq c/lNEchSUswFAlsT8OUCGwwFCQWjmoAACgkQc/lNEchSUswevA//RM2YQI1Z3QMBRMr/5As0 2zXcJFp+j07wkO9avm8U7GwjPjLHGVvs44rTSc0IKSsIKCJDSqNod9jd2iR39lr5/FpRiRk/ 7A1ACZUagASNC+PiyCCjlg34bWulzVmb5ozjqKQqgYww4c6D0P44JDUtedVbKd7HdwjjzP0P cubSgAohnXzrkp3gtVg07KeoQyiZctJqJu9Z84MiXMIQ+G75mFkIJEL4WYIkcJ9pamUHX71Y T1s6qtrqXemn25w87TioHUMcW4wRXhHHJ4gDbe/P9wb9XKS41ks0kiTia1ZcFsf6QQzoCoK1 R8ahGzsqvCRHMR7fU5w25qXAPfS5ENZgH0KcAVi1bDjwDyhQk3PfPiraiHmtEz2IlthAPpRD Drr0lqCvDFNtqaC+ZI0eOmTvy6/zfVh7ODmaDq1KqMu5EB9ojHXM7N6XXN8OubY+lNx+q0T5 STssqr8EKkrHp6rw2OQHCX7uaEQri2GEJW4HowVvlashmxC4bxR8B4gbm+EB8gR8PD7BSZQG k5NkPOqUZJXq1HO+d5Udk1WdT+mkFGwIMN/U9t3gJNWkab+aAYg1mKwdz7B+10j51vbQbFgY 2/n9jtl/AFgfYQocbJta5+0fOwIJObNFpLAotvtFNF+Q164Bc3E7Njh230nFduU/9BnmCpOQ RncIIYr0LjXAAzY= Subject: Usage of public-inbox with maildirs Message-ID: <745d6a8e-7e7c-8c61-336b-105cf9570ab7@oth-regensburg.de> Date: Wed, 20 Mar 2019 23:28:30 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.3 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Language: en-PH Content-Transfer-Encoding: 8bit X-Originating-IP: [194.95.106.138] X-ClientProxiedBy: E16S04.hs-regensburg.de (2001:638:a01:8013::94) To E16S02.hs-regensburg.de (2001:638:a01:8013::92) List-Id: Hi, we want to archive a fair amount of mailing lists (~160 lists) with public-inbox. Therefore, we subscribed to all of those lists with a single email address. Mails are periodically fetched and stored in a local maildir via IMAP. Mails are currently not pre-filtered or sorted, all of them are bunched in a single maildir. So every [publicinbox] config entry has the same 'watch' entry for the maildir, but all have their own watchheader to be sensitive on different lists. Is this the intended way to use public-inbox, or should we rather place mails from different lists in different maildirs before processing them with public-inbox? Secondly, I wrote a script that automatically that creates the public-inbox config together with empty, bare git repositories for every list. A config entry looks like: [publicinbox "listid"] address = post@listid.org mainrepo = /path/to/repo watch = maildir:/path/to/maildir watchheader = List-Id: Our maildir currently contains ~120k mails for the initial import, and this raised some new questions: 1. It appears that the initial import with public-inbox-watch is very slow. After stracing the perl script, it looks like public-inbox-watch lstats every single mail. After an hour of not inserting any mail into a repo, I canceled the process and restarted it on a smaller initial subset. This works better, but is still slow. (~4k mails in 10 minutes, feels like constantly getting slower) If public-inbox-watch is restarted for some reason (e.g., system reboot), will it stat every single mail again on startup? IOW, should old mails be removed from the maildir and/or will they cause performance impacts? Is there an way to automatically delete processed mails? 2. public-inbox-watch seems to fill the repositories with the 'old' v1 layout, and I don't know how to switch to v2. Is there a config parameter for that? I found the v1-v2 convert script, but I'd like to directly initialise it with the newer version, if possible. 3. On the initial import, public-inbox-watch seems to randomly insert mails into repositories. In the end, coverage matters more than hierarchy, but is there a way to do the initial import sorted by date? Thanks a lot! Ralf