about summary refs log tree commit homepage
DateCommit message (Collapse)
2022-04-23public-inbox 1.8.0 v1.8.0
2022-04-22doc: update 1.8 WIP release notes
2022-04-22lei: commit store on interrupted partial imports
This change prevents lingering shard and git-fast-import processes from remaining after interrupted "lei import" (and similar). It also reduces the likelyhood of data-loss in case of subsequent abnormal termination of the daemon. I think this is the least surprising way to handle users prematurely aborting imports or other similar operations which write to lei/store and will result in reduced bandwidth waste for users with intermittent connections. This is because the lei/store processes may be shared by parallel "lei import" callers, and commits done by any "lei import" caller will inevitably trigger writes for all of them.
2022-04-18syscall: golf + more idiomatic buffer initialization
While `vec' is useful for user-supplied buffers to avoid excess memory traffic, but provides no benefit when we need to allocate our own buffers as we do in nodatacow_fh, since Perl can't elide memset(ptr, 0, len). So just use the idiomatic `"\0" x $LEN' here.
2022-04-18lei: wire up pure Perl sendmsg/recvmsg for Linux users
This enables lei-daemon to work without Inline::C nor Socket::MsgHdr installed. Prior to this, only the `lei' client was using the pure Perl implementation. Either C implementation is still marginally faster, however.
2022-04-18syscall: more idiomatic cmsghdr space allocation
Since we know the space required under Linux, we can use the same initialization as the Inline::C version instead of hard-coding 256 as we do for Socket::MsgHdr.
2022-04-18lei: clobber recvmsg buffer on errors
It will be necessary when we drop the Inline::C requirement since the pure Perl Linux syscall recvmsg implementation. This likely would've caused errors for Socket::MsgHdr users without Inline::C, but I haven't tested it since it's a rare configuration.
2022-04-18lei_mail_sync: explicit bind for old SQL_VARCHAR compat
This avoids repeated work for incremental "lei import" runs when users upgrade from 1.7 to current public-inbox.git (and eventually 1.8). We need the explicit bind_param for fallback calls because previous bind_param calls are "sticky" for a given statement handle. The DBI(3pm) manpage states: The data type is 'sticky' in that bind values passed to execute() are bound with the data type specified by earlier bind_param() calls, if any. Portable applications should not rely on being able to change the data type after the first "bind_param" call.
2022-04-05lei: always open mail_sync.sqlite3 R/W
This will make transparently upgrading from 1.7.0 -> 1.8.x easier. Only a single user has access to mail_sync.sqlite3, and R/W at the kernel-level is required for WAL, anyways.
2022-04-02view: remove unused $end variable
Noticed while looking at something else completely unrelated...
2022-04-02examples/unsubscribe.milter: RFC 8058 (List-Unsubscribe=One-Click)
This allows unambiguous signaling to some MUAs and webmail clients that th List-Unsubscribe header contains an instantaneous unsubscribe option.
2022-04-02examples/unsubscribe.milter: use IO::Socket, again
Sendmail::PMilter requires an IO::Socket object, not a GLOB. Fixes: e901a56b3b30b22f (treewide: favor open(..., '+<&=', $fd), 2021-05-21)
2022-04-02lei_mail_sync: store OIDs and Maildir filenames as blobs
DBD::SQLite doesn't seem to use SQL_BLOB automatically, which can lead to ambiguity in some cases (especially interoperating with other tools). Downgrading to lei 1.7.0 will cause problems, but upgrading appears transparent after weeks of tests.
2022-04-02lei_mail_sync: ensure URLs and folder names are stored as binary
Apparently leaving {sqlite_unicode} unset isn't enough, and there's subtle differences where BLOBs are stored differently than TEXT when dealing with binary data. We also want to avoid odd cases where SQLite will attempt to treat a number-like value as an integer. This should avoid problems in case non-UTF-8 URLs and pathnames are used. They'll automatically be upgraded if not, but downgrades to older lei would cause duplicates to appear.
2022-04-01TODO: add item for auto-detecting TLS files in daemons
I forgot to restart my -imapd and -nntpd instances on public-inbox.org after the cert expired :x
2022-04-01doc: add WIP release notes for 1.8
1.8 will be a minor release, soon (I initially expected to release it in December, but was side-tracked). Major features will be for 1.9.
2022-04-01viewdiff: use defined checks in more places
It's less cognitive overhead for future readers since I just looked at it again and thought it was possible for "0" to be returned (it isn't).
2022-03-24syscall: add sendmsg+recvmsg for remaining arches
aarch64, ppc64le, sparc64, loongarch64, and mips (32-bit userspace) are all tested via machines from the GCC Farm Project <https://cfarm.tetaneutral.net/> Remaining syscall numbers are from musl <https://musl.libc.org/>
2022-03-23syscall: implement sendmsg+recvmsg in pure Perl
Socket::MsgHdr is only packaged for Debian and derivatives at the moment, and Inline::C pulling in gcc/clang is a huge amount of disk space and bandwidth for some users. This enables disk space and/or bandwidth-limited users to use lei. Only Linux guarantees a stable ABI and syscall numbers, but that's the majority of our userbase. FreeBSD users will still have to use Inline::C (or get Socket::MsgHdr packaged). x86, x32, and x86-64 are all currently supported, more to be added.
2022-03-23recv_cmd: do not undef recvmsg buffer arg on errors
It's a waste of ops and cycles, and inconsistent with perl sysread() behavior which doesn't touch the supplied buffer on errors.
2022-03-23syscall: drop unused EEXIST import
We've never used it, actually.
2022-03-22www: loosen deep-linking prevention
Apparently some browsers can set a Referer: header which fails to match. I'm not certain why, but making "$schema://$HOST_PORT" matches case-insensitive seems more correct regardless. In case that doesn't work, we'll also allow bypassing deep-link prevention via a POST form button. Reported-by: Vlastimil Babka <vbabka@suse.cz> Link: https://public-inbox.org/meta/93ebfbd1-9924-481c-4edc-9b232d1e995c@suse.cz/
2022-03-14t/lei-sigpipe.t: ensure SIGPIPE is not ignored instead of not blocked
Ignoring a signal is different than blocking a signal, and the "IgnoreSIGPIPE" option of systemd ignores. [ew: note systemd behavior] Acked-by: Eric Wong <e@80x24.org>
2022-03-08index|extindex: support --dangerous flag
This enables Xapian::DB_DANGEROUS to support in-place updates. This can speed up the initial index and reduce I/O at the cost of preventing concurrent readers and being unsafe in the face of any abnormal terminations. This is more dangerous than --no-fsync. --no-fsync is only unsafe in the event of a power loss or kernel crash; --dangerous is unsafe even on SIGKILL.
2022-03-01t/lei-sigpipe: ensure SIGPIPE is unblocked for this test
Tests run under systemd (and similar) have SIGPIPE blocked by default. This was causing this SIGPIPE test to get stuck when run by automated builders used by Nix. Thanks to Julien Moutinho and Dominique Martinet for tracking down this failure. Reported-by: Julien Moutinho <julm+public-inbox@sourcephile.fr> Reported-by: Dominique Martinet <asmadeus@codewreck.org> Link: https://public-inbox.org/meta/20220227080422.gyqowrxomzu6gyin@sourcephile.fr/
2022-02-18t/lei-sigpipe: attempt to improve diagnostics for stuck test
This may help diagnose a difficult-to-reproduce test failure on NixOS. Link: https://public-inbox/meta/20211209013743.okzgim7bbrpahks7@sourcephile.fr/
2022-02-17git: do not dereference undef as ARRAY ref
When aborting git processes, we must account for the lack of inflight requests.
2022-02-14sharedkv: avoid ambiguity for numeric-like string keys
While we only store URLs and binary SHA-1/SHA-256 values in skv at the moment, we may store potentially ambiguous keys/values in the future. It's possible to store "02" and have it treated as `2' unless explicitly binding parameters as SQL_BLOB. This behavior was independent of the sqlite_unicode parameter as evidenced by the new tests. I only noticed this bug while hacking on another project using DBD::SQLite, and not while hacking on public-inbox itself.
2022-02-14sharedkv: remove unused subs
Some features didn't get used, and they're just getting in the way of upcoming bugfixes.
2022-02-14t/lei-*watch: disable flaky tests by default for now
Properly fixing these tests is too difficult for me at the moment, so just disable these tests for now. A proper fix and fleshing out support for inotify will hopefully happen at some point.
2022-02-11view: remove all CR before LF
While we've rendered CR-LF as LF-only in HTML for many years, some messages end up as CR-CR-LF. So strip ALL all CR bytes preceding LF bytes, while preserving odd CR in the middle of lines. Reported-by: Thomas Weißschuh <thomas@t-8ch.de> Link: https://public-inbox.org/meta/8d13668f-cac7-4984-bb4e-ad90502dc46d@t-8ch.de/
2022-02-03test_lei: use consistent locale for error messages
git-config(1) error messages are locale-dependent, so follow the lead taken by git's own test suite and set LC_ALL=C and LANG=C to ensure error messages we check against are not localized. Reported-by: Julien Moutinho <julm+public-inbox@sourcephile.fr>
2022-02-01syscall: FS_IOC_*FLAGS: define on per-architecture basis
It turns out these Linux ioctls are unfortunately architecture-dependent, and not endian-dependent. Fixup some warning messages while we're at it, too. Fixes: 14fa0abdcc7b6513 ("rewrite Linux nodatacow use in pure Perl w/o system") Link: https://public-inbox.org/meta/YfdYqLhDVQRQ9NGT@codewreck.org/ Noticed-by: Dominique Martinet <asmadeus@codewreck.org>
2022-02-01syscall: fallback to rename on renameat2 EINVAL
ZFS appears to incorrectly return EINVAL on renameat2 when the operation is not supported: renameat2(AT_FDCWD, "...", AT_FDCWD, "...", RENAME_NOREPLACE) = -1 EINVAL Fall back to the racy rename in this case as well:
2022-01-31rewrite Linux nodatacow use in pure Perl w/o system
btrfs is Linux-only at the moment (and likely to remain that way for practical purposes). So rely on Linux ABI stability and use the `syscall' and `ioctl' perlops rather than relying on Inline::C. Inline::C (and gcc||clang) are monstrous dependencies which we can't expect users to have. This makes supporting new architectures more difficult, but new architectures come along rarely and this reduces the burden for the majority of Linux users on popular architectures (while still avoiding the distribution of pre-built binaries). Link: https://public-inbox.org/meta/YbCPWGaJEkV6eWfo@codewreck.org/
2022-01-31http: don't send chunk finalizer on HEAD responses
AFAIK this doesn't affect Varnish or nginx users, but those should eventually become optional dependencies.
2022-01-23t/eml.t: ignore newer Email::MIME behavior
Once again, our message parser class matches the more tolerant behavior of older Email::MIME releases in order to handle ancient messages. This fixes <https://bugs.debian.org/1002219>, but dropping Email::MIME entirely from the test suite may be prudent in the future.
2021-12-08Makefile.PL: fix useless use of push
2021-11-30eliminate some unused subs
->newsgroup_matches was never used, and ->shard_over_check was dropped in 89193578d21f (extindex: --gc checkpoints, 2021-10-06).
2021-11-22lei: always use 3-arg open perlop
Future-proofing in case future versions of Perl warn on this, since 2-arg forms of open may be subject to injection vulnerabilities with non-literal args.
2021-11-22spawn: avoid C++ keyword `try'
This is future-proofing in case we build against Xapian directly in the future, which would require a C++ compiler.
2021-11-22searchidx: avoid modification of read-only `$_'
This fixes the "Modification of a read-only value attempted at ..." error in an initial run of t/reindex-time-range.t. It was reproducible by running `rm -rf t/data-gen/reindex-time-range.v*' before `make && prove -bvw t/reindex-time-range.t'. Thanks to Jörg Rödel for providing the backtrace which helped find this. Debugged-by: Jörg Rödel <joro@8bytes.org> Link: https://public-inbox.org/meta/YZuZEY+WSnm4wlrS@8bytes.org/
2021-11-22t/lei-mirror: skip lei comparisons if lei missing
We can't compare created_at times with lei if lei tests are skipped due to Inline::C or Socket::MsgHdr unavailability. Reported-by: Jörg Rödel <joro@8bytes.org> Link: https://public-inbox.org/meta/YZebmAxlFJy4lqAw@8bytes.org/
2021-11-15lei forget-search: add help for --prune
This enables tab-completion, since I'm using --prune quite a bit and my fingers are about to fall off :<
2021-11-10t/lei-watch: test with with higher sleep
0.1s may not be enough for a task switch and inotify wakeup, so try doubling it and see if it fixes test reliability, for now. A future change may be to implement a watcher/tracer for inotify -> lei/store events. Link: https://public-inbox.org/meta/20211104134327.zrf5jijfz7dsvb7l@meerkat.local/
2021-11-10lei q: make HTTP(S) query strings even less ugly
Following commit 57fed2e4b78ed394 (lei: normalize whitespace in remote queries, 2021-09-11), leaving the trailing `\n' from stdin queries to be normalized to ` ' (SP) causes it to appear as `+' in URLs, which Xapian ignores.
2021-11-10lei q: disallow "\n" in argv[] elements
I don't expect this to be hit in real-world use via normal interactive shells. However, somebody could accidentally add "\n" in languages (e.g. Perl, C) where it's easy to pass "\n" in argv[].
2021-11-10lei up: infer rawstr from old searches via trailing "\n"
For --stdin searches created prior to commit 666dde69a3f6 (lei q|up: fix saved searches for single-phrase search, 2021-11-08) we still want to be able to run "lei up" on them without regressions. So assume nobody manages to enter "\n" as an argv[] element and consider the presence of "\n" as a previous --stdin use. This fixes errors from "lei up" such as: lei_xsearch 2 wq_worker: Exception: Key too long: length was 840 bytes, maximum length of a key is 255 bytes at ../PublicInbox/IPC.pm line 250. Fixes: 666dde69a3f6 ("lei q|up: fix saved searches for single-phrase search")
2021-11-10ipc: note failing sub name
Hopefully problems can get diagnosed more quickly with the sub name in the error message.
2021-11-10solver: support sha256 coderepos
Tested manually on a newish project I'm working on.