about summary refs log tree commit homepage
path: root/lib/PublicInbox
DateCommit message (Collapse)
2021-10-25www: $MSGID/raw: set charset in HTTP response
By using the charset specified in the message, web browsers are more likely to display the raw text properly for human readers. Inspired by a patch by Thomas Weißschuh: https://public-inbox.org/meta/20211024214337.161779-3-thomas@t-8ch.de/ Cc: Thomas Weißschuh <thomas@t-8ch.de>
2021-10-25gzip_filter: delay async wcb call
This will let us modify the response header later to set a proper charset for Content-Type when displaying raw messages. Cc: Thomas Weißschuh <thomas@t-8ch.de>
2021-10-24viewvcs: die on tmpfile() errors
Just let Plack::Util::run_app catch the error and generate a 500 response for it.
2021-10-24git: avoid Perl5 internal scratchpad target cache
Creating a scalar ref directly off substr() seemed to be causing the underlying non-ref scalar to end up in Perl's scratchpad. Assign the substr result to a local variable seems sufficient to prevent multi-megabyte SVs from lingering indefinitely when a read-only daemon serves rare, oversized blobs.
2021-10-24thread: avoid Perl5 internal scratchpad target cache
The use of array-returning built-ins such as `grep' inside arrayref declarations appears to result in permanently allocated scratchpad space for caching according to my malloc inspector. Thread skeletons get discarded every response, but multiple skeletons can exist in memory at once, so do what we can to prevent long-lived allocations from being made, here. In other words, replacing constructs such as: my $foo = [ grep(...) ]; with: my @foo = grep(...); Seems to ensure the mortality of the underlying array.
2021-10-24listener: emit warnings on EPERM
In retrospect, warnings for EPERM on accept4(2) failure may help detect misconfigured firewalls, so start emitting warnings for EPERM. Fwiw, I've never known excessive EPERM warnings to be excessively noisy in other TCP services I've run over the years.
2021-10-24http: use a larger buffer for ->getline responses
64K matches the Linux pipe default, and matches what we use in httpd/async and qspawn. This should reduce syscalls used for serving git packs via dumb HTTP and any ->getline code paths used by other PSGI code. This appears to speed up HTML rendering by w3m when serving giant HTML responsees from the Devel::Mwrap::PSGI memory debugger.
2021-10-24shared_kv: remove cache_size attribute support
We're not using it, anywhere.
2021-10-24lei export-kw: skip read-only IMAP folders
Since we want to store IMAP flags asynchronously and not wait for results, we can't check for IMAP errors this way and end up wasting bandwidth on public-inbox-imapd. Now, we just check PERMANENTFLAGS up front to ensure a folder can handle IMAP flag storage before proceeding.
2021-10-24lei: always pass $lei to LeiAuth->op_merge
This will make future developments easier.
2021-10-23cmd_ipc4: retry sendmsg on ENOBUFS/ENOMEM/ETOOMANYREFS
I'm seeing ENOBUFS on a RAM-starved system, and slowing the sender down enough for the receiver to drain the buffers seems to work. ENOMEM and ETOOMANYREFS could be in the same boat as ENOBUFS. Watching for POLLOUT events via select/poll/epoll_wait doesn't seem to work, since the kernel can already sleep (or return EAGAIN) for cases where POLLOUT would work.
2021-10-23www: respect coderepo.*.url during cgit init
This is necessary for showing "found $OID in $CODEREPO_URL" in solver-generated pages ($INBOX_URL/$OID/s/).
2021-10-23config: remove *_url_format support for cgit
We're not using them, anywhere.
2021-10-23git: simplify local_nick, avoid "foo.git.git"
We need to use a non-greedy regexp to avoid capturing the ".git" suffix in the pathname before blindly appending our own.
2021-10-23searchidx: v1: raise on msgmap init failure
Indexing any inboxes requires SQLite and msgmap, so don't hide exceptions if it fails.
2021-10-22lei forget-search: support --prune=<local|remote>
Instead of: lei forget-search $OUTPUT && rm -r $OUTPUT we'll also allow a user to do: rm -r $OUTPUT && lei forget-search --prune This gives users flexibility to choose whatever flow is most natural to them.
2021-10-22lei export-kw: completion returns all Maildir+IMAP
It's theoretically possible an AUTH=ANONYMOUS login could be writable and allowed to store flags for various people (e.g. within a private network).
2021-10-22lei export-kw: don't recreate deleted IMAP folders
In case an IMAP folder is deleted, just set an error and ignore it rather than creating an empty folder which we attempt to export keywords to for non-existent messages.
2021-10-22wwwatomstream: call gmtime with scalar
When the gmtime() calls were moved from feed_entry() and atom_header() into feed_updated() in c447bbbd, @_ rather than a scalar was passed to gmtime(). As a result, feed <updated> values end up as "1970-01-01T00:00:00Z". Switch back to using a scalar argument to restore the correct timestamps. Fixes: c447bbbddb4ac8e1 ("wwwatomstream: simplify feed_update callers")
2021-10-22lei: use RENAME_NOREPLACE on Linux 3.15+
One syscall is better than two for atomicity in Maildirs. This means there's no window where another process can see both the old and new file at the same time (link && unlink), nor a window where we might inadvertantly clobber an existing file if we were to do `stat && rename'.
2021-10-22lei_mail_sync: mv_src: use transaction, check UNIQUE
We need a transaction across two SQL statements so readers (which don't use flock) will see the result as atomic. This may help against some occasional test failures I'm seeing from t/lei-auto-watch.t and t/lei-watch.t, or make the problem more apparent.
2021-10-22lei: no Perl FileHandle for `undef' w/ ECONNRESET
Error reporting for recv_cmd4 methods is a bit wonky.
2021-10-22dir_idle: treat IN_MOVED_FROM as a gone event
Whether an MUA uses rename(2) or link(2)+unlink(2) combination should not matter to us. We should be able to handle both cases.
2021-10-22lei note-event: clear_src on ENOENT
When a file goes away, try to make sure we don't waste time trying to access or store it.
2021-10-22watch: remove redundant signal mask manipulation
The top-level daemon process already blocks all signals, so there's no reason to block them around fork() calls.
2021-10-22watch: check for {quit} before IDLE
This may make it less likely for watch-dependent tests to get stuck. Unfortunately, due to the synchronous API of Mail::IMAPClient, ->idle is still susceptible to missing signals.
2021-10-22lei_search: try harder to associate "lei index"-ed messages
Allow checking for keyword changes if we have an known OID, even if the blob isn't currently reachable.
2021-10-22lei note-event: wq_io_do => wq_do
No need to pass extra arrayref args, here.
2021-10-22lei note-event: drop unnecessary eval guard
We don't want to lose the failure message in case note-event fails.
2021-10-22lei/store: check for any unexpected process death
The lei/store process should only exit from EOF on the socket, so make sure we note any unintended signals
2021-10-20httpd: reject requests with spaces in header names
Malicious clients may attempt HTTP request smuggling this way. This doesn't affect our current code as we only look for exact matches, but it could affect other servers behind a to-be-implemented reverse proxy built around our -httpd. This doesn't affect users behind varnish at all, nor the HTTPS/HTTP reverse proxy I use (I don't know about nginx), but could be passed through by other reverse proxies. This change is only needed for HTTP::Parser::XS which most users probably use. Users of the pure Perl parser (via PLACK_HTTP_PARSER_PP=1) already hit 400 errors in this case, so this makes the common XS case consistent with the pure Perl case. cf. https://www.mozilla.org/en-US/security/advisories/mfsa2006-33/
2021-10-19lei_mail_sync: show non-matching SHA
It could prove useful for diagnosing bugs (either on our end or an MUA's), or storage device failures.
2021-10-19lei inspect: show ISO8601 {rt} and {dt}, too
While inspect is intended for debugging, the Unix epoch in seconds requires extra steps for human consumption; just steal what we used for "lei q -f json" output.
2021-10-19lei inspect: add atfork hook
This is necessary for in case an inspect command is run in a parallel with other commands.
2021-10-19lei: remove unused ->busy time arg
Our graceful shutdown doesn't time out clients.
2021-10-19lei up: support --exclude=, --no-(external|remote|local)
These can be used to temporarily disable using certain externals in case of temporary network failure or mount point unavailability.
2021-10-19lei: conditionally add "\n" to error messages
Some error messages already include "\n" (w/ file+line info), so don't add another one. (`warn' will automatically add its caller location unless there's a final "\n").
2021-10-19lei up: propagate redispatch_all failure via exit code
We can still continue with some local externals, maybe; but the error needs to be propagated to the calling process for scripting purposes.
2021-10-19lei: use die for external and query handling
This allows "lei up" to continue processing unrelated externals if on output fails.
2021-10-19lei up: prefix `remote' and `local' with `o_'
This will help distinguish between mail outputs and external public-inboxes.
2021-10-19test_common: lazy-require AutoReap
This might speed up non-daemon-using tests.
2021-10-18v2: mirrors don't clobber msgs w/ reused Message-IDs
For odd messages with reused Message-IDs, the second message showing up in a mirror (via git-fetch + -index) should never clobber an entry with a different blob in over. This is noticeable only if the messages arrive in-between indexing runs. Fixes: 4441a38481ed ("v2: index forwards (via `git log --reverse')")
2021-10-18extindex: show mismatches for messages deleted from inbox
There seems to be a bug in v2 inbox reindexing somewhere...
2021-10-17extindex: better locations for {quit} checks
Check for graceful termination at every message since it's a fairly inexpensive check.
2021-10-17extindex: guard against false mismatch unrefs
I'm not sure if this is a bug or not (or it could be an old bug in the v2 indexing code).
2021-10-17extindex: retry sync_inbox before reindex
Ensure the num highwater mark of the target inbox is stable before using it. Otherwise we may end up repeating work done to index a message.
2021-10-17extindex: use localtime to display lock time
Since this is intended for use on the command-line, include TZ offset in time and try to shorten the message a bit so it wraps less on a terminal.
2021-10-17msgmap: do not cache num_highwater
Caching the value doesn't seem necessary from a performance perspective, and it adds a caveat for read-only users which may lead to bugs in future code.
2021-10-16eml: fix leak workaround
Our previous workaround didn't actually work around the leak in <https://rt.cpan.org/Public/Bug/Display.html?id=139622> since croak()-via-Perl was still invoked before the SV reference count could be decremented. Put in a proper workaround which saves warnings onto a temporary variable and only croak after ->decode or ->encode returns; not inside those methods.
2021-10-16lei sockets: favor level-triggered epoll for fairness
Sigfd->event_step needs priority over script/lei clients, LeiSelfSocket, and everything else.