public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2019-11-08	t/*.t: disable nntpd/httpd worker processes in most tests
	And explicitly test for respawning in t/httpd-corner.t There's no need to have an extra entries in the process table for most tests we run, since that's not what we're testing.
2019-11-08	t/hl_mod.t: remove IPC::Run (and File::Temp) dependency
	We already load PublicInbox::Spawn for which(), so using spawn() isn't unreasonable. And rely on "skip" to log the omitted test if w3m is missing, which means we need to update the "&&" escaping test to be self-referential on the same line. File::Temp was totally unused, there; and we can use "open ...,undef" in Perl to easily create anonymous temporary files for use with spawn().
2019-11-08	t/httpd-corner.t: get rid of IPC::Run for running curl
	We already load PublicInbox::Spawn, so there's no need to add another dependency to make life difficult for potential contributors.
2019-11-08	t/httpd-corner.t: drop unnecessary bytes:: for length()
	We don't need to force byte semantics for a buffer we clearly create (via ->read) with byte semantics. Since we didn't "use bytes" in t/httpd-corner.t, it was inadvertantly made available by IPC::Run (which goes away, next).
2019-11-08	t/*.t: remove IPC::Run dependency for git commands
	One small step towards making tests easier-to-run. We can rely on "local $ENV{GIT_DIR}" for potentially shell-unsafe path names, and the rest of our path names are relative and don't contain characters which require escaping.
2019-11-08	edit: check for write errors writing "From_" line
	We need to check every print to a regular file for errors, because storage devices inevitably fail.
2019-11-08	edit: propagate correct editor exit code
	exit($?) is never correct, since ($? >> 8) is needed to extract the correct exit code, as other information (e.g. such as signal) is encoded in $? in addition to the exit code.
2019-11-08	doc: actually document publicinboxwatch.watchspam
	Instead of copy-pasting the documentation for `spamcheck'.
2019-11-04	t/edit: use PublicInbox::Git::qx for pathname safety
	Another case where spaces can be in TMPDIR and cause shell expansion with `command` to fail.
2019-11-04	tests: rely on PublicInbox::Git for pathname safety
	It's possible (but unlikely) a user will put spaces in TMPDIR and cause File::Temp::tempdir() to return a temporary directory with spaces in the filename, making it unsafe for shell expansion. PublicInbox::Git didn't exist when t/mda.t was written, and I just forgot about PublicInbox::Git->qx for t/plack.t :x
2019-11-04	t/httpd-corner.t: check for curl(1) errors in big async test
	curl(1) can fail and we need to invalidate the test in the rare case it fails.
2019-11-04	index: "git log" failures are fatal
	While I've never seen "git log" fail on its own, it could happen one day and we should be prepared to abort indexing when it happens. Beef up tests for t/spawn.t to ensure close() behaves on popen_rd the way we expect it to.
2019-11-03	searchidxshard: reuse $SIG{__WARN__} callback from Admin
	We don't want to define $SIG{__WARN__} in the worker to call an existing non-default callback. Instead update ->{current_info} the same way the V2Writable master process does. I noticed this while reindexing with a large XAPIAN_FLUSH_THRESHOLD and seeing the wrong epoch on my terminal from a shard because the shard worker was spawned while reindexing a higher-numbered epoch.
2019-11-03	public-inbox v1.2.0 v1.2.0

2019-11-03	build: add "git-dist" target for making gzipped tarballs
	Since MANIFEST is tied to files tracked by git, adding generated files such as NEWS to that is more effort than its worth (esp. when I'm wondering if MakeMaker is useful compared to only using GNU make). I also have trouble reading CamelCase, so lower-case names are nicer and more consistent with previous releases (which were all generated with "git archive"); but did not include NEWS.
2019-11-03	doc: mknews: force plain-text NEWS to UTF-8
	We'll have non-7-bit ASCII in the 1.2.0 release notes.
2019-11-03	TODO: update item for multiple Date: headers
	That's the only head-scratcher of the bunch remaining, since that relies on ranges.
2019-11-03	doc: add public-inbox.cgi(1) manpage
	Yet another case of documenting things which should NOT be used :>
2019-11-02	doc: add public-inbox-purge(1) manpage
	Tools intended for end users need manpages, and doubly so to convince potential users NOT to use them :)
2019-10-31	hval: replace "'" with "'" for compatibility
	While testing 216light.css changes, I managed to hit some cases where dillo failed to render ' correctly, but I also can't reproduce it reliably. Anyways, it's definitely a problem with some old browsers and newer versions of highlight already work around it, but Debian 10.x has 3.41, so use "'" to maximize compatibility.
2019-10-31	contrib/css/216light: improve contrast a bit
	"#ff0" foreground on a "#fff" background is just too difficult to distinguish, among other things. So choose slightly darker colors when using a (painful) "#fff" background.
2019-10-31	qspawn: psgi_qx: delay callback until waitpid returns
	We need to detect "git apply" failures reliably when patches fail. This is necessary for solving for blob 81c1164ae5 in https://public-inbox.org/git/ when at least two messages can solve for it (and one of them fails): 1. https://public-inbox.org/git/b9fb52b8-8168-6bf0-9a72-1e6c44a281a5@oracle.com/ 2. https://public-inbox.org/git/56664222-6c29-09dc-ef78-7b380b113c4a@oracle.com/
2019-10-31	solvergit: deal with false-positive dfpost: results
	When solving for blob 81c1164ae5 in https://public-inbox.org/git/, at least two messages get indexed with the dfpost result for that blob (after fixing MsgIter to decode all text/* parts): 1. https://public-inbox.org/git/b9fb52b8-8168-6bf0-9a72-1e6c44a281a5@oracle.com/ 2. https://public-inbox.org/git/56664222-6c29-09dc-ef78-7b380b113c4a@oracle.com/ However, only the first message contains a usable patch. So we must adjust SolverGit to account for multiple messages hitting the same "dfpost:" search result and attempt "git apply" on all results, not just the first. In the future, changes to SearchIdx.pm may rid us of invalid search results and speed up performance (at the expense of developer/indexing time); but we need to account for old search indices, first.
2019-10-31	msgiter: attempt to decode all text/* bodies
	We want to index text/x-patch and text/x-diff, at least, since "git format-patch" can generate a patch series as attachments using --attach.
2019-10-31	msgiter: do not assume UTF-8 if Email::MIME->body_str succeeds
	ISO-2202-JP and other non-UTF-8 messages need to be displayed correctly. Fixes: 7d82a8bc04ce ('handle "multipart/mixed" messages which are not multipart')
2019-10-30	search: add note about SCHEMA_VERSION 15
	--reindex has gotten better over the years, and having parallel Xapian DB directories would exceed all available disk space for some users with giant inboxes.
2019-10-30	wwwlisting: fix spelling and clarify sub location
	Spell "Schwartzian" correctly, and clarify the location of "modified" since we have multiple subs named "modified"
2019-10-30	Merge branch 'learn'
	* learn: doc: add public-inbox-learn(1) manpage mda: support multiple List-ID matches mda: prepare for multiple destinations inboxwritable: add assert_usable_dir sub mda: skip MIME parsing if spam mda: hoist out mda_filter_adjust filter/base: remove MAX_MID_SIZE constant mda: hoist out List-ID handling and reuse in -learn learn: hoist out remove_or_add subroutine learn: GIT_COMMITTER_<NAME\|EMAIL> may be "" or "0" learn: update usage statement learn: only map recipient list on "ham" or "rm" learn: support multiple To/Cc headers
2019-10-30	doc: add public-inbox-learn(1) manpage
	Tools intended for end users need manpages.
2019-10-30	mda: support multiple List-ID matches
	While it's not RFC2919-conformant, mail software can theoretically set multiple List-ID headers. Deliver to all inboxes which match a given List-ID since that's likely the intended. Cc: Eric W. Biederman <ebiederm@xmission.com> Link: https://public-inbox.org/meta/87pniltscf.fsf@x220.int.ebiederm.org/
2019-10-30	mda: prepare for multiple destinations
	Multiple List-ID headers will be supported in the next commit
2019-10-30	inboxwritable: add assert_usable_dir sub
	And use it for mda, since "0" could be a usable directory if somebody insists on using relative paths...
2019-10-30	mda: skip MIME parsing if spam
	We don't want to waste cycles parsing the message for MIME bits if it's spam.
2019-10-30	mda: hoist out mda_filter_adjust
	It makes it easier to document the default -mda behavior is stricter than normal, including "public-inbox-learn ham"
2019-10-30	filter/base: remove MAX_MID_SIZE constant
	We don't need it in the filter, here, since we have one in the MDA package.
2019-10-30	mda: hoist out List-ID handling and reuse in -learn
	It's now possible to inject false-positive ham into an inbox the same way -mda does via List-ID.
2019-10-30	learn: hoist out remove_or_add subroutine
	We'll be reusing it for List-ID processing in the next commit.
2019-10-30	learn: GIT_COMMITTER_<NAME\|EMAIL> may be "" or "0"
	Users may be zeroes or blanks.
2019-10-30	learn: update usage statement
	Use <foo\|bar> since that seems to be the favored notation for required command args (taking a hint from git(1) manpage). While we're at it, remove the space after '<' for the redirect to match git.git coding style.
2019-10-30	learn: only map recipient list on "ham" or "rm"
	It's assumed that "spam" can end up anywhere due to Bcc:, so we need to scan every single inbox. However, "rm" is usually more targeted and and "ham" obviously only belongs in some inboxes.
2019-10-30	learn: support multiple To/Cc headers
	It's possible to specify these headers multiple times, and PublicInbox::MDA->precheck takes that into account, so -learn should, too.
2019-10-30	Merge remote-tracking branch 'origin/multi-mid'
	* origin/multi-mid: view: show X-Alt-Message-ID in permalink view, too index: allow search/lookups on X-Alt-Message-ID linkify: support adding "(raw)" link for Message-IDs view: improve warning for multiple Message-IDs view: move '<' and '>' outside <a> view: display redundant headers in permalink search: support multiple From/To/Cc/Subject headers
2019-10-30	HACKING: add a note about avoiding recursion
	Bad things happen when user data can control our stack size.
2019-10-28	view: show X-Alt-Message-ID in permalink view, too
	Since we index X-Alt-Message-ID (because we need to placate some NNTP clients), we now display it as well, since that Message-ID could be the X-Alt-Message-ID that the reader is actually interested in.
2019-10-28	index: allow search/lookups on X-Alt-Message-ID
	Since we replace extra Message-ID headers with X-Alt-Message-ID to placate NNTP clients, we should allow searching and indexing on X-Alt-Message-ID just like we do with Message-ID.
2019-10-28	linkify: support adding "(raw)" link for Message-IDs
	And use it for the per-message permalink display.
2019-10-28	view: improve warning for multiple Message-IDs
	"refer" is not the correct term, here; since that would mean multiple messages have the current message in the "References:" header, and that's a normal occurence. Instead, we need to warn the reader that the given message itself has multiple Message-IDs.
2019-10-28	view: move '<' and '>' outside <a>
	Browsers may underline '<' and '>' in links, which may be confused with '≤' and '≥'. So have the Message-ID header display follow what we do with In-Reply-To headers and move the "<" and ">" outside of <a> in the HTML.
2019-10-28	view: display redundant headers in permalink
	Mail headers can contain multiple headers of any type, so ensure we don't hide any information we're getting in the per-message permalink views. This means it's possible to have multiple From, Date, To, Cc, Subject, and In-Reply-To headers displayed. The thread indices are a special case, I guess, since we run out of space on the line if the headers too long and tools like mutt only show the first one.
2019-10-28	search: support multiple From/To/Cc/Subject headers
	We can easily support searching on messages with multiple From/To/Cc/Subject headers just like we do with multiple Message-ID headers. This matches the normal mutt pager display behavior.