public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2019-04-23	view: avoid "1+ messages" in per-message footer of /t/ and /T/
	Try to appear gramatically correct and state: "only message in thread" when there's only one known (to us) message in the thread.
2019-04-18	view: show "(no subject)" consistently in HTML
	Empty subjects ("") and undefined Subjects: are now both displayed as "(no subject)" for now.
2019-04-16	cleanup: use '$ibx' consistently when referring to Inbox refs
	'$inbox' is more human-readable, so that is for the more human-readable name in most cases. Making our variable naming more consistent should make the code easier-to-review and harder to screw up.
2019-02-13	ensure bytes::length is available to callers
	We were relying on Danga::Socket using the "bytes" pragma, previously. Nowadays, the "bytes" pragma is not recommended in general, but bytes::length remains acceptable for getting the byte-size of a scalar.
2019-02-01	viewdiff: support renames and long paths in diffstat anchors
	This is best-effort, but works well-enough in practice for projects which use shell-friendly filenames as well as the long path names for some Linux kernel selftests.
2019-02-01	view: simplify quote splitting
	Perl "split" can capture and group in the regexp itself, so rely on that to shorten our code. Comparing the /T/ HTML output of a thread from hell (on LKML with 1356 messages) reveals no difference in the rendered result. Only the HTML source differs in newline placement before/after the closing </span> This allows a minor speedup on my X32 Thinkpad @ 1.6GHz with the aforementioned LKML thread from hell: before: 3.67s after: 3.55s
2019-02-01	view: fix broken hunk header hrefs in Atom feeds
	We use absolute URLs in the Atom feeds (to ease syndication/mirroring), so hunk headers need to point to the solver URLs.
2019-02-01	view: diffstat anchors for multi-message/attachment views
	diffstat <-> ^diff anchors work within the same attachment or message while in HTML views which display multiple messages.
2019-01-30	Merge remote-tracking branch 'origin/viewvcs' into master
	* origin/viewvcs: (66 commits) solvergit: deal with alternative diff prefixes solvergit: extract mode from diff headers properly solvergit: avoid "Wide character" warnings solvergit: do not show full path names to "git apply" css/216dark: add comments and tweak highlight colors viewvcs: avoid segfault with highlight.pm at shutdown solvergit: do not solve blobs twice t/check-www-inbox: disable history t/check-www-inbox: don't follow mboxes t/check-www-inbox: replace IPC::Run with PublicInbox::Spawn hval: add src_escape for highlight post-processing viewvcs: wire up syntax-highlighting for blobs hlmod: disable enclosing <pre> tag t/hl_mod: extra check to ensure we escape HTML wwwhighlight: read_in_full returns undef on errors solver: crank up max patches to 9999 viewvcs: do not show final error message twice qspawn: decode $? for user-friendliness solver: reduce "git apply" invocations solver: hold patches in temporary directory ...
2019-01-30	view: remove unused _msg_date sub
	Not needed since commit 956abe9ad5f13a0d1755262be412d6a54fda72e9 ("view: depend on SearchMsg for Message-ID")
2019-01-26	view: swap CRLF for LF in HTML output
	It makes no difference to browsers aside from saving a few bytes; and this means we won't have to worry about extra '%0D' showing up in links to solver.
2019-01-20	viewdiff: support diff-highlighting w/o coderepo
	Having diff highlighting alone is still useful, even if blob-resolution/recreation is too expensive or unfeasible.
2019-01-19	view: wire up diff and vcs viewers with solver

2019-01-19	view: disable bold in topic display
	It seems pointless due to the indentation, and interacts badly with some CSS colouring.
2019-01-08	view: more culling for search threads
	{mapping} overhead is now down to ~1.3M at the end of a giant thread from hell.
2019-01-08	view: fix wrong date for non-Xapian/SQLite v1 users
	We need to parse the MIME object in order to get the datestamp for those sites. Fixes: 7d02b9e64455 ("view: stop storing all MIME objects on large threads")
2019-01-08	view: stop storing all MIME objects on large threads
	While we try to discard the $smsg (SearchMsg) objects quickly, they remain referenced via $node (SearchThread::Msg) objects, which are stored forever in $ctx->{mapping} to cull redundant words out of subjects in the thread skeleton. This significantly cuts memory bloat with large search results with '&x=t'. Now, the search results overhead of SearchThread::Msg and linked objects are stable at around 350K instead of ~7M per response in a rough test (there's more savings to be had in the same areas). Several hundred kilobytes is still huge and a large per-client cost; but it's far better than MEGABYTES per-client.
2018-12-30	handle "multipart/mixed" messages which are not multipart
	I've found two examples on https://lore.kernel.org/lkml/ where the messages declared themselves to be "multipart/mixed" but were actually plain text: <87llgalspt.fsf@free.fr> <200308111450.h7BEoOu20077@mail.osdl.org> With the mboxrd downloaded, mutt is able to view them without difficulty. Note: this change would require reindexing of Xapian to pick up the changes. But it's only two ancient messages, the first was resent by the original sender and the second is too old to be relevant.
2018-12-28	reply: allow ":none=$REASON" in "replyto" config
	This can be useful for configuring archives of lists which are no longer active.
2018-08-05	view: distinguish strict and loose thread matches
	The "loose" (Subject:-based) thread matching yields too many hits for some common subjects (e.g. "[GIT] Networking" on LKML) and causes thread skeletons to not show the current messages. Favor strict matches in the query and only add loose matches if there's space. While working on this, I noticed the backwards --reindex walk breaks `tid' on v1 repositories, at least. That bug was hidden by the Subject: match logic and not discovered until now. It will be fixed separately. Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
2018-04-23	view: drop redundant References: display code
	We no longer need to parse and dedupe References: ourselves, PublicInbox::MID::references does it for us.
2018-04-23	view: wrap To: and Cc: headers in HTML display
	It is common to have large amounts of addresses Cc:-ed in large mailing lists like LKML. Make them more readable by wrapping after addresses. Unfortunately, line breaks inserted by the MUA get lost when using the public Email::MIME API. Subject and body lines remain unwrapped, as it's the author's fault to have such long lines :P
2018-04-23	view: untangle loop when showing message headers
	The old loop did not help with code clarity with the various conditional statements. It also hid a bug where we forgot to (optionally) obfuscate email addresses in Subject: lines if search was enabled.
2018-04-18	feed: respect feedmax, again
	Gigantic feeds probably make some clients unhappy, clamp it to what it was in the past. Fixes: b9534449ecce2c59 ("view: avoid offset during pagination")
2018-04-06	www: favor reading more from SQLite, and less from Xapian
	Favor simpler internal APIs this time around, this cuts a fair amount of code out and takes another step towards removing Xapian as a dependency for v2 repos.
2018-04-03	view: avoid offset during pagination
	OFFSET in SQLite gets painful to deal with. Instead, rely on timestamps (from Received:) for pagination. This also sets us up for more precise Date searching in case we want it.
2018-04-02	www: rework query responses to avoid COUNT in SQLite
	In many cases, we do not care about the total number of messages. It's a rather expensive operation in SQLite (Xapian only provides an estimate). For LKML, this brings top-level /$INBOX/ loading time from ~375ms to around 60ms on my system. Days ago, this operation was taking 800-900ms(!) for me before introducing the SQLite overview DB.
2018-03-30	feed: optimize query for feeds, too
	This is a smaller improvement than the landing /$INBOX/ page because full message bodies are shown; but still saves around 100ms for my system with LKML.
2018-03-30	view: drop load_results
	It's no longer necessary to have this since load_expand now populates $smsg->mid with the "preferred" Message-ID. This saves around 10ms on the homepage for me.
2018-03-30	view: speed up homepage loading time with date clamp
	This saves over 400ms on my system with the full LKML with over 2.8 million messages.
2018-03-29	view: get rid of some unnecessary imports
	We no longer need some of these old subroutines which assumed a single Message-ID for each message.
2018-03-29	search: get rid of most lookup_* subroutines
	Too many similar functions doing the same basic thing was redundant and misleading, especially since Message-ID is no longer treated as a truly unique identifier. For displaying threads in the HTML, this makes it clear that we favor the primary Message-ID mapped to an NNTP article number if a message cannot be found.
2018-03-29	www: fix attachment downloads for conflicted Message-IDs
	By using the "primary" Message-ID in WwwAttach, we can avoid conflicts in the links we use for downloading attachments.
2018-03-29	www: remove unnecessary ghost checks
	We do not need to care about ghosts at multiple call sites; they cannot have a {blob} field and we've stored the blob field in Xapian since SCHEMA_VERSION=13.
2018-03-27	view: depend on SearchMsg for Message-ID
	Since we need to handle messages with multiple and duplicate Message-ID headers, our thread skeleton display must account for that. Since we have a "preferred" Message-ID in case of conflicts, use it as the UUID in an Atom feed so readers do not get confused by conflicts.
2018-03-27	view: permalink (per-message) view shows multiple messages
	This needs tests and further refinement, but current tests pass.
2018-03-22	use both Date: and Received: times
	We want to rely on Date: to sort messages within individual threads since it keeps messages from git-send-email(1) sorted. However, since developers occasionally have the clock set wrong on their machines, sort overall messages by the newest date in a Received: header so the landing page isn't forever polluted by messages from the future. This also gives us determinism for commit times in most cases, as we'll used the Received: timestamp there, as well.
2018-03-06	favor Received: date over Date: header globally
	The first Received: header is believable since it typically hits the user's mail server and can be treated as relatively trustworthy. We still show the Date: in per-message (permalink) views, which may expose users for having incorrect Date: headers, but all the ISO YYYY-MM-DD dates we display will match what we see.
2018-02-28	view: remove X-PI-TS reference
	We haven't needed this since we integrated threading and dropped Email::Abstract and Mail::Thread usage.
2018-02-07	update copyrights for 2018
	Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2018-02-03	view: allow expanding directly to "nested" view
	Sometimes, it can be desirable to jump directly to the "nested" view when viewing a thread skeleton. This makes it possible. While we're at it, shorten some of the text to ensure it still fits in 80 columns.
2018-01-30	view: close <pre> in reply instructions
	We leave the mailto: link out when obfuscating address, so do not stuff the "</pre>" closing tag into it. Instead, keep the closing tag in the same context as the opening one, making it easier to keep track of.
2018-01-29	view: adjust wording for reply-to-list configs
	This makes the wording less confusing when showing archives for lists where the convention is reply-to-list. I still hate reply-to-list, but it's still better than no archives or list at all.
2017-12-21	view: avoid deduping a single word in subject skeletons
	It is usually pointless to replace a single word with a '"' character.
2017-11-29	view: avoid warning from negative repeat counts
	Perl 5.22 started warning about this.
2017-10-18	view: s/threaded/nested/ in view
	We always do threading, so perhaps it's not a good name. "Nested" is probably more appropriate and closer to what people are used to seeing.
2017-10-03	search: try to fill in ghosts when generating thread skeleton
	Since we attempt to fill in threads by Subject, our thread skeletons can cross actual thread IDs, leading to the possibility of false ghosts showing up in the skeleton. Try to fill in the ghosts as well as possible by performing a message lookup.
2017-10-03	threading: deal with improperly-terminated References headers
	We should not blindly join References and In-Reply-To headers as a single string, because some messages can have an open angle brace '<' in References: without a corresponding '>'.
2017-06-29	view: cull redundant phrases in subjects
	There is no need to show the same phrases over and over again in thread skeletons, it adds to visual noise and makes things more difficult to read.
2017-06-23	allow admins to configure non-obfuscated addresses/domains
	We will also treat all known list addresses as non-obfuscated. By setting publicinbox.noObfuscate in ~/.public-inbox/config, this will allow users to disable address obfuscation on a per-domain or per-address basis.