public-inbox.git - an "archives first" approach to mailing lists

Date	Commit message (Collapse)
2024-02-09	view: decode In-Reply-To comments added by some MUAs
	Štěpán Němec <stepnem@smrk.net> wrote: > Eric Wong wrote: > > Subject: [PATCH] view: decode In-Reply-To comments added by Gnus > Or just "some MUAs"? Who knows who else... Yeah, I wouldn't be surprised if there were more... ---8<--- Subject: [PATCH] view: decode In-Reply-To comments added by some MUAs Emacs-based MUAs (e.g. Gnus and rmail) can do it, and maybe some others, too. I noticed it in <https://yhbt.net/lore/git/xmqqr0ho9oi9.fsf@gitster.g/> while scanning for something else.
2024-01-24	view: /$INBOX/ links to topics_{new,active}.html
	This makes the new endpoints easier-to-find. The navigation is still at the bottom of the page since I figured having it at the top is too cluttered for users on small terminals.
2024-01-10	address: avoid [ undef, undef ] address pairs
	For totally bogus things in address fields, we'll fall back to showing the original entry in the name column when using Email::Address::XS. The pure Perl version differs here, but we'll just let them be different when it comes to handling bogus data.
2024-01-10	www: linkify inbox addresses in To/Cc headers
	This makes it easier to discover contemporary messages crossposted to other groups within the same WWW instance. The internal cache is necessary for giant threads, and the expiry mechanism is necessary to prevent attackers from trivially OOM-ing.
2024-01-02	view: always show strict\|loose note w/ multi-roots
	For thread skeletons with multiple roots, it makes sense to note the strict\|loose delineation even when the first message matches the desired Message-ID.
2023-11-29	www: load and use cindex join data
	This is a major step in solving the problem of having to manually associate hundreds/thousands of coderepos with hundreds/thousands of public-inboxes to power solver (and more).
2023-02-04	www: sort all /$INBOX/ topics by Received: timestamp
	Our previous pinning prevention only worked to prevent older (non-most-recent) topics from being pinned to the landing page, but not the most recent window of messages. We still sort messages within threads by Date: because that makes git-send-email patchsets display more nicely, but we don't want recent topics pinned due to future Date: headers. I nearly switched sort_ds() back to sorting by Received: until I looked back on commit 8e52e5fdea416d6fda0b8d301144af0c043a5a76 (use both Date: and Received: times, 2018-03-21) and was reminded git-send-email relies on Date: for large series, so I added a note about it for sort_ds(). Reported-by: Kyle Meyer <kyle@kyleam.com> Tested-by: Kyle Meyer <kyle@kyleam.com> Link: https://public-inbox.org/meta/87edr5gx63.fsf@kyleam.com/
2023-01-11	www: /$INBOX/$MSGID/d/ to diff reused Message-IDs
	To ensure users aren't abusing the ability to reuse Message-IDs, provide a convenient front-end to `lei mail-diff' from WWW. Most of the time it's just list-appended signatures, so I expect this to be useful for /all/ users.
2022-09-29	www: remove "1\n" lines in $MSGID/t/ view
	Fixes: ab9c03ff4aa3 "www: use PerlIO::scalar (zfh) for buffering"
2022-09-11	view: fix solver links with multiple messages
	For redundant messages sharing Message-IDs, the link to solver (/$INBOX/$OID/s/) was going up too many levels for /$INBOX/$MSGID/ when there were multiple messages sharing the same $MSGID. Unfortunately, redundant messages are common with /all/ due to signature trailers. So dynamically assigning {-spfx} is tricky and error prone from counting `/'. So simplify the code a bit by setting {-spfx} once per HTTP request, instead of every single message.
2022-09-10	www: use PerlIO::scalar (zfh) for buffering
	Calling Compress::Raw::Zlib::deflate is fairly expensive. Relying on the `.=' (concat) operator inside ->zadd operator is faster, but the method dispatch overhead is noticeable compared to the original code where we had bare `.=' littered throughout. Fortunately, `print' and `say' with the PerlIO::scalar IO layer appears to offer better performance without high method dispatch overhead. This doesn't allow us to save as much memory as I originally hoped, but does allow us to rely less on concat operators in other places and just pass a list of args to `print' and `say' as a appropriate. This does reduce scratchpad use, however, allowing for large memory savings, and we still ->deflate every single $eml.
2022-09-10	www: switch to zadd for the majority of buffering
	This allows us to focus string concatenations in one place to allow Perl internal scratchpad optimizations to reuse memory. Calling Compress::Raw::Zlib::deflate repeatedly proves too expensive in terms of CPU cycles.
2022-09-10	www: drop {obuf} use entirely, for now
	This may help us identify hot spots and reduce pad space as needed.
2022-09-10	view: switch a few things to ctx->zmore
	Unfortunately, this is actually slower. However, this hopefully makes it easier to improve the internals and make performance improvements down the line.
2022-09-10	view: html_footer: avoid escaping " in a few places
	qq() is a nice alternative to "" when there's embedded " characters in HTML entities.
2022-09-10	view: html_footer: remove obuf dependency
	Another step towards giving us more options for speedups and memory reductions.
2022-09-10	view: html_footer: golf out a few lines
	We can build `$u' in one line, and drop an unnecessary empty line to reduce the amount of scrolling required to read this sub.
2022-09-10	view: reduce ascii_html calls and {obuf} use
	We can rely on {-html_tip} for some things at the top of the page, and reduce ascii_html and obfuscate_addrs calls by working on the whole buffer at once.
2022-09-10	view: _th_index_lite: use `//' defined-or op
	Just something I noticed while evaluating this subroutine for the buffering overhaul.
2022-09-10	view: _th_index_lite: avoid one s///, improve symmetry
	We can replace an expensive `s///' substitution with a simpler `chop'. Furthermore, we can delay the "</b>\n" replacement to ensure it's on the same line of Perl code as the `<b>' opening tag for readability.
2022-09-10	view: attach_link: reduce obuf manipulation
	This is another steep towards reducing the maximum size of an obuf by eventually doing compression earlier while we render messages as HTML. And do some golfing while we're at it...
2022-09-10	view: reduce subroutine calls for submsg_hdr
	Favor fewer, yet more expensive operations than many smaller ones. While we're still directly manipulating ctx->{obuf} after this, this change makes it easier for us to avoid doing so in the future.
2022-09-10	view: remove multipart_text_as_html
	It seems like a pointless wrapper function that's not saving us a whole lot. Drop some direct {obuf} manipulation while we're at it.
2022-09-10	view: eml_entry: reduce manipulation of ctx->{obuf}
	This is another step towards avoid unnecessary copies and pad space waste.
2022-09-10	view: simplify _parent_headers
	Having References but lacking In-Reply-To is an uncommon case with email, nowadays. So just rely on ->linkify_mids to handle linkification and HTML escaping Furthermore, headers are short enough to return as-is (and rely on CoW improvements in Perl 5.1x) since linkify_mids needs to operate on an independent string, anyways.
2022-09-10	viewvcs: use shorter and simpler ctx->html_done
	We only return 200s for any response large enough to warrant ->html_done, so we can just assume it. ViewVCS can also take advantage of it with some tweaking to avoid an extra method dispatch.
2022-09-10	www_stream: aresponse assumes 200, too
	There's no reason to be streaming large amounts of HTML for anything other than a 200 response.
2022-09-10	view: rework single message page to compress earlier
	We can rely on deflate to compress large thread skeletons on single message pages. Subsequent commits will compress bodies, as well.
2022-09-08	view: drop unnecessary comma in date range note
	I'm not sure how it got there, but it seems out-of-place in retrospect.
2022-09-02	www: omit [thread overview] link for unindexed v1
	Unindexed v1 inboxes do not have the thread overview skeleton at the bottom of /$MSGID/ pages, so do not link to it. And for rare messages without a Date: header (or any headers!), this also ensures the [thread overview] is shown regardless.
2022-09-02	www: fix top nav bar for unindexed v1 inboxes
	For /$INBOX/$MSGID/ pages, we need to point all nav bar links ../ regardless of whether ->over exists. I've also verified this doesn't affect /$INBOX/new.html at all.
2022-09-02	www: always show subject for root of thread skeleton
	For users with short attention spans, the root message of should have the Subject, since <title> is often truncated in most browsers.
2022-08-29	www: provide text/help/#search anchor
	This allows jumping to the appropriate section of the "help" from under the dfblob textarea search.
2022-08-29	www: atom: fix "changed" href to nowhere
	The HTML generated for the Atom feed doesn't have the footer of /T/ and /t/ HTML-only views, so just make "changed" in the diffstat go directly to the permalink #related anchor. Fixes: 66512e177390 ("view: generate query in single-message and commit views")
2022-08-29	view: cleanups and reuse for {obuf} preparation
	{obuf} will eventually go away and we'll write directly to {zbuf}, but as an intermediate step we'll make some changes to rely less on return values. While we're in the area, reuse Linkify objects in more places where possible to save some allocations.
2022-08-29	view: /$INBOX/: show "messages from $old to $new"
	With the ViewVCS commit view using /$INBOX/?t=YYYYMMDDhhmmss- links, the use of `t=' may not be immediately obvious to a reader and confuse them into thinking the inbox hasn't been updated in a while. So add a header to the top of the page whenever the `t=' query parameter is used. And kill a couple of redundant variable assignments while we're at it.
2022-08-29	treewide: ditch inbox->recent method
	It's a needless wrapper, nowadays. Originally, ->over was added on experimental basis to optimize for /$INBOX/ where Xapian ->search is slower on gigantic (LKML-sized) inboxes. Nowadays with extindex, ->over is here to stay given NNTP and IMAP both benefit from it. So reduce the interpreter stack overhead and just access ->over directly. lxs->recent was never used outside of tests, anyways. And while we're in the area, avoid needlessly bumping the refcount of $ctx->{ibx} in View::paginate_recent.
2022-08-29	view: speed up /$INBOX/ landing page by 0.5-1.0%
	Array lookups and extra arithmetic in Perl is slower than bumping the internal array offset inside the interpreter. Fwiw, using: my ($level, $subj) = splice(@extra, 0, 2) did not result in a performance improvement.
2022-08-29	www: allow html_oneshot to take an array arg
	Another step towards making our internal APIs more writev-like and reducing the copies needed for `join' or `.=' concatenation.
2022-08-26	view: add "this message" link above dfblob: textarea
	When jumping to #related from /T/ or /t/ views, it could be disconcerting to not have the current message as context. So add a "this message" link back up to #t as we have always done with the reply instructions.
2022-08-23	view: generate query in single-message and commit views
	The dfblob: search prefix is probably under-utilized, but is extremely powerful IMHO. To make it easier-to-use, add a search textarea with it prefilled with values for the existing patch message. This allows users to easily run a query for all patches which alter or result in either pre or post-image blobs in the current patch. Behavior changes are as follows: "changed" in the diffstat jumps to the bottom of the message. For /T/ and /t/, it goes to the "related" anchor which is just above the reply instructions in the single-message view. For the single message view, it'll jump to the textarea search form. I initially wanted to use a normal `<a href=' link, but figured the textarea is advantageous for two reasons: 1) users should be able to edit the query before submitting 2) crawlers are less likely to waste CPU/disk on forms It's probably too noisy to add this directly to the /T/ and /t/ views, but seems like a good place to put above the reply instructions in the single message view. Note that the queries used by the /$COMMIT_OID/s/ view is subtly different than the /$MSGID/ view since git will lengthen its abbreviations over time, while emails are immutable. I tried adding dfn: (filename) and s: (subject) support, but couldn't come up with cases where it really made sense for /$MSGID/. /$COMMIT_OID/s/ may benefit from it, since patchid: could be flaky due to non-standard diff generation options.
2022-08-20	view: do not show pagination footer for small inboxes
	For new public inboxes with few messages, the dead pagination footer is a worthless and confusing waste of space: "page: \n"; without `next' or `prev' links for users to follow.
2022-08-04	view: avoid intermediate array when streaming thread
	We can rely on auto-vivification to avoid an intermediate array for the map result.
2022-07-28	www: drop --subject from "git send-email" instructions
	Apparently, --subject doesn't work[1] with "git send-email" in this context. So drop the CLI arg and add a note to tell the user to set a "Subject:" line in their response body, instead. [1] I'm not sure if --subject ever worked as I thought it would, or if it's a regression. In either case, there are current versions of git where it doesn't, so just tell users to use the currently supported method. Link: https://80x24.org/lore/git/CAC4O8c-Tf11CpwuRudyrpXv5bGshuyEenV9kKrs0zRWER-+yHA@mail.gmail.com/
2022-04-02	view: remove unused $end variable
	Noticed while looking at something else completely unrelated...
2022-02-11	view: remove all CR before LF
	While we've rendered CR-LF as LF-only in HTML for many years, some messages end up as CR-CR-LF. So strip ALL all CR bytes preceding LF bytes, while preserving odd CR in the middle of lines. Reported-by: Thomas Weißschuh <thomas@t-8ch.de> Link: https://public-inbox.org/meta/8d13668f-cac7-4984-bb4e-ad90502dc46d@t-8ch.de/
2021-10-24	thread: avoid Perl5 internal scratchpad target cache
	The use of array-returning built-ins such as `grep' inside arrayref declarations appears to result in permanently allocated scratchpad space for caching according to my malloc inspector. Thread skeletons get discarded every response, but multiple skeletons can exist in memory at once, so do what we can to prevent long-lived allocations from being made, here. In other words, replacing constructs such as: my $foo = [ grep(...) ]; with: my @foo = grep(...); Seems to ensure the mortality of the underlying array.
2021-10-09	view: save memory by dropping smsg->{from_name} on use
	We'll also save a few LoC when generating it. $smsg objects can linger a while when rendering large threads, so saving a few bytes here can add up to several hundred KB saved. I noticed this while chasing the ref cycle leak in commit b28e74c9dc0a (www: fix ref cycle from threading w/ extindex, 2021-10-03). While there's no longer a leak, releasing memory earlier can allow it to be reused sooner and reduce both memory traffic and memory pressure.
2021-10-09	view: discard Eml->{bdy} when done using
	We can release the raw body buffer once we've obtained a copy of the decoded buffer. This reduces memory pressure ahead of some expensive diff processing.
2021-10-06	msg_iter: split_quotes adds trailing "\n"
	The regexp in split_quotes relies on the presence of a final "\n", so add it wherever we need to instead of making it the responsibility of every caller. This probably doesn't matter in practice since every email seems to have a "\n" as the final byte (due to the way SMTP works), but maybe there's some odd ones that'll get imported via lei.