Date | Commit message (Collapse) |
|
Štěpán Němec <stepnem@smrk.net> wrote:
> Eric Wong wrote:
> > Subject: [PATCH] view: decode In-Reply-To comments added by Gnus
> Or just "some MUAs"? Who knows who else...
Yeah, I wouldn't be surprised if there were more...
---8<---
Subject: [PATCH] view: decode In-Reply-To comments added by some MUAs
Emacs-based MUAs (e.g. Gnus and rmail) can do it, and maybe
some others, too. I noticed it in
<https://yhbt.net/lore/git/xmqqr0ho9oi9.fsf@gitster.g/>
while scanning for something else.
|
|
This makes the new endpoints easier-to-find. The navigation is
still at the bottom of the page since I figured having it at the
top is too cluttered for users on small terminals.
|
|
For totally bogus things in address fields, we'll fall back to
showing the original entry in the name column when using
Email::Address::XS.
The pure Perl version differs here, but we'll just let them be
different when it comes to handling bogus data.
|
|
This makes it easier to discover contemporary messages
crossposted to other groups within the same WWW instance.
The internal cache is necessary for giant threads, and the
expiry mechanism is necessary to prevent attackers from
trivially OOM-ing.
|
|
For thread skeletons with multiple roots, it makes sense to
note the strict|loose delineation even when the first message
matches the desired Message-ID.
|
|
This is a major step in solving the problem of having to
manually associate hundreds/thousands of coderepos with
hundreds/thousands of public-inboxes to power solver
(and more).
|
|
Our previous pinning prevention only worked to prevent older
(non-most-recent) topics from being pinned to the landing page,
but not the most recent window of messages.
We still sort messages within threads by Date: because that
makes git-send-email patchsets display more nicely, but we
don't want recent topics pinned due to future Date: headers.
I nearly switched sort_ds() back to sorting by Received: until
I looked back on commit 8e52e5fdea416d6fda0b8d301144af0c043a5a76
(use both Date: and Received: times, 2018-03-21) and was reminded
git-send-email relies on Date: for large series, so I added a
note about it for sort_ds().
Reported-by: Kyle Meyer <kyle@kyleam.com>
Tested-by: Kyle Meyer <kyle@kyleam.com>
Link: https://public-inbox.org/meta/87edr5gx63.fsf@kyleam.com/
|
|
To ensure users aren't abusing the ability to reuse Message-IDs,
provide a convenient front-end to `lei mail-diff' from WWW.
Most of the time it's just list-appended signatures, so I expect
this to be useful for /all/ users.
|
|
Fixes: ab9c03ff4aa3 "www: use PerlIO::scalar (zfh) for buffering"
|
|
For redundant messages sharing Message-IDs, the link to solver
(/$INBOX/$OID/s/) was going up too many levels for /$INBOX/$MSGID/
when there were multiple messages sharing the same $MSGID.
Unfortunately, redundant messages are common with /all/
due to signature trailers. So dynamically assigning {-spfx}
is tricky and error prone from counting `/'.
So simplify the code a bit by setting {-spfx} once per HTTP
request, instead of every single message.
|
|
Calling Compress::Raw::Zlib::deflate is fairly expensive.
Relying on the `.=' (concat) operator inside ->zadd operator is
faster, but the method dispatch overhead is noticeable compared
to the original code where we had bare `.=' littered throughout.
Fortunately, `print' and `say' with the PerlIO::scalar IO layer
appears to offer better performance without high method dispatch
overhead. This doesn't allow us to save as much memory as I
originally hoped, but does allow us to rely less on concat
operators in other places and just pass a list of args to
`print' and `say' as a appropriate.
This does reduce scratchpad use, however, allowing for large
memory savings, and we still ->deflate every single $eml.
|
|
This allows us to focus string concatenations in one place to
allow Perl internal scratchpad optimizations to reuse memory.
Calling Compress::Raw::Zlib::deflate repeatedly proves too
expensive in terms of CPU cycles.
|
|
This may help us identify hot spots and reduce pad space
as needed.
|
|
Unfortunately, this is actually slower. However, this
hopefully makes it easier to improve the internals and
make performance improvements down the line.
|
|
qq() is a nice alternative to "" when there's embedded "
characters in HTML entities.
|
|
Another step towards giving us more options for speedups and
memory reductions.
|
|
We can build `$u' in one line, and drop an unnecessary empty
line to reduce the amount of scrolling required to read this
sub.
|
|
We can rely on {-html_tip} for some things at the top of the
page, and reduce ascii_html and obfuscate_addrs calls by
working on the whole buffer at once.
|
|
Just something I noticed while evaluating this subroutine
for the buffering overhaul.
|
|
We can replace an expensive `s///' substitution with a simpler
`chop'. Furthermore, we can delay the "</b>\n" replacement
to ensure it's on the same line of Perl code as the `<b>'
opening tag for readability.
|
|
This is another steep towards reducing the maximum size of
an obuf by eventually doing compression earlier while we
render messages as HTML.
And do some golfing while we're at it...
|
|
Favor fewer, yet more expensive operations than many smaller
ones. While we're still directly manipulating ctx->{obuf} after
this, this change makes it easier for us to avoid doing so in
the future.
|
|
It seems like a pointless wrapper function that's not saving us
a whole lot. Drop some direct {obuf} manipulation while we're
at it.
|
|
This is another step towards avoid unnecessary copies
and pad space waste.
|
|
Having References but lacking In-Reply-To is an uncommon case
with email, nowadays. So just rely on ->linkify_mids to handle
linkification and HTML escaping Furthermore, headers are short
enough to return as-is (and rely on CoW improvements in Perl
5.1x) since linkify_mids needs to operate on an independent
string, anyways.
|
|
We only return 200s for any response large enough to warrant
->html_done, so we can just assume it. ViewVCS can also take
advantage of it with some tweaking to avoid an extra method
dispatch.
|
|
There's no reason to be streaming large amounts of HTML for
anything other than a 200 response.
|
|
We can rely on deflate to compress large thread skeletons on
single message pages. Subsequent commits will compress bodies,
as well.
|
|
I'm not sure how it got there, but it seems out-of-place in
retrospect.
|
|
Unindexed v1 inboxes do not have the thread overview skeleton
at the bottom of /$MSGID/ pages, so do not link to it.
And for rare messages without a Date: header (or any headers!),
this also ensures the [thread overview] is shown regardless.
|
|
For /$INBOX/$MSGID/ pages, we need to point all nav bar links
../ regardless of whether ->over exists. I've also verified
this doesn't affect /$INBOX/new.html at all.
|
|
For users with short attention spans, the root message of should
have the Subject, since <title> is often truncated in most browsers.
|
|
This allows jumping to the appropriate section of the "help"
from under the dfblob textarea search.
|
|
The HTML generated for the Atom feed doesn't have the footer
of /T/ and /t/ HTML-only views, so just make "changed" in
the diffstat go directly to the permalink #related anchor.
Fixes: 66512e177390 ("view: generate query in single-message and commit views")
|
|
{obuf} will eventually go away and we'll write directly to
{zbuf}, but as an intermediate step we'll make some changes
to rely less on return values.
While we're in the area, reuse Linkify objects in more places
where possible to save some allocations.
|
|
With the ViewVCS commit view using /$INBOX/?t=YYYYMMDDhhmmss-
links, the use of `t=' may not be immediately obvious to a
reader and confuse them into thinking the inbox hasn't been
updated in a while.
So add a header to the top of the page whenever the `t=' query
parameter is used.
And kill a couple of redundant variable assignments while we're
at it.
|
|
It's a needless wrapper, nowadays. Originally, ->over was added
on experimental basis to optimize for /$INBOX/ where Xapian
->search is slower on gigantic (LKML-sized) inboxes.
Nowadays with extindex, ->over is here to stay given NNTP and
IMAP both benefit from it. So reduce the interpreter stack
overhead and just access ->over directly.
lxs->recent was never used outside of tests, anyways.
And while we're in the area, avoid needlessly bumping the
refcount of $ctx->{ibx} in View::paginate_recent.
|
|
Array lookups and extra arithmetic in Perl is slower than
bumping the internal array offset inside the interpreter.
Fwiw, using: my ($level, $subj) = splice(@extra, 0, 2)
did not result in a performance improvement.
|
|
Another step towards making our internal APIs more writev-like
and reducing the copies needed for `join' or `.=' concatenation.
|
|
When jumping to #related from /T/ or /t/ views, it could be
disconcerting to not have the current message as context.
So add a "this message" link back up to #t as we have always
done with the reply instructions.
|
|
The dfblob: search prefix is probably under-utilized, but is
extremely powerful IMHO. To make it easier-to-use, add a search
textarea with it prefilled with values for the existing patch
message. This allows users to easily run a query for all
patches which alter or result in either pre or post-image
blobs in the current patch.
Behavior changes are as follows: "changed" in the diffstat
jumps to the bottom of the message. For /T/ and /t/, it
goes to the "related" anchor which is just above the reply
instructions in the single-message view. For the single
message view, it'll jump to the textarea search form.
I initially wanted to use a normal `<a href=' link, but
figured the textarea is advantageous for two reasons:
1) users should be able to edit the query before submitting
2) crawlers are less likely to waste CPU/disk on forms
It's probably too noisy to add this directly to the /T/ and /t/
views, but seems like a good place to put above the reply
instructions in the single message view.
Note that the queries used by the /$COMMIT_OID/s/ view is
subtly different than the /$MSGID/ view since git will lengthen
its abbreviations over time, while emails are immutable.
I tried adding dfn: (filename) and s: (subject) support, but
couldn't come up with cases where it really made sense for
/$MSGID/. /$COMMIT_OID/s/ may benefit from it, since patchid:
could be flaky due to non-standard diff generation options.
|
|
For new public inboxes with few messages, the dead pagination
footer is a worthless and confusing waste of space: "page: \n";
without `next' or `prev' links for users to follow.
|
|
We can rely on auto-vivification to avoid an intermediate
array for the map result.
|
|
Apparently, --subject doesn't work[1] with "git send-email" in
this context. So drop the CLI arg and add a note to tell the
user to set a "Subject:" line in their response body, instead.
[1] I'm not sure if --subject ever worked as I thought it would,
or if it's a regression. In either case, there are current
versions of git where it doesn't, so just tell users to use
the currently supported method.
Link: https://80x24.org/lore/git/CAC4O8c-Tf11CpwuRudyrpXv5bGshuyEenV9kKrs0zRWER-+yHA@mail.gmail.com/
|
|
Noticed while looking at something else completely unrelated...
|
|
While we've rendered CR-LF as LF-only in HTML for many years,
some messages end up as CR-CR-LF. So strip ALL all CR bytes
preceding LF bytes, while preserving odd CR in the middle of
lines.
Reported-by: Thomas Weißschuh <thomas@t-8ch.de>
Link: https://public-inbox.org/meta/8d13668f-cac7-4984-bb4e-ad90502dc46d@t-8ch.de/
|
|
The use of array-returning built-ins such as `grep' inside
arrayref declarations appears to result in permanently allocated
scratchpad space for caching according to my malloc inspector.
Thread skeletons get discarded every response, but multiple
skeletons can exist in memory at once, so do what we can to
prevent long-lived allocations from being made, here.
In other words, replacing constructs such as:
my $foo = [ grep(...) ];
with:
my @foo = grep(...);
Seems to ensure the mortality of the underlying array.
|
|
We'll also save a few LoC when generating it. $smsg objects can
linger a while when rendering large threads, so saving a few
bytes here can add up to several hundred KB saved.
I noticed this while chasing the ref cycle leak in commit
b28e74c9dc0a (www: fix ref cycle from threading w/ extindex, 2021-10-03).
While there's no longer a leak, releasing memory earlier can
allow it to be reused sooner and reduce both memory traffic and
memory pressure.
|
|
We can release the raw body buffer once we've obtained a copy of
the decoded buffer. This reduces memory pressure ahead of some
expensive diff processing.
|
|
The regexp in split_quotes relies on the presence of a
final "\n", so add it wherever we need to instead of
making it the responsibility of every caller.
This probably doesn't matter in practice since every
email seems to have a "\n" as the final byte (due to
the way SMTP works), but maybe there's some odd ones
that'll get imported via lei.
|