Date | Commit message (Collapse) |
|
We no longer vivify the intermediate $ibx->{-hide} hashref,
instead we use $ibx->{-hide_$KEY} directly. This avoids
an intermediate hashref and extra hash table lookups.
|
|
This is a convenient (and slightly memory-saving) alternative to
specifying a `publicinbox.*.url' entry for every single inbox
when using publicinbox.wwwListing.
|
|
The sort options and mbox downloads only apply to individual
inbox search endpoints, and they make no sense for the listing
of inboxes themselves.
|
|
Again, ->deflate (and thus ->zmore) calls are relatively
expensive compared to `print' ops using PerlIO::scalar
behind-the-scenes. While I can likely optimize the `join' away
here, too, that will happen in a future commit.
|
|
We need to branch for non-empty `q=' parameters anyways, but
`q=' is usually empty/unset. While we're in the area, `chomp'
reads `$/' while `chop' is simpler. Furthermore, we can shave
a few bytes off the form HTML by omitting spaces before `/>'
and placing `\n' to wrap long lines before attribute names.
|
|
`.' concatenation is still faster for small strings, but
passing an array to ->zmore is more efficient for large
search results and full listings.
|
|
This allows users to search /all/ from the top-level WwwListing
without extra manual steps, although there's still extra network
roundtrips incurred.
No vertical whitespace is added, and there's no clumsy radio
buttons nor menus to deal with. Users only have to use a
different <input type=submit /> button. I forgot how to do this
until I realized we already do something similar with multiple
submit buttons for threaded vs non-threaded mboxrd.gz downloads.
Link: https://public-inbox.org/meta/20210827120845.29682-1-e@80x24.org/
|
|
Perhaps this can be expanded to include grokmirror information
in the future. For now, just give a hint about the "mirror"
link for each inbox.
|
|
This makes the mirroring and code retrieval instructions less
obstructive. Relying on WwwText means we only use our Linkify
module to make hrefs of full URLs; making relative and shortened
hrefs off-limits; hopefully this isn't too much of a problem.
coderepo information remains duplicated on every page since
(IMHO) coderepos are an important feature; but nobody besides me
has ever bothered to configure coderepos, so I suppose it's
fine...
Suggested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210826132747.6gxuwnhftyf7c6hp@nitro.local/
|
|
This may fix problems with the "all" link disappearing.
Link: https://public-inbox.org/meta/CAMwyc-Tw=v5yT1U1U66GSwwTK8OJXv8_YDu-=oXbZO3tHSnYWw@mail.gmail.com/
|
|
Searching inboxes with an empty query no longer gives 500 errors
due to Xapian. Also, improve the error message when no inboxes
match, since saying no inboxes exist yet is wrong.
|
|
It's a special case and we can show it in the HTML display
without affecting manifest.js.gz generation.
|
|
The only place where we could return wide characters with -httpd
was the raw $INBOX_DIR/description text, which is now converted
to octets.
All daemon (HTTP/NNTP/IMAP) sockets are opened in binary mode,
so length() and bytes::length() are equivalent on reads. For
socket writes, any non-octet data would warn about wide characters
and we are strict in warnings with test_httpd.
All gzipped buffers are also octets, as is PublicInbox::Eml->body,
and anything from PerlIO objects ("git cat-file --batch" output,
filesystems), so bytes::length was unnecessary in all those places.
|
|
Since CSS can be overridden by a static webserver on a per-inbox
basis, we need a similar pattern to deal with the instance-wide
WwwListing HTML. "/+/" probably won't conflict with any current
nor future public inbox names.
I don't think it'll cause problems with common linkifiers or URL
extractors, either (and it's unlikely anybody would want to
share URLs of just the CSS in a plain text(-like) format).
|
|
ManifestJsGz->response was not invoking the new "url_filter"
method properly. Furthermore, fix url_filter for returning 404
responses.
Reported-by: Kyle Meyer <kyle@kyleam.com>
Link: https://public-inbox.org/meta/87fsx3128a.fsf@kyleam.com/
Fixes: 520be116e8a686cb ("www_listing: start updating for pagination + search")
|
|
WwwListing and ManifestJsGz may be too different nowadays to
be worth the code sharing between them.
Update some comments and note we still needs better tests :x
Fixes: 520be116e8a686cb ("www_listing: start updating for pagination + search")
|
|
When dealing with thousands of inboxes, displaying all of
them on a single page isn't going to work. So steal some
pagination and search results code from the message search
to generate some basic HTML output that looks good in w3m.
|
|
This prevents the following problem logged to the webserver's error log:
E: Undefined subroutine &PublicInbox::WwwStream::code_footer called at /usr/share/perl5/PublicInbox/WwwListing.pm line 102.
in PublicInbox::ConfigIter=ARRAY(0x557aea68b1a8)::each_section at /usr/share/perl5/PublicInbox/ConfigIter.pm line 37.
Fixes: 7a3946ef122e ("www: support listing of inboxes")
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
As with ExtSearch, MiscSearch lacks a janky cleanup timer of
PublicInbox::Inbox objects, leading to info about
inboxes/newsgroups going stale. Fortunately, we don't use
MiscSearch very heavily, yet.
In the future, we may be able to detect new inboxes without
having to SIGHUP or restart daemons using MiscSearch.
|
|
{pi_config} may be confused with the documented `PI_CONFIG'
environment variable, and we'll favor vowel-removal to be
consistent with our usage of object references.
The `pi_' prefix may stay in some places, for now; since a
separate namespace may come into this codebase for local/private
client-tooling.
For InboxIdle, we'll also remove an invalid comment about
holding a reference to the PublicInbox::Config object, too.
|
|
By using the just-introduced ConfigIter class.
And make ManifestJsGz a subclass of it to reduce duplication.
|
|
In Perl, we can simplify callers by passing a single array
all the way down the stack instead of a single array ref which
needs to be expanded every call.
|
|
It's still as slow as before with hundreds/thousands of inboxes,
but at least it's fair. Future changes will allow it to be
cached and memoized with persistent HTTP servers.
|
|
"*foo" is ambiguous in that it may refer to a bareword file handle;
so we'll use it where we can without triggering warnings.
PublicInbox::TestCommon::run_script_exit required dropping the
prototype, however. We'll also future-proof by dropping "use
warnings" in Cgit.pm and use the less-ambiguous "//=" in Inbox.pm
while we're in the area.
|
|
Sometimes it's useful to quickly get to threads and messages
which are contemporaries of the current thread/message being
focused on. This hopefully improves navigation by making:
a) the top line (where $INBOX_DIR/description) is shown
a link to the latest topics in search results and
per-thread/per-message views.
b) providing a link to contemporaries ("~YYYY-MM-DD") at
around the thread overview skeleton area for per-thread
and per-message views
|
|
The grep call in list_match_domain_i returns true for all inboxes,
even ones without a URL that matches the regular expression, because
the qr value passed to grep is not surrounded by slashes. Add them.
Fixes: 1988d730c0088e8b (config: support multi-value inbox.*.*url)
|
|
Virtually all of our responses are going to be gzipped, anyways.
This will allow us to utilize zlib as a buffering layer and
share common code for async blob retrieval responses.
To streamline this and allow GzipFilter to be a parent class,
we'll replace the NoopFilter with a similar CompressNoop class
which emulates the two Compress::Raw::Zlib::Deflate methods we
use.
This drops a bunch of redundant code and will hopefully make
upcoming WwwStream changes easier to reason about.
|
|
This simplifies callers, as witnessed by the change to
WwwListing. It adds overhead to NoopFilter, but NoopFilter
should see little use as nearly all HTTP clients request gzip.
|
|
The changes to GzipFilter here may be beneficial for building
HTML and XML responses in other places, too.
|
|
Assisted by commit a73957b5b05f2a00f7a85353b1658b6d8cde05ae
("testcommon: speed up wait_for_tail() on GNU/Linux")
Fixes: 846161e3d1207d59 ("treat $INBOX_DIR/description and gitweb.owner as UTF-8")
|
|
gitweb does the same with $GIT_DIR/description and gitweb.owner.
Allowing UTF-8 description should not cause problems when used
in responses for to the NNTP "LIST NEWSGROUPS" request, either,
since RFC 3977 section 7.6.6 recommends the description be UTF-8
(but does not require it).
Link: https://public-inbox.org/meta/20200528151216.l7vmnmrs4ojw372g@sourcephile.fr/
|
|
This allows us to simplify some of our existing code and make
future changes easier.
I doubt anybody goes through the trouble to have a Perl
installation without zlib support. The zlib source code is even
bundled with Perl since 5.9.3 for systems without existing zlib
development headers and libraries.
Of course, zlib is also a requirement of git, too; and we're not
going to stop using git :)
[squashed: "wwwaltid: use gzipfilter up front"]
|
|
And not the last...
I only noticed this since JSON::PP::Boolean was spewing
redefinition warnings via overload.pm
Fixes: 8fb8fc52420ef669 ("wwwlisting: avoid lazy loading JSON module")
|
|
We already lazy-load WwwListing for the CGI script, and
hiding another layer of lazy-loading makes things difficult
to do WWW->preload.
We want long-lived processes to do all long-lived allocations up
front to avoid fragmentation in the allocator, but we'll still
support short-lived processes by lazy-loading individual modules
in the PublicInbox::* namespace.
Mixing up allocation lifetimes (e.g. doing immortal allocations
while a large amount of space is taken by short-lived objects)
will cause fragmentation in any allocator which favors large
contiguous regions for performance reasons. This includes any
malloc implementation which relies on sbrk() for the primary
heap, including glibc malloc.
|
|
"use" is also evaluated earlier than "require", so it is
favorable for compile-only checking.
|
|
I didn't wait until September to do it, this year!
|
|
We use the same idiom in many places for doing two-step
linkification and HTML escaping. Get rid of an outdated
comment in flush_quote while we're at it.
|
|
Most spawn and popen_rd callers die on failure to spawn,
anyways, and some are missing checks entirely. This saves
us a bunch of verbose error-checking code in callers.
This also makes popen_rd more consistent, since it already
dies on pipe creation failures.
|
|
This allows to do some compile-time checking and fills in a
missing "use" in PublicInbox::NewsWWW, allowing it to be used
standalone and independently of PublicInbox::WWW
|
|
Since the beginning of this project, we've implicitly supported
inboxes with multiple URLs by relying on the Host: header sent
by the client ($env->{HTTP_HOST}).
We now offer the option to explicitly configure multiple URLs for
every inbox along with the ability to do a best-effort match for
matching hostnames.
|
|
git's config file keys lack underscores, but my mind is wired
for underscores :x. Fix the whitespace around the info URL
while we're at it, so that it shows up right under the inbox
description.
|
|
Another place where we can replace anonymous subs with named
subs by passing a user-supplied arg.
|
|
ProcessPipe::CLOSE won't reliably set $? inside the event loop
if waitpid(..., WNOHANG) isn't successful. So use a blocking
waitpid() call, here, and hope "git show-ref" exits promptly
since we've already drained its stdout.
|
|
We can use "use" to get the namespace into the "BEGIN" phase of
the interpreter. While we're at it, use \&coderef syntax
explicitly instead of globbing everything.
|
|
Spell "Schwartzian" correctly, and clarify the location of
"modified" since we have multiple subs named "modified"
|
|
"mainrepo" ws a bad name and artifact from the early days when I
intended for there to be a "spamrepo" (now just the
ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be
especially confusing, since v2 needs at least two git
repositories (epoch + all.git) to function and we shouldn't
confuse users by having them point to a git repository for v2.
Much of our documentation already references "INBOX_DIR" for
command-line arguments, so use "inboxdir" as the
git-config(1)-friendly variant for that.
"mainrepo" remains supported indefinitely for compatibility.
Users may need to revert to old versions, or may be referring
to old documentation and must not be forced to change config
files to account for this change.
So if you're using "mainrepo" today, I do NOT recommend changing
it right away because other bugs can lurk.
Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
|
|
The default $GIT_DIR/description (provided by git.git templates)
isn't very useful for v2 epochs, so use the inbox description
and suffix it with the epoch number if it's otherwise unnamed.
Requested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
https://public-inbox.org/meta/20190620190017.GA27175@chatter.i7.local/
|
|
Try to remain consistent with our own documentation regarding
v2 git "epochs", first.
|
|
And use it in manifest.js.
To ease maintaining mirrors with grokmirror(1), we can accept
a "git/" directory prefix before the epoch, and ".git" suffix
after the epoch number.
We maintain compatibility with "$INBOX/$EPOCH" cloning, of
course, and it's still easier-to-type on the command-line.
|