Date | Commit message (Collapse) |
|
Most of these test cases are in t/plack.t, already; and that
runs much faster. Just ensure the slashy corner case and search
stuff works. While we're at it, avoid using the
public-inbox-index command and just use the internal API to
index.
|
|
No point in implementing these slowly with the CGI wrapper
when PSGI is sufficient for testing.
|
|
No need to test this via CGI .cgi is a wrapper around
PSGI and PSGI tests are way faster.
|
|
It is redundant with what is in t/plack.t
|
|
t/plack.t already has the same test.
|
|
More of this test will be, we use PSGI nowadays; and
most of these tests can be ported over to use PSGI and
not fork+exec as much.
|
|
No need to write our own loop when an assignment will do.
|
|
In PSGI, PATH_INFO contains URI-decoded paths which cause
problems when Message-IDs contain ambiguous characters for used
for routing. Instead, extract the undecoded path from
REQUEST_URI and use that.
Reported-by: Leah Neukirchen <leah@vuxu.org>
https://public-inbox.org/meta/8736xsb5s5.fsf@vuxu.org/
|
|
"LIKE" in SQLite (and other SQL implementations I've seen) is
expensive with nearly 3 million messages in the archives.
This caused some partial Message-ID lookups to take over 600ms
on my workstation (~300ms on a faster Xeon). Cut that to below
under 30ms on average on my workstation by relying exclusively
on Xapian for partial Message-ID lookups as we have in the past.
Unlike in the past when we tried using Xapian to match partial
Message-IDs; we now optimize our indexing of Message-IDs to
break apart "words" in Message-IDs for searching, yielding
(hopefully) "good enough" accuracy for folks who get long URLs
broken across lines when copy+pasting.
We'll also drop the (in retrospect) pointless stripping of
"/[tTf]" suffixes for the partial match, since anybody who
hits that codepath would be hitting an invalid message ID.
Finally, limit wildcard expansion to prevent easy DoS vectors
on short terms.
And blame Pine and alpine for generating Message-IDs with
low-entropy prefixes :P
|
|
Using update-copyrights from gnulib
While we're at it, use the SPDX identifier for AGPL-3.0+ to
ease mechanical processing.
|
|
PSGI specs already require PATH_INFO to be unescaped;
so our tests were wrong, too.
|
|
Based on reading RFC 3986, it seems '@', ':', '!', '$', '&',
"'", '; '(', ')', '*', '+', ',', ';', '=' are all allowed
in path-absolute where we have the Message-ID.
In any case, it seems '@' is fairly common in path components
nowadays and too common in Message-IDs.
|
|
We now generate all of our HTML using WwwStream which
forces us to have consistent headers and footers in
the HTML itself.
This also makes the search-capable vs search-less installs
go to the new.html endpoint to maintain consistency
(in case an admin decides to enable Xapian).
|
|
We no longer depend on it for the core code, and tests
are optional for users. Hopefully this makes this
easier-to-install.
|
|
A public-inbox is NOT necessarily a mailing list, but it
could serve as an input point for zero, one, or infinite
mailing lists :D
|
|
Process startup times are atrocious for fast tests and there's far
too much setup involved. Rely on git-fast-import instead; but
more work is needed in this area.
|
|
No need to maintain per-block environment state when we can
localize it to per-command. We've had --git-dir= in git
since 1.4.2 (2006-08-12) and already use it all over the
place.
|
|
Quote-folding was a major design mistake pre-1.0. Since this
project is still in its infancy and unlikely to be in wide
use at the moment, redirect the /f/ endpoints back to the
plain message.
|
|
This should make identifiying leftover directories
due to SIGKILL-ed tests easier.
|
|
This should not be dependent on what is in the users'
$HOME config, oops.
|
|
This is enabled by default, for now.
Smart HTTP cloning support will be added later, but it will
be optional since it can be highly CPU and memory intensive.
|
|
In the future, it should be possible to use this:
git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \
UPDATE_COPYRIGHT_USE_INTERVALS=2 \
xargs /path/to/gnulib/build-aux/update-copyright
|
|
This is the correct Content-Type for Atom feeds, especially
since we updated to use ".atom" as the suffix.
|
|
Since cross-posting is inevitable, we shall link to external
message archives for interopability.
|
|
This allows common /m/ links to be used without a prefix,
saving 2 precious bytes for permalinks and raw messages.
Old URLs continue to redirect.
|
|
This allows users to subscribe to only a single thread
with their feed reader without subscribing to the rest of
the thread.
Update our endpoint notes while we're at it.
|
|
These URLs are preferable in case somebody decides to get cute and
use a suffix we would've used to prevent others from linking to
their message. The common /m/$MESSAGE_ID/ URLs are now 4 characters
shorter so may fit better on terminals.
|
|
We will prefer URLs without suffixes for now to avoid ambiguity
in case a Message-ID ends with ".html", ".txt", ".mbox.gz" or
any other suffix we may use.
Static file compatibility is preserved by using a trailing slash
as most servers can/will fall back to an index.html file in this
case.
For raw text files, we will follow gmane's lead with "/raw"
|
|
Mboxes may be huge, so only support downloading gzipped mboxes
to save bandwidth and to get free checksumming.
Streaming output means we should not be wasting too much memory
on this unless the chosen server sucks.
|
|
Some folks may not want to download and install Perl code like
ssoma, so allow downloading an mbox containing the entire
thread.
|
|
This fixes a minor test failure in t/cgi.t
Tested with perl 5.18.2-2ubuntu1 on Ubuntu 14.04.3 LTS
|
|
Do not repeat ourselves, just use the same description file
gitweb uses to avoid surprising users.
|
|
This is not a blog. All posts, whether replies or not,
carry equal weight.
|
|
It should be common for a single users to be subscribed to multiple
addresses/lists, so we must use the address before alias expansion.
This partially reverts commit b949afc9edf89dd494cac6255c78b124d58e11a5
|
|
The emergency destination may be Maildir. A Maildir emergency
destination is better for volatile data which is written to
and deleted-from frequently.
|
|
This allows WWW readers to slowly page through the entire history
of the mailing list.
|
|
CGI mounts should probably handle this internally. We're reverting
this since it adds too much potential for abuse with fake/extra
prefixes in the URL. We also need to reorder our redirect handling
as a result.
This reverts commit c394de9f2c91c2c5ed1f7832a5a7cc0206120b7f.
|
|
We do not have all messages in the top-level index
(and we need to adjust the test while we're at it).
|
|
It is common to type upper-level URLs without the slash,
redirect users to the correct page for usability.
|
|
Not sure what I was thinking...
|
|
MIDs may have strange characters in them, so we need to handle
escaping/unescaping properly to avoid broken links or worse.
|
|
We may have something like /foo.cgi/m/$MID.html in there.
|
|
This makes it easier to configure for systems which
determine a script is a CGI script based on suffix.
|
|
These need better tests and verification, but it's something
for now.
|
|
Code should be consistent with the design docs
(and we will need better tests).
|
|
This prevents ambiguity when switching URLs between static
file servers and CGI.
The /$LISTNAME/index.html URL appearing in the wild is inevitable
because of our static file server support. Worst yet, there's
no easy/consistent way to get all installations detect and 301
them to the shorter /$LISTNAME/. So we make the CGI support
/$LISTNAME/index.html.
The downside of this is the potential duplicate entry in all caches.
|
|
This is essential when telling people to use something like:
curl $URL | git am
|
|
Remove the specified /all.html while we're at it, we only have
/all.atom.xml because it's convenient for feed readers.
|
|
We should be able to wire up the rest, soon.
|