Date | Commit message (Collapse) |
|
We will prefer URLs without suffixes for now to avoid ambiguity
in case a Message-ID ends with ".html", ".txt", ".mbox.gz" or
any other suffix we may use.
Static file compatibility is preserved by using a trailing slash
as most servers can/will fall back to an index.html file in this
case.
For raw text files, we will follow gmane's lead with "/raw"
|
|
Less scrolling is more efficient.
|
|
Less code is easier-to-manage, although we make a few extra
hash insertions now.
|
|
This doesn't seem needed for actual server use, but Plack tests
complain about it
|
|
This fixes a regression introduced in
commit 72c0f7c71ff28de9755dc4aee8b6ce6f0e4f2ed7
(feed: merge subjects regardless of "[PATCH vN]")
|
|
part_type still contains the filename, unfortunately, so
PGP signatures were truly stripped. Oh well, nobody cares
to verify PGP signatures anyways.
|
|
This normalizes rerolled patches with identical topics,
but does not normalize different patches even if they are
in the same thread (for now).
|
|
We ought to summarize subjects to avoid exploding
line lengths in the web interface.
|
|
This is necessary since Xapian may not be installed and
we may hide a lot of errors this way.
|
|
Consistently name mid_* functions as verbs.
|
|
Many of our internal search queries do not care about relevance,
but is used for proper thread displays.
|
|
Using hash means we no longer have to document and remember what
every field does. The original array form was insane premature
optimization and crazy. Who wrote that? Oh wait, I was on
drugs :<
|
|
Relying on Email::MIME means encoding is handled transparently
for us.
|
|
The root message-ID may be too long to compare. Instead,
check fields based on the consistency of our DB.
|
|
This is to match what Mail::Thread nad our own search
relies on. However, we will be more lenient on spaces,
though.
|
|
Dereference header_obj only once when performance may be
critical, or simplify our code by calling "header" directly on
the Email::{Simple,MIME} object if not.
|
|
We must preserve the umask for the entirety of the indexing
operation, as Xapian transactions replace entire files
atomically instead of writing them in place.
|
|
Email::Address::name never fails assuming it was able to parse
anything.
|
|
Extend the purpose of core.sharedRepository to apply to
the $GIT_DIR/public-inbox/xapian* directory.
|
|
public-inbox git repositories require a "HEAD" ref to
function correctly anyways.
|
|
There is no need to perform string appends when the
"read" and "sysread" functions take an offset argument
to append to the given buffer.
This avoid needless string creation.
|
|
Commenting it in the From: line seems appropriate and
reduces compatibility problems in case a MUA cannot handle
trailing comments after the timestamp.
|
|
This redundantly quotes >From from to prevent losing information
as described by qmail
|
|
This improves compatibility and allows individual messages
to be concatenated into an existing mbox without further
modifications. "git format-patch" does something similar
(but does not do "From " line escaping(!))
|
|
To reduce clutter, we will not link to uncompressed versions.
Users should be able to download entire threads for offline
reading, enable this feature for them.
|
|
Some folks may want to view the mbox inline as a string of raw text,
when guessing URLs. Let them do this...
|
|
Most of our special query functions require exact matches, so none
of the flags we normally use are necessary for query parsing.
|
|
In case there's huge threads, readers should know about them
even though we currently lack the navigation to display them.
|
|
Less code should be easier-to-read.
|
|
This makes organization easier and reduces the amount of code
loaded for a PSGI, mod_perl or CGI instance.
|
|
Perl seems to incorrectly warn for this, workaround it.
|
|
We will attempt to generate Atom feeds "by hand" as the
XML::Atom::SimpleFeed API does not support streaming output.
Since email is large and servers are small, this should prevent
wasting memory when we generate larger feeds.
Of course, we hope clients use SAX parsers capable of handling
large streams without slurping.
|
|
Hopefully this saves us some memory with CoW on *nix.
|
|
Otherwise folks won't get downloadable mboxes
|
|
We may not support this after all, CGI.pm is already
legacy-enough and far more powerful.
|
|
This should allow progressive rendering on the client and reduce
memory usage on the server. Unfortunately XML::Atom::SimpleFeed
does not yet support streaming, so we may not use it in the
future.
|
|
This is hopefully less ambiguous, as the word "count" confused
me, too.
|
|
These are not necessary, anymore
|
|
Mboxes may be huge, so only support downloading gzipped mboxes
to save bandwidth and to get free checksumming.
Streaming output means we should not be wasting too much memory
on this unless the chosen server sucks.
|
|
Since mbox is usually downloaded, support fetching infinitely large
responses via streaming.
|
|
Some folks may not want to download and install Perl code like
ssoma, so allow downloading an mbox containing the entire
thread.
|
|
It's a bit disconcerting to jump to the authorship line.
|
|
This also avoids incorrectly incrementing $part_nr when
we skip a part due to bad Content-Type.
|
|
Oops!
|
|
We need proper ordering of References to thread messages
correctly. We would lose this order if we load the terms
from the database, so set it directly document data.
Do not bother with a separate In-Reply-To, since Mail::Thread
just merges the IRT into References. This bumps our schema
version once again.
|
|
This is for consistency with ssoma. I doubt it makes
a difference in practice, but in case somebody decides
any of the Message-ID-containing headers should have
strange characters, we'll decode and attempt to thread
them. This isn't an attack vector, just a way to
make messages thread improperly which is pointless...
|
|
|
|
We may not be using subject_path after all.
|
|
Oops
|
|
Threading in Xapian is mostly supported by now; so start
documenting things.
|