Date | Commit message (Collapse) |
|
We can't compare created_at times with lei if lei tests are
skipped due to Inline::C or Socket::MsgHdr unavailability.
Reported-by: Jörg Rödel <joro@8bytes.org>
Link: https://public-inbox.org/meta/YZebmAxlFJy4lqAw@8bytes.org/
|
|
0.1s may not be enough for a task switch and inotify wakeup,
so try doubling it and see if it fixes test reliability, for
now. A future change may be to implement a watcher/tracer
for inotify -> lei/store events.
Link: https://public-inbox.org/meta/20211104134327.zrf5jijfz7dsvb7l@meerkat.local/
|
|
I don't expect this to be hit in real-world use via normal
interactive shells. However, somebody could accidentally add
"\n" in languages (e.g. Perl, C) where it's easy to pass "\n"
in argv[].
|
|
`"' (double-quote) needs to be quoted for stdin searches.
We also need to differentiate between "lei q --stdin" usage
when calling "lei up", do it by setting an internal "rawstr"
knob to ensure we can parse the config properly regardless
of whether the initial search used --stdin or not.
|
|
And improve reliability while we're at it. It seems closing a
TCP listen socket on FreeBSD 12.2 doesn't cause connect()-ing
clients to fail. This happens regardless of whether a socket is
IPv4 or IPv6
This non-failure was causing tests to timeout slowly on the
client side instead of failing immediately. We now fork a new
process which does nothing but accept() + shutdown() to emulate
a dead server.
Reliability improves on all OSes since there's never a point in
time when another process can bind the socket.
|
|
I noticed a description for a new inbox had st_mode=0600.
|
|
Xapian boolean terms rely on upper-case prefixes, so the terms
themselves need to be all lowercase.
|
|
SIGPIPE and SIGTERM are common and user-induced, so they're
not worth warning on. Add the value of "$?", though, since
it can help users notice other errors (e.g. SIGSEGV).
|
|
It's not much of a savings, right now, but maybe it can be in the
future. I wanted to eliminate the "lei convert" one, too, but
convert needs to preserve keywords which isn't possible with the
generic fallback, so new tests were written for convert, instead.
|
|
I just got a difficult-to-reproduce failure, here; so there's
still some issues with the up-to-dateness of the inotify watcher.
|
|
This easily allows us to treat "git diff" output as header-less
"messages" for commands such as "lei p2q".
|
|
|
|
By using the charset specified in the message, web browsers are
more likely to display the raw text properly for human readers.
Inspired by a patch by Thomas Weißschuh:
https://public-inbox.org/meta/20211024214337.161779-3-thomas@t-8ch.de/
Cc: Thomas Weißschuh <thomas@t-8ch.de>
|
|
|
|
|
|
The use of array-returning built-ins such as `grep' inside
arrayref declarations appears to result in permanently allocated
scratchpad space for caching according to my malloc inspector.
Thread skeletons get discarded every response, but multiple
skeletons can exist in memory at once, so do what we can to
prevent long-lived allocations from being made, here.
In other words, replacing constructs such as:
my $foo = [ grep(...) ];
with:
my @foo = grep(...);
Seems to ensure the mortality of the underlying array.
|
|
Since we want to store IMAP flags asynchronously and not wait
for results, we can't check for IMAP errors this way and end up
wasting bandwidth on public-inbox-imapd. Now, we just check
PERMANENTFLAGS up front to ensure a folder can handle IMAP flag
storage before proceeding.
|
|
We need to use a non-greedy regexp to avoid capturing the
".git" suffix in the pathname before blindly appending our
own.
|
|
Otherwise things can get noisy if bad entries exist in that
file, because they do.
|
|
Instead of:
lei forget-search $OUTPUT && rm -r $OUTPUT
we'll also allow a user to do:
rm -r $OUTPUT && lei forget-search --prune
This gives users flexibility to choose whatever flow
is most natural to them.
|
|
When the gmtime() calls were moved from feed_entry() and atom_header()
into feed_updated() in c447bbbd, @_ rather than a scalar was passed to
gmtime(). As a result, feed <updated> values end up as
"1970-01-01T00:00:00Z".
Switch back to using a scalar argument to restore the correct
timestamps.
Fixes: c447bbbddb4ac8e1 ("wwwatomstream: simplify feed_update callers")
|
|
One syscall is better than two for atomicity in Maildirs. This
means there's no window where another process can see both the
old and new file at the same time (link && unlink), nor a window
where we might inadvertantly clobber an existing file if we were
to do `stat && rename'.
|
|
I got one mysterious test failure here, once, and can't seem
to reproduce it...
|
|
While it doesn't matter to us, the Maildir spec specifies
characters are to be sorted in alphabetical order.
|
|
Maybe these will help track down some failures and make
diagnosing bugs easier. "lei export-kw" should also become
optional, even, so allow disabling it easily in the test.
|
|
Malicious clients may attempt HTTP request smuggling this way.
This doesn't affect our current code as we only look for exact
matches, but it could affect other servers behind a
to-be-implemented reverse proxy built around our -httpd.
This doesn't affect users behind varnish at all, nor the
HTTPS/HTTP reverse proxy I use (I don't know about nginx), but
could be passed through by other reverse proxies.
This change is only needed for HTTP::Parser::XS which most users
probably use. Users of the pure Perl parser (via
PLACK_HTTP_PARSER_PP=1) already hit 400 errors in this case,
so this makes the common XS case consistent with the pure Perl
case.
cf. https://www.mozilla.org/en-US/security/advisories/mfsa2006-33/
|
|
For odd messages with reused Message-IDs, the second message
showing up in a mirror (via git-fetch + -index) should never
clobber an entry with a different blob in over.
This is noticeable only if the messages arrive in-between
indexing runs.
Fixes: 4441a38481ed ("v2: index forwards (via `git log --reverse')")
|
|
Running tests over a non-interactive ssh session fails,
otherwise.
|
|
There's no savings in having two ways to add watches to an
inotify nor kqueue descriptor.
|
|
If lei up and edit-search work on something, so should forget-search.
|
|
When importing several sources in parallel via http(s) mboxrd,
we need to be able to get keywords of uncommitted documents
directly from shard workers. Otherwise, Xapian DocNotFound
errors happen because the read-only LeiSearch won't see
documents from uncomitted transactions. Keep in mind that it's
possible the keywords can be changed on-the-fly even for
uncommitted documents because of inotify watches from LeiNoteEvent.
|
|
Inbox->xdb does not exist, but this code path was apparently
never tested :x I noticed this on basic v2 inbox, but it could
happen with any v1/v2 inbox. Move ->num2docid into Search
so it's less awkward to use.
|
|
This test wasn't finished when I initially wrote it :x
|
|
No point in testing use_ok when we have no outside dependencies
nor exports in this case.
|
|
Oops, we shouldn't attempt to read a users' actual HOME when
running -index, since mine has a bunch of invalid entries in
there.
|
|
grokmirror 2.x seems to idle in several places for 5s at-a-time,
causing t/www_listing.t to take longer than "make check-run" on
a 4-core system when run without grokmirror. So make it
optional but add some test knobs to allow tailing the log
output so I can see what's going on.
|
|
This allows IMAP mirrors to keep UIDVALIDITY synchronized (and
"LIST ACTIVE.TIMES" in NNTP). "lei add-external --mirror" will
automatically set it, as will the combination of
public-inbox-clone + public-inbox-index.
This avoids the need for extra endpoints or config entries,
at least...
|
|
The original Msgmap->new API was v1-specific and not necessary.
The ->new_file API now supports an $ibx object being passed to
it, simplify -no_fsync use. It will also make an upcoming
change easier...
|
|
The cost of opening a Xapian DB (even with shards) isn't high,
so save some FDs and just close it. We hit Xapian far less than
over.sqlite3 and we discard the MSet ASAP even when streaming
large responses.
This simplifies our code a bit and hopefully helps reduce
fragmentation by increasing mortality of late allocations.
|
|
Xapian::QueryParser is attached to the Xapian::Database,
so holding onto the QueryParser was preventing us from
releasing DB handles if a query was performed.
|
|
When unref-ing a blob from xref3, make sure the "preferred"
smsg->{blob} doesn't point to the blob we just unrefed. This
is necessary because we periodically checkpoint our extindex
process to allow -watch and -mda processes to run.
This also gets rid of a lot of redundant code for ->remove_xref3,
since it's all handled in ExtSearchIdx, now.
|
|
This mode only checks history for missed/stale messages
and doesn't attempt to reindex messages which are already
indexed.
|
|
This should help us catch BUG: errors (and then some) in
-extindex and other read-write code paths. Only read-only
daemons should warn on async callback failures, since those
aren't capable of causing data loss.
|
|
Some code paths may use maximum size checks, so ensure
any checks are waited on, too.
|
|
This lets administrators reindex specific time ranges
according to git "approxidate" formats. These arguments
are passed directly to underlying git-log(1) invocations
and may still reach into old epochs.
Since these options rely on git committer dates (which we infer
from the most recent Received: header), they are not guaranteed
to be strictly tied to git history and it's possible to
over/under-reindex some messages. It's probably not a major
problem in practice, though; reindexing a few extra messages
is generally harmless aside from some extra device wear.
Since this currently relies on git-log, these options do not
affect -extindex, yet.
|
|
Making them immortal doesn't seem worth it, since doing immortal
allocations after process startup leads to fragmentation. While
the allocations made by highlight are small, those small
allocations can break up contiguous regions and prevent
consolidation by the malloc implementation.
Since instantiating code generators doesn't seem too expensive,
just use and delete them ASAP.
|
|
Unlike v1 inboxes (which don't accept duplicate Message-IDs at
all), and v2 inboxes (which generate a new Message-ID for
duplicates), extindex must accept duplicate Message-IDs as-is.
This was fine for storage, but prevented the reference-cycle
mechanism of our message threading display algorithm from working
reliably. It could no longer delete the ->{parent} field from
clobbered entries in the %id_table.
So we now take into account reused Message-IDs and never clobber
entries in %id_table. Instead, we mark reused Message-IDs as
"imposters" and special-case them by injecting them as children
after all other threading is complete.
This cycle was noticed using a pre-release of Devel::Mwrap::PSGI:
https://80x24.org/mwrap-perl.git
|
|
We only use it if Mail::Thread is available, and often it's not.
|
|
This fixes inspect for uninitialized instances, and adds Xapian
("xdoc") output if available.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Message-ID: <20211001204943.l4yl6xvc45c5eapz@meerkat.local>
|
|
Since signalfd is often combined with our event loop, give it a
convenient API and reduce the code duplication required to use it.
EventLoop is replaced with ::event_loop to allow consistent
parameter passing and avoid needlessly passing the package name
on stack.
We also avoid exporting SFD_NONBLOCK since it's the only flag we
support. There's no sense in having the memory overhead of a
constant function when it's in cold code.
|