Date | Commit message (Collapse) |
|
One syscall is better than two for atomicity in Maildirs. This
means there's no window where another process can see both the
old and new file at the same time (link && unlink), nor a window
where we might inadvertantly clobber an existing file if we were
to do `stat && rename'.
|
|
We don't need to flood the terminal with "W: $oid is (!= blob)\n"
messages when somebody nukes a git cat-file process from under
us.
|
|
Simplify our APIs and force dwaitpid() to work in async mode for
all lei workers. This avoids having lingering zombies for
parallel searches if one worker finishes soon before another.
The old distinction between "old" and "new" workers was
needlessly complex, error-prone, and embarrasingly bad.
We also never handled v2:// writers properly before on
Ctrl-C/Ctrl-Z (SIGINT/SIGTSTP), so add them to @WQ_KEYS
to ensure they get handled by $lei when appropropriate.
|
|
By relying more on pgroups for remaining remaining processes,
this lets us pause all curl+tail subprocesses with a single
kill(2) to avoid cluttering stderr.
We won't bother pausing the pigz/gzip/bzip2/xz compressor
process not cat-file processes, though, since those don't write
to the terminal and they idle soon after the workers react to
SIGSTOP.
AutoReap is hoisted out from TestCommon.pm. CLONE_SKIP
is gone since we won't be using Perl threads any time
soon (they're discouraged by the maintainers of Perl).
|
|
Just in case it fails when there's many parallel invocations.
|
|
Since switching to SOCK_SEQUENTIAL, we no longer have to use
fixed-width records to guarantee atomic reads. Thus we can
maintain more human-readable/searchable PktOp opcodes.
Furthermore, we can infer the subroutine name in many cases
to avoid repeating ourselves by specifying a command-name
twice (e.g. $ops->{CMD} => [ \&CMD, $obj ]; can now simply be
written as: $ops->{CMD} => [ $obj ] if CMD is a method of
$obj.
|
|
I'm not sure what caused it, but $err was undef and caused print
to fail, leading to an event loop error. Guard the timer with
an eval and assume warn() can't trigger an event loop failure.
|
|
Overwriting existing destinations safe (but slow) by default,
so show a progress message noting what we're doing while
a user waits.
|
|
This has several advantages:
* no need to use ipc.lock to protect a pipe for non-atomic writes
* ability to pass FDs. In another commit, this will let us
simplify lei->sto_done_request and pass newly-created
sockets to lei/store directly.
disadvantages:
- an extra pipe is required for rare messages over several
hundred KB, this is probably a non-issue, though
The performance delta is unknown, but I expect shards
(which remain pipes) to be the primary bottleneck IPC-wise
for lei/store.
|
|
Since we can't use maxuid for remote externals, automatically
maintaining the last time we got results and appending a dt:
range to the query will prevent HTTP(S) responses from getting
too big.
We could be using "rt:", but no stable release of public-inbox
supports it, yet, so we'll use dt:, instead.
By default, there's a two day fudge factor to account for MTA
downtime and delays; which is hopefully enough. The fudge
factor may be changed per-invocation with the
--remote-fudge-factor=INTERVAL option
Since different externals can have different message transport
routes, "lastresult" entries are stored on a per-external basis.
|
|
Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"),
relying on lei/store to serialize access was a pointless endeavor.
Rely on flock(2) to serialize multiple writers since (in my
experience) it's the easiest way to deal with parallel writers
when using SQLite. This allows us to simplify existing callers
while speeding up 'lei refresh-mail-sync --all=local' by 5% or
so.
|
|
When composing replies in "git format-patch" cover letters,
I'd been relying on "lei q -f text ...", but that still requires
several steps to make it suitable for composing a reply:
* s/^/> / to quote the body
* drop existing In-Reply-To+References
* s/^Message-ID:/In-Reply-To:/;
* add an attribute line
...
"lei q -f reply" takes care of most of that and users will
only have to trim "From " lines, unnecessary results and
over-quoted text (and trimming is likely less error-prone
than doing all the steps above manually).
This should also be a good replacement for
"git format-patch --in-reply-to=...", since copying long
Message-IDs can be error-prone (and this lets you include
quoted text in replies).
|
|
It's a bit confusing to see "0 written to ..." when we actually
wrote something.
|
|
Since "lei up" is expected to be a heavily-used command,
better support for IMAP seems like a reasonable idea.
This is inefficient since we waste an IMAP(S) TCP connection
since it dies when an auth-only LeiUp worker process dies, but
it's better than not working at all, right now.
|
|
There's no need to alias net_merge_all in each WQ class
which uses LeiAuth, `$obj->$sub' works even when `$sub'
is a fully-qualified subroutine name with `::' in it.
perlobj(1) documents it under "Method Call Variations".
|
|
We may be handling invalid mboxes, so just return no objects in
that case. While "lei q" on HTTP(S) externals expects a gzipped
mboxrd, there's always a chance something else gzipped can be
sent to us.
There's also changes to lei_to_mail to better handle emails
which lack a body and/or headers (e.g. t/solve/bare.patch)
Link: https://public-inbox.org/meta/20210903151500.h72mzcpqixgtytjs@meerkat.local/
|
|
xt/net_writer-imap.t was completely broken in recent months and
I completely forgot this test. net->add_url still only accepts
bare scalars (and not scalar refs), so we must set that up
properly. Furthermore, our changes to do FLAGS-only
synchronization in lei of old messages was causing us to not
handle FLAGS properly for the test.
|
|
Since "lei up" is more often useful than not and incurs neglible
overhead; enable --save by default and allow --no-save to work.
This also fixes a long-standing when overwriting --output
destinations with saved searches: dedupe data from previous
searches are reset and no longer influences the new (changed)
search, so results no longer go missing if two sequential
invocations of "lei q --save" point to the same --output.
|
|
SQLite COUNT() is a slow operation that does a full table scan
with no conditions. There's no need for it, since lei dedupe
only needs to know if it's empty or not to decide between
new/ and cur/ for Maildir outputs.
|
|
This will make it easier to use for internal use such as
managing Maildir and IMAP IDLE watches.
|
|
Saved searches rely on (reverse) docid ordering for efficient
incremental results, and sorting any other way prevents that.
Update comment description in LeiQuery while we're at it:
"ls-query" and "rm-query" are "ls-search" and "forget-search",
respectively, and "mv-query" is implicit with "edit-search"
|
|
lcat can now dump the memoized contents of entire IMAP folders,
not just a single UID. It's now parallelized and pipelined for
multiple lei2mail workers.
Furthemore, various forms of JSON output work consistently
with blob-only output, now.
While working on this, I noticed NetReader was passing UID URLs
to imap_each callbacks, which was causing mail_sync.sqlite3 to
store UIDs in `folders' and clearly wrong so it's now fixed.
|
|
lei->rel2abs doesn't resolve symlinks, which could cause
synchronization problems with export-kw or other commands.
|
|
This allows "lei-managed pseudo mailing lists" as described
by Konstantin.
Alternates use is optional and can be enables via --shared.
This doesn't manage or edit ~/.public-inbox/config; presumably
there'll need to be some tweaking of search parameters before
finalizing and making the inbox publicly accessible via HTTP/NNTP.
Link: https://public-inbox.org/meta/20210426164454.5zd5kgugfhfwfkpo@nitro.local/T/
|
|
"lei import" can now import a single IMAP message via
<imaps://example.com/MAILBOX/;UID=$UID>
Likewise, "lei inspect" can show the blob information for UID
URLs and "lei lcat" can display the blob without network access
if imported.
"lei lcat" also gets rid of some unused code and supports
"blob:$OIDHEX" syntax as described in the comments (and used by
our "text" output format).
v2: enforce UID in URL, fail without
v3: fix error reporting (s/fail/child_error/)
|
|
Despite JMAP not supporting the equivalent of the IMAP \Recent
flag, it is useful for "lei q --augment", and "lei up" users to
be able to distinguish new results from old-but-unread messages
in an mbox or Maildir.
For mbox family messages, we'll drop the "O" status flag when
appending to mboxes, and we'll write to the "new" subdirectory
of Maildirs.
Behavior when writing to initially empty Maildirs and mboxes
remains unchanged since there's no need to distinguish between
new and old results in the initial case. Having users wait
for a rename(2) storm or complete mbox rewrite hurts UX.
With IMAP mailboxes, \Recent is already enforced by the IMAP
server and IMAP clients have no way of changing it(*)
(*) mutt uses the "Old" IMAP flag which isn't part of RFC 3501,
other MUAs may do similar things.
|
|
We support writing to IMAP stores in other places (just like
Maildir), and it's actually less complex for us to write to
IMAP. Neither usability nor performance is ideal, but usability
will be addressed in the next commit to relax CLI argument
checking.
Performance is poor due to the synchronous Mail::IMAPClient
API and will need to be addressed with pipelining sometime
further in the future.
|
|
This will give us more flexibility in the future w.r.t.
dealing with UIDVALIDITY and AUTH= info with IMAP. The LoC
reduction is welcome, too.
|
|
IMAP will eventually be supported.
|
|
This will make some of our tests faster and allow users to try
more features of lei without high storage requirements.
|
|
Since completely purging blobs from git is slow, users may wish
to index messages in Maildirs (and eventually other local
storage) without storing data in git.
Much code from LeiImport and LeiInput is reused, and a new dummy
FakeImport class supplies a non-storing $im->add and minimize
changes to LeiStore.
The tricky part of this command is to support "lei import"
after a message has gone through "lei index". Relying on
$smsg->{bytes} == 0 (as we do for external-only vmd storage)
does not work here, since it would break searching for "z:"
byte-ranges when not using externals.
This eventually required PublicInbox::Import::add to use a
SharedKV to keep track of imported blobs and prevent
duplication.
|
|
LeiToMail Maildir and IMAP write callbacks need to account for
the caller-supplied smsg. We'll also make better use of the
user-supplied smsg object by ensuring blob deduplication happens
ASAP.
Fixes: e76683309ca4f254 ("lei <q|up>: distinguish between mset and l2m counts")
|
|
This will allow keyword updates from other folders to propagate
to folders where search results may be duplicated.
|
|
We use the "done" term elsewhere for similar things, and
my easily-confused mind equates "complete" with shell
completion.
|
|
The number of messages we write to --output is usually different
than the mset count due to deduplication from combining multiple
sources.
This change makes the stderr output of "lei up --all=local" way
more useful IMHO.
|
|
Mail::IMAPClient provides the ability to pass a pre-connected
Socket to it. We can rely on this functionality to use
IO::Socket::Socks in place whatever socket class
Mail::IMAPClient chooses to use.
The --proxy=s is shared with curl(1), though we only support
socks5h:// at the moment. Is there any need for SOCKS4 or SOCKS5
without name resolution? Tor .onions require socks5h:// for
name resolution and to prevent data leakage.
|
|
This is mainly for "lei lcat" where it's the default,
but I find it useful anyways compared to the JSON view.
Colors are loaded from ~/.config/lei/config, and fall back
to using diff colors from a normal git config
(e.g. ~/.gitconfig).
|
|
Since we don't have *at() syscalls readily available to us,
lei-daemon may call ->poke_dst in the wrong relative directory.
Despite not having *at() syscalls, we can still capture the
"$MAILDIR/cur" directory handle at pre_augment time so we can
reliably call futimes(2) on it using the `utime' perlop.
|
|
Maildir and IMAP can both handle `forwarded'. Ensure we don't
lose `forwarded' when reading from stores which do not support
it, but ensure we can set it when reading from IMAP and Maildir
stores.
|
|
This makes "lei q --save" as safe as "lei q" to prevent against
accidental data loss when clobbering an existing output,
|
|
NetReader->add_url supports URI-like objects, now. We'll be
relying on the canonicalization for LeiSavedSearch.
|
|
The command isn't finalized, yet, but it's intended to update
an existing saved search.
|
|
This will have a over.sqlite3 for content-based deduplication.
It may exhibit ibxish methods, so serving a read-only (or even
R/W) IMAP or instance or displaying HTML isn't outside the realm
of possibility.
|
|
LeiSavedSearch will use a LeiDedupe-like internal API,
so we won't have to make as many changes to callsites
between saved and unsaved searches.
|
|
IMAP authentication info is only shared amongst lei2mail workers,
so we must ensure all IMAP writes go through lei2mail workers
even if we don't have to access the mail through git.
This allows us to decouple the latency of the remote mboxrd from
the latency of the IMAP --output at the expense of extra IPC
overhead within our own processes.
|
|
We don't need to waste LoC on corner cases, single-use internal
subs, or restoring SIG{__WARN__} when a process exits. All that
extra code contributes to memory use and startup time, especially
for users who can't use FD passing.
|
|
We'll eventually want lei_input users like "lei import" and
"lei tag" to support parallel reads.
|
|
We don't need to import so many things. None of the Errno
constants are in common paths so unlikely to benefit from
constant folding.
|
|
mbox and IMAP seem to have no way of describing this keyword.
but Maildir does with the "P" flagged (for "passed").
|
|
We can tweak lse->kw_changed to return docids and reduce IPC
traffic and reduce work the lei/store worker needs to do.
|