Date | Commit message (Collapse) |
|
While `$argv[-1]' is `undef' on an empty @argv, using `$argv[-1]'
as a subroutine argument would fail incorrectly with:
Modification of non-creatable array value attempted, subscript -1 at ...
...even though we'd never attempt to modify @_ itself in the
subroutines being called. Work around the bug (tested on
5.16.3) by passing `undef' explicitly when `$argv[-1]' is
already `undef'.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210927124056.kj5okiefvs4ztk27@meerkat.local/
|
|
The "-w" perlop always succeeds as root, so we need to check
st_mode for writability bits to detect directories we shouldn't
write to.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210927124056.kj5okiefvs4ztk27@meerkat.local/
|
|
Instead of passing the prefix section and key separately, pass
them together as is commonly done with git-config(1) usage as
well as our ->get_all API. This inconsistency in the get_1 API
is a needless footgun and confused me a bit while working on
"lei up" the other week.
|
|
More switches which can be useful for users who pipe from text
editors. --drq can be helpful while writing patch review email
replies, and perhaps --dequote-only, too.
|
|
lei rediff is expected to see partial patch fragments and such,
so silence warnings when something isn't exactly a valid email
message.
|
|
Only the ->message_string method of Mail::IMAPClient uses it,
and we have no intention of using ->message_string outside
of tests.
|
|
Only the top-level lei-daemon will do inotify/kevent.
|
|
This saves us some memory for the hash slot in the common case
the `cloneurl' file doesn't exist.
|
|
When combining lines from To: and Cc: headers, ", " needs to be
used to separate them.
|
|
This allows users to search /all/ from the top-level WwwListing
without extra manual steps, although there's still extra network
roundtrips incurred.
No vertical whitespace is added, and there's no clumsy radio
buttons nor menus to deal with. Users only have to use a
different <input type=submit /> button. I forgot how to do this
until I realized we already do something similar with multiple
submit buttons for threaded vs non-threaded mboxrd.gz downloads.
Link: https://public-inbox.org/meta/20210827120845.29682-1-e@80x24.org/
|
|
The note-event worker may see changes before a Xapian shard
commit happens, meaning keyword lookups fail as a result.
Just emit the request to the lei/store worker since it's a
fairly cheap operation at this point.
We'll try harder to look for kw changes, too, since
deduplication changes may lead to multiple docids being
resolved for a single message.
|
|
`undef' entries still take up a slot in the hash table, and
cause the `exists' check to false-positive in ->cleanup_shards.
This should fully fix the (innocuous) messages introduced in
commit 63d7b8ce (daemons: revamp periodic cleanup task, 2021-09-23)
|
|
This allows us to avoid creating ibx->{search}->{xdb} at this
spot by using an `undef' value. This is a step towards
eliminating the innocuous "/path/to/inboxdir/xap15 has no shards"
messages introduced in commit 63d7b8ce (daemons: revamp
periodic cleanup task, 2021-09-23)
|
|
This was written before we had auto-loading and rarely used.
|
|
Also was written before we had auto-loading and rarely used.
|
|
This was written before we had auto-loading, and forget-external
should be a rarely-used command that's not worth loading at
startup. Do some golfing while we're in the area, too.
|
|
Since switching to SOCK_SEQUENTIAL, we no longer have to use
fixed-width records to guarantee atomic reads. Thus we can
maintain more human-readable/searchable PktOp opcodes.
Furthermore, we can infer the subroutine name in many cases
to avoid repeating ourselves by specifying a command-name
twice (e.g. $ops->{CMD} => [ \&CMD, $obj ]; can now simply be
written as: $ops->{CMD} => [ $obj ] if CMD is a method of
$obj.
|
|
I'm not sure what caused it, but $err was undef and caused print
to fail, leading to an event loop error. Guard the timer with
an eval and assume warn() can't trigger an event loop failure.
|
|
If the event loop fails, we want blocking waitpid (wait4) calls
to be interruptible with SIGTERM via "kill $PID" rather than
SIGKILL. Though a failing event loop is something we should
avoid...
|
|
Sometimes a user (e.g. me) isn't really sure what timezone
they're in...
|
|
There may still be pre-manifest.js.gz versions of
PublicInbox::WWW running and serving v2 inboxes.
While -clone and "add-external --mirror" were working, -fetch
was failing due to 301 redirect to $INBOX_URL/manifest.js.gz/
and not the expected 404. Update the code to deal with a JSON
decode error (from the 301) and ensure v2 epochs detection is
correct (and not using a shadowed variable).
|
|
This makes it easier for users to enable fetching on a
previously read-only epoch. Prior to this change, users were
required to delete manifest.js.gz in addition to adding the
writable bit. Now, they just have to "chmod +w $EPOCH_DIR".
|
|
There may still be pre-manifest.js.gz versions of PublicInbox::WWW.
running and serving v2 inboxes.
Since $INBOX_URL/manifest.js.gz was not understood, it was
assumed to be a Message-ID and 301-ed to
"$INBOX_URL/manifest.js.gz/" with a trailing slash, so our 404
checks were invalid. Update our fallbacks to deal with 301
by catching JSON decoding errors to trigger HTML scraping.
For HTML parsing, be sure to not be fooled by potential
user-generated content and only scan the part after the last
<hr>.
We also need to avoid propagating $? from curl unnecessarily
when we can continue safely.
Finally, update v2mirror.t with tests to use PublicInbox::WWW
from our "v1.1.0-pre1" tag to ensure these code paths get tested
|
|
We need to check every epoch for writability, so don't
break out of the loop when we find a URL.
|
|
Partial (v2) clones should be useful addition for users wanting
to conserve storage while having fast access to recent messages.
Continuing work started in 876e74283ff3 (fetch: ignore
non-writable epoch dirs, 2021-09-17), this creates bare,
read-only epoch git repos. These git repos have the remotes
pre-configured, but does not fetch any objects.
The goal is to allow users to set the writable bit on a
previously-skipped epoch and start fetching it.
Shell completion support may not be necessary given how short
the epoch ranges are, here.
Cc: Luis Chamberlain <mcgrof@kernel.org>
Link: https://public-inbox.org/meta/20210917002204.GA13112@dcvr/T/#u
|
|
It's probably least confusing for user-facing messages to
display times in the user's configured timezone. I considered
appending "UTC" to the message and sticking with gmtime(), too,
but this output isn't intended to be web-cache friendly nor
expect users from across multiple timezones to view the same
output.
|
|
It helps to be consistent and reduce the learning curve, here.
|
|
It's possible for the rename() sequence to cause read-only
daemons using ->xdb_shards_flat to load an incomplete set of
contiguous shards and get invalid docids for search results.
With this change, we favor the case where search is momentarily
unavailable rather than giving wrong results during the small
window where Xapcmd->commit_changes runs.
|
|
"Correct" meaning the permissions match that of the parent
xap15 or ei15 directory.
|
|
public-inbox-init sets umask for git <2.1.0, so our fork+exec
replacement needs to restore the original umask of the "parent".
|
|
Neither Inboxes nor ExtSearch objects were retrying correctly
when there are live git processes, but the inboxes were getting
rescanned for search or other reasons. Ensure the scan retries
eventually if there's live processes.
We also need to update the cleanup task to detect Xapian shard
count changes, since Xapian ->reopen is enough to detect any
other Xapian changes. Otherwise, we just issue an inexpensive
->reopen call and let Xapian check whether there's anything
worth reopening.
This also lets us eliminate the Devel::Peek dependency.
|
|
Check for unlinked mmap-ed files via /proc/$PID/maps every 60s
or so.
ExtSearch (extindex) is compatible-enough with Inbox objects to
be wired into the old per-inbox code, but the startup cost is
projected to be much higher down the line when there's >30K
inboxes, so we scan /proc/$PID/maps for deleted files before
unlinking. With old Inbox objects, it was (and is) simpler to
just kill processes w/o checking due to the low startup cost
(and non-portability of checking).
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210921144754.gulkneuulzo27qbw@meerkat.local/
|
|
Redundant code is noise and therefore confusing :<
|
|
We shouldn't dispatch all outputs right away since they
can be expensive CPU-wise. Instead, rely on DESTROY to
trigger further redispatches.
This also fixes a circular reference bug for the single-output
case that could lead to a leftover script/lei after MUA exit.
I'm not sure how --jobs/-j should work when the actual xsearch
and lei2mail has it's own parallelism ("--jobs=$X,$M"), but
it's better than having thousands of subtasks running.
Fixes: b34a267efff7b831 ("lei up: fix --mua with single output")
|
|
A few dozen bytes saved here can add up when we have thousands
of inboxes. It also makes Data::Dumper debug output a bit cleaner.
|
|
The bit about reap_compress is no longer true since
LeiXSearch->query_done triggers it, instead. I only noticed
this while working on "lei up".
|
|
This fixes the occasional t/lei-sigpipe.t infinite loop
under "make check-run".
Link: http://nntp.perl.org/group/perl.perl5.porters/258784
<CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com>
Followup-to: b552bb9150775fe4 ("daemon+watch: fix localization of %SIG for non-signalfd users")
|
|
It's needless noise and misleads users reading "ps" into
thinking there's more workers when there's only one.
|
|
There's a chance some sensitive information (e.g. folder names)
can end up in errors.log, though $XDG_RUNTIME_DIR or
/tmp/lei-$UID/ will have 0700 permissions, anyways.
|
|
Sometimes it's useful to pause an expensive query or
refresh-mail-sync to do something else. While lei-daemon and
lei/store can't be paused since they're shared across clients,
per-invocation WQ workers can be paused safely using the
unblockable SIGSTOP.
While we're at it, drop the ETOOMANYREFS hint since it
hasn't been a problem since we drastically reduced FD passing
early in development.
|
|
Avoid slurping gigantic (e.g. 100000) result sets into a single
response if a giant limit is specified, and instead use 10000
as a window for the mset with a given offset. We'll also warn
and hint towards about the --limit= switch when the estimated
result set is larger than the default limit.
|
|
I wanted to try --dedupe=none for something, but it failed
since I forgot --no-save :x So hint users towards --no-save
if necessary.
|
|
It's needless noise in syslogs for daemons and unnecessarily
alarming to users on the command-line.
|
|
Overwriting existing destinations safe (but slow) by default,
so show a progress message noting what we're doing while
a user waits.
|
|
"lei export-kw" no longer completes for anonymous sources.
More commands use "lei refresh-mail-sync" as a basis for their
completion work, as well.
";AUTH=ANONYMOUS@" is stripped from completions since it was
preventing bash completion from working on AUTH=ANONYMOUS IMAP
URLs. I'm not sure if there's a better way, but all of our code
works fine without specifying AUTH=ANONYMOUS as a command-line
arg.
Finally, we fallback to using more candidates if none can
be found, allowing multiple URLs to be completed.
|
|
NNTP URLs are probably more prevalent in public message archives
than IMAP URLs.
|
|
If lcat-ing multiple argument types (blobs vs folders),
maintain the original order of the arguments instead of
dumping all blobs before folder contents.
|
|
We can set opt->{quiet} for (internal) 'note-event' command
to quiet ->qerr, since we use ->qerr everywhere else. And
we'll just die() instead of setting a ->{fail} message, since
eval + die are more inline with the rest of our Perl code.
|
|
NNTP servers, IMAP servers, and various MUAs may recycle
"unique" identifiers due to software bugs or careless BOFHs.
Warn about them, but always be prepared to account for them.
|
|
No reason not to support them, since there's more
public-inbox-nntpd instances than -imapd instances,
currently.
|