Date | Commit message (Collapse) |
|
We'll always be transferring stdin, stdout, and stderr together
for lei. Perhaps I lack imagination or foresight, but I can't
think of a reason to send more or less FDs.
|
|
IO::FDPass may be an extra installation burden I don't want to
impose on users. We only support Linux and *BSDs, however.
|
|
To get rid of the ugly $PublicInbox::DS::in_loop localization
in MboxReader, we'll distinguish between ->CLOSE and ->DESTROY
with ProcessPipe.
If we end up closing via ->DESTROY, we'll assume the caller will
want to deal with $? asynchronously via the event loop (or not
even care about $?).
If we hit ->CLOSE directly, we'll assume the caller called
close() and wants to check $? synchronously.
Note: wantarray doesn't seem to propagate into tied methods,
otherwise I'd be relying on that.
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
It seems like a more logical place for it, but we'll favor the
newly-added xsys_e() in tests for BAIL_OUT use.
|
|
fileno(DIRHANDLE) only works on Perl 5.22+, so we need to use
dirfd(3) ourselves from Inline::C (or rely on chattr(1) being
installed).
While we're at it, rename `set_nodatacow' to `nodatacow_fd'
for consistency with `nodatacow_dir'.
|
|
v?fork failures seems to be the cause of locks not getting
released in -watch. Ensure lock release doesn't get skipped
in ->done for both v1 and v2 inboxes. We also need to do
everything we can to ensure DB handles, pipes and processes
get released even in the face of failure.
While we're at it, make failures around `git update-server-info'
non-fatal, since smart HTTP seems more popular anyways.
v2 changes:
- spawn: show failing command
- ensure waitpid is synchronous for inotify events
- teardown all fast-import processes on exception,
not just the failing one
- beef up lock_release error handling
- release lock on fast-import spawn failure
|
|
SQLite and Xapian files are written randomly, thus they become
fragmented under btrfs with copy-on-write. This leads to
noticeable performance problems (and probably ENOSPC) as these
files get big.
lore/git (v2, <1GB) indexes around 20% faster with this on an
ancient SSD. lore/lkml seems to be taking forever and I'll
probably cancel it to save wear on my SSD.
Unfortunately, disabling CoW also means disabling checksumming
(and compression), so we'll be careful to only set the No_COW
attribute on regeneratable data. We want to keep CoW (and
checksums+compression) on git storage because current ref
storage is neither checksummed nor compressed, and git streams
pack output.
|
|
We no longer use writev(2) in pi_fork_exec to emit errors.
|
|
parent.pm is smaller than base.pm, and we'll also move
towards relying on `-w' (or not) to toggle process-wide
warnings during development.
|
|
Making the RLIMITS list a function doesn't allow constant
folding, so just make it an array accessible to other modules.
|
|
Subprocess we spawn may want to use SIGCHLD for themselves.
This also ensures we restore default signal handlers
in the pure Perl version.
|
|
Older versions of Inline (e.g. 0.53 in CentOS 7) did not accept
the `directory' parameter, so use conditional assignment to set
a default value on $ENV{PERL_INLINE_DIRECTORY}, instead.
|
|
Despite several memory reductions and pure Perl performance
improvements, Inline::C spawn() still gives us a noticeable
performance boost.
More user-oriented command-line programs are likely coming,
setting PERL_INLINE_DIRECTORY is annoying to users, and so is
is poor performance. So allow users to opt-in to using our
Inline::C code once by creating a `~/.cache/public-inbox/inline-c'
directory.
XDG_CACHE_HOME is respected to override the location of ~/.cache
independent of HOME, according to
https://specifications.freedesktop.org/basedir-spec/0.6/ar01s03.html
v2: use "/nonexistent" if HOME is undefined, since that's
the home of the "nobody" user on both FreeBSD and Debian.
|
|
Both the C and pure Perl implementions of `pi_fork_exec'
returns `-1' on error, not `undef'.
|
|
I didn't wait until September to do it, this year!
|
|
Commit 9f5a583694396f84 ("spawn (and thus popen_rd) die on failure")
was incomplete in that it only removed error checking for spawn
failures for non-(vfork|fork) calls, but the actual (vfork|fork)
PID result could still be undef.
Fixes: 9f5a583694396f84 ("spawn (and thus popen_rd) die on failure")
|
|
Most spawn and popen_rd callers die on failure to spawn,
anyways, and some are missing checks entirely. This saves
us a bunch of verbose error-checking code in callers.
This also makes popen_rd more consistent, since it already
dies on pipe creation failures.
|
|
There's a bunch of leftover "require" and "use" statements we no
longer need and can get rid of, along with some excessive
imports via "use".
IO::Handle usage isn't always obvious, so add comments
describing why a package loads it. Along the same lines,
document the tmpdir support as the reason we depend on
File::Temp 0.19, even though every Perl 5.10.1+ user has it.
While we're at it, favor "use" over "require", since it it gives
us extra compile-time checking.
|
|
Since vfork always shares memory between the child and parent,
we can propagate errors to the parent errno using shared memory
instead of just dumping to stderr and hoping somebody sees it.
|
|
This simplifies our admin module a bit and allows solver to be
used with v1 inboxes using git versions prior to v1.8.5 (but
still >= git v1.8.0).
|
|
We can save callers the trouble of {-hold} and {-dev_null}
refs as well as the trouble of calling fileno().
|
|
We can use "use" to get the namespace into the "BEGIN" phase of
the interpreter. While we're at it, use \&coderef syntax
explicitly instead of globbing everything.
|
|
It's unnecessary code which I'm not sure we ever used. In
retrospect, completely clearing the environment doesn't make
sense for the processes we spawn. We don't need to clobber
individual environment variables in our code, either
(and if we did for tests, we can use 'local').
|
|
This makes the subroutine behave more like which(1) command
and will make using spawn() in tests easier.
|
|
|
|
Instead, the O_NONBLOCK flag is set by PublicInbox::HTTPD::Async;
and we won't be setting it elsewhere.
|
|
Noticed while testing on FreeBSD 11.2 amd64 with the optional
Inline::C extension using clang 6.0.0. The end result on
FreeBSD was spawning processes failed badly and things were
immediately unusable with this enabled.
av_len is a misleading API, and I failed to read the API
comments in perl:/av.c which state:
> Note that, unlike what the name implies, it returns
> the highest index in the array, so to get the size of
> the array you need to use "av_len(av) + 1".
> This is unlike "sv_len", which returns what you would expect.
If this bug affected anybody, it would've only affected users
using both the optional Inline::C module AND set the
PERL_INLINE_DIRECTORY environment variable.
That said, I've never seen any evidence of it on Debian
GNU/Linux + gcc on any x86 variant. That includes full 64-bit
systems, a full 32-bit system, a 64-bit system with 32-bit
userspace, across multiple gcc versions since 2016.
|
|
Our high-level config already treats single limits as a
soft==hard limit for limiters; so stop handling that redundant
in the low-level spawn() sub.
|
|
This allows users to configure RLIMIT_{CORE,CPU,DATA} using
our "limiter" config directive when spawning external processes.
|
|
cgit (and most other CGI executables) is not typically installed
for use via $PATH, so we'll need to support absolute paths to
run it.
|
|
We'll be spawning cgit and git-diff, which can take gigantic
amounts of CPU time and/or heap given the right (ermm... wrong)
input. Limit the damage that large/expensive diffs can cause.
|
|
Using update-copyrights from gnulib
While we're at it, use the SPDX identifier for AGPL-3.0+ to
ease mechanical processing.
|
|
fork failures are unfortunately common when Xapian has
gigabytes and gigabytes mmapped.
|
|
While we only want to stop our daemons and gracefully destroy
subprocesses, it is common for 'Ctrl-C' from a terminal to kill
the entire pgroup.
Killing an entire pgroup nukes subprocesses like git-upload-pack
breaks graceful shutdown on long clones. Make a best effort to
ensure git-upload-pack processes are not broken when somebody
signals an entire process group.
Followup-to: commit 37bf2db81bbbe114d7fc5a00e30d3d5a6fa74de5
("doc: systemd examples should only kill one process")
|
|
We can't rely on absolute paths when installed on other
systems.
Unfortunately, mlmmj-* requires them, but none of the core
code will use it.
|
|
We cannot afford to fire Perl-level signal handlers in the
vforked child process since they're not designed to run in
the child like that.
Thus we need to block all signals before calling vfork, reset
signal dispositions in the child, and restore the signal mask in
the parent.
ref: https://ewontfix.com/7
|
|
This makes for better compile-time checking and also helps
document which calls are private for HTTP and NNTP.
While we're at it, use IO::Handle::* functions procedurally,
too, since we know we're working with native glob handles.
|
|
We can rely on timely auto-destruction based on reference
counting; reducing the chance of redundant close(2) calls
which may hit the wront FD.
We do care about certain close calls (e.g. writing to a buffered
IO handle) if we require error-checking for write-integrity. In
other cases, let things go out-of-scope so it can be freed
automatically after use.
|
|
This is necessary since we want to be able to do arbitrary redirects
via the popen interface. Oh well, we'll be a little slower for now
for users without vfork. vfork users will get all the performance
benefits.
|
|
We must stash the error correctly when nesting evals, oops :x
|
|
This should reduce overhead of spawning git processes
from our long-running httpd and nntpd servers.
|
|
Under Linux, vfork maintains constant performance as
parent process size increases. fork needs to prepare pages
for copy-on-write, requiring a linear scan of the address
space.
|