Date | Commit message (Collapse) |
|
While the {inflight} array should be tied to the IO object even
more tightly, that's not an easy task with our current code. So
take some small steps by introducing a gcf_inflight helper to
validate the ownership of the process and to drain the inflight
array via the awaitpid callback.
This hopefully fix problems with t/lei-q-save.t (still) hanging
occasionally on v2 outputs since git->cleanup/->DESTROY was getting
called in v2 shard workers.
|
|
The long-term plan is to share non-blocking read buffering logic
with HTTP/NNTP/IMAP/POP3 and also XapClient.
|
|
Ensure we can ->fail properly from other subs we can within
Gcf2Client. This doesn't fix the test failures on CentOS 7.x,
but tries to make it easier to fix underlying problems and
report OOM errors and other things which the test suite doesn't
touch on.
|
|
This fixes two major problems with the use of tie for filehandles:
* no way to do fcntl, stat, etc. calls directly on the tied handle,
forcing callers to use the `tied' perlop to access the underlying
IO::Handle
* needing separate classes to handle blocking and non-blocking I/O
As a result, Git->cleanup_if_unlinked, InputPipe->consume,
and Qspawn->_yield_start have fewer bizzare bits and we
can call `$io->blocking(0)' directly instead of
`(tied *$io)->{fh}->blocking(0)'
Having a PublicInbox::IO class will also allow us to support
custom read buffering which allows inspecting the current state.
|
|
Since we deal with pipes (of either direction) and bidirectional
stream sockets for this class, it's better to remove the `Pipe'
from the name and replace it with `IO' to communicate that it
works for any form of IO::Handle-like object tied to a process.
|
|
While forked processes inherit from the parent, exec-ed
processes need the `-w' flag passed to them. To determine
whether or not we should pass them, we must check the `$^W'
global perlvar, first.
We'll also favor `perl -e' over `perl -E' in places where
we don't rely on the latest features, since `-E' incurs
slightly more startup time overhead from loading feature.pm
(while `perl -Mv5.12' does not).
|
|
Asking callers to pass a scalar reference is awkward and
doesn't benefit modern Perl with CoW support. Unlike
some constant error messages, it can't save any allocations
at all since there's no constant strings being passed to
libgit2.
|
|
Instead of using ->requeue to emulate level-triggered wakeups in
userspace, just use level-triggered wakeups in the kernel to
save some user time at the expense of system (kernel) time. Of
course, the ready list implementation in the kernel via C is
faster than a Perl one on our end.
We must still use requeue if we've got buffered data, however.
Followup-to: 1181a7e6a853 (listener: switch to level-triggered epoll)
|
|
The benefit of 1MB potential pipe buffer size (on Linux) doesn't
seem noticeable when reading from git (unlike when writing to v2
shards), so Unix stream sockets seem fine, here.
This allows us to simplify our process management by using the
same socket FD for reads and writes and enables us to use our
ProcessPipe class for reaping (as we can do with Gcf2Client).
Gcf2Client no longer relies on PublicInbox::DS for write
buffering, and instead just waits for requests to complete
once the number of inflight requests hits the MAX_INFLIGHT
threshold as we do with PublicInbox::Git.
We reuse the existing MAX_INFLIGHT limit (18) that was
determined by the minimum allowed PIPE_BUF (512). (AFAIK) Unix
stream sockets have no analogy to PIPE_BUF, but all *BSDs and
Linux I've checked have default SO_RCVBUF and SO_SNDBUF values
larger than the previously-required PIPE_BUF size of 512 bytes.
|
|
This is a trivial change compared to Qspawn in the previous
commit.
|
|
We need to abort both check-only and cat requests when
aborting, since we'll be aborting more aggressively
in in read-write paths.
|
|
Since signalfd is often combined with our event loop, give it a
convenient API and reduce the code duplication required to use it.
EventLoop is replaced with ::event_loop to allow consistent
parameter passing and avoid needlessly passing the package name
on stack.
We also avoid exporting SFD_NONBLOCK since it's the only flag we
support. There's no sense in having the memory overhead of a
constant function when it's in cold code.
|
|
While Gcf2Client is designed to mimic what git-cat-file writes
to stdout, its request format is different to support requests
with a git repository path included.
We'll highlight the distinction and make the GitAsyncCat support
code easier-to-follow as a result.
Since Gcf2Client relies on DS, we can rely on DS-specific code
here, too, and use a single Unix socket instead of separate
input and output pipes, reducing memory overhead in both users
and kernel space. Due to the interactive nature of requests and
responses, the buffer size limitations of Unix sockets on Linux
seems inconsequential here (just like it is for existing "git
cat-file --batch" use).
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
Objects with DESTROY callbacks get propagated to children, so we
must be careful to not invoke waitpid from children on their
sibling processes. Only parents (and their parents...) can reap
child processes.
|
|
This simplifies our code and provides a more consistent API for
error handling. PublicInbox::DS can be loaded nowadays on all
*BSDs and Linux distros easily without extra packages to
install.
The downside is possibly increased startup time, but it's
probably not as a big problem with lei being a daemon
(and -mda possibly following suite).
|
|
We don't want to leave Xapcmd waitpid(-1, ...) call to hit it.
|
|
For historical reasons, both Danga::Socket::write and
PublicInbox::DS::write will return 0 when data is buffered;
so Gcf2Client must not call ->fail when DS::write returns 0.
We'll also improve robustness by recreating the entire
Gcf2Client object if it does die for other reasons, instead of
risking mismatched fields due to deferred close.
We also need to ensure we only get one EPOLLERR wakeup and
issue EPOLL_CTL_DEL if ->event_step is triggered by a dying
Gcf2 process, so always register the FD with EPOLLONESHOT.
|
|
It seems easiest to have a singleton Gcf2Client client object
per daemon worker for all inboxes to use. This reduces overall
FD usage from pipes.
The `public-inbox-gcf2' command + manpage are gone and a `$^X'
one-liner is used, instead. This saves inodes for internal
commands and hopefully makes it easier to avoid mismatched
PERL5LIB include paths (as noticed during development :x).
We'll also make the existing cat-file process management
infrastructure more resilient to BOFHs on process killing
sprees (or in case our libgit2-based code fails on us).
(Rare) PublicInbox::WWW PSGI users NOT using public-inbox-httpd
won't automatically benefit from this change, and extra
configuration will be required (to be documented later).
|
|
This amortizes the cost of recreating PublicInbox::Gcf2 objects
when alternates change in v2 all.git.
|
|
Hopefully this allows others to more quickly figure out what's
going on.
|
|
Since we only get OIDs from trusted local data sources
(over.sqlite3), we can safely retry within the -gcf2 process
without worry about clients spamming us with requests for
invalid OIDs and triggering reopens.
|
|
This should be able to replace multiple `git cat-file' for blob
retrieval, but adjustments may be needed.
|