about summary refs log tree commit homepage
path: root/lib/PublicInbox/SolverGit.pm
DateCommit message (Collapse)
2019-04-04support publicinbox.cgitrc directive
We can save admins the trouble of declaring [coderepo "..."] sections in the public-inbox config by parsing the cgitrc directly. Macro expansion (e.g. $HTTP_HOST) expansion is not supported, yet; but may be in the future.
2019-04-04viewvcs: preliminary support for showing non-blobs
Eventually, we'll have special displays for various git objects (commit, tree, tag). But for now, we'll just use git-show to spew whatever comes from git.
2019-02-05solvergit: include the $oid_want tmpdir name
This can help admins diagnose problems with SolverGit, since qspawn logs the failed "git apply" command-line in stderr. (or it can waste admins' time because sometimes there's crap mail clients which mangle patches)
2019-01-31solvergit: allow shorter-than-necessary OIDs from user
We can rely on git to disambiguate, here; because sometimes shorter OIDs can be unambiguous even if we only resolved the longer one.
2019-01-31solvergit: allow searching on longer-than-needed OIDs
public-inbox can only index the abbreviated object_ids in emails, not the full or even longer-than-necessary object_ids. So retry failed object_ids if they're longer than 7 hex characters.
2019-01-30solvergit: don't confuse Xapian with ".." in filenames
Xapian will interpret ".." as ranges, even quoted phrases. So break up words on ".." since punctuation (AFAIK) is not searchable, anyways.
2019-01-30git: use "git rev-parse --git-path"
Using git worktrees was causing t/solver_git.t to fail on me.
2019-01-30solvergit: deal with alternative diff prefixes
At least, without extra directory levels, since git-diff supports --src-prefix and --dst-prefix, and /git/6aa8857a11/s/ uses it...
2019-01-30solvergit: extract mode from diff headers properly
grep() won't set $1, so use "=~", instead.
2019-01-30solvergit: avoid "Wide character" warnings
Just quiet Perl down, since we don't know or care about the encoding of the patch we hand off to git-apply.
2019-01-30solvergit: do not show full path names to "git apply"
"git apply" will warn about whitespace with the full path of the patch, which will expose the $TMPDIR environment to users over HTTP(S). This change breaks compatibility with git pre-1.8.5, again; but that was released in late-2013; so hopefully everybody is on newer versions.
2019-01-29solvergit: do not solve blobs twice
In some cases, a file may ping-pong between blob IDs in the same message when reverts occur. So break out of this early. This doesn't account for different abbreviations, but the limited variations of abbreviations should alleviate the problem.
2019-01-27solver: crank up max patches to 9999
Might as well, since the only constraint is filesystem space for temporary files for public-inbox-httpd users. -httpd can fairly share work across clients with our use of psgi_qx; and there's a recent patch series in git@vger with 64 patches in sequence.
2019-01-27solver: reduce "git apply" invocations
"git apply" is capable of applying multiple patches in one invocation, so give it multiple patches on the command-line now that we no longer rely on anonymous file handles to hold patches. This cuts down a 64-patch series on git@vger from ~1s to ~800ms with vfork spawn enabled using Inline::C.
2019-01-27solver: hold patches in temporary directory
We can avoid bumping up RLIMIT_NOFILE too much by storing patches in a temporary directory. And we can share this top-level directory with our temporary git repository. Since we no longer rely on a working-tree for git, we are free to rearrange the layout and avoid relying on the ".git" convention and relying on "git -C" for chdir. This may also ease porting public-inbox to older systems where git does not support "-C" for chdir.
2019-01-26solver: rewrite to use Qspawn->psgi_qx and pi-httpd.async
The psgi_qx routine in the now-abandoned "repobrowse" branch allows us to break down blob-solving at each process execution point. It reuses the Qspawn facility for git-http-backend(1), allowing us to limit parallel subprocesses independently of Perl worker count. This is actually a 2-3% slower a fully-synchronous execution; but it is fair to other clients as it won't monopolize the server for hundreds of milliseconds (or even seconds) at a time.
2019-01-20solver: remove extra "^index $OID..$OID" line
It was harmless, besides wasting space and memory.
2019-01-20solver: force quoted-printable bodies to LF
..if the Email::MIME ->crlf is LF. Email::MIME::Encodings forces everything to CRLF on quoted-printable messages for RFC-compliance; and git-apply --ignore-whitespace seems to miss a context line which is just "\r\n" (w/o leading space).
2019-01-20solver: restore diagnostics and deal with CRLF
Apparently Email::MIME returns quoted-printable text with CRLF. So use --ignore-whitespace with git-apply(1) and ensure we don't capture '\r' in pathnames from those emails. And restore "$@" dumping when we die while solving.
2019-01-20solver: add a TODO note about making this fully evented
Applying a 100+ patch series can be a pain and lead to a wayward client monopolizing the connection. On the other hand, we'll also need to be careful and limit the number of in-flight file descriptors and parallel git-apply processes when we move to an evented model, here.
2019-01-20solver: note the synchronous nature of index preparation
It's not likely to be worth our time to support a callback-driven model for something which happens once per patch series.
2019-01-20solver: break @todo loop into a callback
This will allow each patch search via Xapian to "yield" the current client in favor of another client in the PSGI web interface for fairness.
2019-01-20solver: simplify control flow for initial loop
We'll be breaking this up into several steps, too; since searching inboxes for patch blobs can take 10s of milliseconds for me.
2019-01-20solver: switch patch application to use a callback
A bit messy at the moment, but we need to break this up into smaller steps for fairness with other clients, as applying dozens of patches can take several hundred milliseconds.
2019-01-20solver: break up patch application steps
We want more fine-grained scheduling for PSGI use, as the patch application step can take hundreds of milliseconds on my modest hardware
2019-01-20solver: more verbose blob resolution
Help users find out where each step of the resolution came from. Also, we must clean abort the process if we have missing blobs. And refine the output to avoid unnecessary braces, too.
2019-01-19solver: operate directly on git index
No need to incur extra I/O traffic with a working-tree and uncompressed files on the filesystem. git can handle patch application in memory and we rely on exact blob matching anyways, so no need for 3way patch application.
2019-01-19solver: various bugfixes and cleanups
Remove the make_path dependency and call mkdir directly. Capture mode on new files, avoid referencing non-existent functions and enhance the debug output for users to read.
2019-01-19solver: initial Perl implementation
This will lookup git blobs from associated git source code repositories. If the blobs can't be found, an attempt to "solve" them via patch application will be performed. Eventually, this may become the basis of a type-agnostic frontend similar to "git show"