about summary refs log tree commit homepage
path: root/t/git.t
DateCommit message (Collapse)
2023-10-01t/git: show git_version in diag output
This is useful to ensure we're testing properly with git <= 2.35 to ensure we don't break --batch-check support for those users.
2023-09-29git: calculate MAX_INFLIGHT properly in Perl
Unlike C, Perl automatically converts quotients to double-precision floating point even with UV/IV numerators and denominators. So force the intermediate quotient to be an integer before multiplying it by the size of each inflight array element. This bug was inconsequential for all platforms since d4ba8828ab23f278 (git: fix asynchronous batching for deep pipelines, 2023-01-04) and inconsequential on most (or all?) Linux even before that due to the larger 4096-byte PIPE_BUF on Linux.
2023-01-31tests: make require_git and require_cmd easier-to-use
We'll rely on defined(wantarray) to implicitly skip subtests, and memoize these to reduce syscalls, since tests should be short-lived enough to not be affected by new installations or removals of git/xapian-compact/curl/etc...
2022-10-24another step towards git SHA-256 support
While SHA-256 isn't supported for inboxes, yet xt/git-http-backend.t now runs properly against a SHA-256 code repository
2021-10-24t/git: support non-master default branch
2021-10-23git: simplify local_nick, avoid "foo.git.git"
We need to use a non-greedy regexp to avoid capturing the ".git" suffix in the pathname before blindly appending our own.
2021-10-13t/git: avoid "once" warning for async_warn
No point in testing use_ok when we have no outside dependencies nor exports in this case.
2021-10-08git: fatalize async callback errors by default
This should help us catch BUG: errors (and then some) in -extindex and other read-write code paths. Only read-only daemons should warn on async callback failures, since those aren't capable of causing data loss.
2021-10-08git: use async_wait_all everywhere
Some code paths may use maximum size checks, so ensure any checks are waited on, too.
2021-09-10t/git.t: quiet intentional git-rev-parse failure
It can get confusing, especially when running non-parallel "make test" Link: https://public-inbox.org/meta/20210909210138.ssiv5tri65mf4l4o@meerkat.local/
2021-05-09git: fix numerous bugs in git_quote and git_unquote
git always quotes with leading zeros to ensure the octal representation is 3 characters long. We enforce that to match low ASCII characters (e.g. [x01-\x06]) that don't need the range provided by 3 characters. git_unquote now does a single pass so it won't get fooled by decoded backslashes into parsing a digit as an octal character. git_unquote is also capped to "\377" so we don't overflow a byte.
2021-02-08search: use one git-rev-parse process for all dates
This is necessary to avoid slowdowns with pathological cases with many dates in the query, since each rev-parse invocation takes ~5ms. This is immeasurably slower with one open-ended range, but already faster with any closed range featuring two dates which require parsing via git.
2021-02-08git: implement date_parse method
Users are expected to be familiar with git's "approxidate" functionality for parsing dates, so we'll expose that in our UIs. Xapian itself has limited date parsing functionality and I can't expect users to learn it. This takes around 4-5ms on my aging workstation, so it'll probably be made acceptable for the WWW UI, even. libgit2 has a git__date_parse function which I expect to have less overhead, but it's only for internal use at the moment.
2021-01-30git: synchronous cat_file may return type and OID
Instead of forcing callers to set a variable to write into, we'll just rely on wantarray.
2021-01-01update copyrights for 2021
Using "make update-copyrights" after setting GNULIB_PATH in my config.mak
2020-12-28import: check for git->qx errors, clearer return values
Those git commands can fail and git->qx will set $? when it fails. There's no need for the extra indirection of the @ret array, either. Improve git->qx coverage to check for $? while we're at it.
2020-12-28git: qx: avoid extra "local" for scalar context case
We can use the ternary operator to avoid an early return, here
2020-06-21tests: require git 2.6+ in more places
We also need to check for git 2.6 earlier in each test case, before any other TAP output is emitted to avoid confusing the TAP consumers.
2020-06-13git: async: automatic retry on alternates change
This matches the behavior of the existing synchronous ->cat_file method. In fact, ->cat_file now becomes a small wrapper around the ->cat_async method.
2020-06-13git: cat_async: provide requested OID + "missing" on missing blobs
This will make it easier to implement the retries on alternates_changed() of the synchronous ->cat_file API.
2020-06-13imap: use git-cat-file asynchronously
This ought to improve overall performance with multiple clients. Single client performance suffers a tiny bit due to extra syscall overhead from epoll. This also makes the existing async interface easier-to-use, since calling cat_async_begin is no longer required.
2020-04-20testcommon: spawn-aware system() and qx[] workalikes
Barely noticeable on Linux, but this gives a 1-2% speedup on a FreeBSD 11.3 VM and lets us use built-in redirects rather than relying on /bin/sh.
2020-04-20import: init_bare: allow use as method, use in tests
Allowing ->init_bare to be used as a method saves some keystrokes, and we can save a little bit of time on systems with our vfork(2)-enabled spawn(). This also sets us up for future improvements where we can avoid spawning a process at all.
2020-02-06treewide: run update-copyrights from gnulib for 2019
I didn't wait until September to do it, this year!
2019-12-30spawn: allow passing GLOB handles for redirects
We can save callers the trouble of {-hold} and {-dev_null} refs as well as the trouble of calling fileno().
2019-12-26git: allow async_cat to pass arg to callback
This allows callers to avoid allocating several KB for for every call to ->async_cat.
2019-12-19tests: move t/common.perl to PublicInbox::TestCommon
We want to be able to use run_script with *.t files, so t/common.perl putting subs into the top-level "main" namespace won't work. Instead, make it a module which uses Exporter like other libraries.
2019-11-24tests: use File::Temp->newdir instead of tempdir()
We'll also introduce a tmpdir() API to give tempdirs consistent names.
2019-11-08t/*.t: remove IPC::Run dependency for git commands
One small step towards making tests easier-to-run. We can rely on "local $ENV{GIT_DIR}" for potentially shell-unsafe path names, and the rest of our path names are relative and don't contain characters which require escaping.
2019-09-09run update-copyrights from gnulib for 2019
2019-06-14git: remove cat_file sub callback interface
We weren't using it, and in retrospect, it makes no sense to use this API cat_file for giant responses which can't read quickly with minimal context-switching (or sanely fit into memory for Email::Simple/Email::MIME). For giant blobs which we don't want slurped in memory, we'll spawn a short-lived git-cat-file process like we do in ViewVCS. Otherwise, monopolizing a git-cat-file process for a giant blob is harmful to other PSGI/NNTP users. A better interface is coming which will be more suitable for for batch processing of "small" objects such as commits and email blobs.
2019-06-01git: unconditional expiry
A constant stream of traffic to either httpd/nntpd would mean git-cat-file processes never expire. Things can go bad after a full repack, as a full repack will unlink old pack indices and git-cat-file does not currently detect unlinked files. We could do something complicated by recursively stat-ing objects/pack of every git directory and alternate; but that's probably not worth the trouble compared to occasionally restarting the cat-file process. So simplify the code and let httpd/nntpd expire them periodically, since spawning a "git-cat-file --batch" process isn't too expensive. We already spawn for every request which hits git-http-backend, cgit, and git-apply. In the future, we may optionally support the Git::Raw module to avoid IPC; but we must remain careful to not leave lingering FDs open to unlinked files after repack.
2019-05-14tests: get rid of unnecessary Cwd module use
We only need it for tests that chdir, and maybe for ENV{PATH} portability (dash seems fine, not sure about others). v2: revert change to solver_git.t for FreeBSD 11.2 and document
2019-04-18git: calculate modified time of repository
This will be used for generating an HTML listing for v1 inboxes, at least. The logic for this follows that of grokmirror, and we may dynamically generate manifest.js.gz natively...
2019-01-31inbox: perform cleanup of Git objects for coderepos
Otherwise, long-running but idle git processes may keep unlinked packs around indefinitely and waste disk space.
2019-01-19git: add git_quote
It'll be helpful for displaying progress in SolverGit output.
2019-01-18t/git.t: do not pass "-b" to git-repack(1)
Allows t/git.t to run on older versions of git without "-b" and avoids incurring extra I/O traffic for bitmaps.
2019-01-18git: git_unquote handles double-quote and backslash
We need to work with 0x22 (double-quote) and 0x5c (backslash); even if they're oddball characters in filenames which wouldn't be used by projects I'd want to work on.
2019-01-18t/git.t: avoid passing read-only value to git_unquote
Older versions of Perl (tested 5.14.2 on Debian wheezy(*), reported by Konstantin on Perl 5.16.3) considered the result of concatenating two string literals to be a constant value. (*) not that other stuff works on wheezy, but t/git.t should. Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
2019-01-15git_unquote: perform modifications in-place
This function doesn't have a lot of callers at the moment so none of them are affected by this change. But the plan is to use this in our WWW code for things, so do it now before we call it in more places. Results from a Thinkpad X200 with a Core2Duo P8600 @ 2.4GHz: Benchmark: timing 10 iterations of cp, ip... cp: 12.868 wallclock secs (12.86 usr + 0.00 sys = 12.86 CPU) @ 0.78/s (n=10) ip: 10.9137 wallclock secs (10.91 usr + 0.00 sys = 10.91 CPU) @ 0.92/s (n=10) Note: I mainly care about unquoted performance because that's the common case for the target audience of public-inbox. Script used to get benchmark results against the Linux source tree: ==> bench_unquote.perl <== use strict; use warnings; use Benchmark ':hireswallclock'; my $nr = 50; my %GIT_ESC = ( a => "\a", b => "\b", f => "\f", n => "\n", r => "\r", t => "\t", v => "\013", ); sub git_unquote_ip ($) { return $_[0] unless ($_[0] =~ /\A"(.*)"\z/); $_[0] = $1; $_[0] =~ s/\\([abfnrtv])/$GIT_ESC{$1}/g; $_[0] =~ s/\\([0-7]{1,3})/chr(oct($1))/ge; $_[0]; } sub git_unquote_cp ($) { my ($s) = @_; return $s unless ($s =~ /\A"(.*)"\z/); $s = $1; $s =~ s/\\([abfnrtv])/$GIT_ESC{$1}/g; $s =~ s/\\([0-7]{1,3})/chr(oct($1))/ge; $s; } chomp(my @files = `git -C ~/linux ls-tree --name-only -r v4.19.13`); timethese(10, { cp => sub { for (0..$nr) { git_unquote_cp($_) for @files } }, ip => sub { for (0..$nr) { git_unquote_ip($_) for @files } }, });
2018-12-29t/git.t: reorder IPC::Run check
We can't skip tests after "use_ok"
2018-12-29tests: consolidate process spawning code.
IPC::Run provides a nice simplification in several places; and we already use it (optionally) on a lot of tests. For the non-test code, we still rely on our vfork-capable Inline::C stuff since real-world server processes can get large enough to where vfork is an advantage. Maybe Perl5 can use CLONE_VFORK somehow, one day: https://rt.perl.org/Ticket/Display.html?id=128227 Ohg V'q engure cbeg choyvp-vaobk gb Ehol :C
2018-02-19git: reload alternates file on missing blob
Since we'll be adding new repositories to the `alternates' file in git, we must restart the `git cat-file --batch' process as git currently does not detect changes to the alternates file in long-running cat-file processes. Don't bother with the `--batch-check' process since we won't be using it with v2.
2018-02-19v2writable: initial cut for repo-rotation
Wrap the old Import package to enable creating new repos based on size thresholds. This is better than relying on time-based rotation as LKML traffic seems to be increasing.
2018-02-07update copyrights for 2018
Using update-copyrights from gnulib While we're at it, use the SPDX identifier for AGPL-3.0+ to ease mechanical processing.
2016-04-11git: add support for qx wrapper
This lets us one-line git commands easily like ``, but without having to remember --git-dir or escape arguments.
2016-03-03t/*.t: use identifiable tempdir names
This should make identifiying leftover directories due to SIGKILL-ed tests easier.
2015-12-22rename 'GitCatFile' package to 'Git'
We'll be using it for more than just cat-file. Adding a `popen' API for internal use allows us to save a bunch of code in other places.
2015-12-22git: cat-file wrapper enhancements
The "cat_file" sub now allows a block to be passed for partial processing. Additionally, a new "check" method is added to retrieve only object metadata: (SHA-1 identifier, type, size)