Date | Commit message (Collapse) |
|
Noticed while adding wildcard support to WwwCoderepo...
|
|
Start lowercasing newsgroup names automatically since uppercase
names are incompatible with IMAP and POP3 and also causes
problems with both -extindex and -cindex.
We'll also warn on eidx_key and newsgroup conflicts to avoid
sometimes subtle breakage when using -extindex and -cindex.
|
|
It's a needless branch to maintain exclusively for our tests.
The `git config -l' output isn't pleasant to write in tests,
anyways. So just use heredocs to write git configs in their
native format rather than emulate the output of `git config -l'.
This does make the test suite do more work with temporary files
and process invocations, but it doesn't seem very measurable
when testing on tmpfs (TMPDIR=/dev/shm).
We'll make a minor improvement to TestCommon::tmpdir by allowing
it to return a single value (which I suspect we can rely on in
more places since File::Temp::Dir overloads stringification).
|
|
It's how git-config works, so our `git config --list' parser
must be able to handle it. Fortunately this doesn't seem to
incur a measurable overhead when parsing a config with 50k
inboxes.
|
|
This should match behavior documented in gitglossary(7)
|
|
It seems suitable for the config class since globs are a
config/option thing.
|
|
We'll rely on defined(wantarray) to implicitly skip subtests,
and memoize these to reduce syscalls, since tests should
be short-lived enough to not be affected by new installations or
removals of git/xapian-compact/curl/etc...
|
|
I configured this for public-inbox.org, but wasn't 100% sure it
worked. This test ensures it stays working :>
|
|
We no longer waste a precious hash slot for a per-Inbox
{nntpserver} if it's only configured globally for all inboxes.
|
|
Thanks to git-describe, we can generate and update this
file to aid users in bug reporting.
|
|
Extsearch objects are duck-types of Inbox objects, and
are capable of supporting code repos all the same.
|
|
We'll try to share a bit more configuration with
extindex entries for WWW PSGI usage.
|
|
Using "make update-copyrights" after setting GNULIB_PATH in my
config.mak
|
|
While git 1.8.5 learned --get-urlmatch, git did not learn to
match URLs against wildcards until 2.26. So only depend on
1.8.5 for this test since 2.26 is too new.
Reported-by: Ali Alnubani <alialnu@nvidia.com>
Link: https://public-inbox.org/meta/DM6PR12MB49106F8E3BD697B63B943A22DADB0@DM6PR12MB4910.namprd12.prod.outlook.com/
Tested-by: Ali Alnubani <alialnu@nvidia.com>
|
|
There's no need to have extra code in the Inbox package for this
or to waste dozens of bytes for every Inbox object which uses
the default value.
This makes our code more flexible w.r.t Inbox-like ExtSearch
objects and fixes uninitialized value warnings with ->ALL.
|
|
This gives better page cache utilization for Xapian indexing on
slow storage by improving locality for random I/O activity on
the Xapian DB.
Instead of doing a single-pass to index both SQLite and Xapian;
this indexes them separately. The first pass is identical to
indexlevel=basic: it indexes both over.sqlite3 and msgmap.sqlite3.
Subsequent passes only operate on a single Xapian shard for
documents belonging to that shard. Given enough shards, each
individual shard can be made small enough to fit into the kernel
page cache and avoid HDD seeks for read activity.
Doing rough tests with a busy system with a 7200 RPM HDD with ext4,
full indexing of LKML (9 epochs) goes from ~80 hours (-j0) to
~30 hours (-j8) with 16GB RAM with 7 shards configured and fsync(2)
disabled (--no-sync) and `--batch-size=10m'.
|
|
"\n" and other characters requiring quoting and/or escaping in
in $GIT_DIR/objects/info/alternates was not supported in git 2.11
and earlier; nor does it seem supported at all in libgit2.
This will allow us to support sharing git-cat-file or similar
endpoints across multiple inboxes via alternates.
This breaks an existing use case for anybody wacky
enough to put `\n' in the `inboxdir' pathname; but I doubt
this affects anybody.
|
|
Since we have IMAP client support in -watch; make sure per-URL
settings are familiar to git users by taking advantage of git's
URL matching abilities.
This requires git 1.8.5+, which most users ought to have
(though base CentOS 7 is on 1.8.3).
|
|
We'll use the xqx() to avoid losing too much performance
compared to normal `backtick` (qx) when testing using
"make check-run" + Inline::C.
|
|
Barely noticeable on Linux, but this gives a 1-2% speedup
on a FreeBSD 11.3 VM and lets us use built-in redirects
rather than relying on /bin/sh.
|
|
Allowing ->init_bare to be used as a method saves some
keystrokes, and we can save a little bit of time on systems with
our vfork(2)-enabled spawn().
This also sets us up for future improvements where we can
avoid spawning a process at all.
|
|
I didn't wait until September to do it, this year!
|
|
Since the beginning of this project, we've implicitly supported
inboxes with multiple URLs by relying on the Host: header sent
by the client ($env->{HTTP_HOST}).
We now offer the option to explicitly configure multiple URLs for
every inbox along with the ability to do a best-effort match for
matching hostnames.
|
|
We want to be able to use run_script with *.t files, so
t/common.perl putting subs into the top-level "main" namespace
won't work. Instead, make it a module which uses Exporter
like other libraries.
|
|
We'll also introduce a tmpdir() API to give tempdirs
consistent names.
|
|
"mainrepo" ws a bad name and artifact from the early days when I
intended for there to be a "spamrepo" (now just the
ENV{PI_EMERGENCY} Maildir). With v2, "mainrepo" can be
especially confusing, since v2 needs at least two git
repositories (epoch + all.git) to function and we shouldn't
confuse users by having them point to a git repository for v2.
Much of our documentation already references "INBOX_DIR" for
command-line arguments, so use "inboxdir" as the
git-config(1)-friendly variant for that.
"mainrepo" remains supported indefinitely for compatibility.
Users may need to revert to old versions, or may be referring
to old documentation and must not be forced to change config
files to account for this change.
So if you're using "mainrepo" today, I do NOT recommend changing
it right away because other bugs can lurk.
Link: https://public-inbox.org/meta/874l0ice8v.fsf@alyssa.is/
|
|
Rewrite a bunch of tests to use ordered input (emulating
"git config -l" output) so we can always walk sections in
the order they were given in the config file.
|
|
This allows us to deal with newlines in config values,
since git-config(1) acquired "-z" support in git v1.5.3.
I'm not sure if it's actually useful in our case, but
maybe some multi-line texts could be added. And newlines
in path names are super useful!
|
|
We need to handle arbitrary integers and case-insensitive
variations of human words to match git-config(1) behavior,
since that's what users would expect given we use config
files parseable by git-config(1).
|
|
|
|
CentOS-7 needs the perl-Data-Dumper package, and the
test is small enough to roll our own escaping, here.
|
|
We need to ensure we don't introduce unnecessary processes
and memory usage for mapping multiple inboxes to the same
code repos.
|
|
For cross-inbox Message-ID resolution; having some sort of
stable ordering makes the most sense. Relying on the
order of the config file seems most natural and allows us
to avoid introducing yet another configuration knob.
|
|
Actually, it turns out git.git/remote.c::valid_remote_nick
rules alone are insufficient. More checking is performed as
part of the refname in the git.git/refs.c::check_refname_component
I also considered rejecting URL-unfriendly inbox names entirely,
but realized some users may intentionally configure names not
handled by our WWW endpoint for archives they don't want
accessible over HTTP.
|
|
Using update-copyrights from gnulib
While we're at it, use the SPDX identifier for AGPL-3.0+ to
ease mechanical processing.
|
|
We will also treat all known list addresses as non-obfuscated.
By setting publicinbox.noObfuscate in ~/.public-inbox/config,
this will allow users to disable address obfuscation on a
per-domain or per-address basis.
|
|
This should simplify the rest of our code for handling
the do-not-obfuscate list.
|
|
This allows certain inboxes to override the global nntpserver
(perhaps under a different domain).
|
|
We can do a better job initializing the data structure
so we no longer need to rely on weak references to cleanup
when we ditch the config on reload.
|
|
Oops :x
|
|
This allows users to customize by using smaller or larger Atom
feeds than the default value of 25 entries.
|
|
Oops. We will inevitably need to support multiple altids for a
public-inbox one day.
|
|
Currently only for git-http-backend use, this allows limiting
the number of spawned processes per-inbox or by group, if there
are multiple large inboxes amidst a sea of small ones.
For example, a "big" repo limiter could be used for big inboxes:
which would be shared between multiple repos:
[limiter "big"]
max = 4
[publicinbox "git"]
address = git@vger.kernel.org
mainrepo = /path/to/git.git
; shared limiter with giant:
httpbackendmax = big
[publicinbox "giant"]
address = giant@project.org
mainrepo = /path/to/giant.git
; shared limiter with git:
httpbackendmax = big
; This is a tiny inbox, use the default limiter with 32 slots:
[publicinbox "meta"]
address = meta@public-inbox.org
mainrepo = /path/to/meta.git
|
|
Most of its functionality is in the PublicInbox::Inbox class.
While we're at it, we no longer auto-create newsgroup names
based on the inbox name, since newsgroup names probably deserve
some thought when it comes to hierarchy.
|
|
Followup-to: commit 24e0219f364ed402f9136227756e0f196dc651aa
("remove GIT_DIR env usage in favor of --git-dir")
|
|
From the beginning, we've avoided objects here in favor
of faster startup time; but it may not be worth it
since a persistent httpd/nntpd is faster and -mda
isn't hit as often.
|
|
A public-inbox is NOT necessarily a mailing list, but it
could serve as an input point for zero, one, or infinite
mailing lists :D
|
|
This should make identifiying leftover directories
due to SIGKILL-ed tests easier.
|
|
In the future, it should be possible to use this:
git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \
UPDATE_COPYRIGHT_USE_INTERVALS=2 \
xargs /path/to/gnulib/build-aux/update-copyright
|
|
Do not repeat ourselves, just use the same description file
gitweb uses to avoid surprising users.
|