* [PATCH 48/95] clone: flesh out --objstore behavior and document
2022-11-28 5:30 6% [PATCH 00/95] clone: multi-inbox/repo support Eric Wong
@ 2022-11-28 5:31 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2022-11-28 5:31 UTC (permalink / raw)
To: meta
We can support absolute paths to avoid surprising behaviors,
but relative paths are preferred since the goal is to be
accessible over the "dumb" HTTP git transport (the dumb
transport is uses less memory and CPU on the server).
---
Documentation/public-inbox-clone.pod | 12 ++++++++++++
lib/PublicInbox/LeiMirror.pm | 3 ++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/Documentation/public-inbox-clone.pod b/Documentation/public-inbox-clone.pod
index 1c31fbb3..cee9f76e 100644
--- a/Documentation/public-inbox-clone.pod
+++ b/Documentation/public-inbox-clone.pod
@@ -6,6 +6,8 @@ public-inbox-clone - "git clone --mirror" wrapper
public-inbox-clone INBOX_URL [INBOX_DIR]
+public-inbox-clone ROOT_URL [DESTINATION]
+
=head1 DESCRIPTION
public-inbox-clone is a wrapper around C<git clone --mirror> for
@@ -82,6 +84,16 @@ Force a remote public-inbox version (must be C<1> or C<2>).
This is auto-detected by default, and this option exists mainly
for testing.
+=item --objstore[=DIR]
+
+Enables space savings when the remote C<manifest.js.gz>
+includes C<forkgroup> entries as generated by grokmirror 2.x.
+
+If C<DIR> is not an absolute path, it is relative to the
+C<DESTINATION> directory. If only C<--objstore> is specified
+without C<DIR>, then C<objstore> (C<$DESTINATION/objstore>)
+is the implied value of C<DIR>.
+
=item -n
=item --dry-run
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 6efe23fa..2f96058a 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -882,7 +882,8 @@ sub do_mirror { # via wq_io_do or public-inbox-clone
if (defined(my $os = $lei->{opt}->{objstore})) {
$os = 'objstore' if $os eq ''; # --objstore w/o args
- $self->{-objstore} = "$self->{dst}/$os";
+ $os = "$self->{dst}/$os" if $os !~ m!\A/!;
+ $self->{-objstore} = $os;
}
local $LIVE;
my $iv = $lei->{opt}->{'inbox-version'} //
^ permalink raw reply related [relevance 7%]
* [PATCH 00/95] clone: multi-inbox/repo support...
@ 2022-11-28 5:30 6% Eric Wong
2022-11-28 5:31 7% ` [PATCH 48/95] clone: flesh out --objstore behavior and document Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2022-11-28 5:30 UTC (permalink / raw)
To: meta
A large patchset, and not done, yet :P It's only tested live,
but it seems to work reasonably well against live hosts...
Behavior changes to public-inbox-clone are NOT final; but
public-inbox-fetch|PublicInbox::Fetch will probably become
thin wrappers around LeiMirror.
--include=/--exclude= support now exists with glob support
--keep-going and --dry-run support added, too, since it's
make(1) influenced (more below)
It supports coderepos, too, using --inbox-config=never (default: always);
--project-list=, --manifest=, --objstore=, and --prune.
key differences from grok-pull (grokmirror) for coderepos:
* uses relative paths on the FS (dumb HTTP untested, but dumb
HTTP is a goal for memory-constrained hosts). This means
I can relocate coderepos freely within my FS or do sneakernet
transfers across machines without having to `perl -ipe s/x/y/'
on hundreds of info/alternates and config files.
* CLI-only, no extra config files (may generate a Makefile, like
individual inbox clones)
* objstore repos fetches from remotes directly
(does not need, use, nor benefit from hardlinks at all)
* no sleep states
It is not a full replacement for grokmirror
* reliant on default `git gc' behavior for repack. This is OK
since it's only one-way relationships between objstore and
non-objstore repos.
* no fsck support (probably will be in generated Makefile)
* doesn't generate forkgroups nor manifest.js.gz
(I may do this for coderepo Xapian indexing)
It relies on parallel git-fetch for objstores, so `-j $NUM'
calculations may end up being ($NUM * $NUM) in the worst case.
Not sure how to best approach this...
Maybe `-j $M,$N' similar to `lei q -j$M,$N` is a solution...
Design note:
This is an exercise in building make(1)-like parallelism using
->DESTROY callbacks for prerequisites; so it's a newish paradigm
for me. It forced me to fix a reference cycle, already.
TODO: repo|symlink pruning, --exit-code, retry/refetch, manpage updates
Eric Wong (95):
clone: support multi-inbox clone
clone: support --include and --exclude with multi-clone
clone: parallelize v2 epoch clones
lei_mirror: async config retrieval for v2 w/ manifest
lei_mirror: rely on DESTROY to index v2 inbox
lei_mirror: rely on global process reaper
clone: support parallel v1 clones
lei_mirror: default to single job by default
lei_mirror: move directory creation to v2-only path
lei_mirror: retrieve description text asynchronously, too
switch inotify/kevent stuff to v5.12
manifest: update module blurb + v5.12
lei_mirror: simplify _get_txt_start callers
lei_mirror: elide description retrieval for v1|coderepo
lei_mirror: add a hint for skipped epoch permissions
lei_mirror: consolidate clone process management
lei_mirror: load File::Path unconditionally
lei_mirror: load most modules up-front
lei_mirror: set gitweb.owner from manifest
clone: support --dry-run / -n flag
lei_mirror: initialize placeholders with "head" from manifest
lei_mirror: support {reference} for v1 manifest clones
lei_mirror: reduce noise on interrupted clones
clone: support --inbox-config option
lei_mirror: retrieve v2 description properly
lei_mirror: reduce scope of v2 lock
lei_mirror: allow --epoch on mixed v1/v2 clones
lei_mirror: fix infinite loop in dependency resolution
lei_mirror: defend against infinite loops
lei_mirror: do not fetch descriptions if using manifest
lei_mirror: require PublicInbox::Lock at use
lei_mirror: fix glob semantics to match end-of-path
lei_mirror: differentiate -entv vs -ent
lei_mirror: support manifest {references} for v2 epochs
lei_mirror: simplify v2 code paths
clone: support --inbox-version
lei_mirror: require Perl v5.12+
lei_mirror: ensure curl exits 22 on HTTP 404 responses
lei_mirror: cleanup File::Temp OO usage
lei_mirror: add `index' target to generated Makefile
lei_mirror: do not write Makefile for --inbox-config=never
lei_mirror: hoist out dump_manifest sub
lei_mirror: avoid convoluted lazy_cb usage
lei_mirror: simplify clone_v2_prep
lei_mirror: support --objstore and forkgroups
lei_mirror: cleanup process reaping logic
lei_mirror: ensure git <1.8.5 fallback can use torsocks
clone: flesh out --objstore behavior and document
lei_mirror: always pack refs for coderepos
lei_mirror: set description for non-inboxes, too
lei_mirror: force --no-tags when fetching forkgroups
lei_mirror: preserve permissions of existing alternates file
lei_mirror: do not show ref updates w/o --verbose
lei_mirror: drop git <1.8.5 support
lei_mirror: make basename more descriptive
lei_mirror: fix --dry-run for forkgroups
lei_mirror: forkgroups use `git fetch --multiple'
clone: move --dry-run handling to lei_mirror
clone: drop unnecessary requires
clone: use v5.12
clone: require `--objstore=' for default location
lei_mirror: shorten remote names
fetch: use v5.12
fetch: eliminate File::Temp->filename var
lei_mirror: properly pack-refs in non-forkgroup repos
lei_mirror: show child error error code
on_destroy: support ->cancel callback
lei_mirror: support resuming multi-repo clones
lei_mirror: check fingerprints before fetching
clone: support loading manifest.js.gz from destination
lei_mirror: delay configuring forkgroups
clone: canonicalize destination path from CLI
clone|fetch: support passing --prune(-tags) to `git fetch'
lei_mirror: avoid needless FD passing
clone: support --keep-going/-k like make(1)
lei_mirror: don't warn on missing manifest on initial clone
lei_mirror: respect `./' and `../' prefixes for CLI args
lei_mirror: --manifest= affects destination, too
lei_mirror: update fingerprints when writing local manifest.js.gz
lei_mirror: remove janky mirror.done stamp file
lei_mirror: simplify most process spawning
lei_mirror: run v1_done earlier on forkgroup done
lei_mirror: simplify forkgroup-related subs
lei_mirror: shorten scope mirror objects
lei_mirror: set {head} from manifest
lei_mirror: support {symlinks} from manifest
lei_mirror: eliminate circular references
lei_mirror: use curl -z/--timecond if manifest exists
lei_mirror: avoid redundant curl `-f' use
lei_mirror: omit trailing slash for git remote.*.url
lei_mirror: set info/web/last-modified from manifest
lei_mirror: don't clobber inbox.config.example if it exists
lei_mirror: break out of fgrp fetch iteration early
clone: support --project-list= for cgit
lei_mirror: handle forkgroup changes
Documentation/lei-add-external.pod | 4 +-
Documentation/public-inbox-clone.pod | 76 ++
Documentation/public-inbox-fetch.pod | 6 +
lib/PublicInbox/DSKQXS.pm | 5 +-
lib/PublicInbox/DirIdle.pm | 4 +-
lib/PublicInbox/FakeInotify.pm | 13 +-
lib/PublicInbox/Fetch.pm | 50 +-
lib/PublicInbox/In2Tie.pm | 4 +-
lib/PublicInbox/InboxIdle.pm | 2 +-
lib/PublicInbox/KQNotify.pm | 12 +-
lib/PublicInbox/LEI.pm | 5 +-
lib/PublicInbox/LeiMirror.pm | 1104 +++++++++++++++++++++-----
lib/PublicInbox/ManifestJsGz.pm | 8 +-
lib/PublicInbox/OnDestroy.pm | 5 +-
lib/PublicInbox/TestCommon.pm | 1 +
script/public-inbox-clone | 23 +-
script/public-inbox-fetch | 4 +-
t/on_destroy.t | 8 +-
t/www_listing.t | 71 +-
19 files changed, 1148 insertions(+), 257 deletions(-)
^ permalink raw reply [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2022-11-28 5:30 6% [PATCH 00/95] clone: multi-inbox/repo support Eric Wong
2022-11-28 5:31 7% ` [PATCH 48/95] clone: flesh out --objstore behavior and document Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).