diff options
Diffstat (limited to 'Documentation/public-inbox-clone.pod')
-rw-r--r-- | Documentation/public-inbox-clone.pod | 287 |
1 files changed, 287 insertions, 0 deletions
diff --git a/Documentation/public-inbox-clone.pod b/Documentation/public-inbox-clone.pod new file mode 100644 index 00000000..64ee3138 --- /dev/null +++ b/Documentation/public-inbox-clone.pod @@ -0,0 +1,287 @@ +=head1 NAME + +public-inbox-clone - "git clone --mirror" wrapper + +=head1 SYNOPSIS + +public-inbox-clone [OPTIONS] INBOX_URL [INBOX_DIR] + +public-inbox-clone [OPTIONS] ROOT_URL [DESTINATION] # public-inbox 2.0+ + +=head1 DESCRIPTION + +public-inbox-clone is a wrapper around C<git clone --mirror> for +making the initial clone of a remote HTTP(S) public-inbox. It +allows cloning multi-epoch v2 inboxes with a single command and +zero configuration. + +In public-inbox 2.0+, public-inbox-clone can create and maintain +a mirror of multiple inboxes or code repositories using manifest.js.gz +files like L<grok-pull(1)> from grokmirror. L<public-inbox-fetch(1)> is +NOT required when using this mode. + +It does not run L<public-inbox-init(1)> nor +L<public-inbox-index(1)>. Those commands must be run separately +if serving/searching the mirror is required. As-is, +public-inbox-clone is suitable for creating a git-only backup +without Xapian and SQLite indices. + +When cloning a single inbox, public-inbox-clone creates a Makefile +with handy targets to update the inbox once indexed. +This Makefile may be edited by the user; it will +not be rewritten by L<public-inbox-fetch(1)> unless it is removed +completely. + +public-inbox-clone does not use nor require any extra +configuration files (not even C<~/.public-inbox/config>), +but it can download snippets suitable for adding to any +L<public-inbox-config(5)> file. + +L<public-inbox-fetch(1)> may be used to keep a single C<INBOX_DIR> +up-to-date. + +For v2 inboxes, it will create a C<$INBOX_DIR/manifest.js.gz> +file to speed up subsequent L<public-inbox-fetch(1)>. + +=head1 OPTIONS + +=over + +=item --epoch=RANGE + +Restrict clones of L<public-inbox-v2-format(5)> inboxes to the +given range of epochs. The range may be a single non-negative +integer or a (possibly open-ended) C<LOW..HIGH> range of +non-negative integers. C<~> may be prefixed to either (or both) +integer values to represent the offset from the maximum possible +value. + +For example, C<--epoch=~0> alone clones only the latest epoch, +C<--epoch=~2..> clones the three latest epochs. + +Default: C<0..~0> or C<0..> or C<..~0> +(all epochs, all three examples are equivalent) + +=item -I PATTERN + +=item --include=PATTERN + +When cloning a top-level with multiple inboxes via manifest, +only clone inboxes and repositories matching a given wildcard pattern +(using C<*?> and C<[]> is supported). + +This is a new option in public-inbox 2.0+ + +=item --exclude=PATTERN + +When cloning a top-level with multiple inboxes via manifest, +ignore inboxes and repositories matching the given wildcard pattern. +Supports the same wildcards as L</--include> + +This is a new option in public-inbox 2.0+ + +=item --inbox-config=always|v2|v1|never + +Whether or not to retrieve the C<$INBOX/_/text/config/raw> HTTP(S) +endpoint when cloning. + +Since we can't deduce v1 inboxes from code repositories, setting this +to C<v2> or C<never> can allow faster clones of code repositories if +no v1 inboxes are present. + +Default: C<always> + +This is a new option in public-inbox 2.0+ + +=item --inbox-version=NUM + +Force a remote public-inbox version (must be C<1> or C<2>). +This is auto-detected by default, and this option exists mainly +for testing. + +This is a new option in public-inbox 2.0+ + +=item --objstore=DIR + +Enables space savings when the remote C<manifest.js.gz> +includes C<forkgroup> entries as generated by grokmirror 2.x. + +If C<DIR> does not start with C</>, C<./>, or C<../>, it is treated +as relative to the C<DESTINATION> directory. If only C<--objstore=> +is specified where C<DIR> is an empty string (C<"">), then C<objstore> +(C<$DESTINATION/objstore>) is the implied value of C<DIR>. + +This is a new option in public-inbox 2.0+ + +=item --manifest=FILE + +When incrementally updating an existing mirror, load the given +manifest (typically C<manifest.js.gz>) to speed up updates. + +By default, public-inbox writes the retrieved manifest to +C<$DESTINATION/manifest.js.gz>, this directive also +changes the destination to the specified C<FILE> + +If C<FILE> does not start with C</>, C<./>, or C<../>, it is treated +as relative to the C<DESTINATION> directory. If only C<--manifest=> +is specified where C<FILE> is an empty string (C<"">), then C<manifest.js.gz> +(C<$DESTINATION/manifest.js.gz>) is the implied value of C<FILE>. + +When updating manifests with many forks using the same objstore, +git 2.41+ is highly recommended for performance as we automatically +use the C<fetch.hideRefs> feature to speed up negotiation. + +C<--manifest=> is a new option in public-inbox 2.0+ + +=item --remote-manifest=URL|RELATIVE_PATH + +Use an alternate location for the remote manifest.js.gz file. +This may be specified as a full absolute URL (e.g +C<--remote-manifest=https://80x24.org/lore/pub/manifest.js.gz>), +or a pathname relative to the ROOT_URL (e.g +C<--remote-manifest=pub/manifest.js.gz> when ROOT_URL is +C<https://80x24.org/lore/> + +By default, C<ROOT_URL/manifest.js.gz> is used. + +This is a new option in public-inbox 2.0+ + +=item --project-list=FILE + +When cloning code repos from a manifest, generate a cgit-compatible +project list. + +If C<FILE> does not start with C</>, C<./>, or C<../>, it is treated +as relative to the C<DESTINATION> directory. If only C<--project-list=> +is specified where C<FILE> is an empty string (C<"">), then C<projects.list> +(C<$DESTINATION/projects.list>) is the implied value of C<FILE>. + +This is a new option in public-inbox 2.0+ + +=item --post-update-hook=COMMAND + +Hooks to run after a repository is cloned or updated, C<COMMAND> will +have the bare git repository destination given as its first and only +argument. + +For v2 inboxes, this operates on a per-epoch basis. + +May be specified multiple times to run multiple commands in the +order specified on the command-line. + +This is a new option in public-inbox 2.0+ + +=item -p + +=item --prune + +Pass the C<--prune> and C<--prune-tags> flags to L<git-fetch(1)> +calls on incremental clones. + +This is a new option in public-inbox 2.0+ + +=item --purge + +Deletes entire repos which no longer exist in the remote manifest, +or are filtered out by C<--include=> or C<--exclude=>. + +This is only useful when using C<--manifest> + +This is a new option in public-inbox 2.0+ + +=item --exit-code + +Exit with C<127> if no updates are done when relying on a manifest. +Updates include fingerprint mismatches in the manifest, new symlinks, +new repositories, and removed repositories from the L<--project-list> + +This is a new option in public-inbox 2.0+ + +=item -k + +=item --keep-going + +Continue as much as possible after an error. + +This is a new option in public-inbox 2.0+ + +=item -n + +=item --dry-run + +Show what would be done, without making any changes. + +This is a new option in public-inbox 2.0+ + +=item -q + +=item --quiet + +Quiets down progress messages, also passed to L<git-fetch(1)>. + +=item -v + +=item --verbose + +Increases verbosity, also passed to L<git-fetch(1)>. + +=item --torsocks=auto|no|yes + +=item --no-torsocks + +Whether to wrap L<git(1)> and L<curl(1)> commands with L<torsocks(1)>. + +Default: C<auto> + +=item -j JOBS + +=item --jobs=JOBS + +The number of parallel processes to spawn at once for various network +operations using L<git(1)> and/or L<curl(1)>. + +=back + +=head1 EXAMPLES + +=for comment +Sticking to smaller projects in examples to minimize load on servers + +=over + +=item To mirror the most recent epochs of dwarves and LTTng inboxes: + + public-inbox-clone --epoch=~0 \ + --include='*lttng*' --include='*dwarves' \ + https://80x24.org/lore/ /path/to/inbox-mirror + +C<https://lore.kernel.org/> may be used instead of C<https://80x24.org/lore/> + +=item To mirror all code repos of the sparse project: + + public-inbox-clone --objstore= --project-list= --prune \ + --include='*sparse*' --inbox-config=never \ + --remote-manifest=https://80x24.org/lore/pub/manifest.js.gz \ + https://80x24.org/lore/ /path/to/code-mirror + +C<https://git.kernel.org/> may be used instead of C<https://80x24.org/lore/> +and the C<--remote-manifest> option can be omitted. + +=back + +=head1 CONTACT + +Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org> + +The mail archives are hosted at L<https://public-inbox.org/meta/> and +L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/> + +=head1 COPYRIGHT + +Copyright all contributors L<mailto:meta@public-inbox.org> + +License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt> + +=head1 SEE ALSO + +L<public-inbox-fetch(1)>, L<public-inbox-init(1)>, L<public-inbox-index(1)> |