about summary refs log tree commit homepage
path: root/Documentation/public-inbox-clone.pod
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/public-inbox-clone.pod')
-rw-r--r--Documentation/public-inbox-clone.pod287
1 files changed, 287 insertions, 0 deletions
diff --git a/Documentation/public-inbox-clone.pod b/Documentation/public-inbox-clone.pod
new file mode 100644
index 00000000..64ee3138
--- /dev/null
+++ b/Documentation/public-inbox-clone.pod
@@ -0,0 +1,287 @@
+=head1 NAME
+
+public-inbox-clone - "git clone --mirror" wrapper
+
+=head1 SYNOPSIS
+
+public-inbox-clone [OPTIONS] INBOX_URL [INBOX_DIR]
+
+public-inbox-clone [OPTIONS] ROOT_URL [DESTINATION] # public-inbox 2.0+
+
+=head1 DESCRIPTION
+
+public-inbox-clone is a wrapper around C<git clone --mirror> for
+making the initial clone of a remote HTTP(S) public-inbox.  It
+allows cloning multi-epoch v2 inboxes with a single command and
+zero configuration.
+
+In public-inbox 2.0+, public-inbox-clone can create and maintain
+a mirror of multiple inboxes or code repositories using manifest.js.gz
+files like L<grok-pull(1)> from grokmirror.  L<public-inbox-fetch(1)> is
+NOT required when using this mode.
+
+It does not run L<public-inbox-init(1)> nor
+L<public-inbox-index(1)>.  Those commands must be run separately
+if serving/searching the mirror is required.  As-is,
+public-inbox-clone is suitable for creating a git-only backup
+without Xapian and SQLite indices.
+
+When cloning a single inbox, public-inbox-clone creates a Makefile
+with handy targets to update the inbox once indexed.
+This Makefile may be edited by the user; it will
+not be rewritten by L<public-inbox-fetch(1)> unless it is removed
+completely.
+
+public-inbox-clone does not use nor require any extra
+configuration files (not even C<~/.public-inbox/config>),
+but it can download snippets suitable for adding to any
+L<public-inbox-config(5)> file.
+
+L<public-inbox-fetch(1)> may be used to keep a single C<INBOX_DIR>
+up-to-date.
+
+For v2 inboxes, it will create a C<$INBOX_DIR/manifest.js.gz>
+file to speed up subsequent L<public-inbox-fetch(1)>.
+
+=head1 OPTIONS
+
+=over
+
+=item --epoch=RANGE
+
+Restrict clones of L<public-inbox-v2-format(5)> inboxes to the
+given range of epochs.  The range may be a single non-negative
+integer or a (possibly open-ended) C<LOW..HIGH> range of
+non-negative integers.  C<~> may be prefixed to either (or both)
+integer values to represent the offset from the maximum possible
+value.
+
+For example, C<--epoch=~0> alone clones only the latest epoch,
+C<--epoch=~2..> clones the three latest epochs.
+
+Default: C<0..~0> or C<0..> or C<..~0>
+(all epochs, all three examples are equivalent)
+
+=item -I PATTERN
+
+=item --include=PATTERN
+
+When cloning a top-level with multiple inboxes via manifest,
+only clone inboxes and repositories matching a given wildcard pattern
+(using C<*?> and C<[]> is supported).
+
+This is a new option in public-inbox 2.0+
+
+=item --exclude=PATTERN
+
+When cloning a top-level with multiple inboxes via manifest,
+ignore inboxes and repositories matching the given wildcard pattern.
+Supports the same wildcards as L</--include>
+
+This is a new option in public-inbox 2.0+
+
+=item --inbox-config=always|v2|v1|never
+
+Whether or not to retrieve the C<$INBOX/_/text/config/raw> HTTP(S)
+endpoint when cloning.
+
+Since we can't deduce v1 inboxes from code repositories, setting this
+to C<v2> or C<never> can allow faster clones of code repositories if
+no v1 inboxes are present.
+
+Default: C<always>
+
+This is a new option in public-inbox 2.0+
+
+=item --inbox-version=NUM
+
+Force a remote public-inbox version (must be C<1> or C<2>).
+This is auto-detected by default, and this option exists mainly
+for testing.
+
+This is a new option in public-inbox 2.0+
+
+=item --objstore=DIR
+
+Enables space savings when the remote C<manifest.js.gz>
+includes C<forkgroup> entries as generated by grokmirror 2.x.
+
+If C<DIR> does not start with C</>, C<./>, or C<../>, it is treated
+as relative to the C<DESTINATION> directory.  If only C<--objstore=>
+is specified where C<DIR> is an empty string (C<"">), then C<objstore>
+(C<$DESTINATION/objstore>) is the implied value of C<DIR>.
+
+This is a new option in public-inbox 2.0+
+
+=item --manifest=FILE
+
+When incrementally updating an existing mirror, load the given
+manifest (typically C<manifest.js.gz>) to speed up updates.
+
+By default, public-inbox writes the retrieved manifest to
+C<$DESTINATION/manifest.js.gz>, this directive also
+changes the destination to the specified C<FILE>
+
+If C<FILE> does not start with C</>, C<./>, or C<../>, it is treated
+as relative to the C<DESTINATION> directory.  If only C<--manifest=>
+is specified where C<FILE> is an empty string (C<"">), then C<manifest.js.gz>
+(C<$DESTINATION/manifest.js.gz>) is the implied value of C<FILE>.
+
+When updating manifests with many forks using the same objstore,
+git 2.41+ is highly recommended for performance as we automatically
+use the C<fetch.hideRefs> feature to speed up negotiation.
+
+C<--manifest=> is a new option in public-inbox 2.0+
+
+=item --remote-manifest=URL|RELATIVE_PATH
+
+Use an alternate location for the remote manifest.js.gz file.
+This may be specified as a full absolute URL (e.g
+C<--remote-manifest=https://80x24.org/lore/pub/manifest.js.gz>),
+or a pathname relative to the ROOT_URL (e.g
+C<--remote-manifest=pub/manifest.js.gz> when ROOT_URL is
+C<https://80x24.org/lore/>
+
+By default, C<ROOT_URL/manifest.js.gz> is used.
+
+This is a new option in public-inbox 2.0+
+
+=item --project-list=FILE
+
+When cloning code repos from a manifest, generate a cgit-compatible
+project list.
+
+If C<FILE> does not start with C</>, C<./>, or C<../>, it is treated
+as relative to the C<DESTINATION> directory.  If only C<--project-list=>
+is specified where C<FILE> is an empty string (C<"">), then C<projects.list>
+(C<$DESTINATION/projects.list>) is the implied value of C<FILE>.
+
+This is a new option in public-inbox 2.0+
+
+=item --post-update-hook=COMMAND
+
+Hooks to run after a repository is cloned or updated, C<COMMAND> will
+have the bare git repository destination given as its first and only
+argument.
+
+For v2 inboxes, this operates on a per-epoch basis.
+
+May be specified multiple times to run multiple commands in the
+order specified on the command-line.
+
+This is a new option in public-inbox 2.0+
+
+=item -p
+
+=item --prune
+
+Pass the C<--prune> and C<--prune-tags> flags to L<git-fetch(1)>
+calls on incremental clones.
+
+This is a new option in public-inbox 2.0+
+
+=item --purge
+
+Deletes entire repos which no longer exist in the remote manifest,
+or are filtered out by C<--include=> or C<--exclude=>.
+
+This is only useful when using C<--manifest>
+
+This is a new option in public-inbox 2.0+
+
+=item --exit-code
+
+Exit with C<127> if no updates are done when relying on a manifest.
+Updates include fingerprint mismatches in the manifest, new symlinks,
+new repositories, and removed repositories from the L<--project-list>
+
+This is a new option in public-inbox 2.0+
+
+=item -k
+
+=item --keep-going
+
+Continue as much as possible after an error.
+
+This is a new option in public-inbox 2.0+
+
+=item -n
+
+=item --dry-run
+
+Show what would be done, without making any changes.
+
+This is a new option in public-inbox 2.0+
+
+=item -q
+
+=item --quiet
+
+Quiets down progress messages, also passed to L<git-fetch(1)>.
+
+=item -v
+
+=item --verbose
+
+Increases verbosity, also passed to L<git-fetch(1)>.
+
+=item --torsocks=auto|no|yes
+
+=item --no-torsocks
+
+Whether to wrap L<git(1)> and L<curl(1)> commands with L<torsocks(1)>.
+
+Default: C<auto>
+
+=item -j JOBS
+
+=item --jobs=JOBS
+
+The number of parallel processes to spawn at once for various network
+operations using L<git(1)> and/or L<curl(1)>.
+
+=back
+
+=head1 EXAMPLES
+
+=for comment
+Sticking to smaller projects in examples to minimize load on servers
+
+=over
+
+=item To mirror the most recent epochs of dwarves and LTTng inboxes:
+
+  public-inbox-clone --epoch=~0 \
+        --include='*lttng*' --include='*dwarves' \
+        https://80x24.org/lore/ /path/to/inbox-mirror
+
+C<https://lore.kernel.org/> may be used instead of C<https://80x24.org/lore/>
+
+=item To mirror all code repos of the sparse project:
+
+  public-inbox-clone --objstore= --project-list= --prune \
+        --include='*sparse*' --inbox-config=never \
+        --remote-manifest=https://80x24.org/lore/pub/manifest.js.gz \
+        https://80x24.org/lore/ /path/to/code-mirror
+
+C<https://git.kernel.org/> may be used instead of C<https://80x24.org/lore/>
+and the C<--remote-manifest> option can be omitted.
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/> and
+L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<public-inbox-fetch(1)>, L<public-inbox-init(1)>, L<public-inbox-index(1)>