about summary refs log tree commit homepage
path: root/Documentation/public-inbox-clone.pod
diff options
Diffstat (limited to 'Documentation/public-inbox-clone.pod')
1 files changed, 287 insertions, 0 deletions
diff --git a/Documentation/public-inbox-clone.pod b/Documentation/public-inbox-clone.pod
new file mode 100644
index 00000000..64ee3138
--- /dev/null
+++ b/Documentation/public-inbox-clone.pod
@@ -0,0 +1,287 @@
+=head1 NAME
+public-inbox-clone - "git clone --mirror" wrapper
+=head1 SYNOPSIS
+public-inbox-clone [OPTIONS] INBOX_URL [INBOX_DIR]
+public-inbox-clone [OPTIONS] ROOT_URL [DESTINATION] # public-inbox 2.0+
+public-inbox-clone is a wrapper around C<git clone --mirror> for
+making the initial clone of a remote HTTP(S) public-inbox.  It
+allows cloning multi-epoch v2 inboxes with a single command and
+zero configuration.
+In public-inbox 2.0+, public-inbox-clone can create and maintain
+a mirror of multiple inboxes or code repositories using manifest.js.gz
+files like L<grok-pull(1)> from grokmirror.  L<public-inbox-fetch(1)> is
+NOT required when using this mode.
+It does not run L<public-inbox-init(1)> nor
+L<public-inbox-index(1)>.  Those commands must be run separately
+if serving/searching the mirror is required.  As-is,
+public-inbox-clone is suitable for creating a git-only backup
+without Xapian and SQLite indices.
+When cloning a single inbox, public-inbox-clone creates a Makefile
+with handy targets to update the inbox once indexed.
+This Makefile may be edited by the user; it will
+not be rewritten by L<public-inbox-fetch(1)> unless it is removed
+public-inbox-clone does not use nor require any extra
+configuration files (not even C<~/.public-inbox/config>),
+but it can download snippets suitable for adding to any
+L<public-inbox-config(5)> file.
+L<public-inbox-fetch(1)> may be used to keep a single C<INBOX_DIR>
+For v2 inboxes, it will create a C<$INBOX_DIR/manifest.js.gz>
+file to speed up subsequent L<public-inbox-fetch(1)>.
+=head1 OPTIONS
+=item --epoch=RANGE
+Restrict clones of L<public-inbox-v2-format(5)> inboxes to the
+given range of epochs.  The range may be a single non-negative
+integer or a (possibly open-ended) C<LOW..HIGH> range of
+non-negative integers.  C<~> may be prefixed to either (or both)
+integer values to represent the offset from the maximum possible
+For example, C<--epoch=~0> alone clones only the latest epoch,
+C<--epoch=~2..> clones the three latest epochs.
+Default: C<0..~0> or C<0..> or C<..~0>
+(all epochs, all three examples are equivalent)
+=item -I PATTERN
+=item --include=PATTERN
+When cloning a top-level with multiple inboxes via manifest,
+only clone inboxes and repositories matching a given wildcard pattern
+(using C<*?> and C<[]> is supported).
+This is a new option in public-inbox 2.0+
+=item --exclude=PATTERN
+When cloning a top-level with multiple inboxes via manifest,
+ignore inboxes and repositories matching the given wildcard pattern.
+Supports the same wildcards as L</--include>
+This is a new option in public-inbox 2.0+
+=item --inbox-config=always|v2|v1|never
+Whether or not to retrieve the C<$INBOX/_/text/config/raw> HTTP(S)
+endpoint when cloning.
+Since we can't deduce v1 inboxes from code repositories, setting this
+to C<v2> or C<never> can allow faster clones of code repositories if
+no v1 inboxes are present.
+Default: C<always>
+This is a new option in public-inbox 2.0+
+=item --inbox-version=NUM
+Force a remote public-inbox version (must be C<1> or C<2>).
+This is auto-detected by default, and this option exists mainly
+for testing.
+This is a new option in public-inbox 2.0+
+=item --objstore=DIR
+Enables space savings when the remote C<manifest.js.gz>
+includes C<forkgroup> entries as generated by grokmirror 2.x.
+If C<DIR> does not start with C</>, C<./>, or C<../>, it is treated
+as relative to the C<DESTINATION> directory.  If only C<--objstore=>
+is specified where C<DIR> is an empty string (C<"">), then C<objstore>
+(C<$DESTINATION/objstore>) is the implied value of C<DIR>.
+This is a new option in public-inbox 2.0+
+=item --manifest=FILE
+When incrementally updating an existing mirror, load the given
+manifest (typically C<manifest.js.gz>) to speed up updates.
+By default, public-inbox writes the retrieved manifest to
+C<$DESTINATION/manifest.js.gz>, this directive also
+changes the destination to the specified C<FILE>
+If C<FILE> does not start with C</>, C<./>, or C<../>, it is treated
+as relative to the C<DESTINATION> directory.  If only C<--manifest=>
+is specified where C<FILE> is an empty string (C<"">), then C<manifest.js.gz>
+(C<$DESTINATION/manifest.js.gz>) is the implied value of C<FILE>.
+When updating manifests with many forks using the same objstore,
+git 2.41+ is highly recommended for performance as we automatically
+use the C<fetch.hideRefs> feature to speed up negotiation.
+C<--manifest=> is a new option in public-inbox 2.0+
+=item --remote-manifest=URL|RELATIVE_PATH
+Use an alternate location for the remote manifest.js.gz file.
+This may be specified as a full absolute URL (e.g
+or a pathname relative to the ROOT_URL (e.g
+C<--remote-manifest=pub/manifest.js.gz> when ROOT_URL is
+By default, C<ROOT_URL/manifest.js.gz> is used.
+This is a new option in public-inbox 2.0+
+=item --project-list=FILE
+When cloning code repos from a manifest, generate a cgit-compatible
+project list.
+If C<FILE> does not start with C</>, C<./>, or C<../>, it is treated
+as relative to the C<DESTINATION> directory.  If only C<--project-list=>
+is specified where C<FILE> is an empty string (C<"">), then C<projects.list>
+(C<$DESTINATION/projects.list>) is the implied value of C<FILE>.
+This is a new option in public-inbox 2.0+
+=item --post-update-hook=COMMAND
+Hooks to run after a repository is cloned or updated, C<COMMAND> will
+have the bare git repository destination given as its first and only
+For v2 inboxes, this operates on a per-epoch basis.
+May be specified multiple times to run multiple commands in the
+order specified on the command-line.
+This is a new option in public-inbox 2.0+
+=item -p
+=item --prune
+Pass the C<--prune> and C<--prune-tags> flags to L<git-fetch(1)>
+calls on incremental clones.
+This is a new option in public-inbox 2.0+
+=item --purge
+Deletes entire repos which no longer exist in the remote manifest,
+or are filtered out by C<--include=> or C<--exclude=>.
+This is only useful when using C<--manifest>
+This is a new option in public-inbox 2.0+
+=item --exit-code
+Exit with C<127> if no updates are done when relying on a manifest.
+Updates include fingerprint mismatches in the manifest, new symlinks,
+new repositories, and removed repositories from the L<--project-list>
+This is a new option in public-inbox 2.0+
+=item -k
+=item --keep-going
+Continue as much as possible after an error.
+This is a new option in public-inbox 2.0+
+=item -n
+=item --dry-run
+Show what would be done, without making any changes.
+This is a new option in public-inbox 2.0+
+=item -q
+=item --quiet
+Quiets down progress messages, also passed to L<git-fetch(1)>.
+=item -v
+=item --verbose
+Increases verbosity, also passed to L<git-fetch(1)>.
+=item --torsocks=auto|no|yes
+=item --no-torsocks
+Whether to wrap L<git(1)> and L<curl(1)> commands with L<torsocks(1)>.
+Default: C<auto>
+=item -j JOBS
+=item --jobs=JOBS
+The number of parallel processes to spawn at once for various network
+operations using L<git(1)> and/or L<curl(1)>.
+=head1 EXAMPLES
+=for comment
+Sticking to smaller projects in examples to minimize load on servers
+=item To mirror the most recent epochs of dwarves and LTTng inboxes:
+  public-inbox-clone --epoch=~0 \
+        --include='*lttng*' --include='*dwarves' \
+        https://80x24.org/lore/ /path/to/inbox-mirror
+C<https://lore.kernel.org/> may be used instead of C<https://80x24.org/lore/>
+=item To mirror all code repos of the sparse project:
+  public-inbox-clone --objstore= --project-list= --prune \
+        --include='*sparse*' --inbox-config=never \
+        --remote-manifest=https://80x24.org/lore/pub/manifest.js.gz \
+        https://80x24.org/lore/ /path/to/code-mirror
+C<https://git.kernel.org/> may be used instead of C<https://80x24.org/lore/>
+and the C<--remote-manifest> option can be omitted.
+=head1 CONTACT
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+The mail archives are hosted at L<https://public-inbox.org/meta/> and
+Copyright all contributors L<mailto:meta@public-inbox.org>
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+=head1 SEE ALSO
+L<public-inbox-fetch(1)>, L<public-inbox-init(1)>, L<public-inbox-index(1)>