From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
"Ben Peart" <Ben.Peart@microsoft.com>,
"Jonathan Tan" <jonathantanmy@google.com>,
"Jonathan Nieder" <jrnieder@gmail.com>,
"Nguyen Thai Ngoc Duy" <pclouds@gmail.com>,
"Mike Hommey" <mh@glandium.org>,
"Lars Schneider" <larsxschneider@gmail.com>,
"Eric Wong" <e@80x24.org>,
"Christian Couder" <chriscool@tuxfamily.org>,
"Jeff Hostetler" <jeffhost@microsoft.com>,
"Eric Sunshine" <sunshine@sunshineco.com>,
"Beat Bolli" <dev+git@drbeat.li>,
"SZEDER Gábor" <szeder.dev@gmail.com>,
"Ramsay Jones" <ramsay@ramsayjones.plus.com>
Subject: [PATCH v5 12/16] partial-clone: add multiple remotes in the doc
Date: Tue, 9 Apr 2019 18:11:12 +0200 [thread overview]
Message-ID: <20190409161116.30256-13-chriscool@tuxfamily.org> (raw)
In-Reply-To: <20190409161116.30256-1-chriscool@tuxfamily.org>
While at it, let's remove a reference to ODB effort as the ODB
effort has been replaced by directly enhancing partial clone
and promisor remote features.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/technical/partial-clone.txt | 117 ++++++++++++++++------
1 file changed, 84 insertions(+), 33 deletions(-)
diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt
index 896c7b3878..210373e258 100644
--- a/Documentation/technical/partial-clone.txt
+++ b/Documentation/technical/partial-clone.txt
@@ -30,12 +30,20 @@ advance* during clone and fetch operations and thereby reduce download
times and disk usage. Missing objects can later be "demand fetched"
if/when needed.
+A remote that can later provide the missing objects is called a
+promisor remote, as it promises to send the objects when
+requested. Initialy Git supported only one promisor remote, the origin
+remote from which the user cloned and that was configured in the
+"extensions.partialClone" config option. Later support for more than
+one promisor remote has been implemented.
+
Use of partial clone requires that the user be online and the origin
-remote be available for on-demand fetching of missing objects. This may
-or may not be problematic for the user. For example, if the user can
-stay within the pre-selected subset of the source tree, they may not
-encounter any missing objects. Alternatively, the user could try to
-pre-fetch various objects if they know that they are going offline.
+remote or other promisor remotes be available for on-demand fetching
+of missing objects. This may or may not be problematic for the user.
+For example, if the user can stay within the pre-selected subset of
+the source tree, they may not encounter any missing objects.
+Alternatively, the user could try to pre-fetch various objects if they
+know that they are going offline.
Non-Goals
@@ -100,18 +108,18 @@ or commits that reference missing trees.
Handling Missing Objects
------------------------
-- An object may be missing due to a partial clone or fetch, or missing due
- to repository corruption. To differentiate these cases, the local
- repository specially indicates such filtered packfiles obtained from the
- promisor remote as "promisor packfiles".
+- An object may be missing due to a partial clone or fetch, or missing
+ due to repository corruption. To differentiate these cases, the
+ local repository specially indicates such filtered packfiles
+ obtained from promisor remotes as "promisor packfiles".
+
These promisor packfiles consist of a "<name>.promisor" file with
arbitrary contents (like the "<name>.keep" files), in addition to
their "<name>.pack" and "<name>.idx" files.
- The local repository considers a "promisor object" to be an object that
- it knows (to the best of its ability) that the promisor remote has promised
- that it has, either because the local repository has that object in one of
+ it knows (to the best of its ability) that promisor remotes have promised
+ that they have, either because the local repository has that object in one of
its promisor packfiles, or because another promisor object refers to it.
+
When Git encounters a missing object, Git can see if it is a promisor object
@@ -123,12 +131,12 @@ expensive-to-modify list of missing objects.[a]
- Since almost all Git code currently expects any referenced object to be
present locally and because we do not want to force every command to do
a dry-run first, a fallback mechanism is added to allow Git to attempt
- to dynamically fetch missing objects from the promisor remote.
+ to dynamically fetch missing objects from promisor remotes.
+
When the normal object lookup fails to find an object, Git invokes
-fetch-object to try to get the object from the server and then retry
-the object lookup. This allows objects to be "faulted in" without
-complicated prediction algorithms.
+promisor_remote_get_direct() to try to get the object from a promisor
+remote and then retry the object lookup. This allows objects to be
+"faulted in" without complicated prediction algorithms.
+
For efficiency reasons, no check as to whether the missing object is
actually a promisor object is performed.
@@ -157,8 +165,7 @@ and prefetch those objects in bulk.
+
We are not happy with this global variable and would like to remove it,
but that requires significant refactoring of the object code to pass an
-additional flag. We hope that concurrent efforts to add an ODB API can
-encompass this.
+additional flag.
Fetching Missing Objects
@@ -182,21 +189,63 @@ has been updated to not use any object flags when the corresponding argument
though they are not necessary.
+Using many promisor remotes
+---------------------------
+
+Many promisor remotes can be configured and used.
+
+This allows for example a user to have multiple geographically-close
+cache servers for fetching missing blobs while continuing to do
+filtered `git-fetch` commands from the central server.
+
+When fetching objects, promisor remotes are tried one after the other
+until all the objects have been fetched.
+
+Remotes that are considered "promisor" remotes are those specified by
+the following configuration variables:
+
+- `extensions.partialClone = <name>`
+
+- `remote.<name>.promisor = true`
+
+- `remote.<name>.partialCloneFilter = ...`
+
+Only one promisor remote can be configured using the
+`extensions.partialClone` config variable. This promisor remote will
+be the last one tried when fetching objects.
+
+We decided to make it the last one we try, because it is likely that
+someone using many promisor remotes is doing so because the other
+promisor remotes are better for some reason (maybe they are closer or
+faster for some kind of objects) than the origin, and the origin is
+likely to be the remote specified by extensions.partialClone.
+
+This justification is not very strong, but one choice had to be made,
+and anyway the long term plan should be to make the order somehow
+fully configurable.
+
+For now though the other promisor remotes will be tried in the order
+they appear in the config file.
+
Current Limitations
-------------------
-- The remote used for a partial clone (or the first partial fetch
- following a regular clone) is marked as the "promisor remote".
+- It is not possible to specify the order in which the promisor
+ remotes are tried in other ways than the order in which they appear
+ in the config file.
+
-We are currently limited to a single promisor remote and only that
-remote may be used for subsequent partial fetches.
+It is also not possible to specify an order to be used when fetching
+from one remote and a different order when fetching from another
+remote.
+
+- It is not possible to push only specific objects to a promisor
+ remote.
+
-We accept this limitation because we believe initial users of this
-feature will be using it on repositories with a strong single central
-server.
+It is not possible to push at the same time to multiple promisor
+remote in a specific order.
-- Dynamic object fetching will only ask the promisor remote for missing
- objects. We assume that the promisor remote has a complete view of the
+- Dynamic object fetching will only ask promisor remotes for missing
+ objects. We assume that promisor remotes have a complete view of the
repository and can satisfy all such requests.
- Repack essentially treats promisor and non-promisor packfiles as 2
@@ -218,15 +267,17 @@ server.
Future Work
-----------
-- Allow more than one promisor remote and define a strategy for fetching
- missing objects from specific promisor remotes or of iterating over the
- set of promisor remotes until a missing object is found.
+- Improve the way to specify the order in which promisor remotes are
+ tried.
+
-A user might want to have multiple geographically-close cache servers
-for fetching missing blobs while continuing to do filtered `git-fetch`
-commands from the central server, for example.
+For example this could allow to specify explicitly something like:
+"When fetching from this remote, I want to use these promisor remotes
+in this order, though, when pushing or fetching to that remote, I want
+to use those promisor remotes in that order."
+
+- Allow pushing to promisor remotes.
+
-Or the user might want to work in a triangular work flow with multiple
+The user might want to work in a triangular work flow with multiple
promisor remotes that each have an incomplete view of the repository.
- Allow repack to work on promisor packfiles (while keeping them distinct
--
2.21.0.750.g68c8ebb2ac
next prev parent reply other threads:[~2019-04-09 16:12 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
2019-04-09 16:11 ` [PATCH v5 01/16] t0410: remove pipes after git commands Christian Couder
2019-04-09 16:11 ` [PATCH v5 02/16] fetch-object: make functions return an error code Christian Couder
2019-04-09 16:11 ` [PATCH v5 03/16] Add initial support for many promisor remotes Christian Couder
2019-04-09 16:11 ` [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct() Christian Couder
2019-05-30 17:21 ` Derrick Stolee
2019-05-30 20:46 ` Johannes Schindelin
2019-05-30 20:54 ` Derrick Stolee
2019-05-31 11:35 ` Johannes Schindelin
2019-05-31 16:14 ` Junio C Hamano
2019-05-31 5:10 ` Christian Couder
2019-06-25 13:50 ` Christian Couder
2019-04-09 16:11 ` [PATCH v5 05/16] promisor-remote: add promisor_remote_reinit() Christian Couder
2019-04-09 16:11 ` [PATCH v5 06/16] promisor-remote: use repository_format_partial_clone Christian Couder
2019-04-09 16:11 ` [PATCH v5 07/16] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
2019-04-09 16:11 ` [PATCH v5 08/16] diff: use promisor-remote.h instead of fetch-object.h Christian Couder
2019-04-09 16:11 ` [PATCH v5 09/16] promisor-remote: parse remote.*.partialclonefilter Christian Couder
2019-04-09 16:11 ` [PATCH v5 10/16] builtin/fetch: remove unique promisor remote limitation Christian Couder
2019-04-09 16:11 ` [PATCH v5 11/16] t0410: test fetching from many promisor remotes Christian Couder
2019-04-09 16:11 ` Christian Couder [this message]
2019-04-09 16:11 ` [PATCH v5 13/16] remote: add promisor and partial clone config to the doc Christian Couder
2019-04-09 16:11 ` [PATCH v5 14/16] Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} Christian Couder
2019-04-09 16:11 ` [PATCH v5 15/16] Move repository_format_partial_clone to promisor-remote.c Christian Couder
2019-04-09 16:11 ` [PATCH v5 16/16] Move core_partial_clone_filter_default " Christian Couder
2019-04-15 9:27 ` [PATCH v5 00/16] Many promisor remotes Junio C Hamano
2019-04-15 10:30 ` Junio C Hamano
2019-04-15 10:39 ` Christian Couder
2019-04-15 10:37 ` Christian Couder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190409161116.30256-13-chriscool@tuxfamily.org \
--to=christian.couder@gmail.com \
--cc=Ben.Peart@microsoft.com \
--cc=chriscool@tuxfamily.org \
--cc=dev+git@drbeat.li \
--cc=e@80x24.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeffhost@microsoft.com \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=larsxschneider@gmail.com \
--cc=mh@glandium.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=ramsay@ramsayjones.plus.com \
--cc=sunshine@sunshineco.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).