git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Ben Peart <Ben.Peart@microsoft.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	Mike Hommey <mh@glandium.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Eric Wong <e@80x24.org>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: [PATCH v6 40/40] Doc/external-odb: explain transfering objects and metadata
Date: Sat, 16 Sep 2017 10:07:31 +0200	[thread overview]
Message-ID: <20170916080731.13925-41-chriscool@tuxfamily.org> (raw)
In-Reply-To: <20170916080731.13925-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/technical/external-odb.txt | 105 +++++++++++++++++++++++++++++++
 1 file changed, 105 insertions(+)

diff --git a/Documentation/technical/external-odb.txt b/Documentation/technical/external-odb.txt
index 58ec8a8145..76dd1e2e6c 100644
--- a/Documentation/technical/external-odb.txt
+++ b/Documentation/technical/external-odb.txt
@@ -340,3 +340,108 @@ can that contains:
 *.jpg           odb=magic
 ------------------------
 
+Transfering objects
+===================
+
+When an external odb helper is configured, the objects managed by the
+external odb are not put in the pack file that is sent (when pushing
+or answering clone and fetch requests), so the receiver should also
+have configured an external odb helper that can get the missing
+objects otherwise Git will error out complaining about missing
+objects.
+
+This has some drawbacks of course, but at least it makes sure that
+users' and admins' repositories are both properly configured to use a
+common external ODB before they can talk to each other.
+
+Transfering meta information and restartable clone
+==================================================
+
+There are different ways to make it possible for the external odb
+helpers to know which services they should get the objects from (or
+put them into), for example the information could be hardcoded into
+the helpers, or the information could be computed from configuration
+information like the url of the "origin" remote.
+
+The external odb mechanism itself doesn't really take care of this, so
+helpers are free to do whatever they want.
+
+One interesting possibility though is to have this information as part
+of the repository in special refs, for example refs/odb/magic/*, where
+"magic" is the external odb name.
+
+This would especially make it possible to implement a restartable
+clone using Git bundles (and an external odb helper) like this:
+
+	1) At the very start of the clone, Git would fetch the refs
+	that contain "meta information", for example refs/odb/magic/*
+	(where "magic" is the odb name). These refs would point to
+	some blobs that contain lists of the bundles that are
+	available for fetching by the helper, along with enough
+	information for the helper to fetch them (for example HTTP
+	urls of the bundles).
+
+	2) After this first fetch of the refs/odb/magic/* refs, the
+	helper would be sent the 'init' instruction. At that time it
+	can read all the blobs pointed to by these refs and download
+	the bundles listed in the blobs.
+
+	If something goes wrong when the helper "fetches" a bundle,
+	the helper could force the clone to error out (after maybe
+	retrying), and when the user (or the helper itself) tries
+	again to clone, the helper would restart its bundle "fetch"
+	(using the restartable protocol, for example HTTP).
+
+	When this "fetch" eventually succeeds, then the helper will
+	unbundle what it received, and then give back control to the
+	second regular part of the clone.
+
+	3) This regular part of the clone will then try to fetch the
+	usual refs, but as the unbundling has already updated the
+	content of the usual refs as well as the object stores this
+	fetch will find that everything is up-to-date.
+
+	Or if everything is not quite up-to-date and there are still
+	things to fetch, another hopefully much small regular fetch
+	will happen.
+
+As this is an interesting use of the external odb mechanism, the
+`--initial-refspec` option has been implemented in `git clone`. This
+makes it possible to perform all the above steps using a single clone
+command like:
+
+------------------------
+$ git clone -c odb.magic.scriptCommand="$HELPER" \
+  --initial-refspec "refs/odbs/magic/*:refs/odbs/magic/*" "$URL"
+------------------------
+
+But note that the above could also be performed using:
+
+------------------------
+$ git init
+$ git remote add origin "$URL"
+$ git fetch origin "refs/odbs/magic/*:refs/odbs/magic/*"
+$ git config odb.magic.scriptCommand "$HELPER"
+$ git fetch origin
+------------------------
+
+So the `--initial-refspec` option can be seen as just a shortcut to
+simplify external odb helped clones for users.
+
+Also note that this `--initial-refspec` approach could be slower than
+a regular clone, so it is mostly interesting if one wants to fetch a
+big number of objects or many big objects, like for an initial clone
+of a big repo. In this use case a relatively small amount of time
+spent in the initial fetch is an acceptable trade-off if the clone is
+restartable.
+
+Though in some cases, as the `--initial-refspec` clone could alleviate
+resource usage of the Git server, it could be even faster than a
+regular clone.
+
+So admins and users should not blindly use the `--initial-refspec`
+option all the time when an external odb is configured. But using an
+external odb in the first place means that they have specific
+requirements for handling objects which suggests that the regular way
+to clone might not be very good for their use cases and for the
+objects that are stored in their external ODBs.
-- 
2.14.1.576.g3f707d88cd


  parent reply	other threads:[~2017-09-16  8:08 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-16  8:06 [PATCH v6 00/40] Add initial experimental external ODB support Christian Couder
2017-09-16  8:06 ` [PATCH v6 01/40] builtin/clone: get rid of 'value' strbuf Christian Couder
2017-09-16  8:06 ` [PATCH v6 02/40] t0021/rot13-filter: refactor packet reading functions Christian Couder
2017-09-16  8:06 ` [PATCH v6 03/40] t0021/rot13-filter: improve 'if .. elsif .. else' style Christian Couder
2017-09-16  8:06 ` [PATCH v6 04/40] t0021/rot13-filter: improve error message Christian Couder
2017-09-16  8:06 ` [PATCH v6 05/40] t0021/rot13-filter: add packet_initialize() Christian Couder
2017-09-16  8:06 ` [PATCH v6 06/40] t0021/rot13-filter: add capability functions Christian Couder
2017-09-16  8:06 ` [PATCH v6 07/40] Add Git/Packet.pm from parts of t0021/rot13-filter.pl Christian Couder
2017-09-16  8:06 ` [PATCH v6 08/40] sha1_file: prepare for external odbs Christian Couder
2017-09-16  8:07 ` [PATCH v6 09/40] Add initial external odb support Christian Couder
2017-09-19 17:45   ` Jonathan Tan
2017-09-27 16:46     ` Christian Couder
2017-09-29 20:36       ` Jonathan Tan
2017-10-02 14:34         ` Ben Peart
2017-10-03  9:45         ` Christian Couder
2017-10-04  0:15           ` Jonathan Tan
2017-09-16  8:07 ` [PATCH v6 10/40] odb-helper: add odb_helper_init() to send 'init' instruction Christian Couder
2017-09-16  8:07 ` [PATCH v6 11/40] t0400: add 'put_raw_obj' instruction to odb-helper script Christian Couder
2017-09-16  8:07 ` [PATCH v6 12/40] external odb: add 'put_raw_obj' support Christian Couder
2017-09-16  8:07 ` [PATCH v6 13/40] external-odb: accept only blobs for now Christian Couder
2017-09-16  8:07 ` [PATCH v6 14/40] t0400: add test for external odb write support Christian Couder
2017-09-16  8:07 ` [PATCH v6 15/40] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
2017-09-16  8:07 ` [PATCH v6 16/40] Add t0410 to test external ODB transfer Christian Couder
2017-09-16  8:07 ` [PATCH v6 17/40] lib-httpd: pass config file to start_httpd() Christian Couder
2017-09-16  8:07 ` [PATCH v6 18/40] lib-httpd: add upload.sh Christian Couder
2017-09-16  8:07 ` [PATCH v6 19/40] lib-httpd: add list.sh Christian Couder
2017-09-16  8:07 ` [PATCH v6 20/40] lib-httpd: add apache-e-odb.conf Christian Couder
2017-09-16  8:07 ` [PATCH v6 21/40] odb-helper: add odb_helper_get_raw_object() Christian Couder
2017-09-16  8:07 ` [PATCH v6 22/40] pack-objects: don't pack objects in external odbs Christian Couder
2017-09-16  8:07 ` [PATCH v6 23/40] Add t0420 to test transfer to HTTP external odb Christian Couder
2017-09-16  8:07 ` [PATCH v6 24/40] external-odb: add 'get_direct' support Christian Couder
2017-09-16  8:07 ` [PATCH v6 25/40] odb-helper: add 'script_mode' to 'struct odb_helper' Christian Couder
2017-09-16  8:07 ` [PATCH v6 26/40] odb-helper: add init_object_process() Christian Couder
2017-09-16  8:07 ` [PATCH v6 27/40] Add t0450 to test 'get_direct' mechanism Christian Couder
2017-09-16  8:07 ` [PATCH v6 28/40] Add t0460 to test passing git objects Christian Couder
2017-09-16  8:07 ` [PATCH v6 29/40] odb-helper: add put_object_process() Christian Couder
2017-09-16  8:07 ` [PATCH v6 30/40] Add t0470 to test passing raw objects Christian Couder
2017-09-16  8:07 ` [PATCH v6 31/40] odb-helper: add have_object_process() Christian Couder
2017-09-16  8:07 ` [PATCH v6 32/40] Add t0480 to test "have" capability and raw objects Christian Couder
2017-09-16  8:07 ` [PATCH v6 33/40] external-odb: use 'odb=magic' attribute to mark odb blobs Christian Couder
2017-09-16  8:07 ` [PATCH v6 34/40] Add Documentation/technical/external-odb.txt Christian Couder
2017-09-16  8:07 ` [PATCH v6 35/40] clone: add 'initial' param to write_remote_refs() Christian Couder
2017-09-16  8:07 ` [PATCH v6 36/40] clone: add --initial-refspec option Christian Couder
2017-09-16  8:07 ` [PATCH v6 37/40] clone: disable external odb before initial clone Christian Couder
2017-09-16  8:07 ` [PATCH v6 38/40] Add tests for 'clone --initial-refspec' Christian Couder
2017-09-16  8:07 ` [PATCH v6 39/40] Add t0430 to test cloning using bundles Christian Couder
2017-09-16  8:07 ` Christian Couder [this message]
2017-10-02 14:18 ` [PATCH v6 00/40] Add initial experimental external ODB support Ben Peart
2017-10-03  6:32   ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170916080731.13925-41-chriscool@tuxfamily.org \
    --to=christian.couder@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=chriscool@tuxfamily.org \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=larsxschneider@gmail.com \
    --cc=mh@glandium.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).