git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, peff@peff.net, jonathantanmy@google.com,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: [PATCH 09/13] extension.partialclone: introduce partial clone extension
Date: Tue, 24 Oct 2017 18:53:28 +0000	[thread overview]
Message-ID: <20171024185332.57261-10-git@jeffhostetler.com> (raw)
In-Reply-To: <20171024185332.57261-1-git@jeffhostetler.com>

From: Jeff Hostetler <jeffhost@microsoft.com>

Introduce the ability to have missing objects in a repo.  This
functionality is guarded by new repository extension options:
    `extensions.partialcloneremote` and
    `extensions.partialclonefilter`.

See the update to Documentation/technical/repository-version.txt
in this patch for more information.

This patch is part of a patch originally authored by:
Jonathan Tan <jonathantanmy@google.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Documentation/technical/repository-version.txt | 22 ++++++
 Makefile                                       |  1 +
 cache.h                                        |  4 ++
 config.h                                       |  3 +
 environment.c                                  |  2 +
 partial-clone-utils.c                          | 99 ++++++++++++++++++++++++++
 partial-clone-utils.h                          | 34 +++++++++
 setup.c                                        | 15 ++++
 8 files changed, 180 insertions(+)
 create mode 100644 partial-clone-utils.c
 create mode 100644 partial-clone-utils.h

diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
index 00ad379..9d488db 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,25 @@ for testing format-1 compatibility.
 When the config key `extensions.preciousObjects` is set to `true`,
 objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 `git repack -d`).
+
+`partialcloneremote`
+~~~~~~~~~~~~~~~~~~~~
+
+When the config key `extensions.partialcloneremote` is set, it indicates
+that the repo was created with a partial clone (or later performed
+a partial fetch) and that the remote may have omitted sending
+certain unwanted objects.  Such a remote is called a "promisor remote"
+and it promises that all such omitted objects can be fetched from it
+in the future.
+
+The value of this key is the name of the promisor remote.
+
+`partialclonefilter`
+~~~~~~~~~~~~~~~~~~~~
+
+When the config key `extensions.partialclonefilter` is set, it gives
+the initial filter expression used to create the partial clone.
+This value becomed the default filter expression for subsequent
+fetches (called "partial fetches") from the promisor remote.  This
+value may also be set by the first explicit partial fetch following a
+normal clone.
diff --git a/Makefile b/Makefile
index b9ff0b4..38632fb 100644
--- a/Makefile
+++ b/Makefile
@@ -841,6 +841,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += pager.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += parse-options-cb.o
+LIB_OBJS += partial-clone-utils.o
 LIB_OBJS += patch-delta.o
 LIB_OBJS += patch-ids.o
 LIB_OBJS += path.o
diff --git a/cache.h b/cache.h
index 6440e2b..4b785c0 100644
--- a/cache.h
+++ b/cache.h
@@ -860,12 +860,16 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
+extern char *repository_format_partial_clone_remote;
+extern char *repository_format_partial_clone_filter;
 
 struct repository_format {
 	int version;
 	int precious_objects;
 	int is_bare;
 	char *work_tree;
+	char *partial_clone_remote; /* value of extensions.partialcloneremote */
+	char *partial_clone_filter; /* value of extensions.partialclonefilter */
 	struct string_list unknown_extensions;
 };
 
diff --git a/config.h b/config.h
index a49d264..90544ef 100644
--- a/config.h
+++ b/config.h
@@ -34,6 +34,9 @@ struct config_options {
 	const char *git_dir;
 };
 
+#define KEY_PARTIALCLONEREMOTE "partialcloneremote"
+#define KEY_PARTIALCLONEFILTER "partialclonefilter"
+
 typedef int (*config_fn_t)(const char *, const char *, void *);
 extern int git_default_config(const char *, const char *, void *);
 extern int git_config_from_file(config_fn_t fn, const char *, void *);
diff --git a/environment.c b/environment.c
index 8289c25..2fcf9bb 100644
--- a/environment.c
+++ b/environment.c
@@ -27,6 +27,8 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
+char *repository_format_partial_clone_remote;
+char *repository_format_partial_clone_filter;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/partial-clone-utils.c b/partial-clone-utils.c
new file mode 100644
index 0000000..8c925ae
--- /dev/null
+++ b/partial-clone-utils.c
@@ -0,0 +1,99 @@
+#include "cache.h"
+#include "config.h"
+#include "partial-clone-utils.h"
+
+int is_partial_clone_registered(void)
+{
+	if (repository_format_partial_clone_remote ||
+	    repository_format_partial_clone_filter)
+		return 1;
+
+	return 0;
+}
+
+void partial_clone_utils_register(
+	const struct list_objects_filter_options *filter_options,
+	const char *remote,
+	const char *cmd_name)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	if (is_partial_clone_registered()) {
+		/*
+		 * The original partial-clone or a previous partial-fetch
+		 * already registered the partial-clone settings.
+		 * If we get here, we are in a subsequent partial-* command
+		 * (with explicit filter args on the command line).
+		 *
+		 * For now, we restrict subsequent commands to one
+		 * consistent with the original request.  We may relax
+		 * this later after we get more experience with the
+		 * partial-clone feature.
+		 *
+		 * [] Restrict to same remote because our dynamic
+		 *    object loading only knows how to fetch objects
+		 *    from 1 remote.
+		 */
+		assert(filter_options && filter_options->choice);
+		assert(remote && *remote);
+
+		if (strcmp(remote, repository_format_partial_clone_remote))
+			die("%s --%s currently limited to remote '%s'",
+			    cmd_name, CL_ARG__FILTER,
+			    repository_format_partial_clone_remote);
+
+		/*
+		 * Treat the (possibly new) filter-spec as transient;
+		 * use it for the current command, but do not overwrite
+		 * the default.
+		 */
+		return;
+	}
+
+	repository_format_partial_clone_remote = xstrdup(remote);
+	repository_format_partial_clone_filter = xstrdup(filter_options->raw_value);
+
+	/*
+	 * Force repo version > 0 to enable extensions namespace.
+	 */
+	git_config_set("core.repositoryformatversion", "1");
+
+	/*
+	 * Use the "extensions" namespace in the config to record
+	 * the name of the remote used in the partial clone.
+	 * This will help us return to that server when we need
+	 * to backfill missing objects.
+	 *
+	 * It is also used to indicate that there *MAY* be
+	 * missing objects so that subsequent commands don't
+	 * immediately die if they hit one.
+	 *
+	 * Also remember the initial filter settings used by
+	 * clone as a default for future fetches.
+	 */
+	git_config_set("extensions." KEY_PARTIALCLONEREMOTE,
+		       repository_format_partial_clone_remote);
+	git_config_set("extensions." KEY_PARTIALCLONEFILTER,
+		       repository_format_partial_clone_filter);
+
+	/*
+	 * TODO Do we need to record both partial-clone
+	 * parameters in the extensions namespace and in the
+	 * section for the remote?
+	 *
+	 * Or should we just remember 1 in each, as in:
+	 *     "extension.partialcloneremote=<remote>"
+	 *     "remote.<remote>.filter=<filter-spec>"
+	 * The issue is when can we set both of the
+	 * repository_format_partial_clone_* globals
+	 * durint subsequent startups.
+	 * See setup.c:check_repo_format().
+	 */
+	strbuf_addf(&buf, "remote.%s.%s", remote, KEY_PARTIALCLONEREMOTE);
+	git_config_set(buf.buf, repository_format_partial_clone_remote);
+
+	strbuf_addf(&buf, "remote.%s.%s", remote, KEY_PARTIALCLONEFILTER);
+	git_config_set(buf.buf, repository_format_partial_clone_filter);
+
+	strbuf_release(&buf);
+}
diff --git a/partial-clone-utils.h b/partial-clone-utils.h
new file mode 100644
index 0000000..b527570
--- /dev/null
+++ b/partial-clone-utils.h
@@ -0,0 +1,34 @@
+#ifndef PARTIAL_CLONE_UTILS_H
+#define PARTIAL_CLONE_UTILS_H
+
+#include "list-objects-filter-options.h"
+
+/*
+ * Register that partial-clone was used to create the repo and
+ * update the config on disk.
+ *
+ * If nothing else, this indicates that the ODB may have missing
+ * objects and that various commands should handle that gracefully.
+ *
+ * Record the remote used for the clone so that we know where
+ * to get missing objects in the future.
+ *
+ * Also record the filter expression so that we know something
+ * about the missing objects (e.g., size-limit vs sparse).
+ *
+ * May also be used by a partial-fetch following a normal clone
+ * to turn on the above tracking.
+ */ 
+extern void partial_clone_utils_register(
+	const struct list_objects_filter_options *filter_options,
+	const char *remote,
+	const char *cmd_name);
+
+/*
+ * Return 1 if partial-clone was used to create the repo
+ * or a subsequent partial-fetch was used.  This is an
+ * indicator that there may be missing objects.
+ */
+extern int is_partial_clone_registered(void);
+
+#endif /* PARTIAL_CLONE_UTILS_H */
diff --git a/setup.c b/setup.c
index 03f51e0..bc4133d 100644
--- a/setup.c
+++ b/setup.c
@@ -420,6 +420,19 @@ static int check_repo_format(const char *var, const char *value, void *vdata)
 			;
 		else if (!strcmp(ext, "preciousobjects"))
 			data->precious_objects = git_config_bool(var, value);
+
+		else if (!strcmp(ext, KEY_PARTIALCLONEREMOTE))
+			if (!value)
+				return config_error_nonbool(var);
+			else
+				data->partial_clone_remote = xstrdup(value);
+
+		else if (!strcmp(ext, KEY_PARTIALCLONEFILTER))
+			if (!value)
+				return config_error_nonbool(var);
+			else
+				data->partial_clone_filter = xstrdup(value);
+
 		else
 			string_list_append(&data->unknown_extensions, ext);
 	} else if (strcmp(var, "core.bare") == 0) {
@@ -463,6 +476,8 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
 	}
 
 	repository_format_precious_objects = candidate.precious_objects;
+	repository_format_partial_clone_remote = candidate.partial_clone_remote;
+	repository_format_partial_clone_filter = candidate.partial_clone_filter;
 	string_list_clear(&candidate.unknown_extensions, 0);
 	if (!has_common) {
 		if (candidate.is_bare != -1) {
-- 
2.9.3


  parent reply	other threads:[~2017-10-24 18:54 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-24 18:53 [PATCH 00/13] WIP Partial clone part 1: object filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 01/13] dir: allow exclusions from blob in addition to file Jeff Hostetler
2017-10-25  4:05   ` Eric Sunshine
2017-10-25  6:43   ` Junio C Hamano
2017-10-25 14:54     ` Jeff Hostetler
2017-10-26  3:47       ` Junio C Hamano
2017-10-26 18:11         ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 02/13] list-objects-filter-map: extend oidmap to collect omitted objects Jeff Hostetler
2017-10-25  7:10   ` Junio C Hamano
2017-10-25 19:22     ` Jeff Hostetler
2017-10-26  4:12       ` Junio C Hamano
2017-10-24 18:53 ` [PATCH 03/13] list-objects: filter objects in traverse_commit_list Jeff Hostetler
2017-10-25  4:05   ` Jonathan Tan
2017-10-25 19:25     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 04/13] list-objects-filter-blobs-none: add filter to omit all blobs Jeff Hostetler
2017-10-24 18:53 ` [PATCH 05/13] list-objects-filter-blobs-limit: add large blob filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 06/13] list-objects-filter-sparse: add sparse filter Jeff Hostetler
2017-10-24 18:53 ` [PATCH 07/13] list-objects-filter-options: common argument parsing Jeff Hostetler
2017-10-25  4:14   ` Jonathan Tan
2017-10-25 19:28     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 08/13] list-objects: add traverse_commit_list_filtered method Jeff Hostetler
2017-10-25  4:24   ` Jonathan Tan
2017-10-25 19:29     ` Jeff Hostetler
2017-10-24 18:53 ` Jeff Hostetler [this message]
2017-10-24 18:53 ` [PATCH 10/13] rev-list: add list-objects filtering support Jeff Hostetler
2017-10-25  4:41   ` Jonathan Tan
2017-10-25 19:37     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 11/13] t6112: rev-list object filtering test Jeff Hostetler
2017-10-24 18:53 ` [PATCH 12/13] pack-objects: add list-objects filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 13/13] t5317: pack-objects object filtering test Jeff Hostetler
2017-10-25  4:57 ` [PATCH 00/13] WIP Partial clone part 1: object filtering Jonathan Tan
2017-10-25  5:00 ` Junio C Hamano
2017-10-25  6:46   ` Jonathan Tan
2017-10-25 15:39     ` Jeff Hostetler
2017-10-26  2:09       ` Junio C Hamano
2017-10-26  2:01     ` Junio C Hamano
2017-10-30 22:27     ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171024185332.57261-10-git@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).