git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 00/11] Many promisor remotes
@ 2019-03-12 13:29 Christian Couder
  2019-03-12 13:29 ` [PATCH v3 01/11] fetch-object: make functions return an error code Christian Couder
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

This path series is a follow up from the "remote odb" patch series
that I sent last year, which were a follow up from previous
series. See the links section for more information.

The goal of this patch series is to make it possible to have and to
fetch missing objects from multiple remotes instead of only one.

For now the fetch order is the order of the remotes in the config,
except for the remote specified by extensions.partialclone config
option which comes last in the fetch order.

I selected the name "Promisor remote" over "Partial clone remote"
because it is shorter and because it is not only about cloning.

The existing extensions.partialclone config option is respected, but
it is not written in the config when a partial clone or fetch is
made. Instead remote.<name>.promisor is set to "true". This may create
a compatibility issue, but it makes it possible to start using many
promisor remotes by just cloning and fetching from different remotes
with partial clone filters. The compatibility issue could be resolved
in a future iteration by just setting extensions.partialclone instead
of remote.<name>.promisor the first time a promisor remote is used.

In general I have tried to change as few things as possible.

High level changes since the V2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  - remove the 2 first patches from V2 that have been sent and merged
    separately

  - improved commit message in patch 8/11 documentation patches
  
  - slightly improved cover letter

As I got no comment on V2, I am not sure what to improve at this
point.

High level overview of this patch series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  - Patch 1/11:

This patch, makes functions in fetch-object.c return an error code,
which is necessary to later tell that they failed and try another
promisor remote when there is more than one. This could also just be
seen as a fix to these functions.

  - Patch 2/11:

This introduces the minimum infrastructure for promisor remotes.

  - Patch 3/11, 4/11 and 5/11:

This add a few missing bits in the promisor remote infrastructure that
will be needed in the following patches.

  - Patch 6/11:

This replaces the previous interface that used only one promisor
remote defined in extensions.partialclone with the new interface
created by the previous patches.

  - Patch 7/11:

This replaces the way a partial clone filter was handled by a new way
based on the previous patches that support more than one partial clone
filter.

  - Patch 8/11:

This patch removes the limitation in builtin/fetch.c to have only one
promisor remote.

  - Patch 9/11:

This adds test cases that shows that now more than one promisor remote
can be used and that remote.<name>.promisor is set to "true" when
fetching from a new promisor remote.

  - Patch 10/11 and 11/11:

These are new documentation patches, to explain how things can work
with more than one promisor remote.

Links
~~~~~

This patch series on GitHub:

V3: https://github.com/chriscool/git/commits/many-promisor-remotes
V2: https://github.com/chriscool/git/commits/many-promisor-remotes35
V1: https://github.com/chriscool/git/commits/many-promisor-remotes17

On the mailing list:

V2: https://public-inbox.org/git/20190122144212.15119-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20181211052746.16218-1-chriscool@tuxfamily.org/

This patch series is a follow up from the discussions related to
the remote odb V4 patch series:

https://public-inbox.org/git/20180802061505.2983-1-chriscool@tuxfamily.org/

Especially in:

https://public-inbox.org/git/CAP8UFD3nrhjANwNDqTwx5ZtnZNcnbAFqUN=u=LrvzuH4+3wQQA@mail.gmail.com/

I said that I would like to work on things in the following order:

  1) Teaching partial clone to attempt to fetch missing objects from
multiple remotes instead of only one using the order in the config.

  2) Simplifying the protocol for fetching missing objects so that it
can be satisfied by a lighter weight object storage system than a full
Git server.

  3) Making it possible to explicitly define an order in which the
remotes are accessed.

  4) Making the criteria for what objects can be missing more
aggressive, so that I can "git add" a large file and work with it
using Git without even having a second copy of that object in my local
object store.

And this patch series is about the 1).

The previous remote odb patch series on GitHub:

V5: https://github.com/chriscool/git/commits/remote-odb
V4: https://github.com/chriscool/git/commits/remote-odb5
V3: https://github.com/chriscool/git/commits/remote-odb3
V2: https://github.com/chriscool/git/commits/remote-odb2
V1: https://github.com/chriscool/git/commits/remote-odb1

Discussions related to previous versions of the odb patch series:

V4: https://public-inbox.org/git/20180802061505.2983-1-chriscool@tuxfamily.org/
V3: https://public-inbox.org/git/20180713174959.16748-1-chriscool@tuxfamily.org/
V2: https://public-inbox.org/git/20180630083542.20347-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20180623121846.19750-1-chriscool@tuxfamily.org/


Christian Couder (11):
  fetch-object: make functions return an error code
  Add initial support for many promisor remotes
  promisor-remote: implement promisor_remote_get_direct()
  promisor-remote: add promisor_remote_reinit()
  promisor-remote: use repository_format_partial_clone
  Use promisor_remote_get_direct() and has_promisor_remote()
  promisor-remote: parse remote.*.partialclonefilter
  builtin/fetch: remove unique promisor remote limitation
  t0410: test fetching from many promisor remotes
  partial-clone: add multiple remotes in the doc
  remote: add promisor and partial clone config to the doc

 Documentation/config/remote.txt           |   8 ++
 Documentation/technical/partial-clone.txt |  83 ++++++++----
 Makefile                                  |   1 +
 builtin/cat-file.c                        |   5 +-
 builtin/fetch.c                           |  29 ++---
 builtin/gc.c                              |   3 +-
 builtin/repack.c                          |   3 +-
 cache-tree.c                              |   3 +-
 connected.c                               |   3 +-
 fetch-object.c                            |  13 +-
 fetch-object.h                            |   4 +-
 list-objects-filter-options.c             |  51 ++++----
 list-objects-filter-options.h             |   3 +-
 packfile.c                                |   3 +-
 promisor-remote.c                         | 148 ++++++++++++++++++++++
 promisor-remote.h                         |  22 ++++
 sha1-file.c                               |  14 +-
 t/t0410-partial-clone.sh                  |  49 ++++++-
 t/t5601-clone.sh                          |   3 +-
 t/t5616-partial-clone.sh                  |   4 +-
 unpack-trees.c                            |   6 +-
 21 files changed, 362 insertions(+), 96 deletions(-)
 create mode 100644 promisor-remote.c
 create mode 100644 promisor-remote.h

-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 01/11] fetch-object: make functions return an error code
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-12 13:29 ` [PATCH v3 02/11] Add initial support for many promisor remotes Christian Couder
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, Christian Couder

From: Christian Couder <christian.couder@gmail.com>

The callers of the fetch_object() and fetch_objects() might
be interested in knowing if these functions succeeded or not.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fetch-object.c | 13 ++++++++-----
 fetch-object.h |  4 ++--
 sha1-file.c    |  4 ++--
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index 4266548800..eac4d448ef 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -5,11 +5,12 @@
 #include "transport.h"
 #include "fetch-object.h"
 
-static void fetch_refs(const char *remote_name, struct ref *ref)
+static int fetch_refs(const char *remote_name, struct ref *ref)
 {
 	struct remote *remote;
 	struct transport *transport;
 	int original_fetch_if_missing = fetch_if_missing;
+	int res;
 
 	fetch_if_missing = 0;
 	remote = remote_get(remote_name);
@@ -19,12 +20,14 @@ static void fetch_refs(const char *remote_name, struct ref *ref)
 
 	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
 	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	transport_fetch_refs(transport, ref);
+	res = transport_fetch_refs(transport, ref);
 	fetch_if_missing = original_fetch_if_missing;
+
+	return res;
 }
 
-void fetch_objects(const char *remote_name, const struct object_id *oids,
-		   int oid_nr)
+int fetch_objects(const char *remote_name, const struct object_id *oids,
+		  int oid_nr)
 {
 	struct ref *ref = NULL;
 	int i;
@@ -36,5 +39,5 @@ void fetch_objects(const char *remote_name, const struct object_id *oids,
 		new_ref->next = ref;
 		ref = new_ref;
 	}
-	fetch_refs(remote_name, ref);
+	return fetch_refs(remote_name, ref);
 }
diff --git a/fetch-object.h b/fetch-object.h
index d6444caa5a..7bcc7cadb0 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -3,7 +3,7 @@
 
 struct object_id;
 
-void fetch_objects(const char *remote_name, const struct object_id *oids,
-		   int oid_nr);
+int fetch_objects(const char *remote_name, const struct object_id *oids,
+		  int oid_nr);
 
 #endif
diff --git a/sha1-file.c b/sha1-file.c
index 494606f771..01cc0590f4 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1372,8 +1372,8 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 		if (fetch_if_missing && repository_format_partial_clone &&
 		    !already_retried && r == the_repository) {
 			/*
-			 * TODO Investigate having fetch_object() return
-			 * TODO error/success and stopping the music here.
+			 * TODO Investigate checking fetch_object() return
+			 * TODO value and stopping on error here.
 			 * TODO Pass a repository struct through fetch_object,
 			 * such that arbitrary repositories work.
 			 */
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 02/11] Add initial support for many promisor remotes
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
  2019-03-12 13:29 ` [PATCH v3 01/11] fetch-object: make functions return an error code Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-13  4:09   ` Junio C Hamano
  2019-03-12 13:29 ` [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct() Christian Couder
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, Christian Couder

From: Christian Couder <christian.couder@gmail.com>

The promisor-remote.{c,h} files will contain functions to
manage many promisor remotes.

We expect that there will not be a lot of promisor remotes,
so it is ok to use a simple linked list to manage them.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile          |   1 +
 promisor-remote.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++
 promisor-remote.h |  17 ++++++++
 3 files changed, 118 insertions(+)
 create mode 100644 promisor-remote.c
 create mode 100644 promisor-remote.h

diff --git a/Makefile b/Makefile
index 537493822b..4f24ccb3dc 100644
--- a/Makefile
+++ b/Makefile
@@ -972,6 +972,7 @@ LIB_OBJS += preload-index.o
 LIB_OBJS += pretty.o
 LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
+LIB_OBJS += promisor-remote.o
 LIB_OBJS += prompt.o
 LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
diff --git a/promisor-remote.c b/promisor-remote.c
new file mode 100644
index 0000000000..d2f574651e
--- /dev/null
+++ b/promisor-remote.c
@@ -0,0 +1,100 @@
+#include "cache.h"
+#include "promisor-remote.h"
+#include "config.h"
+
+static struct promisor_remote *promisors;
+static struct promisor_remote **promisors_tail = &promisors;
+
+struct promisor_remote *promisor_remote_new(const char *remote_name)
+{
+	struct promisor_remote *o;
+
+	o = xcalloc(1, sizeof(*o));
+	o->remote_name = xstrdup(remote_name);
+
+	*promisors_tail = o;
+	promisors_tail = &o->next;
+
+	return o;
+}
+
+static struct promisor_remote *promisor_remote_look_up(const char *remote_name,
+						       struct promisor_remote **previous)
+{
+	struct promisor_remote *o, *p;
+
+	for (p = NULL, o = promisors; o; p = o, o = o->next)
+		if (o->remote_name && !strcmp(o->remote_name, remote_name)) {
+			if (previous)
+				*previous = p;
+			return o;
+		}
+
+	return NULL;
+}
+
+static void promisor_remote_move_to_tail(struct promisor_remote *o,
+					 struct promisor_remote *previous)
+{
+	if (previous)
+		previous->next = o->next;
+	else
+		promisors = o->next ? o->next : o;
+	o->next = NULL;
+	*promisors_tail = o;
+	promisors_tail = &o->next;
+}
+
+static int promisor_remote_config(const char *var, const char *value, void *data)
+{
+	struct promisor_remote *o;
+	const char *name;
+	int namelen;
+	const char *subkey;
+
+	if (parse_config_key(var, "remote", &name, &namelen, &subkey) < 0)
+		return 0;
+
+	if (!strcmp(subkey, "promisor")) {
+		char *remote_name;
+
+		if (!git_config_bool(var, value))
+			return 0;
+
+		remote_name = xmemdupz(name, namelen);
+
+		if (!promisor_remote_look_up(remote_name, NULL))
+			promisor_remote_new(remote_name);
+
+		free(remote_name);
+		return 0;
+	}
+
+	return 0;
+}
+
+static void promisor_remote_init(void)
+{
+	static int initialized;
+
+	if (initialized)
+		return;
+	initialized = 1;
+
+	git_config(promisor_remote_config, NULL);
+}
+
+struct promisor_remote *promisor_remote_find(const char *remote_name)
+{
+	promisor_remote_init();
+
+	if (!remote_name)
+		return promisors;
+
+	return promisor_remote_look_up(remote_name, NULL);
+}
+
+int has_promisor_remote(void)
+{
+	return !!promisor_remote_find(NULL);
+}
diff --git a/promisor-remote.h b/promisor-remote.h
new file mode 100644
index 0000000000..bfbf7c0f21
--- /dev/null
+++ b/promisor-remote.h
@@ -0,0 +1,17 @@
+#ifndef PROMISOR_REMOTE_H
+#define PROMISOR_REMOTE_H
+
+/*
+ * Promisor remote linked list
+ * Its information come from remote.XXX config entries.
+ */
+struct promisor_remote {
+	const char *remote_name;
+	struct promisor_remote *next;
+};
+
+extern struct promisor_remote *promisor_remote_new(const char *remote_name);
+extern struct promisor_remote *promisor_remote_find(const char *remote_name);
+extern int has_promisor_remote(void);
+
+#endif /* PROMISOR_REMOTE_H */
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct()
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
  2019-03-12 13:29 ` [PATCH v3 01/11] fetch-object: make functions return an error code Christian Couder
  2019-03-12 13:29 ` [PATCH v3 02/11] Add initial support for many promisor remotes Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-13  4:23   ` Junio C Hamano
  2019-03-12 13:29 ` [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit() Christian Couder
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, Christian Couder

From: Christian Couder <christian.couder@gmail.com>

This is implemented for now by calling fetch_objects(). It fetches
from all the promisor remotes.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 17 +++++++++++++++++
 promisor-remote.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/promisor-remote.c b/promisor-remote.c
index d2f574651e..f86f9d0b84 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "promisor-remote.h"
 #include "config.h"
+#include "fetch-object.h"
 
 static struct promisor_remote *promisors;
 static struct promisor_remote **promisors_tail = &promisors;
@@ -98,3 +99,19 @@ int has_promisor_remote(void)
 {
 	return !!promisor_remote_find(NULL);
 }
+
+int promisor_remote_get_direct(const struct object_id *oids, int oid_nr)
+{
+	struct promisor_remote *o;
+
+	promisor_remote_init();
+
+	for (o = promisors; o; o = o->next) {
+		if (fetch_objects(o->remote_name, oids, oid_nr) < 0)
+			continue;
+		return 0;
+	}
+
+	return -1;
+}
+
diff --git a/promisor-remote.h b/promisor-remote.h
index bfbf7c0f21..f9f5825417 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -13,5 +13,6 @@ struct promisor_remote {
 extern struct promisor_remote *promisor_remote_new(const char *remote_name);
 extern struct promisor_remote *promisor_remote_find(const char *remote_name);
 extern int has_promisor_remote(void);
+extern int promisor_remote_get_direct(const struct object_id *oids, int oid_nr);
 
 #endif /* PROMISOR_REMOTE_H */
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit()
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (2 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct() Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-13  4:28   ` Junio C Hamano
  2019-03-12 13:29 ` [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone Christian Couder
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, Christian Couder

From: Christian Couder <christian.couder@gmail.com>

We will need to reinitialize the promisor remote configuration
as we will make some changes to it in a later commit.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 14 ++++++++++++--
 promisor-remote.h |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index f86f9d0b84..ea74f6d8a8 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -74,17 +74,27 @@ static int promisor_remote_config(const char *var, const char *value, void *data
 	return 0;
 }
 
-static void promisor_remote_init(void)
+static void promisor_remote_do_init(int force)
 {
 	static int initialized;
 
-	if (initialized)
+	if (!force && initialized)
 		return;
 	initialized = 1;
 
 	git_config(promisor_remote_config, NULL);
 }
 
+static inline void promisor_remote_init(void)
+{
+	promisor_remote_do_init(0);
+}
+
+void promisor_remote_reinit(void)
+{
+	promisor_remote_do_init(1);
+}
+
 struct promisor_remote *promisor_remote_find(const char *remote_name)
 {
 	promisor_remote_init();
diff --git a/promisor-remote.h b/promisor-remote.h
index f9f5825417..f96722bc66 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -10,6 +10,7 @@ struct promisor_remote {
 	struct promisor_remote *next;
 };
 
+extern void promisor_remote_reinit(void);
 extern struct promisor_remote *promisor_remote_new(const char *remote_name);
 extern struct promisor_remote *promisor_remote_find(const char *remote_name);
 extern int has_promisor_remote(void);
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (3 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit() Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-13  4:31   ` Junio C Hamano
  2019-03-12 13:29 ` [PATCH v3 06/11] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

A remote specified using the extensions.partialClone config
option should be considered a promisor remote too.

This remote should be at the end of the promisor remote list,
so that it is used only if objects have not been found in other
remotes.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/promisor-remote.c b/promisor-remote.c
index ea74f6d8a8..dcf6ef6521 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -83,6 +83,17 @@ static void promisor_remote_do_init(int force)
 	initialized = 1;
 
 	git_config(promisor_remote_config, NULL);
+
+	if (repository_format_partial_clone) {
+		struct promisor_remote *o, *previous;
+
+		o = promisor_remote_look_up(repository_format_partial_clone,
+					    &previous);
+		if (o)
+			promisor_remote_move_to_tail(o, previous);
+		else
+			promisor_remote_new(repository_format_partial_clone);
+	}
 }
 
 static inline void promisor_remote_init(void)
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 06/11] Use promisor_remote_get_direct() and has_promisor_remote()
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (4 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-12 13:29 ` [PATCH v3 07/11] promisor-remote: parse remote.*.partialclonefilter Christian Couder
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

Instead of using the repository_format_partial_clone global
and fetch_objects() directly, let's use has_promisor_remote()
and promisor_remote_get_direct().

This way all the configured promisor remotes will be taken
into account, not only the one specified by
extensions.partialClone.

Also when cloning or fetching using a partial clone filter,
remote.origin.promisor will be set to "true" instead of
setting extensions.partialClone to "origin". This makes it
possible to use many promisor remote just by fetching from
them.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/cat-file.c            |  5 +++--
 builtin/fetch.c               | 11 ++++++-----
 builtin/gc.c                  |  3 ++-
 builtin/repack.c              |  3 ++-
 cache-tree.c                  |  3 ++-
 connected.c                   |  3 ++-
 list-objects-filter-options.c | 28 +++++++++++++++-------------
 packfile.c                    |  3 ++-
 sha1-file.c                   | 14 ++++++++------
 t/t5601-clone.sh              |  2 +-
 t/t5616-partial-clone.sh      |  2 +-
 unpack-trees.c                |  6 +++---
 12 files changed, 47 insertions(+), 36 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 0f092382e1..85ae10bf0b 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -15,6 +15,7 @@
 #include "sha1-array.h"
 #include "packfile.h"
 #include "object-store.h"
+#include "promisor-remote.h"
 
 struct batch_options {
 	int enabled;
@@ -523,8 +524,8 @@ static int batch_objects(struct batch_options *opt)
 	if (opt->all_objects) {
 		struct object_cb_data cb;
 
-		if (repository_format_partial_clone)
-			warning("This repository has extensions.partialClone set. Some objects may not be loaded.");
+		if (has_promisor_remote())
+			warning("This repository uses promisor remotes. Some objects may not be loaded.");
 
 		cb.opt = opt;
 		cb.expand = &data;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index b620fd54b4..7edf70ae6c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -23,6 +23,7 @@
 #include "packfile.h"
 #include "list-objects-filter-options.h"
 #include "commit-reach.h"
+#include "promisor-remote.h"
 
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
@@ -1460,7 +1461,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * If no prior partial clone/fetch and the current fetch DID NOT
 	 * request a partial-fetch, do a normal fetch.
 	 */
-	if (!repository_format_partial_clone && !filter_options.choice)
+	if (!has_promisor_remote() && !filter_options.choice)
 		return;
 
 	/*
@@ -1468,7 +1469,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * on this repo and remember the given filter-spec as the default
 	 * for subsequent fetches to this remote.
 	 */
-	if (!repository_format_partial_clone && filter_options.choice) {
+	if (!has_promisor_remote() && filter_options.choice) {
 		partial_clone_register(remote->name, &filter_options);
 		return;
 	}
@@ -1477,7 +1478,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * We are currently limited to only ONE promisor remote and only
 	 * allow partial-fetches from the promisor remote.
 	 */
-	if (strcmp(remote->name, repository_format_partial_clone)) {
+	if (!promisor_remote_find(remote->name)) {
 		if (filter_options.choice)
 			die(_("--filter can only be used with the remote "
 			      "configured in extensions.partialClone"));
@@ -1609,7 +1610,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (depth || deepen_since || deepen_not.nr)
 		deepen = 1;
 
-	if (filter_options.choice && !repository_format_partial_clone)
+	if (filter_options.choice && !has_promisor_remote())
 		die("--filter can only be used when extensions.partialClone is set");
 
 	if (all) {
@@ -1643,7 +1644,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	}
 
 	if (remote) {
-		if (filter_options.choice || repository_format_partial_clone)
+		if (filter_options.choice || has_promisor_remote())
 			fetch_one_setup_partial(remote);
 		result = fetch_one(remote, argc, argv, prune_tags_ok);
 	} else {
diff --git a/builtin/gc.c b/builtin/gc.c
index 020f725acc..0bec41b25f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -27,6 +27,7 @@
 #include "pack-objects.h"
 #include "blob.h"
 #include "tree.h"
+#include "promisor-remote.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -640,7 +641,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			argv_array_push(&prune, prune_expire);
 			if (quiet)
 				argv_array_push(&prune, "--no-progress");
-			if (repository_format_partial_clone)
+			if (has_promisor_remote())
 				argv_array_push(&prune,
 						"--exclude-promisor-objects");
 			if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
diff --git a/builtin/repack.c b/builtin/repack.c
index 67f8978043..3b935690c8 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -11,6 +11,7 @@
 #include "midx.h"
 #include "packfile.h"
 #include "object-store.h"
+#include "promisor-remote.h"
 
 static int delta_base_offset = 1;
 static int pack_kept_objects = -1;
@@ -366,7 +367,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	argv_array_push(&cmd.args, "--all");
 	argv_array_push(&cmd.args, "--reflog");
 	argv_array_push(&cmd.args, "--indexed-objects");
-	if (repository_format_partial_clone)
+	if (has_promisor_remote())
 		argv_array_push(&cmd.args, "--exclude-promisor-objects");
 	if (write_bitmaps)
 		argv_array_push(&cmd.args, "--write-bitmap-index");
diff --git a/cache-tree.c b/cache-tree.c
index b13bfaf71e..64c285a746 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -5,6 +5,7 @@
 #include "cache-tree.h"
 #include "object-store.h"
 #include "replace-object.h"
+#include "promisor-remote.h"
 
 #ifndef DEBUG
 #define DEBUG 0
@@ -357,7 +358,7 @@ static int update_one(struct cache_tree *it,
 		}
 
 		ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
-			(repository_format_partial_clone &&
+			(has_promisor_remote() &&
 			 ce_skip_worktree(ce));
 		if (is_null_oid(oid) ||
 		    (!ce_missing_ok && !has_object_file(oid))) {
diff --git a/connected.c b/connected.c
index 1bba888eff..0eaaedee6a 100644
--- a/connected.c
+++ b/connected.c
@@ -4,6 +4,7 @@
 #include "connected.h"
 #include "transport.h"
 #include "packfile.h"
+#include "promisor-remote.h"
 
 /*
  * If we feed all the commits we want to verify to this command
@@ -56,7 +57,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 	argv_array_push(&rev_list.args,"rev-list");
 	argv_array_push(&rev_list.args, "--objects");
 	argv_array_push(&rev_list.args, "--stdin");
-	if (repository_format_partial_clone)
+	if (has_promisor_remote())
 		argv_array_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		argv_array_push(&rev_list.args, "--not");
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index c0036f7378..f41a831fce 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -6,6 +6,7 @@
 #include "list-objects.h"
 #include "list-objects-filter.h"
 #include "list-objects-filter-options.h"
+#include "promisor-remote.h"
 
 /*
  * Parse value of the argument to the "filter" keyword.
@@ -144,30 +145,31 @@ void partial_clone_register(
 	const char *remote,
 	const struct list_objects_filter_options *filter_options)
 {
-	/*
-	 * Record the name of the partial clone remote in the
-	 * config and in the global variable -- the latter is
-	 * used throughout to indicate that partial clone is
-	 * enabled and to expect missing objects.
-	 */
-	if (repository_format_partial_clone &&
-	    *repository_format_partial_clone &&
-	    strcmp(remote, repository_format_partial_clone))
-		die(_("cannot change partial clone promisor remote"));
+	char *cfg_name;
 
-	git_config_set("core.repositoryformatversion", "1");
-	git_config_set("extensions.partialclone", remote);
+	/* Check if it is already registered */
+	if (!promisor_remote_find(remote)) {
+		git_config_set("core.repositoryformatversion", "1");
 
-	repository_format_partial_clone = xstrdup(remote);
+		/* Add promisor config for the remote */
+		cfg_name = xstrfmt("remote.%s.promisor", remote);
+		git_config_set(cfg_name, "true");
+		free(cfg_name);
+	}
 
 	/*
 	 * Record the initial filter-spec in the config as
 	 * the default for subsequent fetches from this remote.
+	 *
+	 * TODO: record it into remote.<name>.partialclonefilter
 	 */
 	core_partial_clone_filter_default =
 		xstrdup(filter_options->filter_spec);
 	git_config_set("core.partialclonefilter",
 		       core_partial_clone_filter_default);
+
+	/* Make sure the config info are reset */
+	promisor_remote_reinit();
 }
 
 void partial_clone_get_default_filter_spec(
diff --git a/packfile.c b/packfile.c
index 16bcb75262..0f95cdc1c9 100644
--- a/packfile.c
+++ b/packfile.c
@@ -16,6 +16,7 @@
 #include "tree.h"
 #include "object-store.h"
 #include "midx.h"
+#include "promisor-remote.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -2119,7 +2120,7 @@ int is_promisor_object(const struct object_id *oid)
 	static int promisor_objects_prepared;
 
 	if (!promisor_objects_prepared) {
-		if (repository_format_partial_clone) {
+		if (has_promisor_remote()) {
 			for_each_packed_object(add_promisor_object,
 					       &promisor_objects,
 					       FOR_EACH_OBJECT_PROMISOR_ONLY);
diff --git a/sha1-file.c b/sha1-file.c
index 01cc0590f4..715a2b882a 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -32,6 +32,7 @@
 #include "packfile.h"
 #include "fetch-object.h"
 #include "object-store.h"
+#include "promisor-remote.h"
 
 /* The maximum size for an object header. */
 #define MAX_HEADER_LEN 32
@@ -1369,15 +1370,16 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 		}
 
 		/* Check if it is a missing object */
-		if (fetch_if_missing && repository_format_partial_clone &&
+		if (fetch_if_missing && has_promisor_remote() &&
 		    !already_retried && r == the_repository) {
 			/*
-			 * TODO Investigate checking fetch_object() return
-			 * TODO value and stopping on error here.
-			 * TODO Pass a repository struct through fetch_object,
-			 * such that arbitrary repositories work.
+			 * TODO Investigate checking promisor_remote_get_direct()
+			 * TODO return value and stopping on error here.
+			 * TODO Pass a repository struct through
+			 * promisor_remote_get_direct(), such that arbitrary
+			 * repositories work.
 			 */
-			fetch_objects(repository_format_partial_clone, real, 1);
+			promisor_remote_get_direct(real, 1);
 			already_retried = 1;
 			continue;
 		}
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index d6948cbdab..dd658f8b32 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -653,7 +653,7 @@ partial_clone () {
 	git -C client fsck &&
 
 	# Ensure that unneeded blobs are not inadvertently fetched.
-	test_config -C client extensions.partialclone "not a remote" &&
+	test_config -C client remote.origin.promisor "false" &&
 	test_must_fail git -C client cat-file -e "$HASH1" &&
 
 	# But this blob was fetched, because clone performs an initial checkout
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 9643acb161..e8ca825ab7 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -42,7 +42,7 @@ test_expect_success 'do partial clone 1' '
 
 	test_cmp expect_1.oids observed.oids &&
 	test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" &&
-	test "$(git -C pc1 config --local extensions.partialclone)" = "origin" &&
+	test "$(git -C pc1 config --local remote.origin.promisor)" = "true" &&
 	test "$(git -C pc1 config --local core.partialclonefilter)" = "blob:none"
 '
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 22c41a3ba8..47353d85c3 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -17,6 +17,7 @@
 #include "fsmonitor.h"
 #include "object-store.h"
 #include "fetch-object.h"
+#include "promisor-remote.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -398,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o)
 		load_gitmodules_file(index, &state);
 
 	enable_delayed_checkout(&state);
-	if (repository_format_partial_clone && o->update && !o->dry_run) {
+	if (has_promisor_remote() && o->update && !o->dry_run) {
 		/*
 		 * Prefetch the objects that are to be checked out in the loop
 		 * below.
@@ -415,8 +416,7 @@ static int check_updates(struct unpack_trees_options *o)
 			}
 		}
 		if (to_fetch.nr)
-			fetch_objects(repository_format_partial_clone,
-				      to_fetch.oid, to_fetch.nr);
+			promisor_remote_get_direct(to_fetch.oid, to_fetch.nr);
 		fetch_if_missing = fetch_if_missing_store;
 		oid_array_clear(&to_fetch);
 	}
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 07/11] promisor-remote: parse remote.*.partialclonefilter
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (5 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 06/11] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-12 13:29 ` [PATCH v3 08/11] builtin/fetch: remove unique promisor remote limitation Christian Couder
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

This makes it possible to specify a different partial clone
filter for each promisor remote.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/fetch.c               |  2 +-
 list-objects-filter-options.c | 27 +++++++++++++++------------
 list-objects-filter-options.h |  3 ++-
 promisor-remote.c             | 10 ++++++++++
 promisor-remote.h             |  5 ++++-
 t/t0410-partial-clone.sh      |  2 +-
 t/t5601-clone.sh              |  1 +
 t/t5616-partial-clone.sh      |  2 +-
 8 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7edf70ae6c..9d436bd727 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1491,7 +1491,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * the config.
 	 */
 	if (!filter_options.choice)
-		partial_clone_get_default_filter_spec(&filter_options);
+		partial_clone_get_default_filter_spec(&filter_options, remote->name);
 	return;
 }
 
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index f41a831fce..02f48b7c40 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -30,6 +30,9 @@ static int gently_parse_list_objects_filter(
 {
 	const char *v0;
 
+	if (!arg)
+		return 0;
+
 	if (filter_options->choice) {
 		if (errbuf) {
 			strbuf_addstr(
@@ -146,6 +149,7 @@ void partial_clone_register(
 	const struct list_objects_filter_options *filter_options)
 {
 	char *cfg_name;
+	char *filter_name;
 
 	/* Check if it is already registered */
 	if (!promisor_remote_find(remote)) {
@@ -160,27 +164,26 @@ void partial_clone_register(
 	/*
 	 * Record the initial filter-spec in the config as
 	 * the default for subsequent fetches from this remote.
-	 *
-	 * TODO: record it into remote.<name>.partialclonefilter
 	 */
-	core_partial_clone_filter_default =
-		xstrdup(filter_options->filter_spec);
-	git_config_set("core.partialclonefilter",
-		       core_partial_clone_filter_default);
+	filter_name = xstrfmt("remote.%s.partialclonefilter", remote);
+	git_config_set(filter_name, filter_options->filter_spec);
+	free(filter_name);
 
 	/* Make sure the config info are reset */
 	promisor_remote_reinit();
 }
 
 void partial_clone_get_default_filter_spec(
-	struct list_objects_filter_options *filter_options)
+	struct list_objects_filter_options *filter_options,
+	const char *remote)
 {
+	struct promisor_remote *promisor = promisor_remote_find(remote);
+
 	/*
 	 * Parse default value, but silently ignore it if it is invalid.
 	 */
-	if (!core_partial_clone_filter_default)
-		return;
-	gently_parse_list_objects_filter(filter_options,
-					 core_partial_clone_filter_default,
-					 NULL);
+	if (promisor)
+		gently_parse_list_objects_filter(filter_options,
+						 promisor->partial_clone_filter,
+						 NULL);
 }
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index e3adc78ebf..70d27f44ef 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -89,6 +89,7 @@ void partial_clone_register(
 	const char *remote,
 	const struct list_objects_filter_options *filter_options);
 void partial_clone_get_default_filter_spec(
-	struct list_objects_filter_options *filter_options);
+	struct list_objects_filter_options *filter_options,
+	const char *remote);
 
 #endif /* LIST_OBJECTS_FILTER_OPTIONS_H */
diff --git a/promisor-remote.c b/promisor-remote.c
index dcf6ef6521..ed33dfa86c 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -70,6 +70,16 @@ static int promisor_remote_config(const char *var, const char *value, void *data
 		free(remote_name);
 		return 0;
 	}
+	if (!strcmp(subkey, "partialclonefilter")) {
+		char *remote_name = xmemdupz(name, namelen);
+
+		o = promisor_remote_look_up(remote_name, NULL);
+		if (!o)
+			o = promisor_remote_new(remote_name);
+
+		free(remote_name);
+		return git_config_string(&o->partial_clone_filter, var, value);
+	}
 
 	return 0;
 }
diff --git a/promisor-remote.h b/promisor-remote.h
index f96722bc66..f0d609a3f5 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -3,10 +3,13 @@
 
 /*
  * Promisor remote linked list
- * Its information come from remote.XXX config entries.
+ *
+ * Information in its fields come from remote.XXX config entries or
+ * from extensions.partialclone or core.partialclonefilter.
  */
 struct promisor_remote {
 	const char *remote_name;
+	const char *partial_clone_filter;
 	struct promisor_remote *next;
 };
 
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index bce02788e6..9266037714 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -26,7 +26,7 @@ promise_and_delete () {
 test_expect_success 'extensions.partialclone without filter' '
 	test_create_repo server &&
 	git clone --filter="blob:none" "file://$(pwd)/server" client &&
-	git -C client config --unset core.partialclonefilter &&
+	git -C client config --unset remote.origin.partialclonefilter &&
 	git -C client fetch origin
 '
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index dd658f8b32..2e82c70f33 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -654,6 +654,7 @@ partial_clone () {
 
 	# Ensure that unneeded blobs are not inadvertently fetched.
 	test_config -C client remote.origin.promisor "false" &&
+	git -C client config --unset remote.origin.partialclonefilter &&
 	test_must_fail git -C client cat-file -e "$HASH1" &&
 
 	# But this blob was fetched, because clone performs an initial checkout
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index e8ca825ab7..ba0489b41a 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -43,7 +43,7 @@ test_expect_success 'do partial clone 1' '
 	test_cmp expect_1.oids observed.oids &&
 	test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" &&
 	test "$(git -C pc1 config --local remote.origin.promisor)" = "true" &&
-	test "$(git -C pc1 config --local core.partialclonefilter)" = "blob:none"
+	test "$(git -C pc1 config --local remote.origin.partialclonefilter)" = "blob:none"
 '
 
 # checkout master to force dynamic object fetch of blobs at HEAD.
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 08/11] builtin/fetch: remove unique promisor remote limitation
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (6 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 07/11] promisor-remote: parse remote.*.partialclonefilter Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-12 13:29 ` [PATCH v3 09/11] t0410: test fetching from many promisor remotes Christian Couder
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

As the infrastructure for more than one promisor remote
has been introduced in previous patches, we can remove
code that forbids the registration of more than one
promisor remote.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/fetch.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 9d436bd727..56c6827165 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1465,26 +1465,16 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 		return;
 
 	/*
-	 * If this is the FIRST partial-fetch request, we enable partial
-	 * on this repo and remember the given filter-spec as the default
-	 * for subsequent fetches to this remote.
+	 * If this is a partial-fetch request, we enable partial on
+	 * this repo if not already enabled and remember the given
+	 * filter-spec as the default for subsequent fetches to this
+	 * remote.
 	 */
-	if (!has_promisor_remote() && filter_options.choice) {
+	if (filter_options.choice) {
 		partial_clone_register(remote->name, &filter_options);
 		return;
 	}
 
-	/*
-	 * We are currently limited to only ONE promisor remote and only
-	 * allow partial-fetches from the promisor remote.
-	 */
-	if (!promisor_remote_find(remote->name)) {
-		if (filter_options.choice)
-			die(_("--filter can only be used with the remote "
-			      "configured in extensions.partialClone"));
-		return;
-	}
-
 	/*
 	 * Do a partial-fetch from the promisor remote using either the
 	 * explicitly given filter-spec or inherit the filter-spec from
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 09/11] t0410: test fetching from many promisor remotes
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (7 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 08/11] builtin/fetch: remove unique promisor remote limitation Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-12 13:29 ` [PATCH v3 10/11] partial-clone: add multiple remotes in the doc Christian Couder
  2019-03-12 13:29 ` [PATCH v3 11/11] remote: add promisor and partial clone config to " Christian Couder
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, Christian Couder

From: Christian Couder <christian.couder@gmail.com>

This shows that it is now possible to fetch objects from more
than one promisor remote, and that fetching from a new
promisor remote can configure it as one.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0410-partial-clone.sh | 47 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 9266037714..146b0a1e03 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -182,8 +182,53 @@ test_expect_success 'fetching of missing objects works with ref-in-want enabled'
 	grep "git< fetch=.*ref-in-want" trace
 '
 
+test_expect_success 'fetching of missing objects from another promisor remote' '
+	git clone "file://$(pwd)/server" server2 &&
+	test_commit -C server2 bar &&
+	git -C server2 repack -a -d --write-bitmap-index &&
+	HASH2=$(git -C server2 rev-parse bar) &&
+
+	git -C repo remote add server2 "file://$(pwd)/server2" &&
+	git -C repo config remote.server2.promisor true &&
+	git -C repo cat-file -p "$HASH2" &&
+
+	git -C repo fetch server2 &&
+	rm -rf repo/.git/objects/* &&
+	git -C repo cat-file -p "$HASH2" &&
+
+	# Ensure that the .promisor file is written, and check that its
+	# associated packfile contains the object
+	ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
+	test_line_count = 1 promisorlist &&
+	IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&
+	git verify-pack --verbose "$IDX" | grep "$HASH2"
+'
+
+test_expect_success 'fetching of missing objects configures a promisor remote' '
+	git clone "file://$(pwd)/server" server3 &&
+	test_commit -C server3 baz &&
+	git -C server3 repack -a -d --write-bitmap-index &&
+	HASH3=$(git -C server3 rev-parse baz) &&
+	git -C server3 config uploadpack.allowfilter 1 &&
+
+	rm repo/.git/objects/pack/pack-*.promisor &&
+
+	git -C repo remote add server3 "file://$(pwd)/server3" &&
+	git -C repo fetch --filter="blob:none" server3 $HASH3 &&
+
+	test "$(git -C repo config remote.server3.promisor)" = "true" &&
+
+	# Ensure that the .promisor file is written, and check that its
+	# associated packfile contains the object
+	ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
+	test_line_count = 1 promisorlist &&
+	IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&
+	git verify-pack --verbose "$IDX" | grep "$HASH3"
+'
+
 test_expect_success 'fetching of missing blobs works' '
-	rm -rf server repo &&
+	rm -rf server server2 repo &&
+	rm -rf server server3 repo &&
 	test_create_repo server &&
 	test_commit -C server foo &&
 	git -C server repack -a -d --write-bitmap-index &&
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 10/11] partial-clone: add multiple remotes in the doc
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (8 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 09/11] t0410: test fetching from many promisor remotes Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  2019-03-12 13:29 ` [PATCH v3 11/11] remote: add promisor and partial clone config to " Christian Couder
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

While at it, let's remove a reference to ODB effort as the ODB
effort has been replaced by directly enhancing partial clone
and promisor remote features.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/technical/partial-clone.txt | 83 ++++++++++++++++-------
 1 file changed, 58 insertions(+), 25 deletions(-)

diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt
index 896c7b3878..58adcc5ce1 100644
--- a/Documentation/technical/partial-clone.txt
+++ b/Documentation/technical/partial-clone.txt
@@ -100,18 +100,18 @@ or commits that reference missing trees.
 Handling Missing Objects
 ------------------------
 
-- An object may be missing due to a partial clone or fetch, or missing due
-  to repository corruption.  To differentiate these cases, the local
-  repository specially indicates such filtered packfiles obtained from the
-  promisor remote as "promisor packfiles".
+- An object may be missing due to a partial clone or fetch, or missing
+  due to repository corruption.  To differentiate these cases, the
+  local repository specially indicates such filtered packfiles
+  obtained from promisor remotes as "promisor packfiles".
 +
 These promisor packfiles consist of a "<name>.promisor" file with
 arbitrary contents (like the "<name>.keep" files), in addition to
 their "<name>.pack" and "<name>.idx" files.
 
 - The local repository considers a "promisor object" to be an object that
-  it knows (to the best of its ability) that the promisor remote has promised
-  that it has, either because the local repository has that object in one of
+  it knows (to the best of its ability) that promisor remotes have promised
+  that they have, either because the local repository has that object in one of
   its promisor packfiles, or because another promisor object refers to it.
 +
 When Git encounters a missing object, Git can see if it is a promisor object
@@ -123,7 +123,7 @@ expensive-to-modify list of missing objects.[a]
 - Since almost all Git code currently expects any referenced object to be
   present locally and because we do not want to force every command to do
   a dry-run first, a fallback mechanism is added to allow Git to attempt
-  to dynamically fetch missing objects from the promisor remote.
+  to dynamically fetch missing objects from promisor remotes.
 +
 When the normal object lookup fails to find an object, Git invokes
 fetch-object to try to get the object from the server and then retry
@@ -157,8 +157,7 @@ and prefetch those objects in bulk.
 +
 We are not happy with this global variable and would like to remove it,
 but that requires significant refactoring of the object code to pass an
-additional flag.  We hope that concurrent efforts to add an ODB API can
-encompass this.
+additional flag.
 
 
 Fetching Missing Objects
@@ -182,21 +181,53 @@ has been updated to not use any object flags when the corresponding argument
   though they are not necessary.
 
 
+Using many promisor remotes
+---------------------------
+
+Many promisor remotes can be configured and used.
+
+This allows for example a user to have multiple geographically-close
+cache servers for fetching missing blobs while continuing to do
+filtered `git-fetch` commands from the central server.
+
+When fetching objects, promisor remotes are tried one after the other
+until all the objects have been fetched.
+
+Remotes that are considered "promisor" remotes are those specified by
+the following configuration variables:
+
+- `extensions.partialClone = <name>`
+
+- `remote.<name>.promisor = true`
+
+- `remote.<name>.partialCloneFilter = ...`
+
+Only one promisor remote can be configured using the
+`extensions.partialClone` config variable. This promisor remote will
+be the last one tried when fetching objects.
+
+The other promisor remotes will be tried in the order they appear in
+the config file.
+
 Current Limitations
 -------------------
 
-- The remote used for a partial clone (or the first partial fetch
-  following a regular clone) is marked as the "promisor remote".
+- It is not possible to specify the order in which the promisor
+  remotes are tried in other ways than the order in which they appear
+  in the config file.
 +
-We are currently limited to a single promisor remote and only that
-remote may be used for subsequent partial fetches.
+It is also not possible to specify an order to be used when fetching
+from one remote and a different order when fetching from another
+remote.
+
+- It is not possible to push only specific objects to a promisor
+  remote.
 +
-We accept this limitation because we believe initial users of this
-feature will be using it on repositories with a strong single central
-server.
+It is not possible to push at the same time to multiple promisor
+remote in a specific order.
 
-- Dynamic object fetching will only ask the promisor remote for missing
-  objects.  We assume that the promisor remote has a complete view of the
+- Dynamic object fetching will only ask promisor remotes for missing
+  objects.  We assume that promisor remotes have a complete view of the
   repository and can satisfy all such requests.
 
 - Repack essentially treats promisor and non-promisor packfiles as 2
@@ -218,15 +249,17 @@ server.
 Future Work
 -----------
 
-- Allow more than one promisor remote and define a strategy for fetching
-  missing objects from specific promisor remotes or of iterating over the
-  set of promisor remotes until a missing object is found.
+- Improve the way to specify the order in which promisor remotes are
+  tried.
 +
-A user might want to have multiple geographically-close cache servers
-for fetching missing blobs while continuing to do filtered `git-fetch`
-commands from the central server, for example.
+For example this could allow to specify explicitly something like:
+"When fetching from this remote, I want to use these promisor remotes
+in this order, though, when pushing or fetching to that remote, I want
+to use those promisor remotes in that order."
+
+- Allow pushing to promisor remotes.
 +
-Or the user might want to work in a triangular work flow with multiple
+The user might want to work in a triangular work flow with multiple
 promisor remotes that each have an incomplete view of the repository.
 
 - Allow repack to work on promisor packfiles (while keeping them distinct
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 11/11] remote: add promisor and partial clone config to the doc
  2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
                   ` (9 preceding siblings ...)
  2019-03-12 13:29 ` [PATCH v3 10/11] partial-clone: add multiple remotes in the doc Christian Couder
@ 2019-03-12 13:29 ` Christian Couder
  10 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-03-12 13:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config/remote.txt | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 6c4cad83a2..a8e6437a90 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -76,3 +76,11 @@ remote.<name>.pruneTags::
 +
 See also `remote.<name>.prune` and the PRUNING section of
 linkgit:git-fetch[1].
+
+remote.<name>.promisor::
+	When set to true, this remote will be used to fetch promisor
+	objects.
+
+remote.<name>.partialclonefilter::
+	The filter that will be applied when fetching from this
+	promisor remote.
-- 
2.21.0.166.gb5e4dbcfd3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 02/11] Add initial support for many promisor remotes
  2019-03-12 13:29 ` [PATCH v3 02/11] Add initial support for many promisor remotes Christian Couder
@ 2019-03-13  4:09   ` Junio C Hamano
  2019-03-13  4:34     ` Junio C Hamano
  2019-04-01 16:41     ` Christian Couder
  0 siblings, 2 replies; 22+ messages in thread
From: Junio C Hamano @ 2019-03-13  4:09 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder, Jeff Hostetler, Eric Sunshine,
	Beat Bolli

Christian Couder <christian.couder@gmail.com> writes:

> +struct promisor_remote *promisor_remote_new(const char *remote_name)
> +{

Shouldn't this be static?  The config callback that calls this
function is inside this file.

> +	struct promisor_remote *o;
> +
> +	o = xcalloc(1, sizeof(*o));
> +	o->remote_name = xstrdup(remote_name);

A comment on this later...

> +static struct promisor_remote *promisor_remote_look_up(const char *remote_name,
> +						       struct promisor_remote **previous)

In our codebase, this operation is far more often called "lookup",
one word, according to "git grep -e look_up \*.h".

> +{
> +	struct promisor_remote *o, *p;
> +
> +	for (p = NULL, o = promisors; o; p = o, o = o->next)
> +		if (o->remote_name && !strcmp(o->remote_name, remote_name)) {
> +			if (previous)
> +				*previous = p;

I think the "previous" thing is for the callers to learn what
pointer points at the found entry, allowing e.g. an element to be
inserted just before the found element.  If so, would it make more
sense to use the more familiar pattern to use

	*previous = &promisors;

here?  That would remove the need to switch on NULL-ness of previous
in the caller.

> diff --git a/promisor-remote.h b/promisor-remote.h
> new file mode 100644
> index 0000000000..bfbf7c0f21
> --- /dev/null
> +++ b/promisor-remote.h
> @@ -0,0 +1,17 @@
> +#ifndef PROMISOR_REMOTE_H
> +#define PROMISOR_REMOTE_H
> +
> +/*
> + * Promisor remote linked list
> + * Its information come from remote.XXX config entries.
> + */
> +struct promisor_remote {
> +	const char *remote_name;
> +	struct promisor_remote *next;
> +};

Would it make the management of storage easier to make it

	struct promisor_remote {
		struct promisor_remote *next;
		const char name[FLEX_ARRAY];
	};

that will allow allocation with

	struct promisor_remote *r;
	FLEX_ALLOC_STR(r, name, remote_name);

Or if the remote_name field must be a pointer, perhaps use
FLEXPTR_ALLOC_STR().

What is the rule for these promisor names?  If these entries were on
the configuration file, then:

	[remote "origin"]
		url = ...
		promisor = frotz
		promisor = nitfol

	[remote "mirror"}
		url = ...
		promisor = frotz
		promisor = Frotz
		promisor = nit fol

would the two "frotz" for the two remotes refer to the same thing,
or are "promisor" values scoped to each remote?

Can the name of promisor be any string?  If they end up getting used
as part of a path on the filesystem, we'd need to worry about case
sensitivity and UTF-8 normalization issues as well.

In a large enough project where multi-promisor makes sense, what is
the expected number of promisors a repository would define?  10s?
1000s?  Would a linked list still make sense when deployed in the
real world, or would we be forced to move to something like hashmap
later?

You do not have to have the answers to all these questions, and even
the ones with concrete answers, you do not necessarily have to act
on them right now (e.g. you may anticipate the eventual need to move
to hashmap, but prototyping with linked list is perfectly fine;
being aware of the possibility alone would force us to be careful to
make sure that the implementation detail does not leak through too
much and confined within _lookup(), _find(), etc. functions, and
that awareness is good enough at this point).

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct()
  2019-03-12 13:29 ` [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct() Christian Couder
@ 2019-03-13  4:23   ` Junio C Hamano
  2019-04-01 16:41     ` Christian Couder
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2019-03-13  4:23 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder, Jeff Hostetler, Eric Sunshine,
	Beat Bolli

Christian Couder <christian.couder@gmail.com> writes:

> @@ -98,3 +99,19 @@ int has_promisor_remote(void)
>  {
>  	return !!promisor_remote_find(NULL);
>  }
> +
> +int promisor_remote_get_direct(const struct object_id *oids, int oid_nr)
> +{
> +	struct promisor_remote *o;
> +
> +	promisor_remote_init();
> +
> +	for (o = promisors; o; o = o->next) {
> +		if (fetch_objects(o->remote_name, oids, oid_nr) < 0)
> +			continue;
> +		return 0;

Suppose the caller asks to fetch 3 objects, A, B and C, from two
promisors.  The first promisor can give you A and B but cannot give
you C.  The second promisor only can give you C.

Does fetch_objects() return failure after attempting to fetch A, B
and C from the first promisor?  Then we go on to the second promisor
but do we ask all three?  That would mean the second promisor will
also fail because it cannot give you A and B, and then the whole
thing would fail.  It would be nicer if the mechanism would allow us
to fetch what is still missing from later promisor, perhaps.

As the original "fetch" protocol only allows you to fetch a pack
with everything you asked for in it, instead of feeding you a pack
with best effort, I think the answer to the above is "it is very
hard to improve over what we have here", but people may have
interesting ideas ;-)

> +	}
> +
> +	return -1;
> +}
> +

Adding trailing blank line at the end?

> diff --git a/promisor-remote.h b/promisor-remote.h
> index bfbf7c0f21..f9f5825417 100644
> --- a/promisor-remote.h
> +++ b/promisor-remote.h
> @@ -13,5 +13,6 @@ struct promisor_remote {
>  extern struct promisor_remote *promisor_remote_new(const char *remote_name);
>  extern struct promisor_remote *promisor_remote_find(const char *remote_name);
>  extern int has_promisor_remote(void);
> +extern int promisor_remote_get_direct(const struct object_id *oids, int oid_nr);
>  
>  #endif /* PROMISOR_REMOTE_H */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit()
  2019-03-12 13:29 ` [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit() Christian Couder
@ 2019-03-13  4:28   ` Junio C Hamano
  2019-04-01 16:41     ` Christian Couder
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2019-03-13  4:28 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder, Jeff Hostetler, Eric Sunshine,
	Beat Bolli

Christian Couder <christian.couder@gmail.com> writes:

> From: Christian Couder <christian.couder@gmail.com>
>
> We will need to reinitialize the promisor remote configuration
> as we will make some changes to it in a later commit.
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---

At this point, turning "initialized" into a file-scope static, and 
building reinit as

	void promisor_remote_reinit(void)
	{
		initialized = 0;
		... clear existing "promisor" entries ...
		promisor_remote_init();
	}

may make more sense.

> -static void promisor_remote_init(void)
> +static void promisor_remote_do_init(int force)
>  {
>  	static int initialized;
>  
> -	if (initialized)
> +	if (!force && initialized)
>  		return;
>  	initialized = 1;
>  
>  	git_config(promisor_remote_config, NULL);

The promisors and promisors_tail would need to be reinitiazlied and
existing elements on the list purged.  Otherwise, after removing an
entry from the configuration file, the entry will still stay around.

>  }
>  
> +static inline void promisor_remote_init(void)
> +{
> +	promisor_remote_do_init(0);
> +}
> +
> +void promisor_remote_reinit(void)
> +{
> +	promisor_remote_do_init(1);
> +}
> +
>  struct promisor_remote *promisor_remote_find(const char *remote_name)
>  {
>  	promisor_remote_init();
> diff --git a/promisor-remote.h b/promisor-remote.h
> index f9f5825417..f96722bc66 100644
> --- a/promisor-remote.h
> +++ b/promisor-remote.h
> @@ -10,6 +10,7 @@ struct promisor_remote {
>  	struct promisor_remote *next;
>  };
>  
> +extern void promisor_remote_reinit(void);
>  extern struct promisor_remote *promisor_remote_new(const char *remote_name);
>  extern struct promisor_remote *promisor_remote_find(const char *remote_name);
>  extern int has_promisor_remote(void);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone
  2019-03-12 13:29 ` [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone Christian Couder
@ 2019-03-13  4:31   ` Junio C Hamano
  2019-04-01 16:42     ` Christian Couder
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2019-03-13  4:31 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder, Jeff Hostetler, Eric Sunshine,
	Beat Bolli

Christian Couder <christian.couder@gmail.com> writes:

> A remote specified using the extensions.partialClone config
> option should be considered a promisor remote too.
>
> This remote should be at the end of the promisor remote list,
> so that it is used only if objects have not been found in other
> remotes.

That's a declaration, not a rationale, and does not answer "Why
should the origin be only used as the last resort?".

> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  promisor-remote.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/promisor-remote.c b/promisor-remote.c
> index ea74f6d8a8..dcf6ef6521 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -83,6 +83,17 @@ static void promisor_remote_do_init(int force)
>  	initialized = 1;
>  
>  	git_config(promisor_remote_config, NULL);
> +
> +	if (repository_format_partial_clone) {
> +		struct promisor_remote *o, *previous;
> +
> +		o = promisor_remote_look_up(repository_format_partial_clone,
> +					    &previous);
> +		if (o)
> +			promisor_remote_move_to_tail(o, previous);
> +		else
> +			promisor_remote_new(repository_format_partial_clone);
> +	}
>  }
>  
>  static inline void promisor_remote_init(void)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 02/11] Add initial support for many promisor remotes
  2019-03-13  4:09   ` Junio C Hamano
@ 2019-03-13  4:34     ` Junio C Hamano
  2019-04-01 16:41     ` Christian Couder
  1 sibling, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2019-03-13  4:34 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Stefan Beller, Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider,
	Eric Wong, Christian Couder, Jeff Hostetler, Eric Sunshine,
	Beat Bolli

Junio C Hamano <gitster@pobox.com> writes:

> What is the rule for these promisor names?

Disregard this part (and only this part).  The values are not names,
but just "is this thing a promisor" boolean.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 02/11] Add initial support for many promisor remotes
  2019-03-13  4:09   ` Junio C Hamano
  2019-03-13  4:34     ` Junio C Hamano
@ 2019-04-01 16:41     ` Christian Couder
  1 sibling, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-04-01 16:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli

On Wed, Mar 13, 2019 at 5:09 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > +struct promisor_remote *promisor_remote_new(const char *remote_name)
> > +{
>
> Shouldn't this be static?  The config callback that calls this
> function is inside this file.

Yeah, I made it static.

> > +     struct promisor_remote *o;
> > +
> > +     o = xcalloc(1, sizeof(*o));
> > +     o->remote_name = xstrdup(remote_name);
>
> A comment on this later...
>
> > +static struct promisor_remote *promisor_remote_look_up(const char *remote_name,
> > +                                                    struct promisor_remote **previous)
>
> In our codebase, this operation is far more often called "lookup",
> one word, according to "git grep -e look_up \*.h".

Ok, I changed it to "lookup".

> > +{
> > +     struct promisor_remote *o, *p;
> > +
> > +     for (p = NULL, o = promisors; o; p = o, o = o->next)
> > +             if (o->remote_name && !strcmp(o->remote_name, remote_name)) {
> > +                     if (previous)
> > +                             *previous = p;
>
> I think the "previous" thing is for the callers to learn what
> pointer points at the found entry, allowing e.g. an element to be
> inserted just before the found element.

Actually it's to make it easy to move the found element.

> If so, would it make more
> sense to use the more familiar pattern to use
>
>         *previous = &promisors;
>
> here?

If I do that I get an "error: assignment from incompatible pointer
type" as "*previous" is of type "struct promisor_remote *" while
"&promisors" is of type "struct promisor_remote **".

Maybe you mean:

         *previous = promisors;

but I fail to see how that would correctly pass the previous element
when the found one is not the first one.

> That would remove the need to switch on NULL-ness of previous
> in the caller.

In the only caller that passes a non NULL previous, we call
promisor_remote_move_to_tail() which does:

    if (previous)
        previous->next = o->next;
    else
        promisors = o->next ? o->next : o;

So yeah we check the NULL-ness of previous, but if previous has been
set to promisors, then previous->next = o->next will not set promisors
correctly.

I guess we are not here in the case were the familiar pattern you are
thinking about can be applied. Or is there an example, maybe in the
Git source code, that I could learn from?

Another possibility is to just use hashmap as you suggest below or
list.h. It might be a bit wasteful, but the code simplification might
be worth it.

> > diff --git a/promisor-remote.h b/promisor-remote.h
> > new file mode 100644
> > index 0000000000..bfbf7c0f21
> > --- /dev/null
> > +++ b/promisor-remote.h
> > @@ -0,0 +1,17 @@
> > +#ifndef PROMISOR_REMOTE_H
> > +#define PROMISOR_REMOTE_H
> > +
> > +/*
> > + * Promisor remote linked list
> > + * Its information come from remote.XXX config entries.
> > + */
> > +struct promisor_remote {
> > +     const char *remote_name;
> > +     struct promisor_remote *next;
> > +};
>
> Would it make the management of storage easier to make it
>
>         struct promisor_remote {
>                 struct promisor_remote *next;
>                 const char name[FLEX_ARRAY];
>         };
>
> that will allow allocation with
>
>         struct promisor_remote *r;
>         FLEX_ALLOC_STR(r, name, remote_name);

Ok to use a flex array. If we ever use arrays or hashmaps of promisor
remotes, we might have to go back to not using one.

> Or if the remote_name field must be a pointer, perhaps use
> FLEXPTR_ALLOC_STR().

[...]

> Can the name of promisor be any string?  If they end up getting used
> as part of a path on the filesystem, we'd need to worry about case
> sensitivity and UTF-8 normalization issues as well.

It looks like for regular remotes we only check if they start with /.
So I don't think we need to do more than that for promisor remotes. I
added the check.

> In a large enough project where multi-promisor makes sense, what is
> the expected number of promisors a repository would define?  10s?
> 1000s?  Would a linked list still make sense when deployed in the
> real world, or would we be forced to move to something like hashmap
> later?

I am ok to use hashmap to make it similar with regular remotes.

For now I don't expect large projects to use more than 10s promisors
though. They are defined in the config file and I don't think people
will be happy if they have to manage more than 10s promisors in their
config file. If people really start to use more than that, they are
likely to ask us for a new mechanism to manage them (and to
automatically have them configured from servers). So maybe we can
change that if/when we have to work on such mechanism.




> You do not have to have the answers to all these questions, and even
> the ones with concrete answers, you do not necessarily have to act
> on them right now (e.g. you may anticipate the eventual need to move
> to hashmap, but prototyping with linked list is perfectly fine;
> being aware of the possibility alone would force us to be careful to
> make sure that the implementation detail does not leak through too
> much and confined within _lookup(), _find(), etc. functions, and
> that awareness is good enough at this point).
>
> Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct()
  2019-03-13  4:23   ` Junio C Hamano
@ 2019-04-01 16:41     ` Christian Couder
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-04-01 16:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli

On Wed, Mar 13, 2019 at 5:24 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:

> > +int promisor_remote_get_direct(const struct object_id *oids, int oid_nr)
> > +{
> > +     struct promisor_remote *o;
> > +
> > +     promisor_remote_init();
> > +
> > +     for (o = promisors; o; o = o->next) {
> > +             if (fetch_objects(o->remote_name, oids, oid_nr) < 0)
> > +                     continue;
> > +             return 0;
>
> Suppose the caller asks to fetch 3 objects, A, B and C, from two
> promisors.  The first promisor can give you A and B but cannot give
> you C.  The second promisor only can give you C.
>
> Does fetch_objects() return failure after attempting to fetch A, B
> and C from the first promisor?

Yes, I think it should still send the objects it has and then fail.

Maybe it doesn't work like this right now (I haven't checked), or
maybe the failure should be different than a regular failure. In those
cases I can maybe improve on that in a later iteration.

> Then we go on to the second promisor
> but do we ask all three?  That would mean the second promisor will
> also fail because it cannot give you A and B, and then the whole
> thing would fail.  It would be nicer if the mechanism would allow us
> to fetch what is still missing from later promisor, perhaps.

Yeah, I implemented fetching only what is still missing in v4.

> As the original "fetch" protocol only allows you to fetch a pack
> with everything you asked for in it, instead of feeding you a pack
> with best effort, I think the answer to the above is "it is very
> hard to improve over what we have here", but people may have
> interesting ideas ;-)

I think we should make it best effort when acting as a promisor
remote. I will take a look at that.

> > +     }
> > +
> > +     return -1;
> > +}
> > +
>
> Adding trailing blank line at the end?

Removed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit()
  2019-03-13  4:28   ` Junio C Hamano
@ 2019-04-01 16:41     ` Christian Couder
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Couder @ 2019-04-01 16:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli

On Wed, Mar 13, 2019 at 5:28 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > From: Christian Couder <christian.couder@gmail.com>
> >
> > We will need to reinitialize the promisor remote configuration
> > as we will make some changes to it in a later commit.
> >
> > Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> > ---
>
> At this point, turning "initialized" into a file-scope static, and
> building reinit as
>
>         void promisor_remote_reinit(void)
>         {
>                 initialized = 0;
>                 ... clear existing "promisor" entries ...
>                 promisor_remote_init();
>         }
>
> may make more sense.

Ok, I implemented that.

> > -static void promisor_remote_init(void)
> > +static void promisor_remote_do_init(int force)
> >  {
> >       static int initialized;
> >
> > -     if (initialized)
> > +     if (!force && initialized)
> >               return;
> >       initialized = 1;
> >
> >       git_config(promisor_remote_config, NULL);
>
> The promisors and promisors_tail would need to be reinitiazlied and
> existing elements on the list purged.  Otherwise, after removing an
> entry from the configuration file, the entry will still stay around.

Yeah, that's in the new version too.

> >  }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone
  2019-03-13  4:31   ` Junio C Hamano
@ 2019-04-01 16:42     ` Christian Couder
  2019-04-01 17:25       ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Christian Couder @ 2019-04-01 16:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli

On Wed, Mar 13, 2019 at 5:31 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > A remote specified using the extensions.partialClone config
> > option should be considered a promisor remote too.
> >
> > This remote should be at the end of the promisor remote list,
> > so that it is used only if objects have not been found in other
> > remotes.
>
> That's a declaration, not a rationale, and does not answer "Why
> should the origin be only used as the last resort?".

I have added the following explanation:

    If we are using promisor remotes other than the origin, it is
    because these other promisor remotes are likely to be better
    for some reason (maybe they are closer or faster for some kind
    of objects) than the origin, so it's logical that we try to use
    them first.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone
  2019-04-01 16:42     ` Christian Couder
@ 2019-04-01 17:25       ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2019-04-01 17:25 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli

Christian Couder <christian.couder@gmail.com> writes:

> On Wed, Mar 13, 2019 at 5:31 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>> > A remote specified using the extensions.partialClone config
>> > option should be considered a promisor remote too.
>> >
>> > This remote should be at the end of the promisor remote list,
>> > so that it is used only if objects have not been found in other
>> > remotes.
>>
>> That's a declaration, not a rationale, and does not answer "Why
>> should the origin be only used as the last resort?".
>
> I have added the following explanation:
>
>     If we are using promisor remotes other than the origin, it is
>     because these other promisor remotes are likely to be better
>     for some reason (maybe they are closer or faster for some kind
>     of objects) than the origin, so it's logical that we try to use
>     them first.

Alternatively, perhaps you cloned from the origin repository which
is a mirror that is not as reliable as the real thing but is faster
to access when it _is_ up, and you added the true source that
'origin' mirrors from as the fallback, as it should only be used
when 'origin' is down.

So what you wrote is not a convincing explanation why we should
treat the origin as the fallback.

Writing in the end-user documentation something like "because the
'origin' remote is treated as the last resort fallback repository,
after consulting other promisor remotes first and failing to obtain
needed objects, you may want to use repositories with better
availability listed as 'other promisor remotes' and set the worst
connected remote as 'origin'" may be a valid thing to do, though.

I do not like to see the tool making such a policy decision for
users very much, but at least by honestly documenting, we explain
what we unilaterally decide to do without having a strong
justification (i.e. we could decide the other way and the decision
would be equally valid, and because there is no single best choice,
we arbitrarily picked one) to the users, and with that, they can
adjust their use case to what the tool does to take advantage of the
tool's design decision.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-04-01 17:53 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-12 13:29 [PATCH v3 00/11] Many promisor remotes Christian Couder
2019-03-12 13:29 ` [PATCH v3 01/11] fetch-object: make functions return an error code Christian Couder
2019-03-12 13:29 ` [PATCH v3 02/11] Add initial support for many promisor remotes Christian Couder
2019-03-13  4:09   ` Junio C Hamano
2019-03-13  4:34     ` Junio C Hamano
2019-04-01 16:41     ` Christian Couder
2019-03-12 13:29 ` [PATCH v3 03/11] promisor-remote: implement promisor_remote_get_direct() Christian Couder
2019-03-13  4:23   ` Junio C Hamano
2019-04-01 16:41     ` Christian Couder
2019-03-12 13:29 ` [PATCH v3 04/11] promisor-remote: add promisor_remote_reinit() Christian Couder
2019-03-13  4:28   ` Junio C Hamano
2019-04-01 16:41     ` Christian Couder
2019-03-12 13:29 ` [PATCH v3 05/11] promisor-remote: use repository_format_partial_clone Christian Couder
2019-03-13  4:31   ` Junio C Hamano
2019-04-01 16:42     ` Christian Couder
2019-04-01 17:25       ` Junio C Hamano
2019-03-12 13:29 ` [PATCH v3 06/11] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
2019-03-12 13:29 ` [PATCH v3 07/11] promisor-remote: parse remote.*.partialclonefilter Christian Couder
2019-03-12 13:29 ` [PATCH v3 08/11] builtin/fetch: remove unique promisor remote limitation Christian Couder
2019-03-12 13:29 ` [PATCH v3 09/11] t0410: test fetching from many promisor remotes Christian Couder
2019-03-12 13:29 ` [PATCH v3 10/11] partial-clone: add multiple remotes in the doc Christian Couder
2019-03-12 13:29 ` [PATCH v3 11/11] remote: add promisor and partial clone config to " Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).