git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* [PATCH v5 00/16] Many promisor remotes
@ 2019-04-09 16:11 Christian Couder
  2019-04-09 16:11 ` [PATCH v5 01/16] t0410: remove pipes after git commands Christian Couder
                   ` (16 more replies)
  0 siblings, 17 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

This patch series is based on:

763fb763b8 (Merge branch 'jt/batch-fetch-blobs-in-diff' into jch, 2019-04-08)

to avoid issues with jt/batch-fetch-blobs-in-diff.

Introduction
~~~~~~~~~~~~

This path series is a follow up from the "remote odb" patch series
that I sent last year, which were a follow up from previous
series. See the links section for more information.

The goal of this patch series is to make it possible to have and to
fetch missing objects from multiple remotes instead of only one.

For now the fetch order is the order of the remotes in the config,
except for the remote specified by extensions.partialclone config
option which comes last in the fetch order.

I selected the name "Promisor remote" over "Partial clone remote"
because it is shorter and because it is not only about cloning.

The existing extensions.partialclone config option is respected, but
it is not written in the config when a partial clone or fetch is
made. Instead remote.<name>.promisor is set to "true". This may create
a compatibility issue, but it makes it possible to start using many
promisor remotes by just cloning and fetching from different remotes
with partial clone filters. The compatibility issue could be resolved
in a future iteration by just setting extensions.partialclone instead
of remote.<name>.promisor the first time a promisor remote is used.

The code might not work with many promisor remotes that don't all have
all the promised objects, as that would require the fetch protocol to
send packs with best effort, as described by Junio in:

https://public-inbox.org/git/xmqqpnqve71d.fsf@gitster-ct.c.googlers.com/

In general I have tried to change as few things as possible in the
first patches of the series, though the last patches try to hide the
old features that only made sense for the general code to use when
there was only one promisor remote.

High level view of new patches since the V4
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are 5 new patches in this series compared to V4. They are:

  - Patch 1/16 (t0410: remove pipes after git commands) 

This is a preparatory cleanup patch to improve t0410 as suggested by
Szeder Gábor.

  - Patch 8/16 (diff: use promisor-remote.h instead of fetch-object.h)

This patch applies to diff.c the same changes that are made in patch
7/11. They are in a separate patch because the changes need to be made
only if the jt/batch-fetch-blobs-in-diff series is merged. So
depending on what happens to jt/batch-fetch-blobs-in-diff, this patch
can either be squashed into 7/11, be squashed in to
jt/batch-fetch-blobs-in-diff, or be dropped.

  - Patch 14/16 (Remove fetch-object.{c,h} in favor of promisor-remote.{c,h})
  - Patch 15/16 (Move repository_format_partial_clone to promisor-remote.c)
  - Patch 16/16 (Move core_partial_clone_filter_default to promisor-remote.c)

These patches try to hide the old features (fetch_objects(),
repository_format_partial_clone and core_partial_clone_filter_default)
that only made sense for the general code to use when there was only
one promisor remote. This ensures that there will be compilation
errors rather than bugs or test failures if the old features are used
in the old fashion way.

High level view of other changes since the V4
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The title and purpose of the old patches didn't change, but the
following changes were made:

  - Patch 3/16 (Add initial support for many promisor remotes)

Remove check for "r->name" as "name" became a flex array in struct
promisor_remote as suggested by Szeder Gábor and Junio.

  - Patch 4/16 (promisor-remote: implement promisor_remote_get_direct())

Added call to free() "missing" array.

Added "struct object_id;" declaration in "promisor-remote.h".

  - Patch 6/16 (promisor-remote: use repository_format_partial_clone)

Improved explanation about why the remote specified using the
extensions.partialClone config option is tried last.

  - Patch 7/16 (Use promisor_remote_get_direct() and has_promisor_remote())

Remove now useless includes of "fetch-object.h" as suggested by Junio.

  - Patch 11/16 (t0410: test fetching from many promisor remotes)

Improve tests as suggested by Szeder Gábor.

  - Patch 12/16 (partial-clone: add multiple remotes in the doc)

Add explanation in the doc about why the remote specified using the
extensions.partialClone config option is tried last as suggested by
Junio.

Other doc improvements.

High level overview of old patches in this patch series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  - Patch 2/16 (fetch-object: make functions return an error code)

This patch, makes functions in fetch-object.c return an error code,
which is necessary to later tell that they failed and try another
promisor remote when there is more than one. This could also just be
seen as a fix to these functions.

  - Patch 3/16 (Add initial support for many promisor remotes)

This introduces the minimum infrastructure for promisor remotes.

  - Patch 4/16 (promisor-remote: implement promisor_remote_get_direct())  
  - Patch 5/16 (promisor-remote: add promisor_remote_reinit())
  - Patch 6/16 (promisor-remote: use repository_format_partial_clone)

These patches add a few missing bits in the promisor remote
infrastructure that will be needed in the following patches.

  - Patch 7/16 (Use promisor_remote_get_direct() and has_promisor_remote())

This replaces the previous interface that used only one promisor
remote defined in extensions.partialclone with the new interface
created by the previous patches.

  - Patch 9/16 (promisor-remote: parse remote.*.partialclonefilter)

This replaces the way a partial clone filter was handled by a new way
based on the previous patches that support more than one partial clone
filter.

  - Patch 10/16 (builtin/fetch: remove unique promisor remote limitation)

This patch removes the limitation in builtin/fetch.c to have only one
promisor remote.

  - Patch 12/16 (t0410: test fetching from many promisor remotes)

This adds test cases that shows that now more than one promisor remote
can be used and that remote.<name>.promisor is set to "true" when
fetching from a new promisor remote.

  - Patch 13/16 (partial-clone: add multiple remotes in the doc)
  - Patch 14/16 (remote: add promisor and partial clone config to the doc)

These documentation patches explain how things can work with more than
one promisor remote.

Links
~~~~~

This patch series on GitHub:

V5: https://github.com/chriscool/git/commits/many-promisor-remotes
V4: https://github.com/chriscool/git/commits/many-promisor-remotes58
V3: https://github.com/chriscool/git/commits/many-promisor-remotes40
V2: https://github.com/chriscool/git/commits/many-promisor-remotes35
V1: https://github.com/chriscool/git/commits/many-promisor-remotes17

On the mailing list:

V4: https://public-inbox.org/git/20190401164045.17328-1-chriscool@tuxfamily.org/
V3: https://public-inbox.org/git/20190312132959.11764-1-chriscool@tuxfamily.org/
V2: https://public-inbox.org/git/20190122144212.15119-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20181211052746.16218-1-chriscool@tuxfamily.org/

This patch series is a follow up from the discussions related to
the remote odb V4 patch series:

https://public-inbox.org/git/20180802061505.2983-1-chriscool@tuxfamily.org/

Especially in:

https://public-inbox.org/git/CAP8UFD3nrhjANwNDqTwx5ZtnZNcnbAFqUN=u=LrvzuH4+3wQQA@mail.gmail.com/

I said that I would like to work on things in the following order:

  1) Teaching partial clone to attempt to fetch missing objects from
multiple remotes instead of only one using the order in the config.

  2) Simplifying the protocol for fetching missing objects so that it
can be satisfied by a lighter weight object storage system than a full
Git server.

  3) Making it possible to explicitly define an order in which the
remotes are accessed.

  4) Making the criteria for what objects can be missing more
aggressive, so that I can "git add" a large file and work with it
using Git without even having a second copy of that object in my local
object store.

And this patch series is about the 1).

The previous remote odb patch series on GitHub:

V5: https://github.com/chriscool/git/commits/remote-odb
V4: https://github.com/chriscool/git/commits/remote-odb5
V3: https://github.com/chriscool/git/commits/remote-odb3
V2: https://github.com/chriscool/git/commits/remote-odb2
V1: https://github.com/chriscool/git/commits/remote-odb1

Discussions related to previous versions of the odb patch series:

V4: https://public-inbox.org/git/20180802061505.2983-1-chriscool@tuxfamily.org/
V3: https://public-inbox.org/git/20180713174959.16748-1-chriscool@tuxfamily.org/
V2: https://public-inbox.org/git/20180630083542.20347-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20180623121846.19750-1-chriscool@tuxfamily.org/


Christian Couder (16):
  t0410: remove pipes after git commands
  fetch-object: make functions return an error code
  Add initial support for many promisor remotes
  promisor-remote: implement promisor_remote_get_direct()
  promisor-remote: add promisor_remote_reinit()
  promisor-remote: use repository_format_partial_clone
  Use promisor_remote_get_direct() and has_promisor_remote()
  diff: use promisor-remote.h instead of fetch-object.h
  promisor-remote: parse remote.*.partialclonefilter
  builtin/fetch: remove unique promisor remote limitation
  t0410: test fetching from many promisor remotes
  partial-clone: add multiple remotes in the doc
  remote: add promisor and partial clone config to the doc
  Remove fetch-object.{c,h} in favor of promisor-remote.{c,h}
  Move repository_format_partial_clone to promisor-remote.c
  Move core_partial_clone_filter_default to promisor-remote.c

 Documentation/config/remote.txt           |   8 +
 Documentation/technical/partial-clone.txt | 117 +++++++---
 Makefile                                  |   2 +-
 builtin/cat-file.c                        |   5 +-
 builtin/fetch.c                           |  29 +--
 builtin/gc.c                              |   3 +-
 builtin/repack.c                          |   3 +-
 cache-tree.c                              |   3 +-
 cache.h                                   |   2 -
 config.c                                  |   5 -
 connected.c                               |   3 +-
 diff.c                                    |   8 +-
 environment.c                             |   2 -
 fetch-object.c                            |  40 ----
 fetch-object.h                            |   9 -
 list-objects-filter-options.c             |  51 +++--
 list-objects-filter-options.h             |   3 +-
 packfile.c                                |   3 +-
 promisor-remote.c                         | 264 ++++++++++++++++++++++
 promisor-remote.h                         |  29 +++
 setup.c                                   |   3 +-
 sha1-file.c                               |  15 +-
 t/t0410-partial-clone.sh                  |  61 ++++-
 t/t5601-clone.sh                          |   3 +-
 t/t5616-partial-clone.sh                  |   4 +-
 unpack-trees.c                            |   7 +-
 26 files changed, 514 insertions(+), 168 deletions(-)
 delete mode 100644 fetch-object.c
 delete mode 100644 fetch-object.h
 create mode 100644 promisor-remote.c
 create mode 100644 promisor-remote.h

-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 01/16] t0410: remove pipes after git commands
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 02/16] fetch-object: make functions return an error code Christian Couder
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Let's not run a git command, especially one with "verify" in its
name, upstream of a pipe, because the pipe will hide the git
command's exit code.

While at it, let's also avoid a useless `cat` command piping
into `sed`.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0410-partial-clone.sh | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 5bd892f2f7..3559313bd0 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -166,8 +166,9 @@ test_expect_success 'fetching of missing objects' '
 	# associated packfile contains the object
 	ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
 	test_line_count = 1 promisorlist &&
-	IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&
-	git verify-pack --verbose "$IDX" | grep "$HASH"
+	IDX=$(sed "s/promisor$/idx/" promisorlist) &&
+	git verify-pack --verbose "$IDX" >out &&
+	grep "$HASH" out
 '
 
 test_expect_success 'fetching of missing objects works with ref-in-want enabled' '
@@ -514,8 +515,9 @@ test_expect_success 'fetching of missing objects from an HTTP server' '
 	# associated packfile contains the object
 	ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
 	test_line_count = 1 promisorlist &&
-	IDX=$(cat promisorlist | sed "s/promisor$/idx/") &&
-	git verify-pack --verbose "$IDX" | grep "$HASH"
+	IDX=$(sed "s/promisor$/idx/" promisorlist) &&
+	git verify-pack --verbose "$IDX" >out &&
+	grep "$HASH" out
 '
 
 test_done
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 02/16] fetch-object: make functions return an error code
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
  2019-04-09 16:11 ` [PATCH v5 01/16] t0410: remove pipes after git commands Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 03/16] Add initial support for many promisor remotes Christian Couder
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

The callers of the fetch_object() and fetch_objects() might
be interested in knowing if these functions succeeded or not.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fetch-object.c | 13 ++++++++-----
 fetch-object.h |  4 ++--
 sha1-file.c    |  4 ++--
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/fetch-object.c b/fetch-object.c
index 4266548800..eac4d448ef 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -5,11 +5,12 @@
 #include "transport.h"
 #include "fetch-object.h"
 
-static void fetch_refs(const char *remote_name, struct ref *ref)
+static int fetch_refs(const char *remote_name, struct ref *ref)
 {
 	struct remote *remote;
 	struct transport *transport;
 	int original_fetch_if_missing = fetch_if_missing;
+	int res;
 
 	fetch_if_missing = 0;
 	remote = remote_get(remote_name);
@@ -19,12 +20,14 @@ static void fetch_refs(const char *remote_name, struct ref *ref)
 
 	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
 	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	transport_fetch_refs(transport, ref);
+	res = transport_fetch_refs(transport, ref);
 	fetch_if_missing = original_fetch_if_missing;
+
+	return res;
 }
 
-void fetch_objects(const char *remote_name, const struct object_id *oids,
-		   int oid_nr)
+int fetch_objects(const char *remote_name, const struct object_id *oids,
+		  int oid_nr)
 {
 	struct ref *ref = NULL;
 	int i;
@@ -36,5 +39,5 @@ void fetch_objects(const char *remote_name, const struct object_id *oids,
 		new_ref->next = ref;
 		ref = new_ref;
 	}
-	fetch_refs(remote_name, ref);
+	return fetch_refs(remote_name, ref);
 }
diff --git a/fetch-object.h b/fetch-object.h
index d6444caa5a..7bcc7cadb0 100644
--- a/fetch-object.h
+++ b/fetch-object.h
@@ -3,7 +3,7 @@
 
 struct object_id;
 
-void fetch_objects(const char *remote_name, const struct object_id *oids,
-		   int oid_nr);
+int fetch_objects(const char *remote_name, const struct object_id *oids,
+		  int oid_nr);
 
 #endif
diff --git a/sha1-file.c b/sha1-file.c
index ad02649124..83358737f3 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1373,8 +1373,8 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 		    !already_retried && r == the_repository &&
 		    !(flags & OBJECT_INFO_FOR_PREFETCH)) {
 			/*
-			 * TODO Investigate having fetch_object() return
-			 * TODO error/success and stopping the music here.
+			 * TODO Investigate checking fetch_object() return
+			 * TODO value and stopping on error here.
 			 * TODO Pass a repository struct through fetch_object,
 			 * such that arbitrary repositories work.
 			 */
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 03/16] Add initial support for many promisor remotes
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
  2019-04-09 16:11 ` [PATCH v5 01/16] t0410: remove pipes after git commands Christian Couder
  2019-04-09 16:11 ` [PATCH v5 02/16] fetch-object: make functions return an error code Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct() Christian Couder
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

The promisor-remote.{c,h} files will contain functions to
manage many promisor remotes.

We expect that there will not be a lot of promisor remotes,
so it is ok to use a simple linked list to manage them.

Helped-by: Jeff King <peff@peff.net>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile          |  1 +
 promisor-remote.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
 promisor-remote.h | 16 +++++++++
 3 files changed, 109 insertions(+)
 create mode 100644 promisor-remote.c
 create mode 100644 promisor-remote.h

diff --git a/Makefile b/Makefile
index 34fa9da39a..3523ae0517 100644
--- a/Makefile
+++ b/Makefile
@@ -954,6 +954,7 @@ LIB_OBJS += preload-index.o
 LIB_OBJS += pretty.o
 LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
+LIB_OBJS += promisor-remote.o
 LIB_OBJS += prompt.o
 LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
diff --git a/promisor-remote.c b/promisor-remote.c
new file mode 100644
index 0000000000..c249b80e02
--- /dev/null
+++ b/promisor-remote.c
@@ -0,0 +1,92 @@
+#include "cache.h"
+#include "promisor-remote.h"
+#include "config.h"
+
+static struct promisor_remote *promisors;
+static struct promisor_remote **promisors_tail = &promisors;
+
+static struct promisor_remote *promisor_remote_new(const char *remote_name)
+{
+	struct promisor_remote *r;
+
+	if (*remote_name == '/') {
+		warning(_("promisor remote name cannot begin with '/': %s"),
+			remote_name);
+		return NULL;
+	}
+
+	FLEX_ALLOC_STR(r, name, remote_name);
+
+	*promisors_tail = r;
+	promisors_tail = &r->next;
+
+	return r;
+}
+
+static struct promisor_remote *promisor_remote_lookup(const char *remote_name,
+						      struct promisor_remote **previous)
+{
+	struct promisor_remote *r, *p;
+
+	for (p = NULL, r = promisors; r; p = r, r = r->next)
+		if (!strcmp(r->name, remote_name)) {
+			if (previous)
+				*previous = p;
+			return r;
+		}
+
+	return NULL;
+}
+
+static int promisor_remote_config(const char *var, const char *value, void *data)
+{
+	const char *name;
+	int namelen;
+	const char *subkey;
+
+	if (parse_config_key(var, "remote", &name, &namelen, &subkey) < 0)
+		return 0;
+
+	if (!strcmp(subkey, "promisor")) {
+		char *remote_name;
+
+		if (!git_config_bool(var, value))
+			return 0;
+
+		remote_name = xmemdupz(name, namelen);
+
+		if (!promisor_remote_lookup(remote_name, NULL))
+			promisor_remote_new(remote_name);
+
+		free(remote_name);
+		return 0;
+	}
+
+	return 0;
+}
+
+static void promisor_remote_init(void)
+{
+	static int initialized;
+
+	if (initialized)
+		return;
+	initialized = 1;
+
+	git_config(promisor_remote_config, NULL);
+}
+
+struct promisor_remote *promisor_remote_find(const char *remote_name)
+{
+	promisor_remote_init();
+
+	if (!remote_name)
+		return promisors;
+
+	return promisor_remote_lookup(remote_name, NULL);
+}
+
+int has_promisor_remote(void)
+{
+	return !!promisor_remote_find(NULL);
+}
diff --git a/promisor-remote.h b/promisor-remote.h
new file mode 100644
index 0000000000..01dcdf4dc7
--- /dev/null
+++ b/promisor-remote.h
@@ -0,0 +1,16 @@
+#ifndef PROMISOR_REMOTE_H
+#define PROMISOR_REMOTE_H
+
+/*
+ * Promisor remote linked list
+ * Its information come from remote.XXX config entries.
+ */
+struct promisor_remote {
+	struct promisor_remote *next;
+	const char name[FLEX_ARRAY];
+};
+
+extern struct promisor_remote *promisor_remote_find(const char *remote_name);
+extern int has_promisor_remote(void);
+
+#endif /* PROMISOR_REMOTE_H */
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (2 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 03/16] Add initial support for many promisor remotes Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-05-30 17:21   ` Derrick Stolee
  2019-04-09 16:11 ` [PATCH v5 05/16] promisor-remote: add promisor_remote_reinit() Christian Couder
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

This is implemented for now by calling fetch_objects(). It fetches
from all the promisor remotes.

Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++
 promisor-remote.h |  3 +++
 2 files changed, 69 insertions(+)

diff --git a/promisor-remote.c b/promisor-remote.c
index c249b80e02..289f1dd074 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -1,6 +1,8 @@
 #include "cache.h"
+#include "object-store.h"
 #include "promisor-remote.h"
 #include "config.h"
+#include "fetch-object.h"
 
 static struct promisor_remote *promisors;
 static struct promisor_remote **promisors_tail = &promisors;
@@ -90,3 +92,67 @@ int has_promisor_remote(void)
 {
 	return !!promisor_remote_find(NULL);
 }
+
+static int remove_fetched_oids(struct object_id **oids, int oid_nr, int to_free)
+{
+	int i, missing_nr = 0;
+	int *missing = xcalloc(oid_nr, sizeof(*missing));
+	struct object_id *old_oids = *oids;
+	struct object_id *new_oids;
+	int old_fetch_if_missing = fetch_if_missing;
+
+	fetch_if_missing = 0;
+
+	for (i = 0; i < oid_nr; i++)
+		if (oid_object_info_extended(the_repository, &old_oids[i], NULL, 0)) {
+			missing[i] = 1;
+			missing_nr++;
+		}
+
+	fetch_if_missing = old_fetch_if_missing;
+
+	if (missing_nr) {
+		int j = 0;
+		new_oids = xcalloc(missing_nr, sizeof(*new_oids));
+		for (i = 0; i < oid_nr; i++)
+			if (missing[i])
+				oidcpy(&new_oids[j++], &old_oids[i]);
+		*oids = new_oids;
+		if (to_free)
+			free(old_oids);
+	}
+
+	free(missing);
+
+	return missing_nr;
+}
+
+int promisor_remote_get_direct(const struct object_id *oids, int oid_nr)
+{
+	struct promisor_remote *r;
+	struct object_id *missing_oids = (struct object_id *)oids;
+	int missing_nr = oid_nr;
+	int to_free = 0;
+	int res = -1;
+
+	promisor_remote_init();
+
+	for (r = promisors; r; r = r->next) {
+		if (fetch_objects(r->name, missing_oids, missing_nr) < 0) {
+			if (missing_nr == 1)
+				continue;
+			missing_nr = remove_fetched_oids(&missing_oids, missing_nr, to_free);
+			if (missing_nr) {
+				to_free = 1;
+				continue;
+			}
+		}
+		res = 0;
+		break;
+	}
+
+	if (to_free)
+		free(missing_oids);
+
+	return res;
+}
diff --git a/promisor-remote.h b/promisor-remote.h
index 01dcdf4dc7..92650cfd4c 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -1,6 +1,8 @@
 #ifndef PROMISOR_REMOTE_H
 #define PROMISOR_REMOTE_H
 
+struct object_id;
+
 /*
  * Promisor remote linked list
  * Its information come from remote.XXX config entries.
@@ -12,5 +14,6 @@ struct promisor_remote {
 
 extern struct promisor_remote *promisor_remote_find(const char *remote_name);
 extern int has_promisor_remote(void);
+extern int promisor_remote_get_direct(const struct object_id *oids, int oid_nr);
 
 #endif /* PROMISOR_REMOTE_H */
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 05/16] promisor-remote: add promisor_remote_reinit()
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (3 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct() Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 06/16] promisor-remote: use repository_format_partial_clone Christian Couder
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

We will need to reinitialize the promisor remote configuration
as we will make some changes to it in a later commit.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 22 ++++++++++++++++++++--
 promisor-remote.h |  1 +
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 289f1dd074..46271eb3e3 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -67,10 +67,10 @@ static int promisor_remote_config(const char *var, const char *value, void *data
 	return 0;
 }
 
+static int initialized;
+
 static void promisor_remote_init(void)
 {
-	static int initialized;
-
 	if (initialized)
 		return;
 	initialized = 1;
@@ -78,6 +78,24 @@ static void promisor_remote_init(void)
 	git_config(promisor_remote_config, NULL);
 }
 
+static void promisor_remote_clear(void)
+{
+	while (promisors) {
+		struct promisor_remote *r = promisors;
+		promisors = promisors->next;
+		free(r);
+	}
+
+	promisors_tail = &promisors;
+}
+
+void promisor_remote_reinit(void)
+{
+	initialized = 0;
+	promisor_remote_clear();
+	promisor_remote_init();
+}
+
 struct promisor_remote *promisor_remote_find(const char *remote_name)
 {
 	promisor_remote_init();
diff --git a/promisor-remote.h b/promisor-remote.h
index 92650cfd4c..ff69963907 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -12,6 +12,7 @@ struct promisor_remote {
 	const char name[FLEX_ARRAY];
 };
 
+extern void promisor_remote_reinit(void);
 extern struct promisor_remote *promisor_remote_find(const char *remote_name);
 extern int has_promisor_remote(void);
 extern int promisor_remote_get_direct(const struct object_id *oids, int oid_nr);
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 06/16] promisor-remote: use repository_format_partial_clone
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (4 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 05/16] promisor-remote: add promisor_remote_reinit() Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 07/16] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

A remote specified using the extensions.partialClone config
option should be considered a promisor remote too.

For simplicity and to make things predictable, this promisor
remote should be either always the last one we try to get
objects from, or the first one. So it should always be either
at the end of the promisor remote list, or at its start.

We decided to make it the last one we try, because it is
likely that someone using many promisor remotes is doing so
because the other promisor remotes are better for some reason
(maybe they are closer or faster for some kind of objects)
than the origin, and the origin is likely to be the remote
specified by extensions.partialClone.

This justification is not very strong, but one choice had to
be made, and anyway the long term plan should be to make the
order somehow fully configurable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/promisor-remote.c b/promisor-remote.c
index 46271eb3e3..737689d044 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -40,6 +40,18 @@ static struct promisor_remote *promisor_remote_lookup(const char *remote_name,
 	return NULL;
 }
 
+static void promisor_remote_move_to_tail(struct promisor_remote *r,
+					 struct promisor_remote *previous)
+{
+	if (previous)
+		previous->next = r->next;
+	else
+		promisors = r->next ? r->next : r;
+	r->next = NULL;
+	*promisors_tail = r;
+	promisors_tail = &r->next;
+}
+
 static int promisor_remote_config(const char *var, const char *value, void *data)
 {
 	const char *name;
@@ -76,6 +88,17 @@ static void promisor_remote_init(void)
 	initialized = 1;
 
 	git_config(promisor_remote_config, NULL);
+
+	if (repository_format_partial_clone) {
+		struct promisor_remote *o, *previous;
+
+		o = promisor_remote_lookup(repository_format_partial_clone,
+					   &previous);
+		if (o)
+			promisor_remote_move_to_tail(o, previous);
+		else
+			promisor_remote_new(repository_format_partial_clone);
+	}
 }
 
 static void promisor_remote_clear(void)
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 07/16] Use promisor_remote_get_direct() and has_promisor_remote()
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (5 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 06/16] promisor-remote: use repository_format_partial_clone Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 08/16] diff: use promisor-remote.h instead of fetch-object.h Christian Couder
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Instead of using the repository_format_partial_clone global
and fetch_objects() directly, let's use has_promisor_remote()
and promisor_remote_get_direct().

This way all the configured promisor remotes will be taken
into account, not only the one specified by
extensions.partialClone.

Also when cloning or fetching using a partial clone filter,
remote.origin.promisor will be set to "true" instead of
setting extensions.partialClone to "origin". This makes it
possible to use many promisor remote just by fetching from
them.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/cat-file.c            |  5 +++--
 builtin/fetch.c               | 11 ++++++-----
 builtin/gc.c                  |  3 ++-
 builtin/repack.c              |  3 ++-
 cache-tree.c                  |  3 ++-
 connected.c                   |  3 ++-
 list-objects-filter-options.c | 28 +++++++++++++++-------------
 packfile.c                    |  3 ++-
 sha1-file.c                   | 15 ++++++++-------
 t/t5601-clone.sh              |  2 +-
 t/t5616-partial-clone.sh      |  2 +-
 unpack-trees.c                |  7 +++----
 12 files changed, 47 insertions(+), 38 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 0f092382e1..85ae10bf0b 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -15,6 +15,7 @@
 #include "sha1-array.h"
 #include "packfile.h"
 #include "object-store.h"
+#include "promisor-remote.h"
 
 struct batch_options {
 	int enabled;
@@ -523,8 +524,8 @@ static int batch_objects(struct batch_options *opt)
 	if (opt->all_objects) {
 		struct object_cb_data cb;
 
-		if (repository_format_partial_clone)
-			warning("This repository has extensions.partialClone set. Some objects may not be loaded.");
+		if (has_promisor_remote())
+			warning("This repository uses promisor remotes. Some objects may not be loaded.");
 
 		cb.opt = opt;
 		cb.expand = &data;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 4ba63d5ac6..f74bd78144 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -23,6 +23,7 @@
 #include "packfile.h"
 #include "list-objects-filter-options.h"
 #include "commit-reach.h"
+#include "promisor-remote.h"
 
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
@@ -1460,7 +1461,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * If no prior partial clone/fetch and the current fetch DID NOT
 	 * request a partial-fetch, do a normal fetch.
 	 */
-	if (!repository_format_partial_clone && !filter_options.choice)
+	if (!has_promisor_remote() && !filter_options.choice)
 		return;
 
 	/*
@@ -1468,7 +1469,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * on this repo and remember the given filter-spec as the default
 	 * for subsequent fetches to this remote.
 	 */
-	if (!repository_format_partial_clone && filter_options.choice) {
+	if (!has_promisor_remote() && filter_options.choice) {
 		partial_clone_register(remote->name, &filter_options);
 		return;
 	}
@@ -1477,7 +1478,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * We are currently limited to only ONE promisor remote and only
 	 * allow partial-fetches from the promisor remote.
 	 */
-	if (strcmp(remote->name, repository_format_partial_clone)) {
+	if (!promisor_remote_find(remote->name)) {
 		if (filter_options.choice)
 			die(_("--filter can only be used with the remote "
 			      "configured in extensions.partialClone"));
@@ -1611,7 +1612,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (depth || deepen_since || deepen_not.nr)
 		deepen = 1;
 
-	if (filter_options.choice && !repository_format_partial_clone)
+	if (filter_options.choice && !has_promisor_remote())
 		die("--filter can only be used when extensions.partialClone is set");
 
 	if (all) {
@@ -1645,7 +1646,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	}
 
 	if (remote) {
-		if (filter_options.choice || repository_format_partial_clone)
+		if (filter_options.choice || has_promisor_remote())
 			fetch_one_setup_partial(remote);
 		result = fetch_one(remote, argc, argv, prune_tags_ok);
 	} else {
diff --git a/builtin/gc.c b/builtin/gc.c
index 8943bcc300..824a8832b5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -27,6 +27,7 @@
 #include "pack-objects.h"
 #include "blob.h"
 #include "tree.h"
+#include "promisor-remote.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -661,7 +662,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			argv_array_push(&prune, prune_expire);
 			if (quiet)
 				argv_array_push(&prune, "--no-progress");
-			if (repository_format_partial_clone)
+			if (has_promisor_remote())
 				argv_array_push(&prune,
 						"--exclude-promisor-objects");
 			if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
diff --git a/builtin/repack.c b/builtin/repack.c
index 3ea0583d02..57568ce9a8 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -11,6 +11,7 @@
 #include "midx.h"
 #include "packfile.h"
 #include "object-store.h"
+#include "promisor-remote.h"
 
 static int delta_base_offset = 1;
 static int pack_kept_objects = -1;
@@ -359,7 +360,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	argv_array_push(&cmd.args, "--all");
 	argv_array_push(&cmd.args, "--reflog");
 	argv_array_push(&cmd.args, "--indexed-objects");
-	if (repository_format_partial_clone)
+	if (has_promisor_remote())
 		argv_array_push(&cmd.args, "--exclude-promisor-objects");
 	if (write_bitmaps)
 		argv_array_push(&cmd.args, "--write-bitmap-index");
diff --git a/cache-tree.c b/cache-tree.c
index b13bfaf71e..64c285a746 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -5,6 +5,7 @@
 #include "cache-tree.h"
 #include "object-store.h"
 #include "replace-object.h"
+#include "promisor-remote.h"
 
 #ifndef DEBUG
 #define DEBUG 0
@@ -357,7 +358,7 @@ static int update_one(struct cache_tree *it,
 		}
 
 		ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
-			(repository_format_partial_clone &&
+			(has_promisor_remote() &&
 			 ce_skip_worktree(ce));
 		if (is_null_oid(oid) ||
 		    (!ce_missing_ok && !has_object_file(oid))) {
diff --git a/connected.c b/connected.c
index 1bba888eff..0eaaedee6a 100644
--- a/connected.c
+++ b/connected.c
@@ -4,6 +4,7 @@
 #include "connected.h"
 #include "transport.h"
 #include "packfile.h"
+#include "promisor-remote.h"
 
 /*
  * If we feed all the commits we want to verify to this command
@@ -56,7 +57,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 	argv_array_push(&rev_list.args,"rev-list");
 	argv_array_push(&rev_list.args, "--objects");
 	argv_array_push(&rev_list.args, "--stdin");
-	if (repository_format_partial_clone)
+	if (has_promisor_remote())
 		argv_array_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		argv_array_push(&rev_list.args, "--not");
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index c0036f7378..f41a831fce 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -6,6 +6,7 @@
 #include "list-objects.h"
 #include "list-objects-filter.h"
 #include "list-objects-filter-options.h"
+#include "promisor-remote.h"
 
 /*
  * Parse value of the argument to the "filter" keyword.
@@ -144,30 +145,31 @@ void partial_clone_register(
 	const char *remote,
 	const struct list_objects_filter_options *filter_options)
 {
-	/*
-	 * Record the name of the partial clone remote in the
-	 * config and in the global variable -- the latter is
-	 * used throughout to indicate that partial clone is
-	 * enabled and to expect missing objects.
-	 */
-	if (repository_format_partial_clone &&
-	    *repository_format_partial_clone &&
-	    strcmp(remote, repository_format_partial_clone))
-		die(_("cannot change partial clone promisor remote"));
+	char *cfg_name;
 
-	git_config_set("core.repositoryformatversion", "1");
-	git_config_set("extensions.partialclone", remote);
+	/* Check if it is already registered */
+	if (!promisor_remote_find(remote)) {
+		git_config_set("core.repositoryformatversion", "1");
 
-	repository_format_partial_clone = xstrdup(remote);
+		/* Add promisor config for the remote */
+		cfg_name = xstrfmt("remote.%s.promisor", remote);
+		git_config_set(cfg_name, "true");
+		free(cfg_name);
+	}
 
 	/*
 	 * Record the initial filter-spec in the config as
 	 * the default for subsequent fetches from this remote.
+	 *
+	 * TODO: record it into remote.<name>.partialclonefilter
 	 */
 	core_partial_clone_filter_default =
 		xstrdup(filter_options->filter_spec);
 	git_config_set("core.partialclonefilter",
 		       core_partial_clone_filter_default);
+
+	/* Make sure the config info are reset */
+	promisor_remote_reinit();
 }
 
 void partial_clone_get_default_filter_spec(
diff --git a/packfile.c b/packfile.c
index 498498b1e3..fa5a9bb136 100644
--- a/packfile.c
+++ b/packfile.c
@@ -16,6 +16,7 @@
 #include "tree.h"
 #include "object-store.h"
 #include "midx.h"
+#include "promisor-remote.h"
 
 char *odb_pack_name(struct strbuf *buf,
 		    const unsigned char *sha1,
@@ -2147,7 +2148,7 @@ int is_promisor_object(const struct object_id *oid)
 	static int promisor_objects_prepared;
 
 	if (!promisor_objects_prepared) {
-		if (repository_format_partial_clone) {
+		if (has_promisor_remote()) {
 			for_each_packed_object(add_promisor_object,
 					       &promisor_objects,
 					       FOR_EACH_OBJECT_PROMISOR_ONLY);
diff --git a/sha1-file.c b/sha1-file.c
index 83358737f3..e7e037b6ce 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -30,8 +30,8 @@
 #include "mergesort.h"
 #include "quote.h"
 #include "packfile.h"
-#include "fetch-object.h"
 #include "object-store.h"
+#include "promisor-remote.h"
 
 /* The maximum size for an object header. */
 #define MAX_HEADER_LEN 32
@@ -1369,16 +1369,17 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 		}
 
 		/* Check if it is a missing object */
-		if (fetch_if_missing && repository_format_partial_clone &&
+		if (fetch_if_missing && has_promisor_remote() &&
 		    !already_retried && r == the_repository &&
 		    !(flags & OBJECT_INFO_FOR_PREFETCH)) {
 			/*
-			 * TODO Investigate checking fetch_object() return
-			 * TODO value and stopping on error here.
-			 * TODO Pass a repository struct through fetch_object,
-			 * such that arbitrary repositories work.
+			 * TODO Investigate checking promisor_remote_get_direct()
+			 * TODO return value and stopping on error here.
+			 * TODO Pass a repository struct through
+			 * promisor_remote_get_direct(), such that arbitrary
+			 * repositories work.
 			 */
-			fetch_objects(repository_format_partial_clone, real, 1);
+			promisor_remote_get_direct(real, 1);
 			already_retried = 1;
 			continue;
 		}
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 23854cab26..15720847b0 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -653,7 +653,7 @@ partial_clone () {
 	git -C client fsck &&
 
 	# Ensure that unneeded blobs are not inadvertently fetched.
-	test_config -C client extensions.partialclone "not a remote" &&
+	test_config -C client remote.origin.promisor "false" &&
 	test_must_fail git -C client cat-file -e "$HASH1" &&
 
 	# But this blob was fetched, because clone performs an initial checkout
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 9a8f9886b3..c9e5f14165 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -42,7 +42,7 @@ test_expect_success 'do partial clone 1' '
 
 	test_cmp expect_1.oids observed.oids &&
 	test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" &&
-	test "$(git -C pc1 config --local extensions.partialclone)" = "origin" &&
+	test "$(git -C pc1 config --local remote.origin.promisor)" = "true" &&
 	test "$(git -C pc1 config --local core.partialclonefilter)" = "blob:none"
 '
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 22985aa4a3..d86cd922e0 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -16,7 +16,7 @@
 #include "submodule-config.h"
 #include "fsmonitor.h"
 #include "object-store.h"
-#include "fetch-object.h"
+#include "promisor-remote.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -398,7 +398,7 @@ static int check_updates(struct unpack_trees_options *o)
 		load_gitmodules_file(index, &state);
 
 	enable_delayed_checkout(&state);
-	if (repository_format_partial_clone && o->update && !o->dry_run) {
+	if (has_promisor_remote() && o->update && !o->dry_run) {
 		/*
 		 * Prefetch the objects that are to be checked out in the loop
 		 * below.
@@ -417,8 +417,7 @@ static int check_updates(struct unpack_trees_options *o)
 			oid_array_append(&to_fetch, &ce->oid);
 		}
 		if (to_fetch.nr)
-			fetch_objects(repository_format_partial_clone,
-				      to_fetch.oid, to_fetch.nr);
+			promisor_remote_get_direct(to_fetch.oid, to_fetch.nr);
 		oid_array_clear(&to_fetch);
 	}
 	for (i = 0; i < index->cache_nr; i++) {
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 08/16] diff: use promisor-remote.h instead of fetch-object.h
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (6 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 07/16] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 09/16] promisor-remote: parse remote.*.partialclonefilter Christian Couder
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

The repository_format_partial_clone global and fetch_objects()
should not be used anymore when there can be more than one
promisor remote. Instead let's use has_promisor_remote()
and promisor_remote_get_direct() from "promisor-remote.h".

This way all the configured promisor remotes will be taken
into account, not only the one specified by
extensions.partialClone.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 diff.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/diff.c b/diff.c
index 4d3cf83a27..02864171fd 100644
--- a/diff.c
+++ b/diff.c
@@ -25,7 +25,7 @@
 #include "packfile.h"
 #include "parse-options.h"
 #include "help.h"
-#include "fetch-object.h"
+#include "promisor-remote.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -6490,8 +6490,7 @@ static void add_if_missing(struct repository *r,
 
 void diffcore_std(struct diff_options *options)
 {
-	if (options->repo == the_repository &&
-	    repository_format_partial_clone) {
+	if (options->repo == the_repository && has_promisor_remote()) {
 		/*
 		 * Prefetch the diff pairs that are about to be flushed.
 		 */
@@ -6508,8 +6507,7 @@ void diffcore_std(struct diff_options *options)
 			/*
 			 * NEEDSWORK: Consider deduplicating the OIDs sent.
 			 */
-			fetch_objects(repository_format_partial_clone,
-				      to_fetch.oid, to_fetch.nr);
+			promisor_remote_get_direct(to_fetch.oid, to_fetch.nr);
 		oid_array_clear(&to_fetch);
 	}
 
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 09/16] promisor-remote: parse remote.*.partialclonefilter
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (7 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 08/16] diff: use promisor-remote.h instead of fetch-object.h Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 10/16] builtin/fetch: remove unique promisor remote limitation Christian Couder
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

This makes it possible to specify a different partial clone
filter for each promisor remote.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/fetch.c               |  2 +-
 list-objects-filter-options.c | 27 +++++++++++++++------------
 list-objects-filter-options.h |  3 ++-
 promisor-remote.c             | 15 +++++++++++++++
 promisor-remote.h             |  5 ++++-
 t/t0410-partial-clone.sh      |  2 +-
 t/t5601-clone.sh              |  1 +
 t/t5616-partial-clone.sh      |  2 +-
 8 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f74bd78144..13d8133130 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1491,7 +1491,7 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	 * the config.
 	 */
 	if (!filter_options.choice)
-		partial_clone_get_default_filter_spec(&filter_options);
+		partial_clone_get_default_filter_spec(&filter_options, remote->name);
 	return;
 }
 
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index f41a831fce..02f48b7c40 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -30,6 +30,9 @@ static int gently_parse_list_objects_filter(
 {
 	const char *v0;
 
+	if (!arg)
+		return 0;
+
 	if (filter_options->choice) {
 		if (errbuf) {
 			strbuf_addstr(
@@ -146,6 +149,7 @@ void partial_clone_register(
 	const struct list_objects_filter_options *filter_options)
 {
 	char *cfg_name;
+	char *filter_name;
 
 	/* Check if it is already registered */
 	if (!promisor_remote_find(remote)) {
@@ -160,27 +164,26 @@ void partial_clone_register(
 	/*
 	 * Record the initial filter-spec in the config as
 	 * the default for subsequent fetches from this remote.
-	 *
-	 * TODO: record it into remote.<name>.partialclonefilter
 	 */
-	core_partial_clone_filter_default =
-		xstrdup(filter_options->filter_spec);
-	git_config_set("core.partialclonefilter",
-		       core_partial_clone_filter_default);
+	filter_name = xstrfmt("remote.%s.partialclonefilter", remote);
+	git_config_set(filter_name, filter_options->filter_spec);
+	free(filter_name);
 
 	/* Make sure the config info are reset */
 	promisor_remote_reinit();
 }
 
 void partial_clone_get_default_filter_spec(
-	struct list_objects_filter_options *filter_options)
+	struct list_objects_filter_options *filter_options,
+	const char *remote)
 {
+	struct promisor_remote *promisor = promisor_remote_find(remote);
+
 	/*
 	 * Parse default value, but silently ignore it if it is invalid.
 	 */
-	if (!core_partial_clone_filter_default)
-		return;
-	gently_parse_list_objects_filter(filter_options,
-					 core_partial_clone_filter_default,
-					 NULL);
+	if (promisor)
+		gently_parse_list_objects_filter(filter_options,
+						 promisor->partial_clone_filter,
+						 NULL);
 }
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index e3adc78ebf..70d27f44ef 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -89,6 +89,7 @@ void partial_clone_register(
 	const char *remote,
 	const struct list_objects_filter_options *filter_options);
 void partial_clone_get_default_filter_spec(
-	struct list_objects_filter_options *filter_options);
+	struct list_objects_filter_options *filter_options,
+	const char *remote);
 
 #endif /* LIST_OBJECTS_FILTER_OPTIONS_H */
diff --git a/promisor-remote.c b/promisor-remote.c
index 737689d044..707a8005c5 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -75,6 +75,21 @@ static int promisor_remote_config(const char *var, const char *value, void *data
 		free(remote_name);
 		return 0;
 	}
+	if (!strcmp(subkey, "partialclonefilter")) {
+		struct promisor_remote *r;
+		char *remote_name = xmemdupz(name, namelen);
+
+		r = promisor_remote_lookup(remote_name, NULL);
+		if (!r)
+			r = promisor_remote_new(remote_name);
+
+		free(remote_name);
+
+		if (!r)
+			return 0;
+
+		return git_config_string(&r->partial_clone_filter, var, value);
+	}
 
 	return 0;
 }
diff --git a/promisor-remote.h b/promisor-remote.h
index ff69963907..562c7ad8a4 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -5,10 +5,13 @@ struct object_id;
 
 /*
  * Promisor remote linked list
- * Its information come from remote.XXX config entries.
+ *
+ * Information in its fields come from remote.XXX config entries or
+ * from extensions.partialclone or core.partialclonefilter.
  */
 struct promisor_remote {
 	struct promisor_remote *next;
+	const char *partial_clone_filter;
 	const char name[FLEX_ARRAY];
 };
 
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 3559313bd0..3082eff2bf 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -26,7 +26,7 @@ promise_and_delete () {
 test_expect_success 'extensions.partialclone without filter' '
 	test_create_repo server &&
 	git clone --filter="blob:none" "file://$(pwd)/server" client &&
-	git -C client config --unset core.partialclonefilter &&
+	git -C client config --unset remote.origin.partialclonefilter &&
 	git -C client fetch origin
 '
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 15720847b0..8842b029bd 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -654,6 +654,7 @@ partial_clone () {
 
 	# Ensure that unneeded blobs are not inadvertently fetched.
 	test_config -C client remote.origin.promisor "false" &&
+	git -C client config --unset remote.origin.partialclonefilter &&
 	test_must_fail git -C client cat-file -e "$HASH1" &&
 
 	# But this blob was fetched, because clone performs an initial checkout
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index c9e5f14165..107b247336 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -43,7 +43,7 @@ test_expect_success 'do partial clone 1' '
 	test_cmp expect_1.oids observed.oids &&
 	test "$(git -C pc1 config --local core.repositoryformatversion)" = "1" &&
 	test "$(git -C pc1 config --local remote.origin.promisor)" = "true" &&
-	test "$(git -C pc1 config --local core.partialclonefilter)" = "blob:none"
+	test "$(git -C pc1 config --local remote.origin.partialclonefilter)" = "blob:none"
 '
 
 # checkout master to force dynamic object fetch of blobs at HEAD.
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 10/16] builtin/fetch: remove unique promisor remote limitation
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (8 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 09/16] promisor-remote: parse remote.*.partialclonefilter Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 11/16] t0410: test fetching from many promisor remotes Christian Couder
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

As the infrastructure for more than one promisor remote
has been introduced in previous patches, we can remove
code that forbids the registration of more than one
promisor remote.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/fetch.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 13d8133130..5657d054ec 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1465,26 +1465,16 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 		return;
 
 	/*
-	 * If this is the FIRST partial-fetch request, we enable partial
-	 * on this repo and remember the given filter-spec as the default
-	 * for subsequent fetches to this remote.
+	 * If this is a partial-fetch request, we enable partial on
+	 * this repo if not already enabled and remember the given
+	 * filter-spec as the default for subsequent fetches to this
+	 * remote.
 	 */
-	if (!has_promisor_remote() && filter_options.choice) {
+	if (filter_options.choice) {
 		partial_clone_register(remote->name, &filter_options);
 		return;
 	}
 
-	/*
-	 * We are currently limited to only ONE promisor remote and only
-	 * allow partial-fetches from the promisor remote.
-	 */
-	if (!promisor_remote_find(remote->name)) {
-		if (filter_options.choice)
-			die(_("--filter can only be used with the remote "
-			      "configured in extensions.partialClone"));
-		return;
-	}
-
 	/*
 	 * Do a partial-fetch from the promisor remote using either the
 	 * explicitly given filter-spec or inherit the filter-spec from
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 11/16] t0410: test fetching from many promisor remotes
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (9 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 10/16] builtin/fetch: remove unique promisor remote limitation Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 12/16] partial-clone: add multiple remotes in the doc Christian Couder
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

This shows that it is now possible to fetch objects from more
than one promisor remote, and that fetching from a new
promisor remote can configure it as one.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0410-partial-clone.sh | 49 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 3082eff2bf..2498e72a34 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -183,8 +183,55 @@ test_expect_success 'fetching of missing objects works with ref-in-want enabled'
 	grep "git< fetch=.*ref-in-want" trace
 '
 
+test_expect_success 'fetching of missing objects from another promisor remote' '
+	git clone "file://$(pwd)/server" server2 &&
+	test_commit -C server2 bar &&
+	git -C server2 repack -a -d --write-bitmap-index &&
+	HASH2=$(git -C server2 rev-parse bar) &&
+
+	git -C repo remote add server2 "file://$(pwd)/server2" &&
+	git -C repo config remote.server2.promisor true &&
+	git -C repo cat-file -p "$HASH2" &&
+
+	git -C repo fetch server2 &&
+	rm -rf repo/.git/objects/* &&
+	git -C repo cat-file -p "$HASH2" &&
+
+	# Ensure that the .promisor file is written, and check that its
+	# associated packfile contains the object
+	ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
+	test_line_count = 1 promisorlist &&
+	IDX=$(sed "s/promisor$/idx/" promisorlist) &&
+	git verify-pack --verbose "$IDX" >out &&
+	grep "$HASH2" out
+'
+
+test_expect_success 'fetching of missing objects configures a promisor remote' '
+	git clone "file://$(pwd)/server" server3 &&
+	test_commit -C server3 baz &&
+	git -C server3 repack -a -d --write-bitmap-index &&
+	HASH3=$(git -C server3 rev-parse baz) &&
+	git -C server3 config uploadpack.allowfilter 1 &&
+
+	rm repo/.git/objects/pack/pack-*.promisor &&
+
+	git -C repo remote add server3 "file://$(pwd)/server3" &&
+	git -C repo fetch --filter="blob:none" server3 $HASH3 &&
+
+	test_cmp_config -C repo true remote.server3.promisor &&
+
+	# Ensure that the .promisor file is written, and check that its
+	# associated packfile contains the object
+	ls repo/.git/objects/pack/pack-*.promisor >promisorlist &&
+	test_line_count = 1 promisorlist &&
+	IDX=$(sed "s/promisor$/idx/" promisorlist) &&
+	git verify-pack --verbose "$IDX" >out &&
+	grep "$HASH3" out
+'
+
 test_expect_success 'fetching of missing blobs works' '
-	rm -rf server repo &&
+	rm -rf server server2 repo &&
+	rm -rf server server3 repo &&
 	test_create_repo server &&
 	test_commit -C server foo &&
 	git -C server repack -a -d --write-bitmap-index &&
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 12/16] partial-clone: add multiple remotes in the doc
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (10 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 11/16] t0410: test fetching from many promisor remotes Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 13/16] remote: add promisor and partial clone config to " Christian Couder
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

While at it, let's remove a reference to ODB effort as the ODB
effort has been replaced by directly enhancing partial clone
and promisor remote features.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/technical/partial-clone.txt | 117 ++++++++++++++++------
 1 file changed, 84 insertions(+), 33 deletions(-)

diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt
index 896c7b3878..210373e258 100644
--- a/Documentation/technical/partial-clone.txt
+++ b/Documentation/technical/partial-clone.txt
@@ -30,12 +30,20 @@ advance* during clone and fetch operations and thereby reduce download
 times and disk usage.  Missing objects can later be "demand fetched"
 if/when needed.
 
+A remote that can later provide the missing objects is called a
+promisor remote, as it promises to send the objects when
+requested. Initialy Git supported only one promisor remote, the origin
+remote from which the user cloned and that was configured in the
+"extensions.partialClone" config option. Later support for more than
+one promisor remote has been implemented.
+
 Use of partial clone requires that the user be online and the origin
-remote be available for on-demand fetching of missing objects.  This may
-or may not be problematic for the user.  For example, if the user can
-stay within the pre-selected subset of the source tree, they may not
-encounter any missing objects.  Alternatively, the user could try to
-pre-fetch various objects if they know that they are going offline.
+remote or other promisor remotes be available for on-demand fetching
+of missing objects.  This may or may not be problematic for the user.
+For example, if the user can stay within the pre-selected subset of
+the source tree, they may not encounter any missing objects.
+Alternatively, the user could try to pre-fetch various objects if they
+know that they are going offline.
 
 
 Non-Goals
@@ -100,18 +108,18 @@ or commits that reference missing trees.
 Handling Missing Objects
 ------------------------
 
-- An object may be missing due to a partial clone or fetch, or missing due
-  to repository corruption.  To differentiate these cases, the local
-  repository specially indicates such filtered packfiles obtained from the
-  promisor remote as "promisor packfiles".
+- An object may be missing due to a partial clone or fetch, or missing
+  due to repository corruption.  To differentiate these cases, the
+  local repository specially indicates such filtered packfiles
+  obtained from promisor remotes as "promisor packfiles".
 +
 These promisor packfiles consist of a "<name>.promisor" file with
 arbitrary contents (like the "<name>.keep" files), in addition to
 their "<name>.pack" and "<name>.idx" files.
 
 - The local repository considers a "promisor object" to be an object that
-  it knows (to the best of its ability) that the promisor remote has promised
-  that it has, either because the local repository has that object in one of
+  it knows (to the best of its ability) that promisor remotes have promised
+  that they have, either because the local repository has that object in one of
   its promisor packfiles, or because another promisor object refers to it.
 +
 When Git encounters a missing object, Git can see if it is a promisor object
@@ -123,12 +131,12 @@ expensive-to-modify list of missing objects.[a]
 - Since almost all Git code currently expects any referenced object to be
   present locally and because we do not want to force every command to do
   a dry-run first, a fallback mechanism is added to allow Git to attempt
-  to dynamically fetch missing objects from the promisor remote.
+  to dynamically fetch missing objects from promisor remotes.
 +
 When the normal object lookup fails to find an object, Git invokes
-fetch-object to try to get the object from the server and then retry
-the object lookup.  This allows objects to be "faulted in" without
-complicated prediction algorithms.
+promisor_remote_get_direct() to try to get the object from a promisor
+remote and then retry the object lookup.  This allows objects to be
+"faulted in" without complicated prediction algorithms.
 +
 For efficiency reasons, no check as to whether the missing object is
 actually a promisor object is performed.
@@ -157,8 +165,7 @@ and prefetch those objects in bulk.
 +
 We are not happy with this global variable and would like to remove it,
 but that requires significant refactoring of the object code to pass an
-additional flag.  We hope that concurrent efforts to add an ODB API can
-encompass this.
+additional flag.
 
 
 Fetching Missing Objects
@@ -182,21 +189,63 @@ has been updated to not use any object flags when the corresponding argument
   though they are not necessary.
 
 
+Using many promisor remotes
+---------------------------
+
+Many promisor remotes can be configured and used.
+
+This allows for example a user to have multiple geographically-close
+cache servers for fetching missing blobs while continuing to do
+filtered `git-fetch` commands from the central server.
+
+When fetching objects, promisor remotes are tried one after the other
+until all the objects have been fetched.
+
+Remotes that are considered "promisor" remotes are those specified by
+the following configuration variables:
+
+- `extensions.partialClone = <name>`
+
+- `remote.<name>.promisor = true`
+
+- `remote.<name>.partialCloneFilter = ...`
+
+Only one promisor remote can be configured using the
+`extensions.partialClone` config variable. This promisor remote will
+be the last one tried when fetching objects.
+
+We decided to make it the last one we try, because it is likely that
+someone using many promisor remotes is doing so because the other
+promisor remotes are better for some reason (maybe they are closer or
+faster for some kind of objects) than the origin, and the origin is
+likely to be the remote specified by extensions.partialClone.
+
+This justification is not very strong, but one choice had to be made,
+and anyway the long term plan should be to make the order somehow
+fully configurable.
+
+For now though the other promisor remotes will be tried in the order
+they appear in the config file.
+
 Current Limitations
 -------------------
 
-- The remote used for a partial clone (or the first partial fetch
-  following a regular clone) is marked as the "promisor remote".
+- It is not possible to specify the order in which the promisor
+  remotes are tried in other ways than the order in which they appear
+  in the config file.
 +
-We are currently limited to a single promisor remote and only that
-remote may be used for subsequent partial fetches.
+It is also not possible to specify an order to be used when fetching
+from one remote and a different order when fetching from another
+remote.
+
+- It is not possible to push only specific objects to a promisor
+  remote.
 +
-We accept this limitation because we believe initial users of this
-feature will be using it on repositories with a strong single central
-server.
+It is not possible to push at the same time to multiple promisor
+remote in a specific order.
 
-- Dynamic object fetching will only ask the promisor remote for missing
-  objects.  We assume that the promisor remote has a complete view of the
+- Dynamic object fetching will only ask promisor remotes for missing
+  objects.  We assume that promisor remotes have a complete view of the
   repository and can satisfy all such requests.
 
 - Repack essentially treats promisor and non-promisor packfiles as 2
@@ -218,15 +267,17 @@ server.
 Future Work
 -----------
 
-- Allow more than one promisor remote and define a strategy for fetching
-  missing objects from specific promisor remotes or of iterating over the
-  set of promisor remotes until a missing object is found.
+- Improve the way to specify the order in which promisor remotes are
+  tried.
 +
-A user might want to have multiple geographically-close cache servers
-for fetching missing blobs while continuing to do filtered `git-fetch`
-commands from the central server, for example.
+For example this could allow to specify explicitly something like:
+"When fetching from this remote, I want to use these promisor remotes
+in this order, though, when pushing or fetching to that remote, I want
+to use those promisor remotes in that order."
+
+- Allow pushing to promisor remotes.
 +
-Or the user might want to work in a triangular work flow with multiple
+The user might want to work in a triangular work flow with multiple
 promisor remotes that each have an incomplete view of the repository.
 
 - Allow repack to work on promisor packfiles (while keeping them distinct
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 13/16] remote: add promisor and partial clone config to the doc
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (11 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 12/16] partial-clone: add multiple remotes in the doc Christian Couder
@ 2019-04-09 16:11 ` " Christian Couder
  2019-04-09 16:11 ` [PATCH v5 14/16] Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} Christian Couder
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config/remote.txt | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 6c4cad83a2..a8e6437a90 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -76,3 +76,11 @@ remote.<name>.pruneTags::
 +
 See also `remote.<name>.prune` and the PRUNING section of
 linkgit:git-fetch[1].
+
+remote.<name>.promisor::
+	When set to true, this remote will be used to fetch promisor
+	objects.
+
+remote.<name>.partialclonefilter::
+	The filter that will be applied when fetching from this
+	promisor remote.
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 14/16] Remove fetch-object.{c,h} in favor of promisor-remote.{c,h}
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (12 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 13/16] remote: add promisor and partial clone config to " Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 15/16] Move repository_format_partial_clone to promisor-remote.c Christian Couder
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

As fetch_objects() is now used only in promisor-remote.c
and should't be used outside it, let's move it into
promisor-remote.c, make it static there, and remove
fetch-object.{c,h}.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile          |  1 -
 fetch-object.c    | 43 -------------------------------------------
 fetch-object.h    |  9 ---------
 promisor-remote.c | 40 +++++++++++++++++++++++++++++++++++++++-
 4 files changed, 39 insertions(+), 54 deletions(-)
 delete mode 100644 fetch-object.c
 delete mode 100644 fetch-object.h

diff --git a/Makefile b/Makefile
index 3523ae0517..0996fb6646 100644
--- a/Makefile
+++ b/Makefile
@@ -890,7 +890,6 @@ LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec-cmd.o
 LIB_OBJS += fetch-negotiator.o
-LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
diff --git a/fetch-object.c b/fetch-object.c
deleted file mode 100644
index eac4d448ef..0000000000
--- a/fetch-object.c
+++ /dev/null
@@ -1,43 +0,0 @@
-#include "cache.h"
-#include "packfile.h"
-#include "pkt-line.h"
-#include "strbuf.h"
-#include "transport.h"
-#include "fetch-object.h"
-
-static int fetch_refs(const char *remote_name, struct ref *ref)
-{
-	struct remote *remote;
-	struct transport *transport;
-	int original_fetch_if_missing = fetch_if_missing;
-	int res;
-
-	fetch_if_missing = 0;
-	remote = remote_get(remote_name);
-	if (!remote->url[0])
-		die(_("Remote with no URL"));
-	transport = transport_get(remote, remote->url[0]);
-
-	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
-	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	res = transport_fetch_refs(transport, ref);
-	fetch_if_missing = original_fetch_if_missing;
-
-	return res;
-}
-
-int fetch_objects(const char *remote_name, const struct object_id *oids,
-		  int oid_nr)
-{
-	struct ref *ref = NULL;
-	int i;
-
-	for (i = 0; i < oid_nr; i++) {
-		struct ref *new_ref = alloc_ref(oid_to_hex(&oids[i]));
-		oidcpy(&new_ref->old_oid, &oids[i]);
-		new_ref->exact_oid = 1;
-		new_ref->next = ref;
-		ref = new_ref;
-	}
-	return fetch_refs(remote_name, ref);
-}
diff --git a/fetch-object.h b/fetch-object.h
deleted file mode 100644
index 7bcc7cadb0..0000000000
--- a/fetch-object.h
+++ /dev/null
@@ -1,9 +0,0 @@
-#ifndef FETCH_OBJECT_H
-#define FETCH_OBJECT_H
-
-struct object_id;
-
-int fetch_objects(const char *remote_name, const struct object_id *oids,
-		  int oid_nr);
-
-#endif
diff --git a/promisor-remote.c b/promisor-remote.c
index 707a8005c5..066489b637 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -2,7 +2,45 @@
 #include "object-store.h"
 #include "promisor-remote.h"
 #include "config.h"
-#include "fetch-object.h"
+#include "transport.h"
+
+static int fetch_refs(const char *remote_name, struct ref *ref)
+{
+	struct remote *remote;
+	struct transport *transport;
+	int original_fetch_if_missing = fetch_if_missing;
+	int res;
+
+	fetch_if_missing = 0;
+	remote = remote_get(remote_name);
+	if (!remote->url[0])
+		die(_("Remote with no URL"));
+	transport = transport_get(remote, remote->url[0]);
+
+	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
+	res = transport_fetch_refs(transport, ref);
+	fetch_if_missing = original_fetch_if_missing;
+
+	return res;
+}
+
+static int fetch_objects(const char *remote_name,
+			 const struct object_id *oids,
+			 int oid_nr)
+{
+	struct ref *ref = NULL;
+	int i;
+
+	for (i = 0; i < oid_nr; i++) {
+		struct ref *new_ref = alloc_ref(oid_to_hex(&oids[i]));
+		oidcpy(&new_ref->old_oid, &oids[i]);
+		new_ref->exact_oid = 1;
+		new_ref->next = ref;
+		ref = new_ref;
+	}
+	return fetch_refs(remote_name, ref);
+}
 
 static struct promisor_remote *promisors;
 static struct promisor_remote **promisors_tail = &promisors;
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 15/16] Move repository_format_partial_clone to promisor-remote.c
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (13 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 14/16] Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} Christian Couder
@ 2019-04-09 16:11 ` Christian Couder
  2019-04-09 16:11 ` [PATCH v5 16/16] Move core_partial_clone_filter_default " Christian Couder
  2019-04-15  9:27 ` [PATCH v5 00/16] Many promisor remotes Junio C Hamano
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Now that we have has_promisor_remote() and can use many
promisor remotes, let's hide repository_format_partial_clone
as a static in promisor-remote.c to avoid it being use
for anything other than managing backward compatibility.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h           | 1 -
 environment.c     | 1 -
 promisor-remote.c | 7 +++++++
 promisor-remote.h | 6 ++++++
 setup.c           | 3 ++-
 5 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index e928fe9d3b..217e434d5d 100644
--- a/cache.h
+++ b/cache.h
@@ -960,7 +960,6 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
-extern char *repository_format_partial_clone;
 extern const char *core_partial_clone_filter_default;
 extern int repository_format_worktree_config;
 
diff --git a/environment.c b/environment.c
index 89af47cb85..8855d2fc11 100644
--- a/environment.c
+++ b/environment.c
@@ -31,7 +31,6 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
-char *repository_format_partial_clone;
 const char *core_partial_clone_filter_default;
 int repository_format_worktree_config;
 const char *git_commit_encoding;
diff --git a/promisor-remote.c b/promisor-remote.c
index 066489b637..371c78385f 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -4,6 +4,13 @@
 #include "config.h"
 #include "transport.h"
 
+static char *repository_format_partial_clone;
+
+void set_repository_format_partial_clone(char *partial_clone)
+{
+	repository_format_partial_clone = xstrdup_or_null(partial_clone);
+}
+
 static int fetch_refs(const char *remote_name, struct ref *ref)
 {
 	struct remote *remote;
diff --git a/promisor-remote.h b/promisor-remote.h
index 562c7ad8a4..c3c07c2a23 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -20,4 +20,10 @@ extern struct promisor_remote *promisor_remote_find(const char *remote_name);
 extern int has_promisor_remote(void);
 extern int promisor_remote_get_direct(const struct object_id *oids, int oid_nr);
 
+/*
+ * This should be used only once from setup.c to set the value we got
+ * from the extensions.partialclone config option.
+ */
+extern void set_repository_format_partial_clone(char *partial_clone);
+
 #endif /* PROMISOR_REMOTE_H */
diff --git a/setup.c b/setup.c
index d0c958c3b2..434628706b 100644
--- a/setup.c
+++ b/setup.c
@@ -4,6 +4,7 @@
 #include "dir.h"
 #include "string-list.h"
 #include "chdir-notify.h"
+#include "promisor-remote.h"
 
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
@@ -477,7 +478,7 @@ static int check_repository_format_gently(const char *gitdir, struct repository_
 	}
 
 	repository_format_precious_objects = candidate->precious_objects;
-	repository_format_partial_clone = xstrdup_or_null(candidate->partial_clone);
+	set_repository_format_partial_clone(candidate->partial_clone);
 	repository_format_worktree_config = candidate->worktree_config;
 	string_list_clear(&candidate->unknown_extensions, 0);
 
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 16/16] Move core_partial_clone_filter_default to promisor-remote.c
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (14 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 15/16] Move repository_format_partial_clone to promisor-remote.c Christian Couder
@ 2019-04-09 16:11 ` " Christian Couder
  2019-04-15  9:27 ` [PATCH v5 00/16] Many promisor remotes Junio C Hamano
  16 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-09 16:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Now that we can have a different default partial clone filter for
each promisor remote, let's hide core_partial_clone_filter_default
as a static in promisor-remote.c to avoid it being use for
anything other than managing backward compatibility.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h           | 1 -
 config.c          | 5 -----
 environment.c     | 1 -
 promisor-remote.c | 5 +++++
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/cache.h b/cache.h
index 217e434d5d..7ed867b119 100644
--- a/cache.h
+++ b/cache.h
@@ -960,7 +960,6 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
-extern const char *core_partial_clone_filter_default;
 extern int repository_format_worktree_config;
 
 /*
diff --git a/config.c b/config.c
index c2846df3f1..dd95b659b4 100644
--- a/config.c
+++ b/config.c
@@ -1344,11 +1344,6 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (!strcmp(var, "core.partialclonefilter")) {
-		return git_config_string(&core_partial_clone_filter_default,
-					 var, value);
-	}
-
 	if (!strcmp(var, "core.usereplacerefs")) {
 		read_replace_refs = git_config_bool(var, value);
 		return 0;
diff --git a/environment.c b/environment.c
index 8855d2fc11..efa072680a 100644
--- a/environment.c
+++ b/environment.c
@@ -31,7 +31,6 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
-const char *core_partial_clone_filter_default;
 int repository_format_worktree_config;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
diff --git a/promisor-remote.c b/promisor-remote.c
index 371c78385f..f77d0e3b50 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -5,6 +5,7 @@
 #include "transport.h"
 
 static char *repository_format_partial_clone;
+static const char *core_partial_clone_filter_default;
 
 void set_repository_format_partial_clone(char *partial_clone)
 {
@@ -103,6 +104,10 @@ static int promisor_remote_config(const char *var, const char *value, void *data
 	int namelen;
 	const char *subkey;
 
+	if (!strcmp(var, "core.partialclonefilter"))
+		return git_config_string(&core_partial_clone_filter_default,
+					 var, value);
+
 	if (parse_config_key(var, "remote", &name, &namelen, &subkey) < 0)
 		return 0;
 
-- 
2.21.0.750.g68c8ebb2ac


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 00/16] Many promisor remotes
  2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
                   ` (15 preceding siblings ...)
  2019-04-09 16:11 ` [PATCH v5 16/16] Move core_partial_clone_filter_default " Christian Couder
@ 2019-04-15  9:27 ` Junio C Hamano
  2019-04-15 10:30   ` Junio C Hamano
  2019-04-15 10:37   ` Christian Couder
  16 siblings, 2 replies; 28+ messages in thread
From: Junio C Hamano @ 2019-04-15  9:27 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli,
	SZEDER Gábor, Ramsay Jones

Christian Couder <christian.couder@gmail.com> writes:

> This patch series is based on:
>
> 763fb763b8 (Merge branch 'jt/batch-fetch-blobs-in-diff' into jch, 2019-04-08)
>
> to avoid issues with jt/batch-fetch-blobs-in-diff.

Yuck.  As an experienced contributor, you should know better than
that by now to do that.  A merge into jch/pu are rebuilt at least
once and often three times a day, and in no way a good solid base
to build on top.

If you really need to depend on another topic or two, please base
your work on a merge between 'master' (or some well known ancestor
of it) and the tips of the topics instead.

Having said that, I thought that the semantic conflict has been
corrected and the machinery to rebuild 'pu' has been replaying the
correct resolution ever since, so there was no need for such a
rebase?  Isn't it the case and do we still have the breakage due to
semantic conflict with JTan's topic in 'pu'?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 00/16] Many promisor remotes
  2019-04-15  9:27 ` [PATCH v5 00/16] Many promisor remotes Junio C Hamano
@ 2019-04-15 10:30   ` Junio C Hamano
  2019-04-15 10:39     ` Christian Couder
  2019-04-15 10:37   ` Christian Couder
  1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2019-04-15 10:30 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli,
	SZEDER Gábor, Ramsay Jones

Junio C Hamano <gitster@pobox.com> writes:

> Christian Couder <christian.couder@gmail.com> writes:
>
>> This patch series is based on:
>>
>> 763fb763b8 (Merge branch 'jt/batch-fetch-blobs-in-diff' into jch, 2019-04-08)
>>
>> to avoid issues with jt/batch-fetch-blobs-in-diff.
> ...
> If you really need to depend on another topic or two, please base
> your work on a merge between 'master' (or some well known ancestor
> of it) and the tips of the topics instead.

Well, I've done this myself by first queuing these on 763fb763b8 and
then made a merge between jt/batch-fetch-blobs-in-diff and master
and applied these pathes on top of the result.  You should be able
to see the resulting topic replacing the old one in 'pu' in todays
pushout.

Now I can lose the semantic conflict resolution the rebuilding
machinery was keeping, which makes things a bit simpler ;-)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 00/16] Many promisor remotes
  2019-04-15  9:27 ` [PATCH v5 00/16] Many promisor remotes Junio C Hamano
  2019-04-15 10:30   ` Junio C Hamano
@ 2019-04-15 10:37   ` Christian Couder
  1 sibling, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-15 10:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli,
	SZEDER Gábor, Ramsay Jones

On Mon, Apr 15, 2019 at 11:27 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > This patch series is based on:
> >
> > 763fb763b8 (Merge branch 'jt/batch-fetch-blobs-in-diff' into jch, 2019-04-08)
> >
> > to avoid issues with jt/batch-fetch-blobs-in-diff.
>
> Yuck.  As an experienced contributor, you should know better than
> that by now to do that.  A merge into jch/pu are rebuilt at least
> once and often three times a day, and in no way a good solid base
> to build on top.

Sorry if it creates problems.

> If you really need to depend on another topic or two, please base
> your work on a merge between 'master' (or some well known ancestor
> of it) and the tips of the topics instead.

Ok I will do that then.

> Having said that, I thought that the semantic conflict has been
> corrected and the machinery to rebuild 'pu' has been replaying the
> correct resolution ever since, so there was no need for such a
> rebase?  Isn't it the case and do we still have the breakage due to
> semantic conflict with JTan's topic in 'pu'?

There is one patch in the series, Patch 8/16 (diff: use
promisor-remote.h instead of fetch-object.h), that fix the breakage,
so if the series is applied on top of jt/batch-fetch-blobs-in-diff, it
will apply correctly and if it is not applied on top of
jt/batch-fetch-blobs-in-diff then the patch can just be dropped and
everything else will apply correctly. I thought that it might be
better to make the fix explicit than to rely on the rebuild machinery.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 00/16] Many promisor remotes
  2019-04-15 10:30   ` Junio C Hamano
@ 2019-04-15 10:39     ` Christian Couder
  0 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-04-15 10:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Ben Peart, Jonathan Tan, Jonathan Nieder,
	Nguyen Thai Ngoc Duy, Mike Hommey, Lars Schneider, Eric Wong,
	Christian Couder, Jeff Hostetler, Eric Sunshine, Beat Bolli,
	SZEDER Gábor, Ramsay Jones

On Mon, Apr 15, 2019 at 12:30 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Christian Couder <christian.couder@gmail.com> writes:
> >
> >> This patch series is based on:
> >>
> >> 763fb763b8 (Merge branch 'jt/batch-fetch-blobs-in-diff' into jch, 2019-04-08)
> >>
> >> to avoid issues with jt/batch-fetch-blobs-in-diff.
> > ...
> > If you really need to depend on another topic or two, please base
> > your work on a merge between 'master' (or some well known ancestor
> > of it) and the tips of the topics instead.
>
> Well, I've done this myself by first queuing these on 763fb763b8 and
> then made a merge between jt/batch-fetch-blobs-in-diff and master
> and applied these pathes on top of the result.  You should be able
> to see the resulting topic replacing the old one in 'pu' in todays
> pushout.

Thanks, I will take a look.

> Now I can lose the semantic conflict resolution the rebuilding
> machinery was keeping, which makes things a bit simpler ;-)

Yeah, nice!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-04-09 16:11 ` [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct() Christian Couder
@ 2019-05-30 17:21   ` Derrick Stolee
  2019-05-30 20:46     ` Johannes Schindelin
  2019-05-31  5:10     ` Christian Couder
  0 siblings, 2 replies; 28+ messages in thread
From: Derrick Stolee @ 2019-05-30 17:21 UTC (permalink / raw)
  To: Christian Couder, git
  Cc: Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

On 4/9/2019 12:11 PM, Christian Couder wrote:
> From: Christian Couder <christian.couder@gmail.com>
> 
> This is implemented for now by calling fetch_objects(). It fetches
> from all the promisor remotes.

Hi Christian,

Sorry for jumping on the thread late, but I noticed some peculiarities
when looking at the test coverage report.

> +static int remove_fetched_oids(struct object_id **oids, int oid_nr, int to_free)

This method does not seem to be covered by the test suite at all.
Is this scenario difficult to set up for a test?

> +{
> +	int i, missing_nr = 0;
> +	int *missing = xcalloc(oid_nr, sizeof(*missing));
> +	struct object_id *old_oids = *oids;
> +	struct object_id *new_oids;
> +	int old_fetch_if_missing = fetch_if_missing;
> +
> +	fetch_if_missing = 0;

This global 'fetch_if_missing' swap seems very fragile. I'm guessing you are using
it to prevent a loop when calling oid_object_info_extended() below. Can you instead
pass a flag to the method that disables the fetch_if_missing behavior?

> +
> +	for (i = 0; i < oid_nr; i++)
> +		if (oid_object_info_extended(the_repository, &old_oids[i], NULL, 0)) {

A use of "the_repository" this deep in new code is asking for a refactor later to remove it.
Please try to pass a "struct repository *r" through your methods so we minimize references
to the_repository (and the amount of work required to remove them later).

> +			missing[i] = 1;
> +			missing_nr++;
> +		}
> +
> +	fetch_if_missing = old_fetch_if_missing;
> +
> +	if (missing_nr) {
> +		int j = 0;
> +		new_oids = xcalloc(missing_nr, sizeof(*new_oids));
> +		for (i = 0; i < oid_nr; i++)
> +			if (missing[i])
> +				oidcpy(&new_oids[j++], &old_oids[i]);
> +		*oids = new_oids;
> +		if (to_free)
> +			free(old_oids);
> +	}
> +
> +	free(missing);
> +
> +	return missing_nr;
> +}
> +
> +int promisor_remote_get_direct(const struct object_id *oids, int oid_nr)
> +{
> +	struct promisor_remote *r;
> +	struct object_id *missing_oids = (struct object_id *)oids;
> +	int missing_nr = oid_nr;

Note that for this method, "missing_nr" actually means "number of oids still in the list".

> +	int to_free = 0;
> +	int res = -1;
> +
> +	promisor_remote_init();
> +
> +	for (r = promisors; r; r = r->next) {
> +		if (fetch_objects(r->name, missing_oids, missing_nr) < 0) {

This block hits if we have any missing objects. This is not currently hit by the test
suite.

> +			if (missing_nr == 1)
> +				continue;

But we skip the call below if there is exactly one object in the list, as it must be the one
missing object. So, to be interesting we need to try fetching multiple objects.

> +			missing_nr = remove_fetched_oids(&missing_oids, missing_nr, to_free);

Here is the one call, and after this assignment "missing_nr" does mean the number of missing objects.
However, I do think this could be clarified by using remaining_nr and remaining_oids.

> +			if (missing_nr) {
> +				to_free = 1;
> +				continue;
> +			}

Now this block took a bit to grok. You use to_free in the if(to_free) free(missing_oids); below.
But it also changes the behavior of remove_fetched_oids(). This means that the first time
remove_fetched_oids() will preserve the list (because it is the input list) but all later
calls will free the newly-created intermediate list. This checks out.

What is confusing to me: is there any reason that missing_nr would be zero in this situation?
I guess if the fetch_objects() failed to find some objects, but we ended up having them locally
in a new call to oid_object_info_extended(). That's a fringe case that is worth guarding against
but I wouldn't worry about testing.

> +		}
> +		res = 0;
> +		break;
> +	}
> +
> +	if (to_free)
> +		free(missing_oids);
> +
> +	return res;
> +}

While the test coverage report brought this patch to my attention, it does seem correct.
I still think a test exposing this method would be good, especially one that requires
a fetch_objects() call to multiple remotes to really exercise the details of remove_fetched_oids().

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-05-30 17:21   ` Derrick Stolee
@ 2019-05-30 20:46     ` Johannes Schindelin
  2019-05-30 20:54       ` Derrick Stolee
  2019-05-31  5:10     ` Christian Couder
  1 sibling, 1 reply; 28+ messages in thread
From: Johannes Schindelin @ 2019-05-30 20:46 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Christian Couder, git, Junio C Hamano, Jeff King, Ben Peart,
	Jonathan Tan, Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Hi,

On Thu, 30 May 2019, Derrick Stolee wrote:

> On 4/9/2019 12:11 PM, Christian Couder wrote:
> > From: Christian Couder <christian.couder@gmail.com>
> >
> > +{
> > +	int i, missing_nr = 0;
> > +	int *missing = xcalloc(oid_nr, sizeof(*missing));
> > +	struct object_id *old_oids = *oids;
> > +	struct object_id *new_oids;
> > +	int old_fetch_if_missing = fetch_if_missing;
> > +
> > +	fetch_if_missing = 0;
>
> This global 'fetch_if_missing' swap seems very fragile. I'm guessing you
> are using it to prevent a loop when calling oid_object_info_extended()
> below. Can you instead pass a flag to the method that disables the
> fetch_if_missing behavior?

FWIW I mentioned the very same concern here:
https://public-inbox.org/git/nycvar.QRO.7.76.6.1903272300020.41@tvgsbejvaqbjf.bet/

The situation is *pretty* bad by now. I see `fetch_if_missing` mentioned
25 times in `master`, and all but one are in .c files or in cache.h.

The flag is actually used only in `oid_object_info_extended()`, and that
function accepts an `unsigned flags`, so one might think that it could be
extended to accept also a `OBJECT_INFO_LOOKUP_FETCH_IF_MISSING`. But then,
there are many callers of that function, some of them also pretty low in
the food chain. For example, `oid_object_info()` (does not accept `flags`)
or `read_object()` (does not accept flags either).

So it looks as if the idea to pass this flag down the call chain entailed
a pretty serious avalanche effect.

An alternative that strikes me as inelegant, still, but nevertheless
better would be to move `fetch_if_missing` into `struct repository`.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-05-30 20:46     ` Johannes Schindelin
@ 2019-05-30 20:54       ` Derrick Stolee
  2019-05-31 11:35         ` Johannes Schindelin
  2019-05-31 16:14         ` Junio C Hamano
  0 siblings, 2 replies; 28+ messages in thread
From: Derrick Stolee @ 2019-05-30 20:54 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Christian Couder, git, Junio C Hamano, Jeff King, Ben Peart,
	Jonathan Tan, Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

On 5/30/2019 4:46 PM, Johannes Schindelin wrote:
> Hi,
> 
> On Thu, 30 May 2019, Derrick Stolee wrote:
> 
>> On 4/9/2019 12:11 PM, Christian Couder wrote:
>>> From: Christian Couder <christian.couder@gmail.com>
>>>
>>> +{
>>> +	int i, missing_nr = 0;
>>> +	int *missing = xcalloc(oid_nr, sizeof(*missing));
>>> +	struct object_id *old_oids = *oids;
>>> +	struct object_id *new_oids;
>>> +	int old_fetch_if_missing = fetch_if_missing;
>>> +
>>> +	fetch_if_missing = 0;
>>
>> This global 'fetch_if_missing' swap seems very fragile. I'm guessing you
>> are using it to prevent a loop when calling oid_object_info_extended()
>> below. Can you instead pass a flag to the method that disables the
>> fetch_if_missing behavior?
> 
> FWIW I mentioned the very same concern here:
> https://public-inbox.org/git/nycvar.QRO.7.76.6.1903272300020.41@tvgsbejvaqbjf.bet/
> 
> The situation is *pretty* bad by now. I see `fetch_if_missing` mentioned
> 25 times in `master`, and all but one are in .c files or in cache.h.
> 
> The flag is actually used only in `oid_object_info_extended()`, and that
> function accepts an `unsigned flags`, so one might think that it could be
> extended to accept also a `OBJECT_INFO_LOOKUP_FETCH_IF_MISSING`. But then,
> there are many callers of that function, some of them also pretty low in
> the food chain. For example, `oid_object_info()` (does not accept `flags`)
> or `read_object()` (does not accept flags either).
>
> So it looks as if the idea to pass this flag down the call chain entailed
> a pretty serious avalanche effect.

It could be approached in small bits.

First, add an OBJECT_INFO_NEVER_FETCH_IF_MISSING flag that overrides fetch_if_missing,
and then use the flag in small places like this one. Then, build up to the other
methods as appropriate.

> An alternative that strikes me as inelegant, still, but nevertheless
> better would be to move `fetch_if_missing` into `struct repository`.

This is literally the _least_ we should do to reduce our dependence on
globals. Maybe this happens first, then the flag idea could be done bits
at a time.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-05-30 17:21   ` Derrick Stolee
  2019-05-30 20:46     ` Johannes Schindelin
@ 2019-05-31  5:10     ` Christian Couder
  2019-06-25 13:50       ` Christian Couder
  1 sibling, 1 reply; 28+ messages in thread
From: Christian Couder @ 2019-05-31  5:10 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Hi Stolee,

On Thu, May 30, 2019 at 7:21 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 4/9/2019 12:11 PM, Christian Couder wrote:
> > From: Christian Couder <christian.couder@gmail.com>
> >
> > This is implemented for now by calling fetch_objects(). It fetches
> > from all the promisor remotes.
>
> Sorry for jumping on the thread late, but I noticed some peculiarities
> when looking at the test coverage report.

You are welcome. It needs review according to Junio, so it's
definitely a good thing that you take a look at it.

> > +static int remove_fetched_oids(struct object_id **oids, int oid_nr, int to_free)
>
> This method does not seem to be covered by the test suite at all.
> Is this scenario difficult to set up for a test?

I think so. If I remember correctly, I added this following a review
by Junio because it could be possible that a promisor/partial clone
remote only sends parts of the promisor objects it is asked. In this
case the objects that have been fetched should be removed from the
list of objects we try to fetch from the next promisor/partial clone
remote.

The issue is that now if a promisor/partial clone remote can send only
parts of the promisor objects it is asked, it should fail, as far as I
understand, which means that we will not actually get the objects it
should send. That's why I think it's not easy, or perhaps even not
possible, to test this.

> > +{
> > +     int i, missing_nr = 0;
> > +     int *missing = xcalloc(oid_nr, sizeof(*missing));
> > +     struct object_id *old_oids = *oids;
> > +     struct object_id *new_oids;
> > +     int old_fetch_if_missing = fetch_if_missing;
> > +
> > +     fetch_if_missing = 0;
>
> This global 'fetch_if_missing' swap seems very fragile. I'm guessing you are using
> it to prevent a loop when calling oid_object_info_extended() below. Can you instead
> pass a flag to the method that disables the fetch_if_missing behavior?

If such a flag existed when I wrote the function I would certainly
have used it, as I also dislike this kind of messing with a global
(and globals in general).

I will see if I can do something about it according to what you
suggest later in this thread.

> > +     for (i = 0; i < oid_nr; i++)
> > +             if (oid_object_info_extended(the_repository, &old_oids[i], NULL, 0)) {
>
> A use of "the_repository" this deep in new code is asking for a refactor later to remove it.
> Please try to pass a "struct repository *r" through your methods so we minimize references
> to the_repository (and the amount of work required to remove them later).

Ok, I will take a look at that.

> > +int promisor_remote_get_direct(const struct object_id *oids, int oid_nr)
> > +{
> > +     struct promisor_remote *r;
> > +     struct object_id *missing_oids = (struct object_id *)oids;
> > +     int missing_nr = oid_nr;
>
> Note that for this method, "missing_nr" actually means "number of oids still in the list".
>
> > +     int to_free = 0;
> > +     int res = -1;
> > +
> > +     promisor_remote_init();
> > +
> > +     for (r = promisors; r; r = r->next) {
> > +             if (fetch_objects(r->name, missing_oids, missing_nr) < 0) {
>
> This block hits if we have any missing objects. This is not currently hit by the test
> suite.
>
> > +                     if (missing_nr == 1)
> > +                             continue;
>
> But we skip the call below if there is exactly one object in the list, as it must be the one
> missing object. So, to be interesting we need to try fetching multiple objects.
>
> > +                     missing_nr = remove_fetched_oids(&missing_oids, missing_nr, to_free);
>
> Here is the one call, and after this assignment "missing_nr" does mean the number of missing objects.
> However, I do think this could be clarified by using remaining_nr and remaining_oids.

Ok, I will take a look at using "remaining_nr" and "remaining_oids".

> > +                     if (missing_nr) {
> > +                             to_free = 1;
> > +                             continue;
> > +                     }
>
> Now this block took a bit to grok. You use to_free in the if(to_free) free(missing_oids); below.
> But it also changes the behavior of remove_fetched_oids(). This means that the first time
> remove_fetched_oids() will preserve the list (because it is the input list) but all later
> calls will free the newly-created intermediate list. This checks out.
>
> What is confusing to me: is there any reason that missing_nr would be zero in this situation?

I don't think so but I will check again, and maybe add a comment.

> I guess if the fetch_objects() failed to find some objects, but we ended up having them locally
> in a new call to oid_object_info_extended(). That's a fringe case that is worth guarding against
> but I wouldn't worry about testing.
>
> > +             }
> > +             res = 0;
> > +             break;
> > +     }
> > +
> > +     if (to_free)
> > +             free(missing_oids);
> > +
> > +     return res;
> > +}
>
> While the test coverage report brought this patch to my attention, it does seem correct.
> I still think a test exposing this method would be good, especially one that requires
> a fetch_objects() call to multiple remotes to really exercise the details of remove_fetched_oids().

Yeah, I would like to actually test it. I will take another look at
what can be done to test this. Perhaps I will look at what can be done
to still get some objects when fetching from a promisor/partial clone
remote even when it doesn't have all of the objects we request.

Thanks for the review,
Christian.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-05-30 20:54       ` Derrick Stolee
@ 2019-05-31 11:35         ` Johannes Schindelin
  2019-05-31 16:14         ` Junio C Hamano
  1 sibling, 0 replies; 28+ messages in thread
From: Johannes Schindelin @ 2019-05-31 11:35 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Christian Couder, git, Junio C Hamano, Jeff King, Ben Peart,
	Jonathan Tan, Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Hi Stolee,

On Thu, 30 May 2019, Derrick Stolee wrote:

> On 5/30/2019 4:46 PM, Johannes Schindelin wrote:
> >
> > On Thu, 30 May 2019, Derrick Stolee wrote:
> >
> >> On 4/9/2019 12:11 PM, Christian Couder wrote:
> >>> From: Christian Couder <christian.couder@gmail.com>
> >>>
> >>> +{
> >>> +	int i, missing_nr = 0;
> >>> +	int *missing = xcalloc(oid_nr, sizeof(*missing));
> >>> +	struct object_id *old_oids = *oids;
> >>> +	struct object_id *new_oids;
> >>> +	int old_fetch_if_missing = fetch_if_missing;
> >>> +
> >>> +	fetch_if_missing = 0;
> >>
> >> This global 'fetch_if_missing' swap seems very fragile. I'm guessing you
> >> are using it to prevent a loop when calling oid_object_info_extended()
> >> below. Can you instead pass a flag to the method that disables the
> >> fetch_if_missing behavior?
> >
> > FWIW I mentioned the very same concern here:
> > https://public-inbox.org/git/nycvar.QRO.7.76.6.1903272300020.41@tvgsbejvaqbjf.bet/
> >
> > The situation is *pretty* bad by now. I see `fetch_if_missing` mentioned
> > 25 times in `master`, and all but one are in .c files or in cache.h.
> >
> > The flag is actually used only in `oid_object_info_extended()`, and that
> > function accepts an `unsigned flags`, so one might think that it could be
> > extended to accept also a `OBJECT_INFO_LOOKUP_FETCH_IF_MISSING`. But then,
> > there are many callers of that function, some of them also pretty low in
> > the food chain. For example, `oid_object_info()` (does not accept `flags`)
> > or `read_object()` (does not accept flags either).
> >
> > So it looks as if the idea to pass this flag down the call chain entailed
> > a pretty serious avalanche effect.
>
> It could be approached in small bits.
>
> First, add an OBJECT_INFO_NEVER_FETCH_IF_MISSING flag that overrides
> fetch_if_missing, and then use the flag in small places like this one.
> Then, build up to the other methods as appropriate.

That is a good idea. I fear that it will still take a Herculean effort to
get there, as some of the call paths strike me as rather deep...

> > An alternative that strikes me as inelegant, still, but nevertheless
> > better would be to move `fetch_if_missing` into `struct repository`.
>
> This is literally the _least_ we should do to reduce our dependence on
> globals. Maybe this happens first, then the flag idea could be done bits
> at a time.

Okay, then, I added https://github.com/gitgitgadget/git/issues/251 so we
won't forget.

BTW I am rather happy about the way the GitGitGadget issues turn out: I
added a couple of left-over bits, and could already close two tickets
after other developers pointed out that they had already been addressed,
something an unsuspecting GSoC student, for example, could not otherwise
have found out very easily (or for that matter, I myself...).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-05-30 20:54       ` Derrick Stolee
  2019-05-31 11:35         ` Johannes Schindelin
@ 2019-05-31 16:14         ` Junio C Hamano
  1 sibling, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2019-05-31 16:14 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Johannes Schindelin, Christian Couder, git, Jeff King, Ben Peart,
	Jonathan Tan, Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

Derrick Stolee <stolee@gmail.com> writes:

>>> This global 'fetch_if_missing' swap seems very fragile. I'm guessing you
>>> are using it to prevent a loop when calling oid_object_info_extended()
>>> below. Can you instead pass a flag to the method that disables the
>>> fetch_if_missing behavior?
>>  ...
>> The flag is actually used only in `oid_object_info_extended()`, and that
>> function accepts an `unsigned flags`, so one might think that it could be
>> extended to accept also a `OBJECT_INFO_LOOKUP_FETCH_IF_MISSING`. But then,
>> there are many callers of that function, some of them also pretty low in
>> the food chain. For example, `oid_object_info()` (does not accept `flags`)
>> or `read_object()` (does not accept flags either).
>>
>> So it looks as if the idea to pass this flag down the call chain entailed
>> a pretty serious avalanche effect.
>
> It could be approached in small bits.
>
> First, add an OBJECT_INFO_NEVER_FETCH_IF_MISSING flag that overrides fetch_if_missing,
> and then use the flag in small places like this one. Then, build up to the other
> methods as appropriate.
>
>> An alternative that strikes me as inelegant, still, but nevertheless
>> better would be to move `fetch_if_missing` into `struct repository`.
>
> This is literally the _least_ we should do to reduce our dependence on
> globals. Maybe this happens first, then the flag idea could be done bits
> at a time.

The bit is not an attribute of a repository instance, and I agree it
is an ugly hack to take advantage of an unrelated fact that a repo
is getting passed throughout the codechain.  It is better than
nothing if we stop there and will not do anything more to the topic,
but in the longer term, it is not that better than a global, I am
afraid.  We may not be doing the save-flip-and-restore-the-bit dance
on the global anymore, but instead would be doing the same for the
field in the repository object, no?

In any case, thanks for taking a look at the topic; what it wants to
achieve is worthwhile, but its execution does look like it needs
quite a lot more polishing, which is helped by review comments like
these.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct()
  2019-05-31  5:10     ` Christian Couder
@ 2019-06-25 13:50       ` Christian Couder
  0 siblings, 0 replies; 28+ messages in thread
From: Christian Couder @ 2019-06-25 13:50 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: git, Junio C Hamano, Jeff King, Ben Peart, Jonathan Tan,
	Jonathan Nieder, Nguyen Thai Ngoc Duy, Mike Hommey,
	Lars Schneider, Eric Wong, Christian Couder, Jeff Hostetler,
	Eric Sunshine, Beat Bolli, SZEDER Gábor, Ramsay Jones

On Fri, May 31, 2019 at 7:10 AM Christian Couder
<christian.couder@gmail.com> wrote:

> On Thu, May 30, 2019 at 7:21 PM Derrick Stolee <stolee@gmail.com> wrote:
> >
> > On 4/9/2019 12:11 PM, Christian Couder wrote:

> > > +{
> > > +     int i, missing_nr = 0;
> > > +     int *missing = xcalloc(oid_nr, sizeof(*missing));
> > > +     struct object_id *old_oids = *oids;
> > > +     struct object_id *new_oids;
> > > +     int old_fetch_if_missing = fetch_if_missing;
> > > +
> > > +     fetch_if_missing = 0;
> >
> > This global 'fetch_if_missing' swap seems very fragile. I'm guessing you are using
> > it to prevent a loop when calling oid_object_info_extended() below. Can you instead
> > pass a flag to the method that disables the fetch_if_missing behavior?
>
> If such a flag existed when I wrote the function I would certainly
> have used it, as I also dislike this kind of messing with a global
> (and globals in general).
>
> I will see if I can do something about it according to what you
> suggest later in this thread.

In the V6 patch series I just sent, the new
OBJECT_INFO_SKIP_FETCH_OBJECT flag that you introduced is used.

> > > +     for (i = 0; i < oid_nr; i++)
> > > +             if (oid_object_info_extended(the_repository, &old_oids[i], NULL, 0)) {
> >
> > A use of "the_repository" this deep in new code is asking for a refactor later to remove it.
> > Please try to pass a "struct repository *r" through your methods so we minimize references
> > to the_repository (and the amount of work required to remove them later).
>
> Ok, I will take a look at that.

A "struct repository *r" is passed in V6. I forgot to mention that in
the cover letter.

> > > +                     missing_nr = remove_fetched_oids(&missing_oids, missing_nr, to_free);
> >
> > Here is the one call, and after this assignment "missing_nr" does mean the number of missing objects.
> > However, I do think this could be clarified by using remaining_nr and remaining_oids.
>
> Ok, I will take a look at using "remaining_nr" and "remaining_oids".

Done in V6 too.

> > > +                     if (missing_nr) {
> > > +                             to_free = 1;
> > > +                             continue;
> > > +                     }
> >
> > Now this block took a bit to grok. You use to_free in the if(to_free) free(missing_oids); below.
> > But it also changes the behavior of remove_fetched_oids(). This means that the first time
> > remove_fetched_oids() will preserve the list (because it is the input list) but all later
> > calls will free the newly-created intermediate list. This checks out.
> >
> > What is confusing to me: is there any reason that missing_nr would be zero in this situation?
>
> I don't think so but I will check again, and maybe add a comment.

Actually missing_nr, or now remaining_nr, would be 0 if all the
promised objects have been fetched.

> > > +             }
> > > +             res = 0;
> > > +             break;
> > > +     }
> > > +
> > > +     if (to_free)
> > > +             free(missing_oids);
> > > +
> > > +     return res;
> > > +}
> >
> > While the test coverage report brought this patch to my attention, it does seem correct.
> > I still think a test exposing this method would be good, especially one that requires
> > a fetch_objects() call to multiple remotes to really exercise the details of remove_fetched_oids().
>
> Yeah, I would like to actually test it. I will take another look at
> what can be done to test this. Perhaps I will look at what can be done
> to still get some objects when fetching from a promisor/partial clone
> remote even when it doesn't have all of the objects we request.

I haven't improved test coverage or looked at how we could better
handle a partial fetch. I plan to look at that soon.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, back to index

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-09 16:11 [PATCH v5 00/16] Many promisor remotes Christian Couder
2019-04-09 16:11 ` [PATCH v5 01/16] t0410: remove pipes after git commands Christian Couder
2019-04-09 16:11 ` [PATCH v5 02/16] fetch-object: make functions return an error code Christian Couder
2019-04-09 16:11 ` [PATCH v5 03/16] Add initial support for many promisor remotes Christian Couder
2019-04-09 16:11 ` [PATCH v5 04/16] promisor-remote: implement promisor_remote_get_direct() Christian Couder
2019-05-30 17:21   ` Derrick Stolee
2019-05-30 20:46     ` Johannes Schindelin
2019-05-30 20:54       ` Derrick Stolee
2019-05-31 11:35         ` Johannes Schindelin
2019-05-31 16:14         ` Junio C Hamano
2019-05-31  5:10     ` Christian Couder
2019-06-25 13:50       ` Christian Couder
2019-04-09 16:11 ` [PATCH v5 05/16] promisor-remote: add promisor_remote_reinit() Christian Couder
2019-04-09 16:11 ` [PATCH v5 06/16] promisor-remote: use repository_format_partial_clone Christian Couder
2019-04-09 16:11 ` [PATCH v5 07/16] Use promisor_remote_get_direct() and has_promisor_remote() Christian Couder
2019-04-09 16:11 ` [PATCH v5 08/16] diff: use promisor-remote.h instead of fetch-object.h Christian Couder
2019-04-09 16:11 ` [PATCH v5 09/16] promisor-remote: parse remote.*.partialclonefilter Christian Couder
2019-04-09 16:11 ` [PATCH v5 10/16] builtin/fetch: remove unique promisor remote limitation Christian Couder
2019-04-09 16:11 ` [PATCH v5 11/16] t0410: test fetching from many promisor remotes Christian Couder
2019-04-09 16:11 ` [PATCH v5 12/16] partial-clone: add multiple remotes in the doc Christian Couder
2019-04-09 16:11 ` [PATCH v5 13/16] remote: add promisor and partial clone config to " Christian Couder
2019-04-09 16:11 ` [PATCH v5 14/16] Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} Christian Couder
2019-04-09 16:11 ` [PATCH v5 15/16] Move repository_format_partial_clone to promisor-remote.c Christian Couder
2019-04-09 16:11 ` [PATCH v5 16/16] Move core_partial_clone_filter_default " Christian Couder
2019-04-15  9:27 ` [PATCH v5 00/16] Many promisor remotes Junio C Hamano
2019-04-15 10:30   ` Junio C Hamano
2019-04-15 10:39     ` Christian Couder
2019-04-15 10:37   ` Christian Couder

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox