git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH] Modify fetch-pack to no longer die on error?
@ 2020-07-24 22:38 Jonathan Tan
  2020-07-24 23:07 ` Junio C Hamano
                   ` (4 more replies)
  0 siblings, 5 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-07-24 22:38 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

We've had a few instances where a lazy fetch in a partial clone fails,
leading to a fatal error, when the calling code could have easily
recovered - in other words, the severity of the bug should have just a
wasted fetch instead of stopping the whole command. Part of the issue
(and possibly the whole issue - I haven't looked at this beyond
fetch-pack yet) is that fetch-pack dies whenever it encounters an
error, so I took a look at fixing that.

(Note that fetch-pack is sometimes run through a remote helper, meaning
that we could leave the die() invocations in and just make sure that we
handle failure in the separate process correctly. But when the promisor
remote is HTTP protocol v2 or SSH protocol v0/v2, this is not true -
fetch_pack() is run in-process.)

I think the best way for easy authorship and review is to convert each
possibly-dying function into a function that either returns a
possibly-null error string or returns success/failure and writes the
error string into an "out" parameter. In this way, the change is rather
mechanical and should be easy to review. In the patch below I chose the
former approach, and I modified 2 functions (one that returns no value
and one that returns a value) to demonstrate what it would look like.

We could also take this further and have a "struct error" for type
safety and macros - e.g. THROW() to return a "struct error", TRY() to
execute what's inside the parentheses and return the error if there is
one, and OR_DIE() to execute what's inside the parentheses and die if
there is an error.

Any opinions before I continue working on this?

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c | 78 +++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 53 insertions(+), 25 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 80fb3bd899..20a7e05ea8 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -24,6 +24,8 @@
 #include "fsck.h"
 #include "shallow.h"
 
+typedef char * error_string;
+
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
 static int unpack_limit = 100;
@@ -136,8 +138,8 @@ enum ack_type {
 	ACK_ready
 };
 
-static void consume_shallow_list(struct fetch_pack_args *args,
-				 struct packet_reader *reader)
+static error_string consume_shallow_list(struct fetch_pack_args *args,
+					 struct packet_reader *reader)
 {
 	if (args->stateless_rpc && args->deepen) {
 		/* If we sent a depth we will get back "duplicate"
@@ -149,41 +151,54 @@ static void consume_shallow_list(struct fetch_pack_args *args,
 				continue;
 			if (starts_with(reader->line, "unshallow "))
 				continue;
-			die(_("git fetch-pack: expected shallow list"));
+			return xstrdup(_("git fetch-pack: expected shallow list"));
 		}
 		if (reader->status != PACKET_READ_FLUSH)
-			die(_("git fetch-pack: expected a flush packet after shallow list"));
+			return xstrdup(_("git fetch-pack: expected a flush packet after shallow list"));
 	}
+	return NULL;
 }
 
-static enum ack_type get_ack(struct packet_reader *reader,
-			     struct object_id *result_oid)
+static error_string get_ack(struct packet_reader *reader,
+			    enum ack_type *result_ack,
+			    struct object_id *result_oid)
 {
 	int len;
 	const char *arg;
 
 	if (packet_reader_read(reader) != PACKET_READ_NORMAL)
-		die(_("git fetch-pack: expected ACK/NAK, got a flush packet"));
+		return xstrdup(_("git fetch-pack: expected ACK/NAK, got a flush packet"));
 	len = reader->pktlen;
 
-	if (!strcmp(reader->line, "NAK"))
-		return NAK;
+	if (!strcmp(reader->line, "NAK")) {
+		*result_ack = NAK;
+		return NULL;
+	}
 	if (skip_prefix(reader->line, "ACK ", &arg)) {
 		const char *p;
 		if (!parse_oid_hex(arg, result_oid, &p)) {
 			len -= p - reader->line;
-			if (len < 1)
-				return ACK;
-			if (strstr(p, "continue"))
-				return ACK_continue;
-			if (strstr(p, "common"))
-				return ACK_common;
-			if (strstr(p, "ready"))
-				return ACK_ready;
-			return ACK;
+			if (len < 1) {
+				*result_ack = ACK;
+				return NULL;
+			}
+			if (strstr(p, "continue")) {
+				*result_ack = ACK_continue;
+				return NULL;
+			}
+			if (strstr(p, "common")) {
+				*result_ack = ACK_common;
+				return NULL;
+			}
+			if (strstr(p, "ready")) {
+				*result_ack = ACK_ready;
+				return NULL;
+			}
+			*result_ack = ACK;
+			return NULL;
 		}
 	}
-	die(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);
+	return xstrfmt(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);
 }
 
 static void send_request(struct fetch_pack_args *args,
@@ -394,7 +409,8 @@ static int find_common(struct fetch_negotiator *negotiator,
 		print_verbose(args, "have %s", oid_to_hex(oid));
 		in_vain++;
 		if (flush_at <= ++count) {
-			int ack;
+			enum ack_type ack;
+			error_string err;
 
 			packet_buf_flush(&req_buf);
 			send_request(args, fd[1], &req_buf);
@@ -409,9 +425,11 @@ static int find_common(struct fetch_negotiator *negotiator,
 			if (!args->stateless_rpc && count == INITIAL_FLUSH)
 				continue;
 
-			consume_shallow_list(args, &reader);
+			if ((err = consume_shallow_list(args, &reader)))
+				die("%s", err);
 			do {
-				ack = get_ack(&reader, result_oid);
+				if ((err = get_ack(&reader, &ack, result_oid)))
+					die("%s", err);
 				if (ack)
 					print_verbose(args, _("got %s %d %s"), "ack",
 						      ack, oid_to_hex(result_oid));
@@ -457,6 +475,9 @@ static int find_common(struct fetch_negotiator *negotiator,
 						got_ready = 1;
 					break;
 					}
+				case NAK:
+					/* nothing */
+					break;
 				}
 			} while (ack);
 			flushes--;
@@ -481,10 +502,17 @@ static int find_common(struct fetch_negotiator *negotiator,
 	}
 	strbuf_release(&req_buf);
 
-	if (!got_ready || !no_done)
-		consume_shallow_list(args, &reader);
+	if (!got_ready || !no_done) {
+		error_string err;
+		if ((err = consume_shallow_list(args, &reader)))
+			die("%s", err);
+	}
 	while (flushes || multi_ack) {
-		int ack = get_ack(&reader, result_oid);
+		error_string err;
+		enum ack_type ack;
+
+		if ((err = get_ack(&reader, &ack, result_oid)))
+			die("%s", err);
 		if (ack) {
 			print_verbose(args, _("got %s (%d) %s"), "ack",
 				      ack, oid_to_hex(result_oid));
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
@ 2020-07-24 23:07 ` Junio C Hamano
  2020-07-24 23:11 ` Junio C Hamano
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-07-24 23:07 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> We've had a few instances where a lazy fetch in a partial clone fails,
> leading to a fatal error, when the calling code could have easily
> recovered - in other words, the severity of the bug should have just a
> wasted fetch instead of stopping the whole command. Part of the issue
> (and possibly the whole issue - I haven't looked at this beyond
> fetch-pack yet) is that fetch-pack dies whenever it encounters an
> error, so I took a look at fixing that.
>
> (Note that fetch-pack is sometimes run through a remote helper, meaning
> that we could leave the die() invocations in and just make sure that we
> handle failure in the separate process correctly. But when the promisor
> remote is HTTP protocol v2 or SSH protocol v0/v2, this is not true -
> fetch_pack() is run in-process.)

That is easily fixable ;-)  After all, we are talking about network
latency once we "lazily fetch" during a main operation, so why not
take advantage of the isolation that the operating system gives us?

> I think the best way for easy authorship and review is to convert each
> possibly-dying function into a function that either returns a
> possibly-null error string or returns success/failure and writes the
> error string into an "out" parameter.

Ugly.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
  2020-07-24 23:07 ` Junio C Hamano
@ 2020-07-24 23:11 ` Junio C Hamano
  2020-07-25 21:41   ` Jeff King
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-07-24 23:11 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> I think the best way for easy authorship and review is to convert each
> possibly-dying function into a function that either returns a
> possibly-null error string or returns success/failure and writes the
> error string into an "out" parameter.

Ugly.

Rather, if we were to pass an extra parameter through the entire
callchain, I'd rather see that parameter to be what to do when we
see an error.  Depending on that, the callee can die(), return
error(), or silently return -1, or do something else.

I really do no think it is a good idea to pass text string back to
the caller, which invites sentence lego and untranslatable mess.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-24 23:11 ` Junio C Hamano
@ 2020-07-25 21:41   ` Jeff King
  2020-07-25 23:01     ` Junio C Hamano
  0 siblings, 1 reply; 57+ messages in thread
From: Jeff King @ 2020-07-25 21:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Fri, Jul 24, 2020 at 04:11:31PM -0700, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > I think the best way for easy authorship and review is to convert each
> > possibly-dying function into a function that either returns a
> > possibly-null error string or returns success/failure and writes the
> > error string into an "out" parameter.
> 
> Ugly.
> 
> Rather, if we were to pass an extra parameter through the entire
> callchain, I'd rather see that parameter to be what to do when we
> see an error.  Depending on that, the callee can die(), return
> error(), or silently return -1, or do something else.

Agreed. I had a proposal a while ago to pass a "struct error_context"
which was basically a function pointer and any accompanying data. That
lets the caller instruct us to die(), call error(), stuff the error in a
strbuf owned by the caller, and so on.

I think it was this one:

  https://lore.kernel.org/git/20160927191955.mympqgylrxhkp24n@sigill.intra.peff.net/

I still think it's a good idea, but the hard part is lib-ifying all of
the functions that call die() to instead return an error up the stack
(and handle cleanup, etc). Which is why I never really pushed it
further. But if we're going to be lib-ifying such calls anyway, I think
it's nice to do this flexible thing (from their perspective it's no more
work to trigger the callback than it is to call error() and return).

That said, I do think for this particular case, your "just run it in a
sub-process" suggestion may be simpler and more robust.

-Peff

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-25 21:41   ` Jeff King
@ 2020-07-25 23:01     ` Junio C Hamano
  2020-07-27 17:11       ` Jeff King
  2020-07-28 19:23       ` Jonathan Tan
  0 siblings, 2 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-07-25 23:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git

Jeff King <peff@peff.net> writes:

> I think it was this one:
>
>   https://lore.kernel.org/git/20160927191955.mympqgylrxhkp24n@sigill.intra.peff.net/
>
> I still think it's a good idea, but the hard part is lib-ifying all of
> the functions that call die() to instead return an error up the stack
> (and handle cleanup, etc). Which is why I never really pushed it
> further. But if we're going to be lib-ifying such calls anyway, I think
> it's nice to do this flexible thing (from their perspective it's no more
> work to trigger the callback than it is to call error() and return).

Yeah, I almost 80%+ agree.

The remainder of 20% is that I am not so convinced that (fmt_string
+ va_list) that isn't pretty much usable for anything but generating
human readable error messages is enough.  It is certainly a good
enough replacement for (and much better than) the approach to
prepare an error string in a "struct strbuf err" that was passeed by
the caller, but I am not sure if that is a good bar to aim to pass
to begin with ;-).

> That said, I do think for this particular case, your "just run it in a
> sub-process" suggestion may be simpler and more robust.

For this particular case, with the performance and all, I agree that
the stupid and robust approach would be the best.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-25 23:01     ` Junio C Hamano
@ 2020-07-27 17:11       ` Jeff King
  2020-07-28 19:23       ` Jonathan Tan
  1 sibling, 0 replies; 57+ messages in thread
From: Jeff King @ 2020-07-27 17:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Sat, Jul 25, 2020 at 04:01:25PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I think it was this one:
> >
> >   https://lore.kernel.org/git/20160927191955.mympqgylrxhkp24n@sigill.intra.peff.net/
> >
> > I still think it's a good idea, but the hard part is lib-ifying all of
> > the functions that call die() to instead return an error up the stack
> > (and handle cleanup, etc). Which is why I never really pushed it
> > further. But if we're going to be lib-ifying such calls anyway, I think
> > it's nice to do this flexible thing (from their perspective it's no more
> > work to trigger the callback than it is to call error() and return).
> 
> Yeah, I almost 80%+ agree.
> 
> The remainder of 20% is that I am not so convinced that (fmt_string
> + va_list) that isn't pretty much usable for anything but generating
> human readable error messages is enough.  It is certainly a good
> enough replacement for (and much better than) the approach to
> prepare an error string in a "struct strbuf err" that was passeed by
> the caller, but I am not sure if that is a good bar to aim to pass
> to begin with ;-).

Yeah, I certainly agree that just passing around strings does not
provide as much information as some callers would like. But I think
introducing a system of machine-readable error tokens would be
cumbersome for a lot of cases, to the point that it may tip this from
"seems like an easy enough change" to something that nobody wants to
take on.

One in-between step that would be pretty easy is to record errno when
available (i.e., if error_errno() becomes error_ctx_errno(), then we
just stuff that errno value into the error_context struct). That would
let us reliably propagate errno up the stack, or even give intermediate
callers the opportunity to munge it (e.g., if a ref lookup failed with
EISDIR while opening a file, it might convert that to ENOENT as it
propagates up the stack).

-Peff

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-25 23:01     ` Junio C Hamano
  2020-07-27 17:11       ` Jeff King
@ 2020-07-28 19:23       ` Jonathan Tan
  2020-07-28 20:08         ` Jeff King
  1 sibling, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-07-28 19:23 UTC (permalink / raw)
  To: gitster; +Cc: peff, jonathantanmy, git

> Jeff King <peff@peff.net> writes:
> 
> > I think it was this one:
> >
> >   https://lore.kernel.org/git/20160927191955.mympqgylrxhkp24n@sigill.intra.peff.net/
> >
> > I still think it's a good idea, but the hard part is lib-ifying all of
> > the functions that call die() to instead return an error up the stack
> > (and handle cleanup, etc). Which is why I never really pushed it
> > further. But if we're going to be lib-ifying such calls anyway, I think
> > it's nice to do this flexible thing (from their perspective it's no more
> > work to trigger the callback than it is to call error() and return).
> 
> Yeah, I almost 80%+ agree.

Thanks Peff for the pointer. That looks like a good idea, and more
comprehensive than my suggested approach.

> The remainder of 20% is that I am not so convinced that (fmt_string
> + va_list) that isn't pretty much usable for anything but generating
> human readable error messages is enough.  It is certainly a good
> enough replacement for (and much better than) the approach to
> prepare an error string in a "struct strbuf err" that was passeed by
> the caller, but I am not sure if that is a good bar to aim to pass
> to begin with ;-).

I think that functions that inform their callers about different errors
already do so through return values, but in any case I think that Peff's
idea can be extended once we need it (e.g. by adding error reporting
functions that can fallback to the string function if the special
function is not present).

> > That said, I do think for this particular case, your "just run it in a
> > sub-process" suggestion may be simpler and more robust.
> 
> For this particular case, with the performance and all, I agree that
> the stupid and robust approach would be the best.

I'm concerned that we will be painting ourselves into a corner here - I
have been appreciating the richer interface that a C call provides us,
compared to sub-processes where we have to communicate through a single
input stream and a single output stream. For example, "wanted-refs" and
packfile URIs (a.k.a. CDN offloading) were made much simpler because it
was quite easy to communicate back the updated hashes for refs (for
"wanted-refs") and more than one lockfile (for packfile URIs). If we
were doing it with "fetch" in remote helpers (like in HTTP protocol v0),
we would have to augment the textual protocol to communicate back the
updated hashes and the list of lockfiles - doable, but more cumbersome.
(That is also why I only implemented those for protocol v2, because
protocol v2 had "stateless-connect" for HTTP.)

Being limited to sub-processes also stopped me from implementing an
improvement to how fetch-pack calls index-pack - when I added debugging
information (list of refs and hashes) when writing the .promisor file,
what I wanted to do was to pass a callback (function pointer + data
pointer) to index-pack to tell it what to write to .promisor, but I
couldn't do that. I also couldn't write the list to the stdin of
index-pack, because the stdin was already used to communicate the
packfile. I could write the list to a temporary file and pass that name
to index-pack, but by that stage I thought it would be simpler to just
have fetch-pack write the .promisor file itself, even though I think
that this should be an index-pack concern, not a fetch-pack one.

(Also, I think that debugging within a process is easier than debugging
across processes, but that might not be a concern that other people
share.)

In this particular case, when doing the lazy fetch, we pass a special
flag that skips some steps - among other things, negotiation. Admittedly
we could add this flag to "fetch" (or whatever command we're going to
invoke to implement this sub-process part), or add the equivalent
general-purpose flags/configs (e.g. fetch.negotiationAlgorithm=null)
(but I haven't looked thoroughly at what flags would be needed). These
seem surmountable and I can't think of anything unsurmountable, but I'm
still worried that we're painting ourselves into a corner.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-28 19:23       ` Jonathan Tan
@ 2020-07-28 20:08         ` Jeff King
  2020-07-29 18:53           ` Jonathan Tan
  2020-07-29 19:02           ` Junio C Hamano
  0 siblings, 2 replies; 57+ messages in thread
From: Jeff King @ 2020-07-28 20:08 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: gitster, git

On Tue, Jul 28, 2020 at 12:23:50PM -0700, Jonathan Tan wrote:

> > For this particular case, with the performance and all, I agree that
> > the stupid and robust approach would be the best.
> 
> I'm concerned that we will be painting ourselves into a corner here - I
> have been appreciating the richer interface that a C call provides us,
> compared to sub-processes where we have to communicate through a single
> input stream and a single output stream. For example, "wanted-refs" and
> [...]

Yeah, that's a compelling reason. I'd have thought for this use case you
could just say "hey, make sure these objects exist", which doesn't
require a lot of communication. But often when I think things like that
and start coding, it turns out to be much more complicated. So I am
perfectly willing to believe you that it doesn't apply here. And even
if it did, you're right that we may run into other spots that do need to
pass back more information, but benefit from more lib-ified code that
doesn't die().

Just to play devil's advocate for a moment...

> (Also, I think that debugging within a process is easier than debugging
> across processes, but that might not be a concern that other people
> share.)

This is definitely true sometimes, but I think is sometimes the
opposite. When we push things out to a sub-process, then the interface
between the two processes has to be well-defined (e.g., writing results
to a file with a particular format). And that can make debugging easier,
because you can pick up from that intermediate state (munging it in the
middle, or even generating it from scratch for testing).

Likewise, that can result in a more flexible and robust system from the
perspective of users. If we had invented "git log" first, we probably
wouldn't have "git rev-list | git diff-tree --stdin" at all. But having
that as two separate tools is sometimes useful for people doing things
_besides_ log, since it gives different entry points to the code.

That said, I think I could buy the argument that "fetch" works pretty
well as a basic building block for users. It's pretty rare to actually
use fetch-pack as a distinct operation. This is all a monolith vs module
tradeoff question, and the tradeoff around modularity for fetch isn't
that compelling.

-Peff

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-28 20:08         ` Jeff King
@ 2020-07-29 18:53           ` Jonathan Tan
  2020-07-29 19:29             ` Jeff King
  2020-07-29 19:02           ` Junio C Hamano
  1 sibling, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-07-29 18:53 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, gitster, git

> On Tue, Jul 28, 2020 at 12:23:50PM -0700, Jonathan Tan wrote:
> 
> > > For this particular case, with the performance and all, I agree that
> > > the stupid and robust approach would be the best.
> > 
> > I'm concerned that we will be painting ourselves into a corner here - I
> > have been appreciating the richer interface that a C call provides us,
> > compared to sub-processes where we have to communicate through a single
> > input stream and a single output stream. For example, "wanted-refs" and
> > [...]
> 
> Yeah, that's a compelling reason. I'd have thought for this use case you
> could just say "hey, make sure these objects exist", which doesn't
> require a lot of communication. But often when I think things like that
> and start coding, it turns out to be much more complicated. So I am
> perfectly willing to believe you that it doesn't apply here. And even
> if it did, you're right that we may run into other spots that do need to
> pass back more information, but benefit from more lib-ified code that
> doesn't die().

Thanks. Just to be clear, I can't think of any hard stops to
implementing lazy fetch in a sub-process right now, but (as you said)
while programming I could discover something, or we could need something
in the future.

> Just to play devil's advocate for a moment...
> 
> > (Also, I think that debugging within a process is easier than debugging
> > across processes, but that might not be a concern that other people
> > share.)
> 
> This is definitely true sometimes, but I think is sometimes the
> opposite. When we push things out to a sub-process, then the interface
> between the two processes has to be well-defined (e.g., writing results
> to a file with a particular format). And that can make debugging easier,
> because you can pick up from that intermediate state (munging it in the
> middle, or even generating it from scratch for testing).

Well, unless there is some sort of interactivity like in remote helpers
:-)

> Likewise, that can result in a more flexible and robust system from the
> perspective of users. If we had invented "git log" first, we probably
> wouldn't have "git rev-list | git diff-tree --stdin" at all. But having
> that as two separate tools is sometimes useful for people doing things
> _besides_ log, since it gives different entry points to the code.

That's true. And the lazy fetch might be one of them - after looking at
the code, I think that I can get to where I want just by implementing a
null negotiator, which will be useful for end users. (In particular,
simulating a lazy fetch might be useful sometimes, and re-fetching a
packfile could be a crude way of repairing object corruption.)

> That said, I think I could buy the argument that "fetch" works pretty
> well as a basic building block for users. It's pretty rare to actually
> use fetch-pack as a distinct operation. This is all a monolith vs module
> tradeoff question, and the tradeoff around modularity for fetch isn't
> that compelling.

If we are going the sub-process route, I was planning to use "fetch" as
the sub-process, actually, not "fetch-pack" - among other things,
"fetch" allows us to specify a fetch negotiator. So it seems like you
are saying that if we had to use "fetch-pack", you have no problem with
libifying it and calling it in the same process, but if we can use
"fetch", we should use it as a sub-process?

In any case, I'll look into using "fetch" as a sub-process and see if it
works.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-28 20:08         ` Jeff King
  2020-07-29 18:53           ` Jonathan Tan
@ 2020-07-29 19:02           ` Junio C Hamano
  2020-07-29 22:55             ` Jonathan Tan
  1 sibling, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-07-29 19:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git

Jeff King <peff@peff.net> writes:

> Just to play devil's advocate for a moment...
>
>> (Also, I think that debugging within a process is easier than debugging
>> across processes, but that might not be a concern that other people
>> share.)
>
> This is definitely true sometimes, but I think is sometimes the
> opposite. When we push things out to a sub-process, then the interface
> between the two processes has to be well-defined (e.g., writing results
> to a file with a particular format).

I agree that it forces us to be more disciplined to separate them
into two processes and clearly define the way they communicate with
each other.  When done poorly, it tends to make a bigger mess.

> That said, I think I could buy the argument that "fetch" works pretty
> well as a basic building block for users. It's pretty rare to actually
> use fetch-pack as a distinct operation. This is all a monolith vs module
> tradeoff question, and the tradeoff around modularity for fetch isn't
> that compelling.

Yup.  

The "more naive and stupid way to run them as subprocesses, just
like we do for cURL based transport" was primarily because I thought
it would be easier to get right (as the transport API and interface
has long been established) than hunting down all the callers of
die() and touch all the functions involved in the callchain.  I
won't complain or block if a harder-but-more-correct approach is
taken ;-)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-29 18:53           ` Jonathan Tan
@ 2020-07-29 19:29             ` Jeff King
  0 siblings, 0 replies; 57+ messages in thread
From: Jeff King @ 2020-07-29 19:29 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: gitster, git

On Wed, Jul 29, 2020 at 11:53:47AM -0700, Jonathan Tan wrote:

> > This is definitely true sometimes, but I think is sometimes the
> > opposite. When we push things out to a sub-process, then the interface
> > between the two processes has to be well-defined (e.g., writing results
> > to a file with a particular format). And that can make debugging easier,
> > because you can pick up from that intermediate state (munging it in the
> > middle, or even generating it from scratch for testing).
> 
> Well, unless there is some sort of interactivity like in remote helpers
> :-)

Yes, debugging remote helpers is its own special hell. :)

> > That said, I think I could buy the argument that "fetch" works pretty
> > well as a basic building block for users. It's pretty rare to actually
> > use fetch-pack as a distinct operation. This is all a monolith vs module
> > tradeoff question, and the tradeoff around modularity for fetch isn't
> > that compelling.
> 
> If we are going the sub-process route, I was planning to use "fetch" as
> the sub-process, actually, not "fetch-pack" - among other things,
> "fetch" allows us to specify a fetch negotiator. So it seems like you
> are saying that if we had to use "fetch-pack", you have no problem with
> libifying it and calling it in the same process, but if we can use
> "fetch", we should use it as a sub-process?

No, I just meant that in general fetching is a monolithic operation from
the users perspective, and we don't often need to break it down further.
So if you have to build more options into "fetch" (that _could_ have
been broken out into separate sub-commands) I don't think anybody will
be sad.

I guess that is kind of orthogonal to your original problem, though,
which is "should fetch be triggered in-process by other processes". So
one could still make the argument that whatever features are needed by
that calling code (e.g., "use a different negotiation scheme") might
also be needed by other (script) callers of git-fetch, so there may be
some benefit to users in making the cross-process interface more rich
(of course in the case of negotiation schemes, it is not "make it more
rich now" but "earlier efforts to make it rich are now paying off").

-Peff

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH] Modify fetch-pack to no longer die on error?
  2020-07-29 19:02           ` Junio C Hamano
@ 2020-07-29 22:55             ` Jonathan Tan
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-07-29 22:55 UTC (permalink / raw)
  To: gitster; +Cc: peff, jonathantanmy, git

> The "more naive and stupid way to run them as subprocesses, just
> like we do for cURL based transport" was primarily because I thought
> it would be easier to get right (as the transport API and interface
> has long been established) than hunting down all the callers of
> die() and touch all the functions involved in the callchain.  I
> won't complain or block if a harder-but-more-correct approach is
> taken ;-)

Well, I was prepared to do the harder approach (as you can see from my
first email) but it will be harder not just for the author but also for
the reviewer and future readers of the code; now I think that I should
try the subprocess way first, especially since there is another part of
Git that will do that too (the maintenance prefetch [1]). I'll try that
and write back about how it goes.

[1] https://lore.kernel.org/git/3165b8916d2d80bf72dac6596a42c871ccd4cbe6.1595527000.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [RFC PATCH 0/7] Lazy fetch with subprocess
  2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
  2020-07-24 23:07 ` Junio C Hamano
  2020-07-24 23:11 ` Junio C Hamano
@ 2020-08-05  1:20 ` Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
                     ` (6 more replies)
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
  4 siblings, 7 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

Here's some code that performs the lazy fetch in a subprocess. It was
more involved than I thought it would be, but here it is.

My main concern is the user-facing interface for the null fetch
negotiator. It not only does not negotiate at all but affects the
everything_local() check, so I wonder if it should be exposed
differently. More information is in the commit message for the 2nd
patch.

I know that the following still needs to be done, but I thought I'd send
out the patches for early feedback first since the main design and code
is already done:

 - Commit messages
 - User-facing documentation
 - A way to prevent a promisor-remote fetch from invoking another
   promisor-remote fetch (use a file as a lock?)
 - Remove no_dependents code (fetch-pack, transport)

Jonathan Tan (7):
  fetch-pack: allow NULL negotiator->add_tip
  fetch-pack: allow NULL negotiator->known_common
  negotiator/null: add null fetch negotiator
  fetch: --stdin
  fetch: submodule config
  fetch: only populate existing_refs if needed
  promisor-remote: use subprocess to fetch

 Documentation/config/fetch.txt   |  5 +++-
 Makefile                         |  1 +
 builtin/fetch.c                  | 42 +++++++++++++++++++++++------
 fetch-negotiator.c               |  5 ++++
 fetch-pack.c                     | 23 +++++++++++-----
 negotiator/null.c                | 34 +++++++++++++++++++++++
 negotiator/null.h                |  8 ++++++
 promisor-remote.c                | 46 +++++++++++++++-----------------
 repo-settings.c                  |  2 ++
 repository.h                     |  1 +
 submodule-config.c               |  5 ++--
 t/t0410-partial-clone.sh         |  2 +-
 t/t4067-diff-partial-clone.sh    |  8 +++---
 t/t5554-null-fetch-negotiator.sh | 22 +++++++++++++++
 t/t5601-clone.sh                 |  2 +-
 15 files changed, 157 insertions(+), 49 deletions(-)
 create mode 100644 negotiator/null.c
 create mode 100644 negotiator/null.h
 create mode 100755 t/t5554-null-fetch-negotiator.sh

-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  2020-08-05 19:53     ` Junio C Hamano
  2020-08-05  1:20   ` [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common Jonathan Tan
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

In a subsequent patch, a null fetch negotiator will be introduced. This
negotiator, among other things, will not need any tip information and
will have a NULL add_tip.

Teach fetch-pack to allow this, and to take advantage of this by
avoiding a ref and alternate traversal when it can. To do this, this
patch combines all invocations of functions that invoke add_tip into one
function, and that function first checks whether add_tip is NULL.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 80fb3bd899..6c786f5970 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -240,6 +240,16 @@ static void mark_tips(struct fetch_negotiator *negotiator,
 	return;
 }
 
+static void add_tips_and_alternates(struct fetch_negotiator *negotiator,
+				    const struct oid_array *negotiation_tips)
+{
+	if (!negotiator->add_tip)
+		return;
+
+	mark_tips(negotiator, negotiation_tips);
+	for_each_cached_alternate(negotiator, insert_one_alternate_object);
+}
+
 static int find_common(struct fetch_negotiator *negotiator,
 		       struct fetch_pack_args *args,
 		       int fd[2], struct object_id *result_oid,
@@ -262,10 +272,8 @@ static int find_common(struct fetch_negotiator *negotiator,
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_DIE_ON_ERR_PACKET);
 
-	if (!args->no_dependents) {
-		mark_tips(negotiator, args->negotiation_tips);
-		for_each_cached_alternate(negotiator, insert_one_alternate_object);
-	}
+	if (!args->no_dependents)
+		add_tips_and_alternates(negotiator, args->negotiation_tips);
 
 	fetching = 0;
 	for ( ; refs ; refs = refs->next) {
@@ -1575,9 +1583,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				else
 					state = FETCH_SEND_REQUEST;
 
-				mark_tips(negotiator, args->negotiation_tips);
-				for_each_cached_alternate(negotiator,
-							  insert_one_alternate_object);
+				add_tips_and_alternates(negotiator, args->negotiation_tips);
 			} else {
 				filter_refs(args, &ref, sought, nr_sought);
 				state = FETCH_SEND_REQUEST;
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  2020-08-05 20:08     ` Junio C Hamano
  2020-08-05  1:20   ` [RFC PATCH 3/7] negotiator/null: add null fetch negotiator Jonathan Tan
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

In a subsequent patch, a null fetch negotiator will be introduced. This
negotiator, among other things, will not need any information about
common objects and will have a NULL known_common. Teach fetch-pack to
allow this.

[NEEDSWORK]
Optimizing out the ref iteration also affects the execution
of everything_local(), which relies on COMPLETE being set. (Having said
that, the typical use case - lazy fetching - would be fine with
everything_local() always returning that not everything is local.)

This optimization is needed so that in the future, fetch_pack() can be
used to lazily fetch in a partial clone (without the no_dependents
flag). This means that fetch_pack() needs a way to execute without
relying on any targets of refs being present, and thus it cannot use the
ref iterator (because it checks and lazy-fetches any missing targets).
(Git currently does not have this problem because we use the
no_dependents flag, but lazy-fetching will in a subsequent patch be
changed to use the user-facing fetch command, which does not use this
flag.)
[/NEEDSWORK]

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 6c786f5970..5f5474dbed 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -677,6 +677,9 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	int old_save_commit_buffer = save_commit_buffer;
 	timestamp_t cutoff = 0;
 
+	if (!negotiator->known_common)
+		return;
+
 	save_commit_buffer = 0;
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 3/7] negotiator/null: add null fetch negotiator
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 4/7] fetch: --stdin Jonathan Tan
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

Add a null fetch negotiator. This is introduced to allow partial clones
to perform lazy fetches (which do not need negotiation at all) by
spawning a subprocess (to be done in a subsequent patch), but can be
useful for end users, e.g. as a blunt fix for object corruption.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config/fetch.txt   |  5 ++++-
 Makefile                         |  1 +
 fetch-negotiator.c               |  5 +++++
 negotiator/null.c                | 34 ++++++++++++++++++++++++++++++++
 negotiator/null.h                |  8 ++++++++
 repo-settings.c                  |  2 ++
 repository.h                     |  1 +
 t/t5554-null-fetch-negotiator.sh | 22 +++++++++++++++++++++
 8 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 negotiator/null.c
 create mode 100644 negotiator/null.h
 create mode 100755 t/t5554-null-fetch-negotiator.sh

diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
index 0aaa05e8c0..09aff404be 100644
--- a/Documentation/config/fetch.txt
+++ b/Documentation/config/fetch.txt
@@ -60,7 +60,10 @@ fetch.negotiationAlgorithm::
 	sent when negotiating the contents of the packfile to be sent by the
 	server. Set to "skipping" to use an algorithm that skips commits in an
 	effort to converge faster, but may result in a larger-than-necessary
-	packfile; The default is "default" which instructs Git to use the default algorithm
+	packfile; or set to "null" to not send any information at all, which
+	will almost certainly result in a larger-than-necessary packfile, but
+	will skip the negotiation step.
+	The default is "default" which instructs Git to use the default algorithm
 	that never skips commits (unless the server has acknowledged it or one
 	of its descendants). If `feature.experimental` is enabled, then this
 	setting defaults to "skipping".
diff --git a/Makefile b/Makefile
index 372139f1f2..297ea5e517 100644
--- a/Makefile
+++ b/Makefile
@@ -917,6 +917,7 @@ LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
 LIB_OBJS += name-hash.o
 LIB_OBJS += negotiator/default.o
+LIB_OBJS += negotiator/null.o
 LIB_OBJS += negotiator/skipping.o
 LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
diff --git a/fetch-negotiator.c b/fetch-negotiator.c
index 0a1357dc9d..e2ecbe4367 100644
--- a/fetch-negotiator.c
+++ b/fetch-negotiator.c
@@ -2,6 +2,7 @@
 #include "fetch-negotiator.h"
 #include "negotiator/default.h"
 #include "negotiator/skipping.h"
+#include "negotiator/null.h"
 #include "repository.h"
 
 void fetch_negotiator_init(struct repository *r,
@@ -13,6 +14,10 @@ void fetch_negotiator_init(struct repository *r,
 		skipping_negotiator_init(negotiator);
 		return;
 
+	case FETCH_NEGOTIATION_NULL:
+		null_negotiator_init(negotiator);
+		return;
+
 	case FETCH_NEGOTIATION_DEFAULT:
 	default:
 		default_negotiator_init(negotiator);
diff --git a/negotiator/null.c b/negotiator/null.c
new file mode 100644
index 0000000000..db63ce66c2
--- /dev/null
+++ b/negotiator/null.c
@@ -0,0 +1,34 @@
+#include "cache.h"
+#include "null.h"
+#include "../commit.h"
+#include "../fetch-negotiator.h"
+
+static const struct object_id *next(struct fetch_negotiator *n)
+{
+	return NULL;
+}
+
+static int ack(struct fetch_negotiator *n, struct commit *c)
+{
+	/*
+	 * This negotiator does not emit any commits, so there is no commit to
+	 * be acknowledged. If there is any ack, there is a bug.
+	 */
+	BUG("ack with null negotiator, which does not emit any commits");
+	return 0;
+}
+
+static void release(struct fetch_negotiator *n)
+{
+	/* nothing to release */
+}
+
+void null_negotiator_init(struct fetch_negotiator *negotiator)
+{
+	negotiator->known_common = NULL;
+	negotiator->add_tip = NULL;
+	negotiator->next = next;
+	negotiator->ack = ack;
+	negotiator->release = release;
+	negotiator->data = NULL;
+}
diff --git a/negotiator/null.h b/negotiator/null.h
new file mode 100644
index 0000000000..96713f280e
--- /dev/null
+++ b/negotiator/null.h
@@ -0,0 +1,8 @@
+#ifndef NEGOTIATOR_NULL_H
+#define NEGOTIATOR_NULL_H
+
+struct fetch_negotiator;
+
+void null_negotiator_init(struct fetch_negotiator *negotiator);
+
+#endif
diff --git a/repo-settings.c b/repo-settings.c
index 0918408b34..a8c7e1edd7 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -39,6 +39,8 @@ void prepare_repo_settings(struct repository *r)
 	if (!repo_config_get_string(r, "fetch.negotiationalgorithm", &strval)) {
 		if (!strcasecmp(strval, "skipping"))
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
+		else if (!strcasecmp(strval, "null"))
+			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NULL;
 		else
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
 	}
diff --git a/repository.h b/repository.h
index 3c1f7d54bd..f72911a185 100644
--- a/repository.h
+++ b/repository.h
@@ -23,6 +23,7 @@ enum fetch_negotiation_setting {
 	FETCH_NEGOTIATION_NONE = 0,
 	FETCH_NEGOTIATION_DEFAULT = 1,
 	FETCH_NEGOTIATION_SKIPPING = 2,
+	FETCH_NEGOTIATION_NULL = 3,
 };
 
 struct repo_settings {
diff --git a/t/t5554-null-fetch-negotiator.sh b/t/t5554-null-fetch-negotiator.sh
new file mode 100755
index 0000000000..09a8f0d608
--- /dev/null
+++ b/t/t5554-null-fetch-negotiator.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+test_description='test null fetch negotiator'
+. ./test-lib.sh
+
+test_expect_success 'null negotiator does not emit any "have"' '
+	rm -f trace &&
+
+	test_create_repo server &&
+	test_commit -C server to_fetch &&
+
+	test_create_repo client &&
+	test_commit -C client we_have &&
+
+	test_config -C client fetch.negotiationalgorithm null &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+
+	! grep "fetch> have" trace &&
+	grep "fetch> done" trace
+'
+
+test_done
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 4/7] fetch: --stdin
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (2 preceding siblings ...)
  2020-08-05  1:20   ` [RFC PATCH 3/7] negotiator/null: add null fetch negotiator Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  2020-08-05 20:33     ` Junio C Hamano
  2020-08-05  1:20   ` [RFC PATCH 5/7] fetch: submodule config Jonathan Tan
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 3ccf69753f..a5498646bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -80,6 +80,7 @@ static struct list_objects_filter_options filter_options;
 static struct string_list server_options = STRING_LIST_INIT_DUP;
 static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
 static int fetch_write_commit_graph = -1;
+static int stdin_refspecs = 0;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -209,6 +210,8 @@ static struct option builtin_fetch_options[] = {
 		 N_("check for forced-updates on all updated branches")),
 	OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph,
 		 N_("write the commit-graph after fetching")),
+	OPT_BOOL(0, "stdin", &stdin_refspecs,
+		 N_("accept refspecs from stdin")),
 	OPT_END()
 };
 
@@ -1684,7 +1687,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	return;
 }
 
-static int fetch_one(struct remote *remote, int argc, const char **argv, int prune_tags_ok)
+static int fetch_one(struct remote *remote, int argc, const char **argv,
+		     int prune_tags_ok, int use_stdin_refspecs)
 {
 	struct refspec rs = REFSPEC_INIT_FETCH;
 	int i;
@@ -1741,6 +1745,13 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, int pru
 		}
 	}
 
+	if (use_stdin_refspecs) {
+		struct strbuf line = STRBUF_INIT;
+		while (strbuf_getline_lf(&line, stdin) != EOF)
+			refspec_append(&rs, line.buf);
+		strbuf_release(&line);
+	}
+
 	if (server_options.nr)
 		gtransport->server_options = &server_options;
 
@@ -1841,7 +1852,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (remote) {
 		if (filter_options.choice || has_promisor_remote())
 			fetch_one_setup_partial(remote);
-		result = fetch_one(remote, argc, argv, prune_tags_ok);
+		result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs);
 	} else {
 		int max_children = max_jobs;
 
@@ -1849,6 +1860,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			die(_("--filter can only be used with the remote "
 			      "configured in extensions.partialclone"));
 
+		if (stdin_refspecs)
+			die(_("--stdin can only be used when fetching "
+			      "from one remote"));
+
 		if (max_children < 0)
 			max_children = fetch_parallel_config;
 
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 5/7] fetch: submodule config
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (3 preceding siblings ...)
  2020-08-05  1:20   ` [RFC PATCH 4/7] fetch: --stdin Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 6/7] fetch: only populate existing_refs if needed Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 7/7] promisor-remote: use subprocess to fetch Jonathan Tan
  6 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c    | 10 ++++++++--
 submodule-config.c |  5 +++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index a5498646bf..29db219c68 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1786,12 +1786,18 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		free(anon);
 	}
 
-	fetch_config_from_gitmodules(&submodule_fetch_jobs_config,
-				     &recurse_submodules);
 	git_config(git_fetch_config, NULL);
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
+		int *sfjc = submodule_fetch_jobs_config == -1
+			    ? &submodule_fetch_jobs_config : NULL;
+		int *rs = recurse_submodules == RECURSE_SUBMODULES_DEFAULT
+			  ? &recurse_submodules : NULL;
+
+		fetch_config_from_gitmodules(sfjc, rs);
+	}
 
 	if (deepen_relative) {
 		if (deepen_relative < 0)
diff --git a/submodule-config.c b/submodule-config.c
index e175dfbc38..8d65273ed2 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -776,10 +776,11 @@ struct fetch_config {
 static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
 {
 	struct fetch_config *config = cb;
-	if (!strcmp(var, "submodule.fetchjobs")) {
+	if (!strcmp(var, "submodule.fetchjobs") && config->max_children) {
 		*(config->max_children) = parse_submodule_fetchjobs(var, value);
 		return 0;
-	} else if (!strcmp(var, "fetch.recursesubmodules")) {
+	} else if (!strcmp(var, "fetch.recursesubmodules") &&
+		   config->recurse_submodules) {
 		*(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
 		return 0;
 	}
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 6/7] fetch: only populate existing_refs if needed
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (4 preceding siblings ...)
  2020-08-05  1:20   ` [RFC PATCH 5/7] fetch: submodule config Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  2020-08-05  1:20   ` [RFC PATCH 7/7] promisor-remote: use subprocess to fetch Jonathan Tan
  6 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 29db219c68..6460ce3f4e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -449,6 +449,7 @@ static struct ref *get_ref_map(struct remote *remote,
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
 	struct hashmap existing_refs;
+	int existing_refs_populated = 0;
 
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
@@ -542,15 +543,18 @@ static struct ref *get_ref_map(struct remote *remote,
 
 	ref_map = ref_remove_duplicates(ref_map);
 
-	refname_hash_init(&existing_refs);
-	for_each_ref(add_one_refname, &existing_refs);
-
 	for (rm = ref_map; rm; rm = rm->next) {
 		if (rm->peer_ref) {
 			const char *refname = rm->peer_ref->name;
 			struct refname_hash_entry *peer_item;
 			unsigned int hash = strhash(refname);
 
+			if (!existing_refs_populated) {
+				refname_hash_init(&existing_refs);
+				for_each_ref(add_one_refname, &existing_refs);
+				existing_refs_populated = 1;
+			}
+
 			peer_item = hashmap_get_entry_from_hash(&existing_refs,
 						hash, refname,
 						struct refname_hash_entry, ent);
@@ -560,7 +564,8 @@ static struct ref *get_ref_map(struct remote *remote,
 			}
 		}
 	}
-	hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
+	if (existing_refs_populated)
+		hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
 
 	return ref_map;
 }
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [RFC PATCH 7/7] promisor-remote: use subprocess to fetch
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (5 preceding siblings ...)
  2020-08-05  1:20   ` [RFC PATCH 6/7] fetch: only populate existing_refs if needed Jonathan Tan
@ 2020-08-05  1:20   ` Jonathan Tan
  6 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-05  1:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, peff, gitster

---
 promisor-remote.c             | 46 ++++++++++++++++-------------------
 t/t0410-partial-clone.sh      |  2 +-
 t/t4067-diff-partial-clone.sh |  8 +++---
 t/t5601-clone.sh              |  2 +-
 4 files changed, 27 insertions(+), 31 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index baaea12fd6..6e647610e9 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -3,6 +3,7 @@
 #include "promisor-remote.h"
 #include "config.h"
 #include "transport.h"
+#include "argv-array.h"
 
 static char *repository_format_partial_clone;
 static const char *core_partial_clone_filter_default;
@@ -12,39 +13,34 @@ void set_repository_format_partial_clone(char *partial_clone)
 	repository_format_partial_clone = xstrdup_or_null(partial_clone);
 }
 
-static int fetch_refs(const char *remote_name, struct ref *ref)
-{
-	struct remote *remote;
-	struct transport *transport;
-	int res;
-
-	remote = remote_get(remote_name);
-	if (!remote->url[0])
-		die(_("Remote with no URL"));
-	transport = transport_get(remote, remote->url[0]);
-
-	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
-	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	res = transport_fetch_refs(transport, ref);
-
-	return res;
-}
-
 static int fetch_objects(const char *remote_name,
 			 const struct object_id *oids,
 			 int oid_nr)
 {
-	struct ref *ref = NULL;
+	struct child_process child = CHILD_PROCESS_INIT;
 	int i;
+	FILE *child_in;
+
+	child.git_cmd = 1;
+	child.in = -1;
+	argv_array_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=null",
+			 "fetch", remote_name, "--no-tags",
+			 "--no-write-fetch-head", "--recurse-submodules=no",
+			 "--filter=blob:none", "--stdin", NULL);
+	if (start_command(&child))
+		die(_("promisor-remote: unable to fork off fetch subprocess"));
+	child_in = xfdopen(child.in, "w");
 
 	for (i = 0; i < oid_nr; i++) {
-		struct ref *new_ref = alloc_ref(oid_to_hex(&oids[i]));
-		oidcpy(&new_ref->old_oid, &oids[i]);
-		new_ref->exact_oid = 1;
-		new_ref->next = ref;
-		ref = new_ref;
+		if (fputs(oid_to_hex(&oids[i]), child_in) < 0)
+			die_errno(_("promisor-remote: could not write to fetch subprocess"));
+		if (fputc('\n', child_in) < 0)
+			die_errno(_("promisor-remote: could not write to fetch subprocess"));
 	}
-	return fetch_refs(remote_name, ref);
+
+	if (fclose(child_in) < 0)
+		die_errno(_("promisor-remote: could not close stdin to fetch subprocess"));
+	return finish_command(&child) ? -1 : 0;
 }
 
 static struct promisor_remote *promisors;
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 463dc3a8be..3e454f934e 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -203,7 +203,7 @@ test_expect_success 'fetching of missing objects works with ref-in-want enabled'
 	rm -rf repo/.git/objects/* &&
 	rm -f trace &&
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C repo cat-file -p "$HASH" &&
-	grep "git< fetch=.*ref-in-want" trace
+	grep "fetch< fetch=.*ref-in-want" trace
 '
 
 test_expect_success 'fetching of missing objects from another promisor remote' '
diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh
index ef8e0e9cb0..804f2a82e8 100755
--- a/t/t4067-diff-partial-clone.sh
+++ b/t/t4067-diff-partial-clone.sh
@@ -20,7 +20,7 @@ test_expect_success 'git show batches blobs' '
 	# Ensure that there is exactly 1 negotiation by checking that there is
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client show HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -44,7 +44,7 @@ test_expect_success 'diff batches blobs' '
 	# Ensure that there is exactly 1 negotiation by checking that there is
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff HEAD^ HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -127,7 +127,7 @@ test_expect_success 'diff with rename detection batches blobs' '
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --raw -M HEAD^ HEAD >out &&
 	grep ":100644 100644.*R[0-9][0-9][0-9].*b.*c" out &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -175,7 +175,7 @@ test_expect_success 'diff --break-rewrites fetches only if necessary, and batche
 	# by checking that there is only 1 "done" line sent. ("done" marks the
 	# end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --break-rewrites --raw -M HEAD^ HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 84ea2a3eb7..f82b0dbb5a 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -702,7 +702,7 @@ test_expect_success 'batch missing blob request during checkout' '
 	# Ensure that there is only one negotiation by checking that there is
 	# only "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip
  2020-08-05  1:20   ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
@ 2020-08-05 19:53     ` Junio C Hamano
  2020-08-07 20:53       ` Jonathan Tan
  0 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-08-05 19:53 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> In a subsequent patch, a null fetch negotiator will be introduced. This
> negotiator, among other things, will not need any tip information and
> will have a NULL add_tip.

After reading the original code that the patch touches, I know what
"a NULL add_tip" is, but it would have been nicer to readers if you
said "callback" or "method" or "function".

With or without the "if add_tip method is set to NULL, return
without doing anything" optimization, it does make sense to refactor
two existing places that do identical things.  

If all the current negotiators never have NULL in the .add_tip
field, however, I'd suspect that it would make it easier to
understand if this step is explained as refactoring duplicated code
into a single helper _without_ the early-return for optimization,
and then the step that introduces the negotiator that does want to
lack add_tip method to add the early return.  After all, if the null
negotiator added a pointer to a no-op function to its .add_tip field,
things would still work correctly without the early-return in the
new helper function, no?

> Teach fetch-pack to allow this, and to take advantage of this by
> avoiding a ref and alternate traversal when it can. To do this, this
> patch combines all invocations of functions that invoke add_tip into one
> function, and that function first checks whether add_tip is NULL.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  fetch-pack.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)



>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 80fb3bd899..6c786f5970 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -240,6 +240,16 @@ static void mark_tips(struct fetch_negotiator *negotiator,
>  	return;
>  }
>  
> +static void add_tips_and_alternates(struct fetch_negotiator *negotiator,
> +				    const struct oid_array *negotiation_tips)
> +{
> +	if (!negotiator->add_tip)
> +		return;
> +
> +	mark_tips(negotiator, negotiation_tips);
> +	for_each_cached_alternate(negotiator, insert_one_alternate_object);
> +}
> +
>  static int find_common(struct fetch_negotiator *negotiator,
>  		       struct fetch_pack_args *args,
>  		       int fd[2], struct object_id *result_oid,
> @@ -262,10 +272,8 @@ static int find_common(struct fetch_negotiator *negotiator,
>  			   PACKET_READ_CHOMP_NEWLINE |
>  			   PACKET_READ_DIE_ON_ERR_PACKET);
>  
> -	if (!args->no_dependents) {
> -		mark_tips(negotiator, args->negotiation_tips);
> -		for_each_cached_alternate(negotiator, insert_one_alternate_object);
> -	}
> +	if (!args->no_dependents)
> +		add_tips_and_alternates(negotiator, args->negotiation_tips);
>  
>  	fetching = 0;
>  	for ( ; refs ; refs = refs->next) {
> @@ -1575,9 +1583,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
>  				else
>  					state = FETCH_SEND_REQUEST;
>  
> -				mark_tips(negotiator, args->negotiation_tips);
> -				for_each_cached_alternate(negotiator,
> -							  insert_one_alternate_object);
> +				add_tips_and_alternates(negotiator, args->negotiation_tips);
>  			} else {
>  				filter_refs(args, &ref, sought, nr_sought);
>  				state = FETCH_SEND_REQUEST;

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common
  2020-08-05  1:20   ` [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common Jonathan Tan
@ 2020-08-05 20:08     ` Junio C Hamano
  2020-08-05 22:11       ` Junio C Hamano
  0 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-08-05 20:08 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> In a subsequent patch, a null fetch negotiator will be introduced. This
> negotiator, among other things, will not need any information about
> common objects and will have a NULL known_common. Teach fetch-pack to
> allow this.

Hmph, both the default and the skipping negotiator seem to put NULL
in known_common and add_tip when its next() method is called.  Also
they clear known_common to NULL after add_tip is called even once.

So, how have we survived so far without this patch to "allow this
(i.e.  known_common method to be NULL)"?  Is there something that
makes sure a negotiator never gets called from this function after
its .next or .add_tip method is called?

Puzzled.  Or is this merely an optimization?  If so, it's not like
the change "allows this", but it starts to take advantage of it in
some way.

	... goes and looks at mark_complete_and_common_ref()

The function seems to have an unconditional call to ->known_common(),
so anybody passing a negotiator whose known_common is NULL would
already be segfaulting, so this does not appear to be an optimization
but necessary to keep the code from crashing.  I cannot quite tell
if it is avoiding unnecessary work, or sweeping crashes under the
rug, though.  

Is the untold assumption that mark_complete_and_common_ref() will
never be called after either mark_tips() or find_common() have been
called?

Thanks.

> [NEEDSWORK]
> Optimizing out the ref iteration also affects the execution
> of everything_local(), which relies on COMPLETE being set. (Having said
> that, the typical use case - lazy fetching - would be fine with
> everything_local() always returning that not everything is local.)
>
> This optimization is needed so that in the future, fetch_pack() can be
> used to lazily fetch in a partial clone (without the no_dependents
> flag). This means that fetch_pack() needs a way to execute without
> relying on any targets of refs being present, and thus it cannot use the
> ref iterator (because it checks and lazy-fetches any missing targets).
> (Git currently does not have this problem because we use the
> no_dependents flag, but lazy-fetching will in a subsequent patch be
> changed to use the user-facing fetch command, which does not use this
> flag.)
> [/NEEDSWORK]
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  fetch-pack.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 6c786f5970..5f5474dbed 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -677,6 +677,9 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
>  	int old_save_commit_buffer = save_commit_buffer;
>  	timestamp_t cutoff = 0;
>  
> +	if (!negotiator->known_common)
> +		return;
> +
>  	save_commit_buffer = 0;
>  
>  	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 4/7] fetch: --stdin
  2020-08-05  1:20   ` [RFC PATCH 4/7] fetch: --stdin Jonathan Tan
@ 2020-08-05 20:33     ` Junio C Hamano
  2020-08-07 21:10       ` Jonathan Tan
  2020-08-07 21:10       ` Jonathan Tan
  0 siblings, 2 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-05 20:33 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  builtin/fetch.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 3ccf69753f..a5498646bf 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -80,6 +80,7 @@ static struct list_objects_filter_options filter_options;
>  static struct string_list server_options = STRING_LIST_INIT_DUP;
>  static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
>  static int fetch_write_commit_graph = -1;
> +static int stdin_refspecs = 0;

Don't initialize statics to 0 (leave that to BSS).

> @@ -209,6 +210,8 @@ static struct option builtin_fetch_options[] = {
>  		 N_("check for forced-updates on all updated branches")),
>  	OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph,
>  		 N_("write the commit-graph after fetching")),
> +	OPT_BOOL(0, "stdin", &stdin_refspecs,
> +		 N_("accept refspecs from stdin")),
>  	OPT_END()
>  };
>  
> @@ -1684,7 +1687,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
>  	return;
>  }
>  
> -static int fetch_one(struct remote *remote, int argc, const char **argv, int prune_tags_ok)
> +static int fetch_one(struct remote *remote, int argc, const char **argv,
> +		     int prune_tags_ok, int use_stdin_refspecs)
>  {
>  	struct refspec rs = REFSPEC_INIT_FETCH;
>  	int i;
> @@ -1741,6 +1745,13 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, int pru
>  		}
>  	}
>  
> +	if (use_stdin_refspecs) {
> +		struct strbuf line = STRBUF_INIT;
> +		while (strbuf_getline_lf(&line, stdin) != EOF)
> +			refspec_append(&rs, line.buf);
> +		strbuf_release(&line);
> +	}

This will use refspecs both from the command line and the standard
input by appending?  IOW, these refspecs that came from the standard
input are treated otherwise identically to those that came from the
command line?

I do not particularly care whether it is "append to command line" or
"replace command line", as I do not think it makes much difference
in usability.  Just wanted to be sure you coded the behaviour you
wanted.

> @@ -1849,6 +1860,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  			die(_("--filter can only be used with the remote "
>  			      "configured in extensions.partialclone"));
>  
> +		if (stdin_refspecs)
> +			die(_("--stdin can only be used when fetching "
> +			      "from one remote"));

Is that only because you happened to have implemented the reading in
fetch_one() that is designed to be called once per remote?  

You could read them here to a refspec for everybody, and then pass a
pointer to that refspec as the extra parameter to fetch_one(), and
fetch_one() can use that by duplicating and appending to its "rs",
if we wanted to, no?  I do not know how important to support such a
use case, though.  It just feels a bit of shame if this restriction
is purely imposed by the implementation, when lifting the refstiction
does not seem too involved.

Thanks.


>  		if (max_children < 0)
>  			max_children = fetch_parallel_config;

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common
  2020-08-05 20:08     ` Junio C Hamano
@ 2020-08-05 22:11       ` Junio C Hamano
  2020-08-07 20:59         ` Jonathan Tan
  0 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-08-05 22:11 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> In a subsequent patch, a null fetch negotiator will be introduced. This
>> negotiator, among other things, will not need any information about
>> common objects and will have a NULL known_common. Teach fetch-pack to
>> allow this.
>
> Hmph, both the default and the skipping negotiator seem to put NULL
> in known_common and add_tip when its next() method is called.  Also
> they clear known_common to NULL after add_tip is called even once.
>
> So, how have we survived so far without this patch to "allow this
> (i.e.  known_common method to be NULL)"?  Is there something that
> makes sure a negotiator never gets called from this function after
> its .next or .add_tip method is called?
>
> Puzzled.  Or is this merely an optimization?  If so, it's not like
> the change "allows this", but it starts to take advantage of it in
> some way.
>
> 	... goes and looks at mark_complete_and_common_ref()
>
> The function seems to have an unconditional call to ->known_common(),
> so anybody passing a negotiator whose known_common is NULL would
> already be segfaulting, so this does not appear to be an optimization
> but necessary to keep the code from crashing.  I cannot quite tell
> if it is avoiding unnecessary work, or sweeping crashes under the
> rug, though.  
>
> Is the untold assumption that mark_complete_and_common_ref() will
> never be called after either mark_tips() or find_common() have been
> called?

Shot in the dark.  Perhaps clearing of .add_tip and .known_common in
the .next method was done to catch a wrong calling sequence where
mark_complete_and_common_ref() gets called after mark_tips() and/or
find_common() have by forcing the code to segfault?  If so, this
patch removes the safety and we may want to add an equivalent safety
logic.  Perhaps by adding a state field in the negotiator instance
to record that mark_tips() and/or find_common() have been used and
call a BUG() if mark_complete_and_common_ref() gets called after that,
if enforcing such an invariant was the original reason why these
fields were cleared.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip
  2020-08-05 19:53     ` Junio C Hamano
@ 2020-08-07 20:53       ` Jonathan Tan
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-07 20:53 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > In a subsequent patch, a null fetch negotiator will be introduced. This
> > negotiator, among other things, will not need any tip information and
> > will have a NULL add_tip.
> 
> After reading the original code that the patch touches, I know what
> "a NULL add_tip" is, but it would have been nicer to readers if you
> said "callback" or "method" or "function".

OK.

> With or without the "if add_tip method is set to NULL, return
> without doing anything" optimization, it does make sense to refactor
> two existing places that do identical things.  
> 
> If all the current negotiators never have NULL in the .add_tip
> field, however, I'd suspect that it would make it easier to
> understand if this step is explained as refactoring duplicated code
> into a single helper _without_ the early-return for optimization,
> and then the step that introduces the negotiator that does want to
> lack add_tip method to add the early return.

This makes sense.

> After all, if the null
> negotiator added a pointer to a no-op function to its .add_tip field,
> things would still work correctly without the early-return in the
> new helper function, no?

Right now no, because in a partial clone, I also want to
avoid any iterations over refs since whenever refs are iterated, their
targets are checked for existence (see the call to
ref_resolves_to_object() in packed_ref_iterator_advance() in
packed-refs.c). However after looking at the code again, I see that
for_each_rawref() exists, which might do what I want (coupled with our
own way of deref-fing tags in fetch-pack). I'll take another look and if
this works, I'll just avoid the whole null method thing.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common
  2020-08-05 22:11       ` Junio C Hamano
@ 2020-08-07 20:59         ` Jonathan Tan
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-07 20:59 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff

> > Hmph, both the default and the skipping negotiator seem to put NULL
> > in known_common and add_tip when its next() method is called.  Also
> > they clear known_common to NULL after add_tip is called even once.
> >
> > So, how have we survived so far without this patch to "allow this
> > (i.e.  known_common method to be NULL)"?  Is there something that
> > makes sure a negotiator never gets called from this function after
> > its .next or .add_tip method is called?

[snip]

> > Is the untold assumption that mark_complete_and_common_ref() will
> > never be called after either mark_tips() or find_common() have been
> > called?
> 
> Shot in the dark.  Perhaps clearing of .add_tip and .known_common in
> the .next method was done to catch a wrong calling sequence where
> mark_complete_and_common_ref() gets called after mark_tips() and/or
> find_common() have by forcing the code to segfault?

Ah...yes, if I remember correctly, that was my original intention when I
set them to NULL.

> If so, this
> patch removes the safety and we may want to add an equivalent safety
> logic.  Perhaps by adding a state field in the negotiator instance
> to record that mark_tips() and/or find_common() have been used and
> call a BUG() if mark_complete_and_common_ref() gets called after that,
> if enforcing such an invariant was the original reason why these
> fields were cleared.

Sounds good. As I said in my reply to your query on patch 1, we might
not need to set NULL anymore, but if we do, I'll do this.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 4/7] fetch: --stdin
  2020-08-05 20:33     ` Junio C Hamano
@ 2020-08-07 21:10       ` Jonathan Tan
  2020-08-07 21:58         ` Junio C Hamano
  2020-08-07 21:10       ` Jonathan Tan
  1 sibling, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-07 21:10 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff

> > @@ -1741,6 +1745,13 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, int pru
> >  		}
> >  	}
> >  
> > +	if (use_stdin_refspecs) {
> > +		struct strbuf line = STRBUF_INIT;
> > +		while (strbuf_getline_lf(&line, stdin) != EOF)
> > +			refspec_append(&rs, line.buf);
> > +		strbuf_release(&line);
> > +	}
> 
> This will use refspecs both from the command line and the standard
> input by appending?  IOW, these refspecs that came from the standard
> input are treated otherwise identically to those that came from the
> command line?
> 
> I do not particularly care whether it is "append to command line" or
> "replace command line", as I do not think it makes much difference
> in usability.  Just wanted to be sure you coded the behaviour you
> wanted.

Yes, except that I didn't plan to support the "tag foo" format. (My aim
with this is just to allow "git fetch" to take large numbers of
refspecs, because when we lazy fetch, the number of objects we fetch
might be large.)

> > @@ -1849,6 +1860,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> >  			die(_("--filter can only be used with the remote "
> >  			      "configured in extensions.partialclone"));
> >  
> > +		if (stdin_refspecs)
> > +			die(_("--stdin can only be used when fetching "
> > +			      "from one remote"));
> 
> Is that only because you happened to have implemented the reading in
> fetch_one() that is designed to be called once per remote?  
> 
> You could read them here to a refspec for everybody, and then pass a
> pointer to that refspec as the extra parameter to fetch_one(), and
> fetch_one() can use that by duplicating and appending to its "rs",
> if we wanted to, no?  I do not know how important to support such a
> use case, though.  It just feels a bit of shame if this restriction
> is purely imposed by the implementation, when lifting the refstiction
> does not seem too involved.

Yes, and I only implemented the reading in fetch_one() because
fetch_multiple() does not read additional refspecs from the command-line
(it does not take "argv"). Looking at the code, this seems to be on
purpose - there is the error message "fetch --all does not make sense
with refspecs", and when --multiple is set, all args are assumed to be
remotes.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 4/7] fetch: --stdin
  2020-08-05 20:33     ` Junio C Hamano
  2020-08-07 21:10       ` Jonathan Tan
@ 2020-08-07 21:10       ` Jonathan Tan
  1 sibling, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-07 21:10 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff

> > @@ -1741,6 +1745,13 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, int pru
> >  		}
> >  	}
> >  
> > +	if (use_stdin_refspecs) {
> > +		struct strbuf line = STRBUF_INIT;
> > +		while (strbuf_getline_lf(&line, stdin) != EOF)
> > +			refspec_append(&rs, line.buf);
> > +		strbuf_release(&line);
> > +	}
> 
> This will use refspecs both from the command line and the standard
> input by appending?  IOW, these refspecs that came from the standard
> input are treated otherwise identically to those that came from the
> command line?
> 
> I do not particularly care whether it is "append to command line" or
> "replace command line", as I do not think it makes much difference
> in usability.  Just wanted to be sure you coded the behaviour you
> wanted.

Yes, except that I didn't plan to support the "tag foo" format. (My aim
with this is just to allow "git fetch" to take large numbers of
refspecs, because when we lazy fetch, the number of objects we fetch
might be large.)

> > @@ -1849,6 +1860,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> >  			die(_("--filter can only be used with the remote "
> >  			      "configured in extensions.partialclone"));
> >  
> > +		if (stdin_refspecs)
> > +			die(_("--stdin can only be used when fetching "
> > +			      "from one remote"));
> 
> Is that only because you happened to have implemented the reading in
> fetch_one() that is designed to be called once per remote?  
> 
> You could read them here to a refspec for everybody, and then pass a
> pointer to that refspec as the extra parameter to fetch_one(), and
> fetch_one() can use that by duplicating and appending to its "rs",
> if we wanted to, no?  I do not know how important to support such a
> use case, though.  It just feels a bit of shame if this restriction
> is purely imposed by the implementation, when lifting the refstiction
> does not seem too involved.

Yes, and I only implemented the reading in fetch_one() because
fetch_multiple() does not read additional refspecs from the command-line
(it does not take "argv"). Looking at the code, this seems to be on
purpose - there is the error message "fetch --all does not make sense
with refspecs", and when --multiple is set, all args are assumed to be
remotes.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [RFC PATCH 4/7] fetch: --stdin
  2020-08-07 21:10       ` Jonathan Tan
@ 2020-08-07 21:58         ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-07 21:58 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> Yes, and I only implemented the reading in fetch_one() because
> fetch_multiple() does not read additional refspecs from the command-line
> (it does not take "argv"). Looking at the code, this seems to be on
> purpose - there is the error message "fetch --all does not make sense
> with refspecs", and when --multiple is set, all args are assumed to be
> remotes.

OK, that does make sense.  so the multiple one is just a short-hand
for 

	for remote in ...
	do
		git fetch $remote
	done


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 0/7] Lazy fetch with subprocess
  2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
                   ` (2 preceding siblings ...)
  2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
@ 2020-08-11 22:52 ` Jonathan Tan
  2020-08-11 22:52   ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
                     ` (7 more replies)
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
  4 siblings, 8 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

These patches are based on jc/no-update-fetch-head.

After the email exchange with Junio, I went back and took a look to see
if I could accomplish what I needed without using NULL methods in the
fetch negotiators. And I could do it - I just had to be more careful
when iterating over refs and dereferencing them. You can see it in patch
5.

In version 1, I wrote that these needed to be done:

 - Commit messages
 - User-facing documentation
 - A way to prevent a promisor-remote fetch from invoking another
   promisor-remote fetch (use a file as a lock?)
 - Remove no_dependents code (fetch-pack, transport)

1, 2, and 4 are already done. As for 3, we currently don't have this
safeguard before this patch set, so I did not include one in this patch
set either.

Jonathan Tan (7):
  negotiator/null: add null fetch negotiator
  fetch: allow refspecs specified through stdin
  fetch: avoid reading submodule config until needed
  fetch: only populate existing_refs if needed
  fetch-pack: do not lazy-fetch during ref iteration
  promisor-remote: lazy-fetch objects in subprocess
  fetch-pack: remove no_dependents code

 Documentation/config/fetch.txt            |   5 +-
 Documentation/git-fetch.txt               |   4 +
 Documentation/technical/partial-clone.txt |  13 +-
 Makefile                                  |   1 +
 builtin/fetch-pack.c                      |   4 -
 builtin/fetch.c                           |  42 ++++-
 fetch-negotiator.c                        |   5 +
 fetch-pack.c                              | 189 +++++++++-------------
 fetch-pack.h                              |  14 --
 negotiator/null.c                         |  44 +++++
 negotiator/null.h                         |   8 +
 promisor-remote.c                         |  46 +++---
 remote-curl.c                             |   6 -
 repo-settings.c                           |   2 +
 repository.h                              |   1 +
 submodule-config.c                        |   5 +-
 t/t0410-partial-clone.sh                  |   2 +-
 t/t4067-diff-partial-clone.sh             |   8 +-
 t/t5554-null-fetch-negotiator.sh          |  22 +++
 t/t5601-clone.sh                          |   2 +-
 t/t5616-partial-clone.sh                  |  20 +++
 transport.c                               |   4 -
 transport.h                               |   7 -
 23 files changed, 252 insertions(+), 202 deletions(-)
 create mode 100644 negotiator/null.c
 create mode 100644 negotiator/null.h
 create mode 100755 t/t5554-null-fetch-negotiator.sh

-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 1/7] negotiator/null: add null fetch negotiator
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-12 12:55     ` Derrick Stolee
  2020-08-11 22:52   ` [PATCH v2 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Add a null fetch negotiator. This is introduced to allow partial clones
to skip the unneeded negotiation step when fetching missing objects
using a "git fetch" subprocess. (The implementation of spawning a "git
fetch" subprocess will be done in a subsequent patch.) But this can also
be useful for end users, e.g. as a blunt fix for object corruption.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config/fetch.txt   |  5 +++-
 Makefile                         |  1 +
 fetch-negotiator.c               |  5 ++++
 negotiator/null.c                | 44 ++++++++++++++++++++++++++++++++
 negotiator/null.h                |  8 ++++++
 repo-settings.c                  |  2 ++
 repository.h                     |  1 +
 t/t5554-null-fetch-negotiator.sh | 22 ++++++++++++++++
 8 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 negotiator/null.c
 create mode 100644 negotiator/null.h
 create mode 100755 t/t5554-null-fetch-negotiator.sh

diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
index 0aaa05e8c0..09aff404be 100644
--- a/Documentation/config/fetch.txt
+++ b/Documentation/config/fetch.txt
@@ -60,7 +60,10 @@ fetch.negotiationAlgorithm::
 	sent when negotiating the contents of the packfile to be sent by the
 	server. Set to "skipping" to use an algorithm that skips commits in an
 	effort to converge faster, but may result in a larger-than-necessary
-	packfile; The default is "default" which instructs Git to use the default algorithm
+	packfile; or set to "null" to not send any information at all, which
+	will almost certainly result in a larger-than-necessary packfile, but
+	will skip the negotiation step.
+	The default is "default" which instructs Git to use the default algorithm
 	that never skips commits (unless the server has acknowledged it or one
 	of its descendants). If `feature.experimental` is enabled, then this
 	setting defaults to "skipping".
diff --git a/Makefile b/Makefile
index 372139f1f2..297ea5e517 100644
--- a/Makefile
+++ b/Makefile
@@ -917,6 +917,7 @@ LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
 LIB_OBJS += name-hash.o
 LIB_OBJS += negotiator/default.o
+LIB_OBJS += negotiator/null.o
 LIB_OBJS += negotiator/skipping.o
 LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
diff --git a/fetch-negotiator.c b/fetch-negotiator.c
index 0a1357dc9d..e2ecbe4367 100644
--- a/fetch-negotiator.c
+++ b/fetch-negotiator.c
@@ -2,6 +2,7 @@
 #include "fetch-negotiator.h"
 #include "negotiator/default.h"
 #include "negotiator/skipping.h"
+#include "negotiator/null.h"
 #include "repository.h"
 
 void fetch_negotiator_init(struct repository *r,
@@ -13,6 +14,10 @@ void fetch_negotiator_init(struct repository *r,
 		skipping_negotiator_init(negotiator);
 		return;
 
+	case FETCH_NEGOTIATION_NULL:
+		null_negotiator_init(negotiator);
+		return;
+
 	case FETCH_NEGOTIATION_DEFAULT:
 	default:
 		default_negotiator_init(negotiator);
diff --git a/negotiator/null.c b/negotiator/null.c
new file mode 100644
index 0000000000..1bd834f121
--- /dev/null
+++ b/negotiator/null.c
@@ -0,0 +1,44 @@
+#include "cache.h"
+#include "null.h"
+#include "../commit.h"
+#include "../fetch-negotiator.h"
+
+static void known_common(struct fetch_negotiator *n, struct commit *c)
+{
+	/* do nothing */
+}
+
+static void add_tip(struct fetch_negotiator *n, struct commit *c)
+{
+	/* do nothing */
+}
+
+static const struct object_id *next(struct fetch_negotiator *n)
+{
+	return NULL;
+}
+
+static int ack(struct fetch_negotiator *n, struct commit *c)
+{
+	/*
+	 * This negotiator does not emit any commits, so there is no commit to
+	 * be acknowledged. If there is any ack, there is a bug.
+	 */
+	BUG("ack with null negotiator, which does not emit any commits");
+	return 0;
+}
+
+static void release(struct fetch_negotiator *n)
+{
+	/* nothing to release */
+}
+
+void null_negotiator_init(struct fetch_negotiator *negotiator)
+{
+	negotiator->known_common = known_common;
+	negotiator->add_tip = add_tip;
+	negotiator->next = next;
+	negotiator->ack = ack;
+	negotiator->release = release;
+	negotiator->data = NULL;
+}
diff --git a/negotiator/null.h b/negotiator/null.h
new file mode 100644
index 0000000000..96713f280e
--- /dev/null
+++ b/negotiator/null.h
@@ -0,0 +1,8 @@
+#ifndef NEGOTIATOR_NULL_H
+#define NEGOTIATOR_NULL_H
+
+struct fetch_negotiator;
+
+void null_negotiator_init(struct fetch_negotiator *negotiator);
+
+#endif
diff --git a/repo-settings.c b/repo-settings.c
index 0918408b34..a8c7e1edd7 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -39,6 +39,8 @@ void prepare_repo_settings(struct repository *r)
 	if (!repo_config_get_string(r, "fetch.negotiationalgorithm", &strval)) {
 		if (!strcasecmp(strval, "skipping"))
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
+		else if (!strcasecmp(strval, "null"))
+			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NULL;
 		else
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
 	}
diff --git a/repository.h b/repository.h
index 3c1f7d54bd..f72911a185 100644
--- a/repository.h
+++ b/repository.h
@@ -23,6 +23,7 @@ enum fetch_negotiation_setting {
 	FETCH_NEGOTIATION_NONE = 0,
 	FETCH_NEGOTIATION_DEFAULT = 1,
 	FETCH_NEGOTIATION_SKIPPING = 2,
+	FETCH_NEGOTIATION_NULL = 3,
 };
 
 struct repo_settings {
diff --git a/t/t5554-null-fetch-negotiator.sh b/t/t5554-null-fetch-negotiator.sh
new file mode 100755
index 0000000000..09a8f0d608
--- /dev/null
+++ b/t/t5554-null-fetch-negotiator.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+test_description='test null fetch negotiator'
+. ./test-lib.sh
+
+test_expect_success 'null negotiator does not emit any "have"' '
+	rm -f trace &&
+
+	test_create_repo server &&
+	test_commit -C server to_fetch &&
+
+	test_create_repo client &&
+	test_commit -C client we_have &&
+
+	test_config -C client fetch.negotiationalgorithm null &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+
+	! grep "fetch> have" trace &&
+	grep "fetch> done" trace
+'
+
+test_done
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 2/7] fetch: allow refspecs specified through stdin
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
  2020-08-11 22:52   ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-11 22:52   ` [PATCH v2 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

In a subsequent patch, partial clones will be taught to fetch missing
objects using a "git fetch" subprocess. Because the number of objects
fetched may be too numerous to fit on the command line, teach "fetch" to
accept refspecs passed through stdin.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/git-fetch.txt |  4 ++++
 builtin/fetch.c             | 19 +++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index 45b6d8e633..9067c2079e 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -48,6 +48,10 @@ include::fetch-options.txt[]
 
 include::pull-fetch-param.txt[]
 
+--stdin::
+	Read refspecs, one per line, from stdin in addition to those provided
+	as arguments. The "tag <name>" format is not supported.
+
 include::urls-remotes.txt[]
 
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 3ccf69753f..a5498646bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -80,6 +80,7 @@ static struct list_objects_filter_options filter_options;
 static struct string_list server_options = STRING_LIST_INIT_DUP;
 static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
 static int fetch_write_commit_graph = -1;
+static int stdin_refspecs = 0;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -209,6 +210,8 @@ static struct option builtin_fetch_options[] = {
 		 N_("check for forced-updates on all updated branches")),
 	OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph,
 		 N_("write the commit-graph after fetching")),
+	OPT_BOOL(0, "stdin", &stdin_refspecs,
+		 N_("accept refspecs from stdin")),
 	OPT_END()
 };
 
@@ -1684,7 +1687,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	return;
 }
 
-static int fetch_one(struct remote *remote, int argc, const char **argv, int prune_tags_ok)
+static int fetch_one(struct remote *remote, int argc, const char **argv,
+		     int prune_tags_ok, int use_stdin_refspecs)
 {
 	struct refspec rs = REFSPEC_INIT_FETCH;
 	int i;
@@ -1741,6 +1745,13 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, int pru
 		}
 	}
 
+	if (use_stdin_refspecs) {
+		struct strbuf line = STRBUF_INIT;
+		while (strbuf_getline_lf(&line, stdin) != EOF)
+			refspec_append(&rs, line.buf);
+		strbuf_release(&line);
+	}
+
 	if (server_options.nr)
 		gtransport->server_options = &server_options;
 
@@ -1841,7 +1852,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (remote) {
 		if (filter_options.choice || has_promisor_remote())
 			fetch_one_setup_partial(remote);
-		result = fetch_one(remote, argc, argv, prune_tags_ok);
+		result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs);
 	} else {
 		int max_children = max_jobs;
 
@@ -1849,6 +1860,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			die(_("--filter can only be used with the remote "
 			      "configured in extensions.partialclone"));
 
+		if (stdin_refspecs)
+			die(_("--stdin can only be used when fetching "
+			      "from one remote"));
+
 		if (max_children < 0)
 			max_children = fetch_parallel_config;
 
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 3/7] fetch: avoid reading submodule config until needed
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
  2020-08-11 22:52   ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
  2020-08-11 22:52   ` [PATCH v2 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-12 17:34     ` Junio C Hamano
  2020-08-11 22:52   ` [PATCH v2 4/7] fetch: only populate existing_refs if needed Jonathan Tan
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Teach "git fetch" to avoid reading the submodule config until necessary.
This allows users to avoid the lazy-fetching of this potentially missing
config file by specifying the --recurse-submodules=no command line
option.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c    | 10 ++++++++--
 submodule-config.c |  5 +++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index a5498646bf..29db219c68 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1786,12 +1786,18 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		free(anon);
 	}
 
-	fetch_config_from_gitmodules(&submodule_fetch_jobs_config,
-				     &recurse_submodules);
 	git_config(git_fetch_config, NULL);
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
+		int *sfjc = submodule_fetch_jobs_config == -1
+			    ? &submodule_fetch_jobs_config : NULL;
+		int *rs = recurse_submodules == RECURSE_SUBMODULES_DEFAULT
+			  ? &recurse_submodules : NULL;
+
+		fetch_config_from_gitmodules(sfjc, rs);
+	}
 
 	if (deepen_relative) {
 		if (deepen_relative < 0)
diff --git a/submodule-config.c b/submodule-config.c
index e175dfbc38..8d65273ed2 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -776,10 +776,11 @@ struct fetch_config {
 static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
 {
 	struct fetch_config *config = cb;
-	if (!strcmp(var, "submodule.fetchjobs")) {
+	if (!strcmp(var, "submodule.fetchjobs") && config->max_children) {
 		*(config->max_children) = parse_submodule_fetchjobs(var, value);
 		return 0;
-	} else if (!strcmp(var, "fetch.recursesubmodules")) {
+	} else if (!strcmp(var, "fetch.recursesubmodules") &&
+		   config->recurse_submodules) {
 		*(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
 		return 0;
 	}
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 4/7] fetch: only populate existing_refs if needed
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (2 preceding siblings ...)
  2020-08-11 22:52   ` [PATCH v2 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-12 18:06     ` Junio C Hamano
  2020-08-11 22:52   ` [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

When fetching tags, Git only writes tags that do not already exist in
the client repository. This necessitates an iteration over all the refs,
but fetch performs this iteration even if no tags are fetched.

This issue is more severe in a partial clone because the iteration over
refs also checks that the targets of those refs are present,
necessitating a lazy fetch if the target is missing.

Therefore, iterate over the refs only when necessary. The user can avoid
this iteration by refraining from fetching tags, for example, by passing
--no-tags as an argument. A subsequent patch will also teach Git to use
"git fetch" to lazy-fetch missing objects in a partial clone, thus also
making use of this change.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 29db219c68..6460ce3f4e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -449,6 +449,7 @@ static struct ref *get_ref_map(struct remote *remote,
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
 	struct hashmap existing_refs;
+	int existing_refs_populated = 0;
 
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
@@ -542,15 +543,18 @@ static struct ref *get_ref_map(struct remote *remote,
 
 	ref_map = ref_remove_duplicates(ref_map);
 
-	refname_hash_init(&existing_refs);
-	for_each_ref(add_one_refname, &existing_refs);
-
 	for (rm = ref_map; rm; rm = rm->next) {
 		if (rm->peer_ref) {
 			const char *refname = rm->peer_ref->name;
 			struct refname_hash_entry *peer_item;
 			unsigned int hash = strhash(refname);
 
+			if (!existing_refs_populated) {
+				refname_hash_init(&existing_refs);
+				for_each_ref(add_one_refname, &existing_refs);
+				existing_refs_populated = 1;
+			}
+
 			peer_item = hashmap_get_entry_from_hash(&existing_refs,
 						hash, refname,
 						struct refname_hash_entry, ent);
@@ -560,7 +564,8 @@ static struct ref *get_ref_map(struct remote *remote,
 			}
 		}
 	}
-	hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
+	if (existing_refs_populated)
+		hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
 
 	return ref_map;
 }
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (3 preceding siblings ...)
  2020-08-11 22:52   ` [PATCH v2 4/7] fetch: only populate existing_refs if needed Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-12 18:25     ` Junio C Hamano
  2020-08-11 22:52   ` [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Teach "fetch-pack" not to lazy fetch whenever iterating over refs. This
is done by using the raw form of ref iteration and by dereferencing tags
ourselves.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c             | 79 ++++++++++++++++++++++------------------
 t/t5616-partial-clone.sh | 20 ++++++++++
 2 files changed, 64 insertions(+), 35 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 80fb3bd899..707bbc31fd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -108,24 +108,48 @@ static void for_each_cached_alternate(struct fetch_negotiator *negotiator,
 		cb(negotiator, cache.items[i]);
 }
 
+static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
+					       int mark_tags_complete)
+{
+	enum object_type type;
+	struct object_info info = { .typep = &type };
+
+	while (1) {
+		if (oid_object_info_extended(the_repository, oid, &info,
+					     OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK))
+			return NULL;
+		if (type == OBJ_TAG) {
+			struct tag *tag = (struct tag *)
+				parse_object(the_repository, oid);
+
+			if (!tag->tagged)
+				return NULL;
+			if (mark_tags_complete)
+				tag->object.flags |= COMPLETE;
+			oid = &tag->tagged->oid;
+		} else {
+			break;
+		}
+	}
+	if (type == OBJ_COMMIT)
+		return (struct commit *) parse_object(the_repository, oid);
+	return NULL;
+}
+
 static int rev_list_insert_ref(struct fetch_negotiator *negotiator,
-			       const char *refname,
 			       const struct object_id *oid)
 {
-	struct object *o = deref_tag(the_repository,
-				     parse_object(the_repository, oid),
-				     refname, 0);
-
-	if (o && o->type == OBJ_COMMIT)
-		negotiator->add_tip(negotiator, (struct commit *)o);
+	struct commit *c = deref_without_lazy_fetch(oid, 0);
 
+	if (c)
+		negotiator->add_tip(negotiator, c);
 	return 0;
 }
 
 static int rev_list_insert_ref_oid(const char *refname, const struct object_id *oid,
 				   int flag, void *cb_data)
 {
-	return rev_list_insert_ref(cb_data, refname, oid);
+	return rev_list_insert_ref(cb_data, oid);
 }
 
 enum ack_type {
@@ -201,7 +225,7 @@ static void send_request(struct fetch_pack_args *args,
 static void insert_one_alternate_object(struct fetch_negotiator *negotiator,
 					struct object *obj)
 {
-	rev_list_insert_ref(negotiator, NULL, &obj->oid);
+	rev_list_insert_ref(negotiator, &obj->oid);
 }
 
 #define INITIAL_FLUSH 16
@@ -230,13 +254,12 @@ static void mark_tips(struct fetch_negotiator *negotiator,
 	int i;
 
 	if (!negotiation_tips) {
-		for_each_ref(rev_list_insert_ref_oid, negotiator);
+		for_each_rawref(rev_list_insert_ref_oid, negotiator);
 		return;
 	}
 
 	for (i = 0; i < negotiation_tips->nr; i++)
-		rev_list_insert_ref(negotiator, NULL,
-				    &negotiation_tips->oid[i]);
+		rev_list_insert_ref(negotiator, &negotiation_tips->oid[i]);
 	return;
 }
 
@@ -503,21 +526,11 @@ static struct commit_list *complete;
 
 static int mark_complete(const struct object_id *oid)
 {
-	struct object *o = parse_object(the_repository, oid);
-
-	while (o && o->type == OBJ_TAG) {
-		struct tag *t = (struct tag *) o;
-		if (!t->tagged)
-			break; /* broken repository */
-		o->flags |= COMPLETE;
-		o = parse_object(the_repository, &t->tagged->oid);
-	}
-	if (o && o->type == OBJ_COMMIT) {
-		struct commit *commit = (struct commit *)o;
-		if (!(commit->object.flags & COMPLETE)) {
-			commit->object.flags |= COMPLETE;
-			commit_list_insert(commit, &complete);
-		}
+	struct commit *commit = deref_without_lazy_fetch(oid, 1);
+
+	if (commit && !(commit->object.flags & COMPLETE)) {
+		commit->object.flags |= COMPLETE;
+		commit_list_insert(commit, &complete);
 	}
 	return 0;
 }
@@ -702,7 +715,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	 */
 	trace2_region_enter("fetch-pack", "mark_complete_local_refs", NULL);
 	if (!args->deepen) {
-		for_each_ref(mark_complete_oid, NULL);
+		for_each_rawref(mark_complete_oid, NULL);
 		for_each_cached_alternate(NULL, mark_alternate_complete);
 		commit_list_sort_by_date(&complete);
 		if (cutoff)
@@ -716,16 +729,12 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	 */
 	trace2_region_enter("fetch-pack", "mark_common_remote_refs", NULL);
 	for (ref = *refs; ref; ref = ref->next) {
-		struct object *o = deref_tag(the_repository,
-					     lookup_object(the_repository,
-					     &ref->old_oid),
-					     NULL, 0);
+		struct commit *c = deref_without_lazy_fetch(&ref->old_oid, 0);
 
-		if (!o || o->type != OBJ_COMMIT || !(o->flags & COMPLETE))
+		if (!c || !(c->object.flags & COMPLETE))
 			continue;
 
-		negotiator->known_common(negotiator,
-					 (struct commit *)o);
+		negotiator->known_common(negotiator, c);
 	}
 	trace2_region_leave("fetch-pack", "mark_common_remote_refs", NULL);
 
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 8a27452a51..e53492d595 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -384,6 +384,26 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+test_expect_success 'fetch does not lazy-fetch missing targets of its refs' '
+	rm -rf server client trace &&
+
+	test_create_repo server &&
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	test_commit -C server foo &&
+
+	git clone --filter=blob:none "file://$(pwd)/server" client &&
+	# Make all refs point to nothing by deleting all objects.
+	rm client/.git/objects/pack/* &&
+
+	test_commit -C server bar &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch \
+		--no-tags --recurse-submodules=no \
+		origin refs/tags/bar &&
+	FOO_HASH=$(git -C server rev-parse foo) &&
+	! grep "want $FOO_HASH" trace
+'
+
 # The following two tests must be in this order. It is important that
 # the srv.bare repository did not have tags during clone, but has tags
 # in the fetch.
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (4 preceding siblings ...)
  2020-08-11 22:52   ` [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-12 18:28     ` Junio C Hamano
  2020-08-11 22:52   ` [PATCH v2 7/7] fetch-pack: remove no_dependents code Jonathan Tan
  2020-08-12 12:51   ` [PATCH v2 0/7] Lazy fetch with subprocess Derrick Stolee
  7 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Teach Git to lazy-fetch missing objects in a subprocess instead of doing
it in-process. This allows any fatal errors that occur during the fetch
to be isolated and converted into an error return value, instead of
causing the current command being run to terminate.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/partial-clone.txt | 13 ++-----
 promisor-remote.c                         | 46 +++++++++++------------
 t/t0410-partial-clone.sh                  |  2 +-
 t/t4067-diff-partial-clone.sh             |  8 ++--
 t/t5601-clone.sh                          |  2 +-
 5 files changed, 30 insertions(+), 41 deletions(-)

diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt
index b9e17e7a28..0780d30cac 100644
--- a/Documentation/technical/partial-clone.txt
+++ b/Documentation/technical/partial-clone.txt
@@ -171,20 +171,13 @@ additional flag.
 Fetching Missing Objects
 ------------------------
 
-- Fetching of objects is done using the existing transport mechanism using
-  transport_fetch_refs(), setting a new transport option
-  TRANS_OPT_NO_DEPENDENTS to indicate that only the objects themselves are
-  desired, not any object that they refer to.
-+
-Because some transports invoke fetch_pack() in the same process, fetch_pack()
-has been updated to not use any object flags when the corresponding argument
-(no_dependents) is set.
+- Fetching of objects is done by invoking a "git fetch" subprocess.
 
 - The local repository sends a request with the hashes of all requested
-  objects as "want" lines, and does not perform any packfile negotiation.
+  objects, and does not perform any packfile negotiation.
   It then receives a packfile.
 
-- Because we are reusing the existing fetch-pack mechanism, fetching
+- Because we are reusing the existing fetch mechanism, fetching
   currently fetches all objects referred to by the requested objects, even
   though they are not necessary.
 
diff --git a/promisor-remote.c b/promisor-remote.c
index baaea12fd6..6e647610e9 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -3,6 +3,7 @@
 #include "promisor-remote.h"
 #include "config.h"
 #include "transport.h"
+#include "argv-array.h"
 
 static char *repository_format_partial_clone;
 static const char *core_partial_clone_filter_default;
@@ -12,39 +13,34 @@ void set_repository_format_partial_clone(char *partial_clone)
 	repository_format_partial_clone = xstrdup_or_null(partial_clone);
 }
 
-static int fetch_refs(const char *remote_name, struct ref *ref)
-{
-	struct remote *remote;
-	struct transport *transport;
-	int res;
-
-	remote = remote_get(remote_name);
-	if (!remote->url[0])
-		die(_("Remote with no URL"));
-	transport = transport_get(remote, remote->url[0]);
-
-	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
-	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	res = transport_fetch_refs(transport, ref);
-
-	return res;
-}
-
 static int fetch_objects(const char *remote_name,
 			 const struct object_id *oids,
 			 int oid_nr)
 {
-	struct ref *ref = NULL;
+	struct child_process child = CHILD_PROCESS_INIT;
 	int i;
+	FILE *child_in;
+
+	child.git_cmd = 1;
+	child.in = -1;
+	argv_array_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=null",
+			 "fetch", remote_name, "--no-tags",
+			 "--no-write-fetch-head", "--recurse-submodules=no",
+			 "--filter=blob:none", "--stdin", NULL);
+	if (start_command(&child))
+		die(_("promisor-remote: unable to fork off fetch subprocess"));
+	child_in = xfdopen(child.in, "w");
 
 	for (i = 0; i < oid_nr; i++) {
-		struct ref *new_ref = alloc_ref(oid_to_hex(&oids[i]));
-		oidcpy(&new_ref->old_oid, &oids[i]);
-		new_ref->exact_oid = 1;
-		new_ref->next = ref;
-		ref = new_ref;
+		if (fputs(oid_to_hex(&oids[i]), child_in) < 0)
+			die_errno(_("promisor-remote: could not write to fetch subprocess"));
+		if (fputc('\n', child_in) < 0)
+			die_errno(_("promisor-remote: could not write to fetch subprocess"));
 	}
-	return fetch_refs(remote_name, ref);
+
+	if (fclose(child_in) < 0)
+		die_errno(_("promisor-remote: could not close stdin to fetch subprocess"));
+	return finish_command(&child) ? -1 : 0;
 }
 
 static struct promisor_remote *promisors;
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 463dc3a8be..3e454f934e 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -203,7 +203,7 @@ test_expect_success 'fetching of missing objects works with ref-in-want enabled'
 	rm -rf repo/.git/objects/* &&
 	rm -f trace &&
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C repo cat-file -p "$HASH" &&
-	grep "git< fetch=.*ref-in-want" trace
+	grep "fetch< fetch=.*ref-in-want" trace
 '
 
 test_expect_success 'fetching of missing objects from another promisor remote' '
diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh
index ef8e0e9cb0..804f2a82e8 100755
--- a/t/t4067-diff-partial-clone.sh
+++ b/t/t4067-diff-partial-clone.sh
@@ -20,7 +20,7 @@ test_expect_success 'git show batches blobs' '
 	# Ensure that there is exactly 1 negotiation by checking that there is
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client show HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -44,7 +44,7 @@ test_expect_success 'diff batches blobs' '
 	# Ensure that there is exactly 1 negotiation by checking that there is
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff HEAD^ HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -127,7 +127,7 @@ test_expect_success 'diff with rename detection batches blobs' '
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --raw -M HEAD^ HEAD >out &&
 	grep ":100644 100644.*R[0-9][0-9][0-9].*b.*c" out &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -175,7 +175,7 @@ test_expect_success 'diff --break-rewrites fetches only if necessary, and batche
 	# by checking that there is only 1 "done" line sent. ("done" marks the
 	# end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --break-rewrites --raw -M HEAD^ HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 84ea2a3eb7..f82b0dbb5a 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -702,7 +702,7 @@ test_expect_success 'batch missing blob request during checkout' '
 	# Ensure that there is only one negotiation by checking that there is
 	# only "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 7/7] fetch-pack: remove no_dependents code
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (5 preceding siblings ...)
  2020-08-11 22:52   ` [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
@ 2020-08-11 22:52   ` Jonathan Tan
  2020-08-12 12:51   ` [PATCH v2 0/7] Lazy fetch with subprocess Derrick Stolee
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-11 22:52 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Now that Git has switched to using a subprocess to lazy-fetch missing
objects, remove the no_dependents code as it is no longer used.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch-pack.c |   4 --
 fetch-pack.c         | 110 ++++++++++++-------------------------------
 fetch-pack.h         |  14 ------
 remote-curl.c        |   6 ---
 transport.c          |   4 --
 transport.h          |   7 ---
 6 files changed, 30 insertions(+), 115 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index bbb5c96167..58b7c1fbdc 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -153,10 +153,6 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			args.from_promisor = 1;
 			continue;
 		}
-		if (!strcmp("--no-dependents", arg)) {
-			args.no_dependents = 1;
-			continue;
-		}
 		if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), &arg)) {
 			parse_list_objects_filter(&args.filter_options, arg);
 			continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index 707bbc31fd..3212957dae 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -285,10 +285,8 @@ static int find_common(struct fetch_negotiator *negotiator,
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_DIE_ON_ERR_PACKET);
 
-	if (!args->no_dependents) {
-		mark_tips(negotiator, args->negotiation_tips);
-		for_each_cached_alternate(negotiator, insert_one_alternate_object);
-	}
+	mark_tips(negotiator, args->negotiation_tips);
+	for_each_cached_alternate(negotiator, insert_one_alternate_object);
 
 	fetching = 0;
 	for ( ; refs ; refs = refs->next) {
@@ -305,12 +303,8 @@ static int find_common(struct fetch_negotiator *negotiator,
 		 * We use lookup_object here because we are only
 		 * interested in the case we *know* the object is
 		 * reachable and we have already scanned it.
-		 *
-		 * Do this only if args->no_dependents is false (if it is true,
-		 * we cannot trust the object flags).
 		 */
-		if (!args->no_dependents &&
-		    ((o = lookup_object(the_repository, remote)) != NULL) &&
+		if (((o = lookup_object(the_repository, remote)) != NULL) &&
 				(o->flags & COMPLETE)) {
 			continue;
 		}
@@ -410,8 +404,6 @@ static int find_common(struct fetch_negotiator *negotiator,
 	trace2_region_enter("fetch-pack", "negotiation_v0_v1", the_repository);
 	flushes = 0;
 	retval = -1;
-	if (args->no_dependents)
-		goto done;
 	while ((oid = negotiator->next(negotiator))) {
 		packet_buf_write(&req_buf, "have %s\n", oid_to_hex(oid));
 		print_verbose(args, "have %s", oid_to_hex(oid));
@@ -666,9 +658,7 @@ struct loose_object_iter {
 
 /*
  * Mark recent commits available locally and reachable from a local ref as
- * COMPLETE. If args->no_dependents is false, also mark COMPLETE remote refs as
- * COMMON_REF (otherwise, we are not planning to participate in negotiation, and
- * thus do not need COMMON_REF marks).
+ * COMPLETE.
  *
  * The cutoff time for recency is determined by this heuristic: it is the
  * earliest commit time of the objects in refs that are commits and that we know
@@ -969,12 +959,8 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	struct fetch_negotiator negotiator_alloc;
 	struct fetch_negotiator *negotiator;
 
-	if (args->no_dependents) {
-		negotiator = NULL;
-	} else {
-		negotiator = &negotiator_alloc;
-		fetch_negotiator_init(r, negotiator);
-	}
+	negotiator = &negotiator_alloc;
+	fetch_negotiator_init(r, negotiator);
 
 	sort_ref_list(&ref, ref_compare_name);
 	QSORT(sought, nr_sought, cmp_ref_by_name);
@@ -1062,15 +1048,11 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	if (!server_supports_hash(the_hash_algo->name, NULL))
 		die(_("Server does not support this repository's object format"));
 
-	if (!args->no_dependents) {
-		mark_complete_and_common_ref(negotiator, args, &ref);
-		filter_refs(args, &ref, sought, nr_sought);
-		if (everything_local(args, &ref)) {
-			packet_flush(fd[1]);
-			goto all_done;
-		}
-	} else {
-		filter_refs(args, &ref, sought, nr_sought);
+	mark_complete_and_common_ref(negotiator, args, &ref);
+	filter_refs(args, &ref, sought, nr_sought);
+	if (everything_local(args, &ref)) {
+		packet_flush(fd[1]);
+		goto all_done;
 	}
 	if (find_common(negotiator, args, fd, &oid, ref) < 0)
 		if (!args->keep_pack)
@@ -1119,7 +1101,7 @@ static void add_shallow_requests(struct strbuf *req_buf,
 		packet_buf_write(req_buf, "deepen-relative\n");
 }
 
-static void add_wants(int no_dependents, const struct ref *wants, struct strbuf *req_buf)
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
 	int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 0);
 
@@ -1136,12 +1118,8 @@ static void add_wants(int no_dependents, const struct ref *wants, struct strbuf
 		 * We use lookup_object here because we are only
 		 * interested in the case we *know* the object is
 		 * reachable and we have already scanned it.
-		 *
-		 * Do this only if args->no_dependents is false (if it is true,
-		 * we cannot trust the object flags).
 		 */
-		if (!no_dependents &&
-		    ((o = lookup_object(the_repository, remote)) != NULL) &&
+		if (((o = lookup_object(the_repository, remote)) != NULL) &&
 		    (o->flags & COMPLETE)) {
 			continue;
 		}
@@ -1275,19 +1253,14 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
 	}
 
 	/* add wants */
-	add_wants(args->no_dependents, wants, &req_buf);
+	add_wants(wants, &req_buf);
 
-	if (args->no_dependents) {
-		packet_buf_write(&req_buf, "done");
-		ret = 1;
-	} else {
-		/* Add all of the common commits we've found in previous rounds */
-		add_common(&req_buf, common);
+	/* Add all of the common commits we've found in previous rounds */
+	add_common(&req_buf, common);
 
-		/* Add initial haves */
-		ret = add_haves(negotiator, seen_ack, &req_buf,
-				haves_to_send, in_vain);
-	}
+	/* Add initial haves */
+	ret = add_haves(negotiator, seen_ack, &req_buf,
+			haves_to_send, in_vain);
 
 	/* Send request */
 	packet_buf_flush(&req_buf);
@@ -1547,12 +1520,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	struct string_list packfile_uris = STRING_LIST_INIT_DUP;
 	int i;
 
-	if (args->no_dependents) {
-		negotiator = NULL;
-	} else {
-		negotiator = &negotiator_alloc;
-		fetch_negotiator_init(r, negotiator);
-	}
+	negotiator = &negotiator_alloc;
+	fetch_negotiator_init(r, negotiator);
 
 	packet_reader_init(&reader, fd[0], NULL, 0,
 			   PACKET_READ_CHOMP_NEWLINE |
@@ -1576,21 +1545,16 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				args->deepen = 1;
 
 			/* Filter 'ref' by 'sought' and those that aren't local */
-			if (!args->no_dependents) {
-				mark_complete_and_common_ref(negotiator, args, &ref);
-				filter_refs(args, &ref, sought, nr_sought);
-				if (everything_local(args, &ref))
-					state = FETCH_DONE;
-				else
-					state = FETCH_SEND_REQUEST;
-
-				mark_tips(negotiator, args->negotiation_tips);
-				for_each_cached_alternate(negotiator,
-							  insert_one_alternate_object);
-			} else {
-				filter_refs(args, &ref, sought, nr_sought);
+			mark_complete_and_common_ref(negotiator, args, &ref);
+			filter_refs(args, &ref, sought, nr_sought);
+			if (everything_local(args, &ref))
+				state = FETCH_DONE;
+			else
 				state = FETCH_SEND_REQUEST;
-			}
+
+			mark_tips(negotiator, args->negotiation_tips);
+			for_each_cached_alternate(negotiator,
+						  insert_one_alternate_object);
 			break;
 		case FETCH_SEND_REQUEST:
 			if (!negotiation_started) {
@@ -1911,20 +1875,6 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 	if (nr_sought)
 		nr_sought = remove_duplicates_in_refs(sought, nr_sought);
 
-	if (args->no_dependents && !args->filter_options.choice) {
-		/*
-		 * The protocol does not support requesting that only the
-		 * wanted objects be sent, so approximate this by setting a
-		 * "blob:none" filter if no filter is already set. This works
-		 * for all object types: note that wanted blobs will still be
-		 * sent because they are directly specified as a "want".
-		 *
-		 * NEEDSWORK: Add an option in the protocol to request that
-		 * only the wanted objects be sent, and implement it.
-		 */
-		parse_list_objects_filter(&args->filter_options, "blob:none");
-	}
-
 	if (version != protocol_v2 && !ref) {
 		packet_flush(fd[1]);
 		die(_("no matching remote head"));
diff --git a/fetch-pack.h b/fetch-pack.h
index 85d1e39fe7..bbe2938059 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -42,20 +42,6 @@ struct fetch_pack_args {
 	unsigned deepen:1;
 	unsigned from_promisor:1;
 
-	/*
-	 * Attempt to fetch only the wanted objects, and not any objects
-	 * referred to by them. Due to protocol limitations, extraneous
-	 * objects may still be included. (When fetching non-blob
-	 * objects, only blobs are excluded; when fetching a blob, the
-	 * blob itself will still be sent. The client does not need to
-	 * know whether a wanted object is a blob or not.)
-	 *
-	 * If 1, fetch_pack() will also not modify any object flags.
-	 * This allows fetch_pack() to safely be called by any function,
-	 * regardless of which object flags it uses (if any).
-	 */
-	unsigned no_dependents:1;
-
 	/*
 	 * Because fetch_pack() overwrites the shallow file upon a
 	 * successful deepening non-clone fetch, if this struct
diff --git a/remote-curl.c b/remote-curl.c
index 5cbc6e5002..a0c81a64bc 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -40,7 +40,6 @@ struct options {
 		push_cert : 2,
 		deepen_relative : 1,
 		from_promisor : 1,
-		no_dependents : 1,
 		atomic : 1,
 		object_format : 1;
 	const struct git_hash_algo *hash_algo;
@@ -186,9 +185,6 @@ static int set_option(const char *name, const char *value)
 	} else if (!strcmp(name, "from-promisor")) {
 		options.from_promisor = 1;
 		return 0;
-	} else if (!strcmp(name, "no-dependents")) {
-		options.no_dependents = 1;
-		return 0;
 	} else if (!strcmp(name, "filter")) {
 		options.filter = xstrdup(value);
 		return 0;
@@ -1171,8 +1167,6 @@ static int fetch_git(struct discovery *heads,
 		argv_array_push(&args, "--deepen-relative");
 	if (options.from_promisor)
 		argv_array_push(&args, "--from-promisor");
-	if (options.no_dependents)
-		argv_array_push(&args, "--no-dependents");
 	if (options.filter)
 		argv_array_pushf(&args, "--filter=%s", options.filter);
 	argv_array_push(&args, url.buf);
diff --git a/transport.c b/transport.c
index b41386eccb..32e1f21f0c 100644
--- a/transport.c
+++ b/transport.c
@@ -232,9 +232,6 @@ static int set_git_option(struct git_transport_options *opts,
 	} else if (!strcmp(name, TRANS_OPT_FROM_PROMISOR)) {
 		opts->from_promisor = !!value;
 		return 0;
-	} else if (!strcmp(name, TRANS_OPT_NO_DEPENDENTS)) {
-		opts->no_dependents = !!value;
-		return 0;
 	} else if (!strcmp(name, TRANS_OPT_LIST_OBJECTS_FILTER)) {
 		list_objects_filter_die_if_populated(&opts->filter_options);
 		parse_list_objects_filter(&opts->filter_options, value);
@@ -359,7 +356,6 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
 	args.from_promisor = data->options.from_promisor;
-	args.no_dependents = data->options.no_dependents;
 	args.filter_options = data->options.filter_options;
 	args.stateless_rpc = transport->stateless_rpc;
 	args.server_options = transport->server_options;
diff --git a/transport.h b/transport.h
index b3c30133ea..7aa1f33145 100644
--- a/transport.h
+++ b/transport.h
@@ -16,7 +16,6 @@ struct git_transport_options {
 	unsigned update_shallow : 1;
 	unsigned deepen_relative : 1;
 	unsigned from_promisor : 1;
-	unsigned no_dependents : 1;
 
 	/*
 	 * If this transport supports connect or stateless-connect,
@@ -201,12 +200,6 @@ void transport_check_allowed(const char *type);
 /* Indicate that these objects are being fetched by a promisor */
 #define TRANS_OPT_FROM_PROMISOR "from-promisor"
 
-/*
- * Indicate that only the objects wanted need to be fetched, not their
- * dependents
- */
-#define TRANS_OPT_NO_DEPENDENTS "no-dependents"
-
 /* Filter objects for partial clone and fetch */
 #define TRANS_OPT_LIST_OBJECTS_FILTER "filter"
 
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 0/7] Lazy fetch with subprocess
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
                     ` (6 preceding siblings ...)
  2020-08-11 22:52   ` [PATCH v2 7/7] fetch-pack: remove no_dependents code Jonathan Tan
@ 2020-08-12 12:51   ` Derrick Stolee
  7 siblings, 0 replies; 57+ messages in thread
From: Derrick Stolee @ 2020-08-12 12:51 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: gitster

On 8/11/2020 6:52 PM, Jonathan Tan wrote:
>  - A way to prevent a promisor-remote fetch from invoking another
>    promisor-remote fetch (use a file as a lock?)

In the VFS for Git world, the C# layer frequently needs to ask
Git for some information, but doesn't want to trigger a download
of a missing object. We use a config option when running our
command that disables the GVFS protocol for that command.

It's possible that a config option _or_ environment variable would
suffice instead of adding another file-based lock.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 1/7] negotiator/null: add null fetch negotiator
  2020-08-11 22:52   ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
@ 2020-08-12 12:55     ` Derrick Stolee
  2020-08-12 16:44       ` Junio C Hamano
  0 siblings, 1 reply; 57+ messages in thread
From: Derrick Stolee @ 2020-08-12 12:55 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: gitster

On 8/11/2020 6:52 PM, Jonathan Tan wrote:
> Add a null fetch negotiator. 

I understand the value of this negotiator. I'm concerned about using
"null" as the name, since it has a clear relationship to zero-valued
pointers and that's not what is happening. (My gut feeling starting the
patch was that some function pointers would be NULL or something.)

Instead, might I recommend "noop" or "no_op" in place of "null" here?

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 1/7] negotiator/null: add null fetch negotiator
  2020-08-12 12:55     ` Derrick Stolee
@ 2020-08-12 16:44       ` Junio C Hamano
  2020-08-12 17:29         ` Jonathan Tan
  0 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-08-12 16:44 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jonathan Tan, git

Derrick Stolee <stolee@gmail.com> writes:

> On 8/11/2020 6:52 PM, Jonathan Tan wrote:
>> Add a null fetch negotiator. 
>
> I understand the value of this negotiator. I'm concerned about using
> "null" as the name, since it has a clear relationship to zero-valued
> pointers and that's not what is happening. (My gut feeling starting the
> patch was that some function pointers would be NULL or something.)
>
> Instead, might I recommend "noop" or "no_op" in place of "null" here?

Personally I am OK with null [*], but noop is also fine.

	Side note.  I actually would find it good to establish the
	pattern that something that does not use NULL pointer as its
	implementation detail can be called null if "null-ness" of
	its behaviour is its defining characteristics.

Thanks.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 1/7] negotiator/null: add null fetch negotiator
  2020-08-12 16:44       ` Junio C Hamano
@ 2020-08-12 17:29         ` Jonathan Tan
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-12 17:29 UTC (permalink / raw)
  To: gitster; +Cc: stolee, jonathantanmy, git

> Derrick Stolee <stolee@gmail.com> writes:
> 
> > On 8/11/2020 6:52 PM, Jonathan Tan wrote:
> >> Add a null fetch negotiator. 
> >
> > I understand the value of this negotiator. I'm concerned about using
> > "null" as the name, since it has a clear relationship to zero-valued
> > pointers and that's not what is happening. (My gut feeling starting the
> > patch was that some function pointers would be NULL or something.)
> >
> > Instead, might I recommend "noop" or "no_op" in place of "null" here?
> 
> Personally I am OK with null [*], but noop is also fine.
> 
> 	Side note.  I actually would find it good to establish the
> 	pattern that something that does not use NULL pointer as its
> 	implementation detail can be called null if "null-ness" of
> 	its behaviour is its defining characteristics.
> 
> Thanks.

OK, in a future version I'll go with "noop".

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 3/7] fetch: avoid reading submodule config until needed
  2020-08-11 22:52   ` [PATCH v2 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
@ 2020-08-12 17:34     ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-12 17:34 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Teach "git fetch" to avoid reading the submodule config until necessary.
> This allows users to avoid the lazy-fetching of this potentially missing
> config file by specifying the --recurse-submodules=no command line
> option.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  builtin/fetch.c    | 10 ++++++++--
>  submodule-config.c |  5 +++--
>  2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index a5498646bf..29db219c68 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1786,12 +1786,18 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  		free(anon);
>  	}
>  
> -	fetch_config_from_gitmodules(&submodule_fetch_jobs_config,
> -				     &recurse_submodules);
>  	git_config(git_fetch_config, NULL);
>  
>  	argc = parse_options(argc, argv, prefix,
>  			     builtin_fetch_options, builtin_fetch_usage, 0);
> +	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
> +		int *sfjc = submodule_fetch_jobs_config == -1
> +			    ? &submodule_fetch_jobs_config : NULL;
> +		int *rs = recurse_submodules == RECURSE_SUBMODULES_DEFAULT
> +			  ? &recurse_submodules : NULL;
> +
> +		fetch_config_from_gitmodules(sfjc, rs);
> +	}

Hmph.  Instead of calling fetch_config_from... upfront, first
reading the fetch configuration and also command line options to see
if we are told to recurse, and only enter the new "if()" statement
does make sense---we'll avoid calling fetch_config_from... when we
are told not to recurse into submodules.

But it is not quite clear why the two parameters to the function can
now be conditional, and what benefit we gain by doing so, in the
body of the new "if()" statement.  Don't you need to explain to your
future peer developers what is going on here in the log message?

>  	if (deepen_relative) {
>  		if (deepen_relative < 0)
> diff --git a/submodule-config.c b/submodule-config.c
> index e175dfbc38..8d65273ed2 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -776,10 +776,11 @@ struct fetch_config {
>  static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
>  {
>  	struct fetch_config *config = cb;
> -	if (!strcmp(var, "submodule.fetchjobs")) {
> +	if (!strcmp(var, "submodule.fetchjobs") && config->max_children) {
>  		*(config->max_children) = parse_submodule_fetchjobs(var, value);
>  		return 0;

Because the new "if()" statement in cmd_fetch can pass NULL to
either of the two parameters to fetch_config_from_gitmodules(), it
now becomes possible for this function to get an instance of "struct
fetch_config" with .max_children field set to NULL.

The above justifies why assignment to *(config->max_children) must
be skipped when .max_children is NULL, but it does not justify the
new code, where .max_children==NULL does not stop the if/else if/
cascade.

	if (!strcmp(var, "submodule.fetchjobs")) {
		if (config->max_children)
			*(config->max_children) = parse...
		return 0;
	}

Why sfjc or rs is allowed to be NULL in certain cases, and why doing
so is a good idea, should be explained nevertheless, though.

> -	} else if (!strcmp(var, "fetch.recursesubmodules")) {
> +	} else if (!strcmp(var, "fetch.recursesubmodules") &&
> +		   config->recurse_submodules) {
>  		*(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
>  		return 0;

Likewise.

Thanks.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 4/7] fetch: only populate existing_refs if needed
  2020-08-11 22:52   ` [PATCH v2 4/7] fetch: only populate existing_refs if needed Jonathan Tan
@ 2020-08-12 18:06     ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-12 18:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> When fetching tags, Git only writes tags that do not already exist in
> the client repository. This necessitates an iteration over all the refs,
> but fetch performs this iteration even if no tags are fetched.

Hmph, the change itself mechanically makes sense (i.e. if the
ref_map has no entry with peer_ref defined, existing_refs hashmap is
never looked up, so there is no reason to populate it), but the
explanation of log message, especially the part that places strees
on tags, does not.  Wouldn't a plain vanilla "git fetch" with no
argument that uses the remote.origin.fetch populated with standard
refspecs to maintain remote-tracking branches use this hash?

I think the readers need to be somehow made aware of the fact that
the author of this commit message is concentrated primarily on fetch
used as a mechanism to grab specific objects, often used by lazy
fetch, without touching any remote-tracking refs.  Because tag
following is on by default (which is a convenient default for the
plain vanilla fetches that updates remote-tracking branches),
however, such a "these specific objects only" fetch still tries to
trigger the auto following of tags, and necessitates "--no-tags" to
take advantage of the optimization proposed by this patch.

So nothing written in the proposed log message is incorrect per-se,
but it is not very friendly to readers without clarivoyance.

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 29db219c68..6460ce3f4e 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -449,6 +449,7 @@ static struct ref *get_ref_map(struct remote *remote,
>  	struct ref *orefs = NULL, **oref_tail = &orefs;
>  
>  	struct hashmap existing_refs;
> +	int existing_refs_populated = 0;
>  
>  	if (rs->nr) {
>  		struct refspec *fetch_refspec;
> @@ -542,15 +543,18 @@ static struct ref *get_ref_map(struct remote *remote,
>  
>  	ref_map = ref_remove_duplicates(ref_map);
>  
> -	refname_hash_init(&existing_refs);
> -	for_each_ref(add_one_refname, &existing_refs);
> -
>  	for (rm = ref_map; rm; rm = rm->next) {
>  		if (rm->peer_ref) {
>  			const char *refname = rm->peer_ref->name;
>  			struct refname_hash_entry *peer_item;
>  			unsigned int hash = strhash(refname);
>  
> +			if (!existing_refs_populated) {
> +				refname_hash_init(&existing_refs);
> +				for_each_ref(add_one_refname, &existing_refs);
> +				existing_refs_populated = 1;
> +			}
> +
>  			peer_item = hashmap_get_entry_from_hash(&existing_refs,
>  						hash, refname,
>  						struct refname_hash_entry, ent);
> @@ -560,7 +564,8 @@ static struct ref *get_ref_map(struct remote *remote,
>  			}
>  		}
>  	}
> -	hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
> +	if (existing_refs_populated)
> +		hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
>  
>  	return ref_map;
>  }

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration
  2020-08-11 22:52   ` [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
@ 2020-08-12 18:25     ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-12 18:25 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Teach "fetch-pack" not to lazy fetch whenever iterating over
> refs. 

Don't readers need the reason why is it a good thing to do explained
here?

> This
> is done by using the raw form of ref iteration and by dereferencing tags
> ourselves.

Hmph.  The way this patch implements deref_without_lazy_fetch()
makes it hard to reuse in other contexts, as it mixes what is done
by mark_complete(), which is very much "git fetch" specific.

Would "git fetch" be the only user potentially benefit from being
able to dereference a tag that already exists locally and ignore
missing ones?  I wonder if teaching deref_tag() a new flag would be
a better alternative to keep the separation of concerns at different
layers.

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  fetch-pack.c             | 79 ++++++++++++++++++++++------------------
>  t/t5616-partial-clone.sh | 20 ++++++++++
>  2 files changed, 64 insertions(+), 35 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 80fb3bd899..707bbc31fd 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -108,24 +108,48 @@ static void for_each_cached_alternate(struct fetch_negotiator *negotiator,
>  		cb(negotiator, cache.items[i]);
>  }
>  
> +static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
> +					       int mark_tags_complete)
> +{
> +	enum object_type type;
> +	struct object_info info = { .typep = &type };
> +
> +	while (1) {
> +		if (oid_object_info_extended(the_repository, oid, &info,
> +					     OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK))
> +			return NULL;
> +		if (type == OBJ_TAG) {
> +			struct tag *tag = (struct tag *)
> +				parse_object(the_repository, oid);
> +
> +			if (!tag->tagged)
> +				return NULL;
> +			if (mark_tags_complete)
> +				tag->object.flags |= COMPLETE;
> +			oid = &tag->tagged->oid;
> +		} else {
> +			break;
> +		}
> +	}
> +	if (type == OBJ_COMMIT)
> +		return (struct commit *) parse_object(the_repository, oid);
> +	return NULL;
> +}
> +
>  static int rev_list_insert_ref(struct fetch_negotiator *negotiator,
> -			       const char *refname,
>  			       const struct object_id *oid)
>  {
> -	struct object *o = deref_tag(the_repository,
> -				     parse_object(the_repository, oid),
> -				     refname, 0);
> -
> -	if (o && o->type == OBJ_COMMIT)
> -		negotiator->add_tip(negotiator, (struct commit *)o);
> +	struct commit *c = deref_without_lazy_fetch(oid, 0);
>  
> +	if (c)
> +		negotiator->add_tip(negotiator, c);
>  	return 0;
>  }
>  
>  static int rev_list_insert_ref_oid(const char *refname, const struct object_id *oid,
>  				   int flag, void *cb_data)
>  {
> -	return rev_list_insert_ref(cb_data, refname, oid);
> +	return rev_list_insert_ref(cb_data, oid);
>  }
>  
>  enum ack_type {
> @@ -201,7 +225,7 @@ static void send_request(struct fetch_pack_args *args,
>  static void insert_one_alternate_object(struct fetch_negotiator *negotiator,
>  					struct object *obj)
>  {
> -	rev_list_insert_ref(negotiator, NULL, &obj->oid);
> +	rev_list_insert_ref(negotiator, &obj->oid);
>  }
>  
>  #define INITIAL_FLUSH 16
> @@ -230,13 +254,12 @@ static void mark_tips(struct fetch_negotiator *negotiator,
>  	int i;
>  
>  	if (!negotiation_tips) {
> -		for_each_ref(rev_list_insert_ref_oid, negotiator);
> +		for_each_rawref(rev_list_insert_ref_oid, negotiator);
>  		return;
>  	}
>  
>  	for (i = 0; i < negotiation_tips->nr; i++)
> -		rev_list_insert_ref(negotiator, NULL,
> -				    &negotiation_tips->oid[i]);
> +		rev_list_insert_ref(negotiator, &negotiation_tips->oid[i]);
>  	return;
>  }
>  
> @@ -503,21 +526,11 @@ static struct commit_list *complete;
>  
>  static int mark_complete(const struct object_id *oid)
>  {
> -	struct object *o = parse_object(the_repository, oid);
> -
> -	while (o && o->type == OBJ_TAG) {
> -		struct tag *t = (struct tag *) o;
> -		if (!t->tagged)
> -			break; /* broken repository */
> -		o->flags |= COMPLETE;
> -		o = parse_object(the_repository, &t->tagged->oid);
> -	}
> -	if (o && o->type == OBJ_COMMIT) {
> -		struct commit *commit = (struct commit *)o;
> -		if (!(commit->object.flags & COMPLETE)) {
> -			commit->object.flags |= COMPLETE;
> -			commit_list_insert(commit, &complete);
> -		}
> +	struct commit *commit = deref_without_lazy_fetch(oid, 1);
> +
> +	if (commit && !(commit->object.flags & COMPLETE)) {
> +		commit->object.flags |= COMPLETE;
> +		commit_list_insert(commit, &complete);
>  	}
>  	return 0;
>  }
> @@ -702,7 +715,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
>  	 */
>  	trace2_region_enter("fetch-pack", "mark_complete_local_refs", NULL);
>  	if (!args->deepen) {
> -		for_each_ref(mark_complete_oid, NULL);
> +		for_each_rawref(mark_complete_oid, NULL);
>  		for_each_cached_alternate(NULL, mark_alternate_complete);
>  		commit_list_sort_by_date(&complete);
>  		if (cutoff)
> @@ -716,16 +729,12 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
>  	 */
>  	trace2_region_enter("fetch-pack", "mark_common_remote_refs", NULL);
>  	for (ref = *refs; ref; ref = ref->next) {
> -		struct object *o = deref_tag(the_repository,
> -					     lookup_object(the_repository,
> -					     &ref->old_oid),
> -					     NULL, 0);
> +		struct commit *c = deref_without_lazy_fetch(&ref->old_oid, 0);
>  
> -		if (!o || o->type != OBJ_COMMIT || !(o->flags & COMPLETE))
> +		if (!c || !(c->object.flags & COMPLETE))
>  			continue;
>  
> -		negotiator->known_common(negotiator,
> -					 (struct commit *)o);
> +		negotiator->known_common(negotiator, c);
>  	}
>  	trace2_region_leave("fetch-pack", "mark_common_remote_refs", NULL);
>  
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index 8a27452a51..e53492d595 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -384,6 +384,26 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
>  	grep "want $(cat hash)" trace
>  '
>  
> +test_expect_success 'fetch does not lazy-fetch missing targets of its refs' '
> +	rm -rf server client trace &&
> +
> +	test_create_repo server &&
> +	test_config -C server uploadpack.allowfilter 1 &&
> +	test_config -C server uploadpack.allowanysha1inwant 1 &&
> +	test_commit -C server foo &&
> +
> +	git clone --filter=blob:none "file://$(pwd)/server" client &&
> +	# Make all refs point to nothing by deleting all objects.
> +	rm client/.git/objects/pack/* &&
> +
> +	test_commit -C server bar &&
> +	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch \
> +		--no-tags --recurse-submodules=no \
> +		origin refs/tags/bar &&
> +	FOO_HASH=$(git -C server rev-parse foo) &&
> +	! grep "want $FOO_HASH" trace
> +'
> +
>  # The following two tests must be in this order. It is important that
>  # the srv.bare repository did not have tags during clone, but has tags
>  # in the fetch.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess
  2020-08-11 22:52   ` [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
@ 2020-08-12 18:28     ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-12 18:28 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Teach Git to lazy-fetch missing objects in a subprocess instead of doing
> it in-process. This allows any fatal errors that occur during the fetch
> to be isolated and converted into an error return value, instead of
> causing the current command being run to terminate.
>
> ...
>  static int fetch_objects(const char *remote_name,
>  			 const struct object_id *oids,
>  			 int oid_nr)
>  {
> -	struct ref *ref = NULL;
> +	struct child_process child = CHILD_PROCESS_INIT;
>  	int i;
> +	FILE *child_in;
> +
> +	child.git_cmd = 1;
> +	child.in = -1;
> +	argv_array_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=null",
> +			 "fetch", remote_name, "--no-tags",
> +			 "--no-write-fetch-head", "--recurse-submodules=no",
> +			 "--filter=blob:none", "--stdin", NULL);

Finally we get to use the new negotiator introduced earlier.  Nice.

> +	if (start_command(&child))
> +		die(_("promisor-remote: unable to fork off fetch subprocess"));
> +	child_in = xfdopen(child.in, "w");

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v3 0/7] Lazy fetch with subprocess
  2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
                   ` (3 preceding siblings ...)
  2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
@ 2020-08-18  4:01 ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator Jonathan Tan
                     ` (7 more replies)
  4 siblings, 8 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

These patches are based on jc/no-update-fetch-head, once again.

I've addressed most reviewer comments by expanding commit messages and
fixing the cascading "if" issue in patch 3.

> Would "git fetch" be the only user potentially benefit from being
> able to dereference a tag that already exists locally and ignore
> missing ones?  I wonder if teaching deref_tag() a new flag would be
> a better alternative to keep the separation of concerns at different
> layers.

Right now, I think so - "fetch" is a special case of having to avoid
infinite loops when lazy fetching, and of being able to tolerate missing
ref targets. I don't think most other commands are like that.

Jonathan Tan (7):
  negotiator/noop: add noop fetch negotiator
  fetch: allow refspecs specified through stdin
  fetch: avoid reading submodule config until needed
  fetch: only populate existing_refs if needed
  fetch-pack: do not lazy-fetch during ref iteration
  promisor-remote: lazy-fetch objects in subprocess
  fetch-pack: remove no_dependents code

 Documentation/config/fetch.txt            |   5 +-
 Documentation/git-fetch.txt               |   4 +
 Documentation/technical/partial-clone.txt |  13 +-
 Makefile                                  |   1 +
 builtin/fetch-pack.c                      |   4 -
 builtin/fetch.c                           |  42 ++++-
 fetch-negotiator.c                        |   5 +
 fetch-pack.c                              | 189 +++++++++-------------
 fetch-pack.h                              |  14 --
 negotiator/noop.c                         |  44 +++++
 negotiator/noop.h                         |   8 +
 promisor-remote.c                         |  46 +++---
 remote-curl.c                             |   6 -
 repo-settings.c                           |   2 +
 repository.h                              |   1 +
 submodule-config.c                        |   8 +-
 t/t0410-partial-clone.sh                  |   2 +-
 t/t4067-diff-partial-clone.sh             |   8 +-
 t/t5554-noop-fetch-negotiator.sh          |  22 +++
 t/t5601-clone.sh                          |   2 +-
 t/t5616-partial-clone.sh                  |  20 +++
 transport.c                               |   4 -
 transport.h                               |   7 -
 23 files changed, 255 insertions(+), 202 deletions(-)
 create mode 100644 negotiator/noop.c
 create mode 100644 negotiator/noop.h
 create mode 100755 t/t5554-noop-fetch-negotiator.sh

Range-diff against v2:
1:  35bdd372ae ! 1:  5b3b49d123 negotiator/null: add null fetch negotiator
    @@ Metadata
     Author: Jonathan Tan <jonathantanmy@google.com>
     
      ## Commit message ##
    -    negotiator/null: add null fetch negotiator
    +    negotiator/noop: add noop fetch negotiator
     
    -    Add a null fetch negotiator. This is introduced to allow partial clones
    +    Add a noop fetch negotiator. This is introduced to allow partial clones
         to skip the unneeded negotiation step when fetching missing objects
         using a "git fetch" subprocess. (The implementation of spawning a "git
         fetch" subprocess will be done in a subsequent patch.) But this can also
    @@ Documentation/config/fetch.txt: fetch.negotiationAlgorithm::
      	server. Set to "skipping" to use an algorithm that skips commits in an
      	effort to converge faster, but may result in a larger-than-necessary
     -	packfile; The default is "default" which instructs Git to use the default algorithm
    -+	packfile; or set to "null" to not send any information at all, which
    ++	packfile; or set to "noop" to not send any information at all, which
     +	will almost certainly result in a larger-than-necessary packfile, but
     +	will skip the negotiation step.
     +	The default is "default" which instructs Git to use the default algorithm
    @@ Makefile: LIB_OBJS += mergesort.o
      LIB_OBJS += midx.o
      LIB_OBJS += name-hash.o
      LIB_OBJS += negotiator/default.o
    -+LIB_OBJS += negotiator/null.o
    ++LIB_OBJS += negotiator/noop.o
      LIB_OBJS += negotiator/skipping.o
      LIB_OBJS += notes-cache.o
      LIB_OBJS += notes-merge.o
    @@ fetch-negotiator.c
      #include "fetch-negotiator.h"
      #include "negotiator/default.h"
      #include "negotiator/skipping.h"
    -+#include "negotiator/null.h"
    ++#include "negotiator/noop.h"
      #include "repository.h"
      
      void fetch_negotiator_init(struct repository *r,
    @@ fetch-negotiator.c: void fetch_negotiator_init(struct repository *r,
      		skipping_negotiator_init(negotiator);
      		return;
      
    -+	case FETCH_NEGOTIATION_NULL:
    -+		null_negotiator_init(negotiator);
    ++	case FETCH_NEGOTIATION_NOOP:
    ++		noop_negotiator_init(negotiator);
     +		return;
     +
      	case FETCH_NEGOTIATION_DEFAULT:
      	default:
      		default_negotiator_init(negotiator);
     
    - ## negotiator/null.c (new) ##
    + ## negotiator/noop.c (new) ##
     @@
     +#include "cache.h"
    -+#include "null.h"
    ++#include "noop.h"
     +#include "../commit.h"
     +#include "../fetch-negotiator.h"
     +
    @@ negotiator/null.c (new)
     +	 * This negotiator does not emit any commits, so there is no commit to
     +	 * be acknowledged. If there is any ack, there is a bug.
     +	 */
    -+	BUG("ack with null negotiator, which does not emit any commits");
    ++	BUG("ack with noop negotiator, which does not emit any commits");
     +	return 0;
     +}
     +
    @@ negotiator/null.c (new)
     +	/* nothing to release */
     +}
     +
    -+void null_negotiator_init(struct fetch_negotiator *negotiator)
    ++void noop_negotiator_init(struct fetch_negotiator *negotiator)
     +{
     +	negotiator->known_common = known_common;
     +	negotiator->add_tip = add_tip;
    @@ negotiator/null.c (new)
     +	negotiator->data = NULL;
     +}
     
    - ## negotiator/null.h (new) ##
    + ## negotiator/noop.h (new) ##
     @@
    -+#ifndef NEGOTIATOR_NULL_H
    -+#define NEGOTIATOR_NULL_H
    ++#ifndef NEGOTIATOR_NOOP_H
    ++#define NEGOTIATOR_NOOP_H
     +
     +struct fetch_negotiator;
     +
    -+void null_negotiator_init(struct fetch_negotiator *negotiator);
    ++void noop_negotiator_init(struct fetch_negotiator *negotiator);
     +
     +#endif
     
    @@ repo-settings.c: void prepare_repo_settings(struct repository *r)
      	if (!repo_config_get_string(r, "fetch.negotiationalgorithm", &strval)) {
      		if (!strcasecmp(strval, "skipping"))
      			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
    -+		else if (!strcasecmp(strval, "null"))
    -+			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NULL;
    ++		else if (!strcasecmp(strval, "noop"))
    ++			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NOOP;
      		else
      			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
      	}
    @@ repository.h: enum fetch_negotiation_setting {
      	FETCH_NEGOTIATION_NONE = 0,
      	FETCH_NEGOTIATION_DEFAULT = 1,
      	FETCH_NEGOTIATION_SKIPPING = 2,
    -+	FETCH_NEGOTIATION_NULL = 3,
    ++	FETCH_NEGOTIATION_NOOP = 3,
      };
      
      struct repo_settings {
     
    - ## t/t5554-null-fetch-negotiator.sh (new) ##
    + ## t/t5554-noop-fetch-negotiator.sh (new) ##
     @@
     +#!/bin/sh
     +
    -+test_description='test null fetch negotiator'
    ++test_description='test noop fetch negotiator'
     +. ./test-lib.sh
     +
    -+test_expect_success 'null negotiator does not emit any "have"' '
    ++test_expect_success 'noop negotiator does not emit any "have"' '
     +	rm -f trace &&
     +
     +	test_create_repo server &&
    @@ t/t5554-null-fetch-negotiator.sh (new)
     +	test_create_repo client &&
     +	test_commit -C client we_have &&
     +
    -+	test_config -C client fetch.negotiationalgorithm null &&
    ++	test_config -C client fetch.negotiationalgorithm noop &&
     +	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
     +
     +	! grep "fetch> have" trace &&
2:  00ad7dd875 = 2:  9f277f1631 fetch: allow refspecs specified through stdin
3:  8b4a522a13 ! 3:  fda9f834f6 fetch: avoid reading submodule config until needed
    @@ Metadata
      ## Commit message ##
         fetch: avoid reading submodule config until needed
     
    -    Teach "git fetch" to avoid reading the submodule config until necessary.
    -    This allows users to avoid the lazy-fetching of this potentially missing
    -    config file by specifying the --recurse-submodules=no command line
    -    option.
    +    In "fetch", there are two parameters submodule_fetch_jobs_config and
    +    recurse_submodules that can be set in a variety of ways: through
    +    .gitmodules, through .git/config, and through the command line.
    +    Currently "fetch" handles this by first reading .gitmodules, then
    +    reading .git/config (allowing it to overwrite existing values), then
    +    reading the command line (allowing it to overwrite existing values).
    +
    +    Notice that we can avoid reading .gitmodules if .git/config and/or the
    +    command line already provides us with what we need. In addition, if
    +    recurse_submodules is found to be "no", we do not need the value of
    +    submodule_fetch_jobs_config.
    +
    +    Avoiding reading .gitmodules is especially important when we use "git
    +    fetch" to perform lazy fetches in a partial clone because the
    +    .gitmodules file itself might need to be lazy fetched (and otherwise
    +    causing an infinite loop).
    +
    +    In light of all this, avoid reading .gitmodules until necessary. When
    +    reading it, we may only need one of the two parameters it provides, so
    +    teach fetch_config_from_gitmodules() to support NULL arguments. With
    +    this patch, users (including Git itself when invoking "git fetch" to
    +    lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
    +    passing --recurse-submodules=no.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
    @@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
      		if (deepen_relative < 0)
     
      ## submodule-config.c ##
    -@@ submodule-config.c: struct fetch_config {
    - static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
    +@@ submodule-config.c: static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
      {
      	struct fetch_config *config = cb;
    --	if (!strcmp(var, "submodule.fetchjobs")) {
    -+	if (!strcmp(var, "submodule.fetchjobs") && config->max_children) {
    - 		*(config->max_children) = parse_submodule_fetchjobs(var, value);
    + 	if (!strcmp(var, "submodule.fetchjobs")) {
    +-		*(config->max_children) = parse_submodule_fetchjobs(var, value);
    ++		if (config->max_children)
    ++			*(config->max_children) =
    ++				parse_submodule_fetchjobs(var, value);
      		return 0;
    --	} else if (!strcmp(var, "fetch.recursesubmodules")) {
    -+	} else if (!strcmp(var, "fetch.recursesubmodules") &&
    -+		   config->recurse_submodules) {
    - 		*(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
    + 	} else if (!strcmp(var, "fetch.recursesubmodules")) {
    +-		*(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
    ++		if (config->recurse_submodules)
    ++			*(config->recurse_submodules) =
    ++				parse_fetch_recurse_submodules_arg(var, value);
      		return 0;
      	}
    + 
4:  77bc83e7f2 ! 4:  a5554cd27f fetch: only populate existing_refs if needed
    @@ Metadata
      ## Commit message ##
         fetch: only populate existing_refs if needed
     
    -    When fetching tags, Git only writes tags that do not already exist in
    -    the client repository. This necessitates an iteration over all the refs,
    -    but fetch performs this iteration even if no tags are fetched.
    +    In "fetch", get_ref_map() iterates over all refs to populate
    +    "existing_refs" in order to populate peer_ref->old_oid in the returned
    +    refmap, even if the refmap has no peer_ref set - which is the case when
    +    only literal hashes (i.e. no refs by name) are fetched.
     
    -    This issue is more severe in a partial clone because the iteration over
    -    refs also checks that the targets of those refs are present,
    -    necessitating a lazy fetch if the target is missing.
    +    Iterating over refs causes the targets of those refs to be checked for
    +    existence. Avoiding this is especially important when we use "git fetch"
    +    to perform lazy fetches in a partial clone because a target of such a
    +    ref may need to be itself lazy-fetched (and otherwise causing an
    +    infinite loop).
     
    -    Therefore, iterate over the refs only when necessary. The user can avoid
    -    this iteration by refraining from fetching tags, for example, by passing
    -    --no-tags as an argument. A subsequent patch will also teach Git to use
    -    "git fetch" to lazy-fetch missing objects in a partial clone, thus also
    -    making use of this change.
    +    Therefore, avoid populating "existing_refs" until necessary. With this
    +    patch, because Git lazy-fetches objects by literal hashes (to be done in
    +    a subsequent commit), it will then be able to guarantee avoiding reading
    +    targets of refs.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
5:  d42e98ff07 ! 5:  f56c24dd1b fetch-pack: do not lazy-fetch during ref iteration
    @@ Metadata
      ## Commit message ##
         fetch-pack: do not lazy-fetch during ref iteration
     
    -    Teach "fetch-pack" not to lazy fetch whenever iterating over refs. This
    -    is done by using the raw form of ref iteration and by dereferencing tags
    -    ourselves.
    +    In order to determine negotiation tips, "fetch-pack" iterates over all
    +    refs and dereferences all annotated tags found. This causes the
    +    existence of targets of refs and annotated tags to be checked. Avoiding
    +    this is especially important when we use "git fetch" (which invokes
    +    "fetch-pack") to perform lazy fetches in a partial clone because a
    +    target of such a ref or annotated tag may need to be itself lazy-fetched
    +    (and otherwise causing an infinite loop).
    +
    +    Therefore, teach "fetch-pack" not to lazy fetch whenever iterating over
    +    refs. This is done by using the raw form of ref iteration and by
    +    dereferencing tags ourselves.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
6:  55d2e5a4cc = 6:  4e72ca6258 promisor-remote: lazy-fetch objects in subprocess
7:  e8f16d6908 = 7:  3ff9d034e9 fetch-pack: remove no_dependents code
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

Add a noop fetch negotiator. This is introduced to allow partial clones
to skip the unneeded negotiation step when fetching missing objects
using a "git fetch" subprocess. (The implementation of spawning a "git
fetch" subprocess will be done in a subsequent patch.) But this can also
be useful for end users, e.g. as a blunt fix for object corruption.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config/fetch.txt   |  5 +++-
 Makefile                         |  1 +
 fetch-negotiator.c               |  5 ++++
 negotiator/noop.c                | 44 ++++++++++++++++++++++++++++++++
 negotiator/noop.h                |  8 ++++++
 repo-settings.c                  |  2 ++
 repository.h                     |  1 +
 t/t5554-noop-fetch-negotiator.sh | 22 ++++++++++++++++
 8 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 negotiator/noop.c
 create mode 100644 negotiator/noop.h
 create mode 100755 t/t5554-noop-fetch-negotiator.sh

diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
index 0aaa05e8c0..19383ddf7b 100644
--- a/Documentation/config/fetch.txt
+++ b/Documentation/config/fetch.txt
@@ -60,7 +60,10 @@ fetch.negotiationAlgorithm::
 	sent when negotiating the contents of the packfile to be sent by the
 	server. Set to "skipping" to use an algorithm that skips commits in an
 	effort to converge faster, but may result in a larger-than-necessary
-	packfile; The default is "default" which instructs Git to use the default algorithm
+	packfile; or set to "noop" to not send any information at all, which
+	will almost certainly result in a larger-than-necessary packfile, but
+	will skip the negotiation step.
+	The default is "default" which instructs Git to use the default algorithm
 	that never skips commits (unless the server has acknowledged it or one
 	of its descendants). If `feature.experimental` is enabled, then this
 	setting defaults to "skipping".
diff --git a/Makefile b/Makefile
index 372139f1f2..94a8fb54d1 100644
--- a/Makefile
+++ b/Makefile
@@ -917,6 +917,7 @@ LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
 LIB_OBJS += name-hash.o
 LIB_OBJS += negotiator/default.o
+LIB_OBJS += negotiator/noop.o
 LIB_OBJS += negotiator/skipping.o
 LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
diff --git a/fetch-negotiator.c b/fetch-negotiator.c
index 0a1357dc9d..57ed5784e1 100644
--- a/fetch-negotiator.c
+++ b/fetch-negotiator.c
@@ -2,6 +2,7 @@
 #include "fetch-negotiator.h"
 #include "negotiator/default.h"
 #include "negotiator/skipping.h"
+#include "negotiator/noop.h"
 #include "repository.h"
 
 void fetch_negotiator_init(struct repository *r,
@@ -13,6 +14,10 @@ void fetch_negotiator_init(struct repository *r,
 		skipping_negotiator_init(negotiator);
 		return;
 
+	case FETCH_NEGOTIATION_NOOP:
+		noop_negotiator_init(negotiator);
+		return;
+
 	case FETCH_NEGOTIATION_DEFAULT:
 	default:
 		default_negotiator_init(negotiator);
diff --git a/negotiator/noop.c b/negotiator/noop.c
new file mode 100644
index 0000000000..60569b8350
--- /dev/null
+++ b/negotiator/noop.c
@@ -0,0 +1,44 @@
+#include "cache.h"
+#include "noop.h"
+#include "../commit.h"
+#include "../fetch-negotiator.h"
+
+static void known_common(struct fetch_negotiator *n, struct commit *c)
+{
+	/* do nothing */
+}
+
+static void add_tip(struct fetch_negotiator *n, struct commit *c)
+{
+	/* do nothing */
+}
+
+static const struct object_id *next(struct fetch_negotiator *n)
+{
+	return NULL;
+}
+
+static int ack(struct fetch_negotiator *n, struct commit *c)
+{
+	/*
+	 * This negotiator does not emit any commits, so there is no commit to
+	 * be acknowledged. If there is any ack, there is a bug.
+	 */
+	BUG("ack with noop negotiator, which does not emit any commits");
+	return 0;
+}
+
+static void release(struct fetch_negotiator *n)
+{
+	/* nothing to release */
+}
+
+void noop_negotiator_init(struct fetch_negotiator *negotiator)
+{
+	negotiator->known_common = known_common;
+	negotiator->add_tip = add_tip;
+	negotiator->next = next;
+	negotiator->ack = ack;
+	negotiator->release = release;
+	negotiator->data = NULL;
+}
diff --git a/negotiator/noop.h b/negotiator/noop.h
new file mode 100644
index 0000000000..2b4ec5d07a
--- /dev/null
+++ b/negotiator/noop.h
@@ -0,0 +1,8 @@
+#ifndef NEGOTIATOR_NOOP_H
+#define NEGOTIATOR_NOOP_H
+
+struct fetch_negotiator;
+
+void noop_negotiator_init(struct fetch_negotiator *negotiator);
+
+#endif
diff --git a/repo-settings.c b/repo-settings.c
index 0918408b34..aa61a35338 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -39,6 +39,8 @@ void prepare_repo_settings(struct repository *r)
 	if (!repo_config_get_string(r, "fetch.negotiationalgorithm", &strval)) {
 		if (!strcasecmp(strval, "skipping"))
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
+		else if (!strcasecmp(strval, "noop"))
+			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NOOP;
 		else
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
 	}
diff --git a/repository.h b/repository.h
index 3c1f7d54bd..628c834367 100644
--- a/repository.h
+++ b/repository.h
@@ -23,6 +23,7 @@ enum fetch_negotiation_setting {
 	FETCH_NEGOTIATION_NONE = 0,
 	FETCH_NEGOTIATION_DEFAULT = 1,
 	FETCH_NEGOTIATION_SKIPPING = 2,
+	FETCH_NEGOTIATION_NOOP = 3,
 };
 
 struct repo_settings {
diff --git a/t/t5554-noop-fetch-negotiator.sh b/t/t5554-noop-fetch-negotiator.sh
new file mode 100755
index 0000000000..2ac7b5859e
--- /dev/null
+++ b/t/t5554-noop-fetch-negotiator.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+test_description='test noop fetch negotiator'
+. ./test-lib.sh
+
+test_expect_success 'noop negotiator does not emit any "have"' '
+	rm -f trace &&
+
+	test_create_repo server &&
+	test_commit -C server to_fetch &&
+
+	test_create_repo client &&
+	test_commit -C client we_have &&
+
+	test_config -C client fetch.negotiationalgorithm noop &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+
+	! grep "fetch> have" trace &&
+	grep "fetch> done" trace
+'
+
+test_done
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 2/7] fetch: allow refspecs specified through stdin
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

In a subsequent patch, partial clones will be taught to fetch missing
objects using a "git fetch" subprocess. Because the number of objects
fetched may be too numerous to fit on the command line, teach "fetch" to
accept refspecs passed through stdin.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/git-fetch.txt |  4 ++++
 builtin/fetch.c             | 19 +++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index 45b6d8e633..9067c2079e 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -48,6 +48,10 @@ include::fetch-options.txt[]
 
 include::pull-fetch-param.txt[]
 
+--stdin::
+	Read refspecs, one per line, from stdin in addition to those provided
+	as arguments. The "tag <name>" format is not supported.
+
 include::urls-remotes.txt[]
 
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 3ccf69753f..a5498646bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -80,6 +80,7 @@ static struct list_objects_filter_options filter_options;
 static struct string_list server_options = STRING_LIST_INIT_DUP;
 static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
 static int fetch_write_commit_graph = -1;
+static int stdin_refspecs = 0;
 
 static int git_fetch_config(const char *k, const char *v, void *cb)
 {
@@ -209,6 +210,8 @@ static struct option builtin_fetch_options[] = {
 		 N_("check for forced-updates on all updated branches")),
 	OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph,
 		 N_("write the commit-graph after fetching")),
+	OPT_BOOL(0, "stdin", &stdin_refspecs,
+		 N_("accept refspecs from stdin")),
 	OPT_END()
 };
 
@@ -1684,7 +1687,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
 	return;
 }
 
-static int fetch_one(struct remote *remote, int argc, const char **argv, int prune_tags_ok)
+static int fetch_one(struct remote *remote, int argc, const char **argv,
+		     int prune_tags_ok, int use_stdin_refspecs)
 {
 	struct refspec rs = REFSPEC_INIT_FETCH;
 	int i;
@@ -1741,6 +1745,13 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, int pru
 		}
 	}
 
+	if (use_stdin_refspecs) {
+		struct strbuf line = STRBUF_INIT;
+		while (strbuf_getline_lf(&line, stdin) != EOF)
+			refspec_append(&rs, line.buf);
+		strbuf_release(&line);
+	}
+
 	if (server_options.nr)
 		gtransport->server_options = &server_options;
 
@@ -1841,7 +1852,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (remote) {
 		if (filter_options.choice || has_promisor_remote())
 			fetch_one_setup_partial(remote);
-		result = fetch_one(remote, argc, argv, prune_tags_ok);
+		result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs);
 	} else {
 		int max_children = max_jobs;
 
@@ -1849,6 +1860,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 			die(_("--filter can only be used with the remote "
 			      "configured in extensions.partialclone"));
 
+		if (stdin_refspecs)
+			die(_("--stdin can only be used when fetching "
+			      "from one remote"));
+
 		if (max_children < 0)
 			max_children = fetch_parallel_config;
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 3/7] fetch: avoid reading submodule config until needed
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 4/7] fetch: only populate existing_refs if needed Jonathan Tan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

In "fetch", there are two parameters submodule_fetch_jobs_config and
recurse_submodules that can be set in a variety of ways: through
.gitmodules, through .git/config, and through the command line.
Currently "fetch" handles this by first reading .gitmodules, then
reading .git/config (allowing it to overwrite existing values), then
reading the command line (allowing it to overwrite existing values).

Notice that we can avoid reading .gitmodules if .git/config and/or the
command line already provides us with what we need. In addition, if
recurse_submodules is found to be "no", we do not need the value of
submodule_fetch_jobs_config.

Avoiding reading .gitmodules is especially important when we use "git
fetch" to perform lazy fetches in a partial clone because the
.gitmodules file itself might need to be lazy fetched (and otherwise
causing an infinite loop).

In light of all this, avoid reading .gitmodules until necessary. When
reading it, we may only need one of the two parameters it provides, so
teach fetch_config_from_gitmodules() to support NULL arguments. With
this patch, users (including Git itself when invoking "git fetch" to
lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
passing --recurse-submodules=no.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c    | 10 ++++++++--
 submodule-config.c |  8 ++++++--
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index a5498646bf..29db219c68 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1786,12 +1786,18 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		free(anon);
 	}
 
-	fetch_config_from_gitmodules(&submodule_fetch_jobs_config,
-				     &recurse_submodules);
 	git_config(git_fetch_config, NULL);
 
 	argc = parse_options(argc, argv, prefix,
 			     builtin_fetch_options, builtin_fetch_usage, 0);
+	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
+		int *sfjc = submodule_fetch_jobs_config == -1
+			    ? &submodule_fetch_jobs_config : NULL;
+		int *rs = recurse_submodules == RECURSE_SUBMODULES_DEFAULT
+			  ? &recurse_submodules : NULL;
+
+		fetch_config_from_gitmodules(sfjc, rs);
+	}
 
 	if (deepen_relative) {
 		if (deepen_relative < 0)
diff --git a/submodule-config.c b/submodule-config.c
index e175dfbc38..c569e22aa3 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -777,10 +777,14 @@ static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
 {
 	struct fetch_config *config = cb;
 	if (!strcmp(var, "submodule.fetchjobs")) {
-		*(config->max_children) = parse_submodule_fetchjobs(var, value);
+		if (config->max_children)
+			*(config->max_children) =
+				parse_submodule_fetchjobs(var, value);
 		return 0;
 	} else if (!strcmp(var, "fetch.recursesubmodules")) {
-		*(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
+		if (config->recurse_submodules)
+			*(config->recurse_submodules) =
+				parse_fetch_recurse_submodules_arg(var, value);
 		return 0;
 	}
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 4/7] fetch: only populate existing_refs if needed
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
                     ` (2 preceding siblings ...)
  2020-08-18  4:01   ` [PATCH v3 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

In "fetch", get_ref_map() iterates over all refs to populate
"existing_refs" in order to populate peer_ref->old_oid in the returned
refmap, even if the refmap has no peer_ref set - which is the case when
only literal hashes (i.e. no refs by name) are fetched.

Iterating over refs causes the targets of those refs to be checked for
existence. Avoiding this is especially important when we use "git fetch"
to perform lazy fetches in a partial clone because a target of such a
ref may need to be itself lazy-fetched (and otherwise causing an
infinite loop).

Therefore, avoid populating "existing_refs" until necessary. With this
patch, because Git lazy-fetches objects by literal hashes (to be done in
a subsequent commit), it will then be able to guarantee avoiding reading
targets of refs.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 29db219c68..6460ce3f4e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -449,6 +449,7 @@ static struct ref *get_ref_map(struct remote *remote,
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
 	struct hashmap existing_refs;
+	int existing_refs_populated = 0;
 
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
@@ -542,15 +543,18 @@ static struct ref *get_ref_map(struct remote *remote,
 
 	ref_map = ref_remove_duplicates(ref_map);
 
-	refname_hash_init(&existing_refs);
-	for_each_ref(add_one_refname, &existing_refs);
-
 	for (rm = ref_map; rm; rm = rm->next) {
 		if (rm->peer_ref) {
 			const char *refname = rm->peer_ref->name;
 			struct refname_hash_entry *peer_item;
 			unsigned int hash = strhash(refname);
 
+			if (!existing_refs_populated) {
+				refname_hash_init(&existing_refs);
+				for_each_ref(add_one_refname, &existing_refs);
+				existing_refs_populated = 1;
+			}
+
 			peer_item = hashmap_get_entry_from_hash(&existing_refs,
 						hash, refname,
 						struct refname_hash_entry, ent);
@@ -560,7 +564,8 @@ static struct ref *get_ref_map(struct remote *remote,
 			}
 		}
 	}
-	hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
+	if (existing_refs_populated)
+		hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent);
 
 	return ref_map;
 }
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 5/7] fetch-pack: do not lazy-fetch during ref iteration
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
                     ` (3 preceding siblings ...)
  2020-08-18  4:01   ` [PATCH v3 4/7] fetch: only populate existing_refs if needed Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

In order to determine negotiation tips, "fetch-pack" iterates over all
refs and dereferences all annotated tags found. This causes the
existence of targets of refs and annotated tags to be checked. Avoiding
this is especially important when we use "git fetch" (which invokes
"fetch-pack") to perform lazy fetches in a partial clone because a
target of such a ref or annotated tag may need to be itself lazy-fetched
(and otherwise causing an infinite loop).

Therefore, teach "fetch-pack" not to lazy fetch whenever iterating over
refs. This is done by using the raw form of ref iteration and by
dereferencing tags ourselves.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c             | 79 ++++++++++++++++++++++------------------
 t/t5616-partial-clone.sh | 20 ++++++++++
 2 files changed, 64 insertions(+), 35 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 80fb3bd899..707bbc31fd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -108,24 +108,48 @@ static void for_each_cached_alternate(struct fetch_negotiator *negotiator,
 		cb(negotiator, cache.items[i]);
 }
 
+static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
+					       int mark_tags_complete)
+{
+	enum object_type type;
+	struct object_info info = { .typep = &type };
+
+	while (1) {
+		if (oid_object_info_extended(the_repository, oid, &info,
+					     OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK))
+			return NULL;
+		if (type == OBJ_TAG) {
+			struct tag *tag = (struct tag *)
+				parse_object(the_repository, oid);
+
+			if (!tag->tagged)
+				return NULL;
+			if (mark_tags_complete)
+				tag->object.flags |= COMPLETE;
+			oid = &tag->tagged->oid;
+		} else {
+			break;
+		}
+	}
+	if (type == OBJ_COMMIT)
+		return (struct commit *) parse_object(the_repository, oid);
+	return NULL;
+}
+
 static int rev_list_insert_ref(struct fetch_negotiator *negotiator,
-			       const char *refname,
 			       const struct object_id *oid)
 {
-	struct object *o = deref_tag(the_repository,
-				     parse_object(the_repository, oid),
-				     refname, 0);
-
-	if (o && o->type == OBJ_COMMIT)
-		negotiator->add_tip(negotiator, (struct commit *)o);
+	struct commit *c = deref_without_lazy_fetch(oid, 0);
 
+	if (c)
+		negotiator->add_tip(negotiator, c);
 	return 0;
 }
 
 static int rev_list_insert_ref_oid(const char *refname, const struct object_id *oid,
 				   int flag, void *cb_data)
 {
-	return rev_list_insert_ref(cb_data, refname, oid);
+	return rev_list_insert_ref(cb_data, oid);
 }
 
 enum ack_type {
@@ -201,7 +225,7 @@ static void send_request(struct fetch_pack_args *args,
 static void insert_one_alternate_object(struct fetch_negotiator *negotiator,
 					struct object *obj)
 {
-	rev_list_insert_ref(negotiator, NULL, &obj->oid);
+	rev_list_insert_ref(negotiator, &obj->oid);
 }
 
 #define INITIAL_FLUSH 16
@@ -230,13 +254,12 @@ static void mark_tips(struct fetch_negotiator *negotiator,
 	int i;
 
 	if (!negotiation_tips) {
-		for_each_ref(rev_list_insert_ref_oid, negotiator);
+		for_each_rawref(rev_list_insert_ref_oid, negotiator);
 		return;
 	}
 
 	for (i = 0; i < negotiation_tips->nr; i++)
-		rev_list_insert_ref(negotiator, NULL,
-				    &negotiation_tips->oid[i]);
+		rev_list_insert_ref(negotiator, &negotiation_tips->oid[i]);
 	return;
 }
 
@@ -503,21 +526,11 @@ static struct commit_list *complete;
 
 static int mark_complete(const struct object_id *oid)
 {
-	struct object *o = parse_object(the_repository, oid);
-
-	while (o && o->type == OBJ_TAG) {
-		struct tag *t = (struct tag *) o;
-		if (!t->tagged)
-			break; /* broken repository */
-		o->flags |= COMPLETE;
-		o = parse_object(the_repository, &t->tagged->oid);
-	}
-	if (o && o->type == OBJ_COMMIT) {
-		struct commit *commit = (struct commit *)o;
-		if (!(commit->object.flags & COMPLETE)) {
-			commit->object.flags |= COMPLETE;
-			commit_list_insert(commit, &complete);
-		}
+	struct commit *commit = deref_without_lazy_fetch(oid, 1);
+
+	if (commit && !(commit->object.flags & COMPLETE)) {
+		commit->object.flags |= COMPLETE;
+		commit_list_insert(commit, &complete);
 	}
 	return 0;
 }
@@ -702,7 +715,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	 */
 	trace2_region_enter("fetch-pack", "mark_complete_local_refs", NULL);
 	if (!args->deepen) {
-		for_each_ref(mark_complete_oid, NULL);
+		for_each_rawref(mark_complete_oid, NULL);
 		for_each_cached_alternate(NULL, mark_alternate_complete);
 		commit_list_sort_by_date(&complete);
 		if (cutoff)
@@ -716,16 +729,12 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	 */
 	trace2_region_enter("fetch-pack", "mark_common_remote_refs", NULL);
 	for (ref = *refs; ref; ref = ref->next) {
-		struct object *o = deref_tag(the_repository,
-					     lookup_object(the_repository,
-					     &ref->old_oid),
-					     NULL, 0);
+		struct commit *c = deref_without_lazy_fetch(&ref->old_oid, 0);
 
-		if (!o || o->type != OBJ_COMMIT || !(o->flags & COMPLETE))
+		if (!c || !(c->object.flags & COMPLETE))
 			continue;
 
-		negotiator->known_common(negotiator,
-					 (struct commit *)o);
+		negotiator->known_common(negotiator, c);
 	}
 	trace2_region_leave("fetch-pack", "mark_common_remote_refs", NULL);
 
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 8a27452a51..e53492d595 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -384,6 +384,26 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+test_expect_success 'fetch does not lazy-fetch missing targets of its refs' '
+	rm -rf server client trace &&
+
+	test_create_repo server &&
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	test_commit -C server foo &&
+
+	git clone --filter=blob:none "file://$(pwd)/server" client &&
+	# Make all refs point to nothing by deleting all objects.
+	rm client/.git/objects/pack/* &&
+
+	test_commit -C server bar &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch \
+		--no-tags --recurse-submodules=no \
+		origin refs/tags/bar &&
+	FOO_HASH=$(git -C server rev-parse foo) &&
+	! grep "want $FOO_HASH" trace
+'
+
 # The following two tests must be in this order. It is important that
 # the srv.bare repository did not have tags during clone, but has tags
 # in the fetch.
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 6/7] promisor-remote: lazy-fetch objects in subprocess
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
                     ` (4 preceding siblings ...)
  2020-08-18  4:01   ` [PATCH v3 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18  4:01   ` [PATCH v3 7/7] fetch-pack: remove no_dependents code Jonathan Tan
  2020-08-18 19:56   ` [PATCH v3 0/7] Lazy fetch with subprocess Junio C Hamano
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

Teach Git to lazy-fetch missing objects in a subprocess instead of doing
it in-process. This allows any fatal errors that occur during the fetch
to be isolated and converted into an error return value, instead of
causing the current command being run to terminate.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/partial-clone.txt | 13 ++-----
 promisor-remote.c                         | 46 +++++++++++------------
 t/t0410-partial-clone.sh                  |  2 +-
 t/t4067-diff-partial-clone.sh             |  8 ++--
 t/t5601-clone.sh                          |  2 +-
 5 files changed, 30 insertions(+), 41 deletions(-)

diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt
index b9e17e7a28..0780d30cac 100644
--- a/Documentation/technical/partial-clone.txt
+++ b/Documentation/technical/partial-clone.txt
@@ -171,20 +171,13 @@ additional flag.
 Fetching Missing Objects
 ------------------------
 
-- Fetching of objects is done using the existing transport mechanism using
-  transport_fetch_refs(), setting a new transport option
-  TRANS_OPT_NO_DEPENDENTS to indicate that only the objects themselves are
-  desired, not any object that they refer to.
-+
-Because some transports invoke fetch_pack() in the same process, fetch_pack()
-has been updated to not use any object flags when the corresponding argument
-(no_dependents) is set.
+- Fetching of objects is done by invoking a "git fetch" subprocess.
 
 - The local repository sends a request with the hashes of all requested
-  objects as "want" lines, and does not perform any packfile negotiation.
+  objects, and does not perform any packfile negotiation.
   It then receives a packfile.
 
-- Because we are reusing the existing fetch-pack mechanism, fetching
+- Because we are reusing the existing fetch mechanism, fetching
   currently fetches all objects referred to by the requested objects, even
   though they are not necessary.
 
diff --git a/promisor-remote.c b/promisor-remote.c
index baaea12fd6..6e647610e9 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -3,6 +3,7 @@
 #include "promisor-remote.h"
 #include "config.h"
 #include "transport.h"
+#include "argv-array.h"
 
 static char *repository_format_partial_clone;
 static const char *core_partial_clone_filter_default;
@@ -12,39 +13,34 @@ void set_repository_format_partial_clone(char *partial_clone)
 	repository_format_partial_clone = xstrdup_or_null(partial_clone);
 }
 
-static int fetch_refs(const char *remote_name, struct ref *ref)
-{
-	struct remote *remote;
-	struct transport *transport;
-	int res;
-
-	remote = remote_get(remote_name);
-	if (!remote->url[0])
-		die(_("Remote with no URL"));
-	transport = transport_get(remote, remote->url[0]);
-
-	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
-	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	res = transport_fetch_refs(transport, ref);
-
-	return res;
-}
-
 static int fetch_objects(const char *remote_name,
 			 const struct object_id *oids,
 			 int oid_nr)
 {
-	struct ref *ref = NULL;
+	struct child_process child = CHILD_PROCESS_INIT;
 	int i;
+	FILE *child_in;
+
+	child.git_cmd = 1;
+	child.in = -1;
+	argv_array_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=null",
+			 "fetch", remote_name, "--no-tags",
+			 "--no-write-fetch-head", "--recurse-submodules=no",
+			 "--filter=blob:none", "--stdin", NULL);
+	if (start_command(&child))
+		die(_("promisor-remote: unable to fork off fetch subprocess"));
+	child_in = xfdopen(child.in, "w");
 
 	for (i = 0; i < oid_nr; i++) {
-		struct ref *new_ref = alloc_ref(oid_to_hex(&oids[i]));
-		oidcpy(&new_ref->old_oid, &oids[i]);
-		new_ref->exact_oid = 1;
-		new_ref->next = ref;
-		ref = new_ref;
+		if (fputs(oid_to_hex(&oids[i]), child_in) < 0)
+			die_errno(_("promisor-remote: could not write to fetch subprocess"));
+		if (fputc('\n', child_in) < 0)
+			die_errno(_("promisor-remote: could not write to fetch subprocess"));
 	}
-	return fetch_refs(remote_name, ref);
+
+	if (fclose(child_in) < 0)
+		die_errno(_("promisor-remote: could not close stdin to fetch subprocess"));
+	return finish_command(&child) ? -1 : 0;
 }
 
 static struct promisor_remote *promisors;
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 463dc3a8be..3e454f934e 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -203,7 +203,7 @@ test_expect_success 'fetching of missing objects works with ref-in-want enabled'
 	rm -rf repo/.git/objects/* &&
 	rm -f trace &&
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C repo cat-file -p "$HASH" &&
-	grep "git< fetch=.*ref-in-want" trace
+	grep "fetch< fetch=.*ref-in-want" trace
 '
 
 test_expect_success 'fetching of missing objects from another promisor remote' '
diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh
index ef8e0e9cb0..804f2a82e8 100755
--- a/t/t4067-diff-partial-clone.sh
+++ b/t/t4067-diff-partial-clone.sh
@@ -20,7 +20,7 @@ test_expect_success 'git show batches blobs' '
 	# Ensure that there is exactly 1 negotiation by checking that there is
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client show HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -44,7 +44,7 @@ test_expect_success 'diff batches blobs' '
 	# Ensure that there is exactly 1 negotiation by checking that there is
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff HEAD^ HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -127,7 +127,7 @@ test_expect_success 'diff with rename detection batches blobs' '
 	# only 1 "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --raw -M HEAD^ HEAD >out &&
 	grep ":100644 100644.*R[0-9][0-9][0-9].*b.*c" out &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
@@ -175,7 +175,7 @@ test_expect_success 'diff --break-rewrites fetches only if necessary, and batche
 	# by checking that there is only 1 "done" line sent. ("done" marks the
 	# end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff --break-rewrites --raw -M HEAD^ HEAD &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 84ea2a3eb7..f82b0dbb5a 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -702,7 +702,7 @@ test_expect_success 'batch missing blob request during checkout' '
 	# Ensure that there is only one negotiation by checking that there is
 	# only "done" line sent. ("done" marks the end of negotiation.)
 	GIT_TRACE_PACKET="$(pwd)/trace" git -C client checkout HEAD^ &&
-	grep "git> done" trace >done_lines &&
+	grep "fetch> done" trace >done_lines &&
 	test_line_count = 1 done_lines
 '
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 7/7] fetch-pack: remove no_dependents code
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
                     ` (5 preceding siblings ...)
  2020-08-18  4:01   ` [PATCH v3 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
@ 2020-08-18  4:01   ` Jonathan Tan
  2020-08-18 19:56   ` [PATCH v3 0/7] Lazy fetch with subprocess Junio C Hamano
  7 siblings, 0 replies; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18  4:01 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, gitster

Now that Git has switched to using a subprocess to lazy-fetch missing
objects, remove the no_dependents code as it is no longer used.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/fetch-pack.c |   4 --
 fetch-pack.c         | 110 ++++++++++++-------------------------------
 fetch-pack.h         |  14 ------
 remote-curl.c        |   6 ---
 transport.c          |   4 --
 transport.h          |   7 ---
 6 files changed, 30 insertions(+), 115 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index bbb5c96167..58b7c1fbdc 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -153,10 +153,6 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			args.from_promisor = 1;
 			continue;
 		}
-		if (!strcmp("--no-dependents", arg)) {
-			args.no_dependents = 1;
-			continue;
-		}
 		if (skip_prefix(arg, ("--" CL_ARG__FILTER "="), &arg)) {
 			parse_list_objects_filter(&args.filter_options, arg);
 			continue;
diff --git a/fetch-pack.c b/fetch-pack.c
index 707bbc31fd..3212957dae 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -285,10 +285,8 @@ static int find_common(struct fetch_negotiator *negotiator,
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_DIE_ON_ERR_PACKET);
 
-	if (!args->no_dependents) {
-		mark_tips(negotiator, args->negotiation_tips);
-		for_each_cached_alternate(negotiator, insert_one_alternate_object);
-	}
+	mark_tips(negotiator, args->negotiation_tips);
+	for_each_cached_alternate(negotiator, insert_one_alternate_object);
 
 	fetching = 0;
 	for ( ; refs ; refs = refs->next) {
@@ -305,12 +303,8 @@ static int find_common(struct fetch_negotiator *negotiator,
 		 * We use lookup_object here because we are only
 		 * interested in the case we *know* the object is
 		 * reachable and we have already scanned it.
-		 *
-		 * Do this only if args->no_dependents is false (if it is true,
-		 * we cannot trust the object flags).
 		 */
-		if (!args->no_dependents &&
-		    ((o = lookup_object(the_repository, remote)) != NULL) &&
+		if (((o = lookup_object(the_repository, remote)) != NULL) &&
 				(o->flags & COMPLETE)) {
 			continue;
 		}
@@ -410,8 +404,6 @@ static int find_common(struct fetch_negotiator *negotiator,
 	trace2_region_enter("fetch-pack", "negotiation_v0_v1", the_repository);
 	flushes = 0;
 	retval = -1;
-	if (args->no_dependents)
-		goto done;
 	while ((oid = negotiator->next(negotiator))) {
 		packet_buf_write(&req_buf, "have %s\n", oid_to_hex(oid));
 		print_verbose(args, "have %s", oid_to_hex(oid));
@@ -666,9 +658,7 @@ struct loose_object_iter {
 
 /*
  * Mark recent commits available locally and reachable from a local ref as
- * COMPLETE. If args->no_dependents is false, also mark COMPLETE remote refs as
- * COMMON_REF (otherwise, we are not planning to participate in negotiation, and
- * thus do not need COMMON_REF marks).
+ * COMPLETE.
  *
  * The cutoff time for recency is determined by this heuristic: it is the
  * earliest commit time of the objects in refs that are commits and that we know
@@ -969,12 +959,8 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	struct fetch_negotiator negotiator_alloc;
 	struct fetch_negotiator *negotiator;
 
-	if (args->no_dependents) {
-		negotiator = NULL;
-	} else {
-		negotiator = &negotiator_alloc;
-		fetch_negotiator_init(r, negotiator);
-	}
+	negotiator = &negotiator_alloc;
+	fetch_negotiator_init(r, negotiator);
 
 	sort_ref_list(&ref, ref_compare_name);
 	QSORT(sought, nr_sought, cmp_ref_by_name);
@@ -1062,15 +1048,11 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	if (!server_supports_hash(the_hash_algo->name, NULL))
 		die(_("Server does not support this repository's object format"));
 
-	if (!args->no_dependents) {
-		mark_complete_and_common_ref(negotiator, args, &ref);
-		filter_refs(args, &ref, sought, nr_sought);
-		if (everything_local(args, &ref)) {
-			packet_flush(fd[1]);
-			goto all_done;
-		}
-	} else {
-		filter_refs(args, &ref, sought, nr_sought);
+	mark_complete_and_common_ref(negotiator, args, &ref);
+	filter_refs(args, &ref, sought, nr_sought);
+	if (everything_local(args, &ref)) {
+		packet_flush(fd[1]);
+		goto all_done;
 	}
 	if (find_common(negotiator, args, fd, &oid, ref) < 0)
 		if (!args->keep_pack)
@@ -1119,7 +1101,7 @@ static void add_shallow_requests(struct strbuf *req_buf,
 		packet_buf_write(req_buf, "deepen-relative\n");
 }
 
-static void add_wants(int no_dependents, const struct ref *wants, struct strbuf *req_buf)
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
 	int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 0);
 
@@ -1136,12 +1118,8 @@ static void add_wants(int no_dependents, const struct ref *wants, struct strbuf
 		 * We use lookup_object here because we are only
 		 * interested in the case we *know* the object is
 		 * reachable and we have already scanned it.
-		 *
-		 * Do this only if args->no_dependents is false (if it is true,
-		 * we cannot trust the object flags).
 		 */
-		if (!no_dependents &&
-		    ((o = lookup_object(the_repository, remote)) != NULL) &&
+		if (((o = lookup_object(the_repository, remote)) != NULL) &&
 		    (o->flags & COMPLETE)) {
 			continue;
 		}
@@ -1275,19 +1253,14 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
 	}
 
 	/* add wants */
-	add_wants(args->no_dependents, wants, &req_buf);
+	add_wants(wants, &req_buf);
 
-	if (args->no_dependents) {
-		packet_buf_write(&req_buf, "done");
-		ret = 1;
-	} else {
-		/* Add all of the common commits we've found in previous rounds */
-		add_common(&req_buf, common);
+	/* Add all of the common commits we've found in previous rounds */
+	add_common(&req_buf, common);
 
-		/* Add initial haves */
-		ret = add_haves(negotiator, seen_ack, &req_buf,
-				haves_to_send, in_vain);
-	}
+	/* Add initial haves */
+	ret = add_haves(negotiator, seen_ack, &req_buf,
+			haves_to_send, in_vain);
 
 	/* Send request */
 	packet_buf_flush(&req_buf);
@@ -1547,12 +1520,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	struct string_list packfile_uris = STRING_LIST_INIT_DUP;
 	int i;
 
-	if (args->no_dependents) {
-		negotiator = NULL;
-	} else {
-		negotiator = &negotiator_alloc;
-		fetch_negotiator_init(r, negotiator);
-	}
+	negotiator = &negotiator_alloc;
+	fetch_negotiator_init(r, negotiator);
 
 	packet_reader_init(&reader, fd[0], NULL, 0,
 			   PACKET_READ_CHOMP_NEWLINE |
@@ -1576,21 +1545,16 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				args->deepen = 1;
 
 			/* Filter 'ref' by 'sought' and those that aren't local */
-			if (!args->no_dependents) {
-				mark_complete_and_common_ref(negotiator, args, &ref);
-				filter_refs(args, &ref, sought, nr_sought);
-				if (everything_local(args, &ref))
-					state = FETCH_DONE;
-				else
-					state = FETCH_SEND_REQUEST;
-
-				mark_tips(negotiator, args->negotiation_tips);
-				for_each_cached_alternate(negotiator,
-							  insert_one_alternate_object);
-			} else {
-				filter_refs(args, &ref, sought, nr_sought);
+			mark_complete_and_common_ref(negotiator, args, &ref);
+			filter_refs(args, &ref, sought, nr_sought);
+			if (everything_local(args, &ref))
+				state = FETCH_DONE;
+			else
 				state = FETCH_SEND_REQUEST;
-			}
+
+			mark_tips(negotiator, args->negotiation_tips);
+			for_each_cached_alternate(negotiator,
+						  insert_one_alternate_object);
 			break;
 		case FETCH_SEND_REQUEST:
 			if (!negotiation_started) {
@@ -1911,20 +1875,6 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 	if (nr_sought)
 		nr_sought = remove_duplicates_in_refs(sought, nr_sought);
 
-	if (args->no_dependents && !args->filter_options.choice) {
-		/*
-		 * The protocol does not support requesting that only the
-		 * wanted objects be sent, so approximate this by setting a
-		 * "blob:none" filter if no filter is already set. This works
-		 * for all object types: note that wanted blobs will still be
-		 * sent because they are directly specified as a "want".
-		 *
-		 * NEEDSWORK: Add an option in the protocol to request that
-		 * only the wanted objects be sent, and implement it.
-		 */
-		parse_list_objects_filter(&args->filter_options, "blob:none");
-	}
-
 	if (version != protocol_v2 && !ref) {
 		packet_flush(fd[1]);
 		die(_("no matching remote head"));
diff --git a/fetch-pack.h b/fetch-pack.h
index 85d1e39fe7..bbe2938059 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -42,20 +42,6 @@ struct fetch_pack_args {
 	unsigned deepen:1;
 	unsigned from_promisor:1;
 
-	/*
-	 * Attempt to fetch only the wanted objects, and not any objects
-	 * referred to by them. Due to protocol limitations, extraneous
-	 * objects may still be included. (When fetching non-blob
-	 * objects, only blobs are excluded; when fetching a blob, the
-	 * blob itself will still be sent. The client does not need to
-	 * know whether a wanted object is a blob or not.)
-	 *
-	 * If 1, fetch_pack() will also not modify any object flags.
-	 * This allows fetch_pack() to safely be called by any function,
-	 * regardless of which object flags it uses (if any).
-	 */
-	unsigned no_dependents:1;
-
 	/*
 	 * Because fetch_pack() overwrites the shallow file upon a
 	 * successful deepening non-clone fetch, if this struct
diff --git a/remote-curl.c b/remote-curl.c
index 5cbc6e5002..a0c81a64bc 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -40,7 +40,6 @@ struct options {
 		push_cert : 2,
 		deepen_relative : 1,
 		from_promisor : 1,
-		no_dependents : 1,
 		atomic : 1,
 		object_format : 1;
 	const struct git_hash_algo *hash_algo;
@@ -186,9 +185,6 @@ static int set_option(const char *name, const char *value)
 	} else if (!strcmp(name, "from-promisor")) {
 		options.from_promisor = 1;
 		return 0;
-	} else if (!strcmp(name, "no-dependents")) {
-		options.no_dependents = 1;
-		return 0;
 	} else if (!strcmp(name, "filter")) {
 		options.filter = xstrdup(value);
 		return 0;
@@ -1171,8 +1167,6 @@ static int fetch_git(struct discovery *heads,
 		argv_array_push(&args, "--deepen-relative");
 	if (options.from_promisor)
 		argv_array_push(&args, "--from-promisor");
-	if (options.no_dependents)
-		argv_array_push(&args, "--no-dependents");
 	if (options.filter)
 		argv_array_pushf(&args, "--filter=%s", options.filter);
 	argv_array_push(&args, url.buf);
diff --git a/transport.c b/transport.c
index b41386eccb..32e1f21f0c 100644
--- a/transport.c
+++ b/transport.c
@@ -232,9 +232,6 @@ static int set_git_option(struct git_transport_options *opts,
 	} else if (!strcmp(name, TRANS_OPT_FROM_PROMISOR)) {
 		opts->from_promisor = !!value;
 		return 0;
-	} else if (!strcmp(name, TRANS_OPT_NO_DEPENDENTS)) {
-		opts->no_dependents = !!value;
-		return 0;
 	} else if (!strcmp(name, TRANS_OPT_LIST_OBJECTS_FILTER)) {
 		list_objects_filter_die_if_populated(&opts->filter_options);
 		parse_list_objects_filter(&opts->filter_options, value);
@@ -359,7 +356,6 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
 	args.from_promisor = data->options.from_promisor;
-	args.no_dependents = data->options.no_dependents;
 	args.filter_options = data->options.filter_options;
 	args.stateless_rpc = transport->stateless_rpc;
 	args.server_options = transport->server_options;
diff --git a/transport.h b/transport.h
index b3c30133ea..7aa1f33145 100644
--- a/transport.h
+++ b/transport.h
@@ -16,7 +16,6 @@ struct git_transport_options {
 	unsigned update_shallow : 1;
 	unsigned deepen_relative : 1;
 	unsigned from_promisor : 1;
-	unsigned no_dependents : 1;
 
 	/*
 	 * If this transport supports connect or stateless-connect,
@@ -201,12 +200,6 @@ void transport_check_allowed(const char *type);
 /* Indicate that these objects are being fetched by a promisor */
 #define TRANS_OPT_FROM_PROMISOR "from-promisor"
 
-/*
- * Indicate that only the objects wanted need to be fetched, not their
- * dependents
- */
-#define TRANS_OPT_NO_DEPENDENTS "no-dependents"
-
 /* Filter objects for partial clone and fetch */
 #define TRANS_OPT_LIST_OBJECTS_FILTER "filter"
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 0/7] Lazy fetch with subprocess
  2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
                     ` (6 preceding siblings ...)
  2020-08-18  4:01   ` [PATCH v3 7/7] fetch-pack: remove no_dependents code Jonathan Tan
@ 2020-08-18 19:56   ` Junio C Hamano
  2020-08-18 22:32     ` Junio C Hamano
  7 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-08-18 19:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee

Jonathan Tan <jonathantanmy@google.com> writes:

> These patches are based on jc/no-update-fetch-head, once again.

FWIW, that topic no longer exists (it has been swallowed by
Derrick's maintenance topics).  As that topic is not near 'next'
yet, we might want to kick the patch out of it into a separate
jc/no-update-fetch-head topic again, base this series and also
maintenance topics on top of it.

> 2:  00ad7dd875 = 2:  9f277f1631 fetch: allow refspecs specified through stdin
> 3:  8b4a522a13 ! 3:  fda9f834f6 fetch: avoid reading submodule config until needed
>     @@ Metadata
>       ## Commit message ##
>          fetch: avoid reading submodule config until needed
>      
>     -    Teach "git fetch" to avoid reading the submodule config until necessary.
>     -    This allows users to avoid the lazy-fetching of this potentially missing
>     -    config file by specifying the --recurse-submodules=no command line
>     -    option.
>     +    In "fetch", there are two parameters submodule_fetch_jobs_config and
>     +    recurse_submodules that can be set in a variety of ways: through
>     +    .gitmodules, through .git/config, and through the command line.
>     +    Currently "fetch" handles this by first reading .gitmodules, then
>     +    reading .git/config (allowing it to overwrite existing values), then
>     +    reading the command line (allowing it to overwrite existing values).
>     +
>     +    Notice that we can avoid reading .gitmodules if .git/config and/or the
>     +    command line already provides us with what we need. In addition, if
>     +    recurse_submodules is found to be "no", we do not need the value of
>     +    submodule_fetch_jobs_config.
>     +
>     +    Avoiding reading .gitmodules is especially important when we use "git
>     +    fetch" to perform lazy fetches in a partial clone because the
>     +    .gitmodules file itself might need to be lazy fetched (and otherwise
>     +    causing an infinite loop).
>     +
>     +    In light of all this, avoid reading .gitmodules until necessary. When
>     +    reading it, we may only need one of the two parameters it provides, so
>     +    teach fetch_config_from_gitmodules() to support NULL arguments. With
>     +    this patch, users (including Git itself when invoking "git fetch" to
>     +    lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
>     +    passing --recurse-submodules=no.

Quite sensible.

> 4:  77bc83e7f2 ! 4:  a5554cd27f fetch: only populate existing_refs if needed
>     @@ Metadata
>       ## Commit message ##
>          fetch: only populate existing_refs if needed
>      
>     -    When fetching tags, Git only writes tags that do not already exist in
>     -    the client repository. This necessitates an iteration over all the refs,
>     -    but fetch performs this iteration even if no tags are fetched.
>     +    In "fetch", get_ref_map() iterates over all refs to populate
>     +    "existing_refs" in order to populate peer_ref->old_oid in the returned
>     +    refmap, even if the refmap has no peer_ref set - which is the case when
>     +    only literal hashes (i.e. no refs by name) are fetched.

Much better---the previous round gave us a wrong impression that
the change is about the behaviour when fetching tags, but the
updated explanation makes it clear that the primary use case is to
avoid tag-following while directly fetching objects by names, not
via refs.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 0/7] Lazy fetch with subprocess
  2020-08-18 19:56   ` [PATCH v3 0/7] Lazy fetch with subprocess Junio C Hamano
@ 2020-08-18 22:32     ` Junio C Hamano
  2020-08-18 23:36       ` [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
  0 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2020-08-18 22:32 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> These patches are based on jc/no-update-fetch-head, once again.
>
> FWIW, that topic no longer exists (it has been swallowed by
> Derrick's maintenance topics).  As that topic is not near 'next'
> yet, we might want to kick the patch out of it into a separate
> jc/no-update-fetch-head topic again, base this series and also
> maintenance topics on top of it.

So, I've recreated jc/no-update-fetch-head on top of 'master' and
then rebased these patches on top (which required some adjustment
due to argv-array -> strvec API rename).

t5300 stops passing at "[6/7] promisor-remote: lazy-fetch", which
I haven't looked into details, but the topic is queued near the tip
of 'seen' for today's integration cycle.

Thanks.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess
  2020-08-18 22:32     ` Junio C Hamano
@ 2020-08-18 23:36       ` Jonathan Tan
  2020-08-18 23:57         ` Junio C Hamano
  0 siblings, 1 reply; 57+ messages in thread
From: Jonathan Tan @ 2020-08-18 23:36 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

---
Ah, sorry about that - here's a fixup. I forgot to switch the name of
the negotiator in the patch (and also because of the change to a "fetch"
subprocess, the trace is reported as "fetch>", not "git>").

 promisor-remote.c      | 2 +-
 t/t5300-pack-object.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 56d6d4d821..6530e26f98 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -23,7 +23,7 @@ static int fetch_objects(const char *remote_name,
 
 	child.git_cmd = 1;
 	child.in = -1;
-	strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=null",
+	strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
 		     "fetch", remote_name, "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
 		     "--filter=blob:none", "--stdin", NULL);
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 3d6a93343a..392201cabd 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -528,7 +528,7 @@ test_expect_success 'prefetch objects' '
 	TWO=$(git -C server rev-parse three_branch^) &&
 	git -C client fetch --filter=blob:none origin "$TWO" &&
 	GIT_TRACE_PACKET=$(pwd)/trace git -C client push origin "$TWO":refs/heads/two_branch &&
-	grep "git> done" trace >donelines &&
+	grep "fetch> done" trace >donelines &&
 	test_line_count = 1 donelines
 '
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess
  2020-08-18 23:36       ` [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
@ 2020-08-18 23:57         ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2020-08-18 23:57 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> ---
> Ah, sorry about that - here's a fixup. I forgot to switch the name of
> the negotiator in the patch (and also because of the change to a "fetch"
> subprocess, the trace is reported as "fetch>", not "git>").

Excellent.  Thanks for a quick update.

>
>  promisor-remote.c      | 2 +-
>  t/t5300-pack-object.sh | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 56d6d4d821..6530e26f98 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -23,7 +23,7 @@ static int fetch_objects(const char *remote_name,
>  
>  	child.git_cmd = 1;
>  	child.in = -1;
> -	strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=null",
> +	strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
>  		     "fetch", remote_name, "--no-tags",
>  		     "--no-write-fetch-head", "--recurse-submodules=no",
>  		     "--filter=blob:none", "--stdin", NULL);
> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> index 3d6a93343a..392201cabd 100755
> --- a/t/t5300-pack-object.sh
> +++ b/t/t5300-pack-object.sh
> @@ -528,7 +528,7 @@ test_expect_success 'prefetch objects' '
>  	TWO=$(git -C server rev-parse three_branch^) &&
>  	git -C client fetch --filter=blob:none origin "$TWO" &&
>  	GIT_TRACE_PACKET=$(pwd)/trace git -C client push origin "$TWO":refs/heads/two_branch &&
> -	grep "git> done" trace >donelines &&
> +	grep "fetch> done" trace >donelines &&
>  	test_line_count = 1 donelines
>  '

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2020-08-18 23:57 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
2020-07-24 23:07 ` Junio C Hamano
2020-07-24 23:11 ` Junio C Hamano
2020-07-25 21:41   ` Jeff King
2020-07-25 23:01     ` Junio C Hamano
2020-07-27 17:11       ` Jeff King
2020-07-28 19:23       ` Jonathan Tan
2020-07-28 20:08         ` Jeff King
2020-07-29 18:53           ` Jonathan Tan
2020-07-29 19:29             ` Jeff King
2020-07-29 19:02           ` Junio C Hamano
2020-07-29 22:55             ` Jonathan Tan
2020-08-05  1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
2020-08-05 19:53     ` Junio C Hamano
2020-08-07 20:53       ` Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common Jonathan Tan
2020-08-05 20:08     ` Junio C Hamano
2020-08-05 22:11       ` Junio C Hamano
2020-08-07 20:59         ` Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 3/7] negotiator/null: add null fetch negotiator Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 4/7] fetch: --stdin Jonathan Tan
2020-08-05 20:33     ` Junio C Hamano
2020-08-07 21:10       ` Jonathan Tan
2020-08-07 21:58         ` Junio C Hamano
2020-08-07 21:10       ` Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 5/7] fetch: submodule config Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 6/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-05  1:20   ` [RFC PATCH 7/7] promisor-remote: use subprocess to fetch Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
2020-08-11 22:52   ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
2020-08-12 12:55     ` Derrick Stolee
2020-08-12 16:44       ` Junio C Hamano
2020-08-12 17:29         ` Jonathan Tan
2020-08-11 22:52   ` [PATCH v2 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
2020-08-11 22:52   ` [PATCH v2 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
2020-08-12 17:34     ` Junio C Hamano
2020-08-11 22:52   ` [PATCH v2 4/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-12 18:06     ` Junio C Hamano
2020-08-11 22:52   ` [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
2020-08-12 18:25     ` Junio C Hamano
2020-08-11 22:52   ` [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-12 18:28     ` Junio C Hamano
2020-08-11 22:52   ` [PATCH v2 7/7] fetch-pack: remove no_dependents code Jonathan Tan
2020-08-12 12:51   ` [PATCH v2 0/7] Lazy fetch with subprocess Derrick Stolee
2020-08-18  4:01 ` [PATCH v3 " Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 4/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-18  4:01   ` [PATCH v3 7/7] fetch-pack: remove no_dependents code Jonathan Tan
2020-08-18 19:56   ` [PATCH v3 0/7] Lazy fetch with subprocess Junio C Hamano
2020-08-18 22:32     ` Junio C Hamano
2020-08-18 23:36       ` [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-18 23:57         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).