git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] do not reset in_vain on non-novel acks
@ 2016-09-22 18:36 Jonathan Tan
  2016-09-22 18:36 ` [PATCH] fetch-pack: " Jonathan Tan
  2016-09-22 19:20 ` [PATCH] " Junio C Hamano
  0 siblings, 2 replies; 6+ messages in thread
From: Jonathan Tan @ 2016-09-22 18:36 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

This is regarding the packfile negotiation in fetch-pack. If there is a
concern that MAX_IN_VAIN would be hit too early (as a consequence of the
patch below), I'm currently investigating the possibility of improving
the negotiation ability of the client side further (for example, by
prioritizing refs or heads instead of merely prioritizing by date in the
priority queue of objects), but I thought I'd send the patch out first
anyway to see what others think.

Jonathan Tan (1):
  fetch-pack: do not reset in_vain on non-novel acks

 fetch-pack.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] fetch-pack: do not reset in_vain on non-novel acks
  2016-09-22 18:36 [PATCH] do not reset in_vain on non-novel acks Jonathan Tan
@ 2016-09-22 18:36 ` Jonathan Tan
  2016-09-22 20:05   ` Junio C Hamano
  2016-09-22 19:20 ` [PATCH] " Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Jonathan Tan @ 2016-09-22 18:36 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

The MAX_IN_VAIN mechanism was introduced in commit f061e5f ("fetch-pack:
give up after getting too many "ack continue"", 2006-05-24) to stop ref
negotiation if a number of consecutive "have"s have been sent with no
corresponding new acks. A use case (as described in that commit) is the
scenario in which the local repository has more roots than the remote
repository.

However, during a negotiation in which stateless RPCs are used,
MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario
above and others) because in each new request, the client has to inform
the server of objects it already has and knows the server has (to remind
the server of the state), which the server then acks.

Make fetch-pack only consider novel acks (acks for objects for which the
client has never received an ack before in this session) as new acks for
the purpose of MAX_IN_VAIN.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 85e77af..1141e3c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -428,10 +428,18 @@ static int find_common(struct fetch_pack_args *args,
 						const char *hex = sha1_to_hex(result_sha1);
 						packet_buf_write(&req_buf, "have %s\n", hex);
 						state_len = req_buf.len;
-					}
+						/*
+						 * Reset in_vain because this
+						 * ack is a novel ack (that is,
+						 * an ack for this commit has
+						 * not been seen).
+						 */
+						in_vain = 0;
+					} else if (!args->stateless_rpc
+						   || ack != ACK_common)
+						in_vain = 0;
 					mark_common(commit, 0, 1);
 					retval = 0;
-					in_vain = 0;
 					got_continue = 1;
 					if (ack == ACK_ready) {
 						clear_prio_queue(&rev_list);
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] do not reset in_vain on non-novel acks
  2016-09-22 18:36 [PATCH] do not reset in_vain on non-novel acks Jonathan Tan
  2016-09-22 18:36 ` [PATCH] fetch-pack: " Jonathan Tan
@ 2016-09-22 19:20 ` Junio C Hamano
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2016-09-22 19:20 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> This is regarding the packfile negotiation in fetch-pack. If there is a
> concern that MAX_IN_VAIN would be hit too early (as a consequence of the
> patch below), I'm currently investigating the possibility of improving
> the negotiation ability of the client side further (for example, by
> prioritizing refs or heads instead of merely prioritizing by date in the
> priority queue of objects), but I thought I'd send the patch out first
> anyway to see what others think.
>
> Jonathan Tan (1):
>   fetch-pack: do not reset in_vain on non-novel acks
>
>  fetch-pack.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

Just a hint, because you are relatively new to the project.  It
usually is not very productive to have a cover letter to a single
patch.  Your cover letter either ends up being useless, or ends up
costing you time by having to repeat what you write for the patch
anyway (and making others to read it twice).

Below the "---" line of the single patch is often a better place to
tell a backstory of the patch if you need to.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fetch-pack: do not reset in_vain on non-novel acks
  2016-09-22 18:36 ` [PATCH] fetch-pack: " Jonathan Tan
@ 2016-09-22 20:05   ` Junio C Hamano
  2016-09-23 17:41     ` [PATCH v2] " Jonathan Tan
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2016-09-22 20:05 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> The MAX_IN_VAIN mechanism was introduced in commit f061e5f ("fetch-pack:
> give up after getting too many "ack continue"", 2006-05-24) to stop ref
> negotiation if a number of consecutive "have"s have been sent with no
> corresponding new acks. A use case (as described in that commit) is the
> scenario in which the local repository has more roots than the remote
> repository.

To those who know what the mechanism is about, the above is
sufficient to refresh their memory, but to others, a brief
explanation of _why_ it is a good idea to stop is needed to
understand what you are trying to achieve with this change.

It may help to add something like "This will stop the client to dig
too deep in an irrelevant side branch in vain without ever finding a
common ancestor." before "A use case is ...", perhaps?

By the way, you made me run "git show -W f061e5f" and then compare
it with "less fetch-pack.c"; I am kind of surprised to see that
find_common() has grown quite a bit over the years.

> However, during a negotiation in which stateless RPCs are used,
> MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario
> above and others) because in each new request, the client has to inform
> the server of objects it already has and knows the server has (to remind
> the server of the state), which the server then acks.

Hmph.  So the problem you are trying to solve is that the current
code sees that the other side said 'yeah, that is a common commit'
by giving us ACK common, and resets the in_vain counter, when in
fact we haven't made _any_ progress at that point.

> Make fetch-pack only consider novel acks (acks for objects for which the
> client has never received an ack before in this session) as new acks for
> the purpose of MAX_IN_VAIN.

Makes sense.

Just a hint, because you are relatively new to the project.
Whenever you are tempted to say "In other words...", "That
means...", or further elaborte in parentheses, it pays to stop and
think if you can do without whatever you said before that.  In the
above paragraph and in the comment in the patch, a newly invented
term "novel ack" is used exactly once, and because it is a newly
invented word, you need to explain what you want it to mean, but
there is no need to do so.  "Make fetch-pack only consider acks for
objects for which no earlier acks have been seen ..." is equally
readable and does not burden the readers with "Ah, the author
introduced a new term 'novel ack', so I need to remember that this
is the definition of the word when I see it mentioned next time".

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  fetch-pack.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 85e77af..1141e3c 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -428,10 +428,18 @@ static int find_common(struct fetch_pack_args *args,
>  						const char *hex = sha1_to_hex(result_sha1);
>  						packet_buf_write(&req_buf, "have %s\n", hex);
>  						state_len = req_buf.len;
> -					}
> +						/*
> +						 * Reset in_vain because this
> +						 * ack is a novel ack (that is,
> +						 * an ack for this commit has
> +						 * not been seen).
> +						 */

Side note.  Having to wrap the multi-line comment like this is a
sign that the loop got a bit too big to fit in brain.  We may want
to see if there is way to reduce the complexity by introducing a
helper function or something.

> +						in_vain = 0;
> +					} else if (!args->stateless_rpc
> +						   || ack != ACK_common)
> +						in_vain = 0;

It is a bit hard to read this hunk without pre-context.  The
original reads like so:


	...
	case ACK_common:
	case ACK_ready:
	case ACK_continue: {
		struct commit *commit =
			lookup_commit(result_sha1);
		if (!commit)
			die("invalid commit %s", sha1_to_hex(result_sha1));
		if (args->stateless_rpc
		 && ack == ACK_common
		 && !(commit->object.flags & COMMON)) {

Here, they told us that this is a common ancestor by giving us "ACK
common", and this is not a response to our attempt to prime a new
incarnation of stateless server.  It is curious that only ACK_common
is checked, but it is OK because --stateless requires multi-ack and
ACK_continue is not used.

			/* We need to replay the have for this object
			 * on the next RPC request so the peer knows
			 * it is in common with us.
			 */
			const char *hex = sha1_to_hex(result_sha1);
			packet_buf_write(&req_buf, "have %s\n", hex);
			state_len = req_buf.len;

And we store it away so that the next found will start with these
objects as "have" to remind the other side where we were.

> +			in_vain = 0;

And at this point, you reset in_vain counter with your change.
Which makes sense.  This is a newly discovered common one, i.e. we
are making progress.

-		}
> +		} else if (!args->stateless_rpc
> +			   || ack != ACK_common)
> +			in_vain = 0;

And you add an else clause here to reset in_vain counter, which we
used to unconditionally do, when stateless is not in use, or when we
are doing stateless and got something other than "ACK common".  The
latter is to make sure that "ACK common" for commits we have already
known are common do not count as making progress.

Makes perfect sense to me.

		mark_common(commit, 0, 1);
		retval = 0;
-		in_vain = 0;

And you remove the unconditional reset.

		got_continue = 1;
		if (ack == ACK_ready) {
			clear_prio_queue(&rev_list);
			got_ready = 1;
		}
		break;
		}
	}

Everything looks good to me and well thought-out.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] fetch-pack: do not reset in_vain on non-novel acks
  2016-09-22 20:05   ` Junio C Hamano
@ 2016-09-23 17:41     ` Jonathan Tan
  2016-09-23 19:40       ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Tan @ 2016-09-23 17:41 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

The MAX_IN_VAIN mechanism was introduced in commit f061e5f ("fetch-pack:
give up after getting too many "ack continue"", 2006-05-24) to stop ref
negotiation if a number of consecutive "have"s have been sent with no
corresponding new acks. This is to stop the client from digging too deep
in an irrelevant side branch in vain without ever finding a common
ancestor. A use case (as described in that commit) is the scenario in
which the local repository has more roots than the remote repository.

However, during a negotiation in which stateless RPCs are used,
MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario
above and others) because in each new request, the client has to inform
the server of objects it already has and knows the server has (to remind
the server of the state), which the server then acks.

Make fetch-pack only consider, as new acks for the purpose of
MAX_IN_VAIN, acks for objects for which the client has never received an
ack before in this session.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---

Thanks for your comments - I really appreciate them.

Update from original:
o removed redundant text from commit message and comment in patch
o mentioned stopping the client from digging too deep in the commit
  message

I tried looking at creating a helper function to reduce both the size
and the nesting level of the loop, but it seems to me that a helper
function can't be extracted so easily because the logic is quite
intertwined with the rest of the function. For example, the "if
(args->stateless_rpc..." block uses 6 variables from the outer scope:
args, ack, commit, result_sha1, req_buf, and state_len (and in_vain, but
this can be the return value of the function). Expanding it wider would
allow us to make some of those 6 local, but also introduce new ones from
the outer scope.

 fetch-pack.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 85e77af..413937e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -428,10 +428,17 @@ static int find_common(struct fetch_pack_args *args,
 						const char *hex = sha1_to_hex(result_sha1);
 						packet_buf_write(&req_buf, "have %s\n", hex);
 						state_len = req_buf.len;
-					}
+						/*
+						 * Reset in_vain because an ack
+						 * for this commit has not been
+						 * seen.
+						 */
+						in_vain = 0;
+					} else if (!args->stateless_rpc
+						   || ack != ACK_common)
+						in_vain = 0;
 					mark_common(commit, 0, 1);
 					retval = 0;
-					in_vain = 0;
 					got_continue = 1;
 					if (ack == ACK_ready) {
 						clear_prio_queue(&rev_list);
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] fetch-pack: do not reset in_vain on non-novel acks
  2016-09-23 17:41     ` [PATCH v2] " Jonathan Tan
@ 2016-09-23 19:40       ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2016-09-23 19:40 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> I tried looking at creating a helper function to reduce both the size
> and the nesting level of the loop, but it seems to me that a helper
> function can't be extracted so easily because the logic is quite
> intertwined with the rest of the function. For example, the "if
> (args->stateless_rpc..." block uses 6 variables from the outer scope:
> args, ack, commit, result_sha1, req_buf, and state_len (and in_vain, but
> this can be the return value of the function). Expanding it wider would
> allow us to make some of those 6 local, but also introduce new ones from
> the outer scope.

Yup, I suspected that much when I wrote the message you are
responding to, but was sort-of hoping that you might come up with a
more clever way to restructure the code.  It is OK to leave it
as-is, and let others try making it cleaner ;-).

Thanks.

>
>  fetch-pack.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 85e77af..413937e 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -428,10 +428,17 @@ static int find_common(struct fetch_pack_args *args,
>  						const char *hex = sha1_to_hex(result_sha1);
>  						packet_buf_write(&req_buf, "have %s\n", hex);
>  						state_len = req_buf.len;
> -					}
> +						/*
> +						 * Reset in_vain because an ack
> +						 * for this commit has not been
> +						 * seen.
> +						 */
> +						in_vain = 0;
> +					} else if (!args->stateless_rpc
> +						   || ack != ACK_common)
> +						in_vain = 0;
>  					mark_common(commit, 0, 1);
>  					retval = 0;
> -					in_vain = 0;
>  					got_continue = 1;
>  					if (ack == ACK_ready) {
>  						clear_prio_queue(&rev_list);

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-23 19:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-22 18:36 [PATCH] do not reset in_vain on non-novel acks Jonathan Tan
2016-09-22 18:36 ` [PATCH] fetch-pack: " Jonathan Tan
2016-09-22 20:05   ` Junio C Hamano
2016-09-23 17:41     ` [PATCH v2] " Jonathan Tan
2016-09-23 19:40       ` Junio C Hamano
2016-09-22 19:20 ` [PATCH] " Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).