git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] fetch-pack: do not reset in_vain on non-novel acks
Date: Thu, 22 Sep 2016 13:05:32 -0700	[thread overview]
Message-ID: <xmqqfuor4s4z.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <5a258c5dbed0683760e2ffb1bd6a1749ea66b2d5.1474568670.git.jonathantanmy@google.com> (Jonathan Tan's message of "Thu, 22 Sep 2016 11:36:55 -0700")

Jonathan Tan <jonathantanmy@google.com> writes:

> The MAX_IN_VAIN mechanism was introduced in commit f061e5f ("fetch-pack:
> give up after getting too many "ack continue"", 2006-05-24) to stop ref
> negotiation if a number of consecutive "have"s have been sent with no
> corresponding new acks. A use case (as described in that commit) is the
> scenario in which the local repository has more roots than the remote
> repository.

To those who know what the mechanism is about, the above is
sufficient to refresh their memory, but to others, a brief
explanation of _why_ it is a good idea to stop is needed to
understand what you are trying to achieve with this change.

It may help to add something like "This will stop the client to dig
too deep in an irrelevant side branch in vain without ever finding a
common ancestor." before "A use case is ...", perhaps?

By the way, you made me run "git show -W f061e5f" and then compare
it with "less fetch-pack.c"; I am kind of surprised to see that
find_common() has grown quite a bit over the years.

> However, during a negotiation in which stateless RPCs are used,
> MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario
> above and others) because in each new request, the client has to inform
> the server of objects it already has and knows the server has (to remind
> the server of the state), which the server then acks.

Hmph.  So the problem you are trying to solve is that the current
code sees that the other side said 'yeah, that is a common commit'
by giving us ACK common, and resets the in_vain counter, when in
fact we haven't made _any_ progress at that point.

> Make fetch-pack only consider novel acks (acks for objects for which the
> client has never received an ack before in this session) as new acks for
> the purpose of MAX_IN_VAIN.

Makes sense.

Just a hint, because you are relatively new to the project.
Whenever you are tempted to say "In other words...", "That
means...", or further elaborte in parentheses, it pays to stop and
think if you can do without whatever you said before that.  In the
above paragraph and in the comment in the patch, a newly invented
term "novel ack" is used exactly once, and because it is a newly
invented word, you need to explain what you want it to mean, but
there is no need to do so.  "Make fetch-pack only consider acks for
objects for which no earlier acks have been seen ..." is equally
readable and does not burden the readers with "Ah, the author
introduced a new term 'novel ack', so I need to remember that this
is the definition of the word when I see it mentioned next time".

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  fetch-pack.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 85e77af..1141e3c 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -428,10 +428,18 @@ static int find_common(struct fetch_pack_args *args,
>  						const char *hex = sha1_to_hex(result_sha1);
>  						packet_buf_write(&req_buf, "have %s\n", hex);
>  						state_len = req_buf.len;
> -					}
> +						/*
> +						 * Reset in_vain because this
> +						 * ack is a novel ack (that is,
> +						 * an ack for this commit has
> +						 * not been seen).
> +						 */

Side note.  Having to wrap the multi-line comment like this is a
sign that the loop got a bit too big to fit in brain.  We may want
to see if there is way to reduce the complexity by introducing a
helper function or something.

> +						in_vain = 0;
> +					} else if (!args->stateless_rpc
> +						   || ack != ACK_common)
> +						in_vain = 0;

It is a bit hard to read this hunk without pre-context.  The
original reads like so:


	...
	case ACK_common:
	case ACK_ready:
	case ACK_continue: {
		struct commit *commit =
			lookup_commit(result_sha1);
		if (!commit)
			die("invalid commit %s", sha1_to_hex(result_sha1));
		if (args->stateless_rpc
		 && ack == ACK_common
		 && !(commit->object.flags & COMMON)) {

Here, they told us that this is a common ancestor by giving us "ACK
common", and this is not a response to our attempt to prime a new
incarnation of stateless server.  It is curious that only ACK_common
is checked, but it is OK because --stateless requires multi-ack and
ACK_continue is not used.

			/* We need to replay the have for this object
			 * on the next RPC request so the peer knows
			 * it is in common with us.
			 */
			const char *hex = sha1_to_hex(result_sha1);
			packet_buf_write(&req_buf, "have %s\n", hex);
			state_len = req_buf.len;

And we store it away so that the next found will start with these
objects as "have" to remind the other side where we were.

> +			in_vain = 0;

And at this point, you reset in_vain counter with your change.
Which makes sense.  This is a newly discovered common one, i.e. we
are making progress.

-		}
> +		} else if (!args->stateless_rpc
> +			   || ack != ACK_common)
> +			in_vain = 0;

And you add an else clause here to reset in_vain counter, which we
used to unconditionally do, when stateless is not in use, or when we
are doing stateless and got something other than "ACK common".  The
latter is to make sure that "ACK common" for commits we have already
known are common do not count as making progress.

Makes perfect sense to me.

		mark_common(commit, 0, 1);
		retval = 0;
-		in_vain = 0;

And you remove the unconditional reset.

		got_continue = 1;
		if (ack == ACK_ready) {
			clear_prio_queue(&rev_list);
			got_ready = 1;
		}
		break;
		}
	}

Everything looks good to me and well thought-out.

Thanks.

  reply	other threads:[~2016-09-22 20:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-22 18:36 [PATCH] do not reset in_vain on non-novel acks Jonathan Tan
2016-09-22 18:36 ` [PATCH] fetch-pack: " Jonathan Tan
2016-09-22 20:05   ` Junio C Hamano [this message]
2016-09-23 17:41     ` [PATCH v2] " Jonathan Tan
2016-09-23 19:40       ` Junio C Hamano
2016-09-22 19:20 ` [PATCH] " Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqfuor4s4z.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).