git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	lkundrak@v3.sk, jrnieder@gmail.com, git@vger.kernel.org
Subject: Re: Git 2.26 fetches many times more objects than it should, wasting gigabytes
Date: Fri, 24 Apr 2020 01:32:04 -0400	[thread overview]
Message-ID: <20200424053204.GD1648190@coredump.intra.peff.net> (raw)
In-Reply-To: <20200423213735.242662-1-jonathantanmy@google.com>

On Thu, Apr 23, 2020 at 02:37:35PM -0700, Jonathan Tan wrote:

> Thanks for the reproduction recipe (in [1]) and your analysis. I took a
> look, and it's because the check for in_vain is done differently. In v0:
> 
>   if (got_continue && MAX_IN_VAIN < in_vain) {
> 
> reflecting the documentation in pack-protocol.txt:
> 
>   However, the 256 limit *only* turns on in the canonical client
>   implementation if we have received at least one "ACK %s continue"
>   during a prior round.  This helps to ensure that at least one common
>   ancestor is found before we give up entirely.

Ah, thanks for that; I hadn't though to look in that file for more
clues.

> When debugging, I noticed that in_vain was increasing far in excess of
> MAX_IN_VAIN, but because got_continue was false, the client did not give
> up.
> 
> But in v2:
> 
>   if (!haves_added || *in_vain >= MAX_IN_VAIN) {
> 
> ("haves_added" is irrelevant to this discussion. It is another
> termination condition - when we have run out of "have"s to send.)
> 
> So there is no check that "continue" was sent. We probably should change
> v2 to match v0. I can start writing a patch unless someone else would
> like to take a further look at it.

Yeah, this fills in the final pieces of the puzzle I was chasing in:

 https://lore.kernel.org/git/20200422193324.GB558336@coredump.intra.peff.net/

And the patch you suggest sounds like the best solution.

I think there's some room for discussion about what the optimal
strategies are (e.g., v0 does send a lot more haves than v2 in this
instance, and it wouldn't always be helpful). But it makes sense to me
to put v2 and v0 on the same footing for now, especially given the
regressions people have mentioned, and then we can explore new options
at our convenience (like switching on the skipping negotiation
algorithm).

-Peff

  parent reply	other threads:[~2020-04-24  5:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22  8:42 Git 2.26 fetches many times more objects than it should, wasting gigabytes Lubomir Rintel
2020-04-22  9:57 ` Jeff King
2020-04-22 10:30   ` Jeff King
2020-04-22 10:40     ` Jeff King
2020-04-22 15:33       ` Junio C Hamano
2020-04-22 19:33         ` Jeff King
2020-04-23 21:37       ` Jonathan Tan
2020-04-23 21:54         ` Junio C Hamano
2020-04-24  5:32         ` Jeff King [this message]
2020-04-22 15:40   ` Jonathan Nieder
2020-04-22 19:36     ` Jeff King
2020-04-22 15:50   ` [PATCH] Revert "fetch: default to protocol version 2" Jonathan Nieder
2020-04-22 18:23     ` Junio C Hamano
2020-04-22 19:40     ` Jeff King
2020-04-22 19:47       ` Jeff King
2020-04-22 16:53   ` Git 2.26 fetches many times more objects than it should, wasting gigabytes Jonathan Nieder
2020-04-22 17:32     ` Junio C Hamano
2020-04-22 19:18     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200424053204.GD1648190@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=lkundrak@v3.sk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).