git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Ram Rachum <ram@rachum.com>, "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Make `git fetch --all` parallel?
Date: Tue, 11 Oct 2016 15:50:36 -0700	[thread overview]
Message-ID: <CAGZ79kZNvTvk4uZa8xhxZABKtzS9A5HoumJ37AacuZnHaZ4+Xw@mail.gmail.com> (raw)
In-Reply-To: <xmqqa8ea7bsh.fsf@gitster.mtv.corp.google.com>

On Tue, Oct 11, 2016 at 3:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> So I do think it would be much faster, but I also think patches for this would
>> require some thought and a lot of refactoring of the fetch code.
>> ...
>> During the negotiation phase a client would have to be able to change its
>> mind (add more "haves", or in case of the parallel fetching these become
>> "will-have-soons", although the remote figured out the client did not have it
>> earlier.)
>
> Even though a fancy optimization as you outlined might be ideal, I
> suspect that users would be happier if the network bandwidth is
> utilized to talk to multiple remotes at the same time even if they
> end up receiving the same recent objects from more than one place in
> the end.

I agree. Though even for implementing the "dumb" case of fetching
objects twice we'd have to take care of some racing issues, I would assume.

Why did you put a "sleep 2" below?
* a slow start to better spread load locally? (keep the workstation responsive?)
* a slow start to have different fetches in a different phase of the
fetch protocol?
* avoiding some subtle race?

At the very least we would need a similar thing as Jeff recently sent for the
push case with objects quarantined and then made available in one go?

>
> Is the order in which "git fetch --all" iterates over "all remotes"
> predictable and documented?

it is predictable, as it is just the same order as put by grep in
$ grep "\[remote " .git/config, i.e. in order of the file, which in my
case turns out to be sorted by importance/history quite naturally.
But reordering my config file would be not a big deal.

I dunno, if documented though.

> If so, listing the remotes from more
> powerful and well connected place to slower ones and then doing an
> equivalent of stupid
>
>         for remote in $list_of_remotes_ordered_in_such_a_way

list_of_remotes_ordered_in_such_a_way is roughly:
$(git config --get-regexp remote.*.url | tr '.' ' ' |awk '{print $2}')

>         do
>                 git fetch "$remote" &
>                 sleep 2
>         done
>
> might be fairly easy thing to bring happiness.

I would love to see the implementation though, as over time I accumulate
a lot or remotes. (Someone published patches on the mailing list and made
them available somewhere hosted? Grabbing them from their hosting site
is easier than applying patches for me, so I'd rather fetch them... so I have
some remotes now)

  reply	other threads:[~2016-10-11 22:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-11 20:12 Make `git fetch --all` parallel? Ram Rachum
2016-10-11 20:53 ` Stefan Beller
2016-10-11 22:37   ` Junio C Hamano
2016-10-11 22:50     ` Stefan Beller [this message]
2016-10-11 22:58       ` Junio C Hamano
2016-10-11 22:58       ` Stefan Beller
2016-10-11 22:59       ` Jeff King
2016-10-11 23:16         ` Ævar Arnfjörð Bjarmason
2016-10-11 23:18         ` Stefan Beller
2016-10-12  1:34           ` Jeff King
2016-10-12  1:52             ` Jeff King
2016-10-12  6:47               ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kZNvTvk4uZa8xhxZABKtzS9A5HoumJ37AacuZnHaZ4+Xw@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ram@rachum.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).