From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>, Ram Rachum <ram@rachum.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Make `git fetch --all` parallel?
Date: Tue, 11 Oct 2016 16:18:15 -0700 [thread overview]
Message-ID: <CAGZ79kaKOiy-HJboaujXXc66P6CLupteDw4JyPOGetREfz_q_Q@mail.gmail.com> (raw)
In-Reply-To: <20161011225942.tvqbbzxglvu7lldi@sigill.intra.peff.net>
On Tue, Oct 11, 2016 at 3:59 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Oct 11, 2016 at 03:50:36PM -0700, Stefan Beller wrote:
>
>> I agree. Though even for implementing the "dumb" case of fetching
>> objects twice we'd have to take care of some racing issues, I would assume.
>>
>> Why did you put a "sleep 2" below?
>> * a slow start to better spread load locally? (keep the workstation responsive?)
>> * a slow start to have different fetches in a different phase of the
>> fetch protocol?
>> * avoiding some subtle race?
>>
>> At the very least we would need a similar thing as Jeff recently sent for the
>> push case with objects quarantined and then made available in one go?
>
> I don't think so. The object database is perfectly happy with multiple
> simultaneous writers, and nothing impacts the have/wants until actual
> refs are written. Quarantining objects before the refs are written is an
> orthogonal concept.
If a remote advertises its tips, we'd need to look these up (clientside) to
decide if we have them, and I do not think we'd do that via a reachability
check, but via direct lookup in the object data base? So I do not quite
understand, what we gain from the atomic ref writes in e.g. remote/origin/.
> I'm not altogether convinced that parallel fetch would be that much
> faster, though.
Ok, time to present data... Let's assume a degenerate case first:
"up-to-date with all remotes" because that is easy to reproduce.
I have 14 remotes currently:
$ time git fetch --all
real 0m18.016s
user 0m2.027s
sys 0m1.235s
$ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs
-P 14 -I % git fetch %
real 0m5.168s
user 0m2.312s
sys 0m1.167s
A factor of >3, so I suspect there is improvement ;)
Well just as Ævar pointed out, there is some improvement.
>
> I usually just do a one-off fetch of their URL in such a case, exactly
> because I _don't_ want to end up with a bunch of remotes. You can also
> mark them with skipDefaultUpdate if you only care about them
> occasionally (so you can "git fetch sbeller" when you care about it, but
> it doesn't slow down your daily "git fetch").
And I assume you don't want the remotes because it takes time to fetch and not
because your disk space is expensive. ;)
>
> -Peff
next prev parent reply other threads:[~2016-10-11 23:25 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-11 20:12 Make `git fetch --all` parallel? Ram Rachum
2016-10-11 20:53 ` Stefan Beller
2016-10-11 22:37 ` Junio C Hamano
2016-10-11 22:50 ` Stefan Beller
2016-10-11 22:58 ` Junio C Hamano
2016-10-11 22:58 ` Stefan Beller
2016-10-11 22:59 ` Jeff King
2016-10-11 23:16 ` Ævar Arnfjörð Bjarmason
2016-10-11 23:18 ` Stefan Beller [this message]
2016-10-12 1:34 ` Jeff King
2016-10-12 1:52 ` Jeff King
2016-10-12 6:47 ` Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGZ79kaKOiy-HJboaujXXc66P6CLupteDw4JyPOGetREfz_q_Q@mail.gmail.com \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=ram@rachum.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).