RFC: Would a config fetch.retryCount make sense?

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* RFC: Would a config fetch.retryCount make sense?
@ 2017-06-01 12:48 Lars Schneider
  2017-06-01 13:33 ` Ben Peart
  2017-06-01 17:59 ` Stefan Beller
  0 siblings, 2 replies; 5+ messages in thread
From: Lars Schneider @ 2017-06-01 12:48 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

Hi,

we occasionally see "The remote end hung up unexpectedly" (pkt-line.c:265) 
on our `git fetch` calls (most noticeably in our automations). I expect 
random network glitches to be the cause.

In some places we added a basic retry mechanism and I was wondering
if this could be a useful feature for Git itself.

E.g. a Git config such as "fetch.retryCount" or something.
Or is there something like this in Git already and I missed it?

Thanks,
Lars

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Would a config fetch.retryCount make sense?
  2017-06-01 12:48 RFC: Would a config fetch.retryCount make sense? Lars Schneider
@ 2017-06-01 13:33 ` Ben Peart
  2017-06-05 12:04   ` Lars Schneider
  2017-06-01 17:59 ` Stefan Beller
  1 sibling, 1 reply; 5+ messages in thread
From: Ben Peart @ 2017-06-01 13:33 UTC (permalink / raw)
  To: Lars Schneider, Git Mailing List; +Cc: Junio C Hamano

On 6/1/2017 8:48 AM, Lars Schneider wrote:
> Hi,
> 
> we occasionally see "The remote end hung up unexpectedly" (pkt-line.c:265)
> on our `git fetch` calls (most noticeably in our automations). I expect
> random network glitches to be the cause.
> 
> In some places we added a basic retry mechanism and I was wondering
> if this could be a useful feature for Git itself.
> 

Having a configurable retry mechanism makes sense especially if it 
allows continuing an in-progress download rather than aborting and 
trying over.  I would make it off by default so that any existing higher 
level retry mechanism doesn't trigger a retry storm if the problem isn't 
a transient network glitch.

Internally we use a tool 
(https://github.com/Microsoft/GVFS/tree/master/GVFS/FastFetch) to 
perform fetch for our build machines.  It has several advantages 
including retries when downloading pack files.

It's biggest advantage is that it uses multiple threads to parallelize 
the entire fetch and checkout operation from end to end (ie the download 
happens in parallel as well as checkout happening in parallel with the 
download) which makes it take a fraction of the overall time.

When time permits, I hope to bring some of these enhancements over into 
git itself.

> E.g. a Git config such as "fetch.retryCount" or something.
> Or is there something like this in Git already and I missed it?
> 
> Thanks,
> Lars
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Would a config fetch.retryCount make sense?
  2017-06-01 12:48 RFC: Would a config fetch.retryCount make sense? Lars Schneider
  2017-06-01 13:33 ` Ben Peart
@ 2017-06-01 17:59 ` Stefan Beller
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2017-06-01 17:59 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Mailing List, Junio C Hamano

On Thu, Jun 1, 2017 at 5:48 AM, Lars Schneider <larsxschneider@gmail.com> wrote:
> Hi,
>
> we occasionally see "The remote end hung up unexpectedly" (pkt-line.c:265)
> on our `git fetch` calls (most noticeably in our automations). I expect
> random network glitches to be the cause.

There is 665b35eccd (submodule--helper: initial clone learns retry
logic, 2016-06-09)
but that is for submodules and only the initial clone.

I tried searching the mailing list archive if it was discussed for
fetch before (I am sure it was), but could not find a good hint to link at.

IIRC one major concern was:
* When a human operates git-fetch, then they want to have fast feedback.
  The failure may be non-transient, for example when I forgot to up the
  wifi connection. Then the human can inspect and fix the root cause.
  (Assumption in human workflow: these non transient errors happen more
  often than the occasional fetch error due to network glitches.)

For automation I would expect that the retry logic is actually beneficial,
such that you would want to have a command line options such as
"git fetch --retries=5 --delay-between-retries=10s".

>
> In some places we added a basic retry mechanism and I was wondering
> if this could be a useful feature for Git itself.

There are already retries in other places. :) Cf. f4ab4f3ab1
(lock_packed_refs():
allow retries when acquiring the packed-refs lock, 2015-05-11), which
solves the need
of github on the serverside, when they have a very active repo that
multiple people
push to at the same time. (to different branches. I believe that forks
are internally
handled as the same repo, just with different namespaces. So if there
are 1000 forks
of linux.git you see a lot of pushes to the "same" repo)

>
> E.g. a Git config such as "fetch.retryCount" or something.
> Or is there something like this in Git already and I missed it?

I like it.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Would a config fetch.retryCount make sense?
  2017-06-01 13:33 ` Ben Peart
@ 2017-06-05 12:04   ` Lars Schneider
  2017-06-05 14:08     ` Ben Peart
  0 siblings, 1 reply; 5+ messages in thread
From: Lars Schneider @ 2017-06-05 12:04 UTC (permalink / raw)
  To: Ben Peart; +Cc: Git Mailing List, Junio C Hamano


> On 01 Jun 2017, at 15:33, Ben Peart <peartben@gmail.com> wrote:
> 
> 
> 
> On 6/1/2017 8:48 AM, Lars Schneider wrote:
>> Hi,
>> we occasionally see "The remote end hung up unexpectedly" (pkt-line.c:265)
>> on our `git fetch` calls (most noticeably in our automations). I expect
>> random network glitches to be the cause.
>> In some places we added a basic retry mechanism and I was wondering
>> if this could be a useful feature for Git itself.
> 
> Having a configurable retry mechanism makes sense especially if it allows continuing an in-progress download rather than aborting and trying over.  I would make it off by default so that any existing higher level retry mechanism doesn't trigger a retry storm if the problem isn't a transient network glitch.

Agreed.


> Internally we use a tool (https://github.com/Microsoft/GVFS/tree/master/GVFS/FastFetch) to perform fetch for our build machines.  It has several advantages including retries when downloading pack files.

That's a "drop-in" replacement for "git fetch"?! I looked a bit through the 
"git fetch" code and retry (especially with continuing in-progress downloads) 
looks like a bigger change than I expected because of the current "die() 
in case of error" implementation.


> It's biggest advantage is that it uses multiple threads to parallelize the entire fetch and checkout operation from end to end (ie the download happens in parallel as well as checkout happening in parallel with the download) which makes it take a fraction of the overall time.

Interesting. Do you observe noticeable speed improvements with fetch delta updates,
too? This is usually fast enough for us.

The people I work with usually complain that the "clone operation" is slow. The
reason is that they clone over and over again to get a "clean checkout". I try 
to explain to them in that case that every machine should clone only once and 
that there are way more efficient ways to get a clean checkout.


> When time permits, I hope to bring some of these enhancements over into git itself.

That would be great!


- Lars

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Would a config fetch.retryCount make sense?
  2017-06-05 12:04   ` Lars Schneider
@ 2017-06-05 14:08     ` Ben Peart
  0 siblings, 0 replies; 5+ messages in thread
From: Ben Peart @ 2017-06-05 14:08 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Mailing List, Junio C Hamano



On 6/5/2017 8:04 AM, Lars Schneider wrote:
> 
>> On 01 Jun 2017, at 15:33, Ben Peart <peartben@gmail.com> wrote:
>>
>>
>>
>> On 6/1/2017 8:48 AM, Lars Schneider wrote:
>>> Hi,
>>> we occasionally see "The remote end hung up unexpectedly" (pkt-line.c:265)
>>> on our `git fetch` calls (most noticeably in our automations). I expect
>>> random network glitches to be the cause.
>>> In some places we added a basic retry mechanism and I was wondering
>>> if this could be a useful feature for Git itself.
>>
>> Having a configurable retry mechanism makes sense especially if it allows continuing an in-progress download rather than aborting and trying over.  I would make it off by default so that any existing higher level retry mechanism doesn't trigger a retry storm if the problem isn't a transient network glitch.
> 
> Agreed.
> 
> 
>> Internally we use a tool (https://github.com/Microsoft/GVFS/tree/master/GVFS/FastFetch) to perform fetch for our build machines.  It has several advantages including retries when downloading pack files.
> 
> That's a "drop-in" replacement for "git fetch"?! I looked a bit through the
> "git fetch" code and retry (especially with continuing in-progress downloads)
> looks like a bigger change than I expected because of the current "die()
> in case of error" implementation.
> 

No, not a drop in replacement.  We only use this on build machines which 
don't need history so it only pulls down the tip commit on the initial 
clone.  This is a big win on large repos with a lot of history but not 
so great for a developer machines where history may be desired.

> 
>> It's biggest advantage is that it uses multiple threads to parallelize the entire fetch and checkout operation from end to end (ie the download happens in parallel as well as checkout happening in parallel with the download) which makes it take a fraction of the overall time.
> 
> Interesting. Do you observe noticeable speed improvements with fetch delta updates,
> too? This is usually fast enough for us.

Since we have our build machines setup to use it for the clone, we kept 
using it for delta updates.  When deltas get large (and with thousands 
of developers pushing that can happen pretty quickly) it is still a nice 
perf win.

> 
> The people I work with usually complain that the "clone operation" is slow. The
> reason is that they clone over and over again to get a "clean checkout". I try
> to explain to them in that case that every machine should clone only once and
> that there are way more efficient ways to get a clean checkout.
> 
> 
>> When time permits, I hope to bring some of these enhancements over into git itself.
> 
> That would be great!
> 
> 
> - Lars
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-06-05 14:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-01 12:48 RFC: Would a config fetch.retryCount make sense? Lars Schneider
2017-06-01 13:33 ` Ben Peart
2017-06-05 12:04   ` Lars Schneider
2017-06-05 14:08     ` Ben Peart
2017-06-01 17:59 ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).