On Tue, Jun 29, 2021 at 09:31:59PM -0400, Jeff King wrote: > On Mon, Jun 28, 2021 at 07:33:11AM +0200, Patrick Steinhardt wrote: > > > Fix this by not doing a connectivity check in case there were no pushed > > objects. Given that git-rev-walk(1) with only negative references will > > not do any graph walk, no performance improvements are to be expected. > > Conceptionally, it is still the right thing to do though. > > Even though it's not producing any exciting results, I agree this is > still a reasonable thing to do. > > I'm actually surprised it didn't help in your many-ref cases, just > because I think the traversal machinery is pretty eager to parse tags > and commits which are fed as tips. > > If I run "git rev-list --not --all --stdin takes about 35ms. But if I make a ton of refs, like: > > git rev-list HEAD | > perl -lpe 's{.*}{update refs/foo/commit$. $&}' | > git update-ref --stdin > git pack-refs --all --prune > > then it takes about 2000ms (if you don't pack it's even worse, as you > might expect). > > So how come we don't see that improvement in your "extrarefs" cases? > Looking at patch 1, they also seem to make one ref for every commit. > > I think the answer may be below... > > > + /* > > + * If received commands only consist of deletions, then the client MUST > > + * NOT send a packfile because there cannot be any new objects in the > > + * first place. As a result, we do not set up a quarantine environment > > + * because we know no new objects will be received. And that in turn > > + * means that we can skip connectivity checks here. > > + */ > > + if (tmp_objdir) { > > I think this will work, but we're now making assumptions about how > tmp_objdir will be initialized by the rest of the code. > > Could we make a more direct check of: skip calling rev-list if we have > no positive tips to feed? If we call iterate_receive_command() and it > returns end-of-list on the first call, then we know there is nothing to > feed (and as a bonus, this catches some more noop cases around shallow > repos; see iterate_receive_command). > > But that made me think that check_connected() could be doing this > itself. And indeed, it seems to already. At the top of that function is: > > if (fn(cb_data, &oid)) { > if (opt->err_fd) > close(opt->err_fd); > return err; > } > > So before we even spawn rev-list, if there's nothing to feed it, we'll > return right away ("err" will be "0" at this point). And I think that > has been there since the beginning of the function (it is even in the > old versions in builtin/fetch.c before it was factored out). > > And that explains why you didn't see any performance improvement. We're > already doing this optimization. :) > > -Peff Hah, thanks for solving this riddle. I always kind of wondered what this was supposed to do, where my assumption simply was "It's in case the callback returns an error". But your explanation is obviously correct and neatly explains what's going on. So yes, I'll drop this patch. Patrick