git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Daniel Koverman <dkoverman@predictiveTechnologies.com>
Cc: "git\@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Why does send-pack call pack-objects for all remote refs?
Date: Mon, 07 Dec 2015 14:41:00 -0800	[thread overview]
Message-ID: <xmqqvb89lw5f.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <4766c8518c2a46afb88fc0a2dd9a1688@EXCHANGE1U.uunet.arlington.PredictiveTechnologies.com> (Daniel Koverman's message of "Mon, 7 Dec 2015 21:02:22 +0000")

Daniel Koverman <dkoverman@predictiveTechnologies.com> writes:

> I have a repository which has ~2000 branches on the remote, and it
> takes ~8 seconds to push a change to one ref. The majority of this
> time is spent in pack-object. I wrote a hack so that only the ref
> being updated would be packed (the normal behavior is to pack for
> every ref on the remote).

I am having a hard time understanding what you are trying to say, as
nobody's pack-objects "packs for a ref" or "packs a ref", so my
response has to be based on my best guess---I think you are talking
about feeding the object names of the tips of all remote refs as
the bottoms of the revision range to pack-objects.

When you are pushing your 'topic' branch to update the 'topic'
branch at the remote, it is true that we compute

	git rev-list --objects $your_topic --not $all_of_the_remote_refs

to produce a packfile.  And by tweaking this to

	git rev-list --objects $your_topic --not $their_topic

you will cut down the processing time of 'rev-list', especially if
you have insane number of refs at the remote end.

There is a price you would pay for doing so, though.  An obvious one
is what if the 'topic' branch does not exist yet at the remote.
Without the "--not ..." part, you would end up sending the entire
history behind $your_topic, and the way you prevent that from
happening is to give what are known to exist at the remote end.
Even when there already is 'topic' at the remote, the contents at
the paths that are different between your 'topic' and the 'topic' as
exists at the remote may already exist on some other branches that
are already at the remote (e.g. you may have merged some branches
that are common between your repository and the remote, and the only
object missing from the remote that your repository has to send may
be a merge commit and the top-level tree object), but limiting the
bottoms of the revision range only to "--not $their_topic" would rob
this obvious optimization opportunity from you.

There has to be some way to limit the list of remote-refs that are
used as bottoms of the revision range.  For example, if you know
that the remote has all the tags, and that everything in the v1.0
tag is contained in the v2.0 tag, then a single "--not v2.0" should
give the same result as "--not v1.0 v2.0" that lists both.  But the
computation that is needed to figure out which tags and branches are
not worth listing as bottoms would need to look at all of them at
least once anyway, so a naive implementation of such would end up
spending the same cycles, I would suspect.

Also it was unclear if you are working with a shallow repository.
The performance trade-off made between the packsize and the cycles
is somewhat different between a normal and a shallow repository,
e.g. 2dacf26d (pack-objects: use --objects-edge-aggressive for
shallow repos, 2014-12-24) might be a good starting point to think
about this issue.

  reply	other threads:[~2015-12-07 22:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-07 21:02 Why does send-pack call pack-objects for all remote refs? Daniel Koverman
2015-12-07 22:41 ` Junio C Hamano [this message]
2015-12-07 22:57   ` Jeff King
2015-12-08 17:34     ` Daniel Koverman
2015-12-10  4:19       ` Jeff King
2015-12-12  4:15         ` Nasser Grainawi
2015-12-14 13:47         ` Daniel Koverman
2015-12-14 21:04           ` Jeff King
2015-12-14 22:31             ` Jonathan Nieder
2015-12-14 22:37               ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqvb89lw5f.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=dkoverman@predictiveTechnologies.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).