From: Junio C Hamano <gitster@pobox.com> To: Jeff King <peff@peff.net> Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>, "Git Mailing List" <git@vger.kernel.org> Subject: Re: [PATCH 3/4] peel_ref: check object type before loading Date: Thu, 04 Oct 2012 12:06:16 -0700 [thread overview] Message-ID: <7vbogiys47.fsf@alter.siamese.dyndns.org> (raw) In-Reply-To: <20121004080253.GC31325@sigill.intra.peff.net> (Jeff King's message of "Thu, 4 Oct 2012 04:02:53 -0400") Jeff King <peff@peff.net> writes: > The point of peel_ref is to dereference tags; if the base > object is not a tag, then we can return early without even > loading the object into memory. > > This patch accomplishes that by checking sha1_object_info > for the type. For a packed object, we can get away with just > looking in the pack index. For a loose object, we only need > to inflate the first couple of header bytes. We look at the pack index and have to follow its delta chain down to the base to find its type; if the object is deeply deltified, this certainly is an overall loss. The only case sha1_object_info() could work well for an object near the tip of a deep delta chain is to find its size, as the diff-delta encodes the size of the base and the size of the result of applying the delta to the base, so you do not have to follow the chain when you are only interested in the final size. But alas nobody calls sha1_object_info() for only size but not type (there are some callers that are interested in only type but not size). > This is a bit of a gamble; if we do find a tag object, then > we will end up loading the content anyway, and the extra > lookup will have been wasteful. However, if it is not a tag > object, then we save loading the object entirely. Depending > on the ratio of non-tags to tags in the input, this can be a > minor win or minor loss. > > However, it does give us one potential major win: if a ref > points to a large blob (e.g., via an unannotated tag), then > we can avoid looking at it entirely. > > Signed-off-by: Jeff King <peff@peff.net> > --- > This optimization is the one that gave me the most pause. While > upload-pack does call peel_ref on everything, the other callers all > constrain themselves to refs/tags/. So for many projects, we will be > calling it mostly on annotated tags, and it may be a very small net > loss. But in practice, it will not matter for most projects with a sane > number of normal tags, and saving even one accidental giant blob load > can have a huge impact. I may be missing something, but the above description is not convincing to me. When was the last time you pointed a blob directly with a ref, whether large or small, and whether within refs/tags or outside? > > refs.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/refs.c b/refs.c > index f672ad9..02e47b1 100644 > --- a/refs.c > +++ b/refs.c > @@ -1225,8 +1225,15 @@ fallback: > } > > fallback: > - o = parse_object(base); > - if (o && o->type == OBJ_TAG) { > + o = lookup_unknown_object(base); > + if (o->type == OBJ_NONE) { > + int type = sha1_object_info(base, NULL); > + if (type < 0) > + return -1; > + o->type = type; > + } > + > + if (o->type == OBJ_TAG) { > o = deref_tag_noverify(o); > if (o) { > hashcpy(sha1, o->sha1);
next prev parent reply other threads:[~2012-10-04 22:14 UTC|newest] Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-10-03 12:36 upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason 2012-10-03 13:06 ` Nguyen Thai Ngoc Duy 2012-10-03 18:03 ` Jeff King 2012-10-03 18:53 ` Junio C Hamano 2012-10-03 18:55 ` Jeff King 2012-10-03 19:41 ` Shawn Pearce 2012-10-03 20:13 ` Jeff King 2012-10-04 21:52 ` Sascha Cunz 2012-10-05 0:20 ` Jeff King 2012-10-05 6:24 ` Johannes Sixt 2012-10-05 16:57 ` Shawn Pearce 2012-10-08 15:05 ` Johannes Sixt 2012-10-09 6:46 ` Shawn Pearce 2012-10-09 20:30 ` Johannes Sixt 2012-10-09 20:46 ` Johannes Sixt 2012-10-03 20:16 ` Ævar Arnfjörð Bjarmason 2012-10-03 21:20 ` Jeff King 2012-10-03 22:15 ` Ævar Arnfjörð Bjarmason 2012-10-03 23:15 ` Jeff King 2012-10-03 23:54 ` Ævar Arnfjörð Bjarmason 2012-10-04 7:56 ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King 2012-10-04 7:58 ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King 2012-10-04 18:24 ` Junio C Hamano 2012-10-04 8:00 ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King 2012-10-04 18:32 ` Junio C Hamano 2012-10-04 8:02 ` [PATCH 3/4] peel_ref: check object type before loading Jeff King 2012-10-04 19:06 ` Junio C Hamano [this message] 2012-10-04 19:41 ` Jeff King 2012-10-04 20:41 ` Junio C Hamano 2012-10-04 21:59 ` Jeff King 2012-10-04 8:03 ` [PATCH 4/4] upload-pack: use peel_ref for ref advertisements Jeff King 2012-10-04 8:04 ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King 2012-10-04 9:01 ` Ævar Arnfjörð Bjarmason 2012-10-04 12:14 ` Nazri Ramliy 2012-10-03 22:32 ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason 2012-10-03 23:21 ` Jeff King 2012-10-03 23:47 ` Ævar Arnfjörð Bjarmason 2012-10-03 19:13 ` Junio C Hamano
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=7vbogiys47.fsf@alter.siamese.dyndns.org \ --to=gitster@pobox.com \ --cc=avarab@gmail.com \ --cc=git@vger.kernel.org \ --cc=peff@peff.net \ --subject='Re: [PATCH 3/4] peel_ref: check object type before loading' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).