git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: git <git@vger.kernel.org>
Subject: Re: [PATCH 6/6] pack-objects: reuse on-disk deltas for thin "have" objects
Date: Fri, 17 Aug 2018 15:57:18 -0700	[thread overview]
Message-ID: <CAGZ79kY9o_38WRX2TXC1JMA+9Y_ihf_jK8-f2pcYA+cYPF6B4A@mail.gmail.com> (raw)
In-Reply-To: <20180817210604.GF20088@sigill.intra.peff.net>

On Fri, Aug 17, 2018 at 2:06 PM Jeff King <peff@peff.net> wrote:
>
> When we serve a fetch, we pass the "wants" and "haves" from
> the fetch negotiation to pack-objects. That tells us not
> only which objects we need to send, but we also use the
> boundary commits as "preferred bases": their trees and blobs
> are candidates for delta bases, both for reusing on-disk
> deltas and for finding new ones.
>
> However, this misses some opportunities. Modulo some special
> cases like shallow or partial clones, we know that every
> object reachable from the "haves" could be a preferred base.
> We don't use them all for two reasons:

s/all/at all/ ?

> The first is that check_object() needs access to the
> relevant information (the thin flag and bitmap result). We
> can do this by pushing these into program-lifetime globals.

I discussed internally if extending the fetch protocol to include
submodule packs would be a good idea, as then you can get all
the superproject+submodule updates via one connection. This
gives some benefits, such as a more consistent view from the
superproject as well as already knowing the have/wants for
the submodule.

With this background story, moving things into globals
makes me sad, but I guess we can flip this decision once
we actually move towards "submodule packs in the
main connection".

>
> The second is that the rest of the code assumes that any
> reused delta will point to another "struct object_entry" as
> its base. But by definition, we don't have such an entry!

I got lost here by the definition (which def?).

  The delta that we look up from the bitmap, doesn't may
  not be in the pack, but it could be based off of an object
  the client already has in its object store and for that
  there is no struct object_entry in memory.

Is that correct?

> So taking all of those options into account, what I ended up
> with is a separate list of "external bases" that are not
> part of the main packing list. Each delta entry that points
> to an external base has a single-bit flag to do so; we have a
> little breathing room in the bitfield section of
> object_entry.
>
> This lets us limit the change primarily to the oe_delta()
> and oe_set_delta_ext() functions. And as a bonus, most of
> the rest of the code does not consider these dummy entries
> at all, saving both runtime CPU and code complexity.

Thanks,
Stefan

  reply	other threads:[~2018-08-17 22:59 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-17 20:54 [PATCH 0/6] reuse on-disk deltas for fetches with bitmaps Jeff King
2018-08-17 20:55 ` [PATCH 1/6] t/perf: factor boilerplate out of test_perf Jeff King
2018-08-17 20:55 ` [PATCH 2/6] t/perf: factor out percent calculations Jeff King
2018-08-17 20:56 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
2018-08-17 20:57 ` [PATCH 4/6] t/perf: add perf tests for fetches from a bitmapped server Jeff King
2018-08-17 20:59 ` [PATCH 5/6] pack-bitmap: save "have" bitmap from walk Jeff King
2018-08-17 22:39   ` Stefan Beller
2018-08-17 22:45     ` Jeff King
2018-08-17 21:06 ` [PATCH 6/6] pack-objects: reuse on-disk deltas for thin "have" objects Jeff King
2018-08-17 22:57   ` Stefan Beller [this message]
2018-08-17 23:32     ` Jeff King
2018-08-20 21:03   ` Junio C Hamano
2018-08-20 21:42     ` Jeff King
2018-08-21 19:06 ` [PATCH v2 0/6] reuse on-disk deltas for fetches with bitmaps Jeff King
2018-08-21 19:08   ` Jeff King
2018-08-21 19:34   ` Junio C Hamano
2018-08-21 19:51     ` Jeff King
  -- strict thread matches above, loose matches on Subject: below --
2018-08-21 18:41 [PATCH] test-tool.h: include git-compat-util.h Jeff King
2018-08-21 19:07 ` [PATCH 6/6] pack-objects: reuse on-disk deltas for thin "have" objects Jeff King
2018-08-21 19:43   ` Junio C Hamano
2018-08-21 19:50     ` Junio C Hamano
2018-08-21 20:07       ` Jeff King
2018-08-21 20:14         ` Jeff King
2018-08-21 20:52           ` Junio C Hamano
2018-08-21 21:30             ` Jeff King
2018-08-21 20:57         ` Junio C Hamano
2018-08-21 21:32           ` Jeff King
2018-08-21 20:00     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kY9o_38WRX2TXC1JMA+9Y_ihf_jK8-f2pcYA+cYPF6B4A@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).