git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>, Git Users <git@vger.kernel.org>
Subject: Re: "Unpacking objects" question
Date: Mon, 03 May 2021 10:22:07 +0900	[thread overview]
Message-ID: <xmqq5z00bohs.fsf@gitster.g> (raw)
In-Reply-To: <YI7Zl+odFFRIZ7aL@coredump.intra.peff.net> (Jeff King's message of "Sun, 2 May 2021 12:55:51 -0400")

Jeff King <peff@peff.net> writes:

> I don't know if the documentation discusses this tradeoff anywhere, but
> off the top of my head:
>
>   - storing packs can be more efficient in disk space (because of deltas
>     within the pack, but also fewer inodes for small objects). This
>     effect is bigger the more objects you have.
>
>   - storing packs can be less efficient, because thin packs will be
>     completed with duplicates of already-stored objects. The overhead is
>     bigger the fewer objects you have.

Another original motivation was to avoid ending up with too many
small packs, as it would result in accessing objects taking
potentially order of number of packfiles in the repository in the
pre midx world.  After many small fetches, gc would be able to pack
them all into a single pack.

> There are some other subtle effects, too:
>
>   - storing packs from the wire may make git-gc more efficient (you can
>     often reuse deltas sent by the other side)

 - storing and using packs that came from the wire may not have as
   good locality among objects, especially when the other side was a
   server that is optimized to reduce outbound network bandwidth
   (read: size) and their own processing cycles (read: object reuse
   from their packs).  Local packing has a dedicated phase to
   reorder the objects to pack related ones close to each other, but
   the "server" side has no incentive to optimize for that.

>   - storing packs from the wire may produce a worse outcome after
>     git-gc, because you are reusing deltas produced by the client for
>     their push (who might not have spent as much CPU looking for them as
>     you would)

      reply	other threads:[~2021-05-03  1:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-02 11:06 "Unpacking objects" question Bagas Sanjaya
2021-05-02 16:55 ` Jeff King
2021-05-03  1:22   ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq5z00bohs.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).