git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Git Users <git@vger.kernel.org>
Subject: Re: "Unpacking objects" question
Date: Sun, 2 May 2021 12:55:51 -0400	[thread overview]
Message-ID: <YI7Zl+odFFRIZ7aL@coredump.intra.peff.net> (raw)
In-Reply-To: <bdd50fcc-02c3-dc24-9d49-773db881b65d@gmail.com>

On Sun, May 02, 2021 at 06:06:57PM +0700, Bagas Sanjaya wrote:

> Recently I stumbled upon git unpack-objects documentation, which says:
> 
> > Read a packed archive (.pack) from the standard input, expanding the objects contained within and writing them into the repository in "loose" (one object per file) format.
> 
> However, I have some questions:
> 
> 1. When I do git fetch, what is the threshold/limit for "Unpacking objects",
>    in other words what is the minimum number of objects for invoking
>    "Resolving deltas" instead of "Unpacking objects"?
> 2. Can the threshold between unpacking objects and resolving deltas be
>    configurable?

See the fetch.unpackLimit config. The default is 100 objects.

> 3. Why in some cases Git do unpacking objects where resolving deltas
>    can be done?

I don't know if the documentation discusses this tradeoff anywhere, but
off the top of my head:

  - storing packs can be more efficient in disk space (because of deltas
    within the pack, but also fewer inodes for small objects). This
    effect is bigger the more objects you have.

  - storing packs can be less efficient, because thin packs will be
    completed with duplicates of already-stored objects. The overhead is
    bigger the fewer objects you have.

Which I suspect is the main logic driving the object count (I didn't dig
in the history or the archive, though; you might find more discussion
there). AFAIK the number 100 doesn't have any real scientific basis.

There are some other subtle effects, too:

  - storing packs from the wire may make git-gc more efficient (you can
    often reuse deltas sent by the other side)

  - storing packs from the wire may produce a worse outcome after
    git-gc, because you are reusing deltas produced by the client for
    their push (who might not have spent as much CPU looking for them as
    you would)

-Peff

  reply	other threads:[~2021-05-02 16:55 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-02 11:06 "Unpacking objects" question Bagas Sanjaya
2021-05-02 16:55 ` Jeff King [this message]
2021-05-03  1:22   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YI7Zl+odFFRIZ7aL@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).