* "Unpacking objects" question
@ 2021-05-02 11:06 Bagas Sanjaya
2021-05-02 16:55 ` Jeff King
0 siblings, 1 reply; 3+ messages in thread
From: Bagas Sanjaya @ 2021-05-02 11:06 UTC (permalink / raw)
To: Git Users
Hi,
Recently I stumbled upon git unpack-objects documentation, which says:
> Read a packed archive (.pack) from the standard input, expanding the objects contained within and writing them into the repository in "loose" (one object per file) format.
However, I have some questions:
1. When I do git fetch, what is the threshold/limit for "Unpacking objects",
in other words what is the minimum number of objects for invoking
"Resolving deltas" instead of "Unpacking objects"?
2. Can the threshold between unpacking objects and resolving deltas be
configurable?
3. Why in some cases Git do unpacking objects where resolving deltas
can be done?
Thanks.
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: "Unpacking objects" question
2021-05-02 11:06 "Unpacking objects" question Bagas Sanjaya
@ 2021-05-02 16:55 ` Jeff King
2021-05-03 1:22 ` Junio C Hamano
0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2021-05-02 16:55 UTC (permalink / raw)
To: Bagas Sanjaya; +Cc: Git Users
On Sun, May 02, 2021 at 06:06:57PM +0700, Bagas Sanjaya wrote:
> Recently I stumbled upon git unpack-objects documentation, which says:
>
> > Read a packed archive (.pack) from the standard input, expanding the objects contained within and writing them into the repository in "loose" (one object per file) format.
>
> However, I have some questions:
>
> 1. When I do git fetch, what is the threshold/limit for "Unpacking objects",
> in other words what is the minimum number of objects for invoking
> "Resolving deltas" instead of "Unpacking objects"?
> 2. Can the threshold between unpacking objects and resolving deltas be
> configurable?
See the fetch.unpackLimit config. The default is 100 objects.
> 3. Why in some cases Git do unpacking objects where resolving deltas
> can be done?
I don't know if the documentation discusses this tradeoff anywhere, but
off the top of my head:
- storing packs can be more efficient in disk space (because of deltas
within the pack, but also fewer inodes for small objects). This
effect is bigger the more objects you have.
- storing packs can be less efficient, because thin packs will be
completed with duplicates of already-stored objects. The overhead is
bigger the fewer objects you have.
Which I suspect is the main logic driving the object count (I didn't dig
in the history or the archive, though; you might find more discussion
there). AFAIK the number 100 doesn't have any real scientific basis.
There are some other subtle effects, too:
- storing packs from the wire may make git-gc more efficient (you can
often reuse deltas sent by the other side)
- storing packs from the wire may produce a worse outcome after
git-gc, because you are reusing deltas produced by the client for
their push (who might not have spent as much CPU looking for them as
you would)
-Peff
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: "Unpacking objects" question
2021-05-02 16:55 ` Jeff King
@ 2021-05-03 1:22 ` Junio C Hamano
0 siblings, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2021-05-03 1:22 UTC (permalink / raw)
To: Jeff King; +Cc: Bagas Sanjaya, Git Users
Jeff King <peff@peff.net> writes:
> I don't know if the documentation discusses this tradeoff anywhere, but
> off the top of my head:
>
> - storing packs can be more efficient in disk space (because of deltas
> within the pack, but also fewer inodes for small objects). This
> effect is bigger the more objects you have.
>
> - storing packs can be less efficient, because thin packs will be
> completed with duplicates of already-stored objects. The overhead is
> bigger the fewer objects you have.
Another original motivation was to avoid ending up with too many
small packs, as it would result in accessing objects taking
potentially order of number of packfiles in the repository in the
pre midx world. After many small fetches, gc would be able to pack
them all into a single pack.
> There are some other subtle effects, too:
>
> - storing packs from the wire may make git-gc more efficient (you can
> often reuse deltas sent by the other side)
- storing and using packs that came from the wire may not have as
good locality among objects, especially when the other side was a
server that is optimized to reduce outbound network bandwidth
(read: size) and their own processing cycles (read: object reuse
from their packs). Local packing has a dedicated phase to
reorder the objects to pack related ones close to each other, but
the "server" side has no incentive to optimize for that.
> - storing packs from the wire may produce a worse outcome after
> git-gc, because you are reusing deltas produced by the client for
> their push (who might not have spent as much CPU looking for them as
> you would)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-05-03 1:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-02 11:06 "Unpacking objects" question Bagas Sanjaya
2021-05-02 16:55 ` Jeff King
2021-05-03 1:22 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).