git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Farhan Khan <khanzf@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: How DELTA objects values work and are calculated
Date: Sat, 5 Jan 2019 17:32:46 -0500	[thread overview]
Message-ID: <581076b0-95c5-9af9-dec5-5a9bccfe2634@gmail.com> (raw)
In-Reply-To: <CACsJy8Da7+sNfxvTRz1DRn27TjvBXNAipKB=eumA6q+sVsVjcA@mail.gmail.com>



On 1/4/19 11:46 PM, Duy Nguyen wrote:
> On Sat, Jan 5, 2019 at 9:49 AM Farhan Khan <khanzf@gmail.com> wrote:
>>
>> Hi all,
>>
>> I'm having trouble understanding how OBJ_REF_DELTA and OBJ_REF_DELTA
>> (deltas) work in git. Where does git calculate the sha1 hash values
>> when doing "git index-pack" in builtin/index-pack.c. I think my lack
>> of understanding of the code is compounded the fact that I do not
>> understand what the two object types are.
>>
>>  From tracing the code starting from index-pack, all non-delta object
>> type hashes are calculated in index-pack.c:1131 (parse_pack_objects).
>> However, when the function ends, the delta objects hash values are set
>> to all 0's.
> 
> Delta objects depend on other objects (and even delta ones). To
> calculate its sha1 values we may need to recursively calculate sha1
> values of its base objects. This is why we do it in a separate phase
> because the calculation is more complicated than non-delta objects.
> 
>> My questions are:
>> A) How do Delta objects work?
> 
> A delta object consists of a reference to the base object (either an
> sha1 value, or the offset to where the object is) and a "delta" to be
> applied on (it's basically a binary diff).
> 
>> B) Where and how are the sha1 values calculated?
> 
> Start at threaded_second_pass() in index-pack.c, we go through all
> delta objects here and try to calculate their sha1 values. Eventually
> you'll hit resolve_delta(), where the delta is actually applied to the
> base object in the patch_delta() call, and the sha1 value calculated
> in the following hash_object_file() call.
> 
>>
>> I have read Documentation/technical/pack-format.txt, but am still not clear.
>>
>> Thank you!
>> --
>> Farhan Khan
>> PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE
> 
> 
> 


Hi Duy,

Thanks for explaining the Delta objects.

What does a OBJ_REF_DELTA object itself consist of? Do you have to 
uncompress it to parse its values? How do you get its size?

I read through resolve deltas which leads to threaded_second_pass, where 
you suggested to start, but I do not understand what is happening at a 
high level and get confused while reading the code.

 From threaded_second_pass, execution goes into a for-loop that runs 
resolve_base(), which runs runs find_unresolved_deltas(). Is this 
finding the unresolved deltas of the current object (The current 
OBJ_REF_DELTA we are going through)? This then runs 
find_unresolved_deltas() and shortly afterwards 
find_unresolved_deltas_1(). It seems that find_unresolved_deltas_1() is 
applying deltas, but I am not certain.

I do not understand what is happening in any of these functions. There 
are some comments on builtin/index-pack.c:883-904

Overall, I do not understand this entire process, what values to capture 
along the way, and how they are consumed. Please provide some guidance 
on how this process works.

Thank you!
Farhan

  reply	other threads:[~2019-01-05 22:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-05  2:48 How DELTA objects values work and are calculated Farhan Khan
2019-01-05  4:46 ` Duy Nguyen
2019-01-05 22:32   ` Farhan Khan [this message]
2019-01-06  2:32     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=581076b0-95c5-9af9-dec5-5a9bccfe2634@gmail.com \
    --to=khanzf@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).