git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: <git@vger.kernel.org>
Cc: "Martin Ågren" <martin.agren@gmail.com>
Subject: [PATCH 1/2] docs: document SHA-256 pack and indices
Date: Thu, 13 Aug 2020 22:49:00 +0000	[thread overview]
Message-ID: <20200813224901.2652387-2-sandals@crustytoothpaste.net> (raw)
In-Reply-To: <20200813224901.2652387-1-sandals@crustytoothpaste.net>

Now that we have SHA-256 support for packs and indices, let's document
that in SHA-256 repositories, we use SHA-256 instead of SHA-1 for object
names and checksums.  Instead of duplicating this information throughout
the document, let's just document that in SHA-1 repositories, we use
SHA-1 for these purposes, and in SHA-256 repositories, we use SHA-256.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/technical/pack-format.txt | 36 ++++++++++++++-----------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
index d3a142c652..f4c8d94f73 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -1,6 +1,12 @@
 Git pack format
 ===============
 
+== Checksums and object IDs
+
+In a repository using the traditional SHA-1, pack checksums, index checksums,
+and object IDs (object names) mentioned below are all computed using SHA-1.
+Similarly, in SHA-256 repositories, these values are computed using SHA-256.
+
 == pack-*.pack files have the following format:
 
    - A header appears at the beginning and consists of the following:
@@ -26,7 +32,7 @@ Git pack format
 
      (deltified representation)
      n-byte type and length (3-bit type, (n-1)*7+4-bit length)
-     20-byte base object name if OBJ_REF_DELTA or a negative relative
+     base object name if OBJ_REF_DELTA or a negative relative
 	 offset from the delta object's position in the pack if this
 	 is an OBJ_OFS_DELTA object
      compressed delta data
@@ -34,7 +40,7 @@ Git pack format
      Observation: length of each object is encoded in a variable
      length format and is not constrained to 32-bit or anything.
 
-  - The trailer records 20-byte SHA-1 checksum of all of the above.
+  - The trailer records a pack checksum of all of the above.
 
 === Object types
 
@@ -58,8 +64,8 @@ ofs-delta and ref-delta, which is only valid in a pack file.
 
 Both ofs-delta and ref-delta store the "delta" to be applied to
 another object (called 'base object') to reconstruct the object. The
-difference between them is, ref-delta directly encodes 20-byte base
-object name. If the base object is in the same pack, ofs-delta encodes
+difference between them is, ref-delta directly encodes base object
+name. If the base object is in the same pack, ofs-delta encodes
 the offset of the base object in the pack instead.
 
 The base object could also be deltified if it's in the same pack.
@@ -143,14 +149,14 @@ This is the instruction reserved for future expansion.
     object is stored in the packfile as the offset from the
     beginning.
 
-    20-byte object name.
+    one object name of the appropriate size.
 
   - The file is concluded with a trailer:
 
-    A copy of the 20-byte SHA-1 checksum at the end of
-    corresponding packfile.
+    A copy of the pack checksum at the end of the corresponding
+    packfile.
 
-    20-byte SHA-1-checksum of all of the above.
+    Index checksum of all of the above.
 
 Pack Idx file:
 
@@ -198,7 +204,7 @@ Pack file entry: <+
         If it is not DELTA, then deflated bytes (the size above
 		is the size before compression).
 	If it is REF_DELTA, then
-	  20-byte base object name SHA-1 (the size above is the
+	  base object name (the size above is the
 		size of the delta data that follows).
           delta data, deflated.
 	If it is OFS_DELTA, then
@@ -227,9 +233,9 @@ Pack file entry: <+
 
   - A 256-entry fan-out table just like v1.
 
-  - A table of sorted 20-byte SHA-1 object names.  These are
-    packed together without offset values to reduce the cache
-    footprint of the binary search for a specific object name.
+  - A table of sorted object names.  These are packed together
+    without offset values to reduce the cache footprint of the
+    binary search for a specific object name.
 
   - A table of 4-byte CRC32 values of the packed object data.
     This is new in v2 so compressed data can be copied directly
@@ -248,10 +254,10 @@ Pack file entry: <+
 
   - The same trailer as a v1 pack file:
 
-    A copy of the 20-byte SHA-1 checksum at the end of
+    A copy of the pack checksum at the end of
     corresponding packfile.
 
-    20-byte SHA-1-checksum of all of the above.
+    Index checksum of all of the above.
 
 == multi-pack-index (MIDX) files have the following format:
 
@@ -329,4 +335,4 @@ CHUNK DATA:
 
 TRAILER:
 
-	20-byte SHA1-checksum of the above contents.
+	Index checksum of the above contents.

  reply	other threads:[~2020-08-13 22:49 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-13 22:48 [PATCH 0/2] Documentation updates for SHA-256 brian m. carlson
2020-08-13 22:49 ` brian m. carlson [this message]
2020-08-14  2:26   ` [PATCH 1/2] docs: document SHA-256 pack and indices Derrick Stolee
2020-08-13 22:49 ` [PATCH 2/2] docs: fix step in transition plan brian m. carlson
2020-08-14 12:21   ` Martin Ågren
2020-08-14  2:33 ` [PATCH 0/2] Documentation updates for SHA-256 Derrick Stolee
2020-08-14  4:47   ` Junio C Hamano
2020-08-14 20:20     ` brian m. carlson
2020-08-14 20:25       ` Junio C Hamano
2020-08-14 12:21 ` [PATCH 0/5] more SHA-256 documentation Martin Ågren
2020-08-14 12:21   ` [PATCH 1/5] http-protocol.txt: document SHA-256 "want"/"have" format Martin Ågren
2020-08-14 17:28     ` Junio C Hamano
2020-08-14 20:23       ` brian m. carlson
2020-08-14 20:32         ` Martin Ågren
2020-08-14 20:39         ` Junio C Hamano
2020-08-14 20:47           ` Junio C Hamano
2020-08-14 12:21   ` [PATCH 2/5] index-format.txt: document SHA-256 index format Martin Ågren
2020-08-14 12:28     ` Derrick Stolee
2020-08-14 14:05       ` Martin Ågren
2020-08-14 12:21   ` [PATCH 3/5] protocol-capabilities.txt: clarify "allow-x-sha1-in-want" re SHA-256 Martin Ågren
2020-08-14 12:31     ` Derrick Stolee
2020-08-14 14:05       ` Martin Ågren
2020-08-14 17:33     ` Junio C Hamano
2020-08-14 20:35       ` Martin Ågren
2020-08-14 20:43         ` Junio C Hamano
2020-08-14 12:21   ` [PATCH 4/5] shallow.txt: document SHA-256 shallow format Martin Ågren
2020-08-14 12:21   ` [PATCH 5/5] commit-graph-format.txt: fix "Hash Version" description Martin Ågren
2020-08-14 12:37     ` Derrick Stolee
2020-08-14 14:10       ` Martin Ågren
2020-08-14 20:28   ` [PATCH 0/5] more SHA-256 documentation brian m. carlson
2020-08-15 16:05   ` [PATCH v2 0/4] " Martin Ågren
2020-08-15 16:05     ` [PATCH v2 1/4] http-protocol.txt: document SHA-256 "want"/"have" format Martin Ågren
2020-08-15 16:06     ` [PATCH v2 2/4] index-format.txt: document SHA-256 index format Martin Ågren
2020-08-15 16:06     ` [PATCH v2 3/4] protocol-capabilities.txt: clarify "allow-x-sha1-in-want" re SHA-256 Martin Ågren
2020-08-15 16:06     ` [PATCH v2 4/4] shallow.txt: document SHA-256 shallow format Martin Ågren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200813224901.2652387-2-sandals@crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).