git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info
@ 2022-06-02 13:52 Abhradeep Chakraborty via GitGitGadget
  2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-02 13:52 UTC (permalink / raw)
  To: git; +Cc: Abhradeep Chakraborty

There are some issues in the bitmap-format html page. For example, some
nested lists are shown as top-level lists (e.g. [1]- Here
BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
top-level list).

The first commit fix those.

The second commit is about including the info of trailing checksum in the
bitmap-format documentation.

[1] https://git-scm.com/docs/bitmap-format#_on_disk_format

Abhradeep Chakraborty (2):
  bitmap-format.txt: fix some formatting issues
  bitmap-format.txt: add information for trailing checksum

 Documentation/technical/bitmap-format.txt | 100 +++++++++++-----------
 1 file changed, 49 insertions(+), 51 deletions(-)


base-commit: 2668e3608e47494f2f10ef2b6e69f08a84816bcb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1246
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/2] bitmap-format.txt: fix some formatting issues
  2022-06-02 13:52 [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
@ 2022-06-02 13:52 ` Abhradeep Chakraborty via GitGitGadget
  2022-06-06 15:55   ` Junio C Hamano
  2022-06-02 13:52 ` [PATCH 2/2] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
  2 siblings, 1 reply; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-02 13:52 UTC (permalink / raw)
  To: git; +Cc: Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

The asciidoc generated html for `Documentation/technical/bitmap-
format.txt` is broken. This is mainly because `-` is used for nested
lists (which is not allowed in asciidoc) instead of `*`.

Fix these and also reformat it (e.g. removing some blank lines) for
better readability of the html page.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 96 +++++++++++------------
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 04b3ec21785..110d7ddf8ed 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -39,7 +39,7 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 == On-disk format
 
-	- A header appears at the beginning:
+	* A header appears at the beginning:
 
 		4-byte signature: {'B', 'I', 'T', 'M'}
 
@@ -48,35 +48,30 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 			of the bitmap index (the same one as JGit).
 
 		2-byte flags (network byte order)
-
 			The following flags are supported:
-
-			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
-			This flag must always be present. It implies that the
-			bitmap index has been generated for a packfile or
-			multi-pack index (MIDX) with full closure (i.e. where
-			every single object in the packfile/MIDX can find its
-			parent links inside the same packfile/MIDX). This is a
-			requirement for the bitmap index format, also present in
-			JGit, that greatly reduces the complexity of the
-			implementation.
-
-			- BITMAP_OPT_HASH_CACHE (0x4)
-			If present, the end of the bitmap file contains
-			`N` 32-bit name-hash values, one per object in the
-			pack/MIDX. The format and meaning of the name-hash is
-			described below.
+				- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
+				This flag must always be present. It implies that the
+				bitmap index has been generated for a packfile or
+				multi-pack index (MIDX) with full closure (i.e. where
+				every single object in the packfile/MIDX can find its
+				parent links inside the same packfile/MIDX). This is a
+				requirement for the bitmap index format, also present in
+				JGit, that greatly reduces the complexity of the
+				implementation.
+				- BITMAP_OPT_HASH_CACHE (0x4)
+				If present, the end of the bitmap file contains
+				`N` 32-bit name-hash values, one per object in the
+				pack/MIDX. The format and meaning of the name-hash is
+				described below.
 
 		4-byte entry count (network byte order)
-
 			The total count of entries (bitmapped commits) in this bitmap index.
 
 		20-byte checksum
-
 			The SHA1 checksum of the pack/MIDX this bitmap index
 			belongs to.
 
-	- 4 EWAH bitmaps that act as type indexes
+	* 4 EWAH bitmaps that act as type indexes
 
 		Type indexes are serialized after the hash cache in the shape
 		of four EWAH bitmaps stored consecutively (see Appendix A for
@@ -84,7 +79,6 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 		There is a bitmap for each Git object type, stored in the following
 		order:
-
 			- Commits
 			- Trees
 			- Blobs
@@ -97,39 +91,39 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 		in a full set (all bits set), and the AND of all 4 bitmaps will
 		result in an empty bitmap (no bits set).
 
-	- N entries with compressed bitmaps, one for each indexed commit
+	* N entries with compressed bitmaps, one for each indexed commit
 
 		Where `N` is the total amount of entries in this bitmap index.
 		Each entry contains the following:
 
-		- 4-byte object position (network byte order)
-			The position **in the index for the packfile or
-			multi-pack index** where the bitmap for this commit is
-			found.
-
-		- 1-byte XOR-offset
-			The xor offset used to compress this bitmap. For an entry
-			in position `x`, a XOR offset of `y` means that the actual
-			bitmap representing this commit is composed by XORing the
-			bitmap for this entry with the bitmap in entry `x-y` (i.e.
-			the bitmap `y` entries before this one).
-
-			Note that this compression can be recursive. In order to
-			XOR this entry with a previous one, the previous entry needs
-			to be decompressed first, and so on.
-
-			The hard-limit for this offset is 160 (an entry can only be
-			xor'ed against one of the 160 entries preceding it). This
-			number is always positive, and hence entries are always xor'ed
-			with **previous** bitmaps, not bitmaps that will come afterwards
-			in the index.
-
-		- 1-byte flags for this bitmap
-			At the moment the only available flag is `0x1`, which hints
-			that this bitmap can be re-used when rebuilding bitmap indexes
-			for the repository.
-
-		- The compressed bitmap itself, see Appendix A.
+			** 4-byte object position (network byte order)
+				The position **in the index for the packfile or
+				multi-pack index** where the bitmap for this commit is
+				found.
+
+			** 1-byte XOR-offset
+				The xor offset used to compress this bitmap. For an entry
+				in position `x`, a XOR offset of `y` means that the actual
+				bitmap representing this commit is composed by XORing the
+				bitmap for this entry with the bitmap in entry `x-y` (i.e.
+				the bitmap `y` entries before this one).
+
+				Note that this compression can be recursive. In order to
+				XOR this entry with a previous one, the previous entry needs
+				to be decompressed first, and so on.
+
+				The hard-limit for this offset is 160 (an entry can only be
+				xor'ed against one of the 160 entries preceding it). This
+				number is always positive, and hence entries are always xor'ed
+				with **previous** bitmaps, not bitmaps that will come afterwards
+				in the index.
+
+			** 1-byte flags for this bitmap
+				At the moment the only available flag is `0x1`, which hints
+				that this bitmap can be re-used when rebuilding bitmap indexes
+				for the repository.
+
+			** The compressed bitmap itself, see Appendix A.
 
 == Appendix A: Serialization format for an EWAH bitmap
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/2] bitmap-format.txt: add information for trailing checksum
  2022-06-02 13:52 [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
  2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-02 13:52 ` Abhradeep Chakraborty via GitGitGadget
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
  2 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-02 13:52 UTC (permalink / raw)
  To: git; +Cc: Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Bitmap file has a trailing checksum at the end of the file. However
there is no information in the bitmap-format documentation about it.

Add a trailer section to include the trailing checksum info in the
`Documentation/technical/bitmap-format.txt` file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 110d7ddf8ed..6846e7221a7 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -125,6 +125,10 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 			** The compressed bitmap itself, see Appendix A.
 
+	* TRAILER:
+
+		Index checksum of the above contents.
+
 == Appendix A: Serialization format for an EWAH bitmap
 
 Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/2] bitmap-format.txt: fix some formatting issues
  2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-06 15:55   ` Junio C Hamano
  2022-06-07 10:25     ` Abhradeep Chakraborty
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2022-06-06 15:55 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Abhradeep Chakraborty, Vicent Marti, Taylor Blau

"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> Cc: git@vger.kernel.org,  Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Identify those who may have input with "git log --no-merges" and add
them here, perhaps?

> The asciidoc generated html for `Documentation/technical/bitmap-
> format.txt` is broken. This is mainly because `-` is used for nested
> lists (which is not allowed in asciidoc) instead of `*`.

Are we missing another step that must come much earlier than this
patch?  It seems to me that Documentation/Makefile does not even
consider that we should feed this file to AsciiDoc.

> Fix these and also reformat it (e.g. removing some blank lines) for
> better readability of the html page.

Do these blank lines hurt very badly how the end-result is formatted
in HTML?  Does the extra indentation between the line with "The
following flags are supported" on it and the two bullet items in the
header make the output better in significant way?

These changes make the input text much harder to read, and are not
very welcome, so unless they are part of "fixing generated HTML is
broken", please omit them.  As evidenced by the lack of HTML output
in the build system, a lot more folks read this document in text than
in HTML, and readability of the source matters.

Thanks.

> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>  Documentation/technical/bitmap-format.txt | 96 +++++++++++------------
>  1 file changed, 45 insertions(+), 51 deletions(-)
>
> diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
> index 04b3ec21785..110d7ddf8ed 100644
> --- a/Documentation/technical/bitmap-format.txt
> +++ b/Documentation/technical/bitmap-format.txt
> @@ -39,7 +39,7 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  
>  == On-disk format
>  
> -	- A header appears at the beginning:
> +	* A header appears at the beginning:
>  
>  		4-byte signature: {'B', 'I', 'T', 'M'}
>  
> @@ -48,35 +48,30 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  			of the bitmap index (the same one as JGit).
>  
>  		2-byte flags (network byte order)
> -
>  			The following flags are supported:
> -
> -			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
> -			This flag must always be present. It implies that the
> -			bitmap index has been generated for a packfile or
> -			multi-pack index (MIDX) with full closure (i.e. where
> -			every single object in the packfile/MIDX can find its
> -			parent links inside the same packfile/MIDX). This is a
> -			requirement for the bitmap index format, also present in
> -			JGit, that greatly reduces the complexity of the
> -			implementation.
> -
> -			- BITMAP_OPT_HASH_CACHE (0x4)
> -			If present, the end of the bitmap file contains
> -			`N` 32-bit name-hash values, one per object in the
> -			pack/MIDX. The format and meaning of the name-hash is
> -			described below.
> +				- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
> +				This flag must always be present. It implies that the
> +				bitmap index has been generated for a packfile or
> +				multi-pack index (MIDX) with full closure (i.e. where
> +				every single object in the packfile/MIDX can find its
> +				parent links inside the same packfile/MIDX). This is a
> +				requirement for the bitmap index format, also present in
> +				JGit, that greatly reduces the complexity of the
> +				implementation.
> +				- BITMAP_OPT_HASH_CACHE (0x4)
> +				If present, the end of the bitmap file contains
> +				`N` 32-bit name-hash values, one per object in the
> +				pack/MIDX. The format and meaning of the name-hash is
> +				described below.
>  
>  		4-byte entry count (network byte order)
> -
>  			The total count of entries (bitmapped commits) in this bitmap index.
>  
>  		20-byte checksum
> -
>  			The SHA1 checksum of the pack/MIDX this bitmap index
>  			belongs to.
>  
> -	- 4 EWAH bitmaps that act as type indexes
> +	* 4 EWAH bitmaps that act as type indexes
>  
>  		Type indexes are serialized after the hash cache in the shape
>  		of four EWAH bitmaps stored consecutively (see Appendix A for
> @@ -84,7 +79,6 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  
>  		There is a bitmap for each Git object type, stored in the following
>  		order:
> -
>  			- Commits
>  			- Trees
>  			- Blobs
> @@ -97,39 +91,39 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  		in a full set (all bits set), and the AND of all 4 bitmaps will
>  		result in an empty bitmap (no bits set).
>  
> -	- N entries with compressed bitmaps, one for each indexed commit
> +	* N entries with compressed bitmaps, one for each indexed commit
>  
>  		Where `N` is the total amount of entries in this bitmap index.
>  		Each entry contains the following:
>  
> -		- 4-byte object position (network byte order)
> -			The position **in the index for the packfile or
> -			multi-pack index** where the bitmap for this commit is
> -			found.
> -
> -		- 1-byte XOR-offset
> -			The xor offset used to compress this bitmap. For an entry
> -			in position `x`, a XOR offset of `y` means that the actual
> -			bitmap representing this commit is composed by XORing the
> -			bitmap for this entry with the bitmap in entry `x-y` (i.e.
> -			the bitmap `y` entries before this one).
> -
> -			Note that this compression can be recursive. In order to
> -			XOR this entry with a previous one, the previous entry needs
> -			to be decompressed first, and so on.
> -
> -			The hard-limit for this offset is 160 (an entry can only be
> -			xor'ed against one of the 160 entries preceding it). This
> -			number is always positive, and hence entries are always xor'ed
> -			with **previous** bitmaps, not bitmaps that will come afterwards
> -			in the index.
> -
> -		- 1-byte flags for this bitmap
> -			At the moment the only available flag is `0x1`, which hints
> -			that this bitmap can be re-used when rebuilding bitmap indexes
> -			for the repository.
> -
> -		- The compressed bitmap itself, see Appendix A.
> +			** 4-byte object position (network byte order)
> +				The position **in the index for the packfile or
> +				multi-pack index** where the bitmap for this commit is
> +				found.
> +
> +			** 1-byte XOR-offset
> +				The xor offset used to compress this bitmap. For an entry
> +				in position `x`, a XOR offset of `y` means that the actual
> +				bitmap representing this commit is composed by XORing the
> +				bitmap for this entry with the bitmap in entry `x-y` (i.e.
> +				the bitmap `y` entries before this one).
> +
> +				Note that this compression can be recursive. In order to
> +				XOR this entry with a previous one, the previous entry needs
> +				to be decompressed first, and so on.
> +
> +				The hard-limit for this offset is 160 (an entry can only be
> +				xor'ed against one of the 160 entries preceding it). This
> +				number is always positive, and hence entries are always xor'ed
> +				with **previous** bitmaps, not bitmaps that will come afterwards
> +				in the index.
> +
> +			** 1-byte flags for this bitmap
> +				At the moment the only available flag is `0x1`, which hints
> +				that this bitmap can be re-used when rebuilding bitmap indexes
> +				for the repository.
> +
> +			** The compressed bitmap itself, see Appendix A.
>  
>  == Appendix A: Serialization format for an EWAH bitmap

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/2] bitmap-format.txt: fix some formatting issues
  2022-06-06 15:55   ` Junio C Hamano
@ 2022-06-07 10:25     ` Abhradeep Chakraborty
  0 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-07 10:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Git, Vicent Marti, Taylor Blau,
	Kaartic Sivaraam, Derrick Stolee

Junio C Hamano <gitster@pobox.com> wrote:

> Identify those who may have input with "git log --no-merges" and add
> them here, perhaps?

Thanks, I hopefully cc'd all the people who can give some input about the
patch except Peff. I got to know that he took a break so I decided not to
cc him (will surely do if you say). I would love to hear from other people
who has knowledge on asciidoc.

I previously informed Taylor and Kaartic about the patch but forgot to
cc them :P

Another thing to note that the checksum that I included in the last
commit is suggested by Taylor himself. I was having problem to understand
some portion of `load_bitmap_header()` (because I wasn't aware of the
trailing checksum) when he cleared my doubt by saying that a trailer
checksum exists and also suggested to make a PR addressing that -

> I'm glad that it was helpful! If you think others may be confused by the same, feel free to write a patch modifying Documentation/technical/bitmap-format.txt to point out the trailing checksum.

Junio wrote -

> Are we missing another step that must come much earlier than this
> patch?  It seems to me that Documentation/Makefile does not even
> consider that we should feed this file to AsciiDoc.

I also think the same. At first, I thought this is intentional. When
I ran `make doc` (to test the resulting html file), it didn't generate
any html file for bitmap-format.txt. But thankfully there is an online
asciidoc editor[1] where you can check the resulting html file. You also
can check the resulting html by copy-pasting the content[2] of my github
branch bitmap-format file to that editor.

Will write a patch for it.

The current broken page can be found at - https://git-scm.com/docs/bitmap-format

> Do these blank lines hurt very badly how the end-result is formatted
> in HTML?  Does the extra indentation between the line with "The
> following flags are supported" on it and the two bullet items in the
> header make the output better in significant way?

Answering to the first question - yes, those are necessary to improve
the html readability (you can verify that by including and removing the
blank lines in the editor and obsering the changes). This ensures that
all the related paragraphes are contained in the same block.

The extra identations are not necessary. I add those because I thought
that these would be visually better for html page readers. If you think
it does the opposite, I can remove those.

I tried to use two bullets as less as possible ( In most cases, nested
lists came under <pre> blocks, so I didn't have to use two bullets).
But in one case, I had to use it for nested lists (Try the editor to
see the rendered output).

> These changes make the input text much harder to read, and are not
> very welcome, so unless they are part of "fixing generated HTML is
> broken", please omit them.  As evidenced by the lack of HTML output
> in the build system, a lot more folks read this document in text than
> in HTML, and readability of the source matters.

Okay, I will then remove those extra indentations. But besides that, all
are necessary.

I admit that readability of source matters but I think html pages are
also important (even more important)  for people who don't have the
source codes and want to know the git internals.

Thanks :)

[1] https://asciidoclive.com/edit/scratch/1
[2] https://github.com/Abhra303/git/blob/fix-doc-formatting/Documentation/technical/bitmap-format.txt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-02 13:52 [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
  2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
  2022-06-02 13:52 ` [PATCH 2/2] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 17:43 ` Abhradeep Chakraborty via GitGitGadget
  2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
                     ` (4 more replies)
  2 siblings, 5 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-07 17:43 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty

There are some issues in the bitmap-format html page. For example, some
nested lists are shown as top-level lists (e.g. [1]- Here
BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
top-level list). There is also a need of adding info about trailing checksum
in the docs.

Changes since v1:

 * a new commit addressing bitmap-format.txt html page generation is added
 * Remove extra indentation from the previous change
 * elaborate more about the trailing checksum (as suggested by Kaartic)

initial version:

 * first commit fixes some formatting issues
 * information about trailing checksum in the bitmap file is added in the
   bitmap-format doc.

[1] https://git-scm.com/docs/bitmap-format#_on_disk_format

Abhradeep Chakraborty (3):
  bitmap-format.txt: feed the file to asciidoc to generate html
  bitmap-format.txt: fix some formatting issues
  bitmap-format.txt: add information for trailing checksum

 Documentation/Makefile                    |  1 +
 Documentation/technical/bitmap-format.txt | 24 +++++++++++------------
 2 files changed, 12 insertions(+), 13 deletions(-)


base-commit: 2668e3608e47494f2f10ef2b6e69f08a84816bcb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1246

Range-diff vs v1:

 -:  ----------- > 1:  a1b9bd9af90 bitmap-format.txt: feed the file to asciidoc to generate html
 1:  976361e624a ! 2:  cb919513c14 bitmap-format.txt: fix some formatting issues
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      -
       			The following flags are supported:
      -
     --			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     --			This flag must always be present. It implies that the
     --			bitmap index has been generated for a packfile or
     --			multi-pack index (MIDX) with full closure (i.e. where
     --			every single object in the packfile/MIDX can find its
     --			parent links inside the same packfile/MIDX). This is a
     --			requirement for the bitmap index format, also present in
     --			JGit, that greatly reduces the complexity of the
     --			implementation.
     + 			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     + 			This flag must always be present. It implies that the
     + 			bitmap index has been generated for a packfile or
     +@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     + 			requirement for the bitmap index format, also present in
     + 			JGit, that greatly reduces the complexity of the
     + 			implementation.
      -
     --			- BITMAP_OPT_HASH_CACHE (0x4)
     --			If present, the end of the bitmap file contains
     --			`N` 32-bit name-hash values, one per object in the
     --			pack/MIDX. The format and meaning of the name-hash is
     --			described below.
     -+				- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     -+				This flag must always be present. It implies that the
     -+				bitmap index has been generated for a packfile or
     -+				multi-pack index (MIDX) with full closure (i.e. where
     -+				every single object in the packfile/MIDX can find its
     -+				parent links inside the same packfile/MIDX). This is a
     -+				requirement for the bitmap index format, also present in
     -+				JGit, that greatly reduces the complexity of the
     -+				implementation.
     -+				- BITMAP_OPT_HASH_CACHE (0x4)
     -+				If present, the end of the bitmap file contains
     -+				`N` 32-bit name-hash values, one per object in the
     -+				pack/MIDX. The format and meaning of the name-hash is
     -+				described below.
     + 			- BITMAP_OPT_HASH_CACHE (0x4)
     + 			If present, the end of the bitmap file contains
     + 			`N` 32-bit name-hash values, one per object in the
     +@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     + 			described below.
       
       		4-byte entry count (network byte order)
      -
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
       		Each entry contains the following:
       
      -		- 4-byte object position (network byte order)
     --			The position **in the index for the packfile or
     --			multi-pack index** where the bitmap for this commit is
     --			found.
     --
     ++		** 4-byte object position (network byte order)
     + 			The position **in the index for the packfile or
     + 			multi-pack index** where the bitmap for this commit is
     + 			found.
     + 
      -		- 1-byte XOR-offset
     --			The xor offset used to compress this bitmap. For an entry
     --			in position `x`, a XOR offset of `y` means that the actual
     --			bitmap representing this commit is composed by XORing the
     --			bitmap for this entry with the bitmap in entry `x-y` (i.e.
     --			the bitmap `y` entries before this one).
     --
     --			Note that this compression can be recursive. In order to
     --			XOR this entry with a previous one, the previous entry needs
     --			to be decompressed first, and so on.
     --
     --			The hard-limit for this offset is 160 (an entry can only be
     --			xor'ed against one of the 160 entries preceding it). This
     --			number is always positive, and hence entries are always xor'ed
     --			with **previous** bitmaps, not bitmaps that will come afterwards
     --			in the index.
     --
     ++		** 1-byte XOR-offset
     + 			The xor offset used to compress this bitmap. For an entry
     + 			in position `x`, a XOR offset of `y` means that the actual
     + 			bitmap representing this commit is composed by XORing the
     +@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     + 			with **previous** bitmaps, not bitmaps that will come afterwards
     + 			in the index.
     + 
      -		- 1-byte flags for this bitmap
     --			At the moment the only available flag is `0x1`, which hints
     --			that this bitmap can be re-used when rebuilding bitmap indexes
     --			for the repository.
     --
     ++		** 1-byte flags for this bitmap
     + 			At the moment the only available flag is `0x1`, which hints
     + 			that this bitmap can be re-used when rebuilding bitmap indexes
     + 			for the repository.
     + 
      -		- The compressed bitmap itself, see Appendix A.
     -+			** 4-byte object position (network byte order)
     -+				The position **in the index for the packfile or
     -+				multi-pack index** where the bitmap for this commit is
     -+				found.
     -+
     -+			** 1-byte XOR-offset
     -+				The xor offset used to compress this bitmap. For an entry
     -+				in position `x`, a XOR offset of `y` means that the actual
     -+				bitmap representing this commit is composed by XORing the
     -+				bitmap for this entry with the bitmap in entry `x-y` (i.e.
     -+				the bitmap `y` entries before this one).
     -+
     -+				Note that this compression can be recursive. In order to
     -+				XOR this entry with a previous one, the previous entry needs
     -+				to be decompressed first, and so on.
     -+
     -+				The hard-limit for this offset is 160 (an entry can only be
     -+				xor'ed against one of the 160 entries preceding it). This
     -+				number is always positive, and hence entries are always xor'ed
     -+				with **previous** bitmaps, not bitmaps that will come afterwards
     -+				in the index.
     -+
     -+			** 1-byte flags for this bitmap
     -+				At the moment the only available flag is `0x1`, which hints
     -+				that this bitmap can be re-used when rebuilding bitmap indexes
     -+				for the repository.
     -+
     -+			** The compressed bitmap itself, see Appendix A.
     ++		** The compressed bitmap itself, see Appendix A.
       
       == Appendix A: Serialization format for an EWAH bitmap
       
 2:  ba534b5d486 ! 3:  2171d31fb2b bitmap-format.txt: add information for trailing checksum
     @@ Commit message
       ## Documentation/technical/bitmap-format.txt ##
      @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
       
     - 			** The compressed bitmap itself, see Appendix A.
     + 		** The compressed bitmap itself, see Appendix A.
       
      +	* TRAILER:
      +
     -+		Index checksum of the above contents.
     ++		Index checksum of the above contents. It is a 20-byte SHA1 checksum.
      +
       == Appendix A: Serialization format for an EWAH bitmap
       

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 17:43   ` Abhradeep Chakraborty via GitGitGadget
  2022-06-07 18:39     ` Junio C Hamano
  2022-06-07 20:21     ` Taylor Blau
  2022-06-07 17:43   ` [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-07 17:43 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Documentation/Makefile does not include bitmap-format.txt to generate
a html page using asciidoc.

Teach Documentation/Makefile to also generate a html page for
Documentation/technical/bitmap-format.txt file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index d3f043f50d2..8d405a14330 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -94,6 +94,7 @@ TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
 TECH_DOCS += ToolsForGit
+TECH_DOCS += technical/bitmap-format
 TECH_DOCS += technical/bundle-format
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
  2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 17:43   ` Abhradeep Chakraborty via GitGitGadget
  2022-06-07 20:51     ` Taylor Blau
  2022-06-07 17:43   ` [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-07 17:43 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

The asciidoc generated html for `Documentation/technical/bitmap-
format.txt` is broken. This is mainly because `-` is used for nested
lists (which is not allowed in asciidoc) instead of `*`.

Fix these and also reformat it (e.g. removing some blank lines) for
better readability of the html page.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 04b3ec21785..f22669b5916 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -39,7 +39,7 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 == On-disk format
 
-	- A header appears at the beginning:
+	* A header appears at the beginning:
 
 		4-byte signature: {'B', 'I', 'T', 'M'}
 
@@ -48,9 +48,7 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 			of the bitmap index (the same one as JGit).
 
 		2-byte flags (network byte order)
-
 			The following flags are supported:
-
 			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
 			This flag must always be present. It implies that the
 			bitmap index has been generated for a packfile or
@@ -60,7 +58,6 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 			requirement for the bitmap index format, also present in
 			JGit, that greatly reduces the complexity of the
 			implementation.
-
 			- BITMAP_OPT_HASH_CACHE (0x4)
 			If present, the end of the bitmap file contains
 			`N` 32-bit name-hash values, one per object in the
@@ -68,15 +65,13 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 			described below.
 
 		4-byte entry count (network byte order)
-
 			The total count of entries (bitmapped commits) in this bitmap index.
 
 		20-byte checksum
-
 			The SHA1 checksum of the pack/MIDX this bitmap index
 			belongs to.
 
-	- 4 EWAH bitmaps that act as type indexes
+	* 4 EWAH bitmaps that act as type indexes
 
 		Type indexes are serialized after the hash cache in the shape
 		of four EWAH bitmaps stored consecutively (see Appendix A for
@@ -84,7 +79,6 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 		There is a bitmap for each Git object type, stored in the following
 		order:
-
 			- Commits
 			- Trees
 			- Blobs
@@ -97,17 +91,17 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 		in a full set (all bits set), and the AND of all 4 bitmaps will
 		result in an empty bitmap (no bits set).
 
-	- N entries with compressed bitmaps, one for each indexed commit
+	* N entries with compressed bitmaps, one for each indexed commit
 
 		Where `N` is the total amount of entries in this bitmap index.
 		Each entry contains the following:
 
-		- 4-byte object position (network byte order)
+		** 4-byte object position (network byte order)
 			The position **in the index for the packfile or
 			multi-pack index** where the bitmap for this commit is
 			found.
 
-		- 1-byte XOR-offset
+		** 1-byte XOR-offset
 			The xor offset used to compress this bitmap. For an entry
 			in position `x`, a XOR offset of `y` means that the actual
 			bitmap representing this commit is composed by XORing the
@@ -124,12 +118,12 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 			with **previous** bitmaps, not bitmaps that will come afterwards
 			in the index.
 
-		- 1-byte flags for this bitmap
+		** 1-byte flags for this bitmap
 			At the moment the only available flag is `0x1`, which hints
 			that this bitmap can be re-used when rebuilding bitmap indexes
 			for the repository.
 
-		- The compressed bitmap itself, see Appendix A.
+		** The compressed bitmap itself, see Appendix A.
 
 == Appendix A: Serialization format for an EWAH bitmap
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
  2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
  2022-06-07 17:43   ` [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 17:43   ` Abhradeep Chakraborty via GitGitGadget
  2022-06-07 20:56     ` Taylor Blau
  2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
  4 siblings, 1 reply; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-07 17:43 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Bitmap file has a trailing checksum at the end of the file. However
there is no information in the bitmap-format documentation about it.

Add a trailer section to include the trailing checksum info in the
`Documentation/technical/bitmap-format.txt` file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index f22669b5916..a43d2fe2bbf 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -125,6 +125,10 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 		** The compressed bitmap itself, see Appendix A.
 
+	* TRAILER:
+
+		Index checksum of the above contents. It is a 20-byte SHA1 checksum.
+
 == Appendix A: Serialization format for an EWAH bitmap
 
 Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-06-07 17:43   ` [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 18:28   ` Junio C Hamano
  2022-06-07 20:58     ` Taylor Blau
  2022-06-07 21:00     ` Junio C Hamano
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
  4 siblings, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2022-06-07 18:28 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Abhradeep Chakraborty

"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> There are some issues in the bitmap-format html page.

"First, it does not even exist!" before anything else ;-)

> For example, some
> nested lists are shown as top-level lists (e.g. [1]- Here
> BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
> top-level list). There is also a need of adding info about trailing checksum
> in the docs.
>
> Changes since v1:
>
>  * a new commit addressing bitmap-format.txt html page generation is added

Good.

>  * Remove extra indentation from the previous change

Good.

>  * elaborate more about the trailing checksum (as suggested by Kaartic)

Good.

Will take a look (and audiences are requested to do so, too).

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html
  2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 18:39     ` Junio C Hamano
  2022-06-08 15:02       ` Abhradeep Chakraborty
  2022-06-07 20:21     ` Taylor Blau
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2022-06-07 18:39 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Abhradeep Chakraborty

"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
>
> Documentation/Makefile does not include bitmap-format.txt to generate
> a html page using asciidoc.
>
> Teach Documentation/Makefile to also generate a html page for
> Documentation/technical/bitmap-format.txt file.
>
> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>  Documentation/Makefile | 1 +
>  1 file changed, 1 insertion(+)

The change itself is obviously correct (assuming that it is worth
passing the document to AsciiDoc, instead of reading it in text,
that is).

> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index d3f043f50d2..8d405a14330 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -94,6 +94,7 @@ TECH_DOCS += MyFirstContribution
>  TECH_DOCS += MyFirstObjectWalk
>  TECH_DOCS += SubmittingPatches
>  TECH_DOCS += ToolsForGit
> +TECH_DOCS += technical/bitmap-format
>  TECH_DOCS += technical/bundle-format
>  TECH_DOCS += technical/hash-function-transition
>  TECH_DOCS += technical/http-protocol

Is bitmap-format the only one that is not fed to AsciiDoc, by the
way?  Are there other 'text-only' document that is worth converting
to AsciiDoc? 

It is outside the scope of this series, of course, to actually
adjusting them, but since you are already doing the homework, I
thought you might already know the answer, which may become a source
of inspriation for others to find something to work on.

Thanks.




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html
  2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
  2022-06-07 18:39     ` Junio C Hamano
@ 2022-06-07 20:21     ` Taylor Blau
  1 sibling, 0 replies; 37+ messages in thread
From: Taylor Blau @ 2022-06-07 20:21 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty

On Tue, Jun 07, 2022 at 05:43:32PM +0000, Abhradeep Chakraborty via GitGitGadget wrote:
> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
>
> Documentation/Makefile does not include bitmap-format.txt to generate
> a html page using asciidoc.
>
> Teach Documentation/Makefile to also generate a html page for
> Documentation/technical/bitmap-format.txt file.

I am glad to see us finally getting around to this ;). I proposed this
back in:

    https://lore.kernel.org/git/b0bb2e8051f19ec47140fda6500e092e37c6bea8.1624314293.git.me@ttaylorr.com/

but I dropped it from later versions of that series, due in large part
to some of the formatting issues that your series here fixes.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-07 17:43   ` [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 20:51     ` Taylor Blau
  2022-06-07 22:02       ` Junio C Hamano
  2022-06-08 15:40       ` Abhradeep Chakraborty
  0 siblings, 2 replies; 37+ messages in thread
From: Taylor Blau @ 2022-06-07 20:51 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty

On Tue, Jun 07, 2022 at 05:43:33PM +0000, Abhradeep Chakraborty via GitGitGadget wrote:
> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
>
> The asciidoc generated html for `Documentation/technical/bitmap-
> format.txt` is broken. This is mainly because `-` is used for nested
> lists (which is not allowed in asciidoc) instead of `*`.
>
> Fix these and also reformat it (e.g. removing some blank lines) for
> better readability of the html page.

Hmm. When I render the HTML for this page and view it in my browser, the
removed blank lines makes the contents of the section "2-byte flags
(network byte order)" run together, and I think it hurts readability
IMHO.

Is there a way to keep those line breaks without significantly
reformatting the source of this file?

> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>  Documentation/technical/bitmap-format.txt | 20 +++++++-------------
>  1 file changed, 7 insertions(+), 13 deletions(-)
>
> diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
> index 04b3ec21785..f22669b5916 100644
> --- a/Documentation/technical/bitmap-format.txt
> +++ b/Documentation/technical/bitmap-format.txt
> @@ -39,7 +39,7 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>
>  == On-disk format
>
> -	- A header appears at the beginning:
> +	* A header appears at the beginning:
>
>  		4-byte signature: {'B', 'I', 'T', 'M'}

Similarly, everything below the "A header appears at the beginning"
list item appears in a <pre> element, so the rendered HTML looks more
like plaintext to me.

This isn't new from your patch, but I wonder if now is a good
opportunity to make some light use of the formatting options that
ASCIIDoc gives us to make the page read a little bit more easily when
rendered as HTML.

I don't want to compromise too much on the readability of the .txt file,
though, so if there isn't a good way to strike this balance, then I
trust you and think we should leave it as you have modified things here.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum
  2022-06-07 17:43   ` [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
@ 2022-06-07 20:56     ` Taylor Blau
  2022-06-08 16:15       ` Abhradeep Chakraborty
  0 siblings, 1 reply; 37+ messages in thread
From: Taylor Blau @ 2022-06-07 20:56 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Junio C Hamano, Abhradeep Chakraborty

On Tue, Jun 07, 2022 at 05:43:34PM +0000, Abhradeep Chakraborty via GitGitGadget wrote:
> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
>
> Bitmap file has a trailing checksum at the end of the file. However
> there is no information in the bitmap-format documentation about it.
>
> Add a trailer section to include the trailing checksum info in the
> `Documentation/technical/bitmap-format.txt` file.
>
> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>  Documentation/technical/bitmap-format.txt | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
> index f22669b5916..a43d2fe2bbf 100644
> --- a/Documentation/technical/bitmap-format.txt
> +++ b/Documentation/technical/bitmap-format.txt
> @@ -125,6 +125,10 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>
>  		** The compressed bitmap itself, see Appendix A.
>
> +	* TRAILER:
> +
> +		Index checksum of the above contents. It is a 20-byte SHA1 checksum.
> +

I assume by "Index checksum" you are referring to a checksum of the
bitmap _index_'s contents. That term is used a little throughout
pack-format.txt, but it's foreign to me. Assuming that's how you meant
it, a more conventional term (I think) would be just "trailing
checksum".

It is also not guaranteed to be a SHA-1 checksum, if the repository
which wrote the bitmap is in SHA-256 mode. So I would suggest that this
addition just read:

    * TRAILER:

      Trailing checksum of the preceding contents.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
@ 2022-06-07 20:58     ` Taylor Blau
  2022-06-07 21:00     ` Junio C Hamano
  1 sibling, 0 replies; 37+ messages in thread
From: Taylor Blau @ 2022-06-07 20:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty via GitGitGadget, git, Vicent Marti,
	Kaartic Sivaraam, Derrick Stolee, Abhradeep Chakraborty

On Tue, Jun 07, 2022 at 11:28:17AM -0700, Junio C Hamano wrote:
> Will take a look (and audiences are requested to do so, too).

I think this is on a good track. The rendered HTML still has much of its
content inside of <pre> elements, but that may be an acceptable
trade-off to maintain readability of the source material.

If there's a way to make the rendered page more appealing without
compromising on the readability of the source, I'd be in favor of that.
But I trust Abhradeep's judgement here, so if there isn't, I'd be happy
with the series (mostly) as-is.

I left a textual suggestion on the third patch, which I'd like to adopt
before picking this up (this will also give Abhradeep a chance to
investigate the formatting improvements on patch 2/3).

In the meantime, it's probably safe to drop Vicent Martí from the CC
list, since he is no longer working on Git (though I miss him very
much!).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
  2022-06-07 20:58     ` Taylor Blau
@ 2022-06-07 21:00     ` Junio C Hamano
  2022-06-08 17:12       ` Abhradeep Chakraborty
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2022-06-07 21:00 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Taylor Blau, Vicent Marti, Kaartic Sivaraam, Derrick Stolee,
	Abhradeep Chakraborty

Junio C Hamano <gitster@pobox.com> writes:

> "Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
>> There are some issues in the bitmap-format html page.
>
> "First, it does not even exist!" before anything else ;-)
>
>> For example, some
>> nested lists are shown as top-level lists (e.g. [1]- Here
>> BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
>> top-level list). There is also a need of adding info about trailing checksum
>> in the docs.
> ...

No, this is not quite ready for production.

Almost all the "indented" material are shown in fixed-width
typewriter format in the resulting HTML output.

Look how ugly the output from it is.  Not your fault; it is mostly
because when the original text was written, it was not even meant to
be given to AsciiDoc.

  https://twitter.com/jch2355/status/1534276427607986178/photo/1
  https://pbs.twimg.com/media/FUrYP2nakAAnRaH?format=png

And as I already said, removal of the blank lines made it harder to
see what is going on in the source, and because the output is pretty
much straight copy of the source in the fixed-font, just like reading
the source in the terminal, the output here is equally hard to read.

  https://twitter.com/jch2355/status/1534277664441511937/photo/1
  https://pbs.twimg.com/media/FUrZZXUUsAEmEeT?format=png

If we really want to give it to AsciiDoc, we'd need to reformat it
more extensively, not just tweak it on the surface and making an
equivalent of <pre>...</pre> slightly easier to read, which is what
this patch does.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-07 20:51     ` Taylor Blau
@ 2022-06-07 22:02       ` Junio C Hamano
  2022-06-08 16:06         ` Abhradeep Chakraborty
  2022-06-08 15:40       ` Abhradeep Chakraborty
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2022-06-07 22:02 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Abhradeep Chakraborty via GitGitGadget, git, Vicent Marti,
	Kaartic Sivaraam, Derrick Stolee, Abhradeep Chakraborty

Taylor Blau <me@ttaylorr.com> writes:

> Similarly, everything below the "A header appears at the beginning"
> list item appears in a <pre> element, so the rendered HTML looks more
> like plaintext to me.

True.  Unless we are going to revamp the text in some major way so
that we produce "true" HTML, not just the text source enclosed in a
<pre></pre> pair, I would think we are better off keeping it not
passed to AsciiDoc and leaving it in text format.  After all, modern
browsers, which I presume those who want HTML output files would
read them with, can display plain text files just fine, don't they?

> This isn't new from your patch, but I wonder if now is a good
> opportunity to make some light use of the formatting options that
> ASCIIDoc gives us to make the page read a little bit more easily when
> rendered as HTML.

There was some talk about asking those who are adept at website
engineering to work on git-scm.com; it may be a good starting point
to look at these text files that weren't originally written to be
given to AsciiDoc and convert them to be true AsciiDoc sources.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html
  2022-06-07 18:39     ` Junio C Hamano
@ 2022-06-08 15:02       ` Abhradeep Chakraborty
  0 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-08 15:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Git, Taylor Blau, Kaartic Sivaraam,
	Derrick Stolee

Junio C Hamano <gitster@pobox.com> wrote:

> Is bitmap-format the only one that is not fed to AsciiDoc, by the
> way?  Are there other 'text-only' document that is worth converting
> to AsciiDoc? 
>
> It is outside the scope of this series, of course, to actually
> adjusting them, but since you are already doing the homework, I
> thought you might already know the answer, which may become a source
> of inspriation for others to find something to work on.

No, bitmap-format is not the only one. There are more text-only files.
Some of them which I found till now are - technical/chunk-format.txt,
technical/commit-graph.txt etc. There are more but I don't know if they
actually need html conversion. These two texts (which I mentioned) is I
think worth having html files.

I was thinking of adding those in my commit but later I thought it would
divert the patch series.

Thanks :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-07 20:51     ` Taylor Blau
  2022-06-07 22:02       ` Junio C Hamano
@ 2022-06-08 15:40       ` Abhradeep Chakraborty
  1 sibling, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-08 15:40 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Abhradeep Chakraborty, Git, Junio C Hamano, Kaartic Sivaraam,
	Derrick Stolee

Taylor Blau <me@ttaylorr.com> wrote:

> Hmm. When I render the HTML for this page and view it in my browser, the
> removed blank lines makes the contents of the section "2-byte flags
> (network byte order)" run together, and I think it hurts readability
> IMHO.

Honestly I agree with you. I also felt the same but then I thought it
is still better than the currently broken page.

> Is there a way to keep those line breaks without significantly
> reformatting the source of this file?

I have a limited knowledge on asciidoc. I removed those blank lines
only because it generates weird html output. I didn't find any other
way to fix that (with minimum source code changes).

> This isn't new from your patch, but I wonder if now is a good
> opportunity to make some light use of the formatting options that
> ASCIIDoc gives us to make the page read a little bit more easily when
> rendered as HTML.

Yeah, quite sensible. I will surely look for better way.

> I don't want to compromise too much on the readability of the .txt file,
> though, so if there isn't a good way to strike this balance, then I
> trust you and think we should leave it as you have modified things here.

This is one of the main reason why I removed those blank lines and other
stuff. It is the minimum change to fix the html doc. But I will look more
into it.

Thanks :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-07 22:02       ` Junio C Hamano
@ 2022-06-08 16:06         ` Abhradeep Chakraborty
  0 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-08 16:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Git, Taylor Blau, Kaartic Sivaraam,
	Derrick Stolee

Junio C Hamano <gitster@pobox.com> wrote:

> True.  Unless we are going to revamp the text in some major way so
> that we produce "true" HTML, not just the text source enclosed in a
> <pre></pre> pair, I would think we are better off keeping it not
> passed to AsciiDoc and leaving it in text format.  After all, modern
> browsers, which I presume those who want HTML output files would
> read them with, can display plain text files just fine, don't they?

I am not sure whether that's a good idea or not. As I come from web
dev background, I know that people get bored if they need to read
a plain-text long article. SEO optimisation also need some beautiful
designing of articles so that people can spend more time with visual
ease.

Of course, git doesn't need any SEO optimisation as it is very much
popular. But readers want some visual satisfaction while reading
Docs. That's why some people complain about GNU sites (git's site is
beautiful by the way).

Obviously, here I am using `people` to refer non git developers who are
curious about git internals.

Thanks :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum
  2022-06-07 20:56     ` Taylor Blau
@ 2022-06-08 16:15       ` Abhradeep Chakraborty
  0 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-08 16:15 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Abhradeep Chakraborty, Git, Junio C Hamano, Kaartic Sivaraam,
	Derrick Stolee

Taylor Blau <me@ttaylorr.com> wrote:

> I assume by "Index checksum" you are referring to a checksum of the
> bitmap _index_'s contents. 

Yeah, I meant a checksum of the bitmap file's content.

> That term is used a little throughout
> pack-format.txt, but it's foreign to me. Assuming that's how you meant
> it, a more conventional term (I think) would be just "trailing
> checksum".

Actually, I copy-paste it from the pack-format.txt file ;). Will surely
follow your suggestions.

> It is also not guaranteed to be a SHA-1 checksum, if the repository
> which wrote the bitmap is in SHA-256 mode. So I would suggest that this
> addition just read:
>
>     * TRAILER:
>
>       Trailing checksum of the preceding contents.

Got it. Thanks !


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-07 21:00     ` Junio C Hamano
@ 2022-06-08 17:12       ` Abhradeep Chakraborty
  0 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-08 17:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Git, Taylor Blau, Kaartic Sivaraam,
	Derrick Stolee

Junio C Hamano <gitster@pobox.com> wrote:

> No, this is not quite ready for production.
>
> Almost all the "indented" material are shown in fixed-width
> typewriter format in the resulting HTML output.
>
> Look how ugly the output from it is.  Not your fault; it is mostly
> because when the original text was written, it was not even meant to
> be given to AsciiDoc.

Actually, I am wondering how git-scm.com is able to produce a html page
for bitmap-format.txt (if it is not passing to asciidoc). The design of
asciidoc generated html pages in `make docs` are not same as the design
of production html page designs. Probably, production uses some extra
css code to beautify the asciidoc generated html files.

So, the generated html file (production version) is not as bad as the
locally built generated html. I need some understanding of the working
of git-scm though (to verify it).

If you see other locally built html pages - they would look similar to
the bitmap-format html page. But in production, they are beautiful enough.

By the way, I forgot to inform that https://git-scm.com/docs/pack-format#_original_version_1_pack_idx_files_have_the_following_format also has
some weird formatting issues. See the <pre> block after the pack-idx structure
drawing. There are other issues also which you can find (like having
unnecessary indentations e.g. here[1] the second block under the "The header
is followed by number of object entries....").

Thanks :)

[1] https://git-scm.com/docs/pack-format#_pack_pack_files_have_the_following_format

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
@ 2022-06-10 10:54   ` Abhradeep Chakraborty via GitGitGadget
  2022-06-10 10:54     ` [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
                       ` (4 more replies)
  4 siblings, 5 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-10 10:54 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty

There are some issues in the bitmap-format html page. For example, some
nested lists are shown as top-level lists (e.g. [1]- Here
BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
top-level list). There is also a need of adding info about trailing checksum
in the docs.

Changes since v2: The last two commits are updated to address the
suggestions. These changes are -

 * previously omitted blank lines are re-added. In the updated commit, use
   of <pre> blocks are decreased. Description lists and + are used instead
   to add more than one paragraphs under lists. Readability of the source
   text might decrease due to the use of +. But other documentation files
   (e.g. git-add.txt) also use it to connect two paragraphs. So, I hope this
   is acceptable.

 * Information about trailing checksum is updated (as suggested by Taylor)

Changes since v1:

 * a new commit addressing bitmap-format.txt html page generation is added
 * Remove extra indentation from the previous change
 * elaborate more about the trailing checksum (as suggested by Kaartic)

initial version:

 * first commit fixes some formatting issues
 * information about trailing checksum in the bitmap file is added in the
   bitmap-format doc.

[1] https://git-scm.com/docs/bitmap-format#_on_disk_format

Abhradeep Chakraborty (3):
  bitmap-format.txt: feed the file to asciidoc to generate html
  bitmap-format.txt: fix some formatting issues
  bitmap-format.txt: add information for trailing checksum

 Documentation/Makefile                    |   1 +
 Documentation/technical/bitmap-format.txt | 113 ++++++++++++----------
 2 files changed, 63 insertions(+), 51 deletions(-)


base-commit: 2668e3608e47494f2f10ef2b6e69f08a84816bcb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1246

Range-diff vs v2:

 1:  a1b9bd9af90 = 1:  a1b9bd9af90 bitmap-format.txt: feed the file to asciidoc to generate html
 2:  cb919513c14 ! 2:  c74b9a52c2a bitmap-format.txt: fix some formatting issues
     @@ Commit message
          format.txt` is broken. This is mainly because `-` is used for nested
          lists (which is not allowed in asciidoc) instead of `*`.
      
     -    Fix these and also reformat it (e.g. removing some blank lines) for
     -    better readability of the html page.
     +    Fix these and also reformat it for better readability of the html page.
      
          Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      -	- A header appears at the beginning:
      +	* A header appears at the beginning:
       
     - 		4-byte signature: {'B', 'I', 'T', 'M'}
     +-		4-byte signature: {'B', 'I', 'T', 'M'}
     ++		4-byte signature: :: {'B', 'I', 'T', 'M'}
     ++
     ++		2-byte version number (network byte order): ::
       
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     +-		2-byte version number (network byte order)
     + 			The current implementation only supports version 1
       			of the bitmap index (the same one as JGit).
       
     - 		2-byte flags (network byte order)
     --
     +-		2-byte flags (network byte order)
     ++		2-byte flags (network byte order): ::
     + 
       			The following flags are supported:
     --
     - 			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     + 
     +-			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     ++			** {empty}
     ++			BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
     ++
       			This flag must always be present. It implies that the
       			bitmap index has been generated for a packfile or
     + 			multi-pack index (MIDX) with full closure (i.e. where
      @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     - 			requirement for the bitmap index format, also present in
       			JGit, that greatly reduces the complexity of the
       			implementation.
     --
     - 			- BITMAP_OPT_HASH_CACHE (0x4)
     + 
     +-			- BITMAP_OPT_HASH_CACHE (0x4)
     ++			** {empty}
     ++			BITMAP_OPT_HASH_CACHE (0x4): :::
     ++
       			If present, the end of the bitmap file contains
       			`N` 32-bit name-hash values, one per object in the
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     + 			pack/MIDX. The format and meaning of the name-hash is
       			described below.
       
     - 		4-byte entry count (network byte order)
     +-		4-byte entry count (network byte order)
      -
     ++		4-byte entry count (network byte order): ::
       			The total count of entries (bitmapped commits) in this bitmap index.
       
     - 		20-byte checksum
     +-		20-byte checksum
      -
     ++		20-byte checksum: ::
       			The SHA1 checksum of the pack/MIDX this bitmap index
       			belongs to.
       
      -	- 4 EWAH bitmaps that act as type indexes
     -+	* 4 EWAH bitmaps that act as type indexes
     - 
     - 		Type indexes are serialized after the hash cache in the shape
     - 		of four EWAH bitmaps stored consecutively (see Appendix A for
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     - 
     - 		There is a bitmap for each Git object type, stored in the following
     - 		order:
      -
     - 			- Commits
     - 			- Trees
     - 			- Blobs
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     - 		in a full set (all bits set), and the AND of all 4 bitmaps will
     - 		result in an empty bitmap (no bits set).
     - 
     +-		Type indexes are serialized after the hash cache in the shape
     +-		of four EWAH bitmaps stored consecutively (see Appendix A for
     +-		the serialization format of an EWAH bitmap).
     +-
     +-		There is a bitmap for each Git object type, stored in the following
     +-		order:
     +-
     +-			- Commits
     +-			- Trees
     +-			- Blobs
     +-			- Tags
     +-
     +-		In each bitmap, the `n`th bit is set to true if the `n`th object
     +-		in the packfile or multi-pack index is of that type.
     +-
     +-		The obvious consequence is that the OR of all 4 bitmaps will result
     +-		in a full set (all bits set), and the AND of all 4 bitmaps will
     +-		result in an empty bitmap (no bits set).
     +-
      -	- N entries with compressed bitmaps, one for each indexed commit
     -+	* N entries with compressed bitmaps, one for each indexed commit
     - 
     - 		Where `N` is the total amount of entries in this bitmap index.
     - 		Each entry contains the following:
     - 
     +-
     +-		Where `N` is the total amount of entries in this bitmap index.
     +-		Each entry contains the following:
     +-
      -		- 4-byte object position (network byte order)
     -+		** 4-byte object position (network byte order)
     ++	* 4 EWAH bitmaps that act as type indexes
     +++
     ++Type indexes are serialized after the hash cache in the shape
     ++of four EWAH bitmaps stored consecutively (see Appendix A for
     ++the serialization format of an EWAH bitmap).
     +++
     ++There is a bitmap for each Git object type, stored in the following
     ++order:
     +++
     ++	- Commits
     ++	- Trees
     ++	- Blobs
     ++	- Tags
     ++
     +++
     ++In each bitmap, the `n`th bit is set to true if the `n`th object
     ++in the packfile or multi-pack index is of that type.
     ++
     ++    The obvious consequence is that the OR of all 4 bitmaps will result
     ++    in a full set (all bits set), and the AND of all 4 bitmaps will
     ++    result in an empty bitmap (no bits set).
     ++
     ++	* N entries with compressed bitmaps, one for each indexed commit
     +++
     ++Where `N` is the total amount of entries in this bitmap index.
     ++Each entry contains the following:
     ++
     ++		** {empty}
     ++		4-byte object position (network byte order): ::
       			The position **in the index for the packfile or
       			multi-pack index** where the bitmap for this commit is
       			found.
       
      -		- 1-byte XOR-offset
     -+		** 1-byte XOR-offset
     ++		** {empty}
     ++		1-byte XOR-offset: ::
       			The xor offset used to compress this bitmap. For an entry
       			in position `x`, a XOR offset of `y` means that the actual
       			bitmap representing this commit is composed by XORing the
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     - 			with **previous** bitmaps, not bitmaps that will come afterwards
     - 			in the index.
     - 
     + 			bitmap for this entry with the bitmap in entry `x-y` (i.e.
     + 			the bitmap `y` entries before this one).
     +-
     +-			Note that this compression can be recursive. In order to
     +-			XOR this entry with a previous one, the previous entry needs
     +-			to be decompressed first, and so on.
     +-
     +-			The hard-limit for this offset is 160 (an entry can only be
     +-			xor'ed against one of the 160 entries preceding it). This
     +-			number is always positive, and hence entries are always xor'ed
     +-			with **previous** bitmaps, not bitmaps that will come afterwards
     +-			in the index.
     +-
      -		- 1-byte flags for this bitmap
     -+		** 1-byte flags for this bitmap
     +++
     ++NOTE: This compression can be recursive. In order to
     ++XOR this entry with a previous one, the previous entry needs
     ++to be decompressed first, and so on.
     +++
     ++The hard-limit for this offset is 160 (an entry can only be
     ++xor'ed against one of the 160 entries preceding it). This
     ++number is always positive, and hence entries are always xor'ed
     ++with **previous** bitmaps, not bitmaps that will come afterwards
     ++in the index.
     ++
     ++		** {empty}
     ++		1-byte flags for this bitmap: ::
       			At the moment the only available flag is `0x1`, which hints
       			that this bitmap can be re-used when rebuilding bitmap indexes
       			for the repository.
 3:  2171d31fb2b ! 3:  b971558e1cb bitmap-format.txt: add information for trailing checksum
     @@ Commit message
          Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
       ## Documentation/technical/bitmap-format.txt ##
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     +@@ Documentation/technical/bitmap-format.txt: in the index.
       
       		** The compressed bitmap itself, see Appendix A.
       
     -+	* TRAILER:
     -+
     -+		Index checksum of the above contents. It is a 20-byte SHA1 checksum.
     ++	* {empty}
     ++	TRAILER: ::
     ++		Trailing checksum of the preceding contents.
      +
       == Appendix A: Serialization format for an EWAH bitmap
       

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
@ 2022-06-10 10:54     ` Abhradeep Chakraborty via GitGitGadget
  2022-06-10 10:54     ` [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-10 10:54 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Documentation/Makefile does not include bitmap-format.txt to generate
a html page using asciidoc.

Teach Documentation/Makefile to also generate a html page for
Documentation/technical/bitmap-format.txt file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index d3f043f50d2..8d405a14330 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -94,6 +94,7 @@ TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
 TECH_DOCS += ToolsForGit
+TECH_DOCS += technical/bitmap-format
 TECH_DOCS += technical/bundle-format
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
  2022-06-10 10:54     ` [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
@ 2022-06-10 10:54     ` Abhradeep Chakraborty via GitGitGadget
  2022-06-15  2:27       ` Taylor Blau
  2022-06-10 10:54     ` [PATCH v3 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-10 10:54 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

The asciidoc generated html for `Documentation/technical/bitmap-
format.txt` is broken. This is mainly because `-` is used for nested
lists (which is not allowed in asciidoc) instead of `*`.

Fix these and also reformat it for better readability of the html page.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 109 ++++++++++++----------
 1 file changed, 58 insertions(+), 51 deletions(-)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 04b3ec21785..cd621379f42 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -39,19 +39,22 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 == On-disk format
 
-	- A header appears at the beginning:
+	* A header appears at the beginning:
 
-		4-byte signature: {'B', 'I', 'T', 'M'}
+		4-byte signature: :: {'B', 'I', 'T', 'M'}
+
+		2-byte version number (network byte order): ::
 
-		2-byte version number (network byte order)
 			The current implementation only supports version 1
 			of the bitmap index (the same one as JGit).
 
-		2-byte flags (network byte order)
+		2-byte flags (network byte order): ::
 
 			The following flags are supported:
 
-			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
+			** {empty}
+			BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
+
 			This flag must always be present. It implies that the
 			bitmap index has been generated for a packfile or
 			multi-pack index (MIDX) with full closure (i.e. where
@@ -61,75 +64,79 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 			JGit, that greatly reduces the complexity of the
 			implementation.
 
-			- BITMAP_OPT_HASH_CACHE (0x4)
+			** {empty}
+			BITMAP_OPT_HASH_CACHE (0x4): :::
+
 			If present, the end of the bitmap file contains
 			`N` 32-bit name-hash values, one per object in the
 			pack/MIDX. The format and meaning of the name-hash is
 			described below.
 
-		4-byte entry count (network byte order)
-
+		4-byte entry count (network byte order): ::
 			The total count of entries (bitmapped commits) in this bitmap index.
 
-		20-byte checksum
-
+		20-byte checksum: ::
 			The SHA1 checksum of the pack/MIDX this bitmap index
 			belongs to.
 
-	- 4 EWAH bitmaps that act as type indexes
-
-		Type indexes are serialized after the hash cache in the shape
-		of four EWAH bitmaps stored consecutively (see Appendix A for
-		the serialization format of an EWAH bitmap).
-
-		There is a bitmap for each Git object type, stored in the following
-		order:
-
-			- Commits
-			- Trees
-			- Blobs
-			- Tags
-
-		In each bitmap, the `n`th bit is set to true if the `n`th object
-		in the packfile or multi-pack index is of that type.
-
-		The obvious consequence is that the OR of all 4 bitmaps will result
-		in a full set (all bits set), and the AND of all 4 bitmaps will
-		result in an empty bitmap (no bits set).
-
-	- N entries with compressed bitmaps, one for each indexed commit
-
-		Where `N` is the total amount of entries in this bitmap index.
-		Each entry contains the following:
-
-		- 4-byte object position (network byte order)
+	* 4 EWAH bitmaps that act as type indexes
++
+Type indexes are serialized after the hash cache in the shape
+of four EWAH bitmaps stored consecutively (see Appendix A for
+the serialization format of an EWAH bitmap).
++
+There is a bitmap for each Git object type, stored in the following
+order:
++
+	- Commits
+	- Trees
+	- Blobs
+	- Tags
+
++
+In each bitmap, the `n`th bit is set to true if the `n`th object
+in the packfile or multi-pack index is of that type.
+
+    The obvious consequence is that the OR of all 4 bitmaps will result
+    in a full set (all bits set), and the AND of all 4 bitmaps will
+    result in an empty bitmap (no bits set).
+
+	* N entries with compressed bitmaps, one for each indexed commit
++
+Where `N` is the total amount of entries in this bitmap index.
+Each entry contains the following:
+
+		** {empty}
+		4-byte object position (network byte order): ::
 			The position **in the index for the packfile or
 			multi-pack index** where the bitmap for this commit is
 			found.
 
-		- 1-byte XOR-offset
+		** {empty}
+		1-byte XOR-offset: ::
 			The xor offset used to compress this bitmap. For an entry
 			in position `x`, a XOR offset of `y` means that the actual
 			bitmap representing this commit is composed by XORing the
 			bitmap for this entry with the bitmap in entry `x-y` (i.e.
 			the bitmap `y` entries before this one).
-
-			Note that this compression can be recursive. In order to
-			XOR this entry with a previous one, the previous entry needs
-			to be decompressed first, and so on.
-
-			The hard-limit for this offset is 160 (an entry can only be
-			xor'ed against one of the 160 entries preceding it). This
-			number is always positive, and hence entries are always xor'ed
-			with **previous** bitmaps, not bitmaps that will come afterwards
-			in the index.
-
-		- 1-byte flags for this bitmap
++
+NOTE: This compression can be recursive. In order to
+XOR this entry with a previous one, the previous entry needs
+to be decompressed first, and so on.
++
+The hard-limit for this offset is 160 (an entry can only be
+xor'ed against one of the 160 entries preceding it). This
+number is always positive, and hence entries are always xor'ed
+with **previous** bitmaps, not bitmaps that will come afterwards
+in the index.
+
+		** {empty}
+		1-byte flags for this bitmap: ::
 			At the moment the only available flag is `0x1`, which hints
 			that this bitmap can be re-used when rebuilding bitmap indexes
 			for the repository.
 
-		- The compressed bitmap itself, see Appendix A.
+		** The compressed bitmap itself, see Appendix A.
 
 == Appendix A: Serialization format for an EWAH bitmap
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 3/3] bitmap-format.txt: add information for trailing checksum
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
  2022-06-10 10:54     ` [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
  2022-06-10 10:54     ` [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-10 10:54     ` Abhradeep Chakraborty via GitGitGadget
  2022-06-10 17:01     ` [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
  2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
  4 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-10 10:54 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Bitmap file has a trailing checksum at the end of the file. However
there is no information in the bitmap-format documentation about it.

Add a trailer section to include the trailing checksum info in the
`Documentation/technical/bitmap-format.txt` file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index cd621379f42..3f8cdd0ed91 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -138,6 +138,10 @@ in the index.
 
 		** The compressed bitmap itself, see Appendix A.
 
+	* {empty}
+	TRAILER: ::
+		Trailing checksum of the preceding contents.
+
 == Appendix A: Serialization format for an EWAH bitmap
 
 Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-06-10 10:54     ` [PATCH v3 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
@ 2022-06-10 17:01     ` Junio C Hamano
  2022-06-15  2:28       ` Taylor Blau
  2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
  4 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2022-06-10 17:01 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Taylor Blau, Kaartic Sivaraam, Derrick Stolee,
	Abhradeep Chakraborty

"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> There are some issues in the bitmap-format html page. For example, some
> nested lists are shown as top-level lists (e.g. [1]- Here
> BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
> top-level list). There is also a need of adding info about trailing checksum
> in the docs.

Quite honestly, I am not sure if a piecemeal "let's make
<pre>...</pre> a bit prettier" is worth our time.  Especially
relative to the importance of adding missing information to the
documentation.

So, if this round (I haven't looked at the formatting changes at all
yet) turns out to be still not doing the HTML properly, I'd suggest
shuffling the patches around, add missing information so that readers
can get the corrections in text regardless of the rest of HTMLify
effort.  We'll see.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-10 10:54     ` [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-15  2:27       ` Taylor Blau
  2022-06-15 14:28         ` Abhradeep Chakraborty
  0 siblings, 1 reply; 37+ messages in thread
From: Taylor Blau @ 2022-06-15  2:27 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty

Hi Abhradeep,

On Fri, Jun 10, 2022 at 10:54:40AM +0000, Abhradeep Chakraborty via GitGitGadget wrote:
> ++
> +In each bitmap, the `n`th bit is set to true if the `n`th object
> +in the packfile or multi-pack index is of that type.
> +
> +    The obvious consequence is that the OR of all 4 bitmaps will result
> +    in a full set (all bits set), and the AND of all 4 bitmaps will
> +    result in an empty bitmap (no bits set).
> +
> +	* N entries with compressed bitmaps, one for each indexed commit
> ++
> +Where `N` is the total amount of entries in this bitmap index.
> +Each entry contains the following:

The new formatting looks terrific; it's much easier to read this in my
browser after generating the HTML version of these docs. Two questions:

- Are the hard-tabs added in this file required for ASCIIDoc to treat it
  correctly? They are a slight impediment to reading the source in my
  editor, but it's not a huge deal. It would just be nice if we could
  replace "\t" characters with two or four spaces or something.

- The above hunk is the only one which rendered slightly oddly to me; it
  looks like the paragraph beginning with "The obvious consequence ..."
  is surrounded by a <pre> element, when it should be a continuation of
  the above paragraph ("In each bitmap ...").

Otherwise, this series is looking great. Let me know what you think!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-10 17:01     ` [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
@ 2022-06-15  2:28       ` Taylor Blau
  2022-06-15 22:41         ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Taylor Blau @ 2022-06-15  2:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty via GitGitGadget, git, Kaartic Sivaraam,
	Derrick Stolee, Abhradeep Chakraborty

On Fri, Jun 10, 2022 at 10:01:02AM -0700, Junio C Hamano wrote:
> "Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > There are some issues in the bitmap-format html page. For example, some
> > nested lists are shown as top-level lists (e.g. [1]- Here
> > BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
> > top-level list). There is also a need of adding info about trailing checksum
> > in the docs.
>
> Quite honestly, I am not sure if a piecemeal "let's make
> <pre>...</pre> a bit prettier" is worth our time.  Especially
> relative to the importance of adding missing information to the
> documentation.
>
> So, if this round (I haven't looked at the formatting changes at all
> yet) turns out to be still not doing the HTML properly, I'd suggest
> shuffling the patches around, add missing information so that readers
> can get the corrections in text regardless of the rest of HTMLify
> effort.  We'll see.

This version of the series significantly improves the readability of the
generated HTML, and I only had a minor comment or two.

So I think that the improvement is worthwhile, though if others disagree
strongly, the third patch should get picked up regardless, since it
addresses a legitimate gap in our documentation.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-15  2:27       ` Taylor Blau
@ 2022-06-15 14:28         ` Abhradeep Chakraborty
  0 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty @ 2022-06-15 14:28 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Abhradeep Chakraborty, Git, Junio C Hamano, Kaartic Sivaraam,
	Derrick Stolee

Taylor Blau <me@ttaylorr.com> wrote:

> - Are the hard-tabs added in this file required for ASCIIDoc to treat it
>   correctly? They are a slight impediment to reading the source in my
>   editor, but it's not a huge deal. It would just be nice if we could
>   replace "\t" characters with two or four spaces or something.


No, it is not required for Asciidoc. But `git diff --check` was complaining
against it. Don't know if that is related to my git configuration settings.
Moreover other parts of the file didn't seem to use spaces. For these reasons,
I used tabs. But can remove it if you say.

> - The above hunk is the only one which rendered slightly oddly to me; it
>   looks like the paragraph beginning with "The obvious consequence ..."
>   is surrounded by a <pre> element, when it should be a continuation of
>   the above paragraph ("In each bitmap ...").

Thanks for pointing out. Don't know how it was missed. Correcting it.

Thanks :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-15  2:28       ` Taylor Blau
@ 2022-06-15 22:41         ` Junio C Hamano
  0 siblings, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2022-06-15 22:41 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Abhradeep Chakraborty via GitGitGadget, git, Kaartic Sivaraam,
	Derrick Stolee, Abhradeep Chakraborty

Taylor Blau <me@ttaylorr.com> writes:

> This version of the series significantly improves the readability of the
> generated HTML, and I only had a minor comment or two.

Yeah, I looked at the output and it is improved so much to the point
that the remaining paragraph or two that are still typeset in the fixed
font incorrectly start to look even irritating ;-)

I've tentatively queued it in my tree.  I doubt that the topic is
ultra-urgent so if the remaining mark-up issues can be fixed before
the topic hits 'next', that would be great.

Thanks, both.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-06-10 17:01     ` [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
@ 2022-06-16  5:03     ` Abhradeep Chakraborty via GitGitGadget
  2022-06-16  5:03       ` [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
                         ` (3 more replies)
  4 siblings, 4 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-16  5:03 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty

There are some issues in the bitmap-format html page. For example, some
nested lists are shown as top-level lists (e.g. [1]- Here
BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
top-level list). There is also a need of adding info about trailing checksum
in the docs.

Changes since v3:

 * spaces are used instead of tabs
 * fixed remaining <pre> blocks

Changes since v2: The last two commits are updated to address the
suggestions. These changes are -

 * previously omitted blank lines are re-added. In the updated commit, use
   of <pre> blocks are decreased. Description lists and + are used instead
   to add more than one paragraphs under lists. Readability of the source
   text might decrease due to the use of +. But other documentation files
   (e.g. git-add.txt) also use it to connect two paragraphs. So, I hope this
   is acceptable.

 * Information about trailing checksum is updated (as suggested by Taylor)

Changes since v1:

 * a new commit addressing bitmap-format.txt html page generation is added
 * Remove extra indentation from the previous change
 * elaborate more about the trailing checksum (as suggested by Kaartic)

initial version:

 * first commit fixes some formatting issues
 * information about trailing checksum in the bitmap file is added in the
   bitmap-format doc.

[1] https://git-scm.com/docs/bitmap-format#_on_disk_format

Abhradeep Chakraborty (3):
  bitmap-format.txt: feed the file to asciidoc to generate html
  bitmap-format.txt: fix some formatting issues
  bitmap-format.txt: add information for trailing checksum

 Documentation/Makefile                    |   1 +
 Documentation/technical/bitmap-format.txt | 203 ++++++++++++----------
 2 files changed, 108 insertions(+), 96 deletions(-)


base-commit: 5699ec1b0aec51b9e9ba5a2785f65970c5a95d84
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1246

Range-diff vs v3:

 1:  a1b9bd9af90 ! 1:  494c1c1bd52 bitmap-format.txt: feed the file to asciidoc to generate html
     @@ Documentation/Makefile: TECH_DOCS += MyFirstContribution
       TECH_DOCS += ToolsForGit
      +TECH_DOCS += technical/bitmap-format
       TECH_DOCS += technical/bundle-format
     + TECH_DOCS += technical/cruft-packs
       TECH_DOCS += technical/hash-function-transition
     - TECH_DOCS += technical/http-protocol
 2:  c74b9a52c2a ! 2:  25512aa9c5b bitmap-format.txt: fix some formatting issues
     @@ Commit message
          Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
       ## Documentation/technical/bitmap-format.txt ##
     +@@ Documentation/technical/bitmap-format.txt: An object is uniquely described by its bit position within a bitmap:
     + 	is defined as follows:
     + 
     + 		o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2)
     +-
     +-	The ordering between packs is done according to the MIDX's .rev file.
     +-	Notably, the preferred pack sorts ahead of all other packs.
     +++
     ++The ordering between packs is done according to the MIDX's .rev file.
     ++Notably, the preferred pack sorts ahead of all other packs.
     + 
     + The on-disk representation (described below) of a bitmap is the same regardless
     + of whether or not that bitmap belongs to a packfile or a MIDX. The only
      @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
       
       == On-disk format
       
      -	- A header appears at the beginning:
     -+	* A header appears at the beginning:
     - 
     +-
      -		4-byte signature: {'B', 'I', 'T', 'M'}
     -+		4-byte signature: :: {'B', 'I', 'T', 'M'}
     -+
     -+		2-byte version number (network byte order): ::
     - 
     +-
      -		2-byte version number (network byte order)
     - 			The current implementation only supports version 1
     - 			of the bitmap index (the same one as JGit).
     - 
     +-			The current implementation only supports version 1
     +-			of the bitmap index (the same one as JGit).
     +-
      -		2-byte flags (network byte order)
     -+		2-byte flags (network byte order): ::
     - 
     - 			The following flags are supported:
     - 
     +-
     +-			The following flags are supported:
     +-
      -			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     -+			** {empty}
     -+			BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
     -+
     - 			This flag must always be present. It implies that the
     - 			bitmap index has been generated for a packfile or
     - 			multi-pack index (MIDX) with full closure (i.e. where
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     - 			JGit, that greatly reduces the complexity of the
     - 			implementation.
     - 
     +-			This flag must always be present. It implies that the
     +-			bitmap index has been generated for a packfile or
     +-			multi-pack index (MIDX) with full closure (i.e. where
     +-			every single object in the packfile/MIDX can find its
     +-			parent links inside the same packfile/MIDX). This is a
     +-			requirement for the bitmap index format, also present in
     +-			JGit, that greatly reduces the complexity of the
     +-			implementation.
     +-
      -			- BITMAP_OPT_HASH_CACHE (0x4)
     -+			** {empty}
     -+			BITMAP_OPT_HASH_CACHE (0x4): :::
     -+
     - 			If present, the end of the bitmap file contains
     - 			`N` 32-bit name-hash values, one per object in the
     - 			pack/MIDX. The format and meaning of the name-hash is
     - 			described below.
     - 
     +-			If present, the end of the bitmap file contains
     +-			`N` 32-bit name-hash values, one per object in the
     +-			pack/MIDX. The format and meaning of the name-hash is
     +-			described below.
     +-
      -		4-byte entry count (network byte order)
      -
     -+		4-byte entry count (network byte order): ::
     - 			The total count of entries (bitmapped commits) in this bitmap index.
     - 
     +-			The total count of entries (bitmapped commits) in this bitmap index.
     +-
      -		20-byte checksum
      -
     -+		20-byte checksum: ::
     - 			The SHA1 checksum of the pack/MIDX this bitmap index
     - 			belongs to.
     - 
     +-			The SHA1 checksum of the pack/MIDX this bitmap index
     +-			belongs to.
     +-
      -	- 4 EWAH bitmaps that act as type indexes
      -
      -		Type indexes are serialized after the hash cache in the shape
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      -		Each entry contains the following:
      -
      -		- 4-byte object position (network byte order)
     -+	* 4 EWAH bitmaps that act as type indexes
     +-			The position **in the index for the packfile or
     +-			multi-pack index** where the bitmap for this commit is
     +-			found.
     +-
     +-		- 1-byte XOR-offset
     +-			The xor offset used to compress this bitmap. For an entry
     +-			in position `x`, a XOR offset of `y` means that the actual
     +-			bitmap representing this commit is composed by XORing the
     +-			bitmap for this entry with the bitmap in entry `x-y` (i.e.
     +-			the bitmap `y` entries before this one).
     +-
     +-			Note that this compression can be recursive. In order to
     +-			XOR this entry with a previous one, the previous entry needs
     +-			to be decompressed first, and so on.
     +-
     +-			The hard-limit for this offset is 160 (an entry can only be
     +-			xor'ed against one of the 160 entries preceding it). This
     +-			number is always positive, and hence entries are always xor'ed
     +-			with **previous** bitmaps, not bitmaps that will come afterwards
     +-			in the index.
     +-
     +-		- 1-byte flags for this bitmap
     +-			At the moment the only available flag is `0x1`, which hints
     +-			that this bitmap can be re-used when rebuilding bitmap indexes
     +-			for the repository.
     +-
     +-		- The compressed bitmap itself, see Appendix A.
     ++    * A header appears at the beginning:
     ++
     ++        4-byte signature: :: {'B', 'I', 'T', 'M'}
     ++
     ++        2-byte version number (network byte order): ::
     ++
     ++            The current implementation only supports version 1
     ++            of the bitmap index (the same one as JGit).
     ++
     ++        2-byte flags (network byte order): ::
     ++
     ++            The following flags are supported:
     ++
     ++            ** {empty}
     ++            BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
     ++
     ++            This flag must always be present. It implies that the
     ++            bitmap index has been generated for a packfile or
     ++            multi-pack index (MIDX) with full closure (i.e. where
     ++            every single object in the packfile/MIDX can find its
     ++            parent links inside the same packfile/MIDX). This is a
     ++            requirement for the bitmap index format, also present in
     ++            JGit, that greatly reduces the complexity of the
     ++            implementation.
     ++
     ++            ** {empty}
     ++            BITMAP_OPT_HASH_CACHE (0x4): :::
     ++
     ++            If present, the end of the bitmap file contains
     ++            `N` 32-bit name-hash values, one per object in the
     ++            pack/MIDX. The format and meaning of the name-hash is
     ++            described below.
     ++
     ++        4-byte entry count (network byte order): ::
     ++            The total count of entries (bitmapped commits) in this bitmap index.
     ++
     ++        20-byte checksum: ::
     ++            The SHA1 checksum of the pack/MIDX this bitmap index
     ++            belongs to.
     ++
     ++    * 4 EWAH bitmaps that act as type indexes
      ++
      +Type indexes are serialized after the hash cache in the shape
      +of four EWAH bitmaps stored consecutively (see Appendix A for
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      +There is a bitmap for each Git object type, stored in the following
      +order:
      ++
     -+	- Commits
     -+	- Trees
     -+	- Blobs
     -+	- Tags
     ++    - Commits
     ++    - Trees
     ++    - Blobs
     ++    - Tags
      +
      ++
      +In each bitmap, the `n`th bit is set to true if the `n`th object
      +in the packfile or multi-pack index is of that type.
     +++
     ++The obvious consequence is that the OR of all 4 bitmaps will result
     ++in a full set (all bits set), and the AND of all 4 bitmaps will
     ++result in an empty bitmap (no bits set).
      +
     -+    The obvious consequence is that the OR of all 4 bitmaps will result
     -+    in a full set (all bits set), and the AND of all 4 bitmaps will
     -+    result in an empty bitmap (no bits set).
     -+
     -+	* N entries with compressed bitmaps, one for each indexed commit
     ++    * N entries with compressed bitmaps, one for each indexed commit
      ++
      +Where `N` is the total amount of entries in this bitmap index.
      +Each entry contains the following:
      +
     -+		** {empty}
     -+		4-byte object position (network byte order): ::
     - 			The position **in the index for the packfile or
     - 			multi-pack index** where the bitmap for this commit is
     - 			found.
     - 
     --		- 1-byte XOR-offset
     -+		** {empty}
     -+		1-byte XOR-offset: ::
     - 			The xor offset used to compress this bitmap. For an entry
     - 			in position `x`, a XOR offset of `y` means that the actual
     - 			bitmap representing this commit is composed by XORing the
     - 			bitmap for this entry with the bitmap in entry `x-y` (i.e.
     - 			the bitmap `y` entries before this one).
     --
     --			Note that this compression can be recursive. In order to
     --			XOR this entry with a previous one, the previous entry needs
     --			to be decompressed first, and so on.
     --
     --			The hard-limit for this offset is 160 (an entry can only be
     --			xor'ed against one of the 160 entries preceding it). This
     --			number is always positive, and hence entries are always xor'ed
     --			with **previous** bitmaps, not bitmaps that will come afterwards
     --			in the index.
     --
     --		- 1-byte flags for this bitmap
     ++        ** {empty}
     ++        4-byte object position (network byte order): ::
     ++            The position **in the index for the packfile or
     ++            multi-pack index** where the bitmap for this commit is
     ++            found.
     ++
     ++        ** {empty}
     ++        1-byte XOR-offset: ::
     ++            The xor offset used to compress this bitmap. For an entry
     ++            in position `x`, a XOR offset of `y` means that the actual
     ++            bitmap representing this commit is composed by XORing the
     ++            bitmap for this entry with the bitmap in entry `x-y` (i.e.
     ++            the bitmap `y` entries before this one).
      ++
      +NOTE: This compression can be recursive. In order to
      +XOR this entry with a previous one, the previous entry needs
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      +with **previous** bitmaps, not bitmaps that will come afterwards
      +in the index.
      +
     -+		** {empty}
     -+		1-byte flags for this bitmap: ::
     - 			At the moment the only available flag is `0x1`, which hints
     - 			that this bitmap can be re-used when rebuilding bitmap indexes
     - 			for the repository.
     - 
     --		- The compressed bitmap itself, see Appendix A.
     -+		** The compressed bitmap itself, see Appendix A.
     ++        ** {empty}
     ++        1-byte flags for this bitmap: ::
     ++            At the moment the only available flag is `0x1`, which hints
     ++            that this bitmap can be re-used when rebuilding bitmap indexes
     ++            for the repository.
     ++
     ++        ** The compressed bitmap itself, see Appendix A.
       
       == Appendix A: Serialization format for an EWAH bitmap
       
     +@@ Documentation/technical/bitmap-format.txt: implementation:
     + 	- 4-byte number of words of the COMPRESSED bitmap, when stored
     + 
     + 	- N x 8-byte words, as specified by the previous field
     +-
     +-		This is the actual content of the compressed bitmap.
     +++
     ++This is the actual content of the compressed bitmap.
     + 
     + 	- 4-byte position of the current RLW for the compressed
     + 		bitmap
 3:  b971558e1cb ! 3:  dbb86dca205 bitmap-format.txt: add information for trailing checksum
     @@ Commit message
       ## Documentation/technical/bitmap-format.txt ##
      @@ Documentation/technical/bitmap-format.txt: in the index.
       
     - 		** The compressed bitmap itself, see Appendix A.
     +         ** The compressed bitmap itself, see Appendix A.
       
      +	* {empty}
      +	TRAILER: ::

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html
  2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
@ 2022-06-16  5:03       ` Abhradeep Chakraborty via GitGitGadget
  2022-06-16  5:03       ` [PATCH v4 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-16  5:03 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Documentation/Makefile does not include bitmap-format.txt to generate
a html page using asciidoc.

Teach Documentation/Makefile to also generate a html page for
Documentation/technical/bitmap-format.txt file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index f2e7fc1daa5..4f801f4e4c9 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -94,6 +94,7 @@ TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
 TECH_DOCS += ToolsForGit
+TECH_DOCS += technical/bitmap-format
 TECH_DOCS += technical/bundle-format
 TECH_DOCS += technical/cruft-packs
 TECH_DOCS += technical/hash-function-transition
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 2/3] bitmap-format.txt: fix some formatting issues
  2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
  2022-06-16  5:03       ` [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
@ 2022-06-16  5:03       ` Abhradeep Chakraborty via GitGitGadget
  2022-06-16  5:03       ` [PATCH v4 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
  2022-06-16 18:53       ` [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
  3 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-16  5:03 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

The asciidoc generated html for `Documentation/technical/bitmap-
format.txt` is broken. This is mainly because `-` is used for nested
lists (which is not allowed in asciidoc) instead of `*`.

Fix these and also reformat it for better readability of the html page.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 199 +++++++++++-----------
 1 file changed, 103 insertions(+), 96 deletions(-)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 04b3ec21785..49c8e819804 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -25,9 +25,9 @@ An object is uniquely described by its bit position within a bitmap:
 	is defined as follows:
 
 		o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2)
-
-	The ordering between packs is done according to the MIDX's .rev file.
-	Notably, the preferred pack sorts ahead of all other packs.
++
+The ordering between packs is done according to the MIDX's .rev file.
+Notably, the preferred pack sorts ahead of all other packs.
 
 The on-disk representation (described below) of a bitmap is the same regardless
 of whether or not that bitmap belongs to a packfile or a MIDX. The only
@@ -39,97 +39,104 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
 
 == On-disk format
 
-	- A header appears at the beginning:
-
-		4-byte signature: {'B', 'I', 'T', 'M'}
-
-		2-byte version number (network byte order)
-			The current implementation only supports version 1
-			of the bitmap index (the same one as JGit).
-
-		2-byte flags (network byte order)
-
-			The following flags are supported:
-
-			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
-			This flag must always be present. It implies that the
-			bitmap index has been generated for a packfile or
-			multi-pack index (MIDX) with full closure (i.e. where
-			every single object in the packfile/MIDX can find its
-			parent links inside the same packfile/MIDX). This is a
-			requirement for the bitmap index format, also present in
-			JGit, that greatly reduces the complexity of the
-			implementation.
-
-			- BITMAP_OPT_HASH_CACHE (0x4)
-			If present, the end of the bitmap file contains
-			`N` 32-bit name-hash values, one per object in the
-			pack/MIDX. The format and meaning of the name-hash is
-			described below.
-
-		4-byte entry count (network byte order)
-
-			The total count of entries (bitmapped commits) in this bitmap index.
-
-		20-byte checksum
-
-			The SHA1 checksum of the pack/MIDX this bitmap index
-			belongs to.
-
-	- 4 EWAH bitmaps that act as type indexes
-
-		Type indexes are serialized after the hash cache in the shape
-		of four EWAH bitmaps stored consecutively (see Appendix A for
-		the serialization format of an EWAH bitmap).
-
-		There is a bitmap for each Git object type, stored in the following
-		order:
-
-			- Commits
-			- Trees
-			- Blobs
-			- Tags
-
-		In each bitmap, the `n`th bit is set to true if the `n`th object
-		in the packfile or multi-pack index is of that type.
-
-		The obvious consequence is that the OR of all 4 bitmaps will result
-		in a full set (all bits set), and the AND of all 4 bitmaps will
-		result in an empty bitmap (no bits set).
-
-	- N entries with compressed bitmaps, one for each indexed commit
-
-		Where `N` is the total amount of entries in this bitmap index.
-		Each entry contains the following:
-
-		- 4-byte object position (network byte order)
-			The position **in the index for the packfile or
-			multi-pack index** where the bitmap for this commit is
-			found.
-
-		- 1-byte XOR-offset
-			The xor offset used to compress this bitmap. For an entry
-			in position `x`, a XOR offset of `y` means that the actual
-			bitmap representing this commit is composed by XORing the
-			bitmap for this entry with the bitmap in entry `x-y` (i.e.
-			the bitmap `y` entries before this one).
-
-			Note that this compression can be recursive. In order to
-			XOR this entry with a previous one, the previous entry needs
-			to be decompressed first, and so on.
-
-			The hard-limit for this offset is 160 (an entry can only be
-			xor'ed against one of the 160 entries preceding it). This
-			number is always positive, and hence entries are always xor'ed
-			with **previous** bitmaps, not bitmaps that will come afterwards
-			in the index.
-
-		- 1-byte flags for this bitmap
-			At the moment the only available flag is `0x1`, which hints
-			that this bitmap can be re-used when rebuilding bitmap indexes
-			for the repository.
-
-		- The compressed bitmap itself, see Appendix A.
+    * A header appears at the beginning:
+
+        4-byte signature: :: {'B', 'I', 'T', 'M'}
+
+        2-byte version number (network byte order): ::
+
+            The current implementation only supports version 1
+            of the bitmap index (the same one as JGit).
+
+        2-byte flags (network byte order): ::
+
+            The following flags are supported:
+
+            ** {empty}
+            BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
+
+            This flag must always be present. It implies that the
+            bitmap index has been generated for a packfile or
+            multi-pack index (MIDX) with full closure (i.e. where
+            every single object in the packfile/MIDX can find its
+            parent links inside the same packfile/MIDX). This is a
+            requirement for the bitmap index format, also present in
+            JGit, that greatly reduces the complexity of the
+            implementation.
+
+            ** {empty}
+            BITMAP_OPT_HASH_CACHE (0x4): :::
+
+            If present, the end of the bitmap file contains
+            `N` 32-bit name-hash values, one per object in the
+            pack/MIDX. The format and meaning of the name-hash is
+            described below.
+
+        4-byte entry count (network byte order): ::
+            The total count of entries (bitmapped commits) in this bitmap index.
+
+        20-byte checksum: ::
+            The SHA1 checksum of the pack/MIDX this bitmap index
+            belongs to.
+
+    * 4 EWAH bitmaps that act as type indexes
++
+Type indexes are serialized after the hash cache in the shape
+of four EWAH bitmaps stored consecutively (see Appendix A for
+the serialization format of an EWAH bitmap).
++
+There is a bitmap for each Git object type, stored in the following
+order:
++
+    - Commits
+    - Trees
+    - Blobs
+    - Tags
+
++
+In each bitmap, the `n`th bit is set to true if the `n`th object
+in the packfile or multi-pack index is of that type.
++
+The obvious consequence is that the OR of all 4 bitmaps will result
+in a full set (all bits set), and the AND of all 4 bitmaps will
+result in an empty bitmap (no bits set).
+
+    * N entries with compressed bitmaps, one for each indexed commit
++
+Where `N` is the total amount of entries in this bitmap index.
+Each entry contains the following:
+
+        ** {empty}
+        4-byte object position (network byte order): ::
+            The position **in the index for the packfile or
+            multi-pack index** where the bitmap for this commit is
+            found.
+
+        ** {empty}
+        1-byte XOR-offset: ::
+            The xor offset used to compress this bitmap. For an entry
+            in position `x`, a XOR offset of `y` means that the actual
+            bitmap representing this commit is composed by XORing the
+            bitmap for this entry with the bitmap in entry `x-y` (i.e.
+            the bitmap `y` entries before this one).
++
+NOTE: This compression can be recursive. In order to
+XOR this entry with a previous one, the previous entry needs
+to be decompressed first, and so on.
++
+The hard-limit for this offset is 160 (an entry can only be
+xor'ed against one of the 160 entries preceding it). This
+number is always positive, and hence entries are always xor'ed
+with **previous** bitmaps, not bitmaps that will come afterwards
+in the index.
+
+        ** {empty}
+        1-byte flags for this bitmap: ::
+            At the moment the only available flag is `0x1`, which hints
+            that this bitmap can be re-used when rebuilding bitmap indexes
+            for the repository.
+
+        ** The compressed bitmap itself, see Appendix A.
 
 == Appendix A: Serialization format for an EWAH bitmap
 
@@ -142,8 +149,8 @@ implementation:
 	- 4-byte number of words of the COMPRESSED bitmap, when stored
 
 	- N x 8-byte words, as specified by the previous field
-
-		This is the actual content of the compressed bitmap.
++
+This is the actual content of the compressed bitmap.
 
 	- 4-byte position of the current RLW for the compressed
 		bitmap
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 3/3] bitmap-format.txt: add information for trailing checksum
  2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
  2022-06-16  5:03       ` [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
  2022-06-16  5:03       ` [PATCH v4 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
@ 2022-06-16  5:03       ` Abhradeep Chakraborty via GitGitGadget
  2022-06-16 18:53       ` [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
  3 siblings, 0 replies; 37+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-06-16  5:03 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Kaartic Sivaraam, Derrick Stolee, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Bitmap file has a trailing checksum at the end of the file. However
there is no information in the bitmap-format documentation about it.

Add a trailer section to include the trailing checksum info in the
`Documentation/technical/bitmap-format.txt` file.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 Documentation/technical/bitmap-format.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 49c8e819804..7be5f2318ba 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -138,6 +138,10 @@ in the index.
 
         ** The compressed bitmap itself, see Appendix A.
 
+	* {empty}
+	TRAILER: ::
+		Trailing checksum of the preceding contents.
+
 == Appendix A: Serialization format for an EWAH bitmap
 
 Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-06-16  5:03       ` [PATCH v4 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
@ 2022-06-16 18:53       ` Junio C Hamano
  2022-06-16 21:18         ` Taylor Blau
  3 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2022-06-16 18:53 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Taylor Blau, Kaartic Sivaraam, Derrick Stolee,
	Abhradeep Chakraborty

This version looks good and seems to format well.  Well done.

Thanks.  Will queue.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
  2022-06-16 18:53       ` [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
@ 2022-06-16 21:18         ` Taylor Blau
  0 siblings, 0 replies; 37+ messages in thread
From: Taylor Blau @ 2022-06-16 21:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty via GitGitGadget, git, Taylor Blau,
	Kaartic Sivaraam, Derrick Stolee, Abhradeep Chakraborty

On Thu, Jun 16, 2022 at 11:53:27AM -0700, Junio C Hamano wrote:
> This version looks good and seems to format well.  Well done.

Agreed. Nice work, Abhradeep!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2022-06-16 21:19 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-02 13:52 [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-06 15:55   ` Junio C Hamano
2022-06-07 10:25     ` Abhradeep Chakraborty
2022-06-02 13:52 ` [PATCH 2/2] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-07 18:39     ` Junio C Hamano
2022-06-08 15:02       ` Abhradeep Chakraborty
2022-06-07 20:21     ` Taylor Blau
2022-06-07 17:43   ` [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-07 20:51     ` Taylor Blau
2022-06-07 22:02       ` Junio C Hamano
2022-06-08 16:06         ` Abhradeep Chakraborty
2022-06-08 15:40       ` Abhradeep Chakraborty
2022-06-07 17:43   ` [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-07 20:56     ` Taylor Blau
2022-06-08 16:15       ` Abhradeep Chakraborty
2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-07 20:58     ` Taylor Blau
2022-06-07 21:00     ` Junio C Hamano
2022-06-08 17:12       ` Abhradeep Chakraborty
2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
2022-06-10 10:54     ` [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-10 10:54     ` [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-15  2:27       ` Taylor Blau
2022-06-15 14:28         ` Abhradeep Chakraborty
2022-06-10 10:54     ` [PATCH v3 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-10 17:01     ` [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-15  2:28       ` Taylor Blau
2022-06-15 22:41         ` Junio C Hamano
2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-16 18:53       ` [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-16 21:18         ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).