git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,
	Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>,
	Vicent Marti <tanoku@gmail.com>, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 1/2] bitmap-format.txt: fix some formatting issues
Date: Mon, 06 Jun 2022 08:55:54 -0700	[thread overview]
Message-ID: <xmqqsfohbxbp.fsf@gitster.g> (raw)
In-Reply-To: <976361e624a3dd58c8f291358d42f4e4c66eb266.1654177966.git.gitgitgadget@gmail.com> (Abhradeep Chakraborty via GitGitGadget's message of "Thu, 02 Jun 2022 13:52:45 +0000")

"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> Cc: git@vger.kernel.org,  Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Identify those who may have input with "git log --no-merges" and add
them here, perhaps?

> The asciidoc generated html for `Documentation/technical/bitmap-
> format.txt` is broken. This is mainly because `-` is used for nested
> lists (which is not allowed in asciidoc) instead of `*`.

Are we missing another step that must come much earlier than this
patch?  It seems to me that Documentation/Makefile does not even
consider that we should feed this file to AsciiDoc.

> Fix these and also reformat it (e.g. removing some blank lines) for
> better readability of the html page.

Do these blank lines hurt very badly how the end-result is formatted
in HTML?  Does the extra indentation between the line with "The
following flags are supported" on it and the two bullet items in the
header make the output better in significant way?

These changes make the input text much harder to read, and are not
very welcome, so unless they are part of "fixing generated HTML is
broken", please omit them.  As evidenced by the lack of HTML output
in the build system, a lot more folks read this document in text than
in HTML, and readability of the source matters.

Thanks.

> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>  Documentation/technical/bitmap-format.txt | 96 +++++++++++------------
>  1 file changed, 45 insertions(+), 51 deletions(-)
>
> diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
> index 04b3ec21785..110d7ddf8ed 100644
> --- a/Documentation/technical/bitmap-format.txt
> +++ b/Documentation/technical/bitmap-format.txt
> @@ -39,7 +39,7 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  
>  == On-disk format
>  
> -	- A header appears at the beginning:
> +	* A header appears at the beginning:
>  
>  		4-byte signature: {'B', 'I', 'T', 'M'}
>  
> @@ -48,35 +48,30 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  			of the bitmap index (the same one as JGit).
>  
>  		2-byte flags (network byte order)
> -
>  			The following flags are supported:
> -
> -			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
> -			This flag must always be present. It implies that the
> -			bitmap index has been generated for a packfile or
> -			multi-pack index (MIDX) with full closure (i.e. where
> -			every single object in the packfile/MIDX can find its
> -			parent links inside the same packfile/MIDX). This is a
> -			requirement for the bitmap index format, also present in
> -			JGit, that greatly reduces the complexity of the
> -			implementation.
> -
> -			- BITMAP_OPT_HASH_CACHE (0x4)
> -			If present, the end of the bitmap file contains
> -			`N` 32-bit name-hash values, one per object in the
> -			pack/MIDX. The format and meaning of the name-hash is
> -			described below.
> +				- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
> +				This flag must always be present. It implies that the
> +				bitmap index has been generated for a packfile or
> +				multi-pack index (MIDX) with full closure (i.e. where
> +				every single object in the packfile/MIDX can find its
> +				parent links inside the same packfile/MIDX). This is a
> +				requirement for the bitmap index format, also present in
> +				JGit, that greatly reduces the complexity of the
> +				implementation.
> +				- BITMAP_OPT_HASH_CACHE (0x4)
> +				If present, the end of the bitmap file contains
> +				`N` 32-bit name-hash values, one per object in the
> +				pack/MIDX. The format and meaning of the name-hash is
> +				described below.
>  
>  		4-byte entry count (network byte order)
> -
>  			The total count of entries (bitmapped commits) in this bitmap index.
>  
>  		20-byte checksum
> -
>  			The SHA1 checksum of the pack/MIDX this bitmap index
>  			belongs to.
>  
> -	- 4 EWAH bitmaps that act as type indexes
> +	* 4 EWAH bitmaps that act as type indexes
>  
>  		Type indexes are serialized after the hash cache in the shape
>  		of four EWAH bitmaps stored consecutively (see Appendix A for
> @@ -84,7 +79,6 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  
>  		There is a bitmap for each Git object type, stored in the following
>  		order:
> -
>  			- Commits
>  			- Trees
>  			- Blobs
> @@ -97,39 +91,39 @@ MIDXs, both the bit-cache and rev-cache extensions are required.
>  		in a full set (all bits set), and the AND of all 4 bitmaps will
>  		result in an empty bitmap (no bits set).
>  
> -	- N entries with compressed bitmaps, one for each indexed commit
> +	* N entries with compressed bitmaps, one for each indexed commit
>  
>  		Where `N` is the total amount of entries in this bitmap index.
>  		Each entry contains the following:
>  
> -		- 4-byte object position (network byte order)
> -			The position **in the index for the packfile or
> -			multi-pack index** where the bitmap for this commit is
> -			found.
> -
> -		- 1-byte XOR-offset
> -			The xor offset used to compress this bitmap. For an entry
> -			in position `x`, a XOR offset of `y` means that the actual
> -			bitmap representing this commit is composed by XORing the
> -			bitmap for this entry with the bitmap in entry `x-y` (i.e.
> -			the bitmap `y` entries before this one).
> -
> -			Note that this compression can be recursive. In order to
> -			XOR this entry with a previous one, the previous entry needs
> -			to be decompressed first, and so on.
> -
> -			The hard-limit for this offset is 160 (an entry can only be
> -			xor'ed against one of the 160 entries preceding it). This
> -			number is always positive, and hence entries are always xor'ed
> -			with **previous** bitmaps, not bitmaps that will come afterwards
> -			in the index.
> -
> -		- 1-byte flags for this bitmap
> -			At the moment the only available flag is `0x1`, which hints
> -			that this bitmap can be re-used when rebuilding bitmap indexes
> -			for the repository.
> -
> -		- The compressed bitmap itself, see Appendix A.
> +			** 4-byte object position (network byte order)
> +				The position **in the index for the packfile or
> +				multi-pack index** where the bitmap for this commit is
> +				found.
> +
> +			** 1-byte XOR-offset
> +				The xor offset used to compress this bitmap. For an entry
> +				in position `x`, a XOR offset of `y` means that the actual
> +				bitmap representing this commit is composed by XORing the
> +				bitmap for this entry with the bitmap in entry `x-y` (i.e.
> +				the bitmap `y` entries before this one).
> +
> +				Note that this compression can be recursive. In order to
> +				XOR this entry with a previous one, the previous entry needs
> +				to be decompressed first, and so on.
> +
> +				The hard-limit for this offset is 160 (an entry can only be
> +				xor'ed against one of the 160 entries preceding it). This
> +				number is always positive, and hence entries are always xor'ed
> +				with **previous** bitmaps, not bitmaps that will come afterwards
> +				in the index.
> +
> +			** 1-byte flags for this bitmap
> +				At the moment the only available flag is `0x1`, which hints
> +				that this bitmap can be re-used when rebuilding bitmap indexes
> +				for the repository.
> +
> +			** The compressed bitmap itself, see Appendix A.
>  
>  == Appendix A: Serialization format for an EWAH bitmap

  reply	other threads:[~2022-06-06 15:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-02 13:52 [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-06 15:55   ` Junio C Hamano [this message]
2022-06-07 10:25     ` Abhradeep Chakraborty
2022-06-02 13:52 ` [PATCH 2/2] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-07 18:39     ` Junio C Hamano
2022-06-08 15:02       ` Abhradeep Chakraborty
2022-06-07 20:21     ` Taylor Blau
2022-06-07 17:43   ` [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-07 20:51     ` Taylor Blau
2022-06-07 22:02       ` Junio C Hamano
2022-06-08 16:06         ` Abhradeep Chakraborty
2022-06-08 15:40       ` Abhradeep Chakraborty
2022-06-07 17:43   ` [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-07 20:56     ` Taylor Blau
2022-06-08 16:15       ` Abhradeep Chakraborty
2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-07 20:58     ` Taylor Blau
2022-06-07 21:00     ` Junio C Hamano
2022-06-08 17:12       ` Abhradeep Chakraborty
2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
2022-06-10 10:54     ` [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-10 10:54     ` [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-15  2:27       ` Taylor Blau
2022-06-15 14:28         ` Abhradeep Chakraborty
2022-06-10 10:54     ` [PATCH v3 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-10 17:01     ` [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-15  2:28       ` Taylor Blau
2022-06-15 22:41         ` Junio C Hamano
2022-06-16  5:03     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-16 18:53       ` [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-16 21:18         ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqsfohbxbp.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=chakrabortyabhradeep79@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=tanoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).