git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/4] Add some Glossary terms, and extra renormalize information.
@ 2022-07-09 16:56 Philip Oakley via GitGitGadget
  2022-07-09 16:56 ` [PATCH 1/4] glossary: add Object DataBase (ODB) abbreviation Philip Oakley via GitGitGadget
                   ` (5 more replies)
  0 siblings, 6 replies; 45+ messages in thread
From: Philip Oakley via GitGitGadget @ 2022-07-09 16:56 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley

This short series looks to add the basics of the reachability bitmap and
commit graph phrases to the glossary of terms. While these techniques are
well known to their developers, for some, they are just magic phrases.

The first patch [1/4] is to show OBD as an abbreviation to avoid a UNA [0]

Patch [2/4] provides a basic statement for the Commit-Graph's purpose.

Patch [3/4] provides a similar statement for the reachability bitmaps.

These two patches maybe misses out on some linking information as to the
benefits these have and the basics of their heuristic.

Patch [4/4] follows up on a bug report about the lack of idempotence for the
`--renormalise' command. See commit message for details.

[0] UNA Un-Named Abbreviation.

Signed-off-by: Philip Oakley philipoakley@iee.email

Philip Oakley (4):
  glossary: add Object DataBase (ODB) abbreviation
  glossary: add commit graph description
  glossary: add reachability bitmap description
  doc add: renormalize is not idempotent for CRCRLF

 Documentation/git-add.txt          |  3 ++-
 Documentation/glossary-content.txt | 15 ++++++++++++++-
 2 files changed, 16 insertions(+), 2 deletions(-)


base-commit: 30cc8d0f147546d4dd77bf497f4dec51e7265bd8
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1282%2FPhilipOakley%2FGlossary_terms-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1282/PhilipOakley/Glossary_terms-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1282
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/4] glossary: add Object DataBase (ODB) abbreviation
  2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
@ 2022-07-09 16:56 ` Philip Oakley via GitGitGadget
  2022-07-09 16:56 ` [PATCH 2/4] glossary: add commit graph description Philip Oakley via GitGitGadget
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley via GitGitGadget @ 2022-07-09 16:56 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Philip Oakley

From: Philip Oakley <philipoakley@iee.email>

ODB abbreviation is used in the technical section without expansion.
Show the abbreviation in the Glossary.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index aa2f41f5e70..f3342a5ab69 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -257,7 +257,7 @@ This commit is referred to as a "merge commit", or sometimes just a
 	<<def_SHA1,SHA-1>> of its contents. Consequently, an
 	object cannot be changed.
 
-[[def_object_database]]object database::
+[[def_object_database]]object database (ODB)::
 	Stores a set of "objects", and an individual <<def_object,object>> is
 	identified by its <<def_object_name,object name>>. The objects usually
 	live in `$GIT_DIR/objects/`.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/4] glossary: add commit graph description
  2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
  2022-07-09 16:56 ` [PATCH 1/4] glossary: add Object DataBase (ODB) abbreviation Philip Oakley via GitGitGadget
@ 2022-07-09 16:56 ` Philip Oakley via GitGitGadget
  2022-07-09 21:20   ` Junio C Hamano
  2022-07-09 16:56 ` [PATCH 3/4] glossary: add reachability bitmap description Philip Oakley via GitGitGadget
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley via GitGitGadget @ 2022-07-09 16:56 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Philip Oakley

From: Philip Oakley <philipoakley@iee.email>

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index f3342a5ab69..a9e69949a4e 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -75,6 +75,13 @@ state in the Git history, by creating a new commit representing the current
 state of the <<def_index,index>> and advancing <<def_HEAD,HEAD>>
 to point at the new commit.
 
+[[def_commit_graph]]commit graph::
+	The commit-graph file is a supplemental data structure that
+	accelerates commit graph walks. The existing Object Data Base (ODB)
+	is the definitive commit graph. The "commit-graph" file is stored
+	either in the .git/objects/info directory or in the info directory
+	of an alternate object database.
+
 [[def_commit_object]]commit object::
 	An <<def_object,object>> which contains the information about a
 	particular <<def_revision,revision>>, such as <<def_parent,parents>>, committer,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/4] glossary: add reachability bitmap description
  2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
  2022-07-09 16:56 ` [PATCH 1/4] glossary: add Object DataBase (ODB) abbreviation Philip Oakley via GitGitGadget
  2022-07-09 16:56 ` [PATCH 2/4] glossary: add commit graph description Philip Oakley via GitGitGadget
@ 2022-07-09 16:56 ` Philip Oakley via GitGitGadget
  2022-07-09 16:56 ` [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF Philip Oakley via GitGitGadget
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley via GitGitGadget @ 2022-07-09 16:56 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Philip Oakley

From: Philip Oakley <philipoakley@iee.email>

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index a9e69949a4e..6302df90563 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -500,6 +500,12 @@ exclude;;
 	<<def_tree_object,trees>> to the trees or <<def_blob_object,blobs>>
 	that they contain.
 
+[[def_reachability_bitmap]]reachability bitmaps::
+	Reachability bitmaps store information about the set of objects in
+	a packfile, or a multi-pack index (MIDX). A repository may have at
+	most one bitmap. The bitmap may belong to either one pack, or the
+	repository's multi-pack index (if it exists).
+
 [[def_rebase]]rebase::
 	To reapply a series of changes from a <<def_branch,branch>> to a
 	different base, and reset the <<def_head,head>> of that branch
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-07-09 16:56 ` [PATCH 3/4] glossary: add reachability bitmap description Philip Oakley via GitGitGadget
@ 2022-07-09 16:56 ` Philip Oakley via GitGitGadget
  2022-07-09 21:06   ` Junio C Hamano
  2022-07-10  7:48   ` Torsten Bögershausen
  2022-07-09 21:34 ` [PATCH 0/4] Add some Glossary terms, and extra renormalize information Junio C Hamano
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
  5 siblings, 2 replies; 45+ messages in thread
From: Philip Oakley via GitGitGadget @ 2022-07-09 16:56 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Philip Oakley

From: Philip Oakley <philipoakley@iee.email>

Bug report
 https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
noted that a file containing /r/r/n needed renormalising twice.

This is by design. Lone CR characters, not paired with an LF, are left
unchanged. Note the lack of idempotentness of the "clean" filter in the
documentation.

Renormalize was introduced at 9472935d81e (add: introduce "--renormalize",
Torsten Bögershausen, 2017-11-16)

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/git-add.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 11eb70f16c7..c4a5ad11a6b 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -188,7 +188,8 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
 	forcibly add them again to the index.  This is useful after
 	changing `core.autocrlf` configuration or the `text` attribute
 	in order to correct files added with wrong CRLF/LF line endings.
-	This option implies `-u`.
+	This option implies `-u`. Lone CR characters are untouched, so
+	cleaning not idempotent. A CRCRLF sequence cleans to CRLF.
 
 --chmod=(+|-)x::
 	Override the executable bit of the added files.  The executable
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-09 16:56 ` [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF Philip Oakley via GitGitGadget
@ 2022-07-09 21:06   ` Junio C Hamano
  2022-07-10 21:52     ` Philip Oakley
  2022-07-10  7:48   ` Torsten Bögershausen
  1 sibling, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2022-07-09 21:06 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Philip Oakley

"Philip Oakley via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Philip Oakley <philipoakley@iee.email>
>
> Bug report
>  https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
> noted that a file containing /r/r/n needed renormalising twice.

Did you mean backslash, not forward?

> This is by design. Lone CR characters, not paired with an LF, are left
> unchanged. Note the lack of idempotentness of the "clean" filter in the
> documentation.

OK.


> Renormalize was introduced at 9472935d81e (add: introduce "--renormalize",
> Torsten Bögershausen, 2017-11-16)

Does this need to be said "HERE", rather than leaving it to run "git
blame" for those who became curious?

> Signed-off-by: Philip Oakley <philipoakley@iee.email>
> ---
>  Documentation/git-add.txt | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 11eb70f16c7..c4a5ad11a6b 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -188,7 +188,8 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>  	forcibly add them again to the index.  This is useful after
>  	changing `core.autocrlf` configuration or the `text` attribute
>  	in order to correct files added with wrong CRLF/LF line endings.
> -	This option implies `-u`.
> +	This option implies `-u`. Lone CR characters are untouched, so
> +	cleaning not idempotent. A CRCRLF sequence cleans to CRLF.

Lack of verb BE somewhere.

Do we expect our readers all understand the math-y word?  It is not
too hard to explain it to math-uninitiated, e.g.

    This option implies `-u`.  Note that running renormalize again
    on the result of running renormalize may make it even "more
    normal".  A CR-CR-LF sequence would first renormalize to CR-LF
    (the first CR, a lone CR, is left intact, and CR-LF that follows
    normalizes to LF).  If you run renormalize again, the resulting
    CR-LF will normalize down to LF.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/4] glossary: add commit graph description
  2022-07-09 16:56 ` [PATCH 2/4] glossary: add commit graph description Philip Oakley via GitGitGadget
@ 2022-07-09 21:20   ` Junio C Hamano
  2022-07-10 21:37     ` Philip Oakley
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2022-07-09 21:20 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Philip Oakley

"Philip Oakley via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +[[def_commit_graph]]commit graph::
> +	The commit-graph file is a supplemental data structure that
> +	accelerates commit graph walks. The existing Object Data Base (ODB)
> +	is the definitive commit graph. The "commit-graph" file is stored
> +	either in the .git/objects/info directory or in the info directory
> +	of an alternate object database.

While it says nothing technically incorrect, I suspect "The existing
object data base is the definitive commit graph" may invite unneeded
confusion.

I think you wanted to say that the DAG formed by traversing the
pointers recorded in the objects is the authoritative source of
truth and the commit-graph file is merely a precomputed cache and
can be safely lost, but I am not sure the above description conveys
that to anybody who does not already know it.

    The commits in the object data base form a directed acyclic
    graph (DAG) by commits referring to their parent commits.
    Pieces of information from individual commit objects that are
    needed to traverse the DAG are pre-computed in the commit-graph
    file and stored in ...

is my attempt---I am not very happy or proud about it, but it may be
easier to follow.

Thanks.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] Add some Glossary terms, and extra renormalize information.
  2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-07-09 16:56 ` [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF Philip Oakley via GitGitGadget
@ 2022-07-09 21:34 ` Junio C Hamano
  2022-07-10 15:20   ` Philip Oakley
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
  5 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2022-07-09 21:34 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Philip Oakley

"Philip Oakley via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This short series looks to add the basics of the reachability bitmap and
> commit graph phrases to the glossary of terms. While these techniques are
> well known to their developers, for some, they are just magic phrases.
>
> The first patch [1/4] is to show OBD as an abbreviation to avoid a UNA [0]

Avoiding unnecessary TLA is even better than avoiding.  As I didn't
see in the other three patches that we need to use the OBD acronym,
perhaps we can omit this step?

> Patch [2/4] provides a basic statement for the Commit-Graph's purpose.
>
> Patch [3/4] provides a similar statement for the reachability bitmaps.
>
> These two patches maybe misses out on some linking information as to the
> benefits these have and the basics of their heuristic.
>
> Patch [4/4] follows up on a bug report about the lack of idempotence for the
> `--renormalise' command. See commit message for details.
>
> [0] UNA Un-Named Abbreviation.
>
> Signed-off-by: Philip Oakley philipoakley@iee.email
>
> Philip Oakley (4):
>   glossary: add Object DataBase (ODB) abbreviation
>   glossary: add commit graph description
>   glossary: add reachability bitmap description
>   doc add: renormalize is not idempotent for CRCRLF
>
>  Documentation/git-add.txt          |  3 ++-
>  Documentation/glossary-content.txt | 15 ++++++++++++++-
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
>
> base-commit: 30cc8d0f147546d4dd77bf497f4dec51e7265bd8
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1282%2FPhilipOakley%2FGlossary_terms-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1282/PhilipOakley/Glossary_terms-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1282

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-09 16:56 ` [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF Philip Oakley via GitGitGadget
  2022-07-09 21:06   ` Junio C Hamano
@ 2022-07-10  7:48   ` Torsten Bögershausen
  2022-07-10 22:09     ` Philip Oakley
  1 sibling, 1 reply; 45+ messages in thread
From: Torsten Bögershausen @ 2022-07-10  7:48 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Philip Oakley

On Sat, Jul 09, 2022 at 04:56:21PM +0000, Philip Oakley via GitGitGadget wrote:
> From: Philip Oakley <philipoakley@iee.email>
>
> Bug report
>  https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
> noted that a file containing /r/r/n needed renormalising twice.
>
> This is by design. Lone CR characters, not paired with an LF, are left
> unchanged.

This is all fine.

> Note the lack of idempotentness of the "clean" filter in the
> documentation.

The clean filter is idempotent, I would claim, see below.
You can run it, and re-run, and re-run, there will no other changes.
CRLF in the worktree will become LF in the repo,
'lone CR' stay as they are.
In that sense, CRCRLF in the worktree will become CRLF in the repo.
You can the renormalize again and again.

The "trick" is that the user has to decide what CRCRLF mean and what
should happen in the repo:
CRCRLF in the worktree becomes one line ending (one LF in the repo)
or
CRCRLF in the worktree becomes two line endings ( LFLF in the repo)

For a) you can use dos2unix twice.
Or run `git add --renormalize` followed by
`rm git.bdf`
`git restore .`

The thing is that we used a combination of different commands
$ git add --renormalize .
$ git commit -m "Renormalize bdf.txt"
$ rm git.bdf
$ git restore .
$ git add --renormalize .
$ git commit -m "Renormalize a second time bdf.txt"

... to clean up this very situation.

And, if CRCRLF should have become LFLF instead ?
Probably a python script is needed to fix this.
(or some other script/program in the language of your choice)

We could argue that
`git add --renormalize` is idempotent, but a series of carefully crafted
commands is not.
In short, what is missing is the documentation how CRCRLF is handled by
Git.

>
> Renormalize was introduced at 9472935d81e (add: introduce "--renormalize",
> Torsten Bögershausen, 2017-11-16)
>
> Signed-off-by: Philip Oakley <philipoakley@iee.email>
> ---
>  Documentation/git-add.txt | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 11eb70f16c7..c4a5ad11a6b 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -188,7 +188,8 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>  	forcibly add them again to the index.  This is useful after
>  	changing `core.autocrlf` configuration or the `text` attribute
>  	in order to correct files added with wrong CRLF/LF line endings.
> -	This option implies `-u`.

> +	This option implies `-u`. Lone CR characters are untouched, so
> +	cleaning not idempotent. A CRCRLF sequence cleans to CRLF.

How about this:

This option implies `-u`. Lone CR characters are untouched. CRCRLF cleans to CRLF.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] Add some Glossary terms, and extra renormalize information.
  2022-07-09 21:34 ` [PATCH 0/4] Add some Glossary terms, and extra renormalize information Junio C Hamano
@ 2022-07-10 15:20   ` Philip Oakley
  0 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-07-10 15:20 UTC (permalink / raw)
  To: Junio C Hamano, Philip Oakley via GitGitGadget; +Cc: git

Hi Junio,

On 09/07/2022 22:34, Junio C Hamano wrote:
>> The first patch [1/4] is to show OBD as an abbreviation to avoid a UNA [0]
> Avoiding unnecessary TLA is even better than avoiding.  As I didn't
> see in the other three patches that we need to use the OBD acronym,
> perhaps we can omit this step?
>
This came from seeing `ODB` in a couple of tech docs (commit-graph and
parallel-checkout) and an 'odb' option in pack-redundant, which I should
have noted in the commit message.

I'll update the commit message to clarify that we are using that TLA,
though we aren't always consistent in our distinctions between concepts
and implementation in many places (e.g. Object Store vs Repository;
Staging area..; etc.)

TLAs, UNAs, etc. have been a bug-bear of mine from doing large
engineering collaborations.

--
Philip

(for completeness;-)
ODB Object Data Base.
TLA Three Letter Abbreviation.
UNA Un-Named Abbreviation.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/4] glossary: add commit graph description
  2022-07-09 21:20   ` Junio C Hamano
@ 2022-07-10 21:37     ` Philip Oakley
  2022-08-30 14:33       ` Philip Oakley
  0 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-07-10 21:37 UTC (permalink / raw)
  To: Junio C Hamano, Philip Oakley via GitGitGadget; +Cc: git

Hi Junio,

On 09/07/2022 22:20, Junio C Hamano wrote:
> "Philip Oakley via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> +[[def_commit_graph]]commit graph::
>> +	The commit-graph file is a supplemental data structure that
>> +	accelerates commit graph walks. The existing Object Data Base (ODB)
>> +	is the definitive commit graph. The "commit-graph" file is stored
>> +	either in the .git/objects/info directory or in the info directory
>> +	of an alternate object database.
> While it says nothing technically incorrect, I suspect "The existing
> object data base is the definitive commit graph" may invite unneeded
> confusion.

I probably over-shortened the original text I was summarising
(technical/commit-graph.txt intro).
>
> I think you wanted to say that the DAG formed by traversing the
> pointers recorded in the objects is the authoritative source of
> truth and the commit-graph file is merely a precomputed cache
.. of that graph. *nod*
>  and
> can be safely lost, 

I wasn't particularly thinking of that aspect .. Perhaps more that it
accelerates commit graph walks..
> but I am not sure the above description conveys
> that to anybody who does not already know it.
>
>     The commits in the object data base form a directed acyclic
>     graph (DAG) by commits referring to their parent commits.
>     Pieces of information from individual commit objects that are
>     needed to traverse the DAG are pre-computed in the commit-graph
>     file and stored in ...
>
> is my attempt---I am not very happy or proud about it, but it may be
> easier to follow.

I wanted to keepseparate from the graph file definition, the rather
fuzzy relationship between the overall ODB (staging area, and loads of
other stuff), and the way the DAG is generated, which also needs the
selected refs to start the traverse..

In a wider context, it's not clear to me just how the commit graph file
content is chosen relative to the full depth DAG from all local refs.
The reachability bit maps have a similar info gap.

--
Philip

[sorry for erratic responses - currently isolating with covid]


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-09 21:06   ` Junio C Hamano
@ 2022-07-10 21:52     ` Philip Oakley
  2022-07-10 22:04       ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-07-10 21:52 UTC (permalink / raw)
  To: Junio C Hamano, Philip Oakley via GitGitGadget; +Cc: git

On 09/07/2022 22:06, Junio C Hamano wrote:
> "Philip Oakley via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Philip Oakley <philipoakley@iee.email>
>>
>> Bug report
>>  https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
>> noted that a file containing /r/r/n needed renormalising twice.
> Did you mean backslash, not forward?

Correct. Too many years of Windows.
>
>> This is by design. Lone CR characters, not paired with an LF, are left
>> unchanged. Note the lack of idempotentness of the "clean" filter in the
>> documentation.
> OK.
>
>
>> Renormalize was introduced at 9472935d81e (add: introduce "--renormalize",
>> Torsten Bögershausen, 2017-11-16)
> Does this need to be said "HERE", rather than leaving it to run "git
> blame" for those who became curious?

It was a misguided reminder to cc Torsten about his recollection of the
CRCRLF issue. I'll remove it. I see Torsten has also commented.
>
>> Signed-off-by: Philip Oakley <philipoakley@iee.email>
>> ---
>>  Documentation/git-add.txt | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
>> index 11eb70f16c7..c4a5ad11a6b 100644
>> --- a/Documentation/git-add.txt
>> +++ b/Documentation/git-add.txt
>> @@ -188,7 +188,8 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>>  	forcibly add them again to the index.  This is useful after
>>  	changing `core.autocrlf` configuration or the `text` attribute
>>  	in order to correct files added with wrong CRLF/LF line endings.
>> -	This option implies `-u`.
>> +	This option implies `-u`. Lone CR characters are untouched, so
>> +	cleaning *^* not idempotent. A CRCRLF sequence cleans to CRLF.
> Lack of verb BE somewhere. 
'^' It took me three re-reads to see my mistyping as my head knew what
I'd meant to write, I've marked above as a note to self.
Aside: Are there any guides / suggestions / how-to's for on-line
reviewing that you can recommend o
> Do we expect our readers all understand the math-y word? 
Ok. It's mainly used in the test directory, and fsmonitor.h, but not in
the user docs.

>  It is not
> too hard to explain it to math-uninitiated, e.g.
>
>     This option implies `-u`.  Note that running renormalize again
>     on the result of running renormalize may make it even "more
>     normal".  A CR-CR-LF sequence would first renormalize to CR-LF
>     (the first CR, a lone CR, is left intact, and CR-LF that follows
>     normalizes to LF).  If you run renormalize again, the resulting
>     CR-LF will normalize down to LF.
>
Torsten had a shorter suggestion I'll also look at.

Philip

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-10 21:52     ` Philip Oakley
@ 2022-07-10 22:04       ` Junio C Hamano
  2022-07-10 22:25         ` Philip Oakley
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2022-07-10 22:04 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Philip Oakley via GitGitGadget, git

Philip Oakley <philipoakley@iee.email> writes:

>>> +	This option implies `-u`. Lone CR characters are untouched, so
>>> +	cleaning *^* not idempotent. A CRCRLF sequence cleans to CRLF.
>> Lack of verb BE somewhere. 
> '^' It took me three re-reads to see my mistyping as my head knew what
> I'd meant to write, I've marked above as a note to self.
> Aside: Are there any guides / suggestions / how-to's for on-line
> reviewing that you can recommend o

Sorry, but I do not know of any good "trick" to fight against our
common tendency to easily miss trivial typoes and thinkos in what we
ourselves wrote.  We can be surprisingly blind to what a colleague
can spot immediately, and that is why it helps to have a thorough
read-through by a reviewer with fresh eyes.  When I was a more
prolific contributor, I sometimes tried to read aloud what I wrote
to myself, both docs and code, and caught silly mistakes before
sending them out to the list, but I do not recommend it to others.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-10  7:48   ` Torsten Bögershausen
@ 2022-07-10 22:09     ` Philip Oakley
  2022-08-05 22:26       ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-07-10 22:09 UTC (permalink / raw)
  To: Torsten Bögershausen, Philip Oakley via GitGitGadget; +Cc: git

Hi Tortsen,
Thanks for the reply and comments.

On 10/07/2022 08:48, Torsten Bögershausen wrote:
> On Sat, Jul 09, 2022 at 04:56:21PM +0000, Philip Oakley via GitGitGadget wrote:
>> From: Philip Oakley <philipoakley@iee.email>
>>
>> Bug report
>>  https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
>> noted that a file containing /r/r/n needed renormalising twice.
>>
>> This is by design. Lone CR characters, not paired with an LF, are left
>> unchanged.
> This is all fine.
>
>> Note the lack of idempotentness of the "clean" filter in the
>> documentation.
> The clean filter is idempotent, I would claim, see below.

I'd disagree, on the basis that any second 'idempotent' cleaning should
not change the file content at all. The need for a second clean was the
surprise the user had.
> You can run it, and re-run, and re-run, there will no other changes.
> CRLF in the worktree will become LF in the repo,
> 'lone CR' stay as they are.
> In that sense, CRCRLF in the worktree will become CRLF in the repo.
So  the output isn't normalised, and warning messages ensue (if enabled,
etc)
> You can the renormalize again and again.
>
> The "trick" is that the user has to decide what CRCRLF mean and what
> should happen in the repo
.. for which they should be forewarned of the issue.
In this case it was a large repository transfer of legacy data, so
little knowledge of how the double CRs occured, but it was a real issue
for them.


> :
> CRCRLF in the worktree becomes one line ending (one LF in the repo)
> or
> CRCRLF in the worktree becomes two line endings ( LFLF in the repo)
>
> For a) you can use dos2unix twice.
> Or run `git add --renormalize` followed by
> `rm git.bdf`
> `git restore .`
>
> The thing is that we used a combination of different commands
> $ git add --renormalize .
> $ git commit -m "Renormalize bdf.txt"
> $ rm git.bdf
> $ git restore .
> $ git add --renormalize .
> $ git commit -m "Renormalize a second time bdf.txt"
>
> ... to clean up this very situation.
>
> And, if CRCRLF should have become LFLF instead ?
> Probably a python script is needed to fix this.
> (or some other script/program in the language of your choice)
>
> We could argue that
> `git add --renormalize` is idempotent, but a series of carefully crafted
> commands is not.

> In short, what is missing is the documentation how CRCRLF is handled by
> Git.
*nod*
>
>> Renormalize was introduced at 9472935d81e (add: introduce "--renormalize",
>> Torsten Bögershausen, 2017-11-16)
>>
>> Signed-off-by: Philip Oakley <philipoakley@iee.email>
>> ---
>>  Documentation/git-add.txt | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
>> index 11eb70f16c7..c4a5ad11a6b 100644
>> --- a/Documentation/git-add.txt
>> +++ b/Documentation/git-add.txt
>> @@ -188,7 +188,8 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>>  	forcibly add them again to the index.  This is useful after
>>  	changing `core.autocrlf` configuration or the `text` attribute
>>  	in order to correct files added with wrong CRLF/LF line endings.
>> -	This option implies `-u`.
>> +	This option implies `-u`. Lone CR characters are untouched, so
>> +	cleaning not idempotent. A CRCRLF sequence cleans to CRLF.
> How about this:
>
> This option implies `-u`. Lone CR characters are untouched. CRCRLF cleans to CRLF.
That is probably sufficient. It drops the awkward 'idempotent'. And
indicates this edge case, though doesn't highlight that the resultant
CRLF still leaves the file only partially renormalised.

I'll reword.
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-10 22:04       ` Junio C Hamano
@ 2022-07-10 22:25         ` Philip Oakley
  0 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-07-10 22:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Philip Oakley via GitGitGadget, git

On 10/07/2022 23:04, Junio C Hamano wrote:
> Philip Oakley <philipoakley@iee.email> writes:
>
>>>> +	This option implies `-u`. Lone CR characters are untouched, so
>>>> +	cleaning *^* not idempotent. A CRCRLF sequence cleans to CRLF.
>>> Lack of verb BE somewhere. 
>> '^' It took me three re-reads to see my mistyping as my head knew what
>> I'd meant to write, I've marked above as a note to self.
>> Aside: Are there any guides / suggestions / how-to's for on-line
>> reviewing that you can recommend o
> Sorry, but I do not know of any good "trick" to fight against our
> common tendency to easily miss trivial typoes and thinkos in what we
> ourselves wrote.  We can be surprisingly blind to what a colleague
> can spot immediately, and that is why it helps to have a thorough
> read-through by a reviewer with fresh eyes.  When I was a more
> prolific contributor, I sometimes tried to read aloud what I wrote
> to myself, both docs and code, and caught silly mistakes before
> sending them out to the list, but I do not recommend it to others.

Thanks. There does appear to be a lack of literature or articles in this
area of on-list reviewing

I've not even seen an list of snippets collated from email advice. 
Other than the email etiquette's starter for ten on don't top post ;-)

--
Philip

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-07-10 22:09     ` Philip Oakley
@ 2022-08-05 22:26       ` Junio C Hamano
  2022-08-06 19:22         ` Torsten Bögershausen
                           ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Junio C Hamano @ 2022-08-05 22:26 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Torsten Bögershausen, Philip Oakley via GitGitGadget, git

Philip Oakley <philipoakley@iee.email> writes:

>> How about this:
>>
>> This option implies `-u`. Lone CR characters are untouched. CRCRLF cleans to CRLF.
> That is probably sufficient. It drops the awkward 'idempotent'. And
> indicates this edge case, though doesn't highlight that the resultant
> CRLF still leaves the file only partially renormalised.
>
> I'll reword.

It's been a few weeks since the last activity on this topic.
Anything you guys need unblocked to move forward?

Thanks.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-08-05 22:26       ` Junio C Hamano
@ 2022-08-06 19:22         ` Torsten Bögershausen
  2022-08-08 14:32         ` Philip Oakley
  2022-08-10 14:44         ` [PATCH v2 0/1] .. Add extra renormalize information Philip Oakley
  2 siblings, 0 replies; 45+ messages in thread
From: Torsten Bögershausen @ 2022-08-06 19:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Philip Oakley, Philip Oakley via GitGitGadget, git

On Fri, Aug 05, 2022 at 03:26:10PM -0700, Junio C Hamano wrote:
> Philip Oakley <philipoakley@iee.email> writes:
>
> >> How about this:
> >>
> >> This option implies `-u`. Lone CR characters are untouched. CRCRLF cleans to CRLF.
> > That is probably sufficient. It drops the awkward 'idempotent'. And
> > indicates this edge case, though doesn't highlight that the resultant
> > CRLF still leaves the file only partially renormalised.
> >
> > I'll reword.
>
> It's been a few weeks since the last activity on this topic.
> Anything you guys need unblocked to move forward?
>
> Thanks.
>

Not from my point of view. My understanding is, that the short version is OK for
everybody:

This option implies `-u`. Lone CR characters are untouched. CRCRLF cleans to CRLF.

Is it OK to ask you for a local ammend to push this further ?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-08-05 22:26       ` Junio C Hamano
  2022-08-06 19:22         ` Torsten Bögershausen
@ 2022-08-08 14:32         ` Philip Oakley
  2022-08-08 16:21           ` Junio C Hamano
  2022-08-09 18:44           ` Torsten Bögershausen
  2022-08-10 14:44         ` [PATCH v2 0/1] .. Add extra renormalize information Philip Oakley
  2 siblings, 2 replies; 45+ messages in thread
From: Philip Oakley @ 2022-08-08 14:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Torsten Bögershausen, Philip Oakley via GitGitGadget, git

I've unfortunately had some family issue which prevented me doing any work.

If I haven't managed anything by the end of the week. I'd be happy if
others took it forward.

On 05/08/2022 23:26, Junio C Hamano wrote:
> Philip Oakley <philipoakley@iee.email> writes:
>
>>> How about this:
>>>
>>> This option implies `-u`. Lone CR characters are untouched. CRCRLF cleans to CRLF.
>> That is probably sufficient. It drops the awkward 'idempotent'. And
>> indicates this edge case, though doesn't highlight that the resultant
>> CRLF still leaves the file only partially renormalised.
>>
>> I'll reword.
> It's been a few weeks since the last activity on this topic.
> Anything you guys need unblocked to move forward?
>
> Thanks.
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-08-08 14:32         ` Philip Oakley
@ 2022-08-08 16:21           ` Junio C Hamano
  2022-08-09 18:44           ` Torsten Bögershausen
  1 sibling, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2022-08-08 16:21 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Torsten Bögershausen, Philip Oakley via GitGitGadget, git

Philip Oakley <philipoakley@iee.email> writes:

> I've unfortunately had some family issue which prevented me doing any work.

I hope everything will be well on your side.

> If I haven't managed anything by the end of the week. I'd be happy if
> others took it forward.

Thanks for letting us know.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF
  2022-08-08 14:32         ` Philip Oakley
  2022-08-08 16:21           ` Junio C Hamano
@ 2022-08-09 18:44           ` Torsten Bögershausen
  1 sibling, 0 replies; 45+ messages in thread
From: Torsten Bögershausen @ 2022-08-09 18:44 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, Philip Oakley via GitGitGadget, git

On Mon, Aug 08, 2022 at 03:32:35PM +0100, Philip Oakley wrote:
> I've unfortunately had some family issue which prevented me doing any work.

That is sad to here. I hope that things are getting better in one way or another.

>
> If I haven't managed anything by the end of the week. I'd be happy if
> others took it forward.

I can certainly have a look, after the weekend, and continue your work.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 0/1] .. Add extra renormalize information.
  2022-08-05 22:26       ` Junio C Hamano
  2022-08-06 19:22         ` Torsten Bögershausen
  2022-08-08 14:32         ` Philip Oakley
@ 2022-08-10 14:44         ` Philip Oakley
  2022-08-10 14:44           ` [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF Philip Oakley
  2 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-08-10 14:44 UTC (permalink / raw)
  To: gitster; +Cc: git, gitgitgadget, philipoakley, tboegi

This was [PATCH 4/4] 'doc add: renormalize is not idempotent for CRCRLF'
of a GitGitGadget series, which was split off into its own branch
po/doc-add-renormalize

Since V1, remove the use of 'idempotent' which is unknown to many. Instead
clarify the special case.

Philip Oakley (1):
  doc add: renormalize is not idempotent for CRCRLF

 Documentation/git-add.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

-- 
2.37.1.windows.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF
  2022-08-10 14:44         ` [PATCH v2 0/1] .. Add extra renormalize information Philip Oakley
@ 2022-08-10 14:44           ` Philip Oakley
  2022-08-10 17:11             ` Torsten Bögershausen
  2022-08-10 17:42             ` Junio C Hamano
  0 siblings, 2 replies; 45+ messages in thread
From: Philip Oakley @ 2022-08-10 14:44 UTC (permalink / raw)
  To: gitster; +Cc: git, gitgitgadget, philipoakley, tboegi

Bug report
 https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
noted that a file containing /r/r/n needed renormalising twice.

This is by design. Lone CR characters, not paired with an LF, are left
unchanged. Note this limitation of the "clean" filter in the documentation.

Renormalize was introduced at 9472935d81e (add: introduce "--renormalize",
Torsten Bögershausen, 2017-11-16)

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
This is V2 of po/doc-add-renormalize, based on commit dc8c8deaa6
(Prepare for 2.36.2, 2022-06-07).
It was [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF.

git send-email \
    --in-reply-to=xmqq5yj6z5rx.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=philipoakley@iee.email \
    --cc=tboegi@web.de \
    v2-00*
---
 Documentation/git-add.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 11eb70f16c..9b37f35654 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -188,7 +188,9 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
 	forcibly add them again to the index.  This is useful after
 	changing `core.autocrlf` configuration or the `text` attribute
 	in order to correct files added with wrong CRLF/LF line endings.
-	This option implies `-u`.
+	This option implies `-u`. Lone CR characters are untouched, thus
+	while a CRLF cleans to LF, a CRCRLF sequence is only partially
+	cleaned to CRLF.
 
 --chmod=(+|-)x::
 	Override the executable bit of the added files.  The executable
-- 
2.37.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF
  2022-08-10 14:44           ` [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF Philip Oakley
@ 2022-08-10 17:11             ` Torsten Bögershausen
  2022-08-10 17:42             ` Junio C Hamano
  1 sibling, 0 replies; 45+ messages in thread
From: Torsten Bögershausen @ 2022-08-10 17:11 UTC (permalink / raw)
  To: Philip Oakley; +Cc: gitster, git, gitgitgadget

[]

> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 11eb70f16c..9b37f35654 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -188,7 +188,9 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>  	forcibly add them again to the index.  This is useful after
>  	changing `core.autocrlf` configuration or the `text` attribute
>  	in order to correct files added with wrong CRLF/LF line endings.
> -	This option implies `-u`.
> +	This option implies `-u`. Lone CR characters are untouched, thus
> +	while a CRLF cleans to LF, a CRCRLF sequence is only partially
> +	cleaned to CRLF.

Thanks, I think this one looks good to me.
Reviewed-by: Torsten Bögershausen <tboegi@web.de>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF
  2022-08-10 14:44           ` [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF Philip Oakley
  2022-08-10 17:11             ` Torsten Bögershausen
@ 2022-08-10 17:42             ` Junio C Hamano
  1 sibling, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2022-08-10 17:42 UTC (permalink / raw)
  To: Philip Oakley; +Cc: git, gitgitgadget, tboegi

Philip Oakley <philipoakley@iee.email> writes:

> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 11eb70f16c..9b37f35654 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -188,7 +188,9 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>  	forcibly add them again to the index.  This is useful after
>  	changing `core.autocrlf` configuration or the `text` attribute
>  	in order to correct files added with wrong CRLF/LF line endings.
> -	This option implies `-u`.
> +	This option implies `-u`. Lone CR characters are untouched, thus
> +	while a CRLF cleans to LF, a CRCRLF sequence is only partially
> +	cleaned to CRLF.

Looks perfetly readable and understandable to me.

Thanks, will replace.  Let's plan to merge it down soonish.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/4] glossary: add commit graph description
  2022-07-10 21:37     ` Philip Oakley
@ 2022-08-30 14:33       ` Philip Oakley
  0 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-08-30 14:33 UTC (permalink / raw)
  To: Junio C Hamano, Philip Oakley via GitGitGadget; +Cc: git

On 10/07/2022 22:37, Philip Oakley wrote:
> Hi Junio,
>
> On 09/07/2022 22:20, Junio C Hamano wrote:
>> "Philip Oakley via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> +[[def_commit_graph]]commit graph::
>>> +	The commit-graph file is a supplemental data structure that
>>> +	accelerates commit graph walks. The existing Object Data Base (ODB)
>>> +	is the definitive commit graph. The "commit-graph" file is stored
>>> +	either in the .git/objects/info directory or in the info directory
>>> +	of an alternate object database.
>> While it says nothing technically incorrect, I suspect "The existing
>> object data base is the definitive commit graph" may invite unneeded
>> confusion.
> I probably over-shortened the original text I was summarising
> (technical/commit-graph.txt intro).

I was looking to outline the concept and how, therefore, it is different
from the reachability bitmaps, and also from the 'canonical' DAG of
Git's commit objects.

I hope to have another go in a couple of weeks time.

>> I think you wanted to say that the DAG formed by traversing the
>> pointers recorded in the objects is the authoritative source of
>> truth and the commit-graph file is merely a precomputed cache
> .. of that graph. *nod*
>>  and
>> can be safely lost, 
> I wasn't particularly thinking of that aspect .. Perhaps more that it
> accelerates commit graph walks..
>> but I am not sure the above description conveys
>> that to anybody who does not already know it.
>>
>>     The commits in the object data base form a directed acyclic
>>     graph (DAG) by commits referring to their parent commits.
>>     Pieces of information from individual commit objects that are
>>     needed to traverse the DAG are pre-computed in the commit-graph
>>     file and stored in ...
>>
>> is my attempt---I am not very happy or proud about it, but it may be
>> easier to follow.
> I wanted to keepseparate from the graph file definition, the rather
> fuzzy relationship between the overall ODB (staging area, and loads of
> other stuff), and the way the DAG is generated, which also needs the
> selected refs to start the traverse..
>
> In a wider context, it's not clear to me just how the commit graph file
> content is chosen relative to the full depth DAG from all local refs.
> The reachability bit maps have a similar info gap.
>
> --
> Philip
>
> [sorry for erratic responses - currently isolating with covid]
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 0/3] Add some Glossary of terms information
  2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-07-09 21:34 ` [PATCH 0/4] Add some Glossary terms, and extra renormalize information Junio C Hamano
@ 2022-10-22 22:25 ` Philip Oakley
  2022-10-22 22:25   ` [PATCH v2 1/3] doc: use 'object database' not ODB or abbreviation Philip Oakley
                     ` (4 more replies)
  5 siblings, 5 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-22 22:25 UTC (permalink / raw)
  To: GitList; +Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

was GitGitGadget #1282,
(in reply to <pull.1282.git.1657385781.gitgitgadget@gmail.com>)

This short series looks to add the basics of the reachability bitmap
and commit graph phrases to the glossary of terms. While these
techniques are well known to their developers, for some, they are
just magic phrases.

[V2] .. since V1
Patch 4/4 has been taken upstream independently, and hence dropped
here so we're now just [n/3].

Patch 1/3 Dropped the glossary addition in favour of changing the
locations that used ODB (Junio's suggestion). Kept the git
pack-redundant's `--alt-odb` but spelt out 'object database' in full
in the man page. The only remaining `odb`s are within `goodbye` ;-).

While here, add the (oid) abbreviation to its adjacent entry.

Patch 2/3 Split the 'commit-graph' explanation into two parts to
distinguish the speed-up option, from Git's core graph concept of
object traversal. Included links to existing terms.

Patch 3/3 Added links to existing terms. Statement for the
reachability bitmaps.

added cc: for Stolee (commit-graph) and Abhradeep Chakraborty
(Bitmaps) review.


[V1] [GGG PR #1282] 
https://lore.kernel.org/git/pull.1282.git.1657385781.gitgitgadget@gmail.com/

The first patch [1/4] is to show OBD as an abbreviation to avoid a UNA [0]

Patch [2/4] provides a basic statement for the Commit-Graph's purpose.

Patch [3/4] provides a similar statement for the reachability bitmaps.

These two patches maybe misses out on some linking information as to
the benefits these have and the basics of their heuristic.

Patch [4/4] follows up on a bug report about the lack of idempotence
for the `--renormalise' command. See commit message for details.

[0] UNA Un-Named Abbreviation.

Signed-off-by: Philip Oakley philipoakley@iee.email
cc: Philip Oakley philipoakley@iee.email


Philip Oakley (3):
  doc: use 'object database' not ODB or abbreviation
  glossary: add "commit graph" description
  glossary: add reachability bitmap description

 Documentation/git-pack-redundant.txt          |  2 +-
 Documentation/glossary-content.txt            | 27 +++++++++++++++++--
 Documentation/technical/commit-graph.txt      |  2 +-
 Documentation/technical/parallel-checkout.txt |  2 +-
 4 files changed, 28 insertions(+), 5 deletions(-)

Range-diff against v1:
1:  51b55828d5 ! 1:  dc0d934b00 glossary: add Object DataBase (ODB) abbreviation
    @@ Metadata
     Author: Philip Oakley <philipoakley@iee.email>
     
      ## Commit message ##
    -    glossary: add Object DataBase (ODB) abbreviation
    +    doc: use 'object database' not ODB or abbreviation
     
    -    ODB abbreviation is used in the technical section without expansion.
    -    Show the abbreviation in the Glossary.
    +    The abbreviation 'ODB' is used in the technical documentation
    +    sections for commit-graph and parallel-checkout, along with an
    +    'odb' option in `git-pack-redundant`, without expansion.
    +
    +    Use 'object database' in full, in those entries. The text has not
    +    been reflowed to keep the changes minimal.
    +
    +    While in the glossary for `object` terms, add the common`oid`
    +    abbreviation to its entry.
     
         Signed-off-by: Philip Oakley <philipoakley@iee.email>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +
    + ## Documentation/git-pack-redundant.txt ##
    +@@ Documentation/git-pack-redundant.txt: OPTIONS
    + 
    + --alt-odb::
    + 	Don't require objects present in packs from alternate object
    +-	directories to be present in local packs.
    ++	database (odb) directories to be present in local packs.
    + 
    + --verbose::
    + 	Outputs some statistics to stderr. Has a small performance penalty.
     
      ## Documentation/glossary-content.txt ##
     @@ Documentation/glossary-content.txt: This commit is referred to as a "merge commit", or sometimes just a
    - 	<<def_SHA1,SHA-1>> of its contents. Consequently, an
    - 	object cannot be changed.
    - 
    --[[def_object_database]]object database::
    -+[[def_object_database]]object database (ODB)::
    - 	Stores a set of "objects", and an individual <<def_object,object>> is
      	identified by its <<def_object_name,object name>>. The objects usually
      	live in `$GIT_DIR/objects/`.
    + 
    +-[[def_object_identifier]]object identifier::
    ++[[def_object_identifier]]object identifier (oid)::
    + 	Synonym for <<def_object_name,object name>>.
    + 
    + [[def_object_name]]object name::
    +
    + ## Documentation/technical/commit-graph.txt ##
    +@@ Documentation/technical/commit-graph.txt: There are two main costs here:
    + 
    + The commit-graph file is a supplemental data structure that accelerates
    + commit graph walks. If a user downgrades or disables the 'core.commitGraph'
    +-config setting, then the existing ODB is sufficient. The file is stored
    ++config setting, then the existing object database is sufficient. The file is stored
    + as "commit-graph" either in the .git/objects/info directory or in the info
    + directory of an alternate.
    + 
    +
    + ## Documentation/technical/parallel-checkout.txt ##
    +@@ Documentation/technical/parallel-checkout.txt: Rejected Multi-Threaded Solution
    + 
    + The most "straightforward" implementation would be to spread the set of
    + to-be-updated cache entries across multiple threads. But due to the
    +-thread-unsafe functions in the ODB code, we would have to use locks to
    ++thread-unsafe functions in the object database code, we would have to use locks to
    + coordinate the parallel operation. An early prototype of this solution
    + showed that the multi-threaded checkout would bring performance
    + improvements over the sequential code, but there was still too much lock
2:  6a88bdb7ed ! 2:  77fbf889a5 glossary: add commit graph description
    @@ Metadata
     Author: Philip Oakley <philipoakley@iee.email>
     
      ## Commit message ##
    -    glossary: add commit graph description
    +    glossary: add "commit graph" description
    +
    +    Git has an additional "commit graph" capability that supplements the
    +    normal commit object's directed acylic graph (DAG). The supplemental
    +    commit graph file is designed for speed of access.
    +
    +    Describe the commit graph both from the normative DAG view point and
    +    from the commit graph file perspective.
    +
    +    Also, clarify the link between the branch ref and branch tip
    +    by linking to the `ref` glossary entry, matching this commit graph
    +    entry.
     
         Signed-off-by: Philip Oakley <philipoakley@iee.email>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## Documentation/glossary-content.txt ##
    +@@
    + [[def_branch]]branch::
    + 	A "branch" is a line of development.  The most recent
    + 	<<def_commit,commit>> on a branch is referred to as the tip of
    +-	that branch.  The tip of the branch is referenced by a branch
    ++	that branch.  The tip of the branch is <<def_ref,referenced>> by a branch
    + 	<<def_head,head>>, which moves forward as additional development
    + 	is done on the branch.  A single Git
    + 	<<def_repository,repository>> can track an arbitrary number of
     @@ Documentation/glossary-content.txt: state in the Git history, by creating a new commit representing the current
      state of the <<def_index,index>> and advancing <<def_HEAD,HEAD>>
      to point at the new commit.
      
    -+[[def_commit_graph]]commit graph::
    -+	The commit-graph file is a supplemental data structure that
    -+	accelerates commit graph walks. The existing Object Data Base (ODB)
    -+	is the definitive commit graph. The "commit-graph" file is stored
    ++[[def_commit_graph_general]]commit graph concept, representations and usage::
    ++	A synonym for the <<def_DAG,DAG>> structure formed by
    ++	the commits in the object database, <<def_ref,referenced>> by branch tips,
    ++	using their <<def_chain,chain>> of linked commits.
    ++	This structure is the definitive commit graph. The
    ++	graph can be represented in other ways, e.g. the
    ++	<<def_commit_graph_file,commit graph file>>.
    ++
    ++[[def_commit_graph_file]]commit graph file::
    ++	The commit-graph file is a supplemental representation of
    ++	the <<def_commit_graph_general,commit graph>> which accelerates
    ++	commit graph walks. The "commit-graph" file is stored
     +	either in the .git/objects/info directory or in the info directory
     +	of an alternate object database.
     +
3:  564de4c68f ! 3:  fde2c58153 glossary: add reachability bitmap description
    @@ Metadata
      ## Commit message ##
         glossary: add reachability bitmap description
     
    +    Describe the purpose of the reachability bitmap.
    +
         Signed-off-by: Philip Oakley <philipoakley@iee.email>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## Documentation/glossary-content.txt ##
     @@ Documentation/glossary-content.txt: exclude;;
    @@ Documentation/glossary-content.txt: exclude;;
      	that they contain.
      
     +[[def_reachability_bitmap]]reachability bitmaps::
    -+	Reachability bitmaps store information about the set of objects in
    -+	a packfile, or a multi-pack index (MIDX). A repository may have at
    ++	Reachability bitmaps store information about the
    ++	<<def_reachable,reachability>> of a selected set of objects in
    ++	a packfile, or a multi-pack index (MIDX) to speed up object search.
    ++	A repository may have at
     +	most one bitmap. The bitmap may belong to either one pack, or the
     +	repository's multi-pack index (if it exists).
     +

-- 
2.38.1.windows.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 1/3] doc: use 'object database' not ODB or abbreviation
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
@ 2022-10-22 22:25   ` Philip Oakley
  2022-10-22 22:25   ` [PATCH v2 2/3] glossary: add "commit graph" description Philip Oakley
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-22 22:25 UTC (permalink / raw)
  To: GitList; +Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

The abbreviation 'ODB' is used in the technical documentation
sections for commit-graph and parallel-checkout, along with an
'odb' option in `git-pack-redundant`, without expansion.

Use 'object database' in full, in those entries. The text has not
been reflowed to keep the changes minimal.

While in the glossary for `object` terms, add the common`oid`
abbreviation to its entry.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/git-pack-redundant.txt          | 2 +-
 Documentation/glossary-content.txt            | 2 +-
 Documentation/technical/commit-graph.txt      | 2 +-
 Documentation/technical/parallel-checkout.txt | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-pack-redundant.txt b/Documentation/git-pack-redundant.txt
index ee7034b5e5..1132c73956 100644
--- a/Documentation/git-pack-redundant.txt
+++ b/Documentation/git-pack-redundant.txt
@@ -34,7 +34,7 @@ OPTIONS
 
 --alt-odb::
 	Don't require objects present in packs from alternate object
-	directories to be present in local packs.
+	database (odb) directories to be present in local packs.
 
 --verbose::
 	Outputs some statistics to stderr. Has a small performance penalty.
diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index aa2f41f5e7..947ac49606 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -262,7 +262,7 @@ This commit is referred to as a "merge commit", or sometimes just a
 	identified by its <<def_object_name,object name>>. The objects usually
 	live in `$GIT_DIR/objects/`.
 
-[[def_object_identifier]]object identifier::
+[[def_object_identifier]]object identifier (oid)::
 	Synonym for <<def_object_name,object name>>.
 
 [[def_object_name]]object name::
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
index f05e7bda1a..5a4e1eba8b 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -17,7 +17,7 @@ There are two main costs here:
 
 The commit-graph file is a supplemental data structure that accelerates
 commit graph walks. If a user downgrades or disables the 'core.commitGraph'
-config setting, then the existing ODB is sufficient. The file is stored
+config setting, then the existing object database is sufficient. The file is stored
 as "commit-graph" either in the .git/objects/info directory or in the info
 directory of an alternate.
 
diff --git a/Documentation/technical/parallel-checkout.txt b/Documentation/technical/parallel-checkout.txt
index e790258a1a..47c9b6183c 100644
--- a/Documentation/technical/parallel-checkout.txt
+++ b/Documentation/technical/parallel-checkout.txt
@@ -56,7 +56,7 @@ Rejected Multi-Threaded Solution
 
 The most "straightforward" implementation would be to spread the set of
 to-be-updated cache entries across multiple threads. But due to the
-thread-unsafe functions in the ODB code, we would have to use locks to
+thread-unsafe functions in the object database code, we would have to use locks to
 coordinate the parallel operation. An early prototype of this solution
 showed that the multi-threaded checkout would bring performance
 improvements over the sequential code, but there was still too much lock
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 2/3] glossary: add "commit graph" description
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
  2022-10-22 22:25   ` [PATCH v2 1/3] doc: use 'object database' not ODB or abbreviation Philip Oakley
@ 2022-10-22 22:25   ` Philip Oakley
  2022-10-25 12:31     ` Derrick Stolee
  2022-10-22 22:25   ` [PATCH v2 3/3] glossary: add reachability bitmap description Philip Oakley
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-10-22 22:25 UTC (permalink / raw)
  To: GitList; +Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

Git has an additional "commit graph" capability that supplements the
normal commit object's directed acylic graph (DAG). The supplemental
commit graph file is designed for speed of access.

Describe the commit graph both from the normative DAG view point and
from the commit graph file perspective.

Also, clarify the link between the branch ref and branch tip
by linking to the `ref` glossary entry, matching this commit graph
entry.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 947ac49606..97050826e5 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -20,7 +20,7 @@
 [[def_branch]]branch::
 	A "branch" is a line of development.  The most recent
 	<<def_commit,commit>> on a branch is referred to as the tip of
-	that branch.  The tip of the branch is referenced by a branch
+	that branch.  The tip of the branch is <<def_ref,referenced>> by a branch
 	<<def_head,head>>, which moves forward as additional development
 	is done on the branch.  A single Git
 	<<def_repository,repository>> can track an arbitrary number of
@@ -75,6 +75,21 @@ state in the Git history, by creating a new commit representing the current
 state of the <<def_index,index>> and advancing <<def_HEAD,HEAD>>
 to point at the new commit.
 
+[[def_commit_graph_general]]commit graph concept, representations and usage::
+	A synonym for the <<def_DAG,DAG>> structure formed by
+	the commits in the object database, <<def_ref,referenced>> by branch tips,
+	using their <<def_chain,chain>> of linked commits.
+	This structure is the definitive commit graph. The
+	graph can be represented in other ways, e.g. the
+	<<def_commit_graph_file,commit graph file>>.
+
+[[def_commit_graph_file]]commit graph file::
+	The commit-graph file is a supplemental representation of
+	the <<def_commit_graph_general,commit graph>> which accelerates
+	commit graph walks. The "commit-graph" file is stored
+	either in the .git/objects/info directory or in the info directory
+	of an alternate object database.
+
 [[def_commit_object]]commit object::
 	An <<def_object,object>> which contains the information about a
 	particular <<def_revision,revision>>, such as <<def_parent,parents>>, committer,
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
  2022-10-22 22:25   ` [PATCH v2 1/3] doc: use 'object database' not ODB or abbreviation Philip Oakley
  2022-10-22 22:25   ` [PATCH v2 2/3] glossary: add "commit graph" description Philip Oakley
@ 2022-10-22 22:25   ` Philip Oakley
  2022-10-24  7:43     ` Abhradeep Chakraborty
  2022-10-23  1:49   ` [PATCH v2 0/3] Add some Glossary of terms information Junio C Hamano
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
  4 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-10-22 22:25 UTC (permalink / raw)
  To: GitList; +Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

Describe the purpose of the reachability bitmap.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 97050826e5..3d67b452aa 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -508,6 +508,14 @@ exclude;;
 	<<def_tree_object,trees>> to the trees or <<def_blob_object,blobs>>
 	that they contain.
 
+[[def_reachability_bitmap]]reachability bitmaps::
+	Reachability bitmaps store information about the
+	<<def_reachable,reachability>> of a selected set of objects in
+	a packfile, or a multi-pack index (MIDX) to speed up object search.
+	A repository may have at
+	most one bitmap. The bitmap may belong to either one pack, or the
+	repository's multi-pack index (if it exists).
+
 [[def_rebase]]rebase::
 	To reapply a series of changes from a <<def_branch,branch>> to a
 	different base, and reset the <<def_head,head>> of that branch
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 0/3] Add some Glossary of terms information
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
                     ` (2 preceding siblings ...)
  2022-10-22 22:25   ` [PATCH v2 3/3] glossary: add reachability bitmap description Philip Oakley
@ 2022-10-23  1:49   ` Junio C Hamano
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
  4 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2022-10-23  1:49 UTC (permalink / raw)
  To: Philip Oakley; +Cc: GitList, Derrick Stolee, Abhradeep Chakraborty

Philip Oakley <philipoakley@iee.email> writes:

> was GitGitGadget #1282,
> (in reply to <pull.1282.git.1657385781.gitgitgadget@gmail.com>)
>
> This short series looks to add the basics of the reachability bitmap
> and commit graph phrases to the glossary of terms. While these
> techniques are well known to their developers, for some, they are
> just magic phrases.

They all looked reasonable to me, but as you sensibly Cc'ed folks
who worked on the areas the concepts explained in these patches are
the most relevant, these patches would benefit from their inputs, so
let's hear them first and then advance the patches to 'next'.

Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-22 22:25   ` [PATCH v2 3/3] glossary: add reachability bitmap description Philip Oakley
@ 2022-10-24  7:43     ` Abhradeep Chakraborty
  2022-10-24 16:39       ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Abhradeep Chakraborty @ 2022-10-24  7:43 UTC (permalink / raw)
  To: Philip Oakley; +Cc: GitList, Junio C Hamano, Derrick Stolee

Hey Philip,
Glad that you're working on this :)

On Sun, Oct 23, 2022 at 3:55 AM Philip Oakley <philipoakley@iee.email> wrote:
>
> Describe the purpose of the reachability bitmap.
>
> Signed-off-by: Philip Oakley <philipoakley@iee.email>
> ---
>  Documentation/glossary-content.txt | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index 97050826e5..3d67b452aa 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -508,6 +508,14 @@ exclude;;
>         <<def_tree_object,trees>> to the trees or <<def_blob_object,blobs>>
>         that they contain.
>
> +[[def_reachability_bitmap]]reachability bitmaps::
> +       Reachability bitmaps store information about the
> +       <<def_reachable,reachability>> of a selected set of objects in
> +       a packfile, or a multi-pack index (MIDX) to speed up object search.

Looks good to me. Initially I thought that we could explain it more
but as you already linked the "reachability" here, we don't need to.

> +       A repository may have at
> +       most one bitmap. The bitmap may belong to either one pack, or the
> +       repository's multi-pack index (if it exists).
> +

Small correction here - A repository may have multiple bitmaps (one
for each selected commit from the preferred packfile or a
multi-pack-index) but it can have only one ".bitmap" file (as of now).
Bitmaps for the selected commits are stored in that ".bitmap" file.
So I think the below lines (or similar) will work  -

    The bitmaps are stored in a ".bitmap" file. A repository may have
    at most one ".bitmap" file. The file may belong to either one pack, or the
    repository's multi-pack-index (if it exists).

Feel free to rephrase it accordingly.

Thanks :)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-24  7:43     ` Abhradeep Chakraborty
@ 2022-10-24 16:39       ` Junio C Hamano
  2022-10-24 21:23         ` Philip Oakley
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2022-10-24 16:39 UTC (permalink / raw)
  To: Abhradeep Chakraborty; +Cc: Philip Oakley, GitList, Derrick Stolee

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

> Small correction here - A repository may have multiple bitmaps (one
> for each selected commit from the preferred packfile or a
> multi-pack-index) but it can have only one ".bitmap" file (as of now).
> Bitmaps for the selected commits are stored in that ".bitmap" file.
> So I think the below lines (or similar) will work  -
>
>     The bitmaps are stored in a ".bitmap" file. A repository may have
>     at most one ".bitmap" file. The file may belong to either one pack, or the
>     repository's multi-pack-index (if it exists).
>
> Feel free to rephrase it accordingly.

Sounds good to me.  Or Philip's original can be tweaked minimally to
say "... may have at most one bitmap file (which stores multiple
bitmaps)".

Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-24 16:39       ` Junio C Hamano
@ 2022-10-24 21:23         ` Philip Oakley
  2022-10-25 12:34           ` Derrick Stolee
  0 siblings, 1 reply; 45+ messages in thread
From: Philip Oakley @ 2022-10-24 21:23 UTC (permalink / raw)
  To: Junio C Hamano, Abhradeep Chakraborty; +Cc: GitList, Derrick Stolee

On 24/10/2022 17:39, Junio C Hamano wrote:
> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>
>> Small correction here - A repository may have multiple bitmaps (one
>> for each selected commit from the preferred packfile or a
>> multi-pack-index) but it can have only one ".bitmap" file (as of now).
>> Bitmaps for the selected commits are stored in that ".bitmap" file.
>> So I think the below lines (or similar) will work  -
>>
>>     The bitmaps are stored in a ".bitmap" file. A repository may have
>>     at most one ".bitmap" file. The file may belong to either one pack, or the
>>     repository's multi-pack-index (if it exists).
>>
>> Feel free to rephrase it accordingly.
> Sounds good to me.  Or Philip's original can be tweaked minimally to
> say "... may have at most one bitmap file (which stores multiple
> bitmaps)".
>
Thanks both. I'll tweak the description in a day or so to allow Stolee
to comment if required.
P.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/3] glossary: add "commit graph" description
  2022-10-22 22:25   ` [PATCH v2 2/3] glossary: add "commit graph" description Philip Oakley
@ 2022-10-25 12:31     ` Derrick Stolee
  2022-10-29 16:32       ` Philip Oakley
  0 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee @ 2022-10-25 12:31 UTC (permalink / raw)
  To: Philip Oakley, GitList
  Cc: Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

On 10/22/2022 6:25 PM, Philip Oakley wrote:
> Git has an additional "commit graph" capability that supplements the
> normal commit object's directed acylic graph (DAG). The supplemental
> commit graph file is designed for speed of access.
> 
> Describe the commit graph both from the normative DAG view point and
> from the commit graph file perspective.

One way to help keep the general term and the file separate is to use
different notation. "commit graph" (with a space, no formatting) is the
DAG. "`commit-graph`" (with a dash, code formatting) is the file (and
its format).

> +[[def_commit_graph_general]]commit graph concept, representations and usage::
> +	A synonym for the <<def_DAG,DAG>> structure formed by
> +	the commits in the object database, <<def_ref,referenced>> by branch tips,
> +	using their <<def_chain,chain>> of linked commits.
> +	This structure is the definitive commit graph. The
> +	graph can be represented in other ways, e.g. the
> +	<<def_commit_graph_file,commit graph file>>.
> +
> +[[def_commit_graph_file]]commit graph file::
> +	The commit-graph file is a supplemental representation of
> +	the <<def_commit_graph_general,commit graph>> which accelerates
> +	commit graph walks. The "commit-graph" file is stored
> +	either in the .git/objects/info directory or in the info directory
> +	of an alternate object database.
> +

So this would become:

[[def_commit_graph_file]]`commit-graph` file::
	The `commit-graph` file is a supplemental representation of
	the <<def_commit_graph_general,commit graph>> which accelerates
	commit graph walks. The `commit-graph` file is stored either in
	the `.git/objects/info` directory or in the `info` directory of
	an alternate object database.

(I did some extra style and word-wrapping changes, too.)

Other than these nits, I find this to be a clear description.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-24 21:23         ` Philip Oakley
@ 2022-10-25 12:34           ` Derrick Stolee
  2022-10-25 15:53             ` Junio C Hamano
  2022-10-29 16:36             ` Philip Oakley
  0 siblings, 2 replies; 45+ messages in thread
From: Derrick Stolee @ 2022-10-25 12:34 UTC (permalink / raw)
  To: Philip Oakley, Junio C Hamano, Abhradeep Chakraborty, Taylor Blau
  Cc: GitList, Derrick Stolee

On 10/24/2022 5:23 PM, Philip Oakley wrote:
> On 24/10/2022 17:39, Junio C Hamano wrote:
>> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>>
>>> Small correction here - A repository may have multiple bitmaps (one
>>> for each selected commit from the preferred packfile or a
>>> multi-pack-index) but it can have only one ".bitmap" file (as of now).
>>> Bitmaps for the selected commits are stored in that ".bitmap" file.
>>> So I think the below lines (or similar) will work  -
>>>
>>>     The bitmaps are stored in a ".bitmap" file. A repository may have
>>>     at most one ".bitmap" file. The file may belong to either one pack, or the
>>>     repository's multi-pack-index (if it exists).
>>>
>>> Feel free to rephrase it accordingly.
>> Sounds good to me.  Or Philip's original can be tweaked minimally to
>> say "... may have at most one bitmap file (which stores multiple
>> bitmaps)".
>>
> Thanks both. I'll tweak the description in a day or so to allow Stolee
> to comment if required.

I added my comments about the commit-graph file, and agree with
Abhradeep's suggestions here.

Adding Taylor as a possible reviewer, too.

The one thing I will say is that there can be multiple .bitmap
files, but Git will only use one of them. Not sure if that is
worth being pedantic about here, though.

We'll need to keep this glossary section in mind in case things
change (such as "at most one bitmap file").

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-25 12:34           ` Derrick Stolee
@ 2022-10-25 15:53             ` Junio C Hamano
  2022-10-29 16:36             ` Philip Oakley
  1 sibling, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2022-10-25 15:53 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Philip Oakley, Abhradeep Chakraborty, Taylor Blau, GitList,
	Derrick Stolee

Derrick Stolee <derrickstolee@github.com> writes:

> The one thing I will say is that there can be multiple .bitmap
> files, but Git will only use one of them. Not sure if that is
> worth being pedantic about here, though.

That matches my understanding, but "can be" is less of the norm
these days, no?  "repack -b" would refuse without "-a" so we may
have more than one by accident, or am I missing a common scenario
that we do perfectly normal things and still end up with multiple?

I agree with you that it probably is a good idea to say there can
be, so that the readers do not have to alarmed.

      Only one '.bitmap' file (which stores multiple reachability
      bitmaps) per repository is used in a repository (note. it is
      not wrong to have more than one).  The bitmap file may belong
      to either one pack, or the repository's multi-pack index (if
      it exists).

But then the readers who do have more than one would next think "how
do I get rid of the ones that are not used? they are wasting my
precious disk space".  So I also am not sure if it helps to write
more.  "It is generally true that.." white lie may be better than
technical correctness in this case.

> We'll need to keep this glossary section in mind in case things
> change (such as "at most one bitmap file").

True.

Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/3] glossary: add "commit graph" description
  2022-10-25 12:31     ` Derrick Stolee
@ 2022-10-29 16:32       ` Philip Oakley
  0 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:32 UTC (permalink / raw)
  To: Derrick Stolee, GitList
  Cc: Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

On 25/10/2022 13:31, Derrick Stolee wrote:
> On 10/22/2022 6:25 PM, Philip Oakley wrote:
>> Git has an additional "commit graph" capability that supplements the
>> normal commit object's directed acylic graph (DAG). The supplemental
>> commit graph file is designed for speed of access.
>>
>> Describe the commit graph both from the normative DAG view point and
>> from the commit graph file perspective.
> One way to help keep the general term and the file separate is to use
> different notation. "commit graph" (with a space, no formatting) is the
> DAG. "`commit-graph`" (with a dash, code formatting) is the file (and
> its format).
I did want to have separate entries to make clear the distinction at
this level.

The use of the hyphenation is good, and there are only a few places
where that isn't followed, so I'll specifically call out the use of
hyphenation, and add a patch to update the few places that used the
generic term inappropriately.
Using the code formatting for commit-graph would have been extensive.,

>> +[[def_commit_graph_general]]commit graph concept, representations and usage::
>> +	A synonym for the <<def_DAG,DAG>> structure formed by
>> +	the commits in the object database, <<def_ref,referenced>> by branch tips,
>> +	using their <<def_chain,chain>> of linked commits.
>> +	This structure is the definitive commit graph. The
>> +	graph can be represented in other ways, e.g. the
>> +	<<def_commit_graph_file,commit graph file>>.
>> +
>> +[[def_commit_graph_file]]commit graph file::
>> +	The commit-graph file is a supplemental representation of
>> +	the <<def_commit_graph_general,commit graph>> which accelerates
>> +	commit graph walks. The "commit-graph" file is stored
>> +	either in the .git/objects/info directory or in the info directory
>> +	of an alternate object database.
>> +
> So this would become:
>
> [[def_commit_graph_file]]`commit-graph` file::
> 	The `commit-graph` file is a supplemental representation of
> 	the <<def_commit_graph_general,commit graph>> which accelerates
> 	commit graph walks. The `commit-graph` file is stored either in
> 	the `.git/objects/info` directory or in the `info` directory of
> 	an alternate object database.
>
> (I did some extra style and word-wrapping changes, too.)

I've used some of that. Thanks.

Philip
>
> Other than these nits, I find this to be a clear description.
>
> Thanks,
> -Stolee


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/3] glossary: add reachability bitmap description
  2022-10-25 12:34           ` Derrick Stolee
  2022-10-25 15:53             ` Junio C Hamano
@ 2022-10-29 16:36             ` Philip Oakley
  1 sibling, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:36 UTC (permalink / raw)
  To: Derrick Stolee, Junio C Hamano, Abhradeep Chakraborty,
	Taylor Blau
  Cc: GitList, Derrick Stolee

On 25/10/2022 13:34, Derrick Stolee wrote:
> On 10/24/2022 5:23 PM, Philip Oakley wrote:
>> On 24/10/2022 17:39, Junio C Hamano wrote:
>>> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>>>
>>>> Small correction here - A repository may have multiple bitmaps (one
>>>> for each selected commit from the preferred packfile or a
>>>> multi-pack-index) but it can have only one ".bitmap" file (as of now).
>>>> Bitmaps for the selected commits are stored in that ".bitmap" file.
>>>> So I think the below lines (or similar) will work  -
>>>>
>>>>     The bitmaps are stored in a ".bitmap" file. A repository may have
>>>>     at most one ".bitmap" file. The file may belong to either one pack, or the
>>>>     repository's multi-pack-index (if it exists).
>>>>
>>>> Feel free to rephrase it accordingly.
>>> Sounds good to me.  Or Philip's original can be tweaked minimally to
>>> say "... may have at most one bitmap file (which stores multiple
>>> bitmaps)".
>>>
>> Thanks both. I'll tweak the description in a day or so to allow Stolee
>> to comment if required.
> I added my comments about the commit-graph file, and agree with
> Abhradeep's suggestions here.
>
> Adding Taylor as a possible reviewer, too.
>
> The one thing I will say is that there can be multiple .bitmap
> files, but Git will only use one of them. Not sure if that is
> worth being pedantic about here, though.
>
> We'll need to keep this glossary section in mind in case things
> change (such as "at most one bitmap file").
>
> Thanks,
> -Stolee
I've gone with the phrase "at most one bitmap file in use." here.

The updated series should be sent shortly.

Philip.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 0/4] Add some Glossary of terms information
  2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
                     ` (3 preceding siblings ...)
  2022-10-23  1:49   ` [PATCH v2 0/3] Add some Glossary of terms information Junio C Hamano
@ 2022-10-29 16:41   ` Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 1/4] doc: use 'object database' not ODB or abbreviation Philip Oakley
                       ` (4 more replies)
  4 siblings, 5 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:41 UTC (permalink / raw)
  To: GitList
  Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty,
	Taylor Blau

(in reply to <20221022222539.2333-1-philipoakley@iee.email>

This short series looks to add the basics of the reachability bitmap
and commit graph phrases to the glossary of terms. While these
techniques are well known to their developers, for some, they are
just magic phrases.

[V3] .. since V2

1/4 Unchanged.
2/4 The distinction between the generic 'commit graph' of the DAG and
the implementation specifics of 'commit-graph' file has been retained
for the glossary. 
However the deliberate hyphenation has been included, and a fourth patch
added to maintain the consistency of 'commit-graph' in other documents.

3/4 Tweaks & links applied to Reachability patch.
4/4 New - maintain the consistency of 'commit-graph' in other documents.

[V2] .. since V1
was GitGitGadget #1282,
(in reply to <pull.1282.git.1657385781.gitgitgadget@gmail.com>)
Patch 4/4 has been taken upstream independently, and hence dropped
here so we're now just [n/3].

Patch 1/3 Dropped the glossary addition in favour of changing the
locations that used ODB (Junio's suggestion). Kept the git
pack-redundant's `--alt-odb` but spelt out 'object database' in full
in the man page. The only remaining `odb`s are within `goodbye` ;-).

While here, add the (oid) abbreviation to its adjacent entry.

Patch 2/3 Split the 'commit-graph' explanation into two parts to
distinguish the speed-up option, from Git's core graph concept of
object traversal. Included links to existing terms.

Patch 3/3 Added links to existing terms. Statement for the
reachability bitmaps.

added cc: for Stolee (commit-graph) and Abhradeep Chakraborty
(Bitmaps) review.


[V1] [GGG PR #1282] 
https://lore.kernel.org/git/pull.1282.git.1657385781.gitgitgadget@gmail.com/

The first patch [1/4] is to show OBD as an abbreviation to avoid a UNA [0]

Patch [2/4] provides a basic statement for the Commit-Graph's purpose.

Patch [3/4] provides a similar statement for the reachability bitmaps.

These two patches maybe misses out on some linking information as to
the benefits these have and the basics of their heuristic.

Patch [4/4] follows up on a bug report about the lack of idempotence
for the `--renormalise' command. See commit message for details.

[0] UNA Un-Named Abbreviation.

Signed-off-by: Philip Oakley philipoakley@iee.email
cc: Philip Oakley philipoakley@iee.email


Philip Oakley (4):
  doc: use 'object database' not ODB or abbreviation
  glossary: add "commit graph" description
  glossary: add reachability bitmap description
  doc: use "commit-graph" hyphenation consistently

 Documentation/config/core.txt                 |  2 +-
 Documentation/git-pack-redundant.txt          |  2 +-
 Documentation/gitformat-commit-graph.txt      |  6 ++---
 Documentation/glossary-content.txt            | 27 +++++++++++++++++--
 Documentation/technical/commit-graph.txt      |  8 +++---
 Documentation/technical/parallel-checkout.txt |  2 +-
 6 files changed, 35 insertions(+), 12 deletions(-)

Range-diff against remotes/gitster/po/glossary-around-traversal (v2?):
1:  de164ab78b ! 1:  748b15345e doc: use 'object database' not ODB or abbreviation
    @@ Commit message
         abbreviation to its entry.
     
         Signed-off-by: Philip Oakley <philipoakley@iee.email>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## Documentation/git-pack-redundant.txt ##
     @@ Documentation/git-pack-redundant.txt: OPTIONS
2:  f677d57699 ! 2:  052a9568e7 glossary: add "commit graph" description
    @@ Commit message
         glossary: add "commit graph" description
     
         Git has an additional "commit graph" capability that supplements the
    -    normal commit object's directed acylic graph (DAG). The supplemental
    +    normal commit object's directed acyclic graph (DAG). The supplemental
         commit graph file is designed for speed of access.
     
         Describe the commit graph both from the normative DAG view point and
    @@ Commit message
         by linking to the `ref` glossary entry, matching this commit graph
         entry.
     
    +    The commit-graph file is also distinguished by its hyphenation.
    +
    +    Subsequent commit catches the few cases where the hyphenation of
    +    commit-graph was missing.
    +
         Signed-off-by: Philip Oakley <philipoakley@iee.email>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## Documentation/glossary-content.txt ##
     @@
    @@ Documentation/glossary-content.txt: state in the Git history, by creating a new
      to point at the new commit.
      
     +[[def_commit_graph_general]]commit graph concept, representations and usage::
    -+	A synonym for the <<def_DAG,DAG>> structure formed by
    -+	the commits in the object database, <<def_ref,referenced>> by branch tips,
    ++	A synonym for the <<def_DAG,DAG>> structure formed by the commits
    ++	in the object database, <<def_ref,referenced>> by branch tips,
     +	using their <<def_chain,chain>> of linked commits.
     +	This structure is the definitive commit graph. The
     +	graph can be represented in other ways, e.g. the
    -+	<<def_commit_graph_file,commit graph file>>.
    ++	<<def_commit_graph_file,"commit-graph" file>>.
     +
    -+[[def_commit_graph_file]]commit graph file::
    -+	The commit-graph file is a supplemental representation of
    -+	the <<def_commit_graph_general,commit graph>> which accelerates
    -+	commit graph walks. The "commit-graph" file is stored
    -+	either in the .git/objects/info directory or in the info directory
    -+	of an alternate object database.
    ++[[def_commit_graph_file]]commit-graph file::
    ++	The "commit-graph" (normally hyphenated) file is a supplemental
    ++	representation of the <<def_commit_graph_general,commit graph>>
    ++	which accelerates commit graph walks. The "commit-graph" file is
    ++	stored either in the .git/objects/info directory or in the info
    ++	directory of an alternate object database.
     +
      [[def_commit_object]]commit object::
      	An <<def_object,object>> which contains the information about a
3:  39e9a282fc ! 3:  d56234b70c glossary: add reachability bitmap description
    @@ Commit message
         Describe the purpose of the reachability bitmap.
     
         Signed-off-by: Philip Oakley <philipoakley@iee.email>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## Documentation/glossary-content.txt ##
     @@ Documentation/glossary-content.txt: exclude;;
    @@ Documentation/glossary-content.txt: exclude;;
      
     +[[def_reachability_bitmap]]reachability bitmaps::
     +	Reachability bitmaps store information about the
    -+	<<def_reachable,reachability>> of a selected set of objects in
    -+	a packfile, or a multi-pack index (MIDX) to speed up object search.
    -+	A repository may have at
    -+	most one bitmap. The bitmap may belong to either one pack, or the
    -+	repository's multi-pack index (if it exists).
    ++	<<def_reachable,reachability>> of a selected set of commits in
    ++	a packfile, or a multi-pack index (MIDX), to speed up object search.
    ++	The bitmaps are stored in a ".bitmap" file. A repository may have at
    ++	most one bitmap file in use. The bitmap file may belong to either one
    ++	pack, or the repository's multi-pack index (if it exists).
     +
      [[def_rebase]]rebase::
      	To reapply a series of changes from a <<def_branch,branch>> to a
-:  ---------- > 4:  87686e63f9 doc: use "commit-graph" hyphenation consistently
-- 
2.38.1.windows.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 1/4] doc: use 'object database' not ODB or abbreviation
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
@ 2022-10-29 16:41     ` Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 2/4] glossary: add "commit graph" description Philip Oakley
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:41 UTC (permalink / raw)
  To: GitList
  Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty,
	Taylor Blau

The abbreviation 'ODB' is used in the technical documentation
sections for commit-graph and parallel-checkout, along with an
'odb' option in `git-pack-redundant`, without expansion.

Use 'object database' in full, in those entries. The text has not
been reflowed to keep the changes minimal.

While in the glossary for `object` terms, add the common`oid`
abbreviation to its entry.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/git-pack-redundant.txt          | 2 +-
 Documentation/glossary-content.txt            | 2 +-
 Documentation/technical/commit-graph.txt      | 2 +-
 Documentation/technical/parallel-checkout.txt | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-pack-redundant.txt b/Documentation/git-pack-redundant.txt
index ee7034b5e5..1132c73956 100644
--- a/Documentation/git-pack-redundant.txt
+++ b/Documentation/git-pack-redundant.txt
@@ -34,7 +34,7 @@ OPTIONS
 
 --alt-odb::
 	Don't require objects present in packs from alternate object
-	directories to be present in local packs.
+	database (odb) directories to be present in local packs.
 
 --verbose::
 	Outputs some statistics to stderr. Has a small performance penalty.
diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index aa2f41f5e7..947ac49606 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -262,7 +262,7 @@ This commit is referred to as a "merge commit", or sometimes just a
 	identified by its <<def_object_name,object name>>. The objects usually
 	live in `$GIT_DIR/objects/`.
 
-[[def_object_identifier]]object identifier::
+[[def_object_identifier]]object identifier (oid)::
 	Synonym for <<def_object_name,object name>>.
 
 [[def_object_name]]object name::
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
index 90c9760c23..d2a6a13650 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -17,7 +17,7 @@ There are two main costs here:
 
 The commit-graph file is a supplemental data structure that accelerates
 commit graph walks. If a user downgrades or disables the 'core.commitGraph'
-config setting, then the existing ODB is sufficient. The file is stored
+config setting, then the existing object database is sufficient. The file is stored
 as "commit-graph" either in the .git/objects/info directory or in the info
 directory of an alternate.
 
diff --git a/Documentation/technical/parallel-checkout.txt b/Documentation/technical/parallel-checkout.txt
index e790258a1a..47c9b6183c 100644
--- a/Documentation/technical/parallel-checkout.txt
+++ b/Documentation/technical/parallel-checkout.txt
@@ -56,7 +56,7 @@ Rejected Multi-Threaded Solution
 
 The most "straightforward" implementation would be to spread the set of
 to-be-updated cache entries across multiple threads. But due to the
-thread-unsafe functions in the ODB code, we would have to use locks to
+thread-unsafe functions in the object database code, we would have to use locks to
 coordinate the parallel operation. An early prototype of this solution
 showed that the multi-threaded checkout would bring performance
 improvements over the sequential code, but there was still too much lock
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 2/4] glossary: add "commit graph" description
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 1/4] doc: use 'object database' not ODB or abbreviation Philip Oakley
@ 2022-10-29 16:41     ` Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 3/4] glossary: add reachability bitmap description Philip Oakley
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:41 UTC (permalink / raw)
  To: GitList
  Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty,
	Taylor Blau

Git has an additional "commit graph" capability that supplements the
normal commit object's directed acyclic graph (DAG). The supplemental
commit graph file is designed for speed of access.

Describe the commit graph both from the normative DAG view point and
from the commit graph file perspective.

Also, clarify the link between the branch ref and branch tip
by linking to the `ref` glossary entry, matching this commit graph
entry.

The commit-graph file is also distinguished by its hyphenation.

Subsequent commit catches the few cases where the hyphenation of
commit-graph was missing.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 947ac49606..a526710278 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -20,7 +20,7 @@
 [[def_branch]]branch::
 	A "branch" is a line of development.  The most recent
 	<<def_commit,commit>> on a branch is referred to as the tip of
-	that branch.  The tip of the branch is referenced by a branch
+	that branch.  The tip of the branch is <<def_ref,referenced>> by a branch
 	<<def_head,head>>, which moves forward as additional development
 	is done on the branch.  A single Git
 	<<def_repository,repository>> can track an arbitrary number of
@@ -75,6 +75,21 @@ state in the Git history, by creating a new commit representing the current
 state of the <<def_index,index>> and advancing <<def_HEAD,HEAD>>
 to point at the new commit.
 
+[[def_commit_graph_general]]commit graph concept, representations and usage::
+	A synonym for the <<def_DAG,DAG>> structure formed by the commits
+	in the object database, <<def_ref,referenced>> by branch tips,
+	using their <<def_chain,chain>> of linked commits.
+	This structure is the definitive commit graph. The
+	graph can be represented in other ways, e.g. the
+	<<def_commit_graph_file,"commit-graph" file>>.
+
+[[def_commit_graph_file]]commit-graph file::
+	The "commit-graph" (normally hyphenated) file is a supplemental
+	representation of the <<def_commit_graph_general,commit graph>>
+	which accelerates commit graph walks. The "commit-graph" file is
+	stored either in the .git/objects/info directory or in the info
+	directory of an alternate object database.
+
 [[def_commit_object]]commit object::
 	An <<def_object,object>> which contains the information about a
 	particular <<def_revision,revision>>, such as <<def_parent,parents>>, committer,
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 3/4] glossary: add reachability bitmap description
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 1/4] doc: use 'object database' not ODB or abbreviation Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 2/4] glossary: add "commit graph" description Philip Oakley
@ 2022-10-29 16:41     ` Philip Oakley
  2022-10-29 16:41     ` [PATCH v3 4/4] doc: use "commit-graph" hyphenation consistently Philip Oakley
  2022-10-29 17:24     ` [PATCH v3 0/4] Add some Glossary of terms information Taylor Blau
  4 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:41 UTC (permalink / raw)
  To: GitList
  Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty,
	Taylor Blau

Describe the purpose of the reachability bitmap.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
---
 Documentation/glossary-content.txt | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index a526710278..5a537268e2 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -508,6 +508,14 @@ exclude;;
 	<<def_tree_object,trees>> to the trees or <<def_blob_object,blobs>>
 	that they contain.
 
+[[def_reachability_bitmap]]reachability bitmaps::
+	Reachability bitmaps store information about the
+	<<def_reachable,reachability>> of a selected set of commits in
+	a packfile, or a multi-pack index (MIDX), to speed up object search.
+	The bitmaps are stored in a ".bitmap" file. A repository may have at
+	most one bitmap file in use. The bitmap file may belong to either one
+	pack, or the repository's multi-pack index (if it exists).
+
 [[def_rebase]]rebase::
 	To reapply a series of changes from a <<def_branch,branch>> to a
 	different base, and reset the <<def_head,head>> of that branch
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 4/4] doc: use "commit-graph" hyphenation consistently
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
                       ` (2 preceding siblings ...)
  2022-10-29 16:41     ` [PATCH v3 3/4] glossary: add reachability bitmap description Philip Oakley
@ 2022-10-29 16:41     ` Philip Oakley
  2022-10-29 17:24     ` [PATCH v3 0/4] Add some Glossary of terms information Taylor Blau
  4 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 16:41 UTC (permalink / raw)
  To: GitList
  Cc: Self, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty,
	Taylor Blau

Note, historical release notes have not been updated.

Signed-off-by: Philip Oakley <philipoakley@iee.email>

# Conflicts:
#	Documentation/gitformat-commit-graph.txt
---
 Documentation/config/core.txt            | 2 +-
 Documentation/gitformat-commit-graph.txt | 6 +++---
 Documentation/technical/commit-graph.txt | 6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 37afbaf5a4..dfbdaf00b8 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -618,7 +618,7 @@ but risks losing recent work in the event of an unclean system shutdown.
 * `loose-object` hardens objects added to the repo in loose-object form.
 * `pack` hardens objects added to the repo in packfile form.
 * `pack-metadata` hardens packfile bitmaps and indexes.
-* `commit-graph` hardens the commit graph file.
+* `commit-graph` hardens the commit-graph file.
 * `index` hardens the index when it is modified.
 * `objects` is an aggregate option that is equivalent to
   `loose-object,pack`.
diff --git a/Documentation/gitformat-commit-graph.txt b/Documentation/gitformat-commit-graph.txt
index 7324665716..31cad585e2 100644
--- a/Documentation/gitformat-commit-graph.txt
+++ b/Documentation/gitformat-commit-graph.txt
@@ -3,7 +3,7 @@ gitformat-commit-graph(5)
 
 NAME
 ----
-gitformat-commit-graph - Git commit graph format
+gitformat-commit-graph - Git commit-graph format
 
 SYNOPSIS
 --------
@@ -14,7 +14,7 @@ $GIT_DIR/objects/info/commit-graphs/*
 DESCRIPTION
 -----------
 
-The Git commit graph stores a list of commit OIDs and some associated
+The Git commit-graph stores a list of commit OIDs and some associated
 metadata, including:
 
 - The generation number of the commit.
@@ -34,7 +34,7 @@ corresponding to the array position within the list of commit OIDs. Due
 to some special constants we use to track parents, we can store at most
 (1 << 30) + (1 << 29) + (1 << 28) - 1 (around 1.8 billion) commits.
 
-== Commit graph files have the following format:
+== Commit-graph files have the following format:
 
 In order to allow extensions that add extra data to the graph, we organize
 the body into "chunks" and provide a binary lookup table at the beginning
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
index d2a6a13650..86fed0de0f 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -1,4 +1,4 @@
-Git Commit Graph Design Notes
+Git Commit-Graph Design Notes
 =============================
 
 Git walks the commit graph for many reasons, including:
@@ -95,7 +95,7 @@ with default order), but is not used when the topological order is
 required (such as merge base calculations, "git log --graph").
 
 In practice, we expect some commits to be created recently and not stored
-in the commit graph. We can treat these commits as having "infinite"
+in the commit-graph. We can treat these commits as having "infinite"
 generation number and walk until reaching commits with known generation
 number.
 
@@ -149,7 +149,7 @@ Design Details
   helpful for these clones, anyway. The commit-graph will not be read or
   written when shallow commits are present.
 
-Commit Graphs Chains
+Commit-Graphs Chains
 --------------------
 
 Typically, repos grow with near-constant velocity (commits per day). Over time,
-- 
2.38.1.windows.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 0/4] Add some Glossary of terms information
  2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
                       ` (3 preceding siblings ...)
  2022-10-29 16:41     ` [PATCH v3 4/4] doc: use "commit-graph" hyphenation consistently Philip Oakley
@ 2022-10-29 17:24     ` Taylor Blau
  2022-10-29 17:34       ` Philip Oakley
  4 siblings, 1 reply; 45+ messages in thread
From: Taylor Blau @ 2022-10-29 17:24 UTC (permalink / raw)
  To: Philip Oakley
  Cc: GitList, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

On Sat, Oct 29, 2022 at 05:41:08PM +0100, Philip Oakley wrote:
> (in reply to <20221022222539.2333-1-philipoakley@iee.email>
>
> This short series looks to add the basics of the reachability bitmap
> and commit graph phrases to the glossary of terms. While these
> techniques are well known to their developers, for some, they are
> just magic phrases.

Thanks, the updated round looks good to me. I applied these on top of
the tip of master instead of the existing merge base (which was
dc8c8deaa6b (Prepare for 2.36.2, 2022-06-07)).

Will queue.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 0/4] Add some Glossary of terms information
  2022-10-29 17:24     ` [PATCH v3 0/4] Add some Glossary of terms information Taylor Blau
@ 2022-10-29 17:34       ` Philip Oakley
  0 siblings, 0 replies; 45+ messages in thread
From: Philip Oakley @ 2022-10-29 17:34 UTC (permalink / raw)
  To: Taylor Blau
  Cc: GitList, Junio C Hamano, Derrick Stolee, Abhradeep Chakraborty

On 29/10/2022 18:24, Taylor Blau wrote:
> On Sat, Oct 29, 2022 at 05:41:08PM +0100, Philip Oakley wrote:
>> (in reply to <20221022222539.2333-1-philipoakley@iee.email>
>>
>> This short series looks to add the basics of the reachability bitmap
>> and commit graph phrases to the glossary of terms. While these
>> techniques are well known to their developers, for some, they are
>> just magic phrases.
> Thanks, the updated round looks good to me. I applied these on top of
> the tip of master instead of the existing merge base (which was
> dc8c8deaa6b (Prepare for 2.36.2, 2022-06-07)).
>
> Will queue.
>
> Thanks,
> Taylor
Thanks!
Philip

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2022-10-29 17:34 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-09 16:56 [PATCH 0/4] Add some Glossary terms, and extra renormalize information Philip Oakley via GitGitGadget
2022-07-09 16:56 ` [PATCH 1/4] glossary: add Object DataBase (ODB) abbreviation Philip Oakley via GitGitGadget
2022-07-09 16:56 ` [PATCH 2/4] glossary: add commit graph description Philip Oakley via GitGitGadget
2022-07-09 21:20   ` Junio C Hamano
2022-07-10 21:37     ` Philip Oakley
2022-08-30 14:33       ` Philip Oakley
2022-07-09 16:56 ` [PATCH 3/4] glossary: add reachability bitmap description Philip Oakley via GitGitGadget
2022-07-09 16:56 ` [PATCH 4/4] doc add: renormalize is not idempotent for CRCRLF Philip Oakley via GitGitGadget
2022-07-09 21:06   ` Junio C Hamano
2022-07-10 21:52     ` Philip Oakley
2022-07-10 22:04       ` Junio C Hamano
2022-07-10 22:25         ` Philip Oakley
2022-07-10  7:48   ` Torsten Bögershausen
2022-07-10 22:09     ` Philip Oakley
2022-08-05 22:26       ` Junio C Hamano
2022-08-06 19:22         ` Torsten Bögershausen
2022-08-08 14:32         ` Philip Oakley
2022-08-08 16:21           ` Junio C Hamano
2022-08-09 18:44           ` Torsten Bögershausen
2022-08-10 14:44         ` [PATCH v2 0/1] .. Add extra renormalize information Philip Oakley
2022-08-10 14:44           ` [PATCH v2 1/1] doc add: renormalize is not idempotent for CRCRLF Philip Oakley
2022-08-10 17:11             ` Torsten Bögershausen
2022-08-10 17:42             ` Junio C Hamano
2022-07-09 21:34 ` [PATCH 0/4] Add some Glossary terms, and extra renormalize information Junio C Hamano
2022-07-10 15:20   ` Philip Oakley
2022-10-22 22:25 ` [PATCH v2 0/3] Add some Glossary of terms information Philip Oakley
2022-10-22 22:25   ` [PATCH v2 1/3] doc: use 'object database' not ODB or abbreviation Philip Oakley
2022-10-22 22:25   ` [PATCH v2 2/3] glossary: add "commit graph" description Philip Oakley
2022-10-25 12:31     ` Derrick Stolee
2022-10-29 16:32       ` Philip Oakley
2022-10-22 22:25   ` [PATCH v2 3/3] glossary: add reachability bitmap description Philip Oakley
2022-10-24  7:43     ` Abhradeep Chakraborty
2022-10-24 16:39       ` Junio C Hamano
2022-10-24 21:23         ` Philip Oakley
2022-10-25 12:34           ` Derrick Stolee
2022-10-25 15:53             ` Junio C Hamano
2022-10-29 16:36             ` Philip Oakley
2022-10-23  1:49   ` [PATCH v2 0/3] Add some Glossary of terms information Junio C Hamano
2022-10-29 16:41   ` [PATCH v3 0/4] " Philip Oakley
2022-10-29 16:41     ` [PATCH v3 1/4] doc: use 'object database' not ODB or abbreviation Philip Oakley
2022-10-29 16:41     ` [PATCH v3 2/4] glossary: add "commit graph" description Philip Oakley
2022-10-29 16:41     ` [PATCH v3 3/4] glossary: add reachability bitmap description Philip Oakley
2022-10-29 16:41     ` [PATCH v3 4/4] doc: use "commit-graph" hyphenation consistently Philip Oakley
2022-10-29 17:24     ` [PATCH v3 0/4] Add some Glossary of terms information Taylor Blau
2022-10-29 17:34       ` Philip Oakley

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).