git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
@ 2024-01-14 19:50 Nikolay Edigaryev via GitGitGadget
  2024-01-16 16:54 ` Junio C Hamano
  0 siblings, 1 reply; 2+ messages in thread
From: Nikolay Edigaryev via GitGitGadget @ 2024-01-14 19:50 UTC (permalink / raw
  To: git; +Cc: Nikolay Edigaryev, Nikolay Edigaryev

From: Nikolay Edigaryev <edigaryev@gmail.com>

'--filter=blob:limit=<n>' was introduced in 25ec7bcac0 (list-objects:
filter objects in traverse_commit_list, 2017-11-21) and later expanded
to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
2020-02-14)

The logic that was introduced in these commits (and that still persists
to this day) omits blobs larger than _or equal_ to n bytes or units.

However, the documentation (Documentation/rev-list-options.txt) states:

>The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n
bytes or units. n may be zero.

Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
logic, so it seems it is the documentation that needs fixing, not the
code.

This changes the explanation to be similar to
Documentation/git-clone.txt, which is correct.

Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
---
    rev-list-options: fix off-by-one in '--filter=blob:limit=' explainer
    
    '--filter=blob:limit=' was introduced in 25ec7bcac0 (list-objects:
    filter objects in traverse_commit_list, 2017-11-21) and later expanded
    to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
    2020-02-14)
    
    The logic that was introduced in these commits (and that still persists
    to this day) omits blobs larger than or equal to n bytes or units.
    
    However, the documentation (Documentation/rev-list-options.txt) states:
    
    > The form '--filter=blob:limit=[kmg]' omits blobs larger than n bytes
    > or units. n may be zero.
    
    Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
    logic, so it seems it is the documentation that needs fixing, not the
    code.
    
    This changes the explanation to be similar to
    Documentation/git-clone.txt, which is correct.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1645%2Fedigaryev%2Ffix-blob-limit-off-by-one-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1645/edigaryev/fix-blob-limit-off-by-one-v1
Pull-Request: https://github.com/git/git/pull/1645

 Documentation/rev-list-options.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2bf239ff030..a583b52c612 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -947,10 +947,10 @@ ifdef::git-rev-list[]
 +
 The form '--filter=blob:none' omits all blobs.
 +
-The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes
-or units.  n may be zero.  The suffixes k, m, and g can be used to name
-units in KiB, MiB, or GiB.  For example, 'blob:limit=1k' is the same
-as 'blob:limit=1024'.
+The form '--filter=blob:limit=<n>[kmg]' omits blobs of size at least n
+bytes or units.  n may be zero.  The suffixes k, m, and g can be used
+to name units in KiB, MiB, or GiB.  For example, 'blob:limit=1k'
+is the same as 'blob:limit=1024'.
 +
 The form '--filter=object:type=(tag|commit|tree|blob)' omits all objects
 which are not of the requested type.

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
  2024-01-14 19:50 [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer Nikolay Edigaryev via GitGitGadget
@ 2024-01-16 16:54 ` Junio C Hamano
  0 siblings, 0 replies; 2+ messages in thread
From: Junio C Hamano @ 2024-01-16 16:54 UTC (permalink / raw
  To: Nikolay Edigaryev via GitGitGadget; +Cc: git, Nikolay Edigaryev

"Nikolay Edigaryev via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Nikolay Edigaryev <edigaryev@gmail.com>
>
> '--filter=blob:limit=<n>' was introduced in 25ec7bcac0 (list-objects:
> filter objects in traverse_commit_list, 2017-11-21) and later expanded
> to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
> 2020-02-14)
>
> The logic that was introduced in these commits (and that still persists
> to this day) omits blobs larger than _or equal_ to n bytes or units.

Good eyes.  The former does this

		if (object_length < filter_data->max_bytes)
			goto include_it;

and the latter does this


                if (!bitmap_get(tips, pos) &&
                    get_size_by_pos(bitmap_git, pos) >= limit)
                        bitmap_unset(to_filter, pos);

> However, the documentation (Documentation/rev-list-options.txt) states:
>
>>The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n
> bytes or units. n may be zero.
>
> Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
> logic, so it seems it is the documentation that needs fixing, not the
> code.

Yup.  The mechanism is used for things like "we do not want a large
blob, like 100MB", and a byte on the boundary does not matter all
that much in such a countext, but it does not hurt to be more
correct ;-)

>  The form '--filter=blob:none' omits all blobs.
>  +
> -The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes
> -or units.  n may be zero.  The suffixes k, m, and g can be used to name
> -units in KiB, MiB, or GiB.  For example, 'blob:limit=1k' is the same
> -as 'blob:limit=1024'.
> +The form '--filter=blob:limit=<n>[kmg]' omits blobs of size at least n
> +bytes or units.  n may be zero.  The suffixes k, m, and g can be used
> +to name units in KiB, MiB, or GiB.  For example, 'blob:limit=1k'
> +is the same as 'blob:limit=1024'.

With unnecessary paragraph wrapping, it is a bit hard to compare the
preimage and the postimage, but I manually checked that this only
does

	"larger than" -> "of size at least"

and nothing else, which is expected and in line with what the
proposed commit message claimed to do.  Good job.

Will queue.  Thanks.

>  +
>  The form '--filter=object:type=(tag|commit|tree|blob)' omits all objects
>  which are not of the requested type.
>
> base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-01-16 16:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-14 19:50 [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer Nikolay Edigaryev via GitGitGadget
2024-01-16 16:54 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).