git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Li Linchao via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Li Linchao" <lilinchao@oschina.cn>,
	"Li Linchao" <lilinchao@oschina.cn>
Subject: [PATCH v2] rev-list: support human-readable output for `--disk-usage`
Date: Mon, 08 Aug 2022 08:35:21 +0000	[thread overview]
Message-ID: <pull.1313.v2.git.1659947722132.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1313.git.1659686097163.gitgitgadget@gmail.com>

From: Li Linchao <lilinchao@oschina.cn>

The '--disk-usage' option for git-rev-list was introduced in 16950f8384
(rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
This is very useful for people inspect their git repo's objects usage
infomation, but the resulting number is quit hard for a human to read.

Teach git rev-list to output a human readable result when using
'--disk-usage'.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
---
    rev-list: support human-readable output for disk-usage
    
    The '--disk-usage' option for git-rev-list was introduced in 16950f8384
    (rev-list: add --disk-usage option for calculating disk usage,
    2021-02-09). This is very useful for people inspect their git repo's
    objects usage infomation, but the result number is quit hard for human
    to read.
    
    Teach git rev-list to output more human readable result when using
    '--disk-usage' to calculate objects disk usage.
    
    Signed-off-by: Li Linchao lilinchao@oschina.cn

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1313%2FCactusinhand%2Fllc%2Fadd-human-readable-option-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1313/Cactusinhand/llc/add-human-readable-option-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1313

Range-diff vs v1:

 1:  7f8a7d3f61d ! 1:  7e34d16efe4 rev-list: support `--human-readable` option when applied `disk-usage`
     @@ Metadata
      Author: Li Linchao <lilinchao@oschina.cn>
      
       ## Commit message ##
     -    rev-list: support `--human-readable` option when applied `disk-usage`
     +    rev-list: support human-readable output for `--disk-usage`
      
          The '--disk-usage' option for git-rev-list was introduced in 16950f8384
          (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
          This is very useful for people inspect their git repo's objects usage
     -    infomation, but the result number is quit hard for human to read.
     +    infomation, but the resulting number is quit hard for a human to read.
      
     -    Teach git rev-list to output more human readable result when using
     -    '--disk-usage' to calculate objects disk usage.
     +    Teach git rev-list to output a human readable result when using
     +    '--disk-usage'.
      
          Signed-off-by: Li Linchao <lilinchao@oschina.cn>
      
       ## Documentation/rev-list-options.txt ##
      @@ Documentation/rev-list-options.txt: ifdef::git-rev-list[]
     - 	faster (especially with `--use-bitmap-index`). See the `CAVEATS`
     - 	section in linkgit:git-cat-file[1] for the limitations of what
     - 	"on-disk storage" means.
     -+
     -+-H::
     -+--human-readable::
     -+	Print on-disk objects size in human readable format. This option
     -+	must be combined with `--disk-usage` together.
     - endif::git-rev-list[]
     + 	to `/dev/null` as the output does not have to be formatted.
       
     - --cherry-mark::
     + --disk-usage::
     ++--disk-usage=human::
     + 	Suppress normal output; instead, print the sum of the bytes used
     +-	for on-disk storage by the selected commits or objects. This is
     ++	for on-disk storage by the selected commits or objects.
     ++	When it accepts a value `human`, like: `--disk-usage=human`, this
     ++	means to print objects size in human readable format. This is
     + 	equivalent to piping the output into `git cat-file
     + 	--batch-check='%(objectsize:disk)'`, except that it runs much
     + 	faster (especially with `--use-bitmap-index`). See the `CAVEATS`
      
       ## builtin/rev-list.c ##
     +@@ builtin/rev-list.c: static const char rev_list_usage[] =
     + "    --parents\n"
     + "    --children\n"
     + "    --objects | --objects-edge\n"
     ++"    --disk-usage | --disk-usage=human\n"
     + "    --unpacked\n"
     + "    --header | --pretty\n"
     + "    --[no-]object-names\n"
      @@ builtin/rev-list.c: static int arg_show_object_names = 1;
       
       static int show_disk_usage;
     @@ builtin/rev-list.c: static int try_bitmap_disk_usage(struct rev_info *revs,
       				 int filter_provided_objects)
       {
       	struct bitmap_index *bitmap_git;
     -+	struct strbuf bitmap_size_buf = STRBUF_INIT;
     ++	struct strbuf disk_buf = STRBUF_INIT;
      +	off_t size_from_bitmap;
       
       	if (!show_disk_usage)
     @@ builtin/rev-list.c: static int try_bitmap_disk_usage(struct rev_info *revs,
      -	printf("%"PRIuMAX"\n",
      -	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
      +	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
     -+	if (human_readable) {
     -+		strbuf_humanise_bytes(&bitmap_size_buf, size_from_bitmap);
     -+		printf("%s\n", bitmap_size_buf.buf);
     -+	} else
     -+		printf("%"PRIuMAX"\n", (uintmax_t)size_from_bitmap);
     -+	strbuf_release(&bitmap_size_buf);
     ++	if (human_readable)
     ++		strbuf_humanise_bytes(&disk_buf, size_from_bitmap);
     ++	else
     ++		strbuf_addf(&disk_buf, "%"PRIuMAX"", (uintmax_t)size_from_bitmap);
     ++	puts(disk_buf.buf);
     ++	strbuf_release(&disk_buf);
       	return 0;
       }
       
     -@@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *prefix)
     - {
     - 	struct rev_info revs;
     - 	struct rev_list_info info;
     -+	struct strbuf disk_buf = STRBUF_INIT;
     - 	struct setup_revision_opt s_r_opt = {
     - 		.allow_exclude_promisor_objects = 1,
     - 	};
      @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *prefix)
       			continue;
       		}
       
     -+		if (!strcmp(arg, "--human-readable") || !strcmp(arg, "-H")) {
     -+			human_readable = 1;
     -+			continue;
     -+		}
     -+
     - 		usage(rev_list_usage);
     +-		if (!strcmp(arg, "--disk-usage")) {
     +-			show_disk_usage = 1;
     +-			info.flags |= REV_LIST_QUIET;
     +-			continue;
     ++		if (skip_prefix(arg, "--disk-usage", &arg)) {
     ++			if (*arg == '=') {
     ++				if (!strcmp(++arg, "human")) {
     ++					human_readable = 1;
     ++					show_disk_usage = 1;
     ++					info.flags |= REV_LIST_QUIET;
     ++					continue;
     ++				} else
     ++					die(_("invalid value for '%s': '%s', try --disk-usage=human"), "--disk-usage", arg);
     ++			} else {
     ++				show_disk_usage = 1;
     ++				info.flags |= REV_LIST_QUIET;
     ++				continue;
     ++			}
     + 		}
       
     - 	}
     -+
     -+	if (!show_disk_usage && human_readable)
     -+		die(_("option '%s' should be used with '%s' together"), "--human-readable/-H", "--disk-usage");
     - 	if (revs.commit_format != CMIT_FMT_USERFORMAT)
     - 		revs.include_header = 1;
     - 	if (revs.commit_format != CMIT_FMT_UNSPECIFIED) {
     + 		usage(rev_list_usage);
      @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *prefix)
       			printf("%d\n", revs.count_left + revs.count_right);
       	}
     @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *pr
      -	if (show_disk_usage)
      -		printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
      +	if (show_disk_usage) {
     -+		if (human_readable) {
     ++		struct strbuf disk_buf = STRBUF_INIT;
     ++		if (human_readable)
      +			strbuf_humanise_bytes(&disk_buf, total_disk_usage);
     -+			printf("%s\n", disk_buf.buf);
     -+		} else
     -+			printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
     ++		else
     ++			strbuf_addf(&disk_buf, "%"PRIuMAX"", (uintmax_t)total_disk_usage);
     ++		puts(disk_buf.buf);
     ++		strbuf_release(&disk_buf);
      +	}
       
       cleanup:
       	release_revisions(&revs);
     -+	strbuf_release(&disk_buf);
     - 	return ret;
     - }
      
       ## t/t6115-rev-list-du.sh ##
      @@ t/t6115-rev-list-du.sh: check_du HEAD
       check_du --objects HEAD
       check_du --objects HEAD^..HEAD
       
     -+
     -+test_expect_success 'rev-list --disk-usage with --human-readable' '
     -+	git rev-list --objects HEAD --disk-usage --human-readable >actual &&
     -+	test_i18ngrep -e "446 bytes" actual
     ++# As mentioned above, don't use hardcode sizes as actual size, but use the
     ++# output from git cat-file.
     ++test_expect_success 'rev-list --disk-usage=human' '
     ++	git rev-list --objects HEAD --disk-usage=human >actual &&
     ++	disk_usage_slow --objects HEAD >actual_size &&
     ++	grep "$(cat actual_size) bytes" actual
      +'
      +
     -+test_expect_success 'rev-list --disk-usage with bitmap and --human-readable' '
     -+	git rev-list --objects HEAD --use-bitmap-index --disk-usage -H >actual &&
     -+	test_i18ngrep -e "446 bytes" actual
     ++test_expect_success 'rev-list --disk-usage=human with bitmaps' '
     ++	git rev-list --objects HEAD --use-bitmap-index --disk-usage=human >actual &&
     ++	disk_usage_slow --objects HEAD >actual_size &&
     ++	grep "$(cat actual_size) bytes" actual
      +'
      +
     -+test_expect_success 'rev-list use --human-readable without --disk-usage' '
     -+	test_must_fail git rev-list --objects HEAD --human-readable 2> err &&
     -+	echo "fatal: option '\''--human-readable/-H'\'' should be used with" \
     -+	"'\''--disk-usage'\'' together" >expect &&
     ++test_expect_success 'rev-list use --disk-usage unproperly' '
     ++	test_must_fail git rev-list --objects HEAD --disk-usage=typo 2>err &&
     ++	cat >expect <<-\EOF &&
     ++	fatal: invalid value for '\''--disk-usage'\'': '\''typo'\'', try --disk-usage=human
     ++	EOF
      +	test_cmp err expect
      +'
      +


 Documentation/rev-list-options.txt |  5 +++-
 builtin/rev-list.c                 | 42 ++++++++++++++++++++++++------
 t/t6115-rev-list-du.sh             | 22 ++++++++++++++++
 3 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 195e74eec63..9966ce4ef91 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -242,8 +242,11 @@ ifdef::git-rev-list[]
 	to `/dev/null` as the output does not have to be formatted.
 
 --disk-usage::
+--disk-usage=human::
 	Suppress normal output; instead, print the sum of the bytes used
-	for on-disk storage by the selected commits or objects. This is
+	for on-disk storage by the selected commits or objects.
+	When it accepts a value `human`, like: `--disk-usage=human`, this
+	means to print objects size in human readable format. This is
 	equivalent to piping the output into `git cat-file
 	--batch-check='%(objectsize:disk)'`, except that it runs much
 	faster (especially with `--use-bitmap-index`). See the `CAVEATS`
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 30fd8e83eaf..05ef232dcbe 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -46,6 +46,7 @@ static const char rev_list_usage[] =
 "    --parents\n"
 "    --children\n"
 "    --objects | --objects-edge\n"
+"    --disk-usage | --disk-usage=human\n"
 "    --unpacked\n"
 "    --header | --pretty\n"
 "    --[no-]object-names\n"
@@ -81,6 +82,7 @@ static int arg_show_object_names = 1;
 
 static int show_disk_usage;
 static off_t total_disk_usage;
+static int human_readable;
 
 static off_t get_object_disk_usage(struct object *obj)
 {
@@ -473,6 +475,8 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
 				 int filter_provided_objects)
 {
 	struct bitmap_index *bitmap_git;
+	struct strbuf disk_buf = STRBUF_INIT;
+	off_t size_from_bitmap;
 
 	if (!show_disk_usage)
 		return -1;
@@ -481,8 +485,13 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
 	if (!bitmap_git)
 		return -1;
 
-	printf("%"PRIuMAX"\n",
-	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
+	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
+	if (human_readable)
+		strbuf_humanise_bytes(&disk_buf, size_from_bitmap);
+	else
+		strbuf_addf(&disk_buf, "%"PRIuMAX"", (uintmax_t)size_from_bitmap);
+	puts(disk_buf.buf);
+	strbuf_release(&disk_buf);
 	return 0;
 }
 
@@ -624,10 +633,20 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
-		if (!strcmp(arg, "--disk-usage")) {
-			show_disk_usage = 1;
-			info.flags |= REV_LIST_QUIET;
-			continue;
+		if (skip_prefix(arg, "--disk-usage", &arg)) {
+			if (*arg == '=') {
+				if (!strcmp(++arg, "human")) {
+					human_readable = 1;
+					show_disk_usage = 1;
+					info.flags |= REV_LIST_QUIET;
+					continue;
+				} else
+					die(_("invalid value for '%s': '%s', try --disk-usage=human"), "--disk-usage", arg);
+			} else {
+				show_disk_usage = 1;
+				info.flags |= REV_LIST_QUIET;
+				continue;
+			}
 		}
 
 		usage(rev_list_usage);
@@ -752,8 +771,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			printf("%d\n", revs.count_left + revs.count_right);
 	}
 
-	if (show_disk_usage)
-		printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
+	if (show_disk_usage) {
+		struct strbuf disk_buf = STRBUF_INIT;
+		if (human_readable)
+			strbuf_humanise_bytes(&disk_buf, total_disk_usage);
+		else
+			strbuf_addf(&disk_buf, "%"PRIuMAX"", (uintmax_t)total_disk_usage);
+		puts(disk_buf.buf);
+		strbuf_release(&disk_buf);
+	}
 
 cleanup:
 	release_revisions(&revs);
diff --git a/t/t6115-rev-list-du.sh b/t/t6115-rev-list-du.sh
index b4aef32b713..b34841a4ba8 100755
--- a/t/t6115-rev-list-du.sh
+++ b/t/t6115-rev-list-du.sh
@@ -48,4 +48,26 @@ check_du HEAD
 check_du --objects HEAD
 check_du --objects HEAD^..HEAD
 
+# As mentioned above, don't use hardcode sizes as actual size, but use the
+# output from git cat-file.
+test_expect_success 'rev-list --disk-usage=human' '
+	git rev-list --objects HEAD --disk-usage=human >actual &&
+	disk_usage_slow --objects HEAD >actual_size &&
+	grep "$(cat actual_size) bytes" actual
+'
+
+test_expect_success 'rev-list --disk-usage=human with bitmaps' '
+	git rev-list --objects HEAD --use-bitmap-index --disk-usage=human >actual &&
+	disk_usage_slow --objects HEAD >actual_size &&
+	grep "$(cat actual_size) bytes" actual
+'
+
+test_expect_success 'rev-list use --disk-usage unproperly' '
+	test_must_fail git rev-list --objects HEAD --disk-usage=typo 2>err &&
+	cat >expect <<-\EOF &&
+	fatal: invalid value for '\''--disk-usage'\'': '\''typo'\'', try --disk-usage=human
+	EOF
+	test_cmp err expect
+'
+
 test_done

base-commit: 679aad9e82d0dfd8ef3d1f98fa4629665496cec9
-- 
gitgitgadget

  parent reply	other threads:[~2022-08-08  8:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05  7:54 [PATCH] rev-list: support `--human-readable` option when applied `disk-usage` Li Linchao via GitGitGadget
2022-08-05 10:03 ` Ævar Arnfjörð Bjarmason
2022-08-05 11:01   ` lilinchao
2022-08-08  8:35 ` Li Linchao via GitGitGadget [this message]
2022-08-08  9:37   ` [PATCH v2] rev-list: support human-readable output for `--disk-usage` lilinchao
2022-08-09 13:22   ` Jeff King
2022-08-09 16:46     ` lilinchao
2022-08-10  6:01   ` [PATCH v3] " Li Linchao via GitGitGadget
2022-08-10  7:18     ` Johannes Sixt
2022-08-10 11:14     ` [PATCH v4] " Li Linchao via GitGitGadget
2022-08-10 17:34       ` Junio C Hamano
2022-08-10 21:20         ` Jeff King
2022-08-10 21:25           ` Junio C Hamano
2022-08-11  5:20           ` Junio C Hamano
2022-08-11  8:38             ` Jeff King
2022-08-11  4:47       ` [PATCH v5] " Li Linchao via GitGitGadget
2022-08-11 20:49         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1313.v2.git.1659947722132.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=lilinchao@oschina.cn \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).