git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Li Linchao via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Johannes Sixt" <j6t@kdbg.org>,
	"Li Linchao" <lilinchao@oschina.cn>
Subject: Re: [PATCH v4] rev-list: support human-readable output for `--disk-usage`
Date: Wed, 10 Aug 2022 10:34:32 -0700	[thread overview]
Message-ID: <xmqqlerwm28n.fsf@gitster.g> (raw)
In-Reply-To: <pull.1313.v4.git.1660130072657.gitgitgadget@gmail.com> (Li Linchao via GitGitGadget's message of "Wed, 10 Aug 2022 11:14:32 +0000")

"Li Linchao via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 195e74eec63..5d3880874fc 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -242,6 +242,7 @@ ifdef::git-rev-list[]
>  	to `/dev/null` as the output does not have to be formatted.
>  
>  --disk-usage::
> +--disk-usage=human::
>  	Suppress normal output; instead, print the sum of the bytes used
>  	for on-disk storage by the selected commits or objects. This is
>  	equivalent to piping the output into `git cat-file
> @@ -249,6 +250,8 @@ ifdef::git-rev-list[]
>  	faster (especially with `--use-bitmap-index`). See the `CAVEATS`
>  	section in linkgit:git-cat-file[1] for the limitations of what
>  	"on-disk storage" means.
> +	When it accepts a value `human`, like: `--disk-usage=human`, this
> +	means to print objects size in human readable format.

"When it accepts" sounds wrong, because it implies there are two
cases, i.e. the user gives "--disk-usage=human" and the command
somehow decides to accept it, or to reject it, which is not what is
going on.  How about phrasing it more like ...

    With the optional value `human`, on-disk storage size is shown
    in human-readable string (e.g. 12.23 KiB, 3.50 MiB).

perhaps?

> @@ -46,6 +46,7 @@ static const char rev_list_usage[] =
>  "    --parents\n"
>  "    --children\n"
>  "    --objects | --objects-edge\n"
> +"    --disk-usage | --disk-usage=human\n"

Writing it like

	"--disk-usage[=human]\n"

would be more in line with ...

>  "    --[no-]object-names\n"

... this one.

> @@ -368,6 +370,17 @@ static int show_object_fast(
>  	return 1;
>  }
>  
> +static void print_disk_usage(off_t size)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +	if (human_readable)
> +		strbuf_humanise_bytes(&sb, size);
> +	else
> +		strbuf_addf(&sb, "%"PRIuMAX, (uintmax_t)size);
> +	puts(sb.buf);
> +	strbuf_release(&sb);
> +}

Hmph, I am not sure if we want to make it a helper like this.  The
normal case does not need to prepare the string into a strbuf but
just can send the output to the standard output stream.

It is probably easy to fix, like so:

	if (!human_readable) {
		printf("%" PRIuMAX "\n", disk_usage);
	} else {
		strbuf sb = STRBUF_INIT;
		strbuf_humanise_bytes(&sb, disk_usage);
		puts(sb.buf);
		strbuf_release(&sb);
	}

>  static inline int parse_missing_action_value(const char *value)
>  {
>  	if (!strcmp(value, "error")) {
> @@ -473,6 +486,7 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
>  				 int filter_provided_objects)
>  {
>  	struct bitmap_index *bitmap_git;
> +	off_t size_from_bitmap;

Questionably named variable.

>  	if (!show_disk_usage)
>  		return -1;
> @@ -481,8 +495,8 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
>  	if (!bitmap_git)
>  		return -1;
>  
> -	printf("%"PRIuMAX"\n",
> -	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
> +	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
> +	print_disk_usage(size_from_bitmap);

It makes sense to make the function declare how it gets disk usage
in its name, but once we call the function to get what we want,
there is no need to keep saying we got it from bitmap.  If we ever
gained another function that obtains the disk usage from other
means, then this part of the code would become

	if (use_bitmap)
		variable = get_disk_usage_from_bitmap(...);
	else
		variable = get_disk_usage_from_other_means(...);

	print_disk_usage(variable);

Now, what should that variable be named?  It is clear that it
shouldn't be named "from_bitmap" at all, because it may very well
have come from somewhere else.

You call it simply 'size' in print_disk_usage(), and that would
probably be a good name to use here.  Or even better, "disk_usage".

However, I think all of the above badness stems from the horrible
way the existing code deals with "use_bitmap_index", and it is not
in the scope of this patch to fix it, let's keep the above code
as-is.  This is only called from the "if (use_bitmap_index)" so
the size can only be from bitmap.

> @@ -624,7 +638,20 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  			continue;
>  		}
>  
> -		if (!strcmp(arg, "--disk-usage")) {
> +		if (skip_prefix(arg, "--disk-usage", &arg)) {
> +			if (*arg == '=') {
> +				if (!strcmp(++arg, "human")) {
> +					human_readable = 1;
> +				} else
> +					die(_("invalid value for '%s': '%s', try --disk-usage=human"), "--disk-usage", arg);

This makes the hardcoded "--disk-usage=human" subject to
translation, which it should not.  Perhaps like this:

	die(_("invalid value for '%s': '%s', the only allowed format is "%s"),
	    "--disk-usage=<format>", arg, "human");

The main point is to ensure that "human" that we are not allowing
the user to type in their language is left out of the
_("translatable string").

> +			} else if (*arg) {
> +				/*
> +				* Arguably should goto a label to continue chain of ifs?
> +				* Doesn't matter unless we try to add --disk-usage-foo
> +				* afterwards
> +				*/

Wrong indentation for a long comment?

> +				usage(rev_list_usage);
> +			}

  reply	other threads:[~2022-08-10 17:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05  7:54 [PATCH] rev-list: support `--human-readable` option when applied `disk-usage` Li Linchao via GitGitGadget
2022-08-05 10:03 ` Ævar Arnfjörð Bjarmason
2022-08-05 11:01   ` lilinchao
2022-08-08  8:35 ` [PATCH v2] rev-list: support human-readable output for `--disk-usage` Li Linchao via GitGitGadget
2022-08-08  9:37   ` lilinchao
2022-08-09 13:22   ` Jeff King
2022-08-09 16:46     ` lilinchao
2022-08-10  6:01   ` [PATCH v3] " Li Linchao via GitGitGadget
2022-08-10  7:18     ` Johannes Sixt
2022-08-10 11:14     ` [PATCH v4] " Li Linchao via GitGitGadget
2022-08-10 17:34       ` Junio C Hamano [this message]
2022-08-10 21:20         ` Jeff King
2022-08-10 21:25           ` Junio C Hamano
2022-08-11  5:20           ` Junio C Hamano
2022-08-11  8:38             ` Jeff King
2022-08-11  4:47       ` [PATCH v5] " Li Linchao via GitGitGadget
2022-08-11 20:49         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqlerwm28n.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=j6t@kdbg.org \
    --cc=lilinchao@oschina.cn \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).