git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] support `--oid-only` in `ls-tree`
@ 2021-11-15 11:51 Teng Long
  2021-11-15 11:51 ` [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
                   ` (5 more replies)
  0 siblings, 6 replies; 224+ messages in thread
From: Teng Long @ 2021-11-15 11:51 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Teng Long

Sometimes, we only want to get the objects from output of `ls-tree`
and commands like `sed` or `cut` is usually used to intercept the
origin output to achieve this purpose in practical.

The patch contains three commits

    1. Implementation of the option.
    2. Add new tests in "t3104".
    3. Documentation modifications.

I'm appreciate if someone help to review the patch.

Thanks.

Teng Long (3):
  ls-tree.c: support `--oid-only` option for "git-ls-tree"
  t3104: add related tests for `--oid-only` option
  git-ls-tree.txt: description of the 'oid-only' option

 Documentation/git-ls-tree.txt |  8 +++--
 builtin/ls-tree.c             | 11 +++++++
 t/t3104-ls-tree-oid.sh        | 55 +++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

-- 
2.33.1.9.g5fbd2fc599.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-15 11:51 [PATCH 0/3] support `--oid-only` in `ls-tree` Teng Long
@ 2021-11-15 11:51 ` Teng Long
  2021-11-15 15:12   ` Ævar Arnfjörð Bjarmason
  2021-11-15 19:16   ` Jeff King
  2021-11-15 11:51 ` [PATCH 2/3] t3104: add related tests for `--oid-only` option Teng Long
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 224+ messages in thread
From: Teng Long @ 2021-11-15 11:51 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Teng Long

This commit supply an option names `--oid-only` to let `git ls-tree`
only print out the OID of the object. `--oid-only` and `--name-only`
are mutually exclusive in use.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..1f82229649 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -20,6 +20,7 @@ static int line_termination = '\n';
 #define LS_SHOW_TREES 4
 #define LS_NAME_ONLY 8
 #define LS_SHOW_SIZE 16
+#define LS_OID_ONLY 32
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
@@ -90,6 +91,14 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
+	if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
+		die(_("cannot specify --oid-only and --name-only at the same time"));
+
+	if (ls_options & LS_OID_ONLY) {
+		printf("%s\n", find_unique_abbrev(oid, abbrev));
+		return 0;
+	}
+
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
@@ -139,6 +148,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_NAME_ONLY),
 		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
 			LS_NAME_ONLY),
+		OPT_BIT(0, "oid-only", &ls_options, N_("list only oids"),
+			LS_OID_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
-- 
2.33.1.9.g5fbd2fc599.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH 2/3] t3104: add related tests for `--oid-only` option
  2021-11-15 11:51 [PATCH 0/3] support `--oid-only` in `ls-tree` Teng Long
  2021-11-15 11:51 ` [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-15 11:51 ` Teng Long
  2021-11-15 15:54   ` Đoàn Trần Công Danh
  2021-11-15 11:51 ` [PATCH 3/3] git-ls-tree.txt: description of the 'oid-only' option Teng Long
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-11-15 11:51 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Teng Long

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 t/t3104-ls-tree-oid.sh | 55 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..78ab9127c7
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='git ls-tree oids handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo 111 >1.txt &&
+	echo 222 >2.txt &&
+	mkdir -p path0/a/b/c &&
+	echo 333 >path0/a/b/c/3.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+
+test_expect_success 'specify with --oid-only' '
+	git ls-tree --oid-only $tree >current &&
+	cat >expected <<\EOF &&
+58c9bdf9d017fcd178dc8c073cbfcbb7ff240d6c
+c200906efd24ec5e783bee7f23b5d7c941b0c12c
+4e3849a078083863912298a25db30997cb8ca6d6
+EOF
+	test_cmp current expected
+'
+
+test_expect_success 'specify with --oid-only and -r' '
+	git ls-tree --oid-only -r $tree >current &&
+	cat >expected <<\EOF &&
+58c9bdf9d017fcd178dc8c073cbfcbb7ff240d6c
+c200906efd24ec5e783bee7f23b5d7c941b0c12c
+55bd0ac4c42e46cd751eb7405e12a35e61425550
+EOF
+	test_cmp current expected
+'
+
+test_expect_success 'specify with --oid-only and --abbrev' '
+	git ls-tree --oid-only --abbrev=6 $tree >current &&
+	cat >expected <<\EOF &&
+58c9bd
+c20090
+4e3849
+EOF
+	test_cmp current expected
+'
+
+test_expect_success 'cannot specify --name-only and --oid-only as the same time' '
+	test_must_fail git ls-tree --oid-only --name-only $tree >current 2>&1 >/dev/null &&
+	echo "fatal: cannot specify --oid-only and --name-only at the same time" > expected &&
+	test_cmp current expected
+'
+
+test_done
-- 
2.33.1.9.g5fbd2fc599.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH 3/3] git-ls-tree.txt: description of the 'oid-only' option
  2021-11-15 11:51 [PATCH 0/3] support `--oid-only` in `ls-tree` Teng Long
  2021-11-15 11:51 ` [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-11-15 11:51 ` [PATCH 2/3] t3104: add related tests for `--oid-only` option Teng Long
@ 2021-11-15 11:51 ` Teng Long
  2021-11-15 15:13 ` [PATCH 0/3] support `--oid-only` in `ls-tree` Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2021-11-15 11:51 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Teng Long

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..bc711dc00a 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--oid-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,7 +60,10 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
-
+	Cannot be used with `--oid-only` together.
+--oid-only::
+	List only OIDs of the objects, one per line. Cannot be used with
+	`--name-only` or `--name-status` together.
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	lines, show the shortest prefix that is at least '<n>'
-- 
2.33.1.9.g5fbd2fc599.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-15 11:51 ` [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-15 15:12   ` Ævar Arnfjörð Bjarmason
  2021-11-18  9:28     ` Teng Long
  2021-11-15 19:16   ` Jeff King
  1 sibling, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-15 15:12 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster, peff


lorn Mon, Nov 15 2021, Teng Long wrote:

> This commit supply an option names `--oid-only` to let `git ls-tree`
> only print out the OID of the object. `--oid-only` and `--name-only`
> are mutually exclusive in use.
>
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  builtin/ls-tree.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c7..1f82229649 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -20,6 +20,7 @@ static int line_termination = '\n';
>  #define LS_SHOW_TREES 4
>  #define LS_NAME_ONLY 8
>  #define LS_SHOW_SIZE 16
> +#define LS_OID_ONLY 32
>  static int abbrev;
>  static int ls_options;
>  static struct pathspec pathspec;
> @@ -90,6 +91,14 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  	else if (ls_options & LS_TREE_ONLY)
>  		return 0;
>  
> +	if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
> +		die(_("cannot specify --oid-only and --name-only at the same time"));

If you make these an OPT_CMDMODE you get this behavior for free. See
e.g. my
https://lore.kernel.org/git/patch-v2-06.10-d945fc94774-20211112T221506Z-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 0/3] support `--oid-only` in `ls-tree`
  2021-11-15 11:51 [PATCH 0/3] support `--oid-only` in `ls-tree` Teng Long
                   ` (2 preceding siblings ...)
  2021-11-15 11:51 ` [PATCH 3/3] git-ls-tree.txt: description of the 'oid-only' option Teng Long
@ 2021-11-15 15:13 ` Ævar Arnfjörð Bjarmason
  2021-11-15 19:09   ` Jeff King
  2021-11-15 19:23 ` Jeff King
  2021-11-19 12:09 ` [PATCH v2 0/1] " Teng Long
  5 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-15 15:13 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster, peff


On Mon, Nov 15 2021, Teng Long wrote:

> Sometimes, we only want to get the objects from output of `ls-tree`
> and commands like `sed` or `cut` is usually used to intercept the
> origin output to achieve this purpose in practical.
>
> The patch contains three commits
>
>     1. Implementation of the option.
>     2. Add new tests in "t3104".
>     3. Documentation modifications.
>
> I'm appreciate if someone help to review the patch.

I've looked it over, they look correct mostly, the test code in 2/3
looks a bit too complex (using find?).

But I'd much rather see this be done with adding strbuf_expand() to
ls-tree. I.e. its docs say that it can emit:

    <mode> SP <type> SP <object> TAB <file>

Or, with -l:

    <mode> SP <type> SP <object> SP <object size> TAB <file>

If you use strbuf_expand() you can just define a default format of:

    %(objectmode) SP %(objecttype) SP %(objectname) TAB %(path)

Then make the existing -l option a shorthand for tweaking that to:

    %(objectmode) SP %(objecttype) SP %(objectsize) SP %(objectname) TAB %(path)

Then you can get what you want out of this with a simple:

    git ls-tree --format="%(objectname)"

See e.g. git-cat-file for an existing use of strbuf_expand().

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 2/3] t3104: add related tests for `--oid-only` option
  2021-11-15 11:51 ` [PATCH 2/3] t3104: add related tests for `--oid-only` option Teng Long
@ 2021-11-15 15:54   ` Đoàn Trần Công Danh
  2021-11-18  8:45     ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Đoàn Trần Công Danh @ 2021-11-15 15:54 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster, peff

On 2021-11-15 19:51:52+0800, Teng Long <dyroneteng@gmail.com> wrote:
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  t/t3104-ls-tree-oid.sh | 55 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 55 insertions(+)
>  create mode 100755 t/t3104-ls-tree-oid.sh
> 
> diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
> new file mode 100755
> index 0000000000..78ab9127c7
> --- /dev/null
> +++ b/t/t3104-ls-tree-oid.sh
> @@ -0,0 +1,55 @@
> +#!/bin/sh
> +
> +test_description='git ls-tree oids handling.'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	echo 111 >1.txt &&
> +	echo 222 >2.txt &&
> +	mkdir -p path0/a/b/c &&
> +	echo 333 >path0/a/b/c/3.txt &&
> +	find *.txt path* \( -type f -o -type l \) -print |
> +	xargs git update-index --add &&
> +	tree=$(git write-tree) &&
> +	echo $tree
> +'
> +
> +
> +test_expect_success 'specify with --oid-only' '
> +	git ls-tree --oid-only $tree >current &&
> +	cat >expected <<\EOF &&
> +58c9bdf9d017fcd178dc8c073cbfcbb7ff240d6c
> +c200906efd24ec5e783bee7f23b5d7c941b0c12c
> +4e3849a078083863912298a25db30997cb8ca6d6
> +EOF

Failed with:

	GIT_TEST_DEFAULT_HASH=sha256 ./t3104-ls-tree-oid.sh

I think we can use:

	git ls-tree $tree | awk '{print $3}' >expected

> +	test_cmp current expected
> +'
> +
> +test_expect_success 'specify with --oid-only and -r' '
> +	git ls-tree --oid-only -r $tree >current &&
> +	cat >expected <<\EOF &&
> +58c9bdf9d017fcd178dc8c073cbfcbb7ff240d6c
> +c200906efd24ec5e783bee7f23b5d7c941b0c12c
> +55bd0ac4c42e46cd751eb7405e12a35e61425550
> +EOF
> +	test_cmp current expected
> +'

Ditto for this test and below tests.

> +
> +test_expect_success 'specify with --oid-only and --abbrev' '
> +	git ls-tree --oid-only --abbrev=6 $tree >current &&
> +	cat >expected <<\EOF &&
> +58c9bd
> +c20090
> +4e3849
> +EOF
> +	test_cmp current expected
> +'
> +
> +test_expect_success 'cannot specify --name-only and --oid-only as the same time' '
> +	test_must_fail git ls-tree --oid-only --name-only $tree >current 2>&1 >/dev/null &&

The last redirection '>/dev/null' does nothing, me think.

> +	echo "fatal: cannot specify --oid-only and --name-only at the same time" > expected &&

Style nit:

	use '>expected' instead of '> expected'

> +	test_cmp current expected
> +'
> +
> +test_done
> -- 
> 2.33.1.9.g5fbd2fc599.dirty
> 

-- 
Danh

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 0/3] support `--oid-only` in `ls-tree`
  2021-11-15 15:13 ` [PATCH 0/3] support `--oid-only` in `ls-tree` Ævar Arnfjörð Bjarmason
@ 2021-11-15 19:09   ` Jeff King
  2021-11-15 21:50     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Jeff King @ 2021-11-15 19:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Teng Long, git, gitster

On Mon, Nov 15, 2021 at 04:13:24PM +0100, Ævar Arnfjörð Bjarmason wrote:

> But I'd much rather see this be done with adding strbuf_expand() to
> ls-tree. I.e. its docs say that it can emit:

I had a similar thought, but that's a much bigger task. I think it would
be reasonable to add --oid-only to match the existing --name-only, etc.
If we later add a custom --format option, then it can easily be folded
in and explained as "this is an alias for --format=%(objectname)", just
like --name-only would become "this is an alias for --format=%(path)".

-Peff

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-15 11:51 ` [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-11-15 15:12   ` Ævar Arnfjörð Bjarmason
@ 2021-11-15 19:16   ` Jeff King
  2021-11-15 19:25     ` Jeff King
  2021-11-18 11:23     ` Teng Long
  1 sibling, 2 replies; 224+ messages in thread
From: Jeff King @ 2021-11-15 19:16 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster

On Mon, Nov 15, 2021 at 07:51:51PM +0800, Teng Long wrote:

> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c7..1f82229649 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -20,6 +20,7 @@ static int line_termination = '\n';
>  #define LS_SHOW_TREES 4
>  #define LS_NAME_ONLY 8
>  #define LS_SHOW_SIZE 16
> +#define LS_OID_ONLY 32
>  static int abbrev;
>  static int ls_options;
>  static struct pathspec pathspec;
> @@ -90,6 +91,14 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  	else if (ls_options & LS_TREE_ONLY)
>  		return 0;
>  
> +	if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
> +		die(_("cannot specify --oid-only and --name-only at the same time"));

This seems reasonable to me. Letting them overwrite each other (i.e.,
"last one wins") would also be fine, but we can always loosen to that
behavior later if we choose.

This is a somewhat funny place to put the check, though. It will be run
for every entry in the tree (so is a tiny bit less efficient, but also
would not trigger for an empty tree). It probably should go in
cmd_ls_tree(), perhaps here:

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 1f82229649..3c9ea00489 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -91,9 +91,6 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
-		die(_("cannot specify --oid-only and --name-only at the same time"));
-
 	if (ls_options & LS_OID_ONLY) {
 		printf("%s\n", find_unique_abbrev(oid, abbrev));
 		return 0;
@@ -175,6 +172,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
 		ls_options |= LS_SHOW_TREES;
 
+	if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
+		die(_("cannot specify --oid-only and --name-only at the same time"));
+
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
 	if (get_oid(argv[0], &oid))

Ævar also mentioned using OPT_CMDMODE(), which I think would naturally
move the logic in a similar way.

-Peff

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 0/3] support `--oid-only` in `ls-tree`
  2021-11-15 11:51 [PATCH 0/3] support `--oid-only` in `ls-tree` Teng Long
                   ` (3 preceding siblings ...)
  2021-11-15 15:13 ` [PATCH 0/3] support `--oid-only` in `ls-tree` Ævar Arnfjörð Bjarmason
@ 2021-11-15 19:23 ` Jeff King
  2021-11-19 12:09 ` [PATCH v2 0/1] " Teng Long
  5 siblings, 0 replies; 224+ messages in thread
From: Jeff King @ 2021-11-15 19:23 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster

On Mon, Nov 15, 2021 at 07:51:50PM +0800, Teng Long wrote:

> Sometimes, we only want to get the objects from output of `ls-tree`
> and commands like `sed` or `cut` is usually used to intercept the
> origin output to achieve this purpose in practical.
> 
> The patch contains three commits
> 
>     1. Implementation of the option.
>     2. Add new tests in "t3104".
>     3. Documentation modifications.
> 
> I'm appreciate if someone help to review the patch.

This seems like a good feature to have. I think it would make sense to
squash the three patches into a single one. The documentation and test
patches do not stand on their own, which is why there was nothing useful
to say in their commit messages.

The implementation looks generally sensible (modulo the comments already
given). I was surprised that there was not an existing ls-tree script
that these would fit into. But there really isn't; t3101 covers
--name-only and other output, but is really focused on the pathnames
(though I think it would be OK to refactor it to cover output more
generally).

-Peff

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-15 19:16   ` Jeff King
@ 2021-11-15 19:25     ` Jeff King
  2021-11-18 11:23     ` Teng Long
  1 sibling, 0 replies; 224+ messages in thread
From: Jeff King @ 2021-11-15 19:25 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster

On Mon, Nov 15, 2021 at 02:16:27PM -0500, Jeff King wrote:

> On Mon, Nov 15, 2021 at 07:51:51PM +0800, Teng Long wrote:
> 
> > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> > index 3a442631c7..1f82229649 100644
> > --- a/builtin/ls-tree.c
> > +++ b/builtin/ls-tree.c
> > @@ -20,6 +20,7 @@ static int line_termination = '\n';
> >  #define LS_SHOW_TREES 4
> >  #define LS_NAME_ONLY 8
> >  #define LS_SHOW_SIZE 16
> > +#define LS_OID_ONLY 32
> >  static int abbrev;
> >  static int ls_options;
> >  static struct pathspec pathspec;
> > @@ -90,6 +91,14 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
> >  	else if (ls_options & LS_TREE_ONLY)
> >  		return 0;
> >  
> > +	if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
> > +		die(_("cannot specify --oid-only and --name-only at the same time"));
> 
> This seems reasonable to me. Letting them overwrite each other (i.e.,
> "last one wins") would also be fine, but we can always loosen to that
> behavior later if we choose.

Oh, and whichever direction we go, it would probably make sense for
--long to be handled in the same way. I.e.:

  git ls-tree --long --oid-only

does not really make sense. Though we currently just ignore --long for:

  git ls-tree --long --name-only

which is arguably a bug.

-Peff

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 0/3] support `--oid-only` in `ls-tree`
  2021-11-15 19:09   ` Jeff King
@ 2021-11-15 21:50     ` Ævar Arnfjörð Bjarmason
  2021-11-19  2:57       ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-15 21:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Teng Long, git, gitster


On Mon, Nov 15 2021, Jeff King wrote:

> On Mon, Nov 15, 2021 at 04:13:24PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> But I'd much rather see this be done with adding strbuf_expand() to
>> ls-tree. I.e. its docs say that it can emit:
>
> I had a similar thought, but that's a much bigger task. I think it would
> be reasonable to add --oid-only to match the existing --name-only, etc.
> If we later add a custom --format option, then it can easily be folded
> in and explained as "this is an alias for --format=%(objectname)", just
> like --name-only would become "this is an alias for --format=%(path)".

A quick patch to do it below, seems to work, passes all tests, but I
don't know how much I'd trust it. It's also quite an add use of
strbuf_expa(). We print to stdout directly since
write_name_quoted_relative() really wants to write to stdout, and not
give you a buffer. But I guess it makes sense in a way.

The hardcoded %7s for %(objectsize) is a bit nasty, but I don't know if
we've got anything existing that handles format specifiers with
strbuf_expand() that we could steal.

I really wouldn't trust this code much, I found it when writing it that
our tests for ls-tree are really lacking, e.g. we may not have a single
test for "-l" anywhere (or maybe I didn't look enough, I was just
running t/*ls*tree* while hacking it.

I do thin that we should consider just going with --format in either
case if we agree that this is a good direction. I.e. could just support
3-4 hardcoded formats now and die if anything else is specified.

Then we'd be future-proof with the same interface expanding later, and
wouldn't need to support options that we're only carrying because we
didn't implement the more generic format support.

(Assume my Signed-off-by, if there's any interest...)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..e89daad4229 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,6 +31,20 @@ static const  char * const ls_tree_usage[] = {
 	NULL
 };
 
+static const char *ls_tree_format_d = "%(objectmode) %(objecttype) %(objectname)	%(path)";
+static const char *ls_tree_format_l = "%(objectmode) %(objecttype) %(objectname) %(objectsize)	%(path)";
+static const char *ls_tree_format_n = "%(path)";
+
+struct expand_ls_tree_data {
+	const char *format;
+	unsigned mode;
+	const char *type;
+	const struct object_id *oid;
+	int abbrev;
+	const char *pathname;
+	const char *basebuf;
+};
+
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
 	int i;
@@ -61,9 +75,69 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
 	return 0;
 }
 
+static size_t expand_show_tree(struct strbuf *sb,
+			       const char *start,
+			       void *context)
+{
+	struct expand_ls_tree_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len;
+	const char *type = blob_type;
+
+	if (sb->len) {
+		fputs(sb->buf, stdout);
+		strbuf_reset(sb);
+	}
+
+	if (*start != '(')
+		die(_("bad format as of '%s'"), start);
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("ls-tree format element '%s' does not end in ')'"),
+		    start);
+	len = end - start + 1;
+
+	if (skip_prefix(start, "(objectmode)", &p)) {
+		printf("%06o", data->mode);
+	} else if (skip_prefix(start, "(objecttype)", &p)) {
+		fputs(data->type, stdout);
+	} else if (skip_prefix(start, "(objectsize)", &p)) {
+		char size_text[24];
+		const struct object_id *oid = data->oid;
+
+		if (!strcmp(type, blob_type)) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text),
+					  "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%"PRIuMAX, (uintmax_t)size);
+		} else {
+			xsnprintf(size_text, sizeof(size_text), "-");
+		}
+		printf("%7s", size_text);
+	} else if (skip_prefix(start, "(objectname)", &p)) {
+		fputs(find_unique_abbrev(data->oid, data->abbrev), stdout);
+	} else if (skip_prefix(start, "(path)", &p)) {
+		write_name_quoted_relative(data->basebuf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout, line_termination);
+
+	} else {
+		unsigned int errlen = (unsigned long)len;
+		die(_("bad ls-tree format specifiec %%%.*s"), errlen, start);	
+	}
+
+	return len;
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
+	struct expand_ls_tree_data *data = context;
+	struct strbuf sb = STRBUF_INIT;
 	int retval = 0;
 	int baselen;
 	const char *type = blob_type;
@@ -90,31 +164,18 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (!strcmp(type, blob_type)) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else
-				xsnprintf(size_text, sizeof(size_text), "-");
-			printf("%06o %s %s %7s\t", mode, type,
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
-		} else
-			printf("%06o %s %s\t", mode, type,
-			       find_unique_abbrev(oid, abbrev));
-	}
 	baselen = base->len;
 	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
+
+	strbuf_reset(&sb);
+	data->mode = mode;
+	data->type = type;
+	data->oid = oid;
+	data->abbrev = abbrev;
+	data->pathname = pathname;
+	data->basebuf = base->buf;
+	strbuf_expand(&sb, data->format, expand_show_tree, data);
+
 	strbuf_setlen(base, baselen);
 	return retval;
 }
@@ -147,6 +208,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
+	struct expand_ls_tree_data ls_tree_cb_data = {
+		.format = ls_tree_format_d,
+	};
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
@@ -161,8 +225,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	}
 	/* -d -r should imply -t, but -d by itself should not have to. */
 	if ( (LS_TREE_ONLY|LS_RECURSIVE) ==
-	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
+	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options)) {
 		ls_options |= LS_SHOW_TREES;
+	}
+	if (ls_options & LS_NAME_ONLY)
+		ls_tree_cb_data.format = ls_tree_format_n;
+
+	if (ls_options & LS_SHOW_SIZE)
+		ls_tree_cb_data.format = ls_tree_format_l;
 
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
@@ -185,6 +255,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
+
 	return !!read_tree(the_repository, tree,
-			   &pathspec, show_tree, NULL);
+			   &pathspec, show_tree, &ls_tree_cb_data);
 }

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 2/3] t3104: add related tests for `--oid-only` option
  2021-11-15 15:54   ` Đoàn Trần Công Danh
@ 2021-11-18  8:45     ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2021-11-18  8:45 UTC (permalink / raw)
  To: congdanhqx; +Cc: dyroneteng, git, gitster, peff


on Mon, 15 Nov 2021 22:54:00 +0700, Đoàn Trần Công Danh wrote:

Thanks for helping to review this patch.

> Failed with:
>
>         GIT_TEST_DEFAULT_HASH=sha256 ./t3104-ls-tree-oid.sh

You totally right and the tests should pass both sha1 and sha256.

> I think we can use:
>
>         git ls-tree $tree | awk '{print $3}' >expected
> ...
> ...
>
> Ditto for this test and below tests.

Yes, correct and better.

But should also escape the dollar character to work.

> The last redirection '>/dev/null' does nothing, me think.
> Style nit:
>
>	use '>expected' instead of '> expected'

Yeah, that's my bad and will fix.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-15 15:12   ` Ævar Arnfjörð Bjarmason
@ 2021-11-18  9:28     ` Teng Long
  2021-11-18 11:00       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-11-18  9:28 UTC (permalink / raw)
  To: avarab; +Cc: dyroneteng, git, gitster, peff


> If you make these an OPT_CMDMODE you get this behavior for free. See
> e.g. my
> https://lore.kernel.org/git/patch-v2-06.10-d945fc94774-20211112T221506Z-avarab@gmail.com/	   

Thank you very much for providing this input.

So I try to read this patch your mentioned and try to repeat the idea in my understanding.

First, OPT_CMDMODE() can be used for:

       1. Easy for checking the combined command options, such as "mutually exclusive" conditions.

       2. Die and output the error message consistently when the incompatible options are found.

       3. Brings better extensibilites, no need to change a lot of if/elses.

Then, you suggest to consider about to use OPT_CMDMODE instead of the current implementations.

Did I understand your suggestion right and comprehensive?

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-18  9:28     ` Teng Long
@ 2021-11-18 11:00       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-18 11:00 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster, peff


On Thu, Nov 18 2021, Teng Long wrote:

>> If you make these an OPT_CMDMODE you get this behavior for free. See
>> e.g. my
>> https://lore.kernel.org/git/patch-v2-06.10-d945fc94774-20211112T221506Z-avarab@gmail.com/	   
>
> Thank you very much for providing this input.
>
> So I try to read this patch your mentioned and try to repeat the idea in my understanding.
>
> First, OPT_CMDMODE() can be used for:
>
>        1. Easy for checking the combined command options, such as "mutually exclusive" conditions.
>
>        2. Die and output the error message consistently when the incompatible options are found.
>
>        3. Brings better extensibilites, no need to change a lot of if/elses.
>
> Then, you suggest to consider about to use OPT_CMDMODE instead of the current implementations.
>
> Did I understand your suggestion right and comprehensive?

Yes, all of that is correct.

It's a way of defining N options, --foo, --bar, --baz, where combining
any of them is an error.

We usually use it for a "command mode" (hence the name), but it can be
used when the command has flags that are mutually exclusive.

I think (but am not sure, and didn't check) that you can even use it for
--foo AND --bar that are exclusive, and --other --flags that are also
mutually exclusive (but could be combined with one of --foo or --bar),
you just need to provide another variable for it to set.

But I haven't tested that or used it like that, maybe it doesn't work
for some reason I'm forgetting...

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 1/3] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-15 19:16   ` Jeff King
  2021-11-15 19:25     ` Jeff King
@ 2021-11-18 11:23     ` Teng Long
  1 sibling, 0 replies; 224+ messages in thread
From: Teng Long @ 2021-11-18 11:23 UTC (permalink / raw)
  To: peff; +Cc: dyroneteng, git, gitster

On Mon, 15 Nov 2021 14:16:27 -0500, Jeff King wrote:

> This is a somewhat funny place to put the check, though. It will be run
> for every entry in the tree (so is a tiny bit less efficient, but also
> would not trigger for an empty tree). It probably should go in
> cmd_ls_tree(), perhaps here:

Yes, it's better here as a fail-fast case.

According to the suggestion of the new location I think why not put the logic
further head, after the parse_options() return, like:

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 1f82229649..003a9ade54 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -166,6 +166,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
        argc = parse_options(argc, argv, prefix, ls_tree_options,
                             ls_tree_usage, 0);
+
+       if ((ls_options & LS_NAME_ONLY) && (ls_options & LS_OID_ONLY))
+               die(_("cannot specify --oid-only and --name-only at the same time"));
+
        if (full_tree) {
                ls_tree_prefix = prefix = NULL;
                chomp_prefix = 0;

Will it bring other new problems?

Thank you.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH 0/3] support `--oid-only` in `ls-tree`
  2021-11-15 21:50     ` Ævar Arnfjörð Bjarmason
@ 2021-11-19  2:57       ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2021-11-19  2:57 UTC (permalink / raw)
  To: avarab; +Cc: dyroneteng, git, gitster, peff

> A quick patch to do it below, seems to work, passes all tests, but I
> don't know how much I'd trust it. It's also quite an add use of
> strbuf_expa(). We print to stdout directly since
> write_name_quoted_relative() really wants to write to stdout, and not
> give you a buffer. But I guess it makes sense in a way.

Thanks for the patch and the inputs about "strbuf_expa()".

> Then we'd be future-proof with the same interface expanding later, and
> wouldn't need to support options that we're only carrying because we
> didn't implement the more generic format support.

I agree but like Peff said it maybe another bigger task. I think I will
firstly solve the existing problems in next patch.

I will consider about the generic format support but not sure whether
it will continue to iterate in this patchset.

> (Assume my Signed-off-by, if there's any interest...)

Of course I will.

Thank you very much for your advice and guidance again.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v2 0/1] support `--oid-only` in `ls-tree`
  2021-11-15 11:51 [PATCH 0/3] support `--oid-only` in `ls-tree` Teng Long
                   ` (4 preceding siblings ...)
  2021-11-15 19:23 ` Jeff King
@ 2021-11-19 12:09 ` Teng Long
  2021-11-19 12:09   ` [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-11-22  8:07   ` [PATCH v3 0/1] ls-tree.c: support `--oid-only` option Teng Long
  5 siblings, 2 replies; 224+ messages in thread
From: Teng Long @ 2021-11-19 12:09 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, avarab, congdanhqx, Teng Long


This patch series supports for only outputing the "objects" (OID)
with a new option names `--oid-only`.

Changes with the first patch are :

        1. Three commits are squashed to 1 commit (Peff's advice)
        2. The tests issues (Đoàn Trần Công Danh's advice)
        3. Use `OPT_CMDMODE()` for mutually exclusive control
           (Ævar Arnfjörð Bjarmason's advice)

Some discussions are not included in Patch 2 :

        1. `git ls-tree --long --name-only` and
           `git ls-tree --long --oid-only` which is arguably a bug
           (Peff's advice)
        2. Support `--format` for `git-ls-tree`
           (Ævar Arnfjörð Bjarmason's advice)

The reason why these 2 discussions not included is I'm not sure whether
I should continue on the current patchset or start a new one. And for the
second, I think current implementation is clear and simple to use, meeting
the needs of the moment. Maybe I will to support `--format` option, but
before that, I'm appreciate if there are more suggestions appear.

Thanks.

Teng Long (1):
  ls-tree.c: support `--oid-only` option for "git-ls-tree"

 Documentation/git-ls-tree.txt |  8 +++++--
 builtin/ls-tree.c             | 27 ++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        | 40 +++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

Range-diff against v1:
1:  c4479178d7 < -:  ---------- ls-tree.c: support `--oid-only` option for "git-ls-tree"
2:  853ebbcf88 < -:  ---------- t3104: add related tests for `--oid-only` option
3:  33c68c1f11 < -:  ---------- git-ls-tree.txt: description of the 'oid-only' option
-:  ---------- > 1:  8b68568d6c ls-tree.c: support `--oid-only` option for "git-ls-tree"
-- 
2.33.1.10.g1f74a882e4


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-19 12:09 ` [PATCH v2 0/1] " Teng Long
@ 2021-11-19 12:09   ` Teng Long
  2021-11-19 13:30     ` Ævar Arnfjörð Bjarmason
  2021-11-22  8:07   ` [PATCH v3 0/1] ls-tree.c: support `--oid-only` option Teng Long
  1 sibling, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-11-19 12:09 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, avarab, congdanhqx, Teng Long

Sometimes, we only want to get the objects from output of `ls-tree`
and commands like `sed` or `cut` is usually used to intercept the
origin output to achieve this purpose in practical.

This commit supply an option names `--oid-only` to let `git ls-tree`
only print out the OID of the object. `--oid-only` and `--name-only`
are mutually exclusive in use.

Reviewed-by: Jeff King <peff@peff.net>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  8 +++++--
 builtin/ls-tree.c             | 27 ++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        | 40 +++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..bc711dc00a 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--oid-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,7 +60,10 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
-
+	Cannot be used with `--oid-only` together.
+--oid-only::
+	List only OIDs of the objects, one per line. Cannot be used with
+	`--name-only` or `--name-status` together.
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	lines, show the shortest prefix that is at least '<n>'
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..1e4a82e669 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -18,19 +18,26 @@ static int line_termination = '\n';
 #define LS_RECURSIVE 1
 #define LS_TREE_ONLY 2
 #define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_SHOW_SIZE 8
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
 
-static const  char * const ls_tree_usage[] = {
+static const char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OID_ONLY
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
 	int i;
@@ -90,7 +97,12 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
+	if (cmdmode == 2) {
+		printf("%s\n", find_unique_abbrev(oid, abbrev));
+		return 0;
+	}
+
+	if (cmdmode == 0) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
 			if (!strcmp(type, blob_type)) {
@@ -135,10 +147,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			    N_("terminate entries with NUL byte"), 0),
 		OPT_BIT('l', "long", &ls_options, N_("include object size"),
 			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
+		OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
+		OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..4c02cdd3c3
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+
+test_description='git ls-tree oids handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo 111 >1.txt &&
+	echo 222 >2.txt &&
+	mkdir -p path0/a/b/c &&
+	echo 333 >path0/a/b/c/3.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --oid-only' '
+	git ls-tree --oid-only $tree >current &&
+	git ls-tree $tree | awk "{print \$3}" >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --oid-only with -r' '
+	git ls-tree --oid-only -r $tree >current &&
+	git ls-tree -r $tree | awk "{print \$3}" >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --oid-only with --abbrev' '
+	git ls-tree --oid-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree | awk "{print \$3}" > expected &&
+	test_cmp current expected
+'
+
+test_expect_failure 'usage: incompatible options: --name-only with --oid-only' '
+	test_incompatible_usage git ls-tree --oid-only --name-only
+'
+
+test_done
-- 
2.33.1.10.g1f74a882e4


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-19 12:09   ` [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-19 13:30     ` Ævar Arnfjörð Bjarmason
  2021-11-19 17:32       ` Junio C Hamano
  2021-11-22  7:45       ` Teng Long
  0 siblings, 2 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-19 13:30 UTC (permalink / raw)
  To: Teng Long; +Cc: git, gitster, peff, congdanhqx


On Fri, Nov 19 2021, Teng Long wrote:

> Reviewed-by: Jeff King <peff@peff.net>
> Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Reviewed-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>

Please don't add the Reviewed-by headers yourself, either Junio
accumulates them, or if someone explicitly mentions that you can add it
with their name it's OK.

It doesn't just mean this person reviewed this series in some ML thread,
but "this person is 100% OK with this in its current form".

>  	List only filenames (instead of the "long" output), one per line.
> -
> +	Cannot be used with `--oid-only` together.

Better: "Cannot be combined with OPT."

> +--oid-only::
> +	List only OIDs of the objects, one per line. Cannot be used with
> +	`--name-only` or `--name-status` together.

Better: "Cannot be combined with OPT or OPT2."

> +enum {
> +	MODE_UNSPECIFIED = 0,
> +	MODE_NAME_ONLY,
> +	MODE_OID_ONLY
> +};
> +
> +static int cmdmode = MODE_UNSPECIFIED;

Better:

static enum {
	MODE_NAME_ONLY = 1,
        ...
} cmdmode = MODE_NAME_ONLY;

I.e. no need for the MODE_UNSPECIFIED just to skip past "0".

> @@ -135,10 +147,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  			    N_("terminate entries with NUL byte"), 0),
>  		OPT_BIT('l', "long", &ls_options, N_("include object size"),
>  			LS_SHOW_SIZE),
> -		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> -		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> +		OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
> +		OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
> +		OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),

Better to preserve the wrapping here, to stay within 79 columns.

> +test_expect_success 'setup' '
> +	echo 111 >1.txt &&
> +	echo 222 >2.txt &&

Just use:

    test_commit A &&
    test_commit B

etc?

> +	mkdir -p path0/a/b/c &&
> +	echo 333 >path0/a/b/c/3.txt &&
> +	find *.txt path* \( -type f -o -type l \) -print |
> +	xargs git update-index --add &&
> +	tree=$(git write-tree) &&
> +	echo $tree

Stray echo? Unclear why this test setup is so complex, shouldn't this just be (continued from above):

    mkdir -p C &&
    test_commit C/D.txt

To test nested dirs?

> +'
> +
> +test_expect_success 'usage: --oid-only' '
> +	git ls-tree --oid-only $tree >current &&
> +	git ls-tree $tree | awk "{print \$3}" >expected &&


just cut -f1 instead of awk? Also don't put "git" on the LHS of a pipe,
it might hide segfaults. Also applies to the below.

> +	test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --oid-only with -r' '
> +	git ls-tree --oid-only -r $tree >current &&
> +	git ls-tree -r $tree | awk "{print \$3}" >expected &&
> +	test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --oid-only with --abbrev' '
> +	git ls-tree --oid-only --abbrev=6 $tree >current &&
> +	git ls-tree --abbrev=6 $tree | awk "{print \$3}" > expected &&
> +	test_cmp current expected
> +'
> +
> +test_expect_failure 'usage: incompatible options: --name-only with --oid-only' '
> +	test_incompatible_usage git ls-tree --oid-only --name-only
> +'

Hrm, did you copy this use of test_incompatible_usage from
t1006-cat-file.sh without providing the function?

More data for:
https://lore.kernel.org/git/87tuhmk19c.fsf@evledraar.gmail.com/ :)

Better to use:

    test_expect_code 128 ... # (or was it 129?)

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-19 13:30     ` Ævar Arnfjörð Bjarmason
@ 2021-11-19 17:32       ` Junio C Hamano
  2021-11-22  7:45       ` Teng Long
  1 sibling, 0 replies; 224+ messages in thread
From: Junio C Hamano @ 2021-11-19 17:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Teng Long, git, peff, congdanhqx

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

I think that many points you raised in your message are valid, but
there is one thing that is not.

>> +enum {
>> +	MODE_UNSPECIFIED = 0,
>> +	MODE_NAME_ONLY,
>> +	MODE_OID_ONLY
>> +};
>> +
>> +static int cmdmode = MODE_UNSPECIFIED;
>
> Better:
>
> static enum {
> 	MODE_NAME_ONLY = 1,
>         ...
> } cmdmode = MODE_NAME_ONLY;
>
> I.e. no need for the MODE_UNSPECIFIED just to skip past "0".

If the original wanted to make the default to be "unspecified", your
suggestion changes the semantics.

"enum" is not necessarily an "int", and because the pointer of
"cmdmode" is given to OPT_CMDMODE(), which expects a pointer to
"int", your suggestion breaks the code there, too.

I wonder if cmdmode cannot be a on-stack variable in cmd_ls_tree()
that is passed as the context pointer to show_tree() via
read_tree(), though.  The enum definition still need to be visible
throughout the file, but such a structure would let us lose a
"global" variable.

Thanks.


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-19 13:30     ` Ævar Arnfjörð Bjarmason
  2021-11-19 17:32       ` Junio C Hamano
@ 2021-11-22  7:45       ` Teng Long
  2021-11-22 11:14         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-11-22  7:45 UTC (permalink / raw)
  To: avarab; +Cc: congdanhqx, dyroneteng, git, gitster, peff


On Fri, 19 Nov 2021 14:30:52 +0100, Ævar Arnfjörð Bjarmason wrote

> Please don't add the Reviewed-by headers yourself, either Junio
> accumulates them, or if someone explicitly mentions that you can add it
> with their name it's OK.

I think I misunderstood the meanings of the header before.
Thanks for the important tips.

> Better: "Cannot be combined with OPT."
> Better: "Cannot be combined with OPT or OPT2."
> ...
> Better to preserve the wrapping here, to stay within 79 columns.

Will apply.

> Just use:
>
>     test_commit A &&
>     test_commit B
>
> etc?
> ...
> Stray echo? Unclear why this test setup is so complex, shouldn't this just be (continued from above):
>
>     mkdir -p C &&
>     test_commit C/D.txt

> To test nested dirs?

Will apply.


> just cut -f1 instead of awk? Also don't put "git" on the LHS of a pipe,
> it might hide segfaults. Also applies to the below.
>

Will apply, and could you please describe the problem with more details?
(appreciate if there is an executable example)

Thank you.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v3 0/1] ls-tree.c: support `--oid-only` option
  2021-11-19 12:09 ` [PATCH v2 0/1] " Teng Long
  2021-11-19 12:09   ` [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-22  8:07   ` Teng Long
  2021-11-22  8:07     ` [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-11-23  4:58     ` [PATCH v4 0/1] ls-tree.c: support `--oid-only` option Teng Long
  1 sibling, 2 replies; 224+ messages in thread
From: Teng Long @ 2021-11-22  8:07 UTC (permalink / raw)
  To: git; +Cc: avarab, congdanhqx, gitster, peff, Teng Long

Diffs from previous patch:

      1. Remove "Reviewed-by" headers in commit message.
      2. Optimize option descriptions in Doc.
         (Ævar Arnfjörð Bjarmason' advice)
      3. Optimize and bugfix in "t3104".
      	 (Ævar Arnfjörð Bjarmason' advice)
      4. The formatting problems of line wrappers (over 79 col)

All the advices are from Ævar Arnfjörð Bjarmason and Junio C Hamano,
thank you very much.

Althought some advices are apply in this path, but some questions
remains, they are in link [1].

[1] https://public-inbox.org/git/20211122074538.87255-1-dyroneteng@gmail.com/

Teng Long (1):
  ls-tree.c: support `--oid-only` option for "git-ls-tree"

 Documentation/git-ls-tree.txt |  8 +++++--
 builtin/ls-tree.c             | 27 ++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        | 40 +++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

Range-diff against v2:
1:  8b68568d6c ! 1:  6c15b4c176 ls-tree.c: support `--oid-only` option for "git-ls-tree"
    @@ Commit message
         only print out the OID of the object. `--oid-only` and `--name-only`
         are mutually exclusive in use.
     
    -    Reviewed-by: Jeff King <peff@peff.net>
    -    Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    -    Reviewed-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
      ## Documentation/git-ls-tree.txt ##
-- 
2.33.1.10.g438dd9044d.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-22  8:07   ` [PATCH v3 0/1] ls-tree.c: support `--oid-only` option Teng Long
@ 2021-11-22  8:07     ` Teng Long
  2021-11-22 18:11       ` Peter Baumann
                         ` (2 more replies)
  2021-11-23  4:58     ` [PATCH v4 0/1] ls-tree.c: support `--oid-only` option Teng Long
  1 sibling, 3 replies; 224+ messages in thread
From: Teng Long @ 2021-11-22  8:07 UTC (permalink / raw)
  To: git; +Cc: avarab, congdanhqx, gitster, peff, Teng Long

Sometimes, we only want to get the objects from output of `ls-tree`
and commands like `sed` or `cut` is usually used to intercept the
origin output to achieve this purpose in practical.

This commit supply an option names `--oid-only` to let `git ls-tree`
only print out the OID of the object. `--oid-only` and `--name-only`
are mutually exclusive in use.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  8 +++++--
 builtin/ls-tree.c             | 27 ++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        | 40 +++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..bc711dc00a 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--oid-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,7 +60,10 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
-
+	Cannot be used with `--oid-only` together.
+--oid-only::
+	List only OIDs of the objects, one per line. Cannot be used with
+	`--name-only` or `--name-status` together.
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	lines, show the shortest prefix that is at least '<n>'
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..1e4a82e669 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -18,19 +18,26 @@ static int line_termination = '\n';
 #define LS_RECURSIVE 1
 #define LS_TREE_ONLY 2
 #define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_SHOW_SIZE 8
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
 
-static const  char * const ls_tree_usage[] = {
+static const char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OID_ONLY
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
 	int i;
@@ -90,7 +97,12 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
+	if (cmdmode == 2) {
+		printf("%s\n", find_unique_abbrev(oid, abbrev));
+		return 0;
+	}
+
+	if (cmdmode == 0) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
 			if (!strcmp(type, blob_type)) {
@@ -135,10 +147,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			    N_("terminate entries with NUL byte"), 0),
 		OPT_BIT('l', "long", &ls_options, N_("include object size"),
 			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
+		OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
+		OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..4c02cdd3c3
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+
+test_description='git ls-tree oids handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo 111 >1.txt &&
+	echo 222 >2.txt &&
+	mkdir -p path0/a/b/c &&
+	echo 333 >path0/a/b/c/3.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --oid-only' '
+	git ls-tree --oid-only $tree >current &&
+	git ls-tree $tree | awk "{print \$3}" >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --oid-only with -r' '
+	git ls-tree --oid-only -r $tree >current &&
+	git ls-tree -r $tree | awk "{print \$3}" >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --oid-only with --abbrev' '
+	git ls-tree --oid-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree | awk "{print \$3}" > expected &&
+	test_cmp current expected
+'
+
+test_expect_failure 'usage: incompatible options: --name-only with --oid-only' '
+	test_incompatible_usage git ls-tree --oid-only --name-only
+'
+
+test_done
-- 
2.33.1.10.g438dd9044d.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v2 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-22  7:45       ` Teng Long
@ 2021-11-22 11:14         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-22 11:14 UTC (permalink / raw)
  To: Teng Long; +Cc: congdanhqx, git, gitster, peff


On Mon, Nov 22 2021, Teng Long wrote:

> On Fri, 19 Nov 2021 14:30:52 +0100, Ævar Arnfjörð Bjarmason wrote

>> just cut -f1 instead of awk? Also don't put "git" on the LHS of a pipe,
>> it might hide segfaults. Also applies to the below.
>>
>
> Will apply, and could you please describe the problem with more details?
> (appreciate if there is an executable example)

Run this in a terminal:

    git stawtus | cat; echo $?;

The LHS of the pipe fails, but the exit code of that command is
hidden. So we prefer:

    git stawtus >out && # fails
    [...]




^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-22  8:07     ` [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-22 18:11       ` Peter Baumann
  2021-11-22 18:54       ` Junio C Hamano
  2021-11-23  0:14       ` Đoàn Trần Công Danh
  2 siblings, 0 replies; 224+ messages in thread
From: Peter Baumann @ 2021-11-22 18:11 UTC (permalink / raw)
  To: Teng Long
  Cc: git, Ævar Arnfjörð Bjarmason, congdanhqx,
	Junio C Hamano, Jeff King

[ Sorry if you receive this mail twice, it seems like it didn't get
through the first time. ]

On Mon, Nov 22, 2021 at 9:50 AM Teng Long <dyroneteng@gmail.com> wrote:
>
> Sometimes, we only want to get the objects from output of `ls-tree`
> and commands like `sed` or `cut` is usually used to intercept the
> origin output to achieve this purpose in practical.
>
> This commit supply an option names `--oid-only` to let `git ls-tree`
> only print out the OID of the object. `--oid-only` and `--name-only`
> are mutually exclusive in use.
>
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  Documentation/git-ls-tree.txt |  8 +++++--
>  builtin/ls-tree.c             | 27 ++++++++++++++++-------
>  t/t3104-ls-tree-oid.sh        | 40 +++++++++++++++++++++++++++++++++++
>  3 files changed, 65 insertions(+), 10 deletions(-)
>  create mode 100755 t/t3104-ls-tree-oid.sh
>
> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
> index db02d6d79a..bc711dc00a 100644
> --- a/Documentation/git-ls-tree.txt
> +++ b/Documentation/git-ls-tree.txt
> @@ -10,7 +10,8 @@ SYNOPSIS
>  --------
>  [verse]
>  'git ls-tree' [-d] [-r] [-t] [-l] [-z]
> -           [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
> +           [--name-only] [--name-status] [--oid-only]
> +           [--full-name] [--full-tree] [--abbrev[=<n>]]
>             <tree-ish> [<path>...]
>

Shouldn't the synopsis also indicate that the options are exclusive, e.g.
  [--name-only | --oid-only]  ?

Besides adding the new --oid-only mode, you also add one letter acronyms for
  [-n | --name-only]
  [-s | --name-status ]
and one letter abbreviation
  [-o | --oid-only ]
which are all undocumented in the help page. If we want the short one
letter version,
they should be documented. For me, it is at least questionable why we
introduce them
and more so in a commit adding --oid-only.


>  DESCRIPTION
> @@ -59,7 +60,10 @@ OPTIONS
>  --name-only::
>  --name-status::
>         List only filenames (instead of the "long" output), one per line.
> -
> +       Cannot be used with `--oid-only` together.
> +--oid-only::
> +       List only OIDs of the objects, one per line. Cannot be used with
> +       `--name-only` or `--name-status` together.
>  --abbrev[=<n>]::
>         Instead of showing the full 40-byte hexadecimal object
>         lines, show the shortest prefix that is at least '<n>'
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c7..1e4a82e669 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -18,19 +18,26 @@ static int line_termination = '\n';
>  #define LS_RECURSIVE 1
>  #define LS_TREE_ONLY 2
>  #define LS_SHOW_TREES 4
> -#define LS_NAME_ONLY 8
> -#define LS_SHOW_SIZE 16
> +#define LS_SHOW_SIZE 8
>  static int abbrev;
>  static int ls_options;
>  static struct pathspec pathspec;
>  static int chomp_prefix;
>  static const char *ls_tree_prefix;
>
> -static const  char * const ls_tree_usage[] = {
> +static const char * const ls_tree_usage[] = {
>         N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
>         NULL
>  };
>
> +enum {
> +       MODE_UNSPECIFIED = 0,
> +       MODE_NAME_ONLY,
> +       MODE_OID_ONLY
> +};
> +
> +static int cmdmode = MODE_UNSPECIFIED;
> +
>  static int show_recursive(const char *base, int baselen, const char *pathname)
>  {
>         int i;
> @@ -90,7 +97,12 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>         else if (ls_options & LS_TREE_ONLY)
>                 return 0;
>
> -       if (!(ls_options & LS_NAME_ONLY)) {
> +       if (cmdmode == 2) {
> +               printf("%s\n", find_unique_abbrev(oid, abbrev));
> +               return 0;
> +       }
> +
> +       if (cmdmode == 0) {
>                 if (ls_options & LS_SHOW_SIZE) {
>                         char size_text[24];
>                         if (!strcmp(type, blob_type)) {
> @@ -135,10 +147,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>                             N_("terminate entries with NUL byte"), 0),
>                 OPT_BIT('l', "long", &ls_options, N_("include object size"),
>                         LS_SHOW_SIZE),
> -               OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
> -                       LS_NAME_ONLY),
> -               OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
> -                       LS_NAME_ONLY),
> +               OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
> +               OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
> +               OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),
>                 OPT_SET_INT(0, "full-name", &chomp_prefix,
>                             N_("use full path names"), 0),
>                 OPT_BOOL(0, "full-tree", &full_tree,
> diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
> new file mode 100755
> index 0000000000..4c02cdd3c3
> --- /dev/null
> +++ b/t/t3104-ls-tree-oid.sh
> @@ -0,0 +1,40 @@
> +#!/bin/sh
> +
> +test_description='git ls-tree oids handling.'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +       echo 111 >1.txt &&
> +       echo 222 >2.txt &&
> +       mkdir -p path0/a/b/c &&
> +       echo 333 >path0/a/b/c/3.txt &&
> +       find *.txt path* \( -type f -o -type l \) -print |

I don't see the test using any symbolic links. Why are we searching for
symbolic links with "-type l" here?

-Peter

> +       xargs git update-index --add &&
> +       tree=$(git write-tree) &&
> +       echo $tree
> +'

> +
> +test_expect_success 'usage: --oid-only' '
> +       git ls-tree --oid-only $tree >current &&
> +       git ls-tree $tree | awk "{print \$3}" >expected &&
> +       test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --oid-only with -r' '
> +       git ls-tree --oid-only -r $tree >current &&
> +       git ls-tree -r $tree | awk "{print \$3}" >expected &&
> +       test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --oid-only with --abbrev' '
> +       git ls-tree --oid-only --abbrev=6 $tree >current &&
> +       git ls-tree --abbrev=6 $tree | awk "{print \$3}" > expected &&
> +       test_cmp current expected
> +'
> +
> +test_expect_failure 'usage: incompatible options: --name-only with --oid-only' '
> +       test_incompatible_usage git ls-tree --oid-only --name-only
> +'
> +
> +test_done
> --
> 2.33.1.10.g438dd9044d.dirty
>

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-22  8:07     ` [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-11-22 18:11       ` Peter Baumann
@ 2021-11-22 18:54       ` Junio C Hamano
  2021-11-23  1:09         ` Ævar Arnfjörð Bjarmason
  2021-11-23  0:14       ` Đoàn Trần Công Danh
  2 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-11-22 18:54 UTC (permalink / raw)
  To: Teng Long; +Cc: git, avarab, congdanhqx, peff

Teng Long <dyroneteng@gmail.com> writes:

> Sometimes, we only want to get the objects from output of `ls-tree`
> and commands like `sed` or `cut` is usually used to intercept the
> origin output to achieve this purpose in practical.

"in practical" -> "in practice".

That's true and that is exactly this plumbing command was designed
to be used.

> This commit supply an option names `--oid-only` to let `git ls-tree`
> only print out the OID of the object. `--oid-only` and `--name-only`
> are mutually exclusive in use.

    Teach the "--oid-only" option to tell the command to only show
    the object name, just like "--name-only" option tells the
    command to only show the path component, for each entry.  These
    two options are mutually exclusive.

perhaps?

The above leaves "mode-only" and "type-only".  I wonder if it is a
better design to add just one new option, --hide-fields, and make
the existing --name-only into a synonym to

    git ls-tree --hide-fields=mode,type,object $T

which would mean we do not need to end up with four mutually
exclusive commands, and anybody who wants to only see object names
can do

    git ls-tree --hide-fields=mode,type,file $T

Note: the above uses the terminology in the OUTPUT FORMAT section;
if we want to use "name" instead of "file", I am perfectly OK with
it, but then we should update the documentation to match.

Come to think of it, I think "--show-fields" may work even better
than "--hide-fields".  We can use it to get rid of the "--long"
option:

    git ls-tree --show-fields=mode,type,object,size,file $T

would be equivelent to

    git ls-tree --long $T

The field order may need to be thought through, especially when "-z"
output is not being used.  We may need a rule to require "file" to
be at the end, if exists, or even simpler rule "you can choose which
fields are shown but the order they come out is not affected" (i.e.
"--show-fields=mode,type" and "--show-fields=type,mode" give the
same output).

I am OK if we started with "only a single field allowed" and extend
it to support multiple fields later (until that happens, we cannot
emulate the "--long" output, though).  Then we do not have to answer
two tricky questions, what to do with the output order, and what
field separators are used in the output.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-22  8:07     ` [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-11-22 18:11       ` Peter Baumann
  2021-11-22 18:54       ` Junio C Hamano
@ 2021-11-23  0:14       ` Đoàn Trần Công Danh
  2 siblings, 0 replies; 224+ messages in thread
From: Đoàn Trần Công Danh @ 2021-11-23  0:14 UTC (permalink / raw)
  To: Teng Long; +Cc: git, avarab, gitster, peff

On 2021-11-22 16:07:28+0800, Teng Long <dyroneteng@gmail.com> wrote:
> Sometimes, we only want to get the objects from output of `ls-tree`
> and commands like `sed` or `cut` is usually used to intercept the
> origin output to achieve this purpose in practical.
> 
> This commit supply an option names `--oid-only` to let `git ls-tree`
> only print out the OID of the object. `--oid-only` and `--name-only`
> are mutually exclusive in use.
> 
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  Documentation/git-ls-tree.txt |  8 +++++--
>  builtin/ls-tree.c             | 27 ++++++++++++++++-------
>  t/t3104-ls-tree-oid.sh        | 40 +++++++++++++++++++++++++++++++++++
>  3 files changed, 65 insertions(+), 10 deletions(-)
>  create mode 100755 t/t3104-ls-tree-oid.sh
> 
> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
> index db02d6d79a..bc711dc00a 100644
> --- a/Documentation/git-ls-tree.txt
> +++ b/Documentation/git-ls-tree.txt
> @@ -10,7 +10,8 @@ SYNOPSIS
>  --------
>  [verse]
>  'git ls-tree' [-d] [-r] [-t] [-l] [-z]
> -	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
> +	    [--name-only] [--name-status] [--oid-only]

Please indicate those options are incompatible (as someone else said):

	[--name-only | --name-status | --oid-only]

> +	    [--full-name] [--full-tree] [--abbrev[=<n>]]
>  	    <tree-ish> [<path>...]
>  
>  DESCRIPTION
> @@ -59,7 +60,10 @@ OPTIONS
>  --name-only::
>  --name-status::
>  	List only filenames (instead of the "long" output), one per line.
> -
> +	Cannot be used with `--oid-only` together.
> +--oid-only::
> +	List only OIDs of the objects, one per line. Cannot be used with
> +	`--name-only` or `--name-status` together.
>  --abbrev[=<n>]::
>  	Instead of showing the full 40-byte hexadecimal object
>  	lines, show the shortest prefix that is at least '<n>'
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c7..1e4a82e669 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -18,19 +18,26 @@ static int line_termination = '\n';
>  #define LS_RECURSIVE 1
>  #define LS_TREE_ONLY 2
>  #define LS_SHOW_TREES 4
> -#define LS_NAME_ONLY 8
> -#define LS_SHOW_SIZE 16
> +#define LS_SHOW_SIZE 8
>  static int abbrev;
>  static int ls_options;
>  static struct pathspec pathspec;
>  static int chomp_prefix;
>  static const char *ls_tree_prefix;
>  
> -static const  char * const ls_tree_usage[] = {
> +static const char * const ls_tree_usage[] = {
>  	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
>  	NULL
>  };
>  
> +enum {
> +	MODE_UNSPECIFIED = 0,
> +	MODE_NAME_ONLY,
> +	MODE_OID_ONLY
> +};
> +
> +static int cmdmode = MODE_UNSPECIFIED;
> +
>  static int show_recursive(const char *base, int baselen, const char *pathname)
>  {
>  	int i;
> @@ -90,7 +97,12 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  	else if (ls_options & LS_TREE_ONLY)
>  		return 0;
>  
> -	if (!(ls_options & LS_NAME_ONLY)) {
> +	if (cmdmode == 2) {

I think it's better to use the enum name:

	if (cmdmode == MODE_OID_ONLY) {

> +		printf("%s\n", find_unique_abbrev(oid, abbrev));
> +		return 0;
> +	}
> +
> +	if (cmdmode == 0) {

Ditto:

	if (cmdmode == MODE_UNSPECIFIED) {

Speaking about this, where will MODE_NAME_ONLY be used?

>  		if (ls_options & LS_SHOW_SIZE) {
>  			char size_text[24];
>  			if (!strcmp(type, blob_type)) {
> @@ -135,10 +147,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  			    N_("terminate entries with NUL byte"), 0),
>  		OPT_BIT('l', "long", &ls_options, N_("include object size"),
>  			LS_SHOW_SIZE),
> -		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> -		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> +		OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
> +		OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
> +		OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),
>  		OPT_SET_INT(0, "full-name", &chomp_prefix,
>  			    N_("use full path names"), 0),
>  		OPT_BOOL(0, "full-tree", &full_tree,
> diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
> new file mode 100755
> index 0000000000..4c02cdd3c3
> --- /dev/null
> +++ b/t/t3104-ls-tree-oid.sh
> @@ -0,0 +1,40 @@
> +#!/bin/sh
> +
> +test_description='git ls-tree oids handling.'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	echo 111 >1.txt &&
> +	echo 222 >2.txt &&
> +	mkdir -p path0/a/b/c &&
> +	echo 333 >path0/a/b/c/3.txt &&
> +	find *.txt path* \( -type f -o -type l \) -print |
> +	xargs git update-index --add &&
> +	tree=$(git write-tree) &&
> +	echo $tree
> +'
> +
> +test_expect_success 'usage: --oid-only' '
> +	git ls-tree --oid-only $tree >current &&
> +	git ls-tree $tree | awk "{print \$3}" >expected &&
> +	test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --oid-only with -r' '
> +	git ls-tree --oid-only -r $tree >current &&
> +	git ls-tree -r $tree | awk "{print \$3}" >expected &&
> +	test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --oid-only with --abbrev' '
> +	git ls-tree --oid-only --abbrev=6 $tree >current &&
> +	git ls-tree --abbrev=6 $tree | awk "{print \$3}" > expected &&
> +	test_cmp current expected
> +'
> +
> +test_expect_failure 'usage: incompatible options: --name-only with --oid-only' '
> +	test_incompatible_usage git ls-tree --oid-only --name-only
> +'
> +
> +test_done

It seems like you haven't updated the test code from v2

-- 
Danh

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-22 18:54       ` Junio C Hamano
@ 2021-11-23  1:09         ` Ævar Arnfjörð Bjarmason
  2021-11-23  1:26           ` Junio C Hamano
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-23  1:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Teng Long, git, congdanhqx, peff


On Mon, Nov 22 2021, Junio C Hamano wrote:

> Teng Long <dyroneteng@gmail.com> writes:
>
>> Sometimes, we only want to get the objects from output of `ls-tree`
>> and commands like `sed` or `cut` is usually used to intercept the
>> origin output to achieve this purpose in practical.
>
> "in practical" -> "in practice".
>
> That's true and that is exactly this plumbing command was designed
> to be used.
>
>> This commit supply an option names `--oid-only` to let `git ls-tree`
>> only print out the OID of the object. `--oid-only` and `--name-only`
>> are mutually exclusive in use.
>
>     Teach the "--oid-only" option to tell the command to only show
>     the object name, just like "--name-only" option tells the
>     command to only show the path component, for each entry.  These
>     two options are mutually exclusive.
>
> perhaps?
>
> The above leaves "mode-only" and "type-only".  I wonder if it is a
> better design to add just one new option, --hide-fields, and make
> the existing --name-only into a synonym to
>
>     git ls-tree --hide-fields=mode,type,object $T
>
> which would mean we do not need to end up with four mutually
> exclusive commands, and anybody who wants to only see object names
> can do
>
>     git ls-tree --hide-fields=mode,type,file $T
>
> Note: the above uses the terminology in the OUTPUT FORMAT section;
> if we want to use "name" instead of "file", I am perfectly OK with
> it, but then we should update the documentation to match.
>
> Come to think of it, I think "--show-fields" may work even better
> than "--hide-fields".  We can use it to get rid of the "--long"
> option:
>
>     git ls-tree --show-fields=mode,type,object,size,file $T
>
> would be equivelent to
>
>     git ls-tree --long $T
>
> The field order may need to be thought through, especially when "-z"
> output is not being used.  We may need a rule to require "file" to
> be at the end, if exists, or even simpler rule "you can choose which
> fields are shown but the order they come out is not affected" (i.e.
> "--show-fields=mode,type" and "--show-fields=type,mode" give the
> same output).
>
> I am OK if we started with "only a single field allowed" and extend
> it to support multiple fields later (until that happens, we cannot
> emulate the "--long" output, though).  Then we do not have to answer
> two tricky questions, what to do with the output order, and what
> field separators are used in the output.

All of which (and more) would also be addressed in an obvious way by
just supporting --format as I suggested in
https://lore.kernel.org/git/211115.86o86lqe3c.gmgdl@evledraar.gmail.com/;
don't you think that's a better approach?

As noted in
https://lore.kernel.org/git/211115.86mtm5saz7.gmgdl@evledraar.gmail.com/
we could start by simply dying if the format is not on a small list of
formats we handle, i.e. not implement the strbuf_expand() change to
start with.

A --show-fields and --hide-fields seems like a rather elaborate
interface in lieu of just having a --format.

You'd presumably then want a --field-seperator and
--name-field-separator (we use SP and TAB for the two, currently), so
we've got N option now just to emulate what a --format would do for us.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  1:09         ` Ævar Arnfjörð Bjarmason
@ 2021-11-23  1:26           ` Junio C Hamano
  2021-11-23  2:28             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-11-23  1:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Teng Long, git, congdanhqx, peff

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> All of which (and more) would also be addressed in an obvious way by
> just supporting --format as I suggested in
> https://lore.kernel.org/git/211115.86o86lqe3c.gmgdl@evledraar.gmail.com/;
> don't you think that's a better approach?

That is what I would call over-engineering that I would rather not
to have in low level plumbing.

I am all for making _parsing_ the output from the tool easier by
scripts; I am not interested in eliminating the _output_ by scripts.
They should capture and format the pieces we output in any way they
want.

So, no, I do not think it is a better approach at all.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  1:26           ` Junio C Hamano
@ 2021-11-23  2:28             ` Ævar Arnfjörð Bjarmason
  2021-11-23  2:55               ` Junio C Hamano
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-23  2:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Teng Long, git, congdanhqx, peff


On Mon, Nov 22 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> All of which (and more) would also be addressed in an obvious way by
>> just supporting --format as I suggested in
>> https://lore.kernel.org/git/211115.86o86lqe3c.gmgdl@evledraar.gmail.com/;
>> don't you think that's a better approach?
>
> That is what I would call over-engineering that I would rather not
> to have in low level plumbing.
>
> I am all for making _parsing_ the output from the tool easier by
> scripts; I am not interested in eliminating the _output_ by scripts.
> They should capture and format the pieces we output in any way they
> want.
>
> So, no, I do not think it is a better approach at all.

We've got --format for for-each-ref and family (also branch etc.), and
for the "log" family.

I'm not sure I understand what you're saying, do you think if we could
go back and change it that the "FIELD NAMES" in git-for-each-ref (which
is plumbing) would have been better done as
--field-name=refname,objecttype,... etc?

Having used it extensively it's been very hard to have the flexibility
of formatting, e.g. to specify arbitrary delimiters.

It also leaves the door open to teaching ls-tree etc. the %(if) syntax
in the ref-filter, e.g. if you'd like to only print out certain data for
certain object types or whatever.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  2:28             ` Ævar Arnfjörð Bjarmason
@ 2021-11-23  2:55               ` Junio C Hamano
  2021-11-23  3:35                 ` Junio C Hamano
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-11-23  2:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Teng Long, git, congdanhqx, peff

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> That is what I would call over-engineering that I would rather not
>> to have in low level plumbing.
>> ...
> We've got --format for for-each-ref and family (also branch etc.), and
> for the "log" family.

But I didn't comment on them. ls-tree is a lot lower-level plumbing
where --format does not belong in my mind.




^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  2:55               ` Junio C Hamano
@ 2021-11-23  3:35                 ` Junio C Hamano
  2021-11-23 11:04                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-11-23  3:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Teng Long, git, congdanhqx, peff

Junio C Hamano <gitster@pobox.com> writes:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>> That is what I would call over-engineering that I would rather not
>>> to have in low level plumbing.
>>> ...
>> We've got --format for for-each-ref and family (also branch etc.), and
>> for the "log" family.
>
> But I didn't comment on them. ls-tree is a lot lower-level plumbing
> where --format does not belong in my mind.

There is a lot more practical reason why I'd prefer a less flexible
and good enough interface.

I can see, without coding it myself but from mere memory of how the
code looked like, how such a "we allow you to choose which field to
include, but you do not get to choose the order of fields or any
other string in the output" can be done with minimum disruption to
the existing code and without introducing a bug.  On the other hand,
I am fairly certain that anything more flexible than that will risk
new bugs involved in any large shuffling of the code, which I am
getting tired of.

So there.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v4 0/1] ls-tree.c: support `--oid-only` option
  2021-11-22  8:07   ` [PATCH v3 0/1] ls-tree.c: support `--oid-only` option Teng Long
  2021-11-22  8:07     ` [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-23  4:58     ` Teng Long
  2021-11-23  4:58       ` [PATCH v4 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
  2021-12-08  2:08       ` [PATCH v5 0/1] support `--object-only` " Teng Long
  1 sibling, 2 replies; 224+ messages in thread
From: Teng Long @ 2021-11-23  4:58 UTC (permalink / raw)
  To: git; +Cc: avarab, congdanhqx, gitster, peff, Teng Long

Thanks for the discussions on v3 (even I send a patch with
wrong contents and the right cover). So I looked at them, and
I think I have to send a new patch first, so this includes:

    1. Commit message modifications (Junio C Hamano's advice)
    2. Documentation modifications (Peter Baumann's advice)
    3. To use the MODE enum name instead (Đoàn Trần Công Danh's advice)

The other discussions I will reply today.

Teng Long (1):
  ls-tree.c: support `--oid-only` option for "git-ls-tree"

 Documentation/git-ls-tree.txt | 18 ++++++++++++---
 builtin/ls-tree.c             | 30 +++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        | 43 +++++++++++++++++++++++++++++++++++
 3 files changed, 80 insertions(+), 11 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

Range-diff against v3:
1:  8b68568d6c ! 1:  63876dbeb7 ls-tree.c: support `--oid-only` option for "git-ls-tree"
    @@ Commit message
     
         Sometimes, we only want to get the objects from output of `ls-tree`
         and commands like `sed` or `cut` is usually used to intercept the
    -    origin output to achieve this purpose in practical.
    +    origin output to achieve this purpose in practice.
     
    -    This commit supply an option names `--oid-only` to let `git ls-tree`
    -    only print out the OID of the object. `--oid-only` and `--name-only`
    -    are mutually exclusive in use.
    +    This commit teach the "--oid-only" option to tell the command to
    +    only show the object name, just like "--name-only" option tells the
    +    command to only show the path component, for each entry. These two
    +    options are mutually exclusive.
     
    -    Reviewed-by: Jeff King <peff@peff.net>
    -    Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    -    Reviewed-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
      ## Documentation/git-ls-tree.txt ##
    -@@ Documentation/git-ls-tree.txt: SYNOPSIS
    +@@ Documentation/git-ls-tree.txt: git-ls-tree - List the contents of a tree object
    + SYNOPSIS
      --------
      [verse]
    - 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
    +-'git ls-tree' [-d] [-r] [-t] [-l] [-z]
     -	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
    -+	    [--name-only] [--name-status] [--oid-only]
    ++'git ls-tree' [-d] [-r] [-t] [-l] [-z] [-n] [-s] [-o]
    ++	    [--name-only | --oid-only]
    ++	    [--name-status | --oid-only]
     +	    [--full-name] [--full-tree] [--abbrev[=<n>]]
      	    <tree-ish> [<path>...]
      
      DESCRIPTION
     @@ Documentation/git-ls-tree.txt: OPTIONS
    + 	\0 line termination on output and do not quote filenames.
    + 	See OUTPUT FORMAT below for more information.
    + 
    ++-n::
      --name-only::
    - --name-status::
    +---name-status::
      	List only filenames (instead of the "long" output), one per line.
    --
    -+	Cannot be used with `--oid-only` together.
    ++	Cannot be combined with `--oid-only`.
    ++
    ++-s::
    ++--name-status::
    ++	Consistent behavior with `--name-only`.
    ++
    ++-o::
     +--oid-only::
    -+	List only OIDs of the objects, one per line. Cannot be used with
    -+	`--name-only` or `--name-status` together.
    ++	List only names of the objects, one per line. Cannot be combined
    ++	with `--name-only` or `--name-status`.
    + 
      --abbrev[=<n>]::
      	Instead of showing the full 40-byte hexadecimal object
    - 	lines, show the shortest prefix that is at least '<n>'
     
      ## builtin/ls-tree.c ##
     @@ builtin/ls-tree.c: static int line_termination = '\n';
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
      		return 0;
      
     -	if (!(ls_options & LS_NAME_ONLY)) {
    -+	if (cmdmode == 2) {
    ++	if (cmdmode == MODE_OID_ONLY) {
     +		printf("%s\n", find_unique_abbrev(oid, abbrev));
     +		return 0;
     +	}
     +
    -+	if (cmdmode == 0) {
    ++	if (cmdmode == MODE_UNSPECIFIED) {
      		if (ls_options & LS_SHOW_SIZE) {
      			char size_text[24];
      			if (!strcmp(type, blob_type)) {
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
     -			LS_NAME_ONLY),
     -		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
     -			LS_NAME_ONLY),
    -+		OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
    -+		OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
    -+		OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),
    ++		OPT_CMDMODE('n', "name-only", &cmdmode,
    ++			    N_("list only filenames"), MODE_NAME_ONLY),
    ++		OPT_CMDMODE('s', "name-status", &cmdmode,
    ++			    N_("list only filenames"), MODE_NAME_ONLY),
    ++		OPT_CMDMODE('o', "oid-only", &cmdmode,
    ++			    N_("list only oids"), MODE_OID_ONLY),
      		OPT_SET_INT(0, "full-name", &chomp_prefix,
      			    N_("use full path names"), 0),
      		OPT_BOOL(0, "full-tree", &full_tree,
    @@ t/t3104-ls-tree-oid.sh (new)
     +. ./test-lib.sh
     +
     +test_expect_success 'setup' '
    -+	echo 111 >1.txt &&
    -+	echo 222 >2.txt &&
    -+	mkdir -p path0/a/b/c &&
    -+	echo 333 >path0/a/b/c/3.txt &&
    ++	test_commit A &&
    ++	test_commit B &&
    ++	mkdir -p C &&
    ++	test_commit C/D.txt &&
     +	find *.txt path* \( -type f -o -type l \) -print |
     +	xargs git update-index --add &&
     +	tree=$(git write-tree) &&
    @@ t/t3104-ls-tree-oid.sh (new)
     +
     +test_expect_success 'usage: --oid-only' '
     +	git ls-tree --oid-only $tree >current &&
    -+	git ls-tree $tree | awk "{print \$3}" >expected &&
    ++	git ls-tree $tree >result &&
    ++	cut -f1 result | cut -d " " -f3 >expected &&
     +	test_cmp current expected
     +'
     +
     +test_expect_success 'usage: --oid-only with -r' '
     +	git ls-tree --oid-only -r $tree >current &&
    -+	git ls-tree -r $tree | awk "{print \$3}" >expected &&
    ++	git ls-tree -r $tree >result &&
    ++	cut -f1 result | cut -d " " -f3 >expected &&
     +	test_cmp current expected
     +'
     +
     +test_expect_success 'usage: --oid-only with --abbrev' '
     +	git ls-tree --oid-only --abbrev=6 $tree >current &&
    -+	git ls-tree --abbrev=6 $tree | awk "{print \$3}" > expected &&
    ++	git ls-tree --abbrev=6 $tree >result &&
    ++	cut -f1 result | cut -d " " -f3 >expected &&
     +	test_cmp current expected
     +'
     +
    -+test_expect_failure 'usage: incompatible options: --name-only with --oid-only' '
    -+	test_incompatible_usage git ls-tree --oid-only --name-only
    ++test_expect_success 'usage: incompatible options: --name-only with --oid-only' '
    ++	test_expect_code 129 git ls-tree --oid-only --name-only
     +'
     +
     +test_done
-- 
2.33.1.10.g75523f744f.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v4 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  4:58     ` [PATCH v4 0/1] ls-tree.c: support `--oid-only` option Teng Long
@ 2021-11-23  4:58       ` Teng Long
  2021-11-23 22:32         ` Junio C Hamano
  2021-12-08  2:08       ` [PATCH v5 0/1] support `--object-only` " Teng Long
  1 sibling, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-11-23  4:58 UTC (permalink / raw)
  To: git; +Cc: avarab, congdanhqx, gitster, peff, Teng Long

Sometimes, we only want to get the objects from output of `ls-tree`
and commands like `sed` or `cut` is usually used to intercept the
origin output to achieve this purpose in practice.

This commit teach the "--oid-only" option to tell the command to
only show the object name, just like "--name-only" option tells the
command to only show the path component, for each entry. These two
options are mutually exclusive.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt | 18 ++++++++++++---
 builtin/ls-tree.c             | 30 +++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        | 43 +++++++++++++++++++++++++++++++++++
 3 files changed, 80 insertions(+), 11 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..fd2a871ca5 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -9,8 +9,10 @@ git-ls-tree - List the contents of a tree object
 SYNOPSIS
 --------
 [verse]
-'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+'git ls-tree' [-d] [-r] [-t] [-l] [-z] [-n] [-s] [-o]
+	    [--name-only | --oid-only]
+	    [--name-status | --oid-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -56,9 +58,19 @@ OPTIONS
 	\0 line termination on output and do not quote filenames.
 	See OUTPUT FORMAT below for more information.
 
+-n::
 --name-only::
---name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--oid-only`.
+
+-s::
+--name-status::
+	Consistent behavior with `--name-only`.
+
+-o::
+--oid-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..0c2153a5ad 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -18,19 +18,26 @@ static int line_termination = '\n';
 #define LS_RECURSIVE 1
 #define LS_TREE_ONLY 2
 #define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_SHOW_SIZE 8
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
 
-static const  char * const ls_tree_usage[] = {
+static const char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OID_ONLY
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
 	int i;
@@ -90,7 +97,12 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
+	if (cmdmode == MODE_OID_ONLY) {
+		printf("%s\n", find_unique_abbrev(oid, abbrev));
+		return 0;
+	}
+
+	if (cmdmode == MODE_UNSPECIFIED) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
 			if (!strcmp(type, blob_type)) {
@@ -135,10 +147,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			    N_("terminate entries with NUL byte"), 0),
 		OPT_BIT('l', "long", &ls_options, N_("include object size"),
 			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('n', "name-only", &cmdmode,
+			    N_("list only filenames"), MODE_NAME_ONLY),
+		OPT_CMDMODE('s', "name-status", &cmdmode,
+			    N_("list only filenames"), MODE_NAME_ONLY),
+		OPT_CMDMODE('o', "oid-only", &cmdmode,
+			    N_("list only oids"), MODE_OID_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..2d349f6e46
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,43 @@
+#!/bin/sh
+
+test_description='git ls-tree oids handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --oid-only' '
+	git ls-tree --oid-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --oid-only with -r' '
+	git ls-tree --oid-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --oid-only with --abbrev' '
+	git ls-tree --oid-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --oid-only' '
+	test_expect_code 129 git ls-tree --oid-only --name-only
+'
+
+test_done
-- 
2.33.1.10.g75523f744f.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v3 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  3:35                 ` Junio C Hamano
@ 2021-11-23 11:04                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-23 11:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Teng Long, git, congdanhqx, peff


On Mon, Nov 22 2021, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>>>> That is what I would call over-engineering that I would rather not
>>>> to have in low level plumbing.
>>>> ...
>>> We've got --format for for-each-ref and family (also branch etc.), and
>>> for the "log" family.
>>
>> But I didn't comment on them. ls-tree is a lot lower-level plumbing
>> where --format does not belong in my mind.

Yes, you're just talking about ls-tree here. I'm just trying to
understand what you meant with:

    I am not interested in eliminating the _output_ by scripts.
    They should capture and format the pieces we output in any way they
    want.

Which reads to me like "we provide the data, you pretty-fy it". In this
case the proposed feature doesn't need a patch to git, but can also be
done as:

    git ls-tree HEAD | cut -d$'\t' -f1 | cut -d ' ' -f3

I think it's useful. I'm just trying to understand what you think about
such plumbing output.

> There is a lot more practical reason why I'd prefer a less flexible
> and good enough interface.
>
> I can see, without coding it myself but from mere memory of how the
> code looked like, how such a "we allow you to choose which field to
> include, but you do not get to choose the order of fields or any
> other string in the output" can be done with minimum disruption to
> the existing code and without introducing a bug.  On the other hand,
> I am fairly certain that anything more flexible than that will risk
> new bugs involved in any large shuffling of the code, which I am
> getting tired of.

To be clear, I wasn't talking about running with the WIP patch I had in
<211115.86o86lqe3c.gmgdl@evledraar.gmail.com> here, but that the
interface wolud leave the door open to it. So something like the below.

This works to do what --oid-only does without adding that switch,
instead we add it tou our list of 4 supported hardcoded formats, all of
which map to one of the MODE_* flags.

We could just document that we support that limited list for now, and
that we might add more in the future.

So it's just a way of adding a new MODE_* without supporting an ever
growing list of flags, --oid-only, --objectmode-only, --objectsize-only
etc.

Then if we'd ever want to generalize this in the future we can pick up
someting like my WIP patch and we'd have support for any arbitrary
format.

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 1e4a82e669a..e1e746ae02a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -30,10 +30,11 @@ static const char * const ls_tree_usage[] = {
 	NULL
 };
 
-enum {
+enum ls_tree_cmdmode {
 	MODE_UNSPECIFIED = 0,
 	MODE_NAME_ONLY,
-	MODE_OID_ONLY
+	MODE_OID_ONLY,
+	MODE_LONG,
 };
 
 static int cmdmode = MODE_UNSPECIFIED;
@@ -131,11 +132,22 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	return retval;
 }
 
+static struct {
+	enum ls_tree_cmdmode cmdmode;
+	const char *fmt;
+} allowed_formats[] = {
+	{ MODE_UNSPECIFIED,	"%(objectmode) %(objecttype) %(objectname)%09%(path)" },
+	{ MODE_NAME_ONLY,	"%(path)" },
+	{ MODE_OID_ONLY,	"%(objectname)" },
+	{ MODE_LONG,		"%(objectmode) %(objecttype) %(objectsize) %(objectname)%09%(path)" },
+};
+
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 {
 	struct object_id oid;
 	struct tree *tree;
 	int i, full_tree = 0;
+	const char *format = NULL;
 	const struct option ls_tree_options[] = {
 		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
 			LS_TREE_ONLY),
@@ -149,7 +161,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_SIZE),
 		OPT_CMDMODE('n', "name-only", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
 		OPT_CMDMODE('s', "name-status", &cmdmode, N_("list only filenames"), MODE_NAME_ONLY),
-		OPT_CMDMODE('o', "oid-only", &cmdmode, N_("list only oids"), MODE_OID_ONLY),
+		OPT_STRING(0 , "format", &format, N_("format"),
+			   N_("(limited) format to use for the output")),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -170,6 +183,22 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		ls_tree_prefix = prefix = NULL;
 		chomp_prefix = 0;
 	}
+
+	if (format && cmdmode)
+		die(_("--format and --name-only, --long etc. are incompatible"));
+	if (format) {
+		size_t i;
+
+		for (i = 0; i <= ARRAY_SIZE(allowed_formats); i++) {
+			if (i == ARRAY_SIZE(allowed_formats))
+				die(_("your --format=%s is not on the whitelist of supported formats"), format);
+			if (!strcmp(format, allowed_formats[i].fmt)) {
+				cmdmode = allowed_formats[i].cmdmode;
+				break;
+			}
+		}
+	}
+
 	/* -d -r should imply -t, but -d by itself should not have to. */
 	if ( (LS_TREE_ONLY|LS_RECURSIVE) ==
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v4 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23  4:58       ` [PATCH v4 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-11-23 22:32         ` Junio C Hamano
  2021-12-06  7:52           ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-11-23 22:32 UTC (permalink / raw)
  To: Teng Long; +Cc: git, avarab, congdanhqx, peff

Teng Long <dyroneteng@gmail.com> writes:

> Sometimes, we only want to get the objects from output of `ls-tree`
> and commands like `sed` or `cut` is usually used to intercept the
> origin output to achieve this purpose in practice.

I can guess what "intercept the origin" wants to say, but it does
not roll off the tongue very well.

> This commit teach the "--oid-only" option to tell the command to
> only show the object name, just like "--name-only" option tells the
> command to only show the path component, for each entry. These two
> options are mutually exclusive.

cf. Documentation/SubmittingPatches[[imperative-mood]]

Perhaps like

    We usually pipe the output from `git ls-files` to tools like
    `sed` or `cut` when we only want to extract some fields.

    When we want only the pathname component, we can pass
    `--name-only` option to omit such a pipeline, but there are no
    options for extracting other fields.

    Teach the "--oid-only" option to the command to only show the
    object name.  This option cannot be used together with
    "--name-only" or "--long".

This does not work well with "--long", right?

> -'git ls-tree' [-d] [-r] [-t] [-l] [-z]
> -	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
> +'git ls-tree' [-d] [-r] [-t] [-l] [-z] [-n] [-s] [-o]

Where does the addition of -n/-s/-o fit in the theme of the patch?
This being a plumbing command, an existing option with long name
does not deserve a shorthand (and --oid-only should start with long
name only, too).

If we were to introduce -n as yet another synonym for --name-only
and --name-status (which we would not), the right way to spell it
in the synopsis section would be:

	[-n | --name-only | --name-status]
 
I would think, but this advise is only for your next topic.  We are
not going to add such a synonym.

> +	    [--name-only | --oid-only]
> +	    [--name-status | --oid-only]

This looks very iffy.  Has this been proof-read before sending out?

> +	    [--full-name] [--full-tree] [--abbrev[=<n>]]
>  	    <tree-ish> [<path>...]
>  
>  DESCRIPTION
> @@ -56,9 +58,19 @@ OPTIONS
>  	\0 line termination on output and do not quote filenames.
>  	See OUTPUT FORMAT below for more information.
>  
> +-n::
>  --name-only::
> ---name-status::
>  	List only filenames (instead of the "long" output), one per line.
> +	Cannot be combined with `--oid-only`.
> +
> +-s::
> +--name-status::
> +	Consistent behavior with `--name-only`.

Usually we say A is "Consistent" with B when A and B are different
but are moral equivalent in their respective contexts.  These are
identical, there is no difference.

Lose all of the above change, except for "Cannot be combined with".
The original is just fine.

> +-o::
> +--oid-only::
> +	List only names of the objects, one per line. Cannot be combined
> +	with `--name-only` or `--name-status`.

Or "--long"?  Does this work with it?

ALso, lose "-o".

> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c7..0c2153a5ad 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -18,19 +18,26 @@ static int line_termination = '\n';
>  #define LS_RECURSIVE 1
>  #define LS_TREE_ONLY 2
>  #define LS_SHOW_TREES 4
> -#define LS_NAME_ONLY 8
> -#define LS_SHOW_SIZE 16
> +#define LS_SHOW_SIZE 8
>  static int abbrev;
>  static int ls_options;
>  static struct pathspec pathspec;
>  static int chomp_prefix;
>  static const char *ls_tree_prefix;
>  
> -static const  char * const ls_tree_usage[] = {
> +static const char * const ls_tree_usage[] = {
>  	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
>  	NULL
>  };
>  
> +enum {
> +	MODE_UNSPECIFIED = 0,
> +	MODE_NAME_ONLY,
> +	MODE_OID_ONLY
> +};

I suspect that "--long" would be part of this, if we were to go this
route.

OPT_CMDMODE() is a handy way to ensure "--name-only", "--oid-only",
and "--long" are not given together, but it may be overkill to make
only two or three options mutually exclusive.

In any case, once we pass the parsing part, the code should
translate the option into a bitmask that specifies which among
<mode>, <type>, <object-name>, <size>, and <filename> fields are
shown.  It will result in cleaner code in show_tree() if it uses
that set of fields to decide what is shown and how without looking
at the cmdmode enum.


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v4 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree"
  2021-11-23 22:32         ` Junio C Hamano
@ 2021-12-06  7:52           ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2021-12-06  7:52 UTC (permalink / raw)
  To: gitster; +Cc: avarab, congdanhqx, dyroneteng, git, peff

On Tue, 23 Nov 2021 14:32:35 -0800, Junio C Hamano wrote:

> I can guess what "intercept the origin" wants to say, but it does
>not roll off the tongue very well.
>
> cf. Documentation/SubmittingPatches[[imperative-mood]]

Will learn and borrow your sentences and be shown in next patch :)

> This does not work well with "--long", right?

I think so.

Peff pointed out this is arguably a bug before because
`git ls-tree --long --name-only` do not really make sense (column of
the object size is not shown in result).

> Usually we say A is "Consistent" with B when A and B are different
> but are moral equivalent in their respective contexts.  These are
> identical, there is no difference.

Clearly understood now.

> Lose all of the above change, except for "Cannot be combined with".
> The original is just fine.

Will.

> Or "--long"?  Does this work with it?

As I understand the disscussed context so far, the "--long", "--name-only"
and "--oid-only" they should be mutually exclusive with each other.

> I suspect that "--long" would be part of this, if we were to go this
> route.

Agree.

> OPT_CMDMODE() is a handy way to ensure "--name-only", "--oid-only",
> and "--long" are not given together, but it may be overkill to make
> only two or three options mutually exclusive.
>
> In any case, once we pass the parsing part, the code should
> translate the option into a bitmask that specifies which among
> <mode>, <type>, <object-name>, <size>, and <filename> fields are
> shown.  It will result in cleaner code in show_tree() if it uses
> that set of fields to decide what is shown and how without looking
> at the cmdmode enum.

Make sense.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v5 0/1] support `--object-only` option for "git-ls-tree"
  2021-11-23  4:58     ` [PATCH v4 0/1] ls-tree.c: support `--oid-only` option Teng Long
  2021-11-23  4:58       ` [PATCH v4 1/1] ls-tree.c: support `--oid-only` option for "git-ls-tree" Teng Long
@ 2021-12-08  2:08       ` Teng Long
  2021-12-08  2:08         ` [PATCH v5 1/1] ls-tree.c: " Teng Long
  2021-12-17  6:57         ` [PATCH v6 0/1] " Teng Long
  1 sibling, 2 replies; 224+ messages in thread
From: Teng Long @ 2021-12-08  2:08 UTC (permalink / raw)
  To: git; +Cc: avarab, congdanhqx, gitster, peff, Teng Long


Diffs from patch v4:

* Change `--oid-only` to `--object-only`.
  Word "oid" may not be easily understood for users.

* The commit message was modified in terms of Junio's advice.

* Use "OPT_CMDMODE()" to make `--name-only`, `--object-only` and
  `--long` mutually exclusive with each other.

* After options been parsed, translate options to bitmask, then use
  cleaner bitwise to determine which fields will be shown.

* Add tests for mutually exclusive options.

* Documentation modifications about the change of option name.

Thanks.

Teng Long (1):
  ls-tree.c: support `--object-only` option for "git-ls-tree"

 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 125 ++++++++++++++++++++++++----------
 t/t3103-ls-tree-misc.sh       |   8 +++
 t/t3104-ls-tree-oid.sh        |  51 ++++++++++++++
 4 files changed, 154 insertions(+), 37 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

Range-diff against v4:
-:  ---------- > 1:  38d55a878c ls-tree.c: support `--object-only` option for "git-ls-tree"
-- 
2.33.1.10.gd2a07a0ec5.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v5 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-08  2:08       ` [PATCH v5 0/1] support `--object-only` " Teng Long
@ 2021-12-08  2:08         ` Teng Long
  2021-12-15 19:25           ` Junio C Hamano
  2021-12-17  6:57         ` [PATCH v6 0/1] " Teng Long
  1 sibling, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-12-08  2:08 UTC (permalink / raw)
  To: git; +Cc: avarab, congdanhqx, gitster, peff, Teng Long

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" (mutually exclusive).

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 125 ++++++++++++++++++++++++----------
 t/t3103-ls-tree-misc.sh       |   8 +++
 t/t3104-ls-tree-oid.sh        |  51 ++++++++++++++
 4 files changed, 154 insertions(+), 37 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..729370f235 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..beaa8bf13b 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -16,21 +16,38 @@
 
 static int line_termination = '\n';
 #define LS_RECURSIVE 1
-#define LS_TREE_ONLY 2
-#define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_TREE_ONLY 1 << 1
+#define LS_SHOW_TREES 1 << 2
+#define LS_NAME_ONLY 1 << 3
+#define LS_SHOW_SIZE 1 << 4
+#define LS_OBJECT_ONLY 1 << 5
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
+static unsigned int shown_bits = 0;
+#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
+#define SHOW_MODE 1 << 4
+#define SHOW_TYPE 1 << 3
+#define SHOW_OBJECT_NAME 1 << 2
+#define SHOW_SIZE 1 << 1
+#define SHOW_FILE_NAME 1
 
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OBJECT_ONLY,
+	MODE_LONG
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
 	int i;
@@ -66,6 +83,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
+	int follow = 0;
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
@@ -74,8 +92,8 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		 *
 		 * Something similar to this incomplete example:
 		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
+		 * if (show_subprojects(base, baselen, pathname))
+		 *	retval = READ_TREE_RECURSIVE;
 		 *
 		 */
 		type = commit_type;
@@ -90,35 +108,67 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (!strcmp(type, blob_type)) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else
-				xsnprintf(size_text, sizeof(size_text), "-");
-			printf("%06o %s %s %7s\t", mode, type,
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
+	if (shown_bits & SHOW_MODE) {
+		printf("%06o",mode);
+		follow = 1;
+	}
+	if (shown_bits & SHOW_TYPE) {
+		printf("%s%s", follow == 1 ? " " : "", type);
+		follow = 1;
+	}
+	if (shown_bits & SHOW_OBJECT_NAME) {
+		printf("%s%s", follow == 1 ? " " : "",
+		       find_unique_abbrev(oid, abbrev));
+		if (!(shown_bits ^ SHOW_OBJECT_NAME))
+			printf("%c", line_termination);
+		follow = 1;
+	}
+	if (shown_bits & SHOW_SIZE) {
+		char size_text[24];
+		if (!strcmp(type, blob_type)) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text), "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%"PRIuMAX, (uintmax_t)size);
 		} else
-			printf("%06o %s %s\t", mode, type,
-			       find_unique_abbrev(oid, abbrev));
+			xsnprintf(size_text, sizeof(size_text), "-");
+		printf("%s%7s", follow == 1 ? " " : "", size_text);
+		follow = 1;
+	}
+	if (shown_bits & SHOW_FILE_NAME) {
+		if (follow)
+			printf("\t");
+		baselen = base->len;
+		strbuf_addstr(base, pathname);
+		write_name_quoted_relative(base->buf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout, line_termination);
+		strbuf_setlen(base, baselen);
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
-	strbuf_setlen(base, baselen);
 	return retval;
 }
 
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY) {
+		shown_bits = SHOW_FILE_NAME;
+		return 0;
+	}
+	if (cmdmode == MODE_OBJECT_ONLY) {
+		shown_bits = SHOW_OBJECT_NAME;
+		return 0;
+	}
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY))
+		shown_bits = SHOW_DEFAULT;
+	if (cmdmode == MODE_LONG)
+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
+	return 1;
+}
+
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 {
 	struct object_id oid;
@@ -133,12 +183,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_BIT('l', "long", &ls_options, N_("include object size"),
-			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
+			    MODE_LONG),
+		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -169,6 +221,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
+	parse_shown_fields();
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
diff --git a/t/t3103-ls-tree-misc.sh b/t/t3103-ls-tree-misc.sh
index 14520913af..75e38b0a51 100755
--- a/t/t3103-ls-tree-misc.sh
+++ b/t/t3103-ls-tree-misc.sh
@@ -22,4 +22,12 @@ test_expect_success 'ls-tree fails with non-zero exit code on broken tree' '
 	test_must_fail git ls-tree -r HEAD
 '
 
+test_expect_success 'usage: incompatible options: --name-status with --long' '
+	test_expect_code 129 git ls-tree --long --name-status
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --long' '
+	test_expect_code 129 git ls-tree --long --name-only
+'
+
 test_done
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..81304e7b13
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+	git ls-tree --object-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+	git ls-tree --object-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+	git ls-tree --object-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-only
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-status
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --long
+'
+
+test_done
-- 
2.33.1.10.gd2a07a0ec5.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v5 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-08  2:08         ` [PATCH v5 1/1] ls-tree.c: " Teng Long
@ 2021-12-15 19:25           ` Junio C Hamano
  2021-12-16 12:16             ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-12-15 19:25 UTC (permalink / raw)
  To: Teng Long; +Cc: git, avarab, congdanhqx, peff

Teng Long <dyroneteng@gmail.com> writes:

> We usually pipe the output from `git ls-trees` to tools like
> `sed` or `cut` when we only want to extract some fields.
>
> When we want only the pathname component, we can pass
> `--name-only` option to omit such a pipeline, but there are no
> options for extracting other fields.
>
> Teach the "--object-only" option to the command to only show the
> object name. This option cannot be used together with
> "--name-only" or "--long" (mutually exclusive).

I notice that this changed from --oid to --object and I agree that
it would probably be more friendly to end users.  In fact, this

    $ sed -ne '/^SYNOPSIS/,/^DESCRIPTION/p' Documentation/git-*.txt |
      grep -e -oid

did not find any hits.

> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c7..beaa8bf13b 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -16,21 +16,38 @@
>  
>  static int line_termination = '\n';
>  #define LS_RECURSIVE 1
> -#define LS_TREE_ONLY 2
> -#define LS_SHOW_TREES 4
> -#define LS_NAME_ONLY 8
> -#define LS_SHOW_SIZE 16
> +#define LS_TREE_ONLY 1 << 1
> +#define LS_SHOW_TREES 1 << 2
> +#define LS_NAME_ONLY 1 << 3
> +#define LS_SHOW_SIZE 1 << 4
> +#define LS_OBJECT_ONLY 1 << 5

It usually is a good idea to enclose these in () so that they are
safe to use in any context in a statement.

Luckily, bitwise-or and bitwise-and, which are the most likely
candidates for these symbols to be used with, bind looser than
left-shift, so something like 

	if ((LS_TREE_ONLY | LS_SHOW_TREES) & opt)
		... do this ...

is safe either way, but (LS_TREE_ONLY + LS_SHOW_TREES) would have
different value with and without () around (1 << N).

> +static unsigned int shown_bits = 0;

Style: we do not initialize statics explicitly to zero.

> +#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
> +#define SHOW_MODE 1 << 4
> +#define SHOW_TYPE 1 << 3
> +#define SHOW_OBJECT_NAME 1 << 2
> +#define SHOW_SIZE 1 << 1
> +#define SHOW_FILE_NAME 1

Likewise.  It is a bit curious to see these listed in decreasing
order, though.

>  static const  char * const ls_tree_usage[] = {
>  	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
>  	NULL
>  };
>  
> +enum {
> +	MODE_UNSPECIFIED = 0,
> +	MODE_NAME_ONLY,
> +	MODE_OBJECT_ONLY,
> +	MODE_LONG

It is a good idea to leave a comma even after the last element,
_unless_ there is a strong reason why the element that currently is
at the last MUST stay to be last when new elements are added to the
enum.  That way, a future patch that adds a new element can add it
to the list with a patch noise with fewer lines.

> +};
> +
> +static int cmdmode = MODE_UNSPECIFIED;

This is also initializing a static variable to zero, and arguments
can be made either way: (A) unspecified is set to zero in enum
definition exactly so that we can use zero to signal the variable is
unspecified, so an explicit zero initialization here goes against
the spirit of choosing 0 as MODE_UNSPECIFIED; or (B) enum definition
can be scrambled in future changes to use something other than zero
for MODE_UNSPECIFIED, and explicitly writing it like this is more
future-proof.

I am OK with the way it is written (i.e. (B)).

> @@ -66,6 +83,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  {
>  	int retval = 0;
>  	int baselen;
> +	int follow = 0;

A better name, anybody?

This bit is to keep track of the fact that we made _some_ output
already so any further output needs an inter-field space before
writing what it wants to write out.

>  	const char *type = blob_type;
>  
>  	if (S_ISGITLINK(mode)) {
> @@ -74,8 +92,8 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  		 *
>  		 * Something similar to this incomplete example:
>  		 *
> -		if (show_subprojects(base, baselen, pathname))
> -			retval = READ_TREE_RECURSIVE;
> +		 * if (show_subprojects(base, baselen, pathname))
> +		 *	retval = READ_TREE_RECURSIVE;
>  		 *
>  		 */

Nice ;-)

>  		type = commit_type;
> @@ -90,35 +108,67 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  	else if (ls_options & LS_TREE_ONLY)
>  		return 0;
>  
> -	if (!(ls_options & LS_NAME_ONLY)) {
> -		if (ls_options & LS_SHOW_SIZE) {
> -			char size_text[24];
> -			if (!strcmp(type, blob_type)) {
> -				unsigned long size;
> -				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
> -					xsnprintf(size_text, sizeof(size_text),
> -						  "BAD");
> -				else
> -					xsnprintf(size_text, sizeof(size_text),
> -						  "%"PRIuMAX, (uintmax_t)size);
> -			} else
> -				xsnprintf(size_text, sizeof(size_text), "-");
> -			printf("%06o %s %s %7s\t", mode, type,
> -			       find_unique_abbrev(oid, abbrev),
> -			       size_text);
> +	if (shown_bits & SHOW_MODE) {
> +		printf("%06o",mode);

SP before ','.

> +		follow = 1;
> +	}
> +	if (shown_bits & SHOW_TYPE) {
> +		printf("%s%s", follow == 1 ? " " : "", type);
> +		follow = 1;
> +	}
> +	if (shown_bits & SHOW_OBJECT_NAME) {
> +		printf("%s%s", follow == 1 ? " " : "",
> +		       find_unique_abbrev(oid, abbrev));
> +		if (!(shown_bits ^ SHOW_OBJECT_NAME))
> +			printf("%c", line_termination);

Curious.  I wonder if we can get rid of these two lines (and the
line_termination bit in the SHOW_FILE_NAME part), and have an
unconditional 

	putchar(line_termination);

at the end of the function.

That way, we could in the future choose to introduce a feature to
show only <mode, type, size> and nothing else, which may be useful
for taking per-type stats.

We need to stop using write_name_quoted_relative() in SHOW_FILE_NAME
part, because the helper insists that the name written by it must be
at the end of the entry, if we go that route, but it may be a good
change in the longer term.

> +		follow = 1;
> +	}
> +	if (shown_bits & SHOW_SIZE) {
> +		char size_text[24];
> +		if (!strcmp(type, blob_type)) {
> +			unsigned long size;
> +			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
> +				xsnprintf(size_text, sizeof(size_text), "BAD");
> +			else
> +				xsnprintf(size_text, sizeof(size_text),
> +					  "%"PRIuMAX, (uintmax_t)size);
>  		} else
> -			printf("%06o %s %s\t", mode, type,
> -			       find_unique_abbrev(oid, abbrev));
> +			xsnprintf(size_text, sizeof(size_text), "-");
> +		printf("%s%7s", follow == 1 ? " " : "", size_text);
> +		follow = 1;
> +	}
> +	if (shown_bits & SHOW_FILE_NAME) {
> +		if (follow)
> +			printf("\t");
> +		baselen = base->len;
> +		strbuf_addstr(base, pathname);
> +		write_name_quoted_relative(base->buf,
> +					   chomp_prefix ? ls_tree_prefix : NULL,
> +					   stdout, line_termination);
> +		strbuf_setlen(base, baselen);
>  	}

But the above nits aside, the updated organization of this function
looks much cleaner than the original.  Nicely reorganized.

> -	baselen = base->len;
> -	strbuf_addstr(base, pathname);
> -	write_name_quoted_relative(base->buf,
> -				   chomp_prefix ? ls_tree_prefix : NULL,
> -				   stdout, line_termination);
> -	strbuf_setlen(base, baselen);
>  	return retval;
>  }

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v5 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-15 19:25           ` Junio C Hamano
@ 2021-12-16 12:16             ` Teng Long
  2021-12-16 21:26               ` Junio C Hamano
  0 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-12-16 12:16 UTC (permalink / raw)
  To: gitster; +Cc: avarab, congdanhqx, dyroneteng, git, peff


> I notice that this changed from --oid to --object and I agree that
> it would probably be more friendly to end users.  In fact, this
>
>    $ sed -ne '/^SYNOPSIS/,/^DESCRIPTION/p' Documentation/git-*.txt |
>      grep -e -oid
>
> did not find any hits.

Thank you for comfirming that.

> It usually is a good idea to enclose these in () so that they are
> safe to use in any context in a statement.
> 
> Luckily, bitwise-or and bitwise-and, which are the most likely
> candidates for these symbols to be used with, bind looser than
> left-shift, so something like 
> 
> 	if ((LS_TREE_ONLY | LS_SHOW_TREES) & opt)
> 		... do this ...
> 
> is safe either way, but (LS_TREE_ONLY + LS_SHOW_TREES) would have
> different value with and without () around (1 << N).

Make sense.

Will fix in patch v6.


> Style: we do not initialize statics explicitly to zero.
> ...
> Likewise.  It is a bit curious to see these listed in decreasing
> order, though.

Will fix.

> It is a good idea to leave a comma even after the last element,
> _unless_ there is a strong reason why the element that currently is
> at the last MUST stay to be last when new elements are added to the
> enum.  That way, a future patch that adds a new element can add it
> to the list with a patch noise with fewer lines.

Very clear explanation.

Will fix.

> A better name, anybody?
> 
> This bit is to keep track of the fact that we made _some_ output
> already so any further output needs an inter-field space before
> writing what it wants to write out.

I found a word "interspace", it looks like a little better than the old
one. I will rename to it in next patch, and If there's a better idea,
will apply further.

> SP before ','.

Will fix.


> Curious.  I wonder if we can get rid of these two lines (and the
> line_termination bit in the SHOW_FILE_NAME part), and have an
> unconditional 
> 
> 	putchar(line_termination);
> 
> at the end of the function.
> 
> That way, we could in the future choose to introduce a feature to
> show only <mode, type, size> and nothing else, which may be useful
> for taking per-type stats.
> 
> We need to stop using write_name_quoted_relative() in SHOW_FILE_NAME
> part, because the helper insists that the name written by it must be
> at the end of the entry, if we go that route, but it may be a good
> change in the longer term.

Let me try to represent to make sure I understand your suggestion
sufficiently.

"write_name_quoted_relative" is used to compute the relative file name
by "prefix" and output the name and a line termination to the given FD.

We do not want use "write_name_quoted_relative" in here because the
function alway output a line termination after "name", this may bring
some inconvenience because the "name" may not be the last field in the
future.

So, instead:

We need to calculate the file name (relative path and quotes if need)
without "write_name_quoted_relative"  and then output the line
termination before return.


Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v5 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-16 12:16             ` Teng Long
@ 2021-12-16 21:26               ` Junio C Hamano
  2021-12-16 21:29                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2021-12-16 21:26 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, peff

Teng Long <dyroneteng@gmail.com> writes:

>> A better name, anybody?
>> 
>> This bit is to keep track of the fact that we made _some_ output
>> already so any further output needs an inter-field space before
>> writing what it wants to write out.
>
> I found a word "interspace", it looks like a little better than the old
> one. I will rename to it in next patch, and If there's a better idea,
> will apply further.

Yeah, it really is needs_inter_field_space but that is way too long.

>> We need to stop using write_name_quoted_relative() in SHOW_FILE_NAME
>> part, because the helper insists that the name written by it must be
>> at the end of the entry, if we go that route, but it may be a good
>> change in the longer term.
>
> Let me try to represent to make sure I understand your suggestion
> sufficiently.
>
> "write_name_quoted_relative" is used to compute the relative file name
> by "prefix" and output the name and a line termination to the given FD.
>
> We do not want use "write_name_quoted_relative" in here because the
> function alway output a line termination after "name", this may bring
> some inconvenience because the "name" may not be the last field in the
> future.
>
> So, instead:
>
> We need to calculate the file name (relative path and quotes if need)
> without "write_name_quoted_relative"  and then output the line
> termination before return.

I think we are on the same page.  We can work backwards, I think.

We have a repetitive

        if (mode should be shown) {
                show mode;
                record that we have already shown something;
        }
        if (type should be shown) {
                give inter-field-space if we have shown something;
                show type;
                record that we have already shown something;
        }
        ...

that ends with

        if (name should be shown) {
                give inter-field-space if we have shown something;
                show name PLUS line termination;
	}

But if we can make the last step to

        if (name should be shown) {
                give inter-field-space if we have shown something;
                show name;
	}

	give line termination;

it gets easier to support a combination that does not show name, and
we can have inter-record separator.

But write_name_quoted_relative() does not give the caller a choice
to have no terminator, so we need to do something like this:

	if (shown_bits & SHOW_FILE_NAME) {
		const char *name;
                struct strbuf name_buf = STRBUF_INIT;

		if (follow)
			printf("\t");
		baselen = base->len;
		strbuf_addstr(base, pathname);
                
		name = relative_path(base->buf, 
				     chomp_prefix ? ls_tree_prefix : NULL,
                                     &name_buf);
		if (line_termination)
			quote_c_style(name, NULL, stdout, 0);
		else
			fputs(name, stdout);
		strbuf_release(&name_buf);
		strbuf_setlen(base, baselen);
	}

I initially thought that extending write_name_quoted() and
write_name_quoted_relative() to accept a special value or two for
terminator to tell it not to add terminator would be sufficient (see
below).  I however think it is way too ugly to have the "add no
terminator and do not quote" option at write_name_quoted() level,
simply because the caller that chooses as-is can simply do fputs()
itself without bothering to use write_name_quoted().  So I am not
convinced that it will a good idea.

If we were to go that "ugly helper" route, the above can become even
simpler and closer to what you originally wrote, e.g.

	if (shown_bits & SHOW_FILE_NAME) {
		if (follow)
			printf("\t");
		baselen = base->len;
		strbuf_addstr(base, pathname);
		write_name_quoted_relative(base->buf,
					   chomp_prefix ? ls_tree_prefix : NULL,
					   stdout,
                                           line_termination 
                                           ? CQ_NO_TERMINATOR_C_QUOTED
                                           : CQ_NO_TERMINATOR_AS_IS);
		strbuf_setlen(base, baselen);
	}


 quote.c |  5 +++--
 quote.h | 19 +++++++++++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git c/quote.c w/quote.c
index 26719d21d1..cbbcd8563f 100644
--- c/quote.c
+++ w/quote.c
@@ -340,12 +340,13 @@ void quote_two_c_style(struct strbuf *sb, const char *prefix, const char *path,
 
 void write_name_quoted(const char *name, FILE *fp, int terminator)
 {
-	if (terminator) {
+	if (0 < terminator || terminator == CQ_NO_TERMINATOR_C_QUOTED) {
 		quote_c_style(name, NULL, fp, 0);
 	} else {
 		fputs(name, fp);
 	}
-	fputc(terminator, fp);
+	if (0 <= terminator)
+		fputc(terminator, fp);
 }
 
 void write_name_quoted_relative(const char *name, const char *prefix,
diff --git c/quote.h w/quote.h
index 87ff458b06..5c8c7cf952 100644
--- c/quote.h
+++ w/quote.h
@@ -85,7 +85,26 @@ int unquote_c_style(struct strbuf *, const char *quoted, const char **endp);
 size_t quote_c_style(const char *name, struct strbuf *, FILE *, unsigned);
 void quote_two_c_style(struct strbuf *, const char *, const char *, unsigned);
 
+/*
+ * Write a name, typically a filename, followed by a terminator that
+ * separates it from what comes next.
+ * When terminator is NUL, the name is given as-is.  Otherwise, the
+ * name is c-quoted, suitable for text output.  HT and LF are typical
+ * values used for the terminator, but other positive values are possible.
+ *
+ * In addition to non-negative values two special values in terminator
+ * are possible.
+ * -1: show the name c-quoted, without adding any terminator.
+ * -2: show the name as-is, without adding any terminator.
+ */
+#define CQ_NO_TERMINATOR_C_QUOTED	(-1)
+#define CQ_NO_TERMINATOR_AS_IS		(-2)
 void write_name_quoted(const char *name, FILE *, int terminator);
+
+/*
+ * Similar to the above, but the name is first made relative to the prefix
+ * before being shown.
+ */ 
 void write_name_quoted_relative(const char *name, const char *prefix,
 				FILE *fp, int terminator);
 


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v5 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-16 21:26               ` Junio C Hamano
@ 2021-12-16 21:29                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-16 21:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Teng Long, congdanhqx, git, peff


On Thu, Dec 16 2021, Junio C Hamano wrote:

> Teng Long <dyroneteng@gmail.com> writes:
>
>>> A better name, anybody?
>>> 
>>> This bit is to keep track of the fact that we made _some_ output
>>> already so any further output needs an inter-field space before
>>> writing what it wants to write out.
>>
>> I found a word "interspace", it looks like a little better than the old
>> one. I will rename to it in next patch, and If there's a better idea,
>> will apply further.
>
> Yeah, it really is needs_inter_field_space but that is way too long.
>
>>> We need to stop using write_name_quoted_relative() in SHOW_FILE_NAME
>>> part, because the helper insists that the name written by it must be
>>> at the end of the entry, if we go that route, but it may be a good
>>> change in the longer term.
>>
>> Let me try to represent to make sure I understand your suggestion
>> sufficiently.
>>
>> "write_name_quoted_relative" is used to compute the relative file name
>> by "prefix" and output the name and a line termination to the given FD.
>>
>> We do not want use "write_name_quoted_relative" in here because the
>> function alway output a line termination after "name", this may bring
>> some inconvenience because the "name" may not be the last field in the
>> future.
>>
>> So, instead:
>>
>> We need to calculate the file name (relative path and quotes if need)
>> without "write_name_quoted_relative"  and then output the line
>> termination before return.
>
> I think we are on the same page.  We can work backwards, I think.
>
> We have a repetitive
>
>         if (mode should be shown) {
>                 show mode;
>                 record that we have already shown something;
>         }
>         if (type should be shown) {
>                 give inter-field-space if we have shown something;
>                 show type;
>                 record that we have already shown something;
>         }
>         ...
>
> that ends with
>
>         if (name should be shown) {
>                 give inter-field-space if we have shown something;
>                 show name PLUS line termination;
> 	}
>
> But if we can make the last step to
>
>         if (name should be shown) {
>                 give inter-field-space if we have shown something;
>                 show name;
> 	}
>
> 	give line termination;
>
> it gets easier to support a combination that does not show name, and
> we can have inter-record separator.
>
> But write_name_quoted_relative() does not give the caller a choice
> to have no terminator, so we need to do something like this:
>
> 	if (shown_bits & SHOW_FILE_NAME) {
> 		const char *name;
>                 struct strbuf name_buf = STRBUF_INIT;
>
> 		if (follow)
> 			printf("\t");
> 		baselen = base->len;
> 		strbuf_addstr(base, pathname);
>                 
> 		name = relative_path(base->buf, 
> 				     chomp_prefix ? ls_tree_prefix : NULL,
>                                      &name_buf);
> 		if (line_termination)
> 			quote_c_style(name, NULL, stdout, 0);
> 		else
> 			fputs(name, stdout);
> 		strbuf_release(&name_buf);
> 		strbuf_setlen(base, baselen);
> 	}
>
> I initially thought that extending write_name_quoted() and
> write_name_quoted_relative() to accept a special value or two for
> terminator to tell it not to add terminator would be sufficient (see
> below).  I however think it is way too ugly to have the "add no
> terminator and do not quote" option at write_name_quoted() level,
> simply because the caller that chooses as-is can simply do fputs()
> itself without bothering to use write_name_quoted().  So I am not
> convinced that it will a good idea.
>
> If we were to go that "ugly helper" route, the above can become even
> simpler and closer to what you originally wrote, e.g.
>
> 	if (shown_bits & SHOW_FILE_NAME) {
> 		if (follow)
> 			printf("\t");
> 		baselen = base->len;
> 		strbuf_addstr(base, pathname);
> 		write_name_quoted_relative(base->buf,
> 					   chomp_prefix ? ls_tree_prefix : NULL,
> 					   stdout,
>                                            line_termination 
>                                            ? CQ_NO_TERMINATOR_C_QUOTED
>                                            : CQ_NO_TERMINATOR_AS_IS);
> 		strbuf_setlen(base, baselen);
> 	}
>
>
>  quote.c |  5 +++--
>  quote.h | 19 +++++++++++++++++++
>  2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git c/quote.c w/quote.c
> index 26719d21d1..cbbcd8563f 100644
> --- c/quote.c
> +++ w/quote.c
> @@ -340,12 +340,13 @@ void quote_two_c_style(struct strbuf *sb, const char *prefix, const char *path,
>  
>  void write_name_quoted(const char *name, FILE *fp, int terminator)
>  {
> -	if (terminator) {
> +	if (0 < terminator || terminator == CQ_NO_TERMINATOR_C_QUOTED) {
>  		quote_c_style(name, NULL, fp, 0);
>  	} else {
>  		fputs(name, fp);
>  	}
> -	fputc(terminator, fp);
> +	if (0 <= terminator)
> +		fputc(terminator, fp);
>  }
>  
>  void write_name_quoted_relative(const char *name, const char *prefix,
> diff --git c/quote.h w/quote.h
> index 87ff458b06..5c8c7cf952 100644
> --- c/quote.h
> +++ w/quote.h
> @@ -85,7 +85,26 @@ int unquote_c_style(struct strbuf *, const char *quoted, const char **endp);
>  size_t quote_c_style(const char *name, struct strbuf *, FILE *, unsigned);
>  void quote_two_c_style(struct strbuf *, const char *, const char *, unsigned);
>  
> +/*
> + * Write a name, typically a filename, followed by a terminator that
> + * separates it from what comes next.
> + * When terminator is NUL, the name is given as-is.  Otherwise, the
> + * name is c-quoted, suitable for text output.  HT and LF are typical
> + * values used for the terminator, but other positive values are possible.
> + *
> + * In addition to non-negative values two special values in terminator
> + * are possible.
> + * -1: show the name c-quoted, without adding any terminator.
> + * -2: show the name as-is, without adding any terminator.
> + */
> +#define CQ_NO_TERMINATOR_C_QUOTED	(-1)
> +#define CQ_NO_TERMINATOR_AS_IS		(-2)
>  void write_name_quoted(const char *name, FILE *, int terminator);
> +
> +/*
> + * Similar to the above, but the name is first made relative to the prefix
> + * before being shown.
> + */ 
>  void write_name_quoted_relative(const char *name, const char *prefix,
>  				FILE *fp, int terminator);
>  

In my "just make ls-tree support --format" this whole thing is just:
    
    +               if (prefix)
    +                       name = relative_path(name, prefix, &scratch);
    +               quote_c_style(name, sb, NULL, 0);
    +               strbuf_addch(sb, line_termination);

But of course that's an implementation that's moved away from the "write
stuff to a FH for me" API in quote.c.

So I haven't looked carefully here, but you need this API just because
of some constraint in the write_name_quoted()?

Perhaps just running with that approach is better? Whether it's stealing
that approach, or doing away with --object-only for a --format...

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v6 0/1] support `--object-only` option for "git-ls-tree"
  2021-12-08  2:08       ` [PATCH v5 0/1] support `--object-only` " Teng Long
  2021-12-08  2:08         ` [PATCH v5 1/1] ls-tree.c: " Teng Long
@ 2021-12-17  6:57         ` Teng Long
  2021-12-17  6:57           ` [PATCH v6 1/1] ls-tree.c: " Teng Long
                             ` (2 more replies)
  1 sibling, 3 replies; 224+ messages in thread
From: Teng Long @ 2021-12-17  6:57 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff

Diffs from patch v5:

* Let "write_name_quoted_relative" support two terminations, they are
  "CQ_NO_TERMINATOR_C_QUOTED" and "CQ_NO_TERMINATOR_AS_IS" (They are
  used to let the function write "name" at a 'c_quoted' way or just
  'as-is', but without print a terminator).
  
* Rename "follow" to "interspace" (Representing whether any further
  output needs an inter-field space).

* Nits fixed.

Many thanks to Junio and Ævar for your help and patient explanation.
I noticed Ævar suggest the solution with using `--format`, but in
this patch, the current approach continues. If this part of code needs
to be improved or we want to support "--format" in "ls-tree" in the
future, I'm more than glad to continue to contribute.

Thanks.

Teng Long (1):
  ls-tree.c: support `--object-only` option for "git-ls-tree"

 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 131 ++++++++++++++++++++++++----------
 quote.c                       |   8 +--
 quote.h                       |  19 +++++
 t/t3103-ls-tree-misc.sh       |   8 +++
 t/t3104-ls-tree-oid.sh        |  51 +++++++++++++
 6 files changed, 183 insertions(+), 41 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

Range-diff against v5:
1:  38d55a878c ! 1:  2e449d1c79 ls-tree.c: support `--object-only` option for "git-ls-tree"
    @@ builtin/ls-tree.c
     -#define LS_SHOW_TREES 4
     -#define LS_NAME_ONLY 8
     -#define LS_SHOW_SIZE 16
    -+#define LS_TREE_ONLY 1 << 1
    -+#define LS_SHOW_TREES 1 << 2
    -+#define LS_NAME_ONLY 1 << 3
    -+#define LS_SHOW_SIZE 1 << 4
    -+#define LS_OBJECT_ONLY 1 << 5
    ++#define LS_TREE_ONLY (1 << 1)
    ++#define LS_SHOW_TREES (1 << 2)
    ++#define LS_NAME_ONLY (1 << 3)
    ++#define LS_SHOW_SIZE (1 << 4)
    ++#define LS_OBJECT_ONLY (1 << 5)
      static int abbrev;
      static int ls_options;
      static struct pathspec pathspec;
      static int chomp_prefix;
      static const char *ls_tree_prefix;
    -+static unsigned int shown_bits = 0;
    -+#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
    -+#define SHOW_MODE 1 << 4
    -+#define SHOW_TYPE 1 << 3
    -+#define SHOW_OBJECT_NAME 1 << 2
    -+#define SHOW_SIZE 1 << 1
    ++static unsigned int shown_bits;
     +#define SHOW_FILE_NAME 1
    ++#define SHOW_SIZE (1 << 1)
    ++#define SHOW_OBJECT_NAME (1 << 2)
    ++#define SHOW_TYPE (1 << 3)
    ++#define SHOW_MODE (1 << 4)
    ++#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
      
      static const  char * const ls_tree_usage[] = {
      	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
    @@ builtin/ls-tree.c
     +	MODE_UNSPECIFIED = 0,
     +	MODE_NAME_ONLY,
     +	MODE_OBJECT_ONLY,
    -+	MODE_LONG
    ++	MODE_LONG,
     +};
     +
     +static int cmdmode = MODE_UNSPECIFIED;
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
      {
      	int retval = 0;
      	int baselen;
    -+	int follow = 0;
    ++	int interspace = 0;
      	const char *type = blob_type;
      
      	if (S_ISGITLINK(mode)) {
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -			       find_unique_abbrev(oid, abbrev),
     -			       size_text);
     +	if (shown_bits & SHOW_MODE) {
    -+		printf("%06o",mode);
    -+		follow = 1;
    ++		printf("%06o", mode);
    ++		interspace = 1;
     +	}
     +	if (shown_bits & SHOW_TYPE) {
    -+		printf("%s%s", follow == 1 ? " " : "", type);
    -+		follow = 1;
    ++		printf("%s%s", interspace ? " " : "", type);
    ++		interspace = 1;
     +	}
     +	if (shown_bits & SHOW_OBJECT_NAME) {
    -+		printf("%s%s", follow == 1 ? " " : "",
    ++		printf("%s%s", interspace ? " " : "",
     +		       find_unique_abbrev(oid, abbrev));
     +		if (!(shown_bits ^ SHOW_OBJECT_NAME))
    -+			printf("%c", line_termination);
    -+		follow = 1;
    ++			goto LINE_FINISH;
    ++		interspace = 1;
     +	}
     +	if (shown_bits & SHOW_SIZE) {
     +		char size_text[24];
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -			printf("%06o %s %s\t", mode, type,
     -			       find_unique_abbrev(oid, abbrev));
     +			xsnprintf(size_text, sizeof(size_text), "-");
    -+		printf("%s%7s", follow == 1 ? " " : "", size_text);
    -+		follow = 1;
    ++		printf("%s%7s", interspace ? " " : "", size_text);
    ++		interspace = 1;
     +	}
     +	if (shown_bits & SHOW_FILE_NAME) {
    -+		if (follow)
    ++		if (interspace)
     +			printf("\t");
     +		baselen = base->len;
     +		strbuf_addstr(base, pathname);
     +		write_name_quoted_relative(base->buf,
     +					   chomp_prefix ? ls_tree_prefix : NULL,
    -+					   stdout, line_termination);
    ++					   stdout,
    ++					   line_termination
    ++					   ? CQ_NO_TERMINATOR_C_QUOTED
    ++					   : CQ_NO_TERMINATOR_AS_IS);
     +		strbuf_setlen(base, baselen);
      	}
     -	baselen = base->len;
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -				   chomp_prefix ? ls_tree_prefix : NULL,
     -				   stdout, line_termination);
     -	strbuf_setlen(base, baselen);
    ++
    ++LINE_FINISH:
    ++	putchar(line_termination);
      	return retval;
      }
      
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      	 * show_recursive() rolls its own matching code and is
      	 * generally ignorant of 'struct pathspec'. The magic mask
     
    + ## quote.c ##
    +@@ quote.c: void quote_two_c_style(struct strbuf *sb, const char *prefix, const char *path,
    + 
    + void write_name_quoted(const char *name, FILE *fp, int terminator)
    + {
    +-	if (terminator) {
    ++	if (0 < terminator || terminator == CQ_NO_TERMINATOR_C_QUOTED)
    + 		quote_c_style(name, NULL, fp, 0);
    +-	} else {
    ++	else
    + 		fputs(name, fp);
    +-	}
    +-	fputc(terminator, fp);
    ++	if (0 <= terminator)
    ++		fputc(terminator, fp);
    + }
    + 
    + void write_name_quoted_relative(const char *name, const char *prefix,
    +
    + ## quote.h ##
    +@@ quote.h: int unquote_c_style(struct strbuf *, const char *quoted, const char **endp);
    + #define CQUOTE_NODQ 01
    + size_t quote_c_style(const char *name, struct strbuf *, FILE *, unsigned);
    + void quote_two_c_style(struct strbuf *, const char *, const char *, unsigned);
    ++/*
    ++ * Write a name, typically a filename, followed by a terminator that
    ++ * separates it from what comes next.
    ++ * When terminator is NUL, the name is given as-is.  Otherwise, the
    ++ * name is c-quoted, suitable for text output.  HT and LF are typical
    ++ * values used for the terminator, but other positive values are possible.
    ++ *
    ++ * In addition to non-negative values two special values in terminator
    ++ * are possible.
    ++ *
    ++ * -1: show the name c-quoted, without adding any terminator.
    ++ * -2: show the name as-is, without adding any terminator.
    ++ */
    ++#define CQ_NO_TERMINATOR_C_QUOTED	(-1)
    ++#define CQ_NO_TERMINATOR_AS_IS		(-2)
    + 
    + void write_name_quoted(const char *name, FILE *, int terminator);
    ++/*
    ++ * Similar to the above, but the name is first made relative to the prefix
    ++ * before being shown.
    ++ */
    + void write_name_quoted_relative(const char *name, const char *prefix,
    + 				FILE *fp, int terminator);
    + 
    +
      ## t/t3103-ls-tree-misc.sh ##
     @@ t/t3103-ls-tree-misc.sh: test_expect_success 'ls-tree fails with non-zero exit code on broken tree' '
      	test_must_fail git ls-tree -r HEAD
-- 
2.33.1.10.g5f17c1a2c1.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v6 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-17  6:57         ` [PATCH v6 0/1] " Teng Long
@ 2021-12-17  6:57           ` Teng Long
  2021-12-17 13:09             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
  2 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2021-12-17  6:57 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" (mutually exclusive).

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 131 ++++++++++++++++++++++++----------
 quote.c                       |   8 +--
 quote.h                       |  19 +++++
 t/t3103-ls-tree-misc.sh       |   8 +++
 t/t3104-ls-tree-oid.sh        |  51 +++++++++++++
 6 files changed, 183 insertions(+), 41 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..729370f235 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..e76c6e43e8 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -16,21 +16,38 @@
 
 static int line_termination = '\n';
 #define LS_RECURSIVE 1
-#define LS_TREE_ONLY 2
-#define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_TREE_ONLY (1 << 1)
+#define LS_SHOW_TREES (1 << 2)
+#define LS_NAME_ONLY (1 << 3)
+#define LS_SHOW_SIZE (1 << 4)
+#define LS_OBJECT_ONLY (1 << 5)
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
+static unsigned int shown_bits;
+#define SHOW_FILE_NAME 1
+#define SHOW_SIZE (1 << 1)
+#define SHOW_OBJECT_NAME (1 << 2)
+#define SHOW_TYPE (1 << 3)
+#define SHOW_MODE (1 << 4)
+#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
 
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OBJECT_ONLY,
+	MODE_LONG,
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
 static int show_recursive(const char *base, int baselen, const char *pathname)
 {
 	int i;
@@ -66,6 +83,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
+	int interspace = 0;
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
@@ -74,8 +92,8 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		 *
 		 * Something similar to this incomplete example:
 		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
+		 * if (show_subprojects(base, baselen, pathname))
+		 *	retval = READ_TREE_RECURSIVE;
 		 *
 		 */
 		type = commit_type;
@@ -90,35 +108,73 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (!strcmp(type, blob_type)) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else
-				xsnprintf(size_text, sizeof(size_text), "-");
-			printf("%06o %s %s %7s\t", mode, type,
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
+	if (shown_bits & SHOW_MODE) {
+		printf("%06o", mode);
+		interspace = 1;
+	}
+	if (shown_bits & SHOW_TYPE) {
+		printf("%s%s", interspace ? " " : "", type);
+		interspace = 1;
+	}
+	if (shown_bits & SHOW_OBJECT_NAME) {
+		printf("%s%s", interspace ? " " : "",
+		       find_unique_abbrev(oid, abbrev));
+		if (!(shown_bits ^ SHOW_OBJECT_NAME))
+			goto LINE_FINISH;
+		interspace = 1;
+	}
+	if (shown_bits & SHOW_SIZE) {
+		char size_text[24];
+		if (!strcmp(type, blob_type)) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text), "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%"PRIuMAX, (uintmax_t)size);
 		} else
-			printf("%06o %s %s\t", mode, type,
-			       find_unique_abbrev(oid, abbrev));
+			xsnprintf(size_text, sizeof(size_text), "-");
+		printf("%s%7s", interspace ? " " : "", size_text);
+		interspace = 1;
+	}
+	if (shown_bits & SHOW_FILE_NAME) {
+		if (interspace)
+			printf("\t");
+		baselen = base->len;
+		strbuf_addstr(base, pathname);
+		write_name_quoted_relative(base->buf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout,
+					   line_termination
+					   ? CQ_NO_TERMINATOR_C_QUOTED
+					   : CQ_NO_TERMINATOR_AS_IS);
+		strbuf_setlen(base, baselen);
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
-	strbuf_setlen(base, baselen);
+
+LINE_FINISH:
+	putchar(line_termination);
 	return retval;
 }
 
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY) {
+		shown_bits = SHOW_FILE_NAME;
+		return 0;
+	}
+	if (cmdmode == MODE_OBJECT_ONLY) {
+		shown_bits = SHOW_OBJECT_NAME;
+		return 0;
+	}
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY))
+		shown_bits = SHOW_DEFAULT;
+	if (cmdmode == MODE_LONG)
+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
+	return 1;
+}
+
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 {
 	struct object_id oid;
@@ -133,12 +189,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_BIT('l', "long", &ls_options, N_("include object size"),
-			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
+			    MODE_LONG),
+		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -169,6 +227,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
+	parse_shown_fields();
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
diff --git a/quote.c b/quote.c
index 8a3a5e39eb..9f9da49fa2 100644
--- a/quote.c
+++ b/quote.c
@@ -340,12 +340,12 @@ void quote_two_c_style(struct strbuf *sb, const char *prefix, const char *path,
 
 void write_name_quoted(const char *name, FILE *fp, int terminator)
 {
-	if (terminator) {
+	if (0 < terminator || terminator == CQ_NO_TERMINATOR_C_QUOTED)
 		quote_c_style(name, NULL, fp, 0);
-	} else {
+	else
 		fputs(name, fp);
-	}
-	fputc(terminator, fp);
+	if (0 <= terminator)
+		fputc(terminator, fp);
 }
 
 void write_name_quoted_relative(const char *name, const char *prefix,
diff --git a/quote.h b/quote.h
index 049d8dd0b3..ec3f862545 100644
--- a/quote.h
+++ b/quote.h
@@ -84,8 +84,27 @@ int unquote_c_style(struct strbuf *, const char *quoted, const char **endp);
 #define CQUOTE_NODQ 01
 size_t quote_c_style(const char *name, struct strbuf *, FILE *, unsigned);
 void quote_two_c_style(struct strbuf *, const char *, const char *, unsigned);
+/*
+ * Write a name, typically a filename, followed by a terminator that
+ * separates it from what comes next.
+ * When terminator is NUL, the name is given as-is.  Otherwise, the
+ * name is c-quoted, suitable for text output.  HT and LF are typical
+ * values used for the terminator, but other positive values are possible.
+ *
+ * In addition to non-negative values two special values in terminator
+ * are possible.
+ *
+ * -1: show the name c-quoted, without adding any terminator.
+ * -2: show the name as-is, without adding any terminator.
+ */
+#define CQ_NO_TERMINATOR_C_QUOTED	(-1)
+#define CQ_NO_TERMINATOR_AS_IS		(-2)
 
 void write_name_quoted(const char *name, FILE *, int terminator);
+/*
+ * Similar to the above, but the name is first made relative to the prefix
+ * before being shown.
+ */
 void write_name_quoted_relative(const char *name, const char *prefix,
 				FILE *fp, int terminator);
 
diff --git a/t/t3103-ls-tree-misc.sh b/t/t3103-ls-tree-misc.sh
index 14520913af..75e38b0a51 100755
--- a/t/t3103-ls-tree-misc.sh
+++ b/t/t3103-ls-tree-misc.sh
@@ -22,4 +22,12 @@ test_expect_success 'ls-tree fails with non-zero exit code on broken tree' '
 	test_must_fail git ls-tree -r HEAD
 '
 
+test_expect_success 'usage: incompatible options: --name-status with --long' '
+	test_expect_code 129 git ls-tree --long --name-status
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --long' '
+	test_expect_code 129 git ls-tree --long --name-only
+'
+
 test_done
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..81304e7b13
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+	git ls-tree --object-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+	git ls-tree --object-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+	git ls-tree --object-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-only
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-status
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --long
+'
+
+test_done
-- 
2.33.1.10.g5f17c1a2c1.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v6 1/1] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-17  6:57           ` [PATCH v6 1/1] ls-tree.c: " Teng Long
@ 2021-12-17 13:09             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:09 UTC (permalink / raw)
  To: Teng Long; +Cc: congdanhqx, git, gitster, peff


On Fri, Dec 17 2021, Teng Long wrote:

> [...]
>  int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  {
>  	struct object_id oid;
> @@ -133,12 +189,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  			LS_SHOW_TREES),
>  		OPT_SET_INT('z', NULL, &line_termination,
>  			    N_("terminate entries with NUL byte"), 0),
> -		OPT_BIT('l', "long", &ls_options, N_("include object size"),
> -			LS_SHOW_SIZE),
> -		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> -		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> +		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
> +			    MODE_LONG),
> +		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
> +			    MODE_NAME_ONLY),
> +		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
> +			    MODE_NAME_ONLY),
> +		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
> +			    MODE_OBJECT_ONLY),
>  		OPT_SET_INT(0, "full-name", &chomp_prefix,
>  			    N_("use full path names"), 0),
>  		OPT_BOOL(0, "full-tree", &full_tree,

Very nice to have the OPT_CMDMODE for asserting the usage, but this
would be even better if it were done as a separate commit. I.e. let's
first do prep cleanups, then the new --object-name mode.

> +test_expect_success 'usage: incompatible options: --name-status with --long' '
> +	test_expect_code 129 git ls-tree --long --name-status
> +'
> +
> +test_expect_success 'usage: incompatible options: --name-only with --long' '
> +	test_expect_code 129 git ls-tree --long --name-only
> +'
> +
>  test_done
> [...]
> +test_expect_success 'usage: incompatible options: --name-only with --object-only' '
> +	test_expect_code 129 git ls-tree --object-only --name-only
> +'
> +
> +test_expect_success 'usage: incompatible options: --name-status with --object-only' '
> +	test_expect_code 129 git ls-tree --object-only --name-status
> +'
> +
> +test_expect_success 'usage: incompatible options: --long with --object-only' '
> +	test_expect_code 129 git ls-tree --object-only --long
> +'

These tests don't check for what you think they check, because you don't
supply a <tree-ish>. So they're really just dying for the same reason a:

    git ls-tree

Would.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 0/7] ls-tree --format
  2021-12-17  6:57         ` [PATCH v6 0/1] " Teng Long
  2021-12-17  6:57           ` [PATCH v6 1/1] ls-tree.c: " Teng Long
@ 2021-12-17 13:30           ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 1/7] ls-tree: remove commented-out code Ævar Arnfjörð Bjarmason
                               ` (6 more replies)
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
  2 siblings, 7 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

On Fri, Dec 17 2021, Teng Long wrote:

> Many thanks to Junio and Ævar for your help and patient explanation.
> I noticed Ævar suggest the solution with using `--format`, but in
> this patch, the current approach continues. If this part of code needs
> to be improved or we want to support "--format" in "ls-tree" in the
> future, I'm more than glad to continue to contribute.

FWIW here's the changes I had locally & cleaned up now that did the
alternate --format approach.

I think you'll probably want to steal some of this, e.g. you're
patching the dead comment I removed in 1/6, 2-4/6 can be skipped, but
I thought they were nice.

Back when I last looked at this series, your --object-name patch was
much shorter, but now it's about the same size as the generic --format
support. So maybe it's worth considering implementing the more generic
path.

One reason I didn't submit this before is that I couldn't get past the
performance regression this would inttroduce, i.e. if moved entirely
to strbuf_expand(). Here though I'm keeping the old code, so it's no
slower than "master", unlike your patch. But I haven't dug into why
yours is slower:
    
    $ git hyperfine -L rev origin/master,tl/object-name,avar/ls-tree-format -s 'make CFLAGS=-O3' './git -C /run/user/1001/linux ls-tree -r HEAD' --warmup 10 -r 10
    Benchmark 1: ./git -C /run/user/1001/linux ls-tree -r HEAD' in 'origin/master
      Time (mean ± σ):      67.8 ms ±   0.3 ms    [User: 48.8 ms, System: 18.9 ms]
      Range (min … max):    67.4 ms …  68.4 ms    10 runs
    
    Benchmark 2: ./git -C /run/user/1001/linux ls-tree -r HEAD' in 'tl/object-name
      Time (mean ± σ):      72.8 ms ±   0.4 ms    [User: 50.6 ms, System: 22.1 ms]
      Range (min … max):    72.0 ms …  73.2 ms    10 runs
    
    Benchmark 3: ./git -C /run/user/1001/linux ls-tree -r HEAD' in 'avar/ls-tree-format
      Time (mean ± σ):      67.6 ms ±   0.4 ms    [User: 50.5 ms, System: 17.0 ms]
      Range (min … max):    67.1 ms …  68.4 ms    10 runs
    
    Summary
      './git -C /run/user/1001/linux ls-tree -r HEAD' in 'avar/ls-tree-format' ran
        1.00 ± 0.01 times faster than './git -C /run/user/1001/linux ls-tree -r HEAD' in 'origin/master'
        1.08 ± 0.01 times faster than './git -C /run/user/1001/linux ls-tree -r HEAD' in 'tl/object-name'

I then tacket a 6/6 at the end here to implement your --object-name in
terms of --format (but didn't update the comimt message etc.). That's
slower as expected:
    
    $ git hyperfine -L rev tl/object-name,avar/ls-tree-format -s 'make CFLAGS=-O3' './git -C /run/user/1001/linux ls-tree --object-only -r HEAD' --warmup 10 -r 10
    Benchmark 1: ./git -C /run/user/1001/linux ls-tree --object-only -r HEAD' in 'tl/object-name
      Time (mean ± σ):      58.7 ms ±   0.4 ms    [User: 43.0 ms, System: 15.6 ms]
      Range (min … max):    58.4 ms …  59.6 ms    10 runs
     
    Benchmark 2: ./git -C /run/user/1001/linux ls-tree --object-only -r HEAD' in 'avar/ls-tree-format
      Time (mean ± σ):      65.6 ms ±   0.2 ms    [User: 42.4 ms, System: 23.0 ms]
      Range (min … max):    65.1 ms …  65.9 ms    10 runs
     
    Summary
      './git -C /run/user/1001/linux ls-tree --object-only -r HEAD' in 'tl/object-name' ran
        1.12 ± 0.01 times faster than './git -C /run/user/1001/linux ls-tree --object-only -r HEAD' in 'avar/ls-tree-format'

But it's not too bad, so maybe it's fine & worth making it more
generic?

Anyway. Just food for thought and and FYI in case you're
interested. Junio noted already that he'd like the --object-name
approach first, so if you still want to pursue your current
implementation I don't mind.

I do think you should be making performance testing a part of your
testing & cover letter writing though. A 8-10% slowdown isn't nothing,
especially for exactly the sort of plumbing command that'll likely to
be used to e.g. slurp up all paths in a very large repo.

These patches really aren't "ready". There's no docs, and as I noted
in some earlier thread the tests for ls-tree are really
lacking. E.g. I seem to have a rather obvious bug in how -t and the
--format interact here, but no test catches it.

Well, that one's me not having added a test, but I'm fairly sure there
might also be hidden bugs here due to lack of testing.

Teng Long (1):
  ls-tree.c: support `--object-only` option for "git-ls-tree"

Ævar Arnfjörð Bjarmason (6):
  ls-tree: remove commented-out code
  ls-tree: add missing braces to "else" arms
  ls-tree: use "enum object_type", not {blob,tree,commit}_type
  ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  ls-tree: split up the "init" part of show_tree()
  ls-tree: add a --format=<fmt> option

 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 226 ++++++++++++++++++++++++++++++----
 t/t3103-ls-tree-misc.sh       |   8 ++
 t/t3104-ls-tree-oid.sh        |  51 ++++++++
 t/t3105-ls-tree-format.sh     |  49 ++++++++
 5 files changed, 313 insertions(+), 28 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh
 create mode 100755 t/t3105-ls-tree-format.sh

-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 1/7] ls-tree: remove commented-out code
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 2/7] ls-tree: add missing braces to "else" arms Ævar Arnfjörð Bjarmason
                               ` (5 subsequent siblings)
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

Remove code added in f35a6d3bce7 (Teach core object handling functions
about gitlinks, 2007-04-09), later patched in 7d0b18a4da1 (Add output
flushing before fork(), 2008-08-04), and then finally ending up in its
current form in d3bee161fef (tree.c: allow read_tree_recursive() to
traverse gitlink entries, 2009-01-25). All while being commented-out!

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..5f7c84950ce 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -69,15 +69,6 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
-		/*
-		 * Maybe we want to have some recursive version here?
-		 *
-		 * Something similar to this incomplete example:
-		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
-		 *
-		 */
 		type = commit_type;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 2/7] ls-tree: add missing braces to "else" arms
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 1/7] ls-tree: remove commented-out code Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 3/7] ls-tree: use "enum object_type", not {blob,tree,commit}_type Ævar Arnfjörð Bjarmason
                               ` (4 subsequent siblings)
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

Add missing {} to the "else" arms in show_tree() per the
CodingGuidelines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 5f7c84950ce..0a28f32ccb9 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -92,14 +92,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				else
 					xsnprintf(size_text, sizeof(size_text),
 						  "%"PRIuMAX, (uintmax_t)size);
-			} else
+			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
+			}
 			printf("%06o %s %s %7s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
-		} else
+		} else {
 			printf("%06o %s %s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev));
+		}
 	}
 	baselen = base->len;
 	strbuf_addstr(base, pathname);
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 3/7] ls-tree: use "enum object_type", not {blob,tree,commit}_type
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 1/7] ls-tree: remove commented-out code Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 2/7] ls-tree: add missing braces to "else" arms Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 4/7] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Ævar Arnfjörð Bjarmason
                               ` (3 subsequent siblings)
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

Change the ls-tree.c code to use type_name() on the enum instead of
using the string constants. This doesn't matter either way for
performance, but makes this a bit easier to read as we'll no longer
need a strcmp() here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 0a28f32ccb9..3f0225b097f 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,17 +66,17 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
-	const char *type = blob_type;
+	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-		type = commit_type;
+		type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
 			retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
 				return retval;
 		}
-		type = tree_type;
+		type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
@@ -84,7 +84,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
-			if (!strcmp(type, blob_type)) {
+			if (type == OBJ_BLOB) {
 				unsigned long size;
 				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
 					xsnprintf(size_text, sizeof(size_text),
@@ -95,11 +95,11 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
 			}
-			printf("%06o %s %s %7s\t", mode, type,
+			printf("%06o %s %s %7s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
 		} else {
-			printf("%06o %s %s\t", mode, type,
+			printf("%06o %s %s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev));
 		}
 	}
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 4/7] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
                               ` (2 preceding siblings ...)
  2021-12-17 13:30             ` [RFC PATCH 3/7] ls-tree: use "enum object_type", not {blob,tree,commit}_type Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 5/7] ls-tree: split up the "init" part of show_tree() Ævar Arnfjörð Bjarmason
                               ` (2 subsequent siblings)
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

The "struct strbuf"'s "len" member is a "size_t", not an "int", so
let's change our corresponding types accordingly. This also changes
the "len" and "speclen" variables, which are likewise used to store
the return value of strlen(), which returns "size_t", not "int".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3f0225b097f..eecc7482d54 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,7 +31,7 @@ static const  char * const ls_tree_usage[] = {
 	NULL
 };
 
-static int show_recursive(const char *base, int baselen, const char *pathname)
+static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
 
@@ -43,7 +43,7 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
 
 	for (i = 0; i < pathspec.nr; i++) {
 		const char *spec = pathspec.items[i].match;
-		int len, speclen;
+		size_t len, speclen;
 
 		if (strncmp(base, spec, baselen))
 			continue;
@@ -65,7 +65,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
 	int retval = 0;
-	int baselen;
+	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 5/7] ls-tree: split up the "init" part of show_tree()
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
                               ` (3 preceding siblings ...)
  2021-12-17 13:30             ` [RFC PATCH 4/7] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 6/7] ls-tree: add a --format=<fmt> option Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 7/7] ls-tree.c: support `--object-only` option for "git-ls-tree" Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

Split up the "init" part of the show_tree() function where we decide
what the "type" is, and whether we'll return early. This makes things
a bit less readable for now, but we'll soon re-use this in a sibling
function, and avoiding the duplication will be worth it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index eecc7482d54..df8312408da 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -61,25 +61,33 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
-static int show_tree(const struct object_id *oid, struct strbuf *base,
-		const char *pathname, unsigned mode, void *context)
+static int show_tree_init(enum object_type *type, struct strbuf *base,
+			  const char *pathname, unsigned mode, int *retval)
 {
-	int retval = 0;
-	size_t baselen;
-	enum object_type type = OBJ_BLOB;
-
 	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
+		*type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-			retval = READ_TREE_RECURSIVE;
+			*retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
-				return retval;
+				return 1;
 		}
-		type = OBJ_TREE;
+		*type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
-		return 0;
+		return 1;
+	return 0;
+}
+
+static int show_tree(const struct object_id *oid, struct strbuf *base,
+		const char *pathname, unsigned mode, void *context)
+{
+	int retval = 0;
+	size_t baselen;
+	enum object_type type = OBJ_BLOB;
+
+	if (show_tree_init(&type, base, pathname, mode, &retval))
+		return retval;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 6/7] ls-tree: add a --format=<fmt> option
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
                               ` (4 preceding siblings ...)
  2021-12-17 13:30             ` [RFC PATCH 5/7] ls-tree: split up the "init" part of show_tree() Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  2021-12-17 13:30             ` [RFC PATCH 7/7] ls-tree.c: support `--object-only` option for "git-ls-tree" Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

Add a --format option to ls-tree. It has an existing default output,
and then --long and --name-only options to emit the default output
along with the objectsize and, or to only emit object paths.

Rather than add --type-only, --object-only etc. we can just support a
--format using a strbuf_expand() similar to "for-each-ref
--format". We might still add such options in the future for
convenience.

The --format implementation is slower than the existing code, but this
change does not cause any performance regressions. We'll leave the
existing show_tree() unchanged, and only run show_tree_format() in if
a --format different than the hardcoded built-in ones corresponding to
the existing modes is provided.

"Slower" here can bee seen via the the following "hyperfine"
command. This uses GIT_TEST_LS_TREE_FORMAT_BACKEND=<bool> to force the
use of the new backend:

    $ hyperfine -L env false,true -L f "-r,-r -l,-r --name-only,-r --format='%(objectname)'" 'GIT_TEST_LS_TREE_FORMAT_BACKEND={env} ./git -C ~/g/linux ls-tree {f} HEAD' -r 10
    Benchmark 1: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r HEAD
      Time (mean ± σ):      86.1 ms ±   0.6 ms    [User: 65.2 ms, System: 20.9 ms]
      Range (min … max):    85.2 ms …  87.5 ms    10 runs

    Benchmark 2: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r HEAD
      Time (mean ± σ):     122.5 ms ±   0.6 ms    [User: 101.3 ms, System: 21.1 ms]
      Range (min … max):   121.8 ms … 123.4 ms    10 runs

    Benchmark 3: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r -l HEAD
      Time (mean ± σ):     277.7 ms ±   1.3 ms    [User: 234.6 ms, System: 43.0 ms]
      Range (min … max):   275.9 ms … 279.7 ms    10 runs

    Benchmark 4: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r -l HEAD
      Time (mean ± σ):     332.8 ms ±   2.6 ms    [User: 282.0 ms, System: 50.7 ms]
      Range (min … max):   329.6 ms … 338.2 ms    10 runs

    Benchmark 5: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --name-only HEAD
      Time (mean ± σ):      71.8 ms ±   0.4 ms    [User: 54.1 ms, System: 17.6 ms]
      Range (min … max):    71.2 ms …  72.5 ms    10 runs

    Benchmark 6: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --name-only HEAD
      Time (mean ± σ):      86.6 ms ±   0.5 ms    [User: 65.7 ms, System: 20.7 ms]
      Range (min … max):    85.9 ms …  87.4 ms    10 runs

    Benchmark 7: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD
      Time (mean ± σ):      85.8 ms ±   0.6 ms    [User: 66.2 ms, System: 19.5 ms]
      Range (min … max):    85.0 ms …  86.9 ms    10 runs

    Benchmark 8: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD
      Time (mean ± σ):      85.3 ms ±   0.2 ms    [User: 66.6 ms, System: 18.7 ms]
      Range (min … max):    85.0 ms …  85.7 ms    10 runs

    Summary
      'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --name-only HEAD' ran
        1.19 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD'
        1.19 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD'
        1.20 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r HEAD'
        1.21 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --name-only HEAD'
        1.71 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r HEAD'
        3.87 ± 0.03 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r -l HEAD'
        4.64 ± 0.05 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r -l HEAD'

I.e. something like the "--long" output would be much slower with
this, mainly due to how we need to allocate various things to do with
quote.c instead of spewing the output directly to stdout.

But even a --format='%(objectname)' is fast with the new backend, so
this is viable as a replacement for adding new formats, and we'll pay
for this added complexity as a one-off, and not again every time a new
format needs to be added. See [1] for an example of what it would
otherwise take to add an --object-name flag.

1. https://lore.kernel.org/git/2e449d1c792ff81da5f22c8bf65ed33c393d62f8.1639721750.git.dyroneteng@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c         | 167 +++++++++++++++++++++++++++++++++++++-
 t/t3105-ls-tree-format.sh |  49 +++++++++++
 2 files changed, 215 insertions(+), 1 deletion(-)
 create mode 100755 t/t3105-ls-tree-format.sh

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index df8312408da..efd85cab088 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -26,11 +26,34 @@ static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
 
+/*
+ * The format equivalents that show_tree() is prepared to handle.
+ */
+static const char *ls_tree_format_d = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
+static const char *ls_tree_format_l = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
+static const char *ls_tree_format_n = "%(path)";
+
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+struct read_tree_ls_tree_data {
+	const char *format;
+	struct strbuf sb_scratch;
+	struct strbuf sb_tmp;
+};
+
+struct expand_ls_tree_data {
+	unsigned mode;
+	enum object_type type;
+	const struct object_id *oid;
+	const char *pathname;
+	const char *basebuf;
+	struct strbuf *sb_scratch;
+	struct strbuf *sb_tmp;
+};
+
 static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
@@ -61,6 +84,76 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static void expand_objectsize(struct strbuf *sb,
+			      const struct object_id *oid,
+			      const enum object_type type,
+			      unsigned int padded)
+{
+	if (type == OBJ_BLOB) {
+		unsigned long size;
+		if (oid_object_info(the_repository, oid, &size) < 0)
+			die(_("could not get object info about '%s'"), oid_to_hex(oid));
+		if (padded)
+			strbuf_addf(sb, "%7"PRIuMAX, (uintmax_t)size);
+		else
+			strbuf_addf(sb, "%"PRIuMAX, (uintmax_t)size);
+	} else if (padded) {
+		strbuf_addf(sb, "%7s", "-");
+	} else {
+		strbuf_addstr(sb, "-");
+	}
+}
+
+static size_t expand_show_tree(struct strbuf *sb,
+			       const char *start,
+			       void *context)
+{
+	struct expand_ls_tree_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len;
+
+	len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+
+	if (*start != '(')
+		die(_("bad format as of '%s'"), start);
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("ls-tree format element '%s' does not end in ')'"),
+		    start);
+	len = end - start + 1;
+
+	if (skip_prefix(start, "(objectmode)", &p)) {
+		strbuf_addf(sb, "%06o", data->mode);
+	} else if (skip_prefix(start, "(objecttype)", &p)) {
+		strbuf_addstr(sb, type_name(data->type));
+	} else if (skip_prefix(start, "(objectsize:padded)", &p)) {
+		expand_objectsize(sb, data->oid, data->type, 1);
+	} else if (skip_prefix(start, "(objectsize)", &p)) {
+		expand_objectsize(sb, data->oid, data->type, 0);
+	} else if (skip_prefix(start, "(objectname)", &p)) {
+		strbuf_addstr(sb, find_unique_abbrev(data->oid, abbrev));
+	} else if (skip_prefix(start, "(path)", &p)) {
+		const char *name = data->basebuf;
+		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
+
+		if (prefix)
+			name = relative_path(name, prefix, data->sb_scratch);
+		quote_c_style(name, data->sb_tmp, NULL, 0);
+		strbuf_add(sb, data->sb_tmp->buf, data->sb_tmp->len);
+
+		strbuf_reset(data->sb_tmp);
+		/* The relative_path() function resets "scratch" */
+	} else {
+		unsigned int errlen = (unsigned long)len;
+		die(_("bad ls-tree format specifiec %%%.*s"), errlen, start);
+	}
+
+	return len;
+}
+
 static int show_tree_init(enum object_type *type, struct strbuf *base,
 			  const char *pathname, unsigned mode, int *retval)
 {
@@ -79,6 +172,38 @@ static int show_tree_init(enum object_type *type, struct strbuf *base,
 	return 0;
 }
 
+static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
+			 const char *pathname, unsigned mode, void *context)
+{
+	struct read_tree_ls_tree_data *data = context;
+	struct expand_ls_tree_data my_data = {
+		.mode = mode,
+		.type = OBJ_BLOB,
+		.oid = oid,
+		.pathname = pathname,
+		.sb_scratch = &data->sb_scratch,
+		.sb_tmp = &data->sb_tmp,
+	};
+	struct strbuf sb = STRBUF_INIT;
+	int retval = 0;
+	size_t baselen;
+
+	if (show_tree_init(&my_data.type, base, pathname, mode, &retval))
+		return retval;
+
+	baselen = base->len;
+	strbuf_addstr(base, pathname);
+	strbuf_reset(&sb);
+	my_data.basebuf = base->buf;
+
+	strbuf_expand(&sb, data->format, expand_show_tree, &my_data);
+	strbuf_addch(&sb, line_termination);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_setlen(base, baselen);
+
+	return retval;
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
@@ -125,6 +250,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	struct object_id oid;
 	struct tree *tree;
 	int i, full_tree = 0;
+	const char *implicit_format = NULL;
+	const char *format = NULL;
+	struct read_tree_ls_tree_data read_tree_cb_data = {
+		.sb_scratch = STRBUF_INIT,
+		.sb_tmp = STRBUF_INIT,
+	};
 	const struct option ls_tree_options[] = {
 		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
 			LS_TREE_ONLY),
@@ -145,9 +276,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "full-tree", &full_tree,
 			 N_("list entire tree; not just current directory "
 			    "(implies --full-name)")),
+		OPT_STRING_F(0 , "format", &format, N_("format"),
+			     N_("format to use for the output"), PARSE_OPT_NONEG),
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
+	read_tree_fn_t fn = show_tree;
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
@@ -164,6 +298,18 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if ( (LS_TREE_ONLY|LS_RECURSIVE) ==
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
 		ls_options |= LS_SHOW_TREES;
+	if (ls_options & LS_NAME_ONLY)
+		implicit_format = ls_tree_format_n;
+	if (ls_options & LS_SHOW_SIZE)
+		implicit_format = ls_tree_format_l;
+
+	if (format && implicit_format)
+		usage_msg_opt(_("providing --format cannot be combined with other format-altering options"),
+			      ls_tree_usage, ls_tree_options);
+	if (implicit_format)
+		format = implicit_format;
+	if (!format)
+		format = ls_tree_format_d;
 
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
@@ -186,6 +332,25 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
+
+	/*
+	 * The generic show_tree_fmt() is slower than show_tree(), so
+	 * take the fast path if possible.
+	 */
+	if (format && (!strcmp(format, ls_tree_format_d) ||
+		       !strcmp(format, ls_tree_format_l) ||
+		       !strcmp(format, ls_tree_format_n)))
+		fn = show_tree;
+	else if (format)
+		fn = show_tree_fmt;
+	/*
+	 * Allow forcing the show_tree_fmt(), to test that it can
+	 * handle the test suite.
+	 */
+	if (git_env_bool("GIT_TEST_LS_TREE_FORMAT_BACKEND", 0))
+		fn = show_tree_fmt;
+
+	read_tree_cb_data.format = format;
 	return !!read_tree(the_repository, tree,
-			   &pathspec, show_tree, NULL);
+			   &pathspec, fn, &read_tree_cb_data);
 }
diff --git a/t/t3105-ls-tree-format.sh b/t/t3105-ls-tree-format.sh
new file mode 100755
index 00000000000..79817260ce8
--- /dev/null
+++ b/t/t3105-ls-tree-format.sh
@@ -0,0 +1,49 @@
+#!/bin/sh
+
+test_description='ls-tree --format'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'ls-tree --format usage' '
+	test_expect_code 129 git ls-tree --format=fmt -l &&
+	test_expect_code 129 git ls-tree --format=fmt --name-only &&
+	test_expect_code 129 git ls-tree --format=fmt --name-status
+'
+
+test_expect_success 'setup' '
+	mkdir dir &&
+	test_commit dir/sub-file &&
+	test_commit top-file
+'
+
+test_ls_tree_format () {
+	format=$1 &&
+	opts=$2 &&		
+	shift 2 &&
+	git ls-tree $opts -r HEAD >expect.raw &&
+	sed "s/^/> /" >expect <expect.raw &&
+	git ls-tree --format="> $format" -r HEAD >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ls-tree --format=<default-like>' '
+	test_ls_tree_format \
+		"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
+		""
+'
+
+test_expect_success 'ls-tree --format=<long-like>' '
+	test_ls_tree_format \
+		"%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)" \
+		"--long"
+'
+
+test_expect_success 'ls-tree --format=<name-only-like>' '
+	test_ls_tree_format \
+		"%(path)" \
+		"--name-only"
+
+'
+
+test_done
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [RFC PATCH 7/7] ls-tree.c: support `--object-only` option for "git-ls-tree"
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
                               ` (5 preceding siblings ...)
  2021-12-17 13:30             ` [RFC PATCH 6/7] ls-tree: add a --format=<fmt> option Ævar Arnfjörð Bjarmason
@ 2021-12-17 13:30             ` Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-17 13:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Teng Long, Ævar Arnfjörð Bjarmason

From: Teng Long <dyroneteng@gmail.com>

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" (mutually exclusive).

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/git-ls-tree.txt |  7 ++++-
 builtin/ls-tree.c             |  6 +++++
 t/t3103-ls-tree-misc.sh       |  8 ++++++
 t/t3104-ls-tree-oid.sh        | 51 +++++++++++++++++++++++++++++++++++
 4 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a9..729370f2357 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index efd85cab088..f19b0138362 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -20,6 +20,7 @@ static int line_termination = '\n';
 #define LS_SHOW_TREES 4
 #define LS_NAME_ONLY 8
 #define LS_SHOW_SIZE 16
+#define LS_OBJECT_ONLY 32
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
@@ -31,6 +32,7 @@ static const char *ls_tree_prefix;
  */
 static const char *ls_tree_format_d = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
 static const char *ls_tree_format_l = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
+static const char *ls_tree_format_o = "%(objectname)";
 static const char *ls_tree_format_n = "%(path)";
 
 static const  char * const ls_tree_usage[] = {
@@ -271,6 +273,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_NAME_ONLY),
 		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
 			LS_NAME_ONLY),
+		OPT_BIT(0, "object-only", &ls_options, N_("list only objects"),
+			LS_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -302,6 +306,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		implicit_format = ls_tree_format_n;
 	if (ls_options & LS_SHOW_SIZE)
 		implicit_format = ls_tree_format_l;
+	if (ls_options & LS_OBJECT_ONLY)
+		implicit_format = ls_tree_format_o;
 
 	if (format && implicit_format)
 		usage_msg_opt(_("providing --format cannot be combined with other format-altering options"),
diff --git a/t/t3103-ls-tree-misc.sh b/t/t3103-ls-tree-misc.sh
index d18ba1bd84b..a8641706a6e 100755
--- a/t/t3103-ls-tree-misc.sh
+++ b/t/t3103-ls-tree-misc.sh
@@ -23,4 +23,12 @@ test_expect_success 'ls-tree fails with non-zero exit code on broken tree' '
 	test_must_fail git ls-tree -r HEAD
 '
 
+test_expect_success 'usage: incompatible options: --name-status with --long' '
+	test_expect_code 129 git ls-tree --long --name-status
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --long' '
+	test_expect_code 129 git ls-tree --long --name-only
+'
+
 test_done
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 00000000000..81304e7b13a
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+	git ls-tree --object-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+	git ls-tree --object-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+	git ls-tree --object-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-only
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-status
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --long
+'
+
+test_done
-- 
2.34.1.1119.g7a3fc8778ee


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts
  2021-12-17  6:57         ` [PATCH v6 0/1] " Teng Long
  2021-12-17  6:57           ` [PATCH v6 1/1] ls-tree.c: " Teng Long
  2021-12-17 13:30           ` [RFC PATCH 0/7] ls-tree --format Ævar Arnfjörð Bjarmason
@ 2022-01-01 13:50           ` Teng Long
  2022-01-01 13:50             ` [PATCH v8 1/8] ls-tree: remove commented-out code Teng Long
                               ` (8 more replies)
  2 siblings, 9 replies; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

Diff from v6 (Origin) and v7 (RFC by Ævar):

1. [v6] Performance Regression

In v6, Ævar pointed out that there's a nearly 10% performance
regression under linux [1]. This is cause by in V6, I chose to
use a bitwisp operation to check whether the specified field to
be printed, this will separate the original to many "printf" to
combined the final output format. But some "checks" are unnecessary,
like we will check whether to print the "mode" and the "type", but
we do not really need to do that because only print them are
meaningless.

So in commit cb881183cb if this patch, I kept some parts of
bitwise logic in "show_tree" because it's more intuitive than before
I think. Then, move the original logic to function "show_default", now
in "show_tree" the procedure is clearer, first "show_tree_init", then 
check whether it's asked only to print objectname or filename, or to
print a default format. After this, the performance regression problem
was solved, here is the performance test result based on linux in my env: 

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    Range (min … max):   101.5 ms … 111.3 ms    28 runs
    
    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    Range (min … max):    99.3 ms … 109.5 ms    27 runs
    
    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    Range (min … max):   323.0 ms … 355.0 ms    10 runs
    
    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    Range (min … max):   330.4 ms … 349.9 ms    10 runs

  2. [v6] Bugs in "t3104"

  Ævar found that[2] I forgot to supply a <tree-ish> in tests, that's obviously a
  bug need to fix and already done in this patch.

  3. [RFC v7 by Ævar] the pre-works commits

  Ævar helped to do some pre-works commits in V7, they are 2fcff7e0d4, 6fd1dd9383,
  208654b5e2, 2637464fd8 and d77c895a4b, I think these are all reasonable, so I
  just cherry-pick to this patch and continue the work base on them.

  First, is to support `--object-only`, had been mentioned above. The second commit is
  to create a "shown_data" struct to prepare the next "--format" commit for reusing the
  struct. The last one (ls-tree.c: introduce "--format" option) is to support the
  "--format" option.
  
  Ævar posted a commit for mainly supporting "--format" in RFC v7[3] and give some design
  and performance test context. My commit based on Ævar's (I'm not sure I have to mark
  something about Ævar in commit message, because I only made some modifications but the
  idea is from Ævar) but exists some changes:

      1).  Changed the format field names, the original's and the current's are:

          objectmode -> mode
          objecttype -> type
          objectname -> object
          path -> file

          The original's are ok, just I prefer to make the name more simple to memorize and
          type, in addition, the current Documentation/git-ls-tree.txt, at "Output Format"
          section use "<mode> SP <type> SP <object> TAB <file>" to describe the format.

          I think the names with "object" prefix are from Documentation/git-for-each-ref.txt,
          use a "objectname" is not a redundant expression because there are also "authorname"
          and "refname" to be distingushed in `git-for-each-ref`, but in "git-ls-tree",
          currently, seems like no necessary, but I'm not so much sure about the naming rules
          if I was missing something.
      
      2).  OPT_CMDMODE and OPT_BIT:

          I noticed Ævar uses "OPT_BIT" in his patch but I use "OPT_CMDMODE" (actually
          OPT_CMDMODE also is Ævar teached me) and they seems like both supporting to
          make a mutual exclusive betweem options. I didn't change them to "OPT_BIT"
          because they looked like working well, plz told me if I misunderstood. 

2. Performance comparation between "master" and v8

      1). Default format( "git ls-tree -r" vs "hitten builtin formats" vs "miss builtin formats")

        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
        Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
        Time (mean ± σ):     105.2 ms ±   3.3 ms    [User: 84.3 ms, System: 20.8 ms]
        Range (min … max):    99.2 ms … 113.2 ms    28 runs
    
        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD"
        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD
        Time (mean ± σ):     106.4 ms ±   2.7 ms    [User: 86.1 ms, System: 20.2 ms]
        Range (min … max):   100.2 ms … 110.5 ms    29 runs

        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='> %(mode) %(type) %(object)%x09%(file)'  HEAD"
        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='> %(mode) %(type) %(object)%x09%(file)'  HEAD
        Time (mean ± σ):     145.3 ms ±   3.9 ms    [User: 119.0 ms, System: 26.2 ms]
        Range (min … max):   139.7 ms … 150.8 ms    20 runs

      2). Default format that including object size (( "git ls-tree -r -l" vs "hitten builtin formats" vs "miss builtin formats"))

        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
        Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
        Time (mean ± σ):     335.1 ms ±   6.5 ms    [User: 304.6 ms, System: 30.4 ms]
        Range (min … max):   327.5 ms … 348.4 ms    10 runs
    
        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
        Time (mean ± σ):     337.2 ms ±   8.2 ms    [User: 309.2 ms, System: 27.9 ms]
        Range (min … max):   328.8 ms … 349.4 ms    10 runs

        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='> %(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='> %(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
        Time (mean ± σ):     396.9 ms ±   8.9 ms    [User: 364.2 ms, System: 32.7 ms]
        Range (min … max):   379.6 ms … 408.6 ms    10 runs

Thanks.

[1] https://public-inbox.org/git/RFC-cover-0.7-00000000000-20211217T131635Z-avarab@gmail.com/
[2] https://public-inbox.org/git/211217.86o85f8jey.gmgdl@evledraar.gmail.com/
[3] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

Teng Long (3):
  ls-tree.c: support --object-only option for "git-ls-tree"
  ls-tree.c: introduce struct "shown_data"
  ls-tree.c: introduce "--format" option

Ævar Arnfjörð Bjarmason (5):
  ls-tree: remove commented-out code
  ls-tree: add missing braces to "else" arms
  ls-tree: use "enum object_type", not {blob,tree,commit}_type
  ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  ls-tree: split up the "init" part of show_tree()

 Documentation/git-ls-tree.txt |  55 +++++-
 builtin/ls-tree.c             | 315 +++++++++++++++++++++++++++-------
 t/t3104-ls-tree-oid.sh        |  51 ++++++
 t/t3105-ls-tree-format.sh     |  55 ++++++
 4 files changed, 415 insertions(+), 61 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh
 create mode 100755 t/t3105-ls-tree-format.sh

Range-diff against v7:
-:  ---------- > 1:  2fcff7e0d4 ls-tree: remove commented-out code
-:  ---------- > 2:  6fd1dd9383 ls-tree: add missing braces to "else" arms
-:  ---------- > 3:  208654b5e2 ls-tree: use "enum object_type", not {blob,tree,commit}_type
-:  ---------- > 4:  2637464fd8 ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
-:  ---------- > 5:  d77c895a4b ls-tree: split up the "init" part of show_tree()
1:  2e449d1c79 ! 6:  cb881183cb ls-tree.c: support `--object-only` option for "git-ls-tree"
    @@ Metadata
     Author: Teng Long <dyroneteng@gmail.com>
     
      ## Commit message ##
    -    ls-tree.c: support `--object-only` option for "git-ls-tree"
    +    ls-tree.c: support --object-only option for "git-ls-tree"
     
         We usually pipe the output from `git ls-trees` to tools like
         `sed` or `cut` when we only want to extract some fields.
    @@ Commit message
     
         Teach the "--object-only" option to the command to only show the
         object name. This option cannot be used together with
    -    "--name-only" or "--long" (mutually exclusive).
    +    "--name-only" or "--long" , they are mutually exclusive (actually
    +    "--name-only" and "--long" can be combined together before, this
    +    commit by the way fix this bug).
    +
    +    A simple refactoring was done to the "show_tree" function, intead by
    +    using bitwise operations to recognize the format for printing to
    +    stdout. The reason for doing this is that we don't want to increase
    +    the readability difficulty with the addition of "-object-only",
    +    making this part of the logic easier to read and expand.
    +
    +    In terms of performance, there is no loss comparing to the
    +    "master" (2ae0a9cb8298185a94e5998086f380a355dd8907), here are the
    +    results of the performance tests in my environment based on linux
    +    repository:
    +
    +        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    +        Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    +        Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    +        Range (min … max):   101.5 ms … 111.3 ms    28 runs
    +
    +        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    +        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    +        Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    +        Range (min … max):    99.3 ms … 109.5 ms    27 runs
    +
    +        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    +        Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    +        Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    +        Range (min … max):   323.0 ms … 355.0 ms    10 runs
    +
    +        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    +        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    +        Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    +        Range (min … max):   330.4 ms … 349.9 ms    10 runs
     
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
    @@ builtin/ls-tree.c
      	NULL
      };
      
    +-static int show_recursive(const char *base, size_t baselen, const char *pathname)
     +enum {
     +	MODE_UNSPECIFIED = 0,
     +	MODE_NAME_ONLY,
    @@ builtin/ls-tree.c
     +
     +static int cmdmode = MODE_UNSPECIFIED;
     +
    - static int show_recursive(const char *base, int baselen, const char *pathname)
    ++static int parse_shown_fields(void)
    ++{
    ++	if (cmdmode == MODE_NAME_ONLY) {
    ++		shown_bits = SHOW_FILE_NAME;
    ++		return 0;
    ++	}
    ++	if (cmdmode == MODE_OBJECT_ONLY) {
    ++		shown_bits = SHOW_OBJECT_NAME;
    ++		return 0;
    ++	}
    ++	if (!ls_options || (ls_options & LS_RECURSIVE)
    ++	    || (ls_options & LS_SHOW_TREES)
    ++	    || (ls_options & LS_TREE_ONLY))
    ++		shown_bits = SHOW_DEFAULT;
    ++	if (cmdmode == MODE_LONG)
    ++		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
    ++	return 1;
    ++}
    ++
    ++static int show_recursive(const char *base, size_t baselen,
    ++			  const char *pathname)
      {
      	int i;
    -@@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    - {
    - 	int retval = 0;
    - 	int baselen;
    -+	int interspace = 0;
    - 	const char *type = blob_type;
      
    - 	if (S_ISGITLINK(mode)) {
    -@@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    - 		 *
    - 		 * Something similar to this incomplete example:
    - 		 *
    --		if (show_subprojects(base, baselen, pathname))
    --			retval = READ_TREE_RECURSIVE;
    -+		 * if (show_subprojects(base, baselen, pathname))
    -+		 *	retval = READ_TREE_RECURSIVE;
    - 		 *
    - 		 */
    - 		type = commit_type;
    +@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, const char *pathname
    + 	return 0;
    + }
    + 
    ++static int show_default(const struct object_id *oid, enum object_type type,
    ++			const char *pathname, unsigned mode,
    ++			struct strbuf *base)
    ++{
    ++	size_t baselen = base->len;
    ++
    ++	if (shown_bits & SHOW_SIZE) {
    ++		char size_text[24];
    ++		if (type == OBJ_BLOB) {
    ++			unsigned long size;
    ++			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
    ++				xsnprintf(size_text, sizeof(size_text), "BAD");
    ++			else
    ++				xsnprintf(size_text, sizeof(size_text),
    ++					  "%" PRIuMAX, (uintmax_t)size);
    ++		} else {
    ++			xsnprintf(size_text, sizeof(size_text), "-");
    ++		}
    ++		printf("%06o %s %s %7s\t", mode, type_name(type),
    ++		find_unique_abbrev(oid, abbrev), size_text);
    ++	} else {
    ++		printf("%06o %s %s\t", mode, type_name(type),
    ++		find_unique_abbrev(oid, abbrev));
    ++	}
    ++	baselen = base->len;
    ++	strbuf_addstr(base, pathname);
    ++	write_name_quoted_relative(base->buf,
    ++				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
    ++				   line_termination);
    ++	strbuf_setlen(base, baselen);
    ++	return 1;
    ++}
    ++
    + static int show_tree_init(enum object_type *type, struct strbuf *base,
    + 			  const char *pathname, unsigned mode, int *retval)
    + {
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    - 	else if (ls_options & LS_TREE_ONLY)
    - 		return 0;
    + 	if (show_tree_init(&type, base, pathname, mode, &retval))
    + 		return retval;
      
     -	if (!(ls_options & LS_NAME_ONLY)) {
     -		if (ls_options & LS_SHOW_SIZE) {
     -			char size_text[24];
    --			if (!strcmp(type, blob_type)) {
    +-			if (type == OBJ_BLOB) {
     -				unsigned long size;
     -				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
     -					xsnprintf(size_text, sizeof(size_text),
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -				else
     -					xsnprintf(size_text, sizeof(size_text),
     -						  "%"PRIuMAX, (uintmax_t)size);
    --			} else
    +-			} else {
     -				xsnprintf(size_text, sizeof(size_text), "-");
    --			printf("%06o %s %s %7s\t", mode, type,
    +-			}
    +-			printf("%06o %s %s %7s\t", mode, type_name(type),
     -			       find_unique_abbrev(oid, abbrev),
     -			       size_text);
    -+	if (shown_bits & SHOW_MODE) {
    -+		printf("%06o", mode);
    -+		interspace = 1;
    -+	}
    -+	if (shown_bits & SHOW_TYPE) {
    -+		printf("%s%s", interspace ? " " : "", type);
    -+		interspace = 1;
    -+	}
    -+	if (shown_bits & SHOW_OBJECT_NAME) {
    -+		printf("%s%s", interspace ? " " : "",
    -+		       find_unique_abbrev(oid, abbrev));
    -+		if (!(shown_bits ^ SHOW_OBJECT_NAME))
    -+			goto LINE_FINISH;
    -+		interspace = 1;
    -+	}
    -+	if (shown_bits & SHOW_SIZE) {
    -+		char size_text[24];
    -+		if (!strcmp(type, blob_type)) {
    -+			unsigned long size;
    -+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
    -+				xsnprintf(size_text, sizeof(size_text), "BAD");
    -+			else
    -+				xsnprintf(size_text, sizeof(size_text),
    -+					  "%"PRIuMAX, (uintmax_t)size);
    - 		} else
    --			printf("%06o %s %s\t", mode, type,
    +-		} else {
    +-			printf("%06o %s %s\t", mode, type_name(type),
     -			       find_unique_abbrev(oid, abbrev));
    -+			xsnprintf(size_text, sizeof(size_text), "-");
    -+		printf("%s%7s", interspace ? " " : "", size_text);
    -+		interspace = 1;
    -+	}
    -+	if (shown_bits & SHOW_FILE_NAME) {
    -+		if (interspace)
    -+			printf("\t");
    -+		baselen = base->len;
    -+		strbuf_addstr(base, pathname);
    -+		write_name_quoted_relative(base->buf,
    -+					   chomp_prefix ? ls_tree_prefix : NULL,
    -+					   stdout,
    -+					   line_termination
    -+					   ? CQ_NO_TERMINATOR_C_QUOTED
    -+					   : CQ_NO_TERMINATOR_AS_IS);
    -+		strbuf_setlen(base, baselen);
    +-		}
    ++	if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
    ++		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
    ++		return retval;
      	}
     -	baselen = base->len;
     -	strbuf_addstr(base, pathname);
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -				   stdout, line_termination);
     -	strbuf_setlen(base, baselen);
     +
    -+LINE_FINISH:
    -+	putchar(line_termination);
    ++	if (!(shown_bits ^ SHOW_FILE_NAME)) {
    ++		baselen = base->len;
    ++		strbuf_addstr(base, pathname);
    ++		write_name_quoted_relative(base->buf,
    ++					   chomp_prefix ? ls_tree_prefix : NULL,
    ++					   stdout, line_termination);
    ++		strbuf_setlen(base, baselen);
    ++	}
    ++
    ++	if (!(shown_bits ^ SHOW_DEFAULT) ||
    ++	    !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
    ++		show_default(oid, type, pathname, mode, base);
    ++
      	return retval;
      }
      
    -+static int parse_shown_fields(void)
    -+{
    -+	if (cmdmode == MODE_NAME_ONLY) {
    -+		shown_bits = SHOW_FILE_NAME;
    -+		return 0;
    -+	}
    -+	if (cmdmode == MODE_OBJECT_ONLY) {
    -+		shown_bits = SHOW_OBJECT_NAME;
    -+		return 0;
    -+	}
    -+	if (!ls_options || (ls_options & LS_RECURSIVE)
    -+	    || (ls_options & LS_SHOW_TREES)
    -+	    || (ls_options & LS_TREE_ONLY))
    -+		shown_bits = SHOW_DEFAULT;
    -+	if (cmdmode == MODE_LONG)
    -+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
    -+	return 1;
    -+}
    -+
    - int cmd_ls_tree(int argc, const char **argv, const char *prefix)
    - {
    - 	struct object_id oid;
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
      			LS_SHOW_TREES),
      		OPT_SET_INT('z', NULL, &line_termination,
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      		die("Not a valid object name %s", argv[0]);
      
     +	parse_shown_fields();
    ++
      	/*
      	 * show_recursive() rolls its own matching code and is
      	 * generally ignorant of 'struct pathspec'. The magic mask
     
    - ## quote.c ##
    -@@ quote.c: void quote_two_c_style(struct strbuf *sb, const char *prefix, const char *path,
    - 
    - void write_name_quoted(const char *name, FILE *fp, int terminator)
    - {
    --	if (terminator) {
    -+	if (0 < terminator || terminator == CQ_NO_TERMINATOR_C_QUOTED)
    - 		quote_c_style(name, NULL, fp, 0);
    --	} else {
    -+	else
    - 		fputs(name, fp);
    --	}
    --	fputc(terminator, fp);
    -+	if (0 <= terminator)
    -+		fputc(terminator, fp);
    - }
    - 
    - void write_name_quoted_relative(const char *name, const char *prefix,
    -
    - ## quote.h ##
    -@@ quote.h: int unquote_c_style(struct strbuf *, const char *quoted, const char **endp);
    - #define CQUOTE_NODQ 01
    - size_t quote_c_style(const char *name, struct strbuf *, FILE *, unsigned);
    - void quote_two_c_style(struct strbuf *, const char *, const char *, unsigned);
    -+/*
    -+ * Write a name, typically a filename, followed by a terminator that
    -+ * separates it from what comes next.
    -+ * When terminator is NUL, the name is given as-is.  Otherwise, the
    -+ * name is c-quoted, suitable for text output.  HT and LF are typical
    -+ * values used for the terminator, but other positive values are possible.
    -+ *
    -+ * In addition to non-negative values two special values in terminator
    -+ * are possible.
    -+ *
    -+ * -1: show the name c-quoted, without adding any terminator.
    -+ * -2: show the name as-is, without adding any terminator.
    -+ */
    -+#define CQ_NO_TERMINATOR_C_QUOTED	(-1)
    -+#define CQ_NO_TERMINATOR_AS_IS		(-2)
    - 
    - void write_name_quoted(const char *name, FILE *, int terminator);
    -+/*
    -+ * Similar to the above, but the name is first made relative to the prefix
    -+ * before being shown.
    -+ */
    - void write_name_quoted_relative(const char *name, const char *prefix,
    - 				FILE *fp, int terminator);
    - 
    -
    - ## t/t3103-ls-tree-misc.sh ##
    -@@ t/t3103-ls-tree-misc.sh: test_expect_success 'ls-tree fails with non-zero exit code on broken tree' '
    - 	test_must_fail git ls-tree -r HEAD
    - '
    - 
    -+test_expect_success 'usage: incompatible options: --name-status with --long' '
    -+	test_expect_code 129 git ls-tree --long --name-status
    -+'
    -+
    -+test_expect_success 'usage: incompatible options: --name-only with --long' '
    -+	test_expect_code 129 git ls-tree --long --name-only
    -+'
    -+
    - test_done
    -
      ## t/t3104-ls-tree-oid.sh (new) ##
     @@
     +#!/bin/sh
    @@ t/t3104-ls-tree-oid.sh (new)
     +'
     +
     +test_expect_success 'usage: incompatible options: --name-only with --object-only' '
    -+	test_expect_code 129 git ls-tree --object-only --name-only
    ++	test_expect_code 129 git ls-tree --object-only --name-only $tree
     +'
     +
     +test_expect_success 'usage: incompatible options: --name-status with --object-only' '
    -+	test_expect_code 129 git ls-tree --object-only --name-status
    ++	test_expect_code 129 git ls-tree --object-only --name-status $tree
     +'
     +
     +test_expect_success 'usage: incompatible options: --long with --object-only' '
    -+	test_expect_code 129 git ls-tree --object-only --long
    ++	test_expect_code 129 git ls-tree --object-only --long $tree
     +'
     +
     +test_done
-:  ---------- > 7:  296ebacafe ls-tree.c: introduce struct "shown_data"
-:  ---------- > 8:  e0add802fb ls-tree.c: introduce "--format" option
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 1/8] ls-tree: remove commented-out code
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-01 13:50             ` [PATCH v8 2/8] ls-tree: add missing braces to "else" arms Teng Long
                               ` (7 subsequent siblings)
  8 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Remove code added in f35a6d3bce7 (Teach core object handling functions
about gitlinks, 2007-04-09), later patched in 7d0b18a4da1 (Add output
flushing before fork(), 2008-08-04), and then finally ending up in its
current form in d3bee161fef (tree.c: allow read_tree_recursive() to
traverse gitlink entries, 2009-01-25). All while being commented-out!

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..5f7c84950c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -69,15 +69,6 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
-		/*
-		 * Maybe we want to have some recursive version here?
-		 *
-		 * Something similar to this incomplete example:
-		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
-		 *
-		 */
 		type = commit_type;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 2/8] ls-tree: add missing braces to "else" arms
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
  2022-01-01 13:50             ` [PATCH v8 1/8] ls-tree: remove commented-out code Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-01 13:50             ` [PATCH v8 3/8] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
                               ` (6 subsequent siblings)
  8 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Add missing {} to the "else" arms in show_tree() per the
CodingGuidelines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 5f7c84950c..0a28f32ccb 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -92,14 +92,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				else
 					xsnprintf(size_text, sizeof(size_text),
 						  "%"PRIuMAX, (uintmax_t)size);
-			} else
+			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
+			}
 			printf("%06o %s %s %7s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
-		} else
+		} else {
 			printf("%06o %s %s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev));
+		}
 	}
 	baselen = base->len;
 	strbuf_addstr(base, pathname);
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 3/8] ls-tree: use "enum object_type", not {blob,tree,commit}_type
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
  2022-01-01 13:50             ` [PATCH v8 1/8] ls-tree: remove commented-out code Teng Long
  2022-01-01 13:50             ` [PATCH v8 2/8] ls-tree: add missing braces to "else" arms Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-01 13:50             ` [PATCH v8 4/8] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
                               ` (5 subsequent siblings)
  8 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Change the ls-tree.c code to use type_name() on the enum instead of
using the string constants. This doesn't matter either way for
performance, but makes this a bit easier to read as we'll no longer
need a strcmp() here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 0a28f32ccb..3f0225b097 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,17 +66,17 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
-	const char *type = blob_type;
+	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-		type = commit_type;
+		type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
 			retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
 				return retval;
 		}
-		type = tree_type;
+		type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
@@ -84,7 +84,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
-			if (!strcmp(type, blob_type)) {
+			if (type == OBJ_BLOB) {
 				unsigned long size;
 				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
 					xsnprintf(size_text, sizeof(size_text),
@@ -95,11 +95,11 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
 			}
-			printf("%06o %s %s %7s\t", mode, type,
+			printf("%06o %s %s %7s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
 		} else {
-			printf("%06o %s %s\t", mode, type,
+			printf("%06o %s %s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev));
 		}
 	}
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 4/8] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
                               ` (2 preceding siblings ...)
  2022-01-01 13:50             ` [PATCH v8 3/8] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-01 13:50             ` [PATCH v8 5/8] ls-tree: split up the "init" part of show_tree() Teng Long
                               ` (4 subsequent siblings)
  8 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

The "struct strbuf"'s "len" member is a "size_t", not an "int", so
let's change our corresponding types accordingly. This also changes
the "len" and "speclen" variables, which are likewise used to store
the return value of strlen(), which returns "size_t", not "int".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3f0225b097..eecc7482d5 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,7 +31,7 @@ static const  char * const ls_tree_usage[] = {
 	NULL
 };
 
-static int show_recursive(const char *base, int baselen, const char *pathname)
+static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
 
@@ -43,7 +43,7 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
 
 	for (i = 0; i < pathspec.nr; i++) {
 		const char *spec = pathspec.items[i].match;
-		int len, speclen;
+		size_t len, speclen;
 
 		if (strncmp(base, spec, baselen))
 			continue;
@@ -65,7 +65,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
 	int retval = 0;
-	int baselen;
+	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 5/8] ls-tree: split up the "init" part of show_tree()
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
                               ` (3 preceding siblings ...)
  2022-01-01 13:50             ` [PATCH v8 4/8] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-04  2:06               ` Junio C Hamano
  2022-01-01 13:50             ` [PATCH v8 6/8] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
                               ` (3 subsequent siblings)
  8 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Split up the "init" part of the show_tree() function where we decide
what the "type" is, and whether we'll return early. This makes things
a bit less readable for now, but we'll soon re-use this in a sibling
function, and avoiding the duplication will be worth it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index eecc7482d5..df8312408d 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -61,25 +61,33 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
-static int show_tree(const struct object_id *oid, struct strbuf *base,
-		const char *pathname, unsigned mode, void *context)
+static int show_tree_init(enum object_type *type, struct strbuf *base,
+			  const char *pathname, unsigned mode, int *retval)
 {
-	int retval = 0;
-	size_t baselen;
-	enum object_type type = OBJ_BLOB;
-
 	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
+		*type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-			retval = READ_TREE_RECURSIVE;
+			*retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
-				return retval;
+				return 1;
 		}
-		type = OBJ_TREE;
+		*type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
-		return 0;
+		return 1;
+	return 0;
+}
+
+static int show_tree(const struct object_id *oid, struct strbuf *base,
+		const char *pathname, unsigned mode, void *context)
+{
+	int retval = 0;
+	size_t baselen;
+	enum object_type type = OBJ_BLOB;
+
+	if (show_tree_init(&type, base, pathname, mode, &retval))
+		return retval;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 6/8] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
                               ` (4 preceding siblings ...)
  2022-01-01 13:50             ` [PATCH v8 5/8] ls-tree: split up the "init" part of show_tree() Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-04  1:21               ` Junio C Hamano
  2022-01-01 13:50             ` [PATCH v8 7/8] ls-tree.c: introduce struct "shown_data" Teng Long
                               ` (2 subsequent siblings)
  8 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" , they are mutually exclusive (actually
"--name-only" and "--long" can be combined together before, this
commit by the way fix this bug).

A simple refactoring was done to the "show_tree" function, intead by
using bitwise operations to recognize the format for printing to
stdout. The reason for doing this is that we don't want to increase
the readability difficulty with the addition of "-object-only",
making this part of the logic easier to read and expand.

In terms of performance, there is no loss comparing to the
"master" (2ae0a9cb8298185a94e5998086f380a355dd8907), here are the
results of the performance tests in my environment based on linux
repository:

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    Range (min … max):   101.5 ms … 111.3 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    Range (min … max):    99.3 ms … 109.5 ms    27 runs

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    Range (min … max):   323.0 ms … 355.0 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    Range (min … max):   330.4 ms … 349.9 ms    10 runs

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 140 +++++++++++++++++++++++++---------
 t/t3104-ls-tree-oid.sh        |  51 +++++++++++++
 3 files changed, 159 insertions(+), 39 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..729370f235 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index df8312408d..85ca7358ba 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -16,22 +16,59 @@
 
 static int line_termination = '\n';
 #define LS_RECURSIVE 1
-#define LS_TREE_ONLY 2
-#define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_TREE_ONLY (1 << 1)
+#define LS_SHOW_TREES (1 << 2)
+#define LS_NAME_ONLY (1 << 3)
+#define LS_SHOW_SIZE (1 << 4)
+#define LS_OBJECT_ONLY (1 << 5)
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
+static unsigned int shown_bits;
+#define SHOW_FILE_NAME 1
+#define SHOW_SIZE (1 << 1)
+#define SHOW_OBJECT_NAME (1 << 2)
+#define SHOW_TYPE (1 << 3)
+#define SHOW_MODE (1 << 4)
+#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
 
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
-static int show_recursive(const char *base, size_t baselen, const char *pathname)
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OBJECT_ONLY,
+	MODE_LONG,
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY) {
+		shown_bits = SHOW_FILE_NAME;
+		return 0;
+	}
+	if (cmdmode == MODE_OBJECT_ONLY) {
+		shown_bits = SHOW_OBJECT_NAME;
+		return 0;
+	}
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY))
+		shown_bits = SHOW_DEFAULT;
+	if (cmdmode == MODE_LONG)
+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
+	return 1;
+}
+
+static int show_recursive(const char *base, size_t baselen,
+			  const char *pathname)
 {
 	int i;
 
@@ -61,6 +98,39 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static int show_default(const struct object_id *oid, enum object_type type,
+			const char *pathname, unsigned mode,
+			struct strbuf *base)
+{
+	size_t baselen = base->len;
+
+	if (shown_bits & SHOW_SIZE) {
+		char size_text[24];
+		if (type == OBJ_BLOB) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text), "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%" PRIuMAX, (uintmax_t)size);
+		} else {
+			xsnprintf(size_text, sizeof(size_text), "-");
+		}
+		printf("%06o %s %s %7s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev), size_text);
+	} else {
+		printf("%06o %s %s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev));
+	}
+	baselen = base->len;
+	strbuf_addstr(base, pathname);
+	write_name_quoted_relative(base->buf,
+				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
+				   line_termination);
+	strbuf_setlen(base, baselen);
+	return 1;
+}
+
 static int show_tree_init(enum object_type *type, struct strbuf *base,
 			  const char *pathname, unsigned mode, int *retval)
 {
@@ -89,34 +159,24 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (show_tree_init(&type, base, pathname, mode, &retval))
 		return retval;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (type == OBJ_BLOB) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else {
-				xsnprintf(size_text, sizeof(size_text), "-");
-			}
-			printf("%06o %s %s %7s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
-		} else {
-			printf("%06o %s %s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev));
-		}
+	if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
+		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
+		return retval;
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
-	strbuf_setlen(base, baselen);
+
+	if (!(shown_bits ^ SHOW_FILE_NAME)) {
+		baselen = base->len;
+		strbuf_addstr(base, pathname);
+		write_name_quoted_relative(base->buf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout, line_termination);
+		strbuf_setlen(base, baselen);
+	}
+
+	if (!(shown_bits ^ SHOW_DEFAULT) ||
+	    !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
+		show_default(oid, type, pathname, mode, base);
+
 	return retval;
 }
 
@@ -134,12 +194,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_BIT('l', "long", &ls_options, N_("include object size"),
-			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
+			    MODE_LONG),
+		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -170,6 +232,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
+	parse_shown_fields();
+
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..6ce62bd769
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+	git ls-tree --object-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+	git ls-tree --object-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+	git ls-tree --object-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-only $tree
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-status $tree
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --long $tree
+'
+
+test_done
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 7/8] ls-tree.c: introduce struct "shown_data"
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
                               ` (5 preceding siblings ...)
  2022-01-01 13:50             ` [PATCH v8 6/8] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-03 23:21               ` Junio C Hamano
  2022-01-01 13:50             ` [PATCH v8 8/8] ls-tree.c: introduce "--format" option Teng Long
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
  8 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

"show_data" is a struct that packages the necessary fields for
reusing. This commit is a front-loaded commit for support
"--format" argument and does not affect any existing functionality.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 47 +++++++++++++++++++++++++++++------------------
 1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 85ca7358ba..009ffeb15d 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -34,6 +34,14 @@ static unsigned int shown_bits;
 #define SHOW_MODE (1 << 4)
 #define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
 
+struct shown_data {
+	unsigned mode;
+	enum object_type type;
+	const struct object_id *oid;
+	const char *pathname;
+	struct strbuf *base;
+};
+
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
@@ -98,17 +106,15 @@ static int show_recursive(const char *base, size_t baselen,
 	return 0;
 }
 
-static int show_default(const struct object_id *oid, enum object_type type,
-			const char *pathname, unsigned mode,
-			struct strbuf *base)
+static int show_default(struct shown_data *data)
 {
-	size_t baselen = base->len;
+	size_t baselen = data->base->len;
 
 	if (shown_bits & SHOW_SIZE) {
 		char size_text[24];
-		if (type == OBJ_BLOB) {
+		if (data->type == OBJ_BLOB) {
 			unsigned long size;
-			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+			if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
 				xsnprintf(size_text, sizeof(size_text), "BAD");
 			else
 				xsnprintf(size_text, sizeof(size_text),
@@ -116,18 +122,18 @@ static int show_default(const struct object_id *oid, enum object_type type,
 		} else {
 			xsnprintf(size_text, sizeof(size_text), "-");
 		}
-		printf("%06o %s %s %7s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev), size_text);
+		printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev), size_text);
 	} else {
-		printf("%06o %s %s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev));
+		printf("%06o %s %s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev));
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
+	baselen = data->base->len;
+	strbuf_addstr(data->base, data->pathname);
+	write_name_quoted_relative(data->base->buf,
 				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
 				   line_termination);
-	strbuf_setlen(base, baselen);
+	strbuf_setlen(data->base, baselen);
 	return 1;
 }
 
@@ -154,11 +160,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	size_t baselen;
-	enum object_type type = OBJ_BLOB;
+	struct shown_data data = {
+		.mode = mode,
+		.type = OBJ_BLOB,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
 
-	if (show_tree_init(&type, base, pathname, mode, &retval))
+	if (show_tree_init(&data.type, base, pathname, mode, &retval))
 		return retval;
-
 	if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
 		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
 		return retval;
@@ -175,7 +186,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 
 	if (!(shown_bits ^ SHOW_DEFAULT) ||
 	    !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
-		show_default(oid, type, pathname, mode, base);
+		show_default(&data);
 
 	return retval;
 }
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
                               ` (6 preceding siblings ...)
  2022-01-01 13:50             ` [PATCH v8 7/8] ls-tree.c: introduce struct "shown_data" Teng Long
@ 2022-01-01 13:50             ` Teng Long
  2022-01-04 14:38               ` Johannes Schindelin
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
  8 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-01 13:50 UTC (permalink / raw)
  To: dyroneteng; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

Add a --format option to ls-tree. It has an existing default output,
and then --long and --name-only options to emit the default output
along with the objectsize and, or to only emit object paths.

Rather than add --type-only, --object-only etc. we can just support a
--format using a strbuf_expand() similar to "for-each-ref
--format". We might still add such options in the future for
convenience.

The --format implementation is slower than the existing code, but this
change does not cause any performance regressions. We'll leave the
existing show_tree() unchanged, and only run show_tree_fmt() in if
a --format different than the hardcoded built-in ones corresponding to
the existing modes is provided.

I.e. something like the "--long" output would be much slower with
this, mainly due to how we need to allocate various things to do with
quote.c instead of spewing the output directly to stdout.

The new option of '--format' comes from Ævar Arnfjörð Bjarmasonn's
idea and suggestion, this commit makes modifications in terms of the
original discussion on community [1].

Here is the statistics about performance tests:

1. Default format (hitten the builtin formats):

    "git ls-tree <tree-ish>" vs "--format='%(mode) %(type) %(object)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.2 ms ±   3.3 ms    [User: 84.3 ms, System: 20.8 ms]
    Range (min … max):    99.2 ms … 113.2 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD
    Time (mean ± σ):     106.4 ms ±   2.7 ms    [User: 86.1 ms, System: 20.2 ms]
    Range (min … max):   100.2 ms … 110.5 ms    29 runs

2. Default format includes object size (hitten the builtin formats):

    "git ls-tree -l <tree-ish>" vs "--format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     335.1 ms ±   6.5 ms    [User: 304.6 ms, System: 30.4 ms]
    Range (min … max):   327.5 ms … 348.4 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    Time (mean ± σ):     337.2 ms ±   8.2 ms    [User: 309.2 ms, System: 27.9 ms]
    Range (min … max):   328.8 ms … 349.4 ms    10 runs

Links:
[1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  50 ++++++++-
 builtin/ls-tree.c             | 191 ++++++++++++++++++++++++++++------
 t/t3105-ls-tree-format.sh     |  55 ++++++++++
 3 files changed, 259 insertions(+), 37 deletions(-)
 create mode 100755 t/t3105-ls-tree-format.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index 729370f235..ef23c0fac1 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,9 +10,9 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
-	    <tree-ish> [<path>...]
-
+	    [--name-only] [--name-status] [--object-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--format=<format>] <tree-ish> [<path>...]
 DESCRIPTION
 -----------
 Lists the contents of a given tree object, like what "/bin/ls -a" does
@@ -79,6 +79,16 @@ OPTIONS
 	Do not limit the listing to the current working directory.
 	Implies --full-name.
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result
+	being shown. It also interpolates `%%` to `%`, and
+	`%xx` where `xx`are hex digits interpolates to character
+	with hex code `xx`; for example `%00` interpolates to
+	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
+	When specified, `--format` cannot be combined with other
+	format-altering options, including `--long`, `--name-only`
+	and `--object-only`.
+
 [<path>...]::
 	When paths are given, show them (note that this isn't really raw
 	pathnames, but rather a list of patterns to match).  Otherwise
@@ -87,6 +97,9 @@ OPTIONS
 
 Output Format
 -------------
+
+Default format:
+
         <mode> SP <type> SP <object> TAB <file>
 
 This output format is compatible with what `--index-info --stdin` of
@@ -105,6 +118,37 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+Customized format:
+
+It's support to print customized format by `%(fieldname)` with `--format` option.
+For example, if you want to only print the <object> and <file> fields with a
+JSON style, executing with a specific "--format" like
+
+		git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
+
+The output format changes to:
+
+		{"object":"<object>", "file":"<file>"}
+
+FIELD NAMES
+-----------
+
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputing line, the following
+names can be used:
+
+mode::
+	The mode of the object.
+type::
+	The type of the object (`blob` or `tree`).
+object::
+	The name of the object.
+size[:padded]::
+	The size of the object ("-" if it's a tree).
+	It also supports a padded format of size with "%(size:padded)".
+file::
+	The filename of the object.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 009ffeb15d..6e3e5a4d06 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -56,23 +56,75 @@ enum {
 
 static int cmdmode = MODE_UNSPECIFIED;
 
-static int parse_shown_fields(void)
+static const char *format;
+static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
+static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
+static const char *name_only_format = "%(file)";
+static const char *object_only_format = "%(object)";
+
+static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
+			      const enum object_type type, unsigned int padded)
 {
-	if (cmdmode == MODE_NAME_ONLY) {
-		shown_bits = SHOW_FILE_NAME;
-		return 0;
+	if (type == OBJ_BLOB) {
+		unsigned long size;
+		if (oid_object_info(the_repository, oid, &size) < 0)
+			die(_("could not get object info about '%s'"),
+			    oid_to_hex(oid));
+		if (padded)
+			strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
+		else
+			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
+	} else if (padded) {
+		strbuf_addf(line, "%7s", "-");
+	} else {
+		strbuf_addstr(line, "-");
 	}
-	if (cmdmode == MODE_OBJECT_ONLY) {
-		shown_bits = SHOW_OBJECT_NAME;
-		return 0;
+}
+
+static size_t expand_show_tree(struct strbuf *line, const char *start,
+			       void *context)
+{
+	struct shown_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	size_t len;
+	len = strbuf_expand_literal_cb(line, start, NULL);
+	if (len)
+		return len;
+
+	if (*start != '(')
+		die(_("bad ls-tree format: as '%s'"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(mode)", &p)) {
+		strbuf_addf(line, "%06o", data->mode);
+	} else if (skip_prefix(start, "(type)", &p)) {
+		strbuf_addstr(line, type_name(data->type));
+	} else if (skip_prefix(start, "(size:padded)", &p)) {
+		expand_objectsize(line, data->oid, data->type, 1);
+	} else if (skip_prefix(start, "(size)", &p)) {
+		expand_objectsize(line, data->oid, data->type, 0);
+	} else if (skip_prefix(start, "(object)", &p)) {
+		strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
+	} else if (skip_prefix(start, "(file)", &p)) {
+		const char *name = data->base->buf;
+		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
+		struct strbuf quoted = STRBUF_INIT;
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addstr(data->base, data->pathname);
+		name = relative_path(data->base->buf, prefix, &sb);
+		quote_c_style(name, &quoted, NULL, 0);
+		strbuf_addstr(line, quoted.buf);
+	} else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-tree format: %%%.*s"), errlen, start);
 	}
-	if (!ls_options || (ls_options & LS_RECURSIVE)
-	    || (ls_options & LS_SHOW_TREES)
-	    || (ls_options & LS_TREE_ONLY))
-		shown_bits = SHOW_DEFAULT;
-	if (cmdmode == MODE_LONG)
-		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
-	return 1;
+	return len;
 }
 
 static int show_recursive(const char *base, size_t baselen,
@@ -106,6 +158,75 @@ static int show_recursive(const char *base, size_t baselen,
 	return 0;
 }
 
+static int show_tree_init(enum object_type *type, struct strbuf *base,
+			  const char *pathname, unsigned mode, int *retval)
+{
+	if (S_ISGITLINK(mode)) {
+		*type = OBJ_COMMIT;
+	} else if (S_ISDIR(mode)) {
+		if (show_recursive(base->buf, base->len, pathname)) {
+			*retval = READ_TREE_RECURSIVE;
+			if (!(ls_options & LS_SHOW_TREES))
+				return 1;
+		}
+		*type = OBJ_TREE;
+	}
+	else if (ls_options & LS_TREE_ONLY)
+		return 1;
+	return 0;
+}
+
+static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
+			 const char *pathname, unsigned mode, void *context)
+{
+	size_t baselen;
+	int retval = 0;
+	struct strbuf line = STRBUF_INIT;
+	struct shown_data data = {
+		.mode = mode,
+		.type = OBJ_BLOB,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
+
+	if (show_tree_init(&data.type, base, pathname, mode, &retval))
+		return retval;
+
+	baselen = base->len;
+	strbuf_expand(&line, format, expand_show_tree, &data);
+	strbuf_addch(&line, line_termination);
+	fwrite(line.buf, line.len, 1, stdout);
+	strbuf_setlen(base, baselen);
+	return retval;
+}
+
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY ||
+	    (format && !strcmp(format, name_only_format))) {
+		shown_bits = SHOW_FILE_NAME;
+		return 1;
+	}
+
+	if (cmdmode == MODE_OBJECT_ONLY ||
+	    (format && !strcmp(format, object_only_format))) {
+		shown_bits = SHOW_OBJECT_NAME;
+		return 1;
+	}
+
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY)
+		|| (format && !strcmp(format, default_format)))
+		shown_bits = SHOW_DEFAULT;
+
+	if (cmdmode == MODE_LONG ||
+		(format && !strcmp(format, long_format)))
+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
+	return 1;
+}
+
 static int show_default(struct shown_data *data)
 {
 	size_t baselen = data->base->len;
@@ -137,24 +258,6 @@ static int show_default(struct shown_data *data)
 	return 1;
 }
 
-static int show_tree_init(enum object_type *type, struct strbuf *base,
-			  const char *pathname, unsigned mode, int *retval)
-{
-	if (S_ISGITLINK(mode)) {
-		*type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
-		if (show_recursive(base->buf, base->len, pathname)) {
-			*retval = READ_TREE_RECURSIVE;
-			if (!(ls_options & LS_SHOW_TREES))
-				return 1;
-		}
-		*type = OBJ_TREE;
-	}
-	else if (ls_options & LS_TREE_ONLY)
-		return 1;
-	return 0;
-}
-
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
@@ -196,6 +299,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	struct object_id oid;
 	struct tree *tree;
 	int i, full_tree = 0;
+	read_tree_fn_t fn = show_tree;
 	const struct option ls_tree_options[] = {
 		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
 			LS_TREE_ONLY),
@@ -218,6 +322,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "full-tree", &full_tree,
 			 N_("list entire tree; not just current directory "
 			    "(implies --full-name)")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
@@ -238,6 +345,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
 		ls_options |= LS_SHOW_TREES;
 
+	if (format && cmdmode)
+		usage_msg_opt(
+			_("--format can't be combined with other format-altering options"),
+			ls_tree_usage, ls_tree_options);
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
 	if (get_oid(argv[0], &oid))
@@ -261,6 +372,18 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
-	return !!read_tree(the_repository, tree,
-			   &pathspec, show_tree, NULL);
+
+	/*
+	 * The generic show_tree_fmt() is slower than show_tree(), so
+	 * take the fast path if possible.
+	 */
+	if (format && (!strcmp(format, default_format) ||
+				   !strcmp(format, long_format) ||
+				   !strcmp(format, name_only_format) ||
+				   !strcmp(format, object_only_format)))
+		fn = show_tree;
+	else if (format)
+		fn = show_tree_fmt;
+
+	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
 }
diff --git a/t/t3105-ls-tree-format.sh b/t/t3105-ls-tree-format.sh
new file mode 100755
index 0000000000..92b4d240e8
--- /dev/null
+++ b/t/t3105-ls-tree-format.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='ls-tree --format'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'ls-tree --format usage' '
+	test_expect_code 129 git ls-tree --format=fmt -l &&
+	test_expect_code 129 git ls-tree --format=fmt --name-only &&
+	test_expect_code 129 git ls-tree --format=fmt --name-status &&
+	test_expect_code 129 git ls-tree --format=fmt --object-only
+'
+
+test_expect_success 'setup' '
+	mkdir dir &&
+	test_commit dir/sub-file &&
+	test_commit top-file
+'
+
+test_ls_tree_format () {
+	format=$1 &&
+	opts=$2 &&
+	shift 2 &&
+	git ls-tree $opts -r HEAD >expect.raw &&
+	sed "s/^/> /" >expect <expect.raw &&
+	git ls-tree --format="> $format" -r HEAD >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ls-tree --format=<default-like>' '
+	test_ls_tree_format \
+		"%(mode) %(type) %(object)%x09%(file)" \
+		""
+'
+
+test_expect_success 'ls-tree --format=<long-like>' '
+	test_ls_tree_format \
+		"%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
+		"--long"
+'
+
+test_expect_success 'ls-tree --format=<name-only-like>' '
+	test_ls_tree_format \
+		"%(file)" \
+		"--name-only"
+'
+
+test_expect_success 'ls-tree --format=<object-only-like>' '
+	test_ls_tree_format \
+		"%(object)" \
+		"--object-only"
+'
+
+test_done
-- 
2.33.0.rc1.1802.gbb1c3936fb.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 7/8] ls-tree.c: introduce struct "shown_data"
  2022-01-01 13:50             ` [PATCH v8 7/8] ls-tree.c: introduce struct "shown_data" Teng Long
@ 2022-01-03 23:21               ` Junio C Hamano
  2022-01-04  2:02                 ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2022-01-03 23:21 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, peff, tenglong.tl

Teng Long <dyroneteng@gmail.com> writes:

> "show_data" is a struct that packages the necessary fields for

Is that shown_data?

> reusing. This commit is a front-loaded commit for support
> "--format" argument and does not affect any existing functionality.

What's a front-loaded commit?  Is that some joke around a washing
machine that I do not quite get, or something?

> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  builtin/ls-tree.c | 47 +++++++++++++++++++++++++++++------------------
>  1 file changed, 29 insertions(+), 18 deletions(-)
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 85ca7358ba..009ffeb15d 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -34,6 +34,14 @@ static unsigned int shown_bits;
>  #define SHOW_MODE (1 << 4)
>  #define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
>  
> +struct shown_data {
> +	unsigned mode;
> +	enum object_type type;
> +	const struct object_id *oid;
> +	const char *pathname;
> +	struct strbuf *base;
> +};
> +
>  static const  char * const ls_tree_usage[] = {
>  	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
>  	NULL
> @@ -98,17 +106,15 @@ static int show_recursive(const char *base, size_t baselen,
>  	return 0;
>  }
>  
> -static int show_default(const struct object_id *oid, enum object_type type,
> -			const char *pathname, unsigned mode,
> -			struct strbuf *base)
> +static int show_default(struct shown_data *data)
>  {
> -	size_t baselen = base->len;
> +	size_t baselen = data->base->len;
>  
>  	if (shown_bits & SHOW_SIZE) {
>  		char size_text[24];
> -		if (type == OBJ_BLOB) {
> +		if (data->type == OBJ_BLOB) {
>  			unsigned long size;
> -			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
> +			if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
>  				xsnprintf(size_text, sizeof(size_text), "BAD");
>  			else
>  				xsnprintf(size_text, sizeof(size_text),
> @@ -116,18 +122,18 @@ static int show_default(const struct object_id *oid, enum object_type type,
>  		} else {
>  			xsnprintf(size_text, sizeof(size_text), "-");
>  		}
> -		printf("%06o %s %s %7s\t", mode, type_name(type),
> -		find_unique_abbrev(oid, abbrev), size_text);
> +		printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
> +		find_unique_abbrev(data->oid, abbrev), size_text);
>  	} else {
> -		printf("%06o %s %s\t", mode, type_name(type),
> -		find_unique_abbrev(oid, abbrev));
> +		printf("%06o %s %s\t", data->mode, type_name(data->type),
> +		find_unique_abbrev(data->oid, abbrev));
>  	}
> -	baselen = base->len;
> -	strbuf_addstr(base, pathname);
> -	write_name_quoted_relative(base->buf,
> +	baselen = data->base->len;
> +	strbuf_addstr(data->base, data->pathname);
> +	write_name_quoted_relative(data->base->buf,
>  				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
>  				   line_termination);
> -	strbuf_setlen(base, baselen);
> +	strbuf_setlen(data->base, baselen);
>  	return 1;
>  }
>  
> @@ -154,11 +160,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  {
>  	int retval = 0;
>  	size_t baselen;
> -	enum object_type type = OBJ_BLOB;
> +	struct shown_data data = {
> +		.mode = mode,
> +		.type = OBJ_BLOB,
> +		.oid = oid,
> +		.pathname = pathname,
> +		.base = base,
> +	};
>  
> -	if (show_tree_init(&type, base, pathname, mode, &retval))
> +	if (show_tree_init(&data.type, base, pathname, mode, &retval))
>  		return retval;
> -
>  	if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
>  		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
>  		return retval;
> @@ -175,7 +186,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  
>  	if (!(shown_bits ^ SHOW_DEFAULT) ||
>  	    !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
> -		show_default(oid, type, pathname, mode, base);
> +		show_default(&data);
>  
>  	return retval;
>  }

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 6/8] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-01 13:50             ` [PATCH v8 6/8] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-01-04  1:21               ` Junio C Hamano
  2022-01-04  7:29                 ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2022-01-04  1:21 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, peff, tenglong.tl

Teng Long <dyroneteng@gmail.com> writes:

> A simple refactoring was done to the "show_tree" function, intead by
> using bitwise operations to recognize the format for printing to
> stdout. The reason for doing this is that we don't want to increase
> the readability difficulty with the addition of "-object-only",
> making this part of the logic easier to read and expand.

The resulting code looks unnecessarily complex and brittle; some
SHOW_FOO mean SHOW_FOO_ONLY_AND_NOTHING_ELSE while other SHOW_BAR
means SHOW_BAR_BUT_WE_MAY_SHOW_OTHER_THINGS_IN_LATER_PART, and the
distinction is not clear from their names (which means it is hard
to later extend and enhance the behaviour of the code).

> +	if (!(shown_bits ^ SHOW_FILE_NAME)) {

Is the use of XOR operator significant here?

I.e. "if (shown_bits & SHOW_FILE_NAME)" would have been a much more
natural way to guard "this is a block that shows the file name",
than "the result MUST BE all bits off if we flip SHOW_FILE_NAME bit
off".  If various SHOW_FOO bits are meant to be mutually exclusive,
then "if ((shown_bits & SHOW_FILE_NAME) == SHOW_FILE_NAME)" would
also make sense, but as I said upfront, it is unclear to me if
shown_bits are meant to be a collection of "this bit means this
field is shown (and it implies nothing else)", so I dunno.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 7/8] ls-tree.c: introduce struct "shown_data"
  2022-01-03 23:21               ` Junio C Hamano
@ 2022-01-04  2:02                 ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-04  2:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: avarab, congdanhqx, git, peff, tenglong.tl

>> "show_data" is a struct that packages the necessary fields for

>Is that shown_data?

Yes, sorry for that mislead.

>> reusing. This commit is a front-loaded commit for support
>> "--format" argument and does not affect any existing functionality.

> What's a front-loaded commit?  Is that some joke around a washing
> machine that I do not quite get, or something?

I know this word recently (from an English training institute), feel
sorry if it confuses you, I think it means pre-prepared.

I found two demo sentences from:
https://dictionary.cambridge.org/us/dictionary/english/front-load

I'm not sure that I used it right or wrong, if it's the wrong way or
just make others feel like in haze, I will use “pre-prepared”
preferentially
next time.

Thanks.

Junio C Hamano <gitster@pobox.com> 于2022年1月4日周二 07:21写道:
>
> Teng Long <dyroneteng@gmail.com> writes:
>
> > "show_data" is a struct that packages the necessary fields for
>
> Is that shown_data?
>
> > reusing. This commit is a front-loaded commit for support
> > "--format" argument and does not affect any existing functionality.
>
> What's a front-loaded commit?  Is that some joke around a washing
> machine that I do not quite get, or something?
>
> > Signed-off-by: Teng Long <dyroneteng@gmail.com>
> > ---
> >  builtin/ls-tree.c | 47 +++++++++++++++++++++++++++++------------------
> >  1 file changed, 29 insertions(+), 18 deletions(-)
> >
> > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> > index 85ca7358ba..009ffeb15d 100644
> > --- a/builtin/ls-tree.c
> > +++ b/builtin/ls-tree.c
> > @@ -34,6 +34,14 @@ static unsigned int shown_bits;
> >  #define SHOW_MODE (1 << 4)
> >  #define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
> >
> > +struct shown_data {
> > +     unsigned mode;
> > +     enum object_type type;
> > +     const struct object_id *oid;
> > +     const char *pathname;
> > +     struct strbuf *base;
> > +};
> > +
> >  static const  char * const ls_tree_usage[] = {
> >       N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
> >       NULL
> > @@ -98,17 +106,15 @@ static int show_recursive(const char *base, size_t baselen,
> >       return 0;
> >  }
> >
> > -static int show_default(const struct object_id *oid, enum object_type type,
> > -                     const char *pathname, unsigned mode,
> > -                     struct strbuf *base)
> > +static int show_default(struct shown_data *data)
> >  {
> > -     size_t baselen = base->len;
> > +     size_t baselen = data->base->len;
> >
> >       if (shown_bits & SHOW_SIZE) {
> >               char size_text[24];
> > -             if (type == OBJ_BLOB) {
> > +             if (data->type == OBJ_BLOB) {
> >                       unsigned long size;
> > -                     if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
> > +                     if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
> >                               xsnprintf(size_text, sizeof(size_text), "BAD");
> >                       else
> >                               xsnprintf(size_text, sizeof(size_text),
> > @@ -116,18 +122,18 @@ static int show_default(const struct object_id *oid, enum object_type type,
> >               } else {
> >                       xsnprintf(size_text, sizeof(size_text), "-");
> >               }
> > -             printf("%06o %s %s %7s\t", mode, type_name(type),
> > -             find_unique_abbrev(oid, abbrev), size_text);
> > +             printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
> > +             find_unique_abbrev(data->oid, abbrev), size_text);
> >       } else {
> > -             printf("%06o %s %s\t", mode, type_name(type),
> > -             find_unique_abbrev(oid, abbrev));
> > +             printf("%06o %s %s\t", data->mode, type_name(data->type),
> > +             find_unique_abbrev(data->oid, abbrev));
> >       }
> > -     baselen = base->len;
> > -     strbuf_addstr(base, pathname);
> > -     write_name_quoted_relative(base->buf,
> > +     baselen = data->base->len;
> > +     strbuf_addstr(data->base, data->pathname);
> > +     write_name_quoted_relative(data->base->buf,
> >                                  chomp_prefix ? ls_tree_prefix : NULL, stdout,
> >                                  line_termination);
> > -     strbuf_setlen(base, baselen);
> > +     strbuf_setlen(data->base, baselen);
> >       return 1;
> >  }
> >
> > @@ -154,11 +160,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
> >  {
> >       int retval = 0;
> >       size_t baselen;
> > -     enum object_type type = OBJ_BLOB;
> > +     struct shown_data data = {
> > +             .mode = mode,
> > +             .type = OBJ_BLOB,
> > +             .oid = oid,
> > +             .pathname = pathname,
> > +             .base = base,
> > +     };
> >
> > -     if (show_tree_init(&type, base, pathname, mode, &retval))
> > +     if (show_tree_init(&data.type, base, pathname, mode, &retval))
> >               return retval;
> > -
> >       if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
> >               printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
> >               return retval;
> > @@ -175,7 +186,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
> >
> >       if (!(shown_bits ^ SHOW_DEFAULT) ||
> >           !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
> > -             show_default(oid, type, pathname, mode, base);
> > +             show_default(&data);
> >
> >       return retval;
> >  }

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 5/8] ls-tree: split up the "init" part of show_tree()
  2022-01-01 13:50             ` [PATCH v8 5/8] ls-tree: split up the "init" part of show_tree() Teng Long
@ 2022-01-04  2:06               ` Junio C Hamano
  2022-01-04  9:49                 ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2022-01-04  2:06 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, peff, tenglong.tl

Teng Long <dyroneteng@gmail.com> writes:


> -static int show_tree(const struct object_id *oid, struct strbuf *base,
> -		const char *pathname, unsigned mode, void *context)
> +static int show_tree_init(enum object_type *type, struct strbuf *base,
> +			  const char *pathname, unsigned mode, int *retval)

Don't we need some comment that explains what the function does,
what its return value means, etc.?

>  {
> -	int retval = 0;
> -	size_t baselen;
> -	enum object_type type = OBJ_BLOB;
> -
>  	if (S_ISGITLINK(mode)) {
> -		type = OBJ_COMMIT;
> +		*type = OBJ_COMMIT;
>  	} else if (S_ISDIR(mode)) {
>  		if (show_recursive(base->buf, base->len, pathname)) {
> -			retval = READ_TREE_RECURSIVE;
> +			*retval = READ_TREE_RECURSIVE;
>  			if (!(ls_options & LS_SHOW_TREES))
> -				return retval;
> +				return 1;
>  		}
> -		type = OBJ_TREE;
> +		*type = OBJ_TREE;
>  	}
>  	else if (ls_options & LS_TREE_ONLY)
> -		return 0;
> +		return 1;
> +	return 0;
> +}

It seems that even from its returned value, the caller cannot tell
if *retval was set by the function or not.  Perhaps it makes a much
cleaner API to assign 0 to *retval at the beginning of this function,
just like the original did so anyway? ...

> +static int show_tree(const struct object_id *oid, struct strbuf *base,
> +		const char *pathname, unsigned mode, void *context)
> +{
> +	int retval = 0;

... It would mean we can lose this initialization.

> +	size_t baselen;
> +	enum object_type type = OBJ_BLOB;
> +
> +	if (show_tree_init(&type, base, pathname, mode, &retval))
> +		return retval;

>  
>  	if (!(ls_options & LS_NAME_ONLY)) {
>  		if (ls_options & LS_SHOW_SIZE) {

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 6/8] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-04  1:21               ` Junio C Hamano
@ 2022-01-04  7:29                 ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-04  7:29 UTC (permalink / raw)
  To: gitster; +Cc: avarab, congdanhqx, dyroneteng, git, peff, tenglong.tl

Junio C Hamano <gitster@pobox.com> writes:

> The resulting code looks unnecessarily complex and brittle; some
> SHOW_FOO mean SHOW_FOO_ONLY_AND_NOTHING_ELSE while other SHOW_BAR
> means SHOW_BAR_BUT_WE_MAY_SHOW_OTHER_THINGS_IN_LATER_PART, and the
> distinction is not clear from their names (which means it is hard
> to later extend and enhance the behaviour of the code).

I agree with you that the relevant code is not very clear, So I
think I will take these steps:

1. Rename "shown_bits" -> "shown_fields"

   essentially we want to show the fields but not bits to user that
   may firstly solve the unclear nameing problem for "shown_bits"
   itself.

2. Rename related macro definitions of "shown_fields by :


	   SHOW_FILE_NAME -> FILE_NAME_FIELD
    	   SHOW_SIZE -> SIZE_FIELD
    	   SHOW_OBJECT_NAME -> OBJECT_NAME_FIELD
    	   SHOW_TYPE -> TYPE_FIELD
    	   SHOW_DEFAULT -> DEFAULT_FIELDS

    I think the confusion comes from " SHOW_FOO_ONLY_AND_NOTHING_ELSE"
    and " SHOW_BAR_BUT_WE_MAY_SHOW_OTHER_THINGS_IN_LATER_PART" is
    because some macros's is named by mixed the "flags" and "the
    operation of flags" together. 

    So with renamings, we try to unify these definitions meaning,
    they are just used for defining a "field" with a specified
    non-repetitive bits.
    
    After that, you can show many "fields" by combining any of "fields",
    such as what the builtin "DEFAULT_FIELDS" does, it shows all the
    fields but except the "size" field.

    By far, the "field(s)" only means the definition themselfs, and
    no business with "which one/ones" or  "how" to shown.

3. Decide "which" fields need to show

   The definition of "WHICH_FIELDS_TO_SHOWN" is by "shown_fields",
   it's used for parse from the options and compute it's value
   by function "parse_shown_fields()".

   Actually, the function already represent what work it do,
   but the problem is I didn't notice to rename "show_bits"
   to "show_fields" before. This may bring some confusion,
   because we are not going to show the bits but the field(s).
   So, I will do the <STEP.1>.

4. Decide "How" fields to be shown

    Now we have already know the field(s) we care about, next
    step is to show the fields.

> > +	if (!(shown_bits ^ SHOW_FILE_NAME)) {

> Is the use of XOR operator significant here?
> 
> I.e. "if (shown_bits & SHOW_FILE_NAME)" would have been a much more
> natural way to guard "this is a block that shows the file name",
> than "the result MUST BE all bits off if we flip SHOW_FILE_NAME bit
> off".  If various SHOW_FOO bits are meant to be mutually exclusive,
> then "if ((shown_bits & SHOW_FILE_NAME) == SHOW_FILE_NAME)" would
> also make sense, but as I said upfront, it is unclear to me if
> shown_bits are meant to be a collection of "this bit means this
> field is shown (and it implies nothing else)", so I dunno.

    Not significant. Both work and It's all right for me. Your
    readability is better, and now I know how to handle this
    situation better.

    E.g, if we only want to show a "filename" field
    (with `--name-only`), we will use a way like
    "if ((shown_fields & FILE_NAME_FIELD) == FILE_NAME_FIELD)"
    to judge this situation.

    And if we want to show fields in a builtin way
    (as described in 'git-ls-tree.txt', default output format
    is compatible with what `--index-info --stdin` of
    'git update-index' expects.), we will use "DEFAULT_FIELDS"
    instead.


I am not sure if this solves the problem you are considering as I
may misunderstand, I can quickly finish this new patch and we can
look at it then。

Thanks.


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 5/8] ls-tree: split up the "init" part of show_tree()
  2022-01-04  2:06               ` Junio C Hamano
@ 2022-01-04  9:49                 ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-04  9:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: avarab, congdanhqx, git, peff, tenglong.tl

Junio C Hamano writes:

> Don't we need some comment that explains what the function does,
> what its return value means, etc.?
>
> It seems that even from its returned value, the caller cannot tell
> if *retval was set by the function or not.  Perhaps it makes a much
> cleaner API to assign 0 to *retval at the beginning of this function,
> just like the original did so anyway? ...

Oh, sorry for that, I did not notice the "retval" before because the
naming is unimpressive and the tests were passed, though...

I just looked at it, actually, it's important, not as what it is named, it
affects the result. The "retval" actually determine whether  to
CONTINUE reading the current "tree" or BREAK into the next
one [1] .

So, I think this commit should be modified despite the tests are passed,
firstly, I want to rename "retval" to another name that makes sense,
then just make the relevant "if" and "return" logic more clearly with the
newname, finally, it'll be consistent with the definitions in "read_tree_at()"
at "tree.c" [1].


[1] https://github.com/dyrone/git/blob/master/tree.c#L40

Thanks.

Junio C Hamano <gitster@pobox.com> 于2022年1月4日周二 10:06写道:
>
> Teng Long <dyroneteng@gmail.com> writes:
>
>
> > -static int show_tree(const struct object_id *oid, struct strbuf *base,
> > -             const char *pathname, unsigned mode, void *context)
> > +static int show_tree_init(enum object_type *type, struct strbuf *base,
> > +                       const char *pathname, unsigned mode, int *retval)
>
> Don't we need some comment that explains what the function does,
> what its return value means, etc.?
>
> >  {
> > -     int retval = 0;
> > -     size_t baselen;
> > -     enum object_type type = OBJ_BLOB;
> > -
> >       if (S_ISGITLINK(mode)) {
> > -             type = OBJ_COMMIT;
> > +             *type = OBJ_COMMIT;
> >       } else if (S_ISDIR(mode)) {
> >               if (show_recursive(base->buf, base->len, pathname)) {
> > -                     retval = READ_TREE_RECURSIVE;
> > +                     *retval = READ_TREE_RECURSIVE;
> >                       if (!(ls_options & LS_SHOW_TREES))
> > -                             return retval;
> > +                             return 1;
> >               }
> > -             type = OBJ_TREE;
> > +             *type = OBJ_TREE;
> >       }
> >       else if (ls_options & LS_TREE_ONLY)
> > -             return 0;
> > +             return 1;
> > +     return 0;
> > +}
>
> It seems that even from its returned value, the caller cannot tell
> if *retval was set by the function or not.  Perhaps it makes a much
> cleaner API to assign 0 to *retval at the beginning of this function,
> just like the original did so anyway? ...
>
> > +static int show_tree(const struct object_id *oid, struct strbuf *base,
> > +             const char *pathname, unsigned mode, void *context)
> > +{
> > +     int retval = 0;
>
> ... It would mean we can lose this initialization.
>
> > +     size_t baselen;
> > +     enum object_type type = OBJ_BLOB;
> > +
> > +     if (show_tree_init(&type, base, pathname, mode, &retval))
> > +             return retval;
>
> >
> >       if (!(ls_options & LS_NAME_ONLY)) {
> >               if (ls_options & LS_SHOW_SIZE) {

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-01 13:50             ` [PATCH v8 8/8] ls-tree.c: introduce "--format" option Teng Long
@ 2022-01-04 14:38               ` Johannes Schindelin
  2022-01-04 15:17                 ` Johannes Schindelin
  2022-01-05  9:58                 ` Teng Long
  0 siblings, 2 replies; 224+ messages in thread
From: Johannes Schindelin @ 2022-01-04 14:38 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

Hi Teng,

On Sat, 1 Jan 2022, Teng Long wrote:

> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 009ffeb15d..6e3e5a4d06 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -56,23 +56,75 @@ enum {
>
>  static int cmdmode = MODE_UNSPECIFIED;
>
> -static int parse_shown_fields(void)
> +static const char *format;
> +static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
> +static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
> +static const char *name_only_format = "%(file)";
> +static const char *object_only_format = "%(object)";
> +
> +static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
> +			      const enum object_type type, unsigned int padded)
>  {
> -	if (cmdmode == MODE_NAME_ONLY) {
> -		shown_bits = SHOW_FILE_NAME;
> -		return 0;
> +	if (type == OBJ_BLOB) {
> +		unsigned long size;
> +		if (oid_object_info(the_repository, oid, &size) < 0)
> +			die(_("could not get object info about '%s'"),
> +			    oid_to_hex(oid));
> +		if (padded)
> +			strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
> +		else
> +			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
> +	} else if (padded) {
> +		strbuf_addf(line, "%7s", "-");

This, along with two other similar instances, triggers the
`static-analysis` job in the CI failure of `seen`. The suggested diff is:


-- snip --
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 6e3e5a4d0634..8301d1a15f9a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -75,7 +75,7 @@ static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
 		else
 			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
 	} else if (padded) {
-		strbuf_addf(line, "%7s", "-");
+		strbuf_addstr(line, "-");
 	} else {
 		strbuf_addstr(line, "-");
 	}
@@ -110,7 +110,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
 	} else if (skip_prefix(start, "(size)", &p)) {
 		expand_objectsize(line, data->oid, data->type, 0);
 	} else if (skip_prefix(start, "(object)", &p)) {
-		strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
+		strbuf_add_unique_abbrev(line, data->oid, abbrev);
 	} else if (skip_prefix(start, "(file)", &p)) {
 		const char *name = data->base->buf;
 		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
@@ -119,7 +119,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
 		strbuf_addstr(data->base, data->pathname);
 		name = relative_path(data->base->buf, prefix, &sb);
 		quote_c_style(name, &quoted, NULL, 0);
-		strbuf_addstr(line, quoted.buf);
+		strbuf_addbuf(line, &quoted);
 	} else {
 		errlen = (unsigned long)len;
 		die(_("bad ls-tree format: %%%.*s"), errlen, start);
-- snap --

But I think that the first hunk indicates a deeper issue, as `%7s`
probably meant to pad the dash to seven dashes (which that format won't
accomplish, but `strbuf_addchars()` would)?

Ciao,
Dscho

> +	} else {
> +		strbuf_addstr(line, "-");
>  	}
> -	if (cmdmode == MODE_OBJECT_ONLY) {
> -		shown_bits = SHOW_OBJECT_NAME;
> -		return 0;
> +}
> +
> +static size_t expand_show_tree(struct strbuf *line, const char *start,
> +			       void *context)
> +{
> +	struct shown_data *data = context;
> +	const char *end;
> +	const char *p;
> +	unsigned int errlen;
> +	size_t len;
> +	len = strbuf_expand_literal_cb(line, start, NULL);
> +	if (len)
> +		return len;
> +
> +	if (*start != '(')
> +		die(_("bad ls-tree format: as '%s'"), start);
> +
> +	end = strchr(start + 1, ')');
> +	if (!end)
> +		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
> +
> +	len = end - start + 1;
> +	if (skip_prefix(start, "(mode)", &p)) {
> +		strbuf_addf(line, "%06o", data->mode);
> +	} else if (skip_prefix(start, "(type)", &p)) {
> +		strbuf_addstr(line, type_name(data->type));
> +	} else if (skip_prefix(start, "(size:padded)", &p)) {
> +		expand_objectsize(line, data->oid, data->type, 1);
> +	} else if (skip_prefix(start, "(size)", &p)) {
> +		expand_objectsize(line, data->oid, data->type, 0);
> +	} else if (skip_prefix(start, "(object)", &p)) {
> +		strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
> +	} else if (skip_prefix(start, "(file)", &p)) {
> +		const char *name = data->base->buf;
> +		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
> +		struct strbuf quoted = STRBUF_INIT;
> +		struct strbuf sb = STRBUF_INIT;
> +		strbuf_addstr(data->base, data->pathname);
> +		name = relative_path(data->base->buf, prefix, &sb);
> +		quote_c_style(name, &quoted, NULL, 0);
> +		strbuf_addstr(line, quoted.buf);
> +	} else {
> +		errlen = (unsigned long)len;
> +		die(_("bad ls-tree format: %%%.*s"), errlen, start);
>  	}
> -	if (!ls_options || (ls_options & LS_RECURSIVE)
> -	    || (ls_options & LS_SHOW_TREES)
> -	    || (ls_options & LS_TREE_ONLY))
> -		shown_bits = SHOW_DEFAULT;
> -	if (cmdmode == MODE_LONG)
> -		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
> -	return 1;
> +	return len;
>  }
>
>  static int show_recursive(const char *base, size_t baselen,
> @@ -106,6 +158,75 @@ static int show_recursive(const char *base, size_t baselen,
>  	return 0;
>  }
>
> +static int show_tree_init(enum object_type *type, struct strbuf *base,
> +			  const char *pathname, unsigned mode, int *retval)
> +{
> +	if (S_ISGITLINK(mode)) {
> +		*type = OBJ_COMMIT;
> +	} else if (S_ISDIR(mode)) {
> +		if (show_recursive(base->buf, base->len, pathname)) {
> +			*retval = READ_TREE_RECURSIVE;
> +			if (!(ls_options & LS_SHOW_TREES))
> +				return 1;
> +		}
> +		*type = OBJ_TREE;
> +	}
> +	else if (ls_options & LS_TREE_ONLY)
> +		return 1;
> +	return 0;
> +}
> +
> +static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
> +			 const char *pathname, unsigned mode, void *context)
> +{
> +	size_t baselen;
> +	int retval = 0;
> +	struct strbuf line = STRBUF_INIT;
> +	struct shown_data data = {
> +		.mode = mode,
> +		.type = OBJ_BLOB,
> +		.oid = oid,
> +		.pathname = pathname,
> +		.base = base,
> +	};
> +
> +	if (show_tree_init(&data.type, base, pathname, mode, &retval))
> +		return retval;
> +
> +	baselen = base->len;
> +	strbuf_expand(&line, format, expand_show_tree, &data);
> +	strbuf_addch(&line, line_termination);
> +	fwrite(line.buf, line.len, 1, stdout);
> +	strbuf_setlen(base, baselen);
> +	return retval;
> +}
> +
> +static int parse_shown_fields(void)
> +{
> +	if (cmdmode == MODE_NAME_ONLY ||
> +	    (format && !strcmp(format, name_only_format))) {
> +		shown_bits = SHOW_FILE_NAME;
> +		return 1;
> +	}
> +
> +	if (cmdmode == MODE_OBJECT_ONLY ||
> +	    (format && !strcmp(format, object_only_format))) {
> +		shown_bits = SHOW_OBJECT_NAME;
> +		return 1;
> +	}
> +
> +	if (!ls_options || (ls_options & LS_RECURSIVE)
> +	    || (ls_options & LS_SHOW_TREES)
> +	    || (ls_options & LS_TREE_ONLY)
> +		|| (format && !strcmp(format, default_format)))
> +		shown_bits = SHOW_DEFAULT;
> +
> +	if (cmdmode == MODE_LONG ||
> +		(format && !strcmp(format, long_format)))
> +		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
> +	return 1;
> +}
> +
>  static int show_default(struct shown_data *data)
>  {
>  	size_t baselen = data->base->len;
> @@ -137,24 +258,6 @@ static int show_default(struct shown_data *data)
>  	return 1;
>  }
>
> -static int show_tree_init(enum object_type *type, struct strbuf *base,
> -			  const char *pathname, unsigned mode, int *retval)
> -{
> -	if (S_ISGITLINK(mode)) {
> -		*type = OBJ_COMMIT;
> -	} else if (S_ISDIR(mode)) {
> -		if (show_recursive(base->buf, base->len, pathname)) {
> -			*retval = READ_TREE_RECURSIVE;
> -			if (!(ls_options & LS_SHOW_TREES))
> -				return 1;
> -		}
> -		*type = OBJ_TREE;
> -	}
> -	else if (ls_options & LS_TREE_ONLY)
> -		return 1;
> -	return 0;
> -}
> -
>  static int show_tree(const struct object_id *oid, struct strbuf *base,
>  		const char *pathname, unsigned mode, void *context)
>  {
> @@ -196,6 +299,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  	struct object_id oid;
>  	struct tree *tree;
>  	int i, full_tree = 0;
> +	read_tree_fn_t fn = show_tree;
>  	const struct option ls_tree_options[] = {
>  		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
>  			LS_TREE_ONLY),
> @@ -218,6 +322,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  		OPT_BOOL(0, "full-tree", &full_tree,
>  			 N_("list entire tree; not just current directory "
>  			    "(implies --full-name)")),
> +		OPT_STRING_F(0, "format", &format, N_("format"),
> +			     N_("format to use for the output"),
> +			     PARSE_OPT_NONEG),
>  		OPT__ABBREV(&abbrev),
>  		OPT_END()
>  	};
> @@ -238,6 +345,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
>  		ls_options |= LS_SHOW_TREES;
>
> +	if (format && cmdmode)
> +		usage_msg_opt(
> +			_("--format can't be combined with other format-altering options"),
> +			ls_tree_usage, ls_tree_options);
>  	if (argc < 1)
>  		usage_with_options(ls_tree_usage, ls_tree_options);
>  	if (get_oid(argv[0], &oid))
> @@ -261,6 +372,18 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  	tree = parse_tree_indirect(&oid);
>  	if (!tree)
>  		die("not a tree object");
> -	return !!read_tree(the_repository, tree,
> -			   &pathspec, show_tree, NULL);
> +
> +	/*
> +	 * The generic show_tree_fmt() is slower than show_tree(), so
> +	 * take the fast path if possible.
> +	 */
> +	if (format && (!strcmp(format, default_format) ||
> +				   !strcmp(format, long_format) ||
> +				   !strcmp(format, name_only_format) ||
> +				   !strcmp(format, object_only_format)))
> +		fn = show_tree;
> +	else if (format)
> +		fn = show_tree_fmt;
> +
> +	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
>  }
> diff --git a/t/t3105-ls-tree-format.sh b/t/t3105-ls-tree-format.sh
> new file mode 100755
> index 0000000000..92b4d240e8
> --- /dev/null
> +++ b/t/t3105-ls-tree-format.sh
> @@ -0,0 +1,55 @@
> +#!/bin/sh
> +
> +test_description='ls-tree --format'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
> +. ./test-lib.sh
> +
> +test_expect_success 'ls-tree --format usage' '
> +	test_expect_code 129 git ls-tree --format=fmt -l &&
> +	test_expect_code 129 git ls-tree --format=fmt --name-only &&
> +	test_expect_code 129 git ls-tree --format=fmt --name-status &&
> +	test_expect_code 129 git ls-tree --format=fmt --object-only
> +'
> +
> +test_expect_success 'setup' '
> +	mkdir dir &&
> +	test_commit dir/sub-file &&
> +	test_commit top-file
> +'
> +
> +test_ls_tree_format () {
> +	format=$1 &&
> +	opts=$2 &&
> +	shift 2 &&
> +	git ls-tree $opts -r HEAD >expect.raw &&
> +	sed "s/^/> /" >expect <expect.raw &&
> +	git ls-tree --format="> $format" -r HEAD >actual &&
> +	test_cmp expect actual
> +}
> +
> +test_expect_success 'ls-tree --format=<default-like>' '
> +	test_ls_tree_format \
> +		"%(mode) %(type) %(object)%x09%(file)" \
> +		""
> +'
> +
> +test_expect_success 'ls-tree --format=<long-like>' '
> +	test_ls_tree_format \
> +		"%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
> +		"--long"
> +'
> +
> +test_expect_success 'ls-tree --format=<name-only-like>' '
> +	test_ls_tree_format \
> +		"%(file)" \
> +		"--name-only"
> +'
> +
> +test_expect_success 'ls-tree --format=<object-only-like>' '
> +	test_ls_tree_format \
> +		"%(object)" \
> +		"--object-only"
> +'
> +
> +test_done
> --
> 2.33.0.rc1.1802.gbb1c3936fb.dirty
>
>
>

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-04 14:38               ` Johannes Schindelin
@ 2022-01-04 15:17                 ` Johannes Schindelin
  2022-01-05  9:40                   ` Teng Long
  2022-01-05  9:58                 ` Teng Long
  1 sibling, 1 reply; 224+ messages in thread
From: Johannes Schindelin @ 2022-01-04 15:17 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

Hi Teng,

On Tue, 4 Jan 2022, Johannes Schindelin wrote:

> -- snip --
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 6e3e5a4d0634..8301d1a15f9a 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -75,7 +75,7 @@ static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
>  		else
>  			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
>  	} else if (padded) {
> -		strbuf_addf(line, "%7s", "-");
> +		strbuf_addstr(line, "-");
>  	} else {
>  		strbuf_addstr(line, "-");
>  	}
> @@ -110,7 +110,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
>  	} else if (skip_prefix(start, "(size)", &p)) {
>  		expand_objectsize(line, data->oid, data->type, 0);
>  	} else if (skip_prefix(start, "(object)", &p)) {
> -		strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
> +		strbuf_add_unique_abbrev(line, data->oid, abbrev);
>  	} else if (skip_prefix(start, "(file)", &p)) {
>  		const char *name = data->base->buf;
>  		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
> @@ -119,7 +119,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
>  		strbuf_addstr(data->base, data->pathname);
>  		name = relative_path(data->base->buf, prefix, &sb);
>  		quote_c_style(name, &quoted, NULL, 0);
> -		strbuf_addstr(line, quoted.buf);
> +		strbuf_addbuf(line, &quoted);
>  	} else {
>  		errlen = (unsigned long)len;
>  		die(_("bad ls-tree format: %%%.*s"), errlen, start);
> -- snap --

In addition to that, you need these to quiet down the `linux-leaks` job of
our CI build, which is also failing with your patches:

-- snipsnap --
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 8301d1a15f9a..0dc0327e4785 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -120,6 +120,8 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
 		name = relative_path(data->base->buf, prefix, &sb);
 		quote_c_style(name, &quoted, NULL, 0);
 		strbuf_addbuf(line, &quoted);
+		strbuf_release(&sb);
+		strbuf_release(&quoted);
 	} else {
 		errlen = (unsigned long)len;
 		die(_("bad ls-tree format: %%%.*s"), errlen, start);
@@ -197,6 +199,7 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
 	strbuf_expand(&line, format, expand_show_tree, &data);
 	strbuf_addch(&line, line_termination);
 	fwrite(line.buf, line.len, 1, stdout);
+	strbuf_release(&line);
 	strbuf_setlen(base, baselen);
 	return retval;
 }

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-04 15:17                 ` Johannes Schindelin
@ 2022-01-05  9:40                   ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-05  9:40 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: avarab, congdanhqx, git, Junio C Hamano, peff, tenglong.tl

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> In addition to that, you need these to quiet down the `linux-leaks` job of
> our CI build, which is also failing with your patches:

Thanks.
I will fix it in next patch.

Johannes Schindelin <Johannes.Schindelin@gmx.de> 于2022年1月4日周二 23:17写道:
>
> Hi Teng,
>
> On Tue, 4 Jan 2022, Johannes Schindelin wrote:
>
> > -- snip --
> > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> > index 6e3e5a4d0634..8301d1a15f9a 100644
> > --- a/builtin/ls-tree.c
> > +++ b/builtin/ls-tree.c
> > @@ -75,7 +75,7 @@ static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
> >               else
> >                       strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
> >       } else if (padded) {
> > -             strbuf_addf(line, "%7s", "-");
> > +             strbuf_addstr(line, "-");
> >       } else {
> >               strbuf_addstr(line, "-");
> >       }
> > @@ -110,7 +110,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
> >       } else if (skip_prefix(start, "(size)", &p)) {
> >               expand_objectsize(line, data->oid, data->type, 0);
> >       } else if (skip_prefix(start, "(object)", &p)) {
> > -             strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
> > +             strbuf_add_unique_abbrev(line, data->oid, abbrev);
> >       } else if (skip_prefix(start, "(file)", &p)) {
> >               const char *name = data->base->buf;
> >               const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
> > @@ -119,7 +119,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
> >               strbuf_addstr(data->base, data->pathname);
> >               name = relative_path(data->base->buf, prefix, &sb);
> >               quote_c_style(name, &quoted, NULL, 0);
> > -             strbuf_addstr(line, quoted.buf);
> > +             strbuf_addbuf(line, &quoted);
> >       } else {
> >               errlen = (unsigned long)len;
> >               die(_("bad ls-tree format: %%%.*s"), errlen, start);
> > -- snap --
>
> In addition to that, you need these to quiet down the `linux-leaks` job of
> our CI build, which is also failing with your patches:
>
> -- snipsnap --
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 8301d1a15f9a..0dc0327e4785 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -120,6 +120,8 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
>                 name = relative_path(data->base->buf, prefix, &sb);
>                 quote_c_style(name, &quoted, NULL, 0);
>                 strbuf_addbuf(line, &quoted);
> +               strbuf_release(&sb);
> +               strbuf_release(&quoted);
>         } else {
>                 errlen = (unsigned long)len;
>                 die(_("bad ls-tree format: %%%.*s"), errlen, start);
> @@ -197,6 +199,7 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
>         strbuf_expand(&line, format, expand_show_tree, &data);
>         strbuf_addch(&line, line_termination);
>         fwrite(line.buf, line.len, 1, stdout);
> +       strbuf_release(&line);
>         strbuf_setlen(base, baselen);
>         return retval;
>  }

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-04 14:38               ` Johannes Schindelin
  2022-01-04 15:17                 ` Johannes Schindelin
@ 2022-01-05  9:58                 ` Teng Long
  2022-01-05 13:09                   ` Johannes Schindelin
  1 sibling, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-05  9:58 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: avarab, congdanhqx, git, Junio C Hamano, peff, tenglong.tl

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> This, along with two other similar instances, triggers the
> `static-analysis` job in the CI failure of `seen`. The suggested diff is:

The second and third I will optimize in the next patch.

The first one. Actually I am a little puzzled from this :

> -               strbuf_addf(line, "%7s", "-");
> +               strbuf_addstr(line, "-");

> But I think that the first hunk indicates a deeper issue, as `%7s`
> probably meant to pad the dash to seven dashes (which that format won't
> accomplish, but `strbuf_addchars()` would)?

"strbuf_addf(line, "%7s", "-");" here is used to align the columns
with a width of
seven chars, not repeat one DASH to seven.

A little weird about the fix recommendation of  "strbuf_addstr(line, "-");" ,
because it will only add a single DASH here.

It's the identical result which compares to the "master"[1]  I think with the
current codes and I tested the "strbuf_addf()" simply and it seems to work
fine.

[1] https://github.com/git/git/blob/master/builtin/ls-tree.c#L106

Thanks.

Johannes Schindelin <Johannes.Schindelin@gmx.de> 于2022年1月4日周二 22:38写道:
>
> Hi Teng,
>
> On Sat, 1 Jan 2022, Teng Long wrote:
>
> > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> > index 009ffeb15d..6e3e5a4d06 100644
> > --- a/builtin/ls-tree.c
> > +++ b/builtin/ls-tree.c
> > @@ -56,23 +56,75 @@ enum {
> >
> >  static int cmdmode = MODE_UNSPECIFIED;
> >
> > -static int parse_shown_fields(void)
> > +static const char *format;
> > +static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
> > +static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
> > +static const char *name_only_format = "%(file)";
> > +static const char *object_only_format = "%(object)";
> > +
> > +static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
> > +                           const enum object_type type, unsigned int padded)
> >  {
> > -     if (cmdmode == MODE_NAME_ONLY) {
> > -             shown_bits = SHOW_FILE_NAME;
> > -             return 0;
> > +     if (type == OBJ_BLOB) {
> > +             unsigned long size;
> > +             if (oid_object_info(the_repository, oid, &size) < 0)
> > +                     die(_("could not get object info about '%s'"),
> > +                         oid_to_hex(oid));
> > +             if (padded)
> > +                     strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
> > +             else
> > +                     strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
> > +     } else if (padded) {
> > +             strbuf_addf(line, "%7s", "-");
>
> This, along with two other similar instances, triggers the
> `static-analysis` job in the CI failure of `seen`. The suggested diff is:
>
>
> -- snip --
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 6e3e5a4d0634..8301d1a15f9a 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -75,7 +75,7 @@ static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
>                 else
>                         strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
>         } else if (padded) {
> -               strbuf_addf(line, "%7s", "-");
> +               strbuf_addstr(line, "-");
>         } else {
>                 strbuf_addstr(line, "-");
>         }
> @@ -110,7 +110,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
>         } else if (skip_prefix(start, "(size)", &p)) {
>                 expand_objectsize(line, data->oid, data->type, 0);
>         } else if (skip_prefix(start, "(object)", &p)) {
> -               strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
> +               strbuf_add_unique_abbrev(line, data->oid, abbrev);
>         } else if (skip_prefix(start, "(file)", &p)) {
>                 const char *name = data->base->buf;
>                 const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
> @@ -119,7 +119,7 @@ static size_t expand_show_tree(struct strbuf *line, const char *start,
>                 strbuf_addstr(data->base, data->pathname);
>                 name = relative_path(data->base->buf, prefix, &sb);
>                 quote_c_style(name, &quoted, NULL, 0);
> -               strbuf_addstr(line, quoted.buf);
> +               strbuf_addbuf(line, &quoted);
>         } else {
>                 errlen = (unsigned long)len;
>                 die(_("bad ls-tree format: %%%.*s"), errlen, start);
> -- snap --
>
> But I think that the first hunk indicates a deeper issue, as `%7s`
> probably meant to pad the dash to seven dashes (which that format won't
> accomplish, but `strbuf_addchars()` would)?
>
> Ciao,
> Dscho
>
> > +     } else {
> > +             strbuf_addstr(line, "-");
> >       }
> > -     if (cmdmode == MODE_OBJECT_ONLY) {
> > -             shown_bits = SHOW_OBJECT_NAME;
> > -             return 0;
> > +}
> > +
> > +static size_t expand_show_tree(struct strbuf *line, const char *start,
> > +                            void *context)
> > +{
> > +     struct shown_data *data = context;
> > +     const char *end;
> > +     const char *p;
> > +     unsigned int errlen;
> > +     size_t len;
> > +     len = strbuf_expand_literal_cb(line, start, NULL);
> > +     if (len)
> > +             return len;
> > +
> > +     if (*start != '(')
> > +             die(_("bad ls-tree format: as '%s'"), start);
> > +
> > +     end = strchr(start + 1, ')');
> > +     if (!end)
> > +             die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
> > +
> > +     len = end - start + 1;
> > +     if (skip_prefix(start, "(mode)", &p)) {
> > +             strbuf_addf(line, "%06o", data->mode);
> > +     } else if (skip_prefix(start, "(type)", &p)) {
> > +             strbuf_addstr(line, type_name(data->type));
> > +     } else if (skip_prefix(start, "(size:padded)", &p)) {
> > +             expand_objectsize(line, data->oid, data->type, 1);
> > +     } else if (skip_prefix(start, "(size)", &p)) {
> > +             expand_objectsize(line, data->oid, data->type, 0);
> > +     } else if (skip_prefix(start, "(object)", &p)) {
> > +             strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
> > +     } else if (skip_prefix(start, "(file)", &p)) {
> > +             const char *name = data->base->buf;
> > +             const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
> > +             struct strbuf quoted = STRBUF_INIT;
> > +             struct strbuf sb = STRBUF_INIT;
> > +             strbuf_addstr(data->base, data->pathname);
> > +             name = relative_path(data->base->buf, prefix, &sb);
> > +             quote_c_style(name, &quoted, NULL, 0);
> > +             strbuf_addstr(line, quoted.buf);
> > +     } else {
> > +             errlen = (unsigned long)len;
> > +             die(_("bad ls-tree format: %%%.*s"), errlen, start);
> >       }
> > -     if (!ls_options || (ls_options & LS_RECURSIVE)
> > -         || (ls_options & LS_SHOW_TREES)
> > -         || (ls_options & LS_TREE_ONLY))
> > -             shown_bits = SHOW_DEFAULT;
> > -     if (cmdmode == MODE_LONG)
> > -             shown_bits = SHOW_DEFAULT | SHOW_SIZE;
> > -     return 1;
> > +     return len;
> >  }
> >
> >  static int show_recursive(const char *base, size_t baselen,
> > @@ -106,6 +158,75 @@ static int show_recursive(const char *base, size_t baselen,
> >       return 0;
> >  }
> >
> > +static int show_tree_init(enum object_type *type, struct strbuf *base,
> > +                       const char *pathname, unsigned mode, int *retval)
> > +{
> > +     if (S_ISGITLINK(mode)) {
> > +             *type = OBJ_COMMIT;
> > +     } else if (S_ISDIR(mode)) {
> > +             if (show_recursive(base->buf, base->len, pathname)) {
> > +                     *retval = READ_TREE_RECURSIVE;
> > +                     if (!(ls_options & LS_SHOW_TREES))
> > +                             return 1;
> > +             }
> > +             *type = OBJ_TREE;
> > +     }
> > +     else if (ls_options & LS_TREE_ONLY)
> > +             return 1;
> > +     return 0;
> > +}
> > +
> > +static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
> > +                      const char *pathname, unsigned mode, void *context)
> > +{
> > +     size_t baselen;
> > +     int retval = 0;
> > +     struct strbuf line = STRBUF_INIT;
> > +     struct shown_data data = {
> > +             .mode = mode,
> > +             .type = OBJ_BLOB,
> > +             .oid = oid,
> > +             .pathname = pathname,
> > +             .base = base,
> > +     };
> > +
> > +     if (show_tree_init(&data.type, base, pathname, mode, &retval))
> > +             return retval;
> > +
> > +     baselen = base->len;
> > +     strbuf_expand(&line, format, expand_show_tree, &data);
> > +     strbuf_addch(&line, line_termination);
> > +     fwrite(line.buf, line.len, 1, stdout);
> > +     strbuf_setlen(base, baselen);
> > +     return retval;
> > +}
> > +
> > +static int parse_shown_fields(void)
> > +{
> > +     if (cmdmode == MODE_NAME_ONLY ||
> > +         (format && !strcmp(format, name_only_format))) {
> > +             shown_bits = SHOW_FILE_NAME;
> > +             return 1;
> > +     }
> > +
> > +     if (cmdmode == MODE_OBJECT_ONLY ||
> > +         (format && !strcmp(format, object_only_format))) {
> > +             shown_bits = SHOW_OBJECT_NAME;
> > +             return 1;
> > +     }
> > +
> > +     if (!ls_options || (ls_options & LS_RECURSIVE)
> > +         || (ls_options & LS_SHOW_TREES)
> > +         || (ls_options & LS_TREE_ONLY)
> > +             || (format && !strcmp(format, default_format)))
> > +             shown_bits = SHOW_DEFAULT;
> > +
> > +     if (cmdmode == MODE_LONG ||
> > +             (format && !strcmp(format, long_format)))
> > +             shown_bits = SHOW_DEFAULT | SHOW_SIZE;
> > +     return 1;
> > +}
> > +
> >  static int show_default(struct shown_data *data)
> >  {
> >       size_t baselen = data->base->len;
> > @@ -137,24 +258,6 @@ static int show_default(struct shown_data *data)
> >       return 1;
> >  }
> >
> > -static int show_tree_init(enum object_type *type, struct strbuf *base,
> > -                       const char *pathname, unsigned mode, int *retval)
> > -{
> > -     if (S_ISGITLINK(mode)) {
> > -             *type = OBJ_COMMIT;
> > -     } else if (S_ISDIR(mode)) {
> > -             if (show_recursive(base->buf, base->len, pathname)) {
> > -                     *retval = READ_TREE_RECURSIVE;
> > -                     if (!(ls_options & LS_SHOW_TREES))
> > -                             return 1;
> > -             }
> > -             *type = OBJ_TREE;
> > -     }
> > -     else if (ls_options & LS_TREE_ONLY)
> > -             return 1;
> > -     return 0;
> > -}
> > -
> >  static int show_tree(const struct object_id *oid, struct strbuf *base,
> >               const char *pathname, unsigned mode, void *context)
> >  {
> > @@ -196,6 +299,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
> >       struct object_id oid;
> >       struct tree *tree;
> >       int i, full_tree = 0;
> > +     read_tree_fn_t fn = show_tree;
> >       const struct option ls_tree_options[] = {
> >               OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
> >                       LS_TREE_ONLY),
> > @@ -218,6 +322,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
> >               OPT_BOOL(0, "full-tree", &full_tree,
> >                        N_("list entire tree; not just current directory "
> >                           "(implies --full-name)")),
> > +             OPT_STRING_F(0, "format", &format, N_("format"),
> > +                          N_("format to use for the output"),
> > +                          PARSE_OPT_NONEG),
> >               OPT__ABBREV(&abbrev),
> >               OPT_END()
> >       };
> > @@ -238,6 +345,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
> >           ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
> >               ls_options |= LS_SHOW_TREES;
> >
> > +     if (format && cmdmode)
> > +             usage_msg_opt(
> > +                     _("--format can't be combined with other format-altering options"),
> > +                     ls_tree_usage, ls_tree_options);
> >       if (argc < 1)
> >               usage_with_options(ls_tree_usage, ls_tree_options);
> >       if (get_oid(argv[0], &oid))
> > @@ -261,6 +372,18 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
> >       tree = parse_tree_indirect(&oid);
> >       if (!tree)
> >               die("not a tree object");
> > -     return !!read_tree(the_repository, tree,
> > -                        &pathspec, show_tree, NULL);
> > +
> > +     /*
> > +      * The generic show_tree_fmt() is slower than show_tree(), so
> > +      * take the fast path if possible.
> > +      */
> > +     if (format && (!strcmp(format, default_format) ||
> > +                                !strcmp(format, long_format) ||
> > +                                !strcmp(format, name_only_format) ||
> > +                                !strcmp(format, object_only_format)))
> > +             fn = show_tree;
> > +     else if (format)
> > +             fn = show_tree_fmt;
> > +
> > +     return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
> >  }
> > diff --git a/t/t3105-ls-tree-format.sh b/t/t3105-ls-tree-format.sh
> > new file mode 100755
> > index 0000000000..92b4d240e8
> > --- /dev/null
> > +++ b/t/t3105-ls-tree-format.sh
> > @@ -0,0 +1,55 @@
> > +#!/bin/sh
> > +
> > +test_description='ls-tree --format'
> > +
> > +TEST_PASSES_SANITIZE_LEAK=true
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'ls-tree --format usage' '
> > +     test_expect_code 129 git ls-tree --format=fmt -l &&
> > +     test_expect_code 129 git ls-tree --format=fmt --name-only &&
> > +     test_expect_code 129 git ls-tree --format=fmt --name-status &&
> > +     test_expect_code 129 git ls-tree --format=fmt --object-only
> > +'
> > +
> > +test_expect_success 'setup' '
> > +     mkdir dir &&
> > +     test_commit dir/sub-file &&
> > +     test_commit top-file
> > +'
> > +
> > +test_ls_tree_format () {
> > +     format=$1 &&
> > +     opts=$2 &&
> > +     shift 2 &&
> > +     git ls-tree $opts -r HEAD >expect.raw &&
> > +     sed "s/^/> /" >expect <expect.raw &&
> > +     git ls-tree --format="> $format" -r HEAD >actual &&
> > +     test_cmp expect actual
> > +}
> > +
> > +test_expect_success 'ls-tree --format=<default-like>' '
> > +     test_ls_tree_format \
> > +             "%(mode) %(type) %(object)%x09%(file)" \
> > +             ""
> > +'
> > +
> > +test_expect_success 'ls-tree --format=<long-like>' '
> > +     test_ls_tree_format \
> > +             "%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
> > +             "--long"
> > +'
> > +
> > +test_expect_success 'ls-tree --format=<name-only-like>' '
> > +     test_ls_tree_format \
> > +             "%(file)" \
> > +             "--name-only"
> > +'
> > +
> > +test_expect_success 'ls-tree --format=<object-only-like>' '
> > +     test_ls_tree_format \
> > +             "%(object)" \
> > +             "--object-only"
> > +'
> > +
> > +test_done
> > --
> > 2.33.0.rc1.1802.gbb1c3936fb.dirty
> >
> >
> >

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-05  9:58                 ` Teng Long
@ 2022-01-05 13:09                   ` Johannes Schindelin
  2022-01-05 16:44                     ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Johannes Schindelin @ 2022-01-05 13:09 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, Junio C Hamano, peff, tenglong.tl

Hi Teng,

On Wed, 5 Jan 2022, Teng Long wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > This, along with two other similar instances, triggers the
> > `static-analysis` job in the CI failure of `seen`. The suggested diff is:
>
> The second and third I will optimize in the next patch.
>
> The first one. Actually I am a little puzzled from this :
>
> > -               strbuf_addf(line, "%7s", "-");
> > +               strbuf_addstr(line, "-");
>
> > But I think that the first hunk indicates a deeper issue, as `%7s`
> > probably meant to pad the dash to seven dashes (which that format won't
> > accomplish, but `strbuf_addchars()` would)?
>
> "strbuf_addf(line, "%7s", "-");" here is used to align the columns
> with a width of seven chars, not repeat one DASH to seven.

Ah. I misremembered and thought that `"% 7s"` would do that, but you're
correct. See below for more on this.

But first, I wonder why the test suite passes with the `strbuf_addstr()`
call... Is this line not covered by any test case?

About the `%7s` thing: The most obvious resolution is to use `"      -"`
with `strbuf_addstr()`. And I would argue that this is the best
resolution.

If you disagree (and want to spin up a full `sprintf()` every time, just
to add those six space characters), feel free to integrate the following
into your patch series:

-- snip --
From a390fcf7eec261c7f0e341bda79f2b1f326d151e Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Wed, 5 Jan 2022 14:02:19 +0100
Subject: [PATCH] cocci: allow padding with `strbuf_addf()`

A convenient way to pad strings is to use something like
`strbuf_addf(&buf, "%20s", "Hello, world!")`.

However, the Coccinelle rule that forbids a format `"%s"` with a
constant string argument cast too wide a net, and also forbade such
padding.

Let's be a bit stricter in that Coccinelle rule.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 contrib/coccinelle/strbuf.cocci | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index d9ada69b432..2d6e0f58fc8 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -44,7 +44,7 @@ struct strbuf *SBP;

 @@
 expression E1, E2;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E1, "%@F@", E2);
 + strbuf_addstr(E1, E2);
--
2.33.0.windows.2
-- snap --

Ciao,
Dscho

>
> A little weird about the fix recommendation of  "strbuf_addstr(line, "-");" ,
> because it will only add a single DASH here.
>
> It's the identical result which compares to the "master"[1]  I think with the
> current codes and I tested the "strbuf_addf()" simply and it seems to work
> fine.
>
> [1] https://github.com/git/git/blob/master/builtin/ls-tree.c#L106

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v8 8/8] ls-tree.c: introduce "--format" option
  2022-01-05 13:09                   ` Johannes Schindelin
@ 2022-01-05 16:44                     ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-05 16:44 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: avarab, congdanhqx, git, Junio C Hamano, peff, tenglong.tl

On Wed, Jan 5, 2022 at 9:09 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:

> Ah. I misremembered and thought that `"% 7s"` would do that, but you're
> correct. See below for more on this.
>
> But first, I wonder why the test suite passes with the `strbuf_addstr()`
> call... Is this line not covered by any test case?

Definitely, me too.

> About the `%7s` thing: The most obvious resolution is to use `"      -"`
> with `strbuf_addstr()`. And I would argue that this is the best
> resolution.

I agree that's a quick fix in that way.
Can you feed me more info about why you think it's the best
resolution?

> If you disagree (and want to spin up a full `sprintf()` every time, just
> to add those six space characters), feel free to integrate the following
> into your patch series:
>
> -- snip --
> From a390fcf7eec261c7f0e341bda79f2b1f326d151e Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Wed, 5 Jan 2022 14:02:19 +0100
> Subject: [PATCH] cocci: allow padding with `strbuf_addf()`
>
> A convenient way to pad strings is to use something like
> `strbuf_addf(&buf, "%20s", "Hello, world!")`.
>
> However, the Coccinelle rule that forbids a format `"%s"` with a
> constant string argument cast too wide a net, and also forbade such
> padding.
>
> Let's be a bit stricter in that Coccinelle rule.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  contrib/coccinelle/strbuf.cocci | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
> index d9ada69b432..2d6e0f58fc8 100644
> --- a/contrib/coccinelle/strbuf.cocci
> +++ b/contrib/coccinelle/strbuf.cocci
> @@ -44,7 +44,7 @@ struct strbuf *SBP;
>
>  @@
>  expression E1, E2;
> -format F =~ "s";
> +format F =~ "^s$";
>  @@
>  - strbuf_addf(E1, "%@F@", E2);
>  + strbuf_addstr(E1, E2);
> --
> 2.33.0.windows.2
> -- snap --

I appreciate the input of 'coccinelle' and the commit.

The current relevant rules of 'strbuf' was added in commit [1], the
purpose of it
seems like to forbid some inefficient use cases and chase the performance
profit as much as possible.

I think "<SP*6>-" and "%7s", they both with the same result, the former
benefits in performance, the later benefits in readability. So let's do a simple
performance test under "linux", then think about which is better for this case:

    Benchmark 1: /opt/git/ls-tree-oid-only-addf/bin/git ls-tree -r
--format='> %(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    Time (mean ± σ):     387.7 ms ±   8.8 ms    [User: 357.6 ms,
System: 30.0 ms]
    Range (min … max):   377.5 ms … 399.5 ms    10 runs

    Benchmark 1: /opt/git/ls-tree-oid-only-addstr/bin/git ls-tree -r
--format='> %(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    Time (mean ± σ):     388.9 ms ±   9.0 ms    [User: 362.7 ms,
System: 26.1 ms]
    Range (min … max):   373.4 ms … 399.8 ms    10 runs

It's with a slight performance difference between the two.

So, I decided to integrate your patch as a new commit in the current
patchset and
is it ok for me to mention it's from your guidance in the commit message or
a "helped-by" something like this?

Thanks.

[1] https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 0/9] ls-tree.c: introduce "--format" option
  2022-01-01 13:50           ` [PATCH v8 0/8] ls-tree: "--object-only" and "--format" opts Teng Long
                               ` (7 preceding siblings ...)
  2022-01-01 13:50             ` [PATCH v8 8/8] ls-tree.c: introduce "--format" option Teng Long
@ 2022-01-06  4:31             ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 1/9] ls-tree: remove commented-out code Teng Long
                                 ` (9 more replies)
  8 siblings, 10 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

Diff from v8:

1). [Delete] ls-tree: split up the "init" part of show_tree()
2). [New] ls-tree: optimize naming and handling of "return" in show_tree()

In last patch v8, I use commit "ls-tree: split up the "init" part of show_tree()"
(d77c895a4b) from RFC patch v7 [1] and I did not notice the handling of return
value which was called "retval" in it. As Junio mentioned in his reply [2], I
took a look about the "retval" and found it needs to be handled with care.

So, instead I introduced this new commit (instead of the old commit) to take
responsiblies for renaming "retval" and try to let the relevant codes maybe
more easier to understand.

3). [Append] ls-tree.c: support --object-only option for "git-ls-tree"

Junio pointed out that some of these fields are ambiguous [2] and on the
necessary of using xOR. 

I modified the commit and it's basically according to the steps I replied
in [3], except the issue of using xOR, Junio suggest me to use
"if ((shown_fields & FILE_NAME_FIELD) == FILE_NAME_FIELD)" instead, but
in here I think use "if (shown_fields == FILE_NAME_FIELD)" is enough.

4). [Append] ls-tree.c: introduce struct "show_tree_data"

I had a spelling mistake which was found by Junio ("show_data" should be 
"shown_data"), that reminded me the "show_data" is a little abtract, so I
rename it to "show_tree_data" which is the data struct used in function
"show_tree", I think it's a bit better than before.

5). [Append] ls-tree.c: introduce "--format" option

Doc format modifications. Fix strbuf leaks and some other non-functional
modifications in "ls-tree.c"

6). [New] cocci: allow padding with `strbuf_addf()`

Fix the static-analysis issue[4] which found by Johannes Schindelin. 

Thanks.

[1] https://public-inbox.org/git/xmqqwnjgfe4t.fsf@gitster.g/
[2] https://public-inbox.org/git/xmqqwnjgfe4t.fsf@gitster.g/
[3] https://public-inbox.org/git/20220104072951.10153-1-dyroneteng@gmail.com/#t
[4] https://public-inbox.org/git/nycvar.QRO.7.76.6.2201051348050.7076@tvgsbejvaqbjf.bet/

Teng Long (5):
  ls-tree: optimize naming and handling of "return" in show_tree()
  ls-tree.c: support --object-only option for "git-ls-tree"
  ls-tree.c: introduce struct "show_tree_data"
  ls-tree.c: introduce "--format" option
  cocci: allow padding with `strbuf_addf()`

Ævar Arnfjörð Bjarmason (4):
  ls-tree: remove commented-out code
  ls-tree: add missing braces to "else" arms
  ls-tree: use "enum object_type", not {blob,tree,commit}_type
  ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"

 Documentation/git-ls-tree.txt   |  55 +++++-
 builtin/ls-tree.c               | 332 ++++++++++++++++++++++++++------
 contrib/coccinelle/strbuf.cocci |   2 +-
 t/t3104-ls-tree-oid.sh          |  51 +++++
 t/t3105-ls-tree-format.sh       |  55 ++++++
 5 files changed, 427 insertions(+), 68 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh
 create mode 100755 t/t3105-ls-tree-format.sh

Range-diff against v8:
 1:  d77c895a4b <  -:  ---------- ls-tree: split up the "init" part of show_tree()
 -:  ---------- >  1:  2fcff7e0d4 ls-tree: remove commented-out code
 -:  ---------- >  2:  6fd1dd9383 ls-tree: add missing braces to "else" arms
 -:  ---------- >  3:  208654b5e2 ls-tree: use "enum object_type", not {blob,tree,commit}_type
 -:  ---------- >  4:  2637464fd8 ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
 -:  ---------- >  5:  75503c41a7 ls-tree: optimize naming and handling of "return" in show_tree()
 2:  cb881183cb !  6:  e0274f079a ls-tree.c: support --object-only option for "git-ls-tree"
    @@ builtin/ls-tree.c
      static struct pathspec pathspec;
      static int chomp_prefix;
      static const char *ls_tree_prefix;
    -+static unsigned int shown_bits;
    -+#define SHOW_FILE_NAME 1
    -+#define SHOW_SIZE (1 << 1)
    -+#define SHOW_OBJECT_NAME (1 << 2)
    -+#define SHOW_TYPE (1 << 3)
    -+#define SHOW_MODE (1 << 4)
    -+#define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
    ++static unsigned int shown_fields;
    ++#define FIELD_FILE_NAME 1
    ++#define FIELD_SIZE (1 << 1)
    ++#define FIELD_OBJECT_NAME (1 << 2)
    ++#define FIELD_TYPE (1 << 3)
    ++#define FIELD_MODE (1 << 4)
    ++#define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
    ++#define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
      
      static const  char * const ls_tree_usage[] = {
      	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
    @@ builtin/ls-tree.c
     +static int parse_shown_fields(void)
     +{
     +	if (cmdmode == MODE_NAME_ONLY) {
    -+		shown_bits = SHOW_FILE_NAME;
    ++		shown_fields = FIELD_FILE_NAME;
     +		return 0;
     +	}
     +	if (cmdmode == MODE_OBJECT_ONLY) {
    -+		shown_bits = SHOW_OBJECT_NAME;
    ++		shown_fields = FIELD_OBJECT_NAME;
     +		return 0;
     +	}
     +	if (!ls_options || (ls_options & LS_RECURSIVE)
     +	    || (ls_options & LS_SHOW_TREES)
     +	    || (ls_options & LS_TREE_ONLY))
    -+		shown_bits = SHOW_DEFAULT;
    ++		shown_fields = FIELD_DEFAULT;
     +	if (cmdmode == MODE_LONG)
    -+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
    ++		shown_fields = FIELD_LONG_DEFAULT;
     +	return 1;
     +}
     +
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, c
     +{
     +	size_t baselen = base->len;
     +
    -+	if (shown_bits & SHOW_SIZE) {
    ++	if (shown_fields & FIELD_SIZE) {
     +		char size_text[24];
     +		if (type == OBJ_BLOB) {
     +			unsigned long size;
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, c
     +	return 1;
     +}
     +
    - static int show_tree_init(enum object_type *type, struct strbuf *base,
    - 			  const char *pathname, unsigned mode, int *retval)
    + static void init_type(unsigned mode, enum object_type *type)
      {
    + 	if (S_ISGITLINK(mode))
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    - 	if (show_tree_init(&type, base, pathname, mode, &retval))
    - 		return retval;
    + 	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    + 		return !READ_TREE_RECURSIVE;
      
     -	if (!(ls_options & LS_NAME_ONLY)) {
     -		if (ls_options & LS_SHOW_SIZE) {
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -			printf("%06o %s %s\t", mode, type_name(type),
     -			       find_unique_abbrev(oid, abbrev));
     -		}
    -+	if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
    ++	if (shown_fields == FIELD_OBJECT_NAME) {
     +		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
    -+		return retval;
    ++		return recursive;
      	}
     -	baselen = base->len;
     -	strbuf_addstr(base, pathname);
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -				   stdout, line_termination);
     -	strbuf_setlen(base, baselen);
     +
    -+	if (!(shown_bits ^ SHOW_FILE_NAME)) {
    ++	if (shown_fields == FIELD_FILE_NAME) {
     +		baselen = base->len;
     +		strbuf_addstr(base, pathname);
     +		write_name_quoted_relative(base->buf,
     +					   chomp_prefix ? ls_tree_prefix : NULL,
     +					   stdout, line_termination);
     +		strbuf_setlen(base, baselen);
    ++		return recursive;
     +	}
     +
    -+	if (!(shown_bits ^ SHOW_DEFAULT) ||
    -+	    !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
    ++	if (shown_fields >= FIELD_DEFAULT)
     +		show_default(oid, type, pathname, mode, base);
     +
    - 	return retval;
    + 	return recursive;
      }
      
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 3:  296ebacafe !  7:  725c4d0187 ls-tree.c: introduce struct "shown_data"
    @@ Metadata
     Author: Teng Long <dyroneteng@gmail.com>
     
      ## Commit message ##
    -    ls-tree.c: introduce struct "shown_data"
    +    ls-tree.c: introduce struct "show_tree_data"
     
    -    "show_data" is a struct that packages the necessary fields for
    -    reusing. This commit is a front-loaded commit for support
    -    "--format" argument and does not affect any existing functionality.
    +    "show_tree_data" is a struct that packages the necessary fields for
    +    "show_tree()". This commit is a pre-prepared commit for supporting
    +    "--format" option and it does not affect any existing functionality.
     
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
      ## builtin/ls-tree.c ##
    -@@ builtin/ls-tree.c: static unsigned int shown_bits;
    - #define SHOW_MODE (1 << 4)
    - #define SHOW_DEFAULT 29 /* 11101 size is not shown to output by default */
    +@@ builtin/ls-tree.c: static unsigned int shown_fields;
    + #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
    + #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
      
    -+struct shown_data {
    ++struct show_tree_data {
     +	unsigned mode;
     +	enum object_type type;
     +	const struct object_id *oid;
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
     -static int show_default(const struct object_id *oid, enum object_type type,
     -			const char *pathname, unsigned mode,
     -			struct strbuf *base)
    -+static int show_default(struct shown_data *data)
    ++static int show_default(struct show_tree_data *data)
      {
     -	size_t baselen = base->len;
     +	size_t baselen = data->base->len;
      
    - 	if (shown_bits & SHOW_SIZE) {
    + 	if (shown_fields & FIELD_SIZE) {
      		char size_text[24];
     -		if (type == OBJ_BLOB) {
     +		if (data->type == OBJ_BLOB) {
    @@ builtin/ls-tree.c: static int show_default(const struct object_id *oid, enum obj
      
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
      {
    - 	int retval = 0;
    + 	int recursive = 0;
      	size_t baselen;
     -	enum object_type type = OBJ_BLOB;
    -+	struct shown_data data = {
    ++	struct show_tree_data data = {
     +		.mode = mode,
     +		.type = OBJ_BLOB,
     +		.oid = oid,
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     +		.base = base,
     +	};
      
    --	if (show_tree_init(&type, base, pathname, mode, &retval))
    -+	if (show_tree_init(&data.type, base, pathname, mode, &retval))
    - 		return retval;
    --
    - 	if (!(shown_bits ^ SHOW_OBJECT_NAME)) {
    - 		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
    - 		return retval;
    +-	init_type(mode, &type);
    ++	init_type(mode, &data.type);
    + 	init_recursive(base, pathname, &recursive);
    + 
    +-	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
    ++	if (data.type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
    + 		return recursive;
    +-	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    ++	if (data.type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    + 		return !READ_TREE_RECURSIVE;
    + 
    + 	if (shown_fields == FIELD_OBJECT_NAME) {
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    + 	}
      
    - 	if (!(shown_bits ^ SHOW_DEFAULT) ||
    - 	    !(shown_bits ^ (SHOW_DEFAULT | SHOW_SIZE)))
    + 	if (shown_fields >= FIELD_DEFAULT)
     -		show_default(oid, type, pathname, mode, base);
     +		show_default(&data);
      
    - 	return retval;
    + 	return recursive;
      }
 4:  e0add802fb !  8:  7df58483a4 ls-tree.c: introduce "--format" option
    @@ Commit message
             Range (min … max):   328.8 ms … 349.4 ms    10 runs
     
         Links:
    -    [1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
    +            [1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
     
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
    @@ Documentation/git-ls-tree.txt: quoted as explained for the configuration variabl
     +For example, if you want to only print the <object> and <file> fields with a
     +JSON style, executing with a specific "--format" like
     +
    -+		git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
    ++        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
     +
     +The output format changes to:
     +
    -+		{"object":"<object>", "file":"<file>"}
    ++        {"object":"<object>", "file":"<file>"}
     +
     +FIELD NAMES
     +-----------
    @@ builtin/ls-tree.c: enum {
      
      static int cmdmode = MODE_UNSPECIFIED;
      
    --static int parse_shown_fields(void)
     +static const char *format;
     +static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
     +static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
     +static const char *name_only_format = "%(file)";
     +static const char *object_only_format = "%(object)";
     +
    + static int parse_shown_fields(void)
    + {
    + 	if (cmdmode == MODE_NAME_ONLY) {
    +@@ builtin/ls-tree.c: static int parse_shown_fields(void)
    + 	return 1;
    + }
    + 
     +static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
     +			      const enum object_type type, unsigned int padded)
    - {
    --	if (cmdmode == MODE_NAME_ONLY) {
    --		shown_bits = SHOW_FILE_NAME;
    --		return 0;
    ++{
     +	if (type == OBJ_BLOB) {
     +		unsigned long size;
     +		if (oid_object_info(the_repository, oid, &size) < 0)
    @@ builtin/ls-tree.c: enum {
     +		strbuf_addf(line, "%7s", "-");
     +	} else {
     +		strbuf_addstr(line, "-");
    - 	}
    --	if (cmdmode == MODE_OBJECT_ONLY) {
    --		shown_bits = SHOW_OBJECT_NAME;
    --		return 0;
    ++	}
     +}
     +
     +static size_t expand_show_tree(struct strbuf *line, const char *start,
     +			       void *context)
     +{
    -+	struct shown_data *data = context;
    ++	struct show_tree_data *data = context;
     +	const char *end;
     +	const char *p;
     +	unsigned int errlen;
    -+	size_t len;
    -+	len = strbuf_expand_literal_cb(line, start, NULL);
    ++	size_t len = strbuf_expand_literal_cb(line, start, NULL);
    ++
     +	if (len)
     +		return len;
    -+
     +	if (*start != '(')
     +		die(_("bad ls-tree format: as '%s'"), start);
     +
    @@ builtin/ls-tree.c: enum {
     +	} else if (skip_prefix(start, "(size)", &p)) {
     +		expand_objectsize(line, data->oid, data->type, 0);
     +	} else if (skip_prefix(start, "(object)", &p)) {
    -+		strbuf_addstr(line, find_unique_abbrev(data->oid, abbrev));
    ++		strbuf_add_unique_abbrev(line, data->oid, abbrev);
     +	} else if (skip_prefix(start, "(file)", &p)) {
     +		const char *name = data->base->buf;
     +		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
    @@ builtin/ls-tree.c: enum {
     +		strbuf_addstr(data->base, data->pathname);
     +		name = relative_path(data->base->buf, prefix, &sb);
     +		quote_c_style(name, &quoted, NULL, 0);
    -+		strbuf_addstr(line, quoted.buf);
    ++		strbuf_addbuf(line, &quoted);
    ++		strbuf_release(&sb);
    ++		strbuf_release(&quoted);
     +	} else {
     +		errlen = (unsigned long)len;
     +		die(_("bad ls-tree format: %%%.*s"), errlen, start);
    - 	}
    --	if (!ls_options || (ls_options & LS_RECURSIVE)
    --	    || (ls_options & LS_SHOW_TREES)
    --	    || (ls_options & LS_TREE_ONLY))
    --		shown_bits = SHOW_DEFAULT;
    --	if (cmdmode == MODE_LONG)
    --		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
    --	return 1;
    ++	}
     +	return len;
    - }
    - 
    ++}
    ++
      static int show_recursive(const char *base, size_t baselen,
    + 			  const char *pathname)
    + {
     @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
      	return 0;
      }
      
    -+static int show_tree_init(enum object_type *type, struct strbuf *base,
    -+			  const char *pathname, unsigned mode, int *retval)
    ++static void init_recursive(struct strbuf *base, const char *pathname,
    ++				int *recursive)
     +{
    -+	if (S_ISGITLINK(mode)) {
    ++	if (show_recursive(base->buf, base->len, pathname))
    ++		*recursive = READ_TREE_RECURSIVE;
    ++}
    ++
    ++static void init_type(unsigned mode, enum object_type *type)
    ++{
    ++	if (S_ISGITLINK(mode))
     +		*type = OBJ_COMMIT;
    -+	} else if (S_ISDIR(mode)) {
    -+		if (show_recursive(base->buf, base->len, pathname)) {
    -+			*retval = READ_TREE_RECURSIVE;
    -+			if (!(ls_options & LS_SHOW_TREES))
    -+				return 1;
    -+		}
    ++	else if (S_ISDIR(mode))
     +		*type = OBJ_TREE;
    -+	}
    -+	else if (ls_options & LS_TREE_ONLY)
    -+		return 1;
    -+	return 0;
     +}
     +
     +static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
     +			 const char *pathname, unsigned mode, void *context)
     +{
     +	size_t baselen;
    -+	int retval = 0;
    ++	int recursive = 0;
     +	struct strbuf line = STRBUF_INIT;
    -+	struct shown_data data = {
    ++	struct show_tree_data data = {
     +		.mode = mode,
     +		.type = OBJ_BLOB,
     +		.oid = oid,
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
     +		.base = base,
     +	};
     +
    -+	if (show_tree_init(&data.type, base, pathname, mode, &retval))
    -+		return retval;
    ++	init_type(mode, &data.type);
    ++	init_recursive(base, pathname, &recursive);
    ++
    ++	if (data.type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
    ++		return recursive;
    ++	if (data.type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    ++		return !READ_TREE_RECURSIVE;
     +
     +	baselen = base->len;
     +	strbuf_expand(&line, format, expand_show_tree, &data);
     +	strbuf_addch(&line, line_termination);
     +	fwrite(line.buf, line.len, 1, stdout);
    ++	strbuf_release(&line);
     +	strbuf_setlen(base, baselen);
    -+	return retval;
    ++	return recursive;
     +}
     +
    -+static int parse_shown_fields(void)
    -+{
    -+	if (cmdmode == MODE_NAME_ONLY ||
    -+	    (format && !strcmp(format, name_only_format))) {
    -+		shown_bits = SHOW_FILE_NAME;
    -+		return 1;
    -+	}
    -+
    -+	if (cmdmode == MODE_OBJECT_ONLY ||
    -+	    (format && !strcmp(format, object_only_format))) {
    -+		shown_bits = SHOW_OBJECT_NAME;
    -+		return 1;
    -+	}
    -+
    -+	if (!ls_options || (ls_options & LS_RECURSIVE)
    -+	    || (ls_options & LS_SHOW_TREES)
    -+	    || (ls_options & LS_TREE_ONLY)
    -+		|| (format && !strcmp(format, default_format)))
    -+		shown_bits = SHOW_DEFAULT;
    -+
    -+	if (cmdmode == MODE_LONG ||
    -+		(format && !strcmp(format, long_format)))
    -+		shown_bits = SHOW_DEFAULT | SHOW_SIZE;
    -+	return 1;
    -+}
    -+
    - static int show_default(struct shown_data *data)
    + static int show_default(struct show_tree_data *data)
      {
      	size_t baselen = data->base->len;
    -@@ builtin/ls-tree.c: static int show_default(struct shown_data *data)
    +@@ builtin/ls-tree.c: static int show_default(struct show_tree_data *data)
      	return 1;
      }
      
    --static int show_tree_init(enum object_type *type, struct strbuf *base,
    --			  const char *pathname, unsigned mode, int *retval)
    +-static void init_type(unsigned mode, enum object_type *type)
     -{
    --	if (S_ISGITLINK(mode)) {
    +-	if (S_ISGITLINK(mode))
     -		*type = OBJ_COMMIT;
    --	} else if (S_ISDIR(mode)) {
    --		if (show_recursive(base->buf, base->len, pathname)) {
    --			*retval = READ_TREE_RECURSIVE;
    --			if (!(ls_options & LS_SHOW_TREES))
    --				return 1;
    --		}
    +-	else if (S_ISDIR(mode))
     -		*type = OBJ_TREE;
    --	}
    --	else if (ls_options & LS_TREE_ONLY)
    --		return 1;
    --	return 0;
    +-}
    +-
    +-static void init_recursive(struct strbuf *base, const char *pathname,
    +-				int *recursive)
    +-{
    +-	if (show_recursive(base->buf, base->len, pathname))
    +-		*recursive = READ_TREE_RECURSIVE;
     -}
     -
      static int show_tree(const struct object_id *oid, struct strbuf *base,
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
     +	 * The generic show_tree_fmt() is slower than show_tree(), so
     +	 * take the fast path if possible.
     +	 */
    -+	if (format && (!strcmp(format, default_format) ||
    -+				   !strcmp(format, long_format) ||
    -+				   !strcmp(format, name_only_format) ||
    -+				   !strcmp(format, object_only_format)))
    ++	if (format &&
    ++	    (!strcmp(format, default_format) ||
    ++	     !strcmp(format, long_format) ||
    ++	     !strcmp(format, name_only_format) ||
    ++	     !strcmp(format, object_only_format)))
     +		fn = show_tree;
     +	else if (format)
     +		fn = show_tree_fmt;
 -:  ---------- >  9:  8dafb2b377 cocci: allow padding with `strbuf_addf()`
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 1/9] ls-tree: remove commented-out code
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 2/9] ls-tree: add missing braces to "else" arms Teng Long
                                 ` (8 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Remove code added in f35a6d3bce7 (Teach core object handling functions
about gitlinks, 2007-04-09), later patched in 7d0b18a4da1 (Add output
flushing before fork(), 2008-08-04), and then finally ending up in its
current form in d3bee161fef (tree.c: allow read_tree_recursive() to
traverse gitlink entries, 2009-01-25). All while being commented-out!

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..5f7c84950c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -69,15 +69,6 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
-		/*
-		 * Maybe we want to have some recursive version here?
-		 *
-		 * Something similar to this incomplete example:
-		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
-		 *
-		 */
 		type = commit_type;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 2/9] ls-tree: add missing braces to "else" arms
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
  2022-01-06  4:31               ` [PATCH v9 1/9] ls-tree: remove commented-out code Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 3/9] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
                                 ` (7 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Add missing {} to the "else" arms in show_tree() per the
CodingGuidelines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 5f7c84950c..0a28f32ccb 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -92,14 +92,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				else
 					xsnprintf(size_text, sizeof(size_text),
 						  "%"PRIuMAX, (uintmax_t)size);
-			} else
+			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
+			}
 			printf("%06o %s %s %7s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
-		} else
+		} else {
 			printf("%06o %s %s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev));
+		}
 	}
 	baselen = base->len;
 	strbuf_addstr(base, pathname);
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 3/9] ls-tree: use "enum object_type", not {blob,tree,commit}_type
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
  2022-01-06  4:31               ` [PATCH v9 1/9] ls-tree: remove commented-out code Teng Long
  2022-01-06  4:31               ` [PATCH v9 2/9] ls-tree: add missing braces to "else" arms Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 4/9] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
                                 ` (6 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Change the ls-tree.c code to use type_name() on the enum instead of
using the string constants. This doesn't matter either way for
performance, but makes this a bit easier to read as we'll no longer
need a strcmp() here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 0a28f32ccb..3f0225b097 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,17 +66,17 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
-	const char *type = blob_type;
+	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-		type = commit_type;
+		type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
 			retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
 				return retval;
 		}
-		type = tree_type;
+		type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
@@ -84,7 +84,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
-			if (!strcmp(type, blob_type)) {
+			if (type == OBJ_BLOB) {
 				unsigned long size;
 				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
 					xsnprintf(size_text, sizeof(size_text),
@@ -95,11 +95,11 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
 			}
-			printf("%06o %s %s %7s\t", mode, type,
+			printf("%06o %s %s %7s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
 		} else {
-			printf("%06o %s %s\t", mode, type,
+			printf("%06o %s %s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev));
 		}
 	}
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 4/9] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (2 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 3/9] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 5/9] ls-tree: optimize naming and handling of "return" in show_tree() Teng Long
                                 ` (5 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

The "struct strbuf"'s "len" member is a "size_t", not an "int", so
let's change our corresponding types accordingly. This also changes
the "len" and "speclen" variables, which are likewise used to store
the return value of strlen(), which returns "size_t", not "int".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3f0225b097..eecc7482d5 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,7 +31,7 @@ static const  char * const ls_tree_usage[] = {
 	NULL
 };
 
-static int show_recursive(const char *base, int baselen, const char *pathname)
+static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
 
@@ -43,7 +43,7 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
 
 	for (i = 0; i < pathspec.nr; i++) {
 		const char *spec = pathspec.items[i].match;
-		int len, speclen;
+		size_t len, speclen;
 
 		if (strncmp(base, spec, baselen))
 			continue;
@@ -65,7 +65,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
 	int retval = 0;
-	int baselen;
+	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (3 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 4/9] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06 20:44                 ` Junio C Hamano
  2022-01-06  4:31               ` [PATCH v9 6/9] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
                                 ` (4 subsequent siblings)
  9 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl,
	Johannes.Schindelin, Teng Long

The variable which "show_tree()" return is named "retval", a name that's
a little hard to understand. This commit tries to make the variable
and the related codes more clear in the context.

The change is based on three steps. The first is to rename "retval" to
a more meaningful name.

The second is that there are different "return" cases in "show_tree",
some places use "return retval;", some just directly use "return 0;",
this maybe cause some confusion when reading these "returns". For this
, we change all the "return" cases to the new uniform name.

The last is there are some nested "if" judgments surround the "returns",
this even make the codes here a little hard to understand. So we put
some logic in individual methods, "init_type()" and "init_recursive()".

After the steps, let us look at "show_tree()" again. It has a uniform
return variable name now, and first we init the "type" by "mode", then
call "init_recursive" to init the value of "recursive" which means
whether to go on reading recusively into the "tree". The codes here
become a little bit clearer, so we do not need to take a look at
"read_tree_at()" in "tree.c" to make sure the context of the return
value.

Signed-off-by: Teng Long <dyronetengb@gmail.com>
---
 builtin/ls-tree.c | 38 ++++++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index eecc7482d5..7383dddf8c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -61,25 +61,35 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static void init_type(unsigned mode, enum object_type *type)
+{
+	if (S_ISGITLINK(mode))
+		*type = OBJ_COMMIT;
+	else if (S_ISDIR(mode))
+		*type = OBJ_TREE;
+}
+
+static void init_recursive(struct strbuf *base, const char *pathname,
+				int *recursive)
+{
+	if (show_recursive(base->buf, base->len, pathname))
+		*recursive = READ_TREE_RECURSIVE;
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
-	int retval = 0;
+	int recursive = 0;
 	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
-	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
-		if (show_recursive(base->buf, base->len, pathname)) {
-			retval = READ_TREE_RECURSIVE;
-			if (!(ls_options & LS_SHOW_TREES))
-				return retval;
-		}
-		type = OBJ_TREE;
-	}
-	else if (ls_options & LS_TREE_ONLY)
-		return 0;
+	init_type(mode, &type);
+	init_recursive(base, pathname, &recursive);
+
+	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
+		return recursive;
+	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
+		return !READ_TREE_RECURSIVE;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
@@ -109,7 +119,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				   chomp_prefix ? ls_tree_prefix : NULL,
 				   stdout, line_termination);
 	strbuf_setlen(base, baselen);
-	return retval;
+	return recursive;
 }
 
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (4 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 5/9] ls-tree: optimize naming and handling of "return" in show_tree() Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 7/9] ls-tree.c: introduce struct "show_tree_data" Teng Long
                                 ` (3 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" , they are mutually exclusive (actually
"--name-only" and "--long" can be combined together before, this
commit by the way fix this bug).

A simple refactoring was done to the "show_tree" function, intead by
using bitwise operations to recognize the format for printing to
stdout. The reason for doing this is that we don't want to increase
the readability difficulty with the addition of "-object-only",
making this part of the logic easier to read and expand.

In terms of performance, there is no loss comparing to the
"master" (2ae0a9cb8298185a94e5998086f380a355dd8907), here are the
results of the performance tests in my environment based on linux
repository:

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    Range (min … max):   101.5 ms … 111.3 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    Range (min … max):    99.3 ms … 109.5 ms    27 runs

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    Range (min … max):   323.0 ms … 355.0 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    Range (min … max):   330.4 ms … 349.9 ms    10 runs

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 141 +++++++++++++++++++++++++---------
 t/t3104-ls-tree-oid.sh        |  51 ++++++++++++
 3 files changed, 160 insertions(+), 39 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..729370f235 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 7383dddf8c..6b5e3ab9dd 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -16,22 +16,60 @@
 
 static int line_termination = '\n';
 #define LS_RECURSIVE 1
-#define LS_TREE_ONLY 2
-#define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_TREE_ONLY (1 << 1)
+#define LS_SHOW_TREES (1 << 2)
+#define LS_NAME_ONLY (1 << 3)
+#define LS_SHOW_SIZE (1 << 4)
+#define LS_OBJECT_ONLY (1 << 5)
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
+static unsigned int shown_fields;
+#define FIELD_FILE_NAME 1
+#define FIELD_SIZE (1 << 1)
+#define FIELD_OBJECT_NAME (1 << 2)
+#define FIELD_TYPE (1 << 3)
+#define FIELD_MODE (1 << 4)
+#define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
+#define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
 
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
-static int show_recursive(const char *base, size_t baselen, const char *pathname)
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OBJECT_ONLY,
+	MODE_LONG,
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY) {
+		shown_fields = FIELD_FILE_NAME;
+		return 0;
+	}
+	if (cmdmode == MODE_OBJECT_ONLY) {
+		shown_fields = FIELD_OBJECT_NAME;
+		return 0;
+	}
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY))
+		shown_fields = FIELD_DEFAULT;
+	if (cmdmode == MODE_LONG)
+		shown_fields = FIELD_LONG_DEFAULT;
+	return 1;
+}
+
+static int show_recursive(const char *base, size_t baselen,
+			  const char *pathname)
 {
 	int i;
 
@@ -61,6 +99,39 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static int show_default(const struct object_id *oid, enum object_type type,
+			const char *pathname, unsigned mode,
+			struct strbuf *base)
+{
+	size_t baselen = base->len;
+
+	if (shown_fields & FIELD_SIZE) {
+		char size_text[24];
+		if (type == OBJ_BLOB) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text), "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%" PRIuMAX, (uintmax_t)size);
+		} else {
+			xsnprintf(size_text, sizeof(size_text), "-");
+		}
+		printf("%06o %s %s %7s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev), size_text);
+	} else {
+		printf("%06o %s %s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev));
+	}
+	baselen = base->len;
+	strbuf_addstr(base, pathname);
+	write_name_quoted_relative(base->buf,
+				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
+				   line_termination);
+	strbuf_setlen(base, baselen);
+	return 1;
+}
+
 static void init_type(unsigned mode, enum object_type *type)
 {
 	if (S_ISGITLINK(mode))
@@ -91,34 +162,24 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return !READ_TREE_RECURSIVE;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (type == OBJ_BLOB) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else {
-				xsnprintf(size_text, sizeof(size_text), "-");
-			}
-			printf("%06o %s %s %7s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
-		} else {
-			printf("%06o %s %s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev));
-		}
+	if (shown_fields == FIELD_OBJECT_NAME) {
+		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
+		return recursive;
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
-	strbuf_setlen(base, baselen);
+
+	if (shown_fields == FIELD_FILE_NAME) {
+		baselen = base->len;
+		strbuf_addstr(base, pathname);
+		write_name_quoted_relative(base->buf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout, line_termination);
+		strbuf_setlen(base, baselen);
+		return recursive;
+	}
+
+	if (shown_fields >= FIELD_DEFAULT)
+		show_default(oid, type, pathname, mode, base);
+
 	return recursive;
 }
 
@@ -136,12 +197,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_BIT('l', "long", &ls_options, N_("include object size"),
-			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
+			    MODE_LONG),
+		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -172,6 +235,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
+	parse_shown_fields();
+
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..6ce62bd769
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+	git ls-tree --object-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+	git ls-tree --object-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+	git ls-tree --object-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-only $tree
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-status $tree
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --long $tree
+'
+
+test_done
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 7/9] ls-tree.c: introduce struct "show_tree_data"
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (5 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 6/9] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-06  4:31               ` [PATCH v9 8/9] ls-tree.c: introduce "--format" option Teng Long
                                 ` (2 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

"show_tree_data" is a struct that packages the necessary fields for
"show_tree()". This commit is a pre-prepared commit for supporting
"--format" option and it does not affect any existing functionality.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 50 +++++++++++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 6b5e3ab9dd..12beb02423 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -35,6 +35,14 @@ static unsigned int shown_fields;
 #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
 #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
 
+struct show_tree_data {
+	unsigned mode;
+	enum object_type type;
+	const struct object_id *oid;
+	const char *pathname;
+	struct strbuf *base;
+};
+
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
@@ -99,17 +107,15 @@ static int show_recursive(const char *base, size_t baselen,
 	return 0;
 }
 
-static int show_default(const struct object_id *oid, enum object_type type,
-			const char *pathname, unsigned mode,
-			struct strbuf *base)
+static int show_default(struct show_tree_data *data)
 {
-	size_t baselen = base->len;
+	size_t baselen = data->base->len;
 
 	if (shown_fields & FIELD_SIZE) {
 		char size_text[24];
-		if (type == OBJ_BLOB) {
+		if (data->type == OBJ_BLOB) {
 			unsigned long size;
-			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+			if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
 				xsnprintf(size_text, sizeof(size_text), "BAD");
 			else
 				xsnprintf(size_text, sizeof(size_text),
@@ -117,18 +123,18 @@ static int show_default(const struct object_id *oid, enum object_type type,
 		} else {
 			xsnprintf(size_text, sizeof(size_text), "-");
 		}
-		printf("%06o %s %s %7s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev), size_text);
+		printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev), size_text);
 	} else {
-		printf("%06o %s %s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev));
+		printf("%06o %s %s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev));
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
+	baselen = data->base->len;
+	strbuf_addstr(data->base, data->pathname);
+	write_name_quoted_relative(data->base->buf,
 				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
 				   line_termination);
-	strbuf_setlen(base, baselen);
+	strbuf_setlen(data->base, baselen);
 	return 1;
 }
 
@@ -152,14 +158,20 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int recursive = 0;
 	size_t baselen;
-	enum object_type type = OBJ_BLOB;
+	struct show_tree_data data = {
+		.mode = mode,
+		.type = OBJ_BLOB,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
 
-	init_type(mode, &type);
+	init_type(mode, &data.type);
 	init_recursive(base, pathname, &recursive);
 
-	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
+	if (data.type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
 		return recursive;
-	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
+	if (data.type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return !READ_TREE_RECURSIVE;
 
 	if (shown_fields == FIELD_OBJECT_NAME) {
@@ -178,7 +190,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	}
 
 	if (shown_fields >= FIELD_DEFAULT)
-		show_default(oid, type, pathname, mode, base);
+		show_default(&data);
 
 	return recursive;
 }
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 8/9] ls-tree.c: introduce "--format" option
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (6 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 7/9] ls-tree.c: introduce struct "show_tree_data" Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-10 19:41                 ` Martin Ågren
  2022-01-06  4:31               ` [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()` Teng Long
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
  9 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl, Johannes.Schindelin

Add a --format option to ls-tree. It has an existing default output,
and then --long and --name-only options to emit the default output
along with the objectsize and, or to only emit object paths.

Rather than add --type-only, --object-only etc. we can just support a
--format using a strbuf_expand() similar to "for-each-ref
--format". We might still add such options in the future for
convenience.

The --format implementation is slower than the existing code, but this
change does not cause any performance regressions. We'll leave the
existing show_tree() unchanged, and only run show_tree_fmt() in if
a --format different than the hardcoded built-in ones corresponding to
the existing modes is provided.

I.e. something like the "--long" output would be much slower with
this, mainly due to how we need to allocate various things to do with
quote.c instead of spewing the output directly to stdout.

The new option of '--format' comes from Ævar Arnfjörð Bjarmasonn's
idea and suggestion, this commit makes modifications in terms of the
original discussion on community [1].

Here is the statistics about performance tests:

1. Default format (hitten the builtin formats):

    "git ls-tree <tree-ish>" vs "--format='%(mode) %(type) %(object)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.2 ms ±   3.3 ms    [User: 84.3 ms, System: 20.8 ms]
    Range (min … max):    99.2 ms … 113.2 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD
    Time (mean ± σ):     106.4 ms ±   2.7 ms    [User: 86.1 ms, System: 20.2 ms]
    Range (min … max):   100.2 ms … 110.5 ms    29 runs

2. Default format includes object size (hitten the builtin formats):

    "git ls-tree -l <tree-ish>" vs "--format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     335.1 ms ±   6.5 ms    [User: 304.6 ms, System: 30.4 ms]
    Range (min … max):   327.5 ms … 348.4 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    Time (mean ± σ):     337.2 ms ±   8.2 ms    [User: 309.2 ms, System: 27.9 ms]
    Range (min … max):   328.8 ms … 349.4 ms    10 runs

Links:
	[1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  50 ++++++++++-
 builtin/ls-tree.c             | 158 ++++++++++++++++++++++++++++++----
 t/t3105-ls-tree-format.sh     |  55 ++++++++++++
 3 files changed, 243 insertions(+), 20 deletions(-)
 create mode 100755 t/t3105-ls-tree-format.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index 729370f235..2ca8667c5b 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,9 +10,9 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
-	    <tree-ish> [<path>...]
-
+	    [--name-only] [--name-status] [--object-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--format=<format>] <tree-ish> [<path>...]
 DESCRIPTION
 -----------
 Lists the contents of a given tree object, like what "/bin/ls -a" does
@@ -79,6 +79,16 @@ OPTIONS
 	Do not limit the listing to the current working directory.
 	Implies --full-name.
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result
+	being shown. It also interpolates `%%` to `%`, and
+	`%xx` where `xx`are hex digits interpolates to character
+	with hex code `xx`; for example `%00` interpolates to
+	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
+	When specified, `--format` cannot be combined with other
+	format-altering options, including `--long`, `--name-only`
+	and `--object-only`.
+
 [<path>...]::
 	When paths are given, show them (note that this isn't really raw
 	pathnames, but rather a list of patterns to match).  Otherwise
@@ -87,6 +97,9 @@ OPTIONS
 
 Output Format
 -------------
+
+Default format:
+
         <mode> SP <type> SP <object> TAB <file>
 
 This output format is compatible with what `--index-info --stdin` of
@@ -105,6 +118,37 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+Customized format:
+
+It's support to print customized format by `%(fieldname)` with `--format` option.
+For example, if you want to only print the <object> and <file> fields with a
+JSON style, executing with a specific "--format" like
+
+        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
+
+The output format changes to:
+
+        {"object":"<object>", "file":"<file>"}
+
+FIELD NAMES
+-----------
+
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputing line, the following
+names can be used:
+
+mode::
+	The mode of the object.
+type::
+	The type of the object (`blob` or `tree`).
+object::
+	The name of the object.
+size[:padded]::
+	The size of the object ("-" if it's a tree).
+	It also supports a padded format of size with "%(size:padded)".
+file::
+	The filename of the object.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 12beb02423..d0ba7c4365 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -57,6 +57,12 @@ enum {
 
 static int cmdmode = MODE_UNSPECIFIED;
 
+static const char *format;
+static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
+static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
+static const char *name_only_format = "%(file)";
+static const char *object_only_format = "%(object)";
+
 static int parse_shown_fields(void)
 {
 	if (cmdmode == MODE_NAME_ONLY) {
@@ -76,6 +82,72 @@ static int parse_shown_fields(void)
 	return 1;
 }
 
+static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
+			      const enum object_type type, unsigned int padded)
+{
+	if (type == OBJ_BLOB) {
+		unsigned long size;
+		if (oid_object_info(the_repository, oid, &size) < 0)
+			die(_("could not get object info about '%s'"),
+			    oid_to_hex(oid));
+		if (padded)
+			strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
+		else
+			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
+	} else if (padded) {
+		strbuf_addf(line, "%7s", "-");
+	} else {
+		strbuf_addstr(line, "-");
+	}
+}
+
+static size_t expand_show_tree(struct strbuf *line, const char *start,
+			       void *context)
+{
+	struct show_tree_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	size_t len = strbuf_expand_literal_cb(line, start, NULL);
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-tree format: as '%s'"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(mode)", &p)) {
+		strbuf_addf(line, "%06o", data->mode);
+	} else if (skip_prefix(start, "(type)", &p)) {
+		strbuf_addstr(line, type_name(data->type));
+	} else if (skip_prefix(start, "(size:padded)", &p)) {
+		expand_objectsize(line, data->oid, data->type, 1);
+	} else if (skip_prefix(start, "(size)", &p)) {
+		expand_objectsize(line, data->oid, data->type, 0);
+	} else if (skip_prefix(start, "(object)", &p)) {
+		strbuf_add_unique_abbrev(line, data->oid, abbrev);
+	} else if (skip_prefix(start, "(file)", &p)) {
+		const char *name = data->base->buf;
+		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
+		struct strbuf quoted = STRBUF_INIT;
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addstr(data->base, data->pathname);
+		name = relative_path(data->base->buf, prefix, &sb);
+		quote_c_style(name, &quoted, NULL, 0);
+		strbuf_addbuf(line, &quoted);
+		strbuf_release(&sb);
+		strbuf_release(&quoted);
+	} else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-tree format: %%%.*s"), errlen, start);
+	}
+	return len;
+}
+
 static int show_recursive(const char *base, size_t baselen,
 			  const char *pathname)
 {
@@ -107,6 +179,52 @@ static int show_recursive(const char *base, size_t baselen,
 	return 0;
 }
 
+static void init_recursive(struct strbuf *base, const char *pathname,
+				int *recursive)
+{
+	if (show_recursive(base->buf, base->len, pathname))
+		*recursive = READ_TREE_RECURSIVE;
+}
+
+static void init_type(unsigned mode, enum object_type *type)
+{
+	if (S_ISGITLINK(mode))
+		*type = OBJ_COMMIT;
+	else if (S_ISDIR(mode))
+		*type = OBJ_TREE;
+}
+
+static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
+			 const char *pathname, unsigned mode, void *context)
+{
+	size_t baselen;
+	int recursive = 0;
+	struct strbuf line = STRBUF_INIT;
+	struct show_tree_data data = {
+		.mode = mode,
+		.type = OBJ_BLOB,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
+
+	init_type(mode, &data.type);
+	init_recursive(base, pathname, &recursive);
+
+	if (data.type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
+		return recursive;
+	if (data.type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
+		return !READ_TREE_RECURSIVE;
+
+	baselen = base->len;
+	strbuf_expand(&line, format, expand_show_tree, &data);
+	strbuf_addch(&line, line_termination);
+	fwrite(line.buf, line.len, 1, stdout);
+	strbuf_release(&line);
+	strbuf_setlen(base, baselen);
+	return recursive;
+}
+
 static int show_default(struct show_tree_data *data)
 {
 	size_t baselen = data->base->len;
@@ -138,21 +256,6 @@ static int show_default(struct show_tree_data *data)
 	return 1;
 }
 
-static void init_type(unsigned mode, enum object_type *type)
-{
-	if (S_ISGITLINK(mode))
-		*type = OBJ_COMMIT;
-	else if (S_ISDIR(mode))
-		*type = OBJ_TREE;
-}
-
-static void init_recursive(struct strbuf *base, const char *pathname,
-				int *recursive)
-{
-	if (show_recursive(base->buf, base->len, pathname))
-		*recursive = READ_TREE_RECURSIVE;
-}
-
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
@@ -200,6 +303,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	struct object_id oid;
 	struct tree *tree;
 	int i, full_tree = 0;
+	read_tree_fn_t fn = show_tree;
 	const struct option ls_tree_options[] = {
 		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
 			LS_TREE_ONLY),
@@ -222,6 +326,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "full-tree", &full_tree,
 			 N_("list entire tree; not just current directory "
 			    "(implies --full-name)")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
@@ -242,6 +349,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
 		ls_options |= LS_SHOW_TREES;
 
+	if (format && cmdmode)
+		usage_msg_opt(
+			_("--format can't be combined with other format-altering options"),
+			ls_tree_usage, ls_tree_options);
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
 	if (get_oid(argv[0], &oid))
@@ -265,6 +376,19 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
-	return !!read_tree(the_repository, tree,
-			   &pathspec, show_tree, NULL);
+
+	/*
+	 * The generic show_tree_fmt() is slower than show_tree(), so
+	 * take the fast path if possible.
+	 */
+	if (format &&
+	    (!strcmp(format, default_format) ||
+	     !strcmp(format, long_format) ||
+	     !strcmp(format, name_only_format) ||
+	     !strcmp(format, object_only_format)))
+		fn = show_tree;
+	else if (format)
+		fn = show_tree_fmt;
+
+	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
 }
diff --git a/t/t3105-ls-tree-format.sh b/t/t3105-ls-tree-format.sh
new file mode 100755
index 0000000000..92b4d240e8
--- /dev/null
+++ b/t/t3105-ls-tree-format.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='ls-tree --format'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'ls-tree --format usage' '
+	test_expect_code 129 git ls-tree --format=fmt -l &&
+	test_expect_code 129 git ls-tree --format=fmt --name-only &&
+	test_expect_code 129 git ls-tree --format=fmt --name-status &&
+	test_expect_code 129 git ls-tree --format=fmt --object-only
+'
+
+test_expect_success 'setup' '
+	mkdir dir &&
+	test_commit dir/sub-file &&
+	test_commit top-file
+'
+
+test_ls_tree_format () {
+	format=$1 &&
+	opts=$2 &&
+	shift 2 &&
+	git ls-tree $opts -r HEAD >expect.raw &&
+	sed "s/^/> /" >expect <expect.raw &&
+	git ls-tree --format="> $format" -r HEAD >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ls-tree --format=<default-like>' '
+	test_ls_tree_format \
+		"%(mode) %(type) %(object)%x09%(file)" \
+		""
+'
+
+test_expect_success 'ls-tree --format=<long-like>' '
+	test_ls_tree_format \
+		"%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
+		"--long"
+'
+
+test_expect_success 'ls-tree --format=<name-only-like>' '
+	test_ls_tree_format \
+		"%(file)" \
+		"--name-only"
+'
+
+test_expect_success 'ls-tree --format=<object-only-like>' '
+	test_ls_tree_format \
+		"%(object)" \
+		"--object-only"
+'
+
+test_done
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (7 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 8/9] ls-tree.c: introduce "--format" option Teng Long
@ 2022-01-06  4:31               ` Teng Long
  2022-01-07 13:03                 ` Johannes Schindelin
  2022-01-10 18:00                 ` Ævar Arnfjörð Bjarmason
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
  9 siblings, 2 replies; 224+ messages in thread
From: Teng Long @ 2022-01-06  4:31 UTC (permalink / raw)
  To: dyroneteng
  Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl,
	Johannes.Schindelin, Johannes Schindelin

A convenient way to pad strings is to use something like
`strbuf_addf(&buf, "%20s", "Hello, world!")`.

However, the Coccinelle rule that forbids a format `"%s"` with a
constant string argument cast too wide a net, and also forbade such
padding.

The original rule was introduced by commit:

    https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 contrib/coccinelle/strbuf.cocci | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index d9ada69b43..2d6e0f58fc 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -44,7 +44,7 @@ struct strbuf *SBP;
 
 @@
 expression E1, E2;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E1, "%@F@", E2);
 + strbuf_addstr(E1, E2);
-- 
2.33.0.rc1.1794.g2ae0a9cb82


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-06  4:31               ` [PATCH v9 5/9] ls-tree: optimize naming and handling of "return" in show_tree() Teng Long
@ 2022-01-06 20:44                 ` Junio C Hamano
  2022-01-11  9:14                   ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Junio C Hamano @ 2022-01-06 20:44 UTC (permalink / raw)
  To: Teng Long
  Cc: avarab, congdanhqx, git, peff, tenglong.tl, Johannes.Schindelin,
	Teng Long

Teng Long <dyroneteng@gmail.com> writes:

> +static void init_type(unsigned mode, enum object_type *type)
> +{
> +	if (S_ISGITLINK(mode))
> +		*type = OBJ_COMMIT;
> +	else if (S_ISDIR(mode))
> +		*type = OBJ_TREE;
> +}
> +
> +static void init_recursive(struct strbuf *base, const char *pathname,
> +				int *recursive)
> +{
> +	if (show_recursive(base->buf, base->len, pathname))
> +		*recursive = READ_TREE_RECURSIVE;
> +}
> +
>  static int show_tree(const struct object_id *oid, struct strbuf *base,
>  		const char *pathname, unsigned mode, void *context)
>  {
> -	int retval = 0;
> +	int recursive = 0;
>  	size_t baselen;
>  	enum object_type type = OBJ_BLOB;
>  
> -	if (S_ISGITLINK(mode)) {
> -		type = OBJ_COMMIT;
> -	} else if (S_ISDIR(mode)) {
> -		if (show_recursive(base->buf, base->len, pathname)) {
> -			retval = READ_TREE_RECURSIVE;
> -			if (!(ls_options & LS_SHOW_TREES))
> -				return retval;
> -		}
> -		type = OBJ_TREE;
> -	}
> -	else if (ls_options & LS_TREE_ONLY)
> -		return 0;
> +	init_type(mode, &type);

What this one is doing sounds more like setting the type variable
based on the mode bits, and doing only half a job at it.  The name
"init" does not sound like a good match to what it does.

If we make it a separate function, we probably should add the "else"
clause to set *type to OBJ_BLOB there, so that the caller does not
say "we'd assume it is BLOB initially, but tweak it based on mode
bits".

I.e.

	type = get_type(mode);

where

	static enum object_type get_type(unsigned int mode)
	{
		return (S_ISGITLINK(mode) 
		        ? OBJ_COMMIT
			: S_ISDIR(mode)
			? OBJ_TREE
			: OBJ_BLOB);
	}

or something like that, perhaps?  But I think open-coding the whole
thing, after losing the "We assume BLOB" initialization, would be
much easier to follow, i.e.

	if (S_ISGITLINK(mode))
		type = OBJ_COMMIT;
	else if (S_ISDIR(mode))
		type = OBJ_TREE;
	else
		type = OBJ_BLOB;

without adding init_type() helper function.

> +	init_recursive(base, pathname, &recursive);

This is even less readable.  In the original, it was clear that we
only call show_recursive() on a path that is a true directory; we
now seem to unconditionally make a call to it.  Is that intended?

	Side note.  show_recursive() has a confusing name; it does
	not show anything---it only decides if we want to go
	recursive.

At least, losing the "we assume recursive is 0" upfront in the
variable declaration and writing

	if (type == OBJ_TREE && show_recursive(...))
		recursive = READ_TREE_RECURSIVE;
	else
		recursive = 0;

here, without introducing init_recursive(), would make it easier to
follow.  If we really want to add a new function, perhaps

	recursive = get_recursive(type, base, pathname);
 
where

	static int get_recursive(enum object_type type,
				 struct strbuf *base, const char *pathname)
	{
		if (type == OBJ_TREE && show_recursive(...))
			return READ_TREE_RECURSIVE;
		else
			return 0;
	}

but I fail to see the point of doing so; open-coded 4 lines here
would make the flow of thought much better to me.

In any case, I think your splitting the original into "this is about
type" and "this is about the recursive bit" is a good idea to help
making the resulting code easier to follow.

> +	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
> +		return recursive;

We are looking at an entry that is a directory.  We are running in
recursive mode.  And we are not told to show the directory itself in
the output.  We skip the rest of the function, which is about to
show this single entry.  Makes sense.


> +	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
> +		return !READ_TREE_RECURSIVE;

Negation of a non-zero integer constant is 0, so it is the same as
the original that returned 0, but I am not sure if it is enhancing
or hurting readability of the code.  The user of the value, in
tree.c::read_tree_at(), knows that the possible and valid values are
0 and READ_TREE_RECURSIVE, so returning 0 would probably be a better
idea.  After all, the initializer in the original for the variable
definition of "retval" used "0", not "!READ_TREE_RECURSIVE".

The name "recursive" is much more specific than the overly generic
"retval".  Its value is to be consumed by read_tree_at(), i.e. our
caller, to decide if we want it to recurse into the contents of the
directory.  I would have called it "recurse" (or even "to_recurse"),
if I were doing this change, though.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-06  4:31               ` [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()` Teng Long
@ 2022-01-07 13:03                 ` Johannes Schindelin
  2022-01-10  8:22                   ` Teng Long
  2022-01-10 18:34                   ` Junio C Hamano
  2022-01-10 18:00                 ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 224+ messages in thread
From: Johannes Schindelin @ 2022-01-07 13:03 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

Hi Teng,

On Thu, 6 Jan 2022, Teng Long wrote:

> A convenient way to pad strings is to use something like
> `strbuf_addf(&buf, "%20s", "Hello, world!")`.
>
> However, the Coccinelle rule that forbids a format `"%s"` with a
> constant string argument cast too wide a net, and also forbade such
> padding.
>
> The original rule was introduced by commit:
>
>     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac

Doing this in 9/9 is too late, by this time you already introduced the
code site that requires this workaround.

At the same time, I wonder why you want to defend spinning up the
full-blown `printf()` machinery just to pad text that you can easily pad
yourself. It sounds like a lot of trouble to me to introduce this patch
and then use an uncommon method to pad a fixed string at runtime. Too much
trouble for my liking.

Ciao,
Dscho

>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  contrib/coccinelle/strbuf.cocci | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
> index d9ada69b43..2d6e0f58fc 100644
> --- a/contrib/coccinelle/strbuf.cocci
> +++ b/contrib/coccinelle/strbuf.cocci
> @@ -44,7 +44,7 @@ struct strbuf *SBP;
>
>  @@
>  expression E1, E2;
> -format F =~ "s";
> +format F =~ "^s$";
>  @@
>  - strbuf_addf(E1, "%@F@", E2);
>  + strbuf_addstr(E1, E2);
> --
> 2.33.0.rc1.1794.g2ae0a9cb82
>
>

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-07 13:03                 ` Johannes Schindelin
@ 2022-01-10  8:22                   ` Teng Long
  2022-01-10 12:49                     ` Johannes Schindelin
  2022-01-10 18:34                   ` Junio C Hamano
  1 sibling, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-10  8:22 UTC (permalink / raw)
  To: johannes.schindelin
  Cc: avarab, congdanhqx, dyroneteng, git, gitster, peff, tenglong.tl


I am not sure whether I have sent the email repeatedly, because it is
not shown on mailist. If so, sorry to bother you.

Johannes Schindelin writes:

> Doing this in 9/9 is too late, by this time you already introduced the
> code site that requires this workaround.

Yes, you are correct.
Will fixed if the patch is still remained to next one. 

> At the same time, I wonder why you want to defend spinning up the
> full-blown `printf()` machinery just to pad text that you can easily pad
> yourself. It sounds like a lot of trouble to me to introduce this patch
> and then use an uncommon method to pad a fixed string at runtime. Too much
> trouble for my liking.

I may not have explained it clearly in the cover. Sorry for that, I'm going
to explain some more here, please correct me if there is something wrong or
the method is not recommended or is not best practice in community.

Firstly, the patch needs to be introduced I think and it has nothing to do
with using "      -" or "%7s" here, because the fix recommandation is not
accurate in terms of the "static-analysis" report if someone just uses the
"addf" api:

-               strbuf_addf(line, "%7s", "-");
+               strbuf_addstr(line, "-");

They have different execution results and bring confusion to people. 

Then secondly, about the using "strbuf_addf(line, "%7s" , "-");" or
"strbuf_addstr(line, "      -");". I think you prefer the later and I prefer
the former, right? (I'm not a native English speaker, so I just want to make
sure I understand whole your meannings).

If I understand everything correctly so far, it's good :)

As I metioned in a previous reply [1], I think there is no performance
issue here..

Why I prefer more of the former that is because, for the single line,
it's more readable I think. Maybe it's not going to modify very often,
but If someone want to know what this is, might have to do a count. So
I don't think this is any more readable than "%7s".

Here's what I think and looking forward to your reply.

Thanks.


[1] https://public-inbox.org/git/CADMgQSRxko6nC0zfDiVVfL2ZkdQVbBq0s59Er+6Nmg9vz4uJKQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-10  8:22                   ` Teng Long
@ 2022-01-10 12:49                     ` Johannes Schindelin
  2022-01-10 14:40                       ` Teng Long
                                         ` (2 more replies)
  0 siblings, 3 replies; 224+ messages in thread
From: Johannes Schindelin @ 2022-01-10 12:49 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

Hi Teng,

On Mon, 10 Jan 2022, Teng Long wrote:

> [...] about the using "strbuf_addf(line, "%7s" , "-");" or
> "strbuf_addstr(line, "      -");". [...]
>
> Why I prefer more of the former that is because, for the single line,
> it's more readable I think.

I strongly disagree. Using a format requires the reader to interpret a
`printf()` format, to remember (if they ever knew) the rules about padding
with `%<number>s` formats, and then to satisfy themselves that the result
is correct.

That's quite the cognitive load you put on the reader for something as
trivial as "      -".

Not a fan,
Johannes

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-10 12:49                     ` Johannes Schindelin
@ 2022-01-10 14:40                       ` Teng Long
  2022-01-10 17:47                       ` Junio C Hamano
  2022-01-10 18:02                       ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-10 14:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: avarab, congdanhqx, git, gitster, peff, tenglong.tl

> I strongly disagree. Using a format requires the reader to interpret a
> `printf()` format, to remember (if they ever knew) the rules about padding
> with `%<number>s` formats, and then to satisfy themselves that the result
> is correct.
>
> That's quite the cognitive load you put on the reader for something as
> trivial as "      -".

> Not a fan,
> Johannes

Ok. I will modify the next patch according to your opinion, I just
hope to understand the problems and make better contributions in the
future.

Thanks.

On Mon, Jan 10, 2022 at 8:49 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Teng,
>
> On Mon, 10 Jan 2022, Teng Long wrote:
>
> > [...] about the using "strbuf_addf(line, "%7s" , "-");" or
> > "strbuf_addstr(line, "      -");". [...]
> >
> > Why I prefer more of the former that is because, for the single line,
> > it's more readable I think.
>
> I strongly disagree. Using a format requires the reader to interpret a
> `printf()` format, to remember (if they ever knew) the rules about padding
> with `%<number>s` formats, and then to satisfy themselves that the result
> is correct.
>
> That's quite the cognitive load you put on the reader for something as
> trivial as "      -".
>
> Not a fan,
> Johannes

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-10 12:49                     ` Johannes Schindelin
  2022-01-10 14:40                       ` Teng Long
@ 2022-01-10 17:47                       ` Junio C Hamano
  2022-01-10 18:02                       ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 224+ messages in thread
From: Junio C Hamano @ 2022-01-10 17:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Teng Long, avarab, congdanhqx, git, peff, tenglong.tl

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> [...] about the using "strbuf_addf(line, "%7s" , "-");" or
>> "strbuf_addstr(line, "      -");". [...]
>>
>> Why I prefer more of the former that is because, for the single line,
>> it's more readable I think.
>
> I strongly disagree. Using a format requires the reader to interpret a
> `printf()` format, to remember (if they ever knew) the rules about padding
> with `%<number>s` formats, and then to satisfy themselves that the result
> is correct.

Both "more readable" and "cognitive load" are quite subjective.

FWIW, I have a slight preference to the former because I do not have
to count the whitespaces to figure out at which column the construct
is trying to align to.  Most of the time, however, I may not deeply
care if the thing is aligned exactly, and "     -" might be easier
to scan and getting alarmed by seeing "%7s" to wonder if there is
something unusual going on.  When I am reading not for finding out
the precise output format but for general correctness, bunch of 
unknown number of spaces followed by a dash might be easier to see.

But once you know the language, "%7s" is not so alarming, and it
does make it easier to see both for casual scanning and for counting
columns.  It also is likely that those who know the language would
make more efficient developers to fix and/or enhance the code, so I
prefer to optimize the code for them.


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-06  4:31               ` [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()` Teng Long
  2022-01-07 13:03                 ` Johannes Schindelin
@ 2022-01-10 18:00                 ` Ævar Arnfjörð Bjarmason
  2022-01-11 10:37                   ` Teng Long
  2022-01-11 16:42                   ` Taylor Blau
  1 sibling, 2 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-10 18:00 UTC (permalink / raw)
  To: Teng Long
  Cc: congdanhqx, git, gitster, peff, tenglong.tl, Johannes Schindelin


On Thu, Jan 06 2022, Teng Long wrote:

> A convenient way to pad strings is to use something like
> `strbuf_addf(&buf, "%20s", "Hello, world!")`.
>
> However, the Coccinelle rule that forbids a format `"%s"` with a
> constant string argument cast too wide a net, and also forbade such
> padding.
>
> The original rule was introduced by commit:
>
>     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac

Let's refer to commits like this:

    28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  contrib/coccinelle/strbuf.cocci | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
> index d9ada69b43..2d6e0f58fc 100644
> --- a/contrib/coccinelle/strbuf.cocci
> +++ b/contrib/coccinelle/strbuf.cocci
> @@ -44,7 +44,7 @@ struct strbuf *SBP;
>  
>  @@
>  expression E1, E2;
> -format F =~ "s";
> +format F =~ "^s$";
>  @@
>  - strbuf_addf(E1, "%@F@", E2);
>  + strbuf_addstr(E1, E2);

That file currently has:

     18:format F =~ "s";
     26:format F =~ "s";
     47:format F =~ "s";

If we're fixing these let's fix the other logic errors as well.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-10 12:49                     ` Johannes Schindelin
  2022-01-10 14:40                       ` Teng Long
  2022-01-10 17:47                       ` Junio C Hamano
@ 2022-01-10 18:02                       ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-10 18:02 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Teng Long, congdanhqx, git, gitster, peff, tenglong.tl


On Mon, Jan 10 2022, Johannes Schindelin wrote:

> Hi Teng,
>
> On Mon, 10 Jan 2022, Teng Long wrote:
>
>> [...] about the using "strbuf_addf(line, "%7s" , "-");" or
>> "strbuf_addstr(line, "      -");". [...]
>>
>> Why I prefer more of the former that is because, for the single line,
>> it's more readable I think.
>
> I strongly disagree. Using a format requires the reader to interpret a
> `printf()` format, to remember (if they ever knew) the rules about padding
> with `%<number>s` formats, and then to satisfy themselves that the result
> is correct.
>
> That's quite the cognitive load you put on the reader for something as
> trivial as "      -".
>
> Not a fan,
> Johannes

I think you can argue that, but saying that this series must change that
existing "%7s" format just because it happened to trip over an existin
coccinelle rule as code was changed from printf() to strbuf_addf() is
going overboard. 

Also, the ls-tree output has existing alignment issues, and the
documentation says:

    "right-justified with minimum width of 7 characters"

So I'd think we'd want to keep the %7s, and in some future change change
that format to be dynamic so we'd align things properly if some fields
were longer than 7 characters.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-07 13:03                 ` Johannes Schindelin
  2022-01-10  8:22                   ` Teng Long
@ 2022-01-10 18:34                   ` Junio C Hamano
  1 sibling, 0 replies; 224+ messages in thread
From: Junio C Hamano @ 2022-01-10 18:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Teng Long, avarab, congdanhqx, git, peff, tenglong.tl

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> The original rule was introduced by commit:
>>
>>     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac
>
> Doing this in 9/9 is too late, by this time you already introduced the
> code site that requires this workaround.

Good to point this out.  Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 8/9] ls-tree.c: introduce "--format" option
  2022-01-06  4:31               ` [PATCH v9 8/9] ls-tree.c: introduce "--format" option Teng Long
@ 2022-01-10 19:41                 ` Martin Ågren
  2022-01-11  9:34                   ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Martin Ågren @ 2022-01-10 19:41 UTC (permalink / raw)
  To: Teng Long
  Cc: Ævar Arnfjörð Bjarmason,
	Đoàn Trần Công Danh, Git Mailing List,
	Junio C Hamano, Jeff King, tenglong.tl, Johannes Schindelin

Hi Teng,

On Fri, 7 Jan 2022 at 06:34, Teng Long <dyroneteng@gmail.com> wrote:

> +--format=<format>::
> +       A string that interpolates `%(fieldname)` from the result
> +       being shown. It also interpolates `%%` to `%`, and
> +       `%xx` where `xx`are hex digits interpolates to character

Above, there is a missing space just before "are". That causes the
manpage to render a little bit funny.

> +       with hex code `xx`; for example `%00` interpolates to
> +       `\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
> +       When specified, `--format` cannot be combined with other
> +       format-altering options, including `--long`, `--name-only`
> +       and `--object-only`.
> +

> +Customized format:
> +
> +It's support to print customized format by `%(fieldname)` with `--format` option.

I had to re-read this to understand. How about the following?

    It is possible to print in a custom format by using the `--format`
    option, which is able to interpolate different fields using a
    `%(fieldname)` notation.

Just a suggestion. Feel free to tweak or ignore. :-)

> +For example, if you want to only print the <object> and <file> fields with a
> +JSON style, executing with a specific "--format" like
> +
> +        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
> +
> +The output format changes to:
> +
> +        {"object":"<object>", "file":"<file>"}
> +

Nice!

Martin

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-06 20:44                 ` Junio C Hamano
@ 2022-01-11  9:14                   ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-11  9:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: avarab, congdanhqx, git, peff, tenglong.tl, Johannes Schindelin,
	Teng Long

On Fri, Jan 7, 2022 at 4:44 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Teng Long <dyroneteng@gmail.com> writes:
>

> What this one is doing sounds more like setting the type variable
> based on the mode bits, and doing only half a job at it.  The name
> "init" does not sound like a good match to what it does.
>
> If we make it a separate function, we probably should add the "else"
> clause to set *type to OBJ_BLOB there, so that the caller does not
> say "we'd assume it is BLOB initially, but tweak it based on mode
> bits".
>
> I.e.
>
>         type = get_type(mode);
>
> where
>
>         static enum object_type get_type(unsigned int mode)
>         {
>                 return (S_ISGITLINK(mode)
>                         ? OBJ_COMMIT
>                         : S_ISDIR(mode)
>                         ? OBJ_TREE
>                         : OBJ_BLOB);
>         }

> or something like that, perhaps?  But I think open-coding the whole
> thing, after losing the "We assume BLOB" initialization, would be
> much easier to follow, i.e.
>
>         if (S_ISGITLINK(mode))
>                 type = OBJ_COMMIT;
>         else if (S_ISDIR(mode))
>                 type = OBJ_TREE;
>         else
>                 type = OBJ_BLOB;
>
> without adding init_type() helper function.
>
> > +     init_recursive(base, pathname, &recursive);
>
> This is even less readable.  In the original, it was clear that we
> only call show_recursive() on a path that is a true directory; we
> now seem to unconditionally make a call to it.  Is that intended?
>
>         Side note.  show_recursive() has a confusing name; it does
>         not show anything---it only decides if we want to go
>         recursive.
>
> At least, losing the "we assume recursive is 0" upfront in the
> variable declaration and writing
>
>         if (type == OBJ_TREE && show_recursive(...))
>                 recursive = READ_TREE_RECURSIVE;
>         else
>                 recursive = 0;
>
> here, without introducing init_recursive(), would make it easier to
> follow.  If we really want to add a new function, perhaps
>
>         recursive = get_recursive(type, base, pathname);
>
> where
>
>         static int get_recursive(enum object_type type,
>                                  struct strbuf *base, const char *pathname)
>         {
>                 if (type == OBJ_TREE && show_recursive(...))
>                         return READ_TREE_RECURSIVE;
>                 else
>                         return 0;
>         }
>
> but I fail to see the point of doing so; open-coded 4 lines here
> would make the flow of thought much better to me.
>
> In any case, I think your splitting the original into "this is about
> type" and "this is about the recursive bit" is a good idea to help
> making the resulting code easier to follow.
>
> > +     if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
> > +             return recursive;
>
> We are looking at an entry that is a directory.  We are running in
> recursive mode.  And we are not told to show the directory itself in
> the output.  We skip the rest of the function, which is about to
> show this single entry.  Makes sense.
>
>
> > +     if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
> > +             return !READ_TREE_RECURSIVE;
>
> Negation of a non-zero integer constant is 0, so it is the same as
> the original that returned 0, but I am not sure if it is enhancing
> or hurting readability of the code.  The user of the value, in
> tree.c::read_tree_at(), knows that the possible and valid values are
> 0 and READ_TREE_RECURSIVE, so returning 0 would probably be a better
> idea.  After all, the initializer in the original for the variable
> definition of "retval" used "0", not "!READ_TREE_RECURSIVE".
>
> The name "recursive" is much more specific than the overly generic
> "retval".  Its value is to be consumed by read_tree_at(), i.e. our
> caller, to decide if we want it to recurse into the contents of the
> directory.  I would have called it "recurse" (or even "to_recurse"),
> if I were doing this change, though.

Thanks, will apply in the next patch.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 8/9] ls-tree.c: introduce "--format" option
  2022-01-10 19:41                 ` Martin Ågren
@ 2022-01-11  9:34                   ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-11  9:34 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Ævar Arnfjörð Bjarmason,
	Đoàn Trần Công Danh, Git Mailing List,
	Junio C Hamano, Jeff King, tenglong.tl, Johannes Schindelin

On Tue, Jan 11, 2022 at 3:41 AM Martin Ågren <martin.agren@gmail.com> wrote:

> Above, there is a missing space just before "are". That causes the
> manpage to render a little bit funny.
>
> > +       with hex code `xx`; for example `%00` interpolates to
> > +       `\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
> > +       When specified, `--format` cannot be combined with other
> > +       format-altering options, including `--long`, `--name-only`
> > +       and `--object-only`.
> > +

Thanks for reviewing.  will be fixed in the next patch.

> > +Customized format:
> > +
> > +It's support to print customized format by `%(fieldname)` with `--format` option.
>
> I had to re-read this to understand. How about the following?
>
>     It is possible to print in a custom format by using the `--format`
>     option, which is able to interpolate different fields using a
>     `%(fieldname)` notation.
>
> Just a suggestion. Feel free to tweak or ignore. :-)

Yours reads more smoothly than mine, it will be applied in next patch.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-10 18:00                 ` Ævar Arnfjörð Bjarmason
@ 2022-01-11 10:37                   ` Teng Long
  2022-01-11 16:42                   ` Taylor Blau
  1 sibling, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-11 10:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: congdanhqx, git, Junio C Hamano, peff, tenglong.tl, Johannes Schindelin

On Tue, Jan 11, 2022 at 2:02 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> > The original rule was introduced by commit:
> >
> >     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac
>
> Let's refer to commits like this:
>
>     28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)

OK, I will.

> That file currently has:
>
>      18:format F =~ "s";
>      26:format F =~ "s";
>      47:format F =~ "s";
>
> If we're fixing these let's fix the other logic errors as well.

Thanks for the reminding, they'll be applied in the next patch.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-10 18:00                 ` Ævar Arnfjörð Bjarmason
  2022-01-11 10:37                   ` Teng Long
@ 2022-01-11 16:42                   ` Taylor Blau
  2022-01-11 19:06                     ` René Scharfe
                                       ` (2 more replies)
  1 sibling, 3 replies; 224+ messages in thread
From: Taylor Blau @ 2022-01-11 16:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Teng Long, congdanhqx, git, gitster, peff, tenglong.tl,
	Johannes Schindelin

On Mon, Jan 10, 2022 at 07:00:59PM +0100, Ævar Arnfjörð Bjarmason wrote:
> On Thu, Jan 06 2022, Teng Long wrote:
> > The original rule was introduced by commit:
> >
> >     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac
>
> Let's refer to commits like this:
>
>     28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)

I find it helpful to have an alias like:

    $ git config alias.ll
    !git always --no-pager log -1 --pretty='tformat:%h (%s, %ad)' --date=short

in my $HOME/.gitconfig so that I can easily format commits in the
standard way.

I think that this alias came from Peff, but I can't remember.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-11 16:42                   ` Taylor Blau
@ 2022-01-11 19:06                     ` René Scharfe
  2022-01-11 20:11                       ` Taylor Blau
  2022-01-11 20:39                     ` Ævar Arnfjörð Bjarmason
  2022-01-13  3:28                     ` Teng Long
  2 siblings, 1 reply; 224+ messages in thread
From: René Scharfe @ 2022-01-11 19:06 UTC (permalink / raw)
  To: Taylor Blau, Ævar Arnfjörð Bjarmason
  Cc: Teng Long, congdanhqx, git, gitster, peff, tenglong.tl,
	Johannes Schindelin

Am 11.01.22 um 17:42 schrieb Taylor Blau:
> On Mon, Jan 10, 2022 at 07:00:59PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jan 06 2022, Teng Long wrote:
>>> The original rule was introduced by commit:
>>>
>>>     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac
>>
>> Let's refer to commits like this:
>>
>>     28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)
>
> I find it helpful to have an alias like:
>
>     $ git config alias.ll
>     !git always --no-pager log -1 --pretty='tformat:%h (%s, %ad)' --date=short
>
> in my $HOME/.gitconfig so that I can easily format commits in the
> standard way.

You can shorten "--pretty='tformat:%h (%s, %ad)' --date=short" to
"--pretty=reference" or "--format=reference".  For me that's easy enough
to remember that I don't need an alias.

Silly question, going further off-topic: What's "git always" doing?

René

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-11 19:06                     ` René Scharfe
@ 2022-01-11 20:11                       ` Taylor Blau
  2022-01-13  3:34                         ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Taylor Blau @ 2022-01-11 20:11 UTC (permalink / raw)
  To: René Scharfe
  Cc: Taylor Blau, Ævar Arnfjörð Bjarmason, Teng Long,
	congdanhqx, git, gitster, peff, tenglong.tl, Johannes Schindelin

On Tue, Jan 11, 2022 at 08:06:00PM +0100, René Scharfe wrote:
> Am 11.01.22 um 17:42 schrieb Taylor Blau:
> > I find it helpful to have an alias like:
> >
> >     $ git config alias.ll
> >     !git always --no-pager log -1 --pretty='tformat:%h (%s, %ad)' --date=short
> >
> > in my $HOME/.gitconfig so that I can easily format commits in the
> > standard way.
>
> You can shorten "--pretty='tformat:%h (%s, %ad)' --date=short" to
> "--pretty=reference" or "--format=reference".  For me that's easy enough
> to remember that I don't need an alias.

Ah, of course. Peff's copy likely predates `--pretty=reference`, and I
inherited the cruft from him. Your suggestion has the nice benefit of
colorizing the output when going to the terminal.

> Silly question, going further off-topic: What's "git always" doing?

Oops, I should have mentioned. It's another alias to ensure that the
following command is always run in a Git repository (either the current
one or a hand-picked default):

    $ git config alias.always
    !git rev-parse 2>/dev/null || cd ~/src/git; git

I often read mail out of my home directory, and the above works with my
`:Git` command in Vim (which passes its arguments to `git always` and
inserts the result back into my buffer). That way I don't have to first
`:cd ~/src/git` and then `:Git ll xyz`, I can just `:Git ll xyz` and it
does what I meant most of the time.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-11 16:42                   ` Taylor Blau
  2022-01-11 19:06                     ` René Scharfe
@ 2022-01-11 20:39                     ` Ævar Arnfjörð Bjarmason
  2022-01-13  3:35                       ` Teng Long
  2022-01-13  3:28                     ` Teng Long
  2 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-11 20:39 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Teng Long, congdanhqx, git, gitster, peff, tenglong.tl,
	Johannes Schindelin


On Tue, Jan 11 2022, Taylor Blau wrote:

> On Mon, Jan 10, 2022 at 07:00:59PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jan 06 2022, Teng Long wrote:
>> > The original rule was introduced by commit:
>> >
>> >     https://github.com/git/git/commit/28c23cd4c3902449aff72cb9a4a703220be0d6ac
>>
>> Let's refer to commits like this:
>>
>>     28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)
>
> I find it helpful to have an alias like:
>
>     $ git config alias.ll
>     !git always --no-pager log -1 --pretty='tformat:%h (%s, %ad)' --date=short
>
> in my $HOME/.gitconfig so that I can easily format commits in the
> standard way.
>
> I think that this alias came from Peff, but I can't remember.

Nowadays you can do this as:

    git show -s --pretty=reference

See Documentation/SubmittingPatches

I use:

    $ git help reference
    'reference' is aliased to '!git --no-pager log --pretty=reference -1'

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-11 16:42                   ` Taylor Blau
  2022-01-11 19:06                     ` René Scharfe
  2022-01-11 20:39                     ` Ævar Arnfjörð Bjarmason
@ 2022-01-13  3:28                     ` Teng Long
  2 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:28 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Ævar Arnfjörð Bjarmason,
	Đoàn Trần Công Danh, Git Mailing List,
	Junio C Hamano, Jeff King, tenglong.tl, Johannes Schindelin

On Wed, Jan 12, 2022 at 12:42 AM Taylor Blau <me@ttaylorr.com> wrote:

>
> I find it helpful to have an alias like:
>
>     $ git config alias.ll
>     !git always --no-pager log -1 --pretty='tformat:%h (%s, %ad)' --date=short
>
> in my $HOME/.gitconfig so that I can easily format commits in the
> standard way.
>
> I think that this alias came from Peff, but I can't remember.

Wow. That's cool and efficient.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-11 20:11                       ` Taylor Blau
@ 2022-01-13  3:34                         ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:34 UTC (permalink / raw)
  To: Taylor Blau
  Cc: René Scharfe, Ævar Arnfjörð Bjarmason,
	Đoàn Trần Công Danh, Git Mailing List,
	Junio C Hamano, Jeff King, tenglong.tl, Johannes Schindelin

On Wed, Jan 12, 2022 at 4:11 AM Taylor Blau <me@ttaylorr.com> wrote:

> > Silly question, going further off-topic: What's "git always" doing?
>
> Oops, I should have mentioned. It's another alias to ensure that the
> following command is always run in a Git repository (either the current
> one or a hand-picked default):
>
>     $ git config alias.always
>     !git rev-parse 2>/dev/null || cd ~/src/git; git
>
> I often read mail out of my home directory, and the above works with my
> `:Git` command in Vim (which passes its arguments to `git always` and
> inserts the result back into my buffer). That way I don't have to first
> `:cd ~/src/git` and then `:Git ll xyz`, I can just `:Git ll xyz` and it
> does what I meant most of the time.


The same question is clear now。
Thanks for the explanations from Taylor Blau and René Scharfe.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()`
  2022-01-11 20:39                     ` Ævar Arnfjörð Bjarmason
@ 2022-01-13  3:35                       ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Johannes Schindelin

On Wed, Jan 12, 2022 at 4:40 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> Nowadays you can do this as:
>
>     git show -s --pretty=reference
>
> See Documentation/SubmittingPatches
>
> I use:
>
>     $ git help reference
>     'reference' is aliased to '!git --no-pager log --pretty=reference -1'

Make sense.
Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts
  2022-01-06  4:31             ` [PATCH v9 0/9] " Teng Long
                                 ` (8 preceding siblings ...)
  2022-01-06  4:31               ` [PATCH v9 9/9] cocci: allow padding with `strbuf_addf()` Teng Long
@ 2022-01-13  3:42               ` Teng Long
  2022-01-13  3:42                 ` [PATCH v10 1/9] ls-tree: remove commented-out code Teng Long
                                   ` (9 more replies)
  9 siblings, 10 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

Major diff from v9:

    1. Exchange the order of "cocci: allow padding with `strbuf_addf()`"
and "introduce "--format" option".
       
       (Advice from Johannes Schindelin) 
       Let coccillene's changes take effect before the commit with the rule
       check failure. 

    2. ls-tree.c: support --object-only option for "git-ls-tree"
       
       (Advice from Junio C Hamano)
       Rename the return value name from "recursive" to "recurse".
       Rename "init_type()" to "get_type()".
       Remove "init_recursive()"       

    3. ls-tree.c: introduce "--format" option
       
       (Advice from Martin Ågren)
       Fix some document errors and wording.
       Bug fixing in "t3105" (Always return 129 without specifying an argument).

    4. cocci: allow padding with `strbuf_addf()`  

       (Advice from Ævar Arnfjörð Bjarmason)
       Fix the other logic errors as well.

Thanks.

Teng Long (5):
  ls-tree: optimize naming and handling of "return" in show_tree()
  ls-tree.c: support --object-only option for "git-ls-tree"
  ls-tree.c: introduce struct "show_tree_data"
  cocci: allow padding with `strbuf_addf()`
  ls-tree.c: introduce "--format" option

Ævar Arnfjörð Bjarmason (4):
  ls-tree: remove commented-out code
  ls-tree: add missing braces to "else" arms
  ls-tree: use "enum object_type", not {blob,tree,commit}_type
  ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"

 Documentation/git-ls-tree.txt   |  56 +++++-
 builtin/ls-tree.c               | 328 +++++++++++++++++++++++++-------
 contrib/coccinelle/strbuf.cocci |   6 +-
 t/t3104-ls-tree-oid.sh          |  51 +++++
 t/t3105-ls-tree-format.sh       |  55 ++++++
 5 files changed, 426 insertions(+), 70 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh
 create mode 100755 t/t3105-ls-tree-format.sh

Range-diff against v9:
 1:  75503c41a7 <  -:  ---------- ls-tree: optimize naming and handling of "return" in show_tree()
 -:  ---------- >  1:  2fcff7e0d4 ls-tree: remove commented-out code
 -:  ---------- >  2:  6fd1dd9383 ls-tree: add missing braces to "else" arms
 -:  ---------- >  3:  208654b5e2 ls-tree: use "enum object_type", not {blob,tree,commit}_type
 -:  ---------- >  4:  2637464fd8 ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
 -:  ---------- >  5:  b04188c822 ls-tree: optimize naming and handling of "return" in show_tree()
 2:  e0274f079a !  6:  bcfbc935b8 ls-tree.c: support --object-only option for "git-ls-tree"
    @@ builtin/ls-tree.c
      {
      	int i;
      
    -@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, const char *pathname
    - 	return 0;
    +@@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
    + 	        : OBJ_BLOB);
      }
      
     +static int show_default(const struct object_id *oid, enum object_type type,
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, c
     +	return 1;
     +}
     +
    - static void init_type(unsigned mode, enum object_type *type)
    + static int show_tree(const struct object_id *oid, struct strbuf *base,
    + 		const char *pathname, unsigned mode, void *context)
      {
    - 	if (S_ISGITLINK(mode))
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
      	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    - 		return !READ_TREE_RECURSIVE;
    + 		return 0;
      
     -	if (!(ls_options & LS_NAME_ONLY)) {
     -		if (ls_options & LS_SHOW_SIZE) {
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -		}
     +	if (shown_fields == FIELD_OBJECT_NAME) {
     +		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
    -+		return recursive;
    ++		return recurse;
      	}
     -	baselen = base->len;
     -	strbuf_addstr(base, pathname);
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     +					   chomp_prefix ? ls_tree_prefix : NULL,
     +					   stdout, line_termination);
     +		strbuf_setlen(base, baselen);
    -+		return recursive;
    ++		return recurse;
     +	}
     +
     +	if (shown_fields >= FIELD_DEFAULT)
     +		show_default(oid, type, pathname, mode, base);
     +
    - 	return recursive;
    + 	return recurse;
      }
      
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 3:  725c4d0187 !  7:  3ddffa1027 ls-tree.c: introduce struct "show_tree_data"
    @@ builtin/ls-tree.c: static unsigned int shown_fields;
      static const  char * const ls_tree_usage[] = {
      	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
      	NULL
    -@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
    - 	return 0;
    +@@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
    + 	        : OBJ_BLOB);
      }
      
     -static int show_default(const struct object_id *oid, enum object_type type,
    @@ builtin/ls-tree.c: static int show_default(const struct object_id *oid, enum obj
      }
      
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    - {
    - 	int recursive = 0;
      	size_t baselen;
    --	enum object_type type = OBJ_BLOB;
    + 	enum object_type type = get_type(mode);
    + 
     +	struct show_tree_data data = {
     +		.mode = mode,
    -+		.type = OBJ_BLOB,
    ++		.type = type,
     +		.oid = oid,
     +		.pathname = pathname,
     +		.base = base,
     +	};
    - 
    --	init_type(mode, &type);
    -+	init_type(mode, &data.type);
    - 	init_recursive(base, pathname, &recursive);
    - 
    --	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
    -+	if (data.type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
    - 		return recursive;
    --	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    -+	if (data.type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    - 		return !READ_TREE_RECURSIVE;
    - 
    - 	if (shown_fields == FIELD_OBJECT_NAME) {
    ++
    + 	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
    + 		recurse = READ_TREE_RECURSIVE;
    + 	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
      	}
      
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -		show_default(oid, type, pathname, mode, base);
     +		show_default(&data);
      
    - 	return recursive;
    + 	return recurse;
      }
 -:  ---------- >  8:  4b58a707c2 cocci: allow padding with `strbuf_addf()`
 4:  7df58483a4 !  9:  db058bf670 ls-tree.c: introduce "--format" option
    @@ Documentation/git-ls-tree.txt: OPTIONS
     +--format=<format>::
     +	A string that interpolates `%(fieldname)` from the result
     +	being shown. It also interpolates `%%` to `%`, and
    -+	`%xx` where `xx`are hex digits interpolates to character
    ++	`%xx` where `xx` are hex digits interpolates to character
     +	with hex code `xx`; for example `%00` interpolates to
     +	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
     +	When specified, `--format` cannot be combined with other
    @@ Documentation/git-ls-tree.txt: quoted as explained for the configuration variabl
      
     +Customized format:
     +
    -+It's support to print customized format by `%(fieldname)` with `--format` option.
    ++It is possible to print in a custom format by using the `--format` option,
    ++which is able to interpolate different fields using a `%(fieldname)` notation.
     +For example, if you want to only print the <object> and <file> fields with a
     +JSON style, executing with a specific "--format" like
     +
    @@ builtin/ls-tree.c: static int parse_shown_fields(void)
      static int show_recursive(const char *base, size_t baselen,
      			  const char *pathname)
      {
    -@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
    - 	return 0;
    +@@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
    + 	        : OBJ_BLOB);
      }
      
    -+static void init_recursive(struct strbuf *base, const char *pathname,
    -+				int *recursive)
    -+{
    -+	if (show_recursive(base->buf, base->len, pathname))
    -+		*recursive = READ_TREE_RECURSIVE;
    -+}
    -+
    -+static void init_type(unsigned mode, enum object_type *type)
    -+{
    -+	if (S_ISGITLINK(mode))
    -+		*type = OBJ_COMMIT;
    -+	else if (S_ISDIR(mode))
    -+		*type = OBJ_TREE;
    -+}
    -+
     +static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
     +			 const char *pathname, unsigned mode, void *context)
     +{
     +	size_t baselen;
    -+	int recursive = 0;
    ++	int recurse = 0;
     +	struct strbuf line = STRBUF_INIT;
    ++	enum object_type type = get_type(mode);
    ++
     +	struct show_tree_data data = {
     +		.mode = mode,
    -+		.type = OBJ_BLOB,
    ++		.type = type,
     +		.oid = oid,
     +		.pathname = pathname,
     +		.base = base,
     +	};
     +
    -+	init_type(mode, &data.type);
    -+	init_recursive(base, pathname, &recursive);
    -+
    -+	if (data.type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
    -+		return recursive;
    -+	if (data.type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    -+		return !READ_TREE_RECURSIVE;
    ++	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
    ++		recurse = READ_TREE_RECURSIVE;
    ++	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
    ++		return recurse;
    ++	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    ++		return 0;
     +
     +	baselen = base->len;
     +	strbuf_expand(&line, format, expand_show_tree, &data);
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
     +	fwrite(line.buf, line.len, 1, stdout);
     +	strbuf_release(&line);
     +	strbuf_setlen(base, baselen);
    -+	return recursive;
    ++	return recurse;
     +}
     +
      static int show_default(struct show_tree_data *data)
      {
      	size_t baselen = data->base->len;
    -@@ builtin/ls-tree.c: static int show_default(struct show_tree_data *data)
    - 	return 1;
    - }
    - 
    --static void init_type(unsigned mode, enum object_type *type)
    --{
    --	if (S_ISGITLINK(mode))
    --		*type = OBJ_COMMIT;
    --	else if (S_ISDIR(mode))
    --		*type = OBJ_TREE;
    --}
    --
    --static void init_recursive(struct strbuf *base, const char *pathname,
    --				int *recursive)
    --{
    --	if (show_recursive(base->buf, base->len, pathname))
    --		*recursive = READ_TREE_RECURSIVE;
    --}
    --
    - static int show_tree(const struct object_id *oid, struct strbuf *base,
    - 		const char *pathname, unsigned mode, void *context)
    - {
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
      	struct object_id oid;
      	struct tree *tree;
    @@ t/t3105-ls-tree-format.sh (new)
     +. ./test-lib.sh
     +
     +test_expect_success 'ls-tree --format usage' '
    -+	test_expect_code 129 git ls-tree --format=fmt -l &&
    -+	test_expect_code 129 git ls-tree --format=fmt --name-only &&
    -+	test_expect_code 129 git ls-tree --format=fmt --name-status &&
    -+	test_expect_code 129 git ls-tree --format=fmt --object-only
    ++	test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
    ++	test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
    ++	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD &&
    ++	test_expect_code 129 git ls-tree --format=fmt --object-only HEAD
     +'
     +
     +test_expect_success 'setup' '
 5:  8dafb2b377 <  -:  ---------- cocci: allow padding with `strbuf_addf()`
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 1/9] ls-tree: remove commented-out code
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  3:42                 ` [PATCH v10 2/9] ls-tree: add missing braces to "else" arms Teng Long
                                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Remove code added in f35a6d3bce7 (Teach core object handling functions
about gitlinks, 2007-04-09), later patched in 7d0b18a4da1 (Add output
flushing before fork(), 2008-08-04), and then finally ending up in its
current form in d3bee161fef (tree.c: allow read_tree_recursive() to
traverse gitlink entries, 2009-01-25). All while being commented-out!

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..5f7c84950c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -69,15 +69,6 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
-		/*
-		 * Maybe we want to have some recursive version here?
-		 *
-		 * Something similar to this incomplete example:
-		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
-		 *
-		 */
 		type = commit_type;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 2/9] ls-tree: add missing braces to "else" arms
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
  2022-01-13  3:42                 ` [PATCH v10 1/9] ls-tree: remove commented-out code Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  3:42                 ` [PATCH v10 3/9] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
                                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Add missing {} to the "else" arms in show_tree() per the
CodingGuidelines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 5f7c84950c..0a28f32ccb 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -92,14 +92,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				else
 					xsnprintf(size_text, sizeof(size_text),
 						  "%"PRIuMAX, (uintmax_t)size);
-			} else
+			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
+			}
 			printf("%06o %s %s %7s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
-		} else
+		} else {
 			printf("%06o %s %s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev));
+		}
 	}
 	baselen = base->len;
 	strbuf_addstr(base, pathname);
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 3/9] ls-tree: use "enum object_type", not {blob,tree,commit}_type
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
  2022-01-13  3:42                 ` [PATCH v10 1/9] ls-tree: remove commented-out code Teng Long
  2022-01-13  3:42                 ` [PATCH v10 2/9] ls-tree: add missing braces to "else" arms Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  3:42                 ` [PATCH v10 4/9] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
                                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Change the ls-tree.c code to use type_name() on the enum instead of
using the string constants. This doesn't matter either way for
performance, but makes this a bit easier to read as we'll no longer
need a strcmp() here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 0a28f32ccb..3f0225b097 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,17 +66,17 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
-	const char *type = blob_type;
+	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-		type = commit_type;
+		type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
 			retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
 				return retval;
 		}
-		type = tree_type;
+		type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
@@ -84,7 +84,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
-			if (!strcmp(type, blob_type)) {
+			if (type == OBJ_BLOB) {
 				unsigned long size;
 				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
 					xsnprintf(size_text, sizeof(size_text),
@@ -95,11 +95,11 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
 			}
-			printf("%06o %s %s %7s\t", mode, type,
+			printf("%06o %s %s %7s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
 		} else {
-			printf("%06o %s %s\t", mode, type,
+			printf("%06o %s %s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev));
 		}
 	}
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 4/9] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (2 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 3/9] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  3:42                 ` [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree() Teng Long
                                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

The "struct strbuf"'s "len" member is a "size_t", not an "int", so
let's change our corresponding types accordingly. This also changes
the "len" and "speclen" variables, which are likewise used to store
the return value of strlen(), which returns "size_t", not "int".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3f0225b097..eecc7482d5 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,7 +31,7 @@ static const  char * const ls_tree_usage[] = {
 	NULL
 };
 
-static int show_recursive(const char *base, int baselen, const char *pathname)
+static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
 
@@ -43,7 +43,7 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
 
 	for (i = 0; i < pathspec.nr; i++) {
 		const char *spec = pathspec.items[i].match;
-		int len, speclen;
+		size_t len, speclen;
 
 		if (strncmp(base, spec, baselen))
 			continue;
@@ -65,7 +65,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
 	int retval = 0;
-	int baselen;
+	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (3 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 4/9] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  6:49                   ` Ævar Arnfjörð Bjarmason
  2022-01-13  3:42                 ` [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
                                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren, Teng Long

The variable which "show_tree()" return is named "retval", a name that's
a little hard to understand. This commit tries to make the variable
and the related codes more clear in the context.

The commit firstly rename "retval" to "recurse" which is a more
meaningful name than before. Secondly, "get_type()" is introduced
to setup the "type" by "mode", this will remove some of the nested if.
After this, The codes here become a little bit clearer, so we do not
need to take a look at "read_tree_at()" in "tree.c" to make sure the
context of the return value.

Signed-off-by: Teng Long <dyronetengb@gmail.com>
---
 builtin/ls-tree.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index eecc7482d5..9729854a3d 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -61,24 +61,27 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static enum object_type get_type(unsigned int mode)
+{
+	return (S_ISGITLINK(mode)
+	        ? OBJ_COMMIT
+	        : S_ISDIR(mode)
+	        ? OBJ_TREE
+	        : OBJ_BLOB);
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
-	int retval = 0;
+	int recurse = 0;
 	size_t baselen;
-	enum object_type type = OBJ_BLOB;
-
-	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
-		if (show_recursive(base->buf, base->len, pathname)) {
-			retval = READ_TREE_RECURSIVE;
-			if (!(ls_options & LS_SHOW_TREES))
-				return retval;
-		}
-		type = OBJ_TREE;
-	}
-	else if (ls_options & LS_TREE_ONLY)
+	enum object_type type = get_type(mode);
+
+	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
+		recurse = READ_TREE_RECURSIVE;
+	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
+		return recurse;
+	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return 0;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
@@ -109,7 +112,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				   chomp_prefix ? ls_tree_prefix : NULL,
 				   stdout, line_termination);
 	strbuf_setlen(base, baselen);
-	return retval;
+	return recurse;
 }
 
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (4 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree() Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  6:59                   ` Ævar Arnfjörð Bjarmason
  2022-01-13  3:42                 ` [PATCH v10 7/9] ls-tree.c: introduce struct "show_tree_data" Teng Long
                                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" , they are mutually exclusive (actually
"--name-only" and "--long" can be combined together before, this
commit by the way fix this bug).

A simple refactoring was done to the "show_tree" function, intead by
using bitwise operations to recognize the format for printing to
stdout. The reason for doing this is that we don't want to increase
the readability difficulty with the addition of "-object-only",
making this part of the logic easier to read and expand.

In terms of performance, there is no loss comparing to the
"master" (2ae0a9cb8298185a94e5998086f380a355dd8907), here are the
results of the performance tests in my environment based on linux
repository:

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    Range (min … max):   101.5 ms … 111.3 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    Range (min … max):    99.3 ms … 109.5 ms    27 runs

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    Range (min … max):   323.0 ms … 355.0 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    Range (min … max):   330.4 ms … 349.9 ms    10 runs

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |   7 +-
 builtin/ls-tree.c             | 141 +++++++++++++++++++++++++---------
 t/t3104-ls-tree-oid.sh        |  51 ++++++++++++
 3 files changed, 160 insertions(+), 39 deletions(-)
 create mode 100755 t/t3104-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..729370f235 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 9729854a3d..e1a2f8225b 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -16,22 +16,60 @@
 
 static int line_termination = '\n';
 #define LS_RECURSIVE 1
-#define LS_TREE_ONLY 2
-#define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_TREE_ONLY (1 << 1)
+#define LS_SHOW_TREES (1 << 2)
+#define LS_NAME_ONLY (1 << 3)
+#define LS_SHOW_SIZE (1 << 4)
+#define LS_OBJECT_ONLY (1 << 5)
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
+static unsigned int shown_fields;
+#define FIELD_FILE_NAME 1
+#define FIELD_SIZE (1 << 1)
+#define FIELD_OBJECT_NAME (1 << 2)
+#define FIELD_TYPE (1 << 3)
+#define FIELD_MODE (1 << 4)
+#define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
+#define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
 
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
-static int show_recursive(const char *base, size_t baselen, const char *pathname)
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_OBJECT_ONLY,
+	MODE_LONG,
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY) {
+		shown_fields = FIELD_FILE_NAME;
+		return 0;
+	}
+	if (cmdmode == MODE_OBJECT_ONLY) {
+		shown_fields = FIELD_OBJECT_NAME;
+		return 0;
+	}
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY))
+		shown_fields = FIELD_DEFAULT;
+	if (cmdmode == MODE_LONG)
+		shown_fields = FIELD_LONG_DEFAULT;
+	return 1;
+}
+
+static int show_recursive(const char *base, size_t baselen,
+			  const char *pathname)
 {
 	int i;
 
@@ -70,6 +108,39 @@ static enum object_type get_type(unsigned int mode)
 	        : OBJ_BLOB);
 }
 
+static int show_default(const struct object_id *oid, enum object_type type,
+			const char *pathname, unsigned mode,
+			struct strbuf *base)
+{
+	size_t baselen = base->len;
+
+	if (shown_fields & FIELD_SIZE) {
+		char size_text[24];
+		if (type == OBJ_BLOB) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text), "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%" PRIuMAX, (uintmax_t)size);
+		} else {
+			xsnprintf(size_text, sizeof(size_text), "-");
+		}
+		printf("%06o %s %s %7s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev), size_text);
+	} else {
+		printf("%06o %s %s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev));
+	}
+	baselen = base->len;
+	strbuf_addstr(base, pathname);
+	write_name_quoted_relative(base->buf,
+				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
+				   line_termination);
+	strbuf_setlen(base, baselen);
+	return 1;
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
@@ -84,34 +155,24 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (type == OBJ_BLOB) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else {
-				xsnprintf(size_text, sizeof(size_text), "-");
-			}
-			printf("%06o %s %s %7s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
-		} else {
-			printf("%06o %s %s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev));
-		}
+	if (shown_fields == FIELD_OBJECT_NAME) {
+		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
+		return recurse;
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
-	strbuf_setlen(base, baselen);
+
+	if (shown_fields == FIELD_FILE_NAME) {
+		baselen = base->len;
+		strbuf_addstr(base, pathname);
+		write_name_quoted_relative(base->buf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout, line_termination);
+		strbuf_setlen(base, baselen);
+		return recurse;
+	}
+
+	if (shown_fields >= FIELD_DEFAULT)
+		show_default(oid, type, pathname, mode, base);
+
 	return recurse;
 }
 
@@ -129,12 +190,14 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_BIT('l', "long", &ls_options, N_("include object size"),
-			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
+			    MODE_LONG),
+		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -165,6 +228,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
+	parse_shown_fields();
+
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
diff --git a/t/t3104-ls-tree-oid.sh b/t/t3104-ls-tree-oid.sh
new file mode 100755
index 0000000000..6ce62bd769
--- /dev/null
+++ b/t/t3104-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit A &&
+	test_commit B &&
+	mkdir -p C &&
+	test_commit C/D.txt &&
+	find *.txt path* \( -type f -o -type l \) -print |
+	xargs git update-index --add &&
+	tree=$(git write-tree) &&
+	echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+	git ls-tree --object-only $tree >current &&
+	git ls-tree $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+	git ls-tree --object-only -r $tree >current &&
+	git ls-tree -r $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+	git ls-tree --object-only --abbrev=6 $tree >current &&
+	git ls-tree --abbrev=6 $tree >result &&
+	cut -f1 result | cut -d " " -f3 >expected &&
+	test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-only $tree
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --name-status $tree
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+	test_expect_code 129 git ls-tree --object-only --long $tree
+'
+
+test_done
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 7/9] ls-tree.c: introduce struct "show_tree_data"
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (5 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  7:03                   ` Ævar Arnfjörð Bjarmason
  2022-01-13  3:42                 ` [PATCH v10 8/9] cocci: allow padding with `strbuf_addf()` Teng Long
                                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

"show_tree_data" is a struct that packages the necessary fields for
"show_tree()". This commit is a pre-prepared commit for supporting
"--format" option and it does not affect any existing functionality.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 44 +++++++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index e1a2f8225b..56cc166adb 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -35,6 +35,14 @@ static unsigned int shown_fields;
 #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
 #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
 
+struct show_tree_data {
+	unsigned mode;
+	enum object_type type;
+	const struct object_id *oid;
+	const char *pathname;
+	struct strbuf *base;
+};
+
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
@@ -108,17 +116,15 @@ static enum object_type get_type(unsigned int mode)
 	        : OBJ_BLOB);
 }
 
-static int show_default(const struct object_id *oid, enum object_type type,
-			const char *pathname, unsigned mode,
-			struct strbuf *base)
+static int show_default(struct show_tree_data *data)
 {
-	size_t baselen = base->len;
+	size_t baselen = data->base->len;
 
 	if (shown_fields & FIELD_SIZE) {
 		char size_text[24];
-		if (type == OBJ_BLOB) {
+		if (data->type == OBJ_BLOB) {
 			unsigned long size;
-			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+			if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
 				xsnprintf(size_text, sizeof(size_text), "BAD");
 			else
 				xsnprintf(size_text, sizeof(size_text),
@@ -126,18 +132,18 @@ static int show_default(const struct object_id *oid, enum object_type type,
 		} else {
 			xsnprintf(size_text, sizeof(size_text), "-");
 		}
-		printf("%06o %s %s %7s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev), size_text);
+		printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev), size_text);
 	} else {
-		printf("%06o %s %s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev));
+		printf("%06o %s %s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev));
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
+	baselen = data->base->len;
+	strbuf_addstr(data->base, data->pathname);
+	write_name_quoted_relative(data->base->buf,
 				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
 				   line_termination);
-	strbuf_setlen(base, baselen);
+	strbuf_setlen(data->base, baselen);
 	return 1;
 }
 
@@ -148,6 +154,14 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	size_t baselen;
 	enum object_type type = get_type(mode);
 
+	struct show_tree_data data = {
+		.mode = mode,
+		.type = type,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
+
 	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
 		recurse = READ_TREE_RECURSIVE;
 	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
@@ -171,7 +185,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	}
 
 	if (shown_fields >= FIELD_DEFAULT)
-		show_default(oid, type, pathname, mode, base);
+		show_default(&data);
 
 	return recurse;
 }
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 8/9] cocci: allow padding with `strbuf_addf()`
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (6 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 7/9] ls-tree.c: introduce struct "show_tree_data" Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  3:42                 ` [PATCH v10 9/9] ls-tree.c: introduce "--format" option Teng Long
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
  9 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren, Johannes Schindelin

A convenient way to pad strings is to use something like
`strbuf_addf(&buf, "%20s", "Hello, world!")`.

However, the Coccinelle rule that forbids a format `"%s"` with a
constant string argument cast too wide a net, and also forbade such
padding.

The original rule was introduced by commit:

    28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 contrib/coccinelle/strbuf.cocci | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index d9ada69b43..0970d98ad7 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -15,7 +15,7 @@ constant fmt !~ "%";
 @@
 expression E;
 struct strbuf SB;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E, "%@F@", SB.buf);
 + strbuf_addbuf(E, &SB);
@@ -23,7 +23,7 @@ format F =~ "s";
 @@
 expression E;
 struct strbuf *SBP;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E, "%@F@", SBP->buf);
 + strbuf_addbuf(E, SBP);
@@ -44,7 +44,7 @@ struct strbuf *SBP;
 
 @@
 expression E1, E2;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E1, "%@F@", E2);
 + strbuf_addstr(E1, E2);
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v10 9/9] ls-tree.c: introduce "--format" option
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (7 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 8/9] cocci: allow padding with `strbuf_addf()` Teng Long
@ 2022-01-13  3:42                 ` Teng Long
  2022-01-13  7:16                   ` Ævar Arnfjörð Bjarmason
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
  9 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-13  3:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster, peff,
	tenglong.tl, martin.agren

Add a --format option to ls-tree. It has an existing default output,
and then --long and --name-only options to emit the default output
along with the objectsize and, or to only emit object paths.

Rather than add --type-only, --object-only etc. we can just support a
--format using a strbuf_expand() similar to "for-each-ref
--format". We might still add such options in the future for
convenience.

The --format implementation is slower than the existing code, but this
change does not cause any performance regressions. We'll leave the
existing show_tree() unchanged, and only run show_tree_fmt() in if
a --format different than the hardcoded built-in ones corresponding to
the existing modes is provided.

I.e. something like the "--long" output would be much slower with
this, mainly due to how we need to allocate various things to do with
quote.c instead of spewing the output directly to stdout.

The new option of '--format' comes from Ævar Arnfjörð Bjarmasonn's
idea and suggestion, this commit makes modifications in terms of the
original discussion on community [1].

Here is the statistics about performance tests:

1. Default format (hitten the builtin formats):

    "git ls-tree <tree-ish>" vs "--format='%(mode) %(type) %(object)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.2 ms ±   3.3 ms    [User: 84.3 ms, System: 20.8 ms]
    Range (min … max):    99.2 ms … 113.2 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD
    Time (mean ± σ):     106.4 ms ±   2.7 ms    [User: 86.1 ms, System: 20.2 ms]
    Range (min … max):   100.2 ms … 110.5 ms    29 runs

2. Default format includes object size (hitten the builtin formats):

    "git ls-tree -l <tree-ish>" vs "--format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     335.1 ms ±   6.5 ms    [User: 304.6 ms, System: 30.4 ms]
    Range (min … max):   327.5 ms … 348.4 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    Time (mean ± σ):     337.2 ms ±   8.2 ms    [User: 309.2 ms, System: 27.9 ms]
    Range (min … max):   328.8 ms … 349.4 ms    10 runs

Links:
	[1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  51 +++++++++++++-
 builtin/ls-tree.c             | 129 +++++++++++++++++++++++++++++++++-
 t/t3105-ls-tree-format.sh     |  55 +++++++++++++++
 3 files changed, 230 insertions(+), 5 deletions(-)
 create mode 100755 t/t3105-ls-tree-format.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index 729370f235..ebdde6eae3 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,9 +10,9 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
-	    <tree-ish> [<path>...]
-
+	    [--name-only] [--name-status] [--object-only]
+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--format=<format>] <tree-ish> [<path>...]
 DESCRIPTION
 -----------
 Lists the contents of a given tree object, like what "/bin/ls -a" does
@@ -79,6 +79,16 @@ OPTIONS
 	Do not limit the listing to the current working directory.
 	Implies --full-name.
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result
+	being shown. It also interpolates `%%` to `%`, and
+	`%xx` where `xx` are hex digits interpolates to character
+	with hex code `xx`; for example `%00` interpolates to
+	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
+	When specified, `--format` cannot be combined with other
+	format-altering options, including `--long`, `--name-only`
+	and `--object-only`.
+
 [<path>...]::
 	When paths are given, show them (note that this isn't really raw
 	pathnames, but rather a list of patterns to match).  Otherwise
@@ -87,6 +97,9 @@ OPTIONS
 
 Output Format
 -------------
+
+Default format:
+
         <mode> SP <type> SP <object> TAB <file>
 
 This output format is compatible with what `--index-info --stdin` of
@@ -105,6 +118,38 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+Customized format:
+
+It is possible to print in a custom format by using the `--format` option,
+which is able to interpolate different fields using a `%(fieldname)` notation.
+For example, if you want to only print the <object> and <file> fields with a
+JSON style, executing with a specific "--format" like
+
+        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
+
+The output format changes to:
+
+        {"object":"<object>", "file":"<file>"}
+
+FIELD NAMES
+-----------
+
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputing line, the following
+names can be used:
+
+mode::
+	The mode of the object.
+type::
+	The type of the object (`blob` or `tree`).
+object::
+	The name of the object.
+size[:padded]::
+	The size of the object ("-" if it's a tree).
+	It also supports a padded format of size with "%(size:padded)".
+file::
+	The filename of the object.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 56cc166adb..e048a68ee0 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -57,6 +57,12 @@ enum {
 
 static int cmdmode = MODE_UNSPECIFIED;
 
+static const char *format;
+static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
+static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
+static const char *name_only_format = "%(file)";
+static const char *object_only_format = "%(object)";
+
 static int parse_shown_fields(void)
 {
 	if (cmdmode == MODE_NAME_ONLY) {
@@ -76,6 +82,72 @@ static int parse_shown_fields(void)
 	return 1;
 }
 
+static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
+			      const enum object_type type, unsigned int padded)
+{
+	if (type == OBJ_BLOB) {
+		unsigned long size;
+		if (oid_object_info(the_repository, oid, &size) < 0)
+			die(_("could not get object info about '%s'"),
+			    oid_to_hex(oid));
+		if (padded)
+			strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
+		else
+			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
+	} else if (padded) {
+		strbuf_addf(line, "%7s", "-");
+	} else {
+		strbuf_addstr(line, "-");
+	}
+}
+
+static size_t expand_show_tree(struct strbuf *line, const char *start,
+			       void *context)
+{
+	struct show_tree_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	size_t len = strbuf_expand_literal_cb(line, start, NULL);
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-tree format: as '%s'"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(mode)", &p)) {
+		strbuf_addf(line, "%06o", data->mode);
+	} else if (skip_prefix(start, "(type)", &p)) {
+		strbuf_addstr(line, type_name(data->type));
+	} else if (skip_prefix(start, "(size:padded)", &p)) {
+		expand_objectsize(line, data->oid, data->type, 1);
+	} else if (skip_prefix(start, "(size)", &p)) {
+		expand_objectsize(line, data->oid, data->type, 0);
+	} else if (skip_prefix(start, "(object)", &p)) {
+		strbuf_add_unique_abbrev(line, data->oid, abbrev);
+	} else if (skip_prefix(start, "(file)", &p)) {
+		const char *name = data->base->buf;
+		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
+		struct strbuf quoted = STRBUF_INIT;
+		struct strbuf sb = STRBUF_INIT;
+		strbuf_addstr(data->base, data->pathname);
+		name = relative_path(data->base->buf, prefix, &sb);
+		quote_c_style(name, &quoted, NULL, 0);
+		strbuf_addbuf(line, &quoted);
+		strbuf_release(&sb);
+		strbuf_release(&quoted);
+	} else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-tree format: %%%.*s"), errlen, start);
+	}
+	return len;
+}
+
 static int show_recursive(const char *base, size_t baselen,
 			  const char *pathname)
 {
@@ -116,6 +188,38 @@ static enum object_type get_type(unsigned int mode)
 	        : OBJ_BLOB);
 }
 
+static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
+			 const char *pathname, unsigned mode, void *context)
+{
+	size_t baselen;
+	int recurse = 0;
+	struct strbuf line = STRBUF_INIT;
+	enum object_type type = get_type(mode);
+
+	struct show_tree_data data = {
+		.mode = mode,
+		.type = type,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
+
+	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
+		recurse = READ_TREE_RECURSIVE;
+	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
+		return recurse;
+	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
+		return 0;
+
+	baselen = base->len;
+	strbuf_expand(&line, format, expand_show_tree, &data);
+	strbuf_addch(&line, line_termination);
+	fwrite(line.buf, line.len, 1, stdout);
+	strbuf_release(&line);
+	strbuf_setlen(base, baselen);
+	return recurse;
+}
+
 static int show_default(struct show_tree_data *data)
 {
 	size_t baselen = data->base->len;
@@ -195,6 +299,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	struct object_id oid;
 	struct tree *tree;
 	int i, full_tree = 0;
+	read_tree_fn_t fn = show_tree;
 	const struct option ls_tree_options[] = {
 		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
 			LS_TREE_ONLY),
@@ -217,6 +322,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "full-tree", &full_tree,
 			 N_("list entire tree; not just current directory "
 			    "(implies --full-name)")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
@@ -237,6 +345,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
 		ls_options |= LS_SHOW_TREES;
 
+	if (format && cmdmode)
+		usage_msg_opt(
+			_("--format can't be combined with other format-altering options"),
+			ls_tree_usage, ls_tree_options);
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
 	if (get_oid(argv[0], &oid))
@@ -260,6 +372,19 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
-	return !!read_tree(the_repository, tree,
-			   &pathspec, show_tree, NULL);
+
+	/*
+	 * The generic show_tree_fmt() is slower than show_tree(), so
+	 * take the fast path if possible.
+	 */
+	if (format &&
+	    (!strcmp(format, default_format) ||
+	     !strcmp(format, long_format) ||
+	     !strcmp(format, name_only_format) ||
+	     !strcmp(format, object_only_format)))
+		fn = show_tree;
+	else if (format)
+		fn = show_tree_fmt;
+
+	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
 }
diff --git a/t/t3105-ls-tree-format.sh b/t/t3105-ls-tree-format.sh
new file mode 100755
index 0000000000..ea0f51d866
--- /dev/null
+++ b/t/t3105-ls-tree-format.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='ls-tree --format'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'ls-tree --format usage' '
+	test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
+	test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
+	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD &&
+	test_expect_code 129 git ls-tree --format=fmt --object-only HEAD
+'
+
+test_expect_success 'setup' '
+	mkdir dir &&
+	test_commit dir/sub-file &&
+	test_commit top-file
+'
+
+test_ls_tree_format () {
+	format=$1 &&
+	opts=$2 &&
+	shift 2 &&
+	git ls-tree $opts -r HEAD >expect.raw &&
+	sed "s/^/> /" >expect <expect.raw &&
+	git ls-tree --format="> $format" -r HEAD >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ls-tree --format=<default-like>' '
+	test_ls_tree_format \
+		"%(mode) %(type) %(object)%x09%(file)" \
+		""
+'
+
+test_expect_success 'ls-tree --format=<long-like>' '
+	test_ls_tree_format \
+		"%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
+		"--long"
+'
+
+test_expect_success 'ls-tree --format=<name-only-like>' '
+	test_ls_tree_format \
+		"%(file)" \
+		"--name-only"
+'
+
+test_expect_success 'ls-tree --format=<object-only-like>' '
+	test_ls_tree_format \
+		"%(object)" \
+		"--object-only"
+'
+
+test_done
-- 
2.34.1.390.g2ae0a9cb82.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-13  3:42                 ` [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree() Teng Long
@ 2022-01-13  6:49                   ` Ævar Arnfjörð Bjarmason
  2022-01-14  7:59                     ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-13  6:49 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, peff, tenglong.tl,
	martin.agren, Teng Long


On Thu, Jan 13 2022, Teng Long wrote:

Re the $subject: Is "optimize naming" here just referring to the
s/retval/recurse/g?

Personally I think just a s/retval/ret/g here would make more senes if
we're doing any change at all, and in either case having this variable
re-rename split up as its own commit would make the proposed control
flow changes clearer.

> The variable which "show_tree()" return is named "retval", a name that's
> a little hard to understand. This commit tries to make the variable
> and the related codes more clear in the context.
>
> The commit firstly rename "retval" to "recurse" which is a more
> meaningful name than before. Secondly, "get_type()" is introduced
> to setup the "type" by "mode", this will remove some of the nested if.
> After this, The codes here become a little bit clearer, so we do not
> need to take a look at "read_tree_at()" in "tree.c" to make sure the
> context of the return value.
>
> Signed-off-by: Teng Long <dyronetengb@gmail.com>
> ---
>  builtin/ls-tree.c | 33 ++++++++++++++++++---------------
>  1 file changed, 18 insertions(+), 15 deletions(-)
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index eecc7482d5..9729854a3d 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -61,24 +61,27 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
>  	return 0;
>  }
>  
> +static enum object_type get_type(unsigned int mode)
> +{
> +	return (S_ISGITLINK(mode)
> +	        ? OBJ_COMMIT
> +	        : S_ISDIR(mode)
> +	        ? OBJ_TREE
> +	        : OBJ_BLOB);
> +}

This new function is a re-invention of the object_type() utility in
cache.h, and isn't needed. I.e....

>  static int show_tree(const struct object_id *oid, struct strbuf *base,
>  		const char *pathname, unsigned mode, void *context)
>  {
> -	int retval = 0;
> +	int recurse = 0;
>  	size_t baselen;
> -	enum object_type type = OBJ_BLOB;
> -
> -	if (S_ISGITLINK(mode)) {
> -		type = OBJ_COMMIT;
> -	} else if (S_ISDIR(mode)) {
> -		if (show_recursive(base->buf, base->len, pathname)) {
> -			retval = READ_TREE_RECURSIVE;
> -			if (!(ls_options & LS_SHOW_TREES))
> -				return retval;
> -		}
> -		type = OBJ_TREE;
> -	}
> -	else if (ls_options & LS_TREE_ONLY)
> +	enum object_type type = get_type(mode);

...just drop it and do this:

	-       enum object_type type = get_type(mode);
	+       enum object_type type = object_type(mode);

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-13  3:42                 ` [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-01-13  6:59                   ` Ævar Arnfjörð Bjarmason
  2022-01-14  8:18                     ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-13  6:59 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, peff, tenglong.tl,
	martin.agren


On Thu, Jan 13 2022, Teng Long wrote:

> We usually pipe the output from `git ls-trees` to tools like
> `sed` or `cut` when we only want to extract some fields.
>
> When we want only the pathname component, we can pass
> `--name-only` option to omit such a pipeline, but there are no
> options for extracting other fields.
>
> Teach the "--object-only" option to the command to only show the
> object name. This option cannot be used together with
> "--name-only" or "--long" , they are mutually exclusive (actually
> "--name-only" and "--long" can be combined together before, this
> commit by the way fix this bug).

In the RFC series I sent this was first implemented in terms of the
--format option, and I skipped the custom implementation you're adding
here:
https://lore.kernel.org/git/RFC-patch-7.7-5e34df4f8dd-20211217T131635Z-avarab@gmail.com/

I think in terms of patch series structure it would make sense to do
that, and then have this custom --object-only implementation in terms of
not-"--format " follow from that, and thus with the tests for the two
(we'd add the tests you're adding here first, just for a
--format="%(objectname)" or whatever) we'd see that the two are 1=1
equivalent in terms of functionality, but that this one is <X>% more
optimized.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 7/9] ls-tree.c: introduce struct "show_tree_data"
  2022-01-13  3:42                 ` [PATCH v10 7/9] ls-tree.c: introduce struct "show_tree_data" Teng Long
@ 2022-01-13  7:03                   ` Ævar Arnfjörð Bjarmason
  2022-01-14  9:12                     ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-13  7:03 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, peff, tenglong.tl,
	martin.agren


On Thu, Jan 13 2022, Teng Long wrote:

> "show_tree_data" is a struct that packages the necessary fields for
> "show_tree()". This commit is a pre-prepared commit for supporting
> "--format" option and it does not affect any existing functionality.

Is the only reason this is split off from 9/9 because you're injecting a
8/9 commit for the coccinelle rule change, and wanted to find some
logical cut-off between the two?

> Signed-off-by: Teng Long <dyroneteng@gmail.com>

For both this & 9/9 this seems to mostly/substantially be code I wrote
and submitted as part of
https://lore.kernel.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com;

The convention we use in such cases is to retain the "Author" header and
just add your own Signed-off-by to patches you're modifying/splitting
up.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 9/9] ls-tree.c: introduce "--format" option
  2022-01-13  3:42                 ` [PATCH v10 9/9] ls-tree.c: introduce "--format" option Teng Long
@ 2022-01-13  7:16                   ` Ævar Arnfjörð Bjarmason
  2022-01-18 12:59                     ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-13  7:16 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, peff, tenglong.tl,
	martin.agren, John Cai


On Thu, Jan 13 2022, Teng Long wrote:

> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  Documentation/git-ls-tree.txt |  51 +++++++++++++-
>  builtin/ls-tree.c             | 129 +++++++++++++++++++++++++++++++++-
>  t/t3105-ls-tree-format.sh     |  55 +++++++++++++++
>  3 files changed, 230 insertions(+), 5 deletions(-)
>  create mode 100755 t/t3105-ls-tree-format.sh
>
> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
> index 729370f235..ebdde6eae3 100644
> --- a/Documentation/git-ls-tree.txt
> +++ b/Documentation/git-ls-tree.txt
> @@ -10,9 +10,9 @@ SYNOPSIS
>  --------
>  [verse]
>  'git ls-tree' [-d] [-r] [-t] [-l] [-z]
> -	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
> -	    <tree-ish> [<path>...]
> -
> +	    [--name-only] [--name-status] [--object-only]
> +	    [--full-name] [--full-tree] [--abbrev[=<n>]] 

Let's split up this re-flow only change into its own commit? I.e. the
only non-whitespace change here is beginning with [--format].

If it was the right thing to do to re-flow this then we didn't need
[--format=<format>] to exist to do so...

> +	    [--format=<format>] <tree-ish> [<path>...]
>  DESCRIPTION

Removing this \n breaks the formatting in the file. See "make man && man
./Documentation/git-ls-tree.1". The ./Documentation/doc-diff utility is
also handy for sanity checking the documentation formatting.

>  -----------
>  Lists the contents of a given tree object, like what "/bin/ls -a" does
> @@ -79,6 +79,16 @@ OPTIONS
>  	Do not limit the listing to the current working directory.
>  	Implies --full-name.
>  
> +--format=<format>::
> +	A string that interpolates `%(fieldname)` from the result
> +	being shown. It also interpolates `%%` to `%`, and
> +	`%xx` where `xx` are hex digits interpolates to character
> +	with hex code `xx`; for example `%00` interpolates to
> +	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
> +	When specified, `--format` cannot be combined with other
> +	format-altering options, including `--long`, `--name-only`
> +	and `--object-only`.
> +

These new docs make sense & seem to cover all the basis, thanks!

> +
> +Default format:
> +
>          <mode> SP <type> SP <object> TAB <file>

Here because we've added --format discussing the previous pseudo-format
as a "default" format becomes confusing. Let's instead say:

        The output format of `ls-tree` is determined by either the `--format` option,
        or other format-altering options such as `--name-long` etc. (see `--format` above).

        The use of certain `--format` directives is equivalent to using those options,
        but invoking the full formatting machinery can be slower than using an appropriate
        formatting option.

        In cases where the `--format` would exactly map to an existing option `ls-tree` will
        use the appropriate faster path. Thus the default format is equivalent to:
        ---
        %(mode) %(type) %(object)%x09%(file)
        ---

Or something like that. We could then discuss e.g. --name-long being
`%(mode) %(type) %(object) %(size:padded)%x09%(file)` when we discuss
that option.

>  This output format is compatible with what `--index-info --stdin` of
> @@ -105,6 +118,38 @@ quoted as explained for the configuration variable `core.quotePath`
>  (see linkgit:git-config[1]).  Using `-z` the filename is output
>  verbatim and the line is terminated by a NUL byte.
>  
> +Customized format:
> +
> +It is possible to print in a custom format by using the `--format` option,
> +which is able to interpolate different fields using a `%(fieldname)` notation.
> +For example, if you want to only print the <object> and <file> fields with a
> +JSON style, executing with a specific "--format" like
> +
> +        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
> +
> +The output format changes to:
> +
> +        {"object":"<object>", "file":"<file>"}

This one-liner is guaranteed to result in invalid JSON on some
repositories, both because JSON is inherently a bad fit for git's data
model (JSON needs to be in one Unicode encoding, Git's tree data might
me in a mixture of encodings), and because it'll break if the file
includes a '"'.

I think it's better to just replace this with some example involving -z,
or at least prominently note that this is broken in the general case,
but can be used ad-hoc to quickly check things with "jq" or whatever.

> +FIELD NAMES
> +-----------
> +
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputing line, the following
> +names can be used:
> +
> +mode::
> +	The mode of the object.
> +type::
> +	The type of the object (`blob` or `tree`).
> +object::
> +	The name of the object.
> +size[:padded]::
> +	The size of the object ("-" if it's a tree).
> +	It also supports a padded format of size with "%(size:padded)".
> +file::
> +	The filename of the object.

In
https://lore.kernel.org/git/cover.1641043500.git.dyroneteng@gmail.com/
you noted that you changed the field names of e.g. "objectname" to
"object" etc. You're right that I picked these as-is from the
git-for-each-ref formatting.

1/3 of your reasoning for doing so was to make it consistent with the
documentation examples of e.g.:

     <mode> SP <type> SP <object> TAB <file>

I think in any case (as noted above) we should change those to use the
--format), so that leaves just:

 - "I prefer to make the name more simple to memorize and type"
 - "I think the names with "object" prefix are [from git-for-each-ref
   and the object* prefixes aren't redundant there, but would be here]".

I think both of those still apply, but I think having these consistent
with git-for-each-ref outweighs the slight benefit of shorter names.

Right now only a handful of things support these sort of --format
directives, but we've already got RFC/WIP patches to add that to
git-cat-file, and are likely to add more in the future.

I'd also like us to eventually be able to combine what are now separate
built-ins with their own --format to expose more deeply some internal
APIs via IPC. E.g. now you can do this:

    git for-each-ref --format='%(refname) %(tree)'

But to list each of those trees you'd need to pipe that output into this
new 'git ls-tree --format. But imagine being able to do something like:

    git for-each-ref --format='%(refname) %(git-ls-tree --format %%(objectname) %(tree))'

Where we'd just invoke git-ls-tree for you without running a full
sub-process. I think both for that hypothetical and working with the two
--formats now having to use %(type) in some places but %(objecttype)
etc. in others is just needlessly confusing. Let's just consistently use
the same format names everywhere.

Specifically for your s/path/file/ name change, that's just inaccurate, consider:

    $ ./git ls-tree --format="%(mode) %(type) %(file)" -t HEAD -- t/README
    040000 tree t
    100644 blob t/README

And:

    $ $ (cd t && ../git ls-tree --format="%(mode) %(type) %(file)" -t -r HEAD -- README)
    040000 tree ./
    100644 blob README

I.e. we talk about <path> in the existing SYNOPSIS for a reason. That we
had a "<file>" in the existing format demo was a bug/shorthand that we
shouldn't be propagating further.

> [...]
> +static const char *format;
> +static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
> +static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
> +static const char *name_only_format = "%(file)";
> +static const char *object_only_format = "%(object)";
> +

One advantage of keeping the variable names I picked in
https://lore.kernel.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
is that they align, so you can instantly see that the first two are
equivalent until the "%x09".

It also makes it easier to review to avoid such churn, to see what you
really changed I'm looking at a local version of a range-diff where I
renamed these, the struct you renamed etc. back just to see what you
/really/ changed. I.e. what are functional v.s. renaming changes.

>  static int parse_shown_fields(void)
>  {
>  	if (cmdmode == MODE_NAME_ONLY) {
> @@ -76,6 +82,72 @@ static int parse_shown_fields(void)
>  	return 1;
>  }
>  
> +static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
> +			      const enum object_type type, unsigned int padded)
> +{
> +	if (type == OBJ_BLOB) {
> +		unsigned long size;
> +		if (oid_object_info(the_repository, oid, &size) < 0)
> +			die(_("could not get object info about '%s'"),
> +			    oid_to_hex(oid));
> +		if (padded)
> +			strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
> +		else
> +			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);

Here you changed my '"%"PRIuMAX' to '"%" PRIuMAX'. The former is the
prevailing style in this codebase, and avoiding the formatting churn
makes the inter-diff easier to read.

> +	} else if (padded) {
> +		strbuf_addf(line, "%7s", "-");
> +	} else {
> +		strbuf_addstr(line, "-");
> +	}
> +}

Ditto some harder to review interdiff due to renaming
churn. I.e. s/line/sb/ in both this and expand_show_tree(). I really
wouldn't care at all except because of all the manual work in reviewing
the inter-diff between my original version & this derived version.

In the case of "line" that's not even an improvement. With a --format
we're not building a "line", the user is free to insert any arbitrary
directives including \n's, so we might be working on multiple lines.

> +test_expect_success 'ls-tree --format usage' '
> +	test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
> +	test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
> +	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD &&
> +	test_expect_code 129 git ls-tree --format=fmt --object-only HEAD
> +'

This & several other changes v.s. my version are good, e.g. here I seem
to have repeated the logic error I noted for your version (i.e omitting
"HEAD"), oops!

> +test_expect_success 'setup' '
> +	mkdir dir &&
> +	test_commit dir/sub-file &&
> +	test_commit top-file
> +'
> +
> +test_ls_tree_format () {
> +	format=$1 &&
> +	opts=$2 &&
> +	shift 2 &&
> +	git ls-tree $opts -r HEAD >expect.raw &&
> +	sed "s/^/> /" >expect <expect.raw &&
> +	git ls-tree --format="> $format" -r HEAD >actual &&
> +	test_cmp expect actual
> +}
> +
> +test_expect_success 'ls-tree --format=<default-like>' '
> +	test_ls_tree_format \
> +		"%(mode) %(type) %(object)%x09%(file)" \
> +		""
> +'
> +
> +test_expect_success 'ls-tree --format=<long-like>' '
> +	test_ls_tree_format \
> +		"%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
> +		"--long"
> +'
> +
> +test_expect_success 'ls-tree --format=<name-only-like>' '
> +	test_ls_tree_format \
> +		"%(file)" \
> +		"--name-only"
> +'
> +
> +test_expect_success 'ls-tree --format=<object-only-like>' '
> +	test_ls_tree_format \
> +		"%(object)" \
> +		"--object-only"
> +'
> +
> +test_done

As I noted in my RFC CL (https://lore.kernel.org/git/RFC-cover-0.7-00000000000-20211217T131635Z-avarab@gmail.com/):

	"the tests for ls-tree are really
	lacking. E.g. I seem to have a rather obvious bug in how -t and the
	--format interact here, but no test catches it."

So first, in my version of adding --format I was careful to make
--name-only etc. imply a given --format, and then only at the last
minute would we take the "fast path":
https://lore.kernel.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

You rewrote that in
https://lore.kernel.org/git/e0add802fbbabde7e7b3743127b2d4047f1ce760.1641043500.git.dyroneteng@gmail.com/
and qremoved the limited "GIT_TEST_LS_TREE_FORMAT_BACKEND" testing I
added, so now the internal --format machinery can't be run through the
existing tests we do have.

Even with that re-added I really wouldn't trust that this code is doing
the right thing (and as noted, I don't trust my own RFC version
either). I think e.g. our "coverage" Makefile targets would be a good
start as a first approximation, i.e. running the /ls-tree/ tests and
seeing if we have full coverage.

> Signed-off-by: Teng Long <dyroneteng@gmail.com>

As I noted in 7/9 I think this patch is 9/9 still mostly something I
wrote, so that the "author" and Signed-off-by should be preserved. The
below is a range-diff of an amended version I've been looking at in
trying to review this. It undoes several (but not all) of your
formatting/renaming-only changes, just so that I could see what the
non-formatting changes were:

1:  6c96dff15c5 ! 1:  917bb168d45 ls-tree: add a --format=<fmt> option
    @@
      ## Metadata ##
    -Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +Author: Teng Long <dyroneteng@gmail.com>
     
      ## Commit message ##
    -    ls-tree: add a --format=<fmt> option
    +    ls-tree.c: introduce "--format" option
     
         Add a --format option to ls-tree. It has an existing default output,
         and then --long and --name-only options to emit the default output
    @@ Commit message
     
         The --format implementation is slower than the existing code, but this
         change does not cause any performance regressions. We'll leave the
    -    existing show_tree() unchanged, and only run show_tree_format() in if
    +    existing show_tree() unchanged, and only run show_tree_fmt() in if
         a --format different than the hardcoded built-in ones corresponding to
         the existing modes is provided.
     
    -    "Slower" here can bee seen via the the following "hyperfine"
    -    command. This uses GIT_TEST_LS_TREE_FORMAT_BACKEND=<bool> to force the
    -    use of the new backend:
    -
    -        $ hyperfine -L env false,true -L f "-r,-r -l,-r --name-only,-r --format='%(objectname)'" 'GIT_TEST_LS_TREE_FORMAT_BACKEND={env} ./git -C ~/g/linux ls-tree {f} HEAD' -r 10
    -        Benchmark 1: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r HEAD
    -          Time (mean ± σ):      86.1 ms ±   0.6 ms    [User: 65.2 ms, System: 20.9 ms]
    -          Range (min … max):    85.2 ms …  87.5 ms    10 runs
    +    I.e. something like the "--long" output would be much slower with
    +    this, mainly due to how we need to allocate various things to do with
    +    quote.c instead of spewing the output directly to stdout.
     
    -        Benchmark 2: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r HEAD
    -          Time (mean ± σ):     122.5 ms ±   0.6 ms    [User: 101.3 ms, System: 21.1 ms]
    -          Range (min … max):   121.8 ms … 123.4 ms    10 runs
    +    The new option of '--format' comes from Ævar Arnfjörð Bjarmasonn's
    +    idea and suggestion, this commit makes modifications in terms of the
    +    original discussion on community [1].
     
    -        Benchmark 3: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r -l HEAD
    -          Time (mean ± σ):     277.7 ms ±   1.3 ms    [User: 234.6 ms, System: 43.0 ms]
    -          Range (min … max):   275.9 ms … 279.7 ms    10 runs
    +    Here is the statistics about performance tests:
     
    -        Benchmark 4: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r -l HEAD
    -          Time (mean ± σ):     332.8 ms ±   2.6 ms    [User: 282.0 ms, System: 50.7 ms]
    -          Range (min … max):   329.6 ms … 338.2 ms    10 runs
    +    1. Default format (hitten the builtin formats):
     
    -        Benchmark 5: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --name-only HEAD
    -          Time (mean ± σ):      71.8 ms ±   0.4 ms    [User: 54.1 ms, System: 17.6 ms]
    -          Range (min … max):    71.2 ms …  72.5 ms    10 runs
    +        "git ls-tree <tree-ish>" vs "--format='%(mode) %(type) %(object)%x09%(file)'"
     
    -        Benchmark 6: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --name-only HEAD
    -          Time (mean ± σ):      86.6 ms ±   0.5 ms    [User: 65.7 ms, System: 20.7 ms]
    -          Range (min … max):    85.9 ms …  87.4 ms    10 runs
    +        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    +        Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    +        Time (mean ± σ):     105.2 ms ±   3.3 ms    [User: 84.3 ms, System: 20.8 ms]
    +        Range (min … max):    99.2 ms … 113.2 ms    28 runs
     
    -        Benchmark 7: GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD
    -          Time (mean ± σ):      85.8 ms ±   0.6 ms    [User: 66.2 ms, System: 19.5 ms]
    -          Range (min … max):    85.0 ms …  86.9 ms    10 runs
    +        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD"
    +        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD
    +        Time (mean ± σ):     106.4 ms ±   2.7 ms    [User: 86.1 ms, System: 20.2 ms]
    +        Range (min … max):   100.2 ms … 110.5 ms    29 runs
     
    -        Benchmark 8: GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD
    -          Time (mean ± σ):      85.3 ms ±   0.2 ms    [User: 66.6 ms, System: 18.7 ms]
    -          Range (min … max):    85.0 ms …  85.7 ms    10 runs
    +    2. Default format includes object size (hitten the builtin formats):
     
    -        Summary
    -          'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --name-only HEAD' ran
    -            1.19 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD'
    -            1.19 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r --format='%(objectname)' HEAD'
    -            1.20 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r HEAD'
    -            1.21 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r --name-only HEAD'
    -            1.71 ± 0.01 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r HEAD'
    -            3.87 ± 0.03 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=false ./git -C ~/g/linux ls-tree -r -l HEAD'
    -            4.64 ± 0.05 times faster than 'GIT_TEST_LS_TREE_FORMAT_BACKEND=true ./git -C ~/g/linux ls-tree -r -l HEAD'
    +        "git ls-tree -l <tree-ish>" vs "--format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'"
     
    -    I.e. something like the "--long" output would be much slower with
    -    this, mainly due to how we need to allocate various things to do with
    -    quote.c instead of spewing the output directly to stdout.
    +        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    +        Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    +        Time (mean ± σ):     335.1 ms ±   6.5 ms    [User: 304.6 ms, System: 30.4 ms]
    +        Range (min … max):   327.5 ms … 348.4 ms    10 runs
     
    -    But even a --format='%(objectname)' is fast with the new backend, so
    -    this is viable as a replacement for adding new formats, and we'll pay
    -    for this added complexity as a one-off, and not again every time a new
    -    format needs to be added. See [1] for an example of what it would
    -    otherwise take to add an --object-name flag.
    +        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
    +        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    +        Time (mean ± σ):     337.2 ms ±   8.2 ms    [User: 309.2 ms, System: 27.9 ms]
    +        Range (min … max):   328.8 ms … 349.4 ms    10 runs
     
    -    1. https://lore.kernel.org/git/2e449d1c792ff81da5f22c8bf65ed33c393d62f8.1639721750.git.dyroneteng@gmail.com/
    +    Links:
    +            [1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
     
    -    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
    - ## builtin/ls-tree.c ##
    -@@ builtin/ls-tree.c: static struct pathspec pathspec;
    - static int chomp_prefix;
    - static const char *ls_tree_prefix;
    + ## Documentation/git-ls-tree.txt ##
    +@@ Documentation/git-ls-tree.txt: SYNOPSIS
    + --------
    + [verse]
    + 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
    +-	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
    +-	    <tree-ish> [<path>...]
    ++	    [--name-only] [--name-status] [--object-only]
    ++	    [--full-name] [--full-tree] [--abbrev[=<n>]]
    ++	    [--format=<format>] <tree-ish> [<path>...]
      
    -+/*
    -+ * The format equivalents that show_tree() is prepared to handle.
    -+ */
    -+static const char *ls_tree_format_d = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
    -+static const char *ls_tree_format_l = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
    -+static const char *ls_tree_format_n = "%(path)";
    + DESCRIPTION
    + -----------
    +@@ Documentation/git-ls-tree.txt: OPTIONS
    + 	Do not limit the listing to the current working directory.
    + 	Implies --full-name.
    + 
    ++--format=<format>::
    ++	A string that interpolates `%(fieldname)` from the result
    ++	being shown. It also interpolates `%%` to `%`, and
    ++	`%xx` where `xx` are hex digits interpolates to character
    ++	with hex code `xx`; for example `%00` interpolates to
    ++	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
    ++	When specified, `--format` cannot be combined with other
    ++	format-altering options, including `--long`, `--name-only`
    ++	and `--object-only`.
     +
    - static const  char * const ls_tree_usage[] = {
    - 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
    - 	NULL
    - };
    + [<path>...]::
    + 	When paths are given, show them (note that this isn't really raw
    + 	pathnames, but rather a list of patterns to match).  Otherwise
    +@@ Documentation/git-ls-tree.txt: OPTIONS
      
    -+struct read_tree_ls_tree_data {
    -+	const char *format;
    -+	struct strbuf sb_scratch;
    -+	struct strbuf sb_tmp;
    -+};
    + Output Format
    + -------------
    ++
    ++Default format:
    ++
    +         <mode> SP <type> SP <object> TAB <file>
    + 
    + This output format is compatible with what `--index-info --stdin` of
    +@@ Documentation/git-ls-tree.txt: quoted as explained for the configuration variable `core.quotePath`
    + (see linkgit:git-config[1]).  Using `-z` the filename is output
    + verbatim and the line is terminated by a NUL byte.
    + 
    ++Customized format:
     +
    ++It is possible to print in a custom format by using the `--format` option,
    ++which is able to interpolate different fields using a `%(fieldname)` notation.
    ++For example, if you want to only print the <object> and <file> fields with a
    ++JSON style, executing with a specific "--format" like
    ++
    ++        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
    ++
    ++The output format changes to:
    ++
    ++        {"object":"<object>", "file":"<file>"}
    ++
    ++FIELD NAMES
    ++-----------
    ++
    ++Various values from structured fields can be used to interpolate
    ++into the resulting output. For each outputing line, the following
    ++names can be used:
    ++
    ++mode::
    ++	The mode of the object.
    ++type::
    ++	The type of the object (`blob` or `tree`).
    ++object::
    ++	The name of the object.
    ++size[:padded]::
    ++	The size of the object ("-" if it's a tree).
    ++	It also supports a padded format of size with "%(size:padded)".
    ++file::
    ++	The filename of the object.
    ++
    + GIT
    + ---
    + Part of the linkgit:git[1] suite
    +
    + ## builtin/ls-tree.c ##
    +@@ builtin/ls-tree.c: static unsigned int shown_fields;
    + #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
    + #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
    + 
     +struct expand_ls_tree_data {
     +	unsigned mode;
     +	enum object_type type;
     +	const struct object_id *oid;
     +	const char *pathname;
    -+	const char *basebuf;
    -+	struct strbuf *sb_scratch;
    -+	struct strbuf *sb_tmp;
    ++	struct strbuf *base;
     +};
     +
    - static int show_recursive(const char *base, size_t baselen, const char *pathname)
    + static const  char * const ls_tree_usage[] = {
    + 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
    + 	NULL
    +@@ builtin/ls-tree.c: enum {
    + 
    + static int cmdmode = MODE_UNSPECIFIED;
    + 
    ++static const char *format;
    ++static const char *ls_tree_format_d = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
    ++static const char *ls_tree_format_l = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
    ++static const char *ls_tree_format_n = "%(path)";
    ++static const char *ls_tree_format_o = "%(objectname)";
    ++
    + static int parse_shown_fields(void)
      {
    - 	int i;
    -@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, const char *pathname
    - 	return 0;
    + 	if (cmdmode == MODE_NAME_ONLY) {
    +@@ builtin/ls-tree.c: static int parse_shown_fields(void)
    + 	return 1;
      }
      
    -+static void expand_objectsize(struct strbuf *sb,
    -+			      const struct object_id *oid,
    -+			      const enum object_type type,
    -+			      unsigned int padded)
    ++static void expand_objectsize(struct strbuf *sb, const struct object_id *oid,
    ++			      const enum object_type type, unsigned int padded)
     +{
     +	if (type == OBJ_BLOB) {
     +		unsigned long size;
     +		if (oid_object_info(the_repository, oid, &size) < 0)
    -+			die(_("could not get object info about '%s'"), oid_to_hex(oid));
    ++			die(_("could not get object info about '%s'"),
    ++			    oid_to_hex(oid));
     +		if (padded)
     +			strbuf_addf(sb, "%7"PRIuMAX, (uintmax_t)size);
     +		else
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, c
     +	}
     +}
     +
    -+static size_t expand_show_tree(struct strbuf *sb,
    -+			       const char *start,
    ++static size_t expand_show_tree(struct strbuf *sb, const char *start,
     +			       void *context)
     +{
     +	struct expand_ls_tree_data *data = context;
     +	const char *end;
     +	const char *p;
    -+	size_t len;
    ++	unsigned int errlen;
    ++	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
     +
    -+	len = strbuf_expand_literal_cb(sb, start, NULL);
     +	if (len)
     +		return len;
    -+
     +	if (*start != '(')
    -+		die(_("bad format as of '%s'"), start);
    ++		die(_("bad ls-tree format: as '%s'"), start);
    ++
     +	end = strchr(start + 1, ')');
     +	if (!end)
    -+		die(_("ls-tree format element '%s' does not end in ')'"),
    -+		    start);
    -+	len = end - start + 1;
    ++		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
     +
    ++	len = end - start + 1;
     +	if (skip_prefix(start, "(objectmode)", &p)) {
     +		strbuf_addf(sb, "%06o", data->mode);
     +	} else if (skip_prefix(start, "(objecttype)", &p)) {
    @@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, c
     +	} else if (skip_prefix(start, "(objectsize)", &p)) {
     +		expand_objectsize(sb, data->oid, data->type, 0);
     +	} else if (skip_prefix(start, "(objectname)", &p)) {
    -+		strbuf_addstr(sb, find_unique_abbrev(data->oid, abbrev));
    ++		strbuf_add_unique_abbrev(sb, data->oid, abbrev);
     +	} else if (skip_prefix(start, "(path)", &p)) {
    -+		const char *name = data->basebuf;
    ++		const char *name = data->base->buf;
     +		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
    -+
    -+		if (prefix)
    -+			name = relative_path(name, prefix, data->sb_scratch);
    -+		quote_c_style(name, data->sb_tmp, NULL, 0);
    -+		strbuf_add(sb, data->sb_tmp->buf, data->sb_tmp->len);
    -+
    -+		strbuf_reset(data->sb_tmp);
    -+		/* The relative_path() function resets "scratch" */
    ++		struct strbuf quoted = STRBUF_INIT;
    ++		struct strbuf s = STRBUF_INIT;
    ++		strbuf_addstr(data->base, data->pathname);
    ++		name = relative_path(data->base->buf, prefix, &s);
    ++		quote_c_style(name, &quoted, NULL, 0);
    ++		strbuf_addbuf(sb, &quoted);
    ++		strbuf_release(&s);
    ++		strbuf_release(&quoted);
     +	} else {
    -+		unsigned int errlen = (unsigned long)len;
    -+		die(_("bad ls-tree format specifiec %%%.*s"), errlen, start);
    ++		errlen = (unsigned long)len;
    ++		die(_("bad ls-tree format: %%%.*s"), errlen, start);
     +	}
    -+
     +	return len;
     +}
     +
    - static int show_tree_init(enum object_type *type, struct strbuf *base,
    - 			  const char *pathname, unsigned mode, int *retval)
    + static int show_recursive(const char *base, size_t baselen,
    + 			  const char *pathname)
      {
    -@@ builtin/ls-tree.c: static int show_tree_init(enum object_type *type, struct strbuf *base,
    +@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen,
      	return 0;
      }
      
    +-static int show_default(const struct object_id *oid, enum object_type type,
    +-			const char *pathname, unsigned mode,
    +-			struct strbuf *base)
     +static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
     +			 const char *pathname, unsigned mode, void *context)
    -+{
    -+	struct read_tree_ls_tree_data *data = context;
    -+	struct expand_ls_tree_data my_data = {
    + {
    +-	size_t baselen = base->len;
    ++	size_t baselen;
    ++	int recurse = 0;
    ++	struct strbuf line = STRBUF_INIT;
    ++	enum object_type type = object_type(mode);
    ++
    ++	struct expand_ls_tree_data data = {
     +		.mode = mode,
    -+		.type = OBJ_BLOB,
    ++		.type = type,
     +		.oid = oid,
     +		.pathname = pathname,
    -+		.sb_scratch = &data->sb_scratch,
    -+		.sb_tmp = &data->sb_tmp,
    ++		.base = base,
     +	};
    -+	struct strbuf sb = STRBUF_INIT;
    -+	int retval = 0;
    -+	size_t baselen;
     +
    -+	if (show_tree_init(&my_data.type, base, pathname, mode, &retval))
    -+		return retval;
    ++	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
    ++		recurse = READ_TREE_RECURSIVE;
    ++	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
    ++		return recurse;
    ++	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
    ++		return 0;
     +
     +	baselen = base->len;
    -+	strbuf_addstr(base, pathname);
    -+	strbuf_reset(&sb);
    -+	my_data.basebuf = base->buf;
    -+
    -+	strbuf_expand(&sb, data->format, expand_show_tree, &my_data);
    -+	strbuf_addch(&sb, line_termination);
    -+	fwrite(sb.buf, sb.len, 1, stdout);
    ++	strbuf_expand(&line, format, expand_show_tree, &data);
    ++	strbuf_addch(&line, line_termination);
    ++	fwrite(line.buf, line.len, 1, stdout);
    ++	strbuf_release(&line);
     +	strbuf_setlen(base, baselen);
    -+
    -+	return retval;
    ++	return recurse;
     +}
     +
    - static int show_tree(const struct object_id *oid, struct strbuf *base,
    - 		const char *pathname, unsigned mode, void *context)
    - {
    ++static int show_default(struct expand_ls_tree_data *data)
    ++{
    ++	size_t baselen = data->base->len;
    + 
    + 	if (shown_fields & FIELD_SIZE) {
    + 		char size_text[24];
    +-		if (type == OBJ_BLOB) {
    ++		if (data->type == OBJ_BLOB) {
    + 			unsigned long size;
    +-			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
    ++			if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
    + 				xsnprintf(size_text, sizeof(size_text), "BAD");
    + 			else
    + 				xsnprintf(size_text, sizeof(size_text),
    +@@ builtin/ls-tree.c: static int show_default(const struct object_id *oid, enum object_type type,
    + 		} else {
    + 			xsnprintf(size_text, sizeof(size_text), "-");
    + 		}
    +-		printf("%06o %s %s %7s\t", mode, type_name(type),
    +-		find_unique_abbrev(oid, abbrev), size_text);
    ++		printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
    ++		find_unique_abbrev(data->oid, abbrev), size_text);
    + 	} else {
    +-		printf("%06o %s %s\t", mode, type_name(type),
    +-		find_unique_abbrev(oid, abbrev));
    ++		printf("%06o %s %s\t", data->mode, type_name(data->type),
    ++		find_unique_abbrev(data->oid, abbrev));
    + 	}
    +-	baselen = base->len;
    +-	strbuf_addstr(base, pathname);
    +-	write_name_quoted_relative(base->buf,
    ++	baselen = data->base->len;
    ++	strbuf_addstr(data->base, data->pathname);
    ++	write_name_quoted_relative(data->base->buf,
    + 				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
    + 				   line_termination);
    +-	strbuf_setlen(base, baselen);
    ++	strbuf_setlen(data->base, baselen);
    + 	return 1;
    + }
    + 
    +@@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    + 	size_t baselen;
    + 	enum object_type type = object_type(mode);
    + 
    ++	struct expand_ls_tree_data data = {
    ++		.mode = mode,
    ++		.type = type,
    ++		.oid = oid,
    ++		.pathname = pathname,
    ++		.base = base,
    ++	};
    ++
    + 	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
    + 		recurse = READ_TREE_RECURSIVE;
    + 	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
    +@@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    + 	}
    + 
    + 	if (shown_fields >= FIELD_DEFAULT)
    +-		show_default(oid, type, pathname, mode, base);
    ++		show_default(&data);
    + 
    + 	return recurse;
    + }
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
      	struct object_id oid;
      	struct tree *tree;
      	int i, full_tree = 0;
    -+	const char *implicit_format = NULL;
    -+	const char *format = NULL;
    -+	struct read_tree_ls_tree_data read_tree_cb_data = {
    -+		.sb_scratch = STRBUF_INIT,
    -+		.sb_tmp = STRBUF_INIT,
    -+	};
    ++	read_tree_fn_t fn = show_tree;
      	const struct option ls_tree_options[] = {
      		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
      			LS_TREE_ONLY),
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      		OPT_BOOL(0, "full-tree", &full_tree,
      			 N_("list entire tree; not just current directory "
      			    "(implies --full-name)")),
    -+		OPT_STRING_F(0 , "format", &format, N_("format"),
    -+			     N_("format to use for the output"), PARSE_OPT_NONEG),
    ++		OPT_STRING_F(0, "format", &format, N_("format"),
    ++			     N_("format to use for the output"),
    ++			     PARSE_OPT_NONEG),
      		OPT__ABBREV(&abbrev),
      		OPT_END()
      	};
    -+	read_tree_fn_t fn = show_tree;
    - 
    - 	git_config(git_default_config, NULL);
    - 	ls_tree_prefix = prefix;
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
    - 	if ( (LS_TREE_ONLY|LS_RECURSIVE) ==
      	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
      		ls_options |= LS_SHOW_TREES;
    -+	if (ls_options & LS_NAME_ONLY)
    -+		implicit_format = ls_tree_format_n;
    -+	if (ls_options & LS_SHOW_SIZE)
    -+		implicit_format = ls_tree_format_l;
    -+
    -+	if (format && implicit_format)
    -+		usage_msg_opt(_("providing --format cannot be combined with other format-altering options"),
    -+			      ls_tree_usage, ls_tree_options);
    -+	if (implicit_format)
    -+		format = implicit_format;
    -+	if (!format)
    -+		format = ls_tree_format_d;
      
    ++	if (format && cmdmode)
    ++		usage_msg_opt(
    ++			_("--format can't be combined with other format-altering options"),
    ++			ls_tree_usage, ls_tree_options);
      	if (argc < 1)
      		usage_with_options(ls_tree_usage, ls_tree_options);
    + 	if (get_oid(argv[0], &oid))
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
      	tree = parse_tree_indirect(&oid);
      	if (!tree)
      		die("not a tree object");
    +-	return !!read_tree(the_repository, tree,
    +-			   &pathspec, show_tree, NULL);
     +
     +	/*
     +	 * The generic show_tree_fmt() is slower than show_tree(), so
     +	 * take the fast path if possible.
     +	 */
    -+	if (format && (!strcmp(format, ls_tree_format_d) ||
    -+		       !strcmp(format, ls_tree_format_l) ||
    -+		       !strcmp(format, ls_tree_format_n)))
    ++	if (format &&
    ++	    (!strcmp(format, ls_tree_format_d) ||
    ++	     !strcmp(format, ls_tree_format_l) ||
    ++	     !strcmp(format, ls_tree_format_n) ||
    ++	     !strcmp(format, ls_tree_format_o)))
     +		fn = show_tree;
     +	else if (format)
     +		fn = show_tree_fmt;
    -+	/*
    -+	 * Allow forcing the show_tree_fmt(), to test that it can
    -+	 * handle the test suite.
    -+	 */
    -+	if (git_env_bool("GIT_TEST_LS_TREE_FORMAT_BACKEND", 0))
    -+		fn = show_tree_fmt;
     +
    -+	read_tree_cb_data.format = format;
    - 	return !!read_tree(the_repository, tree,
    --			   &pathspec, show_tree, NULL);
    -+			   &pathspec, fn, &read_tree_cb_data);
    ++	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
      }
     
      ## t/t3105-ls-tree-format.sh (new) ##
    @@ t/t3105-ls-tree-format.sh (new)
     +. ./test-lib.sh
     +
     +test_expect_success 'ls-tree --format usage' '
    -+	test_expect_code 129 git ls-tree --format=fmt -l &&
    -+	test_expect_code 129 git ls-tree --format=fmt --name-only &&
    -+	test_expect_code 129 git ls-tree --format=fmt --name-status
    ++	test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
    ++	test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
    ++	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD &&
    ++	test_expect_code 129 git ls-tree --format=fmt --object-only HEAD
     +'
     +
     +test_expect_success 'setup' '
    @@ t/t3105-ls-tree-format.sh (new)
     +	test_ls_tree_format \
     +		"%(path)" \
     +		"--name-only"
    ++'
     +
    ++test_expect_success 'ls-tree --format=<object-only-like>' '
    ++	test_ls_tree_format \
    ++		"%(objectname)" \
    ++		"--object-only"
     +'
     +
     +test_done

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-13  6:49                   ` Ævar Arnfjörð Bjarmason
@ 2022-01-14  7:59                     ` Teng Long
  2022-01-14 12:00                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-14  7:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren, Teng Long

On Thu, Jan 13, 2022 at 2:55 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> Re the $subject: Is "optimize naming" here just referring to the
> s/retval/recurse/g?

Yes.

> Personally I think just a s/retval/ret/g here would make more senes if
> we're doing any change at all, and in either case having this variable
> re-rename split up as its own commit would make the proposed control
> flow changes clearer.

Do you mean that I can split the current one into two commits,  one does
the renaming work and another one does the left work?

If so, I will do this in the next patch.

>
> This new function is a re-invention of the object_type() utility in
> cache.h, and isn't needed. I.e....
>
> ...just drop it and do this:
>
>         -       enum object_type type = get_type(mode);
>         +       enum object_type type = object_type(mode);

You are absolutely correct.
I will replace get_type() to object_type() in the next patch.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-13  6:59                   ` Ævar Arnfjörð Bjarmason
@ 2022-01-14  8:18                     ` Teng Long
  2022-01-14 11:47                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-14  8:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren

On Thu, Jan 13, 2022 at 3:02 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> In the RFC series I sent this was first implemented in terms of the
> --format option, and I skipped the custom implementation you're adding
> here:
> https://lore.kernel.org/git/RFC-patch-7.7-5e34df4f8dd-20211217T131635Z-avarab@gmail.com/
>
> I think in terms of patch series structure it would make sense to do
> that, and then have this custom --object-only implementation in terms of
> not-"--format " follow from that, and thus with the tests for the two

Sorry, the "not-"--format" means?

> (we'd add the tests you're adding here first, just for a
> --format="%(objectname)" or whatever) we'd see that the two are 1=1
> equivalent in terms of functionality, but that this one is <X>% more
> optimized.

Please allow me to understand your advice,  if we put the commit of
introducing "--format" before the commit of introducing "--object-only", will
be better because it's possible to supply more optimized performance
(if we have) information in the commit message.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 7/9] ls-tree.c: introduce struct "show_tree_data"
  2022-01-13  7:03                   ` Ævar Arnfjörð Bjarmason
@ 2022-01-14  9:12                     ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-14  9:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren

On Thu, Jan 13, 2022 at 3:07 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> > "show_tree_data" is a struct that packages the necessary fields for
> > "show_tree()". This commit is a pre-prepared commit for supporting
> > "--format" option and it does not affect any existing functionality.
>
> Is the only reason this is split off from 9/9 because you're injecting a
> 8/9 commit for the coccinelle rule change, and wanted to find some
> logical cut-off between the two?

I hope "show_tree()" and "the show_tree_format()" to share this structure,
so I made this a pre-prepared and non-functional commit.

After that, in the 9/9, the structure can be used directly and focus on the
functionality changes. If we merge this commit with 9/9, 9/9 will contain a part
of the changes that let "show_tree()" use the new structure, which has nothing
to do with "show_tree_format()" actually, because we designed them to go
through different execution logic. So, personally, I prefer not to mix them
together.

So, the commit of "show_tree_data()" originally was not for "coccinelle".
The only thing that is certain is that coccinelle also should go before 9/9
I think. With regard to 8/9 and 7/9, I think the current order is OK because
they're not related.
>
> For both this & 9/9 this seems to mostly/substantially be code I wrote
> and submitted as part of
> https://lore.kernel.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com;
>
> The convention we use in such cases is to retain the "Author" header and
> just add your own Signed-off-by to patches you're modifying/splitting
> up.

Oops. Sorry for that, I misunderstood it before and I'll be fixed in
the next path.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-14  8:18                     ` Teng Long
@ 2022-01-14 11:47                       ` Ævar Arnfjörð Bjarmason
  2022-01-18  9:55                         ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-14 11:47 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren


On Fri, Jan 14 2022, Teng Long wrote:

> On Thu, Jan 13, 2022 at 3:02 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>
>> In the RFC series I sent this was first implemented in terms of the
>> --format option, and I skipped the custom implementation you're adding
>> here:
>> https://lore.kernel.org/git/RFC-patch-7.7-5e34df4f8dd-20211217T131635Z-avarab@gmail.com/
>>
>> I think in terms of patch series structure it would make sense to do
>> that, and then have this custom --object-only implementation in terms of
>> not-"--format " follow from that, and thus with the tests for the two
>
> Sorry, the "not-"--format" means?

Sorry about not being clear. I mean there's two potential
implementations. One that't is terms of --format='%(objectname)', and
the other with your custom (faster) code to implement it.

>> (we'd add the tests you're adding here first, just for a
>> --format="%(objectname)" or whatever) we'd see that the two are 1=1
>> equivalent in terms of functionality, but that this one is <X>% more
>> optimized.
>
> Please allow me to understand your advice,  if we put the commit of
> introducing "--format" before the commit of introducing "--object-only", will
> be better because it's possible to supply more optimized performance
> (if we have) information in the commit message.

Yes, you get the functionality you need with a simple alias of
--format='%(objectname)' to --object-name (or whatever), so the only
reason to carry the extra code is for optimization.

I wonder if the extra difference in performance is still something you
care about, or if just the --format implementation would be OK.

But in any case, starting with a simpler implementation and testing it
makes the progression easier to reason about.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 5/9] ls-tree: optimize naming and handling of "return" in show_tree()
  2022-01-14  7:59                     ` Teng Long
@ 2022-01-14 12:00                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-14 12:00 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren, Teng Long


On Fri, Jan 14 2022, Teng Long wrote:

> On Thu, Jan 13, 2022 at 2:55 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>
>> Re the $subject: Is "optimize naming" here just referring to the
>> s/retval/recurse/g?
>
> Yes.
>
>> Personally I think just a s/retval/ret/g here would make more senes if
>> we're doing any change at all, and in either case having this variable
>> re-rename split up as its own commit would make the proposed control
>> flow changes clearer.
>
> Do you mean that I can split the current one into two commits,  one does
> the renaming work and another one does the left work?
>
> If so, I will do this in the next patch.

Yes, at least I would find it easier to read :)

>>
>> This new function is a re-invention of the object_type() utility in
>> cache.h, and isn't needed. I.e....
>>
>> ...just drop it and do this:
>>
>>         -       enum object_type type = get_type(mode);
>>         +       enum object_type type = object_type(mode);
>
> You are absolutely correct.
> I will replace get_type() to object_type() in the next patch.

Thanks!

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-14 11:47                       ` Ævar Arnfjörð Bjarmason
@ 2022-01-18  9:55                         ` Teng Long
  2022-02-04 12:58                           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-01-18  9:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren

On Fri, Jan 14, 2022 at 7:59 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> Yes, you get the functionality you need with a simple alias of
> --format='%(objectname)' to --object-name (or whatever), so the only
> reason to carry the extra code is for optimization.
>
> I wonder if the extra difference in performance is still something you
> care about, or if just the --format implementation would be OK.
>
> But in any case, starting with a simpler implementation and testing it
> makes the progression easier to reason about.

Actually, at first, I wanted to achieve this in a simple way, as the
"--object-only" implementation.

With the discussion in the community, I think both of them can achieve
this purpose. "--object-only" is more intuitive, while "--format "is
more flexible.
For example, if the terminal supports automatic completion, the function of
this option can be clearly known with typing TAB and lower costs of use and
understanding. "--format"  also works, but maybe have to check the help
document to see if there are fields that support the same purpose.

Because the community had a different opinion about it. Junio, might prefer
an "--object-only" approach, if I understand the context correctly.

So I have some inclination to support both. However, I can accept that only
"--format" is supported.

So in the next patch, I hope to do some refactoring of the commit to support
"--object-only" as the top commit. If in the end, we decide that "--format" is
enough, we can discard the top "--object-only" commit.

I know you guys currently are busy on the new 2.35 release, so a later reply
is OK.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 9/9] ls-tree.c: introduce "--format" option
  2022-01-13  7:16                   ` Ævar Arnfjörð Bjarmason
@ 2022-01-18 12:59                     ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-01-18 12:59 UTC (permalink / raw)
  To: avarab
  Cc: Johannes.Schindelin, congdanhqx, dyroneteng, git, gitster,
	johncai86, martin.agren, peff, tenglong.tl

On Thu, Jan 13, 2022 at 5:04 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> > diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
> > index 729370f235..ebdde6eae3 100644
> > --- a/Documentation/git-ls-tree.txt
> > +++ b/Documentation/git-ls-tree.txt
> > @@ -10,9 +10,9 @@ SYNOPSIS
> >  --------
> >  [verse]
> >  'git ls-tree' [-d] [-r] [-t] [-l] [-z]
> > -         [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
> > -         <tree-ish> [<path>...]
> > -
> > +         [--name-only] [--name-status] [--object-only]
> > +         [--full-name] [--full-tree] [--abbrev[=<n>]]
>
> Let's split up this re-flow only change into its own commit? I.e. the
> only non-whitespace change here is beginning with [--format].
>
> If it was the right thing to do to re-flow this then we didn't need
> [--format=<format>] to exist to do so...

Agree, especially if "--format" comes earlier as you mentioned in
another reply.

The doc change should only include the new "--format" here, so
we just:

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..b02f028aca 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,5 +10,5 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-           [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+           [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
            <tree-ish> [<path>...]

is OK.

> > +         [--format=<format>] <tree-ish> [<path>...]
> >  DESCRIPTION
>
> Removing this \n breaks the formatting in the file. See "make man && man
> ./Documentation/git-ls-tree.1". The ./Documentation/doc-diff utility is
> also handy for sanity checking the documentation formatting.

This is my mistake and will be corrected in the next patch.

> >  -----------
> >  Lists the contents of a given tree object, like what "/bin/ls -a" does
> > @@ -79,6 +79,16 @@ OPTION
> >       Do not limit the listing to the current working directory.
> >       Implies --full-name.
> >
> > +--format=<format>::
> > +     A string that interpolates `%(fieldname)` from the result
> > +     being shown. It also interpolates `%%` to `%`, and
> > +     `%xx` where `xx` are hex digits interpolates to character
> > +     with hex code `xx`; for example `%00` interpolates to
> > +     `\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
> > +     When specified, `--format` cannot be combined with other
> > +     format-altering options, including `--long`, `--name-only`
> > +     and `--object-only`.
> > +
>
> These new docs make sense & seem to cover all the basis, thanks!

Actually, the content is not from me, I borrowed it from other
documents, but I'm glad if it's described and placed here correctly.

> Here because we've added --format discussing the previous pseudo-format
> as a "default" format becomes confusing. Let's instead say:
>
>         The output format of `ls-tree` is determined by either the `--format` option,
>         or other format-altering options such as `--name-long` etc. (see `--format` above).
>
>         The use of certain `--format` directives is equivalent to using those options,
>         but invoking the full formatting machinery can be slower than using an appropriate
>         formatting option.
>
>         In cases where the `--format` would exactly map to an existing option `ls-tree` will
>         use the appropriate faster path. Thus the default format is equivalent to:
>         ---
>         %(mode) %(type) %(object)%x09%(file)
>         ---

Make sense.

I will use this paragraph instead in next patch except a tiny
nit (s/--name-long/--name-only/).

> > +The output format changes to:
> > +
> > +        {"object":"<object>", "file":"<file>"}
>
> This one-liner is guaranteed to result in invalid JSON on some
> repositories, both because JSON is inherently a bad fit for git's data
> model (JSON needs to be in one Unicode encoding, Git's tree data might
> me in a mixture of encodings), and because it'll break if the file
> includes a '"'.

Correct and especially we use a "quote_c" style behind.

> I think it's better to just replace this with some example involving -z,
> or at least prominently note that this is broken in the general case,
> but can be used ad-hoc to quickly check things with "jq" or whatever.

Your suggestion is great, but personally I don't want to introduce more
complexity and other tools in here, and try to describe it in a simple way.
I think the below maybe is enough:

@@ -117,14 +127,10 @@ Customized format:
 
 It is possible to print in a custom format by using the `--format` option,
 which is able to interpolate different fields using a `%(fieldname)` notation.
-For example, if you want to only print the <object> and <file> fields with a
-JSON style, executing with a specific "--format" like
-
-        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
-
-The output format changes to:
+For example, if you only care about the <object> and <file> fields, you can
+execute with a specific "--format" like
 
-        {"object":"<object>", "file":"<file>"}
+        git ls-tree --format="%(object) %(file)" <tree-ish>
 
 FIELD NAMES
 -----------

> > +FIELD NAMES
> > +-----------
> > +
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputing line, the following
> > +names can be used:
> > +
> > +mode::
> > +     The mode of the object.
> > +type::
> > +     The type of the object (`blob` or `tree`).
> > +object::
> > +     The name of the object.
> > +size[:padded]::
> > +     The size of the object ("-" if it's a tree).
> > +     It also supports a padded format of size with "%(size:padded)".
> > +file::
> > +     The filename of the object.
>
> In
> https://lore.kernel.org/git/cover.1641043500.git.dyroneteng@gmail.com/
> you noted that you changed the field names of e.g. "objectname" to
> "object" etc. You're right that I picked these as-is from the
> git-for-each-ref formatting.
>
> 1/3 of your reasoning for doing so was to make it consistent with the
> documentation examples of e.g.:
>
>      <mode> SP <type> SP <object> TAB <file>
>
> I think in any case (as noted above) we should change those to use the
> --format), so that leaves just:
>
>  - "I prefer to make the name more simple to memorize and type"
>  - "I think the names with "object" prefix are [from git-for-each-ref
>    and the object* prefixes aren't redundant there, but would be here]".
>
> I think both of those still apply, but I think having these consistent
> with git-for-each-ref outweighs the slight benefit of shorter names.
>
> Right now only a handful of things support these sort of --format
> directives, but we've already got RFC/WIP patches to add that to
> git-cat-file, and are likely to add more in the future.

New and important input for me on this.

> I'd also like us to eventually be able to combine what are now separate
> built-ins with their own --format to expose more deeply some internal
> APIs via IPC. E.g. now you can do this:
>
>     git for-each-ref --format='%(refname) %(tree)'
>
> But to list each of those trees you'd need to pipe that output into this
> new 'git ls-tree --format. But imagine being able to do something like:
>
>     git for-each-ref --format='%(refname) %(git-ls-tree --format %%(objectname) %(tree))'

Make sense.

> Where we'd just invoke git-ls-tree for you without running a full
> sub-process. I think both for that hypothetical and working with the two
> --formats now having to use %(type) in some places but %(objecttype)
> etc. in others is just needlessly confusing. Let's just consistently use
> the same format names everywhere.
>
> Specifically for your s/path/file/ name change, that's just inaccurate, consider:
>
>     $ ./git ls-tree --format="%(mode) %(type) %(file)" -t HEAD -- t/README
>     040000 tree t
>     100644 blob t/README
>
> And:
>
>     $ $ (cd t && ../git ls-tree --format="%(mode) %(type) %(file)" -t -r HEAD -- README)
>     040000 tree ./
>     100644 blob README
>
> I.e. we talk about <path> in the existing SYNOPSIS for a reason. That we
> had a "<file>" in the existing format demo was a bug/shorthand that we
> shouldn't be propagating further.

Should use "path" instead of "file" in here.

Make sense.

> > [...]
> > +static const char *format;
> > +static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
> > +static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
> > +static const char *name_only_format = "%(file)";
> > +static const char *object_only_format = "%(object)";
> > +
>
> One advantage of keeping the variable names I picked in
> https://lore.kernel.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
> is that they align, so you can instantly see that the first two are
> equivalent until the "%x09".

Ha. Thanks.I think they not align in here is because my variables' names not align :)  

Actually, I was hesitating to use like "object" or follow the "objectname" like
rules. I would like git to have unified naming style on this, but I didn't have
that much input (other usage already on the way) at the time, so I chose to use
a shorter and probably more memorable name based on the document.

But now, I agree with you, to use the same naming conventions because on the whole,
especially multiple commands have the same appeal on format naming, uniformity is
more important than memorability of a single command, and I also think maybe we
might need to describe and maintain these rules of <fieldname> in a document
somewhere.

> It also makes it easier to review to avoid such churn, to see what you
> really changed I'm looking at a local version of a range-diff where I
> renamed these, the struct you renamed etc. back just to see what you
> /really/ changed. I.e. what are functional v.s. renaming changes.
>
> Here you changed my '"%"PRIuMAX' to '"%" PRIuMAX'. The former is the
> prevailing style in this codebase, and avoiding the formatting churn
> makes the inter-diff easier to read.

Will fix it in next patch.

> Ditto some harder to review interdiff due to renaming
> churn. I.e. s/line/sb/ in both this and expand_show_tree(). I really
> wouldn't care at all except because of all the manual work in reviewing
> the inter-diff between my original version & this derived version.
>
> In the case of "line" that's not even an improvement. With a --format
> we're not building a "line", the user is free to insert any arbitrary
> directives including \n's, so we might be working on multiple lines.

Make sense.

Will optimize in next patch.

> As I noted in my RFC CL (https://lore.kernel.org/git/RFC-cover-0.7-00000000000-20211217T131635Z-avarab@gmail.com/):
>
>         "the tests for ls-tree are really
>         lacking. E.g. I seem to have a rather obvious bug in how -t and the
>         --format interact here, but no test catches it."
>
> So first, in my version of adding --format I was careful to make
> --name-only etc. imply a given --format, and then only at the last
> minute would we take the "fast path":
> https://lore.kernel.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

I'm not sure I understand all the meanings above. I think I have forgot about the content here.
and I have to add some tests for other options like "-t" combined with "--format".

Please correct me if I misunderstood.

> You rewrote that in
> https://lore.kernel.org/git/e0add802fbbabde7e7b3743127b2d4047f1ce760.1641043500.git.dyroneteng@gmail.com/
> and qremoved the limited "GIT_TEST_LS_TREE_FORMAT_BACKEND" testing I
> added, so now the internal --format machinery can't be run through the
> existing tests we do have.

I thought the "GIT_TEST_LS_TREE_FORMAT_BACKEND" is only used in your RFC for testing conveniently,
and should be removed in the end (non-RFC). So I removed it...

So should we add it back?

> Even with that re-added I really wouldn't trust that this code is doing
> the right thing (and as noted, I don't trust my own RFC version
> either). I think e.g. our "coverage" Makefile targets would be a good
> start as a first approximation, i.e. running the /ls-tree/ tests and
> seeing if we have full coverage.

Yeah. I haven't tried it yet, but I will try to run coverage detection when
the next patch is completed

> As I noted in 7/9 I think this patch is 9/9 still mostly something I
> wrote, so that the "author" and Signed-off-by should be preserved. The
> below is a range-diff of an amended version I've been looking at in
> trying to review this. It undoes several (but not all) of your
> formatting/renaming-only changes, just so that I could see what the
> non-formatting changes were:

Sorry for that. I misunderstand something here.
I will fix in the next patch.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-01-18  9:55                         ` Teng Long
@ 2022-02-04 12:58                           ` Ævar Arnfjörð Bjarmason
  2022-02-07  2:22                             ` Teng Long
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 12:58 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren


On Tue, Jan 18 2022, Teng Long wrote:

> On Fri, Jan 14, 2022 at 7:59 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>
>> Yes, you get the functionality you need with a simple alias of
>> --format='%(objectname)' to --object-name (or whatever), so the only
>> reason to carry the extra code is for optimization.
>>
>> I wonder if the extra difference in performance is still something you
>> care about, or if just the --format implementation would be OK.
>>
>> But in any case, starting with a simpler implementation and testing it
>> makes the progression easier to reason about.
>
> Actually, at first, I wanted to achieve this in a simple way, as the
> "--object-only" implementation.
>
> With the discussion in the community, I think both of them can achieve
> this purpose. "--object-only" is more intuitive, while "--format "is
> more flexible.
> For example, if the terminal supports automatic completion, the function of
> this option can be clearly known with typing TAB and lower costs of use and
> understanding. "--format"  also works, but maybe have to check the help
> document to see if there are fields that support the same purpose.
>
> Because the community had a different opinion about it. Junio, might prefer
> an "--object-only" approach, if I understand the context correctly.
>
> So I have some inclination to support both. However, I can accept that only
> "--format" is supported.

I'm only talking about how it's implemented internally, not whether we
have an --object-only option in the UI. I think it's good to have the
option for completion etc.

I.e. in my RFC implementation of it here it's just a trivial wrapper
around specifying a --format:
https://lore.kernel.org/git/RFC-patch-7.7-5e34df4f8dd-20211217T131635Z-avarab@gmail.com/;
Implementing it is 6 lines of trivial C code boilerplate.

But when you picked that up & ran with it you ended up carrying your
original implementation:
https://lore.kernel.org/git/e0274f079a7d381b9a936bfcd53bad64167c18b8.1641440700.git.dyroneteng@gmail.com/

I'm not saying we shouldn't have that, but that in any case a sequence of:

 1. Add a --format option
 2. Add a --object-only alias for a --format (what my RFC 7/7 does)
 3. Add a custom more optimized --object-only implementation

Would make the patch progression much easier to read, and we'd consider
the correctness of --object-only (1 and 2) separate from the
optimization question (3).

But maybe we won't need (3) at all in the end, i.e. is (1 and 2) fast
enough for it not to matter (I think probably "yes", but I don't have a
strong opinion on that).

> So in the next patch, I hope to do some refactoring of the commit to support
> "--object-only" as the top commit. If in the end, we decide that "--format" is
> enough, we can discard the top "--object-only" commit.

*nod*, now that I read ahead I think you pretty much agree with that plan :)

> I know you guys currently are busy on the new 2.35 release, so a later reply
> is OK.

Now would be a good time :)

I was reminded of this because Junio's proposed it for next at
https://lore.kernel.org/git/xmqqr18jnr2t.fsf@gitster.g/

I think per the above & other replies of mine (including not matters of
code arrangement opinion, but e.g. the doc formatting bug) we'll need at
least one more re-roll of this. Thanks for sticking with this & working
on this!

I'll indicate that in a reply to that "What's Cooking" report.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v10 6/9] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-02-04 12:58                           ` Ævar Arnfjörð Bjarmason
@ 2022-02-07  2:22                             ` Teng Long
  0 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-07  2:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Jeff King, tenglong.tl,
	Martin Ågren

On Fri, Feb 4, 2022 at 9:04 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> I'm not saying we shouldn't have that, but that in any case a sequence of:
>
>  1. Add a --format option
>  2. Add a --object-only alias for a --format (what my RFC 7/7 does)
>  3. Add a custom more optimized --object-only implementation
>
> Would make the patch progression much easier to read, and we'd consider
> the correctness of --object-only (1 and 2) separate from the
> optimization question (3).
>
> But maybe we won't need (3) at all in the end, i.e. is (1 and 2) fast
> enough for it not to matter (I think probably "yes", but I don't have a
> strong opinion on that).

Sorry for the late reply, I had a vacation in the last two weeks (Chinese
New Year).

I have to say it's a very valuable recommendation and at the same time
I recognise that
spending more time on organizing commits ahead is important and make
small steps(or commits) sufficiently.

> Now would be a good time :)
>
> I was reminded of this because Junio's proposed it for next at
> https://lore.kernel.org/git/xmqqr18jnr2t.fsf@gitster.g/
>
> I think per the above & other replies of mine (including not matters of
> code arrangement opinion, but e.g. the doc formatting bug) we'll need at
> least one more re-roll of this. Thanks for sticking with this & working
> on this!
>
> I'll indicate that in a reply to that "What's Cooking" report.

Thanks for mentioning that. I will back work on it this week.

Thanks.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts
  2022-01-13  3:42               ` [PATCH v10 0/9] ls-tree: "--object-only" and "--format" opts Teng Long
                                   ` (8 preceding siblings ...)
  2022-01-13  3:42                 ` [PATCH v10 9/9] ls-tree.c: introduce "--format" option Teng Long
@ 2022-02-08 12:14                 ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 01/13] ls-tree: remove commented-out code Teng Long
                                     ` (13 more replies)
  9 siblings, 14 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

Main diffs from v10:

1. Remove commit b04188c822 (ls-tree: optimize naming and handling of "return" in
   show_tree()) and split to 2 new commits:
   		99e6d47108 (ls-tree: rename "retval" to "recurse" in "show_tree()")
		3816a65fe6 (simplify nesting if/else logic in "show_tree()")

   And also use existed `object_type()` instead of a new `get_type()` which bring by v10.

   These changes are based on Ævar Arnfjörð Bjarmason's suggestion on [1]. 

2. Better order of "--format" and "--object-only" features

   The v11 set `--format` before than `--object-only`, this make the patch progression
   much easier to read and let `--object-only` simplier to implement as a case of "--format".

   The change is based on Ævar Arnfjörð Bjarmason's suggestion on [2].

3. Replace to correct author names in partial commits.

   The change is based on Ævar Arnfjörð Bjarmason's suggestion on [3].

4. Changes on commit of "--format" feature.

   * Documentation bugfix:

     		  1) Add back the "\n" before "DESCRIPTION".
		  2) Use a simpler usage case instead of a fragile JSON example.
		  3) Other problems that be mentioned in [4].
   
   * Use a correspondingly uniform naming frame of the "fieldnames" instead of the old ones.

     	          1) mode   ->  objectmode
		  2) type   ->  objecttype
		  3) object ->  objectname
		  4) size   ->  objectsize
		  5) file   ->  path

   These changes are based on Ævar Arnfjörð Bjarmason's suggestion on [4].

5. Add more tests such as let "--format" combined with `-t`, `--full-tree` and `--full-name`. 

6. Add a new commit to introduce "fast_path()".

[1] https://public-inbox.org/git/220114.86ee5attm4.gmgdl@evledraar.gmail.com/
[2] https://public-inbox.org/git/CADMgQSSNQFHhf3=K+PiaoonBnheoDcoKpWy9-zjSu90d9rDY2w@mail.gmail.com/
[3] https://public-inbox.org/git/CADMgQSQaE4EtiNXyGKebPyPS_0YTQ=HN+dU89_jD6BgQ1C470A@mail.gmail.com/
[4] https://public-inbox.org/git/20220118125939.99956-1-dyroneteng@gmail.com/


Johannes Schindelin (1):
  cocci: allow padding with `strbuf_addf()`

Teng Long (6):
  ls-tree: rename "retval" to "recurse" in "show_tree()"
  ls-tree: simplify nesting if/else logic in "show_tree()"
  ls-tree: fix "--name-only" and "--long" combined use bug
  ls-tree: slightly refactor `show_tree()`
  ls-tree: introduce function "fast_path()"
  ls-tree.c: support --object-only option for "git-ls-tree"

Ævar Arnfjörð Bjarmason (6):
  ls-tree: remove commented-out code
  ls-tree: add missing braces to "else" arms
  ls-tree: use "enum object_type", not {blob,tree,commit}_type
  ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  ls-tree: introduce struct "show_tree_data"
  ls-tree.c: introduce "--format" option

 Documentation/git-ls-tree.txt   |  64 ++++++-
 builtin/ls-tree.c               | 322 +++++++++++++++++++++++++-------
 contrib/coccinelle/strbuf.cocci |   6 +-
 t/t3104-ls-tree-format.sh       |  93 +++++++++
 t/t3105-ls-tree-oid.sh          |  51 +++++
 5 files changed, 465 insertions(+), 71 deletions(-)
 create mode 100755 t/t3104-ls-tree-format.sh
 create mode 100755 t/t3105-ls-tree-oid.sh

Range-diff against v10:
 1:  b04188c822 <  -:  ---------- ls-tree: optimize naming and handling of "return" in show_tree()
 -:  ---------- >  1:  2fcff7e0d4 ls-tree: remove commented-out code
 -:  ---------- >  2:  6fd1dd9383 ls-tree: add missing braces to "else" arms
 -:  ---------- >  3:  208654b5e2 ls-tree: use "enum object_type", not {blob,tree,commit}_type
 -:  ---------- >  4:  2637464fd8 ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
 -:  ---------- >  5:  99e6d47108 ls-tree: rename "retval" to "recurse" in "show_tree()"
 -:  ---------- >  6:  3816a65fe6 ls-tree: simplify nesting if/else logic in "show_tree()"
 -:  ---------- >  7:  b22c2dc49e ls-tree: fix "--name-only" and "--long" combined use bug
 2:  bcfbc935b8 !  8:  41e8ed5047 ls-tree.c: support --object-only option for "git-ls-tree"
    @@ Metadata
     Author: Teng Long <dyroneteng@gmail.com>
     
      ## Commit message ##
    -    ls-tree.c: support --object-only option for "git-ls-tree"
    +    ls-treeMain diffs from v10:

1. Remove commit b04188c822 (ls-tree: optimize naming and handling of "return" in
   show_tree()) and split to 2 new commits:
                99e6d47108 (ls-tree: rename "retval" to "recurse" in "show_tree()")
                3816a65fe6 (simplify nesting if/else logic in "show_tree()")

   And also use existed `object_type()` instead of a new `get_type()` which bring by v10.

   These changes are based on Ævar Arnfjörð Bjarmason's suggestion on [1].

2. Better order of "--format" and "--object-only" features

   The v11 set `--format` before than `--object-only`, this make the patch progression
   much easier to read and let `--object-only` simplier to implement as a case of "--format".

   The change is based on Ævar Arnfjörð Bjarmason's suggestion on [2].

3. Fix the author'signs of som commits.

   The change is based on Ævar Arnfjörð Bjarmason's suggestion on [3].

4. Changes on commit of "--format" feature

   * Documentation bugfix:

                  1) Add back the "\n" before "DESCRIPTION".
                  2) Use a simplier usage case instead of a fragile JSON example.
                  3) Other problems that be mentioned in [4].

   * Use a correspondingly uniform naming frame of the "fieldnames" instead of the old ones.

                  1) mode   ->  objectmode
                  2) type   ->  objecttype
                  3) object ->  objectname
                  4) size   ->  objectsize
                  5) file   ->  path

   These changes are based on Ævar Arnfjörð Bjarmason's suggestion on [4].

5. Add more tests such like let "--format" combined with `-t`, `--full-tree` and `--full-name`.

6. Add a new commit to introduce "fast_path()".

[1] https://public-inbox.org/git/220114.86ee5attm4.gmgdl@evledraar.gmail.com/
[2] https://public-inbox.org/git/CADMgQSSNQFHhf3=K+PiaoonBnheoDcoKpWy9-zjSu90d9rDY2w@mail.gmail.com/
[3] https://public-inbox.org/git/CADMgQSQaE4EtiNXyGKebPyPS_0YTQ=HN+dU89_jD6BgQ1C470A@mail.gmail.com/
[4] https://public-inbox.org/git/20220118125939.99956-1-dyroneteng@gmail.com/: slightly refactor `show_tree()`
     
    -    We usually pipe the output from `git ls-trees` to tools like
    -    `sed` or `cut` when we only want to extract some fields.
    +    This is a non-functional change, we use a new int "shown_fields" to mark
    +    which columns to output, and `parse_shown_fields()` to calculate the
    +    value of "shown_fields".
     
    -    When we want only the pathname component, we can pass
    -    `--name-only` option to omit such a pipeline, but there are no
    -    options for extracting other fields.
    -
    -    Teach the "--object-only" option to the command to only show the
    -    object name. This option cannot be used together with
    -    "--name-only" or "--long" , they are mutually exclusive (actually
    -    "--name-only" and "--long" can be combined together before, this
    -    commit by the way fix this bug).
    -
    -    A simple refactoring was done to the "show_tree" function, intead by
    -    using bitwise operations to recognize the format for printing to
    -    stdout. The reason for doing this is that we don't want to increase
    -    the readability difficulty with the addition of "-object-only",
    -    making this part of the logic easier to read and expand.
    -
    -    In terms of performance, there is no loss comparing to the
    -    "master" (2ae0a9cb8298185a94e5998086f380a355dd8907), here are the
    -    results of the performance tests in my environment based on linux
    -    repository:
    -
    -        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    -        Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    -        Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    -        Range (min … max):   101.5 ms … 111.3 ms    28 runs
    -
    -        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    -        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    -        Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    -        Range (min … max):    99.3 ms … 109.5 ms    27 runs
    -
    -        $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    -        Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    -        Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    -        Range (min … max):   323.0 ms … 355.0 ms    10 runs
    -
    -        $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    -        Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    -        Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    -        Range (min … max):   330.4 ms … 349.9 ms    10 runs
    +    This has the advantage of making the show_tree logic simpler and more
    +    readable, as well as making it easier to extend new options (for example,
    +    if we want to add a "--object-only" option, we just need to add a similar
    +    "if (shown_fields == FIELD_OBJECT_NAME)" short-circuit logic in
    +    "show_tree()").
     
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
    - ## Documentation/git-ls-tree.txt ##
    -@@ Documentation/git-ls-tree.txt: SYNOPSIS
    - --------
    - [verse]
    - 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
    --	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
    -+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
    - 	    <tree-ish> [<path>...]
    - 
    - DESCRIPTION
    -@@ Documentation/git-ls-tree.txt: OPTIONS
    - --name-only::
    - --name-status::
    - 	List only filenames (instead of the "long" output), one per line.
    -+	Cannot be combined with `--object-only`.
    -+
    -+--object-only::
    -+	List only names of the objects, one per line. Cannot be combined
    -+	with `--name-only` or `--name-status`.
    - 
    - --abbrev[=<n>]::
    - 	Instead of showing the full 40-byte hexadecimal object
    -
      ## builtin/ls-tree.c ##
     @@
      
    @@ builtin/ls-tree.c
     +#define LS_SHOW_TREES (1 << 2)
     +#define LS_NAME_ONLY (1 << 3)
     +#define LS_SHOW_SIZE (1 << 4)
    -+#define LS_OBJECT_ONLY (1 << 5)
      static int abbrev;
      static int ls_options;
      static struct pathspec pathspec;
      static int chomp_prefix;
      static const char *ls_tree_prefix;
     +static unsigned int shown_fields;
    -+#define FIELD_FILE_NAME 1
    ++#define FIELD_PATH_NAME 1
     +#define FIELD_SIZE (1 << 1)
     +#define FIELD_OBJECT_NAME (1 << 2)
     +#define FIELD_TYPE (1 << 3)
    @@ builtin/ls-tree.c
      	NULL
      };
      
    --static int show_recursive(const char *base, size_t baselen, const char *pathname)
     +enum {
     +	MODE_UNSPECIFIED = 0,
     +	MODE_NAME_ONLY,
    -+	MODE_OBJECT_ONLY,
     +	MODE_LONG,
     +};
     +
    @@ builtin/ls-tree.c
     +static int parse_shown_fields(void)
     +{
     +	if (cmdmode == MODE_NAME_ONLY) {
    -+		shown_fields = FIELD_FILE_NAME;
    -+		return 0;
    -+	}
    -+	if (cmdmode == MODE_OBJECT_ONLY) {
    -+		shown_fields = FIELD_OBJECT_NAME;
    ++		shown_fields = FIELD_PATH_NAME;
     +		return 0;
     +	}
    ++
     +	if (!ls_options || (ls_options & LS_RECURSIVE)
     +	    || (ls_options & LS_SHOW_TREES)
     +	    || (ls_options & LS_TREE_ONLY))
    @@ builtin/ls-tree.c
     +	return 1;
     +}
     +
    -+static int show_recursive(const char *base, size_t baselen,
    -+			  const char *pathname)
    + static int show_recursive(const char *base, size_t baselen, const char *pathname)
      {
      	int i;
    - 
    -@@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
    - 	        : OBJ_BLOB);
    +@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, const char *pathname
    + 	return 0;
      }
      
     +static int show_default(const struct object_id *oid, enum object_type type,
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -			printf("%06o %s %s\t", mode, type_name(type),
     -			       find_unique_abbrev(oid, abbrev));
     -		}
    -+	if (shown_fields == FIELD_OBJECT_NAME) {
    -+		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
    ++	if (shown_fields == FIELD_PATH_NAME) {
    ++		baselen = base->len;
    ++		strbuf_addstr(base, pathname);
    ++		write_name_quoted_relative(base->buf,
    ++					   chomp_prefix ? ls_tree_prefix : NULL,
    ++					   stdout, line_termination);
    ++		strbuf_setlen(base, baselen);
     +		return recurse;
      	}
     -	baselen = base->len;
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     -				   stdout, line_termination);
     -	strbuf_setlen(base, baselen);
     +
    -+	if (shown_fields == FIELD_FILE_NAME) {
    -+		baselen = base->len;
    -+		strbuf_addstr(base, pathname);
    -+		write_name_quoted_relative(base->buf,
    -+					   chomp_prefix ? ls_tree_prefix : NULL,
    -+					   stdout, line_termination);
    -+		strbuf_setlen(base, baselen);
    -+		return recurse;
    -+	}
    -+
     +	if (shown_fields >= FIELD_DEFAULT)
     +		show_default(oid, type, pathname, mode, base);
     +
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      			LS_SHOW_TREES),
      		OPT_SET_INT('z', NULL, &line_termination,
      			    N_("terminate entries with NUL byte"), 0),
    --		OPT_BIT('l', "long", &ls_options, N_("include object size"),
    --			LS_SHOW_SIZE),
    --		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
    --			LS_NAME_ONLY),
    --		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
    --			LS_NAME_ONLY),
    +-		OPT_CMDMODE('l', "long", &ls_options, N_("include object size"),
    +-			    LS_SHOW_SIZE),
    +-		OPT_CMDMODE(0, "name-only", &ls_options, N_("list only filenames"),
    +-			    LS_NAME_ONLY),
    +-		OPT_CMDMODE(0, "name-status", &ls_options, N_("list only filenames"),
    +-			    LS_NAME_ONLY),
     +		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
     +			    MODE_LONG),
     +		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
     +			    MODE_NAME_ONLY),
     +		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
     +			    MODE_NAME_ONLY),
    -+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
    -+			    MODE_OBJECT_ONLY),
      		OPT_SET_INT(0, "full-name", &chomp_prefix,
      			    N_("use full path names"), 0),
      		OPT_BOOL(0, "full-tree", &full_tree,
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      	/*
      	 * show_recursive() rolls its own matching code and is
      	 * generally ignorant of 'struct pathspec'. The magic mask
    -
    - ## t/t3104-ls-tree-oid.sh (new) ##
    -@@
    -+#!/bin/sh
    -+
    -+test_description='git ls-tree objects handling.'
    -+
    -+. ./test-lib.sh
    -+
    -+test_expect_success 'setup' '
    -+	test_commit A &&
    -+	test_commit B &&
    -+	mkdir -p C &&
    -+	test_commit C/D.txt &&
    -+	find *.txt path* \( -type f -o -type l \) -print |
    -+	xargs git update-index --add &&
    -+	tree=$(git write-tree) &&
    -+	echo $tree
    -+'
    -+
    -+test_expect_success 'usage: --object-only' '
    -+	git ls-tree --object-only $tree >current &&
    -+	git ls-tree $tree >result &&
    -+	cut -f1 result | cut -d " " -f3 >expected &&
    -+	test_cmp current expected
    -+'
    -+
    -+test_expect_success 'usage: --object-only with -r' '
    -+	git ls-tree --object-only -r $tree >current &&
    -+	git ls-tree -r $tree >result &&
    -+	cut -f1 result | cut -d " " -f3 >expected &&
    -+	test_cmp current expected
    -+'
    -+
    -+test_expect_success 'usage: --object-only with --abbrev' '
    -+	git ls-tree --object-only --abbrev=6 $tree >current &&
    -+	git ls-tree --abbrev=6 $tree >result &&
    -+	cut -f1 result | cut -d " " -f3 >expected &&
    -+	test_cmp current expected
    -+'
    -+
    -+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
    -+	test_expect_code 129 git ls-tree --object-only --name-only $tree
    -+'
    -+
    -+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
    -+	test_expect_code 129 git ls-tree --object-only --name-status $tree
    -+'
    -+
    -+test_expect_success 'usage: incompatible options: --long with --object-only' '
    -+	test_expect_code 129 git ls-tree --object-only --long $tree
    -+'
    -+
    -+test_done
 3:  3ddffa1027 !  9:  46e10a5392 ls-tree.c: introduce struct "show_tree_data"
    @@
      ## Metadata ##
    -Author: Teng Long <dyroneteng@gmail.com>
    +Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    ls-tree.c: introduce struct "show_tree_data"
    +    ls-tree: introduce struct "show_tree_data"
     
         "show_tree_data" is a struct that packages the necessary fields for
         "show_tree()". This commit is a pre-prepared commit for supporting
         "--format" option and it does not affect any existing functionality.
     
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
      ## builtin/ls-tree.c ##
    @@ builtin/ls-tree.c: static unsigned int shown_fields;
      static const  char * const ls_tree_usage[] = {
      	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
      	NULL
    -@@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
    - 	        : OBJ_BLOB);
    +@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, const char *pathname
    + 	return 0;
      }
      
     -static int show_default(const struct object_id *oid, enum object_type type,
    @@ builtin/ls-tree.c: static int show_default(const struct object_id *oid, enum obj
      }
      
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
    + 	int recurse = 0;
      	size_t baselen;
    - 	enum object_type type = get_type(mode);
    - 
    + 	enum object_type type = object_type(mode);
     +	struct show_tree_data data = {
     +		.mode = mode,
     +		.type = type,
    @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strb
     +		.pathname = pathname,
     +		.base = base,
     +	};
    -+
    + 
      	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
      		recurse = READ_TREE_RECURSIVE;
    - 	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
     @@ builtin/ls-tree.c: static int show_tree(const struct object_id *oid, struct strbuf *base,
      	}
      
 4:  4b58a707c2 ! 10:  c04320b801 cocci: allow padding with `strbuf_addf()`
    @@
      ## Metadata ##
    -Author: Teng Long <dyroneteng@gmail.com>
    +Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
     
      ## Commit message ##
         cocci: allow padding with `strbuf_addf()`
 5:  db058bf670 ! 11:  5936004f13 ls-tree.c: introduce "--format" option
    @@
      ## Metadata ##
    -Author: Teng Long <dyroneteng@gmail.com>
    +Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
         ls-tree.c: introduce "--format" option
    @@ Commit message
         Links:
                 [1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
     
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Teng Long <dyroneteng@gmail.com>
     
      ## Documentation/git-ls-tree.txt ##
    @@ Documentation/git-ls-tree.txt: SYNOPSIS
      --------
      [verse]
      'git ls-tree' [-d] [-r] [-t] [-l] [-z]
    --	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]]
    --	    <tree-ish> [<path>...]
    --
    -+	    [--name-only] [--name-status] [--object-only]
    -+	    [--full-name] [--full-tree] [--abbrev[=<n>]]
    -+	    [--format=<format>] <tree-ish> [<path>...]
    +-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
    ++	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
    + 	    <tree-ish> [<path>...]
    + 
      DESCRIPTION
    - -----------
    - Lists the contents of a given tree object, like what "/bin/ls -a" does
     @@ Documentation/git-ls-tree.txt: OPTIONS
      	Do not limit the listing to the current working directory.
      	Implies --full-name.
    @@ Documentation/git-ls-tree.txt: OPTIONS
      
      Output Format
      -------------
    +-        <mode> SP <type> SP <object> TAB <file>
    ++
    ++The output format of `ls-tree` is determined by either the `--format`
    ++option, or other format-altering options such as `--name-only` etc.
    ++(see `--format` above).
     +
    -+Default format:
    ++The use of certain `--format` directives is equivalent to using those
    ++options, but invoking the full formatting machinery can be slower than
    ++using an appropriate formatting option.
     +
    -         <mode> SP <type> SP <object> TAB <file>
    ++In cases where the `--format` would exactly map to an existing option
    ++`ls-tree` will use the appropriate faster path. Thus the default format
    ++is equivalent to:
    ++
    ++        %(objectmode) %(objecttype) %(objectname)%x09%(path)
      
      This output format is compatible with what `--index-info --stdin` of
    + 'git update-index' expects.
    + 
    + When the `-l` option is used, format changes to
    + 
    +-        <mode> SP <type> SP <object> SP <object size> TAB <file>
    ++        %(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)
    + 
    +-Object size identified by <object> is given in bytes, and right-justified
    ++Object size identified by <objectname> is given in bytes, and right-justified
    + with minimum width of 7 characters.  Object size is given only for blobs
    + (file) entries; for other entries `-` character is used in place of size.
    + 
     @@ Documentation/git-ls-tree.txt: quoted as explained for the configuration variable `core.quotePath`
      (see linkgit:git-config[1]).  Using `-z` the filename is output
      verbatim and the line is terminated by a NUL byte.
    @@ Documentation/git-ls-tree.txt: quoted as explained for the configuration variabl
     +
     +It is possible to print in a custom format by using the `--format` option,
     +which is able to interpolate different fields using a `%(fieldname)` notation.
    -+For example, if you want to only print the <object> and <file> fields with a
    -+JSON style, executing with a specific "--format" like
    -+
    -+        git ls-tree --format='{"object":"%(object)", "file":"%(file)"}' <tree-ish>
    ++For example, if you only care about the "objectname" and "path" fields, you
    ++can execute with a specific "--format" like
     +
    -+The output format changes to:
    -+
    -+        {"object":"<object>", "file":"<file>"}
    ++        git ls-tree --format='%(objectname) %(path)' <tree-ish>
     +
     +FIELD NAMES
     +-----------
    @@ Documentation/git-ls-tree.txt: quoted as explained for the configuration variabl
     +into the resulting output. For each outputing line, the following
     +names can be used:
     +
    -+mode::
    ++objectmode::
     +	The mode of the object.
    -+type::
    ++objecttype::
     +	The type of the object (`blob` or `tree`).
    -+object::
    ++objectname::
     +	The name of the object.
    -+size[:padded]::
    ++objectsize[:padded]::
     +	The size of the object ("-" if it's a tree).
     +	It also supports a padded format of size with "%(size:padded)".
    -+file::
    -+	The filename of the object.
    ++path::
    ++	The pathname of the object.
     +
      GIT
      ---
      Part of the linkgit:git[1] suite
     
      ## builtin/ls-tree.c ##
    +@@ builtin/ls-tree.c: static unsigned int shown_fields;
    + #define FIELD_MODE (1 << 4)
    + #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
    + #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
    +-
    ++static const char *format;
    ++static const char *default_format = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
    ++static const char *long_format = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
    ++static const char *name_only_format = "%(path)";
    + struct show_tree_data {
    + 	unsigned mode;
    + 	enum object_type type;
     @@ builtin/ls-tree.c: enum {
      
      static int cmdmode = MODE_UNSPECIFIED;
      
    -+static const char *format;
    -+static const char *default_format = "%(mode) %(type) %(object)%x09%(file)";
    -+static const char *long_format = "%(mode) %(type) %(object) %(size:padded)%x09%(file)";
    -+static const char *name_only_format = "%(file)";
    -+static const char *object_only_format = "%(object)";
    -+
    - static int parse_shown_fields(void)
    - {
    - 	if (cmdmode == MODE_NAME_ONLY) {
    -@@ builtin/ls-tree.c: static int parse_shown_fields(void)
    - 	return 1;
    - }
    - 
     +static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
     +			      const enum object_type type, unsigned int padded)
     +{
    @@ builtin/ls-tree.c: static int parse_shown_fields(void)
     +			die(_("could not get object info about '%s'"),
     +			    oid_to_hex(oid));
     +		if (padded)
    -+			strbuf_addf(line, "%7" PRIuMAX, (uintmax_t)size);
    ++			strbuf_addf(line, "%7"PRIuMAX, (uintmax_t)size);
     +		else
    -+			strbuf_addf(line, "%" PRIuMAX, (uintmax_t)size);
    ++			strbuf_addf(line, "%"PRIuMAX, (uintmax_t)size);
     +	} else if (padded) {
     +		strbuf_addf(line, "%7s", "-");
     +	} else {
    @@ builtin/ls-tree.c: static int parse_shown_fields(void)
     +	}
     +}
     +
    -+static size_t expand_show_tree(struct strbuf *line, const char *start,
    ++static size_t expand_show_tree(struct strbuf *sb, const char *start,
     +			       void *context)
     +{
     +	struct show_tree_data *data = context;
     +	const char *end;
     +	const char *p;
     +	unsigned int errlen;
    -+	size_t len = strbuf_expand_literal_cb(line, start, NULL);
    ++	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
     +
     +	if (len)
     +		return len;
    @@ builtin/ls-tree.c: static int parse_shown_fields(void)
     +		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
     +
     +	len = end - start + 1;
    -+	if (skip_prefix(start, "(mode)", &p)) {
    -+		strbuf_addf(line, "%06o", data->mode);
    -+	} else if (skip_prefix(start, "(type)", &p)) {
    -+		strbuf_addstr(line, type_name(data->type));
    -+	} else if (skip_prefix(start, "(size:padded)", &p)) {
    -+		expand_objectsize(line, data->oid, data->type, 1);
    -+	} else if (skip_prefix(start, "(size)", &p)) {
    -+		expand_objectsize(line, data->oid, data->type, 0);
    -+	} else if (skip_prefix(start, "(object)", &p)) {
    -+		strbuf_add_unique_abbrev(line, data->oid, abbrev);
    -+	} else if (skip_prefix(start, "(file)", &p)) {
    ++	if (skip_prefix(start, "(objectmode)", &p)) {
    ++		strbuf_addf(sb, "%06o", data->mode);
    ++	} else if (skip_prefix(start, "(objecttype)", &p)) {
    ++		strbuf_addstr(sb, type_name(data->type));
    ++	} else if (skip_prefix(start, "(objectsize:padded)", &p)) {
    ++		expand_objectsize(sb, data->oid, data->type, 1);
    ++	} else if (skip_prefix(start, "(objectsize)", &p)) {
    ++		expand_objectsize(sb, data->oid, data->type, 0);
    ++	} else if (skip_prefix(start, "(objectname)", &p)) {
    ++		strbuf_add_unique_abbrev(sb, data->oid, abbrev);
    ++	} else if (skip_prefix(start, "(path)", &p)) {
     +		const char *name = data->base->buf;
     +		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
     +		struct strbuf quoted = STRBUF_INIT;
    -+		struct strbuf sb = STRBUF_INIT;
    ++		struct strbuf sbuf = STRBUF_INIT;
     +		strbuf_addstr(data->base, data->pathname);
    -+		name = relative_path(data->base->buf, prefix, &sb);
    ++		name = relative_path(data->base->buf, prefix, &sbuf);
     +		quote_c_style(name, &quoted, NULL, 0);
    -+		strbuf_addbuf(line, &quoted);
    -+		strbuf_release(&sb);
    ++		strbuf_addbuf(sb, &quoted);
    ++		strbuf_release(&sbuf);
     +		strbuf_release(&quoted);
     +	} else {
     +		errlen = (unsigned long)len;
    @@ builtin/ls-tree.c: static int parse_shown_fields(void)
     +	return len;
     +}
     +
    - static int show_recursive(const char *base, size_t baselen,
    - 			  const char *pathname)
    + static int parse_shown_fields(void)
      {
    -@@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
    - 	        : OBJ_BLOB);
    + 	if (cmdmode == MODE_NAME_ONLY) {
    +@@ builtin/ls-tree.c: static int show_recursive(const char *base, size_t baselen, const char *pathname
    + 	return 0;
      }
      
     +static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
    @@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
     +{
     +	size_t baselen;
     +	int recurse = 0;
    -+	struct strbuf line = STRBUF_INIT;
    -+	enum object_type type = get_type(mode);
    ++	struct strbuf sb = STRBUF_INIT;
    ++	enum object_type type = object_type(mode);
     +
     +	struct show_tree_data data = {
     +		.mode = mode,
    @@ builtin/ls-tree.c: static enum object_type get_type(unsigned int mode)
     +		return 0;
     +
     +	baselen = base->len;
    -+	strbuf_expand(&line, format, expand_show_tree, &data);
    -+	strbuf_addch(&line, line_termination);
    -+	fwrite(line.buf, line.len, 1, stdout);
    -+	strbuf_release(&line);
    ++	strbuf_expand(&sb, format, expand_show_tree, &data);
    ++	strbuf_addch(&sb, line_termination);
    ++	fwrite(sb.buf, sb.len, 1, stdout);
    ++	strbuf_release(&sb);
     +	strbuf_setlen(base, baselen);
     +	return recurse;
     +}
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      			 N_("list entire tree; not just current directory "
      			    "(implies --full-name)")),
     +		OPT_STRING_F(0, "format", &format, N_("format"),
    -+			     N_("format to use for the output"),
    -+			     PARSE_OPT_NONEG),
    ++					 N_("format to use for the output"),
    ++					 PARSE_OPT_NONEG),
      		OPT__ABBREV(&abbrev),
      		OPT_END()
      	};
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
      		die("not a tree object");
     -	return !!read_tree(the_repository, tree,
     -			   &pathspec, show_tree, NULL);
    -+
     +	/*
     +	 * The generic show_tree_fmt() is slower than show_tree(), so
     +	 * take the fast path if possible.
     +	 */
    -+	if (format &&
    -+	    (!strcmp(format, default_format) ||
    -+	     !strcmp(format, long_format) ||
    -+	     !strcmp(format, name_only_format) ||
    -+	     !strcmp(format, object_only_format)))
    ++	if (format && (!strcmp(format, default_format))) {
    ++		fn = show_tree;
    ++	} else if (format && (!strcmp(format, long_format))) {
    ++		shown_fields = shown_fields | FIELD_SIZE;
     +		fn = show_tree;
    -+	else if (format)
    ++	} else if (format && (!strcmp(format, name_only_format))) {
    ++		shown_fields = FIELD_PATH_NAME;
    ++		fn = show_tree;
    ++	} else if (format)
     +		fn = show_tree_fmt;
     +
     +	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
      }
     
    - ## t/t3105-ls-tree-format.sh (new) ##
    + ## t/t3104-ls-tree-format.sh (new) ##
     @@
     +#!/bin/sh
     +
    @@ t/t3105-ls-tree-format.sh (new)
     +test_expect_success 'ls-tree --format usage' '
     +	test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
     +	test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
    -+	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD &&
    -+	test_expect_code 129 git ls-tree --format=fmt --object-only HEAD
    ++	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD
     +'
     +
     +test_expect_success 'setup' '
    @@ t/t3105-ls-tree-format.sh (new)
     +test_ls_tree_format () {
     +	format=$1 &&
     +	opts=$2 &&
    ++	fmtopts=$3 &&
     +	shift 2 &&
     +	git ls-tree $opts -r HEAD >expect.raw &&
     +	sed "s/^/> /" >expect <expect.raw &&
    -+	git ls-tree --format="> $format" -r HEAD >actual &&
    ++	git ls-tree --format="> $format" -r $fmtopts HEAD >actual &&
     +	test_cmp expect actual
     +}
     +
     +test_expect_success 'ls-tree --format=<default-like>' '
     +	test_ls_tree_format \
    -+		"%(mode) %(type) %(object)%x09%(file)" \
    ++		"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
     +		""
     +'
     +
     +test_expect_success 'ls-tree --format=<long-like>' '
     +	test_ls_tree_format \
    -+		"%(mode) %(type) %(object) %(size:padded)%x09%(file)" \
    ++		"%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)" \
     +		"--long"
     +'
     +
     +test_expect_success 'ls-tree --format=<name-only-like>' '
     +	test_ls_tree_format \
    -+		"%(file)" \
    ++		"%(path)" \
     +		"--name-only"
     +'
     +
    -+test_expect_success 'ls-tree --format=<object-only-like>' '
    ++test_expect_success 'ls-tree combine --format=<default-like> and -t' '
    ++	test_ls_tree_format \
    ++	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
    ++	"-t" \
    ++	"-t"
    ++'
    ++
    ++test_expect_success 'ls-tree combine --format=<default-like> and --full-name' '
     +	test_ls_tree_format \
    -+		"%(object)" \
    -+		"--object-only"
    ++	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
    ++	"--full-name" \
    ++	"--full-name"
     +'
     +
    ++test_expect_success 'ls-tree combine --format=<default-like> and --full-tree' '
    ++	test_ls_tree_format \
    ++	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
    ++	"--full-tree" \
    ++	"--full-tree"
    ++'
    ++
    ++test_expect_success 'ls-tree hit fast-path with --format=<default-like>' '
    ++	git ls-tree -r HEAD >expect &&
    ++	git ls-tree --format="%(objectmode) %(objecttype) %(objectname)%x09%(path)" -r HEAD >actual &&
    ++	test_cmp expect actual
    ++'
    ++
    ++test_expect_success 'ls-tree hit fast-path with --format=<name-only-like>' '
    ++	git ls-tree -r --name-only HEAD >expect &&
    ++	git ls-tree --format="%(path)" -r HEAD >actual &&
    ++	test_cmp expect actual
    ++'
     +test_done
 -:  ---------- > 12:  6d26497749 ls-tree: introduce function "fast_path()"
 -:  ---------- > 13:  e6d98f2560 ls-tree.c: support --object-only option for "git-ls-tree"
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 01/13] ls-tree: remove commented-out code
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 02/13] ls-tree: add missing braces to "else" arms Teng Long
                                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Remove code added in f35a6d3bce7 (Teach core object handling functions
about gitlinks, 2007-04-09), later patched in 7d0b18a4da1 (Add output
flushing before fork(), 2008-08-04), and then finally ending up in its
current form in d3bee161fef (tree.c: allow read_tree_recursive() to
traverse gitlink entries, 2009-01-25). All while being commented-out!

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c7..5f7c84950c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -69,15 +69,6 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	const char *type = blob_type;
 
 	if (S_ISGITLINK(mode)) {
-		/*
-		 * Maybe we want to have some recursive version here?
-		 *
-		 * Something similar to this incomplete example:
-		 *
-		if (show_subprojects(base, baselen, pathname))
-			retval = READ_TREE_RECURSIVE;
-		 *
-		 */
 		type = commit_type;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 02/13] ls-tree: add missing braces to "else" arms
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
  2022-02-08 12:14                   ` [PATCH v11 01/13] ls-tree: remove commented-out code Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 03/13] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
                                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Add missing {} to the "else" arms in show_tree() per the
CodingGuidelines.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 5f7c84950c..0a28f32ccb 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -92,14 +92,16 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				else
 					xsnprintf(size_text, sizeof(size_text),
 						  "%"PRIuMAX, (uintmax_t)size);
-			} else
+			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
+			}
 			printf("%06o %s %s %7s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
-		} else
+		} else {
 			printf("%06o %s %s\t", mode, type,
 			       find_unique_abbrev(oid, abbrev));
+		}
 	}
 	baselen = base->len;
 	strbuf_addstr(base, pathname);
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 03/13] ls-tree: use "enum object_type", not {blob,tree,commit}_type
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
  2022-02-08 12:14                   ` [PATCH v11 01/13] ls-tree: remove commented-out code Teng Long
  2022-02-08 12:14                   ` [PATCH v11 02/13] ls-tree: add missing braces to "else" arms Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 04/13] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
                                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Change the ls-tree.c code to use type_name() on the enum instead of
using the string constants. This doesn't matter either way for
performance, but makes this a bit easier to read as we'll no longer
need a strcmp() here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 0a28f32ccb..3f0225b097 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,17 +66,17 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int retval = 0;
 	int baselen;
-	const char *type = blob_type;
+	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-		type = commit_type;
+		type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
 			retval = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
 				return retval;
 		}
-		type = tree_type;
+		type = OBJ_TREE;
 	}
 	else if (ls_options & LS_TREE_ONLY)
 		return 0;
@@ -84,7 +84,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
 			char size_text[24];
-			if (!strcmp(type, blob_type)) {
+			if (type == OBJ_BLOB) {
 				unsigned long size;
 				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
 					xsnprintf(size_text, sizeof(size_text),
@@ -95,11 +95,11 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 			} else {
 				xsnprintf(size_text, sizeof(size_text), "-");
 			}
-			printf("%06o %s %s %7s\t", mode, type,
+			printf("%06o %s %s %7s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev),
 			       size_text);
 		} else {
-			printf("%06o %s %s\t", mode, type,
+			printf("%06o %s %s\t", mode, type_name(type),
 			       find_unique_abbrev(oid, abbrev));
 		}
 	}
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 04/13] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (2 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 03/13] ls-tree: use "enum object_type", not {blob,tree,commit}_type Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 05/13] ls-tree: rename "retval" to "recurse" in "show_tree()" Teng Long
                                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

The "struct strbuf"'s "len" member is a "size_t", not an "int", so
let's change our corresponding types accordingly. This also changes
the "len" and "speclen" variables, which are likewise used to store
the return value of strlen(), which returns "size_t", not "int".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/ls-tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3f0225b097..eecc7482d5 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -31,7 +31,7 @@ static const  char * const ls_tree_usage[] = {
 	NULL
 };
 
-static int show_recursive(const char *base, int baselen, const char *pathname)
+static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
 
@@ -43,7 +43,7 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
 
 	for (i = 0; i < pathspec.nr; i++) {
 		const char *spec = pathspec.items[i].match;
-		int len, speclen;
+		size_t len, speclen;
 
 		if (strncmp(base, spec, baselen))
 			continue;
@@ -65,7 +65,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
 	int retval = 0;
-	int baselen;
+	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
 	if (S_ISGITLINK(mode)) {
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 05/13] ls-tree: rename "retval" to "recurse" in "show_tree()"
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (3 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 04/13] ls-tree: use "size_t", not "int" for "struct strbuf"'s "len" Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 06/13] ls-tree: simplify nesting if/else logic " Teng Long
                                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

The variable which "show_tree()" return is named "retval", a name that's
a little hard to understand. The commit rename "retval" to "recurse"
which is a more meaningful name than before in the context. We do not
need to take a look at "read_tree_at()" in "tree.c" to make sure what
does "retval" mean.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index eecc7482d5..ef8c414f61 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -64,7 +64,7 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
-	int retval = 0;
+	int recurse = 0;
 	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
@@ -72,9 +72,9 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 		type = OBJ_COMMIT;
 	} else if (S_ISDIR(mode)) {
 		if (show_recursive(base->buf, base->len, pathname)) {
-			retval = READ_TREE_RECURSIVE;
+			recurse = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
-				return retval;
+				return recurse;
 		}
 		type = OBJ_TREE;
 	}
@@ -109,7 +109,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 				   chomp_prefix ? ls_tree_prefix : NULL,
 				   stdout, line_termination);
 	strbuf_setlen(base, baselen);
-	return retval;
+	return recurse;
 }
 
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 06/13] ls-tree: simplify nesting if/else logic in "show_tree()"
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (4 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 05/13] ls-tree: rename "retval" to "recurse" in "show_tree()" Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-19  6:06                     ` Ævar Arnfjörð Bjarmason
  2022-02-08 12:14                   ` [PATCH v11 07/13] ls-tree: fix "--name-only" and "--long" combined use bug Teng Long
                                     ` (7 subsequent siblings)
  13 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl, Teng Long

This commit use "object_type()" to get the type, then remove
some of the nested if to let the codes here become more cleaner.

Signed-off-by: Teng Long <dyronetengb@gmail.com>
---
 builtin/ls-tree.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index ef8c414f61..9c57a36c8c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,19 +66,13 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int recurse = 0;
 	size_t baselen;
-	enum object_type type = OBJ_BLOB;
+	enum object_type type = object_type(mode);
 
-	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
-		if (show_recursive(base->buf, base->len, pathname)) {
-			recurse = READ_TREE_RECURSIVE;
-			if (!(ls_options & LS_SHOW_TREES))
-				return recurse;
-		}
-		type = OBJ_TREE;
-	}
-	else if (ls_options & LS_TREE_ONLY)
+	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
+		recurse = READ_TREE_RECURSIVE;
+	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
+		return recurse;
+	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return 0;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 07/13] ls-tree: fix "--name-only" and "--long" combined use bug
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (5 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 06/13] ls-tree: simplify nesting if/else logic " Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-19  6:04                     ` Ævar Arnfjörð Bjarmason
  2022-02-08 12:14                   ` [PATCH v11 08/13] ls-tree: slightly refactor `show_tree()` Teng Long
                                     ` (6 subsequent siblings)
  13 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

If we execute "git ls-tree" with combined "--name-only" and "--long"
, only the pathname will be printed, the size is omitted (the original
discoverer was Peff in [1]).

This commit fix this issue by using `OPT_CMDMODE()` instead to make both
of them mutually exclusive.

[1] https://public-inbox.org/git/YZK0MKCYAJmG+pSU@coredump.intra.peff.net/

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 9c57a36c8c..32147e75e6 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -120,12 +120,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_BIT('l', "long", &ls_options, N_("include object size"),
-			LS_SHOW_SIZE),
-		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
-		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
-			LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &ls_options, N_("include object size"),
+			    LS_SHOW_SIZE),
+		OPT_CMDMODE(0, "name-only", &ls_options, N_("list only filenames"),
+			    LS_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &ls_options, N_("list only filenames"),
+			    LS_NAME_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 08/13] ls-tree: slightly refactor `show_tree()`
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (6 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 07/13] ls-tree: fix "--name-only" and "--long" combined use bug Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-19  5:56                     ` Ævar Arnfjörð Bjarmason
  2022-02-08 12:14                   ` [PATCH v11 09/13] ls-tree: introduce struct "show_tree_data" Teng Long
                                     ` (5 subsequent siblings)
  13 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

This is a non-functional change, we use a new int "shown_fields" to mark
which columns to output, and `parse_shown_fields()` to calculate the
value of "shown_fields".

This has the advantage of making the show_tree logic simpler and more
readable, as well as making it easier to extend new options (for example,
if we want to add a "--object-only" option, we just need to add a similar
"if (shown_fields == FIELD_OBJECT_NAME)" short-circuit logic in
"show_tree()").

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 126 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 89 insertions(+), 37 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 32147e75e6..8baab7c83e 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -16,21 +16,53 @@
 
 static int line_termination = '\n';
 #define LS_RECURSIVE 1
-#define LS_TREE_ONLY 2
-#define LS_SHOW_TREES 4
-#define LS_NAME_ONLY 8
-#define LS_SHOW_SIZE 16
+#define LS_TREE_ONLY (1 << 1)
+#define LS_SHOW_TREES (1 << 2)
+#define LS_NAME_ONLY (1 << 3)
+#define LS_SHOW_SIZE (1 << 4)
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
 static int chomp_prefix;
 static const char *ls_tree_prefix;
+static unsigned int shown_fields;
+#define FIELD_PATH_NAME 1
+#define FIELD_SIZE (1 << 1)
+#define FIELD_OBJECT_NAME (1 << 2)
+#define FIELD_TYPE (1 << 3)
+#define FIELD_MODE (1 << 4)
+#define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
+#define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
 
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
 };
 
+enum {
+	MODE_UNSPECIFIED = 0,
+	MODE_NAME_ONLY,
+	MODE_LONG,
+};
+
+static int cmdmode = MODE_UNSPECIFIED;
+
+static int parse_shown_fields(void)
+{
+	if (cmdmode == MODE_NAME_ONLY) {
+		shown_fields = FIELD_PATH_NAME;
+		return 0;
+	}
+
+	if (!ls_options || (ls_options & LS_RECURSIVE)
+	    || (ls_options & LS_SHOW_TREES)
+	    || (ls_options & LS_TREE_ONLY))
+		shown_fields = FIELD_DEFAULT;
+	if (cmdmode == MODE_LONG)
+		shown_fields = FIELD_LONG_DEFAULT;
+	return 1;
+}
+
 static int show_recursive(const char *base, size_t baselen, const char *pathname)
 {
 	int i;
@@ -61,6 +93,39 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static int show_default(const struct object_id *oid, enum object_type type,
+			const char *pathname, unsigned mode,
+			struct strbuf *base)
+{
+	size_t baselen = base->len;
+
+	if (shown_fields & FIELD_SIZE) {
+		char size_text[24];
+		if (type == OBJ_BLOB) {
+			unsigned long size;
+			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+				xsnprintf(size_text, sizeof(size_text), "BAD");
+			else
+				xsnprintf(size_text, sizeof(size_text),
+					  "%" PRIuMAX, (uintmax_t)size);
+		} else {
+			xsnprintf(size_text, sizeof(size_text), "-");
+		}
+		printf("%06o %s %s %7s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev), size_text);
+	} else {
+		printf("%06o %s %s\t", mode, type_name(type),
+		find_unique_abbrev(oid, abbrev));
+	}
+	baselen = base->len;
+	strbuf_addstr(base, pathname);
+	write_name_quoted_relative(base->buf,
+				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
+				   line_termination);
+	strbuf_setlen(base, baselen);
+	return 1;
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
@@ -75,34 +140,19 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return 0;
 
-	if (!(ls_options & LS_NAME_ONLY)) {
-		if (ls_options & LS_SHOW_SIZE) {
-			char size_text[24];
-			if (type == OBJ_BLOB) {
-				unsigned long size;
-				if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
-					xsnprintf(size_text, sizeof(size_text),
-						  "BAD");
-				else
-					xsnprintf(size_text, sizeof(size_text),
-						  "%"PRIuMAX, (uintmax_t)size);
-			} else {
-				xsnprintf(size_text, sizeof(size_text), "-");
-			}
-			printf("%06o %s %s %7s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev),
-			       size_text);
-		} else {
-			printf("%06o %s %s\t", mode, type_name(type),
-			       find_unique_abbrev(oid, abbrev));
-		}
+	if (shown_fields == FIELD_PATH_NAME) {
+		baselen = base->len;
+		strbuf_addstr(base, pathname);
+		write_name_quoted_relative(base->buf,
+					   chomp_prefix ? ls_tree_prefix : NULL,
+					   stdout, line_termination);
+		strbuf_setlen(base, baselen);
+		return recurse;
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
-				   chomp_prefix ? ls_tree_prefix : NULL,
-				   stdout, line_termination);
-	strbuf_setlen(base, baselen);
+
+	if (shown_fields >= FIELD_DEFAULT)
+		show_default(oid, type, pathname, mode, base);
+
 	return recurse;
 }
 
@@ -120,12 +170,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			LS_SHOW_TREES),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("terminate entries with NUL byte"), 0),
-		OPT_CMDMODE('l', "long", &ls_options, N_("include object size"),
-			    LS_SHOW_SIZE),
-		OPT_CMDMODE(0, "name-only", &ls_options, N_("list only filenames"),
-			    LS_NAME_ONLY),
-		OPT_CMDMODE(0, "name-status", &ls_options, N_("list only filenames"),
-			    LS_NAME_ONLY),
+		OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
+			    MODE_LONG),
+		OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
+			    MODE_NAME_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
@@ -156,6 +206,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
+	parse_shown_fields();
+
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 09/13] ls-tree: introduce struct "show_tree_data"
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (7 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 08/13] ls-tree: slightly refactor `show_tree()` Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 10/13] cocci: allow padding with `strbuf_addf()` Teng Long
                                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

"show_tree_data" is a struct that packages the necessary fields for
"show_tree()". This commit is a pre-prepared commit for supporting
"--format" option and it does not affect any existing functionality.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 43 ++++++++++++++++++++++++++++---------------
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 8baab7c83e..293b8f9dfb 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -34,6 +34,14 @@ static unsigned int shown_fields;
 #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
 #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
 
+struct show_tree_data {
+	unsigned mode;
+	enum object_type type;
+	const struct object_id *oid;
+	const char *pathname;
+	struct strbuf *base;
+};
+
 static const  char * const ls_tree_usage[] = {
 	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
 	NULL
@@ -93,17 +101,15 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
-static int show_default(const struct object_id *oid, enum object_type type,
-			const char *pathname, unsigned mode,
-			struct strbuf *base)
+static int show_default(struct show_tree_data *data)
 {
-	size_t baselen = base->len;
+	size_t baselen = data->base->len;
 
 	if (shown_fields & FIELD_SIZE) {
 		char size_text[24];
-		if (type == OBJ_BLOB) {
+		if (data->type == OBJ_BLOB) {
 			unsigned long size;
-			if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
+			if (oid_object_info(the_repository, data->oid, &size) == OBJ_BAD)
 				xsnprintf(size_text, sizeof(size_text), "BAD");
 			else
 				xsnprintf(size_text, sizeof(size_text),
@@ -111,18 +117,18 @@ static int show_default(const struct object_id *oid, enum object_type type,
 		} else {
 			xsnprintf(size_text, sizeof(size_text), "-");
 		}
-		printf("%06o %s %s %7s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev), size_text);
+		printf("%06o %s %s %7s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev), size_text);
 	} else {
-		printf("%06o %s %s\t", mode, type_name(type),
-		find_unique_abbrev(oid, abbrev));
+		printf("%06o %s %s\t", data->mode, type_name(data->type),
+		find_unique_abbrev(data->oid, abbrev));
 	}
-	baselen = base->len;
-	strbuf_addstr(base, pathname);
-	write_name_quoted_relative(base->buf,
+	baselen = data->base->len;
+	strbuf_addstr(data->base, data->pathname);
+	write_name_quoted_relative(data->base->buf,
 				   chomp_prefix ? ls_tree_prefix : NULL, stdout,
 				   line_termination);
-	strbuf_setlen(base, baselen);
+	strbuf_setlen(data->base, baselen);
 	return 1;
 }
 
@@ -132,6 +138,13 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	int recurse = 0;
 	size_t baselen;
 	enum object_type type = object_type(mode);
+	struct show_tree_data data = {
+		.mode = mode,
+		.type = type,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
 
 	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
 		recurse = READ_TREE_RECURSIVE;
@@ -151,7 +164,7 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	}
 
 	if (shown_fields >= FIELD_DEFAULT)
-		show_default(oid, type, pathname, mode, base);
+		show_default(&data);
 
 	return recurse;
 }
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 10/13] cocci: allow padding with `strbuf_addf()`
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (8 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 09/13] ls-tree: introduce struct "show_tree_data" Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-08 12:14                   ` [PATCH v11 11/13] ls-tree.c: introduce "--format" option Teng Long
                                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

A convenient way to pad strings is to use something like
`strbuf_addf(&buf, "%20s", "Hello, world!")`.

However, the Coccinelle rule that forbids a format `"%s"` with a
constant string argument cast too wide a net, and also forbade such
padding.

The original rule was introduced by commit:

    28c23cd4c39 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 contrib/coccinelle/strbuf.cocci | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index d9ada69b43..0970d98ad7 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -15,7 +15,7 @@ constant fmt !~ "%";
 @@
 expression E;
 struct strbuf SB;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E, "%@F@", SB.buf);
 + strbuf_addbuf(E, &SB);
@@ -23,7 +23,7 @@ format F =~ "s";
 @@
 expression E;
 struct strbuf *SBP;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E, "%@F@", SBP->buf);
 + strbuf_addbuf(E, SBP);
@@ -44,7 +44,7 @@ struct strbuf *SBP;
 
 @@
 expression E1, E2;
-format F =~ "s";
+format F =~ "^s$";
 @@
 - strbuf_addf(E1, "%@F@", E2);
 + strbuf_addstr(E1, E2);
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 11/13] ls-tree.c: introduce "--format" option
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (9 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 10/13] cocci: allow padding with `strbuf_addf()` Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-19  5:44                     ` Ævar Arnfjörð Bjarmason
  2022-02-08 12:14                   ` [PATCH v11 12/13] ls-tree: introduce function "fast_path()" Teng Long
                                     ` (2 subsequent siblings)
  13 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Add a --format option to ls-tree. It has an existing default output,
and then --long and --name-only options to emit the default output
along with the objectsize and, or to only emit object paths.

Rather than add --type-only, --object-only etc. we can just support a
--format using a strbuf_expand() similar to "for-each-ref
--format". We might still add such options in the future for
convenience.

The --format implementation is slower than the existing code, but this
change does not cause any performance regressions. We'll leave the
existing show_tree() unchanged, and only run show_tree_fmt() in if
a --format different than the hardcoded built-in ones corresponding to
the existing modes is provided.

I.e. something like the "--long" output would be much slower with
this, mainly due to how we need to allocate various things to do with
quote.c instead of spewing the output directly to stdout.

The new option of '--format' comes from Ævar Arnfjörð Bjarmasonn's
idea and suggestion, this commit makes modifications in terms of the
original discussion on community [1].

Here is the statistics about performance tests:

1. Default format (hitten the builtin formats):

    "git ls-tree <tree-ish>" vs "--format='%(mode) %(type) %(object)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.2 ms ±   3.3 ms    [User: 84.3 ms, System: 20.8 ms]
    Range (min … max):    99.2 ms … 113.2 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)'  HEAD
    Time (mean ± σ):     106.4 ms ±   2.7 ms    [User: 86.1 ms, System: 20.2 ms]
    Range (min … max):   100.2 ms … 110.5 ms    29 runs

2. Default format includes object size (hitten the builtin formats):

    "git ls-tree -l <tree-ish>" vs "--format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'"

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     335.1 ms ±   6.5 ms    [User: 304.6 ms, System: 30.4 ms]
    Range (min … max):   327.5 ms … 348.4 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'  HEAD
    Time (mean ± σ):     337.2 ms ±   8.2 ms    [User: 309.2 ms, System: 27.9 ms]
    Range (min … max):   328.8 ms … 349.4 ms    10 runs

Links:
	[1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  59 ++++++++++++++--
 builtin/ls-tree.c             | 129 +++++++++++++++++++++++++++++++++-
 t/t3104-ls-tree-format.sh     |  81 +++++++++++++++++++++
 3 files changed, 262 insertions(+), 7 deletions(-)
 create mode 100755 t/t3104-ls-tree-format.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db02d6d79a..db29a9efb5 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
+	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -74,6 +74,16 @@ OPTIONS
 	Do not limit the listing to the current working directory.
 	Implies --full-name.
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result
+	being shown. It also interpolates `%%` to `%`, and
+	`%xx` where `xx` are hex digits interpolates to character
+	with hex code `xx`; for example `%00` interpolates to
+	`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
+	When specified, `--format` cannot be combined with other
+	format-altering options, including `--long`, `--name-only`
+	and `--object-only`.
+
 [<path>...]::
 	When paths are given, show them (note that this isn't really raw
 	pathnames, but rather a list of patterns to match).  Otherwise
@@ -82,16 +92,29 @@ OPTIONS
 
 Output Format
 -------------
-        <mode> SP <type> SP <object> TAB <file>
+
+The output format of `ls-tree` is determined by either the `--format`
+option, or other format-altering options such as `--name-only` etc.
+(see `--format` above).
+
+The use of certain `--format` directives is equivalent to using those
+options, but invoking the full formatting machinery can be slower than
+using an appropriate formatting option.
+
+In cases where the `--format` would exactly map to an existing option
+`ls-tree` will use the appropriate faster path. Thus the default format
+is equivalent to:
+
+        %(objectmode) %(objecttype) %(objectname)%x09%(path)
 
 This output format is compatible with what `--index-info --stdin` of
 'git update-index' expects.
 
 When the `-l` option is used, format changes to
 
-        <mode> SP <type> SP <object> SP <object size> TAB <file>
+        %(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)
 
-Object size identified by <object> is given in bytes, and right-justified
+Object size identified by <objectname> is given in bytes, and right-justified
 with minimum width of 7 characters.  Object size is given only for blobs
 (file) entries; for other entries `-` character is used in place of size.
 
@@ -100,6 +123,34 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+Customized format:
+
+It is possible to print in a custom format by using the `--format` option,
+which is able to interpolate different fields using a `%(fieldname)` notation.
+For example, if you only care about the "objectname" and "path" fields, you
+can execute with a specific "--format" like
+
+        git ls-tree --format='%(objectname) %(path)' <tree-ish>
+
+FIELD NAMES
+-----------
+
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputing line, the following
+names can be used:
+
+objectmode::
+	The mode of the object.
+objecttype::
+	The type of the object (`blob` or `tree`).
+objectname::
+	The name of the object.
+objectsize[:padded]::
+	The size of the object ("-" if it's a tree).
+	It also supports a padded format of size with "%(size:padded)".
+path::
+	The pathname of the object.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 293b8f9dfb..1c71e5d543 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -33,7 +33,10 @@ static unsigned int shown_fields;
 #define FIELD_MODE (1 << 4)
 #define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
 #define FIELD_LONG_DEFAULT  (FIELD_DEFAULT | FIELD_SIZE)
-
+static const char *format;
+static const char *default_format = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
+static const char *long_format = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
+static const char *name_only_format = "%(path)";
 struct show_tree_data {
 	unsigned mode;
 	enum object_type type;
@@ -55,6 +58,72 @@ enum {
 
 static int cmdmode = MODE_UNSPECIFIED;
 
+static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
+			      const enum object_type type, unsigned int padded)
+{
+	if (type == OBJ_BLOB) {
+		unsigned long size;
+		if (oid_object_info(the_repository, oid, &size) < 0)
+			die(_("could not get object info about '%s'"),
+			    oid_to_hex(oid));
+		if (padded)
+			strbuf_addf(line, "%7"PRIuMAX, (uintmax_t)size);
+		else
+			strbuf_addf(line, "%"PRIuMAX, (uintmax_t)size);
+	} else if (padded) {
+		strbuf_addf(line, "%7s", "-");
+	} else {
+		strbuf_addstr(line, "-");
+	}
+}
+
+static size_t expand_show_tree(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_tree_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-tree format: as '%s'"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p)) {
+		strbuf_addf(sb, "%06o", data->mode);
+	} else if (skip_prefix(start, "(objecttype)", &p)) {
+		strbuf_addstr(sb, type_name(data->type));
+	} else if (skip_prefix(start, "(objectsize:padded)", &p)) {
+		expand_objectsize(sb, data->oid, data->type, 1);
+	} else if (skip_prefix(start, "(objectsize)", &p)) {
+		expand_objectsize(sb, data->oid, data->type, 0);
+	} else if (skip_prefix(start, "(objectname)", &p)) {
+		strbuf_add_unique_abbrev(sb, data->oid, abbrev);
+	} else if (skip_prefix(start, "(path)", &p)) {
+		const char *name = data->base->buf;
+		const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
+		struct strbuf quoted = STRBUF_INIT;
+		struct strbuf sbuf = STRBUF_INIT;
+		strbuf_addstr(data->base, data->pathname);
+		name = relative_path(data->base->buf, prefix, &sbuf);
+		quote_c_style(name, &quoted, NULL, 0);
+		strbuf_addbuf(sb, &quoted);
+		strbuf_release(&sbuf);
+		strbuf_release(&quoted);
+	} else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-tree format: %%%.*s"), errlen, start);
+	}
+	return len;
+}
+
 static int parse_shown_fields(void)
 {
 	if (cmdmode == MODE_NAME_ONLY) {
@@ -101,6 +170,38 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
+			 const char *pathname, unsigned mode, void *context)
+{
+	size_t baselen;
+	int recurse = 0;
+	struct strbuf sb = STRBUF_INIT;
+	enum object_type type = object_type(mode);
+
+	struct show_tree_data data = {
+		.mode = mode,
+		.type = type,
+		.oid = oid,
+		.pathname = pathname,
+		.base = base,
+	};
+
+	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
+		recurse = READ_TREE_RECURSIVE;
+	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
+		return recurse;
+	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
+		return 0;
+
+	baselen = base->len;
+	strbuf_expand(&sb, format, expand_show_tree, &data);
+	strbuf_addch(&sb, line_termination);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	strbuf_setlen(base, baselen);
+	return recurse;
+}
+
 static int show_default(struct show_tree_data *data)
 {
 	size_t baselen = data->base->len;
@@ -174,6 +275,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	struct object_id oid;
 	struct tree *tree;
 	int i, full_tree = 0;
+	read_tree_fn_t fn = show_tree;
 	const struct option ls_tree_options[] = {
 		OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
 			LS_TREE_ONLY),
@@ -194,6 +296,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "full-tree", &full_tree,
 			 N_("list entire tree; not just current directory "
 			    "(implies --full-name)")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+					 N_("format to use for the output"),
+					 PARSE_OPT_NONEG),
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
@@ -214,6 +319,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	    ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
 		ls_options |= LS_SHOW_TREES;
 
+	if (format && cmdmode)
+		usage_msg_opt(
+			_("--format can't be combined with other format-altering options"),
+			ls_tree_usage, ls_tree_options);
 	if (argc < 1)
 		usage_with_options(ls_tree_usage, ls_tree_options);
 	if (get_oid(argv[0], &oid))
@@ -237,6 +346,20 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
-	return !!read_tree(the_repository, tree,
-			   &pathspec, show_tree, NULL);
+	/*
+	 * The generic show_tree_fmt() is slower than show_tree(), so
+	 * take the fast path if possible.
+	 */
+	if (format && (!strcmp(format, default_format))) {
+		fn = show_tree;
+	} else if (format && (!strcmp(format, long_format))) {
+		shown_fields = shown_fields | FIELD_SIZE;
+		fn = show_tree;
+	} else if (format && (!strcmp(format, name_only_format))) {
+		shown_fields = FIELD_PATH_NAME;
+		fn = show_tree;
+	} else if (format)
+		fn = show_tree_fmt;
+
+	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
 }
diff --git a/t/t3104-ls-tree-format.sh b/t/t3104-ls-tree-format.sh
new file mode 100755
index 0000000000..e08c83dc47
--- /dev/null
+++ b/t/t3104-ls-tree-format.sh
@@ -0,0 +1,81 @@
+#!/bin/sh
+
+test_description='ls-tree --format'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'ls-tree --format usage' '
+	test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
+	test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
+	test_expect_code 129 git ls-tree --format=fmt --name-status HEAD
+'
+
+test_expect_success 'setup' '
+	mkdir dir &&
+	test_commit dir/sub-file &&
+	test_commit top-file
+'
+
+test_ls_tree_format () {
+	format=$1 &&
+	opts=$2 &&
+	fmtopts=$3 &&
+	shift 2 &&
+	git ls-tree $opts -r HEAD >expect.raw &&
+	sed "s/^/> /" >expect <expect.raw &&
+	git ls-tree --format="> $format" -r $fmtopts HEAD >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ls-tree --format=<default-like>' '
+	test_ls_tree_format \
+		"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
+		""
+'
+
+test_expect_success 'ls-tree --format=<long-like>' '
+	test_ls_tree_format \
+		"%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)" \
+		"--long"
+'
+
+test_expect_success 'ls-tree --format=<name-only-like>' '
+	test_ls_tree_format \
+		"%(path)" \
+		"--name-only"
+'
+
+test_expect_success 'ls-tree combine --format=<default-like> and -t' '
+	test_ls_tree_format \
+	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
+	"-t" \
+	"-t"
+'
+
+test_expect_success 'ls-tree combine --format=<default-like> and --full-name' '
+	test_ls_tree_format \
+	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
+	"--full-name" \
+	"--full-name"
+'
+
+test_expect_success 'ls-tree combine --format=<default-like> and --full-tree' '
+	test_ls_tree_format \
+	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
+	"--full-tree" \
+	"--full-tree"
+'
+
+test_expect_success 'ls-tree hit fast-path with --format=<default-like>' '
+	git ls-tree -r HEAD >expect &&
+	git ls-tree --format="%(objectmode) %(objecttype) %(objectname)%x09%(path)" -r HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'ls-tree hit fast-path with --format=<name-only-like>' '
+	git ls-tree -r --name-only HEAD >expect &&
+	git ls-tree --format="%(path)" -r HEAD >actual &&
+	test_cmp expect actual
+'
+test_done
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 12/13] ls-tree: introduce function "fast_path()"
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (10 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 11/13] ls-tree.c: introduce "--format" option Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-19  5:32                     ` Ævar Arnfjörð Bjarmason
  2022-02-08 12:14                   ` [PATCH v11 13/13] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
  2022-03-04 10:42                   ` [PATCH v12 00/12] ls-tree: "--object-only" and "--format" opts Teng Long
  13 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

The generic "show_tree_fmt()" is slower than "show_tree()", so
we want to take the fast path if possible.

when "--format=<format>" is passed, "fast_path()" will determine
whether to use "show_tree()" or insist on using "show_tree_fmt()"
by a try of finding out if the built-int format is hit.

This commit take out the related codes from "cmd_ls_tree()" and
package them into a new funtion "fast_path()".

Explain it a little bit further, whether fast_path is hit or not,
the final correctness should not break. Abstracting a separate method
helps improve the readability of "cmd_ls_tree()" and the cohesiveness
and extensibility of fast path logic.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 builtin/ls-tree.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 1c71e5d543..ba96bcf602 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -58,6 +58,19 @@ enum {
 
 static int cmdmode = MODE_UNSPECIFIED;
 
+static int fast_path(void){
+	if (!strcmp(format, default_format)) {
+		return 1;
+	} else if (!strcmp(format, long_format)) {
+		shown_fields = shown_fields | FIELD_SIZE;
+		return 1;
+	} else if (!strcmp(format, name_only_format)) {
+		shown_fields = FIELD_PATH_NAME;
+		return 1;
+	}
+	return 0;
+}
+
 static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
 			      const enum object_type type, unsigned int padded)
 {
@@ -350,15 +363,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	 * The generic show_tree_fmt() is slower than show_tree(), so
 	 * take the fast path if possible.
 	 */
-	if (format && (!strcmp(format, default_format))) {
-		fn = show_tree;
-	} else if (format && (!strcmp(format, long_format))) {
-		shown_fields = shown_fields | FIELD_SIZE;
-		fn = show_tree;
-	} else if (format && (!strcmp(format, name_only_format))) {
-		shown_fields = FIELD_PATH_NAME;
-		fn = show_tree;
-	} else if (format)
+	if (format && !fast_path())
 		fn = show_tree_fmt;
 
 	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v11 13/13] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (11 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 12/13] ls-tree: introduce function "fast_path()" Teng Long
@ 2022-02-08 12:14                   ` Teng Long
  2022-02-19  5:24                     ` Ævar Arnfjörð Bjarmason
  2022-03-04 10:42                   ` [PATCH v12 00/12] ls-tree: "--object-only" and "--format" opts Teng Long
  13 siblings, 1 reply; 224+ messages in thread
From: Teng Long @ 2022-02-08 12:14 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

We usually pipe the output from `git ls-trees` to tools like
`sed` or `cut` when we only want to extract some fields.

When we want only the pathname component, we can pass
`--name-only` option to omit such a pipeline, but there are no
options for extracting other fields.

Teach the "--object-only" option to the command to only show the
object name. This option cannot be used together with
"--name-only" or "--long" , they are mutually exclusive (actually
"--name-only" and "--long" can be combined together before, this
commit by the way fix this bug).

In terms of performance, there is no loss comparing to the
"master" (2ae0a9c), here are the
results of the performance tests in my environment based on linux
repository:

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
    Range (min … max):   101.5 ms … 111.3 ms    28 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
    Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
    Range (min … max):    99.3 ms … 109.5 ms    27 runs

    $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
    Range (min … max):   323.0 ms … 355.0 ms    10 runs

    $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
    Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
    Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
    Range (min … max):   330.4 ms … 349.9 ms    10 runs

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 Documentation/git-ls-tree.txt |  7 ++++-
 builtin/ls-tree.c             | 16 ++++++++++-
 t/t3104-ls-tree-format.sh     | 12 +++++++++
 t/t3105-ls-tree-oid.sh        | 51 +++++++++++++++++++++++++++++++++++
 4 files changed, 84 insertions(+), 2 deletions(-)
 create mode 100755 t/t3105-ls-tree-oid.sh

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index db29a9efb5..21045dd163 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git ls-tree' [-d] [-r] [-t] [-l] [-z]
-	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
+	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
 	    <tree-ish> [<path>...]
 
 DESCRIPTION
@@ -59,6 +59,11 @@ OPTIONS
 --name-only::
 --name-status::
 	List only filenames (instead of the "long" output), one per line.
+	Cannot be combined with `--object-only`.
+
+--object-only::
+	List only names of the objects, one per line. Cannot be combined
+	with `--name-only` or `--name-status`.
 
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index ba96bcf602..9819a24186 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -20,6 +20,7 @@ static int line_termination = '\n';
 #define LS_SHOW_TREES (1 << 2)
 #define LS_NAME_ONLY (1 << 3)
 #define LS_SHOW_SIZE (1 << 4)
+#define LS_OBJECT_ONLY (1 << 5)
 static int abbrev;
 static int ls_options;
 static struct pathspec pathspec;
@@ -37,6 +38,7 @@ static const char *format;
 static const char *default_format = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
 static const char *long_format = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
 static const char *name_only_format = "%(path)";
+static const char *object_only_format = "%(objectname)";
 struct show_tree_data {
 	unsigned mode;
 	enum object_type type;
@@ -53,6 +55,7 @@ static const  char * const ls_tree_usage[] = {
 enum {
 	MODE_UNSPECIFIED = 0,
 	MODE_NAME_ONLY,
+	MODE_OBJECT_ONLY,
 	MODE_LONG,
 };
 
@@ -67,6 +70,8 @@ static int fast_path(void){
 	} else if (!strcmp(format, name_only_format)) {
 		shown_fields = FIELD_PATH_NAME;
 		return 1;
+	} else if (!strcmp(format, object_only_format)) {
+		shown_fields = FIELD_OBJECT_NAME;
 	}
 	return 0;
 }
@@ -143,7 +148,10 @@ static int parse_shown_fields(void)
 		shown_fields = FIELD_PATH_NAME;
 		return 0;
 	}
-
+	if (cmdmode == MODE_OBJECT_ONLY) {
+		shown_fields = FIELD_OBJECT_NAME;
+		return 0;
+	}
 	if (!ls_options || (ls_options & LS_RECURSIVE)
 	    || (ls_options & LS_SHOW_TREES)
 	    || (ls_options & LS_TREE_ONLY))
@@ -267,6 +275,10 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
 		return 0;
 
+	if (shown_fields == FIELD_OBJECT_NAME) {
+		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
+		return recurse;
+	}
 	if (shown_fields == FIELD_PATH_NAME) {
 		baselen = base->len;
 		strbuf_addstr(base, pathname);
@@ -304,6 +316,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 			    MODE_NAME_ONLY),
 		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
 			    MODE_NAME_ONLY),
+		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_SET_INT(0, "full-name", &chomp_prefix,
 			    N_("use full path names"), 0),
 		OPT_BOOL(0, "full-tree", &full_tree,
diff --git a/t/t3104-ls-tree-format.sh b/t/t3104-ls-tree-format.sh
index e08c83dc47..c0ffc8e1c3 100755
--- a/t/t3104-ls-tree-format.sh
+++ b/t/t3104-ls-tree-format.sh
@@ -46,6 +46,12 @@ test_expect_success 'ls-tree --format=<name-only-like>' '
 		"--name-only"
 '
 
+test_expect_success 'ls-tree --format=<object-only-like>' '
+	test_ls_tree_format \
+		"%(objectname)" \
+		"--object-only"
+'
+
 test_expect_success 'ls-tree combine --format=<default-like> and -t' '
 	test_ls_tree_format \
 	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
@@ -78,4 +84,10 @@ test_expect_success 'ls-tree hit fast-path with --format=<name-only-like>' '
 	git ls-tree --format="%(path)" -r HEAD >actual &&
 	test_cmp expect actual
 '
+
+test_expect_success 'ls-tree hit fast-path with --format=<object-only-like>' '
+	git ls-tree -r --object-only HEAD >expect &&
+	git ls-tree --format="%(objectname)" -r HEAD >actual &&
+	test_cmp expect actual
+'
 test_done
diff --git a/t/t3105-ls-tree-oid.sh b/t/t3105-ls-tree-oid.sh
new file mode 100755
index 0000000000..992bb26bfa
--- /dev/null
+++ b/t/t3105-ls-tree-oid.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+
+test_description='git ls-tree objects handling.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+test_commit A &&
+test_commit B &&
+mkdir -p C &&
+test_commit C/D.txt &&
+find *.txt path* \( -type f -o -type l \) -print |
+xargs git update-index --add &&
+tree=$(git write-tree) &&
+echo $tree
+'
+
+test_expect_success 'usage: --object-only' '
+git ls-tree --object-only $tree >current &&
+git ls-tree $tree >result &&
+cut -f1 result | cut -d " " -f3 >expected &&
+test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with -r' '
+git ls-tree --object-only -r $tree >current &&
+git ls-tree -r $tree >result &&
+cut -f1 result | cut -d " " -f3 >expected &&
+test_cmp current expected
+'
+
+test_expect_success 'usage: --object-only with --abbrev' '
+git ls-tree --object-only --abbrev=6 $tree >current &&
+git ls-tree --abbrev=6 $tree >result &&
+cut -f1 result | cut -d " " -f3 >expected &&
+test_cmp current expected
+'
+
+test_expect_success 'usage: incompatible options: --name-only with --object-only' '
+test_expect_code 129 git ls-tree --object-only --name-only $tree
+'
+
+test_expect_success 'usage: incompatible options: --name-status with --object-only' '
+test_expect_code 129 git ls-tree --object-only --name-status $tree
+'
+
+test_expect_success 'usage: incompatible options: --long with --object-only' '
+test_expect_code 129 git ls-tree --object-only --long $tree
+'
+
+test_done
-- 
2.34.1.403.gb35f2687cf.dirty


^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 13/13] ls-tree.c: support --object-only option for "git-ls-tree"
  2022-02-08 12:14                   ` [PATCH v11 13/13] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-02-19  5:24                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  5:24 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, martin.agren,
	peff, tenglong.tl


On Tue, Feb 08 2022, Teng Long wrote:

> We usually pipe the output from `git ls-trees` to tools like
> `sed` or `cut` when we only want to extract some fields.
>
> When we want only the pathname component, we can pass
> `--name-only` option to omit such a pipeline, but there are no
> options for extracting other fields.
>
> Teach the "--object-only" option to the command to only show the
> object name. This option cannot be used together with
> "--name-only" or "--long" , they are mutually exclusive (actually
> "--name-only" and "--long" can be combined together before, this
> commit by the way fix this bug).
>
> In terms of performance, there is no loss comparing to the
> "master" (2ae0a9c), here are the
> results of the performance tests in my environment based on linux
> repository:

I think given the re-arrangement in this v11 it would make sense to
change the commit messageto say:

 * This is an alias for --format=%(objectname)
 * Per benchmark XYZ it's faster

I.e. this:

>     $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
>     Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
>     Time (mean ± σ):     105.8 ms ±   2.7 ms    [User: 85.7 ms, System: 20.0 ms]
>     Range (min … max):   101.5 ms … 111.3 ms    28 runs
>
>     $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD"
>     Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r HEAD
>     Time (mean ± σ):     105.0 ms ±   3.0 ms    [User: 83.7 ms, System: 21.2 ms]
>     Range (min … max):    99.3 ms … 109.5 ms    27 runs
>
>     $hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
>     Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
>     Time (mean ± σ):     337.4 ms ±  10.9 ms    [User: 308.3 ms, System: 29.0 ms]
>     Range (min … max):   323.0 ms … 355.0 ms    10 runs
>
>     $hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD"
>     Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r -l HEAD
>     Time (mean ± σ):     337.6 ms ±   6.2 ms    [User: 309.4 ms, System: 28.1 ms]
>     Range (min … max):   330.4 ms … 349.9 ms    10 runs

Is surely more relevant if compared to master & that --format.

> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  Documentation/git-ls-tree.txt |  7 ++++-
>  builtin/ls-tree.c             | 16 ++++++++++-
>  t/t3104-ls-tree-format.sh     | 12 +++++++++
>  t/t3105-ls-tree-oid.sh        | 51 +++++++++++++++++++++++++++++++++++
>  4 files changed, 84 insertions(+), 2 deletions(-)
>  create mode 100755 t/t3105-ls-tree-oid.sh
>
> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
> index db29a9efb5..21045dd163 100644
> --- a/Documentation/git-ls-tree.txt
> +++ b/Documentation/git-ls-tree.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  --------
>  [verse]
>  'git ls-tree' [-d] [-r] [-t] [-l] [-z]
> -	    [--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
> +	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
>  	    <tree-ish> [<path>...]
>  
>  DESCRIPTION
> @@ -59,6 +59,11 @@ OPTIONS
>  --name-only::
>  --name-status::
>  	List only filenames (instead of the "long" output), one per line.
> +	Cannot be combined with `--object-only`.
> +
> +--object-only::
> +	List only names of the objects, one per line. Cannot be combined
> +	with `--name-only` or `--name-status`.

Hrm, I regret that in my version of v11 11/13 I didn't add to all of these something like:
    
    This is equivalent to specifying `--format=...`, but for both this
    option and that exact format the command takes a hand-optimized
    codepath instead of going through the generic formatting mechanism.

Or whatever, and perhaps have everything after ", but for[...]" part of
the generic FORMAT section (no need to say it for every option).
    
>  
>  --abbrev[=<n>]::
>  	Instead of showing the full 40-byte hexadecimal object
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index ba96bcf602..9819a24186 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -20,6 +20,7 @@ static int line_termination = '\n';
>  #define LS_SHOW_TREES (1 << 2)
>  #define LS_NAME_ONLY (1 << 3)
>  #define LS_SHOW_SIZE (1 << 4)
> +#define LS_OBJECT_ONLY (1 << 5)
>  static int abbrev;
>  static int ls_options;
>  static struct pathspec pathspec;
> @@ -37,6 +38,7 @@ static const char *format;
>  static const char *default_format = "%(objectmode) %(objecttype) %(objectname)%x09%(path)";
>  static const char *long_format = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)";
>  static const char *name_only_format = "%(path)";
> +static const char *object_only_format = "%(objectname)";
>  struct show_tree_data {
>  	unsigned mode;
>  	enum object_type type;
> @@ -53,6 +55,7 @@ static const  char * const ls_tree_usage[] = {
>  enum {
>  	MODE_UNSPECIFIED = 0,
>  	MODE_NAME_ONLY,
> +	MODE_OBJECT_ONLY,
>  	MODE_LONG,
>  };
>  
> @@ -67,6 +70,8 @@ static int fast_path(void){
>  	} else if (!strcmp(format, name_only_format)) {
>  		shown_fields = FIELD_PATH_NAME;
>  		return 1;
> +	} else if (!strcmp(format, object_only_format)) {
> +		shown_fields = FIELD_OBJECT_NAME;
>  	}
>  	return 0;
>  }
> @@ -143,7 +148,10 @@ static int parse_shown_fields(void)
>  		shown_fields = FIELD_PATH_NAME;
>  		return 0;
>  	}
> -
> +	if (cmdmode == MODE_OBJECT_ONLY) {
> +		shown_fields = FIELD_OBJECT_NAME;
> +		return 0;
> +	}
>  	if (!ls_options || (ls_options & LS_RECURSIVE)
>  	    || (ls_options & LS_SHOW_TREES)
>  	    || (ls_options & LS_TREE_ONLY))
> @@ -267,6 +275,10 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
>  		return 0;
>  
> +	if (shown_fields == FIELD_OBJECT_NAME) {
> +		printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
> +		return recurse;
> +	}
>  	if (shown_fields == FIELD_PATH_NAME) {
>  		baselen = base->len;
>  		strbuf_addstr(base, pathname);
> @@ -304,6 +316,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  			    MODE_NAME_ONLY),
>  		OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
>  			    MODE_NAME_ONLY),
> +		OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
> +			    MODE_OBJECT_ONLY),
>  		OPT_SET_INT(0, "full-name", &chomp_prefix,
>  			    N_("use full path names"), 0),
>  		OPT_BOOL(0, "full-tree", &full_tree,
> diff --git a/t/t3104-ls-tree-format.sh b/t/t3104-ls-tree-format.sh
> index e08c83dc47..c0ffc8e1c3 100755
> --- a/t/t3104-ls-tree-format.sh
> +++ b/t/t3104-ls-tree-format.sh
> @@ -46,6 +46,12 @@ test_expect_success 'ls-tree --format=<name-only-like>' '
>  		"--name-only"
>  '

This looks much better/less complex than in earlier rounds.

> +test_expect_success 'ls-tree --format=<object-only-like>' '
> +	test_ls_tree_format \
> +		"%(objectname)" \
> +		"--object-only"
> +'
> +
>  test_expect_success 'ls-tree combine --format=<default-like> and -t' '
>  	test_ls_tree_format \
>  	"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
> @@ -78,4 +84,10 @@ test_expect_success 'ls-tree hit fast-path with --format=<name-only-like>' '
>  	git ls-tree --format="%(path)" -r HEAD >actual &&
>  	test_cmp expect actual
>  '
> +
> +test_expect_success 'ls-tree hit fast-path with --format=<object-only-like>' '
> +	git ls-tree -r --object-only HEAD >expect &&
> +	git ls-tree --format="%(objectname)" -r HEAD >actual &&
> +	test_cmp expect actual
> +'
>  test_done

So, you and I came up with independent tests for these two.

I wonder if this can be re-arranged so that we can share the tests, and
perhaps test all for both the --format and --object-onnly in some
for-loop, or maybe it's not worth it.

> diff --git a/t/t3105-ls-tree-oid.sh b/t/t3105-ls-tree-oid.sh
> new file mode 100755
> index 0000000000..992bb26bfa
> --- /dev/null
> +++ b/t/t3105-ls-tree-oid.sh
> @@ -0,0 +1,51 @@
> +#!/bin/sh
> +
> +test_description='git ls-tree objects handling.'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +test_commit A &&
> +test_commit B &&
> +mkdir -p C &&
> +test_commit C/D.txt &&
> +find *.txt path* \( -type f -o -type l \) -print |
> +xargs git update-index --add &&
> +tree=$(git write-tree) &&
> +echo $tree
> +'
> +
> +test_expect_success 'usage: --object-only' '
> +git ls-tree --object-only $tree >current &&
> +git ls-tree $tree >result &&
> +cut -f1 result | cut -d " " -f3 >expected &&
> +test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --object-only with -r' '
> +git ls-tree --object-only -r $tree >current &&
> +git ls-tree -r $tree >result &&
> +cut -f1 result | cut -d " " -f3 >expected &&
> +test_cmp current expected
> +'
> +
> +test_expect_success 'usage: --object-only with --abbrev' '
> +git ls-tree --object-only --abbrev=6 $tree >current &&
> +git ls-tree --abbrev=6 $tree >result &&
> +cut -f1 result | cut -d " " -f3 >expected &&
> +test_cmp current expected
> +'
> +
> +test_expect_success 'usage: incompatible options: --name-only with --object-only' '
> +test_expect_code 129 git ls-tree --object-only --name-only $tree
> +'
> +
> +test_expect_success 'usage: incompatible options: --name-status with --object-only' '
> +test_expect_code 129 git ls-tree --object-only --name-status $tree
> +'
> +
> +test_expect_success 'usage: incompatible options: --long with --object-only' '
> +test_expect_code 129 git ls-tree --object-only --long $tree
> +'
> +
> +test_done

This whole test block seems to have lost its indentation since the v10.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 12/13] ls-tree: introduce function "fast_path()"
  2022-02-08 12:14                   ` [PATCH v11 12/13] ls-tree: introduce function "fast_path()" Teng Long
@ 2022-02-19  5:32                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  5:32 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, martin.agren,
	peff, tenglong.tl


On Tue, Feb 08 2022, Teng Long wrote:

> The generic "show_tree_fmt()" is slower than "show_tree()", so
> we want to take the fast path if possible.
>
> when "--format=<format>" is passed, "fast_path()" will determine
> whether to use "show_tree()" or insist on using "show_tree_fmt()"
> by a try of finding out if the built-int format is hit.
>
> This commit take out the related codes from "cmd_ls_tree()" and
> package them into a new funtion "fast_path()".
>
> Explain it a little bit further, whether fast_path is hit or not,
> the final correctness should not break. Abstracting a separate method
> helps improve the readability of "cmd_ls_tree()" and the cohesiveness
> and extensibility of fast path logic.

This whole commit message sounds a bit like "we're introducing this fast
path", but really it got added in 11/13, and this is just a refactoring
to split that into a function to slightly reduce the size of
cmd_ls_tree() itself.

Which I really don't mind, but it would be better if the commit message
said so, e.g.:

    In a preceding commit a fast path selection was added to cmd_ls_tree(),
    split it into a utility function because ...

But I got stuck on "..." because I couldn't find a reason :)

I.e. in 13/13 this isn't used at all, except by adding a new brace arm
to it, but then it could still live in cmd_ls_tree().

Personally I think the pre-image is a bit easier to read, but then again
I wrote that so I'm biased. I don't mind changing this, but structurally
for the series it seems better to squash it in if you'd want to keep it.

> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  builtin/ls-tree.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 1c71e5d543..ba96bcf602 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -58,6 +58,19 @@ enum {
>  
>  static int cmdmode = MODE_UNSPECIFIED;
>  
> +static int fast_path(void){
> +	if (!strcmp(format, default_format)) {
> +		return 1;
> +	} else if (!strcmp(format, long_format)) {
> +		shown_fields = shown_fields | FIELD_SIZE;
> +		return 1;
> +	} else if (!strcmp(format, name_only_format)) {
> +		shown_fields = FIELD_PATH_NAME;
> +		return 1;
> +	}
> +	return 0;
> +}

Just in terms of arranging things if you add a static function and it's
only used in one other function, here in cmd_ls_tree(), it's more
readable to add it immediately before that function.

>  static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
>  			      const enum object_type type, unsigned int padded)
>  {
> @@ -350,15 +363,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  	 * The generic show_tree_fmt() is slower than show_tree(), so
>  	 * take the fast path if possible.
>  	 */
> -	if (format && (!strcmp(format, default_format))) {
> -		fn = show_tree;
> -	} else if (format && (!strcmp(format, long_format))) {
> -		shown_fields = shown_fields | FIELD_SIZE;
> -		fn = show_tree;
> -	} else if (format && (!strcmp(format, name_only_format))) {
> -		shown_fields = FIELD_PATH_NAME;
> -		fn = show_tree;
> -	} else if (format)
> +	if (format && !fast_path())
>  		fn = show_tree_fmt;
>  
>  	return !!read_tree(the_repository, tree, &pathspec, fn, NULL);

Also in terms of structure wouldn't it be better to end up with this:

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 9819a241869..47f7e2136b0 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -355,8 +355,6 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	if (get_oid(argv[0], &oid))
 		die("Not a valid object name %s", argv[0]);
 
-	parse_shown_fields();
-
 	/*
 	 * show_recursive() rolls its own matching code and is
 	 * generally ignorant of 'struct pathspec'. The magic mask
@@ -373,6 +371,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	tree = parse_tree_indirect(&oid);
 	if (!tree)
 		die("not a tree object");
+
+	parse_shown_fields();
 	/*
 	 * The generic show_tree_fmt() is slower than show_tree(), so
 	 * take the fast path if possible.

I.e. have the whole "shown_fields" decisions be near one another.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 11/13] ls-tree.c: introduce "--format" option
  2022-02-08 12:14                   ` [PATCH v11 11/13] ls-tree.c: introduce "--format" option Teng Long
@ 2022-02-19  5:44                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  5:44 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, martin.agren,
	peff, tenglong.tl


On Tue, Feb 08 2022, Teng Long wrote:

> +	if (*start != '(')
> +		die(_("bad ls-tree format: as '%s'"), start);

My typo surely, but I think I menat "as of" not just "as" there:

    $ ./git ls-tree --format="%[blah)" -r HEAD
    fatal: bad ls-tree format: as of '[blah)'

> +
> +	end = strchr(start + 1, ')');
> +	if (!end)
> +		die(_("bad ls-tree format: element '%s' does not end in ')'"), start);

Or actually:

    $ ./git ls-tree --format="%(blah]" -r HEAD
    fatal: bad ls-tree format: element '(blah]' does not end in ')'

We could rather say for the first one:

    $ ./git ls-tree --format="%[blah)" -r HEAD
    fatal: bad ls-tree format: element '[blah)' does not start with '('

> [...]
> +		errlen = (unsigned long)len;
> +		die(_("bad ls-tree format: %%%.*s"), errlen, start);

I wondered why that %% is there (and I probably wrote this in the first
place, I didn't check:). But it makes sense, because strbuf_expand()
skips past the % for us, and we'd like to say e.g. %(foobar) here, not
(foobar) or whatever.

> new file mode 100755
> index 0000000000..e08c83dc47
> --- /dev/null
> +++ b/t/t3104-ls-tree-format.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +
> +test_description='ls-tree --format'
> +
> +TEST_PASSES_SANITIZE_LEAK=true

I notice now after commenting on your 13/13 that you should add
TEST_PASSES_SANITIZE_LEAK=true to it (assuming it doesn't leak, which I
don't think it does, but test with SANITIZE=leak first!)

> +test_ls_tree_format () {
> +	format=$1 &&
> +	opts=$2 &&
> +	fmtopts=$3 &&
> +	shift 2 &&
> +	git ls-tree $opts -r HEAD >expect.raw &&
> +	sed "s/^/> /" >expect <expect.raw &&
> +	git ls-tree --format="> $format" -r $fmtopts HEAD >actual &&
> +	test_cmp expect actual
> +}

I also forgot I wrote this, but also per my comment on 13/13 you can
just add your tests added in 13/13 to this file, then we'll assert that
-r etc. work the same for both.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 08/13] ls-tree: slightly refactor `show_tree()`
  2022-02-08 12:14                   ` [PATCH v11 08/13] ls-tree: slightly refactor `show_tree()` Teng Long
@ 2022-02-19  5:56                     ` Ævar Arnfjörð Bjarmason
       [not found]                       ` <CADMgQSRYKB1ybxZWxQQ3uVM71fmdbzHqcK-WUPNKm2HMxw2C2g@mail.gmail.com>
  0 siblings, 1 reply; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  5:56 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, martin.agren,
	peff, tenglong.tl


On Tue, Feb 08 2022, Teng Long wrote:

> This is a non-functional change, we use a new int "shown_fields" to mark
> which columns to output, and `parse_shown_fields()` to calculate the
> value of "shown_fields".
>
> This has the advantage of making the show_tree logic simpler and more
> readable, as well as making it easier to extend new options (for example,
> if we want to add a "--object-only" option, we just need to add a similar
> "if (shown_fields == FIELD_OBJECT_NAME)" short-circuit logic in
> "show_tree()").

I think this and the 09/13 really don't make sense in combination.

Now, I clearly prefer to put options for the command into its own little
struct to pass around, I think it makes for easier reading than the
globals you end up with.

But tastes differ, and some built-ins use one, and some the other
pattern.

But this is really the worst of both worlds, let's just pick one or the
other, not effectively some some ptions in that struct in 09/13, and
some in globals here...

> +static unsigned int shown_fields;
> +#define FIELD_PATH_NAME 1
> +#define FIELD_SIZE (1 << 1)
> +#define FIELD_OBJECT_NAME (1 << 2)
> +#define FIELD_TYPE (1 << 3)
> +#define FIELD_MODE (1 << 4)
> +#define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */

Why do we need some FIELD_DEFAULT here as opposed to just having it by
an enum field with a valu of 0?

> +enum {
> +	MODE_UNSPECIFIED = 0,
> +	MODE_NAME_ONLY,
> +	MODE_LONG,
> +};
> +
> +static int cmdmode = MODE_UNSPECIFIED;

let's name this enum type and use it, see e.g. builtin/help.c's "static
enum help_action" for an example.

> +
> +static int parse_shown_fields(void)
> +{
> +	if (cmdmode == MODE_NAME_ONLY) {
> +		shown_fields = FIELD_PATH_NAME;
> +		return 0;
> +	}
> +
> +	if (!ls_options || (ls_options & LS_RECURSIVE)
> +	    || (ls_options & LS_SHOW_TREES)
> +	    || (ls_options & LS_TREE_ONLY))
> +		shown_fields = FIELD_DEFAULT;
> +	if (cmdmode == MODE_LONG)
> +		shown_fields = FIELD_LONG_DEFAULT;
> +	return 1;
> +}

I still don't really get why we can't just use the one MODE_*
here. E.g. doesn't MODE_LONG map to FIELD_LONG_DEFAULT, MODE_NAME_ONLY
to FIELD_PATH_NAME etc?

Is this all so we can do "shown_fields & FIELD_SIZE" in show_default()
as opposed to e.g. checking "default format or long format?" ?

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 07/13] ls-tree: fix "--name-only" and "--long" combined use bug
  2022-02-08 12:14                   ` [PATCH v11 07/13] ls-tree: fix "--name-only" and "--long" combined use bug Teng Long
@ 2022-02-19  6:04                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  6:04 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, martin.agren,
	peff, tenglong.tl


On Tue, Feb 08 2022, Teng Long wrote:

> If we execute "git ls-tree" with combined "--name-only" and "--long"
> , only the pathname will be printed, the size is omitted (the original
> discoverer was Peff in [1]).
>
> This commit fix this issue by using `OPT_CMDMODE()` instead to make both
> of them mutually exclusive.
>
> [1] https://public-inbox.org/git/YZK0MKCYAJmG+pSU@coredump.intra.peff.net/
>
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  builtin/ls-tree.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 9c57a36c8c..32147e75e6 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -120,12 +120,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  			LS_SHOW_TREES),
>  		OPT_SET_INT('z', NULL, &line_termination,
>  			    N_("terminate entries with NUL byte"), 0),
> -		OPT_BIT('l', "long", &ls_options, N_("include object size"),
> -			LS_SHOW_SIZE),
> -		OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> -		OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
> -			LS_NAME_ONLY),
> +		OPT_CMDMODE('l', "long", &ls_options, N_("include object size"),
> +			    LS_SHOW_SIZE),
> +		OPT_CMDMODE(0, "name-only", &ls_options, N_("list only filenames"),
> +			    LS_NAME_ONLY),
> +		OPT_CMDMODE(0, "name-status", &ls_options, N_("list only filenames"),
> +			    LS_NAME_ONLY),
>  		OPT_SET_INT(0, "full-name", &chomp_prefix,
>  			    N_("use full path names"), 0),
>  		OPT_BOOL(0, "full-tree", &full_tree,

This seems like a sensible fix, but let's add a test for it. See:

    git grep 'test_expect_code 129'

For examples.

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 06/13] ls-tree: simplify nesting if/else logic in "show_tree()"
  2022-02-08 12:14                   ` [PATCH v11 06/13] ls-tree: simplify nesting if/else logic " Teng Long
@ 2022-02-19  6:06                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-19  6:06 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes.Schindelin, congdanhqx, git, gitster, martin.agren,
	peff, tenglong.tl, Teng Long


On Tue, Feb 08 2022, Teng Long wrote:

> This commit use "object_type()" to get the type, then remove
> some of the nested if to let the codes here become more cleaner.
>
> Signed-off-by: Teng Long <dyronetengb@gmail.com>
> ---
>  builtin/ls-tree.c | 18 ++++++------------
>  1 file changed, 6 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index ef8c414f61..9c57a36c8c 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -66,19 +66,13 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  {
>  	int recurse = 0;
>  	size_t baselen;
> -	enum object_type type = OBJ_BLOB;
> +	enum object_type type = object_type(mode);
>  
> -	if (S_ISGITLINK(mode)) {
> -		type = OBJ_COMMIT;
> -	} else if (S_ISDIR(mode)) {
> -		if (show_recursive(base->buf, base->len, pathname)) {
> -			recurse = READ_TREE_RECURSIVE;
> -			if (!(ls_options & LS_SHOW_TREES))
> -				return recurse;
> -		}
> -		type = OBJ_TREE;
> -	}
> -	else if (ls_options & LS_TREE_ONLY)
> +	if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
> +		recurse = READ_TREE_RECURSIVE;
> +	if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
> +		return recurse;
> +	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
>  		return 0;
>  
>  	if (!(ls_options & LS_NAME_ONLY)) {

I think the use of object_type() here is good, but in any case I think
doing a minimal change first for the "type" and then this proposed
refactoring would be easier to look at, and to independently decide on
the two.

I find this much easier to read, both as a diff and as end-state:

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index ef8c414f61a..0af09e94a03 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,20 +66,18 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int recurse = 0;
 	size_t baselen;
-	enum object_type type = OBJ_BLOB;
+	enum object_type type = object_type(mode);
 
-	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
+	if (type == OBJ_BLOB) {
+		if (ls_options & LS_TREE_ONLY)
+			return 0;
+	} else if (type == OBJ_TREE) { 
 		if (show_recursive(base->buf, base->len, pathname)) {
 			recurse = READ_TREE_RECURSIVE;
 			if (!(ls_options & LS_SHOW_TREES))
 				return recurse;
 		}
-		type = OBJ_TREE;
 	}
-	else if (ls_options & LS_TREE_ONLY)
-		return 0;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {

Unrolling this from a logical if/else if to an if/if/if I think also
doesn't make sense. At the cost of a slightly larger diff (could be done
on top) we get rid of the show_recursive() branch too:

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index ef8c414f61a..d4be71bad24 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -66,20 +66,17 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
 {
 	int recurse = 0;
 	size_t baselen;
-	enum object_type type = OBJ_BLOB;
+	enum object_type type = object_type(mode);
 
-	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
-		if (show_recursive(base->buf, base->len, pathname)) {
-			recurse = READ_TREE_RECURSIVE;
-			if (!(ls_options & LS_SHOW_TREES))
-				return recurse;
-		}
-		type = OBJ_TREE;
+	if (type == OBJ_BLOB) {
+		if (ls_options & LS_TREE_ONLY)
+			return 0;
+	} else if (type == OBJ_TREE &&
+		   show_recursive(base->buf, base->len, pathname)) {
+		recurse = READ_TREE_RECURSIVE;
+		if (!(ls_options & LS_SHOW_TREES))
+			return recurse;
 	}
-	else if (ls_options & LS_TREE_ONLY)
-		return 0;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {

Which, I think is also nicer to read, we're not checking "is it a
tree?", setting "recursive", and then using that "recursive" as a
boolean for no reason. Let's just continue on that "else if" chain we're
already in instead...

^ permalink raw reply	[flat|nested] 224+ messages in thread

* Re: [PATCH v11 08/13] ls-tree: slightly refactor `show_tree()`
       [not found]                       ` <CADMgQSRYKB1ybxZWxQQ3uVM71fmdbzHqcK-WUPNKm2HMxw2C2g@mail.gmail.com>
@ 2022-02-28 16:18                         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 224+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-28 16:18 UTC (permalink / raw)
  To: Teng Long
  Cc: Johannes Schindelin, Đoàn Trần Công Danh,
	Git Mailing List, Junio C Hamano, Martin Ågren, Jeff King,
	tenglong.tl


On Mon, Feb 28 2022, Teng Long wrote:

[Since this had HTML parts it didn't make it to the git ML, quoting it
here in full & replying on-list].

> On Sat, Feb 19, 2022 at 2:04 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>  > I think this and the 09/13 really don't make sense in combination.
>  > 
>  > Now, I clearly prefer to put options for the command into its own little
>  > struct to pass around, I think it makes for easier reading than the
>  > globals you end up with.
>  > 
>  > But tastes differ, and some built-ins use one, and some the other
>  > pattern.
>  > 
>  > But this is really the worst of both worlds, let's just pick one or the
>  > other, not effectively some some ptions in that struct in 09/13, and
>  > some in globals here...
>
> I'm not 100 percent sure about it, but I agree with we can just pick one or
> the other.
>
> So, how about: 
>      1. add "unsigned shown_fields"  in "struct show_tree_data"
>      2.  move global "show_fields" to "struct show_tree_data"
>      3. move "parse_show_fields()" to "show_tree()"
>
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 293b8f9dfb..92add01ecc 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -25,7 +25,6 @@ static int ls_options;
>  static struct pathspec pathspec;
>  static int chomp_prefix;
>  static const char *ls_tree_prefix;
> -static unsigned int shown_fields;
>  #define FIELD_PATH_NAME 1
>  #define FIELD_SIZE (1 << 1)
>  #define FIELD_OBJECT_NAME (1 << 2)
> @@ -40,6 +39,7 @@ struct show_tree_data {
>         const struct object_id *oid;
>         const char *pathname;
>         struct strbuf *base;
> +       unsigned int shown_fields;
>  };
>  
>  static const  char * const ls_tree_usage[] = {
> @@ -55,19 +55,19 @@ enum {
>  
>  static int cmdmode = MODE_UNSPECIFIED;
>  
> -static int parse_shown_fields(void)
> +static int parse_shown_fields(unsigned int *shown_fields)
>  {
>         if (cmdmode == MODE_NAME_ONLY) {
> -               shown_fields = FIELD_PATH_NAME;
> +               *shown_fields = FIELD_PATH_NAME;
>                 return 0;
>         }
>  
>         if (!ls_options || (ls_options & LS_RECURSIVE)
>             || (ls_options & LS_SHOW_TREES)
>             || (ls_options & LS_TREE_ONLY))
> -               shown_fields = FIELD_DEFAULT;
> +               *shown_fields = FIELD_DEFAULT;
>         if (cmdmode == MODE_LONG)
> -               shown_fields = FIELD_LONG_DEFAULT;
> +               *shown_fields = FIELD_LONG_DEFAULT;
>         return 1;
>  }
>  
> @@ -105,7 +105,7 @@ static int show_default(struct show_tree_data *data)
>  {
>         size_t baselen = data->base->len;
>  
> -       if (shown_fields & FIELD_SIZE) {
> +       if (data->shown_fields & FIELD_SIZE) {
>                 char size_text[24];
>                 if (data->type == OBJ_BLOB) {
>                         unsigned long size;
> @@ -137,15 +137,19 @@ static int show_tree(const struct object_id *oid, struct strbuf *base,
>  {
>         int recurse = 0;
>         size_t baselen;
> +       unsigned int shown_fields = 0;
>         enum object_type type = object_type(mode);
> -       struct show_tree_data data = {
> +               struct show_tree_data data = {
>                 .mode = mode,
>                 .type = type,
>                 .oid = oid,
>                 .pathname = pathname,
>                 .base = base,
> +               .shown_fields = shown_fields,
>         };
>  
> +       parse_shown_fields(&shown_fields);
> +
>         if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
>                 recurse = READ_TREE_RECURSIVE;
>         if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
> @@ -219,8 +223,6 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>         if (get_oid(argv[0], &oid))
>                 die("Not a valid object name %s", argv[0]);
>  
> -       parse_shown_fields();
> -
>         /*
>          * show_recursive() rolls its own matching code and is
>          * generally ignorant of 'struct pathspec'. The magic mask
>

Yes, this looks good, i.e. these are (I think, I didn't look in much
detail) now all store one place instead of two.

>  > >+#define FIELD_DEFAULT 29 /* 11101 size is not shown to output by default */
>
>  >Why do we need some FIELD_DEFAULT here as opposed to just having it by
>  >an enum field with a valu of 0?
>
>  
> You mean to use "cmdmode" or set "FIELD_DEFAULT" to 0, if the former  I think is a
> similar situation to your last replied paragraph so I will reply to that. Or if the 
> latter, we want them as a bitmask not only a flag,  so  I think "0" here means no fields
> will be shown and "29" is the default bitmask value (also some context metioned in
> https://public-inbox.org/git/xmqqmtlu7bb0.fsf@gitster.g/).

I meant (elaborated on below) that it seemed like this was conflating
"default mode" with "wants to emit size" in the state machine.

I.e. I didn't try rewriting it (or re-visited it now), but it seemed at
the time with some minor changes it could be eliminated, but maybe
not...

>  > let's name this enum type and use it, see e.g. builtin/help.c's "static
>  > enum help_action" for an example.
>
> Totally agree with that. 
> I want to name it "mutx_option", I'm not sure whether we need a better one. 

Sure, I don't think the name matters much since it's now, so whatever
you think is OK.

I just wanted to point out that some existing code makes the same
pattern simpler by relying on the 0-init and having fields starting at
0.

>  > I still don't really get why we can't just use the one MODE_*
>  > here. E.g. doesn't MODE_LONG map to FIELD_LONG_DEFAULT, MODE_NAME_ONLY
>  > to FIELD_PATH_NAME etc?
>
>  
> For current, the answer is YES I think.
>
>  > Is this all so we can do "shown_fields & FIELD_SIZE" in show_default()
>  > as opposed to e.g. checking "default format or long format?" ?
>
> Actually, in v3 patch I use "cmdmode" to determine the output fields and the next patches Junio
> suggest using bitmasks for the work (https://public-inbox.org/git/xmqqmtlu7bb0.fsf@gitster.g/).
>
> I didn't understand it at first,  and I thought it might make sense. The cmd_mode is used here to indicate that options are mutually exclusive, which may have the same effect in certain
> cases, such as some short-circuited options (-- name-only).
>
> However, cmd_mode is not a complete representation of what fields we want to output, because some options may not be mutually exclusive but can also be used to change the output (for
> example, we want to add -format or others). If there is a bitmask associated with the output field, we can quickly and explicitly know what to output in the show_tree() phase, without
> thinking about the relationship to cmd_mode or doing a translation of meaning, and at the same time  this will make it easier to adapt to change I think.

*nod*

I'm not quite sure I get what you're aiming for, but I'll look at it
again in any future re-roll & see if I can submit a patch-on-top next
time if I have any comments (if at all worth it). Thanks!

^ permalink raw reply	[flat|nested] 224+ messages in thread

* [PATCH v12 00/12] ls-tree: "--object-only" and "--format" opts
  2022-02-08 12:14                 ` [PATCH v11 00/13] ls-tree: "--object-only" and "--format" opts Teng Long
                                     ` (12 preceding siblings ...)
  2022-02-08 12:14                   ` [PATCH v11 13/13] ls-tree.c: support --object-only option for "git-ls-tree" Teng Long
@ 2022-03-04 10:42                   ` Teng Long
  2022-03-04 10:42                     ` [PATCH v12 01/12] ls-tree: remove commented-out code Teng Long
                                       ` (13 more replies)
  13 siblings, 14 replies; 224+ messages in thread
From: Teng Long @ 2022-03-04 10:42 UTC (permalink / raw)
  To: dyroneteng
  Cc: Johannes.Schindelin, avarab, congdanhqx, git, gitster,
	martin.agren, peff, tenglong.tl

Sorry for the late reply.

Main diff from V11:

1. [v12][06/12] simplify nesting if/else logic in "show_tree()"

  * Unrolling this from a logical if/if/if/ to if/else

  It's from Ævar Arnfjörð Bjarmason's suggestion at:

  https://public-inbox.org/git/220219.86ee3ze5kz.gmgdl@evledraar.gmail.com/#t

  By the way, Ævar also think to doing a minimal change first for use
  "object_type", the reason I didn't is because after the unroll, the related
  codes here is almost the same. So, I think this situation here is ok now.


2. [v12][07/12] ls-tree: fix "--name-only" and "--long" combined use bug

  * Add 2 tests for the bugfix

  It's from Ævar Arnfjörð Bjarmason's suggestion at:

  https://public-inbox.org/git/220219.86iltbe6i2.gmgdl@evledraar.gmail