git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options
@ 2022-06-15 13:45 ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-15 13:45 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Sometime we may need to extract some custom informations from git index
entries. Add a new option "--format" to "git ls-files" which can do such
thing, and add a new option "--object-only" which alias to
"--format=%(objectname)".

The origin discussion is here:
https://lore.kernel.org/git/pull.1250.v2.git.1654778272871.gitgitgadget@gmail.com/

ZheNing Hu (2):
  ls-files: introduce "--format" option
  ls-files: introduce "--object-only" option

 Documentation/git-ls-files.txt |  59 ++++++++++-
 builtin/ls-files.c             | 160 +++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 176 +++++++++++++++++++++++++++++++++
 3 files changed, 390 insertions(+), 5 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh


base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1262
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/2] ls-files: introduce "--format" option
  2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
@ 2022-06-15 13:45 ` ZheNing Hu via GitGitGadget
  2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
  2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-15 13:45 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate, --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-ls-files.txt |  51 +++++++++++-
 builtin/ls-files.c             | 126 ++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 142 +++++++++++++++++++++++++++++++++
 3 files changed, 315 insertions(+), 4 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..b22860ec8c0 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates %(fieldname) from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
+	`--debug`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+tag::
+	The tag of file status.
+objectmode::
+	The mode of the object.
+objectname::
+	The name of the object.
+stage::
+	The stage of the file.
+eol::
+	The line endings of files.
+path::
+	The pathname of the object.
+ctime::
+	The create time of file.
+mtime::
+	The modify time of file.
+dev::
+	The ID of device containing file.
+ino::
+	The inode number of file.
+uid::
+	The user id of file owner.
+gid::
+	The group id of file owner.
+size::
+	The size of the file.
+flags::
+	The flags of the file.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..9dd6c55eeb9 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -58,8 +60,8 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
-static void write_eolinfo(struct index_state *istate,
-			  const struct cache_entry *ce, const char *path)
+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
+				   const struct cache_entry *ce, const char *path)
 {
 	if (show_eol) {
 		struct stat st;
@@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
 							       ce->name);
 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
 			w_txt = get_wt_convert_stats_ascii(path);
-		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		if (sb)
+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		else
+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
 	}
 }
 
+static void write_eolinfo(struct index_state *istate,
+			  const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(NULL, istate, ce, path);
+}
+
+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(sb, istate, ce, path);
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +102,16 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	name = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator) {
+		quote_c_style(name, sb, NULL, 0);
+	} else {
+		strbuf_add(sb, name, strlen(name));
+	}
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +249,86 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *tag;
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	const struct stat_data *sd = &data->ce->ce_stat_data;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s' does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(tag)", &p)) {
+		strbuf_addstr(sb, get_tag(data->ce, data->tag));
+	} else if (skip_prefix(start, "(objectmode)", &p)) {
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	} else if (skip_prefix(start, "(objectname)", &p)) {
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	} else if (skip_prefix(start, "(stage)", &p)) {
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	} else if (skip_prefix(start, "(eol)", &p)) {
+		write_eolinfo_to_buf(sb, data->istate, data->ce, data->pathname);
+	} else if (skip_prefix(start, "(path)", &p)) {
+		write_name_to_buf(sb, data->pathname);
+	} else if (skip_prefix(start, "(ctime)", &p)) {
+		strbuf_addf(sb, "ctime: %u:%u", sd->sd_ctime.sec, sd->sd_ctime.nsec);
+	} else if (skip_prefix(start, "(mtime)", &p)) {
+		strbuf_addf(sb, "mtime: %u:%u", sd->sd_mtime.sec, sd->sd_mtime.nsec);
+	} else if (skip_prefix(start, "(dev)", &p)) {
+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
+	} else if (skip_prefix(start, "(ino)", &p)) {
+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
+	} else if (skip_prefix(start, "(uid)", &p)) {
+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
+	} else if (skip_prefix(start, "(gid)", &p)) {
+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
+	} else if (skip_prefix(start, "(size)", &p)) {
+		strbuf_addf(sb, "size: %u", sd->sd_size);
+	} else if (skip_prefix(start, "(flags)", &p)) {
+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
+	} else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-files format: %%%.*s"), errlen, start);
+	}
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname, const char *tag) {
+
+	struct show_index_data data = {
+		.tag = tag,
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +343,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname, tag);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +787,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+					 N_("format to use for the output"),
+					 PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +814,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || debug_mode))
+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..61a2e68713a
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,142 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format tag' '
+	printf "H \nH \n" >expect &&
+	git ls-files --format="%(tag)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eol' '
+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
+	git ls-files --format="%(eol)" --eol >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format ctime' '
+	git ls-files --debug | grep ctime >expect &&
+	git ls-files --format="  %(ctime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format mtime' '
+	git ls-files --debug | grep mtime >expect &&
+	git ls-files --format="  %(mtime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format dev and ino' '
+	git ls-files --debug | grep dev >expect &&
+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format uid and gid' '
+	git ls-files --debug | grep uid >expect &&
+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	rm o1 &&
+	test_when_finished "git restore o1" &&
+	cat >expect <<-EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format size and flags' '
+	git ls-files --debug | grep size >expect &&
+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -s must fail' '
+	test_must_fail git ls-files --format="%(objectname)" -s
+'
+
+test_expect_success 'git ls-files --format with -o must fail' '
+	test_must_fail git ls-files --format="%(objectname)" -o
+'
+
+test_expect_success 'git ls-files --format with -k must fail' '
+	test_must_fail git ls-files --format="%(objectname)" -k
+'
+
+test_expect_success 'git ls-files --format with --resolve-undo must fail' '
+	test_must_fail git ls-files --format="%(objectname)" --resolve-undo
+'
+
+test_expect_success 'git ls-files --format with --deduplicate must fail' '
+	test_must_fail git ls-files --format="%(objectname)" --deduplicate
+'
+
+test_expect_success 'git ls-files --format with --debug must fail' '
+	test_must_fail git ls-files --format="%(objectname)" --debug
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/2] ls-files: introduce "--object-only" option
  2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
@ 2022-06-15 13:45 ` ZheNing Hu via GitGitGadget
  2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-15 13:45 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

--object-only is an alias for --format=%(objectname),
which output objectname of index entries, taking
inspiration from the option with the same name in
the `git ls-tree` command.

--object-only cannot be used with --format, and -s, -o,
-k, --resolve-undo, --deduplicate, --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-ls-files.txt |  8 +++++++-
 builtin/ls-files.c             | 36 +++++++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 34 ++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index b22860ec8c0..c3f46bb821b 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 		[-c|--cached] [-d|--deleted] [-o|--others] [-i|--|ignored]
 		[-s|--stage] [-u|--unmerged] [-k|--|killed] [-m|--modified]
 		[--directory [--no-empty-directory]] [--eol]
-		[--deduplicate]
+		[--deduplicate] [--object-only]
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 		[--exclude-per-directory=<file>]
@@ -199,6 +199,12 @@ followed by the  ("attr/<eolattr>").
 	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
 	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
 	`--debug`.
+
+--object-only::
+	List only names of the objects, one per line. This is equivalent
+	to specifying `--format='%(objectname)'`. Cannot be combined with
+	`--format=<format>`.
+
 \--::
 	Do not interpret any more arguments as options.
 
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 9dd6c55eeb9..4ac8f34baac 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -60,6 +60,27 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
+static enum ls_files_cmdmode {
+	MODE_DEFAULT = 0,
+	MODE_OBJECT_ONLY,
+} ls_files_cmdmode;
+
+struct ls_files_cmdmodee_to_fmt {
+	enum ls_files_cmdmode mode;
+	const char *const fmt;
+};
+
+static struct ls_files_cmdmodee_to_fmt ls_files_cmdmode_format[] = {
+	{
+		.mode = MODE_DEFAULT,
+		.fmt = NULL,
+	},
+	{
+		.mode = MODE_OBJECT_ONLY,
+		.fmt = "%(objectname)",
+	},
+};
+
 static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
 				   const struct cache_entry *ce, const char *path)
 {
@@ -747,6 +768,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			DIR_SHOW_IGNORED),
 		OPT_BOOL('s', "stage", &show_stage,
 			N_("show staged contents' object name in the output")),
+		OPT_CMDMODE(0, "object-only", &ls_files_cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_BOOL('k', "killed", &show_killed,
 			N_("show files on the filesystem that need to be removed")),
 		OPT_BIT(0, "directory", &dir.flags,
@@ -815,9 +838,20 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
 
+	if (format && ls_files_cmdmode)
+		die(_("--format can't be combined with other format-altering options"));
+
+	for (i = 0; !format && i < ARRAY_SIZE(ls_files_cmdmode_format); i++) {
+		if (ls_files_cmdmode == ls_files_cmdmode_format[i].mode) {
+			format = ls_files_cmdmode_format[i].fmt;
+			break;
+		}
+	}
+
 	if (format && (show_stage || show_others || show_killed ||
 		show_resolve_undo || skipping_duplicates || debug_mode))
-			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+		die(_("ls-files --format or other format-altering options "
+		      "cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
 
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
index 61a2e68713a..1c982ea13e0 100755
--- a/t/t3013-ls-files-format.sh
+++ b/t/t3013-ls-files-format.sh
@@ -139,4 +139,38 @@ test_expect_success 'git ls-files --format with --debug must fail' '
 	test_must_fail git ls-files --format="%(objectname)" --debug
 '
 
+test_expect_success 'git ls-files --object-only equal to --format=%(objectname)' '
+	git ls-files --format="%(objectname)" >expect &&
+	git ls-files --object-only >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --object-only with --format must fail' '
+	test_must_fail git ls-files --format="%(path)" --object-only
+'
+
+test_expect_success 'git ls-files --object-only with -s must fail' '
+	test_must_fail git ls-files --object-only -s
+'
+
+test_expect_success 'git ls-files --object-only with -o must fail' '
+	test_must_fail git ls-files --object-only -o
+'
+
+test_expect_success 'git ls-files --object-only with -k must fail' '
+	test_must_fail git ls-files --object-only -k
+'
+
+test_expect_success 'git ls-files --object-only with --resolve-undo must fail' '
+	test_must_fail git ls-files --object-only --resolve-undo
+'
+
+test_expect_success 'git ls-files --object-only with --deduplicate must fail' '
+	test_must_fail git ls-files --object-only --deduplicate
+'
+
+test_expect_success 'git ls-files --object-only with --debug must fail' '
+	test_must_fail git ls-files --object-only --debug
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/2] ls-files: introduce "--format" option
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
@ 2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
  2022-06-18 10:50     ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-15 20:07 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, ZheNing Hu


On Wed, Jun 15 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>

Thanks a lot for pursuing this, this looks good & is much smaller than I
thought, just some nits below:

> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
>
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate, --debug.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/git-ls-files.txt |  51 +++++++++++-
>  builtin/ls-files.c             | 126 ++++++++++++++++++++++++++++-
>  t/t3013-ls-files-format.sh     | 142 +++++++++++++++++++++++++++++++++
>  3 files changed, 315 insertions(+), 4 deletions(-)
>  create mode 100755 t/t3013-ls-files-format.sh
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 0dabf3f0ddc..b22860ec8c0 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -20,7 +20,7 @@ SYNOPSIS
>  		[--exclude-standard]
>  		[--error-unmatch] [--with-tree=<tree-ish>]
>  		[--full-name] [--recurse-submodules]
> -		[--abbrev[=<n>]] [--] [<file>...]
> +		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
>  
>  DESCRIPTION
>  -----------
> @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
>  	to the contained files. Sparse directories will be shown with a
>  	trailing slash, such as "x/" for a sparse directory "x".
>  
> +--format=<format>::
> +	A string that interpolates %(fieldname) from the result being shown.

Missing `` for %(fieldname) ?
> +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> +	interpolates to character with hex code `xx`; for example `%00`
> +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> +	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,

Replace that last "," with " or ".

> +	`--debug`.
>  \--::
>  	Do not interpret any more arguments as options.
>  
> @@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
>  (see linkgit:git-config[1]).  Using `-z` the filename is output
>  verbatim and the line is terminated by a NUL byte.
>  
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +tag::
> +	The tag of file status.
> +objectmode::
> +	The mode of the object.
> +objectname::
> +	The name of the object.
> +stage::
> +	The stage of the file.
> +eol::
> +	The line endings of files.
> +path::
> +	The pathname of the object.
> +ctime::
> +	The create time of file.
> +mtime::
> +	The modify time of file.
> +dev::
> +	The ID of device containing file.
> +ino::
> +	The inode number of file.
> +uid::
> +	The user id of file owner.
> +gid::
> +	The group id of file owner.
> +size::
> +	The size of the file.
> +flags::
> +	The flags of the file.
>  
>  EXCLUDE PATTERNS
>  ----------------
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e791b65e7e9..9dd6c55eeb9 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -11,6 +11,7 @@
>  #include "quote.h"
>  #include "dir.h"
>  #include "builtin.h"
> +#include "strbuf.h"
>  #include "tree.h"
>  #include "cache-tree.h"
>  #include "parse-options.h"
> @@ -48,6 +49,7 @@ static char *ps_matched;
>  static const char *with_tree;
>  static int exc_given;
>  static int exclude_args;
> +static const char *format;
>  
>  static const char *tag_cached = "";
>  static const char *tag_unmerged = "";
> @@ -58,8 +60,8 @@ static const char *tag_modified = "";
>  static const char *tag_skip_worktree = "";
>  static const char *tag_resolve_undo = "";
>  
> -static void write_eolinfo(struct index_state *istate,
> -			  const struct cache_entry *ce, const char *path)
> +static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
> +				   const struct cache_entry *ce, const char *path)
>  {
>  	if (show_eol) {
>  		struct stat st;
> @@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
>  							       ce->name);
>  		if (!lstat(path, &st) && S_ISREG(st.st_mode))
>  			w_txt = get_wt_convert_stats_ascii(path);
> -		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
> +		if (sb)
> +			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
> +		else
> +			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
>  	}
>  }
>  
> +static void write_eolinfo(struct index_state *istate,
> +			  const struct cache_entry *ce, const char *path)
> +{
> +	write_eolinfo_internal(NULL, istate, ce, path);
> +}
> +
> +static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
> +				 const struct cache_entry *ce, const char *path)
> +{
> +	write_eolinfo_internal(sb, istate, ce, path);
> +}
> +
>  static void write_name(const char *name)
>  {
>  	/*
> @@ -85,6 +102,16 @@ static void write_name(const char *name)
>  				   stdout, line_terminator);
>  }
>  
> +static void write_name_to_buf(struct strbuf *sb, const char *name)
> +{
> +	name = relative_path(name, prefix_len ? prefix : NULL, sb);
FWIW I'd find this a bit less "huh?" if we declared another variable
here, so just:

	const char *rel = relative_path(name, ...).

> +	if (line_terminator) {
> +		quote_c_style(name, sb, NULL, 0);
> +	} else {
> +		strbuf_add(sb, name, strlen(name));
> +	}

Can drop the {} braces here for if/else, see CodingGuidelines.

> +}
> +
>  static const char *get_tag(const struct cache_entry *ce, const char *tag)
>  {
>  	static char alttag[4];
> @@ -222,6 +249,86 @@ static void show_submodule(struct repository *superproject,
>  	repo_clear(&subrepo);
>  }
>  
> +struct show_index_data {
> +	const char *tag;
> +	const char *pathname;
> +	struct index_state *istate;
> +	const struct cache_entry *ce;
> +};
> +
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +			       void *context)
> +{
> +	struct show_index_data *data = context;
> +	const char *end;
> +	const char *p;
> +	unsigned int errlen;
> +	const struct stat_data *sd = &data->ce->ce_stat_data;
> +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> +	if (len)
> +		return len;
> +	if (*start != '(')
> +		die(_("bad ls-files format: element '%s' does not start with '('"), start);
> +
> +	end = strchr(start + 1, ')');
> +	if (!end)
> +		die(_("bad ls-files format: element '%s' does not end in ')'"), start);
> +
> +	len = end - start + 1;
> +	if (skip_prefix(start, "(tag)", &p)) {

Style nit, I'd much rather see us drop the {} on the whole if/else if
chain here, which we can do if...

> +		strbuf_addstr(sb, get_tag(data->ce, data->tag));
> +	} else if (skip_prefix(start, "(objectmode)", &p)) {
> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
> +	} else if (skip_prefix(start, "(objectname)", &p)) {
> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> +	} else if (skip_prefix(start, "(stage)", &p)) {
> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
> +	} else if (skip_prefix(start, "(eol)", &p)) {
> +		write_eolinfo_to_buf(sb, data->istate, data->ce, data->pathname);
> +	} else if (skip_prefix(start, "(path)", &p)) {
> +		write_name_to_buf(sb, data->pathname);
> +	} else if (skip_prefix(start, "(ctime)", &p)) {
> +		strbuf_addf(sb, "ctime: %u:%u", sd->sd_ctime.sec, sd->sd_ctime.nsec);
> +	} else if (skip_prefix(start, "(mtime)", &p)) {
> +		strbuf_addf(sb, "mtime: %u:%u", sd->sd_mtime.sec, sd->sd_mtime.nsec);

(too long lines, keep within 79 chars?)

> +	} else if (skip_prefix(start, "(dev)", &p)) {
> +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
> +	} else if (skip_prefix(start, "(ino)", &p)) {
> +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
> +	} else if (skip_prefix(start, "(uid)", &p)) {
> +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
> +	} else if (skip_prefix(start, "(gid)", &p)) {
> +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
> +	} else if (skip_prefix(start, "(size)", &p)) {
> +		strbuf_addf(sb, "size: %u", sd->sd_size);
> +	} else if (skip_prefix(start, "(flags)", &p)) {
> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> +	} else {
> +		errlen = (unsigned long)len;
> +		die(_("bad ls-files format: %%%.*s"), errlen, start);

We just line-wrap the "(unsigned long)len" here, which seems worth it
for less line noise :)

> +	}
> +
> +	return len;
> +}
> +
> +static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
> +			const char *format, const char *fullname, const char *tag) {
> +
> +	struct show_index_data data = {
> +		.tag = tag,
> +		.pathname = fullname,
> +		.istate = repo->index,
> +		.ce = ce,
> +	};
> +
> +	struct strbuf sb = STRBUF_INIT;
> +	strbuf_expand(&sb, format, expand_show_index, &data);
> +	strbuf_addch(&sb, line_terminator);
> +	fwrite(sb.buf, sb.len, 1, stdout);
> +	strbuf_release(&sb);
> +	return;
> +}
> +
>  static void show_ce(struct repository *repo, struct dir_struct *dir,
>  		    const struct cache_entry *ce, const char *fullname,
>  		    const char *tag)
> @@ -236,6 +343,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
>  				  max_prefix_len, ps_matched,
>  				  S_ISDIR(ce->ce_mode) ||
>  				  S_ISGITLINK(ce->ce_mode))) {
> +		if (format) {
> +			show_ce_fmt(repo, ce, format, fullname, tag);
> +			return;
> +		}
> +
>  		tag = get_tag(ce, tag);
>  
>  		if (!show_stage) {
> @@ -675,6 +787,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  			 N_("suppress duplicate entries")),
>  		OPT_BOOL(0, "sparse", &show_sparse_dirs,
>  			 N_("show sparse directories in the presence of a sparse index")),
> +		OPT_STRING_F(0, "format", &format, N_("format"),
> +					 N_("format to use for the output"),
> +					 PARSE_OPT_NONEG),

Odd indentation?

>  		OPT_END()
>  	};
>  	int ret = 0;
> @@ -699,6 +814,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  	for (i = 0; i < exclude_list.nr; i++) {
>  		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
>  	}
> +
> +	if (format && (show_stage || show_others || show_killed ||
> +		show_resolve_undo || skipping_duplicates || debug_mode))
> +			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));

Good to check this.

> +
>  	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>  		tag_cached = "H ";
>  		tag_unmerged = "M ";
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> new file mode 100755
> index 00000000000..61a2e68713a
> --- /dev/null
> +++ b/t/t3013-ls-files-format.sh
> @@ -0,0 +1,142 @@
> +#!/bin/sh
> +
> +test_description='git ls-files --format test'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	echo o1 >o1 &&
> +	echo o2 >o2 &&
> +	git add o1 o2 &&
> +	git add --chmod +x o1 &&
> +	git commit -m base
> +'
> +
> +test_expect_success 'git ls-files --format tag' '
> +	printf "H \nH \n" >expect &&
> +	git ls-files --format="%(tag)" -t >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format objectmode' '
> +	cat >expect <<-EOF &&
> +	100755
> +	100644
> +	EOF
> +	git ls-files --format="%(objectmode)" -t >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format objectname' '
> +	oid1=$(git hash-object o1) &&
> +	oid2=$(git hash-object o2) &&
> +	cat >expect <<-EOF &&
> +	$oid1
> +	$oid2
> +	EOF
> +	git ls-files --format="%(objectname)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format eol' '
> +	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
> +	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
> +	git ls-files --format="%(eol)" --eol >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format path' '
> +	cat >expect <<-EOF &&
> +	o1
> +	o2
> +	EOF
> +	git ls-files --format="%(path)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format ctime' '
> +	git ls-files --debug | grep ctime >expect &&

For this and the rest: don't put git on the left-hand-side of a "|", it
hides its exit code (and potential segfaults)>.

Instead e.g.:

    git ... >out &&
    grep ctime out >expect &&
    ...

> +	git ls-files --format="  %(ctime)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format mtime' '
> +	git ls-files --debug | grep mtime >expect &&

ditto here & below.

> +	git ls-files --format="  %(mtime)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format dev and ino' '
> +	git ls-files --debug | grep dev >expect &&
> +	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format uid and gid' '
> +	git ls-files --debug | grep uid >expect &&
> +	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with -m' '
> +	echo change >o1 &&
> +	cat >expect <<-EOF &&

When not using varibales use <<-\EOF, applies for the rest.

> +	o1
> +	EOF
> +	git ls-files --format="%(path)" -m >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with -d' '
> +	rm o1 &&

Don't "rm o1" here, rather have the test that creates it do:

    test_when_finished "rm o1" &&
    [the command that creates o1]

> +	test_when_finished "git restore o1" &&
> +	cat >expect <<-EOF &&
> +	o1
> +	EOF
> +	git ls-files --format="%(path)" -d >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format size and flags' '
> +	git ls-files --debug | grep size >expect &&
> +	git ls-files --format="  %(size)%x09%(flags)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format imitate --stage' '
> +	git ls-files --stage >expect &&
> +	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format imitate --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
> +	test_cmp expect actual
> +'

These tests...:

> +test_expect_success 'git ls-files --format with -s must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" -s
> +'
> +
> +test_expect_success 'git ls-files --format with -o must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" -o
> +'
> +
> +test_expect_success 'git ls-files --format with -k must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" -k
> +'
> +
> +test_expect_success 'git ls-files --format with --resolve-undo must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" --resolve-undo
> +'
> +
> +test_expect_success 'git ls-files --format with --deduplicate must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" --deduplicate
> +'
> +
> +test_expect_success 'git ls-files --format with --debug must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" --debug
> +'

...would be better done with a for-loop, so:

	for flag in -s -o -k --resolve-undo [...]
	do
		test_expect_success "git ls-files --format is incompatible with $flag" '
			test_must_fail git ls-files --format="%(objectname)" $flag
		'
	done

Note the '' on the second argument, that's intentional, as we eval it
you don't need "".

> +test_done


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/2] ls-files: introduce "--object-only" option
  2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
@ 2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
  2022-06-18 10:59     ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-15 20:15 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, ZheNing Hu


On Wed, Jun 15 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
>
> --object-only is an alias for --format=%(objectname),
> which output objectname of index entries, taking
> inspiration from the option with the same name in
> the `git ls-tree` command.
>
> --object-only cannot be used with --format, and -s, -o,
> -k, --resolve-undo, --deduplicate, --debug.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/git-ls-files.txt |  8 +++++++-
>  builtin/ls-files.c             | 36 +++++++++++++++++++++++++++++++++-
>  t/t3013-ls-files-format.sh     | 34 ++++++++++++++++++++++++++++++++
>  3 files changed, 76 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index b22860ec8c0..c3f46bb821b 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -13,7 +13,7 @@ SYNOPSIS
>  		[-c|--cached] [-d|--deleted] [-o|--others] [-i|--|ignored]
>  		[-s|--stage] [-u|--unmerged] [-k|--|killed] [-m|--modified]
>  		[--directory [--no-empty-directory]] [--eol]
> -		[--deduplicate]
> +		[--deduplicate] [--object-only]
>  		[-x <pattern>|--exclude=<pattern>]
>  		[-X <file>|--exclude-from=<file>]
>  		[--exclude-per-directory=<file>]
> @@ -199,6 +199,12 @@ followed by the  ("attr/<eolattr>").
>  	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
>  	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
>  	`--debug`.
> +
> +--object-only::
> +	List only names of the objects, one per line. This is equivalent
> +	to specifying `--format='%(objectname)'`. Cannot be combined with
> +	`--format=<format>`.
> +
>  \--::
>  	Do not interpret any more arguments as options.
>  
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 9dd6c55eeb9..4ac8f34baac 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -60,6 +60,27 @@ static const char *tag_modified = "";
>  static const char *tag_skip_worktree = "";
>  static const char *tag_resolve_undo = "";
>  
> +static enum ls_files_cmdmode {
> +	MODE_DEFAULT = 0,
> +	MODE_OBJECT_ONLY,
> +} ls_files_cmdmode;
> +
> +struct ls_files_cmdmodee_to_fmt {
> +	enum ls_files_cmdmode mode;
> +	const char *const fmt;
> +};
> +
> +static struct ls_files_cmdmodee_to_fmt ls_files_cmdmode_format[] = {
> +	{
> +		.mode = MODE_DEFAULT,
> +		.fmt = NULL,
> +	},
> +	{
> +		.mode = MODE_OBJECT_ONLY,
> +		.fmt = "%(objectname)",
> +	},
> +};
[...snip...]

This code all looks OK from skimming it, and is substantially copied
from builtin/ls-tree.c (which is good).

But I wonder as in that case whether having such an alias is worth it at
all, especially since in the case of ls-files (unlike ls-tree) we don't
start out with various --just-the-X-field type options, this is the
first one.

So I *really* like that you took my suggestion of "why not a --format"
from a previous round, but given the above for ls-files in particular is
it really worth it to have this extra code just to type:

    --object-only

Instead of:

    --format="%(objectname)"

So, maybe, and I'm not set against it, but I think it's worth
re-evaluating in this case.

In particular because the part of ls-tree's code is missing here where
we "format optimize", i.e. we take a form like:

    --format="%(objectname)"

And dispatch it to the more optimized special function, instead of the
generic strbuf_expand(), whereas in this case it's the other way around,
the option is just an alias for --format.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/2] ls-files: introduce "--format" option
  2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
@ 2022-06-18 10:50     ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-18 10:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月16日周四 04:15写道:
>
>
> > +static void write_name_to_buf(struct strbuf *sb, const char *name)
> > +{
> > +     name = relative_path(name, prefix_len ? prefix : NULL, sb);
> FWIW I'd find this a bit less "huh?" if we declared another variable
> here, so just:
>
>         const char *rel = relative_path(name, ...).
>

Yeah, It's just a wrong code copy.

> > +     o1
> > +     EOF
> > +     git ls-files --format="%(path)" -m >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -d' '
> > +     rm o1 &&
>
> Don't "rm o1" here, rather have the test that creates it do:
>
>     test_when_finished "rm o1" &&
>     [the command that creates o1]
>

I thought about how to test 'git ls-files -d', so maybe I need
something like:

test_expect_success 'git ls-files --format with -d' '
     echo o3 >o3 &&
     git add o3 &&
     rm o3 &&
     cat >expect <<-\EOF &&
     o3
     EOF
     git ls-files --format="%(path)" -d >actual &&
     test_cmp expect actual
'

> > +     test_when_finished "git restore o1" &&
> > +     cat >expect <<-EOF &&
> > +     o1
> > +     EOF
> > +     git ls-files --format="%(path)" -d >actual &&
> > +     test_cmp expect actual
> > +'
> > +
>
> These tests...:
>
> > +test_expect_success 'git ls-files --format with -s must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" -s
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -o must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" -o
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -k must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" -k
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --resolve-undo must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" --resolve-undo
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --deduplicate must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" --deduplicate
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --debug must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" --debug
> > +'
>
> ...would be better done with a for-loop, so:
>
>         for flag in -s -o -k --resolve-undo [...]
>         do
>                 test_expect_success "git ls-files --format is incompatible with $flag" '
>                         test_must_fail git ls-files --format="%(objectname)" $flag
>                 '
>         done
>

Yeah, using this loop will be clear.

> Note the '' on the second argument, that's intentional, as we eval it
> you don't need "".
>
> > +test_done
>

Thanks for all these code style suggestions!

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/2] ls-files: introduce "--object-only" option
  2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
@ 2022-06-18 10:59     ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-18 10:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月16日周四 04:25写道:
>
>
> On Wed, Jun 15 2022, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > --object-only is an alias for --format=%(objectname),
> > which output objectname of index entries, taking
> > inspiration from the option with the same name in
> > the `git ls-tree` command.
> >
> > --object-only cannot be used with --format, and -s, -o,
> > -k, --resolve-undo, --deduplicate, --debug.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> >  Documentation/git-ls-files.txt |  8 +++++++-
> >  builtin/ls-files.c             | 36 +++++++++++++++++++++++++++++++++-
> >  t/t3013-ls-files-format.sh     | 34 ++++++++++++++++++++++++++++++++
> >  3 files changed, 76 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> > index b22860ec8c0..c3f46bb821b 100644
> > --- a/Documentation/git-ls-files.txt
> > +++ b/Documentation/git-ls-files.txt
> > @@ -13,7 +13,7 @@ SYNOPSIS
> >               [-c|--cached] [-d|--deleted] [-o|--others] [-i|--|ignored]
> >               [-s|--stage] [-u|--unmerged] [-k|--|killed] [-m|--modified]
> >               [--directory [--no-empty-directory]] [--eol]
> > -             [--deduplicate]
> > +             [--deduplicate] [--object-only]
> >               [-x <pattern>|--exclude=<pattern>]
> >               [-X <file>|--exclude-from=<file>]
> >               [--exclude-per-directory=<file>]
> > @@ -199,6 +199,12 @@ followed by the  ("attr/<eolattr>").
> >       interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> >       --format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
> >       `--debug`.
> > +
> > +--object-only::
> > +     List only names of the objects, one per line. This is equivalent
> > +     to specifying `--format='%(objectname)'`. Cannot be combined with
> > +     `--format=<format>`.
> > +
> >  \--::
> >       Do not interpret any more arguments as options.
> >
> > diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> > index 9dd6c55eeb9..4ac8f34baac 100644
> > --- a/builtin/ls-files.c
> > +++ b/builtin/ls-files.c
> > @@ -60,6 +60,27 @@ static const char *tag_modified = "";
> >  static const char *tag_skip_worktree = "";
> >  static const char *tag_resolve_undo = "";
> >
> > +static enum ls_files_cmdmode {
> > +     MODE_DEFAULT = 0,
> > +     MODE_OBJECT_ONLY,
> > +} ls_files_cmdmode;
> > +
> > +struct ls_files_cmdmodee_to_fmt {
> > +     enum ls_files_cmdmode mode;
> > +     const char *const fmt;
> > +};
> > +
> > +static struct ls_files_cmdmodee_to_fmt ls_files_cmdmode_format[] = {
> > +     {
> > +             .mode = MODE_DEFAULT,
> > +             .fmt = NULL,
> > +     },
> > +     {
> > +             .mode = MODE_OBJECT_ONLY,
> > +             .fmt = "%(objectname)",
> > +     },
> > +};
> [...snip...]
>
> This code all looks OK from skimming it, and is substantially copied
> from builtin/ls-tree.c (which is good).
>
> But I wonder as in that case whether having such an alias is worth it at
> all, especially since in the case of ls-files (unlike ls-tree) we don't
> start out with various --just-the-X-field type options, this is the
> first one.
>
> So I *really* like that you took my suggestion of "why not a --format"
> from a previous round, but given the above for ls-files in particular is
> it really worth it to have this extra code just to type:
>
>     --object-only
>
> Instead of:
>
>     --format="%(objectname)"
>
> So, maybe, and I'm not set against it, but I think it's worth
> re-evaluating in this case.
>
> In particular because the part of ls-tree's code is missing here where
> we "format optimize", i.e. we take a form like:
>
>     --format="%(objectname)"
>
> And dispatch it to the more optimized special function, instead of the
> generic strbuf_expand(), whereas in this case it's the other way around,
> the option is just an alias for --format.

Thanks for clarifying that --object-only uses the fast path instead of a simple
alias of --format=%(objectname). Maybe I should do this too, or just
drop this patch
because git ls-file --format has included such a function :-)

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2] ls-files: introduce "--format" option
  2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
@ 2022-06-19  9:13 ` ZheNing Hu via GitGitGadget
  2022-06-19 13:50   ` Phillip Wood
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-19  9:13 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate and --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v1->v2:
    
     1. do some code style fix suggected by Ævar.
     2. remove --object-only option (I have tried to use fast path for it,
        but cannot see any performance promote compare with
        --format=%(objectname))

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v1:

 1:  432d80b8c78 ! 1:  67f2c3b8ebe ls-files: introduce "--format" option
     @@ Commit message
          command.
      
          --format cannot used with -s, -o, -k, --resolve-undo,
     -    --deduplicate, --debug.
     +    --deduplicate and --debug.
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ Documentation/git-ls-files.txt: followed by the  ("attr/<eolattr>").
       	trailing slash, such as "x/" for a sparse directory "x".
       
      +--format=<format>::
     -+	A string that interpolates %(fieldname) from the result being shown.
     ++	A string that interpolates `%(fieldname)` from the result being shown.
      +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
      +	interpolates to character with hex code `xx`; for example `%00`
      +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
     -+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
     ++	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
      +	`--debug`.
       \--::
       	Do not interpret any more arguments as options.
     @@ builtin/ls-files.c: static void write_name(const char *name)
       
      +static void write_name_to_buf(struct strbuf *sb, const char *name)
      +{
     -+	name = relative_path(name, prefix_len ? prefix : NULL, sb);
     -+	if (line_terminator) {
     -+		quote_c_style(name, sb, NULL, 0);
     -+	} else {
     -+		strbuf_add(sb, name, strlen(name));
     -+	}
     ++	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
     ++	if (line_terminator)
     ++		quote_c_style(rel, sb, NULL, 0);
     ++	else
     ++		strbuf_add(sb, rel, strlen(rel));
      +}
      +
       static const char *get_tag(const struct cache_entry *ce, const char *tag)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	if (len)
      +		return len;
      +	if (*start != '(')
     -+		die(_("bad ls-files format: element '%s' does not start with '('"), start);
     ++		die(_("bad ls-files format: element '%s' "
     ++		      "does not start with '('"), start);
      +
      +	end = strchr(start + 1, ')');
      +	if (!end)
     -+		die(_("bad ls-files format: element '%s' does not end in ')'"), start);
     ++		die(_("bad ls-files format: element '%s'"
     ++		      "does not end in ')'"), start);
      +
      +	len = end - start + 1;
     -+	if (skip_prefix(start, "(tag)", &p)) {
     ++	if (skip_prefix(start, "(tag)", &p))
      +		strbuf_addstr(sb, get_tag(data->ce, data->tag));
     -+	} else if (skip_prefix(start, "(objectmode)", &p)) {
     ++	else if (skip_prefix(start, "(objectmode)", &p))
      +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
     -+	} else if (skip_prefix(start, "(objectname)", &p)) {
     ++	else if (skip_prefix(start, "(objectname)", &p))
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
     -+	} else if (skip_prefix(start, "(stage)", &p)) {
     ++	else if (skip_prefix(start, "(stage)", &p))
      +		strbuf_addf(sb, "%d", ce_stage(data->ce));
     -+	} else if (skip_prefix(start, "(eol)", &p)) {
     -+		write_eolinfo_to_buf(sb, data->istate, data->ce, data->pathname);
     -+	} else if (skip_prefix(start, "(path)", &p)) {
     ++	else if (skip_prefix(start, "(eol)", &p))
     ++		write_eolinfo_to_buf(sb, data->istate,
     ++				     data->ce, data->pathname);
     ++	else if (skip_prefix(start, "(path)", &p))
      +		write_name_to_buf(sb, data->pathname);
     -+	} else if (skip_prefix(start, "(ctime)", &p)) {
     -+		strbuf_addf(sb, "ctime: %u:%u", sd->sd_ctime.sec, sd->sd_ctime.nsec);
     -+	} else if (skip_prefix(start, "(mtime)", &p)) {
     -+		strbuf_addf(sb, "mtime: %u:%u", sd->sd_mtime.sec, sd->sd_mtime.nsec);
     -+	} else if (skip_prefix(start, "(dev)", &p)) {
     ++	else if (skip_prefix(start, "(ctime)", &p))
     ++		strbuf_addf(sb, "ctime: %u:%u",
     ++			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
     ++	else if (skip_prefix(start, "(mtime)", &p))
     ++		strbuf_addf(sb, "mtime: %u:%u",
     ++			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
     ++	else if (skip_prefix(start, "(dev)", &p))
      +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
     -+	} else if (skip_prefix(start, "(ino)", &p)) {
     ++	else if (skip_prefix(start, "(ino)", &p))
      +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
     -+	} else if (skip_prefix(start, "(uid)", &p)) {
     ++	else if (skip_prefix(start, "(uid)", &p))
      +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
     -+	} else if (skip_prefix(start, "(gid)", &p)) {
     ++	else if (skip_prefix(start, "(gid)", &p))
      +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
     -+	} else if (skip_prefix(start, "(size)", &p)) {
     ++	else if (skip_prefix(start, "(size)", &p))
      +		strbuf_addf(sb, "size: %u", sd->sd_size);
     -+	} else if (skip_prefix(start, "(flags)", &p)) {
     ++	else if (skip_prefix(start, "(flags)", &p))
      +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
     -+	} else {
     ++	else {
      +		errlen = (unsigned long)len;
      +		die(_("bad ls-files format: %%%.*s"), errlen, start);
      +	}
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
       		OPT_BOOL(0, "sparse", &show_sparse_dirs,
       			 N_("show sparse directories in the presence of a sparse index")),
      +		OPT_STRING_F(0, "format", &format, N_("format"),
     -+					 N_("format to use for the output"),
     -+					 PARSE_OPT_NONEG),
     ++			     N_("format to use for the output"),
     ++			     PARSE_OPT_NONEG),
       		OPT_END()
       	};
       	int ret = 0;
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format objectmode' '
     -+	cat >expect <<-EOF &&
     ++	cat >expect <<-\EOF &&
      +	100755
      +	100644
      +	EOF
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format path' '
     -+	cat >expect <<-EOF &&
     ++	cat >expect <<-\EOF &&
      +	o1
      +	o2
      +	EOF
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format ctime' '
     -+	git ls-files --debug | grep ctime >expect &&
     ++	git ls-files --debug >out &&
     ++	grep ctime out >expect &&
      +	git ls-files --format="  %(ctime)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format mtime' '
     -+	git ls-files --debug | grep mtime >expect &&
     ++	git ls-files --debug >out &&
     ++	grep mtime out >expect &&
      +	git ls-files --format="  %(mtime)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format dev and ino' '
     -+	git ls-files --debug | grep dev >expect &&
     ++	git ls-files --debug >out &&
     ++	grep dev out >expect &&
      +	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format uid and gid' '
     -+	git ls-files --debug | grep uid >expect &&
     ++	git ls-files --debug >out &&
     ++	grep uid out >expect &&
      +	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format with -m' '
      +	echo change >o1 &&
     -+	cat >expect <<-EOF &&
     ++	cat >expect <<-\EOF &&
      +	o1
      +	EOF
      +	git ls-files --format="%(path)" -m >actual &&
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format with -d' '
     -+	rm o1 &&
     -+	test_when_finished "git restore o1" &&
     -+	cat >expect <<-EOF &&
     -+	o1
     ++	echo o3 >o3 &&
     ++	git add o3 &&
     ++	rm o3 &&
     ++	cat >expect <<-\EOF &&
     ++	o3
      +	EOF
      +	git ls-files --format="%(path)" -d >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format size and flags' '
     -+	git ls-files --debug | grep size >expect &&
     ++	git ls-files --debug >out &&
     ++	grep size out >expect &&
      +	git ls-files --format="  %(size)%x09%(flags)" >actual &&
      +	test_cmp expect actual
      +'
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format with -s must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" -s
     -+'
     -+
     -+test_expect_success 'git ls-files --format with -o must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" -o
     -+'
     -+
     -+test_expect_success 'git ls-files --format with -k must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" -k
     -+'
     -+
     -+test_expect_success 'git ls-files --format with --resolve-undo must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" --resolve-undo
     -+'
     -+
     -+test_expect_success 'git ls-files --format with --deduplicate must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" --deduplicate
     -+'
     -+
     -+test_expect_success 'git ls-files --format with --debug must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" --debug
     -+'
     -+
     ++for flag in -s -o -k --resolve-undo --deduplicate --debug
     ++do
     ++	test_expect_success "git ls-files --format is incompatible with $flag" '
     ++		test_must_fail git ls-files --format="%(objectname)" $flag
     ++	'
     ++done
      +test_done
 2:  81ae1280e8e < -:  ----------- ls-files: introduce "--object-only" option


 Documentation/git-ls-files.txt |  51 ++++++++++++-
 builtin/ls-files.c             | 130 ++++++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 130 +++++++++++++++++++++++++++++++++
 3 files changed, 307 insertions(+), 4 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..9a88c92f1ad 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
+	`--debug`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+tag::
+	The tag of file status.
+objectmode::
+	The mode of the object.
+objectname::
+	The name of the object.
+stage::
+	The stage of the file.
+eol::
+	The line endings of files.
+path::
+	The pathname of the object.
+ctime::
+	The create time of file.
+mtime::
+	The modify time of file.
+dev::
+	The ID of device containing file.
+ino::
+	The inode number of file.
+uid::
+	The user id of file owner.
+gid::
+	The group id of file owner.
+size::
+	The size of the file.
+flags::
+	The flags of the file.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..f037ccb58b4 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -58,8 +60,8 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
-static void write_eolinfo(struct index_state *istate,
-			  const struct cache_entry *ce, const char *path)
+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
+				   const struct cache_entry *ce, const char *path)
 {
 	if (show_eol) {
 		struct stat st;
@@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
 							       ce->name);
 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
 			w_txt = get_wt_convert_stats_ascii(path);
-		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		if (sb)
+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		else
+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
 	}
 }
 
+static void write_eolinfo(struct index_state *istate,
+			  const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(NULL, istate, ce, path);
+}
+
+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(sb, istate, ce, path);
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +102,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +248,91 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *tag;
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	const struct stat_data *sd = &data->ce->ce_stat_data;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(tag)", &p))
+		strbuf_addstr(sb, get_tag(data->ce, data->tag));
+	else if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eol)", &p))
+		write_eolinfo_to_buf(sb, data->istate,
+				     data->ce, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(ctime)", &p))
+		strbuf_addf(sb, "ctime: %u:%u",
+			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
+	else if (skip_prefix(start, "(mtime)", &p))
+		strbuf_addf(sb, "mtime: %u:%u",
+			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
+	else if (skip_prefix(start, "(dev)", &p))
+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
+	else if (skip_prefix(start, "(ino)", &p))
+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
+	else if (skip_prefix(start, "(uid)", &p))
+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
+	else if (skip_prefix(start, "(gid)", &p))
+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
+	else if (skip_prefix(start, "(size)", &p))
+		strbuf_addf(sb, "size: %u", sd->sd_size);
+	else if (skip_prefix(start, "(flags)", &p))
+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
+	else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-files format: %%%.*s"), errlen, start);
+	}
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname, const char *tag) {
+
+	struct show_index_data data = {
+		.tag = tag,
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +347,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname, tag);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +791,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +818,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || debug_mode))
+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..1a1b09e7b3c
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,130 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format tag' '
+	printf "H \nH \n" >expect &&
+	git ls-files --format="%(tag)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eol' '
+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
+	git ls-files --format="%(eol)" --eol >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format ctime' '
+	git ls-files --debug >out &&
+	grep ctime out >expect &&
+	git ls-files --format="  %(ctime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format mtime' '
+	git ls-files --debug >out &&
+	grep mtime out >expect &&
+	git ls-files --format="  %(mtime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format dev and ino' '
+	git ls-files --debug >out &&
+	grep dev out >expect &&
+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format uid and gid' '
+	git ls-files --debug >out &&
+	grep uid out >expect &&
+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format size and flags' '
+	git ls-files --debug >out &&
+	grep size out >expect &&
+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+for flag in -s -o -k --resolve-undo --deduplicate --debug
+do
+	test_expect_success "git ls-files --format is incompatible with $flag" '
+		test_must_fail git ls-files --format="%(objectname)" $flag
+	'
+done
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2] ls-files: introduce "--format" option
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
@ 2022-06-19 13:50   ` Phillip Wood
  2022-06-20 13:32     ` ZheNing Hu
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  1 sibling, 1 reply; 61+ messages in thread
From: Phillip Wood @ 2022-06-19 13:50 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Hi ZheNing

On 19/06/2022 10:13, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
> 
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
> 
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate and --debug.

I think this is an interesting feature that provides functionality that 
is not available by feeding index entries into cat-file.

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
>   Documentation/git-ls-files.txt |  51 ++++++++++++-
>   builtin/ls-files.c             | 130 ++++++++++++++++++++++++++++++++-
>   t/t3013-ls-files-format.sh     | 130 +++++++++++++++++++++++++++++++++
>   3 files changed, 307 insertions(+), 4 deletions(-)
>   create mode 100755 t/t3013-ls-files-format.sh
> 
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 0dabf3f0ddc..9a88c92f1ad 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -20,7 +20,7 @@ SYNOPSIS
>   		[--exclude-standard]
>   		[--error-unmatch] [--with-tree=<tree-ish>]
>   		[--full-name] [--recurse-submodules]
> -		[--abbrev[=<n>]] [--] [<file>...]
> +		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
>   
>   DESCRIPTION
>   -----------
> @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
>   	to the contained files. Sparse directories will be shown with a
>   	trailing slash, such as "x/" for a sparse directory "x".
>   
> +--format=<format>::
> +	A string that interpolates `%(fieldname)` from the result being shown.
> +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> +	interpolates to character with hex code `xx`; for example `%00`
> +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> +	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
> +	`--debug`.
>   \--::
>   	Do not interpret any more arguments as options.
>   
> @@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
>   (see linkgit:git-config[1]).  Using `-z` the filename is output
>   verbatim and the line is terminated by a NUL byte.
>   
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +tag::
> +	The tag of file status.

The documentation for -t strong discourages its use, so I wonder if we 
really want to expose it here.

> +objectmode::
> +	The mode of the object.
> +objectname::
> +	The name of the object.
> +stage::
> +	The stage of the file.
> +eol::
> +	The line endings of files.

Every other option refers to either a "file" or "object" but here we 
have "files". Looking at the implementation below this will print the 
line ending from both the index and the worktree, it would be useful to 
clarify that here.

> +path::
> +	The pathname of the object.
> +ctime::
> +	The create time of file.

It is not clear from this whether this (and all the file attributes 
below) are coming from the worktree or the index or both like eol?

> +mtime::
> +	The modify time of file.
> +dev::
> +	The ID of device containing file.
> +ino::
> +	The inode number of file.
> +uid::
> +	The user id of file owner.
> +gid::
> +	The group id of file owner.
> +size::
> +	The size of the file.
> +flags::
> +	The flags of the file.

What are the flags?

> [...]  
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +			       void *context)
> +{
> +	struct show_index_data *data = context;
> +	const char *end;
> +	const char *p;
> +	unsigned int errlen;
 > [...]
> +	else if (skip_prefix(start, "(flags)", &p))
> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> +	else {
> +		errlen = (unsigned long)len;
> +		die(_("bad ls-files format: %%%.*s"), errlen, start);

errlen is declared as an unsigned int, but you cast len which is a 
size_t to unsigned long when assigning to errlen. Then errlen is used 
where a signed int is required by die. There is also a style violation 
as if any branch of an if needs braces then they should all be braced. I 
think that the best solution would be to drop errlen and just write

	else
		die(_("bad ls-files format: %%%.*s"), (int)len, start);

It would be interesting to check the performance of this implementation 
on a large repository as it is doing a lot of branching inside a loop. I 
don't think we should change it unless it turns out to be a problem. 
Then we could try switching on the first character of the format 
specifier or some other optimization.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2] ls-files: introduce "--format" option
  2022-06-19 13:50   ` Phillip Wood
@ 2022-06-20 13:32     ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-20 13:32 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月19日周日 21:50写道:
>
> Hi ZheNing
>
> On 19/06/2022 10:13, ZheNing Hu via GitGitGadget wrote:
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Add a new option --format that output index enties
> > informations with custom format, taking inspiration
> > from the option with the same name in the `git ls-tree`
> > command.
> >
> > --format cannot used with -s, -o, -k, --resolve-undo,
> > --deduplicate and --debug.
>
> I think this is an interesting feature that provides functionality that
> is not available by feeding index entries into cat-file.
>

Yeah, it cares about index state. Having this feature, maybe we can
easier check index/work-tree state.

> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> >   Documentation/git-ls-files.txt |  51 ++++++++++++-
> >   builtin/ls-files.c             | 130 ++++++++++++++++++++++++++++++++-
> >   t/t3013-ls-files-format.sh     | 130 +++++++++++++++++++++++++++++++++
> >   3 files changed, 307 insertions(+), 4 deletions(-)
> >   create mode 100755 t/t3013-ls-files-format.sh
> >
> > diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> > index 0dabf3f0ddc..9a88c92f1ad 100644
> > --- a/Documentation/git-ls-files.txt
> > +++ b/Documentation/git-ls-files.txt
> > @@ -20,7 +20,7 @@ SYNOPSIS
> >               [--exclude-standard]
> >               [--error-unmatch] [--with-tree=<tree-ish>]
> >               [--full-name] [--recurse-submodules]
> > -             [--abbrev[=<n>]] [--] [<file>...]
> > +             [--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
> >
> >   DESCRIPTION
> >   -----------
> > @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
> >       to the contained files. Sparse directories will be shown with a
> >       trailing slash, such as "x/" for a sparse directory "x".
> >
> > +--format=<format>::
> > +     A string that interpolates `%(fieldname)` from the result being shown.
> > +     It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> > +     interpolates to character with hex code `xx`; for example `%00`
> > +     interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> > +     --format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
> > +     `--debug`.
> >   \--::
> >       Do not interpret any more arguments as options.
> >
> > @@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
> >   (see linkgit:git-config[1]).  Using `-z` the filename is output
> >   verbatim and the line is terminated by a NUL byte.
> >
> > +It is possible to print in a custom format by using the `--format`
> > +option, which is able to interpolate different fields using
> > +a `%(fieldname)` notation. For example, if you only care about the
> > +"objectname" and "path" fields, you can execute with a specific
> > +"--format" like
> > +
> > +     git ls-files --format='%(objectname) %(path)'
> > +
> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +tag::
> > +     The tag of file status.
>
> The documentation for -t strong discourages its use, so I wonder if we
> really want to expose it here.
>

I think it's ok to remove it.

> > +objectmode::
> > +     The mode of the object.
> > +objectname::
> > +     The name of the object.
> > +stage::
> > +     The stage of the file.
> > +eol::
> > +     The line endings of files.
>
> Every other option refers to either a "file" or "object" but here we
> have "files". Looking at the implementation below this will print the
> line ending from both the index and the worktree, it would be useful to
> clarify that here.
>

Sure...

> > +path::
> > +     The pathname of the object.
> > +ctime::
> > +     The create time of file.
>
> It is not clear from this whether this (and all the file attributes
> below) are coming from the worktree or the index or both like eol?
>

...I think they are basically index cache_entry attributes, except eol
cares about both
worktree and index. I will fix them.

> > +mtime::
> > +     The modify time of file.
> > +dev::
> > +     The ID of device containing file.
> > +ino::
> > +     The inode number of file.
> > +uid::
> > +     The user id of file owner.
> > +gid::
> > +     The group id of file owner.
> > +size::
> > +     The size of the file.
> > +flags::
> > +     The flags of the file.
>
> What are the flags?
>

It is cache entry flags which include In-memory only flags and some
extended on-disk flags.

> > [...]
> > +static size_t expand_show_index(struct strbuf *sb, const char *start,
> > +                            void *context)
> > +{
> > +     struct show_index_data *data = context;
> > +     const char *end;
> > +     const char *p;
> > +     unsigned int errlen;
>  > [...]
> > +     else if (skip_prefix(start, "(flags)", &p))
> > +             strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> > +     else {
> > +             errlen = (unsigned long)len;
> > +             die(_("bad ls-files format: %%%.*s"), errlen, start);
>
> errlen is declared as an unsigned int, but you cast len which is a
> size_t to unsigned long when assigning to errlen. Then errlen is used
> where a signed int is required by die. There is also a style violation
> as if any branch of an if needs braces then they should all be braced. I
> think that the best solution would be to drop errlen and just write
>
>         else
>                 die(_("bad ls-files format: %%%.*s"), (int)len, start);
>

This piece of code is copying from ls-tree. Maybe we should fix it too.

> It would be interesting to check the performance of this implementation
> on a large repository as it is doing a lot of branching inside a loop. I
> don't think we should change it unless it turns out to be a problem.
> Then we could try switching on the first character of the format
> specifier or some other optimization.
>

Just like ref-filter or something else does, it parses atoms
and then fills buffers with information. Maybe we need such performance
optimization later, but for now, it's just easier to implement this patch :)

> Best Wishes
>
> Phillip

Thanks

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3] ls-files: introduce "--format" option
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2022-06-19 13:50   ` Phillip Wood
@ 2022-06-21  2:05   ` ZheNing Hu via GitGitGadget
  2022-06-23 14:06     ` Phillip Wood
                       ` (2 more replies)
  1 sibling, 3 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-21  2:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate and --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v2->v3:
    
     1. remove %(tag) because -t is deprecated, suggested by Phillip.
     2. fix some description of atoms in document, suggested by Phillip..

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v2:

 1:  67f2c3b8ebe ! 1:  aaafa35ffcd ls-files: introduce "--format" option
     @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
      +into the resulting output. For each outputting line, the following
      +names can be used:
      +
     -+tag::
     -+	The tag of file status.
      +objectmode::
     -+	The mode of the object.
     ++	The mode of the file which is in the index.
      +objectname::
     -+	The name of the object.
     ++	The name of the file which is in the index.
      +stage::
     -+	The stage of the file.
     ++	The stage of the file which is in the index.
      +eol::
     -+	The line endings of files.
     ++	The <eolinfo> and <eolattr> of files both in the
     ++	index and the work-tree.
      +path::
     -+	The pathname of the object.
     ++	The pathname of the file which is in the index.
      +ctime::
     -+	The create time of file.
     ++	The create time of file which is in the index.
      +mtime::
     -+	The modify time of file.
     ++	The modified time of file which is in the index.
      +dev::
     -+	The ID of device containing file.
     ++	The ID of device containing file which is in the index.
      +ino::
     -+	The inode number of file.
     ++	The inode number of file which is in the index.
      +uid::
     -+	The user id of file owner.
     ++	The user id of file owner which is in the index.
      +gid::
     -+	The group id of file owner.
     ++	The group id of file owner which is in the index.
      +size::
     -+	The size of the file.
     ++	The size of the file which is in the index.
      +flags::
     -+	The flags of the file.
     ++	The flags of the file in the index which include
     ++	in-memory only flags and some extended on-disk flags.
       
       EXCLUDE PATTERNS
       ----------------
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	struct show_index_data *data = context;
      +	const char *end;
      +	const char *p;
     -+	unsigned int errlen;
      +	const struct stat_data *sd = &data->ce->ce_stat_data;
      +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
      +	if (len)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		      "does not end in ')'"), start);
      +
      +	len = end - start + 1;
     -+	if (skip_prefix(start, "(tag)", &p))
     -+		strbuf_addstr(sb, get_tag(data->ce, data->tag));
     -+	else if (skip_prefix(start, "(objectmode)", &p))
     ++	if (skip_prefix(start, "(objectmode)", &p))
      +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
      +	else if (skip_prefix(start, "(objectname)", &p))
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		strbuf_addf(sb, "size: %u", sd->sd_size);
      +	else if (skip_prefix(start, "(flags)", &p))
      +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
     -+	else {
     -+		errlen = (unsigned long)len;
     -+		die(_("bad ls-files format: %%%.*s"), errlen, start);
     -+	}
     ++	else
     ++		die(_("bad ls-files format: %%%.*s"), (int)len, start);
      +
      +	return len;
      +}
      +
      +static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
     -+			const char *format, const char *fullname, const char *tag) {
     ++			const char *format, const char *fullname) {
      +
      +	struct show_index_data data = {
     -+		.tag = tag,
      +		.pathname = fullname,
      +		.istate = repo->index,
      +		.ce = ce,
     @@ builtin/ls-files.c: static void show_ce(struct repository *repo, struct dir_stru
       				  S_ISDIR(ce->ce_mode) ||
       				  S_ISGITLINK(ce->ce_mode))) {
      +		if (format) {
     -+			show_ce_fmt(repo, ce, format, fullname, tag);
     ++			show_ce_fmt(repo, ce, format, fullname);
      +			return;
      +		}
      +
     @@ t/t3013-ls-files-format.sh (new)
      +	git commit -m base
      +'
      +
     -+test_expect_success 'git ls-files --format tag' '
     -+	printf "H \nH \n" >expect &&
     -+	git ls-files --format="%(tag)" -t >actual &&
     -+	test_cmp expect actual
     -+'
     -+
      +test_expect_success 'git ls-files --format objectmode' '
      +	cat >expect <<-\EOF &&
      +	100755


 Documentation/git-ls-files.txt |  51 +++++++++++++-
 builtin/ls-files.c             | 124 ++++++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 124 +++++++++++++++++++++++++++++++++
 3 files changed, 295 insertions(+), 4 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..39211bde797 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
+	`--debug`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+objectmode::
+	The mode of the file which is in the index.
+objectname::
+	The name of the file which is in the index.
+stage::
+	The stage of the file which is in the index.
+eol::
+	The <eolinfo> and <eolattr> of files both in the
+	index and the work-tree.
+path::
+	The pathname of the file which is in the index.
+ctime::
+	The create time of file which is in the index.
+mtime::
+	The modified time of file which is in the index.
+dev::
+	The ID of device containing file which is in the index.
+ino::
+	The inode number of file which is in the index.
+uid::
+	The user id of file owner which is in the index.
+gid::
+	The group id of file owner which is in the index.
+size::
+	The size of the file which is in the index.
+flags::
+	The flags of the file in the index which include
+	in-memory only flags and some extended on-disk flags.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..387641b32df 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -58,8 +60,8 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
-static void write_eolinfo(struct index_state *istate,
-			  const struct cache_entry *ce, const char *path)
+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
+				   const struct cache_entry *ce, const char *path)
 {
 	if (show_eol) {
 		struct stat st;
@@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
 							       ce->name);
 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
 			w_txt = get_wt_convert_stats_ascii(path);
-		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		if (sb)
+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		else
+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
 	}
 }
 
+static void write_eolinfo(struct index_state *istate,
+			  const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(NULL, istate, ce, path);
+}
+
+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(sb, istate, ce, path);
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +102,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +248,85 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *tag;
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	const struct stat_data *sd = &data->ce->ce_stat_data;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eol)", &p))
+		write_eolinfo_to_buf(sb, data->istate,
+				     data->ce, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(ctime)", &p))
+		strbuf_addf(sb, "ctime: %u:%u",
+			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
+	else if (skip_prefix(start, "(mtime)", &p))
+		strbuf_addf(sb, "mtime: %u:%u",
+			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
+	else if (skip_prefix(start, "(dev)", &p))
+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
+	else if (skip_prefix(start, "(ino)", &p))
+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
+	else if (skip_prefix(start, "(uid)", &p))
+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
+	else if (skip_prefix(start, "(gid)", &p))
+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
+	else if (skip_prefix(start, "(size)", &p))
+		strbuf_addf(sb, "size: %u", sd->sd_size);
+	else if (skip_prefix(start, "(flags)", &p))
+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +341,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +785,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +812,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || debug_mode))
+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..8c3ef2df138
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,124 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eol' '
+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
+	git ls-files --format="%(eol)" --eol >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format ctime' '
+	git ls-files --debug >out &&
+	grep ctime out >expect &&
+	git ls-files --format="  %(ctime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format mtime' '
+	git ls-files --debug >out &&
+	grep mtime out >expect &&
+	git ls-files --format="  %(mtime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format dev and ino' '
+	git ls-files --debug >out &&
+	grep dev out >expect &&
+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format uid and gid' '
+	git ls-files --debug >out &&
+	grep uid out >expect &&
+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format size and flags' '
+	git ls-files --debug >out &&
+	grep size out >expect &&
+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+for flag in -s -o -k --resolve-undo --deduplicate --debug
+do
+	test_expect_success "git ls-files --format is incompatible with $flag" '
+		test_must_fail git ls-files --format="%(objectname)" $flag
+	'
+done
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
@ 2022-06-23 14:06     ` Phillip Wood
  2022-06-23 15:57       ` Junio C Hamano
  2022-06-26 13:01       ` ZheNing Hu
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 61+ messages in thread
From: Phillip Wood @ 2022-06-23 14:06 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Hi ZheNing

On 21/06/2022 03:05, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
> 
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
> 
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate and --debug.
> 
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>      ls-files: introduce "--format" options
>      
>      v2->v3:
>      
>       1. remove %(tag) because -t is deprecated, suggested by Phillip.
>       2. fix some description of atoms in document, suggested by Phillip..

Thanks for re-rolling, having taken a look a closer look at the tests 
I'm concerned about the output format for some of the specifiers, see below.

> [...]  
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is in the index.
> +objectname::
> +	The name of the file which is in the index.
> +stage::
> +	The stage of the file which is in the index.
> +eol::
> +	The <eolinfo> and <eolattr> of files both in the
> +	index and the work-tree.

Looking at the test for this option I think it needs more work, why 
should --format arbitrarily append a tab to the end of the output? - the 
user should be able to specify a separator if they want one as part of 
the format string. Also I'm not sure why there is so much whitespace in 
the output.

> +path::
> +	The pathname of the file which is in the index.

I think that for all these it might be clearer to say "recorded in the 
index" rather than "of the file which is in the index"

> +ctime::
> +	The create time of file which is in the index.

This is printed with a prefix 'ctime:' (the same applies to the format 
specifiers below) I think we should omit that and just print the data so 
the user can choose the format they want.

> +mtime::
> +	The modified time of file which is in the index.
> +dev::
> +	The ID of device containing file which is in the index.
> +ino::
> +	The inode number of file which is in the index.
> +uid::
> +	The user id of file owner which is in the index.
> +gid::
> +	The group id of file owner which is in the index.
> +size::
> +	The size of the file which is in the index.
> +flags::
> +	The flags of the file in the index which include
> +	in-memory only flags and some extended on-disk flags.

If %(flags) is going to be useful then I think we need to think about 
how they are printed and document that. At the moment they are printed 
as a hexadecimal number which is fine for debugging but probably not 
going to be useful for something like --format. I think printing 
documented symbolic names with some kind of separator (a comma maybe) 
between them is probably more useful

 > [...]
> +test_expect_success 'git ls-files --format eol' '
> +	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
> +	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
> +	git ls-files --format="%(eol)" --eol >actual &&

I'm not sure why this is passing --eol as well as --format='%(eol)' - 
shouldn't that combination of flags be an error?

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 14:06     ` Phillip Wood
@ 2022-06-23 15:57       ` Junio C Hamano
  2022-06-24 10:16         ` Phillip Wood
  2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
  2022-06-26 13:01       ` ZheNing Hu
  1 sibling, 2 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-06-23 15:57 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Phillip Wood <phillip.wood123@gmail.com> writes:

> Thanks for re-rolling, having taken a look a closer look at the tests
> I'm concerned about the output format for some of the specifiers, see
> below.

Thanks for raising these issues.  I agree with you on many of them.
In addition to what you covered ....

>> +path::
>> +	The pathname of the file which is in the index.
> I think that for all these it might be clearer to say "recorded in the
> index" rather than "of the file which is in the index"

I think we would call this "name".  The name of the existing option
that controls how they are shown is "--full-name", not "--full-path",
for example.

>> +ctime::
>> +	The create time of file which is in the index.
>
> This is printed with a prefix 'ctime:' (the same applies to the format
> specifiers below) I think we should omit that and just print the data
> so the user can choose the format they want.
>
>> +mtime::
>> +	The modified time of file which is in the index.

These are only the low-bits of the full timestamp, not ctime/mtime
themselves.

But stepping back a bit, why do we need to include them in the
output?  What workflow and use case are we trying to help?  Dump
output from "stat <path>" equivalent from ls-files and compare with
"stat ." output to see which ones are stale?  Or is there any value
to see the value of, say, ctime as an individual data item?

>> +dev::
>> +	The ID of device containing file which is in the index.
>> +ino::
>> +	The inode number of file which is in the index.
>> +uid::
>> +	The user id of file owner which is in the index.
>> +gid::
>> +	The group id of file owner which is in the index.

Again, why do we need to include these in the output?

Wouldn't it be sufficient, as well as a lot more useful, to show a
single bit "the cached stat info matches what is in the working tree
(yes/no)"?

>> +size::
>> +	The size of the file which is in the index.

This needs to explain what kind of size this is.  Is it the size of
the blob object?  Is it the size of the file in the working tree
(i.e. not cleaned)?  Is it _always_ the size, or can it become a
number that is very different from size in certain circumstances?

IOW, I do not think giving this to unsuspecting users and call it
"size of the file" hurts them more than it helps them, especially
because it is not always the size of the file.

I'd suggest getting rid of everything from ctime down to size and if
we really care about the freshness of the cached stat info, replace
them with a single bit "up-to-date".

>> +flags::
>> +	The flags of the file in the index which include
>> +	in-memory only flags and some extended on-disk flags.
>
> If %(flags) is going to be useful then I think we need to think about
> how they are printed and document that. At the moment they are printed 
> as a hexadecimal number which is fine for debugging but probably not
> going to be useful for something like --format. I think printing 
> documented symbolic names with some kind of separator (a comma maybe)
> between them is probably more useful

I am guessing that most of the above are only useful for curious
geeks and those who are debugging their new tweak to the code that
touches the index, i.e. a debugging feature.  But these folks can
run "git" under a debugger, and they probably have to do so when
they are seeing an unexpected value in the flags member of a cache
entry anyway.  So I am not sure whom this field is intended to help.

>> [...]
>> +test_expect_success 'git ls-files --format eol' '
>> +	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
>> +	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
>> +	git ls-files --format="%(eol)" --eol >actual &&
>
> I'm not sure why this is passing --eol as well as --format='%(eol)' -
> shouldn't that combination of flags be an error?

Good eyes.

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 15:57       ` Junio C Hamano
@ 2022-06-24 10:16         ` Phillip Wood
  2022-06-26 13:05           ` ZheNing Hu
  2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 61+ messages in thread
From: Phillip Wood @ 2022-06-24 10:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

On 23/06/2022 16:57, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> Thanks for re-rolling, having taken a look a closer look at the tests
>> I'm concerned about the output format for some of the specifiers, see
>> below.
> 
> Thanks for raising these issues.  I agree with you on many of them.
> In addition to what you covered ....
> 
>>> +path::
>>> +	The pathname of the file which is in the index.
>> I think that for all these it might be clearer to say "recorded in the
>> index" rather than "of the file which is in the index"
> 
> I think we would call this "name".  The name of the existing option
> that controls how they are shown is "--full-name", not "--full-path",
> for example.

That's a good point, also I've just noticed that this is another case 
where there is a separator character is printed automatically when the 
format string is expanded. I think it is probably right to format the 
name based on whether or not -z was passed but we should leave it up to 
the user to supply a delimiter in the format string.

>>> +ctime::
>>> +	The create time of file which is in the index.
>>
>> This is printed with a prefix 'ctime:' (the same applies to the format
>> specifiers below) I think we should omit that and just print the data
>> so the user can choose the format they want.
>>
>>> +mtime::
>>> +	The modified time of file which is in the index.
> 
> These are only the low-bits of the full timestamp, not ctime/mtime
> themselves.
> 
> But stepping back a bit, why do we need to include them in the
> output?  What workflow and use case are we trying to help?  Dump
> output from "stat <path>" equivalent from ls-files and compare with
> "stat ." output to see which ones are stale?  Or is there any value
> to see the value of, say, ctime as an individual data item?
> 
>>> +dev::
>>> +	The ID of device containing file which is in the index.
>>> +ino::
>>> +	The inode number of file which is in the index.
>>> +uid::
>>> +	The user id of file owner which is in the index.
>>> +gid::
>>> +	The group id of file owner which is in the index.
> 
> Again, why do we need to include these in the output?
> 
> Wouldn't it be sufficient, as well as a lot more useful, to show a
> single bit "the cached stat info matches what is in the working tree
> (yes/no)"?

That does sound useful

>>> +flags::
>>> +	The flags of the file in the index which include
>>> +	in-memory only flags and some extended on-disk flags.
>>
>> If %(flags) is going to be useful then I think we need to think about
>> how they are printed and document that. At the moment they are printed
>> as a hexadecimal number which is fine for debugging but probably not
>> going to be useful for something like --format. I think printing
>> documented symbolic names with some kind of separator (a comma maybe)
>> between them is probably more useful
> 
> I am guessing that most of the above are only useful for curious
> geeks and those who are debugging their new tweak to the code that
> touches the index, i.e. a debugging feature.  But these folks can
> run "git" under a debugger, and they probably have to do so when
> they are seeing an unexpected value in the flags member of a cache
> entry anyway.  So I am not sure whom this field is intended to help.

I wondered about that as well, but thought there might be a plausible 
use if someone wants to check if an entry is marked intent-to-add, or 
has the skip-worktree/spare-index bits set (are there other ways to 
inspect those?)

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 15:57       ` Junio C Hamano
  2022-06-24 10:16         ` Phillip Wood
@ 2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
  2022-06-24 15:28           ` Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-24 13:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Phillip Wood, ZheNing Hu via GitGitGadget, git, Christian Couder,
	ZheNing Hu


On Thu, Jun 23 2022, Junio C Hamano wrote:

> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> Thanks for re-rolling, having taken a look a closer look at the tests
>> I'm concerned about the output format for some of the specifiers, see
>> below.
>
> Thanks for raising these issues.  I agree with you on many of them.
> In addition to what you covered ....
>
>>> +path::
>>> +	The pathname of the file which is in the index.
>> I think that for all these it might be clearer to say "recorded in the
>> index" rather than "of the file which is in the index"
>
> I think we would call this "name".  The name of the existing option
> that controls how they are shown is "--full-name", not "--full-path",
> for example.

To the extent that we got this wrong it was me in 455923e0a15 (ls-tree:
introduce "--format" option, 2022-03-23), but given that we have that I
think it makes sense to have this be consistent with ls-tree.

FWIW ls-tree also uses "name" options, but its docs talked about
"<path>", so I thought it was more helpful to pick that.

We also say that we will "show the full path names" in that
documentation.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  2022-06-23 14:06     ` Phillip Wood
@ 2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
  2022-06-24 15:33       ` Junio C Hamano
  2022-06-26 13:34       ` ZheNing Hu
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-24 13:25 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Phillip Wood, ZheNing Hu


On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
> [...]
> +	if (skip_prefix(start, "(objectmode)", &p))
> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
> +	else if (skip_prefix(start, "(objectname)", &p))
> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> +	else if (skip_prefix(start, "(stage)", &p))
> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
> +	else if (skip_prefix(start, "(eol)", &p))
> +		write_eolinfo_to_buf(sb, data->istate,
> +				     data->ce, data->pathname);
> +	else if (skip_prefix(start, "(path)", &p))
> +		write_name_to_buf(sb, data->pathname);
> +	else if (skip_prefix(start, "(ctime)", &p))
> +		strbuf_addf(sb, "ctime: %u:%u",
> +			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
> +	else if (skip_prefix(start, "(mtime)", &p))
> +		strbuf_addf(sb, "mtime: %u:%u",
> +			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
> +	else if (skip_prefix(start, "(dev)", &p))
> +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
> +	else if (skip_prefix(start, "(ino)", &p))
> +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
> +	else if (skip_prefix(start, "(uid)", &p))
> +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
> +	else if (skip_prefix(start, "(gid)", &p))
> +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
> +	else if (skip_prefix(start, "(size)", &p))
> +		strbuf_addf(sb, "size: %u", sd->sd_size);
> +	else if (skip_prefix(start, "(flags)", &p))
> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);


In my mind almost the entire point of a --format is that you can
e.g. \0-delimit it, and don't need to do other parsing games.

So this really should be adding just e.g. "%x", not "flags: %x", 

Similarly, let's no have :-delimited fields. First, for a formatted
number "1656077225:850723245" is just bizarre for %(ctime), let's use
".", not ":", so: "1656077225.850723245".

And let's call that %(ctime), then have (which is trivial to add) a
%(ctime:sec) and %(ctime:nsec), so someone who wants to format this can
parse it as they please, ditto for mtime.

Looking at your tests it seemed you went down the route of aligning the
output with the --debug output, which is already pre-formatted. I.e. to
make what you have here match:

                printf("  ctime: %u:%u\n", sd->sd_ctime.sec, sd->sd_ctime.nsec);
                printf("  mtime: %u:%u\n", sd->sd_mtime.sec, sd->sd_mtime.nsec);
                printf("  dev: %u\tino: %u\n", sd->sd_dev, sd->sd_ino);
                printf("  uid: %u\tgid: %u\n", sd->sd_uid, sd->sd_gid);
                printf("  size: %u\tflags: %x\n", sd->sd_size, ce->ce_flags);

I think that's a mistake, we should be able to emit those individual
%-specifiers instead, not that line as-is without the " " prefix and
"\n" suffix.

> +
> +	if (format && (show_stage || show_others || show_killed ||
> +		show_resolve_undo || skipping_duplicates || debug_mode))
> +			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));

Use usage_msg_opt() or usage_msg_optf() here instead of die(), and no
need to include "ls-files " in the message.

See die_for_incompatible_opt4, maybe you can just use that instead? A
bit painful, but:

    die_for_incompatible_opt4(format, "--format", show_stage, "-s", show_others, "-o", show_killed, "-k");
    die_for_incompatible_opt4(format, "--format", show_resolve_undo, "--resolve-undo", skipping_duplicates, "--deduplicate", debug_mode, "--debug");

But urgh, that helper really should use usage_msg_opt() instead, but
using it for now as-is probably sucks less.

I also think we should not forbid combining this wtih --debug, it's
helpful to construct a format. This seems to work:
		
	diff --git a/builtin/ls-files.c b/builtin/ls-files.c
	index 387641b32df..82f13edef7e 100644
	--- a/builtin/ls-files.c
	+++ b/builtin/ls-files.c
	@@ -343,12 +343,17 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
	 				  S_ISGITLINK(ce->ce_mode))) {
	 		if (format) {
	 			show_ce_fmt(repo, ce, format, fullname);
	-			return;
	+			if (!debug_mode)
	+				return;
	 		}
	 
	 		tag = get_tag(ce, tag);
	 
	-		if (!show_stage) {
	+		if (format) {
	+			if (!debug_mode)
	+				BUG("unreachable");
	+			; /* for --debug */
	+		} else if (!show_stage) {
	 			fputs(tag, stdout);
	 		} else {
	 			printf("%s%06o %s %d\t",
	@@ -814,7 +819,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
	 	}
	 
	 	if (format && (show_stage || show_others || show_killed ||
	-		show_resolve_undo || skipping_duplicates || debug_mode))
	+		show_resolve_undo || skipping_duplicates))
	 			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
	 
	 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
	
I.e. we'll get:
	
	$ ./git ls-files --debug --format='<%(flags) %(path)>'  -- po/is.po
	<flags: 0 po/is.po>
	po/is.po
	  ctime: 1654300098:369653868
	  mtime: 1654300098:369653868
	  dev: 2306     ino: 10487322
	  uid: 1001     gid: 1001
	  size: 3370    flags: 0

Which I think is quite useful when poking around in this an coming up
with a format.

> +
>  	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>  		tag_cached = "H ";
>  		tag_unmerged = "M ";
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> new file mode 100755
> index 00000000000..8c3ef2df138
> --- /dev/null
> +++ b/t/t3013-ls-files-format.sh
> @@ -0,0 +1,124 @@
> +#!/bin/sh
> +
> +test_description='git ls-files --format test'
> +

Add this line here:

TEST_PASSES_SANITIZE_LEAK=true

I.e. just before test-lib.sh, see other test examples. Then we'll test
this under SANITIZE=leak in CI, to ensure it doesn't leak memory.

> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	echo o1 >o1 &&
> +	echo o2 >o2 &&
> +	git add o1 o2 &&
> +	git add --chmod +x o1 &&
> +	git commit -m base
> +'
> +
> [...]

> +for flag in -s -o -k --resolve-undo --deduplicate --debug
> +do
> +	test_expect_success "git ls-files --format is incompatible with $flag" '
> +		test_must_fail git ls-files --format="%(objectname)" $flag
> +	'
> +done

Nit: I think it's good to move these sotrs of tests before "setup", and
give them a "usage: " prefix, see some other existing examples.

We usually use test_expect_code 129 for those, depending on if you'll
end up with die() or not...

nit: missing \n before this line:

> +test_done
>
> base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
@ 2022-06-24 15:28           ` Junio C Hamano
  0 siblings, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-06-24 15:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Phillip Wood, ZheNing Hu via GitGitGadget, git, Christian Couder,
	ZheNing Hu

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We also say that we will "show the full path names" in that
> documentation.

The primary issue is not the presence of "name" there, but the lack
of "path" in the word chosen.

Many things can have "name" (including "object name"), and "path",
not "name", in "path name" is what clarifies what kind of name it
is.  Given that --format placeholders include "objectname", it does
not make a good design to use "name" alone without saying what kind
of "name" it is.

Calling it "pathname", not just "path", is perfectly OK.  But if
there is no other things the word "path" could refer to in this
context, which I think is the case here, "path" would be acceptable.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
@ 2022-06-24 15:33       ` Junio C Hamano
  2022-06-26 13:35         ` ZheNing Hu
  2022-06-26 13:34       ` ZheNing Hu
  1 sibling, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-06-24 15:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder, Phillip Wood,
	ZheNing Hu

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:
>> +	if (skip_prefix(start, "(objectmode)", &p))
>> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
>> +	else if (skip_prefix(start, "(objectname)", &p))
>> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
>> +	else if (skip_prefix(start, "(stage)", &p))
>> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
>> +	else if (skip_prefix(start, "(path)", &p))
>> +		write_name_to_buf(sb, data->pathname);

These are just "values".

>> +	else if (skip_prefix(start, "(ctime)", &p))
>> +		strbuf_addf(sb, "ctime: %u:%u",
>> +			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
>> +	else if (skip_prefix(start, "(mtime)", &p))
>> +		strbuf_addf(sb, "mtime: %u:%u",
>> +			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
>> +	else if (skip_prefix(start, "(dev)", &p))
>> +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
>> +	else if (skip_prefix(start, "(ino)", &p))
>> +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
>> +	else if (skip_prefix(start, "(uid)", &p))
>> +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
>> +	else if (skip_prefix(start, "(gid)", &p))
>> +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
>> +	else if (skip_prefix(start, "(size)", &p))
>> +		strbuf_addf(sb, "size: %u", sd->sd_size);
>> +	else if (skip_prefix(start, "(flags)", &p))
>> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);

These are not.

> In my mind almost the entire point of a --format is that you can
> e.g. \0-delimit it, and don't need to do other parsing games.
>
> So this really should be adding just e.g. "%x", not "flags: %x", 

Yes.  A very good point, if we were showing these fields (I already
said I doubt it is useful), they should also show just "values"
After all, people can do "--format=mode: %(objectmode)" if they want
an identifying tag before the value.

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 14:06     ` Phillip Wood
  2022-06-23 15:57       ` Junio C Hamano
@ 2022-06-26 13:01       ` ZheNing Hu
  1 sibling, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:01 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月23日周四 22:06写道:
>
> Hi ZheNing
> > [...]
> > +It is possible to print in a custom format by using the `--format`
> > +option, which is able to interpolate different fields using
> > +a `%(fieldname)` notation. For example, if you only care about the
> > +"objectname" and "path" fields, you can execute with a specific
> > +"--format" like
> > +
> > +     git ls-files --format='%(objectname) %(path)'
> > +
> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +objectmode::
> > +     The mode of the file which is in the index.
> > +objectname::
> > +     The name of the file which is in the index.
> > +stage::
> > +     The stage of the file which is in the index.
> > +eol::
> > +     The <eolinfo> and <eolattr> of files both in the
> > +     index and the work-tree.
>
> Looking at the test for this option I think it needs more work, why
> should --format arbitrarily append a tab to the end of the output? - the
> user should be able to specify a separator if they want one as part of
> the format string. Also I'm not sure why there is so much whitespace in
> the output.
>

Because I used old output format in write_eolinfo(), now I think it's wrong,
I will separate it to three parts: %(eolinfo:index), %(eolinfo:worktree),
%(eolattr).

> If %(flags) is going to be useful then I think we need to think about
> how they are printed and document that. At the moment they are printed
> as a hexadecimal number which is fine for debugging but probably not
> going to be useful for something like --format. I think printing
> documented symbolic names with some kind of separator (a comma maybe)
> between them is probably more useful
>

Agree.

>  > [...]
> > +test_expect_success 'git ls-files --format eol' '
> > +     printf "i/lf    w/lf    attr/                 \t\n" >expect &&
> > +     printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
> > +     git ls-files --format="%(eol)" --eol >actual &&
>
> I'm not sure why this is passing --eol as well as --format='%(eol)' -
> shouldn't that combination of flags be an error?
>

Thank you for reminding, will be corrected.

> Best Wishes
>
> Phillip

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 10:16         ` Phillip Wood
@ 2022-06-26 13:05           ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:05 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Junio C Hamano, ZheNing Hu via GitGitGadget, Git List,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月24日周五 18:16写道:

> >>> +ctime::
> >>> +   The create time of file which is in the index.
> >>
> >> This is printed with a prefix 'ctime:' (the same applies to the format
> >> specifiers below) I think we should omit that and just print the data
> >> so the user can choose the format they want.
> >>
> >>> +mtime::
> >>> +   The modified time of file which is in the index.
> >
> > These are only the low-bits of the full timestamp, not ctime/mtime
> > themselves.
> >
> > But stepping back a bit, why do we need to include them in the
> > output?  What workflow and use case are we trying to help?  Dump
> > output from "stat <path>" equivalent from ls-files and compare with
> > "stat ." output to see which ones are stale?  Or is there any value
> > to see the value of, say, ctime as an individual data item?
> >
> >>> +dev::
> >>> +   The ID of device containing file which is in the index.
> >>> +ino::
> >>> +   The inode number of file which is in the index.
> >>> +uid::
> >>> +   The user id of file owner which is in the index.
> >>> +gid::
> >>> +   The group id of file owner which is in the index.
> >
> > Again, why do we need to include these in the output?
> >
> > Wouldn't it be sufficient, as well as a lot more useful, to show a
> > single bit "the cached stat info matches what is in the working tree
> > (yes/no)"?
>
> That does sound useful
>
> >>> +flags::
> >>> +   The flags of the file in the index which include
> >>> +   in-memory only flags and some extended on-disk flags.
> >>
> >> If %(flags) is going to be useful then I think we need to think about
> >> how they are printed and document that. At the moment they are printed
> >> as a hexadecimal number which is fine for debugging but probably not
> >> going to be useful for something like --format. I think printing
> >> documented symbolic names with some kind of separator (a comma maybe)
> >> between them is probably more useful
> >
> > I am guessing that most of the above are only useful for curious
> > geeks and those who are debugging their new tweak to the code that
> > touches the index, i.e. a debugging feature.  But these folks can
> > run "git" under a debugger, and they probably have to do so when
> > they are seeing an unexpected value in the flags member of a cache
> > entry anyway.  So I am not sure whom this field is intended to help.
>
> I wondered about that as well, but thought there might be a plausible
> use if someone wants to check if an entry is marked intent-to-add, or
> has the skip-worktree/spare-index bits set (are there other ways to
> inspect those?)
>

I think this feature will be useful too, but it may not belong to this patch.
We can discuss how to implement it later.

> Best Wishes
>
> Phillip

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
  2022-06-24 15:33       ` Junio C Hamano
@ 2022-06-26 13:34       ` ZheNing Hu
  1 sibling, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Phillip Wood

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月24日周五 21:46写道:
>
>
> On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> > [...]
>
> In my mind almost the entire point of a --format is that you can
> e.g. \0-delimit it, and don't need to do other parsing games.
>
> So this really should be adding just e.g. "%x", not "flags: %x",
>

Yeah, I admit that there really shouldn't use extra formatting here.

> Similarly, let's no have :-delimited fields. First, for a formatted
> number "1656077225:850723245" is just bizarre for %(ctime), let's use
> ".", not ":", so: "1656077225.850723245".
>
> And let's call that %(ctime), then have (which is trivial to add) a
> %(ctime:sec) and %(ctime:nsec), so someone who wants to format this can
> parse it as they please, ditto for mtime.
>
> Looking at your tests it seemed you went down the route of aligning the
> output with the --debug output, which is already pre-formatted. I.e. to
> make what you have here match:
>
>                 printf("  ctime: %u:%u\n", sd->sd_ctime.sec, sd->sd_ctime.nsec);
>                 printf("  mtime: %u:%u\n", sd->sd_mtime.sec, sd->sd_mtime.nsec);
>                 printf("  dev: %u\tino: %u\n", sd->sd_dev, sd->sd_ino);
>                 printf("  uid: %u\tgid: %u\n", sd->sd_uid, sd->sd_gid);
>                 printf("  size: %u\tflags: %x\n", sd->sd_size, ce->ce_flags);
>
> I think that's a mistake, we should be able to emit those individual
> %-specifiers instead, not that line as-is without the " " prefix and
> "\n" suffix.
>

Yeah, agree. But now I just want to delete all atoms from %(ctime) to %(flags),
and let --debug can work with --format.

> > +
> > +     if (format && (show_stage || show_others || show_killed ||
> > +             show_resolve_undo || skipping_duplicates || debug_mode))
> > +                     die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
>
> Use usage_msg_opt() or usage_msg_optf() here instead of die(), and no
> need to include "ls-files " in the message.
>
> See die_for_incompatible_opt4, maybe you can just use that instead? A
> bit painful, but:
>
>     die_for_incompatible_opt4(format, "--format", show_stage, "-s", show_others, "-o", show_killed, "-k");
>     die_for_incompatible_opt4(format, "--format", show_resolve_undo, "--resolve-undo", skipping_duplicates, "--deduplicate", debug_mode, "--debug");
>

Good suggestion. I am curious about why there is no function like
die_for_incompatible_opt4() with variable parameters?

> But urgh, that helper really should use usage_msg_opt() instead, but
> using it for now as-is probably sucks less.
>
> I also think we should not forbid combining this wtih --debug, it's
> helpful to construct a format. This seems to work:
>
>         diff --git a/builtin/ls-files.c b/builtin/ls-files.c
>         index 387641b32df..82f13edef7e 100644
>         --- a/builtin/ls-files.c
>         +++ b/builtin/ls-files.c
>         @@ -343,12 +343,17 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
>                                           S_ISGITLINK(ce->ce_mode))) {
>                         if (format) {
>                                 show_ce_fmt(repo, ce, format, fullname);
>         -                       return;
>         +                       if (!debug_mode)
>         +                               return;
>                         }
>
>                         tag = get_tag(ce, tag);
>
>         -               if (!show_stage) {
>         +               if (format) {
>         +                       if (!debug_mode)
>         +                               BUG("unreachable");
>         +                       ; /* for --debug */
>         +               } else if (!show_stage) {
>                                 fputs(tag, stdout);
>                         } else {
>                                 printf("%s%06o %s %d\t",
>         @@ -814,7 +819,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>                 }
>
>                 if (format && (show_stage || show_others || show_killed ||
>         -               show_resolve_undo || skipping_duplicates || debug_mode))
>         +               show_resolve_undo || skipping_duplicates))
>                                 die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
>
>                 if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>
> I.e. we'll get:
>
>         $ ./git ls-files --debug --format='<%(flags) %(path)>'  -- po/is.po
>         <flags: 0 po/is.po>
>         po/is.po
>           ctime: 1654300098:369653868
>           mtime: 1654300098:369653868
>           dev: 2306     ino: 10487322
>           uid: 1001     gid: 1001
>           size: 3370    flags: 0
>
> Which I think is quite useful when poking around in this an coming up
> with a format.
>

Maybe something like this will be easier?


@@ -343,6 +335,7 @@ static void show_ce(struct repository *repo,
struct dir_struct *dir,
                                  S_ISGITLINK(ce->ce_mode))) {
                if (format) {
                        show_ce_fmt(repo, ce, format, fullname);
+                       print_debug(ce);
                        return;
                }


> > +
> >       if (show_tag || show_valid_bit || show_fsmonitor_bit) {
> >               tag_cached = "H ";
> >               tag_unmerged = "M ";
> > diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> > new file mode 100755
> > index 00000000000..8c3ef2df138
> > --- /dev/null
> > +++ b/t/t3013-ls-files-format.sh
> > @@ -0,0 +1,124 @@
> > +#!/bin/sh
> > +
> > +test_description='git ls-files --format test'
> > +
>
> Add this line here:
>
> TEST_PASSES_SANITIZE_LEAK=true
>
> I.e. just before test-lib.sh, see other test examples. Then we'll test
> this under SANITIZE=leak in CI, to ensure it doesn't leak memory.
>
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup' '
> > +     echo o1 >o1 &&
> > +     echo o2 >o2 &&
> > +     git add o1 o2 &&
> > +     git add --chmod +x o1 &&
> > +     git commit -m base
> > +'
> > +
> > [...]
>
> > +for flag in -s -o -k --resolve-undo --deduplicate --debug
> > +do
> > +     test_expect_success "git ls-files --format is incompatible with $flag" '
> > +             test_must_fail git ls-files --format="%(objectname)" $flag
> > +     '
> > +done
>
> Nit: I think it's good to move these sotrs of tests before "setup", and
> give them a "usage: " prefix, see some other existing examples.
>

Agree.

> We usually use test_expect_code 129 for those, depending on if you'll
> end up with die() or not...
>
> nit: missing \n before this line:
>
> > +test_done
> >
> > base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
>

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 15:33       ` Junio C Hamano
@ 2022-06-26 13:35         ` ZheNing Hu
  2022-06-27  8:22           ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

Junio C Hamano <gitster@pobox.com> 于2022年6月24日周五 23:33写道:
>
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> > On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:
> >> +    if (skip_prefix(start, "(objectmode)", &p))
> >> +            strbuf_addf(sb, "%06o", data->ce->ce_mode);
> >> +    else if (skip_prefix(start, "(objectname)", &p))
> >> +            strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> >> +    else if (skip_prefix(start, "(stage)", &p))
> >> +            strbuf_addf(sb, "%d", ce_stage(data->ce));
> >> +    else if (skip_prefix(start, "(path)", &p))
> >> +            write_name_to_buf(sb, data->pathname);
>
> These are just "values".
>
> >> +    else if (skip_prefix(start, "(ctime)", &p))
> >> +            strbuf_addf(sb, "ctime: %u:%u",
> >> +                        sd->sd_ctime.sec, sd->sd_ctime.nsec);
> >> +    else if (skip_prefix(start, "(mtime)", &p))
> >> +            strbuf_addf(sb, "mtime: %u:%u",
> >> +                        sd->sd_mtime.sec, sd->sd_mtime.nsec);
> >> +    else if (skip_prefix(start, "(dev)", &p))
> >> +            strbuf_addf(sb, "dev: %u", sd->sd_dev);
> >> +    else if (skip_prefix(start, "(ino)", &p))
> >> +            strbuf_addf(sb, "ino: %u", sd->sd_ino);
> >> +    else if (skip_prefix(start, "(uid)", &p))
> >> +            strbuf_addf(sb, "uid: %u", sd->sd_uid);
> >> +    else if (skip_prefix(start, "(gid)", &p))
> >> +            strbuf_addf(sb, "gid: %u", sd->sd_gid);
> >> +    else if (skip_prefix(start, "(size)", &p))
> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
> >> +    else if (skip_prefix(start, "(flags)", &p))
> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
>
> These are not.
>
Agree. So I just remove them as you see. If someone else
need them for some reason, we can add them back.

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4] ls-files: introduce "--format" option
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  2022-06-23 14:06     ` Phillip Wood
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
@ 2022-06-26 15:29     ` ZheNing Hu via GitGitGadget
  2022-06-27  8:32       ` Junio C Hamano
                         ` (3 more replies)
  2 siblings, 4 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-26 15:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate and --eol.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v3->v4:
    
     1. Let --format compatible with --debug.
     2. Let --format incompatible with --eol.
     3. Split %(eol) to three atom: %(eolinfo:index), %(eolinfo:worktree)
        and %(eolattr).
     4. Remove %(ctime), %(mtime), %(dev), %(ino), %(uid), %(gid), %(size),
        %(flags).
     5. Fix output format without some dirty "prefix".
     6. Change some test.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v3:

 1:  aaafa35ffcd ! 1:  6827e44e158 ls-files: introduce "--format" option
     @@ Commit message
          command.
      
          --format cannot used with -s, -o, -k, --resolve-undo,
     -    --deduplicate and --debug.
     +    --deduplicate and --eol.
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ Documentation/git-ls-files.txt: followed by the  ("attr/<eolattr>").
      +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
      +	interpolates to character with hex code `xx`; for example `%00`
      +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
     -+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
     -+	`--debug`.
     ++	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`
     ++	and `--eol`.
       \--::
       	Do not interpret any more arguments as options.
       
     @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
      +names can be used:
      +
      +objectmode::
     -+	The mode of the file which is in the index.
     ++	The mode of the file which is recorded in the index.
      +objectname::
     -+	The name of the file which is in the index.
     ++	The name of the file which is recorded in the index.
      +stage::
     -+	The stage of the file which is in the index.
     -+eol::
     -+	The <eolinfo> and <eolattr> of files both in the
     -+	index and the work-tree.
     ++	The stage of the file which is recorded in the index.
     ++eolinfo:index::
     ++	The <eolinfo> of the file which is recorded in the index.
     ++eolinfo:worktree::
     ++	The <eolinfo> of the file which is recorded in the working tree.
     ++eolattr::
     ++	The <eolattr> of the file which is recorded in the index.
      +path::
     -+	The pathname of the file which is in the index.
     -+ctime::
     -+	The create time of file which is in the index.
     -+mtime::
     -+	The modified time of file which is in the index.
     -+dev::
     -+	The ID of device containing file which is in the index.
     -+ino::
     -+	The inode number of file which is in the index.
     -+uid::
     -+	The user id of file owner which is in the index.
     -+gid::
     -+	The group id of file owner which is in the index.
     -+size::
     -+	The size of the file which is in the index.
     -+flags::
     -+	The flags of the file in the index which include
     -+	in-memory only flags and some extended on-disk flags.
     ++	The pathname of the file which is recorded in the index.
       
       EXCLUDE PATTERNS
       ----------------
     @@ builtin/ls-files.c: static char *ps_matched;
       
       static const char *tag_cached = "";
       static const char *tag_unmerged = "";
     -@@ builtin/ls-files.c: static const char *tag_modified = "";
     - static const char *tag_skip_worktree = "";
     - static const char *tag_resolve_undo = "";
     - 
     --static void write_eolinfo(struct index_state *istate,
     --			  const struct cache_entry *ce, const char *path)
     -+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
     -+				   const struct cache_entry *ce, const char *path)
     - {
     - 	if (show_eol) {
     - 		struct stat st;
      @@ builtin/ls-files.c: static void write_eolinfo(struct index_state *istate,
     - 							       ce->name);
     - 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
     - 			w_txt = get_wt_convert_stats_ascii(path);
     --		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
     -+		if (sb)
     -+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
     -+		else
     -+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
       	}
       }
       
     -+static void write_eolinfo(struct index_state *istate,
     -+			  const struct cache_entry *ce, const char *path)
     ++static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
     ++				       const struct cache_entry *ce)
     ++{
     ++	const char *i_txt = "";
     ++	if (ce && S_ISREG(ce->ce_mode))
     ++		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
     ++	strbuf_addstr(sb, i_txt);
     ++}
     ++
     ++static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
      +{
     -+	write_eolinfo_internal(NULL, istate, ce, path);
     ++	struct stat st;
     ++	const char *w_txt = "";
     ++	if (!lstat(path, &st) && S_ISREG(st.st_mode))
     ++		w_txt = get_wt_convert_stats_ascii(path);
     ++	strbuf_addstr(sb, w_txt);
      +}
      +
     -+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
     -+				 const struct cache_entry *ce, const char *path)
     ++static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
     ++				 const char *path)
      +{
     -+	write_eolinfo_internal(sb, istate, ce, path);
     ++	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
      +}
      +
       static void write_name(const char *name)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
       }
       
      +struct show_index_data {
     -+	const char *tag;
      +	const char *pathname;
      +	struct index_state *istate;
      +	const struct cache_entry *ce;
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	struct show_index_data *data = context;
      +	const char *end;
      +	const char *p;
     -+	const struct stat_data *sd = &data->ce->ce_stat_data;
      +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
      +	if (len)
      +		return len;
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
      +	else if (skip_prefix(start, "(stage)", &p))
      +		strbuf_addf(sb, "%d", ce_stage(data->ce));
     -+	else if (skip_prefix(start, "(eol)", &p))
     -+		write_eolinfo_to_buf(sb, data->istate,
     -+				     data->ce, data->pathname);
     ++	else if (skip_prefix(start, "(eolinfo:index)", &p))
     ++		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
     ++	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
     ++		write_worktree_eolinfo_to_buf(sb, data->pathname);
     ++	else if (skip_prefix(start, "(eolattr)", &p))
     ++		write_eolattr_to_buf(sb, data->istate, data->pathname);
      +	else if (skip_prefix(start, "(path)", &p))
      +		write_name_to_buf(sb, data->pathname);
     -+	else if (skip_prefix(start, "(ctime)", &p))
     -+		strbuf_addf(sb, "ctime: %u:%u",
     -+			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
     -+	else if (skip_prefix(start, "(mtime)", &p))
     -+		strbuf_addf(sb, "mtime: %u:%u",
     -+			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
     -+	else if (skip_prefix(start, "(dev)", &p))
     -+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
     -+	else if (skip_prefix(start, "(ino)", &p))
     -+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
     -+	else if (skip_prefix(start, "(uid)", &p))
     -+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
     -+	else if (skip_prefix(start, "(gid)", &p))
     -+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
     -+	else if (skip_prefix(start, "(size)", &p))
     -+		strbuf_addf(sb, "size: %u", sd->sd_size);
     -+	else if (skip_prefix(start, "(flags)", &p))
     -+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
      +	else
      +		die(_("bad ls-files format: %%%.*s"), (int)len, start);
      +
     @@ builtin/ls-files.c: static void show_ce(struct repository *repo, struct dir_stru
       				  S_ISGITLINK(ce->ce_mode))) {
      +		if (format) {
      +			show_ce_fmt(repo, ce, format, fullname);
     ++			print_debug(ce);
      +			return;
      +		}
      +
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
       	}
      +
      +	if (format && (show_stage || show_others || show_killed ||
     -+		show_resolve_undo || skipping_duplicates || debug_mode))
     -+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
     ++		show_resolve_undo || skipping_duplicates || show_eol))
     ++			usage_msg_opt("--format cannot used with -s, -o, -k, "
     ++				      "--resolve-undo, --deduplicate, --eol",
     ++				      ls_files_usage, builtin_ls_files_options);
      +
       	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
       		tag_cached = "H ";
     @@ t/t3013-ls-files-format.sh (new)
      +
      +test_description='git ls-files --format test'
      +
     ++TEST_PASSES_SANITIZE_LEAK=true
      +. ./test-lib.sh
      +
     ++for flag in -s -o -k --resolve-undo --deduplicate --eol
     ++do
     ++	test_expect_success "usage: --format is incompatible with $flag" '
     ++		test_expect_code 129 git ls-files --format="%(objectname)" $flag
     ++	'
     ++done
     ++
      +test_expect_success 'setup' '
      +	echo o1 >o1 &&
      +	echo o2 >o2 &&
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format eol' '
     -+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
     -+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
     -+	git ls-files --format="%(eol)" --eol >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'git ls-files --format path' '
     ++test_expect_success 'git ls-files --format eolinfo:index' '
      +	cat >expect <<-\EOF &&
     -+	o1
     -+	o2
     ++	lf
     ++	lf
      +	EOF
     -+	git ls-files --format="%(path)" >actual &&
     ++	git ls-files --format="%(eolinfo:index)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format ctime' '
     -+	git ls-files --debug >out &&
     -+	grep ctime out >expect &&
     -+	git ls-files --format="  %(ctime)" >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'git ls-files --format mtime' '
     -+	git ls-files --debug >out &&
     -+	grep mtime out >expect &&
     -+	git ls-files --format="  %(mtime)" >actual &&
     ++test_expect_success 'git ls-files --format eolinfo:worktree' '
     ++	cat >expect <<-\EOF &&
     ++	lf
     ++	lf
     ++	EOF
     ++	git ls-files --format="%(eolinfo:worktree)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format dev and ino' '
     -+	git ls-files --debug >out &&
     -+	grep dev out >expect &&
     -+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
     ++test_expect_success 'git ls-files --format eolattr' '
     ++	printf "\n\n" >expect &&
     ++	git ls-files --format="%(eolattr)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format uid and gid' '
     -+	git ls-files --debug >out &&
     -+	grep uid out >expect &&
     -+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
     ++test_expect_success 'git ls-files --format path' '
     ++	cat >expect <<-\EOF &&
     ++	o1
     ++	o2
     ++	EOF
     ++	git ls-files --format="%(path)" >actual &&
      +	test_cmp expect actual
      +'
      +
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format size and flags' '
     -+	git ls-files --debug >out &&
     -+	grep size out >expect &&
     -+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
     -+	test_cmp expect actual
     -+'
     -+
      +test_expect_success 'git ls-files --format imitate --stage' '
      +	git ls-files --stage >expect &&
      +	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format imitate --debug' '
     ++test_expect_success 'git ls-files --format with --debug' '
      +	git ls-files --debug >expect &&
     -+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
     ++	git ls-files --format="%(path)" --debug >actual &&
      +	test_cmp expect actual
      +'
      +
     -+for flag in -s -o -k --resolve-undo --deduplicate --debug
     -+do
     -+	test_expect_success "git ls-files --format is incompatible with $flag" '
     -+		test_must_fail git ls-files --format="%(objectname)" $flag
     -+	'
     -+done
      +test_done


 Documentation/git-ls-files.txt |  37 ++++++++++-
 builtin/ls-files.c             | 113 +++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 108 +++++++++++++++++++++++++++++++
 3 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..38e81cc889f 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,34 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+	The <eolinfo> of the file which is recorded in the index.
+eolinfo:worktree::
+	The <eolinfo> of the file which is recorded in the working tree.
+eolattr::
+	The <eolattr> of the file which is recorded in the index.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..1d52f5cb90b 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -75,6 +77,30 @@ static void write_eolinfo(struct index_state *istate,
 	}
 }
 
+static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				       const struct cache_entry *ce)
+{
+	const char *i_txt = "";
+	if (ce && S_ISREG(ce->ce_mode))
+		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
+	strbuf_addstr(sb, i_txt);
+}
+
+static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
+{
+	struct stat st;
+	const char *w_txt = "";
+	if (!lstat(path, &st) && S_ISREG(st.st_mode))
+		w_txt = get_wt_convert_stats_ascii(path);
+	strbuf_addstr(sb, w_txt);
+}
+
+static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const char *path)
+{
+	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +111,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +257,68 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p))
+		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
+		write_worktree_eolinfo_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(eolattr)", &p))
+		write_eolattr_to_buf(sb, data->istate, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +333,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +778,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +805,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol))
+			usage_msg_opt("--format cannot used with -s, -o, -k, "
+				      "--resolve-undo, --deduplicate, --eol",
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..a186fe21126
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,108 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolinfo:index' '
+	cat >expect <<-\EOF &&
+	lf
+	lf
+	EOF
+	git ls-files --format="%(eolinfo:index)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolinfo:worktree' '
+	cat >expect <<-\EOF &&
+	lf
+	lf
+	EOF
+	git ls-files --format="%(eolinfo:worktree)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolattr' '
+	printf "\n\n" >expect &&
+	git ls-files --format="%(eolattr)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-26 13:35         ` ZheNing Hu
@ 2022-06-27  8:22           ` Junio C Hamano
  2022-06-27 11:06             ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-06-27  8:22 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

ZheNing Hu <adlternative@gmail.com> writes:

>> >> +    else if (skip_prefix(start, "(path)", &p))
>> >> +            write_name_to_buf(sb, data->pathname);
>>
>> These are just "values".
>> ...
>> >> +    else if (skip_prefix(start, "(size)", &p))
>> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
>> >> +    else if (skip_prefix(start, "(flags)", &p))
>> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
>>
>> These are not.
>>
> ... If someone else
> need them for some reason, we can add them back.

If someone else needs to see "size:" printed in front of the value
of sd_size member, we DO NOT HAVE TO DO ANYTHING!  That someone else
can write "--format=size: %(size)" to do so themselves.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
@ 2022-06-27  8:32       ` Junio C Hamano
  2022-06-27 11:18         ` ZheNing Hu
  2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-06-27  8:32 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Ævar Arnfjörð Bjarmason,
	Phillip Wood, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Range-diff vs v3:
> ...
>       +test_done

I omitted 300 lines of range-diff, which is not exactly illuminating
in this case.  I wonder if there is a way to turn it off when it is
not helping automatically...

> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.


> +eolinfo:index::
> +	The <eolinfo> of the file which is recorded in the index.
> +eolinfo:worktree::
> +	The <eolinfo> of the file which is recorded in the working tree.

These sound somewhat strange, as the above makes it sound as if we
are recording eolinfo for something (we never record eolinfo of
anything anywhere).

	eolinfo:index::
	eolinfo:worktree::
        	The <eolinfo> (see the description of the `--eol` option) of
                the contents in the index or in the worktree for the path

perhaps?  I dunno.

> +eolattr::
> +	The <eolattr> of the file which is recorded in the index.

Likewise, eolattr comes from the attribute subsystem and not
recorded in the index.  It is more like

	eolattr:
                The <eolattr> (see the description of the `--eol` option)
                that applies to the path.

Because attribute applies to the path, it applies equally to both
what is in the index and what is in the working tree.

> +path::
> +	The pathname of the file which is recorded in the index.

As ls-tree already uses %(path) for it, this is probably OK
(otherwise we would probably have called it %(pathname)).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-27  8:22           ` Junio C Hamano
@ 2022-06-27 11:06             ` ZheNing Hu
  2022-06-27 15:41               ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: ZheNing Hu @ 2022-06-27 11:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 16:22写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> >> >> +    else if (skip_prefix(start, "(path)", &p))
> >> >> +            write_name_to_buf(sb, data->pathname);
> >>
> >> These are just "values".
> >> ...
> >> >> +    else if (skip_prefix(start, "(size)", &p))
> >> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
> >> >> +    else if (skip_prefix(start, "(flags)", &p))
> >> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> >>
> >> These are not.
> >>
> > ... If someone else
> > need them for some reason, we can add them back.
>
> If someone else needs to see "size:" printed in front of the value
> of sd_size member, we DO NOT HAVE TO DO ANYTHING!  That someone else
> can write "--format=size: %(size)" to do so themselves.
>
>

Oh, sorry, I mean if someone need some atoms from %(size) to %(flags), we can
add them back.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-27  8:32       ` Junio C Hamano
@ 2022-06-27 11:18         ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-06-27 11:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Johannes Schindelin

Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 16:32写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Range-diff vs v3:
> > ...
> >       +test_done
>
> I omitted 300 lines of range-diff, which is not exactly illuminating
> in this case.  I wonder if there is a way to turn it off when it is
> not helping automatically...
>

I have make a issue to gitgitgadget, maybe Johannes Schindelin can
give a help: https://github.com/gitgitgadget/gitgitgadget/issues/1024

> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +objectmode::
> > +     The mode of the file which is recorded in the index.
> > +objectname::
> > +     The name of the file which is recorded in the index.
> > +stage::
> > +     The stage of the file which is recorded in the index.
>
>
> > +eolinfo:index::
> > +     The <eolinfo> of the file which is recorded in the index.
> > +eolinfo:worktree::
> > +     The <eolinfo> of the file which is recorded in the working tree.
>
> These sound somewhat strange, as the above makes it sound as if we
> are recording eolinfo for something (we never record eolinfo of
> anything anywhere).
>
>         eolinfo:index::
>         eolinfo:worktree::
>                 The <eolinfo> (see the description of the `--eol` option) of
>                 the contents in the index or in the worktree for the path
>
> perhaps?  I dunno.
>
> > +eolattr::
> > +     The <eolattr> of the file which is recorded in the index.
>
> Likewise, eolattr comes from the attribute subsystem and not
> recorded in the index.  It is more like
>
>         eolattr:
>                 The <eolattr> (see the description of the `--eol` option)
>                 that applies to the path.
>
> Because attribute applies to the path, it applies equally to both
> what is in the index and what is in the working tree.
>

Thanks for clarifying it, I will fix it.

> > +path::
> > +     The pathname of the file which is recorded in the index.
>
> As ls-tree already uses %(path) for it, this is probably OK
> (otherwise we would probably have called it %(pathname)).

Agree. Unless we want to fix it in git ls-tree too.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-27 11:06             ` ZheNing Hu
@ 2022-06-27 15:41               ` Junio C Hamano
  2022-07-01 13:30                 ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-06-27 15:41 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

ZheNing Hu <adlternative@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 16:22写道:
>>
>> ZheNing Hu <adlternative@gmail.com> writes:
>>
>> >> >> +    else if (skip_prefix(start, "(path)", &p))
>> >> >> +            write_name_to_buf(sb, data->pathname);
>> >>
>> >> These are just "values".
>> >> ...
>> >> >> +    else if (skip_prefix(start, "(size)", &p))
>> >> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
>> >> >> +    else if (skip_prefix(start, "(flags)", &p))
>> >> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
>> >>
>> >> These are not.
>> >>
>> > ... If someone else
>> > need them for some reason, we can add them back.
>>
>> If someone else needs to see "size:" printed in front of the value
>> of sd_size member, we DO NOT HAVE TO DO ANYTHING!  That someone else
>> can write "--format=size: %(size)" to do so themselves.
>
> Oh, sorry, I mean if someone need some atoms from %(size) to %(flags), we can
> add them back.

Ah, I see.  I am not sure about the %(flags) to help the debugging
mode, but giving a single bit "is it dirty?" would be more useful
than giving the cached stat info, I would think.

Thanks.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2022-06-27  8:32       ` Junio C Hamano
@ 2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
  2022-07-01 12:42         ` ZheNing Hu
  2022-06-28 15:19       ` Phillip Wood
  2022-07-05  6:32       ` [PATCH v5] " ZheNing Hu via GitGitGadget
  3 siblings, 1 reply; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-27 18:34 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Phillip Wood, ZheNing Hu


On Sun, Jun 26 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
>
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate and --eol.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> [...]
> +test_expect_success 'git ls-files --format with --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)" --debug >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_done

I'm not sure what to make of this.

In some ways I think this makes more sense than what I suggested in
https://lore.kernel.org/git/220624.86letmi383.gmgdl@evledraar.gmail.com/;
but I had to think for a second about what's going on here.

In my version I suggested having this work with --debug, but not in this
way, in my version you'd always emit the debug output, and the format
output.

But here e.g.:

    git ls-files -t --debug

Will emit "H tag.c" or whatever, but if you add --format the -t option
is silently discarded.

So the test is relying on "%(path)" being the default format.

I think extending this to e.g. test what happens with "-t" would be a
good thing, but also in general does combining --format with -t make
sense, and are there other such options where the combination might not
make sense?

So I'm not 100% sure, but I think I'd prefer my version, but I see how
it would get hairy to support, e.g.:

    git ls-files -s --debug --format=...

Should work, but you'd have to special-case the logic for erroring if -s
is combined with --format.

Anyway, I think it would be fine to leave this in whatever state is
easy, the --debug option "just for debugging".

But re
https://lore.kernel.org/git/CAOLTT8Tc95-aUE+uN2d8QjTJpGpGw6cBJfG+bpmyE55OcXTSRA@mail.gmail.com/
I think it might be interesting to get --format to a state where we can
remove --debug entirely.

I.e. in c2a29405105 (t1091/t3705: remove 'test-tool read-cache --table',
2021-12-22) we could replace some similar test-only code with "git
ls-files". I for one wouldn't mind --debug going away entirely, and have
the t3705-add-sparse-checkout.sh tests use --format instead.

Or we could keep --debug, but just have it powerful enough to do what
print_debug() is doing now, possibly without "truly internal" stuff like
"ce_flags".

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2022-06-27  8:32       ` Junio C Hamano
  2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
@ 2022-06-28 15:19       ` Phillip Wood
  2022-07-01 12:47         ` ZheNing Hu
  2022-07-05  6:32       ` [PATCH v5] " ZheNing Hu via GitGitGadget
  3 siblings, 1 reply; 61+ messages in thread
From: Phillip Wood @ 2022-06-28 15:19 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Hi ZheNing

This looks good, I don't have much to add beyond the comments others 
have left.

On 26/06/2022 16:29, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com> 
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.
> +eolinfo:index::
> +	The <eolinfo> of the file which is recorded in the index.
> +eolinfo:worktree::
> +	The <eolinfo> of the file which is recorded in the working tree.
> +eolattr::
> +	The <eolattr> of the file which is recorded in the index.
> +path::
> +	The pathname of the file which is recorded in the index.

I think starting with this shorter list of field names is a good idea, 
we can always add more fields later if there is a demand for %(flags) etc.

> +test_expect_success 'git ls-files --format with --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)" --debug >actual &&
> +	test_cmp expect actual
> +'

What's the motivation for being able to combine --format with --debug?

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
@ 2022-07-01 12:42         ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-01 12:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Phillip Wood

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月28日周二 02:48写道:
>
>
> On Sun, Jun 26 2022, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Add a new option --format that output index enties
> > informations with custom format, taking inspiration
> > from the option with the same name in the `git ls-tree`
> > command.
> >
> > --format cannot used with -s, -o, -k, --resolve-undo,
> > --deduplicate and --eol.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> > [...]
> > +test_expect_success 'git ls-files --format with --debug' '
> > +     git ls-files --debug >expect &&
> > +     git ls-files --format="%(path)" --debug >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_done
>
> I'm not sure what to make of this.
>
> In some ways I think this makes more sense than what I suggested in
> https://lore.kernel.org/git/220624.86letmi383.gmgdl@evledraar.gmail.com/;
> but I had to think for a second about what's going on here.
>
> In my version I suggested having this work with --debug, but not in this
> way, in my version you'd always emit the debug output, and the format
> output.
>
> But here e.g.:
>
>     git ls-files -t --debug
>
> Will emit "H tag.c" or whatever, but if you add --format the -t option
> is silently discarded.
>
> So the test is relying on "%(path)" being the default format.
>
> I think extending this to e.g. test what happens with "-t" would be a
> good thing, but also in general does combining --format with -t make
> sense, and are there other such options where the combination might not
> make sense?
>

Why not we just let -t incompatible with --format? Is this because -t can also
be considered a “debug” message, and we often use --debug and -t together?

If so, we can just do something like:

@@ -238,6 +335,13 @@ static void show_ce(struct repository *repo,
struct dir_struct *dir,
                                  S_ISGITLINK(ce->ce_mode))) {
                tag = get_tag(ce, tag);

+               if (format) {
+                       fputs(tag, stdout);
+                       show_ce_fmt(repo, ce, format, fullname);
+                       print_debug(ce);
+                       return;
+               }
+


> So I'm not 100% sure, but I think I'd prefer my version, but I see how
> it would get hairy to support, e.g.:
>
>     git ls-files -s --debug --format=...
>
> Should work, but you'd have to special-case the logic for erroring if -s
> is combined with --format.
>

Agree. it's really weird.

> Anyway, I think it would be fine to leave this in whatever state is
> easy, the --debug option "just for debugging".
>
> But re
> https://lore.kernel.org/git/CAOLTT8Tc95-aUE+uN2d8QjTJpGpGw6cBJfG+bpmyE55OcXTSRA@mail.gmail.com/
> I think it might be interesting to get --format to a state where we can
> remove --debug entirely.
>
> I.e. in c2a29405105 (t1091/t3705: remove 'test-tool read-cache --table',
> 2021-12-22) we could replace some similar test-only code with "git
> ls-files". I for one wouldn't mind --debug going away entirely, and have
> the t3705-add-sparse-checkout.sh tests use --format instead.
>
> Or we could keep --debug, but just have it powerful enough to do what
> print_debug() is doing now, possibly without "truly internal" stuff like
> "ce_flags".

Ah, though we just remove these little "useless" atoms, maybe we can add
them back later? (not in this patch?)

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-28 15:19       ` Phillip Wood
@ 2022-07-01 12:47         ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-01 12:47 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月28日周二 23:19写道:
>
> Hi ZheNing
>
> This looks good, I don't have much to add beyond the comments others
> have left.
>
> On 26/06/2022 16:29, ZheNing Hu via GitGitGadget wrote:
> > From: ZheNing Hu <adlternative@gmail.com>
> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +objectmode::
> > +     The mode of the file which is recorded in the index.
> > +objectname::
> > +     The name of the file which is recorded in the index.
> > +stage::
> > +     The stage of the file which is recorded in the index.
> > +eolinfo:index::
> > +     The <eolinfo> of the file which is recorded in the index.
> > +eolinfo:worktree::
> > +     The <eolinfo> of the file which is recorded in the working tree.
> > +eolattr::
> > +     The <eolattr> of the file which is recorded in the index.
> > +path::
> > +     The pathname of the file which is recorded in the index.
>
> I think starting with this shorter list of field names is a good idea,
> we can always add more fields later if there is a demand for %(flags) etc.
>

Agree.

> > +test_expect_success 'git ls-files --format with --debug' '
> > +     git ls-files --debug >expect &&
> > +     git ls-files --format="%(path)" --debug >actual &&
> > +     test_cmp expect actual
> > +'
>
> What's the motivation for being able to combine --format with --debug?
>

I guess it may help us get debug informations together with --format output
when we are doing some tests.

> Best Wishes
>
> Phillip

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-27 15:41               ` Junio C Hamano
@ 2022-07-01 13:30                 ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-01 13:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 23:41写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> > Oh, sorry, I mean if someone need some atoms from %(size) to %(flags), we can
> > add them back.
>
> Ah, I see.  I am not sure about the %(flags) to help the debugging
> mode, but giving a single bit "is it dirty?" would be more useful
> than giving the cached stat info, I would think.
>

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 061a205576..ccb3fd1676 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -296,6 +296,8 @@ static size_t expand_show_index(struct strbuf *sb,
const char *start,
                write_eolattr_to_buf(sb, data->istate, data->pathname);
        else if (skip_prefix(start, "(path)", &p))
                write_name_to_buf(sb, data->pathname);
+       else if (skip_prefix(start, "(updatetodate)", &p))
+               strbuf_addstr(sb, ce_uptodate(data->ce) ? "true" : "false");
        else
                die(_("bad ls-files format: %%%.*s"), (int)len, start);


I try to use ce_uptodate(ce) to check its flags, but unfortunately,
git ls-files --format="%(updatetodate)" output all files are "false" :(
That's because we have not mark some flags in the cache entry, right?

> Thanks.
>

ZheNing Hu

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-06-28 15:19       ` Phillip Wood
@ 2022-07-05  6:32       ` ZheNing Hu via GitGitGadget
  2022-07-05  8:39         ` Ævar Arnfjörð Bjarmason
                           ` (2 more replies)
  3 siblings, 3 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-07-05  6:32 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, -t, --resolve-undo,
--deduplicate and --eol.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v4->v5:
    
     1. Let --format incompatible with -t.
     2. Fix %(eolinfo) and %(eolattr) docs suggested by Junio.
    
    Looking forward to Ævar's reviewing.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v4:

 1:  6827e44e158 ! 1:  1ce69d6202a ls-files: introduce "--format" option
     @@ Commit message
          from the option with the same name in the `git ls-tree`
          command.
      
     -    --format cannot used with -s, -o, -k, --resolve-undo,
     +    --format cannot used with -s, -o, -k, -t, --resolve-undo,
          --deduplicate and --eol.
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
     @@ Documentation/git-ls-files.txt: followed by the  ("attr/<eolattr>").
      +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
      +	interpolates to character with hex code `xx`; for example `%00`
      +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
     -+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`
     ++	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
      +	and `--eol`.
       \--::
       	Do not interpret any more arguments as options.
     @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
      +stage::
      +	The stage of the file which is recorded in the index.
      +eolinfo:index::
     -+	The <eolinfo> of the file which is recorded in the index.
      +eolinfo:worktree::
     -+	The <eolinfo> of the file which is recorded in the working tree.
     ++	The <eolinfo> (see the description of the `--eol` option) of
     ++	the contents in the index or in the worktree for the path.
      +eolattr::
     -+	The <eolattr> of the file which is recorded in the index.
     ++	The <eolattr> (see the description of the `--eol` option)
     ++	that applies to the path.
      +path::
      +	The pathname of the file which is recorded in the index.
       
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
       	}
      +
      +	if (format && (show_stage || show_others || show_killed ||
     -+		show_resolve_undo || skipping_duplicates || show_eol))
     -+			usage_msg_opt("--format cannot used with -s, -o, -k, "
     ++		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
     ++			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
      +				      "--resolve-undo, --deduplicate, --eol",
      +				      ls_files_usage, builtin_ls_files_options);
      +
     @@ t/t3013-ls-files-format.sh (new)
      +TEST_PASSES_SANITIZE_LEAK=true
      +. ./test-lib.sh
      +
     -+for flag in -s -o -k --resolve-undo --deduplicate --eol
     ++for flag in -s -o -k -t --resolve-undo --deduplicate --eol
      +do
      +	test_expect_success "usage: --format is incompatible with $flag" '
      +		test_expect_code 129 git ls-files --format="%(objectname)" $flag
     @@ t/t3013-ls-files-format.sh (new)
      +	100755
      +	100644
      +	EOF
     -+	git ls-files --format="%(objectmode)" -t >actual &&
     ++	git ls-files --format="%(objectmode)" >actual &&
      +	test_cmp expect actual
      +'
      +


 Documentation/git-ls-files.txt |  38 ++++++++++-
 builtin/ls-files.c             | 113 +++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 108 +++++++++++++++++++++++++++++++
 3 files changed, 258 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..97d4cebba9f 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,35 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+eolinfo:worktree::
+	The <eolinfo> (see the description of the `--eol` option) of
+	the contents in the index or in the worktree for the path.
+eolattr::
+	The <eolattr> (see the description of the `--eol` option)
+	that applies to the path.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..79ecdce2c9c 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -75,6 +77,30 @@ static void write_eolinfo(struct index_state *istate,
 	}
 }
 
+static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				       const struct cache_entry *ce)
+{
+	const char *i_txt = "";
+	if (ce && S_ISREG(ce->ce_mode))
+		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
+	strbuf_addstr(sb, i_txt);
+}
+
+static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
+{
+	struct stat st;
+	const char *w_txt = "";
+	if (!lstat(path, &st) && S_ISREG(st.st_mode))
+		w_txt = get_wt_convert_stats_ascii(path);
+	strbuf_addstr(sb, w_txt);
+}
+
+static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const char *path)
+{
+	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +111,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +257,68 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p))
+		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
+		write_worktree_eolinfo_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(eolattr)", &p))
+		write_eolattr_to_buf(sb, data->istate, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +333,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +778,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +805,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
+			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
+				      "--resolve-undo, --deduplicate, --eol",
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..60c415aafd6
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,108 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k -t --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolinfo:index' '
+	cat >expect <<-\EOF &&
+	lf
+	lf
+	EOF
+	git ls-files --format="%(eolinfo:index)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolinfo:worktree' '
+	cat >expect <<-\EOF &&
+	lf
+	lf
+	EOF
+	git ls-files --format="%(eolinfo:worktree)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolattr' '
+	printf "\n\n" >expect &&
+	git ls-files --format="%(eolattr)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v5] ls-files: introduce "--format" option
  2022-07-05  6:32       ` [PATCH v5] " ZheNing Hu via GitGitGadget
@ 2022-07-05  8:39         ` Ævar Arnfjörð Bjarmason
  2022-07-11 15:14           ` ZheNing Hu
  2022-07-05 19:28         ` Torsten Bögershausen
  2022-07-11 16:53         ` [PATCH v6] " ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-05  8:39 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Phillip Wood, ZheNing Hu


On Tue, Jul 05 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
>
> --format cannot used with -s, -o, -k, -t, --resolve-undo,
> --deduplicate and --eol.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>     ls-files: introduce "--format" options
>     
>     v4->v5:
>     
>      1. Let --format incompatible with -t.
>      2. Fix %(eolinfo) and %(eolattr) docs suggested by Junio.
>     
>     Looking forward to Ævar's reviewing.

Thanks again, I took a look at this and it looks good to me as-is.

If you do want to further twiddle with it at this point I applied these
changes to it locally while poking around, changes:

 * Some trivial whitespace between variable decl.

 * Removed a "return;" at the end of a function

 * I found the new write_*() helpers to be uneccesary, what I did spot
   after seeing if they could be factored out is the existing
   write_eolinfo() function.

   I see you just copied some of the code from there, but
   e.g. initializing to "" and doing an unconditional strbuf_addstr()
   looks odd IMO compared to just doing it inline as below.

   I think if helpers are to be introduced here I'd think it would make
   more sense to split out the small bits of behavior from
   write_eolinfo() so you can call it picemeal and share the code, but
   since it's calling trivial external functions I think just calling
   those directly probably makes more sense...

 * Likewise for the test I wondered if you were adding a bug by not
   reporting when lstat() failed, but then found that this is the same
   thing we do on --eol.

   So for the tests I think it's better just to demonstrate that we can
   emit the exact same thing that --eol does with --format.

 * We've gone back & forth a bit on whether this would combine with
   --debug, while it's an internal-only feature it would be nice to have a
   test for it combined with --format, noting that the behavior might
   change...

 * There is one subtle behavior change here in that I deleted the "ce
   &&" part from write_index_eolinfo_to_buf() when moving the code
   over. I'm 99% sure this is the right thing to do, as other code in
   expand_show_index() unconditionally dereferences it.

   So perhaps we don't need that guard in write_eolinfo() either? In any
   case copy/pasting it over when we're already assuming a non-NULL "ce"
   in the same "if/elseif/else" chain looks a bit odd.

   Ah, I see it's because in show_dir_entry() we explicitly pass it as
   NULL, but that doesn't apply to our new codepath, so as long as we're
   not sharing that helper with write_eolinfo() it makes sense to not do
   that check.

   Even then the helper should probably assume "ce", and write_eolinfo()
   itself should do the "is ce NULL?" check which is specific to its
   use-case.

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 79ecdce2c9c..cc3cece3830 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -77,30 +77,6 @@ static void write_eolinfo(struct index_state *istate,
 	}
 }
 
-static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
-				       const struct cache_entry *ce)
-{
-	const char *i_txt = "";
-	if (ce && S_ISREG(ce->ce_mode))
-		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
-	strbuf_addstr(sb, i_txt);
-}
-
-static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
-{
-	struct stat st;
-	const char *w_txt = "";
-	if (!lstat(path, &st) && S_ISREG(st.st_mode))
-		w_txt = get_wt_convert_stats_ascii(path);
-	strbuf_addstr(sb, w_txt);
-}
-
-static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
-				 const char *path)
-{
-	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
-}
-
 static void write_name(const char *name)
 {
 	/*
@@ -114,6 +90,7 @@ static void write_name(const char *name)
 static void write_name_to_buf(struct strbuf *sb, const char *name)
 {
 	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+
 	if (line_terminator)
 		quote_c_style(rel, sb, NULL, 0);
 	else
@@ -270,6 +247,8 @@ static size_t expand_show_index(struct strbuf *sb, const char *start,
 	const char *end;
 	const char *p;
 	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	struct stat st;
+
 	if (len)
 		return len;
 	if (*start != '(')
@@ -288,12 +267,16 @@ static size_t expand_show_index(struct strbuf *sb, const char *start,
 		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
 	else if (skip_prefix(start, "(stage)", &p))
 		strbuf_addf(sb, "%d", ce_stage(data->ce));
-	else if (skip_prefix(start, "(eolinfo:index)", &p))
-		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
-	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
-		write_worktree_eolinfo_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(eolinfo:index)", &p) &&
+		 S_ISREG(data->ce->ce_mode))
+		strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
+								 data->ce->name));
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
+		 !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
+		strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
 	else if (skip_prefix(start, "(eolattr)", &p))
-		write_eolattr_to_buf(sb, data->istate, data->pathname);
+		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
+							 data->pathname));
 	else if (skip_prefix(start, "(path)", &p))
 		write_name_to_buf(sb, data->pathname);
 	else
@@ -310,13 +293,12 @@ static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
 		.istate = repo->index,
 		.ce = ce,
 	};
-
 	struct strbuf sb = STRBUF_INIT;
+
 	strbuf_expand(&sb, format, expand_show_index, &data);
 	strbuf_addch(&sb, line_terminator);
 	fwrite(sb.buf, sb.len, 1, stdout);
 	strbuf_release(&sb);
-	return;
 }
 
 static void show_ce(struct repository *repo, struct dir_struct *dir,
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
index 60c415aafd6..baf03f9096e 100755
--- a/t/t3013-ls-files-format.sh
+++ b/t/t3013-ls-files-format.sh
@@ -40,27 +40,13 @@ test_expect_success 'git ls-files --format objectname' '
 	test_cmp expect actual
 '
 
-test_expect_success 'git ls-files --format eolinfo:index' '
-	cat >expect <<-\EOF &&
-	lf
-	lf
-	EOF
-	git ls-files --format="%(eolinfo:index)" >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'git ls-files --format eolinfo:worktree' '
-	cat >expect <<-\EOF &&
-	lf
-	lf
-	EOF
-	git ls-files --format="%(eolinfo:worktree)" >actual &&
-	test_cmp expect actual
-'
-
-test_expect_success 'git ls-files --format eolattr' '
-	printf "\n\n" >expect &&
-	git ls-files --format="%(eolattr)" >actual &&
+HT='	'
+WS='    '
+test_expect_success 'git ls-files --format v.s. --eol' '
+	git ls-files --eol >expect 2>err &&
+	test_must_be_empty err &&
+	git ls-files --format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS} ${HT}%(path)" >actual 2>err &&
+	test_must_be_empty err &&
 	test_cmp expect actual
 '
 

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v5] ls-files: introduce "--format" option
  2022-07-05  6:32       ` [PATCH v5] " ZheNing Hu via GitGitGadget
  2022-07-05  8:39         ` Ævar Arnfjörð Bjarmason
@ 2022-07-05 19:28         ` Torsten Bögershausen
  2022-07-11 15:27           ` ZheNing Hu
  2022-07-11 16:53         ` [PATCH v6] " ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 61+ messages in thread
From: Torsten Bögershausen @ 2022-07-05 19:28 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood, ZheNing Hu

On Tue, Jul 05, 2022 at 06:32:40AM +0000, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
[]
> +FIELD NAMES

Nice

> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.
> +eolinfo:index::
> +eolinfo:worktree::
> +	The <eolinfo> (see the description of the `--eol` option) of
> +	the contents in the index or in the worktree for the path.
> +eolattr::
> +	The <eolattr> (see the description of the `--eol` option)
> +	that applies to the path.

This may be a matter of taste, looking at the eol-stuff:
Should the ':' be dropped and we have 3 fieldnames like this:

eolindex
eolworktree
eolattr

> +test_expect_success 'git ls-files --format eolinfo:index' '
> +	cat >expect <<-\EOF &&
> +	lf
> +	lf
> +	EOF
> +	git ls-files --format="%(eolinfo:index)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format eolinfo:worktree' '
> +	cat >expect <<-\EOF &&
> +	lf
> +	lf
> +	EOF
> +	git ls-files --format="%(eolinfo:worktree)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format eolattr' '
> +	printf "\n\n" >expect &&
> +	git ls-files --format="%(eolattr)" >actual &&
> +	test_cmp expect actual
> +'
> +

What exactly should this testcases test ?
Does it make sense to set up a combination of index, worktree, attr,
which are happening in real live ?

There are some tests in t0025, t0027 and t0028 that do more
realistic tests of different combinations.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5] ls-files: introduce "--format" option
  2022-07-05  8:39         ` Ævar Arnfjörð Bjarmason
@ 2022-07-11 15:14           ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-11 15:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Phillip Wood

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年7月5日周二 16:50写道:
>
>
> On Tue, Jul 05 2022, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Add a new option --format that output index enties
> > informations with custom format, taking inspiration
> > from the option with the same name in the `git ls-tree`
> > command.
> >
> > --format cannot used with -s, -o, -k, -t, --resolve-undo,
> > --deduplicate and --eol.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> >     ls-files: introduce "--format" options
> >
> >     v4->v5:
> >
> >      1. Let --format incompatible with -t.
> >      2. Fix %(eolinfo) and %(eolattr) docs suggested by Junio.
> >
> >     Looking forward to Ævar's reviewing.
>
> Thanks again, I took a look at this and it looks good to me as-is.
>
> If you do want to further twiddle with it at this point I applied these
> changes to it locally while poking around, changes:
>
>  * Some trivial whitespace between variable decl.
>
>  * Removed a "return;" at the end of a function
>
>  * I found the new write_*() helpers to be uneccesary, what I did spot
>    after seeing if they could be factored out is the existing
>    write_eolinfo() function.
>
>    I see you just copied some of the code from there, but
>    e.g. initializing to "" and doing an unconditional strbuf_addstr()
>    looks odd IMO compared to just doing it inline as below.
>

Indeed, it may be a little inelegant...

>    I think if helpers are to be introduced here I'd think it would make
>    more sense to split out the small bits of behavior from
>    write_eolinfo() so you can call it picemeal and share the code, but
>    since it's calling trivial external functions I think just calling
>    those directly probably makes more sense...
>
>  * Likewise for the test I wondered if you were adding a bug by not
>    reporting when lstat() failed, but then found that this is the same
>    thing we do on --eol.
>

Yes, write_eolinfo() ignore lstat() error too, so this would not be a problem.

>    So for the tests I think it's better just to demonstrate that we can
>    emit the exact same thing that --eol does with --format.
>
>  * We've gone back & forth a bit on whether this would combine with
>    --debug, while it's an internal-only feature it would be nice to have a
>    test for it combined with --format, noting that the behavior might
>    change...
>

Oh, if we really want --format, --debug used with --eol, -t, some user
may curious about why --format can not used with  --eol, -t (without --debug),
and I think this will make it interface more complicated. So now I pefer to keep
origin design.

>  * There is one subtle behavior change here in that I deleted the "ce
>    &&" part from write_index_eolinfo_to_buf() when moving the code
>    over. I'm 99% sure this is the right thing to do, as other code in
>    expand_show_index() unconditionally dereferences it.
>
>    So perhaps we don't need that guard in write_eolinfo() either? In any
>    case copy/pasting it over when we're already assuming a non-NULL "ce"
>    in the same "if/elseif/else" chain looks a bit odd.
>
>    Ah, I see it's because in show_dir_entry() we explicitly pass it as
>    NULL, but that doesn't apply to our new codepath, so as long as we're
>    not sharing that helper with write_eolinfo() it makes sense to not do
>    that check.
>

Agree.

>    Even then the helper should probably assume "ce", and write_eolinfo()
>    itself should do the "is ce NULL?" check which is specific to its
>    use-case.
>
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 79ecdce2c9c..cc3cece3830 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -77,30 +77,6 @@ static void write_eolinfo(struct index_state *istate,
>         }
>  }
>
> -static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
> -                                      const struct cache_entry *ce)
> -{
> -       const char *i_txt = "";
> -       if (ce && S_ISREG(ce->ce_mode))
> -               i_txt = get_cached_convert_stats_ascii(istate, ce->name);
> -       strbuf_addstr(sb, i_txt);
> -}
> -
> -static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
> -{
> -       struct stat st;
> -       const char *w_txt = "";
> -       if (!lstat(path, &st) && S_ISREG(st.st_mode))
> -               w_txt = get_wt_convert_stats_ascii(path);
> -       strbuf_addstr(sb, w_txt);
> -}
> -
> -static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
> -                                const char *path)
> -{
> -       strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
> -}
> -
>  static void write_name(const char *name)
>  {
>         /*
> @@ -114,6 +90,7 @@ static void write_name(const char *name)
>  static void write_name_to_buf(struct strbuf *sb, const char *name)
>  {
>         const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
> +
>         if (line_terminator)
>                 quote_c_style(rel, sb, NULL, 0);
>         else
> @@ -270,6 +247,8 @@ static size_t expand_show_index(struct strbuf *sb, const char *start,
>         const char *end;
>         const char *p;
>         size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> +       struct stat st;
> +
>         if (len)
>                 return len;
>         if (*start != '(')
> @@ -288,12 +267,16 @@ static size_t expand_show_index(struct strbuf *sb, const char *start,
>                 strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
>         else if (skip_prefix(start, "(stage)", &p))
>                 strbuf_addf(sb, "%d", ce_stage(data->ce));
> -       else if (skip_prefix(start, "(eolinfo:index)", &p))
> -               write_index_eolinfo_to_buf(sb, data->istate, data->ce);
> -       else if (skip_prefix(start, "(eolinfo:worktree)", &p))
> -               write_worktree_eolinfo_to_buf(sb, data->pathname);
> +       else if (skip_prefix(start, "(eolinfo:index)", &p) &&
> +                S_ISREG(data->ce->ce_mode))
> +               strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
> +                                                                data->ce->name));
> +       else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
> +                !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
> +               strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
>         else if (skip_prefix(start, "(eolattr)", &p))
> -               write_eolattr_to_buf(sb, data->istate, data->pathname);
> +               strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
> +                                                        data->pathname));
>         else if (skip_prefix(start, "(path)", &p))
>                 write_name_to_buf(sb, data->pathname);
>         else
> @@ -310,13 +293,12 @@ static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
>                 .istate = repo->index,
>                 .ce = ce,
>         };
> -
>         struct strbuf sb = STRBUF_INIT;
> +
>         strbuf_expand(&sb, format, expand_show_index, &data);
>         strbuf_addch(&sb, line_terminator);
>         fwrite(sb.buf, sb.len, 1, stdout);
>         strbuf_release(&sb);
> -       return;
>  }
>
>  static void show_ce(struct repository *repo, struct dir_struct *dir,
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> index 60c415aafd6..baf03f9096e 100755
> --- a/t/t3013-ls-files-format.sh
> +++ b/t/t3013-ls-files-format.sh
> @@ -40,27 +40,13 @@ test_expect_success 'git ls-files --format objectname' '
>         test_cmp expect actual
>  '
>
> -test_expect_success 'git ls-files --format eolinfo:index' '
> -       cat >expect <<-\EOF &&
> -       lf
> -       lf
> -       EOF
> -       git ls-files --format="%(eolinfo:index)" >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'git ls-files --format eolinfo:worktree' '
> -       cat >expect <<-\EOF &&
> -       lf
> -       lf
> -       EOF
> -       git ls-files --format="%(eolinfo:worktree)" >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_expect_success 'git ls-files --format eolattr' '
> -       printf "\n\n" >expect &&
> -       git ls-files --format="%(eolattr)" >actual &&
> +HT='   '
> +WS='    '
> +test_expect_success 'git ls-files --format v.s. --eol' '
> +       git ls-files --eol >expect 2>err &&
> +       test_must_be_empty err &&
> +       git ls-files --format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS} ${HT}%(path)" >actual 2>err &&
> +       test_must_be_empty err &&
>         test_cmp expect actual
>  '
>

Thanks for review and help :-)

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5] ls-files: introduce "--format" option
  2022-07-05 19:28         ` Torsten Bögershausen
@ 2022-07-11 15:27           ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-11 15:27 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason,
	Phillip Wood

Torsten Bögershausen <tboegi@web.de> 于2022年7月6日周三 03:28写道:
>
> On Tue, Jul 05, 2022 at 06:32:40AM +0000, ZheNing Hu via GitGitGadget wrote:
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Add a new option --format that output index enties
> > informations with custom format, taking inspiration
> > from the option with the same name in the `git ls-tree`
> > command.
> []
> > +FIELD NAMES
>
> Nice
>
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +objectmode::
> > +     The mode of the file which is recorded in the index.
> > +objectname::
> > +     The name of the file which is recorded in the index.
> > +stage::
> > +     The stage of the file which is recorded in the index.
> > +eolinfo:index::
> > +eolinfo:worktree::
> > +     The <eolinfo> (see the description of the `--eol` option) of
> > +     the contents in the index or in the worktree for the path.
> > +eolattr::
> > +     The <eolattr> (see the description of the `--eol` option)
> > +     that applies to the path.
>
> This may be a matter of taste, looking at the eol-stuff:
> Should the ':' be dropped and we have 3 fieldnames like this:
>
> eolindex
> eolworktree
> eolattr
>

Let's see the document of --eol in git-ls-files.txt:

--eol::
     Show <eolinfo> and <eolattr> of files.
     <eolinfo> is the file content identification used by Git when
     the "text" attribute is "auto" (or not set and core.autocrlf is not false).
     <eolinfo> is either "-text", "none", "lf", "crlf", "mixed" or "".

There mentioned eolinfo and eolattr many times, so let's keep it.

> > +test_expect_success 'git ls-files --format eolinfo:index' '
> > +     cat >expect <<-\EOF &&
> > +     lf
> > +     lf
> > +     EOF
> > +     git ls-files --format="%(eolinfo:index)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format eolinfo:worktree' '
> > +     cat >expect <<-\EOF &&
> > +     lf
> > +     lf
> > +     EOF
> > +     git ls-files --format="%(eolinfo:worktree)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format eolattr' '
> > +     printf "\n\n" >expect &&
> > +     git ls-files --format="%(eolattr)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
>
> What exactly should this testcases test ?
> Does it make sense to set up a combination of index, worktree, attr,
> which are happening in real live ?
>
> There are some tests in t0025, t0027 and t0028 that do more
> realistic tests of different combinations.
>
>

Origin test is not good, But now I decide use Avar's patch version:

-test_expect_success 'git ls-files --format eolattr' '
-       printf "\n\n" >expect &&
-       git ls-files --format="%(eolattr)" >actual &&
+HT='   '
+WS='    '
+test_expect_success 'git ls-files --format v.s. --eol' '
+       git ls-files --eol >expect 2>err &&
+       test_must_be_empty err &&
+       git ls-files
--format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS}
${HT}%(path)" >actual 2>err &&
+       test_must_be_empty err &&
        test_cmp expect actual

it can compare the output of git ls-files --format with git ls-files --eol.

Thanks for review!

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6] ls-files: introduce "--format" option
  2022-07-05  6:32       ` [PATCH v5] " ZheNing Hu via GitGitGadget
  2022-07-05  8:39         ` Ævar Arnfjörð Bjarmason
  2022-07-05 19:28         ` Torsten Bögershausen
@ 2022-07-11 16:53         ` ZheNing Hu via GitGitGadget
  2022-07-11 22:11           ` Junio C Hamano
  2022-07-13  6:07           ` [PATCH v7] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-07-11 16:53 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, -t, --resolve-undo,
--deduplicate and --eol.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v5->v6:
    
     1. Some code cleaning suggested by Ævar.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v6
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v5:

 1:  1ce69d6202a ! 1:  57ed2c15728 ls-files: introduce "--format" option
     @@ builtin/ls-files.c: static char *ps_matched;
       
       static const char *tag_cached = "";
       static const char *tag_unmerged = "";
     -@@ builtin/ls-files.c: static void write_eolinfo(struct index_state *istate,
     - 	}
     - }
     - 
     -+static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
     -+				       const struct cache_entry *ce)
     -+{
     -+	const char *i_txt = "";
     -+	if (ce && S_ISREG(ce->ce_mode))
     -+		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
     -+	strbuf_addstr(sb, i_txt);
     -+}
     -+
     -+static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
     -+{
     -+	struct stat st;
     -+	const char *w_txt = "";
     -+	if (!lstat(path, &st) && S_ISREG(st.st_mode))
     -+		w_txt = get_wt_convert_stats_ascii(path);
     -+	strbuf_addstr(sb, w_txt);
     -+}
     -+
     -+static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
     -+				 const char *path)
     -+{
     -+	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
     -+}
     -+
     - static void write_name(const char *name)
     - {
     - 	/*
      @@ builtin/ls-files.c: static void write_name(const char *name)
       				   stdout, line_terminator);
       }
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	const char *end;
      +	const char *p;
      +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
     ++	struct stat st;
     ++
      +	if (len)
      +		return len;
      +	if (*start != '(')
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
      +	else if (skip_prefix(start, "(stage)", &p))
      +		strbuf_addf(sb, "%d", ce_stage(data->ce));
     -+	else if (skip_prefix(start, "(eolinfo:index)", &p))
     -+		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
     -+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
     -+		write_worktree_eolinfo_to_buf(sb, data->pathname);
     ++	else if (skip_prefix(start, "(eolinfo:index)", &p) &&
     ++		 S_ISREG(data->ce->ce_mode))
     ++		strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
     ++								 data->ce->name));
     ++	else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
     ++		 !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
     ++		strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
      +	else if (skip_prefix(start, "(eolattr)", &p))
     -+		write_eolattr_to_buf(sb, data->istate, data->pathname);
     ++		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
     ++			      data->pathname));
      +	else if (skip_prefix(start, "(path)", &p))
      +		write_name_to_buf(sb, data->pathname);
      +	else
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +
      +static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
      +			const char *format, const char *fullname) {
     -+
      +	struct show_index_data data = {
      +		.pathname = fullname,
      +		.istate = repo->index,
      +		.ce = ce,
      +	};
     -+
      +	struct strbuf sb = STRBUF_INIT;
     ++
      +	strbuf_expand(&sb, format, expand_show_index, &data);
      +	strbuf_addch(&sb, line_terminator);
      +	fwrite(sb.buf, sb.len, 1, stdout);
      +	strbuf_release(&sb);
     -+	return;
      +}
      +
       static void show_ce(struct repository *repo, struct dir_struct *dir,
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format eolinfo:index' '
     -+	cat >expect <<-\EOF &&
     -+	lf
     -+	lf
     -+	EOF
     -+	git ls-files --format="%(eolinfo:index)" >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'git ls-files --format eolinfo:worktree' '
     -+	cat >expect <<-\EOF &&
     -+	lf
     -+	lf
     -+	EOF
     -+	git ls-files --format="%(eolinfo:worktree)" >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'git ls-files --format eolattr' '
     -+	printf "\n\n" >expect &&
     -+	git ls-files --format="%(eolattr)" >actual &&
     ++HT='	'
     ++WS='    '
     ++test_expect_success 'git ls-files --format v.s. --eol' '
     ++	git ls-files --eol >expect 2>err &&
     ++	test_must_be_empty err &&
     ++	git ls-files --format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS} ${HT}%(path)" >actual 2>err &&
     ++	test_must_be_empty err &&
      +	test_cmp expect actual
      +'
      +


 Documentation/git-ls-files.txt | 38 +++++++++++++-
 builtin/ls-files.c             | 93 +++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 94 ++++++++++++++++++++++++++++++++++
 3 files changed, 224 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..97d4cebba9f 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,35 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+eolinfo:worktree::
+	The <eolinfo> (see the description of the `--eol` option) of
+	the contents in the index or in the worktree for the path.
+eolattr::
+	The <eolattr> (see the description of the `--eol` option)
+	that applies to the path.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..6376dbcccc6 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -85,6 +87,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +233,72 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	struct stat st;
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p) &&
+		 S_ISREG(data->ce->ce_mode))
+		strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
+								 data->ce->name));
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
+		 !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
+		strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
+	else if (skip_prefix(start, "(eolattr)", &p))
+		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
+			      data->pathname));
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+	struct strbuf sb = STRBUF_INIT;
+
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +313,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +758,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +785,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
+			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
+				      "--resolve-undo, --deduplicate, --eol",
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..baf03f9096e
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,94 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k -t --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+HT='	'
+WS='    '
+test_expect_success 'git ls-files --format v.s. --eol' '
+	git ls-files --eol >expect 2>err &&
+	test_must_be_empty err &&
+	git ls-files --format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS} ${HT}%(path)" >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v6] ls-files: introduce "--format" option
  2022-07-11 16:53         ` [PATCH v6] " ZheNing Hu via GitGitGadget
@ 2022-07-11 22:11           ` Junio C Hamano
  2022-07-12 13:53             ` ZheNing Hu
  2022-07-13  6:07           ` [PATCH v7] " ZheNing Hu via GitGitGadget
  1 sibling, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-07-11 22:11 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Torsten Bögershausen, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Add a new option --format that output index enties ...

Let's quote the options and use the Oxford comma.

    ls-files: introduce "--format" option

    Add a new option "--format" that outputs index entries in a
    custom format, taking inspiration from the option with the same
    name in the `git ls-tree` command.

    "--format" cannot used with "-s", "-o", "-k", "-t", "--resolve-undo",
    "--deduplicate", and "--eol".

> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using

So we use the term "field" to mean different piece of information we
can present.  The definition of what fields are available come later
and the presentation order is a bit awkward, but hopefully the text
is understandable as-is.

> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'

And the example makes it pretty clear.  OK.

> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate

Are we dealing with unstructured fields, too?  If not, let's drop
"structured".

> +into the resulting output. For each outputting line, the following
> +names can be used:

"outputting line" sounds like a non language.


    The way each path is shown can be customized by using the
    `--format=<format>` option, where the %(fieldname) in the
    <format> string for various aspects of the index entry are
    interpolated.  The following "fieldname" are understood:

perhaps?

> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.
> +eolinfo:index::
> +eolinfo:worktree::
> +	The <eolinfo> (see the description of the `--eol` option) of
> +	the contents in the index or in the worktree for the path.
> +eolattr::
> +	The <eolattr> (see the description of the `--eol` option)
> +	that applies to the path.

> +path::
> +	The pathname of the file which is recorded in the index.

Since we are mutually exclusive with "--other", the output always
comes from the index, so this is OK.

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e791b65e7e9..6376dbcccc6 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -11,6 +11,7 @@
>  #include "quote.h"
>  #include "dir.h"
>  #include "builtin.h"
> +#include "strbuf.h"

This is not strictly needed (we have users of strbuf in this file
without this patch already), but OK.

> @@ -48,6 +49,7 @@ static char *ps_matched;
>  static const char *with_tree;
>  static int exc_given;
>  static int exclude_args;
> +static const char *format;
>  
>  static const char *tag_cached = "";
>  static const char *tag_unmerged = "";
> @@ -85,6 +87,15 @@ static void write_name(const char *name)
>  				   stdout, line_terminator);
>  }
>  
> +static void write_name_to_buf(struct strbuf *sb, const char *name)
> +{
> +	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);

A blank line here between the decl and the first statement.

> +	if (line_terminator)
> +		quote_c_style(rel, sb, NULL, 0);
> +	else
> +		strbuf_add(sb, rel, strlen(rel));

It's the same thing, but strbuf_addstr() is less error prone.

> @@ -222,6 +233,72 @@ static void show_submodule(struct repository *superproject,
>  	repo_clear(&subrepo);
>  }
>  
> +struct show_index_data {
> +	const char *pathname;
> +	struct index_state *istate;
> +	const struct cache_entry *ce;
> +};
> +
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +			       void *context)

I think this does not make "struct" and "void" align (one more SP needed).

> +{
> +	struct show_index_data *data = context;
> +	const char *end;
> +	const char *p;
> +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> +	struct stat st;
> +
> +	if (len)
> +		return len;
> +	if (*start != '(')
> +		die(_("bad ls-files format: element '%s' "
> +		      "does not start with '('"), start);
> +
> +	end = strchr(start + 1, ')');
> +	if (!end)
> +		die(_("bad ls-files format: element '%s'"
> +		      "does not end in ')'"), start);
> +
> +	len = end - start + 1;
> +	if (skip_prefix(start, "(objectmode)", &p))


Using skip_prefix() not for the purpose of skipping (notice that
nobody uses p at all) is ugly.  We already computed start and end
(hence the length), so we should be able to do much better than
this.

But let's let it pass, as it was copy-pasted from existing code in
ls-tree.c::expand_show_tree().

> +	else if (skip_prefix(start, "(eolinfo:index)", &p) &&
> +		 S_ISREG(data->ce->ce_mode))
> +		strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
> +								 data->ce->name));

This is outright wrong, isn't it?

It is unlikely to see such a trivial error in the 6th round of a
series after other reviewers looked at it many times, so perhaps I
am missing something?  Or perhaps this is a new code added in this
round.

If you ask for %(eolinfo:index) for an index entry that is not a
regular file, this "else if" will not trigger, and the control will
eventually fall through to hit "bad ls-files format" but what you
detected is not a bad format at all.  Once the skip_prefix() hits,
you should be committed to handle that "field" and never let the
other choices in this if/elif/ cascade to see it.

It is OK to interpolate %(eolinfo:index) to an empty string for a
gitlink and a symbolic link, but the right way to do so would
probably be:

	else if (skip_prefix(start, "(eolinfo:index)", &p) {
		if (S_ISREG(data->ce->ce_mode))
			strbuf_addstr(...);
	} else ...

> +	else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
> +		 !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
> +		strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));

Likewise.

> +test_expect_success 'setup' '
> +	echo o1 >o1 &&
> +	echo o2 >o2 &&
> +	git add o1 o2 &&
> +	git add --chmod +x o1 &&
> +	git commit -m base
> +'

Apparently, this set-up is too trivial to uncover the above bug that
can be spotted in 10 seconds of staring at the code.  Perhaps add a
symbolic link (use "git update-index --cacheinfo" and you do not
have to worry about Windows), a subdirectory and a submodule?


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v6] ls-files: introduce "--format" option
  2022-07-11 22:11           ` Junio C Hamano
@ 2022-07-12 13:53             ` ZheNing Hu
  2022-07-12 14:34               ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: ZheNing Hu @ 2022-07-12 13:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

Junio C Hamano <gitster@pobox.com> 于2022年7月12日周二 06:11写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Add a new option --format that output index enties ...
>
> Let's quote the options and use the Oxford comma.
>
>     ls-files: introduce "--format" option
>
>     Add a new option "--format" that outputs index entries in a
>     custom format, taking inspiration from the option with the same
>     name in the `git ls-tree` command.
>
>     "--format" cannot used with "-s", "-o", "-k", "-t", "--resolve-undo",
>     "--deduplicate", and "--eol".
>
> > +It is possible to print in a custom format by using the `--format`
> > +option, which is able to interpolate different fields using
>
> So we use the term "field" to mean different piece of information we
> can present.  The definition of what fields are available come later
> and the presentation order is a bit awkward, but hopefully the text
> is understandable as-is.
>

OK.

> > +a `%(fieldname)` notation. For example, if you only care about the
> > +"objectname" and "path" fields, you can execute with a specific
> > +"--format" like
> > +
> > +     git ls-files --format='%(objectname) %(path)'
>
> And the example makes it pretty clear.  OK.
>
> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
>
> Are we dealing with unstructured fields, too?  If not, let's drop
> "structured".
>

OK (copy from git-ls-tree.txt too)

> > +into the resulting output. For each outputting line, the following
> > +names can be used:
>
> "outputting line" sounds like a non language.
>
>
>     The way each path is shown can be customized by using the
>     `--format=<format>` option, where the %(fieldname) in the
>     <format> string for various aspects of the index entry are
>     interpolated.  The following "fieldname" are understood:
>
> perhaps?
>

This will indeed be better.

> > +{
> > +     struct show_index_data *data = context;
> > +     const char *end;
> > +     const char *p;
> > +     size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> > +     struct stat st;
> > +
> > +     if (len)
> > +             return len;
> > +     if (*start != '(')
> > +             die(_("bad ls-files format: element '%s' "
> > +                   "does not start with '('"), start);
> > +
> > +     end = strchr(start + 1, ')');
> > +     if (!end)
> > +             die(_("bad ls-files format: element '%s'"
> > +                   "does not end in ')'"), start);
> > +
> > +     len = end - start + 1;
> > +     if (skip_prefix(start, "(objectmode)", &p))
>
>
> Using skip_prefix() not for the purpose of skipping (notice that
> nobody uses p at all) is ugly.  We already computed start and end
> (hence the length), so we should be able to do much better than
> this.
>

Agree. I check the parsing format part of ref-filter.c, we just need to find the
atom's begin pos and end pos, then we can use memcmp() to know what's the
type of atom.

> But let's let it pass, as it was copy-pasted from existing code in
> ls-tree.c::expand_show_tree().
>

Yeah, maybe we can optimize it later.

> > +     else if (skip_prefix(start, "(eolinfo:index)", &p) &&
> > +              S_ISREG(data->ce->ce_mode))
> > +             strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
> > +                                                              data->ce->name));
>
> This is outright wrong, isn't it?
>
> It is unlikely to see such a trivial error in the 6th round of a
> series after other reviewers looked at it many times, so perhaps I
> am missing something?  Or perhaps this is a new code added in this
> round.
>
> If you ask for %(eolinfo:index) for an index entry that is not a
> regular file, this "else if" will not trigger, and the control will
> eventually fall through to hit "bad ls-files format" but what you
> detected is not a bad format at all.  Once the skip_prefix() hits,
> you should be committed to handle that "field" and never let the
> other choices in this if/elif/ cascade to see it.
>
> It is OK to interpolate %(eolinfo:index) to an empty string for a
> gitlink and a symbolic link, but the right way to do so would
> probably be:
>
>         else if (skip_prefix(start, "(eolinfo:index)", &p) {
>                 if (S_ISREG(data->ce->ce_mode))
>                         strbuf_addstr(...);
>         } else ...
>

Yeah, but we would use "{", "}" again, so just revert this code to v5,
which uses a
 wrap function.

> > +     else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
> > +              !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
> > +             strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
>
> Likewise.
>
> > +test_expect_success 'setup' '
> > +     echo o1 >o1 &&
> > +     echo o2 >o2 &&
> > +     git add o1 o2 &&
> > +     git add --chmod +x o1 &&
> > +     git commit -m base
> > +'
>
> Apparently, this set-up is too trivial to uncover the above bug that
> can be spotted in 10 seconds of staring at the code.  Perhaps add a
> symbolic link (use "git update-index --cacheinfo" and you do not
> have to worry about Windows), a subdirectory and a submodule?
>

Ah, Just looking at the c code, I took a long time (more than 10 minutes) to
find out where the mistake was. But yeah, use a subdirectory can quickly
meet the error,  so I need to add more cases here.

Thanks for your review.

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v6] ls-files: introduce "--format" option
  2022-07-12 13:53             ` ZheNing Hu
@ 2022-07-12 14:34               ` Junio C Hamano
  0 siblings, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-07-12 14:34 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

ZheNing Hu <adlternative@gmail.com> writes:

>> > +test_expect_success 'setup' '
>> > +     echo o1 >o1 &&
>> > +     echo o2 >o2 &&
>> > +     git add o1 o2 &&
>> > +     git add --chmod +x o1 &&
>> > +     git commit -m base
>> > +'
>>
>> Apparently, this set-up is too trivial to uncover the above bug that
>> can be spotted in 10 seconds of staring at the code.  Perhaps add a
>> symbolic link (use "git update-index --cacheinfo" and you do not
>> have to worry about Windows), a subdirectory and a submodule?
>
> Ah, Just looking at the c code, I took a long time (more than 10 minutes) to
> find out where the mistake was.

It is OK---it tends to take a lot more time than it should for all
of us, experienced developers included, to find mistakes in our own
code than code written by other people.

> But yeah, use a subdirectory can quickly meet the error, so I need
> to add more cases here.

A pure sub-"directory" would not, I suspect.  A submodule or a
symbolic link would. 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7] ls-files: introduce "--format" option
  2022-07-11 16:53         ` [PATCH v6] " ZheNing Hu via GitGitGadget
  2022-07-11 22:11           ` Junio C Hamano
@ 2022-07-13  6:07           ` ZheNing Hu via GitGitGadget
  2022-07-18  8:09             ` Ævar Arnfjörð Bjarmason
  2022-07-20 16:36             ` [PATCH v8] " ZheNing Hu via GitGitGadget
  1 sibling, 2 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-07-13  6:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option "--format" that outputs index entries
informations in a custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

"--format" cannot used with "-s", "-o", "-k", "-t",
" --resolve-undo","--deduplicate" and "--eol".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v6->v7:
    
     1. Change documents helped by Junio.
     2. Fix bug of parsing format.
     3. Add more test cases for other mode index entries (120000 and
        160000).

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v7
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v7
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v6:

 1:  57ed2c15728 ! 1:  9ca22edba94 ls-files: introduce "--format" option
     @@ Metadata
       ## Commit message ##
          ls-files: introduce "--format" option
      
     -    Add a new option --format that output index enties
     -    informations with custom format, taking inspiration
     +    Add a new option "--format" that outputs index entries
     +    informations in a custom format, taking inspiration
          from the option with the same name in the `git ls-tree`
          command.
      
     -    --format cannot used with -s, -o, -k, -t, --resolve-undo,
     -    --deduplicate and --eol.
     +    "--format" cannot used with "-s", "-o", "-k", "-t",
     +    " --resolve-undo","--deduplicate" and "--eol".
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
      +
      +FIELD NAMES
      +-----------
     -+Various values from structured fields can be used to interpolate
     -+into the resulting output. For each outputting line, the following
     -+names can be used:
     ++The way each path is shown can be customized by using the
     ++`--format=<format>` option, where the %(fieldname) in the
     ++<format> string for various aspects of the index entry are
     ++interpolated.  The following "fieldname" are understood:
      +
      +objectmode::
      +	The mode of the file which is recorded in the index.
     @@ builtin/ls-files.c: static void write_name(const char *name)
      +static void write_name_to_buf(struct strbuf *sb, const char *name)
      +{
      +	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
     ++
      +	if (line_terminator)
      +		quote_c_style(rel, sb, NULL, 0);
      +	else
     -+		strbuf_add(sb, rel, strlen(rel));
     ++		strbuf_addstr(sb, rel);
      +}
      +
       static const char *get_tag(const struct cache_entry *ce, const char *tag)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +};
      +
      +static size_t expand_show_index(struct strbuf *sb, const char *start,
     -+			       void *context)
     ++				void *context)
      +{
      +	struct show_index_data *data = context;
      +	const char *end;
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
      +	else if (skip_prefix(start, "(stage)", &p))
      +		strbuf_addf(sb, "%d", ce_stage(data->ce));
     -+	else if (skip_prefix(start, "(eolinfo:index)", &p) &&
     -+		 S_ISREG(data->ce->ce_mode))
     -+		strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
     -+								 data->ce->name));
     -+	else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
     -+		 !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
     -+		strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
     ++	else if (skip_prefix(start, "(eolinfo:index)", &p))
     ++		strbuf_addstr(sb, S_ISREG(data->ce->ce_mode) ?
     ++			      get_cached_convert_stats_ascii(data->istate,
     ++			      data->ce->name) : "");
     ++	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
     ++		strbuf_addstr(sb, !lstat(data->pathname, &st) &&
     ++			      S_ISREG(st.st_mode) ?
     ++			      get_wt_convert_stats_ascii(data->pathname) : "");
      +	else if (skip_prefix(start, "(eolattr)", &p))
      +		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
      +			      data->pathname));
     @@ t/t3013-ls-files-format.sh (new)
      +done
      +
      +test_expect_success 'setup' '
     -+	echo o1 >o1 &&
     -+	echo o2 >o2 &&
     -+	git add o1 o2 &&
     -+	git add --chmod +x o1 &&
     ++	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
     ++	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
     ++	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
     ++	ln -s o3.txt o4.txt &&
     ++	git add "*.txt" &&
     ++	git add --chmod +x o1.txt &&
     ++	git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&
      +	git commit -m base
      +'
      +
     -+test_expect_success 'git ls-files --format objectmode' '
     -+	cat >expect <<-\EOF &&
     -+	100755
     -+	100644
     -+	EOF
     ++test_expect_success 'git ls-files --format objectmode v.s. -s' '
     ++	git ls-files -s | awk "{print \$1}" >expect &&
      +	git ls-files --format="%(objectmode)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format objectname' '
     -+	oid1=$(git hash-object o1) &&
     -+	oid2=$(git hash-object o2) &&
     -+	cat >expect <<-EOF &&
     -+	$oid1
     -+	$oid2
     -+	EOF
     ++test_expect_success 'git ls-files --format objectname v.s. -s' '
     ++	git ls-files -s | awk "{print \$2}" >expect &&
      +	git ls-files --format="%(objectname)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+HT='	'
     -+WS='    '
      +test_expect_success 'git ls-files --format v.s. --eol' '
     -+	git ls-files --eol >expect 2>err &&
     ++	git ls-files --eol >tmp &&
     ++	sed -e "s/	/ /g" -e "s/  */ /g" tmp >expect 2>err &&
      +	test_must_be_empty err &&
     -+	git ls-files --format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS} ${HT}%(path)" >actual 2>err &&
     ++	git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
      +	test_must_be_empty err &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format path' '
     -+	cat >expect <<-\EOF &&
     -+	o1
     -+	o2
     -+	EOF
     ++test_expect_success 'git ls-files --format path v.s. -s' '
     ++	git ls-files -s | awk "{print \$4}" >expect &&
      +	git ls-files --format="%(path)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format with -m' '
     -+	echo change >o1 &&
     ++	echo change >o1.txt &&
      +	cat >expect <<-\EOF &&
     -+	o1
     ++	o1.txt
     ++	o5.txt
      +	EOF
      +	git ls-files --format="%(path)" -m >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format with -d' '
     -+	echo o3 >o3 &&
     -+	git add o3 &&
     -+	rm o3 &&
     ++	echo o6 >o6.txt &&
     ++	git add o6.txt &&
     ++	rm o6.txt &&
      +	cat >expect <<-\EOF &&
     -+	o3
     ++	o5.txt
     ++	o6.txt
      +	EOF
      +	git ls-files --format="%(path)" -d >actual &&
      +	test_cmp expect actual


 Documentation/git-ls-files.txt | 39 +++++++++++++-
 builtin/ls-files.c             | 95 ++++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 87 +++++++++++++++++++++++++++++++
 3 files changed, 220 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..d7986419c25 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,36 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+The way each path is shown can be customized by using the
+`--format=<format>` option, where the %(fieldname) in the
+<format> string for various aspects of the index entry are
+interpolated.  The following "fieldname" are understood:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+eolinfo:worktree::
+	The <eolinfo> (see the description of the `--eol` option) of
+	the contents in the index or in the worktree for the path.
+eolattr::
+	The <eolattr> (see the description of the `--eol` option)
+	that applies to the path.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..6f3ebcaaff7 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -85,6 +87,16 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_addstr(sb, rel);
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +234,73 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+				void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	struct stat st;
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p))
+		strbuf_addstr(sb, S_ISREG(data->ce->ce_mode) ?
+			      get_cached_convert_stats_ascii(data->istate,
+			      data->ce->name) : "");
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
+		strbuf_addstr(sb, !lstat(data->pathname, &st) &&
+			      S_ISREG(st.st_mode) ?
+			      get_wt_convert_stats_ascii(data->pathname) : "");
+	else if (skip_prefix(start, "(eolattr)", &p))
+		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
+			      data->pathname));
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+	struct strbuf sb = STRBUF_INIT;
+
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +315,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +760,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +787,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
+			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
+				      "--resolve-undo, --deduplicate, --eol",
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..e62bce70f3b
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,87 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k -t --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
+	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
+	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
+	ln -s o3.txt o4.txt &&
+	git add "*.txt" &&
+	git add --chmod +x o1.txt &&
+	git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode v.s. -s' '
+	git ls-files -s | awk "{print \$1}" >expect &&
+	git ls-files --format="%(objectmode)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname v.s. -s' '
+	git ls-files -s | awk "{print \$2}" >expect &&
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format v.s. --eol' '
+	git ls-files --eol >tmp &&
+	sed -e "s/	/ /g" -e "s/  */ /g" tmp >expect 2>err &&
+	test_must_be_empty err &&
+	git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path v.s. -s' '
+	git ls-files -s | awk "{print \$4}" >expect &&
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1.txt &&
+	cat >expect <<-\EOF &&
+	o1.txt
+	o5.txt
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o6 >o6.txt &&
+	git add o6.txt &&
+	rm o6.txt &&
+	cat >expect <<-\EOF &&
+	o5.txt
+	o6.txt
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v7] ls-files: introduce "--format" option
  2022-07-13  6:07           ` [PATCH v7] " ZheNing Hu via GitGitGadget
@ 2022-07-18  8:09             ` Ævar Arnfjörð Bjarmason
  2022-07-19 16:19               ` ZheNing Hu
  2022-07-20 16:36             ` [PATCH v8] " ZheNing Hu via GitGitGadget
  1 sibling, 1 reply; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-18  8:09 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Phillip Wood,
	Torsten Bögershausen, ZheNing Hu


On Wed, Jul 13 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Add a new option "--format" that outputs index entries
> informations in a custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
>
> "--format" cannot used with "-s", "-o", "-k", "-t",
> " --resolve-undo","--deduplicate" and "--eol".
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>     ls-files: introduce "--format" options
>     
>     v6->v7:
>     
>      1. Change documents helped by Junio.
>      2. Fix bug of parsing format.
>      3. Add more test cases for other mode index entries (120000 and
>         160000).
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v7
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v7
> Pull-Request: https://github.com/gitgitgadget/git/pull/1262
>
> Range-diff vs v6:
>
>  1:  57ed2c15728 ! 1:  9ca22edba94 ls-files: introduce "--format" option
>      @@ Metadata
>        ## Commit message ##
>           ls-files: introduce "--format" option
>       
>      -    Add a new option --format that output index enties
>      -    informations with custom format, taking inspiration
>      +    Add a new option "--format" that outputs index entries
>      +    informations in a custom format, taking inspiration
>           from the option with the same name in the `git ls-tree`
>           command.
>       
>      -    --format cannot used with -s, -o, -k, -t, --resolve-undo,
>      -    --deduplicate and --eol.
>      +    "--format" cannot used with "-s", "-o", "-k", "-t",
>      +    " --resolve-undo","--deduplicate" and "--eol".
>       
>           Signed-off-by: ZheNing Hu <adlternative@gmail.com>
>       
>      @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
>       +
>       +FIELD NAMES
>       +-----------
>      -+Various values from structured fields can be used to interpolate
>      -+into the resulting output. For each outputting line, the following
>      -+names can be used:
>      ++The way each path is shown can be customized by using the
>      ++`--format=<format>` option, where the %(fieldname) in the
>      ++<format> string for various aspects of the index entry are
>      ++interpolated.  The following "fieldname" are understood:
>       +
>       +objectmode::
>       +	The mode of the file which is recorded in the index.
>      @@ builtin/ls-files.c: static void write_name(const char *name)
>       +static void write_name_to_buf(struct strbuf *sb, const char *name)
>       +{
>       +	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
>      ++
>       +	if (line_terminator)
>       +		quote_c_style(rel, sb, NULL, 0);
>       +	else
>      -+		strbuf_add(sb, rel, strlen(rel));
>      ++		strbuf_addstr(sb, rel);
>       +}
>       +
>        static const char *get_tag(const struct cache_entry *ce, const char *tag)
>      @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
>       +};
>       +
>       +static size_t expand_show_index(struct strbuf *sb, const char *start,
>      -+			       void *context)
>      ++				void *context)
>       +{
>       +	struct show_index_data *data = context;
>       +	const char *end;
>      @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
>       +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
>       +	else if (skip_prefix(start, "(stage)", &p))
>       +		strbuf_addf(sb, "%d", ce_stage(data->ce));
>      -+	else if (skip_prefix(start, "(eolinfo:index)", &p) &&
>      -+		 S_ISREG(data->ce->ce_mode))
>      -+		strbuf_addstr(sb, get_cached_convert_stats_ascii(data->istate,
>      -+								 data->ce->name));
>      -+	else if (skip_prefix(start, "(eolinfo:worktree)", &p) &&
>      -+		 !lstat(data->pathname, &st) && S_ISREG(st.st_mode))
>      -+		strbuf_addstr(sb, get_wt_convert_stats_ascii(data->pathname));
>      ++	else if (skip_prefix(start, "(eolinfo:index)", &p))
>      ++		strbuf_addstr(sb, S_ISREG(data->ce->ce_mode) ?
>      ++			      get_cached_convert_stats_ascii(data->istate,
>      ++			      data->ce->name) : "");
>      ++	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
>      ++		strbuf_addstr(sb, !lstat(data->pathname, &st) &&
>      ++			      S_ISREG(st.st_mode) ?
>      ++			      get_wt_convert_stats_ascii(data->pathname) : "");
>       +	else if (skip_prefix(start, "(eolattr)", &p))
>       +		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
>       +			      data->pathname));
>      @@ t/t3013-ls-files-format.sh (new)
>       +done
>       +
>       +test_expect_success 'setup' '
>      -+	echo o1 >o1 &&
>      -+	echo o2 >o2 &&
>      -+	git add o1 o2 &&
>      -+	git add --chmod +x o1 &&
>      ++	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
>      ++	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
>      ++	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
>      ++	ln -s o3.txt o4.txt &&
>      ++	git add "*.txt" &&
>      ++	git add --chmod +x o1.txt &&
>      ++	git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&
>       +	git commit -m base
>       +'
>       +
>      -+test_expect_success 'git ls-files --format objectmode' '
>      -+	cat >expect <<-\EOF &&
>      -+	100755
>      -+	100644
>      -+	EOF
>      ++test_expect_success 'git ls-files --format objectmode v.s. -s' '
>      ++	git ls-files -s | awk "{print \$1}" >expect &&
>       +	git ls-files --format="%(objectmode)" >actual &&
>       +	test_cmp expect actual
>       +'
>       +
>      -+test_expect_success 'git ls-files --format objectname' '
>      -+	oid1=$(git hash-object o1) &&
>      -+	oid2=$(git hash-object o2) &&
>      -+	cat >expect <<-EOF &&
>      -+	$oid1
>      -+	$oid2
>      -+	EOF
>      ++test_expect_success 'git ls-files --format objectname v.s. -s' '
>      ++	git ls-files -s | awk "{print \$2}" >expect &&
>       +	git ls-files --format="%(objectname)" >actual &&
>       +	test_cmp expect actual
>       +'
>       +
>      -+HT='	'
>      -+WS='    '
>       +test_expect_success 'git ls-files --format v.s. --eol' '
>      -+	git ls-files --eol >expect 2>err &&
>      ++	git ls-files --eol >tmp &&
>      ++	sed -e "s/	/ /g" -e "s/  */ /g" tmp >expect 2>err &&
>       +	test_must_be_empty err &&
>      -+	git ls-files --format="i/%(eolinfo:index)${WS}w/%(eolinfo:worktree)${WS}attr/${WS}${WS}${WS}${WS} ${HT}%(path)" >actual 2>err &&
>      ++	git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
>       +	test_must_be_empty err &&
>       +	test_cmp expect actual
>       +'
>       +
>      -+test_expect_success 'git ls-files --format path' '
>      -+	cat >expect <<-\EOF &&
>      -+	o1
>      -+	o2
>      -+	EOF
>      ++test_expect_success 'git ls-files --format path v.s. -s' '
>      ++	git ls-files -s | awk "{print \$4}" >expect &&
>       +	git ls-files --format="%(path)" >actual &&
>       +	test_cmp expect actual
>       +'
>       +
>       +test_expect_success 'git ls-files --format with -m' '
>      -+	echo change >o1 &&
>      ++	echo change >o1.txt &&
>       +	cat >expect <<-\EOF &&
>      -+	o1
>      ++	o1.txt
>      ++	o5.txt
>       +	EOF
>       +	git ls-files --format="%(path)" -m >actual &&
>       +	test_cmp expect actual
>       +'
>       +
>       +test_expect_success 'git ls-files --format with -d' '
>      -+	echo o3 >o3 &&
>      -+	git add o3 &&
>      -+	rm o3 &&
>      ++	echo o6 >o6.txt &&
>      ++	git add o6.txt &&
>      ++	rm o6.txt &&
>       +	cat >expect <<-\EOF &&
>      -+	o3
>      ++	o5.txt
>      ++	o6.txt
>       +	EOF
>       +	git ls-files --format="%(path)" -d >actual &&
>       +	test_cmp expect actual
>
>
>  Documentation/git-ls-files.txt | 39 +++++++++++++-
>  builtin/ls-files.c             | 95 ++++++++++++++++++++++++++++++++++
>  t/t3013-ls-files-format.sh     | 87 +++++++++++++++++++++++++++++++
>  3 files changed, 220 insertions(+), 1 deletion(-)
>  create mode 100755 t/t3013-ls-files-format.sh
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 0dabf3f0ddc..d7986419c25 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -20,7 +20,7 @@ SYNOPSIS
>  		[--exclude-standard]
>  		[--error-unmatch] [--with-tree=<tree-ish>]
>  		[--full-name] [--recurse-submodules]
> -		[--abbrev[=<n>]] [--] [<file>...]
> +		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
>  
>  DESCRIPTION
>  -----------
> @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
>  	to the contained files. Sparse directories will be shown with a
>  	trailing slash, such as "x/" for a sparse directory "x".
>  
> +--format=<format>::
> +	A string that interpolates `%(fieldname)` from the result being shown.
> +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> +	interpolates to character with hex code `xx`; for example `%00`
> +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> +	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
> +	and `--eol`.
>  \--::
>  	Do not interpret any more arguments as options.
>  
> @@ -223,6 +230,36 @@ quoted as explained for the configuration variable `core.quotePath`
>  (see linkgit:git-config[1]).  Using `-z` the filename is output
>  verbatim and the line is terminated by a NUL byte.
>  
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +The way each path is shown can be customized by using the
> +`--format=<format>` option, where the %(fieldname) in the
> +<format> string for various aspects of the index entry are
> +interpolated.  The following "fieldname" are understood:
> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.
> +eolinfo:index::
> +eolinfo:worktree::
> +	The <eolinfo> (see the description of the `--eol` option) of
> +	the contents in the index or in the worktree for the path.
> +eolattr::
> +	The <eolattr> (see the description of the `--eol` option)
> +	that applies to the path.
> +path::
> +	The pathname of the file which is recorded in the index.
>  
>  EXCLUDE PATTERNS
>  ----------------
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e791b65e7e9..6f3ebcaaff7 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -11,6 +11,7 @@
>  #include "quote.h"
>  #include "dir.h"
>  #include "builtin.h"
> +#include "strbuf.h"
>  #include "tree.h"
>  #include "cache-tree.h"
>  #include "parse-options.h"
> @@ -48,6 +49,7 @@ static char *ps_matched;
>  static const char *with_tree;
>  static int exc_given;
>  static int exclude_args;
> +static const char *format;
>  
>  static const char *tag_cached = "";
>  static const char *tag_unmerged = "";
> @@ -85,6 +87,16 @@ static void write_name(const char *name)
>  				   stdout, line_terminator);
>  }
>  
> +static void write_name_to_buf(struct strbuf *sb, const char *name)
> +{
> +	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
> +
> +	if (line_terminator)
> +		quote_c_style(rel, sb, NULL, 0);
> +	else
> +		strbuf_addstr(sb, rel);
> +}
> +
>  static const char *get_tag(const struct cache_entry *ce, const char *tag)
>  {
>  	static char alttag[4];
> @@ -222,6 +234,73 @@ static void show_submodule(struct repository *superproject,
>  	repo_clear(&subrepo);
>  }
>  
> +struct show_index_data {
> +	const char *pathname;
> +	struct index_state *istate;
> +	const struct cache_entry *ce;
> +};
> +
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +				void *context)
> +{
> +	struct show_index_data *data = context;
> +	const char *end;
> +	const char *p;
> +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> +	struct stat st;
> +
> +	if (len)
> +		return len;
> +	if (*start != '(')
> +		die(_("bad ls-files format: element '%s' "
> +		      "does not start with '('"), start);
> +
> +	end = strchr(start + 1, ')');
> +	if (!end)
> +		die(_("bad ls-files format: element '%s'"
> +		      "does not end in ')'"), start);
> +
> +	len = end - start + 1;
> +	if (skip_prefix(start, "(objectmode)", &p))
> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
> +	else if (skip_prefix(start, "(objectname)", &p))
> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> +	else if (skip_prefix(start, "(stage)", &p))
> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
> +	else if (skip_prefix(start, "(eolinfo:index)", &p))
> +		strbuf_addstr(sb, S_ISREG(data->ce->ce_mode) ?
> +			      get_cached_convert_stats_ascii(data->istate,
> +			      data->ce->name) : "");
> +	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
> +		strbuf_addstr(sb, !lstat(data->pathname, &st) &&
> +			      S_ISREG(st.st_mode) ?
> +			      get_wt_convert_stats_ascii(data->pathname) : "");
> +	else if (skip_prefix(start, "(eolattr)", &p))
> +		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
> +			      data->pathname));
> +	else if (skip_prefix(start, "(path)", &p))
> +		write_name_to_buf(sb, data->pathname);
> +	else
> +		die(_("bad ls-files format: %%%.*s"), (int)len, start);
> +
> +	return len;
> +}
> +
> +static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
> +			const char *format, const char *fullname) {
> +	struct show_index_data data = {
> +		.pathname = fullname,
> +		.istate = repo->index,
> +		.ce = ce,
> +	};
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	strbuf_expand(&sb, format, expand_show_index, &data);
> +	strbuf_addch(&sb, line_terminator);
> +	fwrite(sb.buf, sb.len, 1, stdout);
> +	strbuf_release(&sb);
> +}
> +
>  static void show_ce(struct repository *repo, struct dir_struct *dir,
>  		    const struct cache_entry *ce, const char *fullname,
>  		    const char *tag)
> @@ -236,6 +315,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
>  				  max_prefix_len, ps_matched,
>  				  S_ISDIR(ce->ce_mode) ||
>  				  S_ISGITLINK(ce->ce_mode))) {
> +		if (format) {
> +			show_ce_fmt(repo, ce, format, fullname);
> +			print_debug(ce);
> +			return;
> +		}
> +
>  		tag = get_tag(ce, tag);
>  
>  		if (!show_stage) {
> @@ -675,6 +760,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  			 N_("suppress duplicate entries")),
>  		OPT_BOOL(0, "sparse", &show_sparse_dirs,
>  			 N_("show sparse directories in the presence of a sparse index")),
> +		OPT_STRING_F(0, "format", &format, N_("format"),
> +			     N_("format to use for the output"),
> +			     PARSE_OPT_NONEG),
>  		OPT_END()
>  	};
>  	int ret = 0;
> @@ -699,6 +787,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  	for (i = 0; i < exclude_list.nr; i++) {
>  		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
>  	}
> +
> +	if (format && (show_stage || show_others || show_killed ||
> +		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
> +			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
> +				      "--resolve-undo, --deduplicate, --eol",
> +				      ls_files_usage, builtin_ls_files_options);

There's a whitespace issue here, you need to add a space after "-t",
otherwise we emit:

	fatal: --format cannot used with -s, -o, -k, -t--resolve-undo, --deduplicate, --eol
> +
>  	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>  		tag_cached = "H ";
>  		tag_unmerged = "M ";
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> new file mode 100755
> index 00000000000..e62bce70f3b
> --- /dev/null
> +++ b/t/t3013-ls-files-format.sh
> @@ -0,0 +1,87 @@
> +#!/bin/sh
> +
> +test_description='git ls-files --format test'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
> +. ./test-lib.sh
> +
> +for flag in -s -o -k -t --resolve-undo --deduplicate --eol
> +do
> +	test_expect_success "usage: --format is incompatible with $flag" '
> +		test_expect_code 129 git ls-files --format="%(objectname)" $flag
> +	'
> +done
> +
> +test_expect_success 'setup' '
> +	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
> +	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
> +	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&

If you want to do this sort of thing in general this pattern is better:

	x="a b c" &&
	printf "%s\n" $x
        printf "%s\r\n" $x

I.e. you can use printf's auto-repeating, or test_write_lines[1]. But in
this case I tried:

	diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
	index e62bce70f3b..e4c3a788acb 100755
	--- a/t/t3013-ls-files-format.sh
	+++ b/t/t3013-ls-files-format.sh
	@@ -13,9 +13,11 @@ do
	 done
	 
	 test_expect_success 'setup' '
	-	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
	-	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
	-	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
	+	lines="LO LINETWO LINETHREE" &&
	+	test_write_lines $lines >o1.txt &&
	+	# Even this passes!
	+	#>o1.txt &&
	+
	 	ln -s o3.txt o4.txt &&
	 	git add "*.txt" &&
	 	git add --chmod +x o1.txt &&

I.e. all tests pass if we don't write o2.txt and o3.txt, and continue to
pass if you uncomment that and make o1.txt an empty file.

So is this some incomplete test setup code that was never used & we
could drop?
	

> +	ln -s o3.txt o4.txt &&
> +	git add "*.txt" &&
> +	git add --chmod +x o1.txt &&
> +	git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&

Do:

	oid=$(git hash-object ..) &&
	git update-index ... "$(oid)"

Otherwise we hide the exit code of "git-hash-object", e.g. if it returns
the hash and then segfaults.

> +	git commit -m base
> +'
> +
> +test_expect_success 'git ls-files --format objectmode v.s. -s' '
> +	git ls-files -s | awk "{print \$1}" >expect &&

Same in this case and below, i.e. let's not hide "git" on the lhs of a
pipe. So:

	git ls-files >files &&
	awk ... <files >expect

In this case all your awk-ing can be replaced with (continued)...

> +	git ls-files --format="%(objectmode)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format objectname v.s. -s' '
> +	git ls-files -s | awk "{print \$2}" >expect &&

...

	cut -d" " -f2

...

> +	git ls-files --format="%(objectname)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format v.s. --eol' '
> +	git ls-files --eol >tmp &&
> +	sed -e "s/	/ /g" -e "s/  */ /g" tmp >expect 2>err &&
> +	test_must_be_empty err &&
> +	git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
> +	test_must_be_empty err &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format path v.s. -s' '
> +	git ls-files -s | awk "{print \$4}" >expect &&

...

	cut -f2

I.e. instead of the 4th whitespace field ask for the 2nd \t-delimited
field. There's nothing wrong with using awk per-se, but let's use the
simpler "cut" for such a simple use-case.

> +	git ls-files --format="%(path)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with -m' '
> +	echo change >o1.txt &&
> +	cat >expect <<-\EOF &&
> +	o1.txt
> +	o5.txt
> +	EOF
> +	git ls-files --format="%(path)" -m >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with -d' '
> +	echo o6 >o6.txt &&
> +	git add o6.txt &&
> +	rm o6.txt &&
> +	cat >expect <<-\EOF &&
> +	o5.txt
> +	o6.txt
> +	EOF
> +	git ls-files --format="%(path)" -d >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format imitate --stage' '
> +	git ls-files --stage >expect &&
> +	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)" --debug >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_done

The rest of this (and especially the C code) all looks good to me at
this point, thanks!

1. Aside: I've found the test_write_lines helper to be rather strange
   for us to carry. I.e. most helpers provide a briefer or less
   buggy/tricky way to do something, but in that case:

	test_write_lines
	printf "%s\n"

   So we have it to write something in a more verbose way than we need,
   as we can see experimentally from all tests passing with:

	perl -pi -e 's[test_write_lines ][printf "%s\\n" ]g' t/t[0-9]*.sh

   It seems to me that per
   https://lore.kernel.org/git/xmqqioqu5fr3.fsf@gitster.dls.corp.google.com/
   and
   https://lore.kernel.org/git/1398255277-26303-2-git-send-email-mst@redhat.com/
   it was suggested without knowing that we could use printf to do the
   same.

   The implementation that landed in ac9afcc31cd (test: add
   test_write_lines helper, 2014-04-27) was fixed up to use printf,
   without re-visiting why we were carrying a helper when an even
   shorter printf would do...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v7] ls-files: introduce "--format" option
  2022-07-18  8:09             ` Ævar Arnfjörð Bjarmason
@ 2022-07-19 16:19               ` ZheNing Hu
  2022-07-19 16:47                 ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: ZheNing Hu @ 2022-07-19 16:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Phillip Wood, Torsten Bögershausen

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年7月18日周一 16:29写道:
>
>
> On Wed, Jul 13 2022, ZheNing Hu via GitGitGadget wrote:
>
> > +test_expect_success 'setup' '
> > +     printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
> > +     printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
> > +     printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
>
> If you want to do this sort of thing in general this pattern is better:
>
>         x="a b c" &&
>         printf "%s\n" $x
>         printf "%s\r\n" $x
>

Let see what's these cmd output:

x="a b c" &&
printf "%s\n" $x &&
printf "%s\r\n" $x

a b c
a b c

I guess what we expect is:

a
b
c
a
b
c

test_write_lines() can do this:

test_write_lines a b c
test_write_lines a b c

yeah, maybe printf do this too:

# x="a b c" we don't use a variable
printf "%s\n" a b c &&
printf "%s\r\n" a b c

> I.e. you can use printf's auto-repeating, or test_write_lines[1]. But in
> this case I tried:
>
>         diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
>         index e62bce70f3b..e4c3a788acb 100755
>         --- a/t/t3013-ls-files-format.sh
>         +++ b/t/t3013-ls-files-format.sh
>         @@ -13,9 +13,11 @@ do
>          done
>
>          test_expect_success 'setup' '
>         -       printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
>         -       printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
>         -       printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
>         +       lines="LO LINETWO LINETHREE" &&
>         +       test_write_lines $lines >o1.txt &&
>         +       # Even this passes!
>         +       #>o1.txt &&
>         +
>                 ln -s o3.txt o4.txt &&
>                 git add "*.txt" &&
>                 git add --chmod +x o1.txt &&
>
> I.e. all tests pass if we don't write o2.txt and o3.txt, and continue to
> pass if you uncomment that and make o1.txt an empty file.
>
> So is this some incomplete test setup code that was never used & we
> could drop?

No, o2.txt, o3.txt just for test some file with different
eolinfo/eolattr, so if we just
keep o1,txt and no o2.txt, o3.txt, it can certainly work with this test case:
'git ls-files --format v.s. --eol'. Other cases don't really need
o2.txt, o3.txt.

By the way: this part of code was copied from t/t0025-crlf-renormalize.sh.
it want to test for three kind file, end with "\n", "\r\n", or mix
with "\n", "\r\n".

>
>
> > +     ln -s o3.txt o4.txt &&
> > +     git add "*.txt" &&
> > +     git add --chmod +x o1.txt &&
> > +     git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&
>
> Do:
>
>         oid=$(git hash-object ..) &&
>         git update-index ... "$(oid)"
>
> Otherwise we hide the exit code of "git-hash-object", e.g. if it returns
> the hash and then segfaults.
>

Thanks. I will keep this in my mind.

> > +     git commit -m base
> > +'
> > +
> > +test_expect_success 'git ls-files --format objectmode v.s. -s' '
> > +     git ls-files -s | awk "{print \$1}" >expect &&
>
> Same in this case and below, i.e. let's not hide "git" on the lhs of a
> pipe. So:
>
>         git ls-files >files &&
>         awk ... <files >expect
>
> In this case all your awk-ing can be replaced with (continued)...
>
> > +     git ls-files --format="%(objectmode)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format objectname v.s. -s' '
> > +     git ls-files -s | awk "{print \$2}" >expect &&
>
> ...
>
>         cut -d" " -f2
>
> ...
>
> > +     git ls-files --format="%(objectname)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format v.s. --eol' '
> > +     git ls-files --eol >tmp &&
> > +     sed -e "s/      / /g" -e "s/  */ /g" tmp >expect 2>err &&
> > +     test_must_be_empty err &&
> > +     git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
> > +     test_must_be_empty err &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format path v.s. -s' '
> > +     git ls-files -s | awk "{print \$4}" >expect &&
>
> ...
>
>         cut -f2
>
> I.e. instead of the 4th whitespace field ask for the 2nd \t-delimited
> field. There's nothing wrong with using awk per-se, but let's use the
> simpler "cut" for such a simple use-case.
>
> > +     git ls-files --format="%(path)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -m' '
> > +     echo change >o1.txt &&
> > +     cat >expect <<-\EOF &&
> > +     o1.txt
> > +     o5.txt
> > +     EOF
> > +     git ls-files --format="%(path)" -m >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -d' '
> > +     echo o6 >o6.txt &&
> > +     git add o6.txt &&
> > +     rm o6.txt &&
> > +     cat >expect <<-\EOF &&
> > +     o5.txt
> > +     o6.txt
> > +     EOF
> > +     git ls-files --format="%(path)" -d >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format imitate --stage' '
> > +     git ls-files --stage >expect &&
> > +     git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --debug' '
> > +     git ls-files --debug >expect &&
> > +     git ls-files --format="%(path)" --debug >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_done
>
> The rest of this (and especially the C code) all looks good to me at
> this point, thanks!
>
> 1. Aside: I've found the test_write_lines helper to be rather strange
>    for us to carry. I.e. most helpers provide a briefer or less
>    buggy/tricky way to do something, but in that case:
>
>         test_write_lines
>         printf "%s\n"
>
>    So we have it to write something in a more verbose way than we need,
>    as we can see experimentally from all tests passing with:
>
>         perl -pi -e 's[test_write_lines ][printf "%s\\n" ]g' t/t[0-9]*.sh
>
>    It seems to me that per
>    https://lore.kernel.org/git/xmqqioqu5fr3.fsf@gitster.dls.corp.google.com/
>    and
>    https://lore.kernel.org/git/1398255277-26303-2-git-send-email-mst@redhat.com/
>    it was suggested without knowing that we could use printf to do the
>    same.
>
>    The implementation that landed in ac9afcc31cd (test: add
>    test_write_lines helper, 2014-04-27) was fixed up to use printf,
>    without re-visiting why we were carrying a helper when an even
>    shorter printf would do...

I guess it's just for letting contributers use the same way output multiple
line, otherwise contributers (like me) want to use echo or other commands
sometimes?

But in my test case here, I need a file mixed with "r\n", "\n" instread of
only "\n", so I think maybe we should keep it.

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v7] ls-files: introduce "--format" option
  2022-07-19 16:19               ` ZheNing Hu
@ 2022-07-19 16:47                 ` Junio C Hamano
  2022-07-19 17:21                   ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-07-19 16:47 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood, Torsten Bögershausen

ZheNing Hu <adlternative@gmail.com> writes:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年7月18日周一 16:29写道:
>>
>>
>> On Wed, Jul 13 2022, ZheNing Hu via GitGitGadget wrote:
>>
>> > +test_expect_success 'setup' '
>> > +     printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
>> > +     printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
>> > +     printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
>>
>> If you want to do this sort of thing in general this pattern is better:
>>
>>         x="a b c" &&
>>         printf "%s\n" $x
>>         printf "%s\r\n" $x
>>
>
> Let see what's these cmd output:
>
> x="a b c" &&
> printf "%s\n" $x &&
> printf "%s\r\n" $x
>
> a b c
> a b c

The above makes it look as if your shell is broken or you have an
unusual IFS that does not have space in it.  Are you sure you did
not place anything around $x on the second and the third line, which
is given to printf after its contents split into words at $IFS?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v7] ls-files: introduce "--format" option
  2022-07-19 16:47                 ` Junio C Hamano
@ 2022-07-19 17:21                   ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-19 17:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood, Torsten Bögershausen

Junio C Hamano <gitster@pobox.com> 于2022年7月20日周三 00:47写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> > Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年7月18日周一 16:29写道:
> >>
> >>
> >> On Wed, Jul 13 2022, ZheNing Hu via GitGitGadget wrote:
> >>
> >> > +test_expect_success 'setup' '
> >> > +     printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
> >> > +     printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
> >> > +     printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
> >>
> >> If you want to do this sort of thing in general this pattern is better:
> >>
> >>         x="a b c" &&
> >>         printf "%s\n" $x
> >>         printf "%s\r\n" $x
> >>
> >
> > Let see what's these cmd output:
> >
> > x="a b c" &&
> > printf "%s\n" $x &&
> > printf "%s\r\n" $x
> >
> > a b c
> > a b c
>
> The above makes it look as if your shell is broken or you have an
> unusual IFS that does not have space in it.  Are you sure you did
> not place anything around $x on the second and the third line, which
> is given to printf after its contents split into words at $IFS?

Ok... That's zsh's strange feature.... I turn to use bash, it's find.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v8] ls-files: introduce "--format" option
  2022-07-13  6:07           ` [PATCH v7] " ZheNing Hu via GitGitGadget
  2022-07-18  8:09             ` Ævar Arnfjörð Bjarmason
@ 2022-07-20 16:36             ` ZheNing Hu via GitGitGadget
  2022-07-20 17:37               ` Junio C Hamano
  2022-07-23  6:44               ` [PATCH v9] " ZheNing Hu via GitGitGadget
  1 sibling, 2 replies; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-07-20 16:36 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option "--format" that outputs index entries
informations in a custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

"--format" cannot used with "-s", "-o", "-k", "-t",
" --resolve-undo","--deduplicate" and "--eol".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v6->v7:
    
     1. fix the usage which leak of one white blank.
     2. fix some test cases.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v8
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v8
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v7:

 1:  9ca22edba94 ! 1:  e765ee28e90 ls-files: introduce "--format" option
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
      +
      +	if (format && (show_stage || show_others || show_killed ||
      +		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
     -+			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
     ++			usage_msg_opt("--format cannot used with -s, -o, -k, -t, "
      +				      "--resolve-undo, --deduplicate, --eol",
      +				      ls_files_usage, builtin_ls_files_options);
      +
     @@ t/t3013-ls-files-format.sh (new)
      +	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
      +	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
      +	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
     -+	ln -s o3.txt o4.txt &&
     -+	git add "*.txt" &&
     -+	git add --chmod +x o1.txt &&
     -+	git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&
     ++	git add . &&
     ++	oid=$(git hash-object o1.txt) &&
     ++	git update-index --add --cacheinfo 120000 $oid o4.txt &&
     ++	git update-index --add --cacheinfo 160000 $oid o5.txt &&
     ++	git update-index --add --cacheinfo 100755 $oid o6.txt &&
      +	git commit -m base
      +'
      +
      +test_expect_success 'git ls-files --format objectmode v.s. -s' '
     -+	git ls-files -s | awk "{print \$1}" >expect &&
     ++	git ls-files -s >files &&
     ++	cut -d" " -f1 files >expect &&
      +	git ls-files --format="%(objectmode)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format objectname v.s. -s' '
     -+	git ls-files -s | awk "{print \$2}" >expect &&
     ++	git ls-files -s >files &&
     ++	cut -d" " -f2 files >expect &&
      +	git ls-files --format="%(objectname)" >actual &&
      +	test_cmp expect actual
      +'
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format path v.s. -s' '
     -+	git ls-files -s | awk "{print \$4}" >expect &&
     ++	git ls-files -s >files &&
     ++	cut -f2 files >expect &&
      +	git ls-files --format="%(path)" >actual &&
      +	test_cmp expect actual
      +'
     @@ t/t3013-ls-files-format.sh (new)
      +	echo change >o1.txt &&
      +	cat >expect <<-\EOF &&
      +	o1.txt
     ++	o4.txt
      +	o5.txt
     ++	o6.txt
      +	EOF
      +	git ls-files --format="%(path)" -m >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format with -d' '
     -+	echo o6 >o6.txt &&
     -+	git add o6.txt &&
     -+	rm o6.txt &&
     ++	echo o7 >o7.txt &&
     ++	git add o7.txt &&
     ++	rm o7.txt &&
      +	cat >expect <<-\EOF &&
     ++	o4.txt
      +	o5.txt
      +	o6.txt
     ++	o7.txt
      +	EOF
      +	git ls-files --format="%(path)" -d >actual &&
      +	test_cmp expect actual


 Documentation/git-ls-files.txt | 39 +++++++++++++-
 builtin/ls-files.c             | 95 ++++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 95 ++++++++++++++++++++++++++++++++++
 3 files changed, 228 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..d7986419c25 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,36 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+The way each path is shown can be customized by using the
+`--format=<format>` option, where the %(fieldname) in the
+<format> string for various aspects of the index entry are
+interpolated.  The following "fieldname" are understood:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+eolinfo:worktree::
+	The <eolinfo> (see the description of the `--eol` option) of
+	the contents in the index or in the worktree for the path.
+eolattr::
+	The <eolattr> (see the description of the `--eol` option)
+	that applies to the path.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..44a2c1cb425 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -85,6 +87,16 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_addstr(sb, rel);
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +234,73 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+				void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	struct stat st;
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p))
+		strbuf_addstr(sb, S_ISREG(data->ce->ce_mode) ?
+			      get_cached_convert_stats_ascii(data->istate,
+			      data->ce->name) : "");
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
+		strbuf_addstr(sb, !lstat(data->pathname, &st) &&
+			      S_ISREG(st.st_mode) ?
+			      get_wt_convert_stats_ascii(data->pathname) : "");
+	else if (skip_prefix(start, "(eolattr)", &p))
+		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
+			      data->pathname));
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+	struct strbuf sb = STRBUF_INIT;
+
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +315,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +760,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +787,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
+			usage_msg_opt("--format cannot used with -s, -o, -k, -t, "
+				      "--resolve-undo, --deduplicate, --eol",
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..92d378ad1df
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,95 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k -t --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
+	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
+	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
+	git add . &&
+	oid=$(git hash-object o1.txt) &&
+	git update-index --add --cacheinfo 120000 $oid o4.txt &&
+	git update-index --add --cacheinfo 160000 $oid o5.txt &&
+	git update-index --add --cacheinfo 100755 $oid o6.txt &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode v.s. -s' '
+	git ls-files -s >files &&
+	cut -d" " -f1 files >expect &&
+	git ls-files --format="%(objectmode)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname v.s. -s' '
+	git ls-files -s >files &&
+	cut -d" " -f2 files >expect &&
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format v.s. --eol' '
+	git ls-files --eol >tmp &&
+	sed -e "s/	/ /g" -e "s/  */ /g" tmp >expect 2>err &&
+	test_must_be_empty err &&
+	git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path v.s. -s' '
+	git ls-files -s >files &&
+	cut -f2 files >expect &&
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1.txt &&
+	cat >expect <<-\EOF &&
+	o1.txt
+	o4.txt
+	o5.txt
+	o6.txt
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o7 >o7.txt &&
+	git add o7.txt &&
+	rm o7.txt &&
+	cat >expect <<-\EOF &&
+	o4.txt
+	o5.txt
+	o6.txt
+	o7.txt
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-20 16:36             ` [PATCH v8] " ZheNing Hu via GitGitGadget
@ 2022-07-20 17:37               ` Junio C Hamano
  2022-07-21 15:54                 ` Ævar Arnfjörð Bjarmason
  2022-07-22  6:44                 ` ZheNing Hu
  2022-07-23  6:44               ` [PATCH v9] " ZheNing Hu via GitGitGadget
  1 sibling, 2 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-07-20 17:37 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Torsten Bögershausen, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

It's been quite many iterations, so I'll just comment on the range-diff.

>      -+			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
>      ++			usage_msg_opt("--format cannot used with -s, -o, -k, -t, "
>       +				      "--resolve-undo, --deduplicate, --eol",
>       +				      ls_files_usage, builtin_ls_files_options);

Looks good.

>      @@ t/t3013-ls-files-format.sh (new)
>       +	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
>       +	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
>       +	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
>      -+	ln -s o3.txt o4.txt &&
>      -+	git add "*.txt" &&
>      -+	git add --chmod +x o1.txt &&
>      -+	git update-index --add --cacheinfo 160000 $(git hash-object o1.txt) o5.txt &&
>      ++	git add . &&

We may want to be a bit more strict (like "o?.txt") but because we
know this is the first 'setup' step, let's let it pass.

>      ++	oid=$(git hash-object o1.txt) &&
>      ++	git update-index --add --cacheinfo 120000 $oid o4.txt &&
>      ++	git update-index --add --cacheinfo 160000 $oid o5.txt &&
>      ++	git update-index --add --cacheinfo 100755 $oid o6.txt &&

It is a bit inconvenient that --cacheinfo takes only fully-spelled
raw object name that we need to use $oid like this (otherwise we
would be able to write ":o1.txt" instead), but (1) it is not a fault
of this patch, and (2) update-index is a plumbing command meant for
scripts, so it is not too big a problem.

>       +	git commit -m base
>       +'
>       +
>       +test_expect_success 'git ls-files --format objectmode v.s. -s' '
>      -+	git ls-files -s | awk "{print \$1}" >expect &&
>      ++	git ls-files -s >files &&
>      ++	cut -d" " -f1 files >expect &&

Either "awk" or "cut" is fine and flipping between them is a bit
distracting.  Cutting the pipe into two is a good move.

But is this testing the right thing?

> +test_expect_success 'git ls-files --format objectmode v.s. -s' '
> +	git ls-files -s >files &&
> +	cut -d" " -f1 files >expect &&
> +	git ls-files --format="%(objectmode)" >actual &&
> +	test_cmp expect actual
> +'

It only looks at the first column of the "-s" output, and we are
implicitly assuming that the order of output does not change between
the "-s" output and "--format=<format>" output.  I wonder if it is
more useful and less error prone to come up with a format string
that 100% reproduces the "ls-files -s" output and compare the two,
e.g. 

	format="%(objectmode) %(objectname) %(stage)	%(path)" &&
	git ls-files -s >expect &&
	git ls-files --format="$format" >actual &&
	test_cmp expect actual

I do not know if the $format I wrote without looking at the doc is
correct, but you get the idea.

Thanks.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-20 17:37               ` Junio C Hamano
@ 2022-07-21 15:54                 ` Ævar Arnfjörð Bjarmason
  2022-07-21 17:22                   ` Eric Sunshine
  2022-07-21 17:23                   ` Junio C Hamano
  2022-07-22  6:44                 ` ZheNing Hu
  1 sibling, 2 replies; 61+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-21 15:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder, Phillip Wood,
	Torsten Bögershausen, ZheNing Hu


On Wed, Jul 20 2022, Junio C Hamano wrote:

> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> It's been quite many iterations, so I'll just comment on the range-diff.
>
>>      -+			usage_msg_opt("--format cannot used with -s, -o, -k, -t"
>>      ++			usage_msg_opt("--format cannot used with -s, -o, -k, -t, "
>>       +				      "--resolve-undo, --deduplicate, --eol",
>>       +				      ls_files_usage, builtin_ls_files_options);
>
> Looks good.

Although a nit I didn't spot before: missing _() & this should be marked
for translation, surely...

>>       +	git commit -m base
>>       +'
>>       +
>>       +test_expect_success 'git ls-files --format objectmode v.s. -s' '
>>      -+	git ls-files -s | awk "{print \$1}" >expect &&
>>      ++	git ls-files -s >files &&
>>      ++	cut -d" " -f1 files >expect &&
>
> Either "awk" or "cut" is fine and flipping between them is a bit
> distracting.  Cutting the pipe into two is a good move.

That "cut" suggestion saw mine, sorry about the churn...

> But is this testing the right thing?

On this...

>> +test_expect_success 'git ls-files --format objectmode v.s. -s' '
>> +	git ls-files -s >files &&
>> +	cut -d" " -f1 files >expect &&
>> +	git ls-files --format="%(objectmode)" >actual &&
>> +	test_cmp expect actual
>> +'
>
> It only looks at the first column of the "-s" output, and we are
> implicitly assuming that the order of output does not change between
> the "-s" output and "--format=<format>" output.  I wonder if it is
> more useful and less error prone to come up with a format string
> that 100% reproduces the "ls-files -s" output and compare the two,
> e.g. 
>
> 	format="%(objectmode) %(objectname) %(stage)	%(path)" &&
> 	git ls-files -s >expect &&
> 	git ls-files --format="$format" >actual &&
> 	test_cmp expect actual
>
> I do not know if the $format I wrote without looking at the doc is
> correct, but you get the idea.

Past rounds moved some tests towards that, maybe that's a good thing
here too I didn't look deeply this time around...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-21 15:54                 ` Ævar Arnfjörð Bjarmason
@ 2022-07-21 17:22                   ` Eric Sunshine
  2022-07-21 17:23                   ` Junio C Hamano
  1 sibling, 0 replies; 61+ messages in thread
From: Eric Sunshine @ 2022-07-21 17:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, ZheNing Hu via GitGitGadget, Git List,
	Christian Couder, Phillip Wood, Torsten Bögershausen,
	ZheNing Hu

On Thu, Jul 21, 2022 at 12:07 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Wed, Jul 20 2022, Junio C Hamano wrote:
> > "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > It's been quite many iterations, so I'll just comment on the range-diff.
> >
> >>      -+                      usage_msg_opt("--format cannot used with -s, -o, -k, -t"
> >>      ++                      usage_msg_opt("--format cannot used with -s, -o, -k, -t, "
> >>       +                                    "--resolve-undo, --deduplicate, --eol",
> >>       +                                    ls_files_usage, builtin_ls_files_options);
> >
> > Looks good.
>
> Although a nit I didn't spot before: missing _() & this should be marked
> for translation, surely...

Probably also want: s/cannot used/cannot be used/

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-21 15:54                 ` Ævar Arnfjörð Bjarmason
  2022-07-21 17:22                   ` Eric Sunshine
@ 2022-07-21 17:23                   ` Junio C Hamano
  1 sibling, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-07-21 17:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder, Phillip Wood,
	Torsten Bögershausen, ZheNing Hu

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>>       +test_expect_success 'git ls-files --format objectmode v.s. -s' '
>>>      -+	git ls-files -s | awk "{print \$1}" >expect &&
>>>      ++	git ls-files -s >files &&
>>>      ++	cut -d" " -f1 files >expect &&
>>
>> Either "awk" or "cut" is fine and flipping between them is a bit
>> distracting.  Cutting the pipe into two is a good move.
>
> That "cut" suggestion saw mine, sorry about the churn...

As I said "cut" is perfectly fine.  Unless this part goes away,
(i.e. perhaps we may decide that it is a bad idea to check only the
pieces of lines), let's not flip back to awk ;-)

>> 	format="%(objectmode) %(objectname) %(stage)	%(path)" &&
>> 	git ls-files -s >expect &&
>> 	git ls-files --format="$format" >actual &&
>> 	test_cmp expect actual
>>
>> I do not know if the $format I wrote without looking at the doc is
>> correct, but you get the idea.
>
> Past rounds moved some tests towards that, maybe that's a good thing
> here too I didn't look deeply this time around...

OK, thanks for reviewing.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-20 17:37               ` Junio C Hamano
  2022-07-21 15:54                 ` Ævar Arnfjörð Bjarmason
@ 2022-07-22  6:44                 ` ZheNing Hu
  2022-07-23 18:40                   ` Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: ZheNing Hu @ 2022-07-22  6:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

Junio C Hamano <gitster@pobox.com> 于2022年7月21日周四 01:37写道:
>
> >       +       git commit -m base
> >       +'
> >       +
> >       +test_expect_success 'git ls-files --format objectmode v.s. -s' '
> >      -+       git ls-files -s | awk "{print \$1}" >expect &&
> >      ++       git ls-files -s >files &&
> >      ++       cut -d" " -f1 files >expect &&
>
> Either "awk" or "cut" is fine and flipping between them is a bit
> distracting.  Cutting the pipe into two is a good move.
>
> But is this testing the right thing?
>

Yes, I am sure about that cut can do the same thing as awk, and it can
specify its delimiter.

> > +test_expect_success 'git ls-files --format objectmode v.s. -s' '
> > +     git ls-files -s >files &&
> > +     cut -d" " -f1 files >expect &&
> > +     git ls-files --format="%(objectmode)" >actual &&
> > +     test_cmp expect actual
> > +'
>
> It only looks at the first column of the "-s" output, and we are
> implicitly assuming that the order of output does not change between
> the "-s" output and "--format=<format>" output.  I wonder if it is
> more useful and less error prone to come up with a format string
> that 100% reproduces the "ls-files -s" output and compare the two,
> e.g.
>
>         format="%(objectmode) %(objectname) %(stage)    %(path)" &&
>         git ls-files -s >expect &&
>         git ls-files --format="$format" >actual &&
>         test_cmp expect actual
>

See test case: 'git ls-files --format imitate --stage' which just do such thing,
maybe I should change its name to 'git ls-files --format v.s. -s'?

> I do not know if the $format I wrote without looking at the doc is
> correct, but you get the idea.
>
> Thanks.
>
>

Thanks

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v9] ls-files: introduce "--format" option
  2022-07-20 16:36             ` [PATCH v8] " ZheNing Hu via GitGitGadget
  2022-07-20 17:37               ` Junio C Hamano
@ 2022-07-23  6:44               ` ZheNing Hu via GitGitGadget
  2022-09-08  2:01                 ` Jiang Xin
  1 sibling, 1 reply; 61+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-07-23  6:44 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen, Eric Sunshine, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option "--format" that outputs index entries
informations in a custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

"--format" cannot used with "-s", "-o", "-k", "-t",
" --resolve-undo","--deduplicate" and "--eol".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v7->v8:
    
     1. wrap the usage with _() and fix grammar error.
     2. fix test case title.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v9
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v9
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v8:

 1:  e765ee28e90 ! 1:  df602d30bf4 ls-files: introduce "--format" option
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
      +
      +	if (format && (show_stage || show_others || show_killed ||
      +		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
     -+			usage_msg_opt("--format cannot used with -s, -o, -k, -t, "
     -+				      "--resolve-undo, --deduplicate, --eol",
     ++			usage_msg_opt(_("--format cannot be used with -s, -o, -k, -t, "
     ++				      "--resolve-undo, --deduplicate, --eol"),
      +				      ls_files_usage, builtin_ls_files_options);
      +
       	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
     @@ t/t3013-ls-files-format.sh (new)
      +	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
      +	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
      +	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
     -+	git add . &&
     ++	git add o?.txt &&
      +	oid=$(git hash-object o1.txt) &&
      +	git update-index --add --cacheinfo 120000 $oid o4.txt &&
      +	git update-index --add --cacheinfo 160000 $oid o5.txt &&
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format imitate --stage' '
     ++test_expect_success 'git ls-files --format v.s -s' '
      +	git ls-files --stage >expect &&
      +	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
      +	test_cmp expect actual


 Documentation/git-ls-files.txt | 39 +++++++++++++-
 builtin/ls-files.c             | 95 ++++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 95 ++++++++++++++++++++++++++++++++++
 3 files changed, 228 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..d7986419c25 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,36 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+The way each path is shown can be customized by using the
+`--format=<format>` option, where the %(fieldname) in the
+<format> string for various aspects of the index entry are
+interpolated.  The following "fieldname" are understood:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+eolinfo:worktree::
+	The <eolinfo> (see the description of the `--eol` option) of
+	the contents in the index or in the worktree for the path.
+eolattr::
+	The <eolattr> (see the description of the `--eol` option)
+	that applies to the path.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..779dc18e59d 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -85,6 +87,16 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_addstr(sb, rel);
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +234,73 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+				void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	struct stat st;
+
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p))
+		strbuf_addstr(sb, S_ISREG(data->ce->ce_mode) ?
+			      get_cached_convert_stats_ascii(data->istate,
+			      data->ce->name) : "");
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
+		strbuf_addstr(sb, !lstat(data->pathname, &st) &&
+			      S_ISREG(st.st_mode) ?
+			      get_wt_convert_stats_ascii(data->pathname) : "");
+	else if (skip_prefix(start, "(eolattr)", &p))
+		strbuf_addstr(sb, get_convert_attr_ascii(data->istate,
+			      data->pathname));
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+	struct strbuf sb = STRBUF_INIT;
+
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +315,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +760,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +787,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol || show_tag))
+			usage_msg_opt(_("--format cannot be used with -s, -o, -k, -t, "
+				      "--resolve-undo, --deduplicate, --eol"),
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..efb7450bf1e
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,95 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k -t --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	printf "LINEONE\nLINETWO\nLINETHREE\n" >o1.txt &&
+	printf "LINEONE\r\nLINETWO\r\nLINETHREE\r\n" >o2.txt &&
+	printf "LINEONE\r\nLINETWO\nLINETHREE\n" >o3.txt &&
+	git add o?.txt &&
+	oid=$(git hash-object o1.txt) &&
+	git update-index --add --cacheinfo 120000 $oid o4.txt &&
+	git update-index --add --cacheinfo 160000 $oid o5.txt &&
+	git update-index --add --cacheinfo 100755 $oid o6.txt &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode v.s. -s' '
+	git ls-files -s >files &&
+	cut -d" " -f1 files >expect &&
+	git ls-files --format="%(objectmode)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname v.s. -s' '
+	git ls-files -s >files &&
+	cut -d" " -f2 files >expect &&
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format v.s. --eol' '
+	git ls-files --eol >tmp &&
+	sed -e "s/	/ /g" -e "s/  */ /g" tmp >expect 2>err &&
+	test_must_be_empty err &&
+	git ls-files --format="i/%(eolinfo:index) w/%(eolinfo:worktree) attr/%(eolattr) %(path)" >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path v.s. -s' '
+	git ls-files -s >files &&
+	cut -f2 files >expect &&
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1.txt &&
+	cat >expect <<-\EOF &&
+	o1.txt
+	o4.txt
+	o5.txt
+	o6.txt
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o7 >o7.txt &&
+	git add o7.txt &&
+	rm o7.txt &&
+	cat >expect <<-\EOF &&
+	o4.txt
+	o5.txt
+	o6.txt
+	o7.txt
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format v.s -s' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-22  6:44                 ` ZheNing Hu
@ 2022-07-23 18:40                   ` Junio C Hamano
  2022-07-23 18:46                     ` Junio C Hamano
  2022-07-24 11:08                     ` ZheNing Hu
  0 siblings, 2 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-07-23 18:40 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

ZheNing Hu <adlternative@gmail.com> writes:

>> But is this testing the right thing?
>
> Yes, I am sure about that cut can do the same thing as awk, and it can
> specify its delimiter.

That is not an answer to "is this testing the right thing?"
question, though ;-)

>> > +test_expect_success 'git ls-files --format objectmode v.s. -s' '
>> > +     git ls-files -s >files &&
>> > +     cut -d" " -f1 files >expect &&
>> > +     git ls-files --format="%(objectmode)" >actual &&
>> > +     test_cmp expect actual
>> > +'
>>
>> It only looks at the first column of the "-s" output, and we are
>> implicitly assuming that the order of output does not change between
>> the "-s" output and "--format=<format>" output.  I wonder if it is
>> more useful and less error prone to come up with a format string
>> that 100% reproduces the "ls-files -s" output and compare the two,
>> e.g.
>>
>>         format="%(objectmode) %(objectname) %(stage)    %(path)" &&
>>         git ls-files -s >expect &&
>>         git ls-files --format="$format" >actual &&
>>         test_cmp expect actual
>>
>
> See test case: 'git ls-files --format imitate --stage' which just do such thing,


That was not the point.  By extracting only "%(objectmode)" without
having any other clues (like "%(path)") on the same line, the test
is assuming that ls-files will always sort its output in the same
order regardless of the output format, whether it is "--stage" or
"--format=<spec>", and that was what the "is this testing the right
thing?" question was about.

The other test that makes sure --format=<spec> can recreate --stage
output is fine.  If some future developer breaks the output order by
mistake for --format=<spec>, we will catch such a mistake with it.


> maybe I should change its name to 'git ls-files --format v.s. -s'?

I do not think you should.  "A v.s. B" does not imply "A and B
should create identical result".  The original title describes what
it does much more clearly.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-23 18:40                   ` Junio C Hamano
@ 2022-07-23 18:46                     ` Junio C Hamano
  2022-07-24 11:08                     ` ZheNing Hu
  1 sibling, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2022-07-23 18:46 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

Junio C Hamano <gitster@pobox.com> writes:

> That was not the point.  By extracting only "%(objectmode)" without
> having any other clues (like "%(path)") on the same line, the test
> is assuming that ls-files will always sort its output in the same
> order regardless of the output format, whether it is "--stage" or
> "--format=<spec>", and that was what the "is this testing the right
> thing?" question was about.
>
> The other test that makes sure --format=<spec> can recreate --stage
> output is fine.  If some future developer breaks the output order by
> mistake for --format=<spec>, we will catch such a mistake with it.

Having said that, let's stop rerolling the series just for this.  An
extra test that may not catch potential breakage is fine and it is
not worth spending an extra review cycle only to remove it.

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-23 18:40                   ` Junio C Hamano
  2022-07-23 18:46                     ` Junio C Hamano
@ 2022-07-24 11:08                     ` ZheNing Hu
  2022-07-25  1:03                       ` Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: ZheNing Hu @ 2022-07-24 11:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

Junio C Hamano <gitster@pobox.com> 于2022年7月24日周日 02:40写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> >> But is this testing the right thing?
> >
> > Yes, I am sure about that cut can do the same thing as awk, and it can
> > specify its delimiter.
>
> That is not an answer to "is this testing the right thing?"
> question, though ;-)
>
> >> > +test_expect_success 'git ls-files --format objectmode v.s. -s' '
> >> > +     git ls-files -s >files &&
> >> > +     cut -d" " -f1 files >expect &&
> >> > +     git ls-files --format="%(objectmode)" >actual &&
> >> > +     test_cmp expect actual
> >> > +'
> >>
> >> It only looks at the first column of the "-s" output, and we are
> >> implicitly assuming that the order of output does not change between
> >> the "-s" output and "--format=<format>" output.  I wonder if it is
> >> more useful and less error prone to come up with a format string
> >> that 100% reproduces the "ls-files -s" output and compare the two,
> >> e.g.
> >>
> >>         format="%(objectmode) %(objectname) %(stage)    %(path)" &&
> >>         git ls-files -s >expect &&
> >>         git ls-files --format="$format" >actual &&
> >>         test_cmp expect actual
> >>
> >
> > See test case: 'git ls-files --format imitate --stage' which just do such thing,
>
>
> That was not the point.  By extracting only "%(objectmode)" without
> having any other clues (like "%(path)") on the same line, the test
> is assuming that ls-files will always sort its output in the same
> order regardless of the output format, whether it is "--stage" or
> "--format=<spec>", and that was what the "is this testing the right
> thing?" question was about.
>

Ah, so that we should sort the ls-files output first, and then compare them.

> The other test that makes sure --format=<spec> can recreate --stage
> output is fine.  If some future developer breaks the output order by
> mistake for --format=<spec>, we will catch such a mistake with it.
>
>
> > maybe I should change its name to 'git ls-files --format v.s. -s'?
>
> I do not think you should.  "A v.s. B" does not imply "A and B
> should create identical result".  The original title describes what
> it does much more clearly.

Ok, here I don't need another rerolling to revert it, right?

Thanks for all the reviews!

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-24 11:08                     ` ZheNing Hu
@ 2022-07-25  1:03                       ` Junio C Hamano
  2022-07-25 11:00                         ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2022-07-25  1:03 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

ZheNing Hu <adlternative@gmail.com> writes:

>> That was not the point.  By extracting only "%(objectmode)" without
>> having any other clues (like "%(path)") on the same line, the test
>> is assuming that ls-files will always sort its output in the same
>> order regardless of the output format, whether it is "--stage" or
>> "--format=<spec>", and that was what the "is this testing the right
>> thing?" question was about.
>>
>
> Ah, so that we should sort the ls-files output first, and then compare them.

Imagine that there are three paths in the index and "ls-files -s"
gives

    100644 1234... 0 path1
    100644 2345... 0 path2
    100755 3456... 0 path3

but a bug causes "ls-files --format=<spec>" to show entries in a
wrong order, e.g. first for path2 and then for path1 and then for
path3.  If the test used enough fields (like the one that mimics the
full output of "ls-files -s"), then the output may be

    100644 2345... 0 path2
    100644 1234... 0 path1
    100755 3456... 0 path3

and you would notice that it is different from "ls-files -s".

But if the test only used %(objectmode), then the faulty output from
"ls-files --format=%(objectmode)" would be

    100644
    100644
    100755

that matches the "ls-files -s | cut -d' ' -f1"

If you sort, then such a breakage will become even harder to
notice.  If the faulty output showed path3 first and then path2 and
then path1, the raw output from "ls-files --format=%(objectmode)" may
be 100755/100644/100644, but if you sort it, no matter what the
broken order is, you will always get 100644/100644/100755.

So, no, we shouldn't sort.  If ls-files were allowed to show output
in any random order, then sorting the output before comparing is a
good strategy, but that does not apply here.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v8] ls-files: introduce "--format" option
  2022-07-25  1:03                       ` Junio C Hamano
@ 2022-07-25 11:00                         ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-07-25 11:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen

Junio C Hamano <gitster@pobox.com> 于2022年7月25日周一 09:03写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> >> That was not the point.  By extracting only "%(objectmode)" without
> >> having any other clues (like "%(path)") on the same line, the test
> >> is assuming that ls-files will always sort its output in the same
> >> order regardless of the output format, whether it is "--stage" or
> >> "--format=<spec>", and that was what the "is this testing the right
> >> thing?" question was about.
> >>
> >
> > Ah, so that we should sort the ls-files output first, and then compare them.
>
> Imagine that there are three paths in the index and "ls-files -s"
> gives
>
>     100644 1234... 0 path1
>     100644 2345... 0 path2
>     100755 3456... 0 path3
>
> but a bug causes "ls-files --format=<spec>" to show entries in a
> wrong order, e.g. first for path2 and then for path1 and then for
> path3.  If the test used enough fields (like the one that mimics the
> full output of "ls-files -s"), then the output may be
>
>     100644 2345... 0 path2
>     100644 1234... 0 path1
>     100755 3456... 0 path3
>
> and you would notice that it is different from "ls-files -s".
>
> But if the test only used %(objectmode), then the faulty output from
> "ls-files --format=%(objectmode)" would be
>
>     100644
>     100644
>     100755
>
> that matches the "ls-files -s | cut -d' ' -f1"
>
> If you sort, then such a breakage will become even harder to
> notice.  If the faulty output showed path3 first and then path2 and
> then path1, the raw output from "ls-files --format=%(objectmode)" may
> be 100755/100644/100644, but if you sort it, no matter what the
> broken order is, you will always get 100644/100644/100755.
>
> So, no, we shouldn't sort.  If ls-files were allowed to show output
> in any random order, then sorting the output before comparing is a
> good strategy, but that does not apply here.

I get what you mean. So test 'git ls-files --format imitate --stage'
can help for
checking it, because every line content is different (maybe different <path>,
or the same <path> with different <stage>,<objectmode>...), we can find the
--format "disorder bug" with ease.

ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9] ls-files: introduce "--format" option
  2022-07-23  6:44               ` [PATCH v9] " ZheNing Hu via GitGitGadget
@ 2022-09-08  2:01                 ` Jiang Xin
  2022-09-11 11:01                   ` ZheNing Hu
  0 siblings, 1 reply; 61+ messages in thread
From: Jiang Xin @ 2022-09-08  2:01 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: Git List, Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Torsten Bögershausen, Eric Sunshine, ZheNing Hu

On Sat, Jul 23, 2022 at 2:54 PM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e791b65e7e9..779dc18e59d 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -222,6 +234,73 @@ static void show_submodule(struct repository *superproject,
>         repo_clear(&subrepo);
>  }
>
> +struct show_index_data {
> +       const char *pathname;
> +       struct index_state *istate;
> +       const struct cache_entry *ce;
> +};
> +
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +                               void *context)
> +{
> +       struct show_index_data *data = context;
> +       const char *end;
> +       const char *p;
> +       size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> +       struct stat st;
> +
> +       if (len)
> +               return len;
> +       if (*start != '(')
> +               die(_("bad ls-files format: element '%s' "

Good, the last space acts as a separator between two lines.

> +                     "does not start with '('"), start);
> +
> +       end = strchr(start + 1, ')');
> +       if (!end)
> +               die(_("bad ls-files format: element '%s'"

Missing the last space to seperate two lines, and this leads to wrong
l10n message. See:

    https://github.com/git-l10n/pot-changes/blob/pot/main/2022-08-03.diff#L70

--
Jiang Xin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v9] ls-files: introduce "--format" option
  2022-09-08  2:01                 ` Jiang Xin
@ 2022-09-11 11:01                   ` ZheNing Hu
  0 siblings, 0 replies; 61+ messages in thread
From: ZheNing Hu @ 2022-09-11 11:01 UTC (permalink / raw)
  To: Jiang Xin
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Torsten Bögershausen, Eric Sunshine

Jiang Xin <worldhello.net@gmail.com> 于2022年9月8日周四 10:01写道:
>
> On Sat, Jul 23, 2022 at 2:54 PM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>
> > diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> > index e791b65e7e9..779dc18e59d 100644
> > --- a/builtin/ls-files.c
> > +++ b/builtin/ls-files.c
> > @@ -222,6 +234,73 @@ static void show_submodule(struct repository *superproject,
> >         repo_clear(&subrepo);
> >  }
> >
> > +struct show_index_data {
> > +       const char *pathname;
> > +       struct index_state *istate;
> > +       const struct cache_entry *ce;
> > +};
> > +
> > +static size_t expand_show_index(struct strbuf *sb, const char *start,
> > +                               void *context)
> > +{
> > +       struct show_index_data *data = context;
> > +       const char *end;
> > +       const char *p;
> > +       size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> > +       struct stat st;
> > +
> > +       if (len)
> > +               return len;
> > +       if (*start != '(')
> > +               die(_("bad ls-files format: element '%s' "
>
> Good, the last space acts as a separator between two lines.
>
> > +                     "does not start with '('"), start);
> > +
> > +       end = strchr(start + 1, ')');
> > +       if (!end)
> > +               die(_("bad ls-files format: element '%s'"
>
> Missing the last space to seperate two lines, and this leads to wrong
> l10n message. See:
>

Thank you for pointing out the error, I will fix it quickly.

>     https://github.com/git-l10n/pot-changes/blob/pot/main/2022-08-03.diff#L70
>
> --
> Jiang Xin

Thanks,
ZheNing Hu

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2022-09-11 11:01 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
2022-06-18 10:50     ` ZheNing Hu
2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
2022-06-18 10:59     ` ZheNing Hu
2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
2022-06-19 13:50   ` Phillip Wood
2022-06-20 13:32     ` ZheNing Hu
2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
2022-06-23 14:06     ` Phillip Wood
2022-06-23 15:57       ` Junio C Hamano
2022-06-24 10:16         ` Phillip Wood
2022-06-26 13:05           ` ZheNing Hu
2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
2022-06-24 15:28           ` Junio C Hamano
2022-06-26 13:01       ` ZheNing Hu
2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
2022-06-24 15:33       ` Junio C Hamano
2022-06-26 13:35         ` ZheNing Hu
2022-06-27  8:22           ` Junio C Hamano
2022-06-27 11:06             ` ZheNing Hu
2022-06-27 15:41               ` Junio C Hamano
2022-07-01 13:30                 ` ZheNing Hu
2022-06-26 13:34       ` ZheNing Hu
2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
2022-06-27  8:32       ` Junio C Hamano
2022-06-27 11:18         ` ZheNing Hu
2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
2022-07-01 12:42         ` ZheNing Hu
2022-06-28 15:19       ` Phillip Wood
2022-07-01 12:47         ` ZheNing Hu
2022-07-05  6:32       ` [PATCH v5] " ZheNing Hu via GitGitGadget
2022-07-05  8:39         ` Ævar Arnfjörð Bjarmason
2022-07-11 15:14           ` ZheNing Hu
2022-07-05 19:28         ` Torsten Bögershausen
2022-07-11 15:27           ` ZheNing Hu
2022-07-11 16:53         ` [PATCH v6] " ZheNing Hu via GitGitGadget
2022-07-11 22:11           ` Junio C Hamano
2022-07-12 13:53             ` ZheNing Hu
2022-07-12 14:34               ` Junio C Hamano
2022-07-13  6:07           ` [PATCH v7] " ZheNing Hu via GitGitGadget
2022-07-18  8:09             ` Ævar Arnfjörð Bjarmason
2022-07-19 16:19               ` ZheNing Hu
2022-07-19 16:47                 ` Junio C Hamano
2022-07-19 17:21                   ` ZheNing Hu
2022-07-20 16:36             ` [PATCH v8] " ZheNing Hu via GitGitGadget
2022-07-20 17:37               ` Junio C Hamano
2022-07-21 15:54                 ` Ævar Arnfjörð Bjarmason
2022-07-21 17:22                   ` Eric Sunshine
2022-07-21 17:23                   ` Junio C Hamano
2022-07-22  6:44                 ` ZheNing Hu
2022-07-23 18:40                   ` Junio C Hamano
2022-07-23 18:46                     ` Junio C Hamano
2022-07-24 11:08                     ` ZheNing Hu
2022-07-25  1:03                       ` Junio C Hamano
2022-07-25 11:00                         ` ZheNing Hu
2022-07-23  6:44               ` [PATCH v9] " ZheNing Hu via GitGitGadget
2022-09-08  2:01                 ` Jiang Xin
2022-09-11 11:01                   ` ZheNing Hu

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).