git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options
@ 2022-06-15 13:45 ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-15 13:45 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Sometime we may need to extract some custom informations from git index
entries. Add a new option "--format" to "git ls-files" which can do such
thing, and add a new option "--object-only" which alias to
"--format=%(objectname)".

The origin discussion is here:
https://lore.kernel.org/git/pull.1250.v2.git.1654778272871.gitgitgadget@gmail.com/

ZheNing Hu (2):
  ls-files: introduce "--format" option
  ls-files: introduce "--object-only" option

 Documentation/git-ls-files.txt |  59 ++++++++++-
 builtin/ls-files.c             | 160 +++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 176 +++++++++++++++++++++++++++++++++
 3 files changed, 390 insertions(+), 5 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh


base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1262
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/2] ls-files: introduce "--format" option
  2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
@ 2022-06-15 13:45 ` ZheNing Hu via GitGitGadget
  2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
  2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 30+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-15 13:45 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate, --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-ls-files.txt |  51 +++++++++++-
 builtin/ls-files.c             | 126 ++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 142 +++++++++++++++++++++++++++++++++
 3 files changed, 315 insertions(+), 4 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..b22860ec8c0 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates %(fieldname) from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
+	`--debug`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+tag::
+	The tag of file status.
+objectmode::
+	The mode of the object.
+objectname::
+	The name of the object.
+stage::
+	The stage of the file.
+eol::
+	The line endings of files.
+path::
+	The pathname of the object.
+ctime::
+	The create time of file.
+mtime::
+	The modify time of file.
+dev::
+	The ID of device containing file.
+ino::
+	The inode number of file.
+uid::
+	The user id of file owner.
+gid::
+	The group id of file owner.
+size::
+	The size of the file.
+flags::
+	The flags of the file.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..9dd6c55eeb9 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -58,8 +60,8 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
-static void write_eolinfo(struct index_state *istate,
-			  const struct cache_entry *ce, const char *path)
+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
+				   const struct cache_entry *ce, const char *path)
 {
 	if (show_eol) {
 		struct stat st;
@@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
 							       ce->name);
 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
 			w_txt = get_wt_convert_stats_ascii(path);
-		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		if (sb)
+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		else
+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
 	}
 }
 
+static void write_eolinfo(struct index_state *istate,
+			  const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(NULL, istate, ce, path);
+}
+
+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(sb, istate, ce, path);
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +102,16 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	name = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator) {
+		quote_c_style(name, sb, NULL, 0);
+	} else {
+		strbuf_add(sb, name, strlen(name));
+	}
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +249,86 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *tag;
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	const struct stat_data *sd = &data->ce->ce_stat_data;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s' does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(tag)", &p)) {
+		strbuf_addstr(sb, get_tag(data->ce, data->tag));
+	} else if (skip_prefix(start, "(objectmode)", &p)) {
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	} else if (skip_prefix(start, "(objectname)", &p)) {
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	} else if (skip_prefix(start, "(stage)", &p)) {
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	} else if (skip_prefix(start, "(eol)", &p)) {
+		write_eolinfo_to_buf(sb, data->istate, data->ce, data->pathname);
+	} else if (skip_prefix(start, "(path)", &p)) {
+		write_name_to_buf(sb, data->pathname);
+	} else if (skip_prefix(start, "(ctime)", &p)) {
+		strbuf_addf(sb, "ctime: %u:%u", sd->sd_ctime.sec, sd->sd_ctime.nsec);
+	} else if (skip_prefix(start, "(mtime)", &p)) {
+		strbuf_addf(sb, "mtime: %u:%u", sd->sd_mtime.sec, sd->sd_mtime.nsec);
+	} else if (skip_prefix(start, "(dev)", &p)) {
+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
+	} else if (skip_prefix(start, "(ino)", &p)) {
+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
+	} else if (skip_prefix(start, "(uid)", &p)) {
+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
+	} else if (skip_prefix(start, "(gid)", &p)) {
+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
+	} else if (skip_prefix(start, "(size)", &p)) {
+		strbuf_addf(sb, "size: %u", sd->sd_size);
+	} else if (skip_prefix(start, "(flags)", &p)) {
+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
+	} else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-files format: %%%.*s"), errlen, start);
+	}
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname, const char *tag) {
+
+	struct show_index_data data = {
+		.tag = tag,
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +343,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname, tag);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +787,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+					 N_("format to use for the output"),
+					 PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +814,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || debug_mode))
+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..61a2e68713a
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,142 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format tag' '
+	printf "H \nH \n" >expect &&
+	git ls-files --format="%(tag)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eol' '
+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
+	git ls-files --format="%(eol)" --eol >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format ctime' '
+	git ls-files --debug | grep ctime >expect &&
+	git ls-files --format="  %(ctime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format mtime' '
+	git ls-files --debug | grep mtime >expect &&
+	git ls-files --format="  %(mtime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format dev and ino' '
+	git ls-files --debug | grep dev >expect &&
+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format uid and gid' '
+	git ls-files --debug | grep uid >expect &&
+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	rm o1 &&
+	test_when_finished "git restore o1" &&
+	cat >expect <<-EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format size and flags' '
+	git ls-files --debug | grep size >expect &&
+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -s must fail' '
+	test_must_fail git ls-files --format="%(objectname)" -s
+'
+
+test_expect_success 'git ls-files --format with -o must fail' '
+	test_must_fail git ls-files --format="%(objectname)" -o
+'
+
+test_expect_success 'git ls-files --format with -k must fail' '
+	test_must_fail git ls-files --format="%(objectname)" -k
+'
+
+test_expect_success 'git ls-files --format with --resolve-undo must fail' '
+	test_must_fail git ls-files --format="%(objectname)" --resolve-undo
+'
+
+test_expect_success 'git ls-files --format with --deduplicate must fail' '
+	test_must_fail git ls-files --format="%(objectname)" --deduplicate
+'
+
+test_expect_success 'git ls-files --format with --debug must fail' '
+	test_must_fail git ls-files --format="%(objectname)" --debug
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/2] ls-files: introduce "--object-only" option
  2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
@ 2022-06-15 13:45 ` ZheNing Hu via GitGitGadget
  2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 30+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-15 13:45 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

--object-only is an alias for --format=%(objectname),
which output objectname of index entries, taking
inspiration from the option with the same name in
the `git ls-tree` command.

--object-only cannot be used with --format, and -s, -o,
-k, --resolve-undo, --deduplicate, --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-ls-files.txt |  8 +++++++-
 builtin/ls-files.c             | 36 +++++++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 34 ++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index b22860ec8c0..c3f46bb821b 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 		[-c|--cached] [-d|--deleted] [-o|--others] [-i|--|ignored]
 		[-s|--stage] [-u|--unmerged] [-k|--|killed] [-m|--modified]
 		[--directory [--no-empty-directory]] [--eol]
-		[--deduplicate]
+		[--deduplicate] [--object-only]
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 		[--exclude-per-directory=<file>]
@@ -199,6 +199,12 @@ followed by the  ("attr/<eolattr>").
 	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
 	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
 	`--debug`.
+
+--object-only::
+	List only names of the objects, one per line. This is equivalent
+	to specifying `--format='%(objectname)'`. Cannot be combined with
+	`--format=<format>`.
+
 \--::
 	Do not interpret any more arguments as options.
 
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 9dd6c55eeb9..4ac8f34baac 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -60,6 +60,27 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
+static enum ls_files_cmdmode {
+	MODE_DEFAULT = 0,
+	MODE_OBJECT_ONLY,
+} ls_files_cmdmode;
+
+struct ls_files_cmdmodee_to_fmt {
+	enum ls_files_cmdmode mode;
+	const char *const fmt;
+};
+
+static struct ls_files_cmdmodee_to_fmt ls_files_cmdmode_format[] = {
+	{
+		.mode = MODE_DEFAULT,
+		.fmt = NULL,
+	},
+	{
+		.mode = MODE_OBJECT_ONLY,
+		.fmt = "%(objectname)",
+	},
+};
+
 static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
 				   const struct cache_entry *ce, const char *path)
 {
@@ -747,6 +768,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			DIR_SHOW_IGNORED),
 		OPT_BOOL('s', "stage", &show_stage,
 			N_("show staged contents' object name in the output")),
+		OPT_CMDMODE(0, "object-only", &ls_files_cmdmode, N_("list only objects"),
+			    MODE_OBJECT_ONLY),
 		OPT_BOOL('k', "killed", &show_killed,
 			N_("show files on the filesystem that need to be removed")),
 		OPT_BIT(0, "directory", &dir.flags,
@@ -815,9 +838,20 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
 
+	if (format && ls_files_cmdmode)
+		die(_("--format can't be combined with other format-altering options"));
+
+	for (i = 0; !format && i < ARRAY_SIZE(ls_files_cmdmode_format); i++) {
+		if (ls_files_cmdmode == ls_files_cmdmode_format[i].mode) {
+			format = ls_files_cmdmode_format[i].fmt;
+			break;
+		}
+	}
+
 	if (format && (show_stage || show_others || show_killed ||
 		show_resolve_undo || skipping_duplicates || debug_mode))
-			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+		die(_("ls-files --format or other format-altering options "
+		      "cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
 
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
index 61a2e68713a..1c982ea13e0 100755
--- a/t/t3013-ls-files-format.sh
+++ b/t/t3013-ls-files-format.sh
@@ -139,4 +139,38 @@ test_expect_success 'git ls-files --format with --debug must fail' '
 	test_must_fail git ls-files --format="%(objectname)" --debug
 '
 
+test_expect_success 'git ls-files --object-only equal to --format=%(objectname)' '
+	git ls-files --format="%(objectname)" >expect &&
+	git ls-files --object-only >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --object-only with --format must fail' '
+	test_must_fail git ls-files --format="%(path)" --object-only
+'
+
+test_expect_success 'git ls-files --object-only with -s must fail' '
+	test_must_fail git ls-files --object-only -s
+'
+
+test_expect_success 'git ls-files --object-only with -o must fail' '
+	test_must_fail git ls-files --object-only -o
+'
+
+test_expect_success 'git ls-files --object-only with -k must fail' '
+	test_must_fail git ls-files --object-only -k
+'
+
+test_expect_success 'git ls-files --object-only with --resolve-undo must fail' '
+	test_must_fail git ls-files --object-only --resolve-undo
+'
+
+test_expect_success 'git ls-files --object-only with --deduplicate must fail' '
+	test_must_fail git ls-files --object-only --deduplicate
+'
+
+test_expect_success 'git ls-files --object-only with --debug must fail' '
+	test_must_fail git ls-files --object-only --debug
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] ls-files: introduce "--format" option
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
@ 2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
  2022-06-18 10:50     ` ZheNing Hu
  0 siblings, 1 reply; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-15 20:07 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, ZheNing Hu


On Wed, Jun 15 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>

Thanks a lot for pursuing this, this looks good & is much smaller than I
thought, just some nits below:

> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
>
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate, --debug.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/git-ls-files.txt |  51 +++++++++++-
>  builtin/ls-files.c             | 126 ++++++++++++++++++++++++++++-
>  t/t3013-ls-files-format.sh     | 142 +++++++++++++++++++++++++++++++++
>  3 files changed, 315 insertions(+), 4 deletions(-)
>  create mode 100755 t/t3013-ls-files-format.sh
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 0dabf3f0ddc..b22860ec8c0 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -20,7 +20,7 @@ SYNOPSIS
>  		[--exclude-standard]
>  		[--error-unmatch] [--with-tree=<tree-ish>]
>  		[--full-name] [--recurse-submodules]
> -		[--abbrev[=<n>]] [--] [<file>...]
> +		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
>  
>  DESCRIPTION
>  -----------
> @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
>  	to the contained files. Sparse directories will be shown with a
>  	trailing slash, such as "x/" for a sparse directory "x".
>  
> +--format=<format>::
> +	A string that interpolates %(fieldname) from the result being shown.

Missing `` for %(fieldname) ?
> +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> +	interpolates to character with hex code `xx`; for example `%00`
> +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> +	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,

Replace that last "," with " or ".

> +	`--debug`.
>  \--::
>  	Do not interpret any more arguments as options.
>  
> @@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
>  (see linkgit:git-config[1]).  Using `-z` the filename is output
>  verbatim and the line is terminated by a NUL byte.
>  
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +tag::
> +	The tag of file status.
> +objectmode::
> +	The mode of the object.
> +objectname::
> +	The name of the object.
> +stage::
> +	The stage of the file.
> +eol::
> +	The line endings of files.
> +path::
> +	The pathname of the object.
> +ctime::
> +	The create time of file.
> +mtime::
> +	The modify time of file.
> +dev::
> +	The ID of device containing file.
> +ino::
> +	The inode number of file.
> +uid::
> +	The user id of file owner.
> +gid::
> +	The group id of file owner.
> +size::
> +	The size of the file.
> +flags::
> +	The flags of the file.
>  
>  EXCLUDE PATTERNS
>  ----------------
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e791b65e7e9..9dd6c55eeb9 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -11,6 +11,7 @@
>  #include "quote.h"
>  #include "dir.h"
>  #include "builtin.h"
> +#include "strbuf.h"
>  #include "tree.h"
>  #include "cache-tree.h"
>  #include "parse-options.h"
> @@ -48,6 +49,7 @@ static char *ps_matched;
>  static const char *with_tree;
>  static int exc_given;
>  static int exclude_args;
> +static const char *format;
>  
>  static const char *tag_cached = "";
>  static const char *tag_unmerged = "";
> @@ -58,8 +60,8 @@ static const char *tag_modified = "";
>  static const char *tag_skip_worktree = "";
>  static const char *tag_resolve_undo = "";
>  
> -static void write_eolinfo(struct index_state *istate,
> -			  const struct cache_entry *ce, const char *path)
> +static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
> +				   const struct cache_entry *ce, const char *path)
>  {
>  	if (show_eol) {
>  		struct stat st;
> @@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
>  							       ce->name);
>  		if (!lstat(path, &st) && S_ISREG(st.st_mode))
>  			w_txt = get_wt_convert_stats_ascii(path);
> -		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
> +		if (sb)
> +			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
> +		else
> +			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
>  	}
>  }
>  
> +static void write_eolinfo(struct index_state *istate,
> +			  const struct cache_entry *ce, const char *path)
> +{
> +	write_eolinfo_internal(NULL, istate, ce, path);
> +}
> +
> +static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
> +				 const struct cache_entry *ce, const char *path)
> +{
> +	write_eolinfo_internal(sb, istate, ce, path);
> +}
> +
>  static void write_name(const char *name)
>  {
>  	/*
> @@ -85,6 +102,16 @@ static void write_name(const char *name)
>  				   stdout, line_terminator);
>  }
>  
> +static void write_name_to_buf(struct strbuf *sb, const char *name)
> +{
> +	name = relative_path(name, prefix_len ? prefix : NULL, sb);
FWIW I'd find this a bit less "huh?" if we declared another variable
here, so just:

	const char *rel = relative_path(name, ...).

> +	if (line_terminator) {
> +		quote_c_style(name, sb, NULL, 0);
> +	} else {
> +		strbuf_add(sb, name, strlen(name));
> +	}

Can drop the {} braces here for if/else, see CodingGuidelines.

> +}
> +
>  static const char *get_tag(const struct cache_entry *ce, const char *tag)
>  {
>  	static char alttag[4];
> @@ -222,6 +249,86 @@ static void show_submodule(struct repository *superproject,
>  	repo_clear(&subrepo);
>  }
>  
> +struct show_index_data {
> +	const char *tag;
> +	const char *pathname;
> +	struct index_state *istate;
> +	const struct cache_entry *ce;
> +};
> +
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +			       void *context)
> +{
> +	struct show_index_data *data = context;
> +	const char *end;
> +	const char *p;
> +	unsigned int errlen;
> +	const struct stat_data *sd = &data->ce->ce_stat_data;
> +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
> +	if (len)
> +		return len;
> +	if (*start != '(')
> +		die(_("bad ls-files format: element '%s' does not start with '('"), start);
> +
> +	end = strchr(start + 1, ')');
> +	if (!end)
> +		die(_("bad ls-files format: element '%s' does not end in ')'"), start);
> +
> +	len = end - start + 1;
> +	if (skip_prefix(start, "(tag)", &p)) {

Style nit, I'd much rather see us drop the {} on the whole if/else if
chain here, which we can do if...

> +		strbuf_addstr(sb, get_tag(data->ce, data->tag));
> +	} else if (skip_prefix(start, "(objectmode)", &p)) {
> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
> +	} else if (skip_prefix(start, "(objectname)", &p)) {
> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> +	} else if (skip_prefix(start, "(stage)", &p)) {
> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
> +	} else if (skip_prefix(start, "(eol)", &p)) {
> +		write_eolinfo_to_buf(sb, data->istate, data->ce, data->pathname);
> +	} else if (skip_prefix(start, "(path)", &p)) {
> +		write_name_to_buf(sb, data->pathname);
> +	} else if (skip_prefix(start, "(ctime)", &p)) {
> +		strbuf_addf(sb, "ctime: %u:%u", sd->sd_ctime.sec, sd->sd_ctime.nsec);
> +	} else if (skip_prefix(start, "(mtime)", &p)) {
> +		strbuf_addf(sb, "mtime: %u:%u", sd->sd_mtime.sec, sd->sd_mtime.nsec);

(too long lines, keep within 79 chars?)

> +	} else if (skip_prefix(start, "(dev)", &p)) {
> +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
> +	} else if (skip_prefix(start, "(ino)", &p)) {
> +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
> +	} else if (skip_prefix(start, "(uid)", &p)) {
> +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
> +	} else if (skip_prefix(start, "(gid)", &p)) {
> +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
> +	} else if (skip_prefix(start, "(size)", &p)) {
> +		strbuf_addf(sb, "size: %u", sd->sd_size);
> +	} else if (skip_prefix(start, "(flags)", &p)) {
> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> +	} else {
> +		errlen = (unsigned long)len;
> +		die(_("bad ls-files format: %%%.*s"), errlen, start);

We just line-wrap the "(unsigned long)len" here, which seems worth it
for less line noise :)

> +	}
> +
> +	return len;
> +}
> +
> +static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
> +			const char *format, const char *fullname, const char *tag) {
> +
> +	struct show_index_data data = {
> +		.tag = tag,
> +		.pathname = fullname,
> +		.istate = repo->index,
> +		.ce = ce,
> +	};
> +
> +	struct strbuf sb = STRBUF_INIT;
> +	strbuf_expand(&sb, format, expand_show_index, &data);
> +	strbuf_addch(&sb, line_terminator);
> +	fwrite(sb.buf, sb.len, 1, stdout);
> +	strbuf_release(&sb);
> +	return;
> +}
> +
>  static void show_ce(struct repository *repo, struct dir_struct *dir,
>  		    const struct cache_entry *ce, const char *fullname,
>  		    const char *tag)
> @@ -236,6 +343,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
>  				  max_prefix_len, ps_matched,
>  				  S_ISDIR(ce->ce_mode) ||
>  				  S_ISGITLINK(ce->ce_mode))) {
> +		if (format) {
> +			show_ce_fmt(repo, ce, format, fullname, tag);
> +			return;
> +		}
> +
>  		tag = get_tag(ce, tag);
>  
>  		if (!show_stage) {
> @@ -675,6 +787,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  			 N_("suppress duplicate entries")),
>  		OPT_BOOL(0, "sparse", &show_sparse_dirs,
>  			 N_("show sparse directories in the presence of a sparse index")),
> +		OPT_STRING_F(0, "format", &format, N_("format"),
> +					 N_("format to use for the output"),
> +					 PARSE_OPT_NONEG),

Odd indentation?

>  		OPT_END()
>  	};
>  	int ret = 0;
> @@ -699,6 +814,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  	for (i = 0; i < exclude_list.nr; i++) {
>  		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
>  	}
> +
> +	if (format && (show_stage || show_others || show_killed ||
> +		show_resolve_undo || skipping_duplicates || debug_mode))
> +			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));

Good to check this.

> +
>  	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>  		tag_cached = "H ";
>  		tag_unmerged = "M ";
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> new file mode 100755
> index 00000000000..61a2e68713a
> --- /dev/null
> +++ b/t/t3013-ls-files-format.sh
> @@ -0,0 +1,142 @@
> +#!/bin/sh
> +
> +test_description='git ls-files --format test'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	echo o1 >o1 &&
> +	echo o2 >o2 &&
> +	git add o1 o2 &&
> +	git add --chmod +x o1 &&
> +	git commit -m base
> +'
> +
> +test_expect_success 'git ls-files --format tag' '
> +	printf "H \nH \n" >expect &&
> +	git ls-files --format="%(tag)" -t >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format objectmode' '
> +	cat >expect <<-EOF &&
> +	100755
> +	100644
> +	EOF
> +	git ls-files --format="%(objectmode)" -t >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format objectname' '
> +	oid1=$(git hash-object o1) &&
> +	oid2=$(git hash-object o2) &&
> +	cat >expect <<-EOF &&
> +	$oid1
> +	$oid2
> +	EOF
> +	git ls-files --format="%(objectname)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format eol' '
> +	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
> +	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
> +	git ls-files --format="%(eol)" --eol >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format path' '
> +	cat >expect <<-EOF &&
> +	o1
> +	o2
> +	EOF
> +	git ls-files --format="%(path)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format ctime' '
> +	git ls-files --debug | grep ctime >expect &&

For this and the rest: don't put git on the left-hand-side of a "|", it
hides its exit code (and potential segfaults)>.

Instead e.g.:

    git ... >out &&
    grep ctime out >expect &&
    ...

> +	git ls-files --format="  %(ctime)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format mtime' '
> +	git ls-files --debug | grep mtime >expect &&

ditto here & below.

> +	git ls-files --format="  %(mtime)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format dev and ino' '
> +	git ls-files --debug | grep dev >expect &&
> +	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format uid and gid' '
> +	git ls-files --debug | grep uid >expect &&
> +	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with -m' '
> +	echo change >o1 &&
> +	cat >expect <<-EOF &&

When not using varibales use <<-\EOF, applies for the rest.

> +	o1
> +	EOF
> +	git ls-files --format="%(path)" -m >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format with -d' '
> +	rm o1 &&

Don't "rm o1" here, rather have the test that creates it do:

    test_when_finished "rm o1" &&
    [the command that creates o1]

> +	test_when_finished "git restore o1" &&
> +	cat >expect <<-EOF &&
> +	o1
> +	EOF
> +	git ls-files --format="%(path)" -d >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format size and flags' '
> +	git ls-files --debug | grep size >expect &&
> +	git ls-files --format="  %(size)%x09%(flags)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format imitate --stage' '
> +	git ls-files --stage >expect &&
> +	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git ls-files --format imitate --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
> +	test_cmp expect actual
> +'

These tests...:

> +test_expect_success 'git ls-files --format with -s must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" -s
> +'
> +
> +test_expect_success 'git ls-files --format with -o must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" -o
> +'
> +
> +test_expect_success 'git ls-files --format with -k must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" -k
> +'
> +
> +test_expect_success 'git ls-files --format with --resolve-undo must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" --resolve-undo
> +'
> +
> +test_expect_success 'git ls-files --format with --deduplicate must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" --deduplicate
> +'
> +
> +test_expect_success 'git ls-files --format with --debug must fail' '
> +	test_must_fail git ls-files --format="%(objectname)" --debug
> +'

...would be better done with a for-loop, so:

	for flag in -s -o -k --resolve-undo [...]
	do
		test_expect_success "git ls-files --format is incompatible with $flag" '
			test_must_fail git ls-files --format="%(objectname)" $flag
		'
	done

Note the '' on the second argument, that's intentional, as we eval it
you don't need "".

> +test_done


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] ls-files: introduce "--object-only" option
  2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
@ 2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
  2022-06-18 10:59     ` ZheNing Hu
  0 siblings, 1 reply; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-15 20:15 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, ZheNing Hu


On Wed, Jun 15 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
>
> --object-only is an alias for --format=%(objectname),
> which output objectname of index entries, taking
> inspiration from the option with the same name in
> the `git ls-tree` command.
>
> --object-only cannot be used with --format, and -s, -o,
> -k, --resolve-undo, --deduplicate, --debug.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/git-ls-files.txt |  8 +++++++-
>  builtin/ls-files.c             | 36 +++++++++++++++++++++++++++++++++-
>  t/t3013-ls-files-format.sh     | 34 ++++++++++++++++++++++++++++++++
>  3 files changed, 76 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index b22860ec8c0..c3f46bb821b 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -13,7 +13,7 @@ SYNOPSIS
>  		[-c|--cached] [-d|--deleted] [-o|--others] [-i|--|ignored]
>  		[-s|--stage] [-u|--unmerged] [-k|--|killed] [-m|--modified]
>  		[--directory [--no-empty-directory]] [--eol]
> -		[--deduplicate]
> +		[--deduplicate] [--object-only]
>  		[-x <pattern>|--exclude=<pattern>]
>  		[-X <file>|--exclude-from=<file>]
>  		[--exclude-per-directory=<file>]
> @@ -199,6 +199,12 @@ followed by the  ("attr/<eolattr>").
>  	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
>  	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
>  	`--debug`.
> +
> +--object-only::
> +	List only names of the objects, one per line. This is equivalent
> +	to specifying `--format='%(objectname)'`. Cannot be combined with
> +	`--format=<format>`.
> +
>  \--::
>  	Do not interpret any more arguments as options.
>  
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 9dd6c55eeb9..4ac8f34baac 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -60,6 +60,27 @@ static const char *tag_modified = "";
>  static const char *tag_skip_worktree = "";
>  static const char *tag_resolve_undo = "";
>  
> +static enum ls_files_cmdmode {
> +	MODE_DEFAULT = 0,
> +	MODE_OBJECT_ONLY,
> +} ls_files_cmdmode;
> +
> +struct ls_files_cmdmodee_to_fmt {
> +	enum ls_files_cmdmode mode;
> +	const char *const fmt;
> +};
> +
> +static struct ls_files_cmdmodee_to_fmt ls_files_cmdmode_format[] = {
> +	{
> +		.mode = MODE_DEFAULT,
> +		.fmt = NULL,
> +	},
> +	{
> +		.mode = MODE_OBJECT_ONLY,
> +		.fmt = "%(objectname)",
> +	},
> +};
[...snip...]

This code all looks OK from skimming it, and is substantially copied
from builtin/ls-tree.c (which is good).

But I wonder as in that case whether having such an alias is worth it at
all, especially since in the case of ls-files (unlike ls-tree) we don't
start out with various --just-the-X-field type options, this is the
first one.

So I *really* like that you took my suggestion of "why not a --format"
from a previous round, but given the above for ls-files in particular is
it really worth it to have this extra code just to type:

    --object-only

Instead of:

    --format="%(objectname)"

So, maybe, and I'm not set against it, but I think it's worth
re-evaluating in this case.

In particular because the part of ls-tree's code is missing here where
we "format optimize", i.e. we take a form like:

    --format="%(objectname)"

And dispatch it to the more optimized special function, instead of the
generic strbuf_expand(), whereas in this case it's the other way around,
the option is just an alias for --format.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] ls-files: introduce "--format" option
  2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
@ 2022-06-18 10:50     ` ZheNing Hu
  0 siblings, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-18 10:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano, Christian Couder

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月16日周四 04:15写道:
>
>
> > +static void write_name_to_buf(struct strbuf *sb, const char *name)
> > +{
> > +     name = relative_path(name, prefix_len ? prefix : NULL, sb);
> FWIW I'd find this a bit less "huh?" if we declared another variable
> here, so just:
>
>         const char *rel = relative_path(name, ...).
>

Yeah, It's just a wrong code copy.

> > +     o1
> > +     EOF
> > +     git ls-files --format="%(path)" -m >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -d' '
> > +     rm o1 &&
>
> Don't "rm o1" here, rather have the test that creates it do:
>
>     test_when_finished "rm o1" &&
>     [the command that creates o1]
>

I thought about how to test 'git ls-files -d', so maybe I need
something like:

test_expect_success 'git ls-files --format with -d' '
     echo o3 >o3 &&
     git add o3 &&
     rm o3 &&
     cat >expect <<-\EOF &&
     o3
     EOF
     git ls-files --format="%(path)" -d >actual &&
     test_cmp expect actual
'

> > +     test_when_finished "git restore o1" &&
> > +     cat >expect <<-EOF &&
> > +     o1
> > +     EOF
> > +     git ls-files --format="%(path)" -d >actual &&
> > +     test_cmp expect actual
> > +'
> > +
>
> These tests...:
>
> > +test_expect_success 'git ls-files --format with -s must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" -s
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -o must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" -o
> > +'
> > +
> > +test_expect_success 'git ls-files --format with -k must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" -k
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --resolve-undo must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" --resolve-undo
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --deduplicate must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" --deduplicate
> > +'
> > +
> > +test_expect_success 'git ls-files --format with --debug must fail' '
> > +     test_must_fail git ls-files --format="%(objectname)" --debug
> > +'
>
> ...would be better done with a for-loop, so:
>
>         for flag in -s -o -k --resolve-undo [...]
>         do
>                 test_expect_success "git ls-files --format is incompatible with $flag" '
>                         test_must_fail git ls-files --format="%(objectname)" $flag
>                 '
>         done
>

Yeah, using this loop will be clear.

> Note the '' on the second argument, that's intentional, as we eval it
> you don't need "".
>
> > +test_done
>

Thanks for all these code style suggestions!

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] ls-files: introduce "--object-only" option
  2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
@ 2022-06-18 10:59     ` ZheNing Hu
  0 siblings, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-18 10:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano, Christian Couder

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月16日周四 04:25写道:
>
>
> On Wed, Jun 15 2022, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > --object-only is an alias for --format=%(objectname),
> > which output objectname of index entries, taking
> > inspiration from the option with the same name in
> > the `git ls-tree` command.
> >
> > --object-only cannot be used with --format, and -s, -o,
> > -k, --resolve-undo, --deduplicate, --debug.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> >  Documentation/git-ls-files.txt |  8 +++++++-
> >  builtin/ls-files.c             | 36 +++++++++++++++++++++++++++++++++-
> >  t/t3013-ls-files-format.sh     | 34 ++++++++++++++++++++++++++++++++
> >  3 files changed, 76 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> > index b22860ec8c0..c3f46bb821b 100644
> > --- a/Documentation/git-ls-files.txt
> > +++ b/Documentation/git-ls-files.txt
> > @@ -13,7 +13,7 @@ SYNOPSIS
> >               [-c|--cached] [-d|--deleted] [-o|--others] [-i|--|ignored]
> >               [-s|--stage] [-u|--unmerged] [-k|--|killed] [-m|--modified]
> >               [--directory [--no-empty-directory]] [--eol]
> > -             [--deduplicate]
> > +             [--deduplicate] [--object-only]
> >               [-x <pattern>|--exclude=<pattern>]
> >               [-X <file>|--exclude-from=<file>]
> >               [--exclude-per-directory=<file>]
> > @@ -199,6 +199,12 @@ followed by the  ("attr/<eolattr>").
> >       interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> >       --format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
> >       `--debug`.
> > +
> > +--object-only::
> > +     List only names of the objects, one per line. This is equivalent
> > +     to specifying `--format='%(objectname)'`. Cannot be combined with
> > +     `--format=<format>`.
> > +
> >  \--::
> >       Do not interpret any more arguments as options.
> >
> > diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> > index 9dd6c55eeb9..4ac8f34baac 100644
> > --- a/builtin/ls-files.c
> > +++ b/builtin/ls-files.c
> > @@ -60,6 +60,27 @@ static const char *tag_modified = "";
> >  static const char *tag_skip_worktree = "";
> >  static const char *tag_resolve_undo = "";
> >
> > +static enum ls_files_cmdmode {
> > +     MODE_DEFAULT = 0,
> > +     MODE_OBJECT_ONLY,
> > +} ls_files_cmdmode;
> > +
> > +struct ls_files_cmdmodee_to_fmt {
> > +     enum ls_files_cmdmode mode;
> > +     const char *const fmt;
> > +};
> > +
> > +static struct ls_files_cmdmodee_to_fmt ls_files_cmdmode_format[] = {
> > +     {
> > +             .mode = MODE_DEFAULT,
> > +             .fmt = NULL,
> > +     },
> > +     {
> > +             .mode = MODE_OBJECT_ONLY,
> > +             .fmt = "%(objectname)",
> > +     },
> > +};
> [...snip...]
>
> This code all looks OK from skimming it, and is substantially copied
> from builtin/ls-tree.c (which is good).
>
> But I wonder as in that case whether having such an alias is worth it at
> all, especially since in the case of ls-files (unlike ls-tree) we don't
> start out with various --just-the-X-field type options, this is the
> first one.
>
> So I *really* like that you took my suggestion of "why not a --format"
> from a previous round, but given the above for ls-files in particular is
> it really worth it to have this extra code just to type:
>
>     --object-only
>
> Instead of:
>
>     --format="%(objectname)"
>
> So, maybe, and I'm not set against it, but I think it's worth
> re-evaluating in this case.
>
> In particular because the part of ls-tree's code is missing here where
> we "format optimize", i.e. we take a form like:
>
>     --format="%(objectname)"
>
> And dispatch it to the more optimized special function, instead of the
> generic strbuf_expand(), whereas in this case it's the other way around,
> the option is just an alias for --format.

Thanks for clarifying that --object-only uses the fast path instead of a simple
alias of --format=%(objectname). Maybe I should do this too, or just
drop this patch
because git ls-file --format has included such a function :-)

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2] ls-files: introduce "--format" option
  2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
@ 2022-06-19  9:13 ` ZheNing Hu via GitGitGadget
  2022-06-19 13:50   ` Phillip Wood
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 30+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-19  9:13 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate and --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v1->v2:
    
     1. do some code style fix suggected by Ævar.
     2. remove --object-only option (I have tried to use fast path for it,
        but cannot see any performance promote compare with
        --format=%(objectname))

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v1:

 1:  432d80b8c78 ! 1:  67f2c3b8ebe ls-files: introduce "--format" option
     @@ Commit message
          command.
      
          --format cannot used with -s, -o, -k, --resolve-undo,
     -    --deduplicate, --debug.
     +    --deduplicate and --debug.
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ Documentation/git-ls-files.txt: followed by the  ("attr/<eolattr>").
       	trailing slash, such as "x/" for a sparse directory "x".
       
      +--format=<format>::
     -+	A string that interpolates %(fieldname) from the result being shown.
     ++	A string that interpolates `%(fieldname)` from the result being shown.
      +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
      +	interpolates to character with hex code `xx`; for example `%00`
      +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
     -+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`,
     ++	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
      +	`--debug`.
       \--::
       	Do not interpret any more arguments as options.
     @@ builtin/ls-files.c: static void write_name(const char *name)
       
      +static void write_name_to_buf(struct strbuf *sb, const char *name)
      +{
     -+	name = relative_path(name, prefix_len ? prefix : NULL, sb);
     -+	if (line_terminator) {
     -+		quote_c_style(name, sb, NULL, 0);
     -+	} else {
     -+		strbuf_add(sb, name, strlen(name));
     -+	}
     ++	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
     ++	if (line_terminator)
     ++		quote_c_style(rel, sb, NULL, 0);
     ++	else
     ++		strbuf_add(sb, rel, strlen(rel));
      +}
      +
       static const char *get_tag(const struct cache_entry *ce, const char *tag)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	if (len)
      +		return len;
      +	if (*start != '(')
     -+		die(_("bad ls-files format: element '%s' does not start with '('"), start);
     ++		die(_("bad ls-files format: element '%s' "
     ++		      "does not start with '('"), start);
      +
      +	end = strchr(start + 1, ')');
      +	if (!end)
     -+		die(_("bad ls-files format: element '%s' does not end in ')'"), start);
     ++		die(_("bad ls-files format: element '%s'"
     ++		      "does not end in ')'"), start);
      +
      +	len = end - start + 1;
     -+	if (skip_prefix(start, "(tag)", &p)) {
     ++	if (skip_prefix(start, "(tag)", &p))
      +		strbuf_addstr(sb, get_tag(data->ce, data->tag));
     -+	} else if (skip_prefix(start, "(objectmode)", &p)) {
     ++	else if (skip_prefix(start, "(objectmode)", &p))
      +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
     -+	} else if (skip_prefix(start, "(objectname)", &p)) {
     ++	else if (skip_prefix(start, "(objectname)", &p))
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
     -+	} else if (skip_prefix(start, "(stage)", &p)) {
     ++	else if (skip_prefix(start, "(stage)", &p))
      +		strbuf_addf(sb, "%d", ce_stage(data->ce));
     -+	} else if (skip_prefix(start, "(eol)", &p)) {
     -+		write_eolinfo_to_buf(sb, data->istate, data->ce, data->pathname);
     -+	} else if (skip_prefix(start, "(path)", &p)) {
     ++	else if (skip_prefix(start, "(eol)", &p))
     ++		write_eolinfo_to_buf(sb, data->istate,
     ++				     data->ce, data->pathname);
     ++	else if (skip_prefix(start, "(path)", &p))
      +		write_name_to_buf(sb, data->pathname);
     -+	} else if (skip_prefix(start, "(ctime)", &p)) {
     -+		strbuf_addf(sb, "ctime: %u:%u", sd->sd_ctime.sec, sd->sd_ctime.nsec);
     -+	} else if (skip_prefix(start, "(mtime)", &p)) {
     -+		strbuf_addf(sb, "mtime: %u:%u", sd->sd_mtime.sec, sd->sd_mtime.nsec);
     -+	} else if (skip_prefix(start, "(dev)", &p)) {
     ++	else if (skip_prefix(start, "(ctime)", &p))
     ++		strbuf_addf(sb, "ctime: %u:%u",
     ++			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
     ++	else if (skip_prefix(start, "(mtime)", &p))
     ++		strbuf_addf(sb, "mtime: %u:%u",
     ++			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
     ++	else if (skip_prefix(start, "(dev)", &p))
      +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
     -+	} else if (skip_prefix(start, "(ino)", &p)) {
     ++	else if (skip_prefix(start, "(ino)", &p))
      +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
     -+	} else if (skip_prefix(start, "(uid)", &p)) {
     ++	else if (skip_prefix(start, "(uid)", &p))
      +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
     -+	} else if (skip_prefix(start, "(gid)", &p)) {
     ++	else if (skip_prefix(start, "(gid)", &p))
      +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
     -+	} else if (skip_prefix(start, "(size)", &p)) {
     ++	else if (skip_prefix(start, "(size)", &p))
      +		strbuf_addf(sb, "size: %u", sd->sd_size);
     -+	} else if (skip_prefix(start, "(flags)", &p)) {
     ++	else if (skip_prefix(start, "(flags)", &p))
      +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
     -+	} else {
     ++	else {
      +		errlen = (unsigned long)len;
      +		die(_("bad ls-files format: %%%.*s"), errlen, start);
      +	}
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
       		OPT_BOOL(0, "sparse", &show_sparse_dirs,
       			 N_("show sparse directories in the presence of a sparse index")),
      +		OPT_STRING_F(0, "format", &format, N_("format"),
     -+					 N_("format to use for the output"),
     -+					 PARSE_OPT_NONEG),
     ++			     N_("format to use for the output"),
     ++			     PARSE_OPT_NONEG),
       		OPT_END()
       	};
       	int ret = 0;
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format objectmode' '
     -+	cat >expect <<-EOF &&
     ++	cat >expect <<-\EOF &&
      +	100755
      +	100644
      +	EOF
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format path' '
     -+	cat >expect <<-EOF &&
     ++	cat >expect <<-\EOF &&
      +	o1
      +	o2
      +	EOF
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format ctime' '
     -+	git ls-files --debug | grep ctime >expect &&
     ++	git ls-files --debug >out &&
     ++	grep ctime out >expect &&
      +	git ls-files --format="  %(ctime)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format mtime' '
     -+	git ls-files --debug | grep mtime >expect &&
     ++	git ls-files --debug >out &&
     ++	grep mtime out >expect &&
      +	git ls-files --format="  %(mtime)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format dev and ino' '
     -+	git ls-files --debug | grep dev >expect &&
     ++	git ls-files --debug >out &&
     ++	grep dev out >expect &&
      +	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format uid and gid' '
     -+	git ls-files --debug | grep uid >expect &&
     ++	git ls-files --debug >out &&
     ++	grep uid out >expect &&
      +	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format with -m' '
      +	echo change >o1 &&
     -+	cat >expect <<-EOF &&
     ++	cat >expect <<-\EOF &&
      +	o1
      +	EOF
      +	git ls-files --format="%(path)" -m >actual &&
     @@ t/t3013-ls-files-format.sh (new)
      +'
      +
      +test_expect_success 'git ls-files --format with -d' '
     -+	rm o1 &&
     -+	test_when_finished "git restore o1" &&
     -+	cat >expect <<-EOF &&
     -+	o1
     ++	echo o3 >o3 &&
     ++	git add o3 &&
     ++	rm o3 &&
     ++	cat >expect <<-\EOF &&
     ++	o3
      +	EOF
      +	git ls-files --format="%(path)" -d >actual &&
      +	test_cmp expect actual
      +'
      +
      +test_expect_success 'git ls-files --format size and flags' '
     -+	git ls-files --debug | grep size >expect &&
     ++	git ls-files --debug >out &&
     ++	grep size out >expect &&
      +	git ls-files --format="  %(size)%x09%(flags)" >actual &&
      +	test_cmp expect actual
      +'
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format with -s must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" -s
     -+'
     -+
     -+test_expect_success 'git ls-files --format with -o must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" -o
     -+'
     -+
     -+test_expect_success 'git ls-files --format with -k must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" -k
     -+'
     -+
     -+test_expect_success 'git ls-files --format with --resolve-undo must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" --resolve-undo
     -+'
     -+
     -+test_expect_success 'git ls-files --format with --deduplicate must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" --deduplicate
     -+'
     -+
     -+test_expect_success 'git ls-files --format with --debug must fail' '
     -+	test_must_fail git ls-files --format="%(objectname)" --debug
     -+'
     -+
     ++for flag in -s -o -k --resolve-undo --deduplicate --debug
     ++do
     ++	test_expect_success "git ls-files --format is incompatible with $flag" '
     ++		test_must_fail git ls-files --format="%(objectname)" $flag
     ++	'
     ++done
      +test_done
 2:  81ae1280e8e < -:  ----------- ls-files: introduce "--object-only" option


 Documentation/git-ls-files.txt |  51 ++++++++++++-
 builtin/ls-files.c             | 130 ++++++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 130 +++++++++++++++++++++++++++++++++
 3 files changed, 307 insertions(+), 4 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..9a88c92f1ad 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
+	`--debug`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+tag::
+	The tag of file status.
+objectmode::
+	The mode of the object.
+objectname::
+	The name of the object.
+stage::
+	The stage of the file.
+eol::
+	The line endings of files.
+path::
+	The pathname of the object.
+ctime::
+	The create time of file.
+mtime::
+	The modify time of file.
+dev::
+	The ID of device containing file.
+ino::
+	The inode number of file.
+uid::
+	The user id of file owner.
+gid::
+	The group id of file owner.
+size::
+	The size of the file.
+flags::
+	The flags of the file.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..f037ccb58b4 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -58,8 +60,8 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
-static void write_eolinfo(struct index_state *istate,
-			  const struct cache_entry *ce, const char *path)
+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
+				   const struct cache_entry *ce, const char *path)
 {
 	if (show_eol) {
 		struct stat st;
@@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
 							       ce->name);
 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
 			w_txt = get_wt_convert_stats_ascii(path);
-		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		if (sb)
+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		else
+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
 	}
 }
 
+static void write_eolinfo(struct index_state *istate,
+			  const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(NULL, istate, ce, path);
+}
+
+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(sb, istate, ce, path);
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +102,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +248,91 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *tag;
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	unsigned int errlen;
+	const struct stat_data *sd = &data->ce->ce_stat_data;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(tag)", &p))
+		strbuf_addstr(sb, get_tag(data->ce, data->tag));
+	else if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eol)", &p))
+		write_eolinfo_to_buf(sb, data->istate,
+				     data->ce, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(ctime)", &p))
+		strbuf_addf(sb, "ctime: %u:%u",
+			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
+	else if (skip_prefix(start, "(mtime)", &p))
+		strbuf_addf(sb, "mtime: %u:%u",
+			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
+	else if (skip_prefix(start, "(dev)", &p))
+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
+	else if (skip_prefix(start, "(ino)", &p))
+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
+	else if (skip_prefix(start, "(uid)", &p))
+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
+	else if (skip_prefix(start, "(gid)", &p))
+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
+	else if (skip_prefix(start, "(size)", &p))
+		strbuf_addf(sb, "size: %u", sd->sd_size);
+	else if (skip_prefix(start, "(flags)", &p))
+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
+	else {
+		errlen = (unsigned long)len;
+		die(_("bad ls-files format: %%%.*s"), errlen, start);
+	}
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname, const char *tag) {
+
+	struct show_index_data data = {
+		.tag = tag,
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +347,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname, tag);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +791,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +818,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || debug_mode))
+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..1a1b09e7b3c
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,130 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format tag' '
+	printf "H \nH \n" >expect &&
+	git ls-files --format="%(tag)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eol' '
+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
+	git ls-files --format="%(eol)" --eol >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format ctime' '
+	git ls-files --debug >out &&
+	grep ctime out >expect &&
+	git ls-files --format="  %(ctime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format mtime' '
+	git ls-files --debug >out &&
+	grep mtime out >expect &&
+	git ls-files --format="  %(mtime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format dev and ino' '
+	git ls-files --debug >out &&
+	grep dev out >expect &&
+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format uid and gid' '
+	git ls-files --debug >out &&
+	grep uid out >expect &&
+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format size and flags' '
+	git ls-files --debug >out &&
+	grep size out >expect &&
+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+for flag in -s -o -k --resolve-undo --deduplicate --debug
+do
+	test_expect_success "git ls-files --format is incompatible with $flag" '
+		test_must_fail git ls-files --format="%(objectname)" $flag
+	'
+done
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] ls-files: introduce "--format" option
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
@ 2022-06-19 13:50   ` Phillip Wood
  2022-06-20 13:32     ` ZheNing Hu
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  1 sibling, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2022-06-19 13:50 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Hi ZheNing

On 19/06/2022 10:13, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
> 
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
> 
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate and --debug.

I think this is an interesting feature that provides functionality that 
is not available by feeding index entries into cat-file.

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
>   Documentation/git-ls-files.txt |  51 ++++++++++++-
>   builtin/ls-files.c             | 130 ++++++++++++++++++++++++++++++++-
>   t/t3013-ls-files-format.sh     | 130 +++++++++++++++++++++++++++++++++
>   3 files changed, 307 insertions(+), 4 deletions(-)
>   create mode 100755 t/t3013-ls-files-format.sh
> 
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 0dabf3f0ddc..9a88c92f1ad 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -20,7 +20,7 @@ SYNOPSIS
>   		[--exclude-standard]
>   		[--error-unmatch] [--with-tree=<tree-ish>]
>   		[--full-name] [--recurse-submodules]
> -		[--abbrev[=<n>]] [--] [<file>...]
> +		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
>   
>   DESCRIPTION
>   -----------
> @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
>   	to the contained files. Sparse directories will be shown with a
>   	trailing slash, such as "x/" for a sparse directory "x".
>   
> +--format=<format>::
> +	A string that interpolates `%(fieldname)` from the result being shown.
> +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> +	interpolates to character with hex code `xx`; for example `%00`
> +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> +	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
> +	`--debug`.
>   \--::
>   	Do not interpret any more arguments as options.
>   
> @@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
>   (see linkgit:git-config[1]).  Using `-z` the filename is output
>   verbatim and the line is terminated by a NUL byte.
>   
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +tag::
> +	The tag of file status.

The documentation for -t strong discourages its use, so I wonder if we 
really want to expose it here.

> +objectmode::
> +	The mode of the object.
> +objectname::
> +	The name of the object.
> +stage::
> +	The stage of the file.
> +eol::
> +	The line endings of files.

Every other option refers to either a "file" or "object" but here we 
have "files". Looking at the implementation below this will print the 
line ending from both the index and the worktree, it would be useful to 
clarify that here.

> +path::
> +	The pathname of the object.
> +ctime::
> +	The create time of file.

It is not clear from this whether this (and all the file attributes 
below) are coming from the worktree or the index or both like eol?

> +mtime::
> +	The modify time of file.
> +dev::
> +	The ID of device containing file.
> +ino::
> +	The inode number of file.
> +uid::
> +	The user id of file owner.
> +gid::
> +	The group id of file owner.
> +size::
> +	The size of the file.
> +flags::
> +	The flags of the file.

What are the flags?

> [...]  
> +static size_t expand_show_index(struct strbuf *sb, const char *start,
> +			       void *context)
> +{
> +	struct show_index_data *data = context;
> +	const char *end;
> +	const char *p;
> +	unsigned int errlen;
 > [...]
> +	else if (skip_prefix(start, "(flags)", &p))
> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> +	else {
> +		errlen = (unsigned long)len;
> +		die(_("bad ls-files format: %%%.*s"), errlen, start);

errlen is declared as an unsigned int, but you cast len which is a 
size_t to unsigned long when assigning to errlen. Then errlen is used 
where a signed int is required by die. There is also a style violation 
as if any branch of an if needs braces then they should all be braced. I 
think that the best solution would be to drop errlen and just write

	else
		die(_("bad ls-files format: %%%.*s"), (int)len, start);

It would be interesting to check the performance of this implementation 
on a large repository as it is doing a lot of branching inside a loop. I 
don't think we should change it unless it turns out to be a problem. 
Then we could try switching on the first character of the format 
specifier or some other optimization.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] ls-files: introduce "--format" option
  2022-06-19 13:50   ` Phillip Wood
@ 2022-06-20 13:32     ` ZheNing Hu
  0 siblings, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-20 13:32 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月19日周日 21:50写道:
>
> Hi ZheNing
>
> On 19/06/2022 10:13, ZheNing Hu via GitGitGadget wrote:
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Add a new option --format that output index enties
> > informations with custom format, taking inspiration
> > from the option with the same name in the `git ls-tree`
> > command.
> >
> > --format cannot used with -s, -o, -k, --resolve-undo,
> > --deduplicate and --debug.
>
> I think this is an interesting feature that provides functionality that
> is not available by feeding index entries into cat-file.
>

Yeah, it cares about index state. Having this feature, maybe we can
easier check index/work-tree state.

> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> >   Documentation/git-ls-files.txt |  51 ++++++++++++-
> >   builtin/ls-files.c             | 130 ++++++++++++++++++++++++++++++++-
> >   t/t3013-ls-files-format.sh     | 130 +++++++++++++++++++++++++++++++++
> >   3 files changed, 307 insertions(+), 4 deletions(-)
> >   create mode 100755 t/t3013-ls-files-format.sh
> >
> > diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> > index 0dabf3f0ddc..9a88c92f1ad 100644
> > --- a/Documentation/git-ls-files.txt
> > +++ b/Documentation/git-ls-files.txt
> > @@ -20,7 +20,7 @@ SYNOPSIS
> >               [--exclude-standard]
> >               [--error-unmatch] [--with-tree=<tree-ish>]
> >               [--full-name] [--recurse-submodules]
> > -             [--abbrev[=<n>]] [--] [<file>...]
> > +             [--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
> >
> >   DESCRIPTION
> >   -----------
> > @@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
> >       to the contained files. Sparse directories will be shown with a
> >       trailing slash, such as "x/" for a sparse directory "x".
> >
> > +--format=<format>::
> > +     A string that interpolates `%(fieldname)` from the result being shown.
> > +     It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
> > +     interpolates to character with hex code `xx`; for example `%00`
> > +     interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
> > +     --format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
> > +     `--debug`.
> >   \--::
> >       Do not interpret any more arguments as options.
> >
> > @@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
> >   (see linkgit:git-config[1]).  Using `-z` the filename is output
> >   verbatim and the line is terminated by a NUL byte.
> >
> > +It is possible to print in a custom format by using the `--format`
> > +option, which is able to interpolate different fields using
> > +a `%(fieldname)` notation. For example, if you only care about the
> > +"objectname" and "path" fields, you can execute with a specific
> > +"--format" like
> > +
> > +     git ls-files --format='%(objectname) %(path)'
> > +
> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +tag::
> > +     The tag of file status.
>
> The documentation for -t strong discourages its use, so I wonder if we
> really want to expose it here.
>

I think it's ok to remove it.

> > +objectmode::
> > +     The mode of the object.
> > +objectname::
> > +     The name of the object.
> > +stage::
> > +     The stage of the file.
> > +eol::
> > +     The line endings of files.
>
> Every other option refers to either a "file" or "object" but here we
> have "files". Looking at the implementation below this will print the
> line ending from both the index and the worktree, it would be useful to
> clarify that here.
>

Sure...

> > +path::
> > +     The pathname of the object.
> > +ctime::
> > +     The create time of file.
>
> It is not clear from this whether this (and all the file attributes
> below) are coming from the worktree or the index or both like eol?
>

...I think they are basically index cache_entry attributes, except eol
cares about both
worktree and index. I will fix them.

> > +mtime::
> > +     The modify time of file.
> > +dev::
> > +     The ID of device containing file.
> > +ino::
> > +     The inode number of file.
> > +uid::
> > +     The user id of file owner.
> > +gid::
> > +     The group id of file owner.
> > +size::
> > +     The size of the file.
> > +flags::
> > +     The flags of the file.
>
> What are the flags?
>

It is cache entry flags which include In-memory only flags and some
extended on-disk flags.

> > [...]
> > +static size_t expand_show_index(struct strbuf *sb, const char *start,
> > +                            void *context)
> > +{
> > +     struct show_index_data *data = context;
> > +     const char *end;
> > +     const char *p;
> > +     unsigned int errlen;
>  > [...]
> > +     else if (skip_prefix(start, "(flags)", &p))
> > +             strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> > +     else {
> > +             errlen = (unsigned long)len;
> > +             die(_("bad ls-files format: %%%.*s"), errlen, start);
>
> errlen is declared as an unsigned int, but you cast len which is a
> size_t to unsigned long when assigning to errlen. Then errlen is used
> where a signed int is required by die. There is also a style violation
> as if any branch of an if needs braces then they should all be braced. I
> think that the best solution would be to drop errlen and just write
>
>         else
>                 die(_("bad ls-files format: %%%.*s"), (int)len, start);
>

This piece of code is copying from ls-tree. Maybe we should fix it too.

> It would be interesting to check the performance of this implementation
> on a large repository as it is doing a lot of branching inside a loop. I
> don't think we should change it unless it turns out to be a problem.
> Then we could try switching on the first character of the format
> specifier or some other optimization.
>

Just like ref-filter or something else does, it parses atoms
and then fills buffers with information. Maybe we need such performance
optimization later, but for now, it's just easier to implement this patch :)

> Best Wishes
>
> Phillip

Thanks

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3] ls-files: introduce "--format" option
  2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
  2022-06-19 13:50   ` Phillip Wood
@ 2022-06-21  2:05   ` ZheNing Hu via GitGitGadget
  2022-06-23 14:06     ` Phillip Wood
                       ` (2 more replies)
  1 sibling, 3 replies; 30+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-21  2:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate and --debug.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v2->v3:
    
     1. remove %(tag) because -t is deprecated, suggested by Phillip.
     2. fix some description of atoms in document, suggested by Phillip..

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v2:

 1:  67f2c3b8ebe ! 1:  aaafa35ffcd ls-files: introduce "--format" option
     @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
      +into the resulting output. For each outputting line, the following
      +names can be used:
      +
     -+tag::
     -+	The tag of file status.
      +objectmode::
     -+	The mode of the object.
     ++	The mode of the file which is in the index.
      +objectname::
     -+	The name of the object.
     ++	The name of the file which is in the index.
      +stage::
     -+	The stage of the file.
     ++	The stage of the file which is in the index.
      +eol::
     -+	The line endings of files.
     ++	The <eolinfo> and <eolattr> of files both in the
     ++	index and the work-tree.
      +path::
     -+	The pathname of the object.
     ++	The pathname of the file which is in the index.
      +ctime::
     -+	The create time of file.
     ++	The create time of file which is in the index.
      +mtime::
     -+	The modify time of file.
     ++	The modified time of file which is in the index.
      +dev::
     -+	The ID of device containing file.
     ++	The ID of device containing file which is in the index.
      +ino::
     -+	The inode number of file.
     ++	The inode number of file which is in the index.
      +uid::
     -+	The user id of file owner.
     ++	The user id of file owner which is in the index.
      +gid::
     -+	The group id of file owner.
     ++	The group id of file owner which is in the index.
      +size::
     -+	The size of the file.
     ++	The size of the file which is in the index.
      +flags::
     -+	The flags of the file.
     ++	The flags of the file in the index which include
     ++	in-memory only flags and some extended on-disk flags.
       
       EXCLUDE PATTERNS
       ----------------
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	struct show_index_data *data = context;
      +	const char *end;
      +	const char *p;
     -+	unsigned int errlen;
      +	const struct stat_data *sd = &data->ce->ce_stat_data;
      +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
      +	if (len)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		      "does not end in ')'"), start);
      +
      +	len = end - start + 1;
     -+	if (skip_prefix(start, "(tag)", &p))
     -+		strbuf_addstr(sb, get_tag(data->ce, data->tag));
     -+	else if (skip_prefix(start, "(objectmode)", &p))
     ++	if (skip_prefix(start, "(objectmode)", &p))
      +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
      +	else if (skip_prefix(start, "(objectname)", &p))
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		strbuf_addf(sb, "size: %u", sd->sd_size);
      +	else if (skip_prefix(start, "(flags)", &p))
      +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
     -+	else {
     -+		errlen = (unsigned long)len;
     -+		die(_("bad ls-files format: %%%.*s"), errlen, start);
     -+	}
     ++	else
     ++		die(_("bad ls-files format: %%%.*s"), (int)len, start);
      +
      +	return len;
      +}
      +
      +static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
     -+			const char *format, const char *fullname, const char *tag) {
     ++			const char *format, const char *fullname) {
      +
      +	struct show_index_data data = {
     -+		.tag = tag,
      +		.pathname = fullname,
      +		.istate = repo->index,
      +		.ce = ce,
     @@ builtin/ls-files.c: static void show_ce(struct repository *repo, struct dir_stru
       				  S_ISDIR(ce->ce_mode) ||
       				  S_ISGITLINK(ce->ce_mode))) {
      +		if (format) {
     -+			show_ce_fmt(repo, ce, format, fullname, tag);
     ++			show_ce_fmt(repo, ce, format, fullname);
      +			return;
      +		}
      +
     @@ t/t3013-ls-files-format.sh (new)
      +	git commit -m base
      +'
      +
     -+test_expect_success 'git ls-files --format tag' '
     -+	printf "H \nH \n" >expect &&
     -+	git ls-files --format="%(tag)" -t >actual &&
     -+	test_cmp expect actual
     -+'
     -+
      +test_expect_success 'git ls-files --format objectmode' '
      +	cat >expect <<-\EOF &&
      +	100755


 Documentation/git-ls-files.txt |  51 +++++++++++++-
 builtin/ls-files.c             | 124 ++++++++++++++++++++++++++++++++-
 t/t3013-ls-files-format.sh     | 124 +++++++++++++++++++++++++++++++++
 3 files changed, 295 insertions(+), 4 deletions(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..39211bde797 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
+	`--debug`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,48 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+objectmode::
+	The mode of the file which is in the index.
+objectname::
+	The name of the file which is in the index.
+stage::
+	The stage of the file which is in the index.
+eol::
+	The <eolinfo> and <eolattr> of files both in the
+	index and the work-tree.
+path::
+	The pathname of the file which is in the index.
+ctime::
+	The create time of file which is in the index.
+mtime::
+	The modified time of file which is in the index.
+dev::
+	The ID of device containing file which is in the index.
+ino::
+	The inode number of file which is in the index.
+uid::
+	The user id of file owner which is in the index.
+gid::
+	The group id of file owner which is in the index.
+size::
+	The size of the file which is in the index.
+flags::
+	The flags of the file in the index which include
+	in-memory only flags and some extended on-disk flags.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..387641b32df 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -58,8 +60,8 @@ static const char *tag_modified = "";
 static const char *tag_skip_worktree = "";
 static const char *tag_resolve_undo = "";
 
-static void write_eolinfo(struct index_state *istate,
-			  const struct cache_entry *ce, const char *path)
+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
+				   const struct cache_entry *ce, const char *path)
 {
 	if (show_eol) {
 		struct stat st;
@@ -71,10 +73,25 @@ static void write_eolinfo(struct index_state *istate,
 							       ce->name);
 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
 			w_txt = get_wt_convert_stats_ascii(path);
-		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		if (sb)
+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
+		else
+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
 	}
 }
 
+static void write_eolinfo(struct index_state *istate,
+			  const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(NULL, istate, ce, path);
+}
+
+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const struct cache_entry *ce, const char *path)
+{
+	write_eolinfo_internal(sb, istate, ce, path);
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +102,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +248,85 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *tag;
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	const struct stat_data *sd = &data->ce->ce_stat_data;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eol)", &p))
+		write_eolinfo_to_buf(sb, data->istate,
+				     data->ce, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(ctime)", &p))
+		strbuf_addf(sb, "ctime: %u:%u",
+			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
+	else if (skip_prefix(start, "(mtime)", &p))
+		strbuf_addf(sb, "mtime: %u:%u",
+			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
+	else if (skip_prefix(start, "(dev)", &p))
+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
+	else if (skip_prefix(start, "(ino)", &p))
+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
+	else if (skip_prefix(start, "(uid)", &p))
+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
+	else if (skip_prefix(start, "(gid)", &p))
+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
+	else if (skip_prefix(start, "(size)", &p))
+		strbuf_addf(sb, "size: %u", sd->sd_size);
+	else if (skip_prefix(start, "(flags)", &p))
+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +341,11 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +785,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +812,11 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || debug_mode))
+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..8c3ef2df138
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,124 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eol' '
+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
+	git ls-files --format="%(eol)" --eol >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format ctime' '
+	git ls-files --debug >out &&
+	grep ctime out >expect &&
+	git ls-files --format="  %(ctime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format mtime' '
+	git ls-files --debug >out &&
+	grep mtime out >expect &&
+	git ls-files --format="  %(mtime)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format dev and ino' '
+	git ls-files --debug >out &&
+	grep dev out >expect &&
+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format uid and gid' '
+	git ls-files --debug >out &&
+	grep uid out >expect &&
+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format size and flags' '
+	git ls-files --debug >out &&
+	grep size out >expect &&
+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
+	test_cmp expect actual
+'
+
+for flag in -s -o -k --resolve-undo --deduplicate --debug
+do
+	test_expect_success "git ls-files --format is incompatible with $flag" '
+		test_must_fail git ls-files --format="%(objectname)" $flag
+	'
+done
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
@ 2022-06-23 14:06     ` Phillip Wood
  2022-06-23 15:57       ` Junio C Hamano
  2022-06-26 13:01       ` ZheNing Hu
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 30+ messages in thread
From: Phillip Wood @ 2022-06-23 14:06 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Hi ZheNing

On 21/06/2022 03:05, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
> 
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
> 
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate and --debug.
> 
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>      ls-files: introduce "--format" options
>      
>      v2->v3:
>      
>       1. remove %(tag) because -t is deprecated, suggested by Phillip.
>       2. fix some description of atoms in document, suggested by Phillip..

Thanks for re-rolling, having taken a look a closer look at the tests 
I'm concerned about the output format for some of the specifiers, see below.

> [...]  
> +It is possible to print in a custom format by using the `--format`
> +option, which is able to interpolate different fields using
> +a `%(fieldname)` notation. For example, if you only care about the
> +"objectname" and "path" fields, you can execute with a specific
> +"--format" like
> +
> +	git ls-files --format='%(objectname) %(path)'
> +
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is in the index.
> +objectname::
> +	The name of the file which is in the index.
> +stage::
> +	The stage of the file which is in the index.
> +eol::
> +	The <eolinfo> and <eolattr> of files both in the
> +	index and the work-tree.

Looking at the test for this option I think it needs more work, why 
should --format arbitrarily append a tab to the end of the output? - the 
user should be able to specify a separator if they want one as part of 
the format string. Also I'm not sure why there is so much whitespace in 
the output.

> +path::
> +	The pathname of the file which is in the index.

I think that for all these it might be clearer to say "recorded in the 
index" rather than "of the file which is in the index"

> +ctime::
> +	The create time of file which is in the index.

This is printed with a prefix 'ctime:' (the same applies to the format 
specifiers below) I think we should omit that and just print the data so 
the user can choose the format they want.

> +mtime::
> +	The modified time of file which is in the index.
> +dev::
> +	The ID of device containing file which is in the index.
> +ino::
> +	The inode number of file which is in the index.
> +uid::
> +	The user id of file owner which is in the index.
> +gid::
> +	The group id of file owner which is in the index.
> +size::
> +	The size of the file which is in the index.
> +flags::
> +	The flags of the file in the index which include
> +	in-memory only flags and some extended on-disk flags.

If %(flags) is going to be useful then I think we need to think about 
how they are printed and document that. At the moment they are printed 
as a hexadecimal number which is fine for debugging but probably not 
going to be useful for something like --format. I think printing 
documented symbolic names with some kind of separator (a comma maybe) 
between them is probably more useful

 > [...]
> +test_expect_success 'git ls-files --format eol' '
> +	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
> +	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
> +	git ls-files --format="%(eol)" --eol >actual &&

I'm not sure why this is passing --eol as well as --format='%(eol)' - 
shouldn't that combination of flags be an error?

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 14:06     ` Phillip Wood
@ 2022-06-23 15:57       ` Junio C Hamano
  2022-06-24 10:16         ` Phillip Wood
  2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
  2022-06-26 13:01       ` ZheNing Hu
  1 sibling, 2 replies; 30+ messages in thread
From: Junio C Hamano @ 2022-06-23 15:57 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Phillip Wood <phillip.wood123@gmail.com> writes:

> Thanks for re-rolling, having taken a look a closer look at the tests
> I'm concerned about the output format for some of the specifiers, see
> below.

Thanks for raising these issues.  I agree with you on many of them.
In addition to what you covered ....

>> +path::
>> +	The pathname of the file which is in the index.
> I think that for all these it might be clearer to say "recorded in the
> index" rather than "of the file which is in the index"

I think we would call this "name".  The name of the existing option
that controls how they are shown is "--full-name", not "--full-path",
for example.

>> +ctime::
>> +	The create time of file which is in the index.
>
> This is printed with a prefix 'ctime:' (the same applies to the format
> specifiers below) I think we should omit that and just print the data
> so the user can choose the format they want.
>
>> +mtime::
>> +	The modified time of file which is in the index.

These are only the low-bits of the full timestamp, not ctime/mtime
themselves.

But stepping back a bit, why do we need to include them in the
output?  What workflow and use case are we trying to help?  Dump
output from "stat <path>" equivalent from ls-files and compare with
"stat ." output to see which ones are stale?  Or is there any value
to see the value of, say, ctime as an individual data item?

>> +dev::
>> +	The ID of device containing file which is in the index.
>> +ino::
>> +	The inode number of file which is in the index.
>> +uid::
>> +	The user id of file owner which is in the index.
>> +gid::
>> +	The group id of file owner which is in the index.

Again, why do we need to include these in the output?

Wouldn't it be sufficient, as well as a lot more useful, to show a
single bit "the cached stat info matches what is in the working tree
(yes/no)"?

>> +size::
>> +	The size of the file which is in the index.

This needs to explain what kind of size this is.  Is it the size of
the blob object?  Is it the size of the file in the working tree
(i.e. not cleaned)?  Is it _always_ the size, or can it become a
number that is very different from size in certain circumstances?

IOW, I do not think giving this to unsuspecting users and call it
"size of the file" hurts them more than it helps them, especially
because it is not always the size of the file.

I'd suggest getting rid of everything from ctime down to size and if
we really care about the freshness of the cached stat info, replace
them with a single bit "up-to-date".

>> +flags::
>> +	The flags of the file in the index which include
>> +	in-memory only flags and some extended on-disk flags.
>
> If %(flags) is going to be useful then I think we need to think about
> how they are printed and document that. At the moment they are printed 
> as a hexadecimal number which is fine for debugging but probably not
> going to be useful for something like --format. I think printing 
> documented symbolic names with some kind of separator (a comma maybe)
> between them is probably more useful

I am guessing that most of the above are only useful for curious
geeks and those who are debugging their new tweak to the code that
touches the index, i.e. a debugging feature.  But these folks can
run "git" under a debugger, and they probably have to do so when
they are seeing an unexpected value in the flags member of a cache
entry anyway.  So I am not sure whom this field is intended to help.

>> [...]
>> +test_expect_success 'git ls-files --format eol' '
>> +	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
>> +	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
>> +	git ls-files --format="%(eol)" --eol >actual &&
>
> I'm not sure why this is passing --eol as well as --format='%(eol)' -
> shouldn't that combination of flags be an error?

Good eyes.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 15:57       ` Junio C Hamano
@ 2022-06-24 10:16         ` Phillip Wood
  2022-06-26 13:05           ` ZheNing Hu
  2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2022-06-24 10:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

On 23/06/2022 16:57, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> Thanks for re-rolling, having taken a look a closer look at the tests
>> I'm concerned about the output format for some of the specifiers, see
>> below.
> 
> Thanks for raising these issues.  I agree with you on many of them.
> In addition to what you covered ....
> 
>>> +path::
>>> +	The pathname of the file which is in the index.
>> I think that for all these it might be clearer to say "recorded in the
>> index" rather than "of the file which is in the index"
> 
> I think we would call this "name".  The name of the existing option
> that controls how they are shown is "--full-name", not "--full-path",
> for example.

That's a good point, also I've just noticed that this is another case 
where there is a separator character is printed automatically when the 
format string is expanded. I think it is probably right to format the 
name based on whether or not -z was passed but we should leave it up to 
the user to supply a delimiter in the format string.

>>> +ctime::
>>> +	The create time of file which is in the index.
>>
>> This is printed with a prefix 'ctime:' (the same applies to the format
>> specifiers below) I think we should omit that and just print the data
>> so the user can choose the format they want.
>>
>>> +mtime::
>>> +	The modified time of file which is in the index.
> 
> These are only the low-bits of the full timestamp, not ctime/mtime
> themselves.
> 
> But stepping back a bit, why do we need to include them in the
> output?  What workflow and use case are we trying to help?  Dump
> output from "stat <path>" equivalent from ls-files and compare with
> "stat ." output to see which ones are stale?  Or is there any value
> to see the value of, say, ctime as an individual data item?
> 
>>> +dev::
>>> +	The ID of device containing file which is in the index.
>>> +ino::
>>> +	The inode number of file which is in the index.
>>> +uid::
>>> +	The user id of file owner which is in the index.
>>> +gid::
>>> +	The group id of file owner which is in the index.
> 
> Again, why do we need to include these in the output?
> 
> Wouldn't it be sufficient, as well as a lot more useful, to show a
> single bit "the cached stat info matches what is in the working tree
> (yes/no)"?

That does sound useful

>>> +flags::
>>> +	The flags of the file in the index which include
>>> +	in-memory only flags and some extended on-disk flags.
>>
>> If %(flags) is going to be useful then I think we need to think about
>> how they are printed and document that. At the moment they are printed
>> as a hexadecimal number which is fine for debugging but probably not
>> going to be useful for something like --format. I think printing
>> documented symbolic names with some kind of separator (a comma maybe)
>> between them is probably more useful
> 
> I am guessing that most of the above are only useful for curious
> geeks and those who are debugging their new tweak to the code that
> touches the index, i.e. a debugging feature.  But these folks can
> run "git" under a debugger, and they probably have to do so when
> they are seeing an unexpected value in the flags member of a cache
> entry anyway.  So I am not sure whom this field is intended to help.

I wondered about that as well, but thought there might be a plausible 
use if someone wants to check if an entry is marked intent-to-add, or 
has the skip-worktree/spare-index bits set (are there other ways to 
inspect those?)

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 15:57       ` Junio C Hamano
  2022-06-24 10:16         ` Phillip Wood
@ 2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
  2022-06-24 15:28           ` Junio C Hamano
  1 sibling, 1 reply; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-24 13:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Phillip Wood, ZheNing Hu via GitGitGadget, git, Christian Couder,
	ZheNing Hu


On Thu, Jun 23 2022, Junio C Hamano wrote:

> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> Thanks for re-rolling, having taken a look a closer look at the tests
>> I'm concerned about the output format for some of the specifiers, see
>> below.
>
> Thanks for raising these issues.  I agree with you on many of them.
> In addition to what you covered ....
>
>>> +path::
>>> +	The pathname of the file which is in the index.
>> I think that for all these it might be clearer to say "recorded in the
>> index" rather than "of the file which is in the index"
>
> I think we would call this "name".  The name of the existing option
> that controls how they are shown is "--full-name", not "--full-path",
> for example.

To the extent that we got this wrong it was me in 455923e0a15 (ls-tree:
introduce "--format" option, 2022-03-23), but given that we have that I
think it makes sense to have this be consistent with ls-tree.

FWIW ls-tree also uses "name" options, but its docs talked about
"<path>", so I thought it was more helpful to pick that.

We also say that we will "show the full path names" in that
documentation.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  2022-06-23 14:06     ` Phillip Wood
@ 2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
  2022-06-24 15:33       ` Junio C Hamano
  2022-06-26 13:34       ` ZheNing Hu
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-24 13:25 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Phillip Wood, ZheNing Hu


On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
> [...]
> +	if (skip_prefix(start, "(objectmode)", &p))
> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
> +	else if (skip_prefix(start, "(objectname)", &p))
> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> +	else if (skip_prefix(start, "(stage)", &p))
> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
> +	else if (skip_prefix(start, "(eol)", &p))
> +		write_eolinfo_to_buf(sb, data->istate,
> +				     data->ce, data->pathname);
> +	else if (skip_prefix(start, "(path)", &p))
> +		write_name_to_buf(sb, data->pathname);
> +	else if (skip_prefix(start, "(ctime)", &p))
> +		strbuf_addf(sb, "ctime: %u:%u",
> +			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
> +	else if (skip_prefix(start, "(mtime)", &p))
> +		strbuf_addf(sb, "mtime: %u:%u",
> +			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
> +	else if (skip_prefix(start, "(dev)", &p))
> +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
> +	else if (skip_prefix(start, "(ino)", &p))
> +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
> +	else if (skip_prefix(start, "(uid)", &p))
> +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
> +	else if (skip_prefix(start, "(gid)", &p))
> +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
> +	else if (skip_prefix(start, "(size)", &p))
> +		strbuf_addf(sb, "size: %u", sd->sd_size);
> +	else if (skip_prefix(start, "(flags)", &p))
> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);


In my mind almost the entire point of a --format is that you can
e.g. \0-delimit it, and don't need to do other parsing games.

So this really should be adding just e.g. "%x", not "flags: %x", 

Similarly, let's no have :-delimited fields. First, for a formatted
number "1656077225:850723245" is just bizarre for %(ctime), let's use
".", not ":", so: "1656077225.850723245".

And let's call that %(ctime), then have (which is trivial to add) a
%(ctime:sec) and %(ctime:nsec), so someone who wants to format this can
parse it as they please, ditto for mtime.

Looking at your tests it seemed you went down the route of aligning the
output with the --debug output, which is already pre-formatted. I.e. to
make what you have here match:

                printf("  ctime: %u:%u\n", sd->sd_ctime.sec, sd->sd_ctime.nsec);
                printf("  mtime: %u:%u\n", sd->sd_mtime.sec, sd->sd_mtime.nsec);
                printf("  dev: %u\tino: %u\n", sd->sd_dev, sd->sd_ino);
                printf("  uid: %u\tgid: %u\n", sd->sd_uid, sd->sd_gid);
                printf("  size: %u\tflags: %x\n", sd->sd_size, ce->ce_flags);

I think that's a mistake, we should be able to emit those individual
%-specifiers instead, not that line as-is without the " " prefix and
"\n" suffix.

> +
> +	if (format && (show_stage || show_others || show_killed ||
> +		show_resolve_undo || skipping_duplicates || debug_mode))
> +			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));

Use usage_msg_opt() or usage_msg_optf() here instead of die(), and no
need to include "ls-files " in the message.

See die_for_incompatible_opt4, maybe you can just use that instead? A
bit painful, but:

    die_for_incompatible_opt4(format, "--format", show_stage, "-s", show_others, "-o", show_killed, "-k");
    die_for_incompatible_opt4(format, "--format", show_resolve_undo, "--resolve-undo", skipping_duplicates, "--deduplicate", debug_mode, "--debug");

But urgh, that helper really should use usage_msg_opt() instead, but
using it for now as-is probably sucks less.

I also think we should not forbid combining this wtih --debug, it's
helpful to construct a format. This seems to work:
		
	diff --git a/builtin/ls-files.c b/builtin/ls-files.c
	index 387641b32df..82f13edef7e 100644
	--- a/builtin/ls-files.c
	+++ b/builtin/ls-files.c
	@@ -343,12 +343,17 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
	 				  S_ISGITLINK(ce->ce_mode))) {
	 		if (format) {
	 			show_ce_fmt(repo, ce, format, fullname);
	-			return;
	+			if (!debug_mode)
	+				return;
	 		}
	 
	 		tag = get_tag(ce, tag);
	 
	-		if (!show_stage) {
	+		if (format) {
	+			if (!debug_mode)
	+				BUG("unreachable");
	+			; /* for --debug */
	+		} else if (!show_stage) {
	 			fputs(tag, stdout);
	 		} else {
	 			printf("%s%06o %s %d\t",
	@@ -814,7 +819,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
	 	}
	 
	 	if (format && (show_stage || show_others || show_killed ||
	-		show_resolve_undo || skipping_duplicates || debug_mode))
	+		show_resolve_undo || skipping_duplicates))
	 			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
	 
	 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
	
I.e. we'll get:
	
	$ ./git ls-files --debug --format='<%(flags) %(path)>'  -- po/is.po
	<flags: 0 po/is.po>
	po/is.po
	  ctime: 1654300098:369653868
	  mtime: 1654300098:369653868
	  dev: 2306     ino: 10487322
	  uid: 1001     gid: 1001
	  size: 3370    flags: 0

Which I think is quite useful when poking around in this an coming up
with a format.

> +
>  	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>  		tag_cached = "H ";
>  		tag_unmerged = "M ";
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> new file mode 100755
> index 00000000000..8c3ef2df138
> --- /dev/null
> +++ b/t/t3013-ls-files-format.sh
> @@ -0,0 +1,124 @@
> +#!/bin/sh
> +
> +test_description='git ls-files --format test'
> +

Add this line here:

TEST_PASSES_SANITIZE_LEAK=true

I.e. just before test-lib.sh, see other test examples. Then we'll test
this under SANITIZE=leak in CI, to ensure it doesn't leak memory.

> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	echo o1 >o1 &&
> +	echo o2 >o2 &&
> +	git add o1 o2 &&
> +	git add --chmod +x o1 &&
> +	git commit -m base
> +'
> +
> [...]

> +for flag in -s -o -k --resolve-undo --deduplicate --debug
> +do
> +	test_expect_success "git ls-files --format is incompatible with $flag" '
> +		test_must_fail git ls-files --format="%(objectname)" $flag
> +	'
> +done

Nit: I think it's good to move these sotrs of tests before "setup", and
give them a "usage: " prefix, see some other existing examples.

We usually use test_expect_code 129 for those, depending on if you'll
end up with die() or not...

nit: missing \n before this line:

> +test_done
>
> base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
@ 2022-06-24 15:28           ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2022-06-24 15:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Phillip Wood, ZheNing Hu via GitGitGadget, git, Christian Couder,
	ZheNing Hu

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We also say that we will "show the full path names" in that
> documentation.

The primary issue is not the presence of "name" there, but the lack
of "path" in the word chosen.

Many things can have "name" (including "object name"), and "path",
not "name", in "path name" is what clarifies what kind of name it
is.  Given that --format placeholders include "objectname", it does
not make a good design to use "name" alone without saying what kind
of "name" it is.

Calling it "pathname", not just "path", is perfectly OK.  But if
there is no other things the word "path" could refer to in this
context, which I think is the case here, "path" would be acceptable.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
@ 2022-06-24 15:33       ` Junio C Hamano
  2022-06-26 13:35         ` ZheNing Hu
  2022-06-26 13:34       ` ZheNing Hu
  1 sibling, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2022-06-24 15:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, git, Christian Couder, Phillip Wood,
	ZheNing Hu

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:
>> +	if (skip_prefix(start, "(objectmode)", &p))
>> +		strbuf_addf(sb, "%06o", data->ce->ce_mode);
>> +	else if (skip_prefix(start, "(objectname)", &p))
>> +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
>> +	else if (skip_prefix(start, "(stage)", &p))
>> +		strbuf_addf(sb, "%d", ce_stage(data->ce));
>> +	else if (skip_prefix(start, "(path)", &p))
>> +		write_name_to_buf(sb, data->pathname);

These are just "values".

>> +	else if (skip_prefix(start, "(ctime)", &p))
>> +		strbuf_addf(sb, "ctime: %u:%u",
>> +			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
>> +	else if (skip_prefix(start, "(mtime)", &p))
>> +		strbuf_addf(sb, "mtime: %u:%u",
>> +			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
>> +	else if (skip_prefix(start, "(dev)", &p))
>> +		strbuf_addf(sb, "dev: %u", sd->sd_dev);
>> +	else if (skip_prefix(start, "(ino)", &p))
>> +		strbuf_addf(sb, "ino: %u", sd->sd_ino);
>> +	else if (skip_prefix(start, "(uid)", &p))
>> +		strbuf_addf(sb, "uid: %u", sd->sd_uid);
>> +	else if (skip_prefix(start, "(gid)", &p))
>> +		strbuf_addf(sb, "gid: %u", sd->sd_gid);
>> +	else if (skip_prefix(start, "(size)", &p))
>> +		strbuf_addf(sb, "size: %u", sd->sd_size);
>> +	else if (skip_prefix(start, "(flags)", &p))
>> +		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);

These are not.

> In my mind almost the entire point of a --format is that you can
> e.g. \0-delimit it, and don't need to do other parsing games.
>
> So this really should be adding just e.g. "%x", not "flags: %x", 

Yes.  A very good point, if we were showing these fields (I already
said I doubt it is useful), they should also show just "values"
After all, people can do "--format=mode: %(objectmode)" if they want
an identifying tag before the value.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-23 14:06     ` Phillip Wood
  2022-06-23 15:57       ` Junio C Hamano
@ 2022-06-26 13:01       ` ZheNing Hu
  1 sibling, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:01 UTC (permalink / raw)
  To: Phillip Wood
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月23日周四 22:06写道:
>
> Hi ZheNing
> > [...]
> > +It is possible to print in a custom format by using the `--format`
> > +option, which is able to interpolate different fields using
> > +a `%(fieldname)` notation. For example, if you only care about the
> > +"objectname" and "path" fields, you can execute with a specific
> > +"--format" like
> > +
> > +     git ls-files --format='%(objectname) %(path)'
> > +
> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +objectmode::
> > +     The mode of the file which is in the index.
> > +objectname::
> > +     The name of the file which is in the index.
> > +stage::
> > +     The stage of the file which is in the index.
> > +eol::
> > +     The <eolinfo> and <eolattr> of files both in the
> > +     index and the work-tree.
>
> Looking at the test for this option I think it needs more work, why
> should --format arbitrarily append a tab to the end of the output? - the
> user should be able to specify a separator if they want one as part of
> the format string. Also I'm not sure why there is so much whitespace in
> the output.
>

Because I used old output format in write_eolinfo(), now I think it's wrong,
I will separate it to three parts: %(eolinfo:index), %(eolinfo:worktree),
%(eolattr).

> If %(flags) is going to be useful then I think we need to think about
> how they are printed and document that. At the moment they are printed
> as a hexadecimal number which is fine for debugging but probably not
> going to be useful for something like --format. I think printing
> documented symbolic names with some kind of separator (a comma maybe)
> between them is probably more useful
>

Agree.

>  > [...]
> > +test_expect_success 'git ls-files --format eol' '
> > +     printf "i/lf    w/lf    attr/                 \t\n" >expect &&
> > +     printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
> > +     git ls-files --format="%(eol)" --eol >actual &&
>
> I'm not sure why this is passing --eol as well as --format='%(eol)' -
> shouldn't that combination of flags be an error?
>

Thank you for reminding, will be corrected.

> Best Wishes
>
> Phillip

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 10:16         ` Phillip Wood
@ 2022-06-26 13:05           ` ZheNing Hu
  0 siblings, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:05 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Junio C Hamano, ZheNing Hu via GitGitGadget, Git List,
	Christian Couder, Ævar Arnfjörð Bjarmason

Phillip Wood <phillip.wood123@gmail.com> 于2022年6月24日周五 18:16写道:

> >>> +ctime::
> >>> +   The create time of file which is in the index.
> >>
> >> This is printed with a prefix 'ctime:' (the same applies to the format
> >> specifiers below) I think we should omit that and just print the data
> >> so the user can choose the format they want.
> >>
> >>> +mtime::
> >>> +   The modified time of file which is in the index.
> >
> > These are only the low-bits of the full timestamp, not ctime/mtime
> > themselves.
> >
> > But stepping back a bit, why do we need to include them in the
> > output?  What workflow and use case are we trying to help?  Dump
> > output from "stat <path>" equivalent from ls-files and compare with
> > "stat ." output to see which ones are stale?  Or is there any value
> > to see the value of, say, ctime as an individual data item?
> >
> >>> +dev::
> >>> +   The ID of device containing file which is in the index.
> >>> +ino::
> >>> +   The inode number of file which is in the index.
> >>> +uid::
> >>> +   The user id of file owner which is in the index.
> >>> +gid::
> >>> +   The group id of file owner which is in the index.
> >
> > Again, why do we need to include these in the output?
> >
> > Wouldn't it be sufficient, as well as a lot more useful, to show a
> > single bit "the cached stat info matches what is in the working tree
> > (yes/no)"?
>
> That does sound useful
>
> >>> +flags::
> >>> +   The flags of the file in the index which include
> >>> +   in-memory only flags and some extended on-disk flags.
> >>
> >> If %(flags) is going to be useful then I think we need to think about
> >> how they are printed and document that. At the moment they are printed
> >> as a hexadecimal number which is fine for debugging but probably not
> >> going to be useful for something like --format. I think printing
> >> documented symbolic names with some kind of separator (a comma maybe)
> >> between them is probably more useful
> >
> > I am guessing that most of the above are only useful for curious
> > geeks and those who are debugging their new tweak to the code that
> > touches the index, i.e. a debugging feature.  But these folks can
> > run "git" under a debugger, and they probably have to do so when
> > they are seeing an unexpected value in the flags member of a cache
> > entry anyway.  So I am not sure whom this field is intended to help.
>
> I wondered about that as well, but thought there might be a plausible
> use if someone wants to check if an entry is marked intent-to-add, or
> has the skip-worktree/spare-index bits set (are there other ways to
> inspect those?)
>

I think this feature will be useful too, but it may not belong to this patch.
We can discuss how to implement it later.

> Best Wishes
>
> Phillip

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
  2022-06-24 15:33       ` Junio C Hamano
@ 2022-06-26 13:34       ` ZheNing Hu
  1 sibling, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Phillip Wood

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年6月24日周五 21:46写道:
>
>
> On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> > [...]
>
> In my mind almost the entire point of a --format is that you can
> e.g. \0-delimit it, and don't need to do other parsing games.
>
> So this really should be adding just e.g. "%x", not "flags: %x",
>

Yeah, I admit that there really shouldn't use extra formatting here.

> Similarly, let's no have :-delimited fields. First, for a formatted
> number "1656077225:850723245" is just bizarre for %(ctime), let's use
> ".", not ":", so: "1656077225.850723245".
>
> And let's call that %(ctime), then have (which is trivial to add) a
> %(ctime:sec) and %(ctime:nsec), so someone who wants to format this can
> parse it as they please, ditto for mtime.
>
> Looking at your tests it seemed you went down the route of aligning the
> output with the --debug output, which is already pre-formatted. I.e. to
> make what you have here match:
>
>                 printf("  ctime: %u:%u\n", sd->sd_ctime.sec, sd->sd_ctime.nsec);
>                 printf("  mtime: %u:%u\n", sd->sd_mtime.sec, sd->sd_mtime.nsec);
>                 printf("  dev: %u\tino: %u\n", sd->sd_dev, sd->sd_ino);
>                 printf("  uid: %u\tgid: %u\n", sd->sd_uid, sd->sd_gid);
>                 printf("  size: %u\tflags: %x\n", sd->sd_size, ce->ce_flags);
>
> I think that's a mistake, we should be able to emit those individual
> %-specifiers instead, not that line as-is without the " " prefix and
> "\n" suffix.
>

Yeah, agree. But now I just want to delete all atoms from %(ctime) to %(flags),
and let --debug can work with --format.

> > +
> > +     if (format && (show_stage || show_others || show_killed ||
> > +             show_resolve_undo || skipping_duplicates || debug_mode))
> > +                     die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
>
> Use usage_msg_opt() or usage_msg_optf() here instead of die(), and no
> need to include "ls-files " in the message.
>
> See die_for_incompatible_opt4, maybe you can just use that instead? A
> bit painful, but:
>
>     die_for_incompatible_opt4(format, "--format", show_stage, "-s", show_others, "-o", show_killed, "-k");
>     die_for_incompatible_opt4(format, "--format", show_resolve_undo, "--resolve-undo", skipping_duplicates, "--deduplicate", debug_mode, "--debug");
>

Good suggestion. I am curious about why there is no function like
die_for_incompatible_opt4() with variable parameters?

> But urgh, that helper really should use usage_msg_opt() instead, but
> using it for now as-is probably sucks less.
>
> I also think we should not forbid combining this wtih --debug, it's
> helpful to construct a format. This seems to work:
>
>         diff --git a/builtin/ls-files.c b/builtin/ls-files.c
>         index 387641b32df..82f13edef7e 100644
>         --- a/builtin/ls-files.c
>         +++ b/builtin/ls-files.c
>         @@ -343,12 +343,17 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
>                                           S_ISGITLINK(ce->ce_mode))) {
>                         if (format) {
>                                 show_ce_fmt(repo, ce, format, fullname);
>         -                       return;
>         +                       if (!debug_mode)
>         +                               return;
>                         }
>
>                         tag = get_tag(ce, tag);
>
>         -               if (!show_stage) {
>         +               if (format) {
>         +                       if (!debug_mode)
>         +                               BUG("unreachable");
>         +                       ; /* for --debug */
>         +               } else if (!show_stage) {
>                                 fputs(tag, stdout);
>                         } else {
>                                 printf("%s%06o %s %d\t",
>         @@ -814,7 +819,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>                 }
>
>                 if (format && (show_stage || show_others || show_killed ||
>         -               show_resolve_undo || skipping_duplicates || debug_mode))
>         +               show_resolve_undo || skipping_duplicates))
>                                 die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
>
>                 if (show_tag || show_valid_bit || show_fsmonitor_bit) {
>
> I.e. we'll get:
>
>         $ ./git ls-files --debug --format='<%(flags) %(path)>'  -- po/is.po
>         <flags: 0 po/is.po>
>         po/is.po
>           ctime: 1654300098:369653868
>           mtime: 1654300098:369653868
>           dev: 2306     ino: 10487322
>           uid: 1001     gid: 1001
>           size: 3370    flags: 0
>
> Which I think is quite useful when poking around in this an coming up
> with a format.
>

Maybe something like this will be easier?


@@ -343,6 +335,7 @@ static void show_ce(struct repository *repo,
struct dir_struct *dir,
                                  S_ISGITLINK(ce->ce_mode))) {
                if (format) {
                        show_ce_fmt(repo, ce, format, fullname);
+                       print_debug(ce);
                        return;
                }


> > +
> >       if (show_tag || show_valid_bit || show_fsmonitor_bit) {
> >               tag_cached = "H ";
> >               tag_unmerged = "M ";
> > diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> > new file mode 100755
> > index 00000000000..8c3ef2df138
> > --- /dev/null
> > +++ b/t/t3013-ls-files-format.sh
> > @@ -0,0 +1,124 @@
> > +#!/bin/sh
> > +
> > +test_description='git ls-files --format test'
> > +
>
> Add this line here:
>
> TEST_PASSES_SANITIZE_LEAK=true
>
> I.e. just before test-lib.sh, see other test examples. Then we'll test
> this under SANITIZE=leak in CI, to ensure it doesn't leak memory.
>
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup' '
> > +     echo o1 >o1 &&
> > +     echo o2 >o2 &&
> > +     git add o1 o2 &&
> > +     git add --chmod +x o1 &&
> > +     git commit -m base
> > +'
> > +
> > [...]
>
> > +for flag in -s -o -k --resolve-undo --deduplicate --debug
> > +do
> > +     test_expect_success "git ls-files --format is incompatible with $flag" '
> > +             test_must_fail git ls-files --format="%(objectname)" $flag
> > +     '
> > +done
>
> Nit: I think it's good to move these sotrs of tests before "setup", and
> give them a "usage: " prefix, see some other existing examples.
>

Agree.

> We usually use test_expect_code 129 for those, depending on if you'll
> end up with die() or not...
>
> nit: missing \n before this line:
>
> > +test_done
> >
> > base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
>

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-24 15:33       ` Junio C Hamano
@ 2022-06-26 13:35         ` ZheNing Hu
  2022-06-27  8:22           ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: ZheNing Hu @ 2022-06-26 13:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

Junio C Hamano <gitster@pobox.com> 于2022年6月24日周五 23:33写道:
>
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> > On Tue, Jun 21 2022, ZheNing Hu via GitGitGadget wrote:
> >> +    if (skip_prefix(start, "(objectmode)", &p))
> >> +            strbuf_addf(sb, "%06o", data->ce->ce_mode);
> >> +    else if (skip_prefix(start, "(objectname)", &p))
> >> +            strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
> >> +    else if (skip_prefix(start, "(stage)", &p))
> >> +            strbuf_addf(sb, "%d", ce_stage(data->ce));
> >> +    else if (skip_prefix(start, "(path)", &p))
> >> +            write_name_to_buf(sb, data->pathname);
>
> These are just "values".
>
> >> +    else if (skip_prefix(start, "(ctime)", &p))
> >> +            strbuf_addf(sb, "ctime: %u:%u",
> >> +                        sd->sd_ctime.sec, sd->sd_ctime.nsec);
> >> +    else if (skip_prefix(start, "(mtime)", &p))
> >> +            strbuf_addf(sb, "mtime: %u:%u",
> >> +                        sd->sd_mtime.sec, sd->sd_mtime.nsec);
> >> +    else if (skip_prefix(start, "(dev)", &p))
> >> +            strbuf_addf(sb, "dev: %u", sd->sd_dev);
> >> +    else if (skip_prefix(start, "(ino)", &p))
> >> +            strbuf_addf(sb, "ino: %u", sd->sd_ino);
> >> +    else if (skip_prefix(start, "(uid)", &p))
> >> +            strbuf_addf(sb, "uid: %u", sd->sd_uid);
> >> +    else if (skip_prefix(start, "(gid)", &p))
> >> +            strbuf_addf(sb, "gid: %u", sd->sd_gid);
> >> +    else if (skip_prefix(start, "(size)", &p))
> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
> >> +    else if (skip_prefix(start, "(flags)", &p))
> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
>
> These are not.
>
Agree. So I just remove them as you see. If someone else
need them for some reason, we can add them back.

ZheNing Hu

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4] ls-files: introduce "--format" option
  2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
  2022-06-23 14:06     ` Phillip Wood
  2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
@ 2022-06-26 15:29     ` ZheNing Hu via GitGitGadget
  2022-06-27  8:32       ` Junio C Hamano
                         ` (2 more replies)
  2 siblings, 3 replies; 30+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2022-06-26 15:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add a new option --format that output index enties
informations with custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

--format cannot used with -s, -o, -k, --resolve-undo,
--deduplicate and --eol.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    ls-files: introduce "--format" options
    
    v3->v4:
    
     1. Let --format compatible with --debug.
     2. Let --format incompatible with --eol.
     3. Split %(eol) to three atom: %(eolinfo:index), %(eolinfo:worktree)
        and %(eolattr).
     4. Remove %(ctime), %(mtime), %(dev), %(ino), %(uid), %(gid), %(size),
        %(flags).
     5. Fix output format without some dirty "prefix".
     6. Change some test.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1262%2Fadlternative%2Fzh%2Fls-file-format-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1262/adlternative/zh/ls-file-format-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1262

Range-diff vs v3:

 1:  aaafa35ffcd ! 1:  6827e44e158 ls-files: introduce "--format" option
     @@ Commit message
          command.
      
          --format cannot used with -s, -o, -k, --resolve-undo,
     -    --deduplicate and --debug.
     +    --deduplicate and --eol.
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ Documentation/git-ls-files.txt: followed by the  ("attr/<eolattr>").
      +	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
      +	interpolates to character with hex code `xx`; for example `%00`
      +	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
     -+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo` and
     -+	`--debug`.
     ++	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`
     ++	and `--eol`.
       \--::
       	Do not interpret any more arguments as options.
       
     @@ Documentation/git-ls-files.txt: quoted as explained for the configuration variab
      +names can be used:
      +
      +objectmode::
     -+	The mode of the file which is in the index.
     ++	The mode of the file which is recorded in the index.
      +objectname::
     -+	The name of the file which is in the index.
     ++	The name of the file which is recorded in the index.
      +stage::
     -+	The stage of the file which is in the index.
     -+eol::
     -+	The <eolinfo> and <eolattr> of files both in the
     -+	index and the work-tree.
     ++	The stage of the file which is recorded in the index.
     ++eolinfo:index::
     ++	The <eolinfo> of the file which is recorded in the index.
     ++eolinfo:worktree::
     ++	The <eolinfo> of the file which is recorded in the working tree.
     ++eolattr::
     ++	The <eolattr> of the file which is recorded in the index.
      +path::
     -+	The pathname of the file which is in the index.
     -+ctime::
     -+	The create time of file which is in the index.
     -+mtime::
     -+	The modified time of file which is in the index.
     -+dev::
     -+	The ID of device containing file which is in the index.
     -+ino::
     -+	The inode number of file which is in the index.
     -+uid::
     -+	The user id of file owner which is in the index.
     -+gid::
     -+	The group id of file owner which is in the index.
     -+size::
     -+	The size of the file which is in the index.
     -+flags::
     -+	The flags of the file in the index which include
     -+	in-memory only flags and some extended on-disk flags.
     ++	The pathname of the file which is recorded in the index.
       
       EXCLUDE PATTERNS
       ----------------
     @@ builtin/ls-files.c: static char *ps_matched;
       
       static const char *tag_cached = "";
       static const char *tag_unmerged = "";
     -@@ builtin/ls-files.c: static const char *tag_modified = "";
     - static const char *tag_skip_worktree = "";
     - static const char *tag_resolve_undo = "";
     - 
     --static void write_eolinfo(struct index_state *istate,
     --			  const struct cache_entry *ce, const char *path)
     -+static void write_eolinfo_internal(struct strbuf *sb, struct index_state *istate,
     -+				   const struct cache_entry *ce, const char *path)
     - {
     - 	if (show_eol) {
     - 		struct stat st;
      @@ builtin/ls-files.c: static void write_eolinfo(struct index_state *istate,
     - 							       ce->name);
     - 		if (!lstat(path, &st) && S_ISREG(st.st_mode))
     - 			w_txt = get_wt_convert_stats_ascii(path);
     --		printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
     -+		if (sb)
     -+			strbuf_addf(sb, "i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
     -+		else
     -+			printf("i/%-5s w/%-5s attr/%-17s\t", i_txt, w_txt, a_txt);
       	}
       }
       
     -+static void write_eolinfo(struct index_state *istate,
     -+			  const struct cache_entry *ce, const char *path)
     ++static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
     ++				       const struct cache_entry *ce)
     ++{
     ++	const char *i_txt = "";
     ++	if (ce && S_ISREG(ce->ce_mode))
     ++		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
     ++	strbuf_addstr(sb, i_txt);
     ++}
     ++
     ++static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
      +{
     -+	write_eolinfo_internal(NULL, istate, ce, path);
     ++	struct stat st;
     ++	const char *w_txt = "";
     ++	if (!lstat(path, &st) && S_ISREG(st.st_mode))
     ++		w_txt = get_wt_convert_stats_ascii(path);
     ++	strbuf_addstr(sb, w_txt);
      +}
      +
     -+static void write_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
     -+				 const struct cache_entry *ce, const char *path)
     ++static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
     ++				 const char *path)
      +{
     -+	write_eolinfo_internal(sb, istate, ce, path);
     ++	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
      +}
      +
       static void write_name(const char *name)
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
       }
       
      +struct show_index_data {
     -+	const char *tag;
      +	const char *pathname;
      +	struct index_state *istate;
      +	const struct cache_entry *ce;
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +	struct show_index_data *data = context;
      +	const char *end;
      +	const char *p;
     -+	const struct stat_data *sd = &data->ce->ce_stat_data;
      +	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
      +	if (len)
      +		return len;
     @@ builtin/ls-files.c: static void show_submodule(struct repository *superproject,
      +		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
      +	else if (skip_prefix(start, "(stage)", &p))
      +		strbuf_addf(sb, "%d", ce_stage(data->ce));
     -+	else if (skip_prefix(start, "(eol)", &p))
     -+		write_eolinfo_to_buf(sb, data->istate,
     -+				     data->ce, data->pathname);
     ++	else if (skip_prefix(start, "(eolinfo:index)", &p))
     ++		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
     ++	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
     ++		write_worktree_eolinfo_to_buf(sb, data->pathname);
     ++	else if (skip_prefix(start, "(eolattr)", &p))
     ++		write_eolattr_to_buf(sb, data->istate, data->pathname);
      +	else if (skip_prefix(start, "(path)", &p))
      +		write_name_to_buf(sb, data->pathname);
     -+	else if (skip_prefix(start, "(ctime)", &p))
     -+		strbuf_addf(sb, "ctime: %u:%u",
     -+			    sd->sd_ctime.sec, sd->sd_ctime.nsec);
     -+	else if (skip_prefix(start, "(mtime)", &p))
     -+		strbuf_addf(sb, "mtime: %u:%u",
     -+			    sd->sd_mtime.sec, sd->sd_mtime.nsec);
     -+	else if (skip_prefix(start, "(dev)", &p))
     -+		strbuf_addf(sb, "dev: %u", sd->sd_dev);
     -+	else if (skip_prefix(start, "(ino)", &p))
     -+		strbuf_addf(sb, "ino: %u", sd->sd_ino);
     -+	else if (skip_prefix(start, "(uid)", &p))
     -+		strbuf_addf(sb, "uid: %u", sd->sd_uid);
     -+	else if (skip_prefix(start, "(gid)", &p))
     -+		strbuf_addf(sb, "gid: %u", sd->sd_gid);
     -+	else if (skip_prefix(start, "(size)", &p))
     -+		strbuf_addf(sb, "size: %u", sd->sd_size);
     -+	else if (skip_prefix(start, "(flags)", &p))
     -+		strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
      +	else
      +		die(_("bad ls-files format: %%%.*s"), (int)len, start);
      +
     @@ builtin/ls-files.c: static void show_ce(struct repository *repo, struct dir_stru
       				  S_ISGITLINK(ce->ce_mode))) {
      +		if (format) {
      +			show_ce_fmt(repo, ce, format, fullname);
     ++			print_debug(ce);
      +			return;
      +		}
      +
     @@ builtin/ls-files.c: int cmd_ls_files(int argc, const char **argv, const char *cm
       	}
      +
      +	if (format && (show_stage || show_others || show_killed ||
     -+		show_resolve_undo || skipping_duplicates || debug_mode))
     -+			die(_("ls-files --format cannot used with -s, -o, -k, --resolve-undo, --deduplicate, --debug"));
     ++		show_resolve_undo || skipping_duplicates || show_eol))
     ++			usage_msg_opt("--format cannot used with -s, -o, -k, "
     ++				      "--resolve-undo, --deduplicate, --eol",
     ++				      ls_files_usage, builtin_ls_files_options);
      +
       	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
       		tag_cached = "H ";
     @@ t/t3013-ls-files-format.sh (new)
      +
      +test_description='git ls-files --format test'
      +
     ++TEST_PASSES_SANITIZE_LEAK=true
      +. ./test-lib.sh
      +
     ++for flag in -s -o -k --resolve-undo --deduplicate --eol
     ++do
     ++	test_expect_success "usage: --format is incompatible with $flag" '
     ++		test_expect_code 129 git ls-files --format="%(objectname)" $flag
     ++	'
     ++done
     ++
      +test_expect_success 'setup' '
      +	echo o1 >o1 &&
      +	echo o2 >o2 &&
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format eol' '
     -+	printf "i/lf    w/lf    attr/                 \t\n" >expect &&
     -+	printf "i/lf    w/lf    attr/                 \t\n" >>expect &&
     -+	git ls-files --format="%(eol)" --eol >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'git ls-files --format path' '
     ++test_expect_success 'git ls-files --format eolinfo:index' '
      +	cat >expect <<-\EOF &&
     -+	o1
     -+	o2
     ++	lf
     ++	lf
      +	EOF
     -+	git ls-files --format="%(path)" >actual &&
     ++	git ls-files --format="%(eolinfo:index)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format ctime' '
     -+	git ls-files --debug >out &&
     -+	grep ctime out >expect &&
     -+	git ls-files --format="  %(ctime)" >actual &&
     -+	test_cmp expect actual
     -+'
     -+
     -+test_expect_success 'git ls-files --format mtime' '
     -+	git ls-files --debug >out &&
     -+	grep mtime out >expect &&
     -+	git ls-files --format="  %(mtime)" >actual &&
     ++test_expect_success 'git ls-files --format eolinfo:worktree' '
     ++	cat >expect <<-\EOF &&
     ++	lf
     ++	lf
     ++	EOF
     ++	git ls-files --format="%(eolinfo:worktree)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format dev and ino' '
     -+	git ls-files --debug >out &&
     -+	grep dev out >expect &&
     -+	git ls-files --format="  %(dev)%x09%(ino)" >actual &&
     ++test_expect_success 'git ls-files --format eolattr' '
     ++	printf "\n\n" >expect &&
     ++	git ls-files --format="%(eolattr)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format uid and gid' '
     -+	git ls-files --debug >out &&
     -+	grep uid out >expect &&
     -+	git ls-files --format="  %(uid)%x09%(gid)" >actual &&
     ++test_expect_success 'git ls-files --format path' '
     ++	cat >expect <<-\EOF &&
     ++	o1
     ++	o2
     ++	EOF
     ++	git ls-files --format="%(path)" >actual &&
      +	test_cmp expect actual
      +'
      +
     @@ t/t3013-ls-files-format.sh (new)
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format size and flags' '
     -+	git ls-files --debug >out &&
     -+	grep size out >expect &&
     -+	git ls-files --format="  %(size)%x09%(flags)" >actual &&
     -+	test_cmp expect actual
     -+'
     -+
      +test_expect_success 'git ls-files --format imitate --stage' '
      +	git ls-files --stage >expect &&
      +	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success 'git ls-files --format imitate --debug' '
     ++test_expect_success 'git ls-files --format with --debug' '
      +	git ls-files --debug >expect &&
     -+	git ls-files --format="%(path)%x0a  %(ctime)%x0a  %(mtime)%x0a  %(dev)%x09%(ino)%x0a  %(uid)%x09%(gid)%x0a  %(size)%x09%(flags)" >actual &&
     ++	git ls-files --format="%(path)" --debug >actual &&
      +	test_cmp expect actual
      +'
      +
     -+for flag in -s -o -k --resolve-undo --deduplicate --debug
     -+do
     -+	test_expect_success "git ls-files --format is incompatible with $flag" '
     -+		test_must_fail git ls-files --format="%(objectname)" $flag
     -+	'
     -+done
      +test_done


 Documentation/git-ls-files.txt |  37 ++++++++++-
 builtin/ls-files.c             | 113 +++++++++++++++++++++++++++++++++
 t/t3013-ls-files-format.sh     | 108 +++++++++++++++++++++++++++++++
 3 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100755 t/t3013-ls-files-format.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0dabf3f0ddc..38e81cc889f 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -20,7 +20,7 @@ SYNOPSIS
 		[--exclude-standard]
 		[--error-unmatch] [--with-tree=<tree-ish>]
 		[--full-name] [--recurse-submodules]
-		[--abbrev[=<n>]] [--] [<file>...]
+		[--abbrev[=<n>]] [--format=<format>] [--] [<file>...]
 
 DESCRIPTION
 -----------
@@ -192,6 +192,13 @@ followed by the  ("attr/<eolattr>").
 	to the contained files. Sparse directories will be shown with a
 	trailing slash, such as "x/" for a sparse directory "x".
 
+--format=<format>::
+	A string that interpolates `%(fieldname)` from the result being shown.
+	It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits
+	interpolates to character with hex code `xx`; for example `%00`
+	interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF).
+	--format cannot be combined with `-s`, `-o`, `-k`, `--resolve-undo`
+	and `--eol`.
 \--::
 	Do not interpret any more arguments as options.
 
@@ -223,6 +230,34 @@ quoted as explained for the configuration variable `core.quotePath`
 (see linkgit:git-config[1]).  Using `-z` the filename is output
 verbatim and the line is terminated by a NUL byte.
 
+It is possible to print in a custom format by using the `--format`
+option, which is able to interpolate different fields using
+a `%(fieldname)` notation. For example, if you only care about the
+"objectname" and "path" fields, you can execute with a specific
+"--format" like
+
+	git ls-files --format='%(objectname) %(path)'
+
+FIELD NAMES
+-----------
+Various values from structured fields can be used to interpolate
+into the resulting output. For each outputting line, the following
+names can be used:
+
+objectmode::
+	The mode of the file which is recorded in the index.
+objectname::
+	The name of the file which is recorded in the index.
+stage::
+	The stage of the file which is recorded in the index.
+eolinfo:index::
+	The <eolinfo> of the file which is recorded in the index.
+eolinfo:worktree::
+	The <eolinfo> of the file which is recorded in the working tree.
+eolattr::
+	The <eolattr> of the file which is recorded in the index.
+path::
+	The pathname of the file which is recorded in the index.
 
 EXCLUDE PATTERNS
 ----------------
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e791b65e7e9..1d52f5cb90b 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -11,6 +11,7 @@
 #include "quote.h"
 #include "dir.h"
 #include "builtin.h"
+#include "strbuf.h"
 #include "tree.h"
 #include "cache-tree.h"
 #include "parse-options.h"
@@ -48,6 +49,7 @@ static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
 static int exclude_args;
+static const char *format;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -75,6 +77,30 @@ static void write_eolinfo(struct index_state *istate,
 	}
 }
 
+static void write_index_eolinfo_to_buf(struct strbuf *sb, struct index_state *istate,
+				       const struct cache_entry *ce)
+{
+	const char *i_txt = "";
+	if (ce && S_ISREG(ce->ce_mode))
+		i_txt = get_cached_convert_stats_ascii(istate, ce->name);
+	strbuf_addstr(sb, i_txt);
+}
+
+static void write_worktree_eolinfo_to_buf(struct strbuf *sb, const char *path)
+{
+	struct stat st;
+	const char *w_txt = "";
+	if (!lstat(path, &st) && S_ISREG(st.st_mode))
+		w_txt = get_wt_convert_stats_ascii(path);
+	strbuf_addstr(sb, w_txt);
+}
+
+static void write_eolattr_to_buf(struct strbuf *sb, struct index_state *istate,
+				 const char *path)
+{
+	strbuf_addstr(sb, get_convert_attr_ascii(istate, path));
+}
+
 static void write_name(const char *name)
 {
 	/*
@@ -85,6 +111,15 @@ static void write_name(const char *name)
 				   stdout, line_terminator);
 }
 
+static void write_name_to_buf(struct strbuf *sb, const char *name)
+{
+	const char *rel = relative_path(name, prefix_len ? prefix : NULL, sb);
+	if (line_terminator)
+		quote_c_style(rel, sb, NULL, 0);
+	else
+		strbuf_add(sb, rel, strlen(rel));
+}
+
 static const char *get_tag(const struct cache_entry *ce, const char *tag)
 {
 	static char alttag[4];
@@ -222,6 +257,68 @@ static void show_submodule(struct repository *superproject,
 	repo_clear(&subrepo);
 }
 
+struct show_index_data {
+	const char *pathname;
+	struct index_state *istate;
+	const struct cache_entry *ce;
+};
+
+static size_t expand_show_index(struct strbuf *sb, const char *start,
+			       void *context)
+{
+	struct show_index_data *data = context;
+	const char *end;
+	const char *p;
+	size_t len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+	if (*start != '(')
+		die(_("bad ls-files format: element '%s' "
+		      "does not start with '('"), start);
+
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("bad ls-files format: element '%s'"
+		      "does not end in ')'"), start);
+
+	len = end - start + 1;
+	if (skip_prefix(start, "(objectmode)", &p))
+		strbuf_addf(sb, "%06o", data->ce->ce_mode);
+	else if (skip_prefix(start, "(objectname)", &p))
+		strbuf_add_unique_abbrev(sb, &data->ce->oid, abbrev);
+	else if (skip_prefix(start, "(stage)", &p))
+		strbuf_addf(sb, "%d", ce_stage(data->ce));
+	else if (skip_prefix(start, "(eolinfo:index)", &p))
+		write_index_eolinfo_to_buf(sb, data->istate, data->ce);
+	else if (skip_prefix(start, "(eolinfo:worktree)", &p))
+		write_worktree_eolinfo_to_buf(sb, data->pathname);
+	else if (skip_prefix(start, "(eolattr)", &p))
+		write_eolattr_to_buf(sb, data->istate, data->pathname);
+	else if (skip_prefix(start, "(path)", &p))
+		write_name_to_buf(sb, data->pathname);
+	else
+		die(_("bad ls-files format: %%%.*s"), (int)len, start);
+
+	return len;
+}
+
+static void show_ce_fmt(struct repository *repo, const struct cache_entry *ce,
+			const char *format, const char *fullname) {
+
+	struct show_index_data data = {
+		.pathname = fullname,
+		.istate = repo->index,
+		.ce = ce,
+	};
+
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_expand(&sb, format, expand_show_index, &data);
+	strbuf_addch(&sb, line_terminator);
+	fwrite(sb.buf, sb.len, 1, stdout);
+	strbuf_release(&sb);
+	return;
+}
+
 static void show_ce(struct repository *repo, struct dir_struct *dir,
 		    const struct cache_entry *ce, const char *fullname,
 		    const char *tag)
@@ -236,6 +333,12 @@ static void show_ce(struct repository *repo, struct dir_struct *dir,
 				  max_prefix_len, ps_matched,
 				  S_ISDIR(ce->ce_mode) ||
 				  S_ISGITLINK(ce->ce_mode))) {
+		if (format) {
+			show_ce_fmt(repo, ce, format, fullname);
+			print_debug(ce);
+			return;
+		}
+
 		tag = get_tag(ce, tag);
 
 		if (!show_stage) {
@@ -675,6 +778,9 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			 N_("suppress duplicate entries")),
 		OPT_BOOL(0, "sparse", &show_sparse_dirs,
 			 N_("show sparse directories in the presence of a sparse index")),
+		OPT_STRING_F(0, "format", &format, N_("format"),
+			     N_("format to use for the output"),
+			     PARSE_OPT_NONEG),
 		OPT_END()
 	};
 	int ret = 0;
@@ -699,6 +805,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	for (i = 0; i < exclude_list.nr; i++) {
 		add_pattern(exclude_list.items[i].string, "", 0, pl, --exclude_args);
 	}
+
+	if (format && (show_stage || show_others || show_killed ||
+		show_resolve_undo || skipping_duplicates || show_eol))
+			usage_msg_opt("--format cannot used with -s, -o, -k, "
+				      "--resolve-undo, --deduplicate, --eol",
+				      ls_files_usage, builtin_ls_files_options);
+
 	if (show_tag || show_valid_bit || show_fsmonitor_bit) {
 		tag_cached = "H ";
 		tag_unmerged = "M ";
diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
new file mode 100755
index 00000000000..a186fe21126
--- /dev/null
+++ b/t/t3013-ls-files-format.sh
@@ -0,0 +1,108 @@
+#!/bin/sh
+
+test_description='git ls-files --format test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+for flag in -s -o -k --resolve-undo --deduplicate --eol
+do
+	test_expect_success "usage: --format is incompatible with $flag" '
+		test_expect_code 129 git ls-files --format="%(objectname)" $flag
+	'
+done
+
+test_expect_success 'setup' '
+	echo o1 >o1 &&
+	echo o2 >o2 &&
+	git add o1 o2 &&
+	git add --chmod +x o1 &&
+	git commit -m base
+'
+
+test_expect_success 'git ls-files --format objectmode' '
+	cat >expect <<-\EOF &&
+	100755
+	100644
+	EOF
+	git ls-files --format="%(objectmode)" -t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format objectname' '
+	oid1=$(git hash-object o1) &&
+	oid2=$(git hash-object o2) &&
+	cat >expect <<-EOF &&
+	$oid1
+	$oid2
+	EOF
+	git ls-files --format="%(objectname)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolinfo:index' '
+	cat >expect <<-\EOF &&
+	lf
+	lf
+	EOF
+	git ls-files --format="%(eolinfo:index)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolinfo:worktree' '
+	cat >expect <<-\EOF &&
+	lf
+	lf
+	EOF
+	git ls-files --format="%(eolinfo:worktree)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format eolattr' '
+	printf "\n\n" >expect &&
+	git ls-files --format="%(eolattr)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format path' '
+	cat >expect <<-\EOF &&
+	o1
+	o2
+	EOF
+	git ls-files --format="%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -m' '
+	echo change >o1 &&
+	cat >expect <<-\EOF &&
+	o1
+	EOF
+	git ls-files --format="%(path)" -m >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with -d' '
+	echo o3 >o3 &&
+	git add o3 &&
+	rm o3 &&
+	cat >expect <<-\EOF &&
+	o3
+	EOF
+	git ls-files --format="%(path)" -d >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format imitate --stage' '
+	git ls-files --stage >expect &&
+	git ls-files --format="%(objectmode) %(objectname) %(stage)%x09%(path)" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git ls-files --format with --debug' '
+	git ls-files --debug >expect &&
+	git ls-files --format="%(path)" --debug >actual &&
+	test_cmp expect actual
+'
+
+test_done

base-commit: ab336e8f1c8009c8b1aab8deb592148e69217085
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-26 13:35         ` ZheNing Hu
@ 2022-06-27  8:22           ` Junio C Hamano
  2022-06-27 11:06             ` ZheNing Hu
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2022-06-27  8:22 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

ZheNing Hu <adlternative@gmail.com> writes:

>> >> +    else if (skip_prefix(start, "(path)", &p))
>> >> +            write_name_to_buf(sb, data->pathname);
>>
>> These are just "values".
>> ...
>> >> +    else if (skip_prefix(start, "(size)", &p))
>> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
>> >> +    else if (skip_prefix(start, "(flags)", &p))
>> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
>>
>> These are not.
>>
> ... If someone else
> need them for some reason, we can add them back.

If someone else needs to see "size:" printed in front of the value
of sd_size member, we DO NOT HAVE TO DO ANYTHING!  That someone else
can write "--format=size: %(size)" to do so themselves.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
@ 2022-06-27  8:32       ` Junio C Hamano
  2022-06-27 11:18         ` ZheNing Hu
  2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
  2022-06-28 15:19       ` Phillip Wood
  2 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2022-06-27  8:32 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Ævar Arnfjörð Bjarmason,
	Phillip Wood, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Range-diff vs v3:
> ...
>       +test_done

I omitted 300 lines of range-diff, which is not exactly illuminating
in this case.  I wonder if there is a way to turn it off when it is
not helping automatically...

> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.


> +eolinfo:index::
> +	The <eolinfo> of the file which is recorded in the index.
> +eolinfo:worktree::
> +	The <eolinfo> of the file which is recorded in the working tree.

These sound somewhat strange, as the above makes it sound as if we
are recording eolinfo for something (we never record eolinfo of
anything anywhere).

	eolinfo:index::
	eolinfo:worktree::
        	The <eolinfo> (see the description of the `--eol` option) of
                the contents in the index or in the worktree for the path

perhaps?  I dunno.

> +eolattr::
> +	The <eolattr> of the file which is recorded in the index.

Likewise, eolattr comes from the attribute subsystem and not
recorded in the index.  It is more like

	eolattr:
                The <eolattr> (see the description of the `--eol` option)
                that applies to the path.

Because attribute applies to the path, it applies equally to both
what is in the index and what is in the working tree.

> +path::
> +	The pathname of the file which is recorded in the index.

As ls-tree already uses %(path) for it, this is probably OK
(otherwise we would probably have called it %(pathname)).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-27  8:22           ` Junio C Hamano
@ 2022-06-27 11:06             ` ZheNing Hu
  2022-06-27 15:41               ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: ZheNing Hu @ 2022-06-27 11:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 16:22写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> >> >> +    else if (skip_prefix(start, "(path)", &p))
> >> >> +            write_name_to_buf(sb, data->pathname);
> >>
> >> These are just "values".
> >> ...
> >> >> +    else if (skip_prefix(start, "(size)", &p))
> >> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
> >> >> +    else if (skip_prefix(start, "(flags)", &p))
> >> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
> >>
> >> These are not.
> >>
> > ... If someone else
> > need them for some reason, we can add them back.
>
> If someone else needs to see "size:" printed in front of the value
> of sd_size member, we DO NOT HAVE TO DO ANYTHING!  That someone else
> can write "--format=size: %(size)" to do so themselves.
>
>

Oh, sorry, I mean if someone need some atoms from %(size) to %(flags), we can
add them back.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-27  8:32       ` Junio C Hamano
@ 2022-06-27 11:18         ` ZheNing Hu
  0 siblings, 0 replies; 30+ messages in thread
From: ZheNing Hu @ 2022-06-27 11:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Johannes Schindelin

Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 16:32写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Range-diff vs v3:
> > ...
> >       +test_done
>
> I omitted 300 lines of range-diff, which is not exactly illuminating
> in this case.  I wonder if there is a way to turn it off when it is
> not helping automatically...
>

I have make a issue to gitgitgadget, maybe Johannes Schindelin can
give a help: https://github.com/gitgitgadget/gitgitgadget/issues/1024

> > +FIELD NAMES
> > +-----------
> > +Various values from structured fields can be used to interpolate
> > +into the resulting output. For each outputting line, the following
> > +names can be used:
> > +
> > +objectmode::
> > +     The mode of the file which is recorded in the index.
> > +objectname::
> > +     The name of the file which is recorded in the index.
> > +stage::
> > +     The stage of the file which is recorded in the index.
>
>
> > +eolinfo:index::
> > +     The <eolinfo> of the file which is recorded in the index.
> > +eolinfo:worktree::
> > +     The <eolinfo> of the file which is recorded in the working tree.
>
> These sound somewhat strange, as the above makes it sound as if we
> are recording eolinfo for something (we never record eolinfo of
> anything anywhere).
>
>         eolinfo:index::
>         eolinfo:worktree::
>                 The <eolinfo> (see the description of the `--eol` option) of
>                 the contents in the index or in the worktree for the path
>
> perhaps?  I dunno.
>
> > +eolattr::
> > +     The <eolattr> of the file which is recorded in the index.
>
> Likewise, eolattr comes from the attribute subsystem and not
> recorded in the index.  It is more like
>
>         eolattr:
>                 The <eolattr> (see the description of the `--eol` option)
>                 that applies to the path.
>
> Because attribute applies to the path, it applies equally to both
> what is in the index and what is in the working tree.
>

Thanks for clarifying it, I will fix it.

> > +path::
> > +     The pathname of the file which is recorded in the index.
>
> As ls-tree already uses %(path) for it, this is probably OK
> (otherwise we would probably have called it %(pathname)).

Agree. Unless we want to fix it in git ls-tree too.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v3] ls-files: introduce "--format" option
  2022-06-27 11:06             ` ZheNing Hu
@ 2022-06-27 15:41               ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2022-06-27 15:41 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Ævar Arnfjörð Bjarmason,
	ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Phillip Wood

ZheNing Hu <adlternative@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> 于2022年6月27日周一 16:22写道:
>>
>> ZheNing Hu <adlternative@gmail.com> writes:
>>
>> >> >> +    else if (skip_prefix(start, "(path)", &p))
>> >> >> +            write_name_to_buf(sb, data->pathname);
>> >>
>> >> These are just "values".
>> >> ...
>> >> >> +    else if (skip_prefix(start, "(size)", &p))
>> >> >> +            strbuf_addf(sb, "size: %u", sd->sd_size);
>> >> >> +    else if (skip_prefix(start, "(flags)", &p))
>> >> >> +            strbuf_addf(sb, "flags: %x", data->ce->ce_flags);
>> >>
>> >> These are not.
>> >>
>> > ... If someone else
>> > need them for some reason, we can add them back.
>>
>> If someone else needs to see "size:" printed in front of the value
>> of sd_size member, we DO NOT HAVE TO DO ANYTHING!  That someone else
>> can write "--format=size: %(size)" to do so themselves.
>
> Oh, sorry, I mean if someone need some atoms from %(size) to %(flags), we can
> add them back.

Ah, I see.  I am not sure about the %(flags) to help the debugging
mode, but giving a single bit "is it dirty?" would be more useful
than giving the cached stat info, I would think.

Thanks.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2022-06-27  8:32       ` Junio C Hamano
@ 2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
  2022-06-28 15:19       ` Phillip Wood
  2 siblings, 0 replies; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-27 18:34 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Phillip Wood, ZheNing Hu


On Sun, Jun 26 2022, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Add a new option --format that output index enties
> informations with custom format, taking inspiration
> from the option with the same name in the `git ls-tree`
> command.
>
> --format cannot used with -s, -o, -k, --resolve-undo,
> --deduplicate and --eol.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> [...]
> +test_expect_success 'git ls-files --format with --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)" --debug >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_done

I'm not sure what to make of this.

In some ways I think this makes more sense than what I suggested in
https://lore.kernel.org/git/220624.86letmi383.gmgdl@evledraar.gmail.com/;
but I had to think for a second about what's going on here.

In my version I suggested having this work with --debug, but not in this
way, in my version you'd always emit the debug output, and the format
output.

But here e.g.:

    git ls-files -t --debug

Will emit "H tag.c" or whatever, but if you add --format the -t option
is silently discarded.

So the test is relying on "%(path)" being the default format.

I think extending this to e.g. test what happens with "-t" would be a
good thing, but also in general does combining --format with -t make
sense, and are there other such options where the combination might not
make sense?

So I'm not 100% sure, but I think I'd prefer my version, but I see how
it would get hairy to support, e.g.:

    git ls-files -s --debug --format=...

Should work, but you'd have to special-case the logic for erroring if -s
is combined with --format.

Anyway, I think it would be fine to leave this in whatever state is
easy, the --debug option "just for debugging".

But re
https://lore.kernel.org/git/CAOLTT8Tc95-aUE+uN2d8QjTJpGpGw6cBJfG+bpmyE55OcXTSRA@mail.gmail.com/
I think it might be interesting to get --format to a state where we can
remove --debug entirely.

I.e. in c2a29405105 (t1091/t3705: remove 'test-tool read-cache --table',
2021-12-22) we could replace some similar test-only code with "git
ls-files". I for one wouldn't mind --debug going away entirely, and have
the t3705-add-sparse-checkout.sh tests use --format instead.

Or we could keep --debug, but just have it powerful enough to do what
print_debug() is doing now, possibly without "truly internal" stuff like
"ce_flags".

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4] ls-files: introduce "--format" option
  2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
  2022-06-27  8:32       ` Junio C Hamano
  2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
@ 2022-06-28 15:19       ` Phillip Wood
  2 siblings, 0 replies; 30+ messages in thread
From: Phillip Wood @ 2022-06-28 15:19 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder,
	Ævar Arnfjörð Bjarmason, ZheNing Hu

Hi ZheNing

This looks good, I don't have much to add beyond the comments others 
have left.

On 26/06/2022 16:29, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com> 
> +FIELD NAMES
> +-----------
> +Various values from structured fields can be used to interpolate
> +into the resulting output. For each outputting line, the following
> +names can be used:
> +
> +objectmode::
> +	The mode of the file which is recorded in the index.
> +objectname::
> +	The name of the file which is recorded in the index.
> +stage::
> +	The stage of the file which is recorded in the index.
> +eolinfo:index::
> +	The <eolinfo> of the file which is recorded in the index.
> +eolinfo:worktree::
> +	The <eolinfo> of the file which is recorded in the working tree.
> +eolattr::
> +	The <eolattr> of the file which is recorded in the index.
> +path::
> +	The pathname of the file which is recorded in the index.

I think starting with this shorter list of field names is a good idea, 
we can always add more fields later if there is a demand for %(flags) etc.

> +test_expect_success 'git ls-files --format with --debug' '
> +	git ls-files --debug >expect &&
> +	git ls-files --format="%(path)" --debug >actual &&
> +	test_cmp expect actual
> +'

What's the motivation for being able to combine --format with --debug?

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-06-28 15:21 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15 13:45 [PATCH 0/2] ls-files: introduce "--format" and "--object-only" options ZheNing Hu via GitGitGadget
2022-06-15 13:45 ` [PATCH 1/2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
2022-06-15 20:07   ` Ævar Arnfjörð Bjarmason
2022-06-18 10:50     ` ZheNing Hu
2022-06-15 13:45 ` [PATCH 2/2] ls-files: introduce "--object-only" option ZheNing Hu via GitGitGadget
2022-06-15 20:15   ` Ævar Arnfjörð Bjarmason
2022-06-18 10:59     ` ZheNing Hu
2022-06-19  9:13 ` [PATCH v2] ls-files: introduce "--format" option ZheNing Hu via GitGitGadget
2022-06-19 13:50   ` Phillip Wood
2022-06-20 13:32     ` ZheNing Hu
2022-06-21  2:05   ` [PATCH v3] " ZheNing Hu via GitGitGadget
2022-06-23 14:06     ` Phillip Wood
2022-06-23 15:57       ` Junio C Hamano
2022-06-24 10:16         ` Phillip Wood
2022-06-26 13:05           ` ZheNing Hu
2022-06-24 13:20         ` Ævar Arnfjörð Bjarmason
2022-06-24 15:28           ` Junio C Hamano
2022-06-26 13:01       ` ZheNing Hu
2022-06-24 13:25     ` Ævar Arnfjörð Bjarmason
2022-06-24 15:33       ` Junio C Hamano
2022-06-26 13:35         ` ZheNing Hu
2022-06-27  8:22           ` Junio C Hamano
2022-06-27 11:06             ` ZheNing Hu
2022-06-27 15:41               ` Junio C Hamano
2022-06-26 13:34       ` ZheNing Hu
2022-06-26 15:29     ` [PATCH v4] " ZheNing Hu via GitGitGadget
2022-06-27  8:32       ` Junio C Hamano
2022-06-27 11:18         ` ZheNing Hu
2022-06-27 18:34       ` Ævar Arnfjörð Bjarmason
2022-06-28 15:19       ` Phillip Wood

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).