* [RFC PATCH 0/6] ls-tree: introduce '--pattern' option @ 2022-11-17 11:30 Teng Long 2022-11-17 11:30 ` [RFC PATCH 1/6] ls-tree: cleanup the redundant SPACE Teng Long ` (7 more replies) 0 siblings, 8 replies; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> This RFC patch introduce a new "ls-tree" option "--pattern", aim to match the entries by regex then filter the output which we may want to achieve. It also contains some commit for preparation or cleanup. The idea may be not comprehensive and the tests for it might be insufficient too, but I'd like to listen the suggestion from the community to decide if it's worth going forward with. Thanks. Teng Long (6): ls-tree: cleanup the redundant SPACE t3104: remove shift code in 'test_ls_tree_format' ls-tree: optimize params of 'show_tree_common_default_long()' ls-tree: improving cohension in the print code ls-tree: introduce 'match_pattern()' function ls-tree: introduce '--pattern' option Documentation/git-ls-tree.txt | 7 ++- builtin/ls-tree.c | 109 ++++++++++++++++++++++++++-------- t/t3104-ls-tree-format.sh | 1 - t/t3106-ls-tree-pattern.sh | 70 ++++++++++++++++++++++ 4 files changed, 161 insertions(+), 26 deletions(-) create mode 100755 t/t3106-ls-tree-pattern.sh -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 1/6] ls-tree: cleanup the redundant SPACE 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long @ 2022-11-17 11:30 ` Teng Long 2022-11-17 11:30 ` [RFC PATCH 2/6] t3104: remove shift code in 'test_ls_tree_format' Teng Long ` (6 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> An redundant space was found in ls-tree.c, which is no doubt a small change, but it might be OK to make a commit on its own. Signed-off-by: Teng Long <dyroneteng@gmail.com> --- builtin/ls-tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index c3ea09281af..8256fc0bc97 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -32,7 +32,7 @@ struct show_tree_data { struct strbuf *base; }; -static const char * const ls_tree_usage[] = { +static const char * const ls_tree_usage[] = { N_("git ls-tree [<options>] <tree-ish> [<path>...]"), NULL }; -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 2/6] t3104: remove shift code in 'test_ls_tree_format' 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long 2022-11-17 11:30 ` [RFC PATCH 1/6] ls-tree: cleanup the redundant SPACE Teng Long @ 2022-11-17 11:30 ` Teng Long 2022-11-17 11:30 ` [RFC PATCH 3/6] ls-tree: optimize params of 'show_tree_common_default_long()' Teng Long ` (5 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> In t3104-ls-tree-format.sh, There is a legacy 'shift 2' code and the relevant code block no longer depends on it anymore, so let's remove it for a small cleanup. Signed-off-by: Teng Long <dyroneteng@gmail.com> --- t/t3104-ls-tree-format.sh | 1 - 1 file changed, 1 deletion(-) diff --git a/t/t3104-ls-tree-format.sh b/t/t3104-ls-tree-format.sh index 383896667b6..74053978f49 100755 --- a/t/t3104-ls-tree-format.sh +++ b/t/t3104-ls-tree-format.sh @@ -20,7 +20,6 @@ test_ls_tree_format () { format=$1 && opts=$2 && fmtopts=$3 && - shift 2 && test_expect_success "ls-tree '--format=<$format>' is like options '$opts $fmtopts'" ' git ls-tree $opts -r HEAD >expect && -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 3/6] ls-tree: optimize params of 'show_tree_common_default_long()' 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long 2022-11-17 11:30 ` [RFC PATCH 1/6] ls-tree: cleanup the redundant SPACE Teng Long 2022-11-17 11:30 ` [RFC PATCH 2/6] t3104: remove shift code in 'test_ls_tree_format' Teng Long @ 2022-11-17 11:30 ` Teng Long 2022-11-17 11:30 ` [RFC PATCH 4/6] ls-tree: improving cohension in the print code Teng Long ` (4 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> Function 'show_tree_common_default_long' has 3 parameters: * struct strbuf *base * const char *pathname * const size_t baselen And the struct 'show_tree_data' which was introduced in 'e81517155e0370ae5433d256046ab287bb10d04d' is: struct show_tree_data { unsigned mode; enum object_type type; const struct object_id *oid; const char *pathname; struct strbuf *base; }; Actually, the struct includes all the data that we need to pass to the function, so we could make a small refactoring by using 'show_tree_data' struct directly. Signed-off-by: Teng Long <dyroneteng@gmail.com> --- builtin/ls-tree.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index 8256fc0bc97..afb65af4280 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -200,15 +200,15 @@ static int show_tree_common(struct show_tree_data *data, int *recurse, return ret; } -static void show_tree_common_default_long(struct strbuf *base, - const char *pathname, - const size_t baselen) +static void show_tree_common_default_long(struct show_tree_data *data) { - strbuf_addstr(base, pathname); - write_name_quoted_relative(base->buf, + int base_len = data->base->len; + + strbuf_addstr(data->base, data->pathname); + write_name_quoted_relative(data->base->buf, chomp_prefix ? ls_tree_prefix : NULL, stdout, line_termination); - strbuf_setlen(base, baselen); + strbuf_setlen(data->base, base_len); } static int show_tree_default(const struct object_id *oid, struct strbuf *base, @@ -225,7 +225,7 @@ static int show_tree_default(const struct object_id *oid, struct strbuf *base, printf("%06o %s %s\t", data.mode, type_name(data.type), find_unique_abbrev(data.oid, abbrev)); - show_tree_common_default_long(base, pathname, data.base->len); + show_tree_common_default_long(&data); return recurse; } @@ -255,7 +255,7 @@ static int show_tree_long(const struct object_id *oid, struct strbuf *base, printf("%06o %s %s %7s\t", data.mode, type_name(data.type), find_unique_abbrev(data.oid, abbrev), size_text); - show_tree_common_default_long(base, pathname, data.base->len); + show_tree_common_default_long(&data); return recurse; } -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 4/6] ls-tree: improving cohension in the print code 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long ` (2 preceding siblings ...) 2022-11-17 11:30 ` [RFC PATCH 3/6] ls-tree: optimize params of 'show_tree_common_default_long()' Teng Long @ 2022-11-17 11:30 ` Teng Long 2022-11-17 13:53 ` Ævar Arnfjörð Bjarmason 2022-11-17 11:30 ` [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function Teng Long ` (3 subsequent siblings) 7 siblings, 1 reply; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> We use 'show_tree_long()' and 'show_tree_default()' to print entries when we want the output in 'default' or 'default long' formats. Function 'show_tree_common_default_long()' prints the "path" field as the last part of output, the previous part which separately implements in 'show_tree_long()' and 'show_tree_default()'. We can package the separate implementation together by the extension of "size_text" in struct "show_tree_data". By improving the cohesion in these two locations, some benefits such as uniform processing of the output can be achieved in future. Signed-off-by: Teng Long <dyroneteng@gmail.com> --- builtin/ls-tree.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index afb65af4280..7661170f7ca 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -30,6 +30,7 @@ struct show_tree_data { const struct object_id *oid; const char *pathname; struct strbuf *base; + char *size_text; }; static const char * const ls_tree_usage[] = { @@ -186,6 +187,7 @@ static int show_tree_common(struct show_tree_data *data, int *recurse, data->oid = oid; data->pathname = pathname; data->base = base; + data->size_text = NULL; if (type == OBJ_BLOB) { if (ls_options & LS_TREE_ONLY) @@ -204,6 +206,13 @@ static void show_tree_common_default_long(struct show_tree_data *data) { int base_len = data->base->len; + if (data->size_text) + printf("%06o %s %s %7s\t", data->mode, type_name(data->type), + find_unique_abbrev(data->oid, abbrev), data->size_text); + else + printf("%06o %s %s\t", data->mode, type_name(data->type), + find_unique_abbrev(data->oid, abbrev)); + strbuf_addstr(data->base, data->pathname); write_name_quoted_relative(data->base->buf, chomp_prefix ? ls_tree_prefix : NULL, stdout, @@ -223,8 +232,6 @@ static int show_tree_default(const struct object_id *oid, struct strbuf *base, if (early >= 0) return early; - printf("%06o %s %s\t", data.mode, type_name(data.type), - find_unique_abbrev(data.oid, abbrev)); show_tree_common_default_long(&data); return recurse; } @@ -253,8 +260,7 @@ static int show_tree_long(const struct object_id *oid, struct strbuf *base, xsnprintf(size_text, sizeof(size_text), "-"); } - printf("%06o %s %s %7s\t", data.mode, type_name(data.type), - find_unique_abbrev(data.oid, abbrev), size_text); + data.size_text = size_text; show_tree_common_default_long(&data); return recurse; } -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 4/6] ls-tree: improving cohension in the print code 2022-11-17 11:30 ` [RFC PATCH 4/6] ls-tree: improving cohension in the print code Teng Long @ 2022-11-17 13:53 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:53 UTC (permalink / raw) To: Teng Long; +Cc: git, tenglong.tl, me On Thu, Nov 17 2022, Teng Long wrote: > From: Teng Long <dyroneteng@gmail.com> > > We use 'show_tree_long()' and 'show_tree_default()' to print entries > when we want the output in 'default' or 'default long' formats. > > Function 'show_tree_common_default_long()' prints the "path" field as > the last part of output, the previous part which separately implements > in 'show_tree_long()' and 'show_tree_default()'. > > We can package the separate implementation together by the extension of > "size_text" in struct "show_tree_data". By improving the cohesion in > these two locations, some benefits such as uniform processing of the > output can be achieved in future. > > Signed-off-by: Teng Long <dyroneteng@gmail.com> > --- > builtin/ls-tree.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c > index afb65af4280..7661170f7ca 100644 > --- a/builtin/ls-tree.c > +++ b/builtin/ls-tree.c > @@ -30,6 +30,7 @@ struct show_tree_data { > const struct object_id *oid; > const char *pathname; > struct strbuf *base; > + char *size_text; > }; > > static const char * const ls_tree_usage[] = { > @@ -186,6 +187,7 @@ static int show_tree_common(struct show_tree_data *data, int *recurse, > data->oid = oid; > data->pathname = pathname; > data->base = base; > + data->size_text = NULL; > > if (type == OBJ_BLOB) { > if (ls_options & LS_TREE_ONLY) > @@ -204,6 +206,13 @@ static void show_tree_common_default_long(struct show_tree_data *data) > { > int base_len = data->base->len; > > + if (data->size_text) > + printf("%06o %s %s %7s\t", data->mode, type_name(data->type), > + find_unique_abbrev(data->oid, abbrev), data->size_text); > + else > + printf("%06o %s %s\t", data->mode, type_name(data->type), > + find_unique_abbrev(data->oid, abbrev)); > + > strbuf_addstr(data->base, data->pathname); > write_name_quoted_relative(data->base->buf, > chomp_prefix ? ls_tree_prefix : NULL, stdout, > @@ -223,8 +232,6 @@ static int show_tree_default(const struct object_id *oid, struct strbuf *base, > if (early >= 0) > return early; > > - printf("%06o %s %s\t", data.mode, type_name(data.type), > - find_unique_abbrev(data.oid, abbrev)); > show_tree_common_default_long(&data); > return recurse; > } > @@ -253,8 +260,7 @@ static int show_tree_long(const struct object_id *oid, struct strbuf *base, > xsnprintf(size_text, sizeof(size_text), "-"); > } > > - printf("%06o %s %s %7s\t", data.mode, type_name(data.type), > - find_unique_abbrev(data.oid, abbrev), size_text); > + data.size_text = size_text; > show_tree_common_default_long(&data); > return recurse; > } I think this is a good example of how not to split up commits. So, in this case the reader is left wondering what the point is, how does having two callers, and introducing an exra parameter to flip between what one or the other wants make sense for a "common" function? It doesn't, that code should just belong in those callers, so this is just making things more indirect. We find out later in the series that the real reason is that this eventually becomes an append to a "struct strbuf". At that point we need to re-write these lines again, so in terms of reviewers having to look at it they'd wonder here, and then look a that code again... This is better just either squashed into a 6/6, or in an explicit "here's some refactoring to make a subsequent change smaller", which e.g. could move to using the strbuf already, which we'd later make "real" use of. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long ` (3 preceding siblings ...) 2022-11-17 11:30 ` [RFC PATCH 4/6] ls-tree: improving cohension in the print code Teng Long @ 2022-11-17 11:30 ` Teng Long 2022-11-17 14:02 ` Ævar Arnfjörð Bjarmason 2022-11-30 9:39 ` Ævar Arnfjörð Bjarmason 2022-11-17 11:30 ` [RFC PATCH 6/6] ls-tree: introduce '--pattern' option Teng Long ` (2 subsequent siblings) 7 siblings, 2 replies; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> In preparation for actually implementing the '--pattern' option, we add a new method called 'match_pattern' that uses regular expressions to match 'ls-tree' entities. Signed-off-by: Teng Long <dyroneteng@gmail.com> --- builtin/ls-tree.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index 7661170f7ca..03dd3fbcb26 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -24,6 +24,7 @@ static struct pathspec pathspec; static int chomp_prefix; static const char *ls_tree_prefix; static const char *format; +static const char *pattern; struct show_tree_data { unsigned mode; enum object_type type; @@ -46,6 +47,32 @@ static enum ls_tree_cmdmode { MODE_OBJECT_ONLY, } cmdmode; +__attribute__((unused)) +static int match_pattern(const char *line) +{ + int ret = 0; + regex_t r; + regmatch_t m[1]; + char errbuf[64]; + + ret = regcomp(&r, pattern, 0); + if (ret) { + regerror(ret, &r, errbuf, sizeof(errbuf)); + die("failed regcomp() for pattern '%s' (%s)", pattern, errbuf); + } + ret = regexec(&r, line, 1, m, 0); + if (ret) { + if (ret == REG_NOMATCH) + goto cleanup; + regerror(ret, &r, errbuf, sizeof(errbuf)); + die("failed regexec() for subject '%s' (%s)", line, errbuf); + } + +cleanup: + regfree(&r); + return ret; +} + static void expand_objectsize(struct strbuf *line, const struct object_id *oid, const enum object_type type, unsigned int padded) { -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function 2022-11-17 11:30 ` [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function Teng Long @ 2022-11-17 14:02 ` Ævar Arnfjörð Bjarmason 2022-11-30 9:39 ` Ævar Arnfjörð Bjarmason 1 sibling, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 14:02 UTC (permalink / raw) To: Teng Long; +Cc: git, tenglong.tl, me On Thu, Nov 17 2022, Teng Long wrote: > From: Teng Long <dyroneteng@gmail.com> > > In preparation for actually implementing the '--pattern' option, we > add a new method called 'match_pattern' that uses regular expressions > to match 'ls-tree' entities. > > Signed-off-by: Teng Long <dyroneteng@gmail.com> > --- > builtin/ls-tree.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c > index 7661170f7ca..03dd3fbcb26 100644 > --- a/builtin/ls-tree.c > +++ b/builtin/ls-tree.c > @@ -24,6 +24,7 @@ static struct pathspec pathspec; > static int chomp_prefix; > static const char *ls_tree_prefix; > static const char *format; > +static const char *pattern; > struct show_tree_data { > unsigned mode; > enum object_type type; > @@ -46,6 +47,32 @@ static enum ls_tree_cmdmode { > MODE_OBJECT_ONLY, > } cmdmode; > > +__attribute__((unused)) This isn't portable (we'd need a check in git-compat-util.h, and in any case just squashing this whole commit into 6/6 where it actually gets used would be better. > +static int match_pattern(const char *line) > +{ > + int ret = 0; > + regex_t r; > + regmatch_t m[1]; > + char errbuf[64]; > + > + ret = regcomp(&r, pattern, 0); > + if (ret) { > + regerror(ret, &r, errbuf, sizeof(errbuf)); > + die("failed regcomp() for pattern '%s' (%s)", pattern, errbuf); > + } > + ret = regexec(&r, line, 1, m, 0); > + if (ret) { > + if (ret == REG_NOMATCH) > + goto cleanup; > + regerror(ret, &r, errbuf, sizeof(errbuf)); > + die("failed regexec() for subject '%s' (%s)", line, errbuf); > + } > + > +cleanup: > + regfree(&r); > + return ret; > +} > + > static void expand_objectsize(struct strbuf *line, const struct object_id *oid, > const enum object_type type, unsigned int padded) > { ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function 2022-11-17 11:30 ` [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function Teng Long 2022-11-17 14:02 ` Ævar Arnfjörð Bjarmason @ 2022-11-30 9:39 ` Ævar Arnfjörð Bjarmason 1 sibling, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-30 9:39 UTC (permalink / raw) To: Teng Long; +Cc: git, tenglong.tl, me On Thu, Nov 17 2022, Teng Long wrote: (I saw this in the latest "seen" push-out) > From: Teng Long <dyroneteng@gmail.com> > [...] > +__attribute__((unused)) > +static int match_pattern(const char *line) > +{ > + int ret = 0; > + regex_t r; > + regmatch_t m[1]; Here we hardcode the size of "m". (Re-arranged a bit) > + > + ret = regcomp(&r, pattern, 0); > + if (ret) { > + regerror(ret, &r, errbuf, sizeof(errbuf)); > + die("failed regcomp() for pattern '%s' (%s)", pattern, errbuf); Needs _(). > + } > + ret = regexec(&r, line, 1, m, 0); Here we hardcode that "1" again, but we should use ARRAY_SIZE() instead. See an existing example at: git grep -W regexec.*ARRAY (Re-arranged from above) > + char errbuf[64]; This is a super short errbuf, in other cases we hardcode this to 1024, which seems reasonable. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long ` (4 preceding siblings ...) 2022-11-17 11:30 ` [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function Teng Long @ 2022-11-17 11:30 ` Teng Long 2022-11-17 14:03 ` Ævar Arnfjörð Bjarmason 2022-12-12 8:32 ` Johannes Schindelin 2022-11-17 13:22 ` [RFC PATCH 0/6] " Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason 7 siblings, 2 replies; 34+ messages in thread From: Teng Long @ 2022-11-17 11:30 UTC (permalink / raw) To: git; +Cc: avarab, tenglong.tl, me, Teng Long From: Teng Long <dyroneteng@gmail.com> The "--pattern" option uses regular expressions to match each entry, then filter the output of "ls-tree" . Signed-off-by: Teng Long <dyroneteng@gmail.com> --- Documentation/git-ls-tree.txt | 7 ++- builtin/ls-tree.c | 82 +++++++++++++++++++++++------------ t/t3106-ls-tree-pattern.sh | 70 ++++++++++++++++++++++++++++++ 3 files changed, 131 insertions(+), 28 deletions(-) create mode 100755 t/t3106-ls-tree-pattern.sh diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt index 0240adb8eec..39346409f2f 100644 --- a/Documentation/git-ls-tree.txt +++ b/Documentation/git-ls-tree.txt @@ -10,7 +10,7 @@ SYNOPSIS -------- [verse] 'git ls-tree' [-d] [-r] [-t] [-l] [-z] - [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>] + [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>] [--pattern=<pattern>] <tree-ish> [<path>...] DESCRIPTION @@ -93,6 +93,11 @@ OPTIONS format-altering options, including `--long`, `--name-only` and `--object-only`. +--pattern=<pattern>:: + The <pattern> is a string of regular expression format used to + match each entry. Unmatched entries will be filtered and not + dump to the output. + [<path>...]:: When paths are given, show them (note that this isn't really raw pathnames, but rather a list of patterns to match). Otherwise diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index 03dd3fbcb26..576fc9ad16f 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -13,6 +13,7 @@ #include "builtin.h" #include "parse-options.h" #include "pathspec.h" +#include <stdio.h> static int line_termination = '\n'; #define LS_RECURSIVE 1 @@ -25,6 +26,7 @@ static int chomp_prefix; static const char *ls_tree_prefix; static const char *format; static const char *pattern; +static regex_t *regex; struct show_tree_data { unsigned mode; enum object_type type; @@ -47,29 +49,29 @@ static enum ls_tree_cmdmode { MODE_OBJECT_ONLY, } cmdmode; -__attribute__((unused)) static int match_pattern(const char *line) { int ret = 0; - regex_t r; regmatch_t m[1]; char errbuf[64]; - ret = regcomp(&r, pattern, 0); - if (ret) { - regerror(ret, &r, errbuf, sizeof(errbuf)); - die("failed regcomp() for pattern '%s' (%s)", pattern, errbuf); + if (!regex) { + regex = xmalloc(sizeof(*regex)); + ret = regcomp(regex, pattern, 0); + if (ret) { + regerror(ret, regex, errbuf, sizeof(errbuf)); + die("failed regcomp() for pattern '%s' (%s)", pattern, errbuf); + } } - ret = regexec(&r, line, 1, m, 0); + + ret = regexec(regex, line, 1, m, 0); if (ret) { if (ret == REG_NOMATCH) - goto cleanup; - regerror(ret, &r, errbuf, sizeof(errbuf)); + return ret; + regerror(ret, regex, errbuf, sizeof(errbuf)); die("failed regexec() for subject '%s' (%s)", line, errbuf); } -cleanup: - regfree(&r); return ret; } @@ -194,8 +196,12 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base, baselen = base->len; strbuf_expand(&sb, format, expand_show_tree, &data); - strbuf_addch(&sb, line_termination); - fwrite(sb.buf, sb.len, 1, stdout); + + if (!pattern || !match_pattern(sb.buf)) { + strbuf_addch(&sb, line_termination); + fwrite(sb.buf, sb.len, 1, stdout); + } + strbuf_release(&sb); strbuf_setlen(base, baselen); return recurse; @@ -232,19 +238,33 @@ static int show_tree_common(struct show_tree_data *data, int *recurse, static void show_tree_common_default_long(struct show_tree_data *data) { int base_len = data->base->len; + struct strbuf sb = STRBUF_INIT; + int sb_len = 0; if (data->size_text) - printf("%06o %s %s %7s\t", data->mode, type_name(data->type), - find_unique_abbrev(data->oid, abbrev), data->size_text); + strbuf_addf(&sb, "%06o %s %s %7s\t", data->mode, + type_name(data->type), + find_unique_abbrev(data->oid, abbrev), + data->size_text); else - printf("%06o %s %s\t", data->mode, type_name(data->type), - find_unique_abbrev(data->oid, abbrev)); + strbuf_addf(&sb, "%06o %s %s\t", data->mode, + type_name(data->type), + find_unique_abbrev(data->oid, abbrev)); strbuf_addstr(data->base, data->pathname); - write_name_quoted_relative(data->base->buf, - chomp_prefix ? ls_tree_prefix : NULL, stdout, - line_termination); + sb_len = sb.len; + strbuf_addbuf(&sb, data->base); + + if (!pattern || !match_pattern(sb.buf)) { + strbuf_setlen(&sb, sb_len); + printf("%s", sb.buf); + write_name_quoted_relative(data->base->buf, + chomp_prefix ? ls_tree_prefix : NULL, + stdout, line_termination); + } strbuf_setlen(data->base, base_len); + + strbuf_release(&sb); } static int show_tree_default(const struct object_id *oid, struct strbuf *base, @@ -306,9 +326,11 @@ static int show_tree_name_only(const struct object_id *oid, struct strbuf *base, return early; strbuf_addstr(base, pathname); - write_name_quoted_relative(base->buf, - chomp_prefix ? ls_tree_prefix : NULL, - stdout, line_termination); + if (!pattern || !match_pattern(base->buf)) { + write_name_quoted_relative(base->buf, + chomp_prefix ? ls_tree_prefix : NULL, + stdout, line_termination); + } strbuf_setlen(base, baselen); return recurse; } @@ -320,12 +342,14 @@ static int show_tree_object(const struct object_id *oid, struct strbuf *base, int early; int recurse; struct show_tree_data data = { 0 }; + const char *oid_text = find_unique_abbrev(oid, abbrev); early = show_tree_common(&data, &recurse, oid, base, pathname, mode); if (early >= 0) return early; - printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination); + if (!pattern || !match_pattern(oid_text)) + printf("%s%c", oid_text, line_termination); return recurse; } @@ -391,8 +415,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) N_("list entire tree; not just current directory " "(implies --full-name)")), OPT_STRING_F(0, "format", &format, N_("format"), - N_("format to use for the output"), - PARSE_OPT_NONEG), + N_("format to use for the output"), + PARSE_OPT_NONEG), + OPT_STRING(0, "pattern", &pattern, "pattern", + "pattern to use to match the output"), OPT__ABBREV(&abbrev), OPT_END() }; @@ -430,10 +456,12 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) usage_with_options(ls_tree_usage, ls_tree_options); if (get_oid(argv[0], &oid)) die("Not a valid object name %s", argv[0]); + if (pattern && !strlen(pattern)) + die("Not a valid pattern, the value is empty"); /* * show_recursive() rolls its own matching code and is - * generally ignorant of 'struct pathspec'. The magic mask + * generally ignorant f 'struct pathspec'. The magic mask * cannot be lifted until it is converted to use * match_pathspec() or tree_entry_interesting() */ diff --git a/t/t3106-ls-tree-pattern.sh b/t/t3106-ls-tree-pattern.sh new file mode 100755 index 00000000000..e4a81c8c47e --- /dev/null +++ b/t/t3106-ls-tree-pattern.sh @@ -0,0 +1,70 @@ +#!/bin/sh + +test_description='ls-tree pattern' + +TEST_PASSES_SANITIZE_LEAK=true +. ./test-lib.sh +. "$TEST_DIRECTORY"/lib-t3100.sh + +test_expect_success 'setup' ' + setup_basic_ls_tree_data +' + +test_expect_success 'ls-tree pattern usage' ' + test_expect_code 129 git ls-tree --pattern HEAD && + test_expect_code 128 git ls-tree --pattern "" HEAD >err 2>&1 && + grep "Not a valid pattern, the value is empty" err +' + +test_expect_success 'combine with "--object-only"' ' + cat > expect <<-EOF && + 6da7993 + EOF + + git ls-tree --object-only --abbrev=7 --pattern "6da7993" HEAD > actual && + test_cmp expect actual +' + +test_expect_success 'combine with "--name-only"' ' + cat > expect <<-EOF && + .gitmodules + top-file.t + EOF + + git ls-tree --name-only --pattern "\." HEAD > actual && + test_cmp expect actual +' + +test_expect_success 'combine with "--long"' ' + cat > expect <<-EOF && + 100644 blob 6da7993 61 .gitmodules + 100644 blob 02dad95 9 top-file.t + EOF + git ls-tree --long --abbrev=7 --pattern "blob" HEAD > actual && + test_cmp expect actual +' + +test_expect_success 'combine with "--format"' ' + # Change the output format by replacing space separators with asterisks. + format="%(objectmode)*%(objecttype)*%(objectname)%x09%(path)" && + pattern="100644\*blob" && + + cat > expect <<-EOF && + 100644*blob*6da7993 .gitmodules + 100644*blob*02dad95 top-file.t + EOF + + git ls-tree --abbrev=7 --format "$format" --pattern "$pattern" HEAD >actual && + test_cmp expect actual +' + +test_expect_success 'default output format (only with "--pattern" option)' ' + cat > expect <<-EOF && + 100644 blob 6da7993ca1a3435f63c598464f77bdc6dae15aa5 .gitmodules + 100644 blob 02dad956d9274a70f7fafe45a5ffa2e123acd9a8 top-file.t + EOF + git ls-tree --pattern "blob" HEAD > actual && + test_cmp expect actual +' + +test_done -- 2.38.1.426.g770fc8806cb.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-11-17 11:30 ` [RFC PATCH 6/6] ls-tree: introduce '--pattern' option Teng Long @ 2022-11-17 14:03 ` Ævar Arnfjörð Bjarmason 2022-12-12 8:32 ` Johannes Schindelin 1 sibling, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 14:03 UTC (permalink / raw) To: Teng Long; +Cc: git, tenglong.tl, me On Thu, Nov 17 2022, Teng Long wrote: > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c > index 03dd3fbcb26..576fc9ad16f 100644 > --- a/builtin/ls-tree.c > +++ b/builtin/ls-tree.c > @@ -13,6 +13,7 @@ > #include "builtin.h" > #include "parse-options.h" > #include "pathspec.h" > +#include <stdio.h> Aside from anything else I've mentionded (e.g. overall goals), don't include "<>" headers in anything except git-compat-util.h and similar. In this case we don't need this at all, but if we did it should be added there... > [...] > } > - ret = regexec(&r, line, 1, m, 0); > + > + ret = regexec(regex, line, 1, m, 0); Some whitespace-churn after the last commit... > static void show_tree_common_default_long(struct show_tree_data *data) > { > int base_len = data->base->len; > + struct strbuf sb = STRBUF_INIT; > + int sb_len = 0; It's size_t, not int, so if you need to keep track of a strbuf's length use the type of its "len". This would be better named "oldlen" or something... > + printf("%s", sb.buf); puts(sb.buf) instead; > + if (pattern && !strlen(pattern)) pattern && !*pattern is more idiomatic IMO. > +test_expect_success 'combine with "--object-only"' ' > + cat > expect <<-EOF && Style: No " " after ">" > + 6da7993 Style: Let's avoid \t\t indenting for here-docs, just use the indenting of the "cat". ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-11-17 11:30 ` [RFC PATCH 6/6] ls-tree: introduce '--pattern' option Teng Long 2022-11-17 14:03 ` Ævar Arnfjörð Bjarmason @ 2022-12-12 8:32 ` Johannes Schindelin 2022-12-12 23:57 ` Junio C Hamano 1 sibling, 1 reply; 34+ messages in thread From: Johannes Schindelin @ 2022-12-12 8:32 UTC (permalink / raw) To: Teng Long; +Cc: git, avarab, tenglong.tl, me Hi, On Thu, 17 Nov 2022, Teng Long wrote: > diff --git a/t/t3106-ls-tree-pattern.sh b/t/t3106-ls-tree-pattern.sh > new file mode 100755 > index 00000000000..e4a81c8c47e > --- /dev/null > +++ b/t/t3106-ls-tree-pattern.sh > @@ -0,0 +1,70 @@ > +#!/bin/sh > + > +test_description='ls-tree pattern' > + > +TEST_PASSES_SANITIZE_LEAK=true > +. ./test-lib.sh > +. "$TEST_DIRECTORY"/lib-t3100.sh > + > +test_expect_success 'setup' ' > + setup_basic_ls_tree_data > +' > + > +test_expect_success 'ls-tree pattern usage' ' > + test_expect_code 129 git ls-tree --pattern HEAD && > + test_expect_code 128 git ls-tree --pattern "" HEAD >err 2>&1 && > + grep "Not a valid pattern, the value is empty" err > +' > + > +test_expect_success 'combine with "--object-only"' ' > + cat > expect <<-EOF && > + 6da7993 > + EOF > + > + git ls-tree --object-only --abbrev=7 --pattern "6da7993" HEAD > actual && > + test_cmp expect actual > +' > + > +test_expect_success 'combine with "--name-only"' ' > + cat > expect <<-EOF && > + .gitmodules > + top-file.t > + EOF > + > + git ls-tree --name-only --pattern "\." HEAD > actual && > + test_cmp expect actual > +' > + > +test_expect_success 'combine with "--long"' ' > + cat > expect <<-EOF && > + 100644 blob 6da7993 61 .gitmodules > + 100644 blob 02dad95 9 top-file.t > + EOF > + git ls-tree --long --abbrev=7 --pattern "blob" HEAD > actual && > + test_cmp expect actual > +' > + > +test_expect_success 'combine with "--format"' ' > + # Change the output format by replacing space separators with asterisks. > + format="%(objectmode)*%(objecttype)*%(objectname)%x09%(path)" && > + pattern="100644\*blob" && > + > + cat > expect <<-EOF && > + 100644*blob*6da7993 .gitmodules > + 100644*blob*02dad95 top-file.t > + EOF > + > + git ls-tree --abbrev=7 --format "$format" --pattern "$pattern" HEAD >actual && > + test_cmp expect actual > +' > + > +test_expect_success 'default output format (only with "--pattern" option)' ' > + cat > expect <<-EOF && > + 100644 blob 6da7993ca1a3435f63c598464f77bdc6dae15aa5 .gitmodules > + 100644 blob 02dad956d9274a70f7fafe45a5ffa2e123acd9a8 top-file.t > + EOF > + git ls-tree --pattern "blob" HEAD > actual && > + test_cmp expect actual > +' > + > +test_done The hard-coded object IDs break the `linux-sha256` job, as pointed out in https://github.com/git/git/blob/6ab7651d8669/whats-cooking.txt#L522-L537. Please squash this in to address this (Junio, please feel free to cherry-pick this on top of `tl/ls-tree--pattern` to reduce the number of CI failures): -- snipsnap -- Subject: [PATCH] fixup! ls-tree: introduce '--pattern' option We need to avoid hard-coding OIDs because of the SHA-256 support. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t3106-ls-tree-pattern.sh | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/t/t3106-ls-tree-pattern.sh b/t/t3106-ls-tree-pattern.sh index e4a81c8c47e..8aaca307804 100755 --- a/t/t3106-ls-tree-pattern.sh +++ b/t/t3106-ls-tree-pattern.sh @@ -17,11 +17,12 @@ test_expect_success 'ls-tree pattern usage' ' ' test_expect_success 'combine with "--object-only"' ' + oid="$(git rev-parse --short HEAD:.gitmodules)" && cat > expect <<-EOF && - 6da7993 + $oid EOF - git ls-tree --object-only --abbrev=7 --pattern "6da7993" HEAD > actual && + git ls-tree --object-only --abbrev=7 --pattern "$oid" HEAD > actual && test_cmp expect actual ' @@ -36,22 +37,26 @@ test_expect_success 'combine with "--name-only"' ' ' test_expect_success 'combine with "--long"' ' + oid1="$(git rev-parse --short HEAD:.gitmodules)" && + oid2="$(git rev-parse --short HEAD:top-file.t)" && cat > expect <<-EOF && - 100644 blob 6da7993 61 .gitmodules - 100644 blob 02dad95 9 top-file.t + 100644 blob $oid1 61 .gitmodules + 100644 blob $oid2 9 top-file.t EOF git ls-tree --long --abbrev=7 --pattern "blob" HEAD > actual && test_cmp expect actual ' test_expect_success 'combine with "--format"' ' + oid1="$(git rev-parse --short HEAD:.gitmodules)" && + oid2="$(git rev-parse --short HEAD:top-file.t)" && # Change the output format by replacing space separators with asterisks. format="%(objectmode)*%(objecttype)*%(objectname)%x09%(path)" && pattern="100644\*blob" && cat > expect <<-EOF && - 100644*blob*6da7993 .gitmodules - 100644*blob*02dad95 top-file.t + 100644*blob*$oid1 .gitmodules + 100644*blob*$oid2 top-file.t EOF git ls-tree --abbrev=7 --format "$format" --pattern "$pattern" HEAD >actual && @@ -59,9 +64,11 @@ test_expect_success 'combine with "--format"' ' ' test_expect_success 'default output format (only with "--pattern" option)' ' + oid1="$(git rev-parse HEAD:.gitmodules)" && + oid2="$(git rev-parse HEAD:top-file.t)" && cat > expect <<-EOF && - 100644 blob 6da7993ca1a3435f63c598464f77bdc6dae15aa5 .gitmodules - 100644 blob 02dad956d9274a70f7fafe45a5ffa2e123acd9a8 top-file.t + 100644 blob $oid1 .gitmodules + 100644 blob $oid2 top-file.t EOF git ls-tree --pattern "blob" HEAD > actual && test_cmp expect actual -- 2.38.1.windows.1.24.g5ff6506c583.dirty ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-12-12 8:32 ` Johannes Schindelin @ 2022-12-12 23:57 ` Junio C Hamano 2022-12-14 5:27 ` Junio C Hamano 2023-03-27 10:37 ` win-test: unknown terminal "xterm-256color", was " Johannes Schindelin 0 siblings, 2 replies; 34+ messages in thread From: Junio C Hamano @ 2022-12-12 23:57 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Teng Long, git, avarab, tenglong.tl, me Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > The hard-coded object IDs break the `linux-sha256` job, as pointed out in > https://github.com/git/git/blob/6ab7651d8669/whats-cooking.txt#L522-L537. > > Please squash this in to address this (Junio, please feel free to > cherry-pick this on top of `tl/ls-tree--pattern` to reduce the number of > CI failures): These days, I gather topics with known CI breakages near the tip of 'seen', and push out only the good bottom half of 'seen' until such a topic gets rerolled, at which time it gets added back to the set of topics pushed out on 'seen' again (and then ejected if it CI breaks). I excluded the part with the topic from last night's pushout. By the way do you know anything about xterm-256color error in win test(6)? https://github.com/git/git/actions/runs/3676139624/jobs/6216575838#step:5:196 I do not think we hard-code any specific terminal name (other than dumb and possibly vt100) in our tests or binaries, so it may be coming from the CI runner environment---some parts incorrectly think xterm-256color is available there while there is no support for the particular terminal? Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-12-12 23:57 ` Junio C Hamano @ 2022-12-14 5:27 ` Junio C Hamano 2022-12-14 10:03 ` Ævar Arnfjörð Bjarmason 2023-03-27 10:37 ` win-test: unknown terminal "xterm-256color", was " Johannes Schindelin 1 sibling, 1 reply; 34+ messages in thread From: Junio C Hamano @ 2022-12-14 5:27 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Teng Long, git, avarab, tenglong.tl, me Junio C Hamano <gitster@pobox.com> writes: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > >> The hard-coded object IDs break the `linux-sha256` job, as pointed out in >> https://github.com/git/git/blob/6ab7651d8669/whats-cooking.txt#L522-L537. >> >> Please squash this in to address this (Junio, please feel free to >> cherry-pick this on top of `tl/ls-tree--pattern` to reduce the number of >> CI failures): After re-reading the patches, I am very much inclined to drop this topic, which does not add much value to the system and adds an odd corner case in the UI. Who needs "git ls-tree --pattern='blob 486' HEAD" that is a synonym to "git ls-tree HEAD | grep 'blob 486'"? Should we end up adding the same option to "git ls-files", "git status", "git ls-remote", "git remote", "git branch --list", etc. for consistency? This is simply insane and goes directly against the "one tool does one job well, and can be combined with other such tools via pipe", which is a key to scale the usability of a set of tools. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-12-14 5:27 ` Junio C Hamano @ 2022-12-14 10:03 ` Ævar Arnfjörð Bjarmason 2022-12-14 10:38 ` Junio C Hamano 0 siblings, 1 reply; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-12-14 10:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, Teng Long, git, tenglong.tl, me On Wed, Dec 14 2022, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> >>> The hard-coded object IDs break the `linux-sha256` job, as pointed out in >>> https://github.com/git/git/blob/6ab7651d8669/whats-cooking.txt#L522-L537. >>> >>> Please squash this in to address this (Junio, please feel free to >>> cherry-pick this on top of `tl/ls-tree--pattern` to reduce the number of >>> CI failures): > > After re-reading the patches, I am very much inclined to drop this > topic, which does not add much value to the system and adds an odd > corner case in the UI. > > Who needs "git ls-tree --pattern='blob 486' HEAD" that is a synonym > to "git ls-tree HEAD | grep 'blob 486'"? Should we end up adding > the same option to "git ls-files", "git status", "git ls-remote", > "git remote", "git branch --list", etc. for consistency? > > This is simply insane and goes directly against the "one tool does > one job well, and can be combined with other such tools via pipe", > which is a key to scale the usability of a set of tools. I agree, and FWIW I read https://lore.kernel.org/git/20221121114150.3473-1-tenglong.tl@alibaba-inc.com/ as the submitter of the topic agreeing to drop it ~3 weeks ago. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-12-14 10:03 ` Ævar Arnfjörð Bjarmason @ 2022-12-14 10:38 ` Junio C Hamano 0 siblings, 0 replies; 34+ messages in thread From: Junio C Hamano @ 2022-12-14 10:38 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Johannes Schindelin, Teng Long, git, tenglong.tl, me Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> This is simply insane and goes directly against the "one tool does >> one job well, and can be combined with other such tools via pipe", >> which is a key to scale the usability of a set of tools. > > I agree, and FWIW I read > https://lore.kernel.org/git/20221121114150.3473-1-tenglong.tl@alibaba-inc.com/ > as the submitter of the topic agreeing to drop it ~3 weeks ago. I actually read "it's useful to me" as resisting, but I already discarded the topic (not just "I stopped merging it to 'seen'", but "I no longer have the topic branch with me"). Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2022-12-12 23:57 ` Junio C Hamano 2022-12-14 5:27 ` Junio C Hamano @ 2023-03-27 10:37 ` Johannes Schindelin 2023-03-27 20:42 ` Junio C Hamano 1 sibling, 1 reply; 34+ messages in thread From: Johannes Schindelin @ 2023-03-27 10:37 UTC (permalink / raw) To: Junio C Hamano; +Cc: Teng Long, git, avarab, tenglong.tl, me Hi Junio On Tue, 13 Dec 2022, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > The hard-coded object IDs break the `linux-sha256` job, as pointed out in > > https://github.com/git/git/blob/6ab7651d8669/whats-cooking.txt#L522-L537. > > > > Please squash this in to address this (Junio, please feel free to > > cherry-pick this on top of `tl/ls-tree--pattern` to reduce the number of > > CI failures): > > These days, I gather topics with known CI breakages near the tip of > 'seen', and push out only the good bottom half of 'seen' until such > a topic gets rerolled, at which time it gets added back to the set > of topics pushed out on 'seen' again (and then ejected if it CI > breaks). I excluded the part with the topic from last night's > pushout. > > By the way do you know anything about xterm-256color error in win > test(6)? > > https://github.com/git/git/actions/runs/3676139624/jobs/6216575838#step:5:196 Unfortunately by the time I got back to this mail, the log had expired. Here is a link to the same symptom in a newer build: https://github.com/git/git/actions/runs/4523517641/jobs/7966768829#step:6:63 > I do not think we hard-code any specific terminal name (other than > dumb and possibly vt100) in our tests or binaries, so it may be > coming from the CI runner environment---some parts incorrectly think > xterm-256color is available there while there is no support for the > particular terminal? The TERM is hard-coded in the MSYS2 runtime: https://github.com/git-for-windows/msys2-runtime/commit/bd627864ab4189984cdb0892c00f91e39c4e8243 Note: The MSYS2 runtime merely wants to ensure that `TERM` is set; If it already has a value, that value remains unchanged. And to save on bandwidth/time (in a desperate attempt to counter the ever-growing runtimes of Git's CI builds), I liberally exclude files from the "minimal subset of Git for Windows' SDK", e.g. `/usr/lib/terminfo/` and `/usr/share/terminfo/`. That's why `tput` cannot figure out what to do with this `TERM` value. If these `tput` errors become too much of a sore in your eye, I see two ways forward: - Set `TERM=dumb` for the Windows jobs - Use a simple shim like this one in `ci/` (and maybe even in `t/test-lib.sh`): tput() { printf '\e[%sm%s'"$(test sgr0 != $1 || echo '\x0f')" "$( \ case $1 in bold) echo 1;; dim) echo 2;; rev) echo 7;; setaf) echo 3$2;; setab) echo 4$2;; esac \ )" } Personally, I do not really want to work on this, not before much bigger fish are fed first. For example, the friction regarding the CI build times is becoming quite crushing. Ciao, Johannes ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2023-03-27 10:37 ` win-test: unknown terminal "xterm-256color", was " Johannes Schindelin @ 2023-03-27 20:42 ` Junio C Hamano 2023-03-28 18:08 ` Jeff King 0 siblings, 1 reply; 34+ messages in thread From: Junio C Hamano @ 2023-03-27 20:42 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Teng Long, git, avarab, tenglong.tl, me Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > The TERM is hard-coded in the MSYS2 runtime: > https://github.com/git-for-windows/msys2-runtime/commit/bd627864ab4189984cdb0892c00f91e39c4e8243 > > Note: The MSYS2 runtime merely wants to ensure that `TERM` is set; If it > already has a value, that value remains unchanged. > > And to save on bandwidth/time (in a desperate attempt to counter the > ever-growing runtimes of Git's CI builds), I liberally exclude files from > the "minimal subset of Git for Windows' SDK", e.g. `/usr/lib/terminfo/` > and `/usr/share/terminfo/`. That's why `tput` cannot figure out what to do > with this `TERM` value. Ah, so it is not like "you ship vt100 but not xterm-color yet the runtime insists you are by default on xterm-color", so "set it to something your terminfo database understands" would not work. I am not sure what unintended fallouts there would be from setting it to dumb, though. Our tests are designed for unattended testing, and they are even capable of running without control terminal, so it should be OK to force TERM=dumb, I guess. > If these `tput` errors become too much of a sore in your eye, I see two > ways forward: Not at all. As long as it is clear that they are to be expected and to be ignored when fishing for real breakages, it is fine by me. Others may care more than I do, but then they are welcome to send fixes your way ;-) Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2023-03-27 20:42 ` Junio C Hamano @ 2023-03-28 18:08 ` Jeff King 2023-03-28 19:31 ` Junio C Hamano 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2023-03-28 18:08 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Teng Long, git, avarab, tenglong.tl, me On Mon, Mar 27, 2023 at 01:42:11PM -0700, Junio C Hamano wrote: > > And to save on bandwidth/time (in a desperate attempt to counter the > > ever-growing runtimes of Git's CI builds), I liberally exclude files from > > the "minimal subset of Git for Windows' SDK", e.g. `/usr/lib/terminfo/` > > and `/usr/share/terminfo/`. That's why `tput` cannot figure out what to do > > with this `TERM` value. > > Ah, so it is not like "you ship vt100 but not xterm-color yet the > runtime insists you are by default on xterm-color", so "set it to > something your terminfo database understands" would not work. I am > not sure what unintended fallouts there would be from setting it to > dumb, though. Our tests are designed for unattended testing, and > they are even capable of running without control terminal, so it > should be OK to force TERM=dumb, I guess. We already force TERM=dumb in test-lib.sh, so the tests themselves should behave the same. The original terminal is only used for colorized output from the test harness itself. But I don't think we'd colorize anyway in CI, since stdout is not a terminal (and we tend to further go through "prove" anyway). So it should be fine to just set TERM=dumb everywhere. What actually confuses me is that we try to do so already: $ git grep -B1 dumb ci/lib.sh ci/lib.sh-# GitHub Action doesn't set TERM, which is required by tput ci/lib.sh:export TERM=${TERM:-dumb} Pushing a stripped-down workflow file to just run "echo $TERM" shows that it seems to already be set by Actions to "dumb" on ubuntu-latest, but is xterm-256color on windows-latest. So maybe we just want to make the line above unconditionally set $TERM? -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2023-03-28 18:08 ` Jeff King @ 2023-03-28 19:31 ` Junio C Hamano 2023-03-28 19:59 ` Jeff King 0 siblings, 1 reply; 34+ messages in thread From: Junio C Hamano @ 2023-03-28 19:31 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Schindelin, Teng Long, git, avarab, tenglong.tl, me Jeff King <peff@peff.net> writes: > So it should be fine to just set TERM=dumb everywhere. What actually > confuses me is that we try to do so already: > > $ git grep -B1 dumb ci/lib.sh > ci/lib.sh-# GitHub Action doesn't set TERM, which is required by tput > ci/lib.sh:export TERM=${TERM:-dumb} > > Pushing a stripped-down workflow file to just run "echo $TERM" shows > that it seems to already be set by Actions to "dumb" on ubuntu-latest, > but is xterm-256color on windows-latest. > > So maybe we just want to make the line above unconditionally set $TERM? I thought that Dscho earlier said xterm-256 is set by mingw when nothing is set to TERM, which explains why TERM=${TERM:-dumb} is not good enough to "fix" this one for them. I am fine with an unconditional assignment for CI. Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2023-03-28 19:31 ` Junio C Hamano @ 2023-03-28 19:59 ` Jeff King 2023-03-28 20:43 ` Jeff King 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2023-03-28 19:59 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Teng Long, git, avarab, tenglong.tl, me On Tue, Mar 28, 2023 at 12:31:59PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > So it should be fine to just set TERM=dumb everywhere. What actually > > confuses me is that we try to do so already: > > > > $ git grep -B1 dumb ci/lib.sh > > ci/lib.sh-# GitHub Action doesn't set TERM, which is required by tput > > ci/lib.sh:export TERM=${TERM:-dumb} > > > > Pushing a stripped-down workflow file to just run "echo $TERM" shows > > that it seems to already be set by Actions to "dumb" on ubuntu-latest, > > but is xterm-256color on windows-latest. > > > > So maybe we just want to make the line above unconditionally set $TERM? > > I thought that Dscho earlier said xterm-256 is set by mingw when > nothing is set to TERM, which explains why TERM=${TERM:-dumb} is > not good enough to "fix" this one for them. Ah, I took it to mean that in our code, we'd default to xterm256-color. But perhaps it is the outer bash which is doing so, via that runtime. I tested with a dummy workflow like: name: fake ci on: [push, pull_request] jobs: ubuntu: runs-on: ubuntu-latest steps: - name: check TERM - run: | echo TERM=$TERM windows: runs-on: windows-latest steps: - name: check TERM shell: bash run: | echo TERM=$TERM and got "dumb" for ubuntu (which is not set by us; this is before ci/lib.sh is run) and "xterm256-color" for windows (I guess because it's mingw bash). At any rate, unconditionally setting TERM=dumb should cover all cases, I'd think. I'm happy to prepare a patch. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2023-03-28 19:59 ` Jeff King @ 2023-03-28 20:43 ` Jeff King 2023-03-28 21:05 ` Junio C Hamano 0 siblings, 1 reply; 34+ messages in thread From: Jeff King @ 2023-03-28 20:43 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Teng Long, git, avarab, tenglong.tl, me On Tue, Mar 28, 2023 at 03:59:42PM -0400, Jeff King wrote: > At any rate, unconditionally setting TERM=dumb should cover all cases, > I'd think. I'm happy to prepare a patch. Hmph. So I started to do this, and notice there are a bunch of tput calls in the ci/ scripts themselves. So if I do a dummy workflow like this: -- >8 -- name: fake ci on: [push, pull_request] jobs: ubuntu: runs-on: ubuntu-latest steps: - name: check TERM run: | echo TERM=$TERM echo $(tput setaf 1)this is setaf 1$(tput sgr0) windows: runs-on: windows-latest steps: - name: check TERM shell: bash run: | echo TERM=$TERM echo $(tput setaf 1)this is setaf 1$(tput sgr0) -- 8< -- then I note two interesting things: - the attempted colorization does nothing on ubuntu, because it's already TERM=dumb - the colorization on windows works! But I thought it was exactly these sorts of tputs that are causing the error messages. So now I'm more confused than ever. I have a feeling we would be better off just using ansi codes instead of tput, as Johannes suggested earlier. But given the twists and turns here, and the fact that nobody really seems that bothered by the status quo, I'm inclined to just leave it alone for now. -Peff ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: win-test: unknown terminal "xterm-256color", was Re: [RFC PATCH 6/6] ls-tree: introduce '--pattern' option 2023-03-28 20:43 ` Jeff King @ 2023-03-28 21:05 ` Junio C Hamano 0 siblings, 0 replies; 34+ messages in thread From: Junio C Hamano @ 2023-03-28 21:05 UTC (permalink / raw) To: Jeff King; +Cc: Johannes Schindelin, Teng Long, git, avarab, tenglong.tl, me Jeff King <peff@peff.net> writes: > then I note two interesting things: > > - the attempted colorization does nothing on ubuntu, because it's > already TERM=dumb This is expected, I think. > - the colorization on windows works! But I thought it was exactly > these sorts of tputs that are causing the error messages. Yeah, tput complains about unknown terminal as terminfo database is not shipped. But perhaps it falls back to bog-standard ANSI after doing its complaining? I dunno. > ... I'm inclined to just leave it alone for now. That's fine, too. Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 0/6] ls-tree: introduce '--pattern' option 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long ` (5 preceding siblings ...) 2022-11-17 11:30 ` [RFC PATCH 6/6] ls-tree: introduce '--pattern' option Teng Long @ 2022-11-17 13:22 ` Ævar Arnfjörð Bjarmason 2022-11-17 22:02 ` Taylor Blau 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason 7 siblings, 1 reply; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:22 UTC (permalink / raw) To: Teng Long; +Cc: git, tenglong.tl, me On Thu, Nov 17 2022, Teng Long wrote: > From: Teng Long <dyroneteng@gmail.com> > > This RFC patch introduce a new "ls-tree" option "--pattern", aim to match the > entries by regex then filter the output which we may want to achieve. It also > contains some commit for preparation or cleanup. > > The idea may be not comprehensive and the tests for it might be insufficient > too, but I'd like to listen the suggestion from the community to decide if it's > worth going forward with. I applied this series, compiled with CFLAGS=-O3, and: $ hyperfine './git ls-tree --pattern=[ab]c.*d -r HEAD' './git ls-tree -r HEAD | grep [ab]c.*d' -w 10 -r 20 Benchmark 1: ./git ls-tree --pattern=[ab]c.*d -r HEAD Time (mean ± σ): 14.8 ms ± 0.1 ms [User: 12.0 ms, System: 2.8 ms] Range (min … max): 14.6 ms … 15.0 ms 20 runs Benchmark 2: ./git ls-tree -r HEAD | grep [ab]c.*d Time (mean ± σ): 12.5 ms ± 0.1 ms [User: 10.0 ms, System: 4.0 ms] Range (min … max): 12.4 ms … 12.8 ms 20 runs Summary './git ls-tree -r HEAD | grep [ab]c.*d' ran 1.18 ± 0.01 times faster than './git ls-tree --pattern=[ab]c.*d -r HEAD' So the value-proposition isn't really clear to me, and the included docs, commit messages & this CL don't answer the "why not just 'grep'" question? That's faster even with another process for me, but likely that's because you're doing the regex matching really inefficiently (e.g. malloc-ing again for each line), which could be "fixed". But in any setup which cares about the performance you're likely piping to another process anyway (the thing using the data), which could do that filtering without thep "grep" process. So I don't see the value in doing this, but maybe I'm just missing something. And, in terms of the complexity for git's implementation it would be really good to avoid the complexity of a "--pattern", "--sort-lines" etc., if those use-cases can be satisfied by piping into "grep" or "sort" instead. Some of the pre-cleanup here looks good, but it's unrelated to the rest of the series. I think in any case that it would be nice to see that as another topic. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 0/6] ls-tree: introduce '--pattern' option 2022-11-17 13:22 ` [RFC PATCH 0/6] " Ævar Arnfjörð Bjarmason @ 2022-11-17 22:02 ` Taylor Blau 2022-11-21 11:41 ` Teng Long 0 siblings, 1 reply; 34+ messages in thread From: Taylor Blau @ 2022-11-17 22:02 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Teng Long, git, tenglong.tl, me On Thu, Nov 17, 2022 at 02:22:20PM +0100, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Nov 17 2022, Teng Long wrote: > > > From: Teng Long <dyroneteng@gmail.com> > > > > This RFC patch introduce a new "ls-tree" option "--pattern", aim to match the > > entries by regex then filter the output which we may want to achieve. It also > > contains some commit for preparation or cleanup. > > > > The idea may be not comprehensive and the tests for it might be insufficient > > too, but I'd like to listen the suggestion from the community to decide if it's > > worth going forward with. > > I applied this series, compiled with CFLAGS=-O3, and: > > $ hyperfine './git ls-tree --pattern=[ab]c.*d -r HEAD' './git ls-tree -r HEAD | grep [ab]c.*d' -w 10 -r 20 > Benchmark 1: ./git ls-tree --pattern=[ab]c.*d -r HEAD > Time (mean ± σ): 14.8 ms ± 0.1 ms [User: 12.0 ms, System: 2.8 ms] > Range (min … max): 14.6 ms … 15.0 ms 20 runs > > Benchmark 2: ./git ls-tree -r HEAD | grep [ab]c.*d > Time (mean ± σ): 12.5 ms ± 0.1 ms [User: 10.0 ms, System: 4.0 ms] > Range (min … max): 12.4 ms … 12.8 ms 20 runs > > Summary > './git ls-tree -r HEAD | grep [ab]c.*d' ran > 1.18 ± 0.01 times faster than './git ls-tree --pattern=[ab]c.*d -r HEAD' > > So the value-proposition isn't really clear to me, and the included > docs, commit messages & this CL don't answer the "why not just 'grep'" > question? > > That's faster even with another process for me, but likely that's > because you're doing the regex matching really inefficiently > (e.g. malloc-ing again for each line), which could be "fixed". > > But in any setup which cares about the performance you're likely piping > to another process anyway (the thing using the data), which could do > that filtering without thep "grep" process. > > So I don't see the value in doing this, but maybe I'm just missing > something. I think this falls into the same trap as the series on 'git show-ref --count' that I worked on earlier this year [1]. At the time, it seemed useful to me (since I was working in an environment where counting the number of lines from 'show-ref' was more cumbersome than teaching 'show-ref' how to perform the same task itself). And I stand by that value judgement, but sharing the patches with the Git mailing list under the intent to merge it in was wrong. Those patches were too niche to be more generally useful, and would only serve to clutter up the UI of show-ref for everybody else. So I was glad to drop that topic. Now, I'd be curious to hear from Teng whether or not there *is* something that we're missing, since if so, I definitely want to know what it is. But absent of that, I tend to agree with Ævar that I'm not compelled by replacing 'ls-tree | grep <pattern>' with 'ls-tree --pattern=<pattern>', especially if the latter is slower than the former. Thanks, Taylor [1]: https://lore.kernel.org/git/cover.1654552560.git.me@ttaylorr.com/ ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 0/6] ls-tree: introduce '--pattern' option 2022-11-17 22:02 ` Taylor Blau @ 2022-11-21 11:41 ` Teng Long 2022-11-21 12:12 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 34+ messages in thread From: Teng Long @ 2022-11-21 11:41 UTC (permalink / raw) To: me; +Cc: avarab, dyroneteng, git, tenglong.tl Taylor Blau <me@ttaylorr.com> writes: > I think this falls into the same trap as the series on 'git show-ref > --count' that I worked on earlier this year [1]. > > At the time, it seemed useful to me (since I was working in an > environment where counting the number of lines from 'show-ref' was more > cumbersome than teaching 'show-ref' how to perform the same task > itself). > > And I stand by that value judgement, but sharing the patches with the > Git mailing list under the intent to merge it in was wrong. Those > patches were too niche to be more generally useful, and would only serve > to clutter up the UI of show-ref for everybody else. > > So I was glad to drop that topic. Now, I'd be curious to hear from Teng > whether or not there *is* something that we're missing, since if so, I > definitely want to know what it is. > > But absent of that, I tend to agree with Ævar that I'm not compelled by > replacing 'ls-tree | grep <pattern>' with 'ls-tree --pattern=<pattern>', > especially if the latter is slower than the former. > > Thanks, > Taylor > > [1]: https://lore.kernel.org/git/cover.1654552560.git.me@ttaylorr.com/ honestly, I just think it's useful to me, but omit the performance recession of the option. I originally thought about it looks like the "git tag -l <pattern>" or "git branch -l <pattern>" usage, but it seems not as a regex matching on them and it indeed executes faster than the pipe grep, because it seems like the former has the more restrictive matching conditions (because if I move the last aster, there is no output): ✗ git branch -r --list "avar*" | wc -l 1498 ✗ hyperfine 'git branch -r --list "avar*"' Benchmark 1: git branch -r --list "avar*" Time (mean ± σ): 69.8 ms ± 3.1 ms [User: 25.8 ms, System: 42.7 ms] Range (min … max): 66.6 ms … 81.8 ms 35 runs ✗ hyperfine 'git branch -r --list | grep "avar"' Benchmark 1: git branch -r --list | grep "avar" Time (mean ± σ): 76.4 ms ± 3.7 ms [User: 32.7 ms, System: 45.2 ms] Range (min … max): 72.9 ms … 85.5 ms 34 runs ➜ Documentation git:(tl/extra_bitmap_relative_path) ✗ git branch -r --list "avar" | wc -l 0 So unless someone finds other benefits, such as the implementation of "git tag -l <pattern>" to solve the performance problem, this method can bring benefits in the corresponding scenario. It's ok for me to continue or stop this patch. Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 0/6] ls-tree: introduce '--pattern' option 2022-11-21 11:41 ` Teng Long @ 2022-11-21 12:12 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-21 12:12 UTC (permalink / raw) To: Teng Long; +Cc: me, git, tenglong.tl On Mon, Nov 21 2022, Teng Long wrote: > Taylor Blau <me@ttaylorr.com> writes: > >> I think this falls into the same trap as the series on 'git show-ref >> --count' that I worked on earlier this year [1]. >> >> At the time, it seemed useful to me (since I was working in an >> environment where counting the number of lines from 'show-ref' was more >> cumbersome than teaching 'show-ref' how to perform the same task >> itself). >> >> And I stand by that value judgement, but sharing the patches with the >> Git mailing list under the intent to merge it in was wrong. Those >> patches were too niche to be more generally useful, and would only serve >> to clutter up the UI of show-ref for everybody else. >> >> So I was glad to drop that topic. Now, I'd be curious to hear from Teng >> whether or not there *is* something that we're missing, since if so, I >> definitely want to know what it is. >> >> But absent of that, I tend to agree with Ævar that I'm not compelled by >> replacing 'ls-tree | grep <pattern>' with 'ls-tree --pattern=<pattern>', >> especially if the latter is slower than the former. >> >> Thanks, >> Taylor >> >> [1]: https://lore.kernel.org/git/cover.1654552560.git.me@ttaylorr.com/ > > honestly, I just think it's useful to me, but omit the performance recession of > the option. I originally thought about it looks like the "git tag -l <pattern>" > or "git branch -l <pattern>" usage, but it seems not as a regex matching on > them and it indeed executes faster than the pipe grep, because it seems like the > former has the more restrictive matching conditions (because if I move the > last aster, there is no output): > > > ✗ git branch -r --list "avar*" | wc -l > 1498 > > ✗ hyperfine 'git branch -r --list "avar*"' > Benchmark 1: git branch -r --list "avar*" > Time (mean ± σ): 69.8 ms ± 3.1 ms [User: 25.8 ms, System: 42.7 ms] > Range (min … max): 66.6 ms … 81.8 ms 35 runs > > ✗ hyperfine 'git branch -r --list | grep "avar"' > Benchmark 1: git branch -r --list | grep "avar" > Time (mean ± σ): 76.4 ms ± 3.7 ms [User: 32.7 ms, System: 45.2 ms] > Range (min … max): 72.9 ms … 85.5 ms 34 runs > > ➜ Documentation git:(tl/extra_bitmap_relative_path) ✗ git branch -r --list "avar" | wc -l > 0 Yeah, that's the built-in wildmatch() (as in wildmatch.c in-tree)-powered matching we do, and support in pretty much anything that takes a <path>. I *thought* this feature was something like that that when I first glanced at it, i.e. to regex match ls-tree entries, but e.g. a --pattern=abc would still match that in the OID, as it's just grepping the line. Which I think is one way to draw the "does this belong in git?" line. I.e. a "grep" or your "--pattern" doesn't need to know anything about sub-components of the line to be equivalent, whereas something that just matches the path does. Also, when you get into e.g. negative pathspecs and the like it becomes even more compelling. I do have some WIP patches somewhere to have our pathspecs support PCRE regexes. So I think it would be neat, or at least worth exploring. But just having ls-tree or some random command support it isn't the righ way to go about it IMO, it should be done via the pathspec API, if done at all. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 0/4] ls-tree: pass state in struct, not globals 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long ` (6 preceding siblings ...) 2022-11-17 13:22 ` [RFC PATCH 0/6] " Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 ` Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks Ævar Arnfjörð Bjarmason ` (4 more replies) 7 siblings, 5 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 UTC (permalink / raw) To: git; +Cc: Taylor Blau, Teng Long, Ævar Arnfjörð Bjarmason These are patches I've been carrying locally since April-ish, as a follow-up to the "ls-tree --format" topic. I'm submitting them here in reply to Teng's parallel RFC[1]. Teng: This conflicts with your topic, but re my suggestion of submitting a separate clean-up series in [2] maybe you could look this over, see how they differ from yours, and see what would make sense to keep/incorporate for such a clean-up series? E.g. 1/4 here is the opposite approach of your 3/6[3], but as 3/4 eventually shows we don't need that struct for anything except that callback case. 1. https://lore.kernel.org/git/20221117113023.65865-1-tenglong.tl@alibaba-inc.com/ 2. https://lore.kernel.org/git/221117.86k03tiudl.gmgdl@evledraar.gmail.com/ 3. https://lore.kernel.org/git/20221117113023.65865-4-tenglong.tl@alibaba-inc.com/ Ævar Arnfjörð Bjarmason (4): ls-tree: don't use "show_tree_data" for "fast" callbacks ls-tree: use a "struct options" ls-tree: fold "show_tree_data" into "cb" struct ls-tree: make "line_termination" less generic builtin/ls-tree.c | 255 +++++++++++++++++++++++++++------------------- 1 file changed, 149 insertions(+), 106 deletions(-) -- 2.38.0.1473.g172bcc0511c ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 ` Ævar Arnfjörð Bjarmason 2022-12-21 11:47 ` Teng Long 2022-11-17 13:48 ` [RFC PATCH 2/4] ls-tree: use a "struct options" Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 4 siblings, 1 reply; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 UTC (permalink / raw) To: git; +Cc: Taylor Blau, Teng Long, Ævar Arnfjörð Bjarmason As noted in [1] the code that made it in as part of 9c4d58ff2c3 (ls-tree: split up "fast path" callbacks, 2022-03-23) was a "maybe a good idea, maybe not" RFC-quality patch. I hadn't looked very carefully at the resulting patterns. The implementation shared the "struct show_tree_data data", which was introduced in e81517155e0 (ls-tree: introduce struct "show_tree_data", 2022-03-23) both for use in 455923e0a15 (ls-tree: introduce "--format" option, 2022-03-23), and because the "fat" callback hadn't been split up as 9c4d58ff2c3 did. Now that that's been done we can see that most of what show_tree_common() was doing could be done lazily by the callbacks themselves, who in the pre-image were often using an odd mis-match of their own arguments and those same arguments stuck into the "data" structure. Let's also have the callers initialize the "type", rather than grabbing it from the "data" structure afterwards. 1. https://lore.kernel.org/git/cover-0.7-00000000000-20220310T134811Z-avarab@gmail.com/ --- builtin/ls-tree.c | 44 ++++++++++++++++++-------------------------- 1 file changed, 18 insertions(+), 26 deletions(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index c3ea09281af..cbb6782f9a5 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -173,19 +173,11 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base, return recurse; } -static int show_tree_common(struct show_tree_data *data, int *recurse, - const struct object_id *oid, struct strbuf *base, - const char *pathname, unsigned mode) +static int show_tree_common(int *recurse, struct strbuf *base, + const char *pathname, enum object_type type) { - enum object_type type = object_type(mode); int ret = -1; - *recurse = 0; - data->mode = mode; - data->type = type; - data->oid = oid; - data->pathname = pathname; - data->base = base; if (type == OBJ_BLOB) { if (ls_options & LS_TREE_ONLY) @@ -217,15 +209,15 @@ static int show_tree_default(const struct object_id *oid, struct strbuf *base, { int early; int recurse; - struct show_tree_data data = { 0 }; + enum object_type type = object_type(mode); - early = show_tree_common(&data, &recurse, oid, base, pathname, mode); + early = show_tree_common(&recurse, base, pathname, type); if (early >= 0) return early; - printf("%06o %s %s\t", data.mode, type_name(data.type), - find_unique_abbrev(data.oid, abbrev)); - show_tree_common_default_long(base, pathname, data.base->len); + printf("%06o %s %s\t", mode, type_name(object_type(mode)), + find_unique_abbrev(oid, abbrev)); + show_tree_common_default_long(base, pathname, base->len); return recurse; } @@ -235,16 +227,16 @@ static int show_tree_long(const struct object_id *oid, struct strbuf *base, { int early; int recurse; - struct show_tree_data data = { 0 }; char size_text[24]; + enum object_type type = object_type(mode); - early = show_tree_common(&data, &recurse, oid, base, pathname, mode); + early = show_tree_common(&recurse, base, pathname, type); if (early >= 0) return early; - if (data.type == OBJ_BLOB) { + if (type == OBJ_BLOB) { unsigned long size; - if (oid_object_info(the_repository, data.oid, &size) == OBJ_BAD) + if (oid_object_info(the_repository, oid, &size) == OBJ_BAD) xsnprintf(size_text, sizeof(size_text), "BAD"); else xsnprintf(size_text, sizeof(size_text), @@ -253,9 +245,9 @@ static int show_tree_long(const struct object_id *oid, struct strbuf *base, xsnprintf(size_text, sizeof(size_text), "-"); } - printf("%06o %s %s %7s\t", data.mode, type_name(data.type), - find_unique_abbrev(data.oid, abbrev), size_text); - show_tree_common_default_long(base, pathname, data.base->len); + printf("%06o %s %s %7s\t", mode, type_name(type), + find_unique_abbrev(oid, abbrev), size_text); + show_tree_common_default_long(base, pathname, base->len); return recurse; } @@ -266,9 +258,9 @@ static int show_tree_name_only(const struct object_id *oid, struct strbuf *base, int early; int recurse; const size_t baselen = base->len; - struct show_tree_data data = { 0 }; + enum object_type type = object_type(mode); - early = show_tree_common(&data, &recurse, oid, base, pathname, mode); + early = show_tree_common(&recurse, base, pathname, type); if (early >= 0) return early; @@ -286,9 +278,9 @@ static int show_tree_object(const struct object_id *oid, struct strbuf *base, { int early; int recurse; - struct show_tree_data data = { 0 }; + enum object_type type = object_type(mode); - early = show_tree_common(&data, &recurse, oid, base, pathname, mode); + early = show_tree_common(&recurse, base, pathname, type); if (early >= 0) return early; -- 2.38.0.1473.g172bcc0511c ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks 2022-11-17 13:48 ` [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks Ævar Arnfjörð Bjarmason @ 2022-12-21 11:47 ` Teng Long 0 siblings, 0 replies; 34+ messages in thread From: Teng Long @ 2022-12-21 11:47 UTC (permalink / raw) To: avarab; +Cc: dyroneteng, git, me "Ævar Arnfjörð Bjarmason" <avarab@gmail.com> writes: > As noted in [1] the code that made it in as part of > 9c4d58ff2c3 (ls-tree: split up "fast path" callbacks, 2022-03-23) was > a "maybe a good idea, maybe not" RFC-quality patch. I hadn't looked > very carefully at the resulting patterns. > > The implementation shared the "struct show_tree_data data", which was > introduced in e81517155e0 (ls-tree: introduce struct "show_tree_data", > 2022-03-23) both for use in 455923e0a15 (ls-tree: introduce "--format" > option, 2022-03-23), and because the "fat" callback hadn't been split > up as 9c4d58ff2c3 did. > > Now that that's been done we can see that most of what > show_tree_common() was doing could be done lazily by the callbacks > themselves, who in the pre-image were often using an odd mis-match of > their own arguments and those same arguments stuck into the "data" > structure. Let's also have the callers initialize the "type", rather > than grabbing it from the "data" structure afterwards. > > 1. https://lore.kernel.org/git/cover-0.7-00000000000-20220310T134811Z-avarab@gmail.com/ Because in "show_tree_common(&data, &recurse, oid, base, pathname, mode)", the "data" and the other args exist redundant, we could just not to pass "data", because it's enough, do I understand right? Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 2/4] ls-tree: use a "struct options" 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 ` Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 3/4] ls-tree: fold "show_tree_data" into "cb" struct Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 4 siblings, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 UTC (permalink / raw) To: git; +Cc: Taylor Blau, Teng Long, Ævar Arnfjörð Bjarmason As a first step towards being able to turn this code into an API some day let's change the "static" options in builtin/ls-tree.c into a "struct ls_tree_options" that can be constructed dynamically without the help of parse_options(). Because we're now using non-static variables for this we'll need to clear_pathspec() at the end of cmd_ls_tree(), least various tests start failing under SANITIZE=leak. The memory leak was already there before, now it's just being brought to the surface. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/ls-tree.c | 203 ++++++++++++++++++++++++++-------------------- 1 file changed, 116 insertions(+), 87 deletions(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index cbb6782f9a5..54f7b604cb7 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -14,37 +14,11 @@ #include "parse-options.h" #include "pathspec.h" -static int line_termination = '\n'; -#define LS_RECURSIVE 1 -#define LS_TREE_ONLY (1 << 1) -#define LS_SHOW_TREES (1 << 2) -static int abbrev; -static int ls_options; -static struct pathspec pathspec; -static int chomp_prefix; -static const char *ls_tree_prefix; -static const char *format; -struct show_tree_data { - unsigned mode; - enum object_type type; - const struct object_id *oid; - const char *pathname; - struct strbuf *base; -}; - static const char * const ls_tree_usage[] = { N_("git ls-tree [<options>] <tree-ish> [<path>...]"), NULL }; -static enum ls_tree_cmdmode { - MODE_DEFAULT = 0, - MODE_LONG, - MODE_NAME_ONLY, - MODE_NAME_STATUS, - MODE_OBJECT_ONLY, -} cmdmode; - static void expand_objectsize(struct strbuf *line, const struct object_id *oid, const enum object_type type, unsigned int padded) { @@ -64,10 +38,39 @@ static void expand_objectsize(struct strbuf *line, const struct object_id *oid, } } +struct ls_tree_options { + int line_termination; + int abbrev; + enum ls_tree_path_options { + LS_RECURSIVE = 1 << 0, + LS_TREE_ONLY = 1 << 1, + LS_SHOW_TREES = 1 << 2, + } ls_options; + struct pathspec pathspec; + int chomp_prefix; + const char *ls_tree_prefix; + const char *format; +}; + +struct show_tree_data { + unsigned mode; + enum object_type type; + const struct object_id *oid; + const char *pathname; + struct strbuf *base; +}; + +struct show_tree_data_cb { + struct ls_tree_options *options; + struct show_tree_data *data; +}; + static size_t expand_show_tree(struct strbuf *sb, const char *start, void *context) { - struct show_tree_data *data = context; + struct show_tree_data_cb *wrapper = context; + struct ls_tree_options *options = wrapper->options; + struct show_tree_data *data = wrapper->data; const char *end; const char *p; unsigned int errlen; @@ -92,10 +95,10 @@ static size_t expand_show_tree(struct strbuf *sb, const char *start, } else if (skip_prefix(start, "(objectsize)", &p)) { expand_objectsize(sb, data->oid, data->type, 0); } else if (skip_prefix(start, "(objectname)", &p)) { - strbuf_add_unique_abbrev(sb, data->oid, abbrev); + strbuf_add_unique_abbrev(sb, data->oid, options->abbrev); } else if (skip_prefix(start, "(path)", &p)) { const char *name = data->base->buf; - const char *prefix = chomp_prefix ? ls_tree_prefix : NULL; + const char *prefix = options->chomp_prefix ? options->ls_tree_prefix : NULL; struct strbuf quoted = STRBUF_INIT; struct strbuf sbuf = STRBUF_INIT; strbuf_addstr(data->base, data->pathname); @@ -111,18 +114,19 @@ static size_t expand_show_tree(struct strbuf *sb, const char *start, return len; } -static int show_recursive(const char *base, size_t baselen, const char *pathname) +static int show_recursive(struct ls_tree_options *options, const char *base, + size_t baselen, const char *pathname) { int i; - if (ls_options & LS_RECURSIVE) + if (options->ls_options & LS_RECURSIVE) return 1; - if (!pathspec.nr) + if (!options->pathspec.nr) return 0; - for (i = 0; i < pathspec.nr; i++) { - const char *spec = pathspec.items[i].match; + for (i = 0; i < options->pathspec.nr; i++) { + const char *spec = options->pathspec.items[i].match; size_t len, speclen; if (strncmp(base, spec, baselen)) @@ -142,13 +146,13 @@ static int show_recursive(const char *base, size_t baselen, const char *pathname } static int show_tree_fmt(const struct object_id *oid, struct strbuf *base, - const char *pathname, unsigned mode, void *context UNUSED) + const char *pathname, unsigned mode, void *context) { + struct ls_tree_options *options = context; size_t baselen; int recurse = 0; struct strbuf sb = STRBUF_INIT; enum object_type type = object_type(mode); - struct show_tree_data data = { .mode = mode, .type = type, @@ -156,81 +160,89 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base, .pathname = pathname, .base = base, }; + struct show_tree_data_cb cb_data = { + .data = &data, + .options = options, + }; - if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname)) + if (type == OBJ_TREE && show_recursive(options, base->buf, base->len, pathname)) recurse = READ_TREE_RECURSIVE; - if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES)) + if (type == OBJ_TREE && recurse && !(options->ls_options & LS_SHOW_TREES)) return recurse; - if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY)) + if (type == OBJ_BLOB && (options->ls_options & LS_TREE_ONLY)) return 0; baselen = base->len; - strbuf_expand(&sb, format, expand_show_tree, &data); - strbuf_addch(&sb, line_termination); + strbuf_expand(&sb, options->format, expand_show_tree, &cb_data); + strbuf_addch(&sb, options->line_termination); fwrite(sb.buf, sb.len, 1, stdout); strbuf_release(&sb); strbuf_setlen(base, baselen); return recurse; } -static int show_tree_common(int *recurse, struct strbuf *base, - const char *pathname, enum object_type type) +static int show_tree_common(struct ls_tree_options *options, int *recurse, + struct strbuf *base, const char *pathname, + enum object_type type) { int ret = -1; *recurse = 0; if (type == OBJ_BLOB) { - if (ls_options & LS_TREE_ONLY) + if (options->ls_options & LS_TREE_ONLY) ret = 0; } else if (type == OBJ_TREE && - show_recursive(base->buf, base->len, pathname)) { + show_recursive(options, base->buf, base->len, pathname)) { *recurse = READ_TREE_RECURSIVE; - if (!(ls_options & LS_SHOW_TREES)) + if (!(options->ls_options & LS_SHOW_TREES)) ret = *recurse; } return ret; } -static void show_tree_common_default_long(struct strbuf *base, +static void show_tree_common_default_long(struct ls_tree_options *options, + struct strbuf *base, const char *pathname, const size_t baselen) { strbuf_addstr(base, pathname); write_name_quoted_relative(base->buf, - chomp_prefix ? ls_tree_prefix : NULL, stdout, - line_termination); + options->chomp_prefix ? options->ls_tree_prefix : NULL, stdout, + options->line_termination); strbuf_setlen(base, baselen); } static int show_tree_default(const struct object_id *oid, struct strbuf *base, const char *pathname, unsigned mode, - void *context UNUSED) + void *context) { + struct ls_tree_options *options = context; int early; int recurse; enum object_type type = object_type(mode); - early = show_tree_common(&recurse, base, pathname, type); + early = show_tree_common(options, &recurse, base, pathname, type); if (early >= 0) return early; printf("%06o %s %s\t", mode, type_name(object_type(mode)), - find_unique_abbrev(oid, abbrev)); - show_tree_common_default_long(base, pathname, base->len); + find_unique_abbrev(oid, options->abbrev)); + show_tree_common_default_long(options, base, pathname, base->len); return recurse; } static int show_tree_long(const struct object_id *oid, struct strbuf *base, const char *pathname, unsigned mode, - void *context UNUSED) + void *context) { + struct ls_tree_options *options = context; int early; int recurse; char size_text[24]; enum object_type type = object_type(mode); - early = show_tree_common(&recurse, base, pathname, type); + early = show_tree_common(options, &recurse, base, pathname, type); if (early >= 0) return early; @@ -246,48 +258,58 @@ static int show_tree_long(const struct object_id *oid, struct strbuf *base, } printf("%06o %s %s %7s\t", mode, type_name(type), - find_unique_abbrev(oid, abbrev), size_text); - show_tree_common_default_long(base, pathname, base->len); + find_unique_abbrev(oid, options->abbrev), size_text); + show_tree_common_default_long(options, base, pathname, base->len); return recurse; } static int show_tree_name_only(const struct object_id *oid, struct strbuf *base, const char *pathname, unsigned mode, - void *context UNUSED) + void *context) { + struct ls_tree_options *options = context; int early; int recurse; const size_t baselen = base->len; enum object_type type = object_type(mode); - early = show_tree_common(&recurse, base, pathname, type); + early = show_tree_common(options, &recurse, base, pathname, type); if (early >= 0) return early; strbuf_addstr(base, pathname); write_name_quoted_relative(base->buf, - chomp_prefix ? ls_tree_prefix : NULL, - stdout, line_termination); + options->chomp_prefix ? options->ls_tree_prefix : NULL, + stdout, options->line_termination); strbuf_setlen(base, baselen); return recurse; } static int show_tree_object(const struct object_id *oid, struct strbuf *base, const char *pathname, unsigned mode, - void *context UNUSED) + void *context) { + struct ls_tree_options *options = context; int early; int recurse; enum object_type type = object_type(mode); - early = show_tree_common(&recurse, base, pathname, type); + early = show_tree_common(options, &recurse, base, pathname, type); if (early >= 0) return early; - printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination); + printf("%s%c", find_unique_abbrev(oid, options->abbrev), options->line_termination); return recurse; } +enum ls_tree_cmdmode { + MODE_DEFAULT = 0, + MODE_LONG, + MODE_NAME_ONLY, + MODE_NAME_STATUS, + MODE_OBJECT_ONLY, +}; + struct ls_tree_cmdmode_to_fmt { enum ls_tree_cmdmode mode; const char *const fmt; @@ -327,14 +349,18 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) struct tree *tree; int i, full_tree = 0; read_tree_fn_t fn = NULL; + enum ls_tree_cmdmode cmdmode = MODE_DEFAULT; + struct ls_tree_options options = { + .line_termination = '\n', + }; const struct option ls_tree_options[] = { - OPT_BIT('d', NULL, &ls_options, N_("only show trees"), + OPT_BIT('d', NULL, &options.ls_options, N_("only show trees"), LS_TREE_ONLY), - OPT_BIT('r', NULL, &ls_options, N_("recurse into subtrees"), + OPT_BIT('r', NULL, &options.ls_options, N_("recurse into subtrees"), LS_RECURSIVE), - OPT_BIT('t', NULL, &ls_options, N_("show trees when recursing"), + OPT_BIT('t', NULL, &options.ls_options, N_("show trees when recursing"), LS_SHOW_TREES), - OPT_SET_INT('z', NULL, &line_termination, + OPT_SET_INT('z', NULL, &options.line_termination, N_("terminate entries with NUL byte"), 0), OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"), MODE_LONG), @@ -344,29 +370,30 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) MODE_NAME_STATUS), OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"), MODE_OBJECT_ONLY), - OPT_SET_INT(0, "full-name", &chomp_prefix, + OPT_SET_INT(0, "full-name", &options.chomp_prefix, N_("use full path names"), 0), OPT_BOOL(0, "full-tree", &full_tree, N_("list entire tree; not just current directory " "(implies --full-name)")), - OPT_STRING_F(0, "format", &format, N_("format"), + OPT_STRING_F(0, "format", &options.format, N_("format"), N_("format to use for the output"), PARSE_OPT_NONEG), - OPT__ABBREV(&abbrev), + OPT__ABBREV(&options.abbrev), OPT_END() }; struct ls_tree_cmdmode_to_fmt *m2f = ls_tree_cmdmode_format; + int ret; git_config(git_default_config, NULL); - ls_tree_prefix = prefix; + options.ls_tree_prefix = prefix; if (prefix) - chomp_prefix = strlen(prefix); + options.chomp_prefix = strlen(prefix); argc = parse_options(argc, argv, prefix, ls_tree_options, ls_tree_usage, 0); if (full_tree) { - ls_tree_prefix = prefix = NULL; - chomp_prefix = 0; + options.ls_tree_prefix = prefix = NULL; + options.chomp_prefix = 0; } /* * We wanted to detect conflicts between --name-only and @@ -378,10 +405,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) /* -d -r should imply -t, but -d by itself should not have to. */ if ( (LS_TREE_ONLY|LS_RECURSIVE) == - ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options)) - ls_options |= LS_SHOW_TREES; + ((LS_TREE_ONLY|LS_RECURSIVE) & options.ls_options)) + options.ls_options |= LS_SHOW_TREES; - if (format && cmdmode) + if (options.format && cmdmode) usage_msg_opt( _("--format can't be combined with other format-altering options"), ls_tree_usage, ls_tree_options); @@ -396,13 +423,13 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) * cannot be lifted until it is converted to use * match_pathspec() or tree_entry_interesting() */ - parse_pathspec(&pathspec, PATHSPEC_ALL_MAGIC & - ~(PATHSPEC_FROMTOP | PATHSPEC_LITERAL), + parse_pathspec(&options.pathspec, PATHSPEC_ALL_MAGIC & + ~(PATHSPEC_FROMTOP | PATHSPEC_LITERAL), PATHSPEC_PREFER_CWD, prefix, argv + 1); - for (i = 0; i < pathspec.nr; i++) - pathspec.items[i].nowildcard_len = pathspec.items[i].len; - pathspec.has_wildcard = 0; + for (i = 0; i < options.pathspec.nr; i++) + options.pathspec.items[i].nowildcard_len = options.pathspec.items[i].len; + options.pathspec.has_wildcard = 0; tree = parse_tree_indirect(&oid); if (!tree) die("not a tree object"); @@ -412,11 +439,11 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) */ while (m2f) { if (!m2f->fmt) { - fn = format ? show_tree_fmt : show_tree_default; - } else if (format && !strcmp(format, m2f->fmt)) { + fn = options.format ? show_tree_fmt : show_tree_default; + } else if (options.format && !strcmp(options.format, m2f->fmt)) { cmdmode = m2f->mode; fn = m2f->fn; - } else if (!format && cmdmode == m2f->mode) { + } else if (!options.format && cmdmode == m2f->mode) { fn = m2f->fn; } else { m2f++; @@ -425,5 +452,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) break; } - return !!read_tree(the_repository, tree, &pathspec, fn, NULL); + ret = !!read_tree(the_repository, tree, &options.pathspec, fn, &options); + clear_pathspec(&options.pathspec); + return ret; } -- 2.38.0.1473.g172bcc0511c ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 3/4] ls-tree: fold "show_tree_data" into "cb" struct 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 2/4] ls-tree: use a "struct options" Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 ` Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 4/4] ls-tree: make "line_termination" less generic Ævar Arnfjörð Bjarmason 2022-11-21 12:00 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Teng Long 4 siblings, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 UTC (permalink / raw) To: git; +Cc: Taylor Blau, Teng Long, Ævar Arnfjörð Bjarmason After the the preceding two commits the only user of the "show_tree_data" struct needed it along with the "options" member, let's instead fold all of that into a "show_tree_data" struct that we'll use only for that callback. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/ls-tree.c | 18 +++++------------- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index 54f7b604cb7..da664eecfb9 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -53,6 +53,7 @@ struct ls_tree_options { }; struct show_tree_data { + struct ls_tree_options *options; unsigned mode; enum object_type type; const struct object_id *oid; @@ -60,17 +61,11 @@ struct show_tree_data { struct strbuf *base; }; -struct show_tree_data_cb { - struct ls_tree_options *options; - struct show_tree_data *data; -}; - static size_t expand_show_tree(struct strbuf *sb, const char *start, void *context) { - struct show_tree_data_cb *wrapper = context; - struct ls_tree_options *options = wrapper->options; - struct show_tree_data *data = wrapper->data; + struct show_tree_data *data = context; + struct ls_tree_options *options = data->options; const char *end; const char *p; unsigned int errlen; @@ -153,17 +148,14 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base, int recurse = 0; struct strbuf sb = STRBUF_INIT; enum object_type type = object_type(mode); - struct show_tree_data data = { + struct show_tree_data cb_data = { + .options = options, .mode = mode, .type = type, .oid = oid, .pathname = pathname, .base = base, }; - struct show_tree_data_cb cb_data = { - .data = &data, - .options = options, - }; if (type == OBJ_TREE && show_recursive(options, base->buf, base->len, pathname)) recurse = READ_TREE_RECURSIVE; -- 2.38.0.1473.g172bcc0511c ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 4/4] ls-tree: make "line_termination" less generic 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2022-11-17 13:48 ` [RFC PATCH 3/4] ls-tree: fold "show_tree_data" into "cb" struct Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 ` Ævar Arnfjörð Bjarmason 2022-11-21 12:00 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Teng Long 4 siblings, 0 replies; 34+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-11-17 13:48 UTC (permalink / raw) To: git; +Cc: Taylor Blau, Teng Long, Ævar Arnfjörð Bjarmason The "ls-tree" command isn't capable of ending "lines" with anything except '\n' or '\0', and in the latter case we can avoid calling write_name_quoted_relative() entirely. Let's do that, less for optimization and more for clarity, the write_name_quoted_relative() API itself does much the same thing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/ls-tree.c | 58 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 44 insertions(+), 14 deletions(-) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index da664eecfb9..a743959f2b3 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -39,7 +39,7 @@ static void expand_objectsize(struct strbuf *line, const struct object_id *oid, } struct ls_tree_options { - int line_termination; + unsigned null_termination:1; int abbrev; enum ls_tree_path_options { LS_RECURSIVE = 1 << 0, @@ -166,7 +166,7 @@ static int show_tree_fmt(const struct object_id *oid, struct strbuf *base, baselen = base->len; strbuf_expand(&sb, options->format, expand_show_tree, &cb_data); - strbuf_addch(&sb, options->line_termination); + strbuf_addch(&sb, options->null_termination ? '\0' : '\n'); fwrite(sb.buf, sb.len, 1, stdout); strbuf_release(&sb); strbuf_setlen(base, baselen); @@ -198,10 +198,22 @@ static void show_tree_common_default_long(struct ls_tree_options *options, const char *pathname, const size_t baselen) { + const char *prefix = options->chomp_prefix ? options->ls_tree_prefix : NULL; + strbuf_addstr(base, pathname); - write_name_quoted_relative(base->buf, - options->chomp_prefix ? options->ls_tree_prefix : NULL, stdout, - options->line_termination); + + if (options->null_termination) { + struct strbuf sb = STRBUF_INIT; + const char *name = relative_path(base->buf, prefix, &sb); + + fputs(name, stdout); + fputc('\0', stdout); + + strbuf_release(&sb); + } else { + write_name_quoted_relative(base->buf, prefix, stdout, '\n'); + } + strbuf_setlen(base, baselen); } @@ -264,15 +276,25 @@ static int show_tree_name_only(const struct object_id *oid, struct strbuf *base, int recurse; const size_t baselen = base->len; enum object_type type = object_type(mode); + const char *prefix; early = show_tree_common(options, &recurse, base, pathname, type); if (early >= 0) return early; + prefix = options->chomp_prefix ? options->ls_tree_prefix : NULL; strbuf_addstr(base, pathname); - write_name_quoted_relative(base->buf, - options->chomp_prefix ? options->ls_tree_prefix : NULL, - stdout, options->line_termination); + if (options->null_termination) { + struct strbuf sb = STRBUF_INIT; + const char *name = relative_path(base->buf, prefix, &sb); + + fputs(name, stdout); + fputc('\0', stdout); + + strbuf_release(&sb); + } else { + write_name_quoted_relative(base->buf, prefix, stdout, '\n'); + } strbuf_setlen(base, baselen); return recurse; } @@ -285,12 +307,19 @@ static int show_tree_object(const struct object_id *oid, struct strbuf *base, int early; int recurse; enum object_type type = object_type(mode); + const char *str; early = show_tree_common(options, &recurse, base, pathname, type); if (early >= 0) return early; - printf("%s%c", find_unique_abbrev(oid, options->abbrev), options->line_termination); + str = find_unique_abbrev(oid, options->abbrev); + if (options->null_termination) { + fputs(str, stdout); + fputc('\0', stdout); + } else { + puts(str); + } return recurse; } @@ -342,9 +371,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) int i, full_tree = 0; read_tree_fn_t fn = NULL; enum ls_tree_cmdmode cmdmode = MODE_DEFAULT; - struct ls_tree_options options = { - .line_termination = '\n', - }; + int null_termination = 0; + struct ls_tree_options options = { 0 }; const struct option ls_tree_options[] = { OPT_BIT('d', NULL, &options.ls_options, N_("only show trees"), LS_TREE_ONLY), @@ -352,8 +380,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) LS_RECURSIVE), OPT_BIT('t', NULL, &options.ls_options, N_("show trees when recursing"), LS_SHOW_TREES), - OPT_SET_INT('z', NULL, &options.line_termination, - N_("terminate entries with NUL byte"), 0), + OPT_BOOL('z', NULL, &null_termination, + N_("terminate entries with NUL byte")), OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"), MODE_LONG), OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"), @@ -383,6 +411,8 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, ls_tree_options, ls_tree_usage, 0); + options.null_termination = null_termination; + if (full_tree) { options.ls_tree_prefix = prefix = NULL; options.chomp_prefix = 0; -- 2.38.0.1473.g172bcc0511c ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 0/4] ls-tree: pass state in struct, not globals 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2022-11-17 13:48 ` [RFC PATCH 4/4] ls-tree: make "line_termination" less generic Ævar Arnfjörð Bjarmason @ 2022-11-21 12:00 ` Teng Long 4 siblings, 0 replies; 34+ messages in thread From: Teng Long @ 2022-11-21 12:00 UTC (permalink / raw) To: avarab; +Cc: dyroneteng, git, me "Ævar Arnfjörð Bjarmason" <avarab@gmail.com> write: > These are patches I've been carrying locally since April-ish, as a > follow-up to the "ls-tree --format" topic. Cool. > Teng: This conflicts with your topic, but re my suggestion of > submitting a separate clean-up series in [2] maybe you could look this > over, see how they differ from yours, and see what would make sense to > keep/incorporate for such a clean-up series? Yes, I'd like to. > E.g. 1/4 here is the opposite approach of your 3/6[3], but as 3/4 > eventually shows we don't need that struct for anything except that > callback case. Ok, I will check it out later. Thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2023-03-28 21:05 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-17 11:30 [RFC PATCH 0/6] ls-tree: introduce '--pattern' option Teng Long 2022-11-17 11:30 ` [RFC PATCH 1/6] ls-tree: cleanup the redundant SPACE Teng Long 2022-11-17 11:30 ` [RFC PATCH 2/6] t3104: remove shift code in 'test_ls_tree_format' Teng Long 2022-11-17 11:30 ` [RFC PATCH 3/6] ls-tree: optimize params of 'show_tree_common_default_long()' Teng Long 2022-11-17 11:30 ` [RFC PATCH 4/6] ls-tree: improving cohension in the print code Teng Long 2022-11-17 13:53 ` Ævar Arnfjörð Bjarmason 2022-11-17 11:30 ` [RFC PATCH 5/6] ls-tree: introduce 'match_pattern()' function Teng Long 2022-11-17 14:02 ` Ævar Arnfjörð Bjarmason 2022-11-30 9:39 ` Ævar Arnfjörð Bjarmason 2022-11-17 11:30 ` [RFC PATCH 6/6] ls-tree: introduce '--pattern' option Teng Long 2022-11-17 14:03 ` Ævar Arnfjörð Bjarmason 2022-12-12 8:32 ` Johannes Schindelin 2022-12-12 23:57 ` Junio C Hamano 2022-12-14 5:27 ` Junio C Hamano 2022-12-14 10:03 ` Ævar Arnfjörð Bjarmason 2022-12-14 10:38 ` Junio C Hamano 2023-03-27 10:37 ` win-test: unknown terminal "xterm-256color", was " Johannes Schindelin 2023-03-27 20:42 ` Junio C Hamano 2023-03-28 18:08 ` Jeff King 2023-03-28 19:31 ` Junio C Hamano 2023-03-28 19:59 ` Jeff King 2023-03-28 20:43 ` Jeff King 2023-03-28 21:05 ` Junio C Hamano 2022-11-17 13:22 ` [RFC PATCH 0/6] " Ævar Arnfjörð Bjarmason 2022-11-17 22:02 ` Taylor Blau 2022-11-21 11:41 ` Teng Long 2022-11-21 12:12 ` Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 1/4] ls-tree: don't use "show_tree_data" for "fast" callbacks Ævar Arnfjörð Bjarmason 2022-12-21 11:47 ` Teng Long 2022-11-17 13:48 ` [RFC PATCH 2/4] ls-tree: use a "struct options" Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 3/4] ls-tree: fold "show_tree_data" into "cb" struct Ævar Arnfjörð Bjarmason 2022-11-17 13:48 ` [RFC PATCH 4/4] ls-tree: make "line_termination" less generic Ævar Arnfjörð Bjarmason 2022-11-21 12:00 ` [RFC PATCH 0/4] ls-tree: pass state in struct, not globals Teng Long
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).