On 2024-03-19 01:58, Junio C Hamano wrote: > Dragan Simic <dsimic@manjaro.org> writes: >> Add new configuration option grep.includeUntracked that enables >> --untracked >> option by default. This pretty much follows the logic established by >> the >> already existing configuration option grep.fallbackToNoIndex, while >> also >> respecting the dependencies of the --untracked option. >> >> Also add a few automated tests to the t7810, to cover the new >> configuration > > Do we have any non-automated tests in t7810? Good point, will be removed in the v2, if we get there. Tying "automated" to "test" is just an old habit of mine. >> option by replicating the already existing tests for --untracked. >> >> Signed-off-by: Dragan Simic <dsimic@manjaro.org> >> --- >> builtin/grep.c | 3 +++ >> t/t7810-grep.sh | 9 +++++++++ >> 2 files changed, 12 insertions(+) >> >> diff --git a/builtin/grep.c b/builtin/grep.c >> index af89c8b5cb19..71d94126fb6e 100644 >> --- a/builtin/grep.c >> +++ b/builtin/grep.c >> @@ -1041,6 +1041,9 @@ int cmd_grep(int argc, const char **argv, const >> char *prefix) >> the_repository->settings.command_requires_full_index = 0; >> } >> >> + if (use_index && !cached) >> + git_config_get_bool("grep.includeuntracked", &untracked); > > Can this ever return an error? E.g. > > [grep] includeuntracked = "not really" > > How badly would setting this configuration variable break third > party tools that assume their "git grep" invocation without the > "--untracked" option would not yield hits from untracked files? After a brief inspection of the code in cache.c, git_config_get_bool() always returns either 0 or 1, so we should be fine. Thus, any strangeness in a configuration file would end up not enabling this option. As I already explained in my earlier response, [1] I think that the usability of this option outweighs the possible issues it may cause to some users. [1] https://lore.kernel.org/git/c68a6d94bb02e5d9aa2f81bee022baa8@manjaro.org/
On 2024-03-19 01:55, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
>
>> ---no-index::
>> - Search files in the current directory that is not managed by Git.
>> -
>> --untracked::
>> In addition to searching in the tracked files in the working
>> tree, search also in untracked files.
>>
>> +--no-index::
>> + Search files in the current directory that is not managed by Git.
>> + This option cannot be used together with `--cached` or
>> `--untracked`.
>> + See also `grep.fallbackToNoIndex` in CONFIGURATION below.
>
> Hmph, this is not the fault of this patch, but the description is
> iffy. You can run "git grep --no-index" inside a directory that is
> managed by Git, and it behaves as if you gave --untracked, if I am
> not mistaken.
>
> What "--no-index" does is to pretend that there is no system called
> Git and work as if it were a strange implementation of "grep -r".
> The reader should be taught to understand the mode as such, because
> that understanding will apply whether the current directory happens
> to be part of a working tree managed by git, or not under control by
> git repository anywhere.
>
> There is no tracked or untracked or managed or anything like that,
> as we are pretending that there is no git, so it falls naturally
> that --cached or --untracked would not work.
>
> And from that point of view, swapping the order of "--no-index" and
> "--untracked" in this patch does make sense. All other options that
> specify which haystack to find the needle in are all about git;
> "--no-index" truly is an oddball that pretends that we live in a
> world without git, and as an oddball, we should move the description
> out from where it does not belong. It might also make sense to
> rethink where `--recurse-submodules` sits in the list of options
> while at it, as it also is an option that affects which haystack the
> search for the needle is carried out.
Thank you very much for your detailed review, as always. I'll try to
incorporate all your remarks into the next version of this patch.
On 2024-03-19 01:32, Junio C Hamano wrote: > Eric Sunshine <sunshine@sunshineco.com> writes: > >> It's entirely subjective, of course, so no right-or-wrong answer, but >> I personally do not find that this change improves code quality or >> readability. > > I agree that this is entirely subjective. To those who wrote these > variable decls and inits, what they wrote was the most readable, > wasn't it? It probably falls into the "to some readers the existing > code may not be perfect, but once it is written, it is not worth a > patch noise to fix it" category. There's no doubt that it was the most readable form to the people who wrote the code, which was some time ago, but the time inevitably passes and the surrounding code changes over time, maybe even some new coding guidelines become introduced, etc. Things inevitably change, that's all I'm trying to say. >> With my reviewer hat on, I spent an inordinate amount of time staring >> at this change trying to locate each variable's new location to verify >> that no initializers were dropped and that the declared type hadn't >> changed. > > It is true that "cleaning up, no behaviour changes intended" patches > are unpleasant to review. They are boring to read, and the risk of > breakage due to mistake is unnecessary and severe. > > But if the result is objectively better, such a one-time cost may be > worth it. We are investing into the better future. For example, we > may have an unsorted mess of an enum definition, and we do > appreciate in the longer run, such a definition were "more or less" > sorted within the constraint of some other criteria (like, "errors > get negative value"). If the enum is a huge one, it may need some > careful reviewing to verify such a change that turns the unsorted > mess into a sorted nice list, but the cost of doing so may be > justified. I fully agree that reviewing code-cleanup patches it usually boring and often taxing. I mean, why change something that has served us well for years, just to make it look nicer in someone's eyes? What does even "nicer" mean to everyone? Well, I sometimes look at the code as if it were a beautiful painting. To some people, it doesn't matter if a painting has some rough areas, as long as it can be hung on a wall. To them, it's just a square thing. Though, to some other people, spotting such rough areas is what makes the painting less beautiful to them. Paintings usually cannot be fixed easily, but fixing the code is much easier. Of course, "nicer" is very hard to define, but I believe that, in the domain of computing, it could be partially defined as "more consistent and flowing better". That's what I tried to achieve with this patch, and I hope I've managed to convey my viewpoint. > Does the change in this patch qualify as "objectively better"? I > dunno. I'd say that these changes qualify as "semi-objectively nicer", but surely not as "objectively better". For any changes to be objectively better, they'd need to introduce some functional changes to the code, while a well-performed code cleanup actually introduces no functional changes.
Hello Junio,
On 2024-03-19 01:21, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
>
>> This patch series introduces new config option grep.includeUntracked,
>> which makes the untracked files also searched by default when
>> git-grep(1)
>> is invoked, in addition to searching the tracked files.
>
> Hmph. I am moderately negative on any configuration that screws
> with the default haystack from which needle is sought for.
>
> I may often do "git grep --cached" but that does not mean I would
> welcome an configuration option to make "git grep" search in the
> index even when the request by the user does not have "--cached".
>
> Inclusion of untracked sources in a sense is even worse, especially
> when an unsuspecting "git grep" user (or a script) fully expects
> that any paths found in the output are to be found in "git ls-files
> -s" output but when you stray into a repository with the
> configuration set, that expectation suddelnly gets broken.
Hmm... Those are valid concerns. For example, I'd also be against
another configuration option that would make it possible to enable
--cached by default, for example, because it could easily mess the
things up really bad.
However, I think that the usability of this new configuration option
outweighs the possible issues it could cause to some users. For
example, I quite often find myself in need to specify --untracked
while running git-grep(1), to see what's found in the code changes
I haven't staged yet, so I find this new configuration very useful.
Of course, I already have a few aliases defined for grep operations,
but defining another one for "grep --untracked" simply doesn't fit
into my alias scheme. Obviously, that's why I implemented this new
option, :) hoping that it would be usable to other people, too.
I'm not sure how many users could be affected by the possible negative
effects of this configuration option, by using a specific mix of git
operations, but I think it would be quite useful.
Dragan Simic <dsimic@manjaro.org> writes: > Add new configuration option grep.includeUntracked that enables --untracked > option by default. This pretty much follows the logic established by the > already existing configuration option grep.fallbackToNoIndex, while also > respecting the dependencies of the --untracked option. > > Also add a few automated tests to the t7810, to cover the new configuration Do we have any non-automated tests in t7810? > option by replicating the already existing tests for --untracked. > > Signed-off-by: Dragan Simic <dsimic@manjaro.org> > --- > builtin/grep.c | 3 +++ > t/t7810-grep.sh | 9 +++++++++ > 2 files changed, 12 insertions(+) > > diff --git a/builtin/grep.c b/builtin/grep.c > index af89c8b5cb19..71d94126fb6e 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -1041,6 +1041,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > the_repository->settings.command_requires_full_index = 0; > } > > + if (use_index && !cached) > + git_config_get_bool("grep.includeuntracked", &untracked); Can this ever return an error? E.g. [grep] includeuntracked = "not really" How badly would setting this configuration variable break third party tools that assume their "git grep" invocation without the "--untracked" option would not yield hits from untracked files?
Dragan Simic <dsimic@manjaro.org> writes: > ---no-index:: > - Search files in the current directory that is not managed by Git. > - > --untracked:: > In addition to searching in the tracked files in the working > tree, search also in untracked files. > > +--no-index:: > + Search files in the current directory that is not managed by Git. > + This option cannot be used together with `--cached` or `--untracked`. > + See also `grep.fallbackToNoIndex` in CONFIGURATION below. Hmph, this is not the fault of this patch, but the description is iffy. You can run "git grep --no-index" inside a directory that is managed by Git, and it behaves as if you gave --untracked, if I am not mistaken. What "--no-index" does is to pretend that there is no system called Git and work as if it were a strange implementation of "grep -r". The reader should be taught to understand the mode as such, because that understanding will apply whether the current directory happens to be part of a working tree managed by git, or not under control by git repository anywhere. There is no tracked or untracked or managed or anything like that, as we are pretending that there is no git, so it falls naturally that --cached or --untracked would not work. And from that point of view, swapping the order of "--no-index" and "--untracked" in this patch does make sense. All other options that specify which haystack to find the needle in are all about git; "--no-index" truly is an oddball that pretends that we live in a world without git, and as an oddball, we should move the description out from where it does not belong. It might also make sense to rethink where `--recurse-submodules` sits in the list of options while at it, as it also is an option that affects which haystack the search for the needle is carried out. > --no-exclude-standard:: > Also search in ignored files by not honoring the `.gitignore` > mechanism. Only useful with `--untracked`.
Eric Sunshine <sunshine@sunshineco.com> writes: > It's entirely subjective, of course, so no right-or-wrong answer, but > I personally do not find that this change improves code quality or > readability. I agree that this is entirely subjective. To those who wrote these variable decls and inits, what they wrote was the most readable, wasn't it? It probably falls into the "to some readers the existing code may not be perfect, but once it is written, it is not worth a patch noise to fix it" category. > With my reviewer hat on, I spent an inordinate amount of time staring > at this change trying to locate each variable's new location to verify > that no initializers were dropped and that the declared type hadn't > changed. It is true that "cleaning up, no behaviour changes intended" patches are unpleasant to review. They are boring to read, and the risk of breakage due to mistake is unnecessary and severe. But if the result is objectively better, such a one-time cost may be worth it. We are investing into the better future. For example, we may have an unsorted mess of an enum definition, and we do appreciate in the longer run, such a definition were "more or less" sorted within the constraint of some other criteria (like, "errors get negative value"). If the enum is a huge one, it may need some careful reviewing to verify such a change that turns the unsorted mess into a sorted nice list, but the cost of doing so may be justified. Does the change in this patch qualify as "objectively better"? I dunno. Thanks.
Dragan Simic <dsimic@manjaro.org> writes:
> This patch series introduces new config option grep.includeUntracked,
> which makes the untracked files also searched by default when git-grep(1)
> is invoked, in addition to searching the tracked files.
Hmph. I am moderately negative on any configuration that screws
with the default haystack from which needle is sought for.
I may often do "git grep --cached" but that does not mean I would
welcome an configuration option to make "git grep" search in the
index even when the request by the user does not have "--cached".
Inclusion of untracked sources in a sense is even worse, especially
when an unsuspecting "git grep" user (or a script) fully expects
that any paths found in the output are to be found in "git ls-files
-s" output but when you stray into a repository with the
configuration set, that expectation suddelnly gets broken.
So, I dunno.
Add a handful of additional automated tests, to improve the coverage of configuration file entries whose values contain internal whitespace, leading and/or trailing whitespace, which may or may not be enclosed within quotation marks, or which contain an additional inline comment. At the same time, rework one already existing automated test a bit, to ensure consistency with the newly added tests. This change introduced no functional changes to the already existing test. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Dragan Simic <dsimic@manjaro.org> --- Notes: Changes in v3: - Removed a few unnecessary invocations of x_to_tab() - As pointed out by Eric Sunshine, [3] it's better not to introduce new random helper functions, so x_to_tab() was replaced by a direct invocation of tr(1), in one case that requires use of literal 'Q' characters, and by invocations of q_to_tab(), in the remaining cases that actually allow use of 'Q' characters to represent HTs - Dropped the change of the name of an already existing test, to not distract the reviewers, as suggested by Eric Sunshine [4] - Renamed the first added test, as suggested by Eric Sunshine, [4] to make it consistent with the expected way for naming setup tests Changes in v2: - All new tests and one already existing test reworked according to Eric Sunshine's review suggestions; [1][2] the already existing test was reworked a bit to ensure consistency - Added a Helped-by tag [1] https://lore.kernel.org/git/CAPig+cRMPNExbG34xJ0w5npUc3DDwxQUGS_AQfam_mi4s53=sA@mail.gmail.com/ [2] https://lore.kernel.org/git/CAPig+cRG8eFxepkaiN54H+fa7D=rFGsmEHdvTP+HSSaLO_6T_A@mail.gmail.com/ [3] https://lore.kernel.org/git/CAPig+cSLb+Rsy81itvw9Tfvqv9vvKSPgO_ER9fWL04XZrgFvwg@mail.gmail.com/T/#u [4] https://lore.kernel.org/git/CAPig+cTVmQzC38DympSEtPNhgY=-+dYbZmkr0RTRbhG-hp2fmQ@mail.gmail.com/ t/t1300-config.sh | 112 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 109 insertions(+), 3 deletions(-) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 31c387868708..7688362826ea 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -11,6 +11,96 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh +test_expect_success 'setup whitespace config' ' + tr "X" "\011" >.git/config <<-\EOF + [section] + Xsolid = rock + Xsparse = big XX blue + XsparseAndTail = big XX blue + XsparseAndTailQuoted = "big XX blue " + XsparseAndBiggerTail = big XX blue X X + XsparseAndBiggerTailQuoted = "big XX blue X X" + XsparseAndBiggerTailQuotedPlus = "big XX blue X X"X + XheadAndTail = Xbig blue + XheadAndTailQuoted = "Xbig blue " + XheadAndTailQuotedPlus = "Xbig blue " + Xannotated = big blueX# to be discarded + XannotatedQuoted = "big blue"X# to be discarded + EOF +' + +test_expect_success 'no internal whitespace' ' + echo "rock" >expect && + git config --get section.solid >actual && + test_cmp expect actual +' + +test_expect_success 'internal whitespace' ' + echo "big QQ blue" | q_to_tab >expect && + git config --get section.sparse >actual && + test_cmp expect actual +' + +test_expect_success 'internal and trailing whitespace' ' + echo "big QQ blue" | q_to_tab >expect && + git config --get section.sparseAndTail >actual && + test_cmp expect actual +' + +test_expect_success 'internal and trailing whitespace, all quoted' ' + echo "big QQ blue " | q_to_tab >expect && + git config --get section.sparseAndTailQuoted >actual && + test_cmp expect actual +' + +test_expect_success 'internal and more trailing whitespace' ' + echo "big QQ blue" | q_to_tab >expect && + git config --get section.sparseAndBiggerTail >actual && + test_cmp expect actual +' + +test_expect_success 'internal and more trailing whitespace, all quoted' ' + echo "big QQ blue Q Q" | q_to_tab >expect && + git config --get section.sparseAndBiggerTailQuoted >actual && + test_cmp expect actual +' + +test_expect_success 'internal and more trailing whitespace, not all quoted' ' + echo "big QQ blue Q Q" | q_to_tab >expect && + git config --get section.sparseAndBiggerTailQuotedPlus >actual && + test_cmp expect actual +' + +test_expect_success 'leading and trailing whitespace' ' + echo "big blue" >expect && + git config --get section.headAndTail >actual && + test_cmp expect actual +' + +test_expect_success 'leading and trailing whitespace, all quoted' ' + echo "Qbig blue " | q_to_tab >expect && + git config --get section.headAndTailQuoted >actual && + test_cmp expect actual +' + +test_expect_success 'leading and trailing whitespace, not all quoted' ' + echo "Qbig blue " | q_to_tab >expect && + git config --get section.headAndTailQuotedPlus >actual && + test_cmp expect actual +' + +test_expect_success 'inline comment' ' + echo "big blue" >expect && + git config --get section.annotated >actual && + test_cmp expect actual +' + +test_expect_success 'inline comment, quoted' ' + echo "big blue" >expect && + git config --get section.annotatedQuoted >actual && + test_cmp expect actual +' + test_expect_success 'clear default config' ' rm -f .git/config ' @@ -1066,9 +1156,25 @@ test_expect_success '--null --get-regexp' ' test_cmp expect result ' -test_expect_success 'inner whitespace kept verbatim' ' - git config section.val "foo bar" && - test_cmp_config "foo bar" section.val +test_expect_success 'inner whitespace kept verbatim, spaces only' ' + echo "foo bar" >expect && + git config section.val "foo bar" && + git config --get section.val >actual && + test_cmp expect actual +' + +test_expect_success 'inner whitespace kept verbatim, horizontal tabs only' ' + echo "fooQQbar" | q_to_tab >expect && + git config section.val "$(cat expect)" && + git config --get section.val >actual && + test_cmp expect actual +' + +test_expect_success 'inner whitespace kept verbatim, horizontal tabs and spaces' ' + echo "foo Q bar" | q_to_tab >expect && + git config section.val "$(cat expect)" && + git config --get section.val >actual && + test_cmp expect actual ' test_expect_success SYMLINKS 'symlinked configuration' '
Make it more clear what the whitespace characters are in the context of git configuration files, and improve the description of the trailing whitespace handling a bit, especially how it works out together with the presence of inline comments. Signed-off-by: Dragan Simic <dsimic@manjaro.org> --- Notes: Changes in v3: - Patch description was expanded a bit, to make it more on point - No changes to the documentation were introduced Changes in v2: - No changes were introduced Documentation/config.txt | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 782c2bab906c..20f3300dc706 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -22,9 +22,10 @@ multivalued. Syntax ~~~~~~ -The syntax is fairly flexible and permissive; whitespaces are mostly -ignored. The '#' and ';' characters begin comments to the end of line, -blank lines are ignored. +The syntax is fairly flexible and permissive. Whitespace characters, +which in this context are the space character (SP) and the horizontal +tabulation (HT), are mostly ignored. The '#' and ';' characters begin +comments to the end of line. Blank lines are ignored. The file consists of sections and variables. A section begins with the name of the section in square brackets and continues until the next @@ -64,12 +65,14 @@ The variable names are case-insensitive, allow only alphanumeric characters and `-`, and must start with an alphabetic character. A line that defines a value can be continued to the next line by -ending it with a `\`; the backslash and the end-of-line are -stripped. Leading whitespaces after 'name =', the remainder of the +ending it with a `\`; the backslash and the end-of-line are stripped. +Leading whitespace characters after 'name =', the remainder of the line after the first comment character '#' or ';', and trailing -whitespaces of the line are discarded unless they are enclosed in -double quotes. Internal whitespaces within the value are retained -verbatim. +whitespace characters of the line are discarded unless they are enclosed +in double quotes. The discarding of the trailing whitespace characters +applies regardless of the discarding of the portion of the line after +the first comment character. Internal whitespace characters within the +value are retained verbatim. Inside double quotes, double quote `"` and backslash `\` characters must be escaped: use `\"` for `"` and `\\` for `\`.
Fix a bug in function parse_value() that prevented whitespace characters (i.e. spaces and horizontal tabs) found inside configuration option values from being parsed and returned in their original form. The bug caused any number of consecutive whitespace characters to be wrongly "squashed" into the same number of space characters. This bug was introduced back in July 2009, in commit ebdaae372b46 ("config: Keep inner whitespace verbatim"). Further investigation showed that setting a configuration value, by invoking git-config(1), converts value-internal horizontal tabs into "\t" escape sequences, which the buggy value-parsing logic in function parse_value() didn't "squash" into spaces. That's why the test included in the ebdaae37 commit passed, which presumably made the bug remain undetected for this long. On the other hand, value-internal literal horizontal tab characters, found in a configuration file edited by hand, do get "squashed" by the value-parsing logic, so the right choice was to fix this bug by making the value-internal whitespace characters preserved verbatim. Signed-off-by: Dragan Simic <dsimic@manjaro.org> --- Notes: Changes in v3: - No changes were introduced Changes in v2: - Dropped the "Fixes" tag, as explained and requested by Junio, [1] because having such tags actually doesn't help us in the long run - No changes to the source code were introduced [1] https://lore.kernel.org/git/xmqq4jd7qtg6.fsf@gitster.g/ config.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/config.c b/config.c index a86a20cdf5cb..5072f12e62e4 100644 --- a/config.c +++ b/config.c @@ -817,33 +817,38 @@ static int get_next_char(struct config_source *cs) static char *parse_value(struct config_source *cs) { - int quote = 0, comment = 0, space = 0; + int quote = 0, comment = 0; + size_t trim_len = 0; strbuf_reset(&cs->value); for (;;) { int c = get_next_char(cs); if (c == '\n') { if (quote) { cs->linenr--; return NULL; } + if (trim_len) + strbuf_setlen(&cs->value, trim_len); return cs->value.buf; } if (comment) continue; if (isspace(c) && !quote) { + if (!trim_len) + trim_len = cs->value.len; if (cs->value.len) - space++; + strbuf_addch(&cs->value, c); continue; } if (!quote) { if (c == ';' || c == '#') { comment = 1; continue; } } - for (; space; space--) - strbuf_addch(&cs->value, ' '); + if (trim_len) + trim_len = 0; if (c == '\\') { c = get_next_char(cs); switch (c) {
In general, binary operators should be enclosed in a pair of leading and trailing space (SP) characters. Thus, clean up one spotted expression that for some reason had a "bunched up" operator. Signed-off-by: Dragan Simic <dsimic@manjaro.org> --- Notes: Changes in v3: - Patch description was expanded a tiny bit, to make it more accurate - No changes to the source code were introduced Changes in v2: - No changes were introduced config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.c b/config.c index 3cfeb3d8bd99..a86a20cdf5cb 100644 --- a/config.c +++ b/config.c @@ -869,7 +869,7 @@ static char *parse_value(struct config_source *cs) continue; } if (c == '"') { - quote = 1-quote; + quote = 1 - quote; continue; } strbuf_addch(&cs->value, c);
This series is an evolvement from another recent series, [1] as a result of a decision to fix a longstanding bug in the parsing of configuration option values, instead of documenting the status quo. [2][3] The bufgix introduced in this series _should_ have no hidden negative effects. All of the configuration-related tests, both the old and the new ones, pass with the patches applied. In v2, this series had five patches in total, out of which the third patch (i.e. patch 3/5) was dropped in v3. [4] Other changes in v2 and v3 are described in each of the patches. There will be follow-up patches, to address the majority of the points raised during the review of this series. [5] Link to v2: https://lore.kernel.org/git/cover.1710646998.git.dsimic@manjaro.org/T/#u [1] https://lore.kernel.org/git/cover.1710258538.git.dsimic@manjaro.org/T/#u [2] https://lore.kernel.org/git/ff7b0a2ead90ad9a9456141da5e4df4a@manjaro.org/ [3] https://lore.kernel.org/git/11be11f231f3bf41d0245c780c20693f@manjaro.org/ [4] https://lore.kernel.org/git/514d832b0399ccdbc354675068477fea@manjaro.org/ [5] https://lore.kernel.org/git/f37d753485094a3ba66fde5e85d0e2dc@manjaro.org/ Dragan Simic (4): config: minor addition of whitespace config: really keep value-internal whitespace verbatim t1300: add more tests for whitespace and inline comments config.txt: describe handling of whitespace further Documentation/config.txt | 19 ++++--- config.c | 15 ++++-- t/t1300-config.sh | 112 +++++++++++++++++++++++++++++++++++++-- 3 files changed, 130 insertions(+), 16 deletions(-)
I had a quick look over the reftable bits of this series. It looks OK, but here are some comments. Nothing blocking. * reftable/error: discern locked/outdated errors It is not obvious to me why you need two different codes. Is it so you can print the offending lock file (so people can delete them manually?). FWIW, this was based on JGit, which has /** * The ref could not be locked for update/delete. * <p> * This is generally a transient failure and is usually caused by * another process trying to access the ref at the same time as this * process was trying to update it. It is possible a future operation * will be successful. */ * reftable/stack: gracefully handle failed auto-compaction due to locks It's a bit unsatisfying that you have to use details of the locking protocol to test it, but I couldn't think of a way to unittest this using only the API. Maybe it's worth considering removing the automatic compaction from the reftable-stack.h API, and have the caller (eg. in refs/reftable-backend.c) call it explicitly? -- Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen
On 2024-03-18 21:02, Eric Sunshine wrote:
> On Mon, Mar 18, 2024 at 1:04 PM Dragan Simic <dsimic@manjaro.org>
> wrote:
>> Clarify that --recurse-submodules cannot be used together with
>> --untracked,
>> and improve the formatting in a couple of places, to make it visually
>> clear
>> that those are the commands or the names of configuration options.
>>
>> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
>> ---
>> diff --git a/Documentation/config/grep.txt
>> b/Documentation/config/grep.txt
>> @@ -24,5 +24,5 @@ grep.fullName::
>> grep.fallbackToNoIndex::
>> - If set to true, fall back to git grep --no-index if git grep
>> + If set to true, fall back to `git grep --no-index` if `git
>> grep`
>
> Good.
>
>> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
>> @@ -65,8 +65,8 @@ OPTIONS
>> <tree> option the prefix of all submodule output will be the
>> name of
>> - the parent project's <tree> object. This option has no effect
>> - if `--no-index` is given.
>> + the parent project's <tree> object. This option cannot be
>> used together
>> + with `--untracked`, and it has no effect if `--no-index` is
>> specified.
>
> I believe that there is a patch series currently in-flight which is
> re-styling in-prose <foo> placeholders as _<foo>_, so you may want to
> make that change as well while you're touching this.
Thanks for the notice. I'll add that to the v2 of this series, if
there will be other reasons for the v2.
On 2024-03-18 20:59, Eric Sunshine wrote:
> On Mon, Mar 18, 2024 at 1:04 PM Dragan Simic <dsimic@manjaro.org>
> wrote:
>> Move some variable definitions around, and reflow one comment block,
>> to
>> make the code a bit neater after spotting those slightly unpolished
>> areas.
>> There are no functional changes to the source code.
>>
>> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
>> ---
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> @@ -623,13 +623,13 @@ static int grep_tree(struct grep_opt *opt, const
>> struct pathspec *pathspec,
>> - struct repository *repo = opt->repo;
>> - int hit = 0;
>> + int hit = 0, name_base_len = 0;
>> + int old_baselen = base->len;
>> enum interesting match = entry_not_interesting;
>> + struct repository *repo = opt->repo;
>> struct name_entry entry;
>> - int old_baselen = base->len;
>> struct strbuf name = STRBUF_INIT;
>> - int name_base_len = 0;
>> @@ -890,19 +890,15 @@ static int pattern_callback(const struct option
>> *opt, const char *arg,
>> - int hit = 0;
>> + int hit = 0, seen_dashdash = 0, use_index = 1;
>> int cached = 0, untracked = 0, opt_exclude = -1;
>> - int seen_dashdash = 0;
>> int external_grep_allowed__ignored;
>> + int i, dummy, allow_revs;
>> const char *show_in_pager = NULL, *default_pager = "dummy";
>> struct grep_opt opt;
>> struct object_array list = OBJECT_ARRAY_INIT;
>> struct pathspec pathspec;
>> struct string_list path_list = STRING_LIST_INIT_DUP;
>> - int i;
>> - int dummy;
>> - int use_index = 1;
>> - int allow_revs;
>
> It's entirely subjective, of course, so no right-or-wrong answer, but
> I personally do not find that this change improves code quality or
> readability.
>
> With my reviewer hat on, I spent an inordinate amount of time staring
> at this change trying to locate each variable's new location to verify
> that no initializers were dropped and that the declared type hadn't
> changed. Taking into consideration that reviewers are a limited
> resource on this project, I'd probably have skipped this patch
> altogether if I were doing this series unless these changes concretely
> help a subsequent patch.
Oh, I'm fully aware that the reviewers are a limited resource, and I do
agree that all this is subjective. Though, I believe it makes the code
look nicer, which is the only reason why I performed and submitted those
changes in the first place.
Though, maybe it would've been better if I submitted these changes as
a separate patch, instead as part of this series.
"Mohit Marathe via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Mohit Marathe <mohitmarathe@proton.me> > > This patch replaces the code, that changes the path separators, > with the already existing function `change_path_separators()` Aside from that the name change_path_separators() needs to be updated, using it here means that the helper cannot become a NO-OP on platforms where we do not have to change backslashes we read from the system to forward slashes. So if we really wanted to use it here, and update the other existing callers in the main code to help them lose the ugly "#if WINDOWS" conditional compilation, we would need two helper functions, perhaps one that is used here that is identical to the current convert_slashes(), and the other one used to clean-up the callsites in [1/2] to also remove the conditional compilation. Even if we were to make it public, I am not sure if compat-util is the right place, though. It is not a kitchen sink. In any case, perhaps we want to do something like this in a header, ... static inline void bs_to_forward_slash_in_place(char *path) { ... } #ifdef GIT_WINDOWS_NATIVE #define normalize_slashes_in_place(path) bs_to_forward_slash_in_place(path) #else #define normalize_slashes_in_place(path) do { ; /* nothing */ } while (0) #endif ... and then use the "normalize" one to lose "#ifdef" in the callers [1/2] touches, while "bs_to_forward_slash" one here. I am not convinced if it is worth doing only for this single test, though. > @@ -88,9 +86,8 @@ static const char *make_relative(const char *location) > > /* convert backslashes to forward slashes */ > strlcpy(buf, location + prefix_len, sizeof(buf)); > - for (p = buf; *p; p++) > - if (*p == '\\') > - *p = '/'; > + p = buf; > + change_path_separators(p); > return buf; > }
"Mohit Marathe via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Mohit Marathe <mohitmarathe@proton.me> > > This patch migrates the `convert_slashes` function to `git-compat- > util.h` and renames it to `change_path_separators`. That is something a reader can tell from looking at what the patch does. What they cannot read from the diff is why the author of the patch thought it was a good idea and that is what we want to see here. > diff --git a/abspath.c b/abspath.c > index 1202cde23db..ea35e2c05ce 100644 > --- a/abspath.c > +++ b/abspath.c > @@ -58,7 +58,7 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining) > strbuf_reset(resolved); > strbuf_add(resolved, remaining->buf, offset); > #ifdef GIT_WINDOWS_NATIVE > - convert_slashes(resolved->buf); > + change_path_separators(resolved->buf); > #endif In the context of "#ifdef GIT_WINDOWS_NATIVE..#endif" conditional compilation, the name convert_slashes() may have made some sense, i.e. "on this system, we get a path with the kind of slash that is not suitable to be used in Git, so we correct it to the other kind of slash". But neither it, or change_path_separators(), as a name of public function makes much sense. The direction of the change is totally unclear. "normalize" may be a verb that has some connotation which direction the conversion is going, but still, without knowing why this change is being made (not the name change, but the part that makes it public), I cannot offer much better candidates for its name. > diff --git a/git-compat-util.h b/git-compat-util.h > index 7c2a6538e5a..3db90c09295 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -1309,6 +1309,13 @@ static inline int strtol_i(char const *s, int base, int *result) > return 0; > } > > +static inline void change_path_separators(char *path) > +{ > + for (; *path; path++) > + if (*path == '\\') > + *path = '/'; > +} > + > void git_stable_qsort(void *base, size_t nmemb, size_t size, > int(*compar)(const void *, const void *)); > #ifdef INTERNAL_QSORT > diff --git a/path.c b/path.c > index 8bb223c92c9..cd7c88ffa0d 100644 > --- a/path.c > +++ b/path.c > @@ -758,7 +758,7 @@ char *interpolate_path(const char *path, int real_home) > else > strbuf_addstr(&user_path, home); > #ifdef GIT_WINDOWS_NATIVE > - convert_slashes(user_path.buf); > + change_path_separators(user_path.buf); > #endif > } else { > struct passwd *pw = getpw_str(username, username_len); If the idea were that "here we know that the path came from reading filesystem and on platforms that use backslashes as path separators, we need to normalize them into slashes before the path is usable in Git", then I might understand if a change were more like: * Introduce normalize_path_separators_in_place(char *) that is a NOOP by default, but does the backslash->slash conversion ONLY on platforms that require it (i.e. Windows). * Lose "#ifdef GIT_WINDOWS_NATIVE" and corresponding "#endif", and call that normalize thing unconditionally, which gets optimized away on most systems other then where it is required. That way, the affected .c files would become somewhat easier to follow without ugly conditional compilation. But with the proposed patch lacking any explanation why the author thought it was a good idea, the above is just my wild guess.
On 2024-03-18 20:21, Eric Sunshine wrote:
> On Mon, Mar 18, 2024 at 4:37 AM Dragan Simic <dsimic@manjaro.org>
> wrote:
>> On 2024-03-18 05:38, Junio C Hamano wrote:
>> > sed -e 's/^|//' -e 's/Q/ /g' >.git/config <<-\EOF
>> > |[section]
>> > | solid = rock
>> > | sparse = big QQ blue
>> > | ...
>> > EOF
>> >
>> This looks quite neat. Furthermore, I think we should also consider
>> the already existing tests in the t1300 that contain such indentation.
>> As I already explained in my earlier response to Eric, [1] the choice
>> of including the indentation or not seems random to me, so we should
>> perhaps consider taking some broader approach.
>>
>> How about this as a plan for moving forward:
>>
>> 1) Sprinkle a couple of tests onto the t1300, which try to be
>> focused on the verification of the indentation-handling logic;
>> maybe those additional tests could be even seen as redundant,
>> but I think they can only help with the test coverage
>>
>> 2) Create a new helper function that uses the logic you described
>> above, to make it simpler to include the indentation into configs
>>
>> 3) Finally, propagate the use of this new helper function into the
>> new test and the already existing tests in the t1300 that already
>> include the indentation
>>
>> I'd be happy to implement all of the above-proposed steps in the next
>> couple of days. Sure, it would be quite time-consuming, especially
>> the
>> third proposed step, but it should be worth it in the long run.
>
> As noted in my other response, while such cleanups may be nice in the
> long-run, the bug-fix patch series under discussion should not be held
> hostage by an ever-increasing set of potential cleanups. Let's focus
> on landing the current series; these tangential cleanups can be done
> in a separate series.
Totally agreed, let's keep this plan for the follow-up patches.
On 2024-03-18 20:17, Eric Sunshine wrote: > On Mon, Mar 18, 2024 at 4:17 AM Dragan Simic <dsimic@manjaro.org> > wrote: >> On 2024-03-18 03:48, Eric Sunshine wrote: >> > Readability wasn't my reason for bringing this up. As a reviewer, >> > every time a question pops into my mind as I'm reading the code, that >> > indicates that something about the code is unclear or that the commit >> > message doesn't properly explain why it was done in this way. People >> > coming across this code in the future may have the same questions but >> > they won't have the benefit of being able to easily ask you why it was >> > done this way. >> >> I see. How about including a small comment in the t1300 that would >> explain the additional indentation? > > I'm just one reviewer. Unless others chime in with similar > observations or questions regarding the patch, I don't think such a > comment is necessary. Aside from the other more significant points > (such as not introducing x_to_tab(), using "setup" in the function > title, etc.), this is extremely minor, and what you have here is "good > enough" (though you may want to take Junio's suggestion of using a > leading "|" to protect indentation). Just to reiterate, both x_to_tab() and the test naming have already been addressed in the future v3 of this series. >> As a note, there are already more tests in the t1300 that contain such >> indentation, so maybe we shoulddo something with those existing tests >> as well; the above-proposed comment, which would be placed at the >> very >> beginning of t1300, may provide a satisfactory explanation for all the >> tests in t1300 that contain such additional indentation. >> >> Another option would be to either add the indentation to all relevant >> tests in the t1300, or to remove the indentation from all tests in the >> t1300 that already contain it. I'd be happy to implement and submit >> patches that do that, after we choose the direction we want to follow. > > It would be better to keep this series focused on its primary goal of > fixing a bug rather than being held hostage to an ever increasing set > of potential cleanups. Such cleanups can be done as separate patch > series either atop this series or alongside it. Let's land this series > first, and then, if you wish, tackle those other less significant > issues. Thanks, I totally agree. >> > If these new tests are also checking leading whitespace behavior, then >> > to improve coverage, would it make sense to have the leading "X" on >> > some lines but not others? >> >> Good point, despite that not being the main purpose of the added >> tests. >> I'll see to add a couple of tests that check the handling of >> indentation, >> possibly at some places in the t1300 that fit the best; improving the >> tests coverage can only help in the long run. > > As above, such additional tests probably aren't mandatory for this > bug-fix series. As a reviewer, I'd like to see fewer and fewer changes > between each version of a patch series; the series should converge so > that it can land rather than diverge from iteration to iteration. Such > additional leading-whitespace tests may be perfectly appropriate for a > follow-up series. Agreed once again. Let's wrap this up, and I'll come back with the follow-up patches.
Eric Sunshine <sunshine@sunshineco.com> writes:
> As a reviewer, I'd like to see fewer and fewer changes between
> each version of a patch series; the series should converge so that
> it can land rather than diverge from iteration to iteration.
Well said.
Thanks.
On Mon, Mar 18, 2024 at 1:04 PM Dragan Simic <dsimic@manjaro.org> wrote: > Clarify that --recurse-submodules cannot be used together with --untracked, > and improve the formatting in a couple of places, to make it visually clear > that those are the commands or the names of configuration options. > > Signed-off-by: Dragan Simic <dsimic@manjaro.org> > --- > diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt > @@ -24,5 +24,5 @@ grep.fullName:: > grep.fallbackToNoIndex:: > - If set to true, fall back to git grep --no-index if git grep > + If set to true, fall back to `git grep --no-index` if `git grep` Good. > diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt > @@ -65,8 +65,8 @@ OPTIONS > <tree> option the prefix of all submodule output will be the name of > - the parent project's <tree> object. This option has no effect > - if `--no-index` is given. > + the parent project's <tree> object. This option cannot be used together > + with `--untracked`, and it has no effect if `--no-index` is specified. I believe that there is a patch series currently in-flight which is re-styling in-prose <foo> placeholders as _<foo>_, so you may want to make that change as well while you're touching this.
On Mon, Mar 18, 2024 at 1:04 PM Dragan Simic <dsimic@manjaro.org> wrote:
> Move some variable definitions around, and reflow one comment block, to
> make the code a bit neater after spotting those slightly unpolished areas.
> There are no functional changes to the source code.
>
> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
> ---
> diff --git a/builtin/grep.c b/builtin/grep.c
> @@ -623,13 +623,13 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
> - struct repository *repo = opt->repo;
> - int hit = 0;
> + int hit = 0, name_base_len = 0;
> + int old_baselen = base->len;
> enum interesting match = entry_not_interesting;
> + struct repository *repo = opt->repo;
> struct name_entry entry;
> - int old_baselen = base->len;
> struct strbuf name = STRBUF_INIT;
> - int name_base_len = 0;
> @@ -890,19 +890,15 @@ static int pattern_callback(const struct option *opt, const char *arg,
> - int hit = 0;
> + int hit = 0, seen_dashdash = 0, use_index = 1;
> int cached = 0, untracked = 0, opt_exclude = -1;
> - int seen_dashdash = 0;
> int external_grep_allowed__ignored;
> + int i, dummy, allow_revs;
> const char *show_in_pager = NULL, *default_pager = "dummy";
> struct grep_opt opt;
> struct object_array list = OBJECT_ARRAY_INIT;
> struct pathspec pathspec;
> struct string_list path_list = STRING_LIST_INIT_DUP;
> - int i;
> - int dummy;
> - int use_index = 1;
> - int allow_revs;
It's entirely subjective, of course, so no right-or-wrong answer, but
I personally do not find that this change improves code quality or
readability.
With my reviewer hat on, I spent an inordinate amount of time staring
at this change trying to locate each variable's new location to verify
that no initializers were dropped and that the declared type hadn't
changed. Taking into consideration that reviewers are a limited
resource on this project, I'd probably have skipped this patch
altogether if I were doing this series unless these changes concretely
help a subsequent patch.
On Mon, Mar 18, 2024 at 4:37 AM Dragan Simic <dsimic@manjaro.org> wrote:
> On 2024-03-18 05:38, Junio C Hamano wrote:
> > sed -e 's/^|//' -e 's/Q/ /g' >.git/config <<-\EOF
> > |[section]
> > | solid = rock
> > | sparse = big QQ blue
> > | ...
> > EOF
> >
> This looks quite neat. Furthermore, I think we should also consider
> the already existing tests in the t1300 that contain such indentation.
> As I already explained in my earlier response to Eric, [1] the choice
> of including the indentation or not seems random to me, so we should
> perhaps consider taking some broader approach.
>
> How about this as a plan for moving forward:
>
> 1) Sprinkle a couple of tests onto the t1300, which try to be
> focused on the verification of the indentation-handling logic;
> maybe those additional tests could be even seen as redundant,
> but I think they can only help with the test coverage
>
> 2) Create a new helper function that uses the logic you described
> above, to make it simpler to include the indentation into configs
>
> 3) Finally, propagate the use of this new helper function into the
> new test and the already existing tests in the t1300 that already
> include the indentation
>
> I'd be happy to implement all of the above-proposed steps in the next
> couple of days. Sure, it would be quite time-consuming, especially the
> third proposed step, but it should be worth it in the long run.
As noted in my other response, while such cleanups may be nice in the
long-run, the bug-fix patch series under discussion should not be held
hostage by an ever-increasing set of potential cleanups. Let's focus
on landing the current series; these tangential cleanups can be done
in a separate series.
On Mon, Mar 18, 2024 at 4:17 AM Dragan Simic <dsimic@manjaro.org> wrote: > On 2024-03-18 03:48, Eric Sunshine wrote: > > Readability wasn't my reason for bringing this up. As a reviewer, > > every time a question pops into my mind as I'm reading the code, that > > indicates that something about the code is unclear or that the commit > > message doesn't properly explain why it was done in this way. People > > coming across this code in the future may have the same questions but > > they won't have the benefit of being able to easily ask you why it was > > done this way. > > I see. How about including a small comment in the t1300 that would > explain the additional indentation? I'm just one reviewer. Unless others chime in with similar observations or questions regarding the patch, I don't think such a comment is necessary. Aside from the other more significant points (such as not introducing x_to_tab(), using "setup" in the function title, etc.), this is extremely minor, and what you have here is "good enough" (though you may want to take Junio's suggestion of using a leading "|" to protect indentation). > As a note, there are already more tests in the t1300 that contain such > indentation, so maybe we shoulddo something with those existing tests > as well; the above-proposed comment, which would be placed at the very > beginning of t1300, may provide a satisfactory explanation for all the > tests in t1300 that contain such additional indentation. > > Another option would be to either add the indentation to all relevant > tests in the t1300, or to remove the indentation from all tests in the > t1300 that already contain it. I'd be happy to implement and submit > patches that do that, after we choose the direction we want to follow. It would be better to keep this series focused on its primary goal of fixing a bug rather than being held hostage to an ever increasing set of potential cleanups. Such cleanups can be done as separate patch series either atop this series or alongside it. Let's land this series first, and then, if you wish, tackle those other less significant issues. > > If these new tests are also checking leading whitespace behavior, then > > to improve coverage, would it make sense to have the leading "X" on > > some lines but not others? > > Good point, despite that not being the main purpose of the added tests. > I'll see to add a couple of tests that check the handling of > indentation, > possibly at some places in the t1300 that fit the best; improving the > tests coverage can only help in the long run. As above, such additional tests probably aren't mandatory for this bug-fix series. As a reviewer, I'd like to see fewer and fewer changes between each version of a patch series; the series should converge so that it can land rather than diverge from iteration to iteration. Such additional leading-whitespace tests may be perfectly appropriate for a follow-up series.