git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Carlos L." <00xc@protonmail.com>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: Junio C Hamano <gitster@pobox.com>,
	"Carlos L. via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org, GNU grep developers <grep-devel@gnu.org>
Subject: Re: [PATCH] grep: add --max-count command line option
Date: Mon, 16 May 2022 08:38:05 +0000	[thread overview]
Message-ID: <MHNbacVw7D6ZU3OJvgIqqRMu70HlgYIYQPduUEUnzWCqkGUsUGRLopGGWj-CbyjNilDcUfLB6elfSRgDOaob9cPpjeAf-I6xuMArQZ0y3io=@protonmail.com> (raw)
In-Reply-To: <e89577f8-8f52-bf09-15f3-c534bf1a6c64@cs.ucla.edu>

Hi list,

Thanks to everyone who provided feedback :)

On Saturday, May 14th, 2022 at 20:16, Martin Ågren <martin.agren@gmail.com> wrote:
> I think this should be
>
> [(-m | --max-count) <num>]

> Please use `backticks` with `-v` and `--invert-match` so that they are
> set in monospace.

I will add these suggestions to my patch.

> Regarding the special value 0, it's a bit unclear what "has no effect"
> means. In particular, it can have an effect in the sense that when it
> is used like
>
> git grep -m 1 -m 0 foo
>
> it undoes the `-m 1`.
>
> But also, that's not how my grep(1) behaves: with `-m 0`, it limits the
> number of matches to zero. I don't know how useful that is (can that
> zero-case be optimized by exiting with 1 before even trying to find the
> needle!?), or if maybe different variants of grep handle this
> differently? If all grep implementations handle 0 by actually only
> emitting zero hits, I think it would be wise for us to handle 0 the same
> way.

I agree the wording is not clear. I did not see a good use case for GNU's `-m 0`, which is why I used that value as unlimited. I am not sold on using `--no-max-count` or -1 *just* for consistency, but if someone can point to a good use case of GNU's `-m 0` (especially in git grep), I will gladly concede.

> Even if we want to handle the zero just like you do, I think this patch
> needs a few tests. We should make sure to test the 0-case (whatever we
> end up wanting it to behave like), and probably the "suppress an earlier
> -m by giving --no-max-count" case. It also seems wise to set up some
> test scenario where there are several files involved so that we can see
> that we don't just print the first m matches globally, but that the
> counter is really handled per file.

This seems sound. Is there any documentation on how to write tests for git?

On Monday, May 16th, 2022 at 07:57, Junio C Hamano <gitster@pobox.com> wrote:
> Good thing that this is defined as "per-file" limit. If it were a
> global limit, the interaction between this one and "--threads=<num>"
> would have been interesting. Perhaps add a test to make sure the
> feature continues to work with "--threads=2" (I am assuming that you
> have already tested this implementation works with the option).

I did and I found no unexpected behavior.

> What "git grep -m -1" should do? IIRC, OPT_INTEGER is for signed
> integer but the new .max_count member, as well as the existing
> "count" that is compared with it, are of "unsigned" type. Either
> erroring out or treating it as unlimited is probably fine, but
> whatever we do, we should document and have a test for it.

I would favor treating it as an error. As mentioned above, using 0 to describe "unlimited matches" (e.g. the default) is my preference, but I am willing to concede if someone can think of a good use for `-m 0`. Also, from the implementation side (although not as important) it looks better: if we allow negative values, we need to distinguish between -1 (unlimited) and -4 (display error to user, probably) - the patch is much simpler right now. And just as a side note, we avoid an issue in the pretty much insignificant use case of giving a very big value (UINT_MAX) for `-m` and it overflowing into -1, thus not properly limiting the number of matches.

On Monday, May 16th, 2022 at 09:28, Paul Eggert <eggert@cs.ucla.edu> wrote:
> As I vaguely recall, if "-m 1" stops before "-m 2" does, then the idea
> was that it's reasonable for "-m 0" to stop before "-m 1" does, and the
> logical place to stop is right at the start, before any matches are
> found (i.e., exit with status 1).

As I mentioned above, I do not see what this `-m 0` behavior is useful for, but if someone could show me an use for it I would appreciate it.

Again, thank you everyone for your comments.

  reply	other threads:[~2022-05-16  8:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12 13:20 [PATCH] grep: add --max-count command line option Carlos L. via GitGitGadget
2022-05-14 18:16 ` Martin Ågren
2022-05-16  5:57 ` Junio C Hamano
2022-05-16  7:28   ` Paul Eggert
2022-05-16  8:38     ` Carlos L. [this message]
2022-05-16 15:36       ` Junio C Hamano
2022-05-17  5:53         ` Paul Eggert
2022-05-16 15:18     ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2022-06-20 15:49 Carlos L. via GitGitGadget
2022-06-20 15:57 ` Paul Eggert
2022-06-20 16:25   ` Carlos L.
2022-06-20 16:32     ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='MHNbacVw7D6ZU3OJvgIqqRMu70HlgYIYQPduUEUnzWCqkGUsUGRLopGGWj-CbyjNilDcUfLB6elfSRgDOaob9cPpjeAf-I6xuMArQZ0y3io=@protonmail.com' \
    --to=00xc@protonmail.com \
    --cc=eggert@cs.ucla.edu \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=grep-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).