git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Albert Yale <surfingalbert@gmail.com>
Cc: git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH/RFC v3] grep: Add the option '--exclude'
Date: Tue, 31 Jan 2012 16:31:42 -0800	[thread overview]
Message-ID: <7vsjiv5r81.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <1328021108-4662-1-git-send-email-surfingalbert@gmail.com> (Albert Yale's message of "Tue, 31 Jan 2012 09:45:08 -0500")

Albert Yale <surfingalbert@gmail.com> writes:

> @@ -124,6 +125,12 @@ OPTIONS
>  	Use fixed strings for patterns (don't interpret pattern
>  	as a regex).
>  
> +-x <pattern>::
> +--exclude <pattern>::
> +	In addition to those found in .gitignore (per directory) and
> +	$GIT_DIR/info/exclude, also consider these patterns to be in the
> +	set of the ignore rules in effect.

That makes it sound as if "git grep" will not produce hits for a path that
was forcibly added despite it matches a pattern in .gitignore file, which
is not true at all.

>  -n::
>  --line-number::
>  	Prefix the line number to matching lines.
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 9ce064a..e9e1ac1 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -528,7 +528,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int
>  		struct cache_entry *ce = active_cache[nr];
>  		if (!S_ISREG(ce->ce_mode))
>  			continue;
> -		if (!match_pathspec_depth(pathspec, ce->name, ce_namelen(ce), 0, NULL))
> +		if (!match_pathspec_depth(pathspec, ce->name, ce_namelen(ce),
> +					  0, NULL, 1))
>  			continue;
>  		/*
>  		 * If CE_VALID is on, we assume worktree file and its cache entry
> @@ -566,6 +567,11 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
>  	while (tree_entry(tree, &entry)) {
>  		int te_len = tree_entry_len(&entry);
>  
> +		if (match_pathspec_depth(pathspec->exclude,
> +					 entry.path, strlen(entry.path),
> +					 0, NULL, 0))
> +			continue;
> +

Why???

IOW, why isn't this

	if (!match_pathspec_depth(pathspec, ...))
		continue;	

My point of the two previous review messages was that it would be nice if
we can make it so that nobody outside match_pathspec_depth() should even
need to know existence of pathspec->exclude. Either that was ignored, or
perhaps you found a compelling reason why that is not a good idea, but if
so I'd like to know why.

I suspect that you do not need to add the extra 0 at the end (it makes the
caller even harder to read) if you go that route. The extra parameter
defeats the whole point of encapsulating the new "negative match" feature
behind match_pathspec_depth() implementation.

> +	if (obj->type == OBJ_BLOB) {
> +		const char *name_without_sha1 = strchr(name, ':') + 1;
> +
> +		if (match_pathspec_depth(pathspec->exclude,
> +					 name_without_sha1,
> +					 strlen(name_without_sha1),
> +					 0, NULL, 0))
> +			return 0;

Likewise.

> diff --git a/cache.h b/cache.h
> index 9bd8c2d..683458a 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -533,9 +533,12 @@ struct pathspec {
>  		int len;
>  		unsigned int use_wildcard:1;
>  	} *items;
> +
> +	struct pathspec *exclude; /* This is never NULL. */
>  };

"This is never NULL" is blatantly wrong, as pathspec->exclude->exclude
will very likely be NULL.

I've been assuming that the resulting structure would be more like this: 

 struct pathspec_set {
         int nr;
         unsigned int has_wildcard:1;
         unsigned int recursive:1;
         struct pathspec_item {
                 const char *match;
                 int len;
                 unsigned int use_wildcard:1;
         } *items;
 };

 struct pathspec {
         const char **raw; /* get_pathspec() result, not freed by free_pathspec() */
         int max_depth;
         struct pathspec_set include;
         struct pathspec_set exclude;
 };
 
I am not including "raw" in the include/exclude because its use should be
deprecated away over time.

The various helpers used in match_pathspec_depth() that know _only_ about
matching and not about excluding will be changed to take a pointer to
"struct pathspec_set" (and possibly max_depth). The top-level callers
would not have to know any of these changes if we structure the API that
way, no?

I think something like that was more or less the structure we discussed
when Nguyễn brought up his negative pathspec work the last time.

      reply	other threads:[~2012-02-01  0:31 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-31 14:45 [PATCH/RFC v3] grep: Add the option '--exclude' Albert Yale
2012-02-01  0:31 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vsjiv5r81.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=surfingalbert@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).