git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: pclouds@gmail.com, git@vger.kernel.org
Subject: Re: [PATCHv8 5/5] pathspec: allow querying for attributes
Date: Thu, 19 May 2016 11:53:40 -0700	[thread overview]
Message-ID: <xmqqmvnlj2vf.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160519010935.27856-6-sbeller@google.com> (Stefan Beller's message of "Wed, 18 May 2016 18:09:35 -0700")

Stefan Beller <sbeller@google.com> writes:

> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index cafc284..aa9f220 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -384,6 +384,23 @@ full pathname may have special meaning:
>  +
>  Glob magic is incompatible with literal magic.
>  
> +attr;;
> +	Additionally to matching the pathspec, the path must have the
> +	attribute as specified. The syntax for specifying the required
> +	attributes is "`attr: [mode] <attribute name> [=value]`"
> ++
> +Attributes can have 4 states (Set, Unset, Set to a value, unspecified) and
> +you can query each attribute for certain states. The "`[mode]`" is a special
> +character to indicate which attribute states are looked for. The following
> +modes are available:
> +
> + - an empty "`[mode]`" matches if the attribute is set
> + - "`-`" the attribute must be unset
> + - "`!`" the attribute must be unspecified
> + - an empty "`[mode]`" combined with "`[=value]`" matches if the attribute has
> +   the given value.
> ++

As an initial design, I find this much more agreeable than the
previous rounds.  I however find the phrasing of the above harder
than necessary to understand, for a few reasons.

 * Mixed use of "X matches if ..." and "... must be Y" makes it
   unclear if they are talking about different kind of things, or
   the same kind of things in merely different ways.

 * It does not make it clear "=value" is only meaningful when [mode]
   is empty.

Perhaps dropping the '[mode]' thing altogether and instead saying

        After `attr:` comes a space separated list of "attribute
       	requirements", all of which must be met in order for the
       	path to be considered a match; this is in addition to the
       	usual non-magic pathspec pattern matching.

	Each of the attribute requirements for the path takes one of
	these forms:

	- "`ATTR`" requires that the attribute `ATTR` must be set.

	- "`-ATTR`" requires that the attribute `ATTR` must be unset.

        - "`ATTR=VALUE`" requires that the attribute `ATTR` must be
          set to the string `VALUE`.

        - "`!ATTR`" requires that the attribute `ATTR` must be
          unspecified.

would make the resulting text easier to read?

> +static int match_attrs(const char *name, int namelen,
> +		       const struct pathspec_item *item)
> +{
> +	int i;
> +
> +	git_check_attr_counted(name, namelen, item->attr_check);
> +	for (i = 0; i < item->attr_match_nr; i++) {
> +		const char *value;
> +		int matched;
> +		enum attr_match_mode match_mode;
> +
> +		value = item->attr_check->check[i].value;
> +		match_mode = item->attr_match[i].match_mode;
> +
> +		if (ATTR_TRUE(value))
> +			matched = match_mode == MATCH_SET;
> +		else if (ATTR_FALSE(value))
> +			matched = match_mode == MATCH_UNSET;
> +		else if (ATTR_UNSET(value))
> +			matched = match_mode == MATCH_UNSPECIFIED;

readability nit:

	matched = (match_mode == MATCH_WHATEVER);

would be easier to view

> +		else
> +			matched = (match_mode == MATCH_VALUE &&
> +				   !strcmp(item->attr_match[i].value, value));

and would match the last case above better.

> +static void parse_pathspec_attr_match(struct pathspec_item *item, const char *value)
> +{
> +	struct string_list_item *si;
> +	struct string_list list = STRING_LIST_INIT_DUP;
> +
> +
> +	if (!value || !strlen(value))
> +		die(_("attr spec must not be empty"));
> +
> +	string_list_split(&list, value, ' ', -1);
> +	string_list_remove_empty_items(&list, 0);
> +
> +	if (!item->attr_check)
> +		item->attr_check = git_attr_check_alloc();
> +	else
> +		die(_("Only one 'attr:' specification is allowed."));
> +
> +	ALLOC_GROW(item->attr_match, item->attr_match_nr + list.nr, item->attr_match_alloc);
> +
> +	for_each_string_list_item(si, &list) {
> +		size_t attr_len;
> +
> +		int j = item->attr_match_nr++;
> +		const char *attr = si->string;
> +		struct attr_match *am = &item->attr_match[j];
> +
> +		if (attr[0] == '!')
> +			am->match_mode = MATCH_UNSPECIFIED;
> +		else if (attr[0] == '-')
> +			am->match_mode = MATCH_UNSET;
> +		else
> +			am->match_mode = MATCH_SET;
> +
> +		if (am->match_mode != MATCH_SET)
> +			/* skip first character */
> +			attr++;
> +		attr_len = strcspn(attr, "=");
> +		if (attr[attr_len] == '=') {
> +			am->match_mode = MATCH_VALUE;
> +			am->value = xstrdup(&attr[attr_len + 1]);
> +			if (strchr(am->value, '\\'))
> +				die(_("attr spec values must not contain backslashes"));
> +		} else
> +			am->value = NULL;
> +

Let me try a variant to see if we can improve it (thinking aloud).

	switch (*attr) {
        case '!':
        	am->match_mode = MATCH_UNSPECIFIED;
		attr++;
                break;
        case '-':
        	am->match_mode = MATCH_UNSET;
		attr++;
                break;
	default:
		attr_len = strcspn(attr, "=");
		if (attr[attr_len] != '=')
			am->match_mode = MATCH_SET;
		else {
			am->match_mode = MATCH_VALUE;
	                am->value = xstrdup(...);
		}
                break;
	}

Using switch/case does not save line counts at all but it may still
make the result easier to follow.  More importantly, it would help
you catch "!ATTR=VAR" as an "invalid attribute name 'ATTR=VAR' was
requested" error by arranging the code to make sure that scanning
for '=' would not happen in !/- case (in other words, while thiking
aloud with an alternative way to write the same thing, I spotted a
bug in the original ;-).

> +		if (!attr_name_valid(attr, attr_len)) {
> +			struct strbuf sb = STRBUF_INIT;
> +			am->match_mode = INVALID_ATTR;
> +			invalid_attr_name_message(&sb, attr, attr_len);
> +			die(_("invalid attribute in '%s': '%s'"), value, sb.buf);
> +		}
> +
> +		am->attr = git_attr_counted(attr, attr_len);

I'd do this the other order, i.e.

	am->attr = git_attr_counted(...);
        if (!am->attr) {
		...
                die(...);
	}

It is wasteful to call attr_name_valid() yourself before seeing an
error from git_attr_counted().

> diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh
> new file mode 100755
> index 0000000..c0d8cda
> --- /dev/null
> +++ b/t/t6134-pathspec-with-labels.sh
> @@ -0,0 +1,166 @@
> +#!/bin/sh
> +
> +test_description='test labels in pathspecs'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup a tree' '
> +	mkdir sub &&
> +	for p in fileA fileB fileC fileAB fileAC fileBC fileNoLabel fileUnsetLabel fileSetLabel fileValue fileWrongLabel; do
> +		: >$p &&
> +		git add $p &&
> +		: >sub/$p
> +		git add sub/$p
> +	done &&
> +	git commit -m $p &&
> +	git ls-files >actual &&
> +	cat <<EOF >expect &&
> +fileA
> +fileAB
> +fileAC
> +fileB
> +fileBC
> +fileC
> +fileNoLabel
> +fileSetLabel
> +fileUnsetLabel
> +fileValue
> +fileWrongLabel
> +sub/fileA
> +sub/fileAB
> +sub/fileAC
> +sub/fileB
> +sub/fileBC
> +sub/fileC
> +sub/fileNoLabel
> +sub/fileSetLabel
> +sub/fileUnsetLabel
> +sub/fileValue
> +sub/fileWrongLabel
> +EOF

	cat <<-\EOF >expect &&
        fileA
        ...
        EOF

to indent the whole thing?

> +test_expect_success 'setup .gitattributes' '
> +	cat <<EOF >.gitattributes &&

Likewise.

> +fileWrongLabel label☺
> +EOF
> +	git add .gitattributes &&
> +	git commit -m "add attributes"
> +'
> +
> +sq="'"

I do not think you use this $sq variable below, but I may have
overlooked its use.

> +
> +test_expect_success 'check specific set attr' '
> +	cat <<EOF >expect &&
> +fileSetLabel
> +sub/fileSetLabel
> +EOF
> +	git ls-files ":(attr:label)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check specific unset attr' '
> +	cat <<EOF >expect &&
> +fileUnsetLabel
> +sub/fileUnsetLabel
> +EOF
> +	git ls-files ":(attr:-label)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check specific value attr' '
> +	cat <<EOF >expect &&
> +fileValue
> +sub/fileValue
> +EOF
> +	git ls-files ":(attr:label=foo)" >actual &&
> +	test_cmp expect actual &&
> +	git ls-files ":(attr:label=bar)" >actual &&
> +	test_must_be_empty actual
> +'
> +
> +test_expect_success 'check unspecified attr' '
> +	cat <<EOF >expect &&
> +.gitattributes
> +...
> +EOF
> +	git ls-files :\(attr:\!label\) >actual &&

Why not the more normal-looking ":(attr:!label)"?  I do not think
history substitution would trigger in an script.

And no, "I may cut and paste into my interactive shell while
debugging" is not a good reason to make the end-result (i.e. script)
less readable.

> +test_expect_success 'check multiple unspecified attr' '
> +	cat <<EOF >expect &&
> +.gitattributes
> +fileC
> +fileNoLabel
> +fileWrongLabel
> +sub/fileC
> +sub/fileNoLabel
> +sub/fileWrongLabel
> +EOF
> +	git ls-files :\(attr:\!labelB\ \!labelA\ \!label\) >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check label with more labels but excluded path' '
> +	cat <<EOF >expect &&
> +fileAB
> +fileB
> +fileBC
> +EOF
> +	git ls-files ":(attr:labelB)" ":(exclude)sub/" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check label excluding other labels' '
> +	cat <<EOF >expect &&
> +fileAB
> +fileB
> +fileBC
> +sub/fileAB
> +sub/fileB
> +EOF
> +	git ls-files ":(attr:labelB)" ":(exclude,attr:labelC)sub/" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'abort on giving invalid label on the command line' '
> +	test_must_fail git ls-files . ":(attr:☺)" 2>actual &&
> +	test_i18ngrep "fatal" actual
> +'
> +
> +test_done

  reply	other threads:[~2016-05-19 18:54 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-19  1:09 [PATCHv8 0/5] pathspec magic extension to search for attributes Stefan Beller
2016-05-19  1:09 ` [PATCHv8 1/5] string list: improve comment Stefan Beller
2016-05-19 18:08   ` Junio C Hamano
2016-05-19 18:12     ` Stefan Beller
2016-05-19  1:09 ` [PATCHv8 2/5] Documentation: fix a typo Stefan Beller
2016-05-19  1:09 ` [PATCHv8 3/5] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
2016-05-19  1:09 ` [PATCHv8 4/5] pathspec: move prefix check out of the inner loop Stefan Beller
2016-05-19  1:09 ` [PATCHv8 5/5] pathspec: allow querying for attributes Stefan Beller
2016-05-19 18:53   ` Junio C Hamano [this message]
2016-05-19 20:42     ` Stefan Beller
2016-05-19 21:00       ` Junio C Hamano
2016-05-19 19:37   ` Junio C Hamano
2016-05-19 18:55 ` [PATCHv8 0/5] pathspec magic extension to search " Junio C Hamano
2016-05-19 21:00   ` Stefan Beller
2016-05-19 21:05     ` Junio C Hamano
2016-05-19 21:25       ` Stefan Beller
2016-05-20 17:00         ` Junio C Hamano
2016-05-20 18:12           ` Stefan Beller
2016-05-20 18:19             ` Junio C Hamano
2016-05-22 11:45         ` Duy Nguyen
2016-05-23 18:49           ` Stefan Beller
2016-05-24  2:00             ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqmvnlj2vf.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).