git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH/RFC v2 0/2] add regex match flags to git describe
@ 2015-12-28 10:30 Mostyn Bramley-Moore
  2015-12-28 10:30 ` [PATCH/RFC v2 1/2] describe: add option to use perl-compatible regexes with --match Mostyn Bramley-Moore
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-28 10:30 UTC (permalink / raw
  To: git; +Cc: Mostyn Bramley-Moore, brian m . carlson

git describe currently only supports glob matching with the --matches flag.
It would be useful to support regular expressions.

For consistency, this uses the same regex flags as those used by git-grep.

Some old discussion of this as a candidate feature is here, though nobody put
together a patch as far as I can see:
http://comments.gmane.org/gmane.comp.version-control.git/173873


-Mostyn.

Mostyn Bramley-Moore (2):
  describe: add option to use perl-compatible regexes with --match
  describe: add basic and extended posix regex matching for completeness

 Documentation/git-describe.txt |  21 ++++++-
 builtin/describe.c             | 131 ++++++++++++++++++++++++++++++++++++++++-
 t/README                       |   3 +-
 t/t6120-describe.sh            |  38 ++++++++++--
 4 files changed, 184 insertions(+), 9 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH/RFC v2 1/2] describe: add option to use perl-compatible regexes with --match
  2015-12-28 10:30 [PATCH/RFC v2 0/2] add regex match flags to git describe Mostyn Bramley-Moore
@ 2015-12-28 10:30 ` Mostyn Bramley-Moore
  2015-12-28 10:30 ` [PATCH/RFC v2 2/2] describe: add basic and extended posix regex matching for completeness Mostyn Bramley-Moore
  2015-12-28 20:30 ` [PATCH/RFC v2 0/2] add regex match flags to git describe Junio C Hamano
  2 siblings, 0 replies; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-28 10:30 UTC (permalink / raw
  To: git; +Cc: Mostyn Bramley-Moore, brian m . carlson

This allows more flexible pattern matching than the default globs.

Signed-off-by: Mostyn Bramley-Moore <mostynb@opera.com>
---
 Documentation/git-describe.txt | 15 +++++--
 builtin/describe.c             | 90 +++++++++++++++++++++++++++++++++++++++++-
 t/README                       |  3 +-
 t/t6120-describe.sh            | 24 +++++++++--
 4 files changed, 123 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index c8f28c8..b8279ec 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -81,9 +81,18 @@ OPTIONS
 	that points at object deadbee....).
 
 --match <pattern>::
-	Only consider tags matching the given `glob(7)` pattern,
-	excluding the "refs/tags/" prefix.  This can be used to avoid
-	leaking private tags from the repository.
+	Only consider tags matching the given pattern, excluding the
+	"refs/tags/" prefix.  This can be used to avoid leaking private
+	tags from the repository. The pattern type can be chosen by
+	specifying --glob or --perl-regexp.
+
+--glob::
+	Use `glob(7)` patterns with --match, this is the default.
+
+-P::
+--perl-regexp::
+	Use Perl-compatible regexp for patterns with --match. Requires
+	libpcre to be compiled in.
 
 --always::
 	Show uniquely abbreviated commit object as fallback.
diff --git a/builtin/describe.c b/builtin/describe.c
index 8a25abe..d9ab5bd 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -10,6 +10,10 @@
 #include "hashmap.h"
 #include "argv-array.h"
 
+#ifdef USE_LIBPCRE
+#include <pcre.h>
+#endif
+
 #define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
@@ -19,6 +23,11 @@ static const char * const describe_usage[] = {
 	NULL
 };
 
+enum match_type {
+	MATCH_GLOB = 0,
+	MATCH_PCRE
+};
+
 static int debug;	/* Display lots of verbose info */
 static int all;	/* Any valid ref can be used */
 static int tags;	/* Allow lightweight tags */
@@ -31,6 +40,12 @@ static int have_util;
 static const char *pattern;
 static int always;
 static const char *dirty;
+static enum match_type pattern_type_arg = MATCH_GLOB;
+
+#ifdef USE_LIBPCRE
+static pcre *pcre_regex = NULL;
+static pcre_extra *extra = NULL;
+#endif
 
 /* diff-index command arguments to check if working tree is dirty. */
 static const char *diff_index_args[] = {
@@ -119,6 +134,61 @@ static void add_to_known_names(const char *path,
 	}
 }
 
+#ifdef USE_LIBPCRE
+static void pcre_init()
+{
+	const char *error = NULL;
+	int erroffset = -1;
+	int opts = PCRE_MULTILINE;
+
+	pcre_regex = pcre_compile(pattern, opts, &error, &erroffset, NULL);
+	if (!pcre_regex)
+		die("invalid PCRE at position %d of \'%s\': %s",
+			erroffset, pattern, error);
+
+	extra = pcre_study(pcre_regex, 0, &error);
+	if (!extra && error)
+		die("%s", error);
+}
+
+static void pcre_cleanup()
+{
+	pcre_free(pcre_regex);
+	pcre_regex = NULL;
+	pcre_free(extra);
+	extra = NULL;
+}
+
+/* return 1 on match, 0 on no match. */
+static int pcre_match(const char *text)
+{
+	int ovector[3], flags = 0;
+	int ret = pcre_exec(pcre_regex, extra,
+	                    text, strlen(text), 0,
+	                    flags, ovector, ARRAY_SIZE(ovector));
+	if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
+		die("pcre_exec failed with error code %d", ret);
+	else if (ret > 0)
+		return 0; /* no match */
+	else
+		return 1; /* match */
+}
+#endif
+
+/* Return 1 on match, 0 on no match. */
+static int match(const char *pattern, const char *text, enum match_type t)
+{
+	if (t == MATCH_GLOB) {
+		return wildmatch(pattern, text, 0, NULL);
+#ifdef USE_LIBPCRE
+	} else if (t == MATCH_PCRE) {
+		return pcre_match(text);
+#endif
+	}
+
+	assert(0); /* None of the alternatives above were used. */
+}
+
 static int get_name(const char *path, const struct object_id *oid, int flag, void *cb_data)
 {
 	int is_tag = starts_with(path, "refs/tags/");
@@ -130,7 +200,7 @@ static int get_name(const char *path, const struct object_id *oid, int flag, voi
 		return 0;
 
 	/* Accept only tags that match the pattern, if given */
-	if (pattern && (!is_tag || wildmatch(pattern, path + 10, 0, NULL)))
+	if (pattern && (!is_tag || match(pattern, path + 10, pattern_type_arg)))
 		return 0;
 
 	/* Is it annotated? */
@@ -406,6 +476,14 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			    N_("consider <n> most recent tags (default: 10)")),
 		OPT_STRING(0, "match",       &pattern, N_("pattern"),
 			   N_("only consider tags matching <pattern>")),
+		OPT_SET_INT(0, "glob", &pattern_type_arg,
+			   N_("use glob patterns with --match (default)"),
+			   MATCH_GLOB),
+#ifdef USE_LIBPCRE
+		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+			   N_("use Perl-compatible regular expressions with --match"),
+			   MATCH_PCRE),
+#endif
 		OPT_BOOL(0, "always",        &always,
 			N_("show abbreviated commit object as fallback")),
 		{OPTION_STRING, 0, "dirty",  &dirty, N_("mark"),
@@ -429,6 +507,11 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 	if (longformat && abbrev == 0)
 		die(_("--long is incompatible with --abbrev=0"));
 
+#ifdef USE_LIBPCRE
+	if (pattern && pattern_type_arg == MATCH_PCRE)
+		pcre_init();
+#endif
+
 	if (contains) {
 		struct argv_array args;
 
@@ -455,6 +538,11 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 	if (!names.size && !always)
 		die(_("No names found, cannot describe anything."));
 
+#ifdef USE_LIBPCRE
+	if (pattern && pattern_type_arg == MATCH_PCRE)
+		pcre_cleanup();
+#endif
+
 	if (argc == 0) {
 		if (dirty) {
 			static struct lock_file index_lock;
diff --git a/t/README b/t/README
index 1dc908e..207257e 100644
--- a/t/README
+++ b/t/README
@@ -798,7 +798,8 @@ use these, and "test_set_prereq" for how to define your own.
  - LIBPCRE
 
    Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests
-   that use git-grep --perl-regexp or git-grep -P in these.
+   that use the --perl-regexp or -P flags to git-grep or
+   git-describe in these.
 
  - CASE_INSENSITIVE_FS
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 85f2694..47427c4 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -12,13 +12,14 @@ test_description='test describe
 '
 . ./test-lib.sh
 
-check_describe () {
-	expect="$1"
-	shift
+_check_describe () {
+	pcre="$1"
+	expect="$2"
+	shift 2
 	R=$(git describe "$@" 2>err.actual)
 	S=$?
 	cat err.actual >&3
-	test_expect_success "describe $*" '
+	test_expect_success $pcre "describe $*" '
 	test $S = 0 &&
 	case "$R" in
 	$expect)	echo happy ;;
@@ -28,6 +29,14 @@ check_describe () {
 	'
 }
 
+check_describe() {
+	_check_describe "" $*
+}
+
+check_describe_pcre() {
+	_check_describe LIBPCRE $*
+}
+
 test_expect_success setup '
 
 	test_tick &&
@@ -175,12 +184,19 @@ test_expect_success 'set-up matching pattern tests' '
 '
 
 check_describe "test-annotated-*" --match="test-*"
+check_describe_pcre "test-annotated-*" --match="^test-" --perl-regexp
 
 check_describe "test1-lightweight-*" --tags --match="test1-*"
+check_describe_pcre "test1-lightweight-*" --tags --match="^test1-" \
+	--perl-regexp
 
 check_describe "test2-lightweight-*" --tags --match="test2-*"
+check_describe_pcre "test2-lightweight-*" --tags --match="^test2-" \
+	--perl-regexp
 
 check_describe "test2-lightweight-*" --long --tags --match="test2-*" HEAD^
+check_describe_pcre "test2-lightweight-*" --long --tags --match="^test2-" \
+	--perl-regexp HEAD^
 
 test_expect_success 'name-rev with exact tags' '
 	echo A >expect &&
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH/RFC v2 2/2] describe: add basic and extended posix regex matching for completeness
  2015-12-28 10:30 [PATCH/RFC v2 0/2] add regex match flags to git describe Mostyn Bramley-Moore
  2015-12-28 10:30 ` [PATCH/RFC v2 1/2] describe: add option to use perl-compatible regexes with --match Mostyn Bramley-Moore
@ 2015-12-28 10:30 ` Mostyn Bramley-Moore
  2015-12-28 20:30 ` [PATCH/RFC v2 0/2] add regex match flags to git describe Junio C Hamano
  2 siblings, 0 replies; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-28 10:30 UTC (permalink / raw
  To: git; +Cc: Mostyn Bramley-Moore, brian m . carlson

Signed-off-by: Mostyn Bramley-Moore <mostynb@opera.com>
---
 Documentation/git-describe.txt |  6 ++++++
 builtin/describe.c             | 41 +++++++++++++++++++++++++++++++++++++++++
 t/t6120-describe.sh            | 14 ++++++++++++++
 3 files changed, 61 insertions(+)

diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
index b8279ec..0c237c3 100644
--- a/Documentation/git-describe.txt
+++ b/Documentation/git-describe.txt
@@ -89,6 +89,12 @@ OPTIONS
 --glob::
 	Use `glob(7)` patterns with --match, this is the default.
 
+-E::
+--extended-regexp::
+-G::
+--basic-regexp::
+	Use POSIX extended/basic regexp for patterns with --match.
+
 -P::
 --perl-regexp::
 	Use Perl-compatible regexp for patterns with --match. Requires
diff --git a/builtin/describe.c b/builtin/describe.c
index d9ab5bd..3bff610 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -9,6 +9,7 @@
 #include "diff.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include <regex.h>
 
 #ifdef USE_LIBPCRE
 #include <pcre.h>
@@ -25,6 +26,8 @@ static const char * const describe_usage[] = {
 
 enum match_type {
 	MATCH_GLOB = 0,
+	MATCH_BRE,
+	MATCH_ERE,
 	MATCH_PCRE
 };
 
@@ -41,6 +44,7 @@ static const char *pattern;
 static int always;
 static const char *dirty;
 static enum match_type pattern_type_arg = MATCH_GLOB;
+static regex_t posix_regex;
 
 #ifdef USE_LIBPCRE
 static pcre *pcre_regex = NULL;
@@ -134,6 +138,27 @@ static void add_to_known_names(const char *path,
 	}
 }
 
+static void re_init(enum match_type type)
+{
+	int cflags = REG_NOSUB;
+	if (type == MATCH_ERE)
+		cflags |= REG_EXTENDED;
+
+	if (regcomp(&posix_regex, pattern, cflags))
+		die("Invalid regular expression: %s", pattern);
+}
+
+static void re_cleanup()
+{
+	regfree(&posix_regex);
+}
+
+/* return 1 on match, 0 on no match. */
+static int posix_re_match(const char *text)
+{
+	return regexec(&posix_regex, text, 0, NULL, 0) != 0;
+}
+
 #ifdef USE_LIBPCRE
 static void pcre_init()
 {
@@ -180,6 +205,8 @@ static int match(const char *pattern, const char *text, enum match_type t)
 {
 	if (t == MATCH_GLOB) {
 		return wildmatch(pattern, text, 0, NULL);
+	} else if (t == MATCH_ERE || t == MATCH_BRE) {
+		return posix_re_match(text);
 #ifdef USE_LIBPCRE
 	} else if (t == MATCH_PCRE) {
 		return pcre_match(text);
@@ -479,6 +506,12 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 		OPT_SET_INT(0, "glob", &pattern_type_arg,
 			   N_("use glob patterns with --match (default)"),
 			   MATCH_GLOB),
+		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+			   N_("use extended POSIX regular expressions with --match"),
+			   MATCH_ERE),
+		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+			   N_("use basic POSIX regular expressions with --match"),
+			   MATCH_BRE),
 #ifdef USE_LIBPCRE
 		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
 			   N_("use Perl-compatible regular expressions with --match"),
@@ -507,6 +540,10 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 	if (longformat && abbrev == 0)
 		die(_("--long is incompatible with --abbrev=0"));
 
+	if (pattern && (pattern_type_arg == MATCH_ERE ||
+			pattern_type_arg == MATCH_BRE))
+		re_init(pattern_type_arg);
+
 #ifdef USE_LIBPCRE
 	if (pattern && pattern_type_arg == MATCH_PCRE)
 		pcre_init();
@@ -538,6 +575,10 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 	if (!names.size && !always)
 		die(_("No names found, cannot describe anything."));
 
+	if (pattern && (pattern_type_arg == MATCH_ERE ||
+			pattern_type_arg == MATCH_BRE))
+		re_cleanup();
+
 #ifdef USE_LIBPCRE
 	if (pattern && pattern_type_arg == MATCH_PCRE)
 		pcre_cleanup();
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 47427c4..e3d8663 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -184,17 +184,31 @@ test_expect_success 'set-up matching pattern tests' '
 '
 
 check_describe "test-annotated-*" --match="test-*"
+check_describe "test-annotated-*" --match="^test-" --basic-regexp
+check_describe "test-annotated-*" --match="^test-" --extended-regexp
 check_describe_pcre "test-annotated-*" --match="^test-" --perl-regexp
 
 check_describe "test1-lightweight-*" --tags --match="test1-*"
+check_describe "test1-lightweight-*" --tags --match="^test1-" \
+	--extended-regexp
+check_describe "test1-lightweight-*" --tags --match="^test1-" \
+	--basic-regexp
 check_describe_pcre "test1-lightweight-*" --tags --match="^test1-" \
 	--perl-regexp
 
 check_describe "test2-lightweight-*" --tags --match="test2-*"
+check_describe "test2-lightweight-*" --tags --match="^test2-" \
+	--extended-regexp
+check_describe "test2-lightweight-*" --tags --match="^test2-" \
+	--basic-regexp
 check_describe_pcre "test2-lightweight-*" --tags --match="^test2-" \
 	--perl-regexp
 
 check_describe "test2-lightweight-*" --long --tags --match="test2-*" HEAD^
+check_describe "test2-lightweight-*" --long --tags --match="test2-*" \
+	--extended-regexp HEAD^
+check_describe "test2-lightweight-*" --long --tags --match="test2-*" \
+	--basic-regexp HEAD^
 check_describe_pcre "test2-lightweight-*" --long --tags --match="^test2-" \
 	--perl-regexp HEAD^
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-28 10:30 [PATCH/RFC v2 0/2] add regex match flags to git describe Mostyn Bramley-Moore
  2015-12-28 10:30 ` [PATCH/RFC v2 1/2] describe: add option to use perl-compatible regexes with --match Mostyn Bramley-Moore
  2015-12-28 10:30 ` [PATCH/RFC v2 2/2] describe: add basic and extended posix regex matching for completeness Mostyn Bramley-Moore
@ 2015-12-28 20:30 ` Junio C Hamano
  2015-12-29  0:13   ` Mostyn Bramley-Moore
  2 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2015-12-28 20:30 UTC (permalink / raw
  To: Mostyn Bramley-Moore; +Cc: git, brian m . carlson

Mostyn Bramley-Moore <mostynb@opera.com> writes:

> git describe currently only supports glob matching with the --matches flag.
> It would be useful to support regular expressions.
>
> For consistency, this uses the same regex flags as those used by git-grep.
>
> Some old discussion of this as a candidate feature is here, though nobody put
> together a patch as far as I can see:
> http://comments.gmane.org/gmane.comp.version-control.git/173873

Thanks.

I do not think it is wrong per-se to add an option to use regular
expressions instead of globs, but if we are to do so, the endgame we
aim for MUST be that we do so consistently to all the other commands
that iterate over refs and limit their output to the ones that match
given pattern (or a set of patterns), not just 'describe'.

Even if we are not adding such an option to these other commands
right now (yet), we at least need to know what these commands are
(e.g. "git tag -l" and "git for-each-ref" immediately come to mind,
but there may be others), and make sure that the option names you
choose here can be used sensibly in their context.  I think "tag"
and "for-each-ref" do no pattern matching against anything other
than the refnames, so it would be clear what a new --perl-regexp
option does in their contexts.

Unlike "grep" whose sole point is to perform pattern matching, the
filtering of refs these commands do is merely a very small tweak in
the overall picture (e.g. "git tag --contains $commit -l $pattern"
does filter by matching $pattern against the refname, but that is a
small detail compared to the filtering done by the reachability with
the $commit), so I am not sure if short -E/-G/-F/-P should be given
to these commands like "grep" does, though.  These commands may have
better uses for these shorter option names.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-28 20:30 ` [PATCH/RFC v2 0/2] add regex match flags to git describe Junio C Hamano
@ 2015-12-29  0:13   ` Mostyn Bramley-Moore
  2015-12-29 18:27     ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-29  0:13 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, brian m . carlson

On 12/28/2015 09:30 PM, Junio C Hamano wrote:
> Mostyn Bramley-Moore <mostynb@opera.com> writes:
>
>> git describe currently only supports glob matching with the --matches flag.
>> It would be useful to support regular expressions.
>>
>> For consistency, this uses the same regex flags as those used by git-grep.
>>
>> Some old discussion of this as a candidate feature is here, though nobody put
>> together a patch as far as I can see:
>> http://comments.gmane.org/gmane.comp.version-control.git/173873
>
> Thanks.
>
> I do not think it is wrong per-se to add an option to use regular
> expressions instead of globs, but if we are to do so, the endgame we
> aim for MUST be that we do so consistently to all the other commands
> that iterate over refs and limit their output to the ones that match
> given pattern (or a set of patterns), not just 'describe'.

There is one important distinction between 'git describe' and the other 
commands that iterate through refs- it applies an internal search 
strategy and outputs at most one match.  This makes it difficult to 
search for the closest matching tag that matches any item in a set of 
patterns without some messy scripting (ie run multiple 'git describe' 
commands, for each pattern, then figure out which result is the one you 
want).

> Even if we are not adding such an option to these other commands
> right now (yet), we at least need to know what these commands are
> (e.g. "git tag -l" and "git for-each-ref" immediately come to mind,
> but there may be others), and make sure that the option names you
> choose here can be used sensibly in their context.  I think "tag"
> and "for-each-ref" do no pattern matching against anything other
> than the refnames, so it would be clear what a new --perl-regexp
> option does in their contexts.

The commands that you mention produce a list of results, which can 
easily be processed by an external util (eg grep itself), and might not 
need similar arguments to be added.

Adding regex pattern matching to 'git describe' is just one reasonable 
(IMO) solution, but there are a couple of alternatives that might suffice:
1) Add an option to print all tags, in a well-defined/useful order. 
Then the results would be easy to process by an external command too.
2) Add support for multiple patterns, and print the first found item 
that matches any of them.

> Unlike "grep" whose sole point is to perform pattern matching, the
> filtering of refs these commands do is merely a very small tweak in
> the overall picture (e.g. "git tag --contains $commit -l $pattern"
> does filter by matching $pattern against the refname, but that is a
> small detail compared to the filtering done by the reachability with
> the $commit), so I am not sure if short -E/-G/-F/-P should be given
> to these commands like "grep" does, though.  These commands may have
> better uses for these shorter option names.

Besides 'git grep', the only regex type flag that is given a short 
option seems to be -E for 'git log' and 'git rev-list'.  I have no 
objection to dropping the short options, or leaving only -E.


-Mostyn.
-- 
Mostyn Bramley-Moore
TV and Connected Devices
Opera Software ASA
mostynb@opera.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-29  0:13   ` Mostyn Bramley-Moore
@ 2015-12-29 18:27     ` Junio C Hamano
  2015-12-30  9:52       ` Duy Nguyen
  2015-12-31  0:00       ` Mostyn Bramley-Moore
  0 siblings, 2 replies; 14+ messages in thread
From: Junio C Hamano @ 2015-12-29 18:27 UTC (permalink / raw
  To: Mostyn Bramley-Moore; +Cc: git, brian m . carlson

Mostyn Bramley-Moore <mostynb@opera.com> writes:

>> I do not think it is wrong per-se to add an option to use regular
>> expressions instead of globs, but if we are to do so, the endgame we
>> aim for MUST be that we do so consistently to all the other commands
>> that iterate over refs and limit their output to the ones that match
>> given pattern (or a set of patterns), not just 'describe'.
>
> There is one important distinction between 'git describe' and the
> other commands that iterate through refs- it applies an internal
> search strategy and outputs at most one match.  This makes it
> difficult to search for the closest matching tag...

If that was what you were trying to solve, then it sounds like a
typical XY problem.  You do not need custom matching flags; you need
a "give me N (or all) names based on possible tags" option.

And I do not think it is a bad thing to add.  I already
said that an option to match with a regular expression is not a bad
thing to add, either ;-)

> Besides 'git grep', the only regex type flag that is given a short
> option seems to be -E for 'git log' and 'git rev-list'.  I have no
> objection to dropping the short options, or leaving only -E.

They also take -F, but "log" and friends do not pattern match the
refnames, so I do not think you have to worry about them at the
moment.

It is more important to envision what we would do in the future when
a command that takes a pattern (or a set of patterns) to match the
refnames with _and_ another pattern (or a set of patterns) to match
something else, and take that into account when designing this
"allowing matching logic for refnames to be customized from glob to
something else" feature, so that we do not paint outselves into a
corner we cannot later get out of.  Imagine a hypothetical command
'git mgrep' that can look for a pattern in tips of multiple branches
that can be used this way:

    $ git mgrep -e 'froo*tz' --refs 'refs/heads/release/*'

which may look for strings that match "froo*tz" in the trees of
all branches whose name match the pattern 'release/*'.  In this
example, the pattern to match strings is a BRE (same default as 
"git grep"), and the pattern to match refnames is a glob.

Consistency & similarity with "git grep" would most likely lead us
to add -E/-F/-G/-P options to this command and to make it affect how
the pattern to match strings works.  For example:

    $ git mgrep -E -e 'fro+tz' --match-refs 'refs/heads/release/*'

may look for the same strings that would match the first example,
but the pattern is expressed in ERE.  "-P", "-G", and "-F" options
would also work the same way.

Now, the question is what this "-E" (or -P/-G/-F) should do with the
matching the command does with the refnames.  The easiest (and
laziest) way out from the implementors' point of view might be to
declare that they affect both at the same time.  But is that useful
in practice?  It probably isn't, as it forces the users to write

    $ git mgrep -E -e 'fro+tz' --match-refs 'refs/heads/release/.*'

because the ref matching suddenly starts to use ERE (not glob),
which most likely is not something users would expect.  So we may
need a separate set of options to affect the way how refs are
matched.

We cannot just say "but we do not have that 'mgrep' command yet, so
we can do whatever we want to do with 'describe' today".  When the
need eventually arises that requires us to be able to specify how
strings are matched and how refnames are matched independently, we
would end up with an inconsistent UI where 'describe' takes '-P' (or
'--perl-regexp') to affect the way how refnames are matched, while
commands like 'mgrep' would need to use '--refmatch-perl-regexp' (or
any other name that can be distinguished from '--perl-regexp') to do
the same thing because they do not want '--perl-regexp' to affect
the matching of refnames.

And at that point in the future, it is too late to fix 'describe',
as people are so used to use '--perl-regexp' to match with refs.  We
will forever regret that we did not give the option a name that can
be used independently from the existing '--perl-regexp' that is
about matching for strings, not refnames.

That is exactly the kind of thing that would paint us in a corner
that we cannot get out of, which we need to avoid, hence we need to
think ahead now.
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-29 18:27     ` Junio C Hamano
@ 2015-12-30  9:52       ` Duy Nguyen
  2015-12-31  0:08         ` Mostyn Bramley-Moore
  2015-12-31  0:00       ` Mostyn Bramley-Moore
  1 sibling, 1 reply; 14+ messages in thread
From: Duy Nguyen @ 2015-12-30  9:52 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Mostyn Bramley-Moore, Git Mailing List, brian m . carlson

On Wed, Dec 30, 2015 at 1:27 AM, Junio C Hamano <gitster@pobox.com> wrote:
> It is more important to envision what we would do in the future when
> a command that takes a pattern (or a set of patterns) to match the
> refnames with _and_ another pattern (or a set of patterns) to match
> something else, and take that into account when designing this
> "allowing matching logic for refnames to be customized from glob to
> something else" feature, so that we do not paint outselves into a
> corner we cannot later get out of.  Imagine a hypothetical command
> 'git mgrep' that can look for a pattern in tips of multiple branches
> that can be used this way:
>
>     $ git mgrep -e 'froo*tz' --refs 'refs/heads/release/*'
>
> which may look for strings that match "froo*tz" in the trees of
> all branches whose name match the pattern 'release/*'.  In this
> example, the pattern to match strings is a BRE (same default as
> "git grep"), and the pattern to match refnames is a glob.
>
> Consistency & similarity with "git grep" would most likely lead us
> to add -E/-F/-G/-P options to this command and to make it affect how
> the pattern to match strings works.  For example:
>
>     $ git mgrep -E -e 'fro+tz' --match-refs 'refs/heads/release/*'
>
> may look for the same strings that would match the first example,
> but the pattern is expressed in ERE.  "-P", "-G", and "-F" options
> would also work the same way.
>
> Now, the question is what this "-E" (or -P/-G/-F) should do with the
> matching the command does with the refnames.  The easiest (and
> laziest) way out from the implementors' point of view might be to
> declare that they affect both at the same time.
> ...

What about not using command line options? We could go with something
like pathspec magic. I think as long as we provide three options:
literal, some pattern matching and backquote, then the user has enough
flexibility to specify any set of refs. For pattern matching I'd
prefer wildmatch. regex can be used via backquote, e.g. `for-each-ref
| grep ...` (or teach for-each-ref about regex to avoid grep).
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-29 18:27     ` Junio C Hamano
  2015-12-30  9:52       ` Duy Nguyen
@ 2015-12-31  0:00       ` Mostyn Bramley-Moore
  2015-12-31  0:23         ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-31  0:00 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, brian m . carlson

On 12/29/2015 07:27 PM, Junio C Hamano wrote:
> Mostyn Bramley-Moore <mostynb@opera.com> writes:
>
>>> I do not think it is wrong per-se to add an option to use regular
>>> expressions instead of globs, but if we are to do so, the endgame we
>>> aim for MUST be that we do so consistently to all the other commands
>>> that iterate over refs and limit their output to the ones that match
>>> given pattern (or a set of patterns), not just 'describe'.
>>
>> There is one important distinction between 'git describe' and the
>> other commands that iterate through refs- it applies an internal
>> search strategy and outputs at most one match.  This makes it
>> difficult to search for the closest matching tag...
>
> If that was what you were trying to solve, then it sounds like a
> typical XY problem.  You do not need custom matching flags; you need
> a "give me N (or all) names based on possible tags" option.

I can submit a separate patch for this, if you think it makes sense. 
Some possibilities that spring to mind:
* --results=<all|N>
* --num-results=<all|N>
* --show-matches=<all|N>
* --multiple-results[=<N>]
* --all-matches
* -<N>

> And I do not think it is a bad thing to add.  I already
> said that an option to match with a regular expression is not a bad
> thing to add, either ;-)
>
>> Besides 'git grep', the only regex type flag that is given a short
>> option seems to be -E for 'git log' and 'git rev-list'.  I have no
>> objection to dropping the short options, or leaving only -E.
>
> They also take -F, but "log" and friends do not pattern match the
> refnames, so I do not think you have to worry about them at the
> moment.
>
> It is more important to envision what we would do in the future when
> a command that takes a pattern (or a set of patterns) to match the
> refnames with _and_ another pattern (or a set of patterns) to match
> something else, and take that into account when designing this
> "allowing matching logic for refnames to be customized from glob to
> something else" feature, so that we do not paint outselves into a
> corner we cannot later get out of.  Imagine a hypothetical command
> 'git mgrep' that can look for a pattern in tips of multiple branches
> that can be used this way:
>
>      $ git mgrep -e 'froo*tz' --refs 'refs/heads/release/*'
>
> which may look for strings that match "froo*tz" in the trees of
> all branches whose name match the pattern 'release/*'.  In this
> example, the pattern to match strings is a BRE (same default as
> "git grep"), and the pattern to match refnames is a glob.
>
> Consistency & similarity with "git grep" would most likely lead us
> to add -E/-F/-G/-P options to this command and to make it affect how
> the pattern to match strings works.  For example:
>
>      $ git mgrep -E -e 'fro+tz' --match-refs 'refs/heads/release/*'
>
> may look for the same strings that would match the first example,
> but the pattern is expressed in ERE.  "-P", "-G", and "-F" options
> would also work the same way.
>
> Now, the question is what this "-E" (or -P/-G/-F) should do with the
> matching the command does with the refnames.  The easiest (and
> laziest) way out from the implementors' point of view might be to
> declare that they affect both at the same time.  But is that useful
> in practice?  It probably isn't, as it forces the users to write
>
>      $ git mgrep -E -e 'fro+tz' --match-refs 'refs/heads/release/.*'
>
> because the ref matching suddenly starts to use ERE (not glob),
> which most likely is not something users would expect.  So we may
> need a separate set of options to affect the way how refs are
> matched.
>
> We cannot just say "but we do not have that 'mgrep' command yet, so
> we can do whatever we want to do with 'describe' today".  When the
> need eventually arises that requires us to be able to specify how
> strings are matched and how refnames are matched independently, we
> would end up with an inconsistent UI where 'describe' takes '-P' (or
> '--perl-regexp') to affect the way how refnames are matched, while
> commands like 'mgrep' would need to use '--refmatch-perl-regexp' (or
> any other name that can be distinguished from '--perl-regexp') to do
> the same thing because they do not want '--perl-regexp' to affect
> the matching of refnames.
>
> And at that point in the future, it is too late to fix 'describe',
> as people are so used to use '--perl-regexp' to match with refs.  We
> will forever regret that we did not give the option a name that can
> be used independently from the existing '--perl-regexp' that is
> about matching for strings, not refnames.
>
> That is exactly the kind of thing that would paint us in a corner
> that we cannot get out of, which we need to avoid, hence we need to
> think ahead now.

OK, brainstorming a bit, how about either of these:

1) 
--match-pattern-type=<glob|fixed-strings|basic-regexp|extended-regexp|perl-regexp>

It's a bit lengthy (maybe --match-type would be sufficient), but I like 
that the value names are shared with git grep etc option names.  And it 
seems future-proof- if we ever need to support different pattern types 
for other arguments, a --foo-pattern-type flag could be added and make 
obvious sense.

2) Interpret --match patterns that start and end with / as regular 
expressions, and just pick one regex type to support.  I would suggest 
extended posix regex (since it's supported in all builds unlike PCRE). 
Downsides: some people might assume this is PCRE, also if we ever wanted 
to support other regex types we would need to add another option like 
(1), which would then make this feature redundant.

I prefer (1).


-Mostyn.
-- 
Mostyn Bramley-Moore
TV and Connected Devices
Opera Software ASA
mostynb@opera.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-30  9:52       ` Duy Nguyen
@ 2015-12-31  0:08         ` Mostyn Bramley-Moore
  0 siblings, 0 replies; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-31  0:08 UTC (permalink / raw
  To: Duy Nguyen, Junio C Hamano; +Cc: Git Mailing List, brian m . carlson

On 12/30/2015 10:52 AM, Duy Nguyen wrote:
> What about not using command line options? We could go with something
> like pathspec magic. I think as long as we provide three options:
> literal, some pattern matching and backquote, then the user has enough
> flexibility to specify any set of refs. For pattern matching I'd
> prefer wildmatch. regex can be used via backquote, e.g. `for-each-ref
> | grep ...` (or teach for-each-ref about regex to avoid grep).

I don't think that would solve my immediate problem, unless git 
for-each-ref knew how to produce output in the same order that git 
describe searches.  But perhaps I misunderstand what you're suggesting 
with the backquote/subshell.


-Mostyn.
-- 
Mostyn Bramley-Moore
TV and Connected Devices
Opera Software ASA
mostynb@opera.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-31  0:00       ` Mostyn Bramley-Moore
@ 2015-12-31  0:23         ` Junio C Hamano
  2015-12-31 10:07           ` Mostyn Bramley-Moore
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2015-12-31  0:23 UTC (permalink / raw
  To: Mostyn Bramley-Moore; +Cc: git, Duy Nguyen, brian m . carlson

Mostyn Bramley-Moore <mostynb@opera.com> writes:

> OK, brainstorming a bit, how about either of these:
>
> 1)
> --match-pattern-type=<glob|fixed-strings|basic-regexp|extended-regexp|perl-regexp>
>
> It's a bit lengthy (maybe --match-type would be sufficient), but I
> like that the value names are shared with git grep etc option names.
> And it seems future-proof- if we ever need to support different
> pattern types for other arguments, a --foo-pattern-type flag could be
> added and make obvious sense.

Swapping the option key and value may not be a bad idea, but one
problem that the above does not solve, which I outlined in the
message you are responding to, is that "match-pattern-type" does not
give any hint that this is about affecting the match that is done to
"refs", e.g. you cannot tell in

  $ git mgrep --match-pattern-type=perl-regexp -e foo --refs 'release_*'

if the perl-regexp is to be used for matching branch names or for
matching the strings the command looks for in the trees of the
matching branches.

Magic pattern annotation like we do for pathspecs Duy raised may not
be a bad idea, either, and would probably be easier to teach people.
Just like in Perl "(?i)$any_pattern" is a way to introduce the case
insensitive match with $any_pattern, we may be able to pick an
extensible magic syntax and decorate the pattern you would specify
for matching refnames to tell Git what kind of pattern it is, e.g.

  $ git mgrep -P -e foo --refs '/?glob/release_*'

I am not suggesting that we must use /?<pattern type name>/ prefix
as the "extensible magic syntax" here--I am just illustrating what
I mean by "extensible magic syntax".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-31  0:23         ` Junio C Hamano
@ 2015-12-31 10:07           ` Mostyn Bramley-Moore
  2016-01-04 17:46             ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2015-12-31 10:07 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Duy Nguyen, brian m . carlson

On 12/31/2015 01:23 AM, Junio C Hamano wrote:
> Mostyn Bramley-Moore <mostynb@opera.com> writes:
>
>> OK, brainstorming a bit, how about either of these:
>>
>> 1)
>> --match-pattern-type=<glob|fixed-strings|basic-regexp|extended-regexp|perl-regexp>
>>
>> It's a bit lengthy (maybe --match-type would be sufficient), but I
>> like that the value names are shared with git grep etc option names.
>> And it seems future-proof- if we ever need to support different
>> pattern types for other arguments, a --foo-pattern-type flag could be
>> added and make obvious sense.
>
> Swapping the option key and value may not be a bad idea, but one
> problem that the above does not solve, which I outlined in the
> message you are responding to, is that "match-pattern-type" does not
> give any hint that this is about affecting the match that is done to
> "refs", e.g. you cannot tell in
>
>    $ git mgrep --match-pattern-type=perl-regexp -e foo --refs 'release_*'
>
> if the perl-regexp is to be used for matching branch names or for
> matching the strings the command looks for in the trees of the
> matching branches.

There is a hint: --foo-pattern-type applies only to --foo.

In your mgrep example --match-pattern-type would apply to --match 
patterns only, and if we choose to implement it then --refs-pattern-type 
would apply to --refs patterns only.

eg:
$ git mgrep --match '^foo' --match-pattern-type=perl-regexp --refs 
'release_*' --refs-pattern-type=glob

> Magic pattern annotation like we do for pathspecs Duy raised may not
> be a bad idea, either, and would probably be easier to teach people.
> Just like in Perl "(?i)$any_pattern" is a way to introduce the case
> insensitive match with $any_pattern, we may be able to pick an
> extensible magic syntax and decorate the pattern you would specify
> for matching refnames to tell Git what kind of pattern it is, e.g.
>
>    $ git mgrep -P -e foo --refs '/?glob/release_*'
>
> I am not suggesting that we must use /?<pattern type name>/ prefix
> as the "extensible magic syntax" here--I am just illustrating what
> I mean by "extensible magic syntax".

I hadn't seen the pathspec magic patterns before- interesting.  We could 
possibly share syntax with pathspecs, ie
:(?pattern-options...)pattern

Would this be confusing for commands that already have --perl-regexp 
etc?  What should happen if you specify both --perl-regexp and and a 
different type of pattern like :(glob)foo (error out, I suppose)?


-Mostyn.
-- 
Mostyn Bramley-Moore
TV and Connected Devices
Opera Software ASA
mostynb@opera.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2015-12-31 10:07           ` Mostyn Bramley-Moore
@ 2016-01-04 17:46             ` Junio C Hamano
  2016-01-06  1:08               ` Mostyn Bramley-Moore
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2016-01-04 17:46 UTC (permalink / raw
  To: Mostyn Bramley-Moore; +Cc: git, Duy Nguyen, brian m . carlson

Mostyn Bramley-Moore <mostynb@opera.com> writes:

> On 12/31/2015 01:23 AM, Junio C Hamano wrote:
> ...
>> Swapping the option key and value may not be a bad idea, but one
>> problem that the above does not solve, which I outlined in the
>> message you are responding to, is that "match-pattern-type" does not
>> give any hint that this is about affecting the match that is done to
>> "refs", e.g. you cannot tell in
>>
>>    $ git mgrep --match-pattern-type=perl-regexp -e foo --refs 'release_*'
>>
>> if the perl-regexp is to be used for matching branch names or for
>> matching the strings the command looks for in the trees of the
>> matching branches.
>
> There is a hint: --foo-pattern-type applies only to --foo.

Hmph.

> In your mgrep example --match-pattern-type would apply to --match
> patterns only, and if we choose to implement it then
> --refs-pattern-type would apply to --refs patterns only.
> eg:
> $ git mgrep --match '^foo' --match-pattern-type=perl-regexp --refs
> release_*' --refs-pattern-type=glob

Most likely the hypothetical "mgrep" would not use "--match" but use
"-e" to retain similarity to "grep", and "--e-pattern-type" would be
confusing.  But I agree that "--refs-pattern-type" uniformly used
where we take pattern parameter on the command line to match with
refs may make it clear that you are only affecting the matches
against refs.

>> Magic pattern annotation like we do for pathspecs Duy raised may not
>> be a bad idea, either, and would probably be easier to teach people.
>> Just like in Perl "(?i)$any_pattern" is a way to introduce the case
>> insensitive match with $any_pattern, we may be able to pick an
>> extensible magic syntax and decorate the pattern you would specify
>> for matching refnames to tell Git what kind of pattern it is, e.g.
>>
>>    $ git mgrep -P -e foo --refs '/?glob/release_*'
>>
>> I am not suggesting that we must use /?<pattern type name>/ prefix
>> as the "extensible magic syntax" here--I am just illustrating what
>> I mean by "extensible magic syntax".
>
> I hadn't seen the pathspec magic patterns before- interesting.  We
> could possibly share syntax with pathspecs, ie
> :(?pattern-options...)pattern

Even though we have DWIM between revisions and paths on the command
line when the user omits "--" for disambiguation, I do not think we
look at the shape of the string to DWIM/decide that it is a pattern,
so as long as the magic syntax cannot be a valid pattern to match
against refs right now (and your ":(?...)"  qualifies as such, as a
refname would not contain a component that begins with a colon), it
would be possible to introduce it as the magic syntax for matching
refs.

Or did you mean to use this syntax also for patterns that match
strings in contents, e.g.

    $ git grep -e ':(?perl-regexp)...'

I am not bold enough to say that it would be a good idea, but I
offhand do not think of a reason why we shouldn't go that route,
either.

> Would this be confusing for commands that already have --perl-regexp
> etc?  What should happen if you specify both --perl-regexp and and a
> different type of pattern like :(glob)foo (error out, I suppose)?

If we were to go that route, ideally, I would say that

    $ git grep --perl-regexp -e 'A' -e ':(?basic-regexp)B' -e ':(?fixed-string)C'

should match with A as pcre, B as BRE and C as a fixed string.

I do not offhand remember if we built the underlying grep machinery
in such a way that it is easy to extend it to allow such mixture,
though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2016-01-04 17:46             ` Junio C Hamano
@ 2016-01-06  1:08               ` Mostyn Bramley-Moore
  2016-01-06 12:23                 ` Duy Nguyen
  0 siblings, 1 reply; 14+ messages in thread
From: Mostyn Bramley-Moore @ 2016-01-06  1:08 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Duy Nguyen, brian m . carlson

On 01/04/2016 06:46 PM, Junio C Hamano wrote:
> Mostyn Bramley-Moore <mostynb@opera.com> writes:
>
>> On 12/31/2015 01:23 AM, Junio C Hamano wrote:
>> ...
>>> Swapping the option key and value may not be a bad idea, but one
>>> problem that the above does not solve, which I outlined in the
>>> message you are responding to, is that "match-pattern-type" does not
>>> give any hint that this is about affecting the match that is done to
>>> "refs", e.g. you cannot tell in
>>>
>>>     $ git mgrep --match-pattern-type=perl-regexp -e foo --refs 'release_*'
>>>
>>> if the perl-regexp is to be used for matching branch names or for
>>> matching the strings the command looks for in the trees of the
>>> matching branches.
>>
>> There is a hint: --foo-pattern-type applies only to --foo.
>
> Hmph.
>
>> In your mgrep example --match-pattern-type would apply to --match
>> patterns only, and if we choose to implement it then
>> --refs-pattern-type would apply to --refs patterns only.
>> eg:
>> $ git mgrep --match '^foo' --match-pattern-type=perl-regexp --refs
>> release_*' --refs-pattern-type=glob
>
> Most likely the hypothetical "mgrep" would not use "--match" but use
> "-e" to retain similarity to "grep", and "--e-pattern-type" would be
> confusing.  But I agree that "--refs-pattern-type" uniformly used
> where we take pattern parameter on the command line to match with
> refs may make it clear that you are only affecting the matches
> against refs.

I don't think -e foo --e-pattern-type=bar would be confusing.

>>> Magic pattern annotation like we do for pathspecs Duy raised may not
>>> be a bad idea, either, and would probably be easier to teach people.
>>> Just like in Perl "(?i)$any_pattern" is a way to introduce the case
>>> insensitive match with $any_pattern, we may be able to pick an
>>> extensible magic syntax and decorate the pattern you would specify
>>> for matching refnames to tell Git what kind of pattern it is, e.g.
>>>
>>>     $ git mgrep -P -e foo --refs '/?glob/release_*'
>>>
>>> I am not suggesting that we must use /?<pattern type name>/ prefix
>>> as the "extensible magic syntax" here--I am just illustrating what
>>> I mean by "extensible magic syntax".
>>
>> I hadn't seen the pathspec magic patterns before- interesting.  We
>> could possibly share syntax with pathspecs, ie
>> :(?pattern-options...)pattern
>
> Even though we have DWIM between revisions and paths on the command
> line when the user omits "--" for disambiguation, I do not think we
> look at the shape of the string to DWIM/decide that it is a pattern,
> so as long as the magic syntax cannot be a valid pattern to match
> against refs right now (and your ":(?...)"  qualifies as such, as a
> refname would not contain a component that begins with a colon), it
> would be possible to introduce it as the magic syntax for matching
> refs.
>
> Or did you mean to use this syntax also for patterns that match
> strings in contents, e.g.
>
>      $ git grep -e ':(?perl-regexp)...'

I think it would be nice to support this syntax in every command that 
does pattern matching.

Corner case: what if we want to search for ":(?perl-regexp)", eg in 
git's own source?  I suppose this would do:
git grep -e ":(?fixed-strings):(?perl-regexp)"

> I am not bold enough to say that it would be a good idea, but I
> offhand do not think of a reason why we shouldn't go that route,
> either.
>
>> Would this be confusing for commands that already have --perl-regexp
>> etc?  What should happen if you specify both --perl-regexp and and a
>> different type of pattern like :(glob)foo (error out, I suppose)?
>
> If we were to go that route, ideally, I would say that
>
>      $ git grep --perl-regexp -e 'A' -e ':(?basic-regexp)B' -e ':(?fixed-string)C'
>
> should match with A as pcre, B as BRE and C as a fixed string.

That makes sense.

> I do not offhand remember if we built the underlying grep machinery
> in such a way that it is easy to extend it to allow such mixture,
> though.

I believe this would require some moderate refactoring, but I have only 
looked briefly so far.  If we can settle on a preliminary design 
(--foo-pattern-type vs magic pattern option strings), I can try 
implementing a proof-of-concept.


-Mostyn.
-- 
Mostyn Bramley-Moore
TV and Connected Devices
Opera Software ASA
mostynb@opera.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH/RFC v2 0/2] add regex match flags to git describe
  2016-01-06  1:08               ` Mostyn Bramley-Moore
@ 2016-01-06 12:23                 ` Duy Nguyen
  0 siblings, 0 replies; 14+ messages in thread
From: Duy Nguyen @ 2016-01-06 12:23 UTC (permalink / raw
  To: Mostyn Bramley-Moore; +Cc: Junio C Hamano, Git Mailing List, brian m . carlson

On Wed, Jan 6, 2016 at 8:08 AM, Mostyn Bramley-Moore <mostynb@opera.com> wrote:
> On 01/04/2016 06:46 PM, Junio C Hamano wrote:
>>>> Magic pattern annotation like we do for pathspecs Duy raised may not
>>>> be a bad idea, either, and would probably be easier to teach people.
>>>> Just like in Perl "(?i)$any_pattern" is a way to introduce the case
>>>> insensitive match with $any_pattern, we may be able to pick an
>>>> extensible magic syntax and decorate the pattern you would specify
>>>> for matching refnames to tell Git what kind of pattern it is, e.g.
>>>>
>>>>     $ git mgrep -P -e foo --refs '/?glob/release_*'
>>>>
>>>> I am not suggesting that we must use /?<pattern type name>/ prefix
>>>> as the "extensible magic syntax" here--I am just illustrating what
>>>> I mean by "extensible magic syntax".
>>>
>>>
>>> I hadn't seen the pathspec magic patterns before- interesting.  We
>>> could possibly share syntax with pathspecs, ie
>>> :(?pattern-options...)pattern
>>
>>
>> Even though we have DWIM between revisions and paths on the command
>> line when the user omits "--" for disambiguation, I do not think we
>> look at the shape of the string to DWIM/decide that it is a pattern,
>> so as long as the magic syntax cannot be a valid pattern to match
>> against refs right now (and your ":(?...)"  qualifies as such, as a
>> refname would not contain a component that begins with a colon), it
>> would be possible to introduce it as the magic syntax for matching
>> refs.
>>
>> Or did you mean to use this syntax also for patterns that match
>> strings in contents, e.g.
>>
>>      $ git grep -e ':(?perl-regexp)...'
>
>
> I think it would be nice to support this syntax in every command that does
> pattern matching.
>
> Corner case: what if we want to search for ":(?perl-regexp)", eg in git's
> own source?  I suppose this would do:
> git grep -e ":(?fixed-strings):(?perl-regexp)"

We can also override it with command line options. If you define
--perl-regexp as "if no regex syntax is specified in the pattern, pcre
is used", then you can have something like --force-perl-regexp that
says "all patterns are in pcre, there's no magic whatsoever to choose
a different regex syntax". Though I think --perl-regexp should play
the role of --force-perl-regexp so you don't surprise current scripts,
and have a new option --default-syntax=<type> instead.
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-01-06 12:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-28 10:30 [PATCH/RFC v2 0/2] add regex match flags to git describe Mostyn Bramley-Moore
2015-12-28 10:30 ` [PATCH/RFC v2 1/2] describe: add option to use perl-compatible regexes with --match Mostyn Bramley-Moore
2015-12-28 10:30 ` [PATCH/RFC v2 2/2] describe: add basic and extended posix regex matching for completeness Mostyn Bramley-Moore
2015-12-28 20:30 ` [PATCH/RFC v2 0/2] add regex match flags to git describe Junio C Hamano
2015-12-29  0:13   ` Mostyn Bramley-Moore
2015-12-29 18:27     ` Junio C Hamano
2015-12-30  9:52       ` Duy Nguyen
2015-12-31  0:08         ` Mostyn Bramley-Moore
2015-12-31  0:00       ` Mostyn Bramley-Moore
2015-12-31  0:23         ` Junio C Hamano
2015-12-31 10:07           ` Mostyn Bramley-Moore
2016-01-04 17:46             ` Junio C Hamano
2016-01-06  1:08               ` Mostyn Bramley-Moore
2016-01-06 12:23                 ` Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).