git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
@ 2018-06-18 23:43 Taylor Blau
  2018-06-18 23:43 ` [PATCH 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
                   ` (11 more replies)
  0 siblings, 12 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Hi,

Attached is a ``fresh start'' of my series to teach 'git grep --column'.
Since the last time I sent this, much has changed, notably the semantics
for deciding which column is the first when given (1) extended
expressions and (2) --invert.

Both (1) and (2) are described in-depth in patch 2/7, but I am happy to
answer more questions should they arise here. Peff and I worked on this
together off-list, and we are both happy with the semantics, and believe
that it covers most reasonable cases.

The notable case that it does _not_ cover is matching the following
line:

  a ... b

with the following expression

  git grep --column -e b --or -e a

This will produce the column for 'b' rather than the column for 'a',
since we short-circuit an --or when the left child finds a match, in
this case 'b'. So, we break the semantics for this case, at the benefit
of not having to do twice the work.

In the future, I'd like to revisit this, since any performance gains
that we _do_ make in this area are moot when we rescan all lines in
show_line() with --color. A path forward, I imagine, would look like a
list of regmatch_t's, or a set of locations in the expression tree, such
that we could either enumerate the list or walk the tree in order to
colorize the line. But, I think for now that is #leftoverbits.

Thanks especially to the last round of reviewers for their detailed
feedback, and I hope that starting in a new series will be OK. I figure
that enough has changed that I'd rather not clutter an already busy
thread.


Thanks,
Taylor

Taylor Blau (7):
  Documentation/config.txt: camel-case lineNumber for consistency
  grep.c: expose {,inverted} match column in match_line()
  grep.[ch]: extend grep_opt to allow showing matched column
  grep.c: display column number of first match
  builtin/grep.c: add '--column' option to 'git-grep(1)'
  grep.c: add configuration variables to show matched option
  contrib/git-jump/git-jump: jump to exact location

 Documentation/config.txt   |  7 ++-
 Documentation/git-grep.txt |  9 +++-
 builtin/grep.c             |  1 +
 contrib/git-jump/README    | 12 ++++-
 contrib/git-jump/git-jump  |  2 +-
 grep.c                     | 95 +++++++++++++++++++++++++++++---------
 grep.h                     |  2 +
 t/t7810-grep.sh            | 63 +++++++++++++++++++++++++
 8 files changed, 163 insertions(+), 28 deletions(-)

--
2.17.0.582.gccdcbd54c

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 1/7] Documentation/config.txt: camel-case lineNumber for consistency
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-18 23:43 ` [PATCH 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

lineNumber has casing that is inconsistent with surrounding options,
like color.grep.matchContext, and color.grep.matchSelected. Re-case this
documentation in order to be consistent with the text around it, and to
ensure that new entries are consistent, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a9..58fde4daea 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1181,7 +1181,7 @@ color.grep.<slot>::
 	filename prefix (when not using `-h`)
 `function`;;
 	function name lines (when using `-p`)
-`linenumber`;;
+`lineNumber`;;
 	line number prefix (when using `-n`)
 `match`;;
 	matching text (same as setting `matchContext` and `matchSelected`)
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 2/7] grep.c: expose {,inverted} match column in match_line()
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
  2018-06-18 23:43 ` [PATCH 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-19 16:49   ` Junio C Hamano
  2018-06-18 23:43 ` [PATCH 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

When calling match_line(), callers presently cannot determine the
relative offset of the match because match_line() discards the
'regmatch_t' that contains this information.

Instead, teach match_line() to take in two 'ssize_t's. Fill the first
with the offset of the match produced by the given expression. If
extended, fill the later with the offset of the match produced as if
--invert were given.

For instance, matching "--not -e x" on this line produces a columnar
offset of 0, (i.e., the whole line does not contain an x), but "--invert
--not -e -x" will fill the later ssize_t of the column containing an
"x", because this expression is semantically equivalent to "-e x".

To determine the column for the inverted and non-inverted case, do the
following:

  - If matching an atom, the non-inverted column is as given from
    match_one_pattern(), and the inverted column is unset.

  - If matching a --not, the inverted column and non-inverted column swap.

  - If matching an --and, or --or, the non-inverted column is the
    minimum of the two children, with the exception that --or is
    short-circuiting. For instance, if we match "-e a --or -e b" on a
    line that contains both "a" and "b" (and "b" comes first), the match
    column will hold "a", since we inspected the left child first, and
    short-circuited over the right child.

This patch will become useful when we later pick between the two new
results in order to display the column number of the first match on a
line with --column.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 56 ++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/grep.c b/grep.c
index 45ec7e636c..19c782aa9d 100644
--- a/grep.c
+++ b/grep.c
@@ -1249,10 +1249,10 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
 }
 
 static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
-			   enum grep_context ctx, int collect_hits)
+			   enum grep_context ctx, ssize_t *col,
+			   ssize_t *icol, int collect_hits)
 {
 	int h = 0;
-	regmatch_t match;
 
 	if (!x)
 		die("Not a valid grep expression");
@@ -1261,25 +1261,39 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 		h = 1;
 		break;
 	case GREP_NODE_ATOM:
-		h = match_one_pattern(x->u.atom, bol, eol, ctx, &match, 0);
+		{
+			regmatch_t tmp;
+			h = match_one_pattern(x->u.atom, bol, eol, ctx,
+					      &tmp, 0);
+			if (h && (*col < 0 || tmp.rm_so < *col))
+				*col = tmp.rm_so;
+		}
 		break;
 	case GREP_NODE_NOT:
-		h = !match_expr_eval(x->u.unary, bol, eol, ctx, 0);
+		/*
+		 * Upon visiting a GREP_NODE_NOT, imatch and match become
+		 * swapped.
+		 */
+		h = !match_expr_eval(x->u.unary, bol, eol, ctx, icol, col, 0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(x->u.binary.left, bol, eol, ctx, 0))
+		if (!match_expr_eval(x->u.binary.left, bol, eol, ctx, col,
+				     icol, 0))
 			return 0;
-		h = match_expr_eval(x->u.binary.right, bol, eol, ctx, 0);
+		h = match_expr_eval(x->u.binary.right, bol, eol, ctx, col,
+				    icol, 0);
 		break;
 	case GREP_NODE_OR:
 		if (!collect_hits)
-			return (match_expr_eval(x->u.binary.left,
-						bol, eol, ctx, 0) ||
-				match_expr_eval(x->u.binary.right,
-						bol, eol, ctx, 0));
-		h = match_expr_eval(x->u.binary.left, bol, eol, ctx, 0);
+			return (match_expr_eval(x->u.binary.left, bol, eol, ctx,
+						col, icol, 0) ||
+				match_expr_eval(x->u.binary.right, bol, eol,
+						ctx, col, icol, 0));
+		h = match_expr_eval(x->u.binary.left, bol, eol, ctx, col,
+				    icol, 0);
 		x->u.binary.left->hit |= h;
-		h |= match_expr_eval(x->u.binary.right, bol, eol, ctx, 1);
+		h |= match_expr_eval(x->u.binary.right, bol, eol, ctx, col,
+				     icol, 1);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1290,25 +1304,30 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 }
 
 static int match_expr(struct grep_opt *opt, char *bol, char *eol,
-		      enum grep_context ctx, int collect_hits)
+		      enum grep_context ctx, ssize_t *col,
+		      ssize_t *icol, int collect_hits)
 {
 	struct grep_expr *x = opt->pattern_expression;
-	return match_expr_eval(x, bol, eol, ctx, collect_hits);
+	return match_expr_eval(x, bol, eol, ctx, col, icol, collect_hits);
 }
 
 static int match_line(struct grep_opt *opt, char *bol, char *eol,
+		      ssize_t *col, ssize_t *icol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
-	regmatch_t match;
 
 	if (opt->extended)
-		return match_expr(opt, bol, eol, ctx, collect_hits);
+		return match_expr(opt, bol, eol, ctx, col, icol,
+				  collect_hits);
 
 	/* we do not call with collect_hits without being extended */
 	for (p = opt->pattern_list; p; p = p->next) {
-		if (match_one_pattern(p, bol, eol, ctx, &match, 0))
+		regmatch_t tmp;
+		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
+			*col = tmp.rm_so;
 			return 1;
+		}
 	}
 	return 0;
 }
@@ -1763,6 +1782,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	while (left) {
 		char *eol, ch;
 		int hit;
+		ssize_t col = -1, icol = -1;
 
 		/*
 		 * look_ahead() skips quickly to the line that possibly
@@ -1786,7 +1806,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		if ((ctx == GREP_CONTEXT_HEAD) && (eol == bol))
 			ctx = GREP_CONTEXT_BODY;
 
-		hit = match_line(opt, bol, eol, ctx, collect_hits);
+		hit = match_line(opt, bol, eol, &col, &icol, ctx, collect_hits);
 		*eol = ch;
 
 		if (collect_hits)
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 3/7] grep.[ch]: extend grep_opt to allow showing matched column
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
  2018-06-18 23:43 ` [PATCH 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
  2018-06-18 23:43 ` [PATCH 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-18 23:43 ` [PATCH 4/7] grep.c: display column number of first match Taylor Blau
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To support showing the matched column when calling 'git-grep(1)', teach
'grep_opt' the normal set of options to configure the default behavior
and colorization of this feature.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 3 +++
 grep.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/grep.c b/grep.c
index 19c782aa9d..33f8572e6d 100644
--- a/grep.c
+++ b/grep.c
@@ -46,6 +46,7 @@ void init_grep_defaults(void)
 	color_set(opt->color_filename, "");
 	color_set(opt->color_function, "");
 	color_set(opt->color_lineno, "");
+	color_set(opt->color_columnno, "");
 	color_set(opt->color_match_context, GIT_COLOR_BOLD_RED);
 	color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED);
 	color_set(opt->color_selected, "");
@@ -155,6 +156,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	opt->extended_regexp_option = def->extended_regexp_option;
 	opt->pattern_type_option = def->pattern_type_option;
 	opt->linenum = def->linenum;
+	opt->columnnum = def->columnnum;
 	opt->max_depth = def->max_depth;
 	opt->pathname = def->pathname;
 	opt->relative = def->relative;
@@ -164,6 +166,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	color_set(opt->color_filename, def->color_filename);
 	color_set(opt->color_function, def->color_function);
 	color_set(opt->color_lineno, def->color_lineno);
+	color_set(opt->color_columnno, def->color_columnno);
 	color_set(opt->color_match_context, def->color_match_context);
 	color_set(opt->color_match_selected, def->color_match_selected);
 	color_set(opt->color_selected, def->color_selected);
diff --git a/grep.h b/grep.h
index 399381c908..08a0b391c5 100644
--- a/grep.h
+++ b/grep.h
@@ -127,6 +127,7 @@ struct grep_opt {
 	int prefix_length;
 	regex_t regexp;
 	int linenum;
+	int columnnum;
 	int invert;
 	int ignore_case;
 	int status_only;
@@ -159,6 +160,7 @@ struct grep_opt {
 	char color_filename[COLOR_MAXLEN];
 	char color_function[COLOR_MAXLEN];
 	char color_lineno[COLOR_MAXLEN];
+	char color_columnno[COLOR_MAXLEN];
 	char color_match_context[COLOR_MAXLEN];
 	char color_match_selected[COLOR_MAXLEN];
 	char color_selected[COLOR_MAXLEN];
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 4/7] grep.c: display column number of first match
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (2 preceding siblings ...)
  2018-06-18 23:43 ` [PATCH 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-19 16:28   ` Jeff King
  2018-06-18 23:43 ` [PATCH 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To prepare for 'git grep' learning '--column', teach grep.c's
show_line() how to show the column of the first match on non-context
lines.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/grep.c b/grep.c
index 33f8572e6d..9f5b00a471 100644
--- a/grep.c
+++ b/grep.c
@@ -1381,7 +1381,7 @@ static int next_match(struct grep_opt *opt, char *bol, char *eol,
 }
 
 static void show_line(struct grep_opt *opt, char *bol, char *eol,
-		      const char *name, unsigned lno, char sign)
+		      const char *name, unsigned lno, unsigned cno, char sign)
 {
 	int rest = eol - bol;
 	const char *match_color, *line_color = NULL;
@@ -1416,6 +1416,17 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
 		output_color(opt, buf, strlen(buf), opt->color_lineno);
 		output_sep(opt, sign);
 	}
+	/*
+	 * Treat 'cno' as the 1-indexed offset from the start of a non-context
+	 * line to its first match. Otherwise, 'cno' is 0 indicating that we are
+	 * being called with a context line.
+	 */
+	if (opt->columnnum && cno) {
+		char buf[32];
+		xsnprintf(buf, sizeof(buf), "%d", cno);
+		output_color(opt, buf, strlen(buf), opt->color_columnno);
+		output_sep(opt, sign);
+	}
 	if (opt->color) {
 		regmatch_t match;
 		enum grep_context ctx = GREP_CONTEXT_BODY;
@@ -1521,7 +1532,7 @@ static void show_funcname_line(struct grep_opt *opt, struct grep_source *gs,
 			break;
 
 		if (match_funcname(opt, gs, bol, eol)) {
-			show_line(opt, bol, eol, gs->name, lno, '=');
+			show_line(opt, bol, eol, gs->name, lno, 0, '=');
 			break;
 		}
 	}
@@ -1586,7 +1597,7 @@ static void show_pre_context(struct grep_opt *opt, struct grep_source *gs,
 
 		while (*eol != '\n')
 			eol++;
-		show_line(opt, bol, eol, gs->name, cur, sign);
+		show_line(opt, bol, eol, gs->name, cur, 0, sign);
 		bol = eol + 1;
 		cur++;
 	}
@@ -1785,6 +1796,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	while (left) {
 		char *eol, ch;
 		int hit;
+		ssize_t cno;
 		ssize_t col = -1, icol = -1;
 
 		/*
@@ -1850,7 +1862,15 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 				show_pre_context(opt, gs, bol, eol, lno);
 			else if (opt->funcname)
 				show_funcname_line(opt, gs, bol, lno);
-			show_line(opt, bol, eol, gs->name, lno, ':');
+			cno = opt->invert ? icol : col;
+			if (cno < 0) {
+				/*
+				 * A negative cno means that there was no match.
+				 * Clamp to the beginning of the line.
+				 */
+				cno = 0;
+			}
+			show_line(opt, bol, eol, gs->name, lno, cno + 1, ':');
 			last_hit = lno;
 			if (opt->funcbody)
 				show_function = 1;
@@ -1879,7 +1899,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 			/* If the last hit is within the post context,
 			 * we need to show this line.
 			 */
-			show_line(opt, bol, eol, gs->name, lno, '-');
+			show_line(opt, bol, eol, gs->name, lno, col + 1, '-');
 		}
 
 	next_line:
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (3 preceding siblings ...)
  2018-06-18 23:43 ` [PATCH 4/7] grep.c: display column number of first match Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-18 23:43 ` [PATCH 6/7] grep.c: add configuration variables to show matched option Taylor Blau
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Teach 'git-grep(1)' a new option, '--column', to show the column
number of the first match on a non-context line. This makes it possible
to teach 'contrib/git-jump/git-jump' how to seek to the first matching
position of a grep match in your editor, and allows similar additional
scripting capabilities.

For example:

  $ git grep -n --column foo | head -n3
  .clang-format:51:14:# myFunction(foo, bar, baz);
  .clang-format:64:7:# int foo();
  .clang-format:75:8:# void foo()

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-grep.txt |  6 +++-
 builtin/grep.c             |  1 +
 t/t7810-grep.sh            | 63 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 312409a607..31dc0392a6 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 	   [-v | --invert-match] [-h|-H] [--full-name]
 	   [-E | --extended-regexp] [-G | --basic-regexp]
 	   [-P | --perl-regexp]
-	   [-F | --fixed-strings] [-n | --line-number]
+	   [-F | --fixed-strings] [-n | --line-number] [--column]
 	   [-l | --files-with-matches] [-L | --files-without-match]
 	   [(-O | --open-files-in-pager) [<pager>]]
 	   [-z | --null]
@@ -169,6 +169,10 @@ providing this option will cause it to die.
 --line-number::
 	Prefix the line number to matching lines.
 
+--column::
+	Prefix the 1-indexed byte-offset of the first match from the start of the
+	matching line.
+
 -l::
 --files-with-matches::
 --name-only::
diff --git a/builtin/grep.c b/builtin/grep.c
index ee753a403e..61bcaf6e58 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -828,6 +828,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
 		OPT_BOOL('n', "line-number", &opt.linenum, N_("show line numbers")),
+		OPT_BOOL(0, "column", &opt.columnnum, N_("show column number of first match")),
 		OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show filenames"), 1),
 		OPT_BIT('H', NULL, &opt.pathname, N_("show filenames"), 1),
 		OPT_NEGBIT(0, "full-name", &opt.relative,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1797f632a3..daaf7b4c44 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -99,6 +99,69 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep -w $L (with --column)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, extended)" '
+		{
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:19:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e mmap$ --or -e baz $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, --invert)" '
+		{
+			echo ${HC}file:1:foo mmap bar
+			echo ${HC}file:1:foo_mmap bar
+			echo ${HC}file:1:foo_mmap bar mmap
+			echo ${HC}file:1:foo mmap bar_mmap
+		} >expected &&
+		git grep --column --invert -w -e baz $H -- file >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L (with --column, --invert, extended)" '
+		{
+			echo ${HC}hello_world:6:HeLLo_world
+		} >expected &&
+		git grep --column --invert -e ll --or --not -e _ $H -- hello_world \
+			>actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, -C)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file-foo_mmap bar
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -C1 -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --line-number, --column)" '
+		{
+			echo ${HC}file:1:5:foo mmap bar
+			echo ${HC}file:3:14:foo_mmap bar mmap
+			echo ${HC}file:4:5:foo mmap bar_mmap
+			echo ${HC}file:5:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep -n --column -w -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep -w $L" '
 		{
 			echo ${HC}file:1:foo mmap bar
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 6/7] grep.c: add configuration variables to show matched option
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (4 preceding siblings ...)
  2018-06-18 23:43 ` [PATCH 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-18 23:43 ` [PATCH 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To support git-grep(1)'s new option, '--column', document and teach
grep.c how to interpret relevant configuration options, similar to those
associated with '--line-number'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt   | 5 +++++
 Documentation/git-grep.txt | 3 +++
 grep.c                     | 6 ++++++
 3 files changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 58fde4daea..e4cbed3078 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1183,6 +1183,8 @@ color.grep.<slot>::
 	function name lines (when using `-p`)
 `lineNumber`;;
 	line number prefix (when using `-n`)
+`column`;;
+	column number prefix (when using `--column`)
 `match`;;
 	matching text (same as setting `matchContext` and `matchSelected`)
 `matchContext`;;
@@ -1797,6 +1799,9 @@ gitweb.snapshot::
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 
+grep.column::
+	If set to true, enable the `--column` option by default.
+
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 31dc0392a6..0de3493b80 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -44,6 +44,9 @@ CONFIGURATION
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 
+grep.column::
+	If set to true, enable the `--column` option by default.
+
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
diff --git a/grep.c b/grep.c
index 9f5b00a471..8ffa94c791 100644
--- a/grep.c
+++ b/grep.c
@@ -96,6 +96,10 @@ int grep_config(const char *var, const char *value, void *cb)
 		opt->linenum = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "grep.column")) {
+		opt->columnnum = git_config_bool(var, value);
+		return 0;
+	}
 
 	if (!strcmp(var, "grep.fullname")) {
 		opt->relative = !git_config_bool(var, value);
@@ -112,6 +116,8 @@ int grep_config(const char *var, const char *value, void *cb)
 		color = opt->color_function;
 	else if (!strcmp(var, "color.grep.linenumber"))
 		color = opt->color_lineno;
+	else if (!strcmp(var, "color.grep.column"))
+		color = opt->color_columnno;
 	else if (!strcmp(var, "color.grep.matchcontext"))
 		color = opt->color_match_context;
 	else if (!strcmp(var, "color.grep.matchselected"))
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 7/7] contrib/git-jump/git-jump: jump to exact location
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (5 preceding siblings ...)
  2018-06-18 23:43 ` [PATCH 6/7] grep.c: add configuration variables to show matched option Taylor Blau
@ 2018-06-18 23:43 ` Taylor Blau
  2018-06-19 16:35 ` [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-18 23:43 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Take advantage of 'git-grep(1)''s new option, '--column' in order to
teach Peff's 'git-jump' script how to jump to the correct column for any
given match.

'git-grep(1)''s output is in the correct format for Vim's jump list, so
no additional cleanup is necessary.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 contrib/git-jump/README   | 12 ++++++++++--
 contrib/git-jump/git-jump |  2 +-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/contrib/git-jump/README b/contrib/git-jump/README
index 4484bda410..2f618a7f97 100644
--- a/contrib/git-jump/README
+++ b/contrib/git-jump/README
@@ -25,6 +25,13 @@ git-jump will feed this to the editor:
 foo.c:2: printf("hello word!\n");
 -----------------------------------
 
+Or, when running 'git jump grep', column numbers will also be emitted,
+e.g. `git jump grep "hello"` would return:
+
+-----------------------------------
+foo.c:2:9: printf("hello word!\n");
+-----------------------------------
+
 Obviously this trivial case isn't that interesting; you could just open
 `foo.c` yourself. But when you have many changes scattered across a
 project, you can use the editor's support to "jump" from point to point.
@@ -35,7 +42,8 @@ Git-jump can generate four types of interesting lists:
 
   2. The beginning of any merge conflict markers.
 
-  3. Any grep matches.
+  3. Any grep matches, including the column of the first match on a
+     line.
 
   4. Any whitespace errors detected by `git diff --check`.
 
@@ -82,7 +90,7 @@ which does something similar to `git jump grep`. However, it is limited
 to positioning the cursor to the correct line in only the first file,
 leaving you to locate subsequent hits in that file or other files using
 the editor or pager. By contrast, git-jump provides the editor with a
-complete list of files and line numbers for each match.
+complete list of files, lines, and a column number for each match.
 
 
 Limitations
diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 80ab0590bc..931b0fe3a9 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -52,7 +52,7 @@ mode_merge() {
 # editor shows them to us in the status bar.
 mode_grep() {
 	cmd=$(git config jump.grepCmd)
-	test -n "$cmd" || cmd="git grep -n"
+	test -n "$cmd" || cmd="git grep -n --column"
 	$cmd "$@" |
 	perl -pe '
 	s/[ \t]+/ /g;
-- 
2.17.0.582.gccdcbd54c

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 4/7] grep.c: display column number of first match
  2018-06-18 23:43 ` [PATCH 4/7] grep.c: display column number of first match Taylor Blau
@ 2018-06-19 16:28   ` Jeff King
  2018-06-19 16:34     ` Taylor Blau
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2018-06-19 16:28 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, gitster

On Mon, Jun 18, 2018 at 06:43:14PM -0500, Taylor Blau wrote:

>  static void show_line(struct grep_opt *opt, char *bol, char *eol,
> -		      const char *name, unsigned lno, char sign)
> +		      const char *name, unsigned lno, unsigned cno, char sign)

Here "cno" is unsigned. But later...

> +	if (opt->columnnum && cno) {
> +		char buf[32];
> +		xsnprintf(buf, sizeof(buf), "%d", cno);

...we print it with "%d". Should this be "%u"?

But ultimately, the column number comes from this code:

> @@ -1785,6 +1796,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
>  	while (left) {
>  		char *eol, ch;
>  		int hit;
> +		ssize_t cno;
>  		ssize_t col = -1, icol = -1;
>  
>  		/*
> @@ -1850,7 +1862,15 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
>  				show_pre_context(opt, gs, bol, eol, lno);
>  			else if (opt->funcname)
>  				show_funcname_line(opt, gs, bol, lno);
> -			show_line(opt, bol, eol, gs->name, lno, ':');
> +			cno = opt->invert ? icol : col;
> +			if (cno < 0) {
> +				/*
> +				 * A negative cno means that there was no match.
> +				 * Clamp to the beginning of the line.
> +				 */
> +				cno = 0;
> +			}

...which is a ssize_t. Should we just be using ssize_t consistently?

We do at least clamp the negative values here, but on 64-bit systems
ssize_t is much larger than "unsigned".  I admit that it's probably
ridiculous for any single line to overflow 32 bits, but it seems like we
should consistently use size_t/ssize_t for buffer offsets, and then we
don't have to think about it.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 4/7] grep.c: display column number of first match
  2018-06-19 16:28   ` Jeff King
@ 2018-06-19 16:34     ` Taylor Blau
  0 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 16:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, avarab, gitster

On Tue, Jun 19, 2018 at 12:28:26PM -0400, Jeff King wrote:
> On Mon, Jun 18, 2018 at 06:43:14PM -0500, Taylor Blau wrote:
>
> >  static void show_line(struct grep_opt *opt, char *bol, char *eol,
> > -		      const char *name, unsigned lno, char sign)
> > +		      const char *name, unsigned lno, unsigned cno, char sign)
>
> Here "cno" is unsigned. But later...
>
> > +	if (opt->columnnum && cno) {
> > +		char buf[32];
> > +		xsnprintf(buf, sizeof(buf), "%d", cno);
>
> ...we print it with "%d". Should this be "%u"?

Thanks, that's certainly a mistake. I think (per the hunk of this
response below) that it should be "%zu" in the case that we change this
patch to take an ssize_t.

> But ultimately, the column number comes from this code:
>
> > @@ -1785,6 +1796,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
> >  	while (left) {
> >  		char *eol, ch;
> >  		int hit;
> > +		ssize_t cno;
> >  		ssize_t col = -1, icol = -1;
> >
> >  		/*
> > @@ -1850,7 +1862,15 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
> >  				show_pre_context(opt, gs, bol, eol, lno);
> >  			else if (opt->funcname)
> >  				show_funcname_line(opt, gs, bol, lno);
> > -			show_line(opt, bol, eol, gs->name, lno, ':');
> > +			cno = opt->invert ? icol : col;
> > +			if (cno < 0) {
> > +				/*
> > +				 * A negative cno means that there was no match.
> > +				 * Clamp to the beginning of the line.
> > +				 */
> > +				cno = 0;
> > +			}
>
> ...which is a ssize_t. Should we just be using ssize_t consistently?
>
> We do at least clamp the negative values here, but on 64-bit systems
> ssize_t is much larger than "unsigned".  I admit that it's probably
> ridiculous for any single line to overflow 32 bits, but it seems like we
> should consistently use size_t/ssize_t for buffer offsets, and then we
> don't have to think about it.

I agree that it's unlikely that a single line will overflow 32 bits, and
certainly at that point we might have other problems to worry about :-).

This was an unsigned in my original patch, and I left it this way in the
revised series for consistency with the other arguments to show_line().
But, I agree with your reasoning and think that this should be an
ssize_t, instead.


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (6 preceding siblings ...)
  2018-06-18 23:43 ` [PATCH 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
@ 2018-06-19 16:35 ` Jeff King
  2018-06-19 17:33   ` René Scharfe
  2018-06-19 16:46 ` Junio C Hamano
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2018-06-19 16:35 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, gitster

On Mon, Jun 18, 2018 at 06:43:01PM -0500, Taylor Blau wrote:

> Attached is a ``fresh start'' of my series to teach 'git grep --column'.
> Since the last time I sent this, much has changed, notably the semantics
> for deciding which column is the first when given (1) extended
> expressions and (2) --invert.
> 
> Both (1) and (2) are described in-depth in patch 2/7, but I am happy to
> answer more questions should they arise here. Peff and I worked on this
> together off-list, and we are both happy with the semantics, and believe
> that it covers most reasonable cases.

So with the exception of some minor type-quibbling in patch 4, this all
looks good to me. Since we discussed this quite a bit off-list, you can
take that review either with a giant grain of salt (reviewing something
I helped to generate in the first place) or a ringing endorsement (I
thought about it a lot more than I would have for a normal review ;) ).

> The notable case that it does _not_ cover is matching the following
> line:
> 
>   a ... b
> 
> with the following expression
> 
>   git grep --column -e b --or -e a
> 
> This will produce the column for 'b' rather than the column for 'a',
> since we short-circuit an --or when the left child finds a match, in
> this case 'b'. So, we break the semantics for this case, at the benefit
> of not having to do twice the work.
> 
> In the future, I'd like to revisit this, since any performance gains
> that we _do_ make in this area are moot when we rescan all lines in
> show_line() with --color. A path forward, I imagine, would look like a
> list of regmatch_t's, or a set of locations in the expression tree, such
> that we could either enumerate the list or walk the tree in order to
> colorize the line. But, I think for now that is #leftoverbits.

The key thing about this iteration is that it doesn't regress
performance, because we always short-circuit where we used to. The other
obvious route is to stop short-circuiting only when "--column" is in
effect, which would have the same property (at the expense of a little
extra code in match_expr_eval()).

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (7 preceding siblings ...)
  2018-06-19 16:35 ` [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
@ 2018-06-19 16:46 ` Junio C Hamano
  2018-06-19 17:02   ` Taylor Blau
  2018-06-19 22:51 ` Taylor Blau
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2018-06-19 16:46 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, peff

Taylor Blau <me@ttaylorr.com> writes:

> Attached is a ``fresh start'' of my series to teach 'git grep --column'.
> Since the last time I sent this, much has changed, notably the semantics
> for deciding which column is the first when given (1) extended
> expressions and (2) --invert.
> ...
> In the future, I'd like to revisit this, since any performance gains
> that we _do_ make in this area are moot when we rescan all lines in
> show_line() with --color.

Sounds like a plan.  From a quick scan, it seems that this is
sufficiently different from the last round so I won't bother
rebuilding your "--only" series on top, and instead just drop those
two series from the older round and queue this as the new and
improved tb/grep-column topic.

Thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/7] grep.c: expose {,inverted} match column in match_line()
  2018-06-18 23:43 ` [PATCH 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
@ 2018-06-19 16:49   ` Junio C Hamano
  2018-06-19 17:02     ` Taylor Blau
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2018-06-19 16:49 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, peff

Taylor Blau <me@ttaylorr.com> writes:

>  	case GREP_NODE_NOT:
> -		h = !match_expr_eval(x->u.unary, bol, eol, ctx, 0);
> +		/*
> +		 * Upon visiting a GREP_NODE_NOT, imatch and match become
> +		 * swapped.
> +		 */
> +		h = !match_expr_eval(x->u.unary, bol, eol, ctx, icol, col, 0);

A minor nit, but the comment talks about something that are
different from the variable names; perhaps you called col/icol with
different names in an earlier incarnation of this patch?


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 16:46 ` Junio C Hamano
@ 2018-06-19 17:02   ` Taylor Blau
  0 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 17:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, avarab, peff

On Tue, Jun 19, 2018 at 09:46:16AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Attached is a ``fresh start'' of my series to teach 'git grep --column'.
> > Since the last time I sent this, much has changed, notably the semantics
> > for deciding which column is the first when given (1) extended
> > expressions and (2) --invert.
> > ...
> > In the future, I'd like to revisit this, since any performance gains
> > that we _do_ make in this area are moot when we rescan all lines in
> > show_line() with --color.
>
> Sounds like a plan.  From a quick scan, it seems that this is
> sufficiently different from the last round so I won't bother
> rebuilding your "--only" series on top, and instead just drop those
> two series from the older round and queue this as the new and
> improved tb/grep-column topic.

Thank you. I recommend dropping the second series
(tb/grep-only-matching) entirely for now. I will rebase that on top of
this so that you can apply it later in the future with ease.

I don't expect the parts of this series that affect that one to change
much, but I'll hold off on rebasing it in general until this series is
stable, hopefully soon.

> Thanks.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/7] grep.c: expose {,inverted} match column in match_line()
  2018-06-19 16:49   ` Junio C Hamano
@ 2018-06-19 17:02     ` Taylor Blau
  0 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 17:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, avarab, peff

On Tue, Jun 19, 2018 at 09:49:21AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >  	case GREP_NODE_NOT:
> > -		h = !match_expr_eval(x->u.unary, bol, eol, ctx, 0);
> > +		/*
> > +		 * Upon visiting a GREP_NODE_NOT, imatch and match become
> > +		 * swapped.
> > +		 */
> > +		h = !match_expr_eval(x->u.unary, bol, eol, ctx, icol, col, 0);
>
> A minor nit, but the comment talks about something that are
> different from the variable names; perhaps you called col/icol with
> different names in an earlier incarnation of this patch?

Good catch, thanks. I've amended my local copy.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 16:35 ` [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
@ 2018-06-19 17:33   ` René Scharfe
  2018-06-19 17:44     ` Taylor Blau
  2018-06-19 17:48     ` Jeff King
  0 siblings, 2 replies; 56+ messages in thread
From: René Scharfe @ 2018-06-19 17:33 UTC (permalink / raw)
  To: Jeff King, Taylor Blau; +Cc: git, avarab, gitster

Am 19.06.2018 um 18:35 schrieb Jeff King:
> On Mon, Jun 18, 2018 at 06:43:01PM -0500, Taylor Blau wrote:
>> The notable case that it does _not_ cover is matching the following
>> line:
>>
>>    a ... b
>>
>> with the following expression
>>
>>    git grep --column -e b --or -e a
>>
>> This will produce the column for 'b' rather than the column for 'a',
>> since we short-circuit an --or when the left child finds a match, in
>> this case 'b'. So, we break the semantics for this case, at the benefit
>> of not having to do twice the work.
>>
>> In the future, I'd like to revisit this, since any performance gains
>> that we _do_ make in this area are moot when we rescan all lines in
>> show_line() with --color. A path forward, I imagine, would look like a
>> list of regmatch_t's, or a set of locations in the expression tree, such
>> that we could either enumerate the list or walk the tree in order to
>> colorize the line. But, I think for now that is #leftoverbits.
> 
> The key thing about this iteration is that it doesn't regress
> performance, because we always short-circuit where we used to. The other
> obvious route is to stop short-circuiting only when "--column" is in
> effect, which would have the same property (at the expense of a little
> extra code in match_expr_eval()).

The performance impact of the exhaustive search for --color scales with
the number of shown lines, while it would scale with the total number of
lines for --column.  Coloring the results of highly selective patterns
is relatively cheap, short-circuiting them still helps significantly.

Disabling that optimization for --column wouldn't be a regression since
it's a new option..  Picking a random result (based on the order of
evaluation) seems sloppy and is probably going to surprise users.

We could add an optimizer pass to reduce the number of regular
expressions in certain cases if that is really too slow.  E.g. this:

	$ git grep -e b -e a

... is equivalent to:

	$ git grep -e '\(b\)\|\(a\)'

In that example the optimizer should use a single kwset instead of a
regex, but you get the idea, namely to leave the short-circuiting to the
regex engine or kwset, which probably do a good job of it.

René

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:33   ` René Scharfe
@ 2018-06-19 17:44     ` Taylor Blau
  2018-06-19 17:50       ` René Scharfe
  2018-06-19 20:26       ` René Scharfe
  2018-06-19 17:48     ` Jeff King
  1 sibling, 2 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 17:44 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, Taylor Blau, git, avarab, gitster

On Tue, Jun 19, 2018 at 07:33:39PM +0200, René Scharfe wrote:
> Am 19.06.2018 um 18:35 schrieb Jeff King:
> > On Mon, Jun 18, 2018 at 06:43:01PM -0500, Taylor Blau wrote:
> >> The notable case that it does _not_ cover is matching the following
> >> line:
> >>
> >>    a ... b
> >>
> >> with the following expression
> >>
> >>    git grep --column -e b --or -e a
> >>
> >> This will produce the column for 'b' rather than the column for 'a',
> >> since we short-circuit an --or when the left child finds a match, in
> >> this case 'b'. So, we break the semantics for this case, at the benefit
> >> of not having to do twice the work.
> >>
> >> In the future, I'd like to revisit this, since any performance gains
> >> that we _do_ make in this area are moot when we rescan all lines in
> >> show_line() with --color. A path forward, I imagine, would look like a
> >> list of regmatch_t's, or a set of locations in the expression tree, such
> >> that we could either enumerate the list or walk the tree in order to
> >> colorize the line. But, I think for now that is #leftoverbits.
> >
> > The key thing about this iteration is that it doesn't regress
> > performance, because we always short-circuit where we used to. The other
> > obvious route is to stop short-circuiting only when "--column" is in
> > effect, which would have the same property (at the expense of a little
> > extra code in match_expr_eval()).
>
> The performance impact of the exhaustive search for --color scales with
> the number of shown lines, while it would scale with the total number of
> lines for --column.  Coloring the results of highly selective patterns
> is relatively cheap, short-circuiting them still helps significantly.

Sure. I was pointing out that we are repeating work in cases where it is
unnecessary to do so. It seems natural to me to take one of the two
above approaches, and optimize where we can.

> Disabling that optimization for --column wouldn't be a regression since
> it's a new option..  Picking a random result (based on the order of
> evaluation) seems sloppy and is probably going to surprise users.

That's fair, and something I'm happy to do if others feel OK about it.
Here is a patch to that effect:

diff --git a/grep.c b/grep.c
index f3329d82ed..a09935d8c5 100644
--- a/grep.c
+++ b/grep.c
@@ -1257,8 +1257,8 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
 	return hit;
 }

-static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
-			   enum grep_context ctx, ssize_t *col,
+static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
+			   char *eol, enum grep_context ctx, ssize_t *col,
 			   ssize_t *icol, int collect_hits)
 {
 	int h = 0;
@@ -1282,26 +1282,27 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 		/*
 		 * Upon visiting a GREP_NODE_NOT, col and icol become swapped.
 		 */
-		h = !match_expr_eval(x->u.unary, bol, eol, ctx, icol, col, 0);
+		h = !match_expr_eval(opt, x->u.unary, bol, eol, ctx, icol, col, 0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(x->u.binary.left, bol, eol, ctx, col,
+		if (!match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				     icol, 0))
 			return 0;
-		h = match_expr_eval(x->u.binary.right, bol, eol, ctx, col,
+		h = match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
 				    icol, 0);
 		break;
 	case GREP_NODE_OR:
-		if (!collect_hits)
-			return (match_expr_eval(x->u.binary.left, bol, eol, ctx,
+		if (!(collect_hits || opt->columnnum))
+			return (match_expr_eval(opt, x->u.binary.left, bol, eol, ctx,
 						col, icol, 0) ||
-				match_expr_eval(x->u.binary.right, bol, eol,
+				match_expr_eval(opt, x->u.binary.right, bol, eol,
 						ctx, col, icol, 0));
-		h = match_expr_eval(x->u.binary.left, bol, eol, ctx, col,
+		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				    icol, 0);
-		x->u.binary.left->hit |= h;
-		h |= match_expr_eval(x->u.binary.right, bol, eol, ctx, col,
-				     icol, 1);
+		if (collect_hits)
+			x->u.binary.left->hit |= h;
+		h |= match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+				     icol, collect_hits);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1316,7 +1317,7 @@ static int match_expr(struct grep_opt *opt, char *bol, char *eol,
 		      ssize_t *icol, int collect_hits)
 {
 	struct grep_expr *x = opt->pattern_expression;
-	return match_expr_eval(x, bol, eol, ctx, col, icol, collect_hits);
+	return match_expr_eval(opt, x, bol, eol, ctx, col, icol, collect_hits);
 }

 static int match_line(struct grep_opt *opt, char *bol, char *eol,

> We could add an optimizer pass to reduce the number of regular
> expressions in certain cases if that is really too slow.  E.g. this:
>
> 	$ git grep -e b -e a
>
> ... is equivalent to:
>
> 	$ git grep -e '\(b\)\|\(a\)'
>
> In that example the optimizer should use a single kwset instead of a
> regex, but you get the idea, namely to leave the short-circuiting to the
> regex engine or kwset, which probably do a good job of it.

I think that--while this pushes that decision to the appropriate level
of indirection--that it is out of scope for this series, and that the
above patch should do a sufficient job at not surprising users.

> René

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:33   ` René Scharfe
  2018-06-19 17:44     ` Taylor Blau
@ 2018-06-19 17:48     ` Jeff King
  2018-06-19 17:54       ` Taylor Blau
                         ` (2 more replies)
  1 sibling, 3 replies; 56+ messages in thread
From: Jeff King @ 2018-06-19 17:48 UTC (permalink / raw)
  To: René Scharfe; +Cc: Taylor Blau, git, avarab, gitster

On Tue, Jun 19, 2018 at 07:33:39PM +0200, René Scharfe wrote:

> > The key thing about this iteration is that it doesn't regress
> > performance, because we always short-circuit where we used to. The other
> > obvious route is to stop short-circuiting only when "--column" is in
> > effect, which would have the same property (at the expense of a little
> > extra code in match_expr_eval()).
> 
> The performance impact of the exhaustive search for --color scales with
> the number of shown lines, while it would scale with the total number of
> lines for --column.  Coloring the results of highly selective patterns
> is relatively cheap, short-circuiting them still helps significantly.

I thought that at first, too, but I think we'd still scale with the
number of shown lines. We're talking about short-circuiting OR, so by
definition we stop the short-circuit because we matched the first half
of the OR.

If you stop short-circuiting AND, then yes, you incur a penalty for
every line. But I don't think --column would need to do that.

Although there are interesting cases around inversion. For example:

  git grep --not \( --not -e a --and --not -e b \)

is equivalent to:

  git grep -e a --or -e b

Do people care if we actually hunt down the exact column where we
_didn't_ match "b" in the first case?  The two are equivalent, but I
have to wonder if somebody writing the first one really cares.

> Disabling that optimization for --column wouldn't be a regression since
> it's a new option..  Picking a random result (based on the order of
> evaluation) seems sloppy and is probably going to surprise users.

I don't see it as a random result; short-circuiting logic is well
understood and we follow the user's ordering.

I think the place where it's _most_ ugly is "--column --color", where we
may color the short-circuited value in the second pass.

> We could add an optimizer pass to reduce the number of regular
> expressions in certain cases if that is really too slow.  E.g. this:

Yes, we actually discussed this kind of transformation. I think it's way
out of scope for this patch series, though. If we do anything more, I
think it should be to disable short-circuiting when --column is in use.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:44     ` Taylor Blau
@ 2018-06-19 17:50       ` René Scharfe
  2018-06-19 20:26       ` René Scharfe
  1 sibling, 0 replies; 56+ messages in thread
From: René Scharfe @ 2018-06-19 17:50 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, git, avarab, gitster

Am 19.06.2018 um 19:44 schrieb Taylor Blau:
> On Tue, Jun 19, 2018 at 07:33:39PM +0200, René Scharfe wrote:
>> Am 19.06.2018 um 18:35 schrieb Jeff King:
>>> On Mon, Jun 18, 2018 at 06:43:01PM -0500, Taylor Blau wrote:
>> We could add an optimizer pass to reduce the number of regular
>> expressions in certain cases if that is really too slow.  E.g. this:
>>
>> 	$ git grep -e b -e a
>>
>> ... is equivalent to:
>>
>> 	$ git grep -e '\(b\)\|\(a\)'
>>
>> In that example the optimizer should use a single kwset instead of a
>> regex, but you get the idea, namely to leave the short-circuiting to the
>> regex engine or kwset, which probably do a good job of it.
> 
> I think that--while this pushes that decision to the appropriate level
> of indirection--that it is out of scope for this series, and that the
> above patch should do a sufficient job at not surprising users.

Definitely.  I'm not even convinced that performance problem is real --
otherwise someone would have added such an optimization already, right?
:)

René

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:48     ` Jeff King
@ 2018-06-19 17:54       ` Taylor Blau
  2018-06-19 17:58       ` Junio C Hamano
  2018-06-19 18:50       ` René Scharfe
  2 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 17:54 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Taylor Blau, git, avarab, gitster

On Tue, Jun 19, 2018 at 01:48:47PM -0400, Jeff King wrote:
> On Tue, Jun 19, 2018 at 07:33:39PM +0200, René Scharfe wrote:
> > Disabling that optimization for --column wouldn't be a regression since
> > it's a new option..  Picking a random result (based on the order of
> > evaluation) seems sloppy and is probably going to surprise users.
>
> I don't see it as a random result; short-circuiting logic is well
> understood and we follow the user's ordering.

Agreed. I would prefer _not_ to apply the patch that I sent to René,
since I think it adds more complexity than its worth (precisely because
the short-circuiting logic is known, though certainly the problem gets
worse as the expression tree grows).

> I think the place where it's _most_ ugly is "--column --color", where we
> may color the short-circuited value in the second pass.

Agreed again, but it's worth noting that --color is the default. To play
devil's advocate, the use-case that I imagine will be most common is
with "git jump," so perhaps that doesn't matter as much.

> > We could add an optimizer pass to reduce the number of regular
> > expressions in certain cases if that is really too slow.  E.g. this:
>
> Yes, we actually discussed this kind of transformation. I think it's way
> out of scope for this patch series, though. If we do anything more, I
> think it should be to disable short-circuiting when --column is in use.

Piggy-backing on what I said to René, agreed again.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:48     ` Jeff King
  2018-06-19 17:54       ` Taylor Blau
@ 2018-06-19 17:58       ` Junio C Hamano
  2018-06-19 18:02         ` Taylor Blau
  2018-06-19 18:05         ` Jeff King
  2018-06-19 18:50       ` René Scharfe
  2 siblings, 2 replies; 56+ messages in thread
From: Junio C Hamano @ 2018-06-19 17:58 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Taylor Blau, git, avarab

Jeff King <peff@peff.net> writes:

> Although there are interesting cases around inversion. For example:
>
>   git grep --not \( --not -e a --and --not -e b \)
>
> is equivalent to:
>
>   git grep -e a --or -e b
>
> Do people care if we actually hunt down the exact column where we
> _didn't_ match "b" in the first case?  The two are equivalent, but I
> have to wonder if somebody writing the first one really cares.

I may be misunderstanding the question, but I personally would feel
that "git grep --not <ANYTHING>" is OK to say "the entire line is at
fault that it did not satisify the criteria to match <ANYTHING>".
I.e., I'd be happy if --column marked the first column as the
beginning of the match, or --color painted the entire line in the
output of the former.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:58       ` Junio C Hamano
@ 2018-06-19 18:02         ` Taylor Blau
  2018-06-19 18:05         ` Jeff King
  1 sibling, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 18:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, René Scharfe, Taylor Blau, git, avarab

On Tue, Jun 19, 2018 at 10:58:30AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Although there are interesting cases around inversion. For example:
> >
> >   git grep --not \( --not -e a --and --not -e b \)
> >
> > is equivalent to:
> >
> >   git grep -e a --or -e b
> >
> > Do people care if we actually hunt down the exact column where we
> > _didn't_ match "b" in the first case?  The two are equivalent, but I
> > have to wonder if somebody writing the first one really cares.
>
> I may be misunderstanding the question, but I personally would feel
> that "git grep --not <ANYTHING>" is OK to say "the entire line is at
> fault that it did not satisify the criteria to match <ANYTHING>".
> I.e., I'd be happy if --column marked the first column as the
> beginning of the match, or --color painted the entire line in the
> output of the former.

I think that Peff is pointing out that a short-circuiting OR will cause
a line like

  b foo a

to print the column for "a", not "b" (when given "-e a --or -e b").

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:58       ` Junio C Hamano
  2018-06-19 18:02         ` Taylor Blau
@ 2018-06-19 18:05         ` Jeff King
  2018-06-19 18:09           ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Jeff King @ 2018-06-19 18:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, Taylor Blau, git, avarab

On Tue, Jun 19, 2018 at 10:58:30AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Although there are interesting cases around inversion. For example:
> >
> >   git grep --not \( --not -e a --and --not -e b \)
> >
> > is equivalent to:
> >
> >   git grep -e a --or -e b
> >
> > Do people care if we actually hunt down the exact column where we
> > _didn't_ match "b" in the first case?  The two are equivalent, but I
> > have to wonder if somebody writing the first one really cares.
> 
> I may be misunderstanding the question, but I personally would feel
> that "git grep --not <ANYTHING>" is OK to say "the entire line is at
> fault that it did not satisify the criteria to match <ANYTHING>".
> I.e., I'd be happy if --column marked the first column as the
> beginning of the match, or --color painted the entire line in the
> output of the former.

Even if it's a double-inversion? The reason we carry both `col` and
`icol` is that it allows:

  git grep --not --not --not --not -e a

to still say "we found 'a' here". That's a dumb thing to ask for, but it
is true in the end that we show lines with "a" (and will colorize them
as such).

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 18:05         ` Jeff King
@ 2018-06-19 18:09           ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2018-06-19 18:09 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Taylor Blau, git, avarab

Jeff King <peff@peff.net> writes:

> Even if it's a double-inversion? The reason we carry both `col` and
> `icol` is that it allows:
>
>   git grep --not --not --not --not -e a
>
> to still say "we found 'a' here". That's a dumb thing to ask for, but it
> is true in the end that we show lines with "a" (and will colorize them
> as such).

Yes.  I think the code is doing too much to cater to a dumb request
;-)


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:48     ` Jeff King
  2018-06-19 17:54       ` Taylor Blau
  2018-06-19 17:58       ` Junio C Hamano
@ 2018-06-19 18:50       ` René Scharfe
  2018-06-19 19:11         ` Jeff King
  2 siblings, 1 reply; 56+ messages in thread
From: René Scharfe @ 2018-06-19 18:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, avarab, gitster

Am 19.06.2018 um 19:48 schrieb Jeff King:
> On Tue, Jun 19, 2018 at 07:33:39PM +0200, René Scharfe wrote:
> 
>>> The key thing about this iteration is that it doesn't regress
>>> performance, because we always short-circuit where we used to. The other
>>> obvious route is to stop short-circuiting only when "--column" is in
>>> effect, which would have the same property (at the expense of a little
>>> extra code in match_expr_eval()).
>>
>> The performance impact of the exhaustive search for --color scales with
>> the number of shown lines, while it would scale with the total number of
>> lines for --column.  Coloring the results of highly selective patterns
>> is relatively cheap, short-circuiting them still helps significantly.
> 
> I thought that at first, too, but I think we'd still scale with the
> number of shown lines. We're talking about short-circuiting OR, so by
> definition we stop the short-circuit because we matched the first half
> of the OR.
> 
> If you stop short-circuiting AND, then yes, you incur a penalty for
> every line. But I don't think --column would need to do that.

Good point.  So disabling that optimization for --column (or in even
in general) shouldn't be a dramatic loss.

> Although there are interesting cases around inversion. For example:
> 
>    git grep --not \( --not -e a --and --not -e b \)
> 
> is equivalent to:
> 
>    git grep -e a --or -e b
> 
> Do people care if we actually hunt down the exact column where we
> _didn't_ match "b" in the first case?  The two are equivalent, but I
> have to wonder if somebody writing the first one really cares.

I'd rather not guess the intentions of someone using such convoluted
expressions. ;-)

Negation causes the whole non-matching line to match, with --column
reporting 1 or nothing in such a case, right?  Or I think doing the
same when the operator is applied a second time is explainable.

>> Disabling that optimization for --column wouldn't be a regression since
>> it's a new option..  Picking a random result (based on the order of
>> evaluation) seems sloppy and is probably going to surprise users.
> 
> I don't see it as a random result; short-circuiting logic is well
> understood and we follow the user's ordering.

When ORing multiple expressions I don't pay attention to their order
as that operator is commutative.  Having results depend on that
order would at least surprise me.

> I think the place where it's _most_ ugly is "--column --color", where we
> may color the short-circuited value in the second pass.

The double negative example you gave above has that discrepancy as
well, but I think in that case it just highlights the different
intentions (--color: highlight search terms, --column: show leftmost
match).

>> We could add an optimizer pass to reduce the number of regular
>> expressions in certain cases if that is really too slow.  E.g. this:
> 
> Yes, we actually discussed this kind of transformation. I think it's way
> out of scope for this patch series, though. If we do anything more, I
> think it should be to disable short-circuiting when --column is in use.

Yep.

René

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 18:50       ` René Scharfe
@ 2018-06-19 19:11         ` Jeff King
  2018-06-19 20:34           ` René Scharfe
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2018-06-19 19:11 UTC (permalink / raw)
  To: René Scharfe; +Cc: Taylor Blau, git, avarab, gitster

On Tue, Jun 19, 2018 at 08:50:16PM +0200, René Scharfe wrote:

> Negation causes the whole non-matching line to match, with --column
> reporting 1 or nothing in such a case, right?  Or I think doing the
> same when the operator is applied a second time is explainable.

Yes to your first question.

Regarding the final sentence, yes, I agree it's explainable. But I
thought that handling negation like this was one of the main complaints
of earlier iterations?

> When ORing multiple expressions I don't pay attention to their order
> as that operator is commutative.  Having results depend on that
> order would at least surprise me.

OK. Let's just disable the short-circuit for --column then (i.e., what
Taylor posted earlier). That's explainable, and I doubt the performance
implications are going to be all that noticeable.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 17:44     ` Taylor Blau
  2018-06-19 17:50       ` René Scharfe
@ 2018-06-19 20:26       ` René Scharfe
  1 sibling, 0 replies; 56+ messages in thread
From: René Scharfe @ 2018-06-19 20:26 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, git, avarab, gitster

Am 19.06.2018 um 19:44 schrieb Taylor Blau:
> diff --git a/grep.c b/grep.c
> index f3329d82ed..a09935d8c5 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -1257,8 +1257,8 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
>   	return hit;
>   }
> 
> -static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
> -			   enum grep_context ctx, ssize_t *col,
> +static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
> +			   char *eol, enum grep_context ctx, ssize_t *col,
>   			   ssize_t *icol, int collect_hits)

Passing opt in is one way.  Piggy-backing on collect_hits and making it
a flags parameter for two bits might be easier.  At least it wouldn't
increase the number of function arguments any further.  Not sure.

Anyway, something like this would be needed as well; or we could
force opt->columnnum to switch opt->extended on:

diff --git a/grep.c b/grep.c
index 8ffa94c791..a724ca3010 100644
--- a/grep.c
+++ b/grep.c
@@ -1325,6 +1321,7 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
+	int hit = 0;
 
 	if (opt->extended)
 		return match_expr(opt, bol, eol, ctx, col, icol,
@@ -1334,11 +1331,14 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 	for (p = opt->pattern_list; p; p = p->next) {
 		regmatch_t tmp;
 		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
-			*col = tmp.rm_so;
-			return 1;
+			hit |= 1;
+			if (!opt->columnnum)
+				break;
+			if (*col < 0 || tmp.rm_so < *col)
+				*col = tmp.rm_so;
 		}
 	}
-	return 0;
+	return hit;
 }
 
 static int match_next_pattern(struct grep_pat *p, char *bol, char *eol,

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 19:11         ` Jeff King
@ 2018-06-19 20:34           ` René Scharfe
  2018-06-19 20:51             ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: René Scharfe @ 2018-06-19 20:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, avarab, gitster

Am 19.06.2018 um 21:11 schrieb Jeff King:
> On Tue, Jun 19, 2018 at 08:50:16PM +0200, René Scharfe wrote:
> 
>> Negation causes the whole non-matching line to match, with --column
>> reporting 1 or nothing in such a case, right?  Or I think doing the
>> same when the operator is applied a second time is explainable.

(Not sure where that extra "Or" came from.)

> Yes to your first question.
> 
> Regarding the final sentence, yes, I agree it's explainable. But I
> thought that handling negation like this was one of the main complaints
> of earlier iterations?

That's possible -- I didn't read the full thread, and I didn't argue
for or against any specific behavior in this regard before AFAIR.

So let's see what your example does:

   $ git grep --column --not \( --not -e foo --or --not -e bar \) trace.h
   trace.h:13: *  #define foo(format, ...) bar(format, __VA_ARGS__)
   $ git grep --column --not \( --not -e bar --or --not -e foo \) trace.h
   trace.h:13: *  #define foo(format, ...) bar(format, __VA_ARGS__)

Impressive.  That expression is confusing at first sight, but showing
the first matching column anyway requires no further explanation.  I
like it.

René

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-19 20:34           ` René Scharfe
@ 2018-06-19 20:51             ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2018-06-19 20:51 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, Taylor Blau, git, avarab

René Scharfe <l.s.r@web.de> writes:

> So let's see what your example does:
>
>    $ git grep --column --not \( --not -e foo --or --not -e bar \) trace.h
>    trace.h:13: *  #define foo(format, ...) bar(format, __VA_ARGS__)
>    $ git grep --column --not \( --not -e bar --or --not -e foo \) trace.h
>    trace.h:13: *  #define foo(format, ...) bar(format, __VA_ARGS__)
>
> Impressive.  That expression is confusing at first sight, but showing
> the first matching column anyway requires no further explanation.  I
> like it.

OK, obviously many people are a lot more math/logic minded than I
am.  I still think that any code that attempts to show the same
result as '-e bar --and -e foo' from the above is over-engineered,
but as long as it is done corretly and efficiently I won't have any
objection.

Thanks.




^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (8 preceding siblings ...)
  2018-06-19 16:46 ` Junio C Hamano
@ 2018-06-19 22:51 ` Taylor Blau
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
  11 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-19 22:51 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

On Mon, Jun 18, 2018 at 06:43:01PM -0500, Taylor Blau wrote:
> Hi,
>
> Attached is a ``fresh start'' of my series to teach 'git grep --column'.
> Since the last time I sent this, much has changed, notably the semantics
> for deciding which column is the first when given (1) extended
> expressions and (2) --invert.

Hi (again),

I have posted an updated version of this series on my fork available at
[1], in case anyone is interested in the changes before I publish them
here tomorrow.

I am going to let this thread sit overnight to collect any more
feedback.


Thanks,
Taylor

[1]: https://github.com/ttaylorr/git/compare/tb/grep-colno

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (9 preceding siblings ...)
  2018-06-19 22:51 ` Taylor Blau
@ 2018-06-20 20:05 ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
                     ` (7 more replies)
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
  11 siblings, 8 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Hi,

Here is a re-roll of my series to add --column to 'git-grep(1)'. Since
last time, not much has changed other than the following:

  - Fix a typo where 'col', 'icol' were spelled as 'match', 'imatch'
    [1].

  - Disable short-circuiting OR when --column is given [2].

  - Disable early-return in match_line() when multiple patterns are
    given and --column is, too [3].

  - Add some more tests in t7810.

Thanks again for your kind and through review; hopefully this re-roll
should be OK to queue as-is.


Thanks,
Taylor

[1]: https://public-inbox.org/git/xmqqwouuvi0e.fsf@gitster-ct.c.googlers.com/
[2]: https://public-inbox.org/git/20180619174452.GA47272@syl.attlocal.net/
[3]: https://public-inbox.org/git/80b9a0b1-3849-7097-fe1a-dd80835d62ae@web.de/

Taylor Blau (7):
  Documentation/config.txt: camel-case lineNumber for consistency
  grep.c: expose {,inverted} match column in match_line()
  grep.[ch]: extend grep_opt to allow showing matched column
  grep.c: display column number of first match
  builtin/grep.c: add '--column' option to 'git-grep(1)'
  grep.c: add configuration variables to show matched option
  contrib/git-jump/git-jump: jump to exact location

 Documentation/config.txt   |   7 ++-
 Documentation/git-grep.txt |   9 ++-
 builtin/grep.c             |   1 +
 contrib/git-jump/README    |  12 +++-
 contrib/git-jump/git-jump  |   2 +-
 grep.c                     | 126 ++++++++++++++++++++++++++++---------
 grep.h                     |   2 +
 t/t7810-grep.sh            |  84 +++++++++++++++++++++++++
 8 files changed, 210 insertions(+), 33 deletions(-)

Inter-diff (since v1):

diff --git a/grep.c b/grep.c
index 8ffa94c791..08d3df2855 100644
--- a/grep.c
+++ b/grep.c
@@ -1257,8 +1257,8 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
 	return hit;
 }

-static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
-			   enum grep_context ctx, ssize_t *col,
+static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
+			   char *eol, enum grep_context ctx, ssize_t *col,
 			   ssize_t *icol, int collect_hits)
 {
 	int h = 0;
@@ -1280,29 +1280,36 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 		break;
 	case GREP_NODE_NOT:
 		/*
-		 * Upon visiting a GREP_NODE_NOT, imatch and match become
-		 * swapped.
+		 * Upon visiting a GREP_NODE_NOT, col and icol become swapped.
 		 */
-		h = !match_expr_eval(x->u.unary, bol, eol, ctx, icol, col, 0);
+		h = !match_expr_eval(opt, x->u.unary, bol, eol, ctx, icol, col,
+				     0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(x->u.binary.left, bol, eol, ctx, col,
+		if (!match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				     icol, 0))
 			return 0;
-		h = match_expr_eval(x->u.binary.right, bol, eol, ctx, col,
+		h = match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
 				    icol, 0);
 		break;
 	case GREP_NODE_OR:
-		if (!collect_hits)
-			return (match_expr_eval(x->u.binary.left, bol, eol, ctx,
-						col, icol, 0) ||
-				match_expr_eval(x->u.binary.right, bol, eol,
-						ctx, col, icol, 0));
-		h = match_expr_eval(x->u.binary.left, bol, eol, ctx, col,
+		if (!(collect_hits || opt->columnnum)) {
+			/*
+			 * Don't short-circuit OR when given --column (or
+			 * collecting hits) to ensure we don't skip a later
+			 * child that would produce an earlier match.
+			 */
+			return (match_expr_eval(opt, x->u.binary.left, bol, eol,
+						ctx, col, icol, 0) ||
+				match_expr_eval(opt, x->u.binary.right, bol,
+						eol, ctx, col, icol, 0));
+		}
+		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				    icol, 0);
-		x->u.binary.left->hit |= h;
-		h |= match_expr_eval(x->u.binary.right, bol, eol, ctx, col,
-				     icol, 1);
+		if (collect_hits)
+			x->u.binary.left->hit |= h;
+		h |= match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+				     icol, collect_hits);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1317,7 +1324,7 @@ static int match_expr(struct grep_opt *opt, char *bol, char *eol,
 		      ssize_t *icol, int collect_hits)
 {
 	struct grep_expr *x = opt->pattern_expression;
-	return match_expr_eval(x, bol, eol, ctx, col, icol, collect_hits);
+	return match_expr_eval(opt, x, bol, eol, ctx, col, icol, collect_hits);
 }

 static int match_line(struct grep_opt *opt, char *bol, char *eol,
@@ -1325,6 +1332,7 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
+	int hit = 0;

 	if (opt->extended)
 		return match_expr(opt, bol, eol, ctx, col, icol,
@@ -1334,11 +1342,21 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 	for (p = opt->pattern_list; p; p = p->next) {
 		regmatch_t tmp;
 		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
-			*col = tmp.rm_so;
-			return 1;
+			hit |= 1;
+			if (!opt->columnnum) {
+				/*
+				 * Without --column, any single match on a line
+				 * is enough to know that it needs to be
+				 * printed. With --column, scan _all_ patterns
+				 * to find the earliest.
+				 */
+				break;
+			}
+			if (*col < 0 || tmp.rm_so < *col)
+				*col = tmp.rm_so;
 		}
 	}
-	return 0;
+	return hit;
 }

 static int match_next_pattern(struct grep_pat *p, char *bol, char *eol,
@@ -1387,7 +1405,7 @@ static int next_match(struct grep_opt *opt, char *bol, char *eol,
 }

 static void show_line(struct grep_opt *opt, char *bol, char *eol,
-		      const char *name, unsigned lno, unsigned cno, char sign)
+		      const char *name, unsigned lno, ssize_t cno, char sign)
 {
 	int rest = eol - bol;
 	const char *match_color, *line_color = NULL;
@@ -1429,7 +1447,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
 	 */
 	if (opt->columnnum && cno) {
 		char buf[32];
-		xsnprintf(buf, sizeof(buf), "%d", cno);
+		xsnprintf(buf, sizeof(buf), "%zu", cno);
 		output_color(opt, buf, strlen(buf), opt->color_columnno);
 		output_sep(opt, sign);
 	}
@@ -1871,8 +1889,11 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 			cno = opt->invert ? icol : col;
 			if (cno < 0) {
 				/*
-				 * A negative cno means that there was no match.
-				 * Clamp to the beginning of the line.
+				 * A negative cno indicates that there was no
+				 * match on the line. We are thus inverted and
+				 * being asked to show all lines that _don't_
+				 * match a given expression. Therefore, set cno
+				 * to 0 to suggest the whole line matches.
 				 */
 				cno = 0;
 			}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index daaf7b4c44..bf0b572dab 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -139,6 +139,15 @@ do
 		test_cmp expected actual
 	'

+	test_expect_success "grep $L (with --column, double-negation)" '
+		{
+			echo ${HC}file:1:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column --not \( --not -e foo --or --not -e baz \) $H -- file \
+			>actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep -w $L (with --column, -C)" '
 		{
 			echo ${HC}file:5:foo mmap bar
@@ -162,6 +171,18 @@ do
 		test_cmp expected actual
 	'

+	test_expect_success "grep -w $L (with non-extended patterns, --column)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file:10:foo_mmap bar
+			echo ${HC}file:10:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:10:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e bar -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep -w $L" '
 		{
 			echo ${HC}file:1:foo mmap bar
--
2.17.0.582.gccdcbd54c

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 1/7] Documentation/config.txt: camel-case lineNumber for consistency
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

lineNumber has casing that is inconsistent with surrounding options,
like color.grep.matchContext, and color.grep.matchSelected. Re-case this
documentation in order to be consistent with the text around it, and to
ensure that new entries are consistent, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a9..58fde4daea 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1181,7 +1181,7 @@ color.grep.<slot>::
 	filename prefix (when not using `-h`)
 `function`;;
 	function name lines (when using `-p`)
-`linenumber`;;
+`lineNumber`;;
 	line number prefix (when using `-n`)
 `match`;;
 	matching text (same as setting `matchContext` and `matchSelected`)
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 2/7] grep.c: expose {,inverted} match column in match_line()
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

When calling match_line(), callers presently cannot determine the
relative offset of the match because match_line() discards the
'regmatch_t' that contains this information.

Instead, teach match_line() to take in two 'ssize_t's. Fill the first
with the offset of the match produced by the given expression. If
extended, fill the later with the offset of the match produced as if
--invert were given.

For instance, matching "--not -e x" on this line produces a columnar
offset of 0, (i.e., the whole line does not contain an x), but "--invert
--not -e -x" will fill the later ssize_t of the column containing an
"x", because this expression is semantically equivalent to "-e x".

To determine the column for the inverted and non-inverted case, do the
following:

  - If matching an atom, the non-inverted column is as given from
    match_one_pattern(), and the inverted column is unset.

  - If matching a --not, the inverted column and non-inverted column
    swap.

  - If matching an --and, or --or, the non-inverted column is the
    minimum of the two children, with the exception that --or is
    short-circuiting. For instance, if we match "-e a --or -e b" on a
    line that contains both "a" and "b" (and "b" comes first), the match
    column will hold "a", since we inspected the left child first, and
    short-circuited over the right child.

This patch will become useful when we later pick between the two new
results in order to display the column number of the first match on a
line with --column.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 58 +++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/grep.c b/grep.c
index 45ec7e636c..dedfe17f93 100644
--- a/grep.c
+++ b/grep.c
@@ -1248,11 +1248,11 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
 	return hit;
 }
 
-static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
-			   enum grep_context ctx, int collect_hits)
+static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
+			   char *eol, enum grep_context ctx, ssize_t *col,
+			   ssize_t *icol, int collect_hits)
 {
 	int h = 0;
-	regmatch_t match;
 
 	if (!x)
 		die("Not a valid grep expression");
@@ -1261,25 +1261,39 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 		h = 1;
 		break;
 	case GREP_NODE_ATOM:
-		h = match_one_pattern(x->u.atom, bol, eol, ctx, &match, 0);
+		{
+			regmatch_t tmp;
+			h = match_one_pattern(x->u.atom, bol, eol, ctx,
+					      &tmp, 0);
+			if (h && (*col < 0 || tmp.rm_so < *col))
+				*col = tmp.rm_so;
+		}
 		break;
 	case GREP_NODE_NOT:
-		h = !match_expr_eval(x->u.unary, bol, eol, ctx, 0);
+		/*
+		 * Upon visiting a GREP_NODE_NOT, col and icol become swapped.
+		 */
+		h = !match_expr_eval(opt, x->u.unary, bol, eol, ctx, icol, col,
+				     0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(x->u.binary.left, bol, eol, ctx, 0))
+		if (!match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
+				     icol, 0))
 			return 0;
-		h = match_expr_eval(x->u.binary.right, bol, eol, ctx, 0);
+		h = match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+				    icol, 0);
 		break;
 	case GREP_NODE_OR:
 		if (!collect_hits)
-			return (match_expr_eval(x->u.binary.left,
-						bol, eol, ctx, 0) ||
-				match_expr_eval(x->u.binary.right,
-						bol, eol, ctx, 0));
-		h = match_expr_eval(x->u.binary.left, bol, eol, ctx, 0);
+			return (match_expr_eval(opt, x->u.binary.left, bol, eol,
+						ctx, col, icol, 0) ||
+				match_expr_eval(opt, x->u.binary.right, bol,
+						eol, ctx, col, icol, 0));
+		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
+				    icol, 0);
 		x->u.binary.left->hit |= h;
-		h |= match_expr_eval(x->u.binary.right, bol, eol, ctx, 1);
+		h |= match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+				     icol, 1);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1290,25 +1304,30 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 }
 
 static int match_expr(struct grep_opt *opt, char *bol, char *eol,
-		      enum grep_context ctx, int collect_hits)
+		      enum grep_context ctx, ssize_t *col,
+		      ssize_t *icol, int collect_hits)
 {
 	struct grep_expr *x = opt->pattern_expression;
-	return match_expr_eval(x, bol, eol, ctx, collect_hits);
+	return match_expr_eval(opt, x, bol, eol, ctx, col, icol, collect_hits);
 }
 
 static int match_line(struct grep_opt *opt, char *bol, char *eol,
+		      ssize_t *col, ssize_t *icol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
-	regmatch_t match;
 
 	if (opt->extended)
-		return match_expr(opt, bol, eol, ctx, collect_hits);
+		return match_expr(opt, bol, eol, ctx, col, icol,
+				  collect_hits);
 
 	/* we do not call with collect_hits without being extended */
 	for (p = opt->pattern_list; p; p = p->next) {
-		if (match_one_pattern(p, bol, eol, ctx, &match, 0))
+		regmatch_t tmp;
+		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
+			*col = tmp.rm_so;
 			return 1;
+		}
 	}
 	return 0;
 }
@@ -1763,6 +1782,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	while (left) {
 		char *eol, ch;
 		int hit;
+		ssize_t col = -1, icol = -1;
 
 		/*
 		 * look_ahead() skips quickly to the line that possibly
@@ -1786,7 +1806,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		if ((ctx == GREP_CONTEXT_HEAD) && (eol == bol))
 			ctx = GREP_CONTEXT_BODY;
 
-		hit = match_line(opt, bol, eol, ctx, collect_hits);
+		hit = match_line(opt, bol, eol, &col, &icol, ctx, collect_hits);
 		*eol = ch;
 
 		if (collect_hits)
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 3/7] grep.[ch]: extend grep_opt to allow showing matched column
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 4/7] grep.c: display column number of first match Taylor Blau
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To support showing the matched column when calling 'git-grep(1)', teach
'grep_opt' the normal set of options to configure the default behavior
and colorization of this feature.

Now that we have opt->columnnum, use it to disable short-circuiting over
ORs so that col and icol are always filled with the earliest matches on
each line. In addition, don't return the first match from match_line(),
for the same reason.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 33 +++++++++++++++++++++++++++------
 grep.h |  2 ++
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/grep.c b/grep.c
index dedfe17f93..d56d4a3a37 100644
--- a/grep.c
+++ b/grep.c
@@ -46,6 +46,7 @@ void init_grep_defaults(void)
 	color_set(opt->color_filename, "");
 	color_set(opt->color_function, "");
 	color_set(opt->color_lineno, "");
+	color_set(opt->color_columnno, "");
 	color_set(opt->color_match_context, GIT_COLOR_BOLD_RED);
 	color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED);
 	color_set(opt->color_selected, "");
@@ -155,6 +156,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	opt->extended_regexp_option = def->extended_regexp_option;
 	opt->pattern_type_option = def->pattern_type_option;
 	opt->linenum = def->linenum;
+	opt->columnnum = def->columnnum;
 	opt->max_depth = def->max_depth;
 	opt->pathname = def->pathname;
 	opt->relative = def->relative;
@@ -164,6 +166,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	color_set(opt->color_filename, def->color_filename);
 	color_set(opt->color_function, def->color_function);
 	color_set(opt->color_lineno, def->color_lineno);
+	color_set(opt->color_columnno, def->color_columnno);
 	color_set(opt->color_match_context, def->color_match_context);
 	color_set(opt->color_match_selected, def->color_match_selected);
 	color_set(opt->color_selected, def->color_selected);
@@ -1284,16 +1287,23 @@ static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
 				    icol, 0);
 		break;
 	case GREP_NODE_OR:
-		if (!collect_hits)
+		if (!(collect_hits || opt->columnnum)) {
+			/*
+			 * Don't short-circuit OR when given --column (or
+			 * collecting hits) to ensure we don't skip a later
+			 * child that would produce an earlier match.
+			 */
 			return (match_expr_eval(opt, x->u.binary.left, bol, eol,
 						ctx, col, icol, 0) ||
 				match_expr_eval(opt, x->u.binary.right, bol,
 						eol, ctx, col, icol, 0));
+		}
 		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				    icol, 0);
-		x->u.binary.left->hit |= h;
+		if (collect_hits)
+			x->u.binary.left->hit |= h;
 		h |= match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
-				     icol, 1);
+				     icol, collect_hits);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1316,6 +1326,7 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
+	int hit = 0;
 
 	if (opt->extended)
 		return match_expr(opt, bol, eol, ctx, col, icol,
@@ -1325,11 +1336,21 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 	for (p = opt->pattern_list; p; p = p->next) {
 		regmatch_t tmp;
 		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
-			*col = tmp.rm_so;
-			return 1;
+			hit |= 1;
+			if (!opt->columnnum) {
+				/*
+				 * Without --column, any single match on a line
+				 * is enough to know that it needs to be
+				 * printed. With --column, scan _all_ patterns
+				 * to find the earliest.
+				 */
+				break;
+			}
+			if (*col < 0 || tmp.rm_so < *col)
+				*col = tmp.rm_so;
 		}
 	}
-	return 0;
+	return hit;
 }
 
 static int match_next_pattern(struct grep_pat *p, char *bol, char *eol,
diff --git a/grep.h b/grep.h
index 399381c908..08a0b391c5 100644
--- a/grep.h
+++ b/grep.h
@@ -127,6 +127,7 @@ struct grep_opt {
 	int prefix_length;
 	regex_t regexp;
 	int linenum;
+	int columnnum;
 	int invert;
 	int ignore_case;
 	int status_only;
@@ -159,6 +160,7 @@ struct grep_opt {
 	char color_filename[COLOR_MAXLEN];
 	char color_function[COLOR_MAXLEN];
 	char color_lineno[COLOR_MAXLEN];
+	char color_columnno[COLOR_MAXLEN];
 	char color_match_context[COLOR_MAXLEN];
 	char color_match_selected[COLOR_MAXLEN];
 	char color_selected[COLOR_MAXLEN];
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 4/7] grep.c: display column number of first match
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
                     ` (2 preceding siblings ...)
  2018-06-20 20:05   ` [PATCH v2 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To prepare for 'git grep' learning '--column', teach grep.c's
show_line() how to show the column of the first match on non-context
lines.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/grep.c b/grep.c
index d56d4a3a37..d353d5d976 100644
--- a/grep.c
+++ b/grep.c
@@ -1399,7 +1399,7 @@ static int next_match(struct grep_opt *opt, char *bol, char *eol,
 }
 
 static void show_line(struct grep_opt *opt, char *bol, char *eol,
-		      const char *name, unsigned lno, char sign)
+		      const char *name, unsigned lno, ssize_t cno, char sign)
 {
 	int rest = eol - bol;
 	const char *match_color, *line_color = NULL;
@@ -1434,6 +1434,17 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
 		output_color(opt, buf, strlen(buf), opt->color_lineno);
 		output_sep(opt, sign);
 	}
+	/*
+	 * Treat 'cno' as the 1-indexed offset from the start of a non-context
+	 * line to its first match. Otherwise, 'cno' is 0 indicating that we are
+	 * being called with a context line.
+	 */
+	if (opt->columnnum && cno) {
+		char buf[32];
+		xsnprintf(buf, sizeof(buf), "%zu", cno);
+		output_color(opt, buf, strlen(buf), opt->color_columnno);
+		output_sep(opt, sign);
+	}
 	if (opt->color) {
 		regmatch_t match;
 		enum grep_context ctx = GREP_CONTEXT_BODY;
@@ -1539,7 +1550,7 @@ static void show_funcname_line(struct grep_opt *opt, struct grep_source *gs,
 			break;
 
 		if (match_funcname(opt, gs, bol, eol)) {
-			show_line(opt, bol, eol, gs->name, lno, '=');
+			show_line(opt, bol, eol, gs->name, lno, 0, '=');
 			break;
 		}
 	}
@@ -1604,7 +1615,7 @@ static void show_pre_context(struct grep_opt *opt, struct grep_source *gs,
 
 		while (*eol != '\n')
 			eol++;
-		show_line(opt, bol, eol, gs->name, cur, sign);
+		show_line(opt, bol, eol, gs->name, cur, 0, sign);
 		bol = eol + 1;
 		cur++;
 	}
@@ -1803,6 +1814,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	while (left) {
 		char *eol, ch;
 		int hit;
+		ssize_t cno;
 		ssize_t col = -1, icol = -1;
 
 		/*
@@ -1868,7 +1880,18 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 				show_pre_context(opt, gs, bol, eol, lno);
 			else if (opt->funcname)
 				show_funcname_line(opt, gs, bol, lno);
-			show_line(opt, bol, eol, gs->name, lno, ':');
+			cno = opt->invert ? icol : col;
+			if (cno < 0) {
+				/*
+				 * A negative cno indicates that there was no
+				 * match on the line. We are thus inverted and
+				 * being asked to show all lines that _don't_
+				 * match a given expression. Therefore, set cno
+				 * to 0 to suggest the whole line matches.
+				 */
+				cno = 0;
+			}
+			show_line(opt, bol, eol, gs->name, lno, cno + 1, ':');
 			last_hit = lno;
 			if (opt->funcbody)
 				show_function = 1;
@@ -1897,7 +1920,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 			/* If the last hit is within the post context,
 			 * we need to show this line.
 			 */
-			show_line(opt, bol, eol, gs->name, lno, '-');
+			show_line(opt, bol, eol, gs->name, lno, col + 1, '-');
 		}
 
 	next_line:
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
                     ` (3 preceding siblings ...)
  2018-06-20 20:05   ` [PATCH v2 4/7] grep.c: display column number of first match Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 6/7] grep.c: add configuration variables to show matched option Taylor Blau
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Teach 'git-grep(1)' a new option, '--column', to show the column
number of the first match on a non-context line. This makes it possible
to teach 'contrib/git-jump/git-jump' how to seek to the first matching
position of a grep match in your editor, and allows similar additional
scripting capabilities.

For example:

  $ git grep -n --column foo | head -n3
  .clang-format:51:14:# myFunction(foo, bar, baz);
  .clang-format:64:7:# int foo();
  .clang-format:75:8:# void foo()

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-grep.txt |  6 ++-
 builtin/grep.c             |  1 +
 t/t7810-grep.sh            | 84 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 312409a607..31dc0392a6 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 	   [-v | --invert-match] [-h|-H] [--full-name]
 	   [-E | --extended-regexp] [-G | --basic-regexp]
 	   [-P | --perl-regexp]
-	   [-F | --fixed-strings] [-n | --line-number]
+	   [-F | --fixed-strings] [-n | --line-number] [--column]
 	   [-l | --files-with-matches] [-L | --files-without-match]
 	   [(-O | --open-files-in-pager) [<pager>]]
 	   [-z | --null]
@@ -169,6 +169,10 @@ providing this option will cause it to die.
 --line-number::
 	Prefix the line number to matching lines.
 
+--column::
+	Prefix the 1-indexed byte-offset of the first match from the start of the
+	matching line.
+
 -l::
 --files-with-matches::
 --name-only::
diff --git a/builtin/grep.c b/builtin/grep.c
index ee753a403e..61bcaf6e58 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -828,6 +828,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
 		OPT_BOOL('n', "line-number", &opt.linenum, N_("show line numbers")),
+		OPT_BOOL(0, "column", &opt.columnnum, N_("show column number of first match")),
 		OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show filenames"), 1),
 		OPT_BIT('H', NULL, &opt.pathname, N_("show filenames"), 1),
 		OPT_NEGBIT(0, "full-name", &opt.relative,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1797f632a3..bf0b572dab 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -99,6 +99,90 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep -w $L (with --column)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, extended)" '
+		{
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:19:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e mmap$ --or -e baz $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, --invert)" '
+		{
+			echo ${HC}file:1:foo mmap bar
+			echo ${HC}file:1:foo_mmap bar
+			echo ${HC}file:1:foo_mmap bar mmap
+			echo ${HC}file:1:foo mmap bar_mmap
+		} >expected &&
+		git grep --column --invert -w -e baz $H -- file >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L (with --column, --invert, extended)" '
+		{
+			echo ${HC}hello_world:6:HeLLo_world
+		} >expected &&
+		git grep --column --invert -e ll --or --not -e _ $H -- hello_world \
+			>actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L (with --column, double-negation)" '
+		{
+			echo ${HC}file:1:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column --not \( --not -e foo --or --not -e baz \) $H -- file \
+			>actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, -C)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file-foo_mmap bar
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -C1 -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --line-number, --column)" '
+		{
+			echo ${HC}file:1:5:foo mmap bar
+			echo ${HC}file:3:14:foo_mmap bar mmap
+			echo ${HC}file:4:5:foo mmap bar_mmap
+			echo ${HC}file:5:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep -n --column -w -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with non-extended patterns, --column)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file:10:foo_mmap bar
+			echo ${HC}file:10:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:10:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e bar -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep -w $L" '
 		{
 			echo ${HC}file:1:foo mmap bar
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 6/7] grep.c: add configuration variables to show matched option
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
                     ` (4 preceding siblings ...)
  2018-06-20 20:05   ` [PATCH v2 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-20 20:05   ` [PATCH v2 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
  2018-06-21 11:53   ` [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To support git-grep(1)'s new option, '--column', document and teach
grep.c how to interpret relevant configuration options, similar to those
associated with '--line-number'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt   | 5 +++++
 Documentation/git-grep.txt | 3 +++
 grep.c                     | 6 ++++++
 3 files changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 58fde4daea..e4cbed3078 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1183,6 +1183,8 @@ color.grep.<slot>::
 	function name lines (when using `-p`)
 `lineNumber`;;
 	line number prefix (when using `-n`)
+`column`;;
+	column number prefix (when using `--column`)
 `match`;;
 	matching text (same as setting `matchContext` and `matchSelected`)
 `matchContext`;;
@@ -1797,6 +1799,9 @@ gitweb.snapshot::
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 
+grep.column::
+	If set to true, enable the `--column` option by default.
+
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 31dc0392a6..0de3493b80 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -44,6 +44,9 @@ CONFIGURATION
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 
+grep.column::
+	If set to true, enable the `--column` option by default.
+
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
diff --git a/grep.c b/grep.c
index d353d5d976..08d3df2855 100644
--- a/grep.c
+++ b/grep.c
@@ -96,6 +96,10 @@ int grep_config(const char *var, const char *value, void *cb)
 		opt->linenum = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "grep.column")) {
+		opt->columnnum = git_config_bool(var, value);
+		return 0;
+	}
 
 	if (!strcmp(var, "grep.fullname")) {
 		opt->relative = !git_config_bool(var, value);
@@ -112,6 +116,8 @@ int grep_config(const char *var, const char *value, void *cb)
 		color = opt->color_function;
 	else if (!strcmp(var, "color.grep.linenumber"))
 		color = opt->color_lineno;
+	else if (!strcmp(var, "color.grep.column"))
+		color = opt->color_columnno;
 	else if (!strcmp(var, "color.grep.matchcontext"))
 		color = opt->color_match_context;
 	else if (!strcmp(var, "color.grep.matchselected"))
-- 
2.17.0.582.gccdcbd54c


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 7/7] contrib/git-jump/git-jump: jump to exact location
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
                     ` (5 preceding siblings ...)
  2018-06-20 20:05   ` [PATCH v2 6/7] grep.c: add configuration variables to show matched option Taylor Blau
@ 2018-06-20 20:05   ` Taylor Blau
  2018-06-21 11:53   ` [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-20 20:05 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Take advantage of 'git-grep(1)''s new option, '--column' in order to
teach Peff's 'git-jump' script how to jump to the correct column for any
given match.

'git-grep(1)''s output is in the correct format for Vim's jump list, so
no additional cleanup is necessary.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 contrib/git-jump/README   | 12 ++++++++++--
 contrib/git-jump/git-jump |  2 +-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/contrib/git-jump/README b/contrib/git-jump/README
index 4484bda410..2f618a7f97 100644
--- a/contrib/git-jump/README
+++ b/contrib/git-jump/README
@@ -25,6 +25,13 @@ git-jump will feed this to the editor:
 foo.c:2: printf("hello word!\n");
 -----------------------------------
 
+Or, when running 'git jump grep', column numbers will also be emitted,
+e.g. `git jump grep "hello"` would return:
+
+-----------------------------------
+foo.c:2:9: printf("hello word!\n");
+-----------------------------------
+
 Obviously this trivial case isn't that interesting; you could just open
 `foo.c` yourself. But when you have many changes scattered across a
 project, you can use the editor's support to "jump" from point to point.
@@ -35,7 +42,8 @@ Git-jump can generate four types of interesting lists:
 
   2. The beginning of any merge conflict markers.
 
-  3. Any grep matches.
+  3. Any grep matches, including the column of the first match on a
+     line.
 
   4. Any whitespace errors detected by `git diff --check`.
 
@@ -82,7 +90,7 @@ which does something similar to `git jump grep`. However, it is limited
 to positioning the cursor to the correct line in only the first file,
 leaving you to locate subsequent hits in that file or other files using
 the editor or pager. By contrast, git-jump provides the editor with a
-complete list of files and line numbers for each match.
+complete list of files, lines, and a column number for each match.
 
 
 Limitations
diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 80ab0590bc..931b0fe3a9 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -52,7 +52,7 @@ mode_merge() {
 # editor shows them to us in the status bar.
 mode_grep() {
 	cmd=$(git config jump.grepCmd)
-	test -n "$cmd" || cmd="git grep -n"
+	test -n "$cmd" || cmd="git grep -n --column"
 	$cmd "$@" |
 	perl -pe '
 	s/[ \t]+/ /g;
-- 
2.17.0.582.gccdcbd54c

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
                     ` (6 preceding siblings ...)
  2018-06-20 20:05   ` [PATCH v2 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
@ 2018-06-21 11:53   ` Jeff King
  2018-06-21 12:01     ` Jeff King
                       ` (2 more replies)
  7 siblings, 3 replies; 56+ messages in thread
From: Jeff King @ 2018-06-21 11:53 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, gitster

On Wed, Jun 20, 2018 at 03:05:30PM -0500, Taylor Blau wrote:

> Hi,
> 
> Here is a re-roll of my series to add --column to 'git-grep(1)'. Since
> last time, not much has changed other than the following:
> 
>   - Fix a typo where 'col', 'icol' were spelled as 'match', 'imatch'
>     [1].
> 
>   - Disable short-circuiting OR when --column is given [2].

If we're going to do this, should we be short-circuiting AND, too?

Handling just OR makes this work:

  $ ./git grep --column -e scalable --or -e fast -- README.md
  README.md:7:Git - fast, scalable, distributed revision control system
  README.md:10:Git is a fast, scalable, distributed revision control system with an

but not this:

  $ ./git grep --column -v --not -e scalable --and --not -e fast -- README.md
  README.md:13:Git - fast, scalable, distributed revision control system
  README.md:16:Git is a fast, scalable, distributed revision control system with an

I wasn't sure where we landed in the discussion on "how much crazy stuff
to support". But AFAIK, the code in this iteration handles every crazy
case already except this one. If we're going to care about OR, maybe we
should just cover all cases.

> @@ -1429,7 +1447,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
>  	 */
>  	if (opt->columnnum && cno) {
>  		char buf[32];
> -		xsnprintf(buf, sizeof(buf), "%d", cno);
> +		xsnprintf(buf, sizeof(buf), "%zu", cno);

Unfortunately %z isn't portable. You have to use:

  xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-21 11:53   ` [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
@ 2018-06-21 12:01     ` Jeff King
  2018-06-22 21:45       ` Johannes Schindelin
  2018-06-21 20:52     ` Junio C Hamano
  2018-06-21 21:45     ` Taylor Blau
  2 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2018-06-21 12:01 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Johannes Schindelin, git, avarab, gitster

On Thu, Jun 21, 2018 at 07:53:02AM -0400, Jeff King wrote:

> > @@ -1429,7 +1447,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
> >  	 */
> >  	if (opt->columnnum && cno) {
> >  		char buf[32];
> > -		xsnprintf(buf, sizeof(buf), "%d", cno);
> > +		xsnprintf(buf, sizeof(buf), "%zu", cno);
> 
> Unfortunately %z isn't portable. You have to use:
> 
>   xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);

To clarify: yes, it is in C99. But traditionally we have not required
that.

This might be a candidate for another "weather balloon" patch to see if
anybody complains, though. The last time time we dealt with this in a
major way was over 7 years ago in 28bd70d811 (unbreak and eliminate
NO_C99_FORMAT, 2011-03-16).

I know Johannes switched out some "%lu" for PRIuMAX as recently as last
August[1], but I think that is more about the Windows size_t not matching
"unsigned long", and the decision to use PRIuMAX was to match the
existing codebase. AFAIK %zu is available on Windows.

-Peff

[1] https://public-inbox.org/git/alpine.DEB.2.21.1.1708100011391.11175@virtualbox/

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-21 11:53   ` [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
  2018-06-21 12:01     ` Jeff King
@ 2018-06-21 20:52     ` Junio C Hamano
  2018-06-21 21:45     ` Taylor Blau
  2 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2018-06-21 20:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, avarab

Jeff King <peff@peff.net> writes:

> I wasn't sure where we landed in the discussion on "how much crazy stuff
> to support". But AFAIK, the code in this iteration handles every crazy
> case already except this one. If we're going to care about OR, maybe we
> should just cover all cases.

I think I was the only one who didn't like the tool knowing more
boolean logic then the mere mortal user.  As long as the resulting
behaviour appears logically consistent to more math minded people, I
am actually OK if the outcome to overly complex input feels "too
magical" ;-)

And it appears that not short-circuiting AND will make the result
better in that respect, so I am all for the suggestion above.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-21 11:53   ` [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
  2018-06-21 12:01     ` Jeff King
  2018-06-21 20:52     ` Junio C Hamano
@ 2018-06-21 21:45     ` Taylor Blau
  2018-06-22  7:22       ` Jeff King
  2 siblings, 1 reply; 56+ messages in thread
From: Taylor Blau @ 2018-06-21 21:45 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, avarab, gitster

On Thu, Jun 21, 2018 at 07:53:02AM -0400, Jeff King wrote:
> On Wed, Jun 20, 2018 at 03:05:30PM -0500, Taylor Blau wrote:
>
> > Hi,
> >
> > Here is a re-roll of my series to add --column to 'git-grep(1)'. Since
> > last time, not much has changed other than the following:
> >
> >   - Fix a typo where 'col', 'icol' were spelled as 'match', 'imatch'
> >     [1].
> >
> >   - Disable short-circuiting OR when --column is given [2].
>
> If we're going to do this, should we be short-circuiting AND, too?
>
> Handling just OR makes this work:
>
>   $ ./git grep --column -e scalable --or -e fast -- README.md
>   README.md:7:Git - fast, scalable, distributed revision control system
>   README.md:10:Git is a fast, scalable, distributed revision control system with an
>
> but not this:
>
>   $ ./git grep --column -v --not -e scalable --and --not -e fast -- README.md
>   README.md:13:Git - fast, scalable, distributed revision control system
>   README.md:16:Git is a fast, scalable, distributed revision control system with an
>
> I wasn't sure where we landed in the discussion on "how much crazy stuff
> to support". But AFAIK, the code in this iteration handles every crazy
> case already except this one. If we're going to care about OR, maybe we
> should just cover all cases.

Good catch. As I understand it, we need to short-circuit AND because an
--invert or a --not above it in the expression tree would cause it to be
turned into an OR by de Morgan's Law, and therefore be susceptible to
the same reasoning that caused me to remove short-circuiting on OR.

> > @@ -1429,7 +1447,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
> >  	 */
> >  	if (opt->columnnum && cno) {
> >  		char buf[32];
> > -		xsnprintf(buf, sizeof(buf), "%d", cno);
> > +		xsnprintf(buf, sizeof(buf), "%zu", cno);
>
> Unfortunately %z isn't portable. You have to use:
>
>   xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);

I see. Per your next note, I've changed this to "%"PRIuMAX (and the
appropriate cast into uintmax_t), and will send another patch out in the
future changing it _back_ to "%zu" and figure out how folks feel about
it.

I published these changes on my fork at [1], and will let this thread
sit overnight again to see if anyone has any more feedback. Thank you!

> -Peff
Thanks,
Taylor

[1]: https://github.com/ttaylorr/git/compare/tb/grep-colno

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-21 21:45     ` Taylor Blau
@ 2018-06-22  7:22       ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2018-06-22  7:22 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, gitster

On Thu, Jun 21, 2018 at 04:45:49PM -0500, Taylor Blau wrote:

> > but not this:
> >
> >   $ ./git grep --column -v --not -e scalable --and --not -e fast -- README.md
> >   README.md:13:Git - fast, scalable, distributed revision control system
> >   README.md:16:Git is a fast, scalable, distributed revision control system with an
> >
> > I wasn't sure where we landed in the discussion on "how much crazy stuff
> > to support". But AFAIK, the code in this iteration handles every crazy
> > case already except this one. If we're going to care about OR, maybe we
> > should just cover all cases.
> 
> Good catch. As I understand it, we need to short-circuit AND because an
> --invert or a --not above it in the expression tree would cause it to be
> turned into an OR by de Morgan's Law, and therefore be susceptible to
> the same reasoning that caused me to remove short-circuiting on OR.

Right, exactly.

> > Unfortunately %z isn't portable. You have to use:
> >
> >   xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);
> 
> I see. Per your next note, I've changed this to "%"PRIuMAX (and the
> appropriate cast into uintmax_t), and will send another patch out in the
> future changing it _back_ to "%zu" and figure out how folks feel about
> it.

That sounds perfect. I think the time may be good for a "%zu"
weather-balloon, but it's best not to tangle it up with your other
changes.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
                   ` (10 preceding siblings ...)
  2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
@ 2018-06-22 15:49 ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
                     ` (7 more replies)
  11 siblings, 8 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Hi,

Attached is my third--anticipate the final--re-roll of my series to
teach 'git grep --column'.

Since the last time, only a couple of things have changed at Peff's
suggestions in [1]. The changes are summarized here, and an inter-diff
is available below:

  - Change "%zu" to PRIuMAX (and an appropriate cast into uintmax_t). I
    plan to send a follow-up patch to convert this back to "%zu" to see
    how people feel about it, but I wanted to keep that out of the
    present series in order to not hold things up.

  - Don't short-circuit AND when given --column, since an earlier NOT
    higher in the tree may cause an AND to be converted into an OR via
    de Morgan's Law, in which case the problem is reduced to the OR case
    (and should not have been short-circuited in the first place).

  - Add a test in t7810 to cover this behavior (i.e., '--not \( -e x
    --and -e y \)').

Thanks,
Taylor

[1]: https://public-inbox.org/git/20180621115302.GB15293@sigill.intra.peff.net/

Taylor Blau (7):
  Documentation/config.txt: camel-case lineNumber for consistency
  grep.c: expose {,inverted} match column in match_line()
  grep.[ch]: extend grep_opt to allow showing matched column
  grep.c: display column number of first match
  builtin/grep.c: add '--column' option to 'git-grep(1)'
  grep.c: add configuration variables to show matched option
  contrib/git-jump/git-jump: jump to exact location

 Documentation/config.txt   |   7 +-
 Documentation/git-grep.txt |   9 ++-
 builtin/grep.c             |   1 +
 contrib/git-jump/README    |  12 +++-
 contrib/git-jump/git-jump  |   2 +-
 grep.c                     | 134 +++++++++++++++++++++++++++++--------
 grep.h                     |   2 +
 t/t7810-grep.sh            |  95 ++++++++++++++++++++++++++
 8 files changed, 228 insertions(+), 34 deletions(-)

Inter-diff (since: v2)

diff --git a/grep.c b/grep.c
index 08d3df2855..992673fe7e 100644
--- a/grep.c
+++ b/grep.c
@@ -1286,11 +1286,17 @@ static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
 				     0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
-				     icol, 0))
-			return 0;
-		h = match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				    icol, 0);
+		if (h || opt->columnnum) {
+			/*
+			 * Don't short-circuit AND when given --column, since a
+			 * NOT earlier in the tree may turn this into an OR. In
+			 * this case, see the below comment.
+			 */
+			h &= match_expr_eval(opt, x->u.binary.right, bol, eol,
+					     ctx, col, icol, 0);
+		}
 		break;
 	case GREP_NODE_OR:
 		if (!(collect_hits || opt->columnnum)) {
@@ -1447,7 +1453,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
 	 */
 	if (opt->columnnum && cno) {
 		char buf[32];
-		xsnprintf(buf, sizeof(buf), "%zu", cno);
+		xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);
 		output_color(opt, buf, strlen(buf), opt->color_columnno);
 		output_sep(opt, sign);
 	}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index bf0b572dab..9312c8daf5 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -110,7 +110,7 @@ do
 		test_cmp expected actual
 	'

-	test_expect_success "grep -w $L (with --column, extended)" '
+	test_expect_success "grep -w $L (with --column, extended OR)" '
 		{
 			echo ${HC}file:14:foo_mmap bar mmap
 			echo ${HC}file:19:foo_mmap bar mmap baz
@@ -130,7 +130,7 @@ do
 		test_cmp expected actual
 	'

-	test_expect_success "grep $L (with --column, --invert, extended)" '
+	test_expect_success "grep $L (with --column, --invert, extended OR)" '
 		{
 			echo ${HC}hello_world:6:HeLLo_world
 		} >expected &&
@@ -139,6 +139,17 @@ do
 		test_cmp expected actual
 	'

+	test_expect_success "grep $L (with --column, --invert, extended AND)" '
+		{
+			echo ${HC}hello_world:3:Hello world
+			echo ${HC}hello_world:3:Hello_world
+			echo ${HC}hello_world:6:HeLLo_world
+		} >expected &&
+		git grep --column --invert --not -e _ --and --not -e ll $H -- hello_world \
+			>actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L (with --column, double-negation)" '
 		{
 			echo ${HC}file:1:foo_mmap bar mmap baz

--
2.18.0

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 1/7] Documentation/config.txt: camel-case lineNumber for consistency
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

lineNumber has casing that is inconsistent with surrounding options,
like color.grep.matchContext, and color.grep.matchSelected. Re-case this
documentation in order to be consistent with the text around it, and to
ensure that new entries are consistent, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a9..58fde4daea 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1181,7 +1181,7 @@ color.grep.<slot>::
 	filename prefix (when not using `-h`)
 `function`;;
 	function name lines (when using `-p`)
-`linenumber`;;
+`lineNumber`;;
 	line number prefix (when using `-n`)
 `match`;;
 	matching text (same as setting `matchContext` and `matchSelected`)
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 2/7] grep.c: expose {,inverted} match column in match_line()
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

When calling match_line(), callers presently cannot determine the
relative offset of the match because match_line() discards the
'regmatch_t' that contains this information.

Instead, teach match_line() to take in two 'ssize_t's. Fill the first
with the offset of the match produced by the given expression. If
extended, fill the later with the offset of the match produced as if
--invert were given.

For instance, matching "--not -e x" on this line produces a columnar
offset of 0, (i.e., the whole line does not contain an x), but "--invert
--not -e -x" will fill the later ssize_t of the column containing an
"x", because this expression is semantically equivalent to "-e x".

To determine the column for the inverted and non-inverted case, do the
following:

  - If matching an atom, the non-inverted column is as given from
    match_one_pattern(), and the inverted column is unset.

  - If matching a --not, the inverted column and non-inverted column
    swap.

  - If matching an --and, or --or, the non-inverted column is the
    minimum of the two children.

Presently, the existing short-circuiting logic for AND and OR applies as
before. This will change in the following commit when we add options to
configure the --column flag. Taken together, this and the forthcoming
change will always yield the earlier column on a given line.

This patch will become useful when we later pick between the two new
results in order to display the column number of the first match on a
line with --column.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 58 +++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/grep.c b/grep.c
index 45ec7e636c..dedfe17f93 100644
--- a/grep.c
+++ b/grep.c
@@ -1248,11 +1248,11 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
 	return hit;
 }
 
-static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
-			   enum grep_context ctx, int collect_hits)
+static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
+			   char *eol, enum grep_context ctx, ssize_t *col,
+			   ssize_t *icol, int collect_hits)
 {
 	int h = 0;
-	regmatch_t match;
 
 	if (!x)
 		die("Not a valid grep expression");
@@ -1261,25 +1261,39 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 		h = 1;
 		break;
 	case GREP_NODE_ATOM:
-		h = match_one_pattern(x->u.atom, bol, eol, ctx, &match, 0);
+		{
+			regmatch_t tmp;
+			h = match_one_pattern(x->u.atom, bol, eol, ctx,
+					      &tmp, 0);
+			if (h && (*col < 0 || tmp.rm_so < *col))
+				*col = tmp.rm_so;
+		}
 		break;
 	case GREP_NODE_NOT:
-		h = !match_expr_eval(x->u.unary, bol, eol, ctx, 0);
+		/*
+		 * Upon visiting a GREP_NODE_NOT, col and icol become swapped.
+		 */
+		h = !match_expr_eval(opt, x->u.unary, bol, eol, ctx, icol, col,
+				     0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(x->u.binary.left, bol, eol, ctx, 0))
+		if (!match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
+				     icol, 0))
 			return 0;
-		h = match_expr_eval(x->u.binary.right, bol, eol, ctx, 0);
+		h = match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+				    icol, 0);
 		break;
 	case GREP_NODE_OR:
 		if (!collect_hits)
-			return (match_expr_eval(x->u.binary.left,
-						bol, eol, ctx, 0) ||
-				match_expr_eval(x->u.binary.right,
-						bol, eol, ctx, 0));
-		h = match_expr_eval(x->u.binary.left, bol, eol, ctx, 0);
+			return (match_expr_eval(opt, x->u.binary.left, bol, eol,
+						ctx, col, icol, 0) ||
+				match_expr_eval(opt, x->u.binary.right, bol,
+						eol, ctx, col, icol, 0));
+		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
+				    icol, 0);
 		x->u.binary.left->hit |= h;
-		h |= match_expr_eval(x->u.binary.right, bol, eol, ctx, 1);
+		h |= match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+				     icol, 1);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1290,25 +1304,30 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
 }
 
 static int match_expr(struct grep_opt *opt, char *bol, char *eol,
-		      enum grep_context ctx, int collect_hits)
+		      enum grep_context ctx, ssize_t *col,
+		      ssize_t *icol, int collect_hits)
 {
 	struct grep_expr *x = opt->pattern_expression;
-	return match_expr_eval(x, bol, eol, ctx, collect_hits);
+	return match_expr_eval(opt, x, bol, eol, ctx, col, icol, collect_hits);
 }
 
 static int match_line(struct grep_opt *opt, char *bol, char *eol,
+		      ssize_t *col, ssize_t *icol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
-	regmatch_t match;
 
 	if (opt->extended)
-		return match_expr(opt, bol, eol, ctx, collect_hits);
+		return match_expr(opt, bol, eol, ctx, col, icol,
+				  collect_hits);
 
 	/* we do not call with collect_hits without being extended */
 	for (p = opt->pattern_list; p; p = p->next) {
-		if (match_one_pattern(p, bol, eol, ctx, &match, 0))
+		regmatch_t tmp;
+		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
+			*col = tmp.rm_so;
 			return 1;
+		}
 	}
 	return 0;
 }
@@ -1763,6 +1782,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	while (left) {
 		char *eol, ch;
 		int hit;
+		ssize_t col = -1, icol = -1;
 
 		/*
 		 * look_ahead() skips quickly to the line that possibly
@@ -1786,7 +1806,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		if ((ctx == GREP_CONTEXT_HEAD) && (eol == bol))
 			ctx = GREP_CONTEXT_BODY;
 
-		hit = match_line(opt, bol, eol, ctx, collect_hits);
+		hit = match_line(opt, bol, eol, &col, &icol, ctx, collect_hits);
 		*eol = ch;
 
 		if (collect_hits)
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 3/7] grep.[ch]: extend grep_opt to allow showing matched column
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 4/7] grep.c: display column number of first match Taylor Blau
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To support showing the matched column when calling 'git-grep(1)', teach
'grep_opt' the normal set of options to configure the default behavior
and colorization of this feature.

Now that we have opt->columnnum, use it to disable short-circuiting over
ORs and ANDs so that col and icol are always filled with the earliest
matches on each line. In addition, don't return the first match from
match_line(), for the same reason.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 47 +++++++++++++++++++++++++++++++++++++----------
 grep.h |  2 ++
 2 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/grep.c b/grep.c
index dedfe17f93..c885101017 100644
--- a/grep.c
+++ b/grep.c
@@ -46,6 +46,7 @@ void init_grep_defaults(void)
 	color_set(opt->color_filename, "");
 	color_set(opt->color_function, "");
 	color_set(opt->color_lineno, "");
+	color_set(opt->color_columnno, "");
 	color_set(opt->color_match_context, GIT_COLOR_BOLD_RED);
 	color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED);
 	color_set(opt->color_selected, "");
@@ -155,6 +156,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	opt->extended_regexp_option = def->extended_regexp_option;
 	opt->pattern_type_option = def->pattern_type_option;
 	opt->linenum = def->linenum;
+	opt->columnnum = def->columnnum;
 	opt->max_depth = def->max_depth;
 	opt->pathname = def->pathname;
 	opt->relative = def->relative;
@@ -164,6 +166,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 	color_set(opt->color_filename, def->color_filename);
 	color_set(opt->color_function, def->color_function);
 	color_set(opt->color_lineno, def->color_lineno);
+	color_set(opt->color_columnno, def->color_columnno);
 	color_set(opt->color_match_context, def->color_match_context);
 	color_set(opt->color_match_selected, def->color_match_selected);
 	color_set(opt->color_selected, def->color_selected);
@@ -1277,23 +1280,36 @@ static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, char *bol,
 				     0);
 		break;
 	case GREP_NODE_AND:
-		if (!match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
-				     icol, 0))
-			return 0;
-		h = match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
+		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				    icol, 0);
+		if (h || opt->columnnum) {
+			/*
+			 * Don't short-circuit AND when given --column, since a
+			 * NOT earlier in the tree may turn this into an OR. In
+			 * this case, see the below comment.
+			 */
+			h &= match_expr_eval(opt, x->u.binary.right, bol, eol,
+					     ctx, col, icol, 0);
+		}
 		break;
 	case GREP_NODE_OR:
-		if (!collect_hits)
+		if (!(collect_hits || opt->columnnum)) {
+			/*
+			 * Don't short-circuit OR when given --column (or
+			 * collecting hits) to ensure we don't skip a later
+			 * child that would produce an earlier match.
+			 */
 			return (match_expr_eval(opt, x->u.binary.left, bol, eol,
 						ctx, col, icol, 0) ||
 				match_expr_eval(opt, x->u.binary.right, bol,
 						eol, ctx, col, icol, 0));
+		}
 		h = match_expr_eval(opt, x->u.binary.left, bol, eol, ctx, col,
 				    icol, 0);
-		x->u.binary.left->hit |= h;
+		if (collect_hits)
+			x->u.binary.left->hit |= h;
 		h |= match_expr_eval(opt, x->u.binary.right, bol, eol, ctx, col,
-				     icol, 1);
+				     icol, collect_hits);
 		break;
 	default:
 		die("Unexpected node type (internal error) %d", x->node);
@@ -1316,6 +1332,7 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 		      enum grep_context ctx, int collect_hits)
 {
 	struct grep_pat *p;
+	int hit = 0;
 
 	if (opt->extended)
 		return match_expr(opt, bol, eol, ctx, col, icol,
@@ -1325,11 +1342,21 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol,
 	for (p = opt->pattern_list; p; p = p->next) {
 		regmatch_t tmp;
 		if (match_one_pattern(p, bol, eol, ctx, &tmp, 0)) {
-			*col = tmp.rm_so;
-			return 1;
+			hit |= 1;
+			if (!opt->columnnum) {
+				/*
+				 * Without --column, any single match on a line
+				 * is enough to know that it needs to be
+				 * printed. With --column, scan _all_ patterns
+				 * to find the earliest.
+				 */
+				break;
+			}
+			if (*col < 0 || tmp.rm_so < *col)
+				*col = tmp.rm_so;
 		}
 	}
-	return 0;
+	return hit;
 }
 
 static int match_next_pattern(struct grep_pat *p, char *bol, char *eol,
diff --git a/grep.h b/grep.h
index 399381c908..08a0b391c5 100644
--- a/grep.h
+++ b/grep.h
@@ -127,6 +127,7 @@ struct grep_opt {
 	int prefix_length;
 	regex_t regexp;
 	int linenum;
+	int columnnum;
 	int invert;
 	int ignore_case;
 	int status_only;
@@ -159,6 +160,7 @@ struct grep_opt {
 	char color_filename[COLOR_MAXLEN];
 	char color_function[COLOR_MAXLEN];
 	char color_lineno[COLOR_MAXLEN];
+	char color_columnno[COLOR_MAXLEN];
 	char color_match_context[COLOR_MAXLEN];
 	char color_match_selected[COLOR_MAXLEN];
 	char color_selected[COLOR_MAXLEN];
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 4/7] grep.c: display column number of first match
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
                     ` (2 preceding siblings ...)
  2018-06-22 15:49   ` [PATCH v3 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To prepare for 'git grep' learning '--column', teach grep.c's
show_line() how to show the column of the first match on non-context
lines.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/grep.c b/grep.c
index c885101017..83fe32a6a0 100644
--- a/grep.c
+++ b/grep.c
@@ -1405,7 +1405,7 @@ static int next_match(struct grep_opt *opt, char *bol, char *eol,
 }
 
 static void show_line(struct grep_opt *opt, char *bol, char *eol,
-		      const char *name, unsigned lno, char sign)
+		      const char *name, unsigned lno, ssize_t cno, char sign)
 {
 	int rest = eol - bol;
 	const char *match_color, *line_color = NULL;
@@ -1440,6 +1440,17 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
 		output_color(opt, buf, strlen(buf), opt->color_lineno);
 		output_sep(opt, sign);
 	}
+	/*
+	 * Treat 'cno' as the 1-indexed offset from the start of a non-context
+	 * line to its first match. Otherwise, 'cno' is 0 indicating that we are
+	 * being called with a context line.
+	 */
+	if (opt->columnnum && cno) {
+		char buf[32];
+		xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);
+		output_color(opt, buf, strlen(buf), opt->color_columnno);
+		output_sep(opt, sign);
+	}
 	if (opt->color) {
 		regmatch_t match;
 		enum grep_context ctx = GREP_CONTEXT_BODY;
@@ -1545,7 +1556,7 @@ static void show_funcname_line(struct grep_opt *opt, struct grep_source *gs,
 			break;
 
 		if (match_funcname(opt, gs, bol, eol)) {
-			show_line(opt, bol, eol, gs->name, lno, '=');
+			show_line(opt, bol, eol, gs->name, lno, 0, '=');
 			break;
 		}
 	}
@@ -1610,7 +1621,7 @@ static void show_pre_context(struct grep_opt *opt, struct grep_source *gs,
 
 		while (*eol != '\n')
 			eol++;
-		show_line(opt, bol, eol, gs->name, cur, sign);
+		show_line(opt, bol, eol, gs->name, cur, 0, sign);
 		bol = eol + 1;
 		cur++;
 	}
@@ -1809,6 +1820,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 	while (left) {
 		char *eol, ch;
 		int hit;
+		ssize_t cno;
 		ssize_t col = -1, icol = -1;
 
 		/*
@@ -1874,7 +1886,18 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 				show_pre_context(opt, gs, bol, eol, lno);
 			else if (opt->funcname)
 				show_funcname_line(opt, gs, bol, lno);
-			show_line(opt, bol, eol, gs->name, lno, ':');
+			cno = opt->invert ? icol : col;
+			if (cno < 0) {
+				/*
+				 * A negative cno indicates that there was no
+				 * match on the line. We are thus inverted and
+				 * being asked to show all lines that _don't_
+				 * match a given expression. Therefore, set cno
+				 * to 0 to suggest the whole line matches.
+				 */
+				cno = 0;
+			}
+			show_line(opt, bol, eol, gs->name, lno, cno + 1, ':');
 			last_hit = lno;
 			if (opt->funcbody)
 				show_function = 1;
@@ -1903,7 +1926,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 			/* If the last hit is within the post context,
 			 * we need to show this line.
 			 */
-			show_line(opt, bol, eol, gs->name, lno, '-');
+			show_line(opt, bol, eol, gs->name, lno, col + 1, '-');
 		}
 
 	next_line:
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
                     ` (3 preceding siblings ...)
  2018-06-22 15:49   ` [PATCH v3 4/7] grep.c: display column number of first match Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 6/7] grep.c: add configuration variables to show matched option Taylor Blau
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Teach 'git-grep(1)' a new option, '--column', to show the column
number of the first match on a non-context line. This makes it possible
to teach 'contrib/git-jump/git-jump' how to seek to the first matching
position of a grep match in your editor, and allows similar additional
scripting capabilities.

For example:

  $ git grep -n --column foo | head -n3
  .clang-format:51:14:# myFunction(foo, bar, baz);
  .clang-format:64:7:# int foo();
  .clang-format:75:8:# void foo()

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-grep.txt |  6 ++-
 builtin/grep.c             |  1 +
 t/t7810-grep.sh            | 95 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 101 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 312409a607..31dc0392a6 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -13,7 +13,7 @@ SYNOPSIS
 	   [-v | --invert-match] [-h|-H] [--full-name]
 	   [-E | --extended-regexp] [-G | --basic-regexp]
 	   [-P | --perl-regexp]
-	   [-F | --fixed-strings] [-n | --line-number]
+	   [-F | --fixed-strings] [-n | --line-number] [--column]
 	   [-l | --files-with-matches] [-L | --files-without-match]
 	   [(-O | --open-files-in-pager) [<pager>]]
 	   [-z | --null]
@@ -169,6 +169,10 @@ providing this option will cause it to die.
 --line-number::
 	Prefix the line number to matching lines.
 
+--column::
+	Prefix the 1-indexed byte-offset of the first match from the start of the
+	matching line.
+
 -l::
 --files-with-matches::
 --name-only::
diff --git a/builtin/grep.c b/builtin/grep.c
index ee753a403e..61bcaf6e58 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -828,6 +828,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
 		OPT_BOOL('n', "line-number", &opt.linenum, N_("show line numbers")),
+		OPT_BOOL(0, "column", &opt.columnnum, N_("show column number of first match")),
 		OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show filenames"), 1),
 		OPT_BIT('H', NULL, &opt.pathname, N_("show filenames"), 1),
 		OPT_NEGBIT(0, "full-name", &opt.relative,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1797f632a3..9312c8daf5 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -99,6 +99,101 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep -w $L (with --column)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, extended OR)" '
+		{
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:19:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e mmap$ --or -e baz $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, --invert)" '
+		{
+			echo ${HC}file:1:foo mmap bar
+			echo ${HC}file:1:foo_mmap bar
+			echo ${HC}file:1:foo_mmap bar mmap
+			echo ${HC}file:1:foo mmap bar_mmap
+		} >expected &&
+		git grep --column --invert -w -e baz $H -- file >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L (with --column, --invert, extended OR)" '
+		{
+			echo ${HC}hello_world:6:HeLLo_world
+		} >expected &&
+		git grep --column --invert -e ll --or --not -e _ $H -- hello_world \
+			>actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L (with --column, --invert, extended AND)" '
+		{
+			echo ${HC}hello_world:3:Hello world
+			echo ${HC}hello_world:3:Hello_world
+			echo ${HC}hello_world:6:HeLLo_world
+		} >expected &&
+		git grep --column --invert --not -e _ --and --not -e ll $H -- hello_world \
+			>actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L (with --column, double-negation)" '
+		{
+			echo ${HC}file:1:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column --not \( --not -e foo --or --not -e baz \) $H -- file \
+			>actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --column, -C)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file-foo_mmap bar
+			echo ${HC}file:14:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -C1 -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with --line-number, --column)" '
+		{
+			echo ${HC}file:1:5:foo mmap bar
+			echo ${HC}file:3:14:foo_mmap bar mmap
+			echo ${HC}file:4:5:foo mmap bar_mmap
+			echo ${HC}file:5:14:foo_mmap bar mmap baz
+		} >expected &&
+		git grep -n --column -w -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep -w $L (with non-extended patterns, --column)" '
+		{
+			echo ${HC}file:5:foo mmap bar
+			echo ${HC}file:10:foo_mmap bar
+			echo ${HC}file:10:foo_mmap bar mmap
+			echo ${HC}file:5:foo mmap bar_mmap
+			echo ${HC}file:10:foo_mmap bar mmap baz
+		} >expected &&
+		git grep --column -w -e bar -e mmap $H >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep -w $L" '
 		{
 			echo ${HC}file:1:foo mmap bar
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 6/7] grep.c: add configuration variables to show matched option
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
                     ` (4 preceding siblings ...)
  2018-06-22 15:49   ` [PATCH v3 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-22 15:49   ` [PATCH v3 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
  2018-06-25 18:43   ` [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

To support git-grep(1)'s new option, '--column', document and teach
grep.c how to interpret relevant configuration options, similar to those
associated with '--line-number'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt   | 5 +++++
 Documentation/git-grep.txt | 3 +++
 grep.c                     | 6 ++++++
 3 files changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 58fde4daea..e4cbed3078 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1183,6 +1183,8 @@ color.grep.<slot>::
 	function name lines (when using `-p`)
 `lineNumber`;;
 	line number prefix (when using `-n`)
+`column`;;
+	column number prefix (when using `--column`)
 `match`;;
 	matching text (same as setting `matchContext` and `matchSelected`)
 `matchContext`;;
@@ -1797,6 +1799,9 @@ gitweb.snapshot::
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 
+grep.column::
+	If set to true, enable the `--column` option by default.
+
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 31dc0392a6..0de3493b80 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -44,6 +44,9 @@ CONFIGURATION
 grep.lineNumber::
 	If set to true, enable `-n` option by default.
 
+grep.column::
+	If set to true, enable the `--column` option by default.
+
 grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
diff --git a/grep.c b/grep.c
index 83fe32a6a0..992673fe7e 100644
--- a/grep.c
+++ b/grep.c
@@ -96,6 +96,10 @@ int grep_config(const char *var, const char *value, void *cb)
 		opt->linenum = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "grep.column")) {
+		opt->columnnum = git_config_bool(var, value);
+		return 0;
+	}
 
 	if (!strcmp(var, "grep.fullname")) {
 		opt->relative = !git_config_bool(var, value);
@@ -112,6 +116,8 @@ int grep_config(const char *var, const char *value, void *cb)
 		color = opt->color_function;
 	else if (!strcmp(var, "color.grep.linenumber"))
 		color = opt->color_lineno;
+	else if (!strcmp(var, "color.grep.column"))
+		color = opt->color_columnno;
 	else if (!strcmp(var, "color.grep.matchcontext"))
 		color = opt->color_match_context;
 	else if (!strcmp(var, "color.grep.matchselected"))
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 7/7] contrib/git-jump/git-jump: jump to exact location
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
                     ` (5 preceding siblings ...)
  2018-06-22 15:49   ` [PATCH v3 6/7] grep.c: add configuration variables to show matched option Taylor Blau
@ 2018-06-22 15:49   ` Taylor Blau
  2018-06-25 18:43   ` [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
  7 siblings, 0 replies; 56+ messages in thread
From: Taylor Blau @ 2018-06-22 15:49 UTC (permalink / raw)
  To: git; +Cc: avarab, peff, gitster

Take advantage of 'git-grep(1)''s new option, '--column' in order to
teach Peff's 'git-jump' script how to jump to the correct column for any
given match.

'git-grep(1)''s output is in the correct format for Vim's jump list, so
no additional cleanup is necessary.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 contrib/git-jump/README   | 12 ++++++++++--
 contrib/git-jump/git-jump |  2 +-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/contrib/git-jump/README b/contrib/git-jump/README
index 4484bda410..2f618a7f97 100644
--- a/contrib/git-jump/README
+++ b/contrib/git-jump/README
@@ -25,6 +25,13 @@ git-jump will feed this to the editor:
 foo.c:2: printf("hello word!\n");
 -----------------------------------
 
+Or, when running 'git jump grep', column numbers will also be emitted,
+e.g. `git jump grep "hello"` would return:
+
+-----------------------------------
+foo.c:2:9: printf("hello word!\n");
+-----------------------------------
+
 Obviously this trivial case isn't that interesting; you could just open
 `foo.c` yourself. But when you have many changes scattered across a
 project, you can use the editor's support to "jump" from point to point.
@@ -35,7 +42,8 @@ Git-jump can generate four types of interesting lists:
 
   2. The beginning of any merge conflict markers.
 
-  3. Any grep matches.
+  3. Any grep matches, including the column of the first match on a
+     line.
 
   4. Any whitespace errors detected by `git diff --check`.
 
@@ -82,7 +90,7 @@ which does something similar to `git jump grep`. However, it is limited
 to positioning the cursor to the correct line in only the first file,
 leaving you to locate subsequent hits in that file or other files using
 the editor or pager. By contrast, git-jump provides the editor with a
-complete list of files and line numbers for each match.
+complete list of files, lines, and a column number for each match.
 
 
 Limitations
diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 80ab0590bc..931b0fe3a9 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -52,7 +52,7 @@ mode_merge() {
 # editor shows them to us in the status bar.
 mode_grep() {
 	cmd=$(git config jump.grepCmd)
-	test -n "$cmd" || cmd="git grep -n"
+	test -n "$cmd" || cmd="git grep -n --column"
 	$cmd "$@" |
 	perl -pe '
 	s/[ \t]+/ /g;
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-21 12:01     ` Jeff King
@ 2018-06-22 21:45       ` Johannes Schindelin
  2018-06-22 22:26         ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Schindelin @ 2018-06-22 21:45 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, avarab, gitster

Hi Peff,

On Thu, 21 Jun 2018, Jeff King wrote:

> On Thu, Jun 21, 2018 at 07:53:02AM -0400, Jeff King wrote:
> 
> > > @@ -1429,7 +1447,7 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol,
> > >  	 */
> > >  	if (opt->columnnum && cno) {
> > >  		char buf[32];
> > > -		xsnprintf(buf, sizeof(buf), "%d", cno);
> > > +		xsnprintf(buf, sizeof(buf), "%zu", cno);
> > 
> > Unfortunately %z isn't portable. You have to use:
> > 
> >   xsnprintf(buf, sizeof(buf), "%"PRIuMAX, (uintmax_t)cno);
> 
> To clarify: yes, it is in C99. But traditionally we have not required
> that.
> 
> This might be a candidate for another "weather balloon" patch to see if
> anybody complains, though. The last time time we dealt with this in a
> major way was over 7 years ago in 28bd70d811 (unbreak and eliminate
> NO_C99_FORMAT, 2011-03-16).
> 
> I know Johannes switched out some "%lu" for PRIuMAX as recently as last
> August[1], but I think that is more about the Windows size_t not matching
> "unsigned long", and the decision to use PRIuMAX was to match the
> existing codebase. AFAIK %zu is available on Windows.

Nope, it's not available:

git.c: In function 'cmd_main':
git.c:733:10: error: unknown conversion type character 'z' in format [-Werror=format=]
 die("x: %z", (void *)(intptr_t)0x123456789a);

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-22 21:45       ` Johannes Schindelin
@ 2018-06-22 22:26         ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2018-06-22 22:26 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Taylor Blau, git, avarab, gitster

On Fri, Jun 22, 2018 at 11:45:09PM +0200, Johannes Schindelin wrote:

> > This might be a candidate for another "weather balloon" patch to see if
> > anybody complains, though. The last time time we dealt with this in a
> > major way was over 7 years ago in 28bd70d811 (unbreak and eliminate
> > NO_C99_FORMAT, 2011-03-16).
> > 
> > I know Johannes switched out some "%lu" for PRIuMAX as recently as last
> > August[1], but I think that is more about the Windows size_t not matching
> > "unsigned long", and the decision to use PRIuMAX was to match the
> > existing codebase. AFAIK %zu is available on Windows.
> 
> Nope, it's not available:
> 
> git.c: In function 'cmd_main':
> git.c:733:10: error: unknown conversion type character 'z' in format [-Werror=format=]
>  die("x: %z", (void *)(intptr_t)0x123456789a);

Well, that resolve that, then. :)

Thanks for letting us know before we went down a dead-end.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
                     ` (6 preceding siblings ...)
  2018-06-22 15:49   ` [PATCH v3 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
@ 2018-06-25 18:43   ` Jeff King
  2018-06-25 18:47     ` Taylor Blau
  7 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2018-06-25 18:43 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, avarab, gitster

On Fri, Jun 22, 2018 at 10:49:26AM -0500, Taylor Blau wrote:

> Attached is my third--anticipate the final--re-roll of my series to
> teach 'git grep --column'.

You know when you say that it jinxes it, right? :)

> Since the last time, only a couple of things have changed at Peff's
> suggestions in [1]. The changes are summarized here, and an inter-diff
> is available below:
> 
>   - Change "%zu" to PRIuMAX (and an appropriate cast into uintmax_t). I
>     plan to send a follow-up patch to convert this back to "%zu" to see
>     how people feel about it, but I wanted to keep that out of the
>     present series in order to not hold things up.
> 
>   - Don't short-circuit AND when given --column, since an earlier NOT
>     higher in the tree may cause an AND to be converted into an OR via
>     de Morgan's Law, in which case the problem is reduced to the OR case
>     (and should not have been short-circuited in the first place).
> 
>   - Add a test in t7810 to cover this behavior (i.e., '--not \( -e x
>     --and -e y \)').

Jinxes aside, this interdiff looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-25 18:43   ` [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
@ 2018-06-25 18:47     ` Taylor Blau
  2018-06-26 16:45       ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Taylor Blau @ 2018-06-25 18:47 UTC (permalink / raw)
  To: Jeff King; +Cc: git, avarab, gitster

On Mon, Jun 25, 2018 at 02:43:50PM -0400, Jeff King wrote:
> On Fri, Jun 22, 2018 at 10:49:26AM -0500, Taylor Blau wrote:
> > Since the last time, only a couple of things have changed at Peff's
> > suggestions in [1]. The changes are summarized here, and an inter-diff
> > is available below:
> >
> >   - Change "%zu" to PRIuMAX (and an appropriate cast into uintmax_t). I
> >     plan to send a follow-up patch to convert this back to "%zu" to see
> >     how people feel about it, but I wanted to keep that out of the
> >     present series in order to not hold things up.
> >
> >   - Don't short-circuit AND when given --column, since an earlier NOT
> >     higher in the tree may cause an AND to be converted into an OR via
> >     de Morgan's Law, in which case the problem is reduced to the OR case
> >     (and should not have been short-circuited in the first place).
> >
> >   - Add a test in t7810 to cover this behavior (i.e., '--not \( -e x
> >     --and -e y \)').
>
> Jinxes aside, this interdiff looks good to me.

Thanks; I hope that I haven't jinxed anything :-).

I'm going to avoid sending the PRIuMAX -> "%zu" patch, since dscho
points out that it's not available on Windows [1].

Thanks,
Taylor

[1]: https://public-inbox.org/git/nycvar.QRO.7.76.6.1806222344280.11870@tvgsbejvaqbjf.bet/

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)'
  2018-06-25 18:47     ` Taylor Blau
@ 2018-06-26 16:45       ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2018-06-26 16:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, git, avarab

Taylor Blau <me@ttaylorr.com> writes:

> On Mon, Jun 25, 2018 at 02:43:50PM -0400, Jeff King wrote:
>> On Fri, Jun 22, 2018 at 10:49:26AM -0500, Taylor Blau wrote:
>> > Since the last time, only a couple of things have changed at Peff's
>> > suggestions in [1]. The changes are summarized here, and an inter-diff
>> > is available below:
>> >
>> >   - Change "%zu" to PRIuMAX (and an appropriate cast into uintmax_t). I
>> >     plan to send a follow-up patch to convert this back to "%zu" to see
>> >     how people feel about it, but I wanted to keep that out of the
>> >     present series in order to not hold things up.
>> ...
>> Jinxes aside, this interdiff looks good to me.
>
> Thanks; I hope that I haven't jinxed anything :-).
>
> I'm going to avoid sending the PRIuMAX -> "%zu" patch, since dscho
> points out that it's not available on Windows [1].

OK, so what I queued on 'pu' seems to be ready to advance, which is
good.  Keeping topics in flight on 'pu', unable to convince myself
that they are ready to advance to 'next', makes me feel uneasy and
unhappy, and having to worry about one less such topic is a good
news ;-)


^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2018-06-26 16:46 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-18 23:43 [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Taylor Blau
2018-06-18 23:43 ` [PATCH 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
2018-06-18 23:43 ` [PATCH 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
2018-06-19 16:49   ` Junio C Hamano
2018-06-19 17:02     ` Taylor Blau
2018-06-18 23:43 ` [PATCH 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
2018-06-18 23:43 ` [PATCH 4/7] grep.c: display column number of first match Taylor Blau
2018-06-19 16:28   ` Jeff King
2018-06-19 16:34     ` Taylor Blau
2018-06-18 23:43 ` [PATCH 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
2018-06-18 23:43 ` [PATCH 6/7] grep.c: add configuration variables to show matched option Taylor Blau
2018-06-18 23:43 ` [PATCH 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
2018-06-19 16:35 ` [PATCH 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
2018-06-19 17:33   ` René Scharfe
2018-06-19 17:44     ` Taylor Blau
2018-06-19 17:50       ` René Scharfe
2018-06-19 20:26       ` René Scharfe
2018-06-19 17:48     ` Jeff King
2018-06-19 17:54       ` Taylor Blau
2018-06-19 17:58       ` Junio C Hamano
2018-06-19 18:02         ` Taylor Blau
2018-06-19 18:05         ` Jeff King
2018-06-19 18:09           ` Junio C Hamano
2018-06-19 18:50       ` René Scharfe
2018-06-19 19:11         ` Jeff King
2018-06-19 20:34           ` René Scharfe
2018-06-19 20:51             ` Junio C Hamano
2018-06-19 16:46 ` Junio C Hamano
2018-06-19 17:02   ` Taylor Blau
2018-06-19 22:51 ` Taylor Blau
2018-06-20 20:05 ` [PATCH v2 " Taylor Blau
2018-06-20 20:05   ` [PATCH v2 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
2018-06-20 20:05   ` [PATCH v2 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
2018-06-20 20:05   ` [PATCH v2 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
2018-06-20 20:05   ` [PATCH v2 4/7] grep.c: display column number of first match Taylor Blau
2018-06-20 20:05   ` [PATCH v2 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
2018-06-20 20:05   ` [PATCH v2 6/7] grep.c: add configuration variables to show matched option Taylor Blau
2018-06-20 20:05   ` [PATCH v2 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
2018-06-21 11:53   ` [PATCH v2 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
2018-06-21 12:01     ` Jeff King
2018-06-22 21:45       ` Johannes Schindelin
2018-06-22 22:26         ` Jeff King
2018-06-21 20:52     ` Junio C Hamano
2018-06-21 21:45     ` Taylor Blau
2018-06-22  7:22       ` Jeff King
2018-06-22 15:49 ` [PATCH v3 " Taylor Blau
2018-06-22 15:49   ` [PATCH v3 1/7] Documentation/config.txt: camel-case lineNumber for consistency Taylor Blau
2018-06-22 15:49   ` [PATCH v3 2/7] grep.c: expose {,inverted} match column in match_line() Taylor Blau
2018-06-22 15:49   ` [PATCH v3 3/7] grep.[ch]: extend grep_opt to allow showing matched column Taylor Blau
2018-06-22 15:49   ` [PATCH v3 4/7] grep.c: display column number of first match Taylor Blau
2018-06-22 15:49   ` [PATCH v3 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)' Taylor Blau
2018-06-22 15:49   ` [PATCH v3 6/7] grep.c: add configuration variables to show matched option Taylor Blau
2018-06-22 15:49   ` [PATCH v3 7/7] contrib/git-jump/git-jump: jump to exact location Taylor Blau
2018-06-25 18:43   ` [PATCH v3 0/7] grep.c: teach --column to 'git-grep(1)' Jeff King
2018-06-25 18:47     ` Taylor Blau
2018-06-26 16:45       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).