From: Stefan Beller <sbeller@google.com>
To: sbeller@google.com
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: [PATCHv2 4/5] diffcore: add a pickaxe option to find a specific blob
Date: Tue, 2 Jan 2018 16:56:20 -0800 [thread overview]
Message-ID: <20180103005620.202940-1-sbeller@google.com> (raw)
In-Reply-To: <20180103004624.222528-5-sbeller@google.com>
Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])
One might be tempted to extend git-describe to also work with blobs,
such that `git describe <blob-id>` gives a description as
'<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer
number of responses (>110), it turns out this is tricky to get right.
The hard part to get right is picking the correct 'commit-ish' as that
could be the commit that (re-)introduced the blob or the blob that
removed the blob; the blob could exist in different branches.
Junio hinted at a different approach of solving this problem, which this
patch implements. Teach the diff machinery another flag for restricting
the information to what is shown. For example:
$ ./git log --oneline --find-object=v2.0.0:Makefile
b2feb64309 Revert the whole "ask curl-config" topic for now
47fbfded53 i18n: only extract comments marked with "TRANSLATORS:"
we observe that the Makefile as shipped with 2.0 was appeared in
v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The
reason why these commits both occur prior to v2.0.0 are evil
merges that are not found using this new mechanism.
[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
[2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/
Signed-off-by: Stefan Beller <sbeller@google.com>
---
We lost the tests in the first version, this includes the new file containing
the tests.
Stefan
Documentation/diff-options.txt | 10 +++++++
diff.c | 21 ++++++++++++-
diff.h | 8 ++++-
diffcore-pickaxe.c | 45 +++++++++++++++++-----------
revision.c | 3 ++
t/t4064-diff-oidfind.sh | 68 ++++++++++++++++++++++++++++++++++++++++++
6 files changed, 135 insertions(+), 20 deletions(-)
create mode 100755 t/t4064-diff-oidfind.sh
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index dd0dba5b1d..f9cf85db05 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -492,6 +492,15 @@ occurrences of that string did not change).
See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
information.
+--find-object=<object-id>::
+ Look for differences that change the number of occurrences of
+ the specified object. Similar to `-S`, just the argument is different
+ in that it doesn't search for a specific string but for a specific
+ object id.
++
+The object can be a blob or a submodule commit. It implies the `-t` option in
+`git-log` to also find trees.
+
--pickaxe-all::
When `-S` or `-G` finds a change, show all the changes in that
changeset, not just the files that contain the change
@@ -500,6 +509,7 @@ information.
--pickaxe-regex::
Treat the <string> given to `-S` as an extended POSIX regular
expression to match.
+
endif::git-format-patch[]
-O<orderfile>::
diff --git a/diff.c b/diff.c
index 5508745dc8..a872bdcac1 100644
--- a/diff.c
+++ b/diff.c
@@ -4082,6 +4082,7 @@ void diff_setup(struct diff_options *options)
options->interhunkcontext = diff_interhunk_context_default;
options->ws_error_highlight = ws_error_highlight_default;
options->flags.rename_empty = 1;
+ options->objfind = NULL;
/* pathchange left =NULL by default */
options->change = diff_change;
@@ -4487,6 +4488,23 @@ static int parse_ws_error_highlight_opt(struct diff_options *opt, const char *ar
return 1;
}
+static int parse_objfind_opt(struct diff_options *opt, const char *arg)
+{
+ struct object_id oid;
+
+ if (get_oid(arg, &oid))
+ return error("unable to resolve '%s'", arg);
+
+ if (!opt->objfind)
+ opt->objfind = xcalloc(1, sizeof(*opt->objfind));
+
+ opt->pickaxe_opts |= DIFF_PICKAXE_KIND_OBJFIND;
+ opt->flags.recursive = 1;
+ opt->flags.tree_in_recursive = 1;
+ oidset_insert(opt->objfind, &oid);
+ return 1;
+}
+
int diff_opt_parse(struct diff_options *options,
const char **av, int ac, const char *prefix)
{
@@ -4736,7 +4754,8 @@ int diff_opt_parse(struct diff_options *options,
else if ((argcount = short_opt('O', av, &optarg))) {
options->orderfile = prefix_filename(prefix, optarg);
return argcount;
- }
+ } else if (skip_prefix(arg, "--find-object=", &arg))
+ return parse_objfind_opt(options, arg);
else if ((argcount = parse_long_opt("diff-filter", av, &optarg))) {
int offending = parse_diff_filter_opt(optarg, options);
if (offending)
diff --git a/diff.h b/diff.h
index 9ec4f824fe..8a56cac2ad 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
#include "tree-walk.h"
#include "pathspec.h"
#include "object.h"
+#include "oidset.h"
struct rev_info;
struct diff_options;
@@ -173,6 +174,8 @@ struct diff_options {
enum diff_words_type word_diff;
enum diff_submodule_format submodule_format;
+ struct oidset *objfind;
+
/* this is set by diffcore for DIFF_FORMAT_PATCH */
int found_changes;
@@ -325,8 +328,11 @@ extern void diff_setup_done(struct diff_options *);
#define DIFF_PICKAXE_KIND_S 4 /* traditional plumbing counter */
#define DIFF_PICKAXE_KIND_G 8 /* grep in the patch */
+#define DIFF_PICKAXE_KIND_OBJFIND 16 /* specific object IDs */
-#define DIFF_PICKAXE_KINDS_MASK (DIFF_PICKAXE_KIND_S | DIFF_PICKAXE_KIND_G)
+#define DIFF_PICKAXE_KINDS_MASK (DIFF_PICKAXE_KIND_S | \
+ DIFF_PICKAXE_KIND_G | \
+ DIFF_PICKAXE_KIND_OBJFIND)
#define DIFF_PICKAXE_IGNORE_CASE 32
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 4b5d88ea46..72bb5a9514 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -124,13 +124,20 @@ static int pickaxe_match(struct diff_filepair *p, struct diff_options *o,
mmfile_t mf1, mf2;
int ret;
- if (!o->pickaxe[0])
- return 0;
-
/* ignore unmerged */
if (!DIFF_FILE_VALID(p->one) && !DIFF_FILE_VALID(p->two))
return 0;
+ if (o->objfind) {
+ return (DIFF_FILE_VALID(p->one) &&
+ oidset_contains(o->objfind, &p->one->oid)) ||
+ (DIFF_FILE_VALID(p->two) &&
+ oidset_contains(o->objfind, &p->two->oid));
+ }
+
+ if (!o->pickaxe[0])
+ return 0;
+
if (o->flags.allow_textconv) {
textconv_one = get_textconv(p->one);
textconv_two = get_textconv(p->two);
@@ -226,20 +233,22 @@ void diffcore_pickaxe(struct diff_options *o)
cflags |= REG_ICASE;
regcomp_or_die(®ex, needle, cflags);
regexp = ®ex;
- } else if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE &&
- has_non_ascii(needle)) {
- struct strbuf sb = STRBUF_INIT;
- int cflags = REG_NEWLINE | REG_ICASE;
-
- basic_regex_quote_buf(&sb, needle);
- regcomp_or_die(®ex, sb.buf, cflags);
- strbuf_release(&sb);
- regexp = ®ex;
- } else {
- kws = kwsalloc(o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE
- ? tolower_trans_tbl : NULL);
- kwsincr(kws, needle, strlen(needle));
- kwsprep(kws);
+ } else if (opts & DIFF_PICKAXE_KIND_S) {
+ if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE &&
+ has_non_ascii(needle)) {
+ struct strbuf sb = STRBUF_INIT;
+ int cflags = REG_NEWLINE | REG_ICASE;
+
+ basic_regex_quote_buf(&sb, needle);
+ regcomp_or_die(®ex, sb.buf, cflags);
+ strbuf_release(&sb);
+ regexp = ®ex;
+ } else {
+ kws = kwsalloc(o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE
+ ? tolower_trans_tbl : NULL);
+ kwsincr(kws, needle, strlen(needle));
+ kwsprep(kws);
+ }
}
/* Might want to warn when both S and G are on; I don't care... */
@@ -248,7 +257,7 @@ void diffcore_pickaxe(struct diff_options *o)
if (regexp)
regfree(regexp);
- else
+ if (kws)
kwsfree(kws);
return;
}
diff --git a/revision.c b/revision.c
index 5d11ecaf27..30f65b3bbd 100644
--- a/revision.c
+++ b/revision.c
@@ -2412,6 +2412,9 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
revs->diffopt.flags.follow_renames)
revs->diff = 1;
+ if (revs->diffopt.objfind)
+ revs->simplify_history = 0;
+
if (revs->topo_order)
revs->limited = 1;
diff --git a/t/t4064-diff-oidfind.sh b/t/t4064-diff-oidfind.sh
new file mode 100755
index 0000000000..3bdf317af8
--- /dev/null
+++ b/t/t4064-diff-oidfind.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+test_description='test finding specific blobs in the revision walking'
+. ./test-lib.sh
+
+test_expect_success 'setup ' '
+ git commit --allow-empty -m "empty initial commit" &&
+
+ echo "Hello, world!" >greeting &&
+ git add greeting &&
+ git commit -m "add the greeting blob" && # borrowed from Git from the Bottom Up
+ git tag -m "the blob" greeting $(git rev-parse HEAD:greeting) &&
+
+ echo asdf >unrelated &&
+ git add unrelated &&
+ git commit -m "unrelated history" &&
+
+ git revert HEAD^ &&
+
+ git commit --allow-empty -m "another unrelated commit"
+'
+
+test_expect_success 'find the greeting blob' '
+ cat >expect <<-EOF &&
+ Revert "add the greeting blob"
+ add the greeting blob
+ EOF
+
+ git log --format=%s --find-object=greeting^{blob} >actual &&
+
+ test_cmp expect actual
+'
+
+test_expect_success 'setup a tree' '
+ mkdir a &&
+ echo asdf >a/file &&
+ git add a/file &&
+ git commit -m "add a file in a subdirectory"
+'
+
+test_expect_success 'find a tree' '
+ cat >expect <<-EOF &&
+ add a file in a subdirectory
+ EOF
+
+ git log --format=%s -t --find-object=HEAD:a >actual &&
+
+ test_cmp expect actual
+'
+
+test_expect_success 'setup a submodule' '
+ test_create_repo sub &&
+ test_commit -C sub sub &&
+ git submodule add ./sub sub &&
+ git commit -a -m "add sub"
+'
+
+test_expect_success 'find a submodule' '
+ cat >expect <<-EOF &&
+ add sub
+ EOF
+
+ git log --format=%s --find-object=HEAD:sub >actual &&
+
+ test_cmp expect actual
+'
+
+test_done
--
2.15.1.620.gb9897f4670-goog
next prev parent reply other threads:[~2018-01-03 0:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-03 0:46 [PATCH 0/5] pickaxe refactorings and a new mode to find blobs (WAS: diffcore: add a pickaxe option to find a specific blob) Stefan Beller
2018-01-03 0:46 ` [PATCH 1/5] diff.h: Make pickaxe_opts an unsigned bit field Stefan Beller
2018-01-03 21:49 ` Junio C Hamano
2018-01-03 0:46 ` [PATCH 2/5] diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit Stefan Beller
2018-01-03 0:46 ` [PATCH 3/5] diff: introduce DIFF_PICKAXE_KINDS_MASK Stefan Beller
2018-01-03 22:01 ` Junio C Hamano
2018-01-03 0:46 ` [PATCH 4/5] diffcore: add a pickaxe option to find a specific blob Stefan Beller
2018-01-03 0:56 ` Stefan Beller [this message]
2018-01-03 0:46 ` [PATCH 5/5] diff: properly error out when combining multiple pickaxe options Stefan Beller
2018-01-03 0:58 ` Jacob Keller
2018-01-03 22:08 ` Junio C Hamano
2018-01-03 22:18 ` Stefan Beller
2018-01-04 22:50 ` [PATCHv2 0/6] pickaxe refactorings and a new mode to find blobs (WAS: diffcore: add a pickaxe option to find a specific blob) Stefan Beller
2018-01-04 22:50 ` [PATCHv2 1/6] diff.h: Make pickaxe_opts an unsigned bit field Stefan Beller
2018-01-04 22:50 ` [PATCHv2 2/6] diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit Stefan Beller
2018-01-04 22:50 ` [PATCHv2 3/6] diff: introduce DIFF_PICKAXE_KINDS_MASK Stefan Beller
2018-01-04 22:50 ` [PATCHv2 4/6] diffcore: add a pickaxe option to find a specific blob Stefan Beller
2018-01-04 22:50 ` [PATCHv2 5/6] diff: properly error out when combining multiple pickaxe options Stefan Beller
2018-01-04 22:50 ` [PATCHv2 6/6] diff: use HAS_MULTI_BITS instead of counting bits manually Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180103005620.202940-1-sbeller@google.com \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).