git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/1] Teaching the diff machinery about blobfind [WAS: git describe <blob>]
@ 2017-11-20 22:25 Stefan Beller
  2017-11-20 22:25 ` [PATCH 1/1] diffcore: add a filter to find a specific blob Stefan Beller
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Beller @ 2017-11-20 22:25 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

This seems to be an easier approach; thanks Junio for hinting at it.
This certainly solves our immediate needs, we may want to build
'git describe <blob>' on top of it or defer it until later.

Thanks,
Stefan

previous descussion
https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/

Stefan Beller (1):
  diffcore: add a filter to find a specific blob

 Documentation/diff-options.txt |  4 ++++
 Makefile                       |  1 +
 builtin/log.c                  |  2 +-
 diff.c                         | 20 ++++++++++++++++-
 diff.h                         |  3 +++
 diffcore-blobfind.c            | 51 ++++++++++++++++++++++++++++++++++++++++++
 diffcore.h                     |  1 +
 revision.c                     |  3 ++-
 t/t4064-diff-blobfind.sh       | 35 +++++++++++++++++++++++++++++
 9 files changed, 117 insertions(+), 3 deletions(-)
 create mode 100644 diffcore-blobfind.c
 create mode 100755 t/t4064-diff-blobfind.sh

-- 
2.15.0.128.gcadd42da22


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] diffcore: add a filter to find a specific blob
  2017-11-20 22:25 [PATCH 0/1] Teaching the diff machinery about blobfind [WAS: git describe <blob>] Stefan Beller
@ 2017-11-20 22:25 ` Stefan Beller
  2017-11-24  7:43   ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Beller @ 2017-11-20 22:25 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

One might be tempted to extend git-describe to also work with blobs,
such that `git describe <blob-id>` gives a description as
'<commit-ish>:<path>'.  This was implemented at [2]; as seen by the sheer
number of responses (>110), it turns out this is tricky to get right.
The hard part to get right is picking the correct 'commit-ish' as that
could be the commit that (re-)introduced the blob or the blob that
removed the blob; the blob could exist in different branches.

Junio hinted at a different approach of solving this problem, which this
patch implements. Teach the diff machinery another flag for restricting
the information to what is shown. For example:

  $ ./git log --oneline --blobfind=v2.0.0:Makefile
  b2feb64309 Revert the whole "ask curl-config" topic for now
  47fbfded53 i18n: only extract comments marked with "TRANSLATORS:"

we observe that the Makefile as shipped with 2.0 was introduced in
v1.9.2-471-g47fbfded53 and replaced in v2.0.0-rc1-5-gb2feb64309 by
a different blob.

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
[2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/

Signed-off-by: Stefan Beller <sbeller@google.com>
---

On playing around with this, trying to find more interesting cases, I observed:

    git log --oneline --blobfind=HEAD:COPYING
    703601d678 Update COPYING with GPLv2 with new FSF address
    
    git log --oneline --blobfind=703601d678^:COPYING
    459b8d22e5 tests: do not borrow from COPYING and README from the real source
    703601d678 Update COPYING with GPLv2 with new FSF address
    075b845a85 Add a COPYING notice, making it explicit that the license is GPLv2.

    t/diff-lib/COPYING may need an update of the adress of the FSF,
    #leftoverbits I guess.
    
Another interesting case that I found was
   git log --oneline --blobfind=v2.14.0:Makefile
   3921a0b3c3 perf: add test for writing the index
   36f048c5e4 sha1dc: build git plumbing code more explicitly
   2118805b92 Makefile: add style build rule

all of which were after v2.14, such that the introduction of that blob doesn't
show up; I suspect it came in via a merge as unrelated series may have updated
the Makefile in parallel, though git-log should have told me?

Thanks,
Stefan

 Documentation/diff-options.txt |  4 ++++
 Makefile                       |  1 +
 builtin/log.c                  |  2 +-
 diff.c                         | 20 ++++++++++++++++-
 diff.h                         |  3 +++
 diffcore-blobfind.c            | 51 ++++++++++++++++++++++++++++++++++++++++++
 diffcore.h                     |  1 +
 revision.c                     |  3 ++-
 t/t4064-diff-blobfind.sh       | 35 +++++++++++++++++++++++++++++
 9 files changed, 117 insertions(+), 3 deletions(-)
 create mode 100644 diffcore-blobfind.c
 create mode 100755 t/t4064-diff-blobfind.sh

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index dd0dba5b1d..252a21cc19 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -500,6 +500,10 @@ information.
 --pickaxe-regex::
 	Treat the <string> given to `-S` as an extended POSIX regular
 	expression to match.
+--blobfind=<blob-id>::
+	Restrict the output such that one side of the diff
+	matches the given blob-id.
+
 endif::git-format-patch[]
 
 -O<orderfile>::
diff --git a/Makefile b/Makefile
index ee9d5eb11e..fdfa8f38f6 100644
--- a/Makefile
+++ b/Makefile
@@ -775,6 +775,7 @@ LIB_OBJS += date.o
 LIB_OBJS += decorate.o
 LIB_OBJS += diffcore-break.o
 LIB_OBJS += diffcore-delta.o
+LIB_OBJS += diffcore-blobfind.o
 LIB_OBJS += diffcore-order.o
 LIB_OBJS += diffcore-pickaxe.o
 LIB_OBJS += diffcore-rename.o
diff --git a/builtin/log.c b/builtin/log.c
index 6c1fa896ad..7b91f61423 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -181,7 +181,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 		init_display_notes(&rev->notes_opt);
 
 	if (rev->diffopt.pickaxe || rev->diffopt.filter ||
-	    rev->diffopt.flags.follow_renames)
+	    rev->diffopt.flags.follow_renames || rev->diffopt.blobfind)
 		rev->always_show_header = 0;
 
 	if (source)
diff --git a/diff.c b/diff.c
index 0763e89263..8861f89ab1 100644
--- a/diff.c
+++ b/diff.c
@@ -4082,6 +4082,7 @@ void diff_setup(struct diff_options *options)
 	options->interhunkcontext = diff_interhunk_context_default;
 	options->ws_error_highlight = ws_error_highlight_default;
 	options->flags.rename_empty = 1;
+	options->blobfind = NULL;
 
 	/* pathchange left =NULL by default */
 	options->change = diff_change;
@@ -4487,6 +4488,19 @@ static int parse_ws_error_highlight_opt(struct diff_options *opt, const char *ar
 	return 1;
 }
 
+static int parse_blobfind_opt(struct diff_options *opt, const char *arg)
+{
+	struct object_id oid;
+
+	if (get_oid_blob(arg, &oid) || sha1_object_info(oid.hash, NULL) != OBJ_BLOB)
+		return error("object '%s' is not a blob", arg);
+
+	if (!opt->blobfind)
+		opt->blobfind = xcalloc(1, sizeof(*opt->blobfind));
+	oidset_insert(opt->blobfind, &oid);
+	return 1;
+}
+
 int diff_opt_parse(struct diff_options *options,
 		   const char **av, int ac, const char *prefix)
 {
@@ -4736,7 +4750,8 @@ int diff_opt_parse(struct diff_options *options,
 	else if ((argcount = short_opt('O', av, &optarg))) {
 		options->orderfile = prefix_filename(prefix, optarg);
 		return argcount;
-	}
+	} else if (skip_prefix(arg, "--blobfind=", &arg))
+		return parse_blobfind_opt(options, arg);
 	else if ((argcount = parse_long_opt("diff-filter", av, &optarg))) {
 		int offending = parse_diff_filter_opt(optarg, options);
 		if (offending)
@@ -5770,6 +5785,9 @@ void diffcore_std(struct diff_options *options)
 		diffcore_skip_stat_unmatch(options);
 	if (!options->found_follow) {
 		/* See try_to_follow_renames() in tree-diff.c */
+
+		if (options->blobfind)
+			diffcore_blobfind(options);
 		if (options->break_opt != -1)
 			diffcore_break(options->break_opt);
 		if (options->detect_rename)
diff --git a/diff.h b/diff.h
index 0fb18dd735..9178e498fa 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "oidset.h"
 
 struct rev_info;
 struct diff_options;
@@ -174,6 +175,8 @@ struct diff_options {
 	enum diff_words_type word_diff;
 	enum diff_submodule_format submodule_format;
 
+	struct oidset *blobfind;
+
 	/* this is set by diffcore for DIFF_FORMAT_PATCH */
 	int found_changes;
 
diff --git a/diffcore-blobfind.c b/diffcore-blobfind.c
new file mode 100644
index 0000000000..5d222fc336
--- /dev/null
+++ b/diffcore-blobfind.c
@@ -0,0 +1,51 @@
+/*
+ * Copyright (c) 2017 Google Inc.
+ */
+#include "cache.h"
+#include "diff.h"
+#include "diffcore.h"
+
+static void diffcore_filter_blobs(struct diff_queue_struct *q,
+				  struct diff_options *options)
+{
+	int i, j = 0, c = q->nr;
+
+	if (!options->blobfind)
+		BUG("blobfind oidset not initialized???");
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+
+		if (DIFF_PAIR_UNMERGED(p) ||
+		    (DIFF_FILE_VALID(p->one) &&
+		     oidset_contains(options->blobfind, &p->one->oid)) ||
+		    (DIFF_FILE_VALID(p->two) &&
+		     oidset_contains(options->blobfind, &p->two->oid)))
+			continue;
+
+		diff_free_filepair(p);
+		q->queue[i] = NULL;
+		c--;
+	}
+
+	/* Keep it sorted. */
+	i = 0; j = 0;
+	while (i < c) {
+		while (!q->queue[j])
+			j++;
+		q->queue[i] = q->queue[j];
+		i++; j++;
+	}
+
+	q->nr = c;
+
+	if (!c) {
+		free(q->queue);
+		DIFF_QUEUE_CLEAR(q);
+	}
+}
+
+void diffcore_blobfind(struct diff_options *options)
+{
+	diffcore_filter_blobs(&diff_queued_diff, options);
+}
diff --git a/diffcore.h b/diffcore.h
index a30da161da..431917672f 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -107,6 +107,7 @@ extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
 					struct diff_filespec *);
 extern void diff_q(struct diff_queue_struct *, struct diff_filepair *);
 
+extern void diffcore_blobfind(struct diff_options *);
 extern void diffcore_break(int);
 extern void diffcore_rename(struct diff_options *);
 extern void diffcore_merge_broken(void);
diff --git a/revision.c b/revision.c
index e2e691dd5a..6449619c0a 100644
--- a/revision.c
+++ b/revision.c
@@ -2409,7 +2409,8 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	/* Pickaxe, diff-filter and rename following need diffs */
 	if (revs->diffopt.pickaxe ||
 	    revs->diffopt.filter ||
-	    revs->diffopt.flags.follow_renames)
+	    revs->diffopt.flags.follow_renames ||
+	    revs->diffopt.blobfind)
 		revs->diff = 1;
 
 	if (revs->topo_order)
diff --git a/t/t4064-diff-blobfind.sh b/t/t4064-diff-blobfind.sh
new file mode 100755
index 0000000000..b2c2964d77
--- /dev/null
+++ b/t/t4064-diff-blobfind.sh
@@ -0,0 +1,35 @@
+#!/bin/sh
+
+test_description='test finding specific blobs in the revision walking'
+. ./test-lib.sh
+
+test_expect_success 'setup ' '
+	git commit --allow-empty -m "empty initial commit" &&
+
+	echo "Hello, world!" >greeting &&
+	git add greeting &&
+	git commit -m "add the greeting blob" && # borrowed from Git from the Bottom Up
+	git tag -m "the blob" greeting $(git rev-parse HEAD:greeting) &&
+
+	echo asdf >unrelated &&
+	git add unrelated &&
+	git commit -m "unrelated history" &&
+
+	git revert HEAD^ &&
+
+	git commit --allow-empty -m "another unrelated commit"
+'
+
+test_expect_success 'find the greeting blob' '
+	cat >expect <<-EOF &&
+	Revert "add the greeting blob"
+	add the greeting blob
+	EOF
+
+	git log --abbrev=12 --oneline --blobfind=greeting^{blob} >actual.raw &&
+	cut -c 14- actual.raw >actual &&
+
+	test_cmp expect actual
+'
+
+test_done
-- 
2.15.0.128.gcadd42da22


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
  2017-11-20 22:25 ` [PATCH 1/1] diffcore: add a filter to find a specific blob Stefan Beller
@ 2017-11-24  7:43   ` Junio C Hamano
  2017-11-25  4:59     ` Junio C Hamano
  2017-12-07 21:40     ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-11-24  7:43 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> Sometimes users are given a hash of an object and they want to
> identify it further (ex.: Use verify-pack to find the largest blobs,
> but what are these? or [1])
>
> One might be tempted to extend git-describe to also work with blobs,
> such that `git describe <blob-id>` gives a description as
> '<commit-ish>:<path>'.  This was implemented at [2]; as seen by the sheer
> number of responses (>110), it turns out this is tricky to get right.
> The hard part to get right is picking the correct 'commit-ish' as that
> could be the commit that (re-)introduced the blob or the blob that
> removed the blob; the blob could exist in different branches.
>
> Junio hinted at a different approach of solving this problem, which this
> patch implements. Teach the diff machinery another flag for restricting
> the information to what is shown. For example:
>
>   $ ./git log --oneline --blobfind=v2.0.0:Makefile
>   b2feb64309 Revert the whole "ask curl-config" topic for now
>   47fbfded53 i18n: only extract comments marked with "TRANSLATORS:"
>
> we observe that the Makefile as shipped with 2.0 was introduced in
> v1.9.2-471-g47fbfded53 and replaced in v2.0.0-rc1-5-gb2feb64309 by
> a different blob.
>
> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
> [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
> On playing around with this, trying to find more interesting cases, I observed:
>
>     git log --oneline --blobfind=HEAD:COPYING
>     703601d678 Update COPYING with GPLv2 with new FSF address
>     
>     git log --oneline --blobfind=703601d678^:COPYING
>     459b8d22e5 tests: do not borrow from COPYING and README from the real source
>     703601d678 Update COPYING with GPLv2 with new FSF address
>     075b845a85 Add a COPYING notice, making it explicit that the license is GPLv2.
>
>     t/diff-lib/COPYING may need an update of the adress of the FSF,
>     # leftoverbits I guess.

I do not think so.  See tz/fsf-address-update topic for details.

Please do not contaminate the list archive with careless mention of 
"hash-mark plus left over bits", as it will make searching the real
good bits harder.  Thanks.

> Another interesting case that I found was
>    git log --oneline --blobfind=v2.14.0:Makefile
>    3921a0b3c3 perf: add test for writing the index
>    36f048c5e4 sha1dc: build git plumbing code more explicitly
>    2118805b92 Makefile: add style build rule
>
> all of which were after v2.14, such that the introduction of that blob doesn't
> show up; I suspect it came in via a merge as unrelated series may have updated
> the Makefile in parallel, though git-log should have told me?

If that is the case, shouldn't we make this new mode imply
--full-history to forbid history simplification?  "git log" is a
tool to find _an_ explanation of the current state, and the usual
history simplification makes tons of sense there, but blobfind is
run most likely in order to find _all_ mention of the set of blobs
given.

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index dd0dba5b1d..252a21cc19 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -500,6 +500,10 @@ information.
>  --pickaxe-regex::
>  	Treat the <string> given to `-S` as an extended POSIX regular
>  	expression to match.
> +--blobfind=<blob-id>::
> +	Restrict the output such that one side of the diff
> +	matches the given blob-id.
> +
>  endif::git-format-patch[]

Can we have a blank line between these enumerations to make the
source easier to read?  Thanks.

> diff --git a/diffcore-blobfind.c b/diffcore-blobfind.c
> new file mode 100644
> index 0000000000..5d222fc336
> --- /dev/null
> +++ b/diffcore-blobfind.c
> @@ -0,0 +1,51 @@
> +/*
> + * Copyright (c) 2017 Google Inc.
> + */
> +#include "cache.h"
> +#include "diff.h"
> +#include "diffcore.h"
> +
> +static void diffcore_filter_blobs(struct diff_queue_struct *q,
> +				  struct diff_options *options)
> +{
> +	int i, j = 0, c = q->nr;
> +
> +	if (!options->blobfind)
> +		BUG("blobfind oidset not initialized???");
> +
> +	for (i = 0; i < q->nr; i++) {
> +		struct diff_filepair *p = q->queue[i];
> +
> +		if (DIFF_PAIR_UNMERGED(p) ||
> +		    (DIFF_FILE_VALID(p->one) &&
> +		     oidset_contains(options->blobfind, &p->one->oid)) ||
> +		    (DIFF_FILE_VALID(p->two) &&
> +		     oidset_contains(options->blobfind, &p->two->oid)))
> +			continue;

So, we keep an unmerged pair, a pair that mentions a sought-blob on
one side or the other side?  I am not sure if we want to keep the
unmerged pair for the purpose of this one.

> +		diff_free_filepair(p);
> +		q->queue[i] = NULL;
> +		c--;

Also, if you are doing the in-place shrinking and have already
introduced another counter 'j' that is initialized to 0, I think it
makes more sense to do the shrinking in-place.  'i' will stay to be
the source-scan pointer that runs 0 thru q->nr, while 'j' can be
used in this loop (where you have 'continue') to move the current
one that is determined to survive from q->queue[i] to q->queue[j++].

Then you do not need 'c'; when the loop ends, 'j' would be the
number of surviving entries and q->nr can be adjusted to it.  Unlike
the usual pattern taken by the other diffcore transformations where
a new queue is populated and the old one discarded, this would leave
the q->queue[] over-allocated, but I do not think it is too bad.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
  2017-11-24  7:43   ` Junio C Hamano
@ 2017-11-25  4:59     ` Junio C Hamano
  2017-12-07 21:40     ` Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-11-25  4:59 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> If that is the case, shouldn't we make this new mode imply
> --full-history to forbid history simplification?  "git log" is a
> tool to find _an_ explanation of the current state, and the usual
> history simplification makes tons of sense there, but blobfind is
> run most likely in order to find _all_ mention of the set of blobs
> given.

One scenario that I think we may want to be careful about is this:

 ---o---*---*---A*--M*--o---X
     \             /
      o---*---o---B

where commits marked with '*' has the same blob M:Makefile you are
looking for at the same path Makefile, and we start traversal at X
with "git log --blobfind=M:Makefile X" (or even with a pathspec, i.e.
"git log --blobfind=M:Makefile X -- Makefile).

The usual merge simplification rules would say "Ah, M and A are
TREESAME so we do not have to look at the side branch that ends at
B".  If the user is interested in finding all the introduction and
the retirement of a specific blob object, we would miss the
transition around the '*' on that side branch and ends up finding
only the transitions after the fork point where the blob is
introduced, and after M where the blob is retired.

Another interesting case we may want to be careful is this:

    ---A*--M*--o---X
          /
      ---B*

for the same reason.  The usual merge simplification rules are
designed to come up with _an_ explanation for the state in X,
and because M is TREESAME with both A and B, it would pick just
one (the first parent) while ignoring the other.  Again, that would
not be appropriate if the reason why the user is running the command
is to find all the introduction and the retirement of an object.

It may be worth covering these in the tests (I didn't try to see
specifically if the patch has these cases already, as I didn't think
of the issue when I responded---sorry about that).

Thanks.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] diffcore: add a filter to find a specific blob
  2017-11-24  7:43   ` Junio C Hamano
  2017-11-25  4:59     ` Junio C Hamano
@ 2017-12-07 21:40     ` Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-12-07 21:40 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

After saying "Will merge to 'next'" in the recent "What's cooking"
report, I noticed that a few loose ends were never tied on this
topic.

>> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>> index dd0dba5b1d..252a21cc19 100644
>> --- a/Documentation/diff-options.txt
>> +++ b/Documentation/diff-options.txt
>> @@ -500,6 +500,10 @@ information.
>>  --pickaxe-regex::
>>  	Treat the <string> given to `-S` as an extended POSIX regular
>>  	expression to match.
>> +--blobfind=<blob-id>::
>> +	Restrict the output such that one side of the diff
>> +	matches the given blob-id.
>> +
>>  endif::git-format-patch[]
>
> Can we have a blank line between these enumerations to make the
> source easier to read?  Thanks.
>
> ...
> So, we keep an unmerged pair, a pair that mentions a sought-blob on
> one side or the other side?  I am not sure if we want to keep the
> unmerged pair for the purpose of this one.
>
>> +		diff_free_filepair(p);
>> +		q->queue[i] = NULL;
>> +		c--;
>
> Also, if you are doing the in-place shrinking and have already
> introduced another counter 'j' that is initialized to 0, I think it
> makes more sense to do the shrinking in-place.  'i' will stay to be
> the source-scan pointer that runs 0 thru q->nr, while 'j' can be
> used in this loop (where you have 'continue') to move the current
> one that is determined to survive from q->queue[i] to q->queue[j++].
>
> Then you do not need 'c'; when the loop ends, 'j' would be the
> number of surviving entries and q->nr can be adjusted to it.  Unlike
> the usual pattern taken by the other diffcore transformations where
> a new queue is populated and the old one discarded, this would leave
> the q->queue[] over-allocated, but I do not think it is too bad.

Here is to illustrate the last point.  I still think we should keep
the unmerged entries for the purpose of blobfind but it should be
trivial to fix that.

 diffcore-blobfind.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/diffcore-blobfind.c b/diffcore-blobfind.c
index 5d222fc336..bf63ba61dc 100644
--- a/diffcore-blobfind.c
+++ b/diffcore-blobfind.c
@@ -8,40 +8,31 @@
 static void diffcore_filter_blobs(struct diff_queue_struct *q,
 				  struct diff_options *options)
 {
-	int i, j = 0, c = q->nr;
+	int src, dst;
 
 	if (!options->blobfind)
 		BUG("blobfind oidset not initialized???");
 
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
+	for (src = dst = 0; src < q->nr; src++) {
+		struct diff_filepair *p = q->queue[src];
 
 		if (DIFF_PAIR_UNMERGED(p) ||
 		    (DIFF_FILE_VALID(p->one) &&
 		     oidset_contains(options->blobfind, &p->one->oid)) ||
 		    (DIFF_FILE_VALID(p->two) &&
-		     oidset_contains(options->blobfind, &p->two->oid)))
-			continue;
-
-		diff_free_filepair(p);
-		q->queue[i] = NULL;
-		c--;
-	}
-
-	/* Keep it sorted. */
-	i = 0; j = 0;
-	while (i < c) {
-		while (!q->queue[j])
-			j++;
-		q->queue[i] = q->queue[j];
-		i++; j++;
+		     oidset_contains(options->blobfind, &p->two->oid))) {
+			q->queue[dst] = p;
+			dst++;
+		} else {
+			diff_free_filepair(p);
+		}
 	}
 
-	q->nr = c;
-
-	if (!c) {
+	if (!dst) {
 		free(q->queue);
 		DIFF_QUEUE_CLEAR(q);
+	} else {
+		q->nr = dst;
 	}
 }
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-12-07 21:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-20 22:25 [PATCH 0/1] Teaching the diff machinery about blobfind [WAS: git describe <blob>] Stefan Beller
2017-11-20 22:25 ` [PATCH 1/1] diffcore: add a filter to find a specific blob Stefan Beller
2017-11-24  7:43   ` Junio C Hamano
2017-11-25  4:59     ` Junio C Hamano
2017-12-07 21:40     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).