git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Stefan Beller" <sbeller@google.com>, "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"David Turner" <novalis@novalis.org>,
	git@vger.kernel.org, "Michael Haggerty" <mhagger@alum.mit.edu>
Subject: [PATCH 22/23] ref-filter: limit traversal to prefix
Date: Wed, 17 May 2017 14:05:45 +0200	[thread overview]
Message-ID: <1f50142ccbb84a4e5d7a1cd67e6f9d30edc1d73a.1495014840.git.mhagger@alum.mit.edu> (raw)
In-Reply-To: <cover.1495014840.git.mhagger@alum.mit.edu>

From: Jeff King <peff@peff.net>

When we are matching refnames "as path" against a pattern, then we
know that the beginning of any refname that can match the pattern has
to match the part of the pattern up to the first glob character. For
example, if the pattern is `refs/heads/foo*bar`, then it can only
match a reference that has the prefix `refs/heads/foo`.

So pass that prefix to `for_each_fullref_in()`. This lets the ref code
avoid passing us the full set of refs, and in some cases avoid reading
them in the first place.

This could be generalized to the case of multiple patterns, but (a) it
probably doesn't come up that often, and (b) it is more awkward to
deal with multiple patterns (e.g., the patterns might not be
disjoint). So, since this is just an optimization, punt on the case of
multiple patterns.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 ref-filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 61 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index 1fc5e9970d..54cc6bcc13 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1665,6 +1665,66 @@ static int filter_pattern_match(struct ref_filter *filter, const char *refname)
 	return match_pattern(filter, refname);
 }
 
+/*
+ * Find the longest prefix of pattern we can pass to
+ * for_each_fullref_in(), namely the part of pattern preceding the
+ * first glob character.
+ */
+static void find_longest_prefix(struct strbuf *out, const char *pattern)
+{
+	const char *p;
+
+	for (p = pattern; *p && !is_glob_special(*p); p++)
+		;
+
+	strbuf_add(out, pattern, p - pattern);
+}
+
+/*
+ * This is the same as for_each_fullref_in(), but it tries to iterate
+ * only over the patterns we'll care about. Note that it _doesn't_ do a full
+ * pattern match, so the callback still has to match each ref individually.
+ */
+static int for_each_fullref_in_pattern(struct ref_filter *filter,
+				       each_ref_fn cb,
+				       void *cb_data,
+				       int broken)
+{
+	struct strbuf prefix = STRBUF_INIT;
+	int ret;
+
+	if (!filter->match_as_path) {
+		/*
+		 * in this case, the patterns are applied after
+		 * prefixes like "refs/heads/" etc. are stripped off,
+		 * so we have to look at everything:
+		 */
+		return for_each_fullref_in("", cb, cb_data, broken);
+	}
+
+	if (!filter->name_patterns[0]) {
+		/* no patterns; we have to look at everything */
+		return for_each_fullref_in("", cb, cb_data, broken);
+	}
+
+	if (filter->name_patterns[1]) {
+		/*
+		 * multiple patterns; in theory this could still work as long
+		 * as the patterns are disjoint. We'd just make multiple calls
+		 * to for_each_ref(). But if they're not disjoint, we'd end up
+		 * reporting the same ref multiple times. So let's punt on that
+		 * for now.
+		 */
+		return for_each_fullref_in("", cb, cb_data, broken);
+	}
+
+	find_longest_prefix(&prefix, filter->name_patterns[0]);
+
+	ret = for_each_fullref_in(prefix.buf, cb, cb_data, broken);
+	strbuf_release(&prefix);
+	return ret;
+}
+
 /*
  * Given a ref (sha1, refname), check if the ref belongs to the array
  * of sha1s. If the given ref is a tag, check if the given tag points
@@ -1911,7 +1971,7 @@ int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int
 		else if (filter->kind == FILTER_REFS_TAGS)
 			ret = for_each_fullref_in("refs/tags/", ref_filter_handler, &ref_cbdata, broken);
 		else if (filter->kind & FILTER_REFS_ALL)
-			ret = for_each_fullref_in("", ref_filter_handler, &ref_cbdata, broken);
+			ret = for_each_fullref_in_pattern(filter, ref_filter_handler, &ref_cbdata, broken);
 		if (!ret && (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(ref_filter_handler, &ref_cbdata);
 	}
-- 
2.11.0


  parent reply	other threads:[~2017-05-17 12:06 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-17 12:05 [PATCH 00/23] Prepare to separate out a packed_ref_store Michael Haggerty
2017-05-17 12:05 ` [PATCH 01/23] t3600: clean up permissions test properly Michael Haggerty
2017-05-17 12:42   ` Jeff King
2017-05-17 14:01     ` Michael Haggerty
2017-05-18  4:10   ` Junio C Hamano
2017-05-19  3:37     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 02/23] refs.h: clarify docstring for the ref_transaction_update()-related fns Michael Haggerty
2017-05-17 16:46   ` Stefan Beller
2017-05-18  4:13     ` Junio C Hamano
2017-05-17 12:05 ` [PATCH 03/23] ref_iterator_begin_fn(): fix docstring Michael Haggerty
2017-05-17 12:05 ` [PATCH 04/23] prefix_ref_iterator: don't trim too much Michael Haggerty
2017-05-17 12:55   ` Jeff King
2017-05-17 14:11     ` Michael Haggerty
2017-05-17 14:22       ` Jeff King
2017-05-18  4:19   ` Junio C Hamano
2017-05-18  4:50     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 05/23] refs_ref_iterator_begin(): don't check prefixes redundantly Michael Haggerty
2017-05-17 12:59   ` Jeff King
2017-05-17 14:21     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 06/23] refs: use `size_t` indexes when iterating over ref transaction updates Michael Haggerty
2017-05-17 16:59   ` Stefan Beller
2017-05-18  4:55     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 07/23] ref_store: take `logmsg` parameter when deleting references Michael Haggerty
2017-05-17 13:12   ` Jeff King
2017-05-17 15:01     ` Michael Haggerty
2017-05-17 15:03       ` Jeff King
2017-05-17 12:05 ` [PATCH 08/23] lockfile: add a new method, is_lock_file_locked() Michael Haggerty
2017-05-17 13:12   ` Jeff King
2017-05-17 12:05 ` [PATCH 09/23] files-backend: move `lock` member to `files_ref_store` Michael Haggerty
2017-05-17 13:15   ` Jeff King
2017-05-17 15:49     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 10/23] files_ref_store: put the packed files lock directly in this struct Michael Haggerty
2017-05-17 13:17   ` Jeff King
2017-05-17 15:05     ` Michael Haggerty
2017-05-17 17:18     ` Stefan Beller
2017-05-18  0:18       ` Brandon Williams
2017-05-19  4:00       ` Michael Haggerty
2017-05-18  0:17     ` Brandon Williams
2017-05-18  1:11       ` Jeff King
2017-05-18 15:42         ` Brandon Williams
2017-05-17 12:05 ` [PATCH 11/23] files_transaction_cleanup(): new helper function Michael Haggerty
2017-05-17 13:19   ` Jeff King
2017-05-19  4:49     ` Michael Haggerty
2017-05-17 17:26   ` Stefan Beller
2017-05-19  4:42     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 12/23] ref_transaction_commit(): break into multiple functions Michael Haggerty
2017-05-17 17:44   ` Stefan Beller
2017-05-19  7:58     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 13/23] ref_update_reject_duplicates(): expose function to whole refs module Michael Haggerty
2017-05-17 12:05 ` [PATCH 14/23] ref_update_reject_duplicates(): use `size_t` rather than `int` Michael Haggerty
2017-05-17 12:05 ` [PATCH 15/23] ref_update_reject_duplicates(): add a sanity check Michael Haggerty
2017-05-17 12:05 ` [PATCH 16/23] should_pack_ref(): new function, extracted from `files_pack_refs()` Michael Haggerty
2017-05-17 12:05 ` [PATCH 17/23] get_packed_ref_cache(): assume "packed-refs" won't change while locked Michael Haggerty
2017-05-17 17:57   ` Stefan Beller
2017-05-18  1:15     ` Jeff King
2017-05-18 16:58       ` Stefan Beller
2017-05-17 12:05 ` [PATCH 18/23] read_packed_refs(): do more of the work of reading packed refs Michael Haggerty
2017-05-17 12:05 ` [PATCH 19/23] read_packed_refs(): report unexpected fopen() failures Michael Haggerty
2017-05-17 13:28   ` Jeff King
2017-05-17 15:27     ` Michael Haggerty
2017-05-18  4:57     ` Junio C Hamano
2017-05-18  5:08       ` Jeff King
2017-05-17 12:05 ` [PATCH 20/23] refs_ref_iterator_begin(): handle `GIT_REF_PARANOIA` Michael Haggerty
2017-05-17 13:29   ` Jeff King
2017-05-17 12:05 ` [PATCH 21/23] create_ref_entry(): remove `check_name` option Michael Haggerty
2017-05-17 12:05 ` Michael Haggerty [this message]
2017-05-17 13:38   ` [PATCH 22/23] ref-filter: limit traversal to prefix Jeff King
2017-05-19 10:02     ` Michael Haggerty
2017-05-17 12:05 ` [PATCH 23/23] cache_ref_iterator_begin(): avoid priming unneeded directories Michael Haggerty
2017-05-17 13:42 ` [PATCH 00/23] Prepare to separate out a packed_ref_store Jeff King
2017-05-17 18:14   ` Stefan Beller
2017-05-18 17:14 ` Johannes Sixt
2017-05-18 17:22   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f50142ccbb84a4e5d7a1cd67e6f9d30edc1d73a.1495014840.git.mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=novalis@novalis.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).