user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: "Eric Wong (Contractor, The Linux Foundation)" <e@80x24.org>
To: meta@public-inbox.org
Cc: "Eric Wong (Contractor, The Linux Foundation)" <e@80x24.org>
Subject: [PATCH 10/12] extmsg: remove expensive git path checks
Date: Wed, 18 Apr 2018 09:13:14 +0000	[thread overview]
Message-ID: <20180418091316.29114-11-e@80x24.org> (raw)
In-Reply-To: <20180418091316.29114-1-e@80x24.org>

Searching across different inboxes is expensive without
SQLite (or Xapian) installed, so avoid doing expensive tree
lookups in git.  Since SQLite is required for Xapian
support anyways, we won't need to check Xapian, either.

Sites without SQLite installed will simply 404 if somebody
requests a message which isn't in the current inbox.
---
 lib/PublicInbox/ExtMsg.pm | 39 +++++++--------------------------------
 lib/PublicInbox/Inbox.pm  |  5 -----
 2 files changed, 7 insertions(+), 37 deletions(-)

diff --git a/lib/PublicInbox/ExtMsg.pm b/lib/PublicInbox/ExtMsg.pm
index c71510f..a6f516d 100644
--- a/lib/PublicInbox/ExtMsg.pm
+++ b/lib/PublicInbox/ExtMsg.pm
@@ -31,30 +31,19 @@ sub ext_msg {
 	my $cur = $ctx->{-inbox};
 	my $mid = $ctx->{mid};
 
-	eval { require PublicInbox::Search };
-	my $have_xap = $@ ? 0 : 1;
-	my (@nox, @ibx, @found);
+	eval { require PublicInbox::Msgmap };
+	my $have_mm = $@ ? 0 : 1;
+	my (@ibx, @found);
 
 	$ctx->{www}->{pi_config}->each_inbox(sub {
 		my ($other) = @_;
 		return if $other->{name} eq $cur->{name} || !$other->base_url;
 
-		my $s = $other->search;
-		if (!$s) {
-			push @nox, $other;
-			return;
-		}
-
-		# try to find the URL with Xapian to avoid forking
-		my $doc_id = eval { $s->find_first_doc_id('Q' . $mid) };
-		if ($@) {
-			# xapian not configured properly for this repo
-			push @nox, $other;
-			return;
-		}
+		my $mm = $other->mm or return;
 
-		# maybe we found it!
-		if (defined $doc_id) {
+		# try to find the URL with Msgmap to avoid forking
+		my $num = $mm->num_for($mid);
+		if (defined $num) {
 			push @found, $other;
 		} else {
 			# no point in trying the fork fallback if we
@@ -66,20 +55,6 @@ sub ext_msg {
 
 	return exact($ctx, \@found, $mid) if @found;
 
-	# Xapian not installed or configured for some repos,
-	# do a full MID check (this is expensive...):
-	if (@nox) {
-		my $path = mid2path($mid);
-		foreach my $other (@nox) {
-			my (undef, $type, undef) = $other->path_check($path);
-
-			if ($type && $type eq 'blob') {
-				push @found, $other;
-			}
-		}
-	}
-	return exact($ctx, \@found, $mid) if @found;
-
 	# fall back to partial MID matching
 	my $n_partial = 0;
 	my @partial;
diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index f71493a..706089c 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -290,11 +290,6 @@ sub smsg_mime {
 	}
 }
 
-sub path_check {
-	my ($self, $path) = @_;
-	git($self)->check('HEAD:'.$path);
-}
-
 sub mid2num($$) {
 	my ($self, $mid) = @_;
 	my $mm = mm($self) or return;
-- 
EW


  parent reply	other threads:[~2018-04-18  9:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-18  9:13 [PATCH 00/12] better dedupe, contiguous article numbers Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 01/12] feed: respect feedmax, again Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 02/12] v1: remove articles from overview DB Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 03/12] compact: do not merge v2 repos by default Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 04/12] v2writable: reduce partititions by one Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 05/12] search: preserve References in Xapian smsg for x=t view Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 06/12] v2: generate better Message-IDs for duplicates Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 07/12] v2: improve deduplication checks Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 08/12] import: cat_blob drops leading 'From ' lines like Inbox Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 09/12] searchidx: regenerate and avoid article number gaps on full index Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` Eric Wong (Contractor, The Linux Foundation) [this message]
2018-04-18  9:13 ` [PATCH 11/12] use %H consistently to disable abbreviations Eric Wong (Contractor, The Linux Foundation)
2018-04-18  9:13 ` [PATCH 12/12] searchidx: increase term positions for all text terms Eric Wong (Contractor, The Linux Foundation)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180418091316.29114-11-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).