user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 28/52] v2writable: make *last_commits and sync_prepare OO methods
  2020-10-27  7:54  6% [PATCH 00/52] detached external index: mostly Eric Wong
@ 2020-10-27  7:54  7% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2020-10-27  7:54 UTC (permalink / raw)
  To: meta

This will allow ExtSearchIdx to override or reuse them more
easily.  Unfortunately we lose prototype validation, but that
seems to be discouraged anyways given the 'signatures' feature
in Perl 5.20+.
---
 lib/PublicInbox/V2Writable.pm | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 3d3c25ec..ca60f2a1 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -952,8 +952,9 @@ sub index_oid { # cat_async callback
 }
 
 # only update last_commit for $i on reindex iff newer than current
+# $sync will be used by subclasses
 sub update_last_commit {
-	my ($self, $git, $i, $cmt) = @_;
+	my ($self, $sync, $git, $i, $cmt) = @_;
 	my $last = last_epoch_commit($self, $i);
 	if (defined $last && is_ancestor($git, $last, $cmt)) {
 		my @cmd = (qw(rev-list --count), "$last..$cmt");
@@ -963,7 +964,7 @@ sub update_last_commit {
 	last_epoch_commit($self, $i, $cmt);
 }
 
-sub last_commits ($$) {
+sub last_commits {
 	my ($self, $sync) = @_;
 	my $heads = [];
 	for (my $i = $sync->{epoch_max}; $i >= 0; $i--) {
@@ -1028,6 +1029,7 @@ sub artnum_max { $_[0]->{mm}->num_highwater }
 
 sub sync_prepare ($$) {
 	my ($self, $sync) = @_;
+	$sync->{ranges} = sync_ranges($self, $sync);
 	my $pr = $sync->{-opt}->{-progress};
 	my $regen_max = 0;
 	my $head = $sync->{ibx}->{ref_head} || 'HEAD';
@@ -1232,7 +1234,7 @@ sub index_epoch ($$$) {
 		}
 	}
 	$all->async_wait_all;
-	$self->update_last_commit($git, $i, $stk->{latest_cmt});
+	$self->update_last_commit($sync, $git, $i, $stk->{latest_cmt});
 }
 
 sub xapian_only {
@@ -1294,7 +1296,6 @@ sub index_sync {
 		ibx => $self->{ibx},
 		epoch_max => $epoch_max,
 	};
-	$sync->{ranges} = sync_ranges($self, $sync);
 	if (sync_prepare($self, $sync)) {
 		# tmp_clone seems to fail if inside a transaction, so
 		# we rollback here (because we opened {mm} for reading)

^ permalink raw reply related	[relevance 7%]

* [PATCH 00/52] detached external index: mostly
@ 2020-10-27  7:54  6% Eric Wong
  2020-10-27  7:54  7% ` [PATCH 28/52] v2writable: make *last_commits and sync_prepare OO methods Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2020-10-27  7:54 UTC (permalink / raw)
  To: meta

...and mostly wired up for WWW, but requires manual config
editing atm.  Needs docs and tests, and IMAP support.

This will also form the basis of a mairix workalike client.

Not sure about the usability aspects, but I think this can
replace the need for per-inbox Xapian DBs and save a truckload
of disk space (and more importantly: cache space).  Per-inbox
over.sqlite3 remains required for compatibility with NNTP/IMAP
and existing WWW code.

I don't know if the command-line tool is going to be called
public-inbox-eindex or public-inbox-extindex, but probably the
latter...

"xindex" could be confusing, and "eindex" rhymes with "reindex"
which could also be confusing.  But I'm even more easily
confused than usual these days :x

Performance isn't great, it took 30+ hours to index my mirror of
lore on a SATA SSD, but the entire index is <200GB due to
deduplication between cross posts.  -compact isn't working with
these indices, yet, but will sometime...

More changes on the way, still trying fix my brain and get
through this year...

Eric Wong (52):
  doc/standards: add RFCs for URL schemes
  search: hoist out _xdb_sharded for v2 inboxes
  extsearch: start mocking out
  searchidx: expose INDEXLEVELS as `our'
  v2writable: add git method
  v2writable: make OO calls to last_commit-related methods
  search: xdb_sharded: make this a public method for ExtSearch
  searchidx: introduce "xref3" concept
  v2writable: prepare initialization for external indices
  v2writable: hoist out write_alternates
  searchidxshard: allow msgref to be undef
  v2writable: idx_shard: simplify callers
  v2writable: count_shards: allow working without {ibx}
  overidx: introduce changes for external index
  v2: some changes for ExtSearchIdx compatibility
  inboxwritable: eidx_key for external index
  v2writable: rename remaining "remote" terminology
  v2writable: checkpoint: account for lack of {mm}
  extsearchidx: initial implementation
  searchidx: index eidx_key as a boolean term
  searchidx: xref3 delete support
  searchidxshard: special init for eidx
  searchidx: put {ibx} into $sync state
  searchidx: log2stack: simplify callers
  v2writable: more generic sync setup code
  v2writable: allow OO method references
  v2writable: rename {v2w} field to {self}
  v2writable: make *last_commits and sync_prepare OO methods
  v2writable: move size check init to sync_prepare
  extsearchidx: more compatibility with V2Writable callers
  v2writable: reduce scope of epoch-aware code
  extsearchidx: remove {unindex_range} field
  v2writable: pass oid to uindex_oid
  extsearchidx: sync unit updates
  searchidx: export prepare_stack
  extsearchidx: sync updates
  searchidx: reduce inbox-dependency, wrap ->with_umask
  searchidx: favor $sync->{ibx} (over $self->{ibx})
  Makefile.PL: do not build manpage if POD is missing
  script: add preliminary eindex implementation
  index: eindex wiring
  over: store xref3 data in over.sqlite3
  searchidx: remove xref3 support for Xapian
  t/extsearch.t: verify results and xref3 ordering
  t/v2writable: remove pointless ->barrier call
  extsearch: wire up smsg_eml
  extsearchidx: handle edits
  extsearch: wire up remaining Inbox-like methods for WWW
  searchidx: ignore exceptions from ->remove_term
  extsearchidx: set current_info in warning callbacks
  extsearchidx: support --batch-size checkpoints
  searchidxshard: make warnings with eidx_key less confusing

 Documentation/standards.perl      |   3 +
 MANIFEST                          |   4 +
 Makefile.PL                       |  16 +-
 lib/PublicInbox/Config.pm         |  12 +
 lib/PublicInbox/ExtSearch.pm      |  69 +++++
 lib/PublicInbox/ExtSearchIdx.pm   | 404 ++++++++++++++++++++++++++++++
 lib/PublicInbox/Inbox.pm          |  53 ++--
 lib/PublicInbox/InboxWritable.pm  |  23 ++
 lib/PublicInbox/Over.pm           |  19 ++
 lib/PublicInbox/OverIdx.pm        | 122 ++++++++-
 lib/PublicInbox/Search.pm         |  62 ++---
 lib/PublicInbox/SearchIdx.pm      | 135 +++++++---
 lib/PublicInbox/SearchIdxShard.pm |  77 +++++-
 lib/PublicInbox/V2Writable.pm     | 310 ++++++++++++-----------
 lib/PublicInbox/WWW.pm            |   3 +-
 lib/PublicInbox/Xapcmd.pm         |   2 +-
 script/public-inbox-eindex        |  43 ++++
 script/public-inbox-index         |   3 +-
 t/extsearch.t                     |  75 ++++++
 t/over.t                          |  24 ++
 t/search.t                        |   2 -
 t/v2writable.t                    |   3 +-
 22 files changed, 1204 insertions(+), 260 deletions(-)
 create mode 100644 lib/PublicInbox/ExtSearch.pm
 create mode 100644 lib/PublicInbox/ExtSearchIdx.pm
 create mode 100644 script/public-inbox-eindex
 create mode 100644 t/extsearch.t

^ permalink raw reply	[relevance 6%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-10-27  7:54  6% [PATCH 00/52] detached external index: mostly Eric Wong
2020-10-27  7:54  7% ` [PATCH 28/52] v2writable: make *last_commits and sync_prepare OO methods Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).