about summary refs log tree commit homepage
path: root/lib/PublicInbox/ExtSearchIdx.pm
DateCommit message (Collapse)
2020-11-08extsearchidx: avoid needless alternates rewrite in ALL.git
As with fill_alternates in V2Writable, we do not need to update $GIT_DIR/objects/info/alternates if nothing is changed.
2020-11-07extsearchidx: support --batch-size checkpoints
This is needed to limit the RSS of processes and ensure the stored data in over.sqlite3 and Xapian DBs are consistent if interrupted. Without checkpoints, indexing lore causes shard workers to take several GB of memory and thrash/OOM smaller systems.
2020-11-07extsearchidx: set current_info in warning callbacks
This bit is duplicated with per-Inbox indexing in Admin, undecided if it's the right place for it.
2020-11-07extsearchidx: handle edits
We can now handle cases where messages are edited in one inbox but not another, bifurcating the message. V2Writable::log_range handles some edge-cases which could happen in v2-only code paths, as well, but weren't usually triggered due to default git-gc knobs not pruning immediately
2020-11-07searchidx: remove xref3 support for Xapian
It doesn't seem worth storing xref3 data in Xapian now that the same info is in over.sqlite3.
2020-11-07extsearchidx: sync updates
A couple of more things to prepare us to run syncs on both v1 and v2 inboxes.
2020-11-07extsearchidx: sync unit updates
Now that the V2Writable code is more generic, we can sync with it to use `units' which represent either a v2 epoch or an entire v1 inbox.
2020-11-07extsearchidx: remove {unindex_range} field
Moved to per-epoch "units".
2020-11-07extsearchidx: more compatibility with V2Writable callers
We'll use `index_oid' and `unindex_oid' as our method names so V2Writable methods may use `$self->can' to access them.
2020-11-07extsearchidx: initial implementation
It compiles...