user/dev discussion of public-inbox itself
 help / color / mirror / Atom feed
* [PATCH 00/23] indexing: --skip-docdata + speedups
@ 2020-08-20 20:24 Eric Wong
  2020-08-20 20:24 ` [PATCH 01/23] doc: note -compact and -xcpdb are rarely used Eric Wong
                   ` (22 more replies)
  0 siblings, 23 replies; 25+ messages in thread
From: Eric Wong @ 2020-08-20 20:24 UTC (permalink / raw)
  To: meta

Some miscellaneous help and cleanup things, too.

Document data is no longer read from Xapian by read-only
daemons; that data is redundant given over.sqlite3 always
exists.  This should should improve page cache hit rates for
over.sqlite3 by a small bit.  Being able to mass load a bunch of
rows from SQLite speeds up the default search summary view by
~10%, too.

--skip-docdata option to -init and -index can avoid writing
Xapian document data, saving ~1.5% in Xapian space overhead
(and associated I/O and page cache overheads).  It breaks
rollbacks to old versions, though, so it won't be the default.

Eric Wong (23):
  doc: note -compact and -xcpdb are rarely used
  admin: progress shows the inbox being indexed
  compact: support --help/-? and perform lazy loading
  init: support --help and -?
  init: support --newsgroup option
  init: drop -N alias for --skip-artnum
  search: v2: ensure shards are numerically sorted
  xapcmd: simplify {reindex} parameter passing
  www: reduce long-lived PublicInbox::Search references
  search: improve comments around constants
  search: export mdocid subroutine
  searchquery: split off from searchview
  search: make qparse_new an internal function
  smsg: reduce utf8::decode call sites
  searchview: use over.sqlite3 instead of Xapian docdata
  searchview: speed up search summary by ~10%
  searchview: convert nested and Atom display to over.sqlite3
  extmsg: avoid using Xapian docdata
  mbox: avoid Xapian docdata in search results
  smsg: remove from_mitem
  t/nntpd-v2: set PI_TEST_VERSION=2 properly
  init+index: support --skip-docdata for Xapian
  search: add mset_to_artnums method

 Documentation/public-inbox-compact.pod |   5 +
 Documentation/public-inbox-config.pod  |   2 +-
 Documentation/public-inbox-index.pod   |   8 ++
 Documentation/public-inbox-init.pod    |  37 +++++++-
 Documentation/public-inbox-xcpdb.pod   |   3 +
 MANIFEST                               |   1 +
 lib/PublicInbox/Admin.pm               |  15 ++-
 lib/PublicInbox/ExtMsg.pm              |  19 ++--
 lib/PublicInbox/IMAP.pm                |   5 +-
 lib/PublicInbox/Inbox.pm               |  11 ++-
 lib/PublicInbox/Mbox.pm                |  24 +++--
 lib/PublicInbox/Over.pm                |  28 +++---
 lib/PublicInbox/Search.pm              | 125 ++++++++++++++-----------
 lib/PublicInbox/SearchIdx.pm           |  35 +++++--
 lib/PublicInbox/SearchIdxShard.pm      |   2 +-
 lib/PublicInbox/SearchQuery.pm         |  53 +++++++++++
 lib/PublicInbox/SearchView.pm          |  90 ++++--------------
 lib/PublicInbox/Smsg.pm                |  10 +-
 lib/PublicInbox/Xapcmd.pm              |  20 ++--
 script/public-inbox-compact            |  39 ++++++--
 script/public-inbox-convert            |   3 +-
 script/public-inbox-index              |   7 +-
 script/public-inbox-init               | 101 +++++++++++++-------
 t/imapd.t                              |   6 +-
 t/inbox_idle.t                         |   2 +-
 t/index-git-times.t                    |  11 ++-
 t/init.t                               |  17 +++-
 t/nntpd-v2.t                           |   2 +-
 t/nntpd.t                              |   9 +-
 t/search.t                             |  34 ++++---
 30 files changed, 437 insertions(+), 287 deletions(-)
 create mode 100644 lib/PublicInbox/SearchQuery.pm

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-08-20 21:10 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20 20:24 [PATCH 00/23] indexing: --skip-docdata + speedups Eric Wong
2020-08-20 20:24 ` [PATCH 01/23] doc: note -compact and -xcpdb are rarely used Eric Wong
2020-08-20 20:24 ` [PATCH 02/23] admin: progress shows the inbox being indexed Eric Wong
2020-08-20 20:24 ` [PATCH 03/23] compact: support --help/-? and perform lazy loading Eric Wong
2020-08-20 20:24 ` [PATCH 04/23] init: support --help and -? Eric Wong
2020-08-20 20:24 ` [PATCH 05/23] init: support --newsgroup option Eric Wong
2020-08-20 21:10   ` Eric Wong
2020-08-20 20:24 ` [PATCH 06/23] init: drop -N alias for --skip-artnum Eric Wong
2020-08-20 20:24 ` [PATCH 07/23] search: v2: ensure shards are numerically sorted Eric Wong
2020-08-20 20:24 ` [PATCH 08/23] xapcmd: simplify {reindex} parameter passing Eric Wong
2020-08-20 20:24 ` [PATCH 09/23] www: reduce long-lived PublicInbox::Search references Eric Wong
2020-08-20 20:24 ` [PATCH 10/23] search: improve comments around constants Eric Wong
2020-08-20 20:24 ` [PATCH 11/23] search: export mdocid subroutine Eric Wong
2020-08-20 20:24 ` [PATCH 12/23] searchquery: split off from searchview Eric Wong
2020-08-20 20:24 ` [PATCH 13/23] search: make qparse_new an internal function Eric Wong
2020-08-20 20:24 ` [PATCH 14/23] smsg: reduce utf8::decode call sites Eric Wong
2020-08-20 20:24 ` [PATCH 15/23] searchview: use over.sqlite3 instead of Xapian docdata Eric Wong
2020-08-20 20:24 ` [PATCH 16/23] searchview: speed up search summary by ~10% Eric Wong
2020-08-20 20:24 ` [PATCH 17/23] searchview: convert nested and Atom display to over.sqlite3 Eric Wong
2020-08-20 20:24 ` [PATCH 18/23] extmsg: avoid using Xapian docdata Eric Wong
2020-08-20 20:24 ` [PATCH 19/23] mbox: avoid Xapian docdata in search results Eric Wong
2020-08-20 20:24 ` [PATCH 20/23] smsg: remove from_mitem Eric Wong
2020-08-20 20:24 ` [PATCH 21/23] t/nntpd-v2: set PI_TEST_VERSION=2 properly Eric Wong
2020-08-20 20:24 ` [PATCH 22/23] init+index: support --skip-docdata for Xapian Eric Wong
2020-08-20 20:24 ` [PATCH 23/23] search: add mset_to_artnums method Eric Wong

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ https://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git