user/dev discussion of public-inbox itself
 help / color / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Eric Wong <e@80x24.org>
Cc: <meta@public-inbox.org> 
Subject: [PATCH 00/13]: Incremental index fixes
Date: Wed, 01 Aug 2018 11:41:47 -0500
Message-ID: <878t5qkpis.fsf@xmission.com> (raw)


Eric,

Recently I noticed some weird indexing on one of my mailboxes.  So I dug
in and made the index_sync tests more robust and found several corner
cases where the current code can be problematic.  What follows are the
test cases and fixes for those cases.

The big one is when a message with the highest message number is
deleted.  This can cause message numbers to be reassigned so I
explicitly track the highest message number assigned.

Very confusing when reading mail there is a case in SearchIdx
that has been assigning message numbers in the wrong order.  Depending
on your time range that can be very confusing as old messages show up
before newer ones.

Finally in v2 deleted messages have not been being deleted from the
msgmap.  Which while great for keeping message numbers from going
backwards it means things still show up that shouldn't.

Eric W. Biederman (13):
      Import.pm: Don't assume {in} and {out} always exist
      t/v1reindex.t: Isolate the test cases
      t/v2reindex.t: Isolate the test cases more
      t/v[12]reindex.t: Place expected second in Xapian tests
      t/v[12]reindex.t: Test that the resulting msgmap is as expected
      t/v[12]reindex.t: Test incremental indexing works
      SearchIdx.pm: Always assign numbers backwards during incremental indexing
      Msgmap.pm: Track the largest value of num ever assigned
      t/v[12]reindex.t Verify num_highwater
      t/v[12]reindex.t: Verify the num highwater is as expected
      SearchIdx,V2Writeable: Update num_highwater on optimized deletes
      V2Writeable.pm: Ensure that a found message number is in the msgmap
      V2Writeable.pm: In unindex_oid delete the message from msgmap

 lib/PublicInbox/Import.pm     |   1 +
 lib/PublicInbox/Msgmap.pm     |  23 ++-
 lib/PublicInbox/SearchIdx.pm  |  30 ++--
 lib/PublicInbox/V2Writable.pm |  20 ++-
 t/v1reindex.t                 | 359 ++++++++++++++++++++++++++++++++++++------
 t/v2reindex.t                 | 335 +++++++++++++++++++++++++++++++++++----
 6 files changed, 666 insertions(+), 102 deletions(-)


Eric



             reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-01 16:41 ebiederm [this message]
2018-08-01 16:43 ` [PATCH 01/13] Import.pm: Don't assume {in} and {out} always exist Eric W. Biederman
2018-08-01 16:43 ` [PATCH 02/13] t/v1reindex.t: Isolate the test cases Eric W. Biederman
2018-08-01 16:43 ` [PATCH 03/13] t/v2reindex.t: Isolate the test cases more Eric W. Biederman
2018-08-01 16:43 ` [PATCH 04/13] t/v[12]reindex.t: Place expected second in Xapian tests Eric W. Biederman
2018-08-01 16:43 ` [PATCH 05/13] t/v[12]reindex.t: Test that the resulting msgmap is as expected Eric W. Biederman
2018-08-01 16:43 ` [PATCH 06/13] t/v[12]reindex.t: Test incremental indexing works Eric W. Biederman
2018-08-01 16:43 ` [PATCH 07/13] SearchIdx.pm: Always assign numbers backwards during incremental indexing Eric W. Biederman
2018-08-01 16:43 ` [PATCH 08/13] Msgmap.pm: Track the largest value of num ever assigned Eric W. Biederman
2018-08-02  3:00   ` Eric Wong
2018-08-02  3:44     ` [WIP] searchidx: support incremental indexing on indexlevel=basic Eric Wong
2018-08-02 12:25       ` ebiederm
2018-08-02 17:12         ` ebiederm
2018-08-02 18:15           ` ebiederm
2019-05-14  2:04       ` [PATCH 0/3] incremental index fixes for indexlevel=basic Eric Wong
2019-05-14  2:04         ` [PATCH 1/3] v1writable: new wrapper which is closer to v2writable Eric Wong
2019-05-14  2:04         ` [PATCH 2/3] v2writable: allow setting nproc via creat options Eric Wong
2019-05-14  2:04         ` [PATCH 3/3] searchidx: fix incremental index with indexlevel=basic on v1 Eric Wong
2018-08-02 12:08     ` [PATCH 08/13] Msgmap.pm: Track the largest value of num ever assigned ebiederm
2018-08-01 16:43 ` [PATCH 09/13] t/v[12]reindex.t Verify num_highwater Eric W. Biederman
2018-08-01 16:43 ` [PATCH 10/13] t/v[12]reindex.t: Verify the num highwater is as expected Eric W. Biederman
2018-08-01 16:43 ` [PATCH 11/13] SearchIdx,V2Writeable: Update num_highwater on optimized deletes Eric W. Biederman
2018-08-01 16:43 ` [PATCH 12/13] V2Writeable.pm: Ensure that a found message number is in the msgmap Eric W. Biederman
2018-08-01 16:43 ` [PATCH 13/13] V2Writeable.pm: In unindex_oid delete the message from msgmap Eric W. Biederman

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878t5qkpis.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox