user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Eric Wong <e@80x24.org>
Cc: meta@public-inbox.org
Subject: Re: Q: Did you do something to message number recently?
Date: Mon, 24 Jun 2019 17:56:41 -0500	[thread overview]
Message-ID: <877e9avbh2.fsf@xmission.com> (raw)
In-Reply-To: <20190624173319.3bb4t3zrieb4k5w2@dcvr> (Eric Wong's message of "Mon, 24 Jun 2019 17:33:19 +0000")

Eric Wong <e@80x24.org> writes:

> Eric Wong <e@80x24.org> wrote:
>> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
>> > 
>> > Eric,
>> > 
>> > I am just starting to dig into this, I just noticed that I have several
>> > inboxes that are seeing huge skips in message numbers assigned in
>> > msgmap.  Do you have any idea why this would be?
>
> How big are the gaps you're hitting?

This is from my mailbox where I confirmed there was an issue.

> sqlite> select max(num) from msgmap
>    ...> ;
> 110106
> sqlite> select * from msgmap where num = 110106 ;
> 110106|84814e0e219b33ade641718dc76b3225@swift.generated
> sqlite> select * from msgmap where num = 110105 ;
> sqlite> select * from msgmap where num = 110104 ;
> sqlite> select * from msgmap where num = 110103 ;
> sqlite> select * from msgmap where num = 110102 ;
> sqlite> select * from msgmap where num = 110101 ;
> sqlite> select * from msgmap where num = 110100 ;
> sqlite> select * from msgmap where num = 110100 ;
> sqlite> select * from msgmap where num = 110099 ;
> sqlite> select * from msgmap where num = 110098 ;
> sqlite> select * from msgmap where num = 110097 ;
> sqlite> select * from msgmap where num = 110096 ;
> sqlite> select * from msgmap where num = 110095 ;
> sqlite> select * from msgmap where num = 110094 ;
> sqlite> select * from msgmap where num = 110093 ;
> sqlite> select * from msgmap where num = 110092 ;
> sqlite> select * from msgmap where num = 110091 ;
> sqlite> select * from msgmap where num = 110090 ;
> sqlite> select * from msgmap where num = 110089 ;
> 110089|be2d2a0d036c01dba1fcbe3c001d3f0e@rhempresarial.com

Hmm.  With a smarter querey I am seeing:

> sqlite> select num, mid from msgmap order by num ;
> 110361|CAC8N3VSSLZ+ANaTXb0b3-K2NNrSRxYkuozAvJrPa4TZBL70ESg@mail.gmail.com
> 110365|947beed847840a7b278d820d00158ff9@rhempresarial.com
> 110412|388652ac24bbd42b565255e60032ac38@training-programs.net
> 110534|0.0.ED.ACB.1D52978A96689C0.0@uspmta120030.emarsys.net
> 110547|20190623052647.GA9838@gmail.com
> 110555|OF1F4E7E7A.F48DBC61-ON47258422.0008F563@LocalDomain
> 110560|832660e183e06b2c917d925c438b74b1@gruyeresfvxneypf.onion.pet
> 110571|1561260206528_102468_20455_1833.sc-10_9_12_206-inbound8$ebiederm@xmission.com
> 110577|166408121.377898.1561282483109@mail.yahoo.com
> 110578|20190623024642.3CADECD60542E@mail.casanare.gov.co
> 110580|308185e7c16c72785bfa7ca1a11bebb8@contato.ufsc.br
> 110592|207A6EDACFB04E7299FCD018E0A4BFA4.MAI@srv.newbilisim.com
> 110603|20190623084720.0180DCDAF68EC@mail.casanare.gov.co
> 110612|20190623103511.B05FF259256F2@mail.07d02.mspz7.gob.ec
> 110662|20190610090733.0CF77DB98C18400A@al-reefy.com
> 110689|9d692655ffc312f5899ee60d045055df@swift.generated
> 110701|25mseb$-t-8$-$$03-v$33@1nabo
> 110744|E1hfHhs-0003HR-Qo@mx02.mta.xmission.com
> 110753|20190624052459.6136AA4A29@poczta.zarz.agh.edu.pl
> 110755|SG2PR04MB260278655E8FC19804B95C76AEE00@SG2PR04MB2602.apcprd04.prod.outlook.com
> 110756|1459657803.17189221561359906087.JavaMail.app@rapp21.atlis1
> 110759|0e2c946c07c8941ce9b5b4d8df655c22@swift.generated
> 110765|SG2PR01MB2983FD90065F6FDAEB52F3B2CCE00@SG2PR01MB2983.apcprd01.prod.exchangelabs.com
> 110772|613cadb920e6c2bd6cecc7a5e2d7af17@swift.generated
> 110774|156136730324625@kroah.com
> 110776|f67aacd0534c471aaf9f531fd7d76514@SRV-MBX13-02.chu-toulouse.fr
> 110782|CAC=vM_Z8Pr4NVsatk3ROn1W8fJjkO3B_CDs8s+2UXm56cg3scA@mail.gmail.com
> 110783|20190624092326.475497997@linuxfoundation.org
> 110784|cb83bd8a41985599698df9225cbcd322@swift.generated
> 110787|6c4e4b4ecf374e0f92f6553e96368c38@SRV-MBX13-02.chu-toulouse.fr
> 110810|20190624132129.4c772nkjbrilxtcy@willie-the-truck
> 110826|1ada2c2ef303f2e6742747bb2218a9a4@fasobf.com
> 110827|20190624105927.84B07703461ACB3F@srv155.main-hosting.eu
> 110843|20190624163442.xhk6drl7ptnq7i5o@dcvr
> 110848|CF9A396C287FC34DABB6CB7E8D92D8FD0C9C6850@V1-EXCH-01.hccsc.k12.in.us
> 110852|20190624173319.3bb4t3zrieb4k5w2@dcvr
> 110856|822002_597274_254187_778544@cvanguardia.net
> 110865|20190624184534.209896-1-joel@joelfernandes.org
> 110866|20190624185214.GA211230@google.com
> 110869|CAG48ez3maGsRbN3qr8YVb6ZCw0FDq-7GqqiTiA4yEa1mebkubw@mail.gmail.com
> 110879|20190624201023.Vlx4D%akpm@linux-foundation.org
> 110884|CAEXW_YS0YR8Au+1f-sW_BT3xVONXKo9zrcSJMBwGJizyMw0xag@mail.gmail.com

So it is pretty bad.  I am using my import_imap_mailbox script that
I am trying to clean up and so I can merge it into at leat contrib.
This is a v2 mailbox and I am using
"PublicInbox::InboxWritable($ibx)->importer(1)->add(...)"

There may have been some crashes in the past, and I am delivering to
multiple mailboxes at once.

Let's see. InboxWritable->importer->add resolves to V2Writeable->add

The add method computes the number using num_for, which uses
Msgmpa::mid_insert.

Short of the sequence number for msgmap getting scrambled I don't
see how that can go wrong.  Sigh.

I will have to instrument my client, and see if that will give
me a clue what is going on.

I definitely don't have altid's in play here.

>> I've hit this in the past using the altid stuff to keep NNTP
>> article numbers in check.
>> 
>> There's a lot of changes w.r.t. progress reporting, no-op
>> speedups,
>> 
>> > If not I will dig in and figure this out.  I just figured I would ask
>> > in case you have any handy canidates.
>> 
>> Are you hitting this with "git fetch && public-inbox-index" or
>> -watch/-mda?
>
> OK, I see this in my libc-alpha v2 mirror via -watch,
> but not my v1 -watch mirrors.  Likely to affect -mda, too.
> But the number gaps are small, however (1-2 messages), and.

Good.  I am glad you can see it as well.  One of use might be able
to spot what strange thing is going on here.

I think most of my previous work was all in the index rebuild path.
Which may be why we don't have a regression test catching this.

Grr.

I don't see a good candiate for this failure!

Eric






  reply	other threads:[~2019-06-24 22:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-24 15:59 Q: Did you do something to message number recently? Eric W. Biederman
2019-06-24 16:34 ` Eric Wong
2019-06-24 17:33   ` Eric Wong
2019-06-24 22:56     ` Eric W. Biederman [this message]
2019-06-24 23:42       ` Eric Wong
2019-06-25 12:01         ` Eric W. Biederman
2019-06-25 17:51           ` Eric Wong
2019-06-24 23:38 ` [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates Eric Wong
2019-06-25  3:48   ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877e9avbh2.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).