From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS6315 166.70.0.0/16 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_LOW,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from out03.mta.xmission.com (out03.mta.xmission.com [166.70.13.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id F24AD1F85E; Fri, 13 Jul 2018 20:04:07 +0000 (UTC) Received: from in01.mta.xmission.com ([166.70.13.51]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fe4IQ-0000Sc-19; Fri, 13 Jul 2018 14:04:06 -0600 Received: from [97.119.167.31] (helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fe4IP-00082t-2L; Fri, 13 Jul 2018 14:04:05 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Eric Wong Cc: meta@public-inbox.org References: <87k1q1bky6.fsf@xmission.com> <20180712014715.dn5aouayoa3uejp4@dcvr> <87k1q07dyc.fsf@xmission.com> <20180712230946.mqv3yjw4aabf7xrf@dcvr.yhbt.net> <878t6f1ch7.fsf@xmission.com> Date: Fri, 13 Jul 2018 15:03:59 -0500 In-Reply-To: <878t6f1ch7.fsf@xmission.com> (Eric W. Biederman's message of "Fri, 13 Jul 2018 08:39:32 -0500") Message-ID: <87h8l2ykb4.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1fe4IP-00082t-2L;;;mid=<87h8l2ykb4.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=97.119.167.31;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19i2Hu362S2tBhhRrSBGm3PM6Bk9fVLsO8= X-SA-Exim-Connect-IP: 97.119.167.31 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: Q: V2 format X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) List-Id: ebiederm@xmission.com (Eric W. Biederman) writes: > Eric Wong writes: > >> "Eric W. Biederman" wrote: >>> >>> Because of the parallelism in V2 I have noticed messages in numbered >>> in an order that does not correspond to their commit order. So the >>> SQLite database isn't as recoverable as it might be. Especially as the >>> parallelism introduces an element of non-determinancy. >> >> *puzzled* were you able to reproduce that? The serial number >> generation + threading happens in the main process and the >> parallelism is limited to Xapian text indexing. -index >> generates serial numbers by walking backwards with v2, and >> complains on unexpected results. Digging into this I have found consistenly non-reproducible numbering, because of deleted files. Apparently in both V1 and V2 an a worst-case estimate is made of the total numbers that are going to be needed and numbers are assigned backwards from there. A fresh indexing of the git mailling list archive on v1 gives me numbers starting with 360 and on v2 numbers starting with 355. Which corresponds with the number of deleted messages. I am still looking to see if there are any other weird things here. I definitely do not like not being able to reconstruct message numbers from a backup. Eric