From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shawn Pearce Subject: Re: [PATCH 09/16] documentation: add documentation for the bitmap format Date: Mon, 1 Jul 2013 12:13:25 -0700 Message-ID: References: <1372116193-32762-1-git-send-email-tanoku@gmail.com> <1372116193-32762-10-git-send-email-tanoku@gmail.com> <7vtxkl28m7.fsf@alter.siamese.dyndns.org> <20130627024521.GA6936@sigill.intra.peff.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: Jeff King , =?ISO-8859-1?Q?Vicent_Mart=ED?= , Junio C Hamano , git To: Colby Ranger X-From: git-owner@vger.kernel.org Mon Jul 01 21:13:51 2013 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UtjXq-0002JA-Mp for gcvg-git-2@plane.gmane.org; Mon, 01 Jul 2013 21:13:51 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753263Ab3GATNq (ORCPT ); Mon, 1 Jul 2013 15:13:46 -0400 Received: from mail-ie0-f177.google.com ([209.85.223.177]:55273 "EHLO mail-ie0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752681Ab3GATNq (ORCPT ); Mon, 1 Jul 2013 15:13:46 -0400 Received: by mail-ie0-f177.google.com with SMTP id aq17so9059111iec.8 for ; Mon, 01 Jul 2013 12:13:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=spearce.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=YoX0YBBcBJE/CBNVDrd4W3UnBJPunJB36KGbpC9xnvs=; b=amkmHr9lcNLWxxphwxe20yqnu7f2iSrG3bXvmnBF1daqhBCHQseue1nYNZz5UL2CLO mAmJ+NYXwA3Xwr+Np/mrv9XGjUjomTzzup9MlJEMRX4KpVzxtHZG22SDYLzSmjJ+nWrL mQayEKmFWfPzEU2pZ5ewqntIsE60lNAUUbNME= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=YoX0YBBcBJE/CBNVDrd4W3UnBJPunJB36KGbpC9xnvs=; b=YJT8pWGPRgulTVyhvgb5+CVg9FN4j1Y3iCrA5GC1KzU+pyPt8s4Yocx1WTNZzJwV22 o00D0xlLWodPyWIN2QpQiPYTLBhBYoNP4OcfqhyqI/jH6/hOG2XkTpJq67NvZuyLE84i OW5teFS275MRWr8TeKhVpRUYL3MpkQujCozfetBMkkilfOHZ+sxLDN9Hn+4FUkzLqjtK j2/b4jMx93uFXbxFDqj6szgYVWokBEUrOMDpEjs96YW96ho9Qb4SMtYIQ4VwghVdZbLu aBJIihUONWdQGWzg29UZ5xATyBPYq0Qj1dOPyZWj8M+iAgyWwhVqo+YSY4nGfbwDErPy q4WQ== X-Received: by 10.50.119.74 with SMTP id ks10mr14138658igb.59.1372706025559; Mon, 01 Jul 2013 12:13:45 -0700 (PDT) Received: by 10.64.143.200 with HTTP; Mon, 1 Jul 2013 12:13:25 -0700 (PDT) In-Reply-To: X-Gm-Message-State: ALoCoQlUeoW6daxrJzEE1boAtjKewHjoEHnWdyw0f+RsGuRzvyCp1FVYzIlj89R2HKtsKapuj/ul Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On Mon, Jul 1, 2013 at 11:47 AM, Colby Ranger wrote: >> But I think we are comparing >> apples to steaks here, Vincent is (rightfully) concerned about process >> startup performance, whereas our timings were assuming the process was >> already running. >> > > I did some timing on loading the reverse index for the kernel and it > is pretty slow (~1200ms). I just submitted a fix to do a bucket sort > and reduced that to ~450ms, which is still slow but much better: > https://eclipse.googlesource.com/jgit/jgit/+/6cc532a43cf28403cb623d3df8600a2542a40a43%5E%21/ A reverse index that is hot in RAM would obviously load in about 0ms. But a cold load of a reverse index that uses only 4 bytes per object (as Colby did here) for 3.1M objects could take ~590ms to read from disk, assuming spinning media moving 20 MiB/s. If 8 byte offsets were also stored this could be more like 1700ms. Numbers obviously get better if the spinning media can transfer at 40 MiB/s, now its more like 295ms for 4 bytes/object and 885ms for 12 bytes/object. I think its still reasonable to compute the reverse index on the fly. But JGit certainly does have the benefit of reusing it across requests by relying on process memory based caches. C Git needs to rely on the kernel buffer cache, which requires this data be written out to a file to be shared.