git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Tony Finch <dot@dotat.at>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: git@vger.kernel.org
Subject: Newton-Raphson, was Re: Performance issue of 'git branch'
Date: Thu, 23 Jul 2009 23:48:43 +0100	[thread overview]
Message-ID: <alpine.LSU.2.00.0907232310220.22113@hermes-2.csi.cam.ac.uk> (raw)
In-Reply-To: <alpine.LFD.2.01.0907231153010.21520@localhost.localdomain>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1724 bytes --]

On Thu, 23 Jul 2009, Linus Torvalds wrote:
>
> Some googling found this:
> 	http://marc.info/?l=git&m=117537594112450&w=2
> but what got merged (half a year later) was a much fancier thing by Junio.
> See sha1-lookup.c.

Thanks. Edésio Costa e Silva also gave me a useful pointer.

> That original "single iteration of newton-raphson" patch was buggy, but
> it's perhaps interesting as a concept patch.

I think Newton-Raphson is a brilliant but misleading idea. (As Junio said,
"egg of Columbus" - it certainly blew my mind!) However, Newton's method
works with smooth curves, but a pack index is a straight line plus
stochastic deviations. If you try to apply Newton's method then the more
you zoom in the more the random variations will send you away from the
place you want to be. So I think your first N-R patch was closer to being
right than its successors.

What you should do is ONE linear interpolation on the entire index. (i.e.
If you have N objects in the pack and you want to find one with SHA-1 id
S, take the top four bytes of S and multiply by N/2^32.) Note that if you
do a level-1 256-way fan-out lookup first then the random variations will
make you LESS likely to land near the right place.

After doing the first-order linear interpolation, it's probably sensible
to do a page-wise linear search (in case you don't land directly on
the page containing the target SHA-1) then a binary search within the
final page for efficiency with a hot cache.

This should give you O(1) seeks in the index per object lookup.

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

  reply	other threads:[~2009-07-23 22:48 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-22 23:59 Performance issue of 'git branch' Carlos R. Mafra
2009-07-23  0:21 ` Linus Torvalds
2009-07-23  0:51   ` Linus Torvalds
2009-07-23  0:55     ` Linus Torvalds
2009-07-23  2:02       ` Carlos R. Mafra
2009-07-23  2:28         ` Linus Torvalds
2009-07-23 12:42           ` Jakub Narebski
2009-07-23 14:45             ` Carlos R. Mafra
2009-07-23 16:25             ` Linus Torvalds
2009-07-23  1:22   ` Carlos R. Mafra
2009-07-23  2:20     ` Linus Torvalds
2009-07-23  2:23       ` Linus Torvalds
2009-07-23  3:08         ` Linus Torvalds
2009-07-23  3:21           ` Linus Torvalds
2009-07-23 17:47             ` Tony Finch
2009-07-23 18:57               ` Linus Torvalds
2009-07-23 22:48                 ` Tony Finch [this message]
2009-07-23 23:24                   ` Newton-Raphson, was " Johannes Schindelin
2009-07-23 23:50                     ` Tony Finch
2009-07-24  0:43                       ` Johannes Schindelin
2009-07-23  3:18         ` Carlos R. Mafra
2009-07-23  3:27           ` Carlos R. Mafra
2009-07-23  3:40           ` Carlos R. Mafra
2009-07-23  3:47           ` Linus Torvalds
2009-07-23  4:10             ` Linus Torvalds
2009-07-23  5:13               ` Junio C Hamano
2009-07-23  5:17               ` Carlos R. Mafra
2009-07-23  4:40         ` Junio C Hamano
2009-07-23  5:36           ` Linus Torvalds
2009-07-23  5:52             ` Junio C Hamano
2009-07-23  6:04               ` Junio C Hamano
2009-07-23 17:19                 ` Linus Torvalds
2009-07-23 16:07           ` Carlos R. Mafra
2009-07-23 16:19             ` Linus Torvalds
2009-07-23 16:53               ` Carlos R. Mafra
2009-07-23 19:05                 ` Linus Torvalds
2009-07-23 19:13                   ` Linus Torvalds
2009-07-23 19:55                     ` Carlos R. Mafra
2009-07-24 20:36                       ` Linus Torvalds
2009-07-24 20:47                         ` Linus Torvalds
2009-07-24 21:21                           ` Linus Torvalds
2009-07-24 22:13                             ` Linus Torvalds
2009-07-24 22:18                               ` david
2009-07-24 22:42                                 ` Linus Torvalds
2009-07-24 22:46                                   ` david
2009-07-25  2:39                                     ` Linus Torvalds
2009-07-25  2:53                                       ` Daniel Barkalow
2009-08-07  4:21                               ` Jeff King
2009-07-24 22:54                             ` Theodore Tso
2009-07-24 22:59                               ` Shawn O. Pearce
2009-07-24 23:28                                 ` Junio C Hamano
2009-07-26 17:07                                 ` Avi Kivity
2009-07-26 17:16                                   ` Johannes Schindelin
2009-07-24 23:46                             ` Carlos R. Mafra
2009-07-25  0:41                               ` Carlos R. Mafra
2009-07-25 18:04                                 ` Linus Torvalds
2009-07-25 18:57                                   ` Timo Hirvonen
2009-07-25 19:06                                     ` Reece Dunn
2009-07-25 20:31                                     ` Mike Hommey
2009-07-25 21:02                                       ` Linus Torvalds
2009-07-25 21:13                                         ` Linus Torvalds
2009-07-25 23:23                                           ` Johannes Schindelin
2009-07-26  4:49                                             ` Linus Torvalds
2009-07-26 16:29                                               ` Theodore Tso
2009-07-26  7:54                                         ` Mike Hommey
2009-07-26 10:16                                           ` Johannes Schindelin
2009-07-26 10:23                                             ` demerphq
2009-07-26 10:27                                               ` demerphq
2009-07-25 21:04                                     ` Carlos R. Mafra
2009-07-23 16:48         ` Anders Kaseorg
2009-07-23 19:03           ` Carlos R. Mafra
2009-07-23  0:23 ` SZEDER Gábor
2009-07-23  2:25   ` Carlos R. Mafra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.00.0907232310220.22113@hermes-2.csi.cam.ac.uk \
    --to=dot@dotat.at \
    --cc=git@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).