git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Chuck Lever <cel@citi.umich.edu>
To: Daniel Barkalow <barkalow@iabervon.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 21/22] teach the merge algorithm about cache iterators
Date: Wed, 14 Sep 2005 11:36:08 -0400	[thread overview]
Message-ID: <43284368.8010004@citi.umich.edu> (raw)
In-Reply-To: <Pine.LNX.4.63.0509121633480.23242@iabervon.org>

[-- Attachment #1: Type: text/plain, Size: 1749 bytes --]

Daniel Barkalow wrote:
> On Mon, 12 Sep 2005, Chuck Lever wrote:
> 
> 
>>For now, we simply replace indpos with a cache cursor.  Likely more
>>changes will be needed after we successfully replace the cache array
>>with an abstract data type.
> 
> 
> The right order is probably to add the concept of a cache that isn't the 
> one that normal functions deal with, have read_cache_unmerged return such 
> a thing, call cc_init with that, and rip out all of the removal and 
> position adjustment code. Then read_tree won't care at all about the 
> internal structure of the cache type, and it can be replaced without any 
> problem.

ok, i've done this.  read_cache_unmerged now reads into a separate 
cache, and read-tree.c does the merge by moving the appropriate cache 
entries into the active cache.

the linked list prototype is done, and works correctly.  this validates 
the new cache cursor API.  unfortunately because finding a name is now 
O(n), many things are slower than before (but i expected this would be 
the case for lists).

the next step is to try out more sophisticated data types.  we have 
three on the table so far:

1.  linus' sparse hyperlist implementation.  i suspect this will have 
the same bad performance characteristics as a standard linked list.

2.  self-balancing trees.  a splay tree is a good example.  we can 
reduce the size of the tree by storing all stages of a name in each 
node.  kernel source is about 18K files, which means we can find names 
in about 15 steps, on average.

3.  hash table, hashing on ce->name.  similar to a Python dictionary. 
with an 8 kilobucket hash table and a good hash function, we can store 
the kernel source, finding names in two or three steps on average.

are there others?

[-- Attachment #2: cel.vcf --]
[-- Type: text/x-vcard, Size: 439 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Charles
org:Network Appliance, Incorporated;Linux NFS Client Development
adr:535 West William Street, Suite 3100;;Center for Information Technology Integration;Ann Arbor;MI;48103-4943;USA
email;internet:cel@citi.umich.edu
title:Member of Technical Staff
tel;work:+1 734 763 4415
tel;fax:+1 734 763 4434
tel;home:+1 734 668 1089
x-mozilla-html:FALSE
url:http://www.monkey.org/~cel/
version:2.1
end:vcard


  parent reply	other threads:[~2005-09-14 15:36 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-12 14:55 [PATCH 00/22] cache cursors: an introduction Chuck Lever
2005-09-12 14:55 ` [PATCH 01/22] introduce facility to walk through the active cache Chuck Lever
2005-09-12 14:55 ` [PATCH 02/22] use cache iterator in checkout-index.c Chuck Lever
2005-09-12 14:55 ` [PATCH 03/22] teach diff.c about cache iterators Chuck Lever
2005-09-12 14:55 ` [PATCH 04/22] teach diff-index.c " Chuck Lever
2005-09-12 14:55 ` [PATCH 05/22] teach diff-files.c " Chuck Lever
2005-09-12 14:55 ` [PATCH 06/22] teach diff-stages.c " Chuck Lever
2005-09-12 14:55 ` [PATCH 07/22] teach fsck-objects.c to use " Chuck Lever
2005-09-12 14:56 ` [PATCH 08/22] teach ls-files.c " Chuck Lever
2005-09-12 14:56 ` [PATCH 09/22] teach read-tree.c " Chuck Lever
2005-09-12 14:56 ` [PATCH 10/22] teach update-index.c about cache cursors Chuck Lever
2005-09-12 14:56 ` [PATCH 11/22] teach write-tree.c to use cache iterators Chuck Lever
2005-09-12 14:56 ` [PATCH 12/22] simplify write_cache() calling sequence Chuck Lever
2005-09-12 14:56 ` [PATCH 13/22] move purge_cache() to read-cache.c Chuck Lever
2005-09-12 14:56 ` [PATCH 14/22] move read_cache_unmerged into read-cache.c Chuck Lever
2005-09-12 14:56 ` [PATCH 15/22] replace cache_name_pos Chuck Lever
2005-09-12 14:56 ` [PATCH 16/22] teach apply.c to use cache_find_name() Chuck Lever
2005-09-12 14:56 ` [PATCH 17/22] teach checkout-index.c " Chuck Lever
2005-09-12 14:56 ` [PATCH 18/22] teach diff.c " Chuck Lever
2005-09-12 14:56 ` [PATCH 19/22] teach ls-files.c " Chuck Lever
2005-09-12 14:56 ` [PATCH 20/22] teach merge-index.c " Chuck Lever
2005-09-12 14:56 ` [PATCH 21/22] teach the merge algorithm about cache iterators Chuck Lever
2005-09-12 20:43   ` Daniel Barkalow
2005-09-13  0:02     ` Chuck Lever
2005-09-14 15:36     ` Chuck Lever [this message]
2005-09-14 16:41       ` Daniel Barkalow
2005-09-14 17:50         ` Junio C Hamano
2005-09-14 19:49         ` Chuck Lever
2005-09-14 20:40           ` Daniel Barkalow
2005-09-14 22:28             ` Chuck Lever
2005-09-14 22:50               ` Linus Torvalds
2005-09-14 23:23                 ` Daniel Barkalow
2005-09-15 14:01                   ` Chuck Lever
2005-09-12 14:56 ` [PATCH 22/22] teach read-cache.c to use cache_find_name() Chuck Lever
2005-09-12 15:38 ` [PATCH 00/22] cache cursors: an introduction A Large Angry SCM
2005-09-12 16:37   ` Chuck Lever
2005-09-12 20:26     ` Daniel Barkalow
2005-09-12 20:47     ` Junio C Hamano
2005-09-13 19:06     ` Tim Ottinger
2005-09-13 19:46       ` Junio C Hamano
2005-09-13 20:06         ` Tim Ottinger
2005-09-14  8:28       ` Catalin Marinas
2005-09-14 14:49         ` Chuck Lever
2005-09-12 19:53 ` Junio C Hamano
2005-09-12 20:22   ` Daniel Barkalow
2005-09-12 20:30     ` Junio C Hamano
2005-09-12 23:59   ` Chuck Lever
2005-09-13  0:14     ` Junio C Hamano
2005-09-13  0:17     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43284368.8010004@citi.umich.edu \
    --to=cel@citi.umich.edu \
    --cc=barkalow@iabervon.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).