From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from kankan.nagaokaut.ac.jp (kankan.nagaokaut.ac.jp [133.44.2.24]) by blade.nagaokaut.ac.jp (8.12.3/8.12.3/Debian-6.6) with ESMTP id j8MFXJRJ028852 for ; Fri, 23 Sep 2005 00:33:19 +0900 Received: from funfun.nagaokaut.ac.jp (funfun.nagaokaut.ac.jp [133.44.2.201]) by kankan.nagaokaut.ac.jp (Postfix) with ESMTP id 127375987 for ; Fri, 23 Sep 2005 00:33:21 +0900 (JST) Received: from localhost (localhost.nagaokaut.ac.jp [127.0.0.1]) by funfun.nagaokaut.ac.jp (Postfix) with ESMTP id 20E20F04842 for ; Fri, 23 Sep 2005 00:33:25 +0900 (JST) Received: from voscc.nagaokaut.ac.jp (voscc.nagaokaut.ac.jp [133.44.1.100]) by funfun.nagaokaut.ac.jp (Postfix) with ESMTP id E68A2F0484A for ; Fri, 23 Sep 2005 00:33:23 +0900 (JST) Received: from beryllium.ruby-lang.org (beryllium.ruby-lang.org [210.163.138.100]) by voscc.nagaokaut.ac.jp (Postfix) with ESMTP id B9805630024 for ; Fri, 23 Sep 2005 00:33:22 +0900 (JST) Received: from beryllium.ruby-lang.org (beryllium.ruby-lang.org [127.0.0.1]) by beryllium.ruby-lang.org (Postfix) with ESMTP id 38D88339CB; Fri, 23 Sep 2005 00:33:21 +0900 (JST) Received: from localhost (beryllium.ruby-lang.org [127.0.0.1]) by beryllium.ruby-lang.org (Postfix) with ESMTP id 31FBA33A18 for ; Fri, 23 Sep 2005 00:33:20 +0900 (JST) Received: from beryllium.ruby-lang.org ([127.0.0.1]) by localhost (beryllium.ruby-lang.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 32144-02 for ; Fri, 23 Sep 2005 00:33:20 +0900 (JST) Received: from web36110.mail.mud.yahoo.com (web36110.mail.mud.yahoo.com [66.163.179.224]) by beryllium.ruby-lang.org (Postfix) with SMTP id 596D5339CB for ; Fri, 23 Sep 2005 00:33:18 +0900 (JST) Received: (qmail 80009 invoked by uid 60001); 22 Sep 2005 15:33:02 -0000 Received: from [66.68.114.146] by web36110.mail.mud.yahoo.com via HTTP; Thu, 22 Sep 2005 08:33:02 PDT Delivered-To: ruby-core@ruby-lang.org Date: Fri, 23 Sep 2005 00:33:20 +0900 Posted: Thu, 22 Sep 2005 08:33:02 -0700 (PDT) From: Eric Mahurin Reply-To: ruby-core@ruby-lang.org Subject: Re: another array patch - performance boosts all over the place To: ruby-core@ruby-lang.org Message-Id: <20050922153302.80007.qmail@web36110.mail.mud.yahoo.com> In-Reply-To: X-ML-Name: ruby-core X-Mail-Count: 05903 X-MLServer: fml [fml 4.0.3 release (20011202/4.0.3)]; post only (only members can post) X-ML-Info: If you have a question, send e-mail with the body "help" (without quotes) to the address ruby-core-ctl@ruby-lang.org; help= X-Original-To: ruby-core@ruby-lang.org DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=S1fik1atOah7xarv8fcb3iTV21sviTpguqNestPSkyIqbR8ytlLvvs/F4eI2ErFW4kpVR/1yp8L/mpT841XbrCAAL49sKv1fUPiihWORG4RI6H7Z0PU5NJsQ3C5Bdl+rMWpx8zfBnvb7ghdNqUoRBcmZDSIE72a6rrj4SGk3PM4= ; X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at ruby-lang.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Precedence: bulk List-Id: ruby-core.ruby-lang.org List-Software: fml [fml 4.0.3 release (20011202/4.0.3)] List-Post: List-Owner: List-Help: List-Unsubscribe: X-Virus-Scanned: by AMaViS snapshot-20020531 --- nobuyoshi nakada wrote: > Eric Mahurin wrote in [ruby-core:05861]: > > - instead of shared arrays all pointing to a common frozen > > array object, put shared arrays in a circular linked list > (used > > ary->shared). The original master array will have > > ELTS_SHARED=0 and the others will have ELTS_SHARED=1. > Unshared > > arrays will have ELTS_SHARED=0 and ary->shared=0. > > What kind of merit and necessity, compared with the current > code? It seems to be complicated. The changes regarding element sharing were the largest, but I believe they will be be most beneficial. This was the best solution I came up with. Here are the issues with the old/current way of element sharing that are solved by my new way: 1. After a slice is taken from a large array and then that array needs to be modified, the large array is copied to a new area so that the shared slice is not disturbed. This causes a extra copy (O(n)) of this large array compared to not doing element sharing (coping the small slice to begin with). With my solution, I copy all of the shared slices instead of the master array when the master array needs to be modified. Here is a benchmark that demonstrates the problem (FIFO taking 2 elements in/out at a time): ruby -e 'n=2**16; a=(0...n).to_a; n.times{a.push(*a.shift(2))}; puts(Process.times)' old - O(n**2) time: # new - O(n) time: # 2. If you take a small slice of a large array, modify the large array, and keep the original, the small slice will maintain a link to the original large array and keep all that space. You'll be maintaining about double the memory that you would without element sharing. Because my patch copies the slices instead of the large array, this problem doesn't exist. Here is an example (works in linux using the /proc filesystem to get memory usage): ruby -e 'n=2**13; a=(0...n).to_a; n.times{a.push(a.pop(2))}; IO.readlines("/proc/#{Process.pid}/status").grep(/VmSize/).display; puts(Process.times)' old - O(n**2) memory and time: VmSize: 199864 kB # new - O(n) memory and time: VmSize: 3360 kB # 3. If a slice is taken and held, references to all of the elements of the original array are kept even after the original array is not needed. With the method I used, when the original is GC'd, the slice is copied and all of the other elements of the original array can be GC'd. Here is an example that shows the issue: ruby -e 'n=2**13; a=[]; n.times{|i|a.push([i,(0..i).to_a][0,1])}; IO.readlines("/proc/#{Process.pid}/status").grep(/VmSize/).display; puts(Process.times)' old - O(n**2) memory, O(n**2) time: VmSize: 166644 kB # new - O(n) memory, O(n**2) time: VmSize: 7252 kB # I tried all of the above tests in perl and it has the performance/memory characteristics of my patch. If we are to have element sharing in ruby, I think the delay/memory penalty should be no worse than the same order of magnitude than if we didn't have it. This is definitely not the case currently. This is only one part of my patch - element sharing. The other major part is doing all insertions/deletions toward the closest end of the array (instead of always toward the right end). Looking at the performance of perl's unshift, they don't have this capability. Maybe this is the first time someone has done something like this. Eric __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com