git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: kazuki saitoh <ksaitoh560@gmail.com>
To: Pete Wyckoff <pw@padd.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
Date: Wed, 14 Aug 2013 09:59:48 +0900	[thread overview]
Message-ID: <CACGba4wbqyHzXDCQxG31EKawfc-D4jpVYqbB4GdmK4hM=Oi4mw@mail.gmail.com> (raw)
In-Reply-To: <20130810201123.GA31706@padd.com>

[-- Attachment #1: Type: text/plain, Size: 6083 bytes --]

> My only concern is in the commit message, about performance.  A
> change that has lots of files in it will cause many roundtrips to
> p4d to do "p4 where" on each.  When the files don't have much
> edited content, this new approach will make the import take twice
> as long, I'll guess.  Do you have a big repository where you
> could test that?

I measured performance of "git p4 clone  --use-client-spec" with a
repository it has 1925 files, 50MB.
  Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
  Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total

It is acceptable in my situation, but looks quite slow...

Then I implemented one batch query version
   7.92s user 33.03s system 15% cpu 4:25.59 total

It is same as original

My additional patch is below.
I investigate call graph (attached rough sketch) and
implement batch query in "commit()" and "splitFilesIntoBranches()".
In addition, modified "map_in_client" to just search cache value.

Could you accept?


Subject: [PATCH] git p4: Implement as one batch "p4 where" query to interpret
 view spec

Query for each file is decrese performance.
So I implement query to get client file path as one batch query.
The query must called before use client path (map_in_client() ).

Result of performance measurement, about 40% speed up

Signed-off-by: KazukiSaitoh <ksaitoh560@gmail.com>
---
 git-p4.py | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 51 insertions(+), 19 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 40522f7..8cbee24 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1849,37 +1849,46 @@ class View(object):
         if not exclude:
             self.mappings.append(depot_side)

-    def map_in_client(self, depot_path):
-        """Return the relative location in the client where this
-           depot file should live.  Returns "" if the file should
-           not be mapped in the client."""
+    def convert_client_path(self, clientFile):
+        # chop off //client/ part to make it relative
+        if not clientFile.startswith(self.client_prefix):
+            die("No prefix '%s' on clientFile '%s'" %
+                (self.client_prefix, clientFile))
+        return clientFile[len(self.client_prefix):]

-        if depot_path in self.client_spec_path_cache:
-            return self.client_spec_path_cache[depot_path]
+    def update_client_spec_path_cache(self, files):
+        fileArgs = [f for f in files if f not in self.client_spec_path_cache]

-        where_result = p4CmdList(['where', depot_path])
-        if len(where_result) == 0:
-            die("No result from 'p4 where %s'" % depot_path)
-        client_path = ""
+        if len(fileArgs) == 0:
+            return  # All files in cache
+
+        where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs)
         for res in where_result:
             if "code" in res and res["code"] == "error":
                 # assume error is "... file(s) not in client view"
-                client_path = ""
                 continue
             if "clientFile" not in res:
                 die("No clientFile from 'p4 where %s'" % depot_path)
             if "unmap" in res:
                 # it will list all of them, but only one not unmap-ped
                 continue
-            # chop off //client/ part to make it relative
-            clientFile = res["clientFile"]
-            if not clientFile.startswith(self.client_prefix):
-                die("No prefix '%s' on clientFile '%s'" %
-                    (self.client_prefix, clientFile))
-            client_path = clientFile[len(self.client_prefix):]
+            self.client_spec_path_cache[res['depotFile']] =
self.convert_client_path(res["clientFile"])
+
+        # not found files or unmap files set to ""
+        for depotFile in fileArgs:
+            if depotFile not in self.client_spec_path_cache:
+                self.client_spec_path_cache[depotFile] = ""
+
+    def map_in_client(self, depot_path):
+        """Return the relative location in the client where this
+           depot file should live.  Returns "" if the file should
+           not be mapped in the client."""
+
+        if depot_path in self.client_spec_path_cache:
+            return self.client_spec_path_cache[depot_path]

-        self.client_spec_path_cache[depot_path] = client_path
-        return client_path
+        die( "Error: %s is not found in client spec path" % depot_path )
+        return ""

 class P4Sync(Command, P4UserMap):
     delete_actions = ( "delete", "move/delete", "purge" )
@@ -2006,6 +2015,22 @@ class P4Sync(Command, P4UserMap):
         """Look at each depotFile in the commit to figure out to what
            branch it belongs."""

+        # create file list and get client paths by one batch "p4 where" query
+        if self.clientSpecDirs:
+            fnum = 0
+            file_list  = []
+            while commit.has_key("depotFile%s" % fnum):
+                path =  commit["depotFile%s" % fnum]
+                found = [p for p in self.depotPaths
+                         if p4PathStartsWith(path, p)]
+                if not found:
+                    fnum = fnum + 1
+                    continue
+
+                file_list.append(path)
+                fnum = fnum + 1
+            self.clientSpecDirs.update_client_spec_path_cache(file_list)
+
         branches = {}
         fnum = 0
         while commit.has_key("depotFile%s" % fnum):
@@ -2255,6 +2280,13 @@ class P4Sync(Command, P4UserMap):
             else:
                 sys.stderr.write("Ignoring file outside of prefix:
%s\n" % f['path'])

+        # get client paths by one batch "p4 where" query
+        if self.clientSpecDirs:
+            file_list = []
+            for f in files:
+                file_list.append(f['path'])
+            self.clientSpecDirs.update_client_spec_path_cache(file_list)
+
         self.gitStream.write("commit %s\n" % branch)
 #        gitStream.write("mark :%s\n" % details["change"])
         self.committedChanges.add(int(details["change"]))
-- 
1.8.4-rc2

[-- Attachment #2: rough sketch_of_call_graph.png --]
[-- Type: image/png, Size: 59978 bytes --]

  parent reply	other threads:[~2013-08-14  1:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  6:45 [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
2013-08-10 20:11 ` Pete Wyckoff
2013-08-10 20:15   ` [PATCH 1/2] git p4 test: sanitize P4CHARSET Pete Wyckoff
2013-08-10 20:15   ` [PATCH 2/2] git p4: implement view spec wildcards with "p4 where" Pete Wyckoff
2013-08-14  0:59   ` kazuki saitoh [this message]
2013-08-16  1:24     ` [PATCH v2] git-p4: Ask "p4" to interpret View setting Pete Wyckoff
2013-08-25  2:29       ` Pete Wyckoff
2013-08-27  2:43         ` kazuki saitoh
2013-08-29 22:40           ` Pete Wyckoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACGba4wbqyHzXDCQxG31EKawfc-D4jpVYqbB4GdmK4hM=Oi4mw@mail.gmail.com' \
    --to=ksaitoh560@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pw@padd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).