git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2] git-p4: Ask "p4" to interpret View setting
@ 2013-08-06  6:45 kazuki saitoh
  2013-08-10 20:11 ` Pete Wyckoff
  0 siblings, 1 reply; 9+ messages in thread
From: kazuki saitoh @ 2013-08-06  6:45 UTC (permalink / raw)
  To: Pete Wyckoff, git

In Perforce, View setting of p4 client can describe
  -//depot/project/files/*.xls //client/project/files/*.xls
to exclude Excel files.
But "git p4 --use-client-spec" cannot support '*'.

In git-p4.py, "map_in_client" method analyzes View setting and return
client file path.
So I modify the method to just ask p4.


> Let me play with this for a bit.  I wonder about the performance
> aspects of doing a "p4 fstat" for every file.  Would it be
> possible to do one or a few batch queries higher up somewhere?
To reduce p4 access, it cache result of asking "client path".
And addition, "fstat" depends on sync status, so modify to use "p4
where" instead of "fstat".



Signed-off-by: KazukiSaitoh <ksaitoh560@gmail.com>
---
 git-p4.py                     | 53 ++++++-----------------------------
 t/lib-git-p4.sh               |  1 +
 t/t9807-git-p4-submit.sh      |  2 +-
 t/t9809-git-p4-client-view.sh | 65 +++++++++++++++++++++++++++++++++++--------
 4 files changed, 64 insertions(+), 57 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 31e71ff..8ec8eb4 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1819,15 +1819,6 @@ class View(object):
                variables."""

             self.ends_triple_dot = False
-            # There are three wildcards allowed in p4 views
-            # (see "p4 help views").  This code knows how to
-            # handle "..." (only at the end), but cannot deal with
-            # "%%n" or "*".  Only check the depot_side, as p4 should
-            # validate that the client_side matches too.
-            if re.search(r'%%[1-9]', self.path):
-                die("Can't handle %%n wildcards in view: %s" % self.path)
-            if self.path.find("*") >= 0:
-                die("Can't handle * wildcards in view: %s" % self.path)
             triple_dot_index = self.path.find("...")
             if triple_dot_index >= 0:
                 if triple_dot_index != len(self.path) - 3:
@@ -1903,6 +1894,7 @@ class View(object):
     #
     def __init__(self):
         self.mappings = []
+        self.client_spec_path_cache = {}  # Caching result of p4
query, use for "--use-client-spec".

     def append(self, view_line):
         """Parse a view line, splitting it into depot and client
@@ -1965,44 +1957,17 @@ class View(object):
            depot file should live.  Returns "" if the file should
            not be mapped in the client."""

-        paths_filled = []
-        client_path = ""
-
-        # look at later entries first
-        for m in self.mappings[::-1]:
-
-            # see where will this path end up in the client
-            p = m.map_depot_to_client(depot_path)
-
-            if p == "":
-                # Depot path does not belong in client.  Must remember
-                # this, as previous items should not cause files to
-                # exist in this path either.  Remember that the list is
-                # being walked from the end, which has higher precedence.
-                # Overlap mappings do not exclude previous mappings.
-                if not m.overlay:
-                    paths_filled.append(m.client_side)
+        if self.client_spec_path_cache.has_key(depot_path):
+            return self.client_spec_path_cache[depot_path]

-            else:
-                # This mapping matched; no need to search any further.
-                # But, the mapping could be rejected if the client path
-                # has already been claimed by an earlier mapping (i.e.
-                # one later in the list, which we are walking backwards).
-                already_mapped_in_client = False
-                for f in paths_filled:
-                    # this is View.Path.match
-                    if f.match(p):
-                        already_mapped_in_client = True
-                        break
-                if not already_mapped_in_client:
-                    # Include this file, unless it is from a line that
-                    # explicitly said to exclude it.
-                    if not m.exclude:
-                        client_path = p
-
-                # a match, even if rejected, always stops the search
+        client_path = ""
+        where_result = p4CmdList(['where', depot_path])
+        for res in where_result:
+            if res["code"] != "error" and not res.has_key("unmap"):
+                client_path = res["path"].replace(getClientRoot()+"/", "")
                 break

+        self.client_spec_path_cache[depot_path] = client_path
         return client_path

 class P4Sync(Command, P4UserMap):
diff --git a/t/lib-git-p4.sh b/t/lib-git-p4.sh
index 2098b9b..0d631dc 100644
--- a/t/lib-git-p4.sh
+++ b/t/lib-git-p4.sh
@@ -48,6 +48,7 @@ P4DPORT=$((10669 + ($testid - $git_p4_test_start)))
 P4PORT=localhost:$P4DPORT
 P4CLIENT=client
 P4EDITOR=:
+P4CHARSET=""
 export P4PORT P4CLIENT P4EDITOR

 db="$TRASH_DIRECTORY/db"
diff --git a/t/t9807-git-p4-submit.sh b/t/t9807-git-p4-submit.sh
index 1fb7bc7..4caf36e 100755
--- a/t/t9807-git-p4-submit.sh
+++ b/t/t9807-git-p4-submit.sh
@@ -17,7 +17,7 @@ test_expect_success 'init depot' '
  )
 '

-test_expect_failure 'is_cli_file_writeable function' '
+test_expect_success 'is_cli_file_writeable function' '
  (
  cd "$cli" &&
  echo a >a &&
diff --git a/t/t9809-git-p4-client-view.sh b/t/t9809-git-p4-client-view.sh
index 77f6349..160fd9a 100755
--- a/t/t9809-git-p4-client-view.sh
+++ b/t/t9809-git-p4-client-view.sh
@@ -75,18 +75,6 @@ test_expect_success 'init depot' '
  )
 '

-# double % for printf
-test_expect_success 'unsupported view wildcard %%n' '
- client_view "//depot/%%%%1/sub/... //client/sub/%%%%1/..." &&
- test_when_finished cleanup_git &&
- test_must_fail git p4 clone --use-client-spec --dest="$git" //depot
-'
-
-test_expect_success 'unsupported view wildcard *' '
- client_view "//depot/*/bar/... //client/*/bar/..." &&
- test_when_finished cleanup_git &&
- test_must_fail git p4 clone --use-client-spec --dest="$git" //depot
-'

 test_expect_success 'wildcard ... only supported at end of spec 1' '
  client_view "//depot/.../file11 //client/.../file11" &&
@@ -836,6 +824,59 @@ test_expect_success 'quotes on both sides' '
  git_verify "cdir 1/file11" "cdir 1/file12"
 '

+#
+# //depot
+#   - dir1
+#     - file11
+#     - file12
+#     - noneed_file11.junk
+#     - noneed_file12.junk
+#   - dir2
+#     - file21
+#     - file22
+#     - noneed_file21.junk
+#     - noneed_file22.junk
+#
+test_expect_success 'wildcard * setup' '
+ client_view "//depot/... //client/..." &&
+ (
+ p4 sync &&
+ cd "$cli" &&
+ rm files &&
+ p4 delete "//depot/..." &&
+ p4 submit -d "delete all files" &&
+ init_depot &&
+
+ cd "$cli" &&
+ p4 sync &&
+
+ echo dir1/noneed_file11.junk >dir1/noneed_file11.junk &&
+ p4 add dir1/noneed_file11.junk &&
+ p4 submit -d "dir1/noneed_file11.junk" &&
+
+ echo dir1/noneed_file12.junk >dir1/noneed_file12.junk &&
+ p4 add dir1/noneed_file12.junk &&
+ p4 submit -d "dir1/noneed_file12.junk" &&
+
+ echo dir2/noneed_file21.junk >dir2/noneed_file21.junk &&
+ p4 add dir2/noneed_file21.junk &&
+ p4 submit -d "dir2/noneed_file21.junk" &&
+
+ echo dir2/noneed_file22.junk >dir2/noneed_file22.junk &&
+ p4 add dir2/noneed_file22.junk &&
+ p4 submit -d "dir2/noneed_file22.junk"
+ )
+'
+test_expect_success 'view wildcard *' '
+ client_view "//depot/... //client/..." \
+ "-//depot/dir1/*.junk //client/dir1/*.junk" \
+ "-//depot/dir2/*.junk //client/dir2/*.junk" &&
+ files="dir1/file11 dir1/file12 dir2/file21 dir2/file22" &&
+ client_verify $files &&
+ git p4 clone --use-client-spec --dest="$git" //depot &&
+ git_verify $files
+'
+
 test_expect_success 'kill p4d' '
  kill_p4d
 '
-- 
1.8.4-rc1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
  2013-08-06  6:45 [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
@ 2013-08-10 20:11 ` Pete Wyckoff
  2013-08-10 20:15   ` [PATCH 1/2] git p4 test: sanitize P4CHARSET Pete Wyckoff
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Pete Wyckoff @ 2013-08-10 20:11 UTC (permalink / raw)
  To: kazuki saitoh; +Cc: git

ksaitoh560@gmail.com wrote on Tue, 06 Aug 2013 15:45 +0900:
> In Perforce, View setting of p4 client can describe
>   -//depot/project/files/*.xls //client/project/files/*.xls
> to exclude Excel files.
> But "git p4 --use-client-spec" cannot support '*'.
> 
> In git-p4.py, "map_in_client" method analyzes View setting and return
> client file path.
> So I modify the method to just ask p4.
> 
> 
> > Let me play with this for a bit.  I wonder about the performance
> > aspects of doing a "p4 fstat" for every file.  Would it be
> > possible to do one or a few batch queries higher up somewhere?
> To reduce p4 access, it cache result of asking "client path".
> And addition, "fstat" depends on sync status, so modify to use "p4
> where" instead of "fstat".

I played around with your patch a bit, ending up with this
teensy series.

I redid the code to use clientFile, not path, as that
will work better with AltRoots.  Also I simplified your
test and added a couple more for the now-supported wildcards.
And deleted a bunch of newly dead code.

My only concern is in the commit message, about performance.  A
change that has lots of files in it will cause many roundtrips to
p4d to do "p4 where" on each.  When the files don't have much
edited content, this new approach will make the import take twice
as long, I'll guess.  Do you have a big repository where you
could test that?

Tell me what you think.

		-- Pete

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] git p4 test: sanitize P4CHARSET
  2013-08-10 20:11 ` Pete Wyckoff
@ 2013-08-10 20:15   ` Pete Wyckoff
  2013-08-10 20:15   ` [PATCH 2/2] git p4: implement view spec wildcards with "p4 where" Pete Wyckoff
  2013-08-14  0:59   ` [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
  2 siblings, 0 replies; 9+ messages in thread
From: Pete Wyckoff @ 2013-08-10 20:15 UTC (permalink / raw)
  To: kazuki saitoh; +Cc: git

From: kazuki saitoh <ksaitoh560@gmail.com>

In the tests, p4d is started without using "internationalized
mode".  Make sure this environment variable is unset, otherwise
a mis-matched user setting would break the tests.  The error
message would be "Unicode clients require a unicode enabled server."

[pw: use unset, add commit text]

Signed-off-by: Kazuki Saitoh <ksaitoh560@gmail.com>
Signed-off-by: Pete Wyckoff <pw@padd.com>
---
 t/lib-git-p4.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/lib-git-p4.sh b/t/lib-git-p4.sh
index 2098b9b..ccd918e 100644
--- a/t/lib-git-p4.sh
+++ b/t/lib-git-p4.sh
@@ -48,7 +48,8 @@ P4DPORT=$((10669 + ($testid - $git_p4_test_start)))
 P4PORT=localhost:$P4DPORT
 P4CLIENT=client
 P4EDITOR=:
-export P4PORT P4CLIENT P4EDITOR
+unset P4CHARSET
+export P4PORT P4CLIENT P4EDITOR P4CHARSET
 
 db="$TRASH_DIRECTORY/db"
 cli="$TRASH_DIRECTORY/cli"
-- 
1.8.4.rc2.88.ga5463da

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] git p4: implement view spec wildcards with "p4 where"
  2013-08-10 20:11 ` Pete Wyckoff
  2013-08-10 20:15   ` [PATCH 1/2] git p4 test: sanitize P4CHARSET Pete Wyckoff
@ 2013-08-10 20:15   ` Pete Wyckoff
  2013-08-14  0:59   ` [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
  2 siblings, 0 replies; 9+ messages in thread
From: Pete Wyckoff @ 2013-08-10 20:15 UTC (permalink / raw)
  To: kazuki saitoh; +Cc: git

From: kazuki saitoh <ksaitoh560@gmail.com>

Currently, git p4 does not support many of the view
wildcards, such as * and %%n.  It only knows the
common ... mapping, and exclusions.

Redo the entire wildcard code around the idea of
directly querying the p4 server for the mapping.  For each
unique file, invoke "p4 where //depot/path" and use
the client mapping to decide where the file goes in git.

This simplifies a lot of code, and adds support for all
wildcards supported by p4.  The downside is that each file
triggers a query to the p4 server, possibly making high file
count changes very slow.  If it turns out to be a problem,
we might consider resurrecting the old wildcard code just
for use on the "easy" cases it understands.

[pw: redo code and tests]

Signed-off-by: Kazuki Saitoh <ksaitoh560@gmail.com>
Signed-off-by: Pete Wyckoff <pw@padd.com>
---
 git-p4.py                     | 204 +++++++++---------------------------------
 t/t9809-git-p4-client-view.sh |  88 ++++++++++++------
 2 files changed, 103 insertions(+), 189 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 31e71ff..40522f7 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -780,11 +780,14 @@ def getClientSpec():
     # dictionary of all client parameters
     entry = specList[0]
 
+    # the //client/ name
+    client_name = entry["Client"]
+
     # just the keys that start with "View"
     view_keys = [ k for k in entry.keys() if k.startswith("View") ]
 
     # hold this new View
-    view = View()
+    view = View(client_name)
 
     # append the lines, in order, to the view
     for view_num in range(len(view_keys)):
@@ -1555,8 +1558,8 @@ class P4Submit(Command, P4UserMap):
             for b in body:
                 labelTemplate += "\t" + b + "\n"
             labelTemplate += "View:\n"
-            for mapping in clientSpec.mappings:
-                labelTemplate += "\t%s\n" % mapping.depot_side.path
+            for depot_side in clientSpec.mappings:
+                labelTemplate += "\t%s\n" % depot_side
 
             if self.dry_run:
                 print "Would create p4 label %s for tag" % name
@@ -1568,7 +1571,7 @@ class P4Submit(Command, P4UserMap):
 
                 # Use the label
                 p4_system(["tag", "-l", name] +
-                          ["%s@%s" % (mapping.depot_side.path, changelist) for mapping in clientSpec.mappings])
+                          ["%s@%s" % (depot_side, changelist) for depot_side in clientSpec.mappings])
 
                 if verbose:
                     print "created p4 label for tag %s" % name
@@ -1796,117 +1799,16 @@ class View(object):
     """Represent a p4 view ("p4 help views"), and map files in a
        repo according to the view."""
 
-    class Path(object):
-        """A depot or client path, possibly containing wildcards.
-           The only one supported is ... at the end, currently.
-           Initialize with the full path, with //depot or //client."""
-
-        def __init__(self, path, is_depot):
-            self.path = path
-            self.is_depot = is_depot
-            self.find_wildcards()
-            # remember the prefix bit, useful for relative mappings
-            m = re.match("(//[^/]+/)", self.path)
-            if not m:
-                die("Path %s does not start with //prefix/" % self.path)
-            prefix = m.group(1)
-            if not self.is_depot:
-                # strip //client/ on client paths
-                self.path = self.path[len(prefix):]
-
-        def find_wildcards(self):
-            """Make sure wildcards are valid, and set up internal
-               variables."""
-
-            self.ends_triple_dot = False
-            # There are three wildcards allowed in p4 views
-            # (see "p4 help views").  This code knows how to
-            # handle "..." (only at the end), but cannot deal with
-            # "%%n" or "*".  Only check the depot_side, as p4 should
-            # validate that the client_side matches too.
-            if re.search(r'%%[1-9]', self.path):
-                die("Can't handle %%n wildcards in view: %s" % self.path)
-            if self.path.find("*") >= 0:
-                die("Can't handle * wildcards in view: %s" % self.path)
-            triple_dot_index = self.path.find("...")
-            if triple_dot_index >= 0:
-                if triple_dot_index != len(self.path) - 3:
-                    die("Can handle only single ... wildcard, at end: %s" %
-                        self.path)
-                self.ends_triple_dot = True
-
-        def ensure_compatible(self, other_path):
-            """Make sure the wildcards agree."""
-            if self.ends_triple_dot != other_path.ends_triple_dot:
-                 die("Both paths must end with ... if either does;\n" +
-                     "paths: %s %s" % (self.path, other_path.path))
-
-        def match_wildcards(self, test_path):
-            """See if this test_path matches us, and fill in the value
-               of the wildcards if so.  Returns a tuple of
-               (True|False, wildcards[]).  For now, only the ... at end
-               is supported, so at most one wildcard."""
-            if self.ends_triple_dot:
-                dotless = self.path[:-3]
-                if test_path.startswith(dotless):
-                    wildcard = test_path[len(dotless):]
-                    return (True, [ wildcard ])
-            else:
-                if test_path == self.path:
-                    return (True, [])
-            return (False, [])
-
-        def match(self, test_path):
-            """Just return if it matches; don't bother with the wildcards."""
-            b, _ = self.match_wildcards(test_path)
-            return b
-
-        def fill_in_wildcards(self, wildcards):
-            """Return the relative path, with the wildcards filled in
-               if there are any."""
-            if self.ends_triple_dot:
-                return self.path[:-3] + wildcards[0]
-            else:
-                return self.path
-
-    class Mapping(object):
-        def __init__(self, depot_side, client_side, overlay, exclude):
-            # depot_side is without the trailing /... if it had one
-            self.depot_side = View.Path(depot_side, is_depot=True)
-            self.client_side = View.Path(client_side, is_depot=False)
-            self.overlay = overlay  # started with "+"
-            self.exclude = exclude  # started with "-"
-            assert not (self.overlay and self.exclude)
-            self.depot_side.ensure_compatible(self.client_side)
-
-        def __str__(self):
-            c = " "
-            if self.overlay:
-                c = "+"
-            if self.exclude:
-                c = "-"
-            return "View.Mapping: %s%s -> %s" % \
-                   (c, self.depot_side.path, self.client_side.path)
-
-        def map_depot_to_client(self, depot_path):
-            """Calculate the client path if using this mapping on the
-               given depot path; does not consider the effect of other
-               mappings in a view.  Even excluded mappings are returned."""
-            matches, wildcards = self.depot_side.match_wildcards(depot_path)
-            if not matches:
-                return ""
-            client_path = self.client_side.fill_in_wildcards(wildcards)
-            return client_path
-
-    #
-    # View methods
-    #
-    def __init__(self):
+    def __init__(self, client_name):
         self.mappings = []
+        self.client_prefix = "//%s/" % client_name
+        # cache results of "p4 where" to lookup client file locations
+        self.client_spec_path_cache = {}
 
     def append(self, view_line):
         """Parse a view line, splitting it into depot and client
-           sides.  Append to self.mappings, preserving order."""
+           sides.  Append to self.mappings, preserving order.  This
+           is only needed for tag creation."""
 
         # Split the view line into exactly two words.  P4 enforces
         # structure on these lines that simplifies this quite a bit.
@@ -1934,75 +1836,49 @@ class View(object):
             depot_side = view_line[0:space_index]
             rhs_index = space_index + 1
 
-        if view_line[rhs_index] == '"':
-            # Second word is double quoted.  Make sure there is a
-            # double quote at the end too.
-            if not view_line.endswith('"'):
-                die("View line with rhs quote should end with one: %s" %
-                    view_line)
-            # skip the quotes
-            client_side = view_line[rhs_index+1:-1]
-        else:
-            client_side = view_line[rhs_index:]
-
         # prefix + means overlay on previous mapping
-        overlay = False
         if depot_side.startswith("+"):
-            overlay = True
             depot_side = depot_side[1:]
 
-        # prefix - means exclude this path
+        # prefix - means exclude this path, leave out of mappings
         exclude = False
         if depot_side.startswith("-"):
             exclude = True
             depot_side = depot_side[1:]
 
-        m = View.Mapping(depot_side, client_side, overlay, exclude)
-        self.mappings.append(m)
+        if not exclude:
+            self.mappings.append(depot_side)
 
     def map_in_client(self, depot_path):
         """Return the relative location in the client where this
            depot file should live.  Returns "" if the file should
            not be mapped in the client."""
 
-        paths_filled = []
-        client_path = ""
-
-        # look at later entries first
-        for m in self.mappings[::-1]:
-
-            # see where will this path end up in the client
-            p = m.map_depot_to_client(depot_path)
-
-            if p == "":
-                # Depot path does not belong in client.  Must remember
-                # this, as previous items should not cause files to
-                # exist in this path either.  Remember that the list is
-                # being walked from the end, which has higher precedence.
-                # Overlap mappings do not exclude previous mappings.
-                if not m.overlay:
-                    paths_filled.append(m.client_side)
-
-            else:
-                # This mapping matched; no need to search any further.
-                # But, the mapping could be rejected if the client path
-                # has already been claimed by an earlier mapping (i.e.
-                # one later in the list, which we are walking backwards).
-                already_mapped_in_client = False
-                for f in paths_filled:
-                    # this is View.Path.match
-                    if f.match(p):
-                        already_mapped_in_client = True
-                        break
-                if not already_mapped_in_client:
-                    # Include this file, unless it is from a line that
-                    # explicitly said to exclude it.
-                    if not m.exclude:
-                        client_path = p
-
-                # a match, even if rejected, always stops the search
-                break
+        if depot_path in self.client_spec_path_cache:
+            return self.client_spec_path_cache[depot_path]
 
+        where_result = p4CmdList(['where', depot_path])
+        if len(where_result) == 0:
+            die("No result from 'p4 where %s'" % depot_path)
+        client_path = ""
+        for res in where_result:
+            if "code" in res and res["code"] == "error":
+                # assume error is "... file(s) not in client view"
+                client_path = ""
+                continue
+            if "clientFile" not in res:
+                die("No clientFile from 'p4 where %s'" % depot_path)
+            if "unmap" in res:
+                # it will list all of them, but only one not unmap-ped
+                continue
+            # chop off //client/ part to make it relative
+            clientFile = res["clientFile"]
+            if not clientFile.startswith(self.client_prefix):
+                die("No prefix '%s' on clientFile '%s'" %
+                    (self.client_prefix, clientFile))
+            client_path = clientFile[len(self.client_prefix):]
+
+        self.client_spec_path_cache[depot_path] = client_path
         return client_path
 
 class P4Sync(Command, P4UserMap):
diff --git a/t/t9809-git-p4-client-view.sh b/t/t9809-git-p4-client-view.sh
index 77f6349..2ebc6e0 100755
--- a/t/t9809-git-p4-client-view.sh
+++ b/t/t9809-git-p4-client-view.sh
@@ -75,31 +75,6 @@ test_expect_success 'init depot' '
 	)
 '
 
-# double % for printf
-test_expect_success 'unsupported view wildcard %%n' '
-	client_view "//depot/%%%%1/sub/... //client/sub/%%%%1/..." &&
-	test_when_finished cleanup_git &&
-	test_must_fail git p4 clone --use-client-spec --dest="$git" //depot
-'
-
-test_expect_success 'unsupported view wildcard *' '
-	client_view "//depot/*/bar/... //client/*/bar/..." &&
-	test_when_finished cleanup_git &&
-	test_must_fail git p4 clone --use-client-spec --dest="$git" //depot
-'
-
-test_expect_success 'wildcard ... only supported at end of spec 1' '
-	client_view "//depot/.../file11 //client/.../file11" &&
-	test_when_finished cleanup_git &&
-	test_must_fail git p4 clone --use-client-spec --dest="$git" //depot
-'
-
-test_expect_success 'wildcard ... only supported at end of spec 2' '
-	client_view "//depot/.../a/... //client/.../a/..." &&
-	test_when_finished cleanup_git &&
-	test_must_fail git p4 clone --use-client-spec --dest="$git" //depot
-'
-
 test_expect_success 'basic map' '
 	client_view "//depot/dir1/... //client/cli1/..." &&
 	files="cli1/file11 cli1/file12" &&
@@ -793,6 +768,69 @@ test_expect_success 'overlay sync swap: cleanup' '
 	)
 '
 
+# add a .junk file, make sure it isn't mapped
+test_expect_success 'wildcard * setup' '
+	client_view "//depot/... //client/..." &&
+	(
+		cd "$cli" &&
+		p4 sync &&
+		echo dir1/file13.junk >dir1/file13.junk &&
+		p4 add dir1/file13.junk &&
+		p4 submit -d dir1/file13.junk
+	)
+'
+
+test_expect_success 'wildcard * base case' '
+	client_view "//depot/... //client/..." &&
+	files="dir1/file11 dir1/file12 dir1/file13.junk
+	       dir2/file21 dir2/file22" &&
+	client_verify $files &&
+	test_when_finished cleanup_git &&
+	git p4 clone --use-client-spec --dest="$git" //depot &&
+	git_verify $files
+'
+
+test_expect_success 'wildcard * excludes junk file' '
+	client_view "//depot/... //client/..." \
+		    "-//depot/dir1/*.junk //client/dir1/*.junk" &&
+	files="dir1/file11 dir1/file12
+	       dir2/file21 dir2/file22" &&
+	client_verify $files &&
+	test_when_finished cleanup_git &&
+	git p4 clone --use-client-spec --dest="$git" //depot &&
+	git_verify $files
+'
+
+test_expect_success 'wildcard * cleanup' '
+	client_view "//depot/... //client/..." &&
+	(
+		cd "$cli" &&
+		p4 sync &&
+		p4 delete dir1/file13.junk &&
+		p4 submit -d "remove dir1/file13.junk"
+	)
+'
+
+test_expect_success 'wildcard ... in middle' '
+	client_view "//depot/.../file11 //client/.../file11" &&
+	files="dir1/file11" &&
+	client_verify $files &&
+	test_when_finished cleanup_git &&
+	git p4 clone --use-client-spec --dest="$git" //depot &&
+	git_verify $files
+'
+
+test_expect_success 'wildcard %% mapping' '
+	client_view "//depot/%%1/... //client/map-%%1/..." &&
+	files="map-dir1/file11 map-dir1/file12
+	       map-dir2/file21 map-dir2/file22" &&
+	client_verify $files &&
+	test_when_finished cleanup_git &&
+	git p4 clone --use-client-spec --dest="$git" //depot &&
+	git_verify $files
+'
+
+
 #
 # Rename directories to test quoting in depot-side mappings
 # //depot
-- 
1.8.4.rc2.88.ga5463da

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
  2013-08-10 20:11 ` Pete Wyckoff
  2013-08-10 20:15   ` [PATCH 1/2] git p4 test: sanitize P4CHARSET Pete Wyckoff
  2013-08-10 20:15   ` [PATCH 2/2] git p4: implement view spec wildcards with "p4 where" Pete Wyckoff
@ 2013-08-14  0:59   ` kazuki saitoh
  2013-08-16  1:24     ` Pete Wyckoff
  2 siblings, 1 reply; 9+ messages in thread
From: kazuki saitoh @ 2013-08-14  0:59 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 6083 bytes --]

> My only concern is in the commit message, about performance.  A
> change that has lots of files in it will cause many roundtrips to
> p4d to do "p4 where" on each.  When the files don't have much
> edited content, this new approach will make the import take twice
> as long, I'll guess.  Do you have a big repository where you
> could test that?

I measured performance of "git p4 clone  --use-client-spec" with a
repository it has 1925 files, 50MB.
  Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
  Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total

It is acceptable in my situation, but looks quite slow...

Then I implemented one batch query version
   7.92s user 33.03s system 15% cpu 4:25.59 total

It is same as original

My additional patch is below.
I investigate call graph (attached rough sketch) and
implement batch query in "commit()" and "splitFilesIntoBranches()".
In addition, modified "map_in_client" to just search cache value.

Could you accept?


Subject: [PATCH] git p4: Implement as one batch "p4 where" query to interpret
 view spec

Query for each file is decrese performance.
So I implement query to get client file path as one batch query.
The query must called before use client path (map_in_client() ).

Result of performance measurement, about 40% speed up

Signed-off-by: KazukiSaitoh <ksaitoh560@gmail.com>
---
 git-p4.py | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 51 insertions(+), 19 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 40522f7..8cbee24 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1849,37 +1849,46 @@ class View(object):
         if not exclude:
             self.mappings.append(depot_side)

-    def map_in_client(self, depot_path):
-        """Return the relative location in the client where this
-           depot file should live.  Returns "" if the file should
-           not be mapped in the client."""
+    def convert_client_path(self, clientFile):
+        # chop off //client/ part to make it relative
+        if not clientFile.startswith(self.client_prefix):
+            die("No prefix '%s' on clientFile '%s'" %
+                (self.client_prefix, clientFile))
+        return clientFile[len(self.client_prefix):]

-        if depot_path in self.client_spec_path_cache:
-            return self.client_spec_path_cache[depot_path]
+    def update_client_spec_path_cache(self, files):
+        fileArgs = [f for f in files if f not in self.client_spec_path_cache]

-        where_result = p4CmdList(['where', depot_path])
-        if len(where_result) == 0:
-            die("No result from 'p4 where %s'" % depot_path)
-        client_path = ""
+        if len(fileArgs) == 0:
+            return  # All files in cache
+
+        where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs)
         for res in where_result:
             if "code" in res and res["code"] == "error":
                 # assume error is "... file(s) not in client view"
-                client_path = ""
                 continue
             if "clientFile" not in res:
                 die("No clientFile from 'p4 where %s'" % depot_path)
             if "unmap" in res:
                 # it will list all of them, but only one not unmap-ped
                 continue
-            # chop off //client/ part to make it relative
-            clientFile = res["clientFile"]
-            if not clientFile.startswith(self.client_prefix):
-                die("No prefix '%s' on clientFile '%s'" %
-                    (self.client_prefix, clientFile))
-            client_path = clientFile[len(self.client_prefix):]
+            self.client_spec_path_cache[res['depotFile']] =
self.convert_client_path(res["clientFile"])
+
+        # not found files or unmap files set to ""
+        for depotFile in fileArgs:
+            if depotFile not in self.client_spec_path_cache:
+                self.client_spec_path_cache[depotFile] = ""
+
+    def map_in_client(self, depot_path):
+        """Return the relative location in the client where this
+           depot file should live.  Returns "" if the file should
+           not be mapped in the client."""
+
+        if depot_path in self.client_spec_path_cache:
+            return self.client_spec_path_cache[depot_path]

-        self.client_spec_path_cache[depot_path] = client_path
-        return client_path
+        die( "Error: %s is not found in client spec path" % depot_path )
+        return ""

 class P4Sync(Command, P4UserMap):
     delete_actions = ( "delete", "move/delete", "purge" )
@@ -2006,6 +2015,22 @@ class P4Sync(Command, P4UserMap):
         """Look at each depotFile in the commit to figure out to what
            branch it belongs."""

+        # create file list and get client paths by one batch "p4 where" query
+        if self.clientSpecDirs:
+            fnum = 0
+            file_list  = []
+            while commit.has_key("depotFile%s" % fnum):
+                path =  commit["depotFile%s" % fnum]
+                found = [p for p in self.depotPaths
+                         if p4PathStartsWith(path, p)]
+                if not found:
+                    fnum = fnum + 1
+                    continue
+
+                file_list.append(path)
+                fnum = fnum + 1
+            self.clientSpecDirs.update_client_spec_path_cache(file_list)
+
         branches = {}
         fnum = 0
         while commit.has_key("depotFile%s" % fnum):
@@ -2255,6 +2280,13 @@ class P4Sync(Command, P4UserMap):
             else:
                 sys.stderr.write("Ignoring file outside of prefix:
%s\n" % f['path'])

+        # get client paths by one batch "p4 where" query
+        if self.clientSpecDirs:
+            file_list = []
+            for f in files:
+                file_list.append(f['path'])
+            self.clientSpecDirs.update_client_spec_path_cache(file_list)
+
         self.gitStream.write("commit %s\n" % branch)
 #        gitStream.write("mark :%s\n" % details["change"])
         self.committedChanges.add(int(details["change"]))
-- 
1.8.4-rc2

[-- Attachment #2: rough sketch_of_call_graph.png --]
[-- Type: image/png, Size: 59978 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
  2013-08-14  0:59   ` [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
@ 2013-08-16  1:24     ` Pete Wyckoff
  2013-08-25  2:29       ` Pete Wyckoff
  0 siblings, 1 reply; 9+ messages in thread
From: Pete Wyckoff @ 2013-08-16  1:24 UTC (permalink / raw)
  To: kazuki saitoh; +Cc: git

ksaitoh560@gmail.com wrote on Wed, 14 Aug 2013 09:59 +0900:
> > My only concern is in the commit message, about performance.  A
> > change that has lots of files in it will cause many roundtrips to
> > p4d to do "p4 where" on each.  When the files don't have much
> > edited content, this new approach will make the import take twice
> > as long, I'll guess.  Do you have a big repository where you
> > could test that?
> 
> I measured performance of "git p4 clone  --use-client-spec" with a
> repository it has 1925 files, 50MB.
>   Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
>   Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total
> 
> It is acceptable in my situation, but looks quite slow...
> 
> Then I implemented one batch query version
>    7.92s user 33.03s system 15% cpu 4:25.59 total
> 
> It is same as original
> 
> My additional patch is below.
> I investigate call graph (attached rough sketch) and
> implement batch query in "commit()" and "splitFilesIntoBranches()".
> In addition, modified "map_in_client" to just search cache value.
> 
> Could you accept?

This looks good.  I've started my own performance testing
on a few-hundred-thousand file repo to confirm your findings.

If it seems to work out, we can clean up the patch.  Otherwise
maybe need to think about having both implementations and use
the by-hand one for "...".  I don't like that approach.  Let's
hope it's not needed.

		-- Pete

> Subject: [PATCH] git p4: Implement as one batch "p4 where" query to interpret
>  view spec
> 
> Query for each file is decrese performance.
> So I implement query to get client file path as one batch query.
> The query must called before use client path (map_in_client() ).
> 
> Result of performance measurement, about 40% speed up
> 
> Signed-off-by: KazukiSaitoh <ksaitoh560@gmail.com>
> ---
>  git-p4.py | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 51 insertions(+), 19 deletions(-)
> 
> diff --git a/git-p4.py b/git-p4.py
> index 40522f7..8cbee24 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -1849,37 +1849,46 @@ class View(object):
>          if not exclude:
>              self.mappings.append(depot_side)
> 
> -    def map_in_client(self, depot_path):
> -        """Return the relative location in the client where this
> -           depot file should live.  Returns "" if the file should
> -           not be mapped in the client."""
> +    def convert_client_path(self, clientFile):
> +        # chop off //client/ part to make it relative
> +        if not clientFile.startswith(self.client_prefix):
> +            die("No prefix '%s' on clientFile '%s'" %
> +                (self.client_prefix, clientFile))
> +        return clientFile[len(self.client_prefix):]
> 
> -        if depot_path in self.client_spec_path_cache:
> -            return self.client_spec_path_cache[depot_path]
> +    def update_client_spec_path_cache(self, files):
> +        fileArgs = [f for f in files if f not in self.client_spec_path_cache]
> 
> -        where_result = p4CmdList(['where', depot_path])
> -        if len(where_result) == 0:
> -            die("No result from 'p4 where %s'" % depot_path)
> -        client_path = ""
> +        if len(fileArgs) == 0:
> +            return  # All files in cache
> +
> +        where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs)
>          for res in where_result:
>              if "code" in res and res["code"] == "error":
>                  # assume error is "... file(s) not in client view"
> -                client_path = ""
>                  continue
>              if "clientFile" not in res:
>                  die("No clientFile from 'p4 where %s'" % depot_path)
>              if "unmap" in res:
>                  # it will list all of them, but only one not unmap-ped
>                  continue
> -            # chop off //client/ part to make it relative
> -            clientFile = res["clientFile"]
> -            if not clientFile.startswith(self.client_prefix):
> -                die("No prefix '%s' on clientFile '%s'" %
> -                    (self.client_prefix, clientFile))
> -            client_path = clientFile[len(self.client_prefix):]
> +            self.client_spec_path_cache[res['depotFile']] =
> self.convert_client_path(res["clientFile"])
> +
> +        # not found files or unmap files set to ""
> +        for depotFile in fileArgs:
> +            if depotFile not in self.client_spec_path_cache:
> +                self.client_spec_path_cache[depotFile] = ""
> +
> +    def map_in_client(self, depot_path):
> +        """Return the relative location in the client where this
> +           depot file should live.  Returns "" if the file should
> +           not be mapped in the client."""
> +
> +        if depot_path in self.client_spec_path_cache:
> +            return self.client_spec_path_cache[depot_path]
> 
> -        self.client_spec_path_cache[depot_path] = client_path
> -        return client_path
> +        die( "Error: %s is not found in client spec path" % depot_path )
> +        return ""
> 
>  class P4Sync(Command, P4UserMap):
>      delete_actions = ( "delete", "move/delete", "purge" )
> @@ -2006,6 +2015,22 @@ class P4Sync(Command, P4UserMap):
>          """Look at each depotFile in the commit to figure out to what
>             branch it belongs."""
> 
> +        # create file list and get client paths by one batch "p4 where" query
> +        if self.clientSpecDirs:
> +            fnum = 0
> +            file_list  = []
> +            while commit.has_key("depotFile%s" % fnum):
> +                path =  commit["depotFile%s" % fnum]
> +                found = [p for p in self.depotPaths
> +                         if p4PathStartsWith(path, p)]
> +                if not found:
> +                    fnum = fnum + 1
> +                    continue
> +
> +                file_list.append(path)
> +                fnum = fnum + 1
> +            self.clientSpecDirs.update_client_spec_path_cache(file_list)
> +
>          branches = {}
>          fnum = 0
>          while commit.has_key("depotFile%s" % fnum):
> @@ -2255,6 +2280,13 @@ class P4Sync(Command, P4UserMap):
>              else:
>                  sys.stderr.write("Ignoring file outside of prefix:
> %s\n" % f['path'])
> 
> +        # get client paths by one batch "p4 where" query
> +        if self.clientSpecDirs:
> +            file_list = []
> +            for f in files:
> +                file_list.append(f['path'])
> +            self.clientSpecDirs.update_client_spec_path_cache(file_list)
> +
>          self.gitStream.write("commit %s\n" % branch)
>  #        gitStream.write("mark :%s\n" % details["change"])
>          self.committedChanges.add(int(details["change"]))
> -- 
> 1.8.4-rc2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
  2013-08-16  1:24     ` Pete Wyckoff
@ 2013-08-25  2:29       ` Pete Wyckoff
  2013-08-27  2:43         ` kazuki saitoh
  0 siblings, 1 reply; 9+ messages in thread
From: Pete Wyckoff @ 2013-08-25  2:29 UTC (permalink / raw)
  To: kazuki saitoh; +Cc: git

pw@padd.com wrote on Thu, 15 Aug 2013 21:24 -0400:
> ksaitoh560@gmail.com wrote on Wed, 14 Aug 2013 09:59 +0900:
> > > My only concern is in the commit message, about performance.  A
> > > change that has lots of files in it will cause many roundtrips to
> > > p4d to do "p4 where" on each.  When the files don't have much
> > > edited content, this new approach will make the import take twice
> > > as long, I'll guess.  Do you have a big repository where you
> > > could test that?
> > 
> > I measured performance of "git p4 clone  --use-client-spec" with a
> > repository it has 1925 files, 50MB.
> >   Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
> >   Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total
> > 
> > It is acceptable in my situation, but looks quite slow...
> > 
> > Then I implemented one batch query version
> >    7.92s user 33.03s system 15% cpu 4:25.59 total
> > 
> > It is same as original
> > 
> > My additional patch is below.
> > I investigate call graph (attached rough sketch) and
> > implement batch query in "commit()" and "splitFilesIntoBranches()".
> > In addition, modified "map_in_client" to just search cache value.
> > 
> > Could you accept?
> 
> This looks good.  I've started my own performance testing
> on a few-hundred-thousand file repo to confirm your findings.
> 
> If it seems to work out, we can clean up the patch.  Otherwise
> maybe need to think about having both implementations and use
> the by-hand one for "...".  I don't like that approach.  Let's
> hope it's not needed.

I tried with a few repos:

Small repo, single-commit clone:

    Current:     0m0.35s user 0m0.30s sys 0m11.52s elapsed 5.69 %CPU
    No batching: 0m0.66s user 0m0.77s sys 0m34.42s elapsed 4.17 %CPU
    Batching:    0m0.28s user 0m0.29s sys 0m10.85s elapsed 5.27 %CPU

Big repo, single-commit clone:

    Current:     6m21.38s user 1m35.36s sys 19m44.83s elapsed 40.23 %CPU
    No batching: 1m53.13s user 24m34.35s sys 146m13.80s elapsed 18.09 %CPU (*)
    Batching:    6m22.01s user 1m44.23s sys 21m19.73s elapsed 37.99 %CPU

    The "no batching" run died with an unrelated p4 timeout.

Big repo, 1000 incremental changes:

    Current:     0m13.43s user 0m19.82s sys 11m12.58s elapsed 4.94 %CPU
    No batching: 0m20.29s user 0m39.94s sys 38m44.69s elapsed 2.59 %CPU (*)
    Batching:    0m16.15s user 0m26.60s sys 13m55.69s elapsed 5.11 %CPU

    The "no batching" run died at 28% of the way through.

There is probably a 20%-ish slowdown in my environment with this
approach.  But given that the timescale for these operations is
measured in the tens of minutes, I don't think a couple more matters
too much to anybody.

The attractiveness of the simplicity and increased client spec feature
coverage weighs in its favor.  Let's go ahead and inflict this on the
world and see what they think.

Do you have an updated patch?  Want to take some time to clean up and
resubmit the entire series?  The batching should be incorporated with
the last 2/2 that I sent out.

		-- Pete

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
  2013-08-25  2:29       ` Pete Wyckoff
@ 2013-08-27  2:43         ` kazuki saitoh
  2013-08-29 22:40           ` Pete Wyckoff
  0 siblings, 1 reply; 9+ messages in thread
From: kazuki saitoh @ 2013-08-27  2:43 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git

> Do you have an updated patch?  Want to take some time to clean up and
> resubmit the entire series?  The batching should be incorporated with
> the last 2/2 that I sent out.

I don't have other update.
I'm satisfied because able to want to do and it became better than my
original modification thanks to your cooperation.
(> a few-hundred-thousand file repo
I didn't think that it work with so HUGE repo.)

How should I do?
Should I create one patch mail that incorporated your sent one?
Or nothing to do?


2013/8/25 Pete Wyckoff <pw@padd.com>:
> pw@padd.com wrote on Thu, 15 Aug 2013 21:24 -0400:
>> ksaitoh560@gmail.com wrote on Wed, 14 Aug 2013 09:59 +0900:
>> > > My only concern is in the commit message, about performance.  A
>> > > change that has lots of files in it will cause many roundtrips to
>> > > p4d to do "p4 where" on each.  When the files don't have much
>> > > edited content, this new approach will make the import take twice
>> > > as long, I'll guess.  Do you have a big repository where you
>> > > could test that?
>> >
>> > I measured performance of "git p4 clone  --use-client-spec" with a
>> > repository it has 1925 files, 50MB.
>> >   Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
>> >   Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total
>> >
>> > It is acceptable in my situation, but looks quite slow...
>> >
>> > Then I implemented one batch query version
>> >    7.92s user 33.03s system 15% cpu 4:25.59 total
>> >
>> > It is same as original
>> >
>> > My additional patch is below.
>> > I investigate call graph (attached rough sketch) and
>> > implement batch query in "commit()" and "splitFilesIntoBranches()".
>> > In addition, modified "map_in_client" to just search cache value.
>> >
>> > Could you accept?
>>
>> This looks good.  I've started my own performance testing
>> on a few-hundred-thousand file repo to confirm your findings.
>>
>> If it seems to work out, we can clean up the patch.  Otherwise
>> maybe need to think about having both implementations and use
>> the by-hand one for "...".  I don't like that approach.  Let's
>> hope it's not needed.
>
> I tried with a few repos:
>
> Small repo, single-commit clone:
>
>     Current:     0m0.35s user 0m0.30s sys 0m11.52s elapsed 5.69 %CPU
>     No batching: 0m0.66s user 0m0.77s sys 0m34.42s elapsed 4.17 %CPU
>     Batching:    0m0.28s user 0m0.29s sys 0m10.85s elapsed 5.27 %CPU
>
> Big repo, single-commit clone:
>
>     Current:     6m21.38s user 1m35.36s sys 19m44.83s elapsed 40.23 %CPU
>     No batching: 1m53.13s user 24m34.35s sys 146m13.80s elapsed 18.09 %CPU (*)
>     Batching:    6m22.01s user 1m44.23s sys 21m19.73s elapsed 37.99 %CPU
>
>     The "no batching" run died with an unrelated p4 timeout.
>
> Big repo, 1000 incremental changes:
>
>     Current:     0m13.43s user 0m19.82s sys 11m12.58s elapsed 4.94 %CPU
>     No batching: 0m20.29s user 0m39.94s sys 38m44.69s elapsed 2.59 %CPU (*)
>     Batching:    0m16.15s user 0m26.60s sys 13m55.69s elapsed 5.11 %CPU
>
>     The "no batching" run died at 28% of the way through.
>
> There is probably a 20%-ish slowdown in my environment with this
> approach.  But given that the timescale for these operations is
> measured in the tens of minutes, I don't think a couple more matters
> too much to anybody.
>
> The attractiveness of the simplicity and increased client spec feature
> coverage weighs in its favor.  Let's go ahead and inflict this on the
> world and see what they think.
>
> Do you have an updated patch?  Want to take some time to clean up and
> resubmit the entire series?  The batching should be incorporated with
> the last 2/2 that I sent out.
>
>                 -- Pete

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
  2013-08-27  2:43         ` kazuki saitoh
@ 2013-08-29 22:40           ` Pete Wyckoff
  0 siblings, 0 replies; 9+ messages in thread
From: Pete Wyckoff @ 2013-08-29 22:40 UTC (permalink / raw)
  To: kazuki saitoh; +Cc: git

ksaitoh560@gmail.com wrote on Tue, 27 Aug 2013 11:43 +0900:
> > Do you have an updated patch?  Want to take some time to clean up and
> > resubmit the entire series?  The batching should be incorporated with
> > the last 2/2 that I sent out.
> 
> I don't have other update.
> I'm satisfied because able to want to do and it became better than my
> original modification thanks to your cooperation.
> (> a few-hundred-thousand file repo
> I didn't think that it work with so HUGE repo.)
> 
> How should I do?
> Should I create one patch mail that incorporated your sent one?
> Or nothing to do?

It would be good if you could fold the one I sent in with yours,
and clean up any stylistic issues as you go.

I'll play with it a bit more, then send on to Junio for
the next release.

Thanks, this is a good addition!

		-- Pete


> 2013/8/25 Pete Wyckoff <pw@padd.com>:
> > pw@padd.com wrote on Thu, 15 Aug 2013 21:24 -0400:
> >> ksaitoh560@gmail.com wrote on Wed, 14 Aug 2013 09:59 +0900:
> >> > > My only concern is in the commit message, about performance.  A
> >> > > change that has lots of files in it will cause many roundtrips to
> >> > > p4d to do "p4 where" on each.  When the files don't have much
> >> > > edited content, this new approach will make the import take twice
> >> > > as long, I'll guess.  Do you have a big repository where you
> >> > > could test that?
> >> >
> >> > I measured performance of "git p4 clone  --use-client-spec" with a
> >> > repository it has 1925 files, 50MB.
> >> >   Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
> >> >   Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total
> >> >
> >> > It is acceptable in my situation, but looks quite slow...
> >> >
> >> > Then I implemented one batch query version
> >> >    7.92s user 33.03s system 15% cpu 4:25.59 total
> >> >
> >> > It is same as original
> >> >
> >> > My additional patch is below.
> >> > I investigate call graph (attached rough sketch) and
> >> > implement batch query in "commit()" and "splitFilesIntoBranches()".
> >> > In addition, modified "map_in_client" to just search cache value.
> >> >
> >> > Could you accept?
> >>
> >> This looks good.  I've started my own performance testing
> >> on a few-hundred-thousand file repo to confirm your findings.
> >>
> >> If it seems to work out, we can clean up the patch.  Otherwise
> >> maybe need to think about having both implementations and use
> >> the by-hand one for "...".  I don't like that approach.  Let's
> >> hope it's not needed.
> >
> > I tried with a few repos:
> >
> > Small repo, single-commit clone:
> >
> >     Current:     0m0.35s user 0m0.30s sys 0m11.52s elapsed 5.69 %CPU
> >     No batching: 0m0.66s user 0m0.77s sys 0m34.42s elapsed 4.17 %CPU
> >     Batching:    0m0.28s user 0m0.29s sys 0m10.85s elapsed 5.27 %CPU
> >
> > Big repo, single-commit clone:
> >
> >     Current:     6m21.38s user 1m35.36s sys 19m44.83s elapsed 40.23 %CPU
> >     No batching: 1m53.13s user 24m34.35s sys 146m13.80s elapsed 18.09 %CPU (*)
> >     Batching:    6m22.01s user 1m44.23s sys 21m19.73s elapsed 37.99 %CPU
> >
> >     The "no batching" run died with an unrelated p4 timeout.
> >
> > Big repo, 1000 incremental changes:
> >
> >     Current:     0m13.43s user 0m19.82s sys 11m12.58s elapsed 4.94 %CPU
> >     No batching: 0m20.29s user 0m39.94s sys 38m44.69s elapsed 2.59 %CPU (*)
> >     Batching:    0m16.15s user 0m26.60s sys 13m55.69s elapsed 5.11 %CPU
> >
> >     The "no batching" run died at 28% of the way through.
> >
> > There is probably a 20%-ish slowdown in my environment with this
> > approach.  But given that the timescale for these operations is
> > measured in the tens of minutes, I don't think a couple more matters
> > too much to anybody.
> >
> > The attractiveness of the simplicity and increased client spec feature
> > coverage weighs in its favor.  Let's go ahead and inflict this on the
> > world and see what they think.
> >
> > Do you have an updated patch?  Want to take some time to clean up and
> > resubmit the entire series?  The batching should be incorporated with
> > the last 2/2 that I sent out.
> >
> >                 -- Pete
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-08-29 22:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-06  6:45 [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
2013-08-10 20:11 ` Pete Wyckoff
2013-08-10 20:15   ` [PATCH 1/2] git p4 test: sanitize P4CHARSET Pete Wyckoff
2013-08-10 20:15   ` [PATCH 2/2] git p4: implement view spec wildcards with "p4 where" Pete Wyckoff
2013-08-14  0:59   ` [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
2013-08-16  1:24     ` Pete Wyckoff
2013-08-25  2:29       ` Pete Wyckoff
2013-08-27  2:43         ` kazuki saitoh
2013-08-29 22:40           ` Pete Wyckoff

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).