git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] git-p4: fix faulty paths for case insensitive systems
@ 2015-08-02 15:15 larsxschneider
  2015-08-02 15:15 ` larsxschneider
  2015-08-04 22:06 ` Luke Diamand
  0 siblings, 2 replies; 4+ messages in thread
From: larsxschneider @ 2015-08-02 15:15 UTC (permalink / raw
  To: git; +Cc: pw, torarvid, ksaitoh560, Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

Hi,

I want to propose this patch as it helped us to migrate a big source code base
successfully from P4 to Git. I am sorry that I don't provide a test case, yet.
I would like to get advise on the patch and on the best strategy to provide a
test. Do you only run git-p4 integration tests in "t/t98??-git-p4-*.sh"? If yes,
which version of "start_p4d" should I use?

Thanks,
Lars

Lars Schneider (1):
  git-p4: fix faulty paths for case insensitive systems

 git-p4.py | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 77 insertions(+), 4 deletions(-)

--
1.9.5 (Apple Git-50.3)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] git-p4: fix faulty paths for case insensitive systems
  2015-08-02 15:15 [PATCH] git-p4: fix faulty paths for case insensitive systems larsxschneider
@ 2015-08-02 15:15 ` larsxschneider
  2015-08-04 22:06 ` Luke Diamand
  1 sibling, 0 replies; 4+ messages in thread
From: larsxschneider @ 2015-08-02 15:15 UTC (permalink / raw
  To: git; +Cc: pw, torarvid, ksaitoh560, Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

PROBLEM:
We run P4 servers on Linux and P4 clients on Windows. For an unknown
reason the file path for a number of files in P4 does not match the
directory path with respect to case sensitivity.

E.g. `p4 files` might return
//depot/path/to/file1
//depot/PATH/to/file2

If you use P4/P4V then these files end up in the same directory, e.g.
//depot/path/to/file1
//depot/path/to/file2

If you use git-p4 then all files not matching the correct file path
(e.g. `file2`) will be ignored.

SOLUTION:
Identify files that are different with respect to case sensitivity.
If there are any then run `p4 dirs` to build up a dictionary
containing the correct cases for each path. Upon `clone` this
dictionary is used to fix the paths. All this is only applied if the
git config "core.ignorecase" is set.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 git-p4.py | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 77 insertions(+), 4 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 549022e..692f1f4 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1859,7 +1859,7 @@ class View(object):
                 (self.client_prefix, clientFile))
         return clientFile[len(self.client_prefix):]
 
-    def update_client_spec_path_cache(self, files):
+    def update_client_spec_path_cache(self, files, fixPathCase = None):
         """ Caching file paths by "p4 where" batch query """
 
         # List depot file paths exclude that already cached
@@ -1878,6 +1878,8 @@ class View(object):
             if "unmap" in res:
                 # it will list all of them, but only one not unmap-ped
                 continue
+            if fixPathCase:
+                res['depotFile'] = fixPathCase(res['depotFile'])
             self.client_spec_path_cache[res['depotFile']] = self.convert_client_path(res["clientFile"])
 
         # not found files or unmap files set to ""
@@ -1973,7 +1975,8 @@ class P4Sync(Command, P4UserMap):
         files = []
         fnum = 0
         while commit.has_key("depotFile%s" % fnum):
-            path =  commit["depotFile%s" % fnum]
+            path = commit["depotFile%s" % fnum]
+            path = self.fixPathCase(path)
 
             if [p for p in self.cloneExclude
                 if p4PathStartsWith(path, p)]:
@@ -2037,7 +2040,9 @@ class P4Sync(Command, P4UserMap):
         branches = {}
         fnum = 0
         while commit.has_key("depotFile%s" % fnum):
-            path =  commit["depotFile%s" % fnum]
+            path = commit["depotFile%s" % fnum]
+            path = self.fixPathCase(path)
+
             found = [p for p in self.depotPaths
                      if p4PathStartsWith(path, p)]
             if not found:
@@ -2164,6 +2169,10 @@ class P4Sync(Command, P4UserMap):
             if marshalled["code"] == "error":
                 if "data" in marshalled:
                     err = marshalled["data"].rstrip()
+
+        if "depotFile" in marshalled:
+            marshalled['depotFile'] = self.fixPathCase(marshalled['depotFile'])
+
         if err:
             f = None
             if self.stream_have_file_info:
@@ -2238,6 +2247,7 @@ class P4Sync(Command, P4UserMap):
 
             # do the last chunk
             if self.stream_file.has_key('depotFile'):
+                self.stream_file['depotFile'] = self.fixPathCase(self.stream_file['depotFile'])
                 self.streamOneP4File(self.stream_file, self.stream_contents)
 
     def make_email(self, userid):
@@ -2295,7 +2305,8 @@ class P4Sync(Command, P4UserMap):
                 sys.stderr.write("Ignoring file outside of prefix: %s\n" % f['path'])
 
         if self.clientSpecDirs:
-            self.clientSpecDirs.update_client_spec_path_cache(files)
+            self.clientSpecDirs.update_client_spec_path_cache(
+                files, lambda x: self.fixPathCase(x))
 
         self.gitStream.write("commit %s\n" % branch)
 #        gitStream.write("mark :%s\n" % details["change"])
@@ -2759,6 +2770,63 @@ class P4Sync(Command, P4UserMap):
             print "IO error with git fast-import. Is your git version recent enough?"
             print self.gitError.read()
 
+    def fixPathCase(self, path):
+        if self.caseCorrectedPaths:
+            components = path.split('/')
+            filename = components.pop()
+            dirname = '/'.join(components).lower() + '/'
+            if dirname in self.caseCorrectedPaths:
+                path = self.caseCorrectedPaths[dirname] + filename
+        return path
+
+    def generatePathCaseDict(self, depotPaths):
+        # Query all files and generate a list of all used paths
+        # e.g. this files list:
+        # //depot/path/to/file1
+        # //depot/PATH/to/file2
+        #
+        # result in this path list:
+        # //depot/
+        # //depot/PATH/
+        # //depot/path/
+        # //depot/PATH/to/
+        # //depot/path/to/
+        p4_paths = set()
+        for p in depotPaths:
+            for f in p4CmdList(["files", p+"..."]):
+                components = f["depotFile"].split('/')[0:-1]
+                for i in range(3, len(components)+1):
+                    p4_paths.add('/'.join(components[0:i]) + '/')
+        p4_paths = sorted(list(p4_paths), key=len)
+
+        if len(p4_paths) > len(set([p.lower() for p in p4_paths])):
+            print "ATTENTION: File paths with different case variations detected. Fixing may take a while..."
+            found_variations = True
+            while found_variations:
+                for path in p4_paths:
+                    found_variations = False
+                    path_variations = [p for p in p4_paths if p.lower() == path.lower()]
+
+                    if len(path_variations) > 1:
+                        print  "%i different case variations for path '%s' detected." % (len(path_variations), path)
+                        # If we detect path variations (e.g. //depot/path and //depot/PATH) then we query P4 to list
+                        # the subdirectories of the parent (e.g //depot/*). P4 will return these subdirectories with
+                        # the correct case.
+                        parent_path = '/'.join(path.split('/')[0:-2])
+                        case_ok_paths = [p["dir"] + '/' for p in p4CmdList(["dirs", "-D", parent_path + '/*'])]
+
+                        # Replace all known paths with the case corrected path from P4 dirs command
+                        for case_ok_path in case_ok_paths:
+                            pattern = re.compile("^" + case_ok_path, re.IGNORECASE)
+                            p4_paths = sorted(list(set([pattern.sub(case_ok_path, p) for p in p4_paths])), key=len)
+
+                        found_variations = True
+                        break
+            return dict((p.lower(), p) for p in p4_paths)
+        else:
+            if self.verbose:
+                print "All file paths have consistent case"
+            return None
 
     def run(self, args):
         self.depotPaths = []
@@ -2930,6 +2998,11 @@ class P4Sync(Command, P4UserMap):
 
         self.depotPaths = newPaths
 
+        if gitConfigBool("core.ignorecase"):
+            self.caseCorrectedPaths = self.generatePathCaseDict(self.depotPaths)
+        else:
+            self.caseCorrectedPaths = None
+
         # --detect-branches may change this for each branch
         self.branchPrefixes = self.depotPaths
 
-- 
1.9.5 (Apple Git-50.3)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] git-p4: fix faulty paths for case insensitive systems
  2015-08-02 15:15 [PATCH] git-p4: fix faulty paths for case insensitive systems larsxschneider
  2015-08-02 15:15 ` larsxschneider
@ 2015-08-04 22:06 ` Luke Diamand
  2015-08-05  5:36   ` Lars Schneider
  1 sibling, 1 reply; 4+ messages in thread
From: Luke Diamand @ 2015-08-04 22:06 UTC (permalink / raw
  To: larsxschneider, git; +Cc: pw, torarvid, ksaitoh560

On 02/08/15 16:15, larsxschneider@gmail.com wrote:
> From: Lars Schneider <larsxschneider@gmail.com>
>
> Hi,
>
> I want to propose this patch as it helped us to migrate a big source code base
> successfully from P4 to Git. I am sorry that I don't provide a test case, yet.

Case sensitivity is a pretty tricky area with p4 - it's very brave of 
you to have a go at fixing it!

> I would like to get advise on the patch and on the best strategy to provide a
> test. Do you only run git-p4 integration tests in "t/t98??-git-p4-*.sh"? If yes,
> which version of "start_p4d" should I use?

Only the t98* tests relate to git-p4 so if you just copy one of those it 
should do the right thing.

t9819-git-p4-case-folding.sh already has a few failing tests for this 
problem. I wrote it a while back just to illustrate the problem, so it 
might be of use to you, or you might need to start again.

Won't your change make importing much slower for people with this problem?

Also, I'm not sure you can use "core.ignorecase" to trigger this: the 
problem will arise if the *server* is ignoring case as well (which I 
think you can detect by querying the server).

I'm not trying to be negative - but this problem does have some annoying 
pitfalls! Let me know if you think I can help though.

Regards!
Luke

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] git-p4: fix faulty paths for case insensitive systems
  2015-08-04 22:06 ` Luke Diamand
@ 2015-08-05  5:36   ` Lars Schneider
  0 siblings, 0 replies; 4+ messages in thread
From: Lars Schneider @ 2015-08-05  5:36 UTC (permalink / raw
  To: Luke Diamand; +Cc: git, pw, torarvid, ksaitoh560

Thank you for your reply. Your t8919 test case looks exactly like the right thing. Unfortuantly I don’t have Internet access for the next two weeks. Afterwards I will provide a proper test cases for the patch.

You are correct about the speed. All these initial “p4 dirs” calls make the clone pretty slow. However, for us it is a one time history migration and therefore speed is not an issue. I also understand your “core.ignorecase” comment. Let’s assume the path correction works as expected, how and when would you trigger it? Would you only rely on the “server ignoring case” flag?

Cheers,
Lars

On 05 Aug 2015, at 00:06, Luke Diamand <luke@diamand.org> wrote:

> On 02/08/15 16:15, larsxschneider@gmail.com wrote:
>> From: Lars Schneider <larsxschneider@gmail.com>
>> 
>> Hi,
>> 
>> I want to propose this patch as it helped us to migrate a big source code base
>> successfully from P4 to Git. I am sorry that I don't provide a test case, yet.
> 
> Case sensitivity is a pretty tricky area with p4 - it's very brave of you to have a go at fixing it!
> 
>> I would like to get advise on the patch and on the best strategy to provide a
>> test. Do you only run git-p4 integration tests in "t/t98??-git-p4-*.sh"? If yes,
>> which version of "start_p4d" should I use?
> 
> Only the t98* tests relate to git-p4 so if you just copy one of those it should do the right thing.
> 
> t9819-git-p4-case-folding.sh already has a few failing tests for this problem. I wrote it a while back just to illustrate the problem, so it might be of use to you, or you might need to start again.
> 
> Won't your change make importing much slower for people with this problem?
> 
> Also, I'm not sure you can use "core.ignorecase" to trigger this: the problem will arise if the *server* is ignoring case as well (which I think you can detect by querying the server).
> 
> I'm not trying to be negative - but this problem does have some annoying pitfalls! Let me know if you think I can help though.
> 
> Regards!
> Luke
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-05  5:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-02 15:15 [PATCH] git-p4: fix faulty paths for case insensitive systems larsxschneider
2015-08-02 15:15 ` larsxschneider
2015-08-04 22:06 ` Luke Diamand
2015-08-05  5:36   ` Lars Schneider

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).