git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Luke Diamand <luke@diamand.org>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: Git List <git@vger.kernel.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Miguel Torroja <miguel.torroja@gmail.com>,
	George Vanburgh <gvanburgh@bloomberg.net>
Subject: Re: [PATCH 1/1] git-p4: add format-patch subcommand
Date: Mon, 12 Mar 2018 09:40:25 +0000
Message-ID: <CAE5ih7_49-85ayR9AdHjd0AUhmTua-5Mn_CjLXGcT571Lh=xzg@mail.gmail.com> (raw)
In-Reply-To: <CAPig+cRt3cdKOCFmbuUMQPuBnP+weD3HcY-5kz9RMkPH=aP63g@mail.gmail.com>

On 26 February 2018 at 23:48, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Mon, Feb 26, 2018 at 6:48 AM, Luke Diamand <luke@diamand.org> wrote:
>> It takes a list of P4 changelists and generates a patch for
>> each one, using "p4 describe".
>> [...]

Just to say that I almost got this working reasonably well, but it
gets pretty nasty making binary files work - in order to generate the
git binary format I need a base85 encoder which exists in Python3, but
not in Python2. And while I could obviously write it in Python easily
enough, it starts to feel like an awful lot of code implementing (and
maintaining) a slightly second-rate "git diff". And while it's
tempting to just not support binary files, in reality Perforce repos
seem to end up with lots of them.

So I think I might have another go at the previous scheme. What I will
have to do is to first create a faked-up commit representing the point
prior to the shelved commit for each file ("for file@rev-1 in files"),
and then create the real commit ("for file@rev in files"). It's
probably not as grim as it sounds and then means that we end up with
git doing all the hard work of diffing files, rather than relying on
Perforce's diff engine in conjunction with some Python.


>> Signed-off-by: Luke Diamand <luke@diamand.org>
>> ---
>> diff --git a/git-p4.py b/git-p4.py
>> @@ -3749,6 +3761,277 @@ class P4Branches(Command):
>> +class P4MakePatch(Command,P4UserMap):
>> +    def run(self, args):
>> +        if self.output and not os.path.isdir(self.output):
>> +            sys.exit("output directory %s does not exist" % self.output)
>
> For consistency with "git format-patch", this could create the output
> directory automatically rather than erroring out.
>
>> +        if self.strip_depot_prefix:
>> +            self.clientSpec = getClientSpec()
>> +        else:
>> +            self.clientSpec = None
>> +
>> +        self.loadUserMapFromCache()
>> +        if len(args) < 1:
>> +            return False
>
> Would it make sense to handle this no-arguments case earlier before
> doing work, such as loading the user map from cache?
>
>> +        for change in args:
>> +            self.make_patch(int(change))
>> +
>> +        return True
>> +
>> +    def p4_fetch_delta(self, change, files, shelved=False):
>> +        cmd = ["describe"]
>> +        if shelved:
>> +            cmd += ["-S"]
>> +        cmd += ["-du"]
>> +        cmd += ["%s" % change]
>> +        cmd = p4_build_cmd(cmd)
>> +
>> +        p4 = subprocess.Popen(cmd, shell=False, stdout=subprocess.PIPE)
>> +        try:
>> +            result = p4.stdout.readlines()
>> +        except EOFError:
>> +            pass
>> +        in_diff = False
>> +        matcher = re.compile('====\s+(.*)#(\d+)\s+\(text\)\s+====')
>> +        diffmatcher = re.compile("Differences ...")
>
> I'm not familiar with the output of "p4 describe", but does it include
> user-supplied text? If so, would it be safer to anchor 'diffmatcher',
> for instance, "^Diferences...$"?
>
>> +        delta = ""
>> +        skip_next_blank_line = False
>> +
>> +        for line in result:
>> +            if diffmatcher.match(line):
>
> Stepping back, does "Differences..." appear on a line of its own? If
> so, why does this need to be a regex at all? Can't it just do a direct
> string comparison?
>
>> +                in_diff = True
>> +                continue
>> +
>> +            if in_diff:
>> +
>> +                if skip_next_blank_line and \
>> +                    line.rstrip() == "":
>> +                    skip_next_blank_line = False
>> +                    continue
>> +
>> +                m = matcher.match(line)
>> +                if m:
>> +                    file = self.map_path(m.group(1))
>> +                    ver = m.group(2)
>> +                    delta += "diff --git a%s b%s\n" % (file, file)
>> +                    delta += "--- a%s\n" % file
>> +                    delta += "+++ b%s\n" % file
>> +                    skip_next_blank_line = True
>> +                else:
>> +                    delta += line
>> +
>> +        delta += "\n"
>> +
>> +        exitCode = p4.wait()
>> +        if exitCode != 0:
>> +            raise IOError("p4 '%s' command failed" % str(cmd))
>
> Since p4.stdout.readlines() gathered all output from the command
> before massaging it into Git format, can't the p4.wait() be done
> earlier, just after the output has been read?
>
>> +        return delta
>> +
>> +    def make_patch(self, change):
>> +        [...]
>> +        # add new or deleted files
>> +        for file in files:
>> +            [...]
>> +            if add or delete:
>> +                if add:
>> +                    [...]
>> +                else:
>> +                    [...]
>> +
>> +                [...]
>> +
>> +                if add:
>> +                    prefix = "+"
>> +                else:
>> +                    prefix = "-"
>> +
>> +                if delete:
>> +                    [...]
>> +                else:
>> +                    # added
>> +                    [...]
>> +
>> +                (lines, delta_content) = self.read_file_contents(prefix, path_rev)
>> +
>> +                if add:
>> +                    if lines > 0:
>> +                        delta += "@@ -0,0 +1,%d @@\n" % lines
>> +                else:
>> +                    delta += "@@ -1,%d +0,0 @@\n" % lines
>
> It's not clear why you sometimes check 'add' but other times 'delete'.
> Perhaps always used one or the other? For instance:
>
>     action = file["action"]
>     if action == "add" or action == "delete":
>         if action == "add":
>             before = "/dev/null"
>             ...
>         else
>             ...
>
>         delta += ...
>
>         if action == "add":
>             ...
>
> or something.
>
>> +                delta += delta_content
>> +
>> +        if self.output:
>> +            with open("%s/%s.patch" % (self.output, change), "w") as f:
>> +                f.write(delta)
>> +        else:
>> +            print(delta)
>> diff --git a/t/t9832-make-patch.sh b/t/t9832-make-patch.sh
>> @@ -0,0 +1,135 @@
>> +# Converting P4 changes into patches
>> +#
>> +# - added, deleted, modified files
>> +# - regular commits, shelved commits
>> +#
>> +# directories and symblinks don't yet work
>> +# binary files will never work
>
> s/symblinks/symlinks/
>
>> +test_expect_success 'init depot' '
>> +       (
>> +               cd "$cli" &&
>> +               echo file1 >file1 &&
>> +               p4 add file1 &&
>> +               p4 submit -d "change 1" &&                      # cl 1
>> +               cat >file_to_delete <<-EOF &&
>
> Using -\EOF would signal that you don't expect interpolation within the heredoc.
>
>> +               LINE1
>> +               LINE2
>> +               EOF
>> +               echo "non-empty" >file_to_delete &&
>> +               p4 add file_to_delete &&
>> +               p4 submit -d "change 2" &&                      # cl 2
>> +               p4 edit file1 &&
>> +               cat >>file1 <<-EOF &&
>> +               LINE1
>> +               LINE2
>> +               EOF
>> +               p4 submit -d "change 3" &&                      # cl 3
>> +               p4 delete file_to_delete &&
>> +               echo "file2" >file2 &&
>> +               p4 add file2 &&
>> +               p4 submit -d "change 4"                         # cl 4
>> +       )
>> +'
>> +test_expect_success 'add p4 symlink' '
>> +       (
>> +               cd "$cli" &&
>> +               echo "symlink_source" >symlink_source &&
>> +               ln -s symlink_source symlink &&
>
> Should this test be protected with the SYMLINKS prerequisite?
>
>> +               p4 add symlink_source symlink &&
>> +               p4 submit -d "add symlink"                      # cl 6
>> +       )
>> +'
>> +
>> +test_expect_success 'patch from symlink' '
>
> Ditto SYMLINKS?
>
>> +       test_when_finished cleanup_git &&
>> +       cd "$git" &&
>> +       (
>> +               git p4 format-patch 6 | patch -p1 &&
>> +               test_path_is_file depot/symlink_source &&
>> +               test -L depot/symlink
>> +       )
>> +'

      reply index

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-26 11:48 [PATCH 0/1] " Luke Diamand
2018-02-26 11:48 ` [PATCH 1/1] " Luke Diamand
2018-02-26 23:29   ` Miguel Torroja
2018-02-26 23:48   ` Eric Sunshine
2018-03-12  9:40     ` Luke Diamand [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE5ih7_49-85ayR9AdHjd0AUhmTua-5Mn_CjLXGcT571Lh=xzg@mail.gmail.com' \
    --to=luke@diamand.org \
    --cc=git@vger.kernel.org \
    --cc=gvanburgh@bloomberg.net \
    --cc=larsxschneider@gmail.com \
    --cc=miguel.torroja@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox