git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Pete Wyckoff <pw@padd.com>
To: kazuki saitoh <ksaitoh560@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2] git-p4: Ask "p4" to interpret View setting
Date: Sat, 24 Aug 2013 22:29:44 -0400	[thread overview]
Message-ID: <20130825022944.GA16027@padd.com> (raw)
In-Reply-To: <20130816012420.GA20985@padd.com>

pw@padd.com wrote on Thu, 15 Aug 2013 21:24 -0400:
> ksaitoh560@gmail.com wrote on Wed, 14 Aug 2013 09:59 +0900:
> > > My only concern is in the commit message, about performance.  A
> > > change that has lots of files in it will cause many roundtrips to
> > > p4d to do "p4 where" on each.  When the files don't have much
> > > edited content, this new approach will make the import take twice
> > > as long, I'll guess.  Do you have a big repository where you
> > > could test that?
> > 
> > I measured performance of "git p4 clone  --use-client-spec" with a
> > repository it has 1925 files, 50MB.
> >   Original:    8.05s user 32.02s system 15% cpu 4:25.34 total
> >   Apply patch:    9.02s user 53.19s system 14% cpu 6:56.41 total
> > 
> > It is acceptable in my situation, but looks quite slow...
> > 
> > Then I implemented one batch query version
> >    7.92s user 33.03s system 15% cpu 4:25.59 total
> > 
> > It is same as original
> > 
> > My additional patch is below.
> > I investigate call graph (attached rough sketch) and
> > implement batch query in "commit()" and "splitFilesIntoBranches()".
> > In addition, modified "map_in_client" to just search cache value.
> > 
> > Could you accept?
> 
> This looks good.  I've started my own performance testing
> on a few-hundred-thousand file repo to confirm your findings.
> 
> If it seems to work out, we can clean up the patch.  Otherwise
> maybe need to think about having both implementations and use
> the by-hand one for "...".  I don't like that approach.  Let's
> hope it's not needed.

I tried with a few repos:

Small repo, single-commit clone:

    Current:     0m0.35s user 0m0.30s sys 0m11.52s elapsed 5.69 %CPU
    No batching: 0m0.66s user 0m0.77s sys 0m34.42s elapsed 4.17 %CPU
    Batching:    0m0.28s user 0m0.29s sys 0m10.85s elapsed 5.27 %CPU

Big repo, single-commit clone:

    Current:     6m21.38s user 1m35.36s sys 19m44.83s elapsed 40.23 %CPU
    No batching: 1m53.13s user 24m34.35s sys 146m13.80s elapsed 18.09 %CPU (*)
    Batching:    6m22.01s user 1m44.23s sys 21m19.73s elapsed 37.99 %CPU

    The "no batching" run died with an unrelated p4 timeout.

Big repo, 1000 incremental changes:

    Current:     0m13.43s user 0m19.82s sys 11m12.58s elapsed 4.94 %CPU
    No batching: 0m20.29s user 0m39.94s sys 38m44.69s elapsed 2.59 %CPU (*)
    Batching:    0m16.15s user 0m26.60s sys 13m55.69s elapsed 5.11 %CPU

    The "no batching" run died at 28% of the way through.

There is probably a 20%-ish slowdown in my environment with this
approach.  But given that the timescale for these operations is
measured in the tens of minutes, I don't think a couple more matters
too much to anybody.

The attractiveness of the simplicity and increased client spec feature
coverage weighs in its favor.  Let's go ahead and inflict this on the
world and see what they think.

Do you have an updated patch?  Want to take some time to clean up and
resubmit the entire series?  The batching should be incorporated with
the last 2/2 that I sent out.

		-- Pete

  reply	other threads:[~2013-08-25  2:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  6:45 [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
2013-08-10 20:11 ` Pete Wyckoff
2013-08-10 20:15   ` [PATCH 1/2] git p4 test: sanitize P4CHARSET Pete Wyckoff
2013-08-10 20:15   ` [PATCH 2/2] git p4: implement view spec wildcards with "p4 where" Pete Wyckoff
2013-08-14  0:59   ` [PATCH v2] git-p4: Ask "p4" to interpret View setting kazuki saitoh
2013-08-16  1:24     ` Pete Wyckoff
2013-08-25  2:29       ` Pete Wyckoff [this message]
2013-08-27  2:43         ` kazuki saitoh
2013-08-29 22:40           ` Pete Wyckoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130825022944.GA16027@padd.com \
    --to=pw@padd.com \
    --cc=git@vger.kernel.org \
    --cc=ksaitoh560@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).