From: Shawn Pearce <spearce@spearce.org>
To: Tay Ray Chuan <rctay89@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFC/PATCH 0/3] teach --histogram to diff
Date: Tue, 12 Jul 2011 07:19:47 -0700 [thread overview]
Message-ID: <CAJo=hJu5ubkzUyyPM0nqP+J9CU3hBtAHfuzaLSuN214Hux4qTA@mail.gmail.com> (raw)
In-Reply-To: <1310451027-15148-1-git-send-email-rctay89@gmail.com>
On Mon, Jul 11, 2011 at 23:10, Tay Ray Chuan <rctay89@gmail.com> wrote:
> (Shawn, I was held up with the patch messages, sorry for the delay.)
>
> Port JGit's HistogramDiff(Index) over to C. This algorithm extends the
> patience algorithm to "support low-occurrence common elements" [1].
>
> Rough numbers show that it is a faster alternative to its --patience
> cousin, as well as to the default Meyers algorithm:
>
> $ time ./git log --histogram -p v1.0.0 >/dev/null
>
> real 0m12.998s
> user 0m11.506s
> sys 0m1.487s
> $ time ./git log -p v1.0.0 >/dev/null
>
> real 0m13.575s
> user 0m12.101s
> sys 0m1.468s
> $ time ./git log --patience -p v1.0.0 >/dev/null
>
> real 0m14.978s
> user 0m13.508s
> sys 0m1.464s
Nice!
Not the big difference that it is for us in JGit (between histogram
and Myers), but its nice to see an improvement here, even if it is
only 0.5s for the entire 1.0.0 history. How do the diffs come out? One
of the arguments for patience diff is the formatting can sometimes be
more readable for certain changes, but its slower. Histogram tries to
apply a similar algorithm as patience in order to get the formatting
benefits, but also some performance improvements.
Have you looked at a patch that differs in output between Myers and
patience, and then compared those to the histogram version?
> The first patch implements JGit's HistogramDiff(Index) proper. The
> second and third patches aren't essential but yield performance gains.
...
> [RFC/PATCH 1/3] teach --histogram to diff
> [RFC/PATCH 2/3] xdiff/xprepare: skip classification
> [RFC/PATCH 3/3] xdiff/xprepare: use a smaller sample size for histogram
Do we need sampling at all for histogram? Can you skip it?
--
Shawn.
next prev parent reply other threads:[~2011-07-12 14:20 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-12 6:10 [RFC/PATCH 0/3] teach --histogram to diff Tay Ray Chuan
2011-07-12 6:10 ` [RFC/PATCH 1/3] " Tay Ray Chuan
2011-07-12 6:10 ` [RFC/PATCH 2/3] xdiff/xprepare: skip classification Tay Ray Chuan
2011-07-12 6:10 ` [RFC/PATCH 3/3] xdiff/xprepare: use a smaller sample size for histogram diff Tay Ray Chuan
2011-07-12 19:56 ` [RFC/PATCH 1/3] teach --histogram to diff Junio C Hamano
2011-07-13 16:36 ` Tay Ray Chuan
2011-07-12 14:19 ` Shawn Pearce [this message]
2011-07-12 17:43 ` [RFC/PATCH 0/3] " Junio C Hamano
2011-07-13 16:35 ` Tay Ray Chuan
2011-07-13 16:34 ` Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 0/8] " Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 1/8] xdiff/xprepare: use memset() Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 2/8] do away with xdl_mmfile_next() Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 3/8] xdiff/xprepare: refactor abort cleanups Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 4/8] xdiff/xpatience: factor out fall-back-diff function Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 5/8] t4033-diff-patience: factor out tests Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 6/8] teach --histogram to diff Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 7/8] xdiff/xprepare: skip classification Tay Ray Chuan
2011-08-01 3:16 ` [PATCH v2 8/8] xdiff/xprepare: use a smaller sample size for histogram diff Tay Ray Chuan
2011-08-01 4:20 ` [PATCH 0/4] changes for rc/histogram-diff in 'next' Tay Ray Chuan
2011-08-01 4:20 ` [PATCH 1/4] xdiff: do away with xdl_mmfile_next() Tay Ray Chuan
2011-08-01 4:20 ` [PATCH 2/4] xdiff/xhistogram: rework handling of recursed results Tay Ray Chuan
2011-08-01 4:20 ` [PATCH 3/4] xdiff/xhistogram: rely on xdl_trim_ends() Tay Ray Chuan
2011-08-01 4:20 ` [PATCH 4/4] xdiff/xhistogram: drop need for additional variable Tay Ray Chuan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJo=hJu5ubkzUyyPM0nqP+J9CU3hBtAHfuzaLSuN214Hux4qTA@mail.gmail.com' \
--to=spearce@spearce.org \
--cc=git@vger.kernel.org \
--cc=rctay89@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).