git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Tay Ray Chuan <rctay89@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFC/PATCH 0/3] teach --histogram to diff
Date: Tue, 12 Jul 2011 07:19:47 -0700	[thread overview]
Message-ID: <CAJo=hJu5ubkzUyyPM0nqP+J9CU3hBtAHfuzaLSuN214Hux4qTA@mail.gmail.com> (raw)
In-Reply-To: <1310451027-15148-1-git-send-email-rctay89@gmail.com>

On Mon, Jul 11, 2011 at 23:10, Tay Ray Chuan <rctay89@gmail.com> wrote:
> (Shawn, I was held up with the patch messages, sorry for the delay.)
>
> Port JGit's HistogramDiff(Index) over to C. This algorithm extends the
> patience algorithm to "support low-occurrence common elements" [1].
>
> Rough numbers show that it is a faster alternative to its --patience
> cousin, as well as to the default Meyers algorithm:
>
>  $ time ./git log --histogram -p v1.0.0 >/dev/null
>
>  real    0m12.998s
>  user    0m11.506s
>  sys     0m1.487s
>  $ time ./git log -p v1.0.0 >/dev/null
>
>  real    0m13.575s
>  user    0m12.101s
>  sys     0m1.468s
>  $ time ./git log --patience -p v1.0.0 >/dev/null
>
>  real    0m14.978s
>  user    0m13.508s
>  sys     0m1.464s

Nice!

Not the big difference that it is for us in JGit (between histogram
and Myers), but its nice to see an improvement here, even if it is
only 0.5s for the entire 1.0.0 history. How do the diffs come out? One
of the arguments for patience diff is the formatting can sometimes be
more readable for certain changes, but its slower. Histogram tries to
apply a similar algorithm as patience in order to get the formatting
benefits, but also some performance improvements.

Have you looked at a patch that differs in output between Myers and
patience, and then compared those to the histogram version?

> The first patch implements JGit's HistogramDiff(Index) proper. The
> second and third patches aren't essential but yield performance gains.
...
> [RFC/PATCH 1/3] teach --histogram to diff
> [RFC/PATCH 2/3] xdiff/xprepare: skip classification
> [RFC/PATCH 3/3] xdiff/xprepare: use a smaller sample size for histogram

Do we need sampling at all for histogram? Can you skip it?

-- 
Shawn.

  parent reply	other threads:[~2011-07-12 14:20 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-12  6:10 [RFC/PATCH 0/3] teach --histogram to diff Tay Ray Chuan
2011-07-12  6:10 ` [RFC/PATCH 1/3] " Tay Ray Chuan
2011-07-12  6:10   ` [RFC/PATCH 2/3] xdiff/xprepare: skip classification Tay Ray Chuan
2011-07-12  6:10     ` [RFC/PATCH 3/3] xdiff/xprepare: use a smaller sample size for histogram diff Tay Ray Chuan
2011-07-12 19:56   ` [RFC/PATCH 1/3] teach --histogram to diff Junio C Hamano
2011-07-13 16:36     ` Tay Ray Chuan
2011-07-12 14:19 ` Shawn Pearce [this message]
2011-07-12 17:43   ` [RFC/PATCH 0/3] " Junio C Hamano
2011-07-13 16:35     ` Tay Ray Chuan
2011-07-13 16:34   ` Tay Ray Chuan
2011-08-01  3:16 ` [PATCH v2 0/8] " Tay Ray Chuan
2011-08-01  3:16   ` [PATCH v2 1/8] xdiff/xprepare: use memset() Tay Ray Chuan
2011-08-01  3:16     ` [PATCH v2 2/8] do away with xdl_mmfile_next() Tay Ray Chuan
2011-08-01  3:16       ` [PATCH v2 3/8] xdiff/xprepare: refactor abort cleanups Tay Ray Chuan
2011-08-01  3:16         ` [PATCH v2 4/8] xdiff/xpatience: factor out fall-back-diff function Tay Ray Chuan
2011-08-01  3:16           ` [PATCH v2 5/8] t4033-diff-patience: factor out tests Tay Ray Chuan
2011-08-01  3:16             ` [PATCH v2 6/8] teach --histogram to diff Tay Ray Chuan
2011-08-01  3:16               ` [PATCH v2 7/8] xdiff/xprepare: skip classification Tay Ray Chuan
2011-08-01  3:16                 ` [PATCH v2 8/8] xdiff/xprepare: use a smaller sample size for histogram diff Tay Ray Chuan
2011-08-01  4:20   ` [PATCH 0/4] changes for rc/histogram-diff in 'next' Tay Ray Chuan
2011-08-01  4:20     ` [PATCH 1/4] xdiff: do away with xdl_mmfile_next() Tay Ray Chuan
2011-08-01  4:20       ` [PATCH 2/4] xdiff/xhistogram: rework handling of recursed results Tay Ray Chuan
2011-08-01  4:20         ` [PATCH 3/4] xdiff/xhistogram: rely on xdl_trim_ends() Tay Ray Chuan
2011-08-01  4:20           ` [PATCH 4/4] xdiff/xhistogram: drop need for additional variable Tay Ray Chuan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJo=hJu5ubkzUyyPM0nqP+J9CU3hBtAHfuzaLSuN214Hux4qTA@mail.gmail.com' \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=rctay89@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).