git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: Dennis Kaarsemaker <dennis@kaarsemaker.net>, git@vger.kernel.org
Subject: Re: git blame <directory> [was: Reducing CPU load on git server]
Date: Wed, 31 Aug 2016 01:42:01 -0400	[thread overview]
Message-ID: <20160831054201.ldlwptlmcndjmfwu@sigill.intra.peff.net> (raw)
In-Reply-To: <9fe5aa9b-5ba8-2b9a-7feb-58e115be3902@gmail.com>

On Tue, Aug 30, 2016 at 12:46:20PM +0200, Jakub Narębski wrote:

> W dniu 29.08.2016 o 23:31, Jeff King pisze:
> 
> > Blame-tree is a GitHub-specific command (it feeds the main repository
> > view page), and is a known CPU hog. There's more clever caching for that
> > coming down the pipe, but it's not shipped yet.
> 
> I wonder if having support for 'git blame <directory>' in Git core would
> be something interesting to Git users.  I once tried to implement it,
> but it went nowhere.  Would it be hard to implement?

I think there's some interest; I have received a few off-list emails
over the years about it. There was some preliminary discussion long ago:

  http://public-inbox.org/git/20110302164031.GA18233@sigill.intra.peff.net/

The code that runs on GitHub is available in my fork of git. I haven't
submitted it upstream because there are some lingering issues. I
mentioned them on-list in the first few items of:

  http://public-inbox.org/git/20130318121243.GC14789@sigill.intra.peff.net/

That code is in the jk/blame-tree branch of https://github.com/peff/git
if you are interested in addressing them (note that I haven't touched
that code in a few years except for rebasing it forward, so it may have
bitrotted a little).

Here's a snippet from an off-list conversation I had with Dennis (cc'd)
in 2014 (I think he uses that blame-tree code as part of a custom git
web interface):

> The things I think it needs are:
> 
>   1. The max-depth patches need to be reconciled with Duy's pathspec
>      work upstream. The current implementation works only for tree
>      diffs, and is not really part of the pathspec at all.
> 
>   2. Docs/output formats for blame-tree need to be friendlier, as you
>      noticed.
> 
>   3. Blame-tree does not use revision pathspecs at all. This makes it
>      take way longer than it could, because it does not prune away side
>      branches deep in history that affect only paths whose blame we have
>      already found. But the current pathspec code is so slow that using
>      it outweighs the pruning benefit.
> 
>      I have a series, which I just pushed up to jk/faster-blame-tree,
>      which tries to improve this.  But it's got a lot of new, untested
>      code itself (we are not even running it at GitHub yet). It's also
>      based on v1.9.4; I think there are going to be a lot of conflicts
>      with the combine-tree work done in v2.0.
> 
> [...]
> 
> I also think it would probably make sense for blame-tree to support the
> same output formats as git-blame (e.g., to have an identical --porcelain
> mode, to have a reasonable human-readable format by default, etc).

That's all I could dig out of my archives. I'd be happy if somebody
wanted to pick it up and run with it. Polishing for upstream has been on
my list for several years now, but there's usually something more
important (or interesting) to work on at any given moment.

You might also look at how GitLab does it. I've never talked to them
about it, and as far as I know they do not use blame-tree.

-Peff

  reply	other threads:[~2016-08-31  5:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-28 19:42 Reducing CPU load on git server W. David Jarvis
2016-08-28 21:20 ` Jakub Narębski
2016-08-28 23:18   ` W. David Jarvis
2016-08-29  5:47 ` Jeff King
2016-08-29 10:46   ` Jakub Narębski
2016-08-29 17:18     ` Jeff King
2016-08-29 19:16   ` W. David Jarvis
2016-08-29 21:31     ` Jeff King
2016-08-29 22:41       ` W. David Jarvis
2016-08-31  6:02         ` Jeff King
2016-08-30 10:46       ` git blame <directory> [was: Reducing CPU load on git server] Jakub Narębski
2016-08-31  5:42         ` Jeff King [this message]
2016-08-31  7:28           ` Dennis Kaarsemaker
2016-08-29 20:14 ` Reducing CPU load on git server Ævar Arnfjörð Bjarmason
2016-08-29 20:57   ` W. David Jarvis
2016-08-29 21:31     ` Dennis Kaarsemaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160831054201.ldlwptlmcndjmfwu@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=dennis@kaarsemaker.net \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).