git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Barret Rhoden <brho@google.com>
To: "René Scharfe" <l.s.r@web.de>, "Junio C Hamano" <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/4] blame: validate and peel the object names on the ignore list
Date: Mon, 12 Oct 2020 16:39:33 -0400	[thread overview]
Message-ID: <cd2c51da-55c6-cc5e-2da1-69db90aaf438@google.com> (raw)
In-Reply-To: <1fa730c4-eaef-2f32-e1b4-716a27ed4646@web.de>

Hi -

On 10/11/20 12:03 PM, René Scharfe wrote:
[snip]
>> Any performance improvement would be welcome.  I haven't looked at
>> the code in a while, but I don't recall any reasons why this wouldn't
>> work.
> 
> Using a commit flag instead of an oidset would only improve
> performance noticeably if the product of the number of suspects and
> ignored commits was huge, I guess.
> 
> I get weird timings for an ignore file containing basically all commits
> (created with "git log --format=%H").  With Git's own repo and rc1:
> 
> Benchmark #1: ./git-blame --ignore-revs-file hashes Makefile
>    Time (mean ± σ):      8.470 s ±  0.049 s    [User: 7.923 s, System: 0.547 s]
>    Range (min … max):    8.434 s …  8.605 s    10 runs
> 
> And with the patch at the bottom:
> 
> Benchmark #1: ./git-blame --ignore-revs-file hashes Makefile
>    Time (mean ± σ):      8.048 s ±  0.061 s    [User: 7.899 s, System: 0.146 s]
>    Range (min … max):    7.987 s …  8.175 s    10 runs
> 
> That looks like a nice speedup, but why for system time alone?  Malloc
> overhead perhaps?

Hard to say.  Maybe page faults when walking the old ignore_list?

> Anyway, here's the patch:

Looks good to me.

Barret


> ---
>   blame.c         |  2 +-
>   blame.h         |  5 +++--
>   builtin/blame.c | 16 ++++++++++++----
>   object.h        |  3 ++-
>   4 files changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/blame.c b/blame.c
> index 686845b2b4..6e8c8fec9b 100644
> --- a/blame.c
> +++ b/blame.c
> @@ -2487,7 +2487,7 @@ static void pass_blame(struct blame_scoreboard *sb, struct blame_origin *origin,
>   	/*
>   	 * Pass remaining suspects for ignored commits to their parents.
>   	 */
> -	if (oidset_contains(&sb->ignore_list, &commit->object.oid)) {
> +	if (commit->object.flags & BLAME_IGNORE) {
>   		for (i = 0, sg = first_scapegoat(revs, commit, sb->reverse);
>   		     i < num_sg && sg;
>   		     sg = sg->next, i++) {
> diff --git a/blame.h b/blame.h
> index b6bbee4147..d35167e8bd 100644
> --- a/blame.h
> +++ b/blame.h
> @@ -16,6 +16,9 @@
>   #define BLAME_DEFAULT_MOVE_SCORE	20
>   #define BLAME_DEFAULT_COPY_SCORE	40
> 
> +/* Remember to update object flag allocation in object.h */
> +#define BLAME_IGNORE	(1u<<14)
> +
>   struct fingerprint;
> 
>   /*
> @@ -125,8 +128,6 @@ struct blame_scoreboard {
>   	/* linked list of blames */
>   	struct blame_entry *ent;
> 
> -	struct oidset ignore_list;
> -
>   	/* look-up a line in the final buffer */
>   	int num_lines;
>   	int *lineno;
> diff --git a/builtin/blame.c b/builtin/blame.c
> index bb0f29300e..1c6721b5d5 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -830,21 +830,29 @@ static void build_ignorelist(struct blame_scoreboard *sb,
>   {
>   	struct string_list_item *i;
>   	struct object_id oid;
> +	const struct object_id *o;
> +	struct oidset_iter iter;
> +	struct oidset ignore_list = OIDSET_INIT;
> 
> -	oidset_init(&sb->ignore_list, 0);
>   	for_each_string_list_item(i, ignore_revs_file_list) {
>   		if (!strcmp(i->string, ""))
> -			oidset_clear(&sb->ignore_list);
> +			oidset_clear(&ignore_list);
>   		else
> -			oidset_parse_file_carefully(&sb->ignore_list, i->string,
> +			oidset_parse_file_carefully(&ignore_list, i->string,
>   						    peel_to_commit_oid, sb);
>   	}
>   	for_each_string_list_item(i, ignore_rev_list) {
>   		if (get_oid_committish(i->string, &oid) ||
>   		    peel_to_commit_oid(&oid, sb))
>   			die(_("cannot find revision %s to ignore"), i->string);
> -		oidset_insert(&sb->ignore_list, &oid);
> +		oidset_insert(&ignore_list, &oid);
>   	}
> +	oidset_iter_init(&ignore_list, &iter);
> +	while ((o = oidset_iter_next(&iter))) {
> +		struct commit *commit = lookup_commit(sb->repo, o);
> +		commit->object.flags |= BLAME_IGNORE;
> +	}
> +	oidset_clear(&ignore_list);
>   }
> 
>   int cmd_blame(int argc, const char **argv, const char *prefix)
> diff --git a/object.h b/object.h
> index 20b18805f0..6818c9296b 100644
> --- a/object.h
> +++ b/object.h
> @@ -64,7 +64,8 @@ struct object_array {
>    * negotiator/default.c:       2--5
>    * walker.c:                 0-2
>    * upload-pack.c:                4       11-----14  16-----19
> - * builtin/blame.c:                        12-13
> + * blame.c:                                     14
> + * builtin/blame.c:                        12---14
>    * bisect.c:                                        16
>    * bundle.c:                                        16
>    * http-push.c:                          11-----14
> --
> 2.28.0
> 


  parent reply	other threads:[~2020-10-12 20:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25  5:59 [PATCH 0/4] Clean-up around get_x_ish() Junio C Hamano
2020-09-25  5:59 ` [PATCH 1/4] t8013: minimum preparatory clean-up Junio C Hamano
2020-09-25  5:59 ` [PATCH 2/4] blame: validate and peel the object names on the ignore list Junio C Hamano
2020-09-26 16:23   ` René Scharfe
2020-09-26 17:06     ` Junio C Hamano
2020-09-26 23:58       ` Junio C Hamano
2020-09-28 13:26       ` Barret Rhoden
2020-10-11 16:03         ` René Scharfe
2020-10-12 16:54           ` Junio C Hamano
2020-10-12 20:39           ` Barret Rhoden [this message]
2020-10-13 20:12             ` René Scharfe
2020-09-25  5:59 ` [PATCH 3/4] t1506: rev-parse A..B and A...B Junio C Hamano
2020-09-25  5:59 ` [PATCH 4/4] sequencer: stop abbreviating stopped-sha file Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd2c51da-55c6-cc5e-2da1-69db90aaf438@google.com \
    --to=brho@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).