git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Eric Sunshine <sunshine@sunshineco.com>
Subject: [PATCH v2 1/3] contrib: add git-contacts helper
Date: Wed,  3 Jul 2013 16:59:40 -0400	[thread overview]
Message-ID: <1372885182-5512-2-git-send-email-sunshine@sunshineco.com> (raw)
In-Reply-To: <1372885182-5512-1-git-send-email-sunshine@sunshineco.com>

This script lists people that might be interested in a patch by going
back through the history for each patch hunk, and finding people that
reviewed, acknowledge, signed, authored, or were Cc:'d on the code the
patch is modifying.

It does this by running git-blame incrementally on each hunk and then
parsing the commit message. After gathering all participants, it
determines each person's relevance by considering how many commits
mentioned that person compared with the total number of commits under
consideration. The final output consists only of participants who pass a
minimum threshold of participation.

Several conditions controlling a person's significance are currently
hard-coded, such as minimum participation level, blame date-limiting,
and -C level for detecting moved and copied lines. In the future, these
conditions may become configurable.

For example:

  % git contacts 0001-remote-hg-trivial-cleanups.patch
  Felipe Contreras <felipe.contreras@gmail.com>
  Jeff King <peff@peff.net>
  Max Horn <max@quendi.de>
  Junio C Hamano <gitster@pobox.com>

Thus, it can be invoked as git-send-email's --cc-cmd option, among other
possible uses.

This is a Perl rewrite of Felipe Contreras' git-related patch series[1]
written in Ruby.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/226065/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 contrib/contacts/git-contacts | 127 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 127 insertions(+)
 create mode 100755 contrib/contacts/git-contacts

diff --git a/contrib/contacts/git-contacts b/contrib/contacts/git-contacts
new file mode 100755
index 0000000..6e9ab45
--- /dev/null
+++ b/contrib/contacts/git-contacts
@@ -0,0 +1,127 @@
+#!/usr/bin/perl
+
+# List people who might be interested in a patch.  Useful as the argument to
+# git-send-email --cc-cmd option, and in other situations.
+#
+# Usage: git contacts <file> ...
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+my $since = '5-years-ago';
+my $min_percent = 10;
+my $labels_rx = qr/Signed-off-by|Reviewed-by|Acked-by|Cc/i;
+my %seen;
+
+sub format_contact {
+	my ($name, $email) = @_;
+	return "$name <$email>";
+}
+
+sub parse_commit {
+	my ($commit, $data) = @_;
+	my $contacts = $commit->{contacts};
+	my $inbody = 0;
+	for (split(/^/m, $data)) {
+		if (not $inbody) {
+			if (/^author ([^<>]+) <(\S+)> .+$/) {
+				$contacts->{format_contact($1, $2)} = 1;
+			} elsif (/^$/) {
+				$inbody = 1;
+			}
+		} elsif (/^$labels_rx:\s+([^<>]+)\s+<(\S+?)>$/o) {
+			$contacts->{format_contact($1, $2)} = 1;
+		}
+	}
+}
+
+sub import_commits {
+	my ($commits) = @_;
+	return unless %$commits;
+	my $pid = open2 my $reader, my $writer, qw(git cat-file --batch);
+	for my $id (keys(%$commits)) {
+		print $writer "$id\n";
+		my $line = <$reader>;
+		if ($line =~ /^([0-9a-f]{40}) commit (\d+)/) {
+			my ($cid, $len) = ($1, $2);
+			die "expected $id but got $cid\n" unless $id eq $cid;
+			my $data;
+			# cat-file emits newline after data, so read len+1
+			read $reader, $data, $len + 1;
+			parse_commit($commits->{$id}, $data);
+		}
+	}
+	close $reader;
+	close $writer;
+	waitpid($pid, 0);
+	die "git-cat-file error: $?\n" if $?;
+}
+
+sub get_blame {
+	my ($commits, $source, $start, $len, $from) = @_;
+	$len = 1 unless defined($len);
+	return if $len == 0;
+	open my $f, '-|',
+		qw(git blame --porcelain -C), '-L', "$start,+$len",
+		'--since', $since, "$from^", '--', $source or die;
+	while (<$f>) {
+		if (/^([0-9a-f]{40}) \d+ \d+ \d+$/) {
+			my $id = $1;
+			$commits->{$id} = { id => $id, contacts => {} }
+				unless $seen{$id};
+			$seen{$id} = 1;
+		}
+	}
+	close $f;
+}
+
+sub scan_patches {
+	my ($commits, $f) = @_;
+	my ($id, $source);
+	while (<$f>) {
+		if (/^From ([0-9a-f]{40}) Mon Sep 17 00:00:00 2001$/) {
+			$id = $1;
+			$seen{$id} = 1;
+		}
+		next unless $id;
+		if (m{^--- (?:a/(.+)|/dev/null)$}) {
+			$source = $1;
+		} elsif (/^--- /) {
+			die "Cannot parse hunk source: $_\n";
+		} elsif (/^@@ -(\d+)(?:,(\d+))?/ && $source) {
+			get_blame($commits, $source, $1, $2, $id);
+		}
+	}
+}
+
+sub scan_patch_file {
+	my ($commits, $file) = @_;
+	open my $f, '<', $file or die "read failure: $file: $!\n";
+	scan_patches($commits, $f);
+	close $f;
+}
+
+if (!@ARGV) {
+	die "No input patch files\n";
+}
+
+my %commits;
+for (@ARGV) {
+	scan_patch_file(\%commits, $_);
+}
+import_commits(\%commits);
+
+my %count_per_person;
+for my $commit (values %commits) {
+	for my $contact (keys %{$commit->{contacts}}) {
+		$count_per_person{$contact}++;
+	}
+}
+
+my $ncommits = scalar(keys %commits);
+for my $contact (keys %count_per_person) {
+	my $percent = $count_per_person{$contact} * 100 / $ncommits;
+	next if $percent < $min_percent;
+	print "$contact\n";
+}
-- 
1.8.3.2

  reply	other threads:[~2013-07-03 21:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-03 20:59 [PATCH v2 0/3] Perl rewrite of Ruby git-related Eric Sunshine
2013-07-03 20:59 ` Eric Sunshine [this message]
2013-07-03 20:59 ` [PATCH v2 2/3] contrib: contacts: add ability to parse from committish Eric Sunshine
2013-07-03 20:59 ` [PATCH v2 3/3] contrib: contacts: interpret committish akin to format-patch Eric Sunshine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1372885182-5512-2-git-send-email-sunshine@sunshineco.com \
    --to=sunshine@sunshineco.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).