git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH v2 0/3] format-patch --complete / am --exact
@ 2019-10-22 11:45 Vegard Nossum
  2019-10-22 11:45 ` [RFC PATCH v2 1/3] format-patch: add --complete Vegard Nossum
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Vegard Nossum @ 2019-10-22 11:45 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Paolo Bonzini

[I'm intentionally keeping the recipient list short to avoid hitting
the Oracle spam filter on outgoing email, hopefully everybody on the
git side who is interested will receive this via the mailing list and
I will link this submission from the workflows list too.]

Background:

There seems to be a consensus in the Linux kernel development community
that tracking patches, patchsets, reviews, and discussion of said patches
is too difficult. One big problem is that there is often no reference to
the email discussion in git history once the patch has been merged.

In order to simplify the tracking of patches, I proposed in [1] that we
include enough metadata about a patch to reconstruct the commit SHA1s
when emailing patches; this means that, assuming a patchset is based on
a publicly available parent SHA1, we can track email patches in git and
use the git SHA1 as a stable reference to a particular submission or its
corresponding discussion. I basically view this as a foundation on which
we can build a richer kernel development experience without sacrificing
the current email-based workflow.

Since I started working on this feature, I also realised that 'git am'
already has a mechanism to amend changelogs with a reference to the
"Message-Id" of the email of a patch using the --message-id flag, and
while this should IMHO be used a lot more for the kernel, it does not
completely offset the utility of these patches.

I'm sending out an early v2 to get more feedback on the implementation,
exact choice of flags and terminology (--exact, --complete, "metadata",
etc.), changelogs.

Changes since v1:
 - moved metadata to the bottom of the diff
 - fixes to pass existing tests (0023, 3403, 4150, 4256, 5100)
 - handles format=flowed (best effort)
 - better changelogs
 - documentation
 - new tests

Todo:
 - 'git am --no-exact' _with_ known metadata could append the original
   sha1 (and/or mail reference) to the changelog
 - UTF-8/non-ASCII encodings
 - 'git am' error handling (e.g. wrong base)
 - more tests: --range-diff, --base=auto, 'am -s', etc.
 - GPG-signed commits [2]

Out of scope for now:
 - Ted's suggestion of a new flag for the base [3]
 - in-transit mangling
 - minisigs
 - empty commits and/or merge commits [4]

[1] https://lore.kernel.org/workflows/b9fb52b8-8168-6bf0-9a72-1e6c44a281a5@oracle.com/
[2] https://lore.kernel.org/workflows/56664222-6c29-09dc-ef78-7b380b113c4a@oracle.com/
[3] https://lore.kernel.org/workflows/20191017144708.GI25548@mit.edu/
[4] https://lore.kernel.org/workflows/xmqqeezc83i6.fsf@gitster-ct.c.googlers.com/



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH v2 1/3] format-patch: add --complete
  2019-10-22 11:45 [RFC PATCH v2 0/3] format-patch --complete / am --exact Vegard Nossum
@ 2019-10-22 11:45 ` Vegard Nossum
  2019-10-22 11:45 ` [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail Vegard Nossum
  2019-10-22 11:45 ` [RFC PATCH v2 3/3] am: add --exact Vegard Nossum
  2 siblings, 0 replies; 7+ messages in thread
From: Vegard Nossum @ 2019-10-22 11:45 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Paolo Bonzini, Vegard Nossum

This option causes the raw commit data to be inserted between the
changelog and the diffstat when you run git-format-patch. With a
following patch to 'git am', this will allow the exact reconstruction
of the commit to the point where the sha1 will be the same.

There is also a new config option controlling the default behaviour:

  format.complete

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Previous-version: 622a0469a4970c5daac0c0323e2d6a77b3bebbdb
---
 Documentation/config/format.txt    |  7 ++++++
 Documentation/git-format-patch.txt |  9 ++++++++
 builtin/log.c                      | 35 ++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/Documentation/config/format.txt b/Documentation/config/format.txt
index 40cad9278f..3a38679837 100644
--- a/Documentation/config/format.txt
+++ b/Documentation/config/format.txt
@@ -87,6 +87,13 @@ format.useAutoBase::
 	A boolean value which lets you enable the `--base=auto` option of
 	format-patch by default.
 
+format.complete::
+	Provides the default value for the `--complete` option to
+	format-patch. If true, the raw commit metadata (including the
+	SHA1) is included at the bottom of the diff, before the signature.
+	This allows a recipient who has all the parent commits and/or the
+	tree to reconstruct the commit with the same SHA1.
+
 format.notes::
 	Provides the default value for the `--notes` option to
 	format-patch. Accepts a boolean value, or a ref which specifies
diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
index 2035d4d5d5..74fc6d8a8c 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -26,6 +26,7 @@ SYNOPSIS
 		   [--no-notes | --notes[=<ref>]]
 		   [--interdiff=<previous>]
 		   [--range-diff=<previous> [--creation-factor=<percent>]]
+		   [--[no-]complete]
 		   [--progress]
 		   [<common diff options>]
 		   [ <since> | <revision range> ]
@@ -325,6 +326,14 @@ you can use `--suffix=-patch` to get `0001-description-of-my-change-patch`.
 	range are always formatted as creation patches, independently
 	of this flag.
 
+--[no-]complete::
+	Include the raw commit metadata (including the SHA1) at the
+	bottom of the diff, before the signature. This allows a
+	recipient who has all the parent commits and/or the tree to
+	reconstruct the commit with the same SHA1. The default is
+	`--no-complete`, unless the `format.complete` configuration
+	option is set.
+
 --progress::
 	Show progress reports on stderr as patches are generated.
 
diff --git a/builtin/log.c b/builtin/log.c
index 89873d2dc2..822a0838b6 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -783,6 +783,8 @@ enum {
 	COVER_AUTO
 };
 
+static int fmt_complete;
+
 static int git_format_config(const char *var, const char *value, void *cb)
 {
 	struct rev_info *rev = cb;
@@ -888,6 +890,10 @@ static int git_format_config(const char *var, const char *value, void *cb)
 		}
 		return 0;
 	}
+	if (!strcmp(var, "format.complete")) {
+		fmt_complete = git_config_bool(var, value);
+		return 0;
+	}
 
 	return git_log_config(var, value, cb);
 }
@@ -1490,6 +1496,23 @@ static void print_bases(struct base_tree_info *bases, FILE *file)
 	oidclr(&bases->base_commit);
 }
 
+static void print_meta(struct rev_info *opt, struct commit *commit)
+{
+	const char *buffer = get_commit_buffer(commit, NULL);
+	const char *subject;
+
+	fprintf(opt->diffopt.file, "--\n");
+	fprintf(opt->diffopt.file, "commit %s\n", oid_to_hex(&commit->object.oid));
+
+	/*
+	 * TODO: hex-encode to avoid mailer mangling?
+	 */
+	if (find_commit_subject(buffer, &subject))
+		fprintf(opt->diffopt.file, "%.*s", (int) (subject - buffer), buffer);
+	else
+		fprintf(opt->diffopt.file, "%s", buffer);
+}
+
 static const char *diff_title(struct strbuf *sb, int reroll_count,
 		       const char *generic, const char *rerolled)
 {
@@ -1622,6 +1645,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			    N_("add a signature")),
 		OPT_STRING(0, "base", &base_commit, N_("base-commit"),
 			   N_("add prerequisite tree info to the patch series")),
+		OPT_BOOL(0, "complete", &fmt_complete,
+			 N_("include all the information necessary to reconstruct commit exactly")),
 		OPT_FILENAME(0, "signature-file", &signature_file,
 				N_("add a signature from a file")),
 		OPT__QUIET(&quiet, N_("don't print the patch filenames")),
@@ -1921,6 +1946,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		prepare_bases(&bases, base, list, nr);
 	}
 
+	if (fmt_complete) {
+		/*
+		 * We need the commit buffer so that we can output the exact
+		 * sequence of bytes that gets hashed as part of a commit.
+		 */
+		save_commit_buffer = 1;
+	}
+
 	if (in_reply_to || thread || cover_letter)
 		rev.ref_message_ids = xcalloc(1, sizeof(struct string_list));
 	if (in_reply_to) {
@@ -2004,6 +2037,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			rev.shown_one = 0;
 		if (shown) {
 			print_bases(&bases, rev.diffopt.file);
+			if (fmt_complete)
+				print_meta(&rev, commit);
 			if (rev.mime_boundary)
 				fprintf(rev.diffopt.file, "\n--%s%s--\n\n\n",
 				       mime_boundary_leader,
--
commit 7fa59e79ef5474fb4c657fb2ff6a8341cc17c897
tree 891d334e23f950afbaaafc182384309fd8c73e48
parent d966095db01190a2196e31195ea6fa0c722aa732
author Vegard Nossum <vegard.nossum@oracle.com> 1570284959 +0200
committer Vegard Nossum <vegard.nossum@oracle.com> 1571666151 +0200

-- 
2.24.0.rc0.3.g4ba423c3c2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail
  2019-10-22 11:45 [RFC PATCH v2 0/3] format-patch --complete / am --exact Vegard Nossum
  2019-10-22 11:45 ` [RFC PATCH v2 1/3] format-patch: add --complete Vegard Nossum
@ 2019-10-22 11:45 ` Vegard Nossum
  2019-10-27 18:44   ` brian m. carlson
  2019-10-22 11:45 ` [RFC PATCH v2 3/3] am: add --exact Vegard Nossum
  2 siblings, 1 reply; 7+ messages in thread
From: Vegard Nossum @ 2019-10-22 11:45 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Paolo Bonzini, Vegard Nossum

"Metadata" here is used for the raw commit metadata which gets included
in a patch when using 'git format-patch --complete'.

The new email format is roughly:

  1. email headers (ends with a blank line)
  2. changelog (ends with "---" or when the diff starts)
  3. comments (optional; ends when the diff starts)
  4. diff itself (ends when there are no more files/hunks)
  5. metadata (optional; starts with "--" and then "commit [...]")
  6. signature (optional; starts with "--")

Traditionally, the comments and signature were counted as part of the diff,
and although the metadata now appears between the diff and the signature
(and is extracted into its own file) the signature is also output together
with the diff; this breaks no existing test cases and seems to be the most
backwards compatible.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Previous-version: 51bb531eb57320caf3761680ebf77c25b89b3719
---
 Documentation/git-mailinfo.txt |   6 +-
 builtin/am.c                   |   2 +-
 builtin/mailinfo.c             |  13 ++-
 mailinfo.c                     | 154 ++++++++++++++++++++++++++++++++-
 mailinfo.h                     |   9 +-
 t/t5100-mailinfo.sh            |  22 +++++
 t/t5100/meta-info0001          |   5 ++
 t/t5100/meta-meta0001          |  23 +++++
 t/t5100/meta-msg0001           |   6 ++
 t/t5100/meta-patch0001         |  76 ++++++++++++++++
 t/t5100/meta-samples.mbox      | 133 ++++++++++++++++++++++++++++
 11 files changed, 439 insertions(+), 10 deletions(-)
 create mode 100644 t/t5100/meta-info0001
 create mode 100644 t/t5100/meta-meta0001
 create mode 100644 t/t5100/meta-msg0001
 create mode 100644 t/t5100/meta-patch0001
 create mode 100644 t/t5100/meta-samples.mbox

diff --git a/Documentation/git-mailinfo.txt b/Documentation/git-mailinfo.txt
index 3bbc731f67..14a2cca08c 100644
--- a/Documentation/git-mailinfo.txt
+++ b/Documentation/git-mailinfo.txt
@@ -9,7 +9,7 @@ git-mailinfo - Extracts patch and authorship from a single e-mail message
 SYNOPSIS
 --------
 [verse]
-'git mailinfo' [-k|-b] [-u | --encoding=<encoding> | -n] [--[no-]scissors] <msg> <patch>
+'git mailinfo' [-k|-b] [-u | --encoding=<encoding> | -n] [--[no-]scissors] <msg> <patch> [<meta>]
 
 
 DESCRIPTION
@@ -21,6 +21,10 @@ written out to the standard output to be used by 'git am'
 to create a commit.  It is usually not necessary to use this
 command directly.  See linkgit:git-am[1] instead.
 
+If specified, <meta> specifies a filename where commit metadata
+will be written. If the e-mail does not contain such information,
+this file will be empty.
+
 
 OPTIONS
 -------
diff --git a/builtin/am.c b/builtin/am.c
index 8181c2aef3..4190383bba 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1159,7 +1159,7 @@ static int parse_mail(struct am_state *state, const char *mail)
 
 	mi.input = xfopen(mail, "r");
 	mi.output = xfopen(am_path(state, "info"), "w");
-	if (mailinfo(&mi, am_path(state, "msg"), am_path(state, "patch")))
+	if (mailinfo(&mi, am_path(state, "msg"), am_path(state, "patch"), am_path(state, "meta")))
 		die("could not parse patch");
 
 	fclose(mi.input);
diff --git a/builtin/mailinfo.c b/builtin/mailinfo.c
index cfb667a594..6c0746fa8e 100644
--- a/builtin/mailinfo.c
+++ b/builtin/mailinfo.c
@@ -9,14 +9,14 @@
 #include "mailinfo.h"
 
 static const char mailinfo_usage[] =
-	"git mailinfo [-k | -b] [-m | --message-id] [-u | --encoding=<encoding> | -n] [--scissors | --no-scissors] <msg> <patch> < mail >info";
+	"git mailinfo [-k | -b] [-m | --message-id] [-u | --encoding=<encoding> | -n] [--scissors | --no-scissors] <msg> <patch> [<meta>] < mail >info";
 
 int cmd_mailinfo(int argc, const char **argv, const char *prefix)
 {
 	const char *def_charset;
 	struct mailinfo mi;
 	int status;
-	char *msgfile, *patchfile;
+	char *msgfile, *patchfile, *metafile;
 
 	setup_mailinfo(&mi);
 
@@ -47,7 +47,7 @@ int cmd_mailinfo(int argc, const char **argv, const char *prefix)
 		argc--; argv++;
 	}
 
-	if (argc != 3)
+	if (argc < 3 || argc > 4)
 		usage(mailinfo_usage);
 
 	mi.input = stdin;
@@ -56,10 +56,15 @@ int cmd_mailinfo(int argc, const char **argv, const char *prefix)
 	msgfile = prefix_filename(prefix, argv[1]);
 	patchfile = prefix_filename(prefix, argv[2]);
 
-	status = !!mailinfo(&mi, msgfile, patchfile);
+	metafile = NULL;
+	if (argc == 4)
+		metafile = prefix_filename(prefix, argv[3]);
+
+	status = !!mailinfo(&mi, msgfile, patchfile, metafile);
 	clear_mailinfo(&mi);
 
 	free(msgfile);
 	free(patchfile);
+	free(metafile);
 	return status;
 }
diff --git a/mailinfo.c b/mailinfo.c
index b395adbdf2..173cb58f6b 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -825,10 +825,139 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	return 0;
 }
 
-static void handle_patch(struct mailinfo *mi, const struct strbuf *line)
+enum patch_parser_state {
+	SKIP_UNTIL_DIFF,
+	NEW_DIFF_OR_HUNK,
+	MATCH_HUNK_HEADER,
+	HUNK,
+};
+
+static int handle_patch(struct mailinfo *mi, const struct strbuf *line)
 {
+	int old_pos, old_size;
+	int new_pos, new_size;
+
+	switch (mi->patch_stage) {
+	case SKIP_UNTIL_DIFF:
+		if (starts_with(line->buf, "diff "))
+			mi->patch_stage = MATCH_HUNK_HEADER;
+
+		break;
+
+	case NEW_DIFF_OR_HUNK:
+		if (starts_with(line->buf, "diff ")) {
+			mi->patch_stage = MATCH_HUNK_HEADER;
+			break;
+		}
+
+		/* fallthrough */
+
+	case MATCH_HUNK_HEADER:
+		if (sscanf(line->buf, "@@ -%d,%d +%d,%d @@", &old_pos, &old_size, &new_pos, &new_size) == 4
+			|| sscanf(line->buf, "@@ -%d,%d +%d @@", &old_pos, &old_size, &new_size) == 3
+			|| sscanf(line->buf, "@@ -%d +%d,%d @@", &old_size, &old_pos, &new_size) == 3
+			|| sscanf(line->buf, "@@ -%d +%d @@", &old_size, &new_size) == 2)
+		{
+			mi->old_size = old_size;
+			mi->new_size = new_size;
+
+			if (old_size > 0 || new_size > 0)
+				mi->patch_stage = HUNK;
+
+			break;
+		}
+
+		if (mi->patch_stage == NEW_DIFF_OR_HUNK) {
+			/*
+			 * We expected a 'diff' line or a hunk header and
+			 * didn't find either; this must be the end of the
+			 * patch!
+			 */
+			return 1;
+		}
+
+		/*
+		 * We were looking for a hunk header, but found something
+		 * else, probably an 'index' line or a '---' or '+++' line.
+		 */
+		break;
+
+	case HUNK:
+		if (line->buf[0] == '-' || line->buf[0] == ' ')
+			--mi->old_size;
+		if (line->buf[0] == '+' || line->buf[0] == ' ')
+			--mi->new_size;
+
+		if (mi->format_flowed) {
+			/*
+			 * format=flowed eats blanks at line ends
+			 */
+			if (line->buf[0] == '\r' || line->buf[0] == '\n') {
+				--mi->old_size;
+				--mi->new_size;
+			}
+		}
+
+		if (mi->old_size <= 0 && mi->new_size <= 0) {
+			/* done with hunk */
+			mi->patch_stage = NEW_DIFF_OR_HUNK;
+		}
+
+		break;
+	}
+
+	fwrite(line->buf, 1, line->len, mi->patchfile);
+	mi->patch_lines++;
+	return 0;
+}
+
+/*
+ * At the end of the diff, we find a signature and (optionally) metadata.
+ *
+ * "metadata" is whatever info we need to reconstruct a commit perfectly.
+ * It does _not_ include the signature or other lines.
+ *
+ * The way we parse this is a bit strange: there may be some lines before
+ * the metadata which logically belong to the signature, even though the
+ * actual signature appears after the metadata (e.g. sometimes blank lines
+ * appear between the diff and the signature, and we don't want to count
+ * them among the metadata).
+ */
+static void handle_signature(struct mailinfo *mi, const struct strbuf *line)
+{
+	switch (mi->meta_stage) {
+	case 0:
+		/* Done with the diff */
+		if (starts_with(line->buf, "commit ")) {
+			if (mi->metafile)
+				fwrite(line->buf, 1, line->len, mi->metafile);
+
+			mi->meta_stage = 1;
+			return;
+		}
+
+		break;
+
+	case 1:
+		/* Metadata */
+		if (starts_with(line->buf, "--")) {
+			mi->meta_stage = 2;
+			return;
+		}
+
+		if (mi->metafile)
+			fwrite(line->buf, 1, line->len, mi->metafile);
+
+		return;
+
+	case 2:
+		/* Done with metadata */
+		break;
+	}
+
 	fwrite(line->buf, 1, line->len, mi->patchfile);
 	mi->patch_lines++;
+	return;
 }
 
 static void handle_filter(struct mailinfo *mi, struct strbuf *line)
@@ -840,7 +969,12 @@ static void handle_filter(struct mailinfo *mi, struct strbuf *line)
 		mi->filter_stage++;
 		/* fallthrough */
 	case 1:
-		handle_patch(mi, line);
+		if (!handle_patch(mi, line))
+			break;
+		mi->filter_stage++;
+		/* fallthrough */
+	case 2:
+		handle_signature(mi, line);
 		break;
 	}
 }
@@ -1145,7 +1279,7 @@ static void handle_info(struct mailinfo *mi)
 	fprintf(mi->output, "\n");
 }
 
-int mailinfo(struct mailinfo *mi, const char *msg, const char *patch)
+int mailinfo(struct mailinfo *mi, const char *msg, const char *patch, const char *meta)
 {
 	FILE *cmitmsg;
 	int peek;
@@ -1163,6 +1297,16 @@ int mailinfo(struct mailinfo *mi, const char *msg, const char *patch)
 		return -1;
 	}
 
+	if (meta) {
+		mi->metafile = fopen(meta, "w");
+		if (!mi->metafile) {
+			perror(meta);
+			fclose(mi->patchfile);
+			fclose(cmitmsg);
+			return -1;
+		}
+	}
+
 	mi->p_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*(mi->p_hdr_data)));
 	mi->s_hdr_data = xcalloc(MAX_HDR_PARSED, sizeof(*(mi->s_hdr_data)));
 
@@ -1183,6 +1327,8 @@ int mailinfo(struct mailinfo *mi, const char *msg, const char *patch)
 	fwrite(mi->log_message.buf, 1, mi->log_message.len, cmitmsg);
 	fclose(cmitmsg);
 	fclose(mi->patchfile);
+	if (mi->metafile)
+		fclose(mi->metafile);
 
 	handle_info(mi);
 	strbuf_release(&line);
@@ -1212,6 +1358,8 @@ void setup_mailinfo(struct mailinfo *mi)
 	strbuf_init(&mi->log_message, 0);
 	strbuf_init(&mi->inbody_header_accum, 0);
 	mi->header_stage = 1;
+	mi->patch_stage = 0;
+	mi->meta_stage = 0;
 	mi->use_inbody_headers = 1;
 	mi->content_top = mi->content;
 	git_config(git_mailinfo_config, mi);
diff --git a/mailinfo.h b/mailinfo.h
index 79b1d6774e..fdf3ba2674 100644
--- a/mailinfo.h
+++ b/mailinfo.h
@@ -9,6 +9,7 @@ struct mailinfo {
 	FILE *input;
 	FILE *output;
 	FILE *patchfile;
+	FILE *metafile;
 
 	struct strbuf name;
 	struct strbuf email;
@@ -31,16 +32,22 @@ struct mailinfo {
 	int patch_lines;
 	int filter_stage; /* still reading log or are we copying patch? */
 	int header_stage; /* still checking in-body headers? */
+	int patch_stage;
+	int meta_stage;
 	struct strbuf inbody_header_accum;
 	struct strbuf **p_hdr_data;
 	struct strbuf **s_hdr_data;
 
 	struct strbuf log_message;
 	int input_error;
+
+	/* handle_patch() */
+	int old_size;
+	int new_size;
 };
 
 void setup_mailinfo(struct mailinfo *);
-int mailinfo(struct mailinfo *, const char *msg, const char *patch);
+int mailinfo(struct mailinfo *, const char *msg, const char *patch, const char *meta);
 void clear_mailinfo(struct mailinfo *);
 
 #endif /* MAILINFO_H */
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 9690dcad4f..76303bfaee 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -66,6 +66,28 @@ do
 	'
 done
 
+
+test_expect_success 'split box with metadata samples' \
+	'mkdir meta &&
+	git mailsplit -ometa "$DATA/meta-samples.mbox" \
+	  >meta/last &&
+	last=$(cat meta/last) &&
+	echo total is $last &&
+	test $(cat meta/last) = 1'
+
+for mail in meta/00*
+do
+	n=$(basename $mail)
+	test_expect_success "mailinfo $mail" '
+		git mailinfo -u "$mail-msg" "$mail-patch" "$mail-meta" <"$mail" >"$mail-info" &&
+		test_cmp "$DATA/meta-msg$n" "$mail-msg" &&
+		test_cmp "$DATA/meta-patch$n" "$mail-patch" &&
+		test_cmp "$DATA/meta-meta$n" "$mail-meta" &&
+		test_cmp "$DATA/meta-info$n" "$mail-info"
+	'
+done
+
+
 test_expect_success 'respect NULs' '
 
 	git mailsplit -d3 -o. "$DATA/nul-plain" &&
diff --git a/t/t5100/meta-info0001 b/t/t5100/meta-info0001
new file mode 100644
index 0000000000..69889dc9ce
--- /dev/null
+++ b/t/t5100/meta-info0001
@@ -0,0 +1,5 @@
+Author: Vegard Nossum
+Email: vegard.nossum@oracle.com
+Subject: format-patch: add --complete
+Date: Sun, 20 Oct 2019 21:40:24 +0200
+
diff --git a/t/t5100/meta-meta0001 b/t/t5100/meta-meta0001
new file mode 100644
index 0000000000..1db7e9f715
--- /dev/null
+++ b/t/t5100/meta-meta0001
@@ -0,0 +1,23 @@
+commit 763f9b1cfd69ade5e22dcdcdc35d144697675a93
+tree 43a6b213a2891b5a863775771cab0c0dba3730dc
+parent 108b97dc372828f0e72e56bbb40cae8e1e83ece6
+author Vegard Nossum <vegard.nossum@oracle.com> 1570284959 +0200
+committer Vegard Nossum <vegard.nossum@oracle.com> 1571599909 +0200
+gpgsig -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1
+ 
+ iQIcBAABAgAGBQJdrLYmAAoJEAvO9Nj+mLpY5zIQAJkdnnZMrCVme64r43M/KMm0
+ W1fmdeXiIMrI7i0McBdAsQ/KQ5yD4HBvaJWCyI0/g6IeLgVBf9//9xq4Y32iqsKn
+ XRut2Pk3H3Az0WfUlpLpDJUgzz7er8t/glaKnESb94XR/ac59tEELbePh+bXsFLH
+ 3+v8Y78zeHJd6ZLKrKmLyq/9ZaJQR+9xmGdKzZdnwM+8seE4aOhM1VtA8ik68Tn6
+ Tbaofp1jbsXcyY4nBG9GxK14wnb/8OZmQlH4J40LsZT4KDWQNWighGig8ude7SJa
+ 6FJXWJPLfOB6r2ThiJUnrf/UXeHbvYUWITiYoWOxEVb6c7RfNLqDbbGF9VQfTD0n
+ SEFO5NqOs6KofaYzALprtgpMrqksRHeLc7Ouh9xgLyLZLx/669I9lo8M1aQ7RJMy
+ V1KDG6sYbFrgy4gQ/4xqXj3NpBmMb/VcjOnCj3j040wo8q7hlpzb6ev5lcqAFEuP
+ y1owwhljMjqAdGIBw6sLVn2on+6gEQuIjbkoapBktPDw7xEpOKe9rzTGcWRRyANs
+ Z+pMWbn8c6TKonokNjERy0iPnu2t2j8x1YpqKdjY+oq8ApNZFMlU1U+UtXFfuLw5
+ ZKR5DtmXxWzvd+nKBenjzXyOt33v5eq4I/WMfATauXBgFu75mbfiKIFVD5VeMfxq
+ DoiIqvLtW+DDUkH99zXm
+ =2APb
+ -----END PGP SIGNATURE-----
+
diff --git a/t/t5100/meta-msg0001 b/t/t5100/meta-msg0001
new file mode 100644
index 0000000000..cc0690657d
--- /dev/null
+++ b/t/t5100/meta-msg0001
@@ -0,0 +1,6 @@
+Include the raw commit data between the changelog and the diffstat.
+This will allow 'git am' to reconstruct the commit exactly to the point
+where the sha1 will be the same.
+
+Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
+Previous-version: 622a0469a4970c5daac0c0323e2d6a77b3bebbdb
diff --git a/t/t5100/meta-patch0001 b/t/t5100/meta-patch0001
new file mode 100644
index 0000000000..8b926834ce
--- /dev/null
+++ b/t/t5100/meta-patch0001
@@ -0,0 +1,76 @@
+---
+ builtin/log.c | 30 ++++++++++++++++++++++++++++++
+ 1 file changed, 30 insertions(+)
+
+diff --git a/builtin/log.c b/builtin/log.c
+index c4b35fdaf9..cd96d579a7 100644
+--- a/builtin/log.c
++++ b/builtin/log.c
+@@ -1490,6 +1490,23 @@ static void print_bases(struct base_tree_info *bases, FILE *file)
+ 	oidclr(&bases->base_commit);
+ }
+ 
++static void print_meta(struct rev_info *opt, struct commit *commit)
++{
++	const char *buffer = get_commit_buffer(commit, NULL);
++	const char *subject;
++
++	fprintf(opt->diffopt.file, "--\n");
++	fprintf(opt->diffopt.file, "commit %s\n", oid_to_hex(&commit->object.oid));
++
++	/*
++	 * TODO: hex-encode to avoid mailer mangling?
++	 */
++	if (find_commit_subject(buffer, &subject))
++		fprintf(opt->diffopt.file, "%.*s", (int) (subject - buffer), buffer);
++	else
++		fprintf(opt->diffopt.file, "%s", buffer);
++}
++
+ static const char *diff_title(struct strbuf *sb, int reroll_count,
+ 		       const char *generic, const char *rerolled)
+ {
+@@ -1545,6 +1562,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 	char *branch_name = NULL;
+ 	char *base_commit = NULL;
+ 	struct base_tree_info bases;
++	int complete = 0;
+ 	int show_progress = 0;
+ 	struct progress *progress = NULL;
+ 	struct oid_array idiff_prev = OID_ARRAY_INIT;
+@@ -1622,6 +1640,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 			    N_("add a signature")),
+ 		OPT_STRING(0, "base", &base_commit, N_("base-commit"),
+ 			   N_("add prerequisite tree info to the patch series")),
++		OPT_BOOL(0, "complete", &complete,
++			 N_("include all the information necessary to reconstruct commit exactly")),
+ 		OPT_FILENAME(0, "signature-file", &signature_file,
+ 				N_("add a signature from a file")),
+ 		OPT__QUIET(&quiet, N_("don't print the patch filenames")),
+@@ -1905,6 +1925,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 		prepare_bases(&bases, base, list, nr);
+ 	}
+ 
++	if (complete) {
++		/*
++		 * We need the commit buffer so that we can output the exact
++		 * sequence of bytes that gets hashed as part of a commit.
++		 */
++		save_commit_buffer = 1;
++	}
++
+ 	if (in_reply_to || thread || cover_letter)
+ 		rev.ref_message_ids = xcalloc(1, sizeof(struct string_list));
+ 	if (in_reply_to) {
+@@ -1988,6 +2016,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 			rev.shown_one = 0;
+ 		if (shown) {
+ 			print_bases(&bases, rev.diffopt.file);
++			if (complete)
++				print_meta(&rev, commit);
+ 			if (rev.mime_boundary)
+ 				fprintf(rev.diffopt.file, "\n--%s%s--\n\n\n",
+ 				       mime_boundary_leader,
+--
+2.23.0.718.g5ad94255a8
+
diff --git a/t/t5100/meta-samples.mbox b/t/t5100/meta-samples.mbox
new file mode 100644
index 0000000000..ce68687b0e
--- /dev/null
+++ b/t/t5100/meta-samples.mbox
@@ -0,0 +1,133 @@
+From nobody Mon Sep 17 00:00:00 2001
+Received: from aserp3020.oracle.com (/141.146.126.70)
+	by default (Oracle Beehive Gateway v4.0)
+	with ESMTP ; Sun, 20 Oct 2019 12:42:33 -0700
+Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1])
+	by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x9KJcZax145871
+	for <vegard.nossum@oracle.com>; Sun, 20 Oct 2019 19:42:33 GMT
+Authentication-Results: aserp3010.oracle.com;
+	spf=softfail smtp.mailfrom=vegard.nossum@oracle.com;
+	dmarc=none header.from=oracle.com
+Received: from t460.home (dhcp-10-175-57-121.vpn.oracle.com [10.175.57.121])
+	by aserp3020.oracle.com with ESMTP id 2vrbyxapj4-1
+	for <vegard.nossum@oracle.com>; Sun, 20 Oct 2019 19:42:33 +0000
+From: Vegard Nossum <vegard.nossum@oracle.com>
+To: vegard.nossum@oracle.com
+Subject: [PATCH 1/3] format-patch: add --complete
+Date: Sun, 20 Oct 2019 21:40:24 +0200
+Message-Id: <20191020194026.19245-1-vegard.nossum@oracle.com>
+X-Mailer: git-send-email 2.23.0.718.g5ad94255a8
+MIME-Version: 1.0
+Content-Transfer-Encoding: 8bit
+X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9416 signatures=668684
+X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0
+ phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=676
+ adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1
+ engine=8.0.1-1908290000 definitions=main-1910200201
+
+Include the raw commit data between the changelog and the diffstat.
+This will allow 'git am' to reconstruct the commit exactly to the point
+where the sha1 will be the same.
+
+Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
+Previous-version: 622a0469a4970c5daac0c0323e2d6a77b3bebbdb
+---
+ builtin/log.c | 30 ++++++++++++++++++++++++++++++
+ 1 file changed, 30 insertions(+)
+
+diff --git a/builtin/log.c b/builtin/log.c
+index c4b35fdaf9..cd96d579a7 100644
+--- a/builtin/log.c
++++ b/builtin/log.c
+@@ -1490,6 +1490,23 @@ static void print_bases(struct base_tree_info *bases, FILE *file)
+ 	oidclr(&bases->base_commit);
+ }
+ 
++static void print_meta(struct rev_info *opt, struct commit *commit)
++{
++	const char *buffer = get_commit_buffer(commit, NULL);
++	const char *subject;
++
++	fprintf(opt->diffopt.file, "--\n");
++	fprintf(opt->diffopt.file, "commit %s\n", oid_to_hex(&commit->object.oid));
++
++	/*
++	 * TODO: hex-encode to avoid mailer mangling?
++	 */
++	if (find_commit_subject(buffer, &subject))
++		fprintf(opt->diffopt.file, "%.*s", (int) (subject - buffer), buffer);
++	else
++		fprintf(opt->diffopt.file, "%s", buffer);
++}
++
+ static const char *diff_title(struct strbuf *sb, int reroll_count,
+ 		       const char *generic, const char *rerolled)
+ {
+@@ -1545,6 +1562,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 	char *branch_name = NULL;
+ 	char *base_commit = NULL;
+ 	struct base_tree_info bases;
++	int complete = 0;
+ 	int show_progress = 0;
+ 	struct progress *progress = NULL;
+ 	struct oid_array idiff_prev = OID_ARRAY_INIT;
+@@ -1622,6 +1640,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 			    N_("add a signature")),
+ 		OPT_STRING(0, "base", &base_commit, N_("base-commit"),
+ 			   N_("add prerequisite tree info to the patch series")),
++		OPT_BOOL(0, "complete", &complete,
++			 N_("include all the information necessary to reconstruct commit exactly")),
+ 		OPT_FILENAME(0, "signature-file", &signature_file,
+ 				N_("add a signature from a file")),
+ 		OPT__QUIET(&quiet, N_("don't print the patch filenames")),
+@@ -1905,6 +1925,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 		prepare_bases(&bases, base, list, nr);
+ 	}
+ 
++	if (complete) {
++		/*
++		 * We need the commit buffer so that we can output the exact
++		 * sequence of bytes that gets hashed as part of a commit.
++		 */
++		save_commit_buffer = 1;
++	}
++
+ 	if (in_reply_to || thread || cover_letter)
+ 		rev.ref_message_ids = xcalloc(1, sizeof(struct string_list));
+ 	if (in_reply_to) {
+@@ -1988,6 +2016,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
+ 			rev.shown_one = 0;
+ 		if (shown) {
+ 			print_bases(&bases, rev.diffopt.file);
++			if (complete)
++				print_meta(&rev, commit);
+ 			if (rev.mime_boundary)
+ 				fprintf(rev.diffopt.file, "\n--%s%s--\n\n\n",
+ 				       mime_boundary_leader,
+--
+commit 763f9b1cfd69ade5e22dcdcdc35d144697675a93
+tree 43a6b213a2891b5a863775771cab0c0dba3730dc
+parent 108b97dc372828f0e72e56bbb40cae8e1e83ece6
+author Vegard Nossum <vegard.nossum@oracle.com> 1570284959 +0200
+committer Vegard Nossum <vegard.nossum@oracle.com> 1571599909 +0200
+gpgsig -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1
+ 
+ iQIcBAABAgAGBQJdrLYmAAoJEAvO9Nj+mLpY5zIQAJkdnnZMrCVme64r43M/KMm0
+ W1fmdeXiIMrI7i0McBdAsQ/KQ5yD4HBvaJWCyI0/g6IeLgVBf9//9xq4Y32iqsKn
+ XRut2Pk3H3Az0WfUlpLpDJUgzz7er8t/glaKnESb94XR/ac59tEELbePh+bXsFLH
+ 3+v8Y78zeHJd6ZLKrKmLyq/9ZaJQR+9xmGdKzZdnwM+8seE4aOhM1VtA8ik68Tn6
+ Tbaofp1jbsXcyY4nBG9GxK14wnb/8OZmQlH4J40LsZT4KDWQNWighGig8ude7SJa
+ 6FJXWJPLfOB6r2ThiJUnrf/UXeHbvYUWITiYoWOxEVb6c7RfNLqDbbGF9VQfTD0n
+ SEFO5NqOs6KofaYzALprtgpMrqksRHeLc7Ouh9xgLyLZLx/669I9lo8M1aQ7RJMy
+ V1KDG6sYbFrgy4gQ/4xqXj3NpBmMb/VcjOnCj3j040wo8q7hlpzb6ev5lcqAFEuP
+ y1owwhljMjqAdGIBw6sLVn2on+6gEQuIjbkoapBktPDw7xEpOKe9rzTGcWRRyANs
+ Z+pMWbn8c6TKonokNjERy0iPnu2t2j8x1YpqKdjY+oq8ApNZFMlU1U+UtXFfuLw5
+ ZKR5DtmXxWzvd+nKBenjzXyOt33v5eq4I/WMfATauXBgFu75mbfiKIFVD5VeMfxq
+ DoiIqvLtW+DDUkH99zXm
+ =2APb
+ -----END PGP SIGNATURE-----
+
+-- 
+2.23.0.718.g5ad94255a8
+
--
commit 53da20012e763a65071bfe0a42fbf4968d0e1e49
tree 46bdf2b042c66643bacf803bb9ffc28043343d5e
parent 7fa59e79ef5474fb4c657fb2ff6a8341cc17c897
author Vegard Nossum <vegard.nossum@oracle.com> 1571184248 +0200
committer Vegard Nossum <vegard.nossum@oracle.com> 1571666151 +0200

-- 
2.24.0.rc0.3.g4ba423c3c2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH v2 3/3] am: add --exact
  2019-10-22 11:45 [RFC PATCH v2 0/3] format-patch --complete / am --exact Vegard Nossum
  2019-10-22 11:45 ` [RFC PATCH v2 1/3] format-patch: add --complete Vegard Nossum
  2019-10-22 11:45 ` [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail Vegard Nossum
@ 2019-10-22 11:45 ` Vegard Nossum
  2 siblings, 0 replies; 7+ messages in thread
From: Vegard Nossum @ 2019-10-22 11:45 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Paolo Bonzini, Vegard Nossum, Paul Tan

This uses exact metadata when creating the commit object, hopefully
reconstructing the commit with the exact same SHA1.

Note: In order to be forwards compatible with new commit formats we
may want a new helper for creating a commit with the exact metadata
that is present (and then validating the result) as opposed to trying
to parse the metadata and pass it piecewise to commit_tree().

Previous-version: 3120370db888889f32e07a082edb4722db8feef1
Cc: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
---
 Documentation/git-am.txt |   9 +++-
 builtin/am.c             | 111 +++++++++++++++++++++++++++++++++++----
 t/t4150-am.sh            |  30 +++++++++++
 3 files changed, 138 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
index fc3b993c33..5b75596aaf 100644
--- a/Documentation/git-am.txt
+++ b/Documentation/git-am.txt
@@ -9,7 +9,7 @@ git-am - Apply a series of patches from a mailbox
 SYNOPSIS
 --------
 [verse]
-'git am' [--signoff] [--keep] [--[no-]keep-cr] [--[no-]utf8]
+'git am' [--[no-]exact] [--signoff] [--keep] [--[no-]keep-cr] [--[no-]utf8]
 	 [--[no-]3way] [--interactive] [--committer-date-is-author-date]
 	 [--ignore-date] [--ignore-space-change | --ignore-whitespace]
 	 [--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>]
@@ -31,6 +31,13 @@ OPTIONS
 	supply this argument, the command reads from the standard input.
 	If you supply directories, they will be treated as Maildirs.
 
+-e::
+--[no-]exact::
+	Reconstruct the exact commit that the patch was generated from,
+	assuming the mail contains complete metadata (i.e. it was generated
+	using `git format-patch --complete`). This is only possible if all
+	the parent commits are available in the repository.
+
 -s::
 --signoff::
 	Add a `Signed-off-by:` line to the commit message, using
diff --git a/builtin/am.c b/builtin/am.c
index 4190383bba..c0fc27a2ae 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -118,6 +118,7 @@ struct am_state {
 	int allow_rerere_autoupdate;
 	const char *sign_commit;
 	int rebasing;
+	int exact;
 };
 
 /**
@@ -399,6 +400,9 @@ static void am_load(struct am_state *state)
 
 	state->rebasing = !!file_exists(am_path(state, "rebasing"));
 
+	read_state_file(&sb, state, "exact", 1);
+	state->exact = !strcmp(sb.buf, "t");
+
 	strbuf_release(&sb);
 }
 
@@ -1005,6 +1009,8 @@ static void am_setup(struct am_state *state, enum patch_format patch_format,
 	else
 		write_state_text(state, "applying", "");
 
+	write_state_bool(state, "exact", state->exact);
+
 	if (!get_oid("HEAD", &curr_head)) {
 		write_state_text(state, "abort-safety", oid_to_hex(&curr_head));
 		if (!state->rebasing)
@@ -1548,40 +1554,121 @@ static int fall_back_threeway(const struct am_state *state, const char *index_pa
  */
 static void do_commit(const struct am_state *state)
 {
+	struct object_id meta_commit = {};
+	struct object_id meta_tree = {};
+
 	struct object_id tree, parent, commit;
 	const struct object_id *old_oid;
 	struct commit_list *parents = NULL;
-	const char *reflog_msg, *author;
+	const char *reflog_msg, *author = NULL;
 	struct strbuf sb = STRBUF_INIT;
 
+	if (state->exact) {
+		/*
+		 * Scan meta file for parents + other data.
+		 *
+		 * TODO: Pass everything after the "commit ..." line
+		 * verbatim to the commit for forwards compatibility
+		 * (e.g. so we don't need to know about every type of
+		 * commit attribute that may appear in the future).
+		 */
+
+		struct strbuf line = STRBUF_INIT;
+		FILE *fp = xfopen(am_path(state, "meta"), "r");
+
+		while (!strbuf_getline_lf(&line, fp)) {
+			const char *rest;
+
+			if (skip_prefix(line.buf, "commit ", &rest)) {
+				if (get_oid_hex(rest, &meta_commit))
+					die("invalid exact metadata (commit)");
+			} else if (skip_prefix(line.buf, "tree ", &rest)) {
+				if (get_oid_hex(rest, &meta_tree))
+					die("invalid exact metadata (tree)");
+			} else if (skip_prefix(line.buf, "parent ", &rest)) {
+				if (get_oid_hex(rest, &parent))
+					die("invalid exact metadata (parent)");
+
+				commit_list_insert(lookup_commit(the_repository, &parent), &parents);
+			} else if (skip_prefix(line.buf, "author ", &rest)) {
+				author = strdup(rest);
+			} else if (skip_prefix(line.buf, "committer ", &rest)) {
+				char *name_copy;
+				char *email;
+				char *email_copy;
+				char *date;
+
+				email = strstr(rest, " <");
+				if (!email)
+					die("invalid exact metadata (committer name)");
+
+				name_copy = xstrndup(rest, email - rest);
+				email += 2;
+				setenv("GIT_COMMITTER_NAME", name_copy, 1);
+				free(name_copy);
+
+				date = strstr(email, "> ");
+				if (!date)
+					die("invalid exact metadata (committer email)");
+
+				email_copy = xstrndup(email, date - email);
+				date += 2;
+				setenv("GIT_COMMITTER_EMAIL", email_copy, 1);
+				free(email_copy);
+
+				setenv("GIT_COMMITTER_DATE", date, 1);
+			} else if (line.len == 0) {
+				break;
+			} else {
+				die("unknown exact metadata: %.*s", (int) line.len, line.buf);
+			}
+		}
+
+		fclose(fp);
+	}
+
 	if (run_hook_le(NULL, "pre-applypatch", NULL))
 		exit(1);
 
 	if (write_cache_as_tree(&tree, 0, NULL))
 		die(_("git write-tree failed to write a tree"));
 
+	if (state->exact && !oideq(&tree, &meta_tree))
+		die("tree mismatch");
+
 	if (!get_oid_commit("HEAD", &parent)) {
 		old_oid = &parent;
-		commit_list_insert(lookup_commit(the_repository, &parent),
-				   &parents);
+		if (!state->exact) {
+			commit_list_insert(lookup_commit(the_repository, &parent),
+					   &parents);
+		}
 	} else {
 		old_oid = NULL;
 		say(state, stderr, _("applying to an empty history"));
 	}
 
-	author = fmt_ident(state->author_name, state->author_email,
-		WANT_AUTHOR_IDENT,
-			state->ignore_date ? NULL : state->author_date,
-			IDENT_STRICT);
-
-	if (state->committer_date_is_author_date)
-		setenv("GIT_COMMITTER_DATE",
-			state->ignore_date ? "" : state->author_date, 1);
+	if (state->exact) {
+		/*
+		 * Already got author above.
+		 */
+	} else {
+		author = fmt_ident(state->author_name, state->author_email,
+			WANT_AUTHOR_IDENT,
+				state->ignore_date ? NULL : state->author_date,
+				IDENT_STRICT);
+
+		if (state->committer_date_is_author_date)
+			setenv("GIT_COMMITTER_DATE",
+				state->ignore_date ? "" : state->author_date, 1);
+	}
 
 	if (commit_tree(state->msg, state->msg_len, &tree, parents, &commit,
 			author, state->sign_commit))
 		die(_("failed to write commit object"));
 
+	if (state->exact && !oideq(&commit, &meta_commit))
+		die("sha1 mismatch");
+
 	reflog_msg = getenv("GIT_REFLOG_ACTION");
 	if (!reflog_msg)
 		reflog_msg = "am";
@@ -2182,6 +2269,8 @@ int cmd_am(int argc, const char **argv, const char *prefix)
 			0, PARSE_OPT_NONEG),
 		OPT_BOOL('c', "scissors", &state.scissors,
 			N_("strip everything before a scissors line")),
+		OPT_BOOL('e', "exact", &state.exact,
+			N_("preserve exact metadata, including sha1")),
 		OPT_PASSTHRU_ARGV(0, "whitespace", &state.git_apply_opts, N_("action"),
 			N_("pass it through git-apply"),
 			0),
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 4f1e24ecbe..56a0804dcb 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -147,6 +147,7 @@ test_expect_success setup '
 		git diff-tree --no-commit-id -p second
 	} >patch1-hg.eml &&
 
+	git format-patch --stdout --complete first >patch1-complete &&
 
 	echo file >file &&
 	git add file &&
@@ -1061,6 +1062,35 @@ test_expect_success 'am --quit keeps HEAD where it is' '
 	test_cmp expected actual
 '
 
+test_expect_success 'am --no-exact with metadata succeeds' '
+	rm -fr .git/rebase-apply &&
+	git reset --hard &&
+	git checkout first &&
+	git am --no-exact patch1-complete
+'
+
+test_expect_success 'am --exact without metadata fails' '
+	rm -fr .git/rebase-apply &&
+	git reset --hard &&
+	git checkout first &&
+	test_must_fail git am --exact patch1
+'
+
+test_expect_success 'am --exact with metadata preserves sha1' '
+	rm -fr .git/rebase-apply &&
+	git reset --hard &&
+	git checkout first &&
+	git am --exact patch1-complete &&
+	test_cmp_rev second HEAD
+'
+
+test_expect_success 'am --exact with metadata applied to the wrong tree fails' '
+	rm -fr .git/rebase-apply &&
+	git reset --hard &&
+	git checkout second &&
+	test_must_fail git am --exact patch1-complete
+'
+
 test_expect_success 'am and .gitattibutes' '
 	test_create_repo attributes &&
 	(
--
commit 429e6ce36ee1fb8e020c76756894bf7e196e7c4e
tree 69ec15522af90642ea38dc761510fd1fa82bdfb2
parent 53da20012e763a65071bfe0a42fbf4968d0e1e49
author Vegard Nossum <vegard.nossum@oracle.com> 1571186178 +0200
committer Vegard Nossum <vegard.nossum@oracle.com> 1571740256 +0200

-- 
2.24.0.rc0.3.g4ba423c3c2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail
  2019-10-22 11:45 ` [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail Vegard Nossum
@ 2019-10-27 18:44   ` brian m. carlson
  2019-10-28 11:48     ` Vegard Nossum
  0 siblings, 1 reply; 7+ messages in thread
From: brian m. carlson @ 2019-10-27 18:44 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: Junio C Hamano, git, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

On 2019-10-22 at 11:45:17, Vegard Nossum wrote:
> diff --git a/t/t5100/meta-meta0001 b/t/t5100/meta-meta0001
> new file mode 100644
> index 0000000000..1db7e9f715
> --- /dev/null
> +++ b/t/t5100/meta-meta0001
> @@ -0,0 +1,23 @@
> +commit 763f9b1cfd69ade5e22dcdcdc35d144697675a93
> +tree 43a6b213a2891b5a863775771cab0c0dba3730dc
> +parent 108b97dc372828f0e72e56bbb40cae8e1e83ece6
> +author Vegard Nossum <vegard.nossum@oracle.com> 1570284959 +0200
> +committer Vegard Nossum <vegard.nossum@oracle.com> 1571599909 +0200
> +gpgsig -----BEGIN PGP SIGNATURE-----
> + Version: GnuPG v1
> + 
> + iQIcBAABAgAGBQJdrLYmAAoJEAvO9Nj+mLpY5zIQAJkdnnZMrCVme64r43M/KMm0
> + W1fmdeXiIMrI7i0McBdAsQ/KQ5yD4HBvaJWCyI0/g6IeLgVBf9//9xq4Y32iqsKn
> + XRut2Pk3H3Az0WfUlpLpDJUgzz7er8t/glaKnESb94XR/ac59tEELbePh+bXsFLH
> + 3+v8Y78zeHJd6ZLKrKmLyq/9ZaJQR+9xmGdKzZdnwM+8seE4aOhM1VtA8ik68Tn6
> + Tbaofp1jbsXcyY4nBG9GxK14wnb/8OZmQlH4J40LsZT4KDWQNWighGig8ude7SJa
> + 6FJXWJPLfOB6r2ThiJUnrf/UXeHbvYUWITiYoWOxEVb6c7RfNLqDbbGF9VQfTD0n
> + SEFO5NqOs6KofaYzALprtgpMrqksRHeLc7Ouh9xgLyLZLx/669I9lo8M1aQ7RJMy
> + V1KDG6sYbFrgy4gQ/4xqXj3NpBmMb/VcjOnCj3j040wo8q7hlpzb6ev5lcqAFEuP
> + y1owwhljMjqAdGIBw6sLVn2on+6gEQuIjbkoapBktPDw7xEpOKe9rzTGcWRRyANs
> + Z+pMWbn8c6TKonokNjERy0iPnu2t2j8x1YpqKdjY+oq8ApNZFMlU1U+UtXFfuLw5
> + ZKR5DtmXxWzvd+nKBenjzXyOt33v5eq4I/WMfATauXBgFu75mbfiKIFVD5VeMfxq
> + DoiIqvLtW+DDUkH99zXm
> + =2APb
> + -----END PGP SIGNATURE-----

First, let me say that I'm pleased to see this series.

It would be nice if we could use the test user and test keys, since this
test data isn't going to work for SHA-256 and I'll need to generate
suitable test data for this test as part of porting it.  I won't be able
to forge a signature using your personal key.

If you wanted to generate that data yourself instead, you're welcome to
rebase your changes onto the transition-stage-4 branch from
https://github.com/bk2204/git.git and run the testsuite with
GIT_TEST_DEFAULT_HASH=sha256 to see where it fails.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail
  2019-10-27 18:44   ` brian m. carlson
@ 2019-10-28 11:48     ` Vegard Nossum
  2019-10-29  1:39       ` brian m. carlson
  0 siblings, 1 reply; 7+ messages in thread
From: Vegard Nossum @ 2019-10-28 11:48 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, git, Paolo Bonzini


On 10/27/19 7:44 PM, brian m. carlson wrote:
> On 2019-10-22 at 11:45:17, Vegard Nossum wrote:
>> diff --git a/t/t5100/meta-meta0001 b/t/t5100/meta-meta0001
>> new file mode 100644
>> index 0000000000..1db7e9f715
>> --- /dev/null
>> +++ b/t/t5100/meta-meta0001
>> @@ -0,0 +1,23 @@
>> +commit 763f9b1cfd69ade5e22dcdcdc35d144697675a93
>> +tree 43a6b213a2891b5a863775771cab0c0dba3730dc
>> +parent 108b97dc372828f0e72e56bbb40cae8e1e83ece6
>> +author Vegard Nossum <vegard.nossum@oracle.com> 1570284959 +0200
>> +committer Vegard Nossum <vegard.nossum@oracle.com> 1571599909 +0200
>> +gpgsig -----BEGIN PGP SIGNATURE-----
>> + Version: GnuPG v1
>> +
>> + iQIcBAABAgAGBQJdrLYmAAoJEAvO9Nj+mLpY5zIQAJkdnnZMrCVme64r43M/KMm0
>> + W1fmdeXiIMrI7i0McBdAsQ/KQ5yD4HBvaJWCyI0/g6IeLgVBf9//9xq4Y32iqsKn
>> + XRut2Pk3H3Az0WfUlpLpDJUgzz7er8t/glaKnESb94XR/ac59tEELbePh+bXsFLH
>> + 3+v8Y78zeHJd6ZLKrKmLyq/9ZaJQR+9xmGdKzZdnwM+8seE4aOhM1VtA8ik68Tn6
>> + Tbaofp1jbsXcyY4nBG9GxK14wnb/8OZmQlH4J40LsZT4KDWQNWighGig8ude7SJa
>> + 6FJXWJPLfOB6r2ThiJUnrf/UXeHbvYUWITiYoWOxEVb6c7RfNLqDbbGF9VQfTD0n
>> + SEFO5NqOs6KofaYzALprtgpMrqksRHeLc7Ouh9xgLyLZLx/669I9lo8M1aQ7RJMy
>> + V1KDG6sYbFrgy4gQ/4xqXj3NpBmMb/VcjOnCj3j040wo8q7hlpzb6ev5lcqAFEuP
>> + y1owwhljMjqAdGIBw6sLVn2on+6gEQuIjbkoapBktPDw7xEpOKe9rzTGcWRRyANs
>> + Z+pMWbn8c6TKonokNjERy0iPnu2t2j8x1YpqKdjY+oq8ApNZFMlU1U+UtXFfuLw5
>> + ZKR5DtmXxWzvd+nKBenjzXyOt33v5eq4I/WMfATauXBgFu75mbfiKIFVD5VeMfxq
>> + DoiIqvLtW+DDUkH99zXm
>> + =2APb
>> + -----END PGP SIGNATURE-----
> 
> First, let me say that I'm pleased to see this series.
> 
> It would be nice if we could use the test user and test keys, since this
> test data isn't going to work for SHA-256 and I'll need to generate
> suitable test data for this test as part of porting it.  I won't be able
> to forge a signature using your personal key.
> 
> If you wanted to generate that data yourself instead, you're welcome to
> rebase your changes onto the transition-stage-4 branch from
> https://github.com/bk2204/git.git and run the testsuite with
> GIT_TEST_DEFAULT_HASH=sha256 to see where it fails.
> 

Hi!

Thanks for the feedback.

I rebased on your branch and ran the tests without any problems.

t5100 is fine for me -- the file above is only used for testing the mail
splitting algorithm, so the choice of hashing algorithm should be
irrelevant for the test. That's also why I chose to use a personal key
for the signature, the other tests also use pregenerated emails to avoid
testing the email generation in those tests. (It could be useful to
include those "historical" test cases, which should still work in case
the format ever changes again in the future.)

The other tests generate the commits/emails within the test. Are you
sure you resolved the conflict in t4150 correctly (i.e. leaving out the
gitattributes test at the end)?


Vegard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail
  2019-10-28 11:48     ` Vegard Nossum
@ 2019-10-29  1:39       ` brian m. carlson
  0 siblings, 0 replies; 7+ messages in thread
From: brian m. carlson @ 2019-10-29  1:39 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: Junio C Hamano, git, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1291 bytes --]

On 2019-10-28 at 11:48:14, Vegard Nossum wrote:
> Hi!
> 
> Thanks for the feedback.
> 
> I rebased on your branch and ran the tests without any problems.
> 
> t5100 is fine for me -- the file above is only used for testing the mail
> splitting algorithm, so the choice of hashing algorithm should be
> irrelevant for the test. That's also why I chose to use a personal key
> for the signature, the other tests also use pregenerated emails to avoid
> testing the email generation in those tests. (It could be useful to
> include those "historical" test cases, which should still work in case
> the format ever changes again in the future.)

Okay, great.  Thanks for checking, since it looked like we might use the
commit data in the buffer, but if this is only for mail splitting, then
I agree there's no problem.

> The other tests generate the commits/emails within the test. Are you
> sure you resolved the conflict in t4150 correctly (i.e. leaving out the
> gitattributes test at the end)?

I may have misresolved it, but I've just rebased on master and the two
versions of t4150 (master and my branch) are now identical, so hopefully
I haven't regressed anything in my new version.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-29  1:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 11:45 [RFC PATCH v2 0/3] format-patch --complete / am --exact Vegard Nossum
2019-10-22 11:45 ` [RFC PATCH v2 1/3] format-patch: add --complete Vegard Nossum
2019-10-22 11:45 ` [RFC PATCH v2 2/3] mailinfo: collect commit metadata from mail Vegard Nossum
2019-10-27 18:44   ` brian m. carlson
2019-10-28 11:48     ` Vegard Nossum
2019-10-29  1:39       ` brian m. carlson
2019-10-22 11:45 ` [RFC PATCH v2 3/3] am: add --exact Vegard Nossum

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).