git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Jerry Zhang via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jerry Zhang <jerry@skydio.com>
Subject: [PATCH v3 0/7] patch-id fixes and improvements
Date: Fri, 14 Oct 2022 08:56:37 +0000	[thread overview]
Message-ID: <pull.1359.v3.git.1665737804.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1359.v2.git.1663654859.gitgitgadget@gmail.com>

These patches add fixes and features to the "git patch-id" command, mostly
discovered through our usage of patch-id in the revup project
(https://github.com/Skydio/revup). On top of that I've tried to make general
cleanup changes where I can.

Summary:

1: Fixed a bug in the combination of --stable with binary files and
header-only, and expanded the test to cover both binary and non-binary
files.

2: Switch internal usage of patch-id in rebase / cherry-pick to use the
stable variant to reduce the number of code paths and improve testing for
bugs like above.

3: Fixed bugs with patch-id and binary diffs. Previously patch-id did not
behave correctly for binary diffs regardless of whether "--binary" was given
to "diff".

4: Fixed bugs with patch-id and mode changes. Previously mode changes were
incorrectly excluded from the patch-id.

5: Add a new "--include-whitespace" mode to patch-id that prevents
whitespace from being stripped during id calculation. Also add a config
option for the same behavior.

6: Remove unused prefix from patch-id logic.

7: Update format-patch doc to specify when patch-ids are going to be equal
to those generated by "git patch-id".

V1->V2: Fixed comment style V2->V3: The ---/+++ lines no longer get added to
the patch-id of binary diffs. Also added patches 3-7 in the series.

Signed-off-by: Jerry Zhang jerry@skydio.com

Jerry Zhang (7):
  patch-id: fix stable patch id for binary / header-only
  patch-id: use stable patch-id for rebases
  builtin: patch-id: fix patch-id with binary diffs
  patch-id: fix patch-id for mode changes
  builtin: patch-id: add --include-whitespace as a command mode
  builtin: patch-id: remove unused diff-tree prefix
  documentation: format-patch: clarify requirements for patch-ids to
    match

 Documentation/git-format-patch.txt |   4 +-
 Documentation/git-patch-id.txt     |  25 +++++--
 builtin/log.c                      |   2 +-
 builtin/patch-id.c                 | 114 +++++++++++++++++++++--------
 diff.c                             |  75 +++++++++----------
 diff.h                             |   2 +-
 patch-ids.c                        |  10 +--
 patch-ids.h                        |   2 +-
 t/t3419-rebase-patch-id.sh         |  63 +++++++++++++---
 t/t4204-patch-id.sh                |  95 ++++++++++++++++++++++--
 10 files changed, 291 insertions(+), 101 deletions(-)


base-commit: d420dda0576340909c3faff364cfbd1485f70376
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1359%2Fjerry-skydio%2Fjerry%2Frevup%2Fmaster%2Fpatch_ids-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1359/jerry-skydio/jerry/revup/master/patch_ids-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1359

Range-diff vs v2:

 1:  945508df7b6 ! 1:  7d4c2e91ce0 patch-id: fix stable patch id for binary / header-only
     @@ Commit message
          populates the object ids, and normal which populates the text
          diff. All branches will end up flushing the hunk.
      
     +    Don't populate the ---a/ and +++b/ lines for binary diffs, to correspond
     +    to those lines not being present in the "git diff" text output.
     +    This is necessary because we advertise that the patch-id calculated
     +    internally and used in format-patch is the same that what the
     +    builtin "git patch-id" would produce when piped from a diff.
     +
          Update the test to run on both binary and normal files.
      
          Signed-off-by: Jerry Zhang <jerry@skydio.com>
      
       ## diff.c ##
      @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object_id *oid
     - 			the_hash_algo->update_fn(&ctx, p->two->path, len2);
     + 		if (p->one->mode == 0) {
     + 			patch_id_add_string(&ctx, "newfilemode");
     + 			patch_id_add_mode(&ctx, p->two->mode);
     +-			patch_id_add_string(&ctx, "---/dev/null");
     +-			patch_id_add_string(&ctx, "+++b/");
     +-			the_hash_algo->update_fn(&ctx, p->two->path, len2);
     + 		} else if (p->two->mode == 0) {
     + 			patch_id_add_string(&ctx, "deletedfilemode");
     + 			patch_id_add_mode(&ctx, p->one->mode);
     +-			patch_id_add_string(&ctx, "---a/");
     +-			the_hash_algo->update_fn(&ctx, p->one->path, len1);
     +-			patch_id_add_string(&ctx, "+++/dev/null");
     +-		} else {
     +-			patch_id_add_string(&ctx, "---a/");
     +-			the_hash_algo->update_fn(&ctx, p->one->path, len1);
     +-			patch_id_add_string(&ctx, "+++b/");
     +-			the_hash_algo->update_fn(&ctx, p->two->path, len2);
       		}
       
      -		if (diff_header_only)
     @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object
       			the_hash_algo->update_fn(&ctx, oid_to_hex(&p->two->oid),
       					the_hash_algo->hexsz);
      -			continue;
     +-		}
     +-
     +-		xpp.flags = 0;
     +-		xecfg.ctxlen = 3;
     +-		xecfg.flags = XDL_EMIT_NO_HUNK_HDR;
     +-		if (xdi_diff_outf(&mf1, &mf2, NULL,
     +-				  patch_id_consume, &data, &xpp, &xecfg))
     +-			return error("unable to generate patch-id diff for %s",
     +-				     p->one->path);
      +		} else {
     ++			if (p->one->mode == 0) {
     ++				patch_id_add_string(&ctx, "---/dev/null");
     ++				patch_id_add_string(&ctx, "+++b/");
     ++				the_hash_algo->update_fn(&ctx, p->two->path, len2);
     ++			} else if (p->two->mode == 0) {
     ++				patch_id_add_string(&ctx, "---a/");
     ++				the_hash_algo->update_fn(&ctx, p->one->path, len1);
     ++				patch_id_add_string(&ctx, "+++/dev/null");
     ++			} else {
     ++				patch_id_add_string(&ctx, "---a/");
     ++				the_hash_algo->update_fn(&ctx, p->one->path, len1);
     ++				patch_id_add_string(&ctx, "+++b/");
     ++				the_hash_algo->update_fn(&ctx, p->two->path, len2);
     ++			}
     + 
      +			if (fill_mmfile(options->repo, &mf1, p->one) < 0 ||
      +			    fill_mmfile(options->repo, &mf2, p->two) < 0)
      +				return error("unable to read files to diff");
     @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object
      +					  patch_id_consume, &data, &xpp, &xecfg))
      +				return error("unable to generate patch-id diff for %s",
      +					     p->one->path);
     - 		}
     --
     --		xpp.flags = 0;
     --		xecfg.ctxlen = 3;
     --		xecfg.flags = XDL_EMIT_NO_HUNK_HDR;
     --		if (xdi_diff_outf(&mf1, &mf2, NULL,
     --				  patch_id_consume, &data, &xpp, &xecfg))
     --			return error("unable to generate patch-id diff for %s",
     --				     p->one->path);
     --
     ++		}
       		if (stable)
       			flush_one_hunk(oid, &ctx);
       	}
      
       ## t/t3419-rebase-patch-id.sh ##
      @@ t/t3419-rebase-patch-id.sh: test_expect_success 'setup: 500 lines' '
     - 	git cherry-pick main >/dev/null 2>&1
     - '
     + 	git add newfile &&
     + 	git commit -q -m "add small file" &&
     + 
     +-	git cherry-pick main >/dev/null 2>&1
     +-'
     ++	git cherry-pick main >/dev/null 2>&1 &&
       
      -test_expect_success 'setup attributes' '
      -	echo "file binary" >.gitattributes
     --'
     --
     ++	git branch -f squashed main &&
     ++	git checkout -q -f squashed &&
     ++	git reset -q --soft HEAD~2 &&
     ++	git commit -q -m squashed
     + '
     + 
       test_expect_success 'detect upstream patch' '
     - 	git checkout -q main &&
     +-	git checkout -q main &&
     ++	git checkout -q main^{} &&
       	scramble file &&
     + 	git add file &&
     + 	git commit -q -m "change big file again" &&
      @@ t/t3419-rebase-patch-id.sh: test_expect_success 'detect upstream patch' '
     - 	git checkout -q other^{} &&
     - 	git rebase main &&
     - 	git rev-list main...HEAD~ >revs &&
     --	test_must_be_empty revs
     -+	test_must_be_empty revs &&
     + 	test_must_be_empty revs
     + '
     + 
     ++test_expect_success 'detect upstream patch binary' '
      +	echo "file binary" >.gitattributes &&
      +	git checkout -q other^{} &&
      +	git rebase main &&
      +	git rev-list main...HEAD~ >revs &&
      +	test_must_be_empty revs &&
     -+	rm .gitattributes
     - '
     - 
     ++	test_when_finished "rm .gitattributes"
     ++'
     ++
       test_expect_success 'do not drop patch' '
     -@@ t/t3419-rebase-patch-id.sh: test_expect_success 'do not drop patch' '
     - 	git commit -q -m squashed &&
     +-	git branch -f squashed main &&
     +-	git checkout -q -f squashed &&
     +-	git reset -q --soft HEAD~2 &&
     +-	git commit -q -m squashed &&
       	git checkout -q other^{} &&
       	test_must_fail git rebase squashed &&
      -	git rebase --quit
     -+	git rebase --abort &&
     ++	test_when_finished "git rebase --abort"
     ++'
     ++
     ++test_expect_success 'do not drop patch binary' '
      +	echo "file binary" >.gitattributes &&
      +	git checkout -q other^{} &&
      +	test_must_fail git rebase squashed &&
     -+	git rebase --abort &&
     -+	rm .gitattributes
     ++	test_when_finished "git rebase --abort" &&
     ++	test_when_finished "rm .gitattributes"
       '
       
       test_done
 2:  30ec43cd129 ! 2:  25e28b7dab3 patch-id: use stable patch-id for rebases
     @@ Commit message
          patch-id: use stable patch-id for rebases
      
          Git doesn't persist patch-ids during the rebase process, so there is
     -    no need to specifically invoke the unstable variant.
     -
     -    This allows the legacy unstable id logic to be cleaned up.
     +    no need to specifically invoke the unstable variant. Use the stable
     +    logic for all internal patch-id calculations to minimize the number of
     +    code paths and improve test coverage.
      
          Signed-off-by: Jerry Zhang <jerry@skydio.com>
      
 -:  ----------- > 3:  21642128927 builtin: patch-id: fix patch-id with binary diffs
 -:  ----------- > 4:  6e07cfd5691 patch-id: fix patch-id for mode changes
 -:  ----------- > 5:  bbaa2425ad0 builtin: patch-id: add --include-whitespace as a command mode
 -:  ----------- > 6:  a1f6f36d487 builtin: patch-id: remove unused diff-tree prefix
 -:  ----------- > 7:  69440797f30 documentation: format-patch: clarify requirements for patch-ids to match

-- 
gitgitgadget

  parent reply	other threads:[~2022-10-14  8:56 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-20  5:58 [PATCH 0/2] update internal patch-id to use "stable" algorithm Jerry Zhang via GitGitGadget
2022-09-20  5:58 ` [PATCH 1/2] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-09-20  5:58 ` [PATCH 2/2] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-09-20  6:20 ` [PATCH v2 0/2] update internal patch-id to use "stable" algorithm Jerry Zhang via GitGitGadget
2022-09-20  6:20   ` [PATCH v2 1/2] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-09-20  6:20   ` [PATCH v2 2/2] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-14  8:56   ` Jerry Zhang via GitGitGadget [this message]
2022-10-14  8:56     ` [PATCH v3 1/7] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-10-14  8:56     ` [PATCH v3 2/7] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-14  8:56     ` [PATCH v3 3/7] builtin: patch-id: fix patch-id with binary diffs Jerry Zhang via GitGitGadget
2022-10-14 17:13       ` Junio C Hamano
2022-10-14 22:33         ` Jerry Zhang
2022-10-14 21:12       ` Junio C Hamano
2022-10-14 22:34         ` Jerry Zhang
2022-10-17 15:23           ` Junio C Hamano
2022-10-14  8:56     ` [PATCH v3 4/7] patch-id: fix patch-id for mode changes Jerry Zhang via GitGitGadget
2022-10-14 21:17       ` Junio C Hamano
2022-10-14  8:56     ` [PATCH v3 5/7] builtin: patch-id: add --include-whitespace as a command mode Jerry Zhang via GitGitGadget
2022-10-14 21:24       ` Junio C Hamano
2022-10-14 22:55         ` Jerry Zhang
2022-10-17 15:38         ` Junio C Hamano
2022-10-18 22:12           ` Jerry Zhang
2022-10-14  8:56     ` [PATCH v3 6/7] builtin: patch-id: remove unused diff-tree prefix Jerry Zhang via GitGitGadget
2022-10-14 22:03       ` Junio C Hamano
2022-10-14  8:56     ` [PATCH v3 7/7] documentation: format-patch: clarify requirements for patch-ids to match Jerry Zhang via GitGitGadget
2022-10-17 15:18       ` Junio C Hamano
2022-10-18 21:57         ` Jerry Zhang
2022-10-19 16:19           ` Junio C Hamano
2022-10-20 23:16     ` [PATCH v4 0/6] patch-id fixes and improvements Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 1/6] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 2/6] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 3/6] builtin: patch-id: fix patch-id with binary diffs Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 4/6] patch-id: fix patch-id for mode changes Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 5/6] builtin: patch-id: add --verbatim as a command mode Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 6/6] builtin: patch-id: remove unused diff-tree prefix Jerry Zhang via GitGitGadget
2022-10-21  9:33         ` Junio C Hamano
2022-10-24 20:07       ` [PATCH v5 0/6] patch-id fixes and improvements Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 1/6] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 2/6] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 3/6] builtin: patch-id: fix patch-id with binary diffs Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 4/6] patch-id: fix patch-id for mode changes Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 5/6] builtin: patch-id: add --verbatim as a command mode Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 6/6] builtin: patch-id: remove unused diff-tree prefix Jerry Zhang via GitGitGadget
2022-10-24 22:55         ` [PATCH v5 0/6] patch-id fixes and improvements Junio C Hamano
2022-09-21 19:16 ` [PATCH 0/2] update internal patch-id to use "stable" algorithm Junio C Hamano
2022-09-21 20:59   ` Jerry Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1359.v3.git.1665737804.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jerry@skydio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).