From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, johannes.schindelin@gmx.de,
Elijah Newren <newren@gmail.com>
Subject: [PATCH 5/5] fast-export: do automatic reencoding of commit messages only if requested
Date: Thu, 25 Apr 2019 08:51:18 -0700 [thread overview]
Message-ID: <20190425155118.7918-6-newren@gmail.com> (raw)
In-Reply-To: <20190425155118.7918-1-newren@gmail.com>
Automatic re-encoding of commit messages (and dropping of the encoding
header) hurts attempts to do reversible history rewrites (e.g. sha1sum
<-> sha256sum transitions, some subtree rewrites), and seems
inconsistent with the general principle followed elsewhere in
fast-export of requiring explicit user requests to modify the output
(e.g. --signed-tags=strip, --tag-of-filtered-object=rewrite). Add a
--reencode flag that the user can use to specify, and like other
fast-export flags, default it to 'abort'.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
builtin/fast-export.c | 35 ++++++++++++++++++++++++++++++++---
t/t9350-fast-export.sh | 31 ++++++++++++++++++++++++++++---
2 files changed, 60 insertions(+), 6 deletions(-)
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 66331fa401..43cc52331c 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -33,6 +33,7 @@ static const char *fast_export_usage[] = {
static int progress;
static enum { SIGNED_TAG_ABORT, VERBATIM, WARN, WARN_STRIP, STRIP } signed_tag_mode = SIGNED_TAG_ABORT;
static enum { TAG_FILTERING_ABORT, DROP, REWRITE } tag_of_filtered_mode = TAG_FILTERING_ABORT;
+static enum { REENCODE_ABORT, REENCODE_PLEASE, REENCODE_NEVER } reencode_mode = REENCODE_ABORT;
static int fake_missing_tagger;
static int use_done_feature;
static int no_data;
@@ -77,6 +78,20 @@ static int parse_opt_tag_of_filtered_mode(const struct option *opt,
return 0;
}
+static int parse_opt_reencode_mode(const struct option *opt,
+ const char *arg, int unset)
+{
+ if (unset || !strcmp(arg, "abort"))
+ reencode_mode = REENCODE_ABORT;
+ else if (!strcmp(arg, "yes"))
+ reencode_mode = REENCODE_PLEASE;
+ else if (!strcmp(arg, "no"))
+ reencode_mode = REENCODE_NEVER;
+ else
+ return error("Unknown reencoding mode: %s", arg);
+ return 0;
+}
+
static struct decoration idnums;
static uint32_t last_idnum;
@@ -633,10 +648,21 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
}
mark_next_object(&commit->object);
- if (anonymize)
+ if (anonymize) {
reencoded = anonymize_commit_message(message);
- else if (!is_encoding_utf8(encoding))
- reencoded = reencode_string(message, "UTF-8", encoding);
+ } else if (encoding) {
+ switch(reencode_mode) {
+ case REENCODE_PLEASE:
+ reencoded = reencode_string(message, "UTF-8", encoding);
+ break;
+ case REENCODE_NEVER:
+ break;
+ case REENCODE_ABORT:
+ die("Encountered commit-specific encoding %s in commit "
+ "%s; use --reencode=<mode> to handle it",
+ encoding, oid_to_hex(&commit->object.oid));
+ }
+ }
if (!commit->parents)
printf("reset %s\n", refname);
printf("commit %s\nmark :%"PRIu32"\n", refname, last_idnum);
@@ -1091,6 +1117,9 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
OPT_CALLBACK(0, "tag-of-filtered-object", &tag_of_filtered_mode, N_("mode"),
N_("select handling of tags that tag filtered objects"),
parse_opt_tag_of_filtered_mode),
+ OPT_CALLBACK(0, "reencode", &reencode_mode, N_("mode"),
+ N_("select handling of commit messages in an alternate encoding"),
+ parse_opt_reencode_mode),
OPT_STRING(0, "export-marks", &export_filename, N_("file"),
N_("Dump marks to this file")),
OPT_STRING(0, "import-marks", &import_filename, N_("file"),
diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 975c8c4014..4774926bb6 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -94,7 +94,7 @@ test_expect_success 'fast-export --show-original-ids | git fast-import' '
test $MUSS = $(git rev-parse --verify refs/tags/muss)
'
-test_expect_success 'iso-8859-7' '
+test_expect_success 'reencoding iso-8859-7' '
test_when_finished "git reset --hard HEAD~1" &&
test_when_finished "git config --unset i18n.commitencoding" &&
@@ -102,7 +102,7 @@ test_expect_success 'iso-8859-7' '
test_tick &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360")" file &&
- git fast-export wer^..wer >iso-8859-7.fi &&
+ git fast-export --reencode=yes wer^..wer >iso-8859-7.fi &&
sed "s/wer/i18n/" iso-8859-7.fi |
(cd new &&
git fast-import &&
@@ -110,6 +110,31 @@ test_expect_success 'iso-8859-7' '
grep $(printf "\317\200") actual)
'
+test_expect_success 'aborting on iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
+ test_when_finished "git config --unset i18n.commitencoding" &&
+ git config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ test_must_fail git fast-export --reencode=abort wer^..wer >iso-8859-7.fi
+'
+
+test_expect_success 'preserving iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
+ test_when_finished "git config --unset i18n.commitencoding" &&
+ git config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ git fast-export --reencode=no wer^..wer >iso-8859-7.fi &&
+ sed "s/wer/i18n-no-recoding/" iso-8859-7.fi |
+ (cd new &&
+ git fast-import &&
+ git cat-file commit i18n-no-recoding >actual &&
+ grep $(printf "\360") actual)
+'
+
test_expect_success 'encoding preserved if reencoding fails' '
test_when_finished "git reset --hard HEAD~1" &&
@@ -117,7 +142,7 @@ test_expect_success 'encoding preserved if reencoding fails' '
git config i18n.commitencoding iso-8859-7 &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360; Invalid: \377")" file &&
- git fast-export wer^..wer >iso-8859-7.fi &&
+ git fast-export --reencode=yes wer^..wer >iso-8859-7.fi &&
sed "s/wer/i18n-invalid/" iso-8859-7.fi |
(cd new &&
git fast-import &&
--
2.21.0.779.g2f4b9c5032
next prev parent reply other threads:[~2019-04-25 15:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-25 15:51 [PATCH 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-04-25 15:51 ` [PATCH 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-04-25 17:52 ` Eric Sunshine
2019-04-26 2:40 ` Junio C Hamano
2019-04-25 15:51 ` [PATCH 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-04-25 19:36 ` Eric Sunshine
2019-04-26 11:39 ` Elijah Newren
2019-04-25 15:51 ` [PATCH 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-04-25 15:51 ` [PATCH 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-04-25 15:51 ` Elijah Newren [this message]
2019-04-25 15:55 ` [PATCH 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-04-25 15:57 ` Elijah Newren
2019-04-26 21:32 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190425155118.7918-6-newren@gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).