From: Elijah Newren <newren@gmail.com>
To: gitster@pobox.com
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v2 0/5] Fix and extend encoding handling in fast export/import
Date: Tue, 30 Apr 2019 11:25:18 -0700 [thread overview]
Message-ID: <20190430182523.3339-1-newren@gmail.com> (raw)
While stress testing `git filter-repo`, I noticed an issue with
encoding; further digging led to the fixes and features in this series.
See the individual commit messages for details.
Changes since v1 (full range-diff below):
* Applied style fixes Eric pointed out in his review (thanks!)
* Rebased on latest master (83232e38, "The seventh batch"), resolving
a trivial merge conflict. Now merges cleanly with next and pu as
well.
I'm a bit under the weather so I may be slow to respond...
Elijah Newren (5):
t9350: fix encoding test to actually test reencoding
fast-import: support 'encoding' commit header
fast-export: avoid stripping encoding header if we cannot reencode
fast-export: differentiate between explicitly utf-8 and implicitly
utf-8
fast-export: do automatic reencoding of commit messages only if
requested
Documentation/git-fast-import.txt | 7 ++++
builtin/fast-export.c | 44 +++++++++++++++++++++----
fast-import.c | 11 +++++--
t/t9300-fast-import.sh | 20 ++++++++++++
t/t9350-fast-export.sh | 53 +++++++++++++++++++++++++------
5 files changed, 118 insertions(+), 17 deletions(-)
Range-diff:
1: d6efd05142 ! 1: 9cc04242bd t9350: fix encoding test to actually test reencoding
@@ -26,8 +26,7 @@
- # use author and committer name in ISO-8859-1 to match it.
- . "$TEST_DIRECTORY"/t3901/8859-1.txt &&
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
test_tick &&
echo rosten >file &&
- git commit -s -m den file &&
2: 02f48c7559 ! 2: 0cd023ac7a fast-import: support 'encoding' commit header
@@ -51,9 +51,8 @@
}
if (!committer)
die("Expected committer but didn't get one");
-+ if (skip_prefix(command_buf.buf, "encoding ", &encoding)) {
++ if (skip_prefix(command_buf.buf, "encoding ", &encoding))
+ read_next_command();
-+ }
parse_data(&msg, 0, NULL);
read_next_command();
parse_from(b);
@@ -69,7 +68,7 @@
+ strbuf_addf(&new_data,
+ "encoding %s\n",
+ encoding);
-+ strbuf_addf(&new_data, "\n");
++ strbuf_addch(&new_data, '\n');
strbuf_addbuf(&new_data, &msg);
free(author);
free(committer);
@@ -78,14 +77,14 @@
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@
- background_import_still_running
+ sed -e s/LFs/LLL/ W-input | tr L "\n" | test_must_fail git fast-import
'
+###
-+### series W (other new features)
++### series X (other new features)
+###
+
-+test_expect_success 'W: handling encoding' '
++test_expect_success 'X: handling encoding' '
+ test_tick &&
+ cat >input <<-INPUT_END &&
+ commit refs/heads/encoding
3: 86c348402d ! 3: 1fddf51402 fast-export: avoid stripping encoding header if we cannot reencode
@@ -41,8 +41,7 @@
+test_expect_success 'encoding preserved if reencoding fails' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360; Invalid: \377")" file &&
+ git fast-export wer^..wer >iso-8859-7.fi &&
4: c09b23bc59 = 4: 4a2e04b3ae fast-export: differentiate between explicitly utf-8 and implicitly utf-8
5: 24b69a0db9 ! 5: 44aacb1a0b fast-export: do automatic reencoding of commit messages only if requested
@@ -92,8 +92,7 @@
+test_expect_success 'reencoding iso-8859-7' '
test_when_finished "git reset --hard HEAD~1" &&
- test_when_finished "git config --unset i18n.commitencoding" &&
-@@
+ test_config i18n.commitencoding iso-8859-7 &&
test_tick &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360")" file &&
@@ -109,8 +108,7 @@
+test_expect_success 'aborting on iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ test_must_fail git fast-export --reencode=abort wer^..wer >iso-8859-7.fi
@@ -119,8 +117,7 @@
+test_expect_success 'preserving iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ git fast-export --reencode=no wer^..wer >iso-8859-7.fi &&
@@ -134,8 +131,7 @@
test_expect_success 'encoding preserved if reencoding fails' '
test_when_finished "git reset --hard HEAD~1" &&
-@@
- git config i18n.commitencoding iso-8859-7 &&
+ test_config i18n.commitencoding iso-8859-7 &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360; Invalid: \377")" file &&
- git fast-export wer^..wer >iso-8859-7.fi &&
--
2.21.0.782.g44aacb1a0b
next reply other threads:[~2019-04-30 18:25 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-30 18:25 Elijah Newren [this message]
2019-04-30 18:25 ` [PATCH v2 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-04-30 18:25 ` [PATCH v2 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-04-30 18:25 ` [PATCH v2 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-04-30 18:25 ` [PATCH v2 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-04-30 18:25 ` [PATCH v2 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-10 20:53 ` [PATCH v3 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-05-10 20:53 ` [PATCH v3 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-10 20:53 ` [PATCH v3 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-10 20:53 ` [PATCH v3 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-10 20:53 ` [PATCH v3 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-05-10 20:53 ` [PATCH v3 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-11 21:07 ` Torsten Bögershausen
2019-05-11 21:42 ` Elijah Newren
2019-05-13 7:48 ` Junio C Hamano
2019-05-13 13:24 ` Elijah Newren
2019-05-13 10:23 ` Johannes Schindelin
2019-05-13 12:56 ` Torsten Bögershausen
2019-05-13 13:29 ` Elijah Newren
2019-05-13 16:41 ` Elijah Newren
2019-05-13 10:14 ` [PATCH v3 0/5] Fix and extend encoding handling in fast export/import Johannes Schindelin
2019-05-13 16:47 ` [PATCH v4 " Elijah Newren
2019-05-13 16:47 ` [PATCH v4 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-13 16:47 ` [PATCH v4 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-13 16:47 ` [PATCH v4 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-13 16:47 ` [PATCH v4 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-05-13 16:47 ` [PATCH v4 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-13 22:32 ` Junio C Hamano
2019-05-13 23:17 ` [PATCH v5 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-05-13 23:17 ` [PATCH v5 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-14 2:50 ` Torsten Bögershausen
2019-05-13 23:17 ` [PATCH v5 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-13 23:17 ` [PATCH v5 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-14 2:56 ` Torsten Bögershausen
2019-05-13 23:17 ` [PATCH v5 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-05-14 3:01 ` Torsten Bögershausen
2019-05-13 23:17 ` [PATCH v5 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-14 0:19 ` Eric Sunshine
2019-05-14 4:30 ` [PATCH v6 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-05-14 4:30 ` [PATCH v6 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-14 4:30 ` [PATCH v6 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-14 4:31 ` [PATCH v6 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-14 4:31 ` [PATCH v6 4/5] fast-export: differentiate between explicitly UTF-8 and implicitly UTF-8 Elijah Newren
2019-05-14 4:31 ` [PATCH v6 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-16 18:15 ` [PATCH v6 0/5] Fix and extend encoding handling in fast export/import Torsten Bögershausen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190430182523.3339-1-newren@gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).