git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Doan Tran Cong Danh <congdanhqx@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 3/3] sequencer: reencode to utf-8 before arrange rebase's todo list
Date: Thu, 31 Oct 2019 11:38:14 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1910311134011.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <20191031092618.29073-4-congdanhqx@gmail.com>

Hi,


On Thu, 31 Oct 2019, Doan Tran Cong Danh wrote:

> On musl libc, ISO-2022-JP encoder is too eager to switch back to
> 1 byte encoding, musl's iconv always switch back after every combining
> character. Comparing glibc and musl's output for this command
> $ sed q t/t3900/ISO-2022-JP.txt| iconv -f ISO-2022-JP -t utf-8 |
> 	iconv -f utf-8 -t ISO-2022-JP | xxd
>
> glibc:
> 00000000: 1b24 4224 4f24 6c24 5224 5b24 551b 2842  .$B$O$l$R$[$U.(B
> 00000010: 0a                                       .
>
> musl:
> 00000000: 1b24 4224 4f1b 2842 1b24 4224 6c1b 2842  .$B$O.(B.$B$l.(B
> 00000010: 1b24 4224 521b 2842 1b24 4224 5b1b 2842  .$B$R.(B.$B$[.(B
> 00000020: 1b24 4224 551b 2842 0a                   .$B$U.(B.
>
> Although musl iconv's output isn't optimal, it's still correct.
>
> From commit 7d509878b8, ("pretty.c: format string with truncate respects
> logOutputEncoding", 2014-05-21), we're encoding the message to utf-8
> first, then format it and convert the message to the actual output
> encoding on git commit --squash.
>
> Thus, t3900 is failing on Linux with musl libc.
>
> Reencode to utf-8 before arranging rebase's todo list.

Since the re-encoded commit messages are only used for figuring out the
relationships between the `fixup!`/`squash!` commits and their targets,
but are not used in any of the lines that are written out, this change
looks good to me.

In fact, all three patches look good to me.

Thanks for unbreaking Git with musl!
Johannes

>
> Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
> ---
>
> Notes:
>     The todo list shown to user has already been reencoded by sequencer_make_script,
>     without this patch it looks like this:
>
>     $ head -3 .git/rebase-merge/git-rebase-todo | xxd
>     00000000: 7069 636b 2065 6633 3961 3033 201b 2442  pick ef39a03 .$B
>     00000010: 244f 1b28 421b 2442 246c 1b28 421b 2442  $O.(B.$B$l.(B.$B
>     00000020: 2452 1b28 421b 2442 245b 1b28 421b 2442  $R.(B.$B$[.(B.$B
>     00000030: 2455 1b28 420a 7069 636b 2062 3832 3931  $U.(B.pick b8291
>     00000040: 3336 2073 7175 6173 6821 201b 2442 244f  36 squash! .$B$O
>     00000050: 1b28 421b 2442 246c 1b28 421b 2442 2452  .(B.$B$l.(B.$B$R
>     00000060: 1b28 421b 2442 245b 1b28 421b 2442 2455  .(B.$B$[.(B.$B$U
>     00000070: 1b28 420a 7069 636b 2062 3532 3132 6437  .(B.pick b5212d7
>     00000080: 2069 6e74 6572 6d65 6469 6174 6520 636f   intermediate co
>     00000090: 6d6d 6974 0a                             mmit.
>
>  sequencer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index 9d5964fd81..69430fe23f 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -5169,7 +5169,7 @@ int todo_list_rearrange_squash(struct todo_list *todo_list)
>  		*commit_todo_item_at(&commit_todo, item->commit) = item;
>
>  		parse_commit(item->commit);
> -		commit_buffer = get_commit_buffer(item->commit, NULL);
> +		commit_buffer = logmsg_reencode(item->commit, NULL, "UTF-8");
>  		find_commit_subject(commit_buffer, &subject);
>  		format_subject(&buf, subject, " ");
>  		subject = subjects[i] = strbuf_detach(&buf, &subject_len);
> --
> 2.24.0.rc1.3.g89530838a3.dirty
>
>

  reply	other threads:[~2019-10-31 10:38 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-31  9:26 [PATCH 0/3] Linux with musl libc improvement Doan Tran Cong Danh
2019-10-31  9:26 ` [PATCH 1/3] t0028: eliminate non-standard usage of printf Doan Tran Cong Danh
2019-10-31 17:41   ` Jeff King
2019-11-01  1:33     ` Danh Doan
2019-10-31 19:50   ` brian m. carlson
2019-10-31  9:26 ` [PATCH 2/3] configure.ac: define ICONV_OMITS_BOM if necessary Doan Tran Cong Danh
2019-10-31 18:11   ` Jeff King
2019-10-31 20:02     ` brian m. carlson
2019-11-01  1:40     ` Danh Doan
2019-10-31  9:26 ` [PATCH 3/3] sequencer: reencode to utf-8 before arrange rebase's todo list Doan Tran Cong Danh
2019-10-31 10:38   ` Johannes Schindelin [this message]
2019-10-31 19:26     ` Jeff King
2019-11-01  4:49       ` Danh Doan
2019-11-01  8:25 ` [PATCH v2 0/3] Linux with musl libc improvement Doan Tran Cong Danh
2019-11-01  8:25   ` [PATCH v2 1/3] t0028: eliminate non-standard usage of printf Doan Tran Cong Danh
2019-11-01 16:54     ` Jeff King
2019-11-01  8:25   ` [PATCH v2 2/3] configure.ac: define ICONV_OMITS_BOM if necessary Doan Tran Cong Danh
2019-11-01 16:56     ` Jeff King
2019-11-02  0:43       ` Danh Doan
2019-11-01  8:25   ` [PATCH v2 3/3] sequencer: reencode to utf-8 before arrange rebase's todo list Doan Tran Cong Danh
2019-11-01 16:59     ` Jeff King
2019-11-02  1:02       ` Danh Doan
2019-11-02 12:20         ` Danh Doan
2019-11-05  8:00         ` Jeff King
2019-11-06  1:30           ` Junio C Hamano
2019-11-06  4:03             ` Jeff King
2019-11-06 10:03               ` Danh Doan
2019-11-07  5:56                 ` Jeff King
2019-11-06  9:19 ` [PATCH v3 0/8] Correct internal working and output encoding Doan Tran Cong Danh
2019-11-06  9:19   ` [PATCH v3 1/8] t0028: eliminate non-standard usage of printf Doan Tran Cong Danh
2019-11-06  9:20   ` [PATCH v3 2/8] configure.ac: define ICONV_OMITS_BOM if necessary Doan Tran Cong Danh
2019-11-06  9:20   ` [PATCH v3 3/8] t3900: demonstrate git-rebase problem with multi encoding Doan Tran Cong Danh
2019-11-06  9:20   ` [PATCH v3 4/8] sequencer: reencode to utf-8 before arrange rebase's todo list Doan Tran Cong Danh
2019-11-06  9:20   ` [PATCH v3 5/8] sequencer: reencode revert/cherry-pick's " Doan Tran Cong Danh
2019-11-06  9:20   ` [PATCH v3 6/8] sequencer: reencode squashing commit's message Doan Tran Cong Danh
2019-11-06  9:20   ` [PATCH v3 7/8] sequencer: reencode old merge-commit message Doan Tran Cong Danh
2019-11-06 15:39     ` Eric Sunshine
2019-11-06  9:20   ` [PATCH v3 8/8] sequencer: reencode commit message for am/rebase --show-current-patch Doan Tran Cong Danh
2019-11-07  2:56 ` [PATCH v4 0/8] Correct internal working and output encoding Doan Tran Cong Danh
2019-11-07  2:56   ` [PATCH v4 1/8] t0028: eliminate non-standard usage of printf Doan Tran Cong Danh
2019-11-07  2:56   ` [PATCH v4 2/8] configure.ac: define ICONV_OMITS_BOM if necessary Doan Tran Cong Danh
2019-11-07  6:18     ` Junio C Hamano
2019-11-07  2:56   ` [PATCH v4 3/8] t3900: demonstrate git-rebase problem with multi encoding Doan Tran Cong Danh
2019-11-07  6:02     ` Jeff King
2019-11-07  6:48       ` Danh Doan
2019-11-07  8:02         ` Jeff King
2019-11-07 10:51           ` Danh Doan
2019-11-11  8:22             ` Jeff King
2019-11-07  2:56   ` [PATCH v4 4/8] sequencer: reencode to utf-8 before arrange rebase's todo list Doan Tran Cong Danh
2019-11-07  6:04     ` Jeff King
2019-11-07  2:56   ` [PATCH v4 5/8] sequencer: reencode revert/cherry-pick's " Doan Tran Cong Danh
2019-11-07  6:06     ` Jeff King
2019-11-07  2:56   ` [PATCH v4 6/8] sequencer: reencode squashing commit's message Doan Tran Cong Danh
2019-11-07  6:15     ` Jeff King
2019-11-07  2:56   ` [PATCH v4 7/8] sequencer: reencode old merge-commit message Doan Tran Cong Danh
2019-11-07  2:56   ` [PATCH v4 8/8] sequencer: reencode commit message for am/rebase --show-current-patch Doan Tran Cong Danh
2019-11-07  6:32     ` Jeff King
2019-11-07  7:48       ` Danh Doan
2019-11-07  8:03         ` Jeff King
2019-11-07 16:32           ` Danh Doan
2019-11-08  9:43 ` [PATCH v5 0/9] Improve odd encoding integration Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 1/9] t0028: eliminate non-standard usage of printf Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 2/9] configure.ac: define ICONV_OMITS_BOM if necessary Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 3/9] t3900: demonstrate git-rebase problem with multi encoding Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 4/9] sequencer: reencode to utf-8 before arrange rebase's todo list Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 5/9] sequencer: reencode revert/cherry-pick's " Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 6/9] sequencer: reencode squashing commit's message Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 7/9] sequencer: reencode old merge-commit message Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 8/9] sequencer: reencode commit message for am/rebase --show-current-patch Doan Tran Cong Danh
2019-11-08  9:43   ` [PATCH v5 9/9] sequencer: fallback to sane label in making rebase todo list Doan Tran Cong Danh
2019-11-11  1:22   ` [PATCH v5 0/9] Improve odd encoding integration Junio C Hamano
2019-11-11  4:02   ` Junio C Hamano
2019-11-11  4:43     ` Danh Doan
2019-11-11  6:14     ` Junio C Hamano
2019-11-11  6:03 ` [PATCH v6 0/9] sequencer: handle other encoding better Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 1/9] t0028: eliminate non-standard usage of printf Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 2/9] configure.ac: define ICONV_OMITS_BOM if necessary Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 3/9] t3900: demonstrate git-rebase problem with multi encoding Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 4/9] sequencer: reencode to utf-8 before arrange rebase's todo list Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 5/9] sequencer: reencode revert/cherry-pick's " Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 6/9] sequencer: reencode squashing commit's message Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 7/9] sequencer: reencode old merge-commit message Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 8/9] sequencer: reencode commit message for am/rebase --show-current-patch Doan Tran Cong Danh
2019-11-11  6:03   ` [PATCH v6 9/9] sequencer: fallback to sane label in making rebase todo list Doan Tran Cong Danh
2019-11-11  8:39     ` Jeff King
2019-11-11 16:22       ` Phillip Wood
2019-11-11 18:26     ` Johannes Schindelin
2019-11-12  4:17       ` Junio C Hamano
2019-11-11  8:40   ` [PATCH v6 0/9] sequencer: handle other encoding better Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1910311134011.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=congdanhqx@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).