* Broken interactive rebase text after some UTF-8 characters @ 2019-01-15 19:29 Michal Nowak 2019-01-16 10:33 ` Phillip Wood 0 siblings, 1 reply; 16+ messages in thread From: Michal Nowak @ 2019-01-15 19:29 UTC (permalink / raw) To: git Hello, on OpenIndiana 2018.10 (illumos kernel) line of the interactive rebase text after a particular name (Gergő Mihály Doma) is broken: pick 1ea94c756c 10202 loader: use screen-#rows to find bottom left co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> p pick cadd68ea00 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> Source: https://github.com/illumos/illumos-gate.git See the second item which only has "p". This is with LC_ALL=en_US.UTF-8 (same with cs_CZ.UTF-8), with LC_ALL=C the text is fine. I checked various editors (vim, nano, cat), they are all the same. `git log` (PAGER is set to `/usr/bin/less -ins`) shows the particular commit correctly. I tested following git version: 2.16.5, 2.19.2, 2.20.1, git.next branch. Here's the configure output: https://paste.ec/paste/oDdydNxZ#r3avan8BL+8lldFMo928cw3eXSServTsSkGPW4jRBOd Michal ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-15 19:29 Broken interactive rebase text after some UTF-8 characters Michal Nowak @ 2019-01-16 10:33 ` Phillip Wood 2019-01-16 21:36 ` Michal Nowak 0 siblings, 1 reply; 16+ messages in thread From: Phillip Wood @ 2019-01-16 10:33 UTC (permalink / raw) To: Michal Nowak, git Dear Michal Thanks for the bug report, unfortunately I'm unable to reproduce it here using git 2.20.1 (see below). Knowing a little about how the todo list is created I cannot think how multibyte characters would break it. What command line were you using to start the rebase? Best Wishes Phillip $ git clone --shallow-since='11-1-2019' https://github.com/illumos/illumos-gate.git Cloning into 'illumos-gate'... remote: Enumerating objects: 54348, done. remote: Counting objects: 100% (54348/54348), done. remote: Compressing objects: 100% (39008/39008), done. remote: Total 54348 (delta 19671), reused 31187 (delta 11884), pack-reused 0 Receiving objects: 100% (54348/54348), 148.91 MiB | 1.18 MiB/s, done. Resolving deltas: 100% (19671/19671), done. Checking out files: 100% (47352/47352), done. $ cd ~/src/illumos-gate $ git log --oneline -1 f482e26c (HEAD -> master, origin/master, origin/HEAD) 10233 dboot: process_module() is missing newline at the end of the string Reviewed by: Sebastian Wiedenroth <wiedi@frubar.net> Reviewed by: Jason King <jbk@joyent.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Robert Mustacchi <rm@joyent.com> $ GIT_SEQUENCE_EDITOR=cat git rebase -i 08487eea68a2fa501b5042131c6db068089f82e1 hint: Waiting for your editor to close the file... pick 1ea94c75 10202 loader: use screen-#rows to find bottom left co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> pick cadd68ea 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> pick 05ede3db 10079 smatch Makefile changes for usr/src/lib Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick 5661bb76 10080 smatch Makefile changes for usr/src/cmd Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick 15c07adc 10081 smatch indenting fixes for usr/src/uts Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick c653bb47 10197 build smatch in parallel Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick 161294fe 10217 mdb: r9w isn't r8w 10218 CONV_CAP_VAL_HW2_BUFSIZE wasn't updated 10219 AV_386_2_AVX512VPOPCDQ elfcap name should be consistent Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: Rob Johnston <rob.johnston@joyent.com> Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Dan McDonald <danmcd@joyent.com> pick f482e26c 10233 dboot: process_module() is missing newline at the end of the string Reviewed by: Sebastian Wiedenroth <wiedi@frubar.net> Reviewed by: Jason King <jbk@joyent.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Robert Mustacchi <rm@joyent.com> # Rebase 08487eea..f482e26c onto 08487eea (8 commands) # # Commands: # p, pick <commit> = use commit # r, reword <commit> = use commit, but edit the commit message # e, edit <commit> = use commit, but stop for amending # s, squash <commit> = use commit, but meld into previous commit # f, fixup <commit> = like "squash", but discard this commit's log message # x, exec <command> = run command (the rest of the line) using shell # b, break = stop here (continue rebase later with 'git rebase --continue') # d, drop <commit> = remove commit # l, label <label> = label current HEAD with a name # t, reset <label> = reset HEAD to a label # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>] # . create a merge commit using the original merge commit's # . message (or the oneline, if no original merge commit was # . specified). Use -c <commit> to reword the commit message. # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. # # Note that empty commits are commented out Successfully rebased and updated refs/heads/master. On 15/01/2019 19:29, Michal Nowak wrote: > Hello, > > on OpenIndiana 2018.10 (illumos kernel) line of the interactive rebase > text after a particular name (Gergő Mihály Doma) is broken: > > pick 1ea94c756c 10202 loader: use screen-#rows to find bottom left > co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: > Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi > <rm@joyent.com> > p > pick cadd68ea00 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman > <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> > Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: Robert > Mustacchi <rm@joyent.com> > > > Source: https://github.com/illumos/illumos-gate.git > > See the second item which only has "p". > > This is with LC_ALL=en_US.UTF-8 (same with cs_CZ.UTF-8), with LC_ALL=C > the text is fine. > > I checked various editors (vim, nano, cat), they are all the same. > > `git log` (PAGER is set to `/usr/bin/less -ins`) shows the particular > commit correctly. > > I tested following git version: 2.16.5, 2.19.2, 2.20.1, git.next branch. > > Here's the configure output: > https://paste.ec/paste/oDdydNxZ#r3avan8BL+8lldFMo928cw3eXSServTsSkGPW4jRBOd > > Michal > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-16 10:33 ` Phillip Wood @ 2019-01-16 21:36 ` Michal Nowak 2019-01-17 11:04 ` Phillip Wood 0 siblings, 1 reply; 16+ messages in thread From: Michal Nowak @ 2019-01-16 21:36 UTC (permalink / raw) To: phillip.wood, git Hello Phillip, thanks for reaching out. This is what I see with your suggested command: $ GIT_SEQUENCE_EDITOR=cat git rebase -i 08487eea68a2fa501b5042131c6db068089f82e1 hint: Waiting for your editor to close the file... pick 1ea94c756c 10202 loader: use screen-#rows to find bottom left co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> p pick cadd68ea00 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> pick 05ede3db5e 10079 smatch Makefile changes for usr/src/lib Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick 5661bb7641 10080 smatch Makefile changes for usr/src/cmd Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick 15c07adc1c 10081 smatch indenting fixes for usr/src/uts Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> pick c653bb4713 10197 build smatch in parallel Reviewed by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> # Rebase 08487eea68..c653bb4713 onto 08487eea68 (6 commands) # # Commands: # p, pick <commit> = use commit # r, reword <commit> = use commit, but edit the commit message # e, edit <commit> = use commit, but stop for amending # s, squash <commit> = use commit, but meld into previous commit # f, fixup <commit> = like "squash", but discard this commit's log message # x, exec <command> = run command (the rest of the line) using shell # b, break = stop here (continue rebase later with 'git rebase --continue') # d, drop <commit> = remove commit # l, label <label> = label current HEAD with a name # t, reset <label> = reset HEAD to a label # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>] # . create a merge commit using the original merge commit's # . message (or the oneline, if no original merge commit was # . specified). Use -c <commit> to reword the commit message. # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. # # Note that empty commits are commented out error: missing arguments for pick error: invalid line 2: p You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'. Or you can abort the rebase with 'git rebase --abort'. Let me know, if there's anything I can try myself to gather more information. Should you want to reproduce this on OpenIndiana on your own, Vagrant is the more straightforward way to deploy OpenIndiana. Thanks, Michal On 01/16/19 11:33 AM, Phillip Wood wrote: > Dear Michal > > Thanks for the bug report, unfortunately I'm unable to reproduce it here > using git 2.20.1 (see below). Knowing a little about how the todo list > is created I cannot think how multibyte characters would break it. What > command line were you using to start the rebase? > > Best Wishes > > Phillip > > $ git clone --shallow-since='11-1-2019' > https://github.com/illumos/illumos-gate.git > Cloning into 'illumos-gate'... > remote: Enumerating objects: 54348, done. > remote: Counting objects: 100% (54348/54348), done. > remote: Compressing objects: 100% (39008/39008), done. > remote: Total 54348 (delta 19671), reused 31187 (delta 11884), > pack-reused 0 > Receiving objects: 100% (54348/54348), 148.91 MiB | 1.18 MiB/s, done. > Resolving deltas: 100% (19671/19671), done. > Checking out files: 100% (47352/47352), done. > > $ cd ~/src/illumos-gate > $ git log --oneline -1 > f482e26c (HEAD -> master, origin/master, origin/HEAD) 10233 dboot: > process_module() is missing newline at the end of the string Reviewed > by: Sebastian Wiedenroth <wiedi@frubar.net> Reviewed by: Jason King > <jbk@joyent.com> Reviewed by: Andy Stormont > <astormont@racktopsystems.com> Approved by: Robert Mustacchi > <rm@joyent.com> > > $ GIT_SEQUENCE_EDITOR=cat git rebase -i > 08487eea68a2fa501b5042131c6db068089f82e1 > > hint: Waiting for your editor to close the file... > pick 1ea94c75 10202 loader: use screen-#rows to find bottom left > co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: > Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi > <rm@joyent.com> > pick cadd68ea 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman > <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> > Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: Robert > Mustacchi <rm@joyent.com> > pick 05ede3db 10079 smatch Makefile changes for usr/src/lib Reviewed by: > Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi > <rm@joyent.com> > pick 5661bb76 10080 smatch Makefile changes for usr/src/cmd Reviewed by: > Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi > <rm@joyent.com> > pick 15c07adc 10081 smatch indenting fixes for usr/src/uts Reviewed by: > Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble > <peter.tribble@gmail.com> Reviewed by: Andy Fiddaman <andy@omniosce.org> > Approved by: Robert Mustacchi <rm@joyent.com> > pick c653bb47 10197 build smatch in parallel Reviewed by: Andy Fiddaman > <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> > pick 161294fe 10217 mdb: r9w isn't r8w 10218 CONV_CAP_VAL_HW2_BUFSIZE > wasn't updated 10219 AV_386_2_AVX512VPOPCDQ elfcap name should be > consistent Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: > Rob Johnston <rob.johnston@joyent.com> Reviewed by: Patrick Mooney > <patrick.mooney@joyent.com> Reviewed by: Andy Fiddaman > <andy@omniosce.org> Approved by: Dan McDonald <danmcd@joyent.com> > pick f482e26c 10233 dboot: process_module() is missing newline at the > end of the string Reviewed by: Sebastian Wiedenroth <wiedi@frubar.net> > Reviewed by: Jason King <jbk@joyent.com> Reviewed by: Andy Stormont > <astormont@racktopsystems.com> Approved by: Robert Mustacchi > <rm@joyent.com> > > # Rebase 08487eea..f482e26c onto 08487eea (8 commands) > # > # Commands: > # p, pick <commit> = use commit > # r, reword <commit> = use commit, but edit the commit message > # e, edit <commit> = use commit, but stop for amending > # s, squash <commit> = use commit, but meld into previous commit > # f, fixup <commit> = like "squash", but discard this commit's log message > # x, exec <command> = run command (the rest of the line) using shell > # b, break = stop here (continue rebase later with 'git rebase --continue') > # d, drop <commit> = remove commit > # l, label <label> = label current HEAD with a name > # t, reset <label> = reset HEAD to a label > # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>] > # . create a merge commit using the original merge commit's > # . message (or the oneline, if no original merge commit was > # . specified). Use -c <commit> to reword the commit message. > # > # These lines can be re-ordered; they are executed from top to bottom. > # > # If you remove a line here THAT COMMIT WILL BE LOST. > # > # However, if you remove everything, the rebase will be aborted. > # > # Note that empty commits are commented out > Successfully rebased and updated refs/heads/master. > > > On 15/01/2019 19:29, Michal Nowak wrote: >> Hello, >> >> on OpenIndiana 2018.10 (illumos kernel) line of the interactive rebase >> text after a particular name (Gergő Mihály Doma) is broken: >> >> pick 1ea94c756c 10202 loader: use screen-#rows to find bottom left >> co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: >> Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi >> <rm@joyent.com> >> p >> pick cadd68ea00 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman >> <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> >> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: >> Robert Mustacchi <rm@joyent.com> >> >> >> Source: https://github.com/illumos/illumos-gate.git >> >> See the second item which only has "p". >> >> This is with LC_ALL=en_US.UTF-8 (same with cs_CZ.UTF-8), with LC_ALL=C >> the text is fine. >> >> I checked various editors (vim, nano, cat), they are all the same. >> >> `git log` (PAGER is set to `/usr/bin/less -ins`) shows the particular >> commit correctly. >> >> I tested following git version: 2.16.5, 2.19.2, 2.20.1, git.next branch. >> >> Here's the configure output: >> https://paste.ec/paste/oDdydNxZ#r3avan8BL+8lldFMo928cw3eXSServTsSkGPW4jRBOd >> >> >> Michal >> > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-16 21:36 ` Michal Nowak @ 2019-01-17 11:04 ` Phillip Wood 2019-01-31 17:43 ` Alban Gruin 0 siblings, 1 reply; 16+ messages in thread From: Phillip Wood @ 2019-01-17 11:04 UTC (permalink / raw) To: Michal Nowak, phillip.wood, git Hi Michal On 16/01/2019 21:36, Michal Nowak wrote: > Hello Phillip, thanks for reaching out. > > This is what I see with your suggested command: > > > $ GIT_SEQUENCE_EDITOR=cat git rebase -i > 08487eea68a2fa501b5042131c6db068089f82e1 > hint: Waiting for your editor to close the file... pick 1ea94c756c 10202 > loader: use screen-#rows to find bottom left co-ordinates Reviewed by: > Toomas Soome <tsoome@me.com> Reviewed by: Gergő Mihály Doma > <domag02@gmail.com> Approved by: Robert Mustacchi <rm@joyent.com> > p That's really weird, I'm not sure what's going on. One thing that would be useful would be to check whether the output of git log --pretty='%s' --reverse 08487eea68a2fa5.. includes the stray 'p' or not. I had a look through the configure output you attached to your previous mail and it seems you're building without iconv and gettext support, I've built git on linux with make NO_ICONV=1 NO_GETTEXT=1 to try and test that and the rebase todo list was fine. Building with configure rather than the Makefile that comes with git can be a bit flaky sometimes as occasionally when a new compile option gets added to the Makefile updating the configure script gets overlooked but I don't know if that is the problem here or not. Best Wishes Phillip > pick cadd68ea00 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman > <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> > Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: Robert > Mustacchi <rm@joyent.com> > pick 05ede3db5e 10079 smatch Makefile changes for usr/src/lib Reviewed > by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi > <rm@joyent.com> > pick 5661bb7641 10080 smatch Makefile changes for usr/src/cmd Reviewed > by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi > <rm@joyent.com> > pick 15c07adc1c 10081 smatch indenting fixes for usr/src/uts Reviewed > by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble > <peter.tribble@gmail.com> Reviewed by: Andy Fiddaman <andy@omniosce.org> > Approved by: Robert Mustacchi <rm@joyent.com> > pick c653bb4713 10197 build smatch in parallel Reviewed by: Andy > Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> > > # Rebase 08487eea68..c653bb4713 onto 08487eea68 (6 commands) > # > # Commands: > # p, pick <commit> = use commit > # r, reword <commit> = use commit, but edit the commit message > # e, edit <commit> = use commit, but stop for amending > # s, squash <commit> = use commit, but meld into previous commit > # f, fixup <commit> = like "squash", but discard this commit's log message > # x, exec <command> = run command (the rest of the line) using shell > # b, break = stop here (continue rebase later with 'git rebase --continue') > # d, drop <commit> = remove commit > # l, label <label> = label current HEAD with a name > # t, reset <label> = reset HEAD to a label > # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>] > # . create a merge commit using the original merge commit's > # . message (or the oneline, if no original merge commit was > # . specified). Use -c <commit> to reword the commit message. > # > # These lines can be re-ordered; they are executed from top to bottom. > # > # If you remove a line here THAT COMMIT WILL BE LOST. > # > # However, if you remove everything, the rebase will be aborted. > # > # Note that empty commits are commented out > error: missing arguments for pick > error: invalid line 2: p > You can fix this with 'git rebase --edit-todo' and then run 'git rebase > --continue'. > Or you can abort the rebase with 'git rebase --abort'. > > > Let me know, if there's anything I can try myself to gather more > information. Should you want to reproduce this on OpenIndiana on your > own, Vagrant is the more straightforward way to deploy OpenIndiana. > > Thanks, > Michal > > On 01/16/19 11:33 AM, Phillip Wood wrote: >> Dear Michal >> >> Thanks for the bug report, unfortunately I'm unable to reproduce it >> here using git 2.20.1 (see below). Knowing a little about how the todo >> list is created I cannot think how multibyte characters would break >> it. What command line were you using to start the rebase? >> >> Best Wishes >> >> Phillip >> >> $ git clone --shallow-since='11-1-2019' >> https://github.com/illumos/illumos-gate.git >> Cloning into 'illumos-gate'... >> remote: Enumerating objects: 54348, done. >> remote: Counting objects: 100% (54348/54348), done. >> remote: Compressing objects: 100% (39008/39008), done. >> remote: Total 54348 (delta 19671), reused 31187 (delta 11884), >> pack-reused 0 >> Receiving objects: 100% (54348/54348), 148.91 MiB | 1.18 MiB/s, done. >> Resolving deltas: 100% (19671/19671), done. >> Checking out files: 100% (47352/47352), done. >> >> $ cd ~/src/illumos-gate >> $ git log --oneline -1 >> f482e26c (HEAD -> master, origin/master, origin/HEAD) 10233 dboot: >> process_module() is missing newline at the end of the string Reviewed >> by: Sebastian Wiedenroth <wiedi@frubar.net> Reviewed by: Jason King >> <jbk@joyent.com> Reviewed by: Andy Stormont >> <astormont@racktopsystems.com> Approved by: Robert Mustacchi >> <rm@joyent.com> >> >> $ GIT_SEQUENCE_EDITOR=cat git rebase -i >> 08487eea68a2fa501b5042131c6db068089f82e1 >> >> hint: Waiting for your editor to close the file... >> pick 1ea94c75 10202 loader: use screen-#rows to find bottom left >> co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: >> Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi >> <rm@joyent.com> >> pick cadd68ea 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman >> <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> >> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: >> Robert Mustacchi <rm@joyent.com> >> pick 05ede3db 10079 smatch Makefile changes for usr/src/lib Reviewed >> by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi >> <rm@joyent.com> >> pick 5661bb76 10080 smatch Makefile changes for usr/src/cmd Reviewed >> by: Andy Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi >> <rm@joyent.com> >> pick 15c07adc 10081 smatch indenting fixes for usr/src/uts Reviewed >> by: Toomas Soome <tsoome@me.com> Reviewed by: Peter Tribble >> <peter.tribble@gmail.com> Reviewed by: Andy Fiddaman >> <andy@omniosce.org> Approved by: Robert Mustacchi <rm@joyent.com> >> pick c653bb47 10197 build smatch in parallel Reviewed by: Andy >> Fiddaman <andy@omniosce.org> Approved by: Robert Mustacchi >> <rm@joyent.com> >> pick 161294fe 10217 mdb: r9w isn't r8w 10218 CONV_CAP_VAL_HW2_BUFSIZE >> wasn't updated 10219 AV_386_2_AVX512VPOPCDQ elfcap name should be >> consistent Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: >> Rob Johnston <rob.johnston@joyent.com> Reviewed by: Patrick Mooney >> <patrick.mooney@joyent.com> Reviewed by: Andy Fiddaman >> <andy@omniosce.org> Approved by: Dan McDonald <danmcd@joyent.com> >> pick f482e26c 10233 dboot: process_module() is missing newline at the >> end of the string Reviewed by: Sebastian Wiedenroth <wiedi@frubar.net> >> Reviewed by: Jason King <jbk@joyent.com> Reviewed by: Andy Stormont >> <astormont@racktopsystems.com> Approved by: Robert Mustacchi >> <rm@joyent.com> >> >> # Rebase 08487eea..f482e26c onto 08487eea (8 commands) >> # >> # Commands: >> # p, pick <commit> = use commit >> # r, reword <commit> = use commit, but edit the commit message >> # e, edit <commit> = use commit, but stop for amending >> # s, squash <commit> = use commit, but meld into previous commit >> # f, fixup <commit> = like "squash", but discard this commit's log >> message >> # x, exec <command> = run command (the rest of the line) using shell >> # b, break = stop here (continue rebase later with 'git rebase >> --continue') >> # d, drop <commit> = remove commit >> # l, label <label> = label current HEAD with a name >> # t, reset <label> = reset HEAD to a label >> # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>] >> # . create a merge commit using the original merge commit's >> # . message (or the oneline, if no original merge commit was >> # . specified). Use -c <commit> to reword the commit message. >> # >> # These lines can be re-ordered; they are executed from top to bottom. >> # >> # If you remove a line here THAT COMMIT WILL BE LOST. >> # >> # However, if you remove everything, the rebase will be aborted. >> # >> # Note that empty commits are commented out >> Successfully rebased and updated refs/heads/master. >> >> >> On 15/01/2019 19:29, Michal Nowak wrote: >>> Hello, >>> >>> on OpenIndiana 2018.10 (illumos kernel) line of the interactive >>> rebase text after a particular name (Gergő Mihály Doma) is broken: >>> >>> pick 1ea94c756c 10202 loader: use screen-#rows to find bottom left >>> co-ordinates Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: >>> Gergő Mihály Doma <domag02@gmail.com> Approved by: Robert Mustacchi >>> <rm@joyent.com> >>> p >>> pick cadd68ea00 10078 smatch fixes for UCB Reviewed by: Andy Fiddaman >>> <omnios@citrus-it.net> Reviewed by: Toomas Soome <tsoome@me.com> >>> Reviewed by: Peter Tribble <peter.tribble@gmail.com> Approved by: >>> Robert Mustacchi <rm@joyent.com> >>> >>> >>> Source: https://github.com/illumos/illumos-gate.git >>> >>> See the second item which only has "p". >>> >>> This is with LC_ALL=en_US.UTF-8 (same with cs_CZ.UTF-8), with >>> LC_ALL=C the text is fine. >>> >>> I checked various editors (vim, nano, cat), they are all the same. >>> >>> `git log` (PAGER is set to `/usr/bin/less -ins`) shows the particular >>> commit correctly. >>> >>> I tested following git version: 2.16.5, 2.19.2, 2.20.1, git.next branch. >>> >>> Here's the configure output: >>> https://paste.ec/paste/oDdydNxZ#r3avan8BL+8lldFMo928cw3eXSServTsSkGPW4jRBOd >>> >>> >>> Michal >>> >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-17 11:04 ` Phillip Wood @ 2019-01-31 17:43 ` Alban Gruin 2019-01-31 20:40 ` Phillip Wood 0 siblings, 1 reply; 16+ messages in thread From: Alban Gruin @ 2019-01-31 17:43 UTC (permalink / raw) To: phillip.wood, Michal Nowak, git Hi Phillip and Michal, I think I found the bug. If you look at .git/rebase-merge/git-rebase-todo.backup, which is created before the editor is opened, you can see that it does not have this problem. In between, transform_todos() is called and causes the problem reported by Michal. This seems to be caused by a single line, sequencer:4661 (on b5101f9297, "Fourth batch after 2.20", 2019-01-29)[1]. If you add just before a something like this: fwrite(item->arg, item->arg_len, sizeof(char), stdout); You will see that the argument is properly written to stdout. But if you write this: printf("%.*s\n", item->arg_len, item->arg); You will have the same broken output as in the todo file. Are we misusing C formats? [1] https://github.com/git/git/blob/master/sequencer.c#L4661 Cheers, Alban ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-31 17:43 ` Alban Gruin @ 2019-01-31 20:40 ` Phillip Wood 2019-01-31 21:00 ` Alban Gruin 2019-01-31 21:35 ` Junio C Hamano 0 siblings, 2 replies; 16+ messages in thread From: Phillip Wood @ 2019-01-31 20:40 UTC (permalink / raw) To: Alban Gruin, phillip.wood, Michal Nowak, git; +Cc: Johannes Schindelin Hi Alban On 31/01/2019 17:43, Alban Gruin wrote: > Hi Phillip and Michal, > > I think I found the bug. Good find! Which os are you on? > > If you look at .git/rebase-merge/git-rebase-todo.backup, which is > created before the editor is opened, you can see that it does not have > this problem. In between, transform_todos() is called and causes the > problem reported by Michal. > > This seems to be caused by a single line, sequencer:4661 (on b5101f9297, > "Fourth batch after 2.20", 2019-01-29)[1]. If you add just before a > something like this: > > fwrite(item->arg, item->arg_len, sizeof(char), stdout); > > You will see that the argument is properly written to stdout. But if > you write this: > > printf("%.*s\n", item->arg_len, item->arg); > > You will have the same broken output as in the todo file. > > Are we misusing C formats? The C standard and POSIX both say that the * refers to the maximum number of bytes to print but it looks like it is being treated as the maximum number of characters on OpenIndiana. Johannes - Perhaps we should change it to use fwrite() unless printf() gets fixed and we're sure no other operating systems are affected? [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/fprintf.html [2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf page 309 Best Wishes Phillip > [1] https://github.com/git/git/blob/master/sequencer.c#L4661 > > Cheers, > Alban > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-31 20:40 ` Phillip Wood @ 2019-01-31 21:00 ` Alban Gruin 2019-01-31 21:35 ` Junio C Hamano 1 sibling, 0 replies; 16+ messages in thread From: Alban Gruin @ 2019-01-31 21:00 UTC (permalink / raw) To: phillip.wood, Michal Nowak, git; +Cc: Johannes Schindelin Hi Phillip, Le 31/01/2019 à 21:40, Phillip Wood a écrit : > Hi Alban > > On 31/01/2019 17:43, Alban Gruin wrote: >> Hi Phillip and Michal, >> >> I think I found the bug. > > Good find! Which os are you on? I’m on Linux, but I found this on OpenIndiana running under QEMU. -- Alban ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-31 20:40 ` Phillip Wood 2019-01-31 21:00 ` Alban Gruin @ 2019-01-31 21:35 ` Junio C Hamano 2019-02-01 7:38 ` Johannes Schindelin 1 sibling, 1 reply; 16+ messages in thread From: Junio C Hamano @ 2019-01-31 21:35 UTC (permalink / raw) To: Phillip Wood Cc: Alban Gruin, phillip.wood, Michal Nowak, git, Johannes Schindelin Phillip Wood <phillip.wood@talktalk.net> writes: >> Are we misusing C formats? > > The C standard and POSIX both say that the * refers to the maximum > number of bytes to print but it looks like it is being treated as the > maximum number of characters on OpenIndiana. > > Johannes - Perhaps we should change it to use fwrite() unless printf() > gets fixed and we're sure no other operating systems are affected? Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in many other places in our codebase, if you can. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-01-31 21:35 ` Junio C Hamano @ 2019-02-01 7:38 ` Johannes Schindelin 2019-02-01 9:06 ` Michal Nowak 2019-02-01 16:13 ` Alban Gruin 0 siblings, 2 replies; 16+ messages in thread From: Johannes Schindelin @ 2019-02-01 7:38 UTC (permalink / raw) To: Junio C Hamano; +Cc: Phillip Wood, Alban Gruin, phillip.wood, Michal Nowak, git Hi, On Thu, 31 Jan 2019, Junio C Hamano wrote: > Phillip Wood <phillip.wood@talktalk.net> writes: > > >> Are we misusing C formats? > > > > The C standard and POSIX both say that the * refers to the maximum > > number of bytes to print but it looks like it is being treated as the > > maximum number of characters on OpenIndiana. > > > > Johannes - Perhaps we should change it to use fwrite() unless printf() > > gets fixed and we're sure no other operating systems are affected? > > Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in > many other places in our codebase, if you can. Yes, this would be painful in particular in cases like master:advice.c:101: fprintf(stderr, _("%shint: %.*s%s\n"), where we want to write more than just a variable-length buffer. I am curious: is libintl (gettext) used on OpenIndiana? I ask because AFAIR fprintf() is overridden in that case, and the bug might be a lot easier to fix if it is in libintl rather than in libc. Of course, it might *still* be a bug in libc by virtue of handing '%.*s' through to libc's implementation. Alban, can you test this with NO_GETTEXT? Thanks, Johannes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 7:38 ` Johannes Schindelin @ 2019-02-01 9:06 ` Michal Nowak 2019-02-01 14:33 ` Johannes Schindelin 2019-02-01 16:15 ` Alban Gruin 2019-02-01 16:13 ` Alban Gruin 1 sibling, 2 replies; 16+ messages in thread From: Michal Nowak @ 2019-02-01 9:06 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Phillip Wood, Alban Gruin, phillip.wood, Michal Nowak, git Johannes, On Friday, February 1, 2019 at 8:38 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi, > > On Thu, 31 Jan 2019, Junio C Hamano wrote: > >> Phillip Wood <phillip.wood@talktalk.net> writes: >> >> >> Are we misusing C formats? >> > >> > The C standard and POSIX both say that the * refers to the maximum >> > number of bytes to print but it looks like it is being treated as the >> > maximum number of characters on OpenIndiana. >> > >> > Johannes - Perhaps we should change it to use fwrite() unless >> printf() >> > gets fixed and we're sure no other operating systems are affected? >> >> Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in >> many other places in our codebase, if you can. > > Yes, this would be painful in particular in cases like > > master:advice.c:101: fprintf(stderr, _("%shint: %.*s%s\n"), > > where we want to write more than just a variable-length buffer. > > I am curious: is libintl (gettext) used on OpenIndiana? I ask because > AFAIR fprintf() is overridden in that case, and the bug might be a lot > easier to fix if it is in libintl rather than in libc. here you can see the full output of the OpenIndiana git build: https://hipster.openindiana.org/logs/oi-userland/latest/git.publish.log. From what I see there, libintl was found. If you believe this is illumos libc bug, it would be cool if someone created an simple testcase, which I can forward to the illumos developers. Thanks, Michal > > Of course, it might *still* be a bug in libc by virtue of handing '%.*s' > through to libc's implementation. > > Alban, can you test this with NO_GETTEXT? > > Thanks, > Johannes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 9:06 ` Michal Nowak @ 2019-02-01 14:33 ` Johannes Schindelin 2019-02-01 16:24 ` Michal Nowak 2019-02-01 16:15 ` Alban Gruin 1 sibling, 1 reply; 16+ messages in thread From: Johannes Schindelin @ 2019-02-01 14:33 UTC (permalink / raw) To: Michal Nowak; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin, phillip.wood, git Hi Michal, On Fri, 1 Feb 2019, Michal Nowak wrote: > On Friday, February 1, 2019 at 8:38 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > On Thu, 31 Jan 2019, Junio C Hamano wrote: > > > >> Phillip Wood <phillip.wood@talktalk.net> writes: > >> > >> >> Are we misusing C formats? > >> > > >> > The C standard and POSIX both say that the * refers to the maximum > >> > number of bytes to print but it looks like it is being treated as the > >> > maximum number of characters on OpenIndiana. > >> > > >> > Johannes - Perhaps we should change it to use fwrite() unless > >> printf() > >> > gets fixed and we're sure no other operating systems are affected? > >> > >> Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in > >> many other places in our codebase, if you can. > > > > Yes, this would be painful in particular in cases like > > > > master:advice.c:101: fprintf(stderr, _("%shint: %.*s%s\n"), > > > > where we want to write more than just a variable-length buffer. > > > > I am curious: is libintl (gettext) used on OpenIndiana? I ask because > > AFAIR fprintf() is overridden in that case, and the bug might be a lot > > easier to fix if it is in libintl rather than in libc. > > here you can see the full output of the OpenIndiana git build: https://hipster.openindiana.org/logs/oi-userland/latest/git.publish.log. > > From what I see there, libintl was found. > > If you believe this is illumos libc bug, it would be cool if someone created an simple testcase, which I can forward to the illumos developers. You already have that example. Just take the UTF-8 text in your original bug report, put it into something like int main(int argc, char **argv) { char utf8[] = "... your text here..."; printf("%.*s", (int)(sizeof(utf8) - 1), utf8); return 0; } You should first verify, though, that this replicate the problem, and if it does not, use libintl (I think you have to `#include <gettext.h>` and `-lintl` or some such) and see whether that reproduces your problem. Ciao, Johannes > > Thanks, > Michal > > > > > Of course, it might *still* be a bug in libc by virtue of handing '%.*s' > > through to libc's implementation. > > > > Alban, can you test this with NO_GETTEXT? > > > > Thanks, > > Johannes > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 14:33 ` Johannes Schindelin @ 2019-02-01 16:24 ` Michal Nowak 2019-02-01 17:30 ` Junio C Hamano 0 siblings, 1 reply; 16+ messages in thread From: Michal Nowak @ 2019-02-01 16:24 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Phillip Wood, Alban Gruin, phillip.wood, git On 02/01/19 03:33 PM, Johannes Schindelin wrote: > Hi Michal, > > On Fri, 1 Feb 2019, Michal Nowak wrote: > >> On Friday, February 1, 2019 at 8:38 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >>> >>> On Thu, 31 Jan 2019, Junio C Hamano wrote: >>> >>>> Phillip Wood <phillip.wood@talktalk.net> writes: >>>> >>>>>> Are we misusing C formats? >>>>> >>>>> The C standard and POSIX both say that the * refers to the maximum >>>>> number of bytes to print but it looks like it is being treated as the >>>>> maximum number of characters on OpenIndiana. >>>>> >>>>> Johannes - Perhaps we should change it to use fwrite() unless >>>> printf() >>>>> gets fixed and we're sure no other operating systems are affected? >>>> >>>> Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in >>>> many other places in our codebase, if you can. >>> >>> Yes, this would be painful in particular in cases like >>> >>> master:advice.c:101: fprintf(stderr, _("%shint: %.*s%s\n"), >>> >>> where we want to write more than just a variable-length buffer. >>> >>> I am curious: is libintl (gettext) used on OpenIndiana? I ask because >>> AFAIR fprintf() is overridden in that case, and the bug might be a lot >>> easier to fix if it is in libintl rather than in libc. >> >> here you can see the full output of the OpenIndiana git build: https://hipster.openindiana.org/logs/oi-userland/latest/git.publish.log. >> >> From what I see there, libintl was found. >> >> If you believe this is illumos libc bug, it would be cool if someone created an simple testcase, which I can forward to the illumos developers. > > You already have that example. Just take the UTF-8 text in your original > bug report, put it into something like > > int main(int argc, char **argv) > { > char utf8[] = "... your text here..."; > > printf("%.*s", (int)(sizeof(utf8) - 1), utf8); > > return 0; > } > > You should first verify, though, that this replicate the problem, and if > it does not, use libintl (I think you have to `#include <gettext.h>` and > `-lintl` or some such) and see whether that reproduces your problem. Thank you, Johannes for the test case. However, I don't see any problem with the output on OpenIndiana: {global} newman@lenovo:~ $ cat printf.c #include <stdio.h> //#include <gettext.h> int main(int argc, char **argv) { char utf8[] = "Gergő Mihály Doma\n"; printf("%.*s", (int)(sizeof(utf8) - 1), utf8); return 0; } {global} newman@lenovo:~ $ gcc printf.c -o printf && ./printf Gergő Mihály Doma Enabled the gettext header in the source file. {global} newman@lenovo:~ $ gcc printf.c -o printf_intl -lintl -I/usr/share/gettext/ && ./printf_intl Gergő Mihály Doma {global} newman@lenovo:~ $ ldd printf printf_intl printf: libc.so.1 => /lib/libc.so.1 libm.so.2 => /lib/libm.so.2 printf_intl: libintl.so.1 => /lib/libintl.so.1 libc.so.1 => /lib/libc.so.1 libm.so.2 => /lib/libm.so.2 {global} newman@lenovo:~ $ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_ALL= I even tried more arcane characters from https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html but they are displayed correctly as well. Michal > > Ciao, > Johannes > >> >> Thanks, >> Michal >> >>> >>> Of course, it might *still* be a bug in libc by virtue of handing '%.*s' >>> through to libc's implementation. >>> >>> Alban, can you test this with NO_GETTEXT? >>> >>> Thanks, >>> Johannes >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 16:24 ` Michal Nowak @ 2019-02-01 17:30 ` Junio C Hamano 2019-02-01 19:00 ` Michal Nowak 0 siblings, 1 reply; 16+ messages in thread From: Junio C Hamano @ 2019-02-01 17:30 UTC (permalink / raw) To: Michal Nowak Cc: Johannes Schindelin, Phillip Wood, Alban Gruin, phillip.wood, git Michal Nowak <mnowak@startmail.com> writes: >> You already have that example. Just take the UTF-8 text in your original >> bug report, put it into something like >> >> int main(int argc, char **argv) >> { >> char utf8[] = "... your text here..."; >> >> printf("%.*s", (int)(sizeof(utf8) - 1), utf8); >> >> return 0; >> } When replayed literally, this is not a very good test. > {global} newman@lenovo:~ $ cat printf.c > #include <stdio.h> > //#include <gettext.h> > int main(int argc, char **argv) { > char utf8[] = "Gergő Mihály Doma\n"; > printf("%.*s", (int)(sizeof(utf8) - 1), utf8); > return 0; > } And this is replaying it literally. The current working suspicion in this thread is that the platform printf("%.*s", num, str) emits up to num "characters" starting at str, which is an incorrect implementation, as it should emit up to num "bytes". Notice that the num in this case is the byte count of that utf8[] string. That number is always larger than the number of "characters" for a string with multi-byte character(s) in it. Let's say that the sample string has N "characters", and it is N+X "bytes" long, where X > 1. If the suspicion is correct, i.e. the way the printf implementation is broken on this platform is that it shows up to num "characters", then the call is asking to show up to N+X "characters". The buggy printf shows all the available N "characters", notices the string stops there, and finishes. So you won't _see_ the bug with that test program. Instead, use something like this. #include <stdio.h> int main(int ac, char **av) { char utf8[] = "ふabc"; printf("%.*s\n", 4, utf8); return 0; } With or without gettext or i18n, the output must end with 'a' followed by a newline, and you must not see 'b' nor 'c'. Otherwise your printf is broken. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 17:30 ` Junio C Hamano @ 2019-02-01 19:00 ` Michal Nowak 0 siblings, 0 replies; 16+ messages in thread From: Michal Nowak @ 2019-02-01 19:00 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Phillip Wood, Alban Gruin, phillip.wood, git On 02/01/19 06:30 PM, Junio C Hamano wrote: > Michal Nowak <mnowak@startmail.com> writes: > >>> You already have that example. Just take the UTF-8 text in your original >>> bug report, put it into something like >>> >>> int main(int argc, char **argv) >>> { >>> char utf8[] = "... your text here..."; >>> >>> printf("%.*s", (int)(sizeof(utf8) - 1), utf8); >>> >>> return 0; >>> } > > When replayed literally, this is not a very good test. > >> {global} newman@lenovo:~ $ cat printf.c >> #include <stdio.h> >> //#include <gettext.h> >> int main(int argc, char **argv) { >> char utf8[] = "Gergő Mihály Doma\n"; >> printf("%.*s", (int)(sizeof(utf8) - 1), utf8); >> return 0; >> } > > And this is replaying it literally. > > The current working suspicion in this thread is that the platform > printf("%.*s", num, str) emits up to num "characters" starting at > str, which is an incorrect implementation, as it should emit up to > num "bytes". > > Notice that the num in this case is the byte count of that utf8[] > string. That number is always larger than the number of "characters" > for a string with multi-byte character(s) in it. Let's say that the > sample string has N "characters", and it is N+X "bytes" long, where > X > 1. > > If the suspicion is correct, i.e. the way the printf implementation > is broken on this platform is that it shows up to num "characters", > then the call is asking to show up to N+X "characters". The buggy > printf shows all the available N "characters", notices the string > stops there, and finishes. So you won't _see_ the bug with that > test program. > > Instead, use something like this. > > #include <stdio.h> > > int main(int ac, char **av) > { > char utf8[] = "ふabc"; > printf("%.*s\n", 4, utf8); > return 0; > } > > With or without gettext or i18n, the output must end with 'a' followed > by a newline, and you must not see 'b' nor 'c'. Otherwise your printf > is broken. > The output ends with 'a' for me on OpenIndiana: {global} newman@lenovo:~ $ gcc tp.c -o tp && ./tp ふa Though, I can reproduce the problem with Alban's test case the exact same way he wrote earlier today. Michal ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 9:06 ` Michal Nowak 2019-02-01 14:33 ` Johannes Schindelin @ 2019-02-01 16:15 ` Alban Gruin 1 sibling, 0 replies; 16+ messages in thread From: Alban Gruin @ 2019-02-01 16:15 UTC (permalink / raw) To: Michal Nowak, Johannes Schindelin Cc: Junio C Hamano, Phillip Wood, phillip.wood, git [-- Attachment #1: Type: text/plain, Size: 2316 bytes --] Hi Michal, Le 01/02/2019 à 10:06, Michal Nowak a écrit : > Johannes, > > On Friday, February 1, 2019 at 8:38 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >> Hi, >> >> On Thu, 31 Jan 2019, Junio C Hamano wrote: >> >>> Phillip Wood <phillip.wood@talktalk.net> writes: >>> >>>>> Are we misusing C formats? >>>> >>>> The C standard and POSIX both say that the * refers to the maximum >>>> number of bytes to print but it looks like it is being treated as the >>>> maximum number of characters on OpenIndiana. >>>> >>>> Johannes - Perhaps we should change it to use fwrite() unless >>> printf() >>>> gets fixed and we're sure no other operating systems are affected? >>> >>> Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in >>> many other places in our codebase, if you can. >> >> Yes, this would be painful in particular in cases like >> >> master:advice.c:101: fprintf(stderr, _("%shint: %.*s%s\n"), >> >> where we want to write more than just a variable-length buffer. >> >> I am curious: is libintl (gettext) used on OpenIndiana? I ask because >> AFAIR fprintf() is overridden in that case, and the bug might be a lot >> easier to fix if it is in libintl rather than in libc. > > here you can see the full output of the OpenIndiana git build: https://hipster.openindiana.org/logs/oi-userland/latest/git.publish.log. > > From what I see there, libintl was found. > > If you believe this is illumos libc bug, it would be cool if someone created an simple testcase, which I can forward to the illumos developers. > I attached a test case to this email. You can build it with `gcc test-case.c', and run it with `./a.out'. Output on my Linux system: Before setting locale: Expected output: áaaa Actual output: áaaa After setting locale: Expected output: áaaa Actual output: áaaa Output on an OpenIndiana system: Before setting locale: Expected output: áaaa Actual output: áaaa After setting locale: Expected output: áaaa Actual output: áaaaa > Thanks, > Michal > >> >> Of course, it might *still* be a bug in libc by virtue of handing '%.*s' >> through to libc's implementation. >> >> Alban, can you test this with NO_GETTEXT? >> >> Thanks, >> Johannes Cheers, Alban [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: test-case.c --] [-- Type: text/x-csrc; name="test-case.c", Size: 650 bytes --] /* * Test case for OpenIndiana '%.*s' bug * Build with `gcc test-case.c' * Run with `./a.out' */ #include <stdlib.h> #include <stdio.h> #include <string.h> #include <locale.h> #include <libintl.h> static void compare_output(const char *str, int len) { puts("Expected output:"); fwrite(str, len, sizeof(char), stdout); puts("\nActual output:"); printf("%.*s\n", len, str); } int main(int argc, char **argv) { char buf[] = "áaaaaaa"; puts("Before setting locale:"); compare_output(buf, 5); setlocale(LC_ALL, ""); puts("\nAfter setting locale:"); compare_output(buf, 5); return EXIT_SUCCESS; } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Broken interactive rebase text after some UTF-8 characters 2019-02-01 7:38 ` Johannes Schindelin 2019-02-01 9:06 ` Michal Nowak @ 2019-02-01 16:13 ` Alban Gruin 1 sibling, 0 replies; 16+ messages in thread From: Alban Gruin @ 2019-02-01 16:13 UTC (permalink / raw) To: Johannes Schindelin, Junio C Hamano Cc: Phillip Wood, phillip.wood, Michal Nowak, git Hi Johannes, Le 01/02/2019 à 08:38, Johannes Schindelin a écrit : > Hi, > > On Thu, 31 Jan 2019, Junio C Hamano wrote: > >> Phillip Wood <phillip.wood@talktalk.net> writes: >> >>>> Are we misusing C formats? >>> >>> The C standard and POSIX both say that the * refers to the maximum >>> number of bytes to print but it looks like it is being treated as the >>> maximum number of characters on OpenIndiana. >>> >>> Johannes - Perhaps we should change it to use fwrite() unless printf() >>> gets fixed and we're sure no other operating systems are affected? >> >> Avoid such a rewrite, as "%*.s" that takes (int, char *) are used in >> many other places in our codebase, if you can. > > Yes, this would be painful in particular in cases like > > master:advice.c:101: fprintf(stderr, _("%shint: %.*s%s\n"), > > where we want to write more than just a variable-length buffer. > > I am curious: is libintl (gettext) used on OpenIndiana? I ask because > AFAIR fprintf() is overridden in that case, and the bug might be a lot > easier to fix if it is in libintl rather than in libc. > > Of course, it might *still* be a bug in libc by virtue of handing '%.*s' > through to libc's implementation. > > Alban, can you test this with NO_GETTEXT? Sure. :) The bug no longer happens when git is built with NO_GETTEXT. All is working as expected. > > Thanks, > Johannes > Cheers, Alban ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2019-02-01 19:00 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-15 19:29 Broken interactive rebase text after some UTF-8 characters Michal Nowak 2019-01-16 10:33 ` Phillip Wood 2019-01-16 21:36 ` Michal Nowak 2019-01-17 11:04 ` Phillip Wood 2019-01-31 17:43 ` Alban Gruin 2019-01-31 20:40 ` Phillip Wood 2019-01-31 21:00 ` Alban Gruin 2019-01-31 21:35 ` Junio C Hamano 2019-02-01 7:38 ` Johannes Schindelin 2019-02-01 9:06 ` Michal Nowak 2019-02-01 14:33 ` Johannes Schindelin 2019-02-01 16:24 ` Michal Nowak 2019-02-01 17:30 ` Junio C Hamano 2019-02-01 19:00 ` Michal Nowak 2019-02-01 16:15 ` Alban Gruin 2019-02-01 16:13 ` Alban Gruin
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).