git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] git gui: fix branch name encoding error on git gui
@ 2019-12-07  0:29 加藤一博
  2019-12-09 19:15 ` Junio C Hamano
  2019-12-11 16:05 ` Pratyush Yadav
  0 siblings, 2 replies; 5+ messages in thread
From: 加藤一博 @ 2019-12-07  0:29 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: 加藤一博

After "git checkout -b '漢字'" to create a branch with UTF-8
character in it, "git gui" shows the branch name incorrectly,
as it forgets to turn the bytes
read from the "git for-each-ref" and
read from "HEAD" file
into Unicode characters.

Signed-off-by: Kazuhiro Kato <kato-k@ksysllc.co.jp>
---
 git-gui.sh     | 1 +
 lib/branch.tcl | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/git-gui.sh b/git-gui.sh
index 0d21f56..8f4a9ae 100755
--- a/git-gui.sh
+++ b/git-gui.sh
@@ -684,6 +684,7 @@ proc load_current_branch {} {
 	global current_branch is_detached
 
 	set fd [open [gitdir HEAD] r]
+	fconfigure $fd -translation binary -encoding utf-8
 	if {[gets $fd ref] < 1} {
 		set ref {}
 	}
diff --git a/lib/branch.tcl b/lib/branch.tcl
index 777eeb7..8b0c485 100644
--- a/lib/branch.tcl
+++ b/lib/branch.tcl
@@ -8,6 +8,7 @@ proc load_all_heads {} {
 	set rh_len [expr {[string length $rh] + 1}]
 	set all_heads [list]
 	set fd [git_read for-each-ref --format=%(refname) $rh]
+	fconfigure $fd -translation binary -encoding utf-8
 	while {[gets $fd line] > 0} {
 		if {!$some_heads_tracking || ![is_tracking_branch $line]} {
 			lappend all_heads [string range $line $rh_len end]
@@ -24,6 +25,7 @@ proc load_all_tags {} {
 		--sort=-taggerdate \
 		--format=%(refname) \
 		refs/tags]
+	fconfigure $fd -translation binary -encoding utf-8
 	while {[gets $fd line] > 0} {
 		if {![regsub ^refs/tags/ $line {} name]} continue
 		lappend all_tags $name
-- 
2.20.1.windows.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] git gui: fix branch name encoding error on git gui
  2019-12-07  0:29 [PATCH] git gui: fix branch name encoding error on git gui 加藤一博
@ 2019-12-09 19:15 ` Junio C Hamano
  2019-12-09 21:09   ` Pratyush Yadav
  2019-12-11 16:05 ` Pratyush Yadav
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2019-12-09 19:15 UTC (permalink / raw)
  To: Pratyush Yadav; +Cc: git@vger.kernel.org, 加藤一博

加藤一博 <kato-k@ksysllc.co.jp> writes:

> After "git checkout -b '漢字'" to create a branch with UTF-8
> character in it, "git gui" shows the branch name incorrectly,
> as it forgets to turn the bytes
> read from the "git for-each-ref" and
> read from "HEAD" file
> into Unicode characters.

Thanks.

Note to the git-gui mentainer.  The above may want to be
line-wrapped a bit.

> Signed-off-by: Kazuhiro Kato <kato-k@ksysllc.co.jp>
> ---
>  git-gui.sh     | 1 +
>  lib/branch.tcl | 2 ++
>  2 files changed, 3 insertions(+)
>
> diff --git a/git-gui.sh b/git-gui.sh
> index 0d21f56..8f4a9ae 100755
> --- a/git-gui.sh
> +++ b/git-gui.sh
> @@ -684,6 +684,7 @@ proc load_current_branch {} {
>  	global current_branch is_detached
>  
>  	set fd [open [gitdir HEAD] r]
> +	fconfigure $fd -translation binary -encoding utf-8
>  	if {[gets $fd ref] < 1} {
>  		set ref {}
>  	}

A comment totally outside the scope of this fix to anybody
interested in further working on this code.

This piece of code is way too intimate with the implementation
details of HEAD and yet not intimate enough to know that HEAD can be
a symlink (in other words, it is a poor imitation of the real logic
implemented in git core).  A kosher way to implement this would be
to call

	git symbolic-ref --quiet --short HEAD

which would succeed and give the branch name to its standard output,
or would fail when the head is detached.  Set "current_branch" and
"is_detached" according to the outcome.

And yes, Kato-san's fconfigure fix in this patch will still be
relevant even after such a fix to the implementation of this proc.

> diff --git a/lib/branch.tcl b/lib/branch.tcl 
> index 777eeb7..8b0c485 100644
> --- a/lib/branch.tcl
> +++ b/lib/branch.tcl
> @@ -8,6 +8,7 @@ proc load_all_heads {} {
>  	set rh_len [expr {[string length $rh] + 1}]
>  	set all_heads [list]
>  	set fd [git_read for-each-ref --format=%(refname) $rh]
> +	fconfigure $fd -translation binary -encoding utf-8
>  	while {[gets $fd line] > 0} {
>  		if {!$some_heads_tracking || ![is_tracking_branch $line]} {
>  			lappend all_heads [string range $line $rh_len end]
> @@ -24,6 +25,7 @@ proc load_all_tags {} {
>  		--sort=-taggerdate \
>  		--format=%(refname) \
>  		refs/tags]
> +	fconfigure $fd -translation binary -encoding utf-8
>  	while {[gets $fd line] > 0} {
>  		if {![regsub ^refs/tags/ $line {} name]} continue
>  		lappend all_tags $name

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] git gui: fix branch name encoding error on git gui
  2019-12-09 19:15 ` Junio C Hamano
@ 2019-12-09 21:09   ` Pratyush Yadav
  2019-12-09 21:25     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Pratyush Yadav @ 2019-12-09 21:09 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, 加藤一博,
	Jonathan Gilbert

On 09/12/19 11:15AM, Junio C Hamano wrote:
> 加藤一博 <kato-k@ksysllc.co.jp> writes:
> 
> > After "git checkout -b '漢字'" to create a branch with UTF-8
> > character in it, "git gui" shows the branch name incorrectly,
> > as it forgets to turn the bytes
> > read from the "git for-each-ref" and
> > read from "HEAD" file
> > into Unicode characters.
> 
> Thanks.
> 
> Note to the git-gui mentainer.  The above may want to be
> line-wrapped a bit.

Thanks. Already done :)
 
> > Signed-off-by: Kazuhiro Kato <kato-k@ksysllc.co.jp>
> > ---
> >  git-gui.sh     | 1 +
> >  lib/branch.tcl | 2 ++
> >  2 files changed, 3 insertions(+)
> >
> > diff --git a/git-gui.sh b/git-gui.sh
> > index 0d21f56..8f4a9ae 100755
> > --- a/git-gui.sh
> > +++ b/git-gui.sh
> > @@ -684,6 +684,7 @@ proc load_current_branch {} {
> >  	global current_branch is_detached
> >  
> >  	set fd [open [gitdir HEAD] r]
> > +	fconfigure $fd -translation binary -encoding utf-8
> >  	if {[gets $fd ref] < 1} {
> >  		set ref {}
> >  	}
> 
> A comment totally outside the scope of this fix to anybody
> interested in further working on this code.
> 
> This piece of code is way too intimate with the implementation
> details of HEAD and yet not intimate enough to know that HEAD can be
> a symlink (in other words, it is a poor imitation of the real logic
> implemented in git core).  A kosher way to implement this would be
> to call
> 
> 	git symbolic-ref --quiet --short HEAD
> 
> which would succeed and give the branch name to its standard output,
> or would fail when the head is detached.  Set "current_branch" and
> "is_detached" according to the outcome.

It was introduced in fc4e8da (git-gui: Internalize symbolic-ref HEAD 
reading logic, 2007-05-30). The commit message is:

  To improve performance on fork+exec impoverished systems (such as
  Windows) we want to avoid running git-symbolic-ref on every rescan
  if we can do so.  A quick way to implement such an avoidance is to
  just read the HEAD ref ourselves; we'll either see it as a symref
  (starts with "ref: ") or we'll see it as a detached head (40 hex
  digits).  In either case we can treat that as our current branch.

Now I'm not sure how relevant this still is over 12 years later, but 
AFAIK a fork+exec is still very costly on Windows.

So I wonder whether we should manually check if HEAD is a symbolic link 
or we should just use git-symbolic-ref and hope the performance doesn't 
drop too much.
 
> And yes, Kato-san's fconfigure fix in this patch will still be
> relevant even after such a fix to the implementation of this proc.

Me and Jonathan (Cc) have been having a discussion [0] about whether 
hard-coding UTF-8 as the refname encoding is the right idea. The gist of 
it is that Git _technically_ allows refnames to be in other encodings as 
long as the strings are NULL terminated. It does not restrict itself to 
valid UTF-8 only. More details in the linked thread, of course.

My position is that we should default to UTF-8 given its popularity (at 
least in the Git world), but I'm wondering whether we should also add a 
config variable to allow users to configure their encodings.

If you don't mind, your thoughts on this would be appreciated :)

[0] https://github.com/prati0100/git-gui/pull/21

-- 
Regards,
Pratyush Yadav

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] git gui: fix branch name encoding error on git gui
  2019-12-09 21:09   ` Pratyush Yadav
@ 2019-12-09 21:25     ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2019-12-09 21:25 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: git@vger.kernel.org, 加藤一博,
	Jonathan Gilbert

Pratyush Yadav <me@yadavpratyush.com> writes:

> My position is that we should default to UTF-8 given its popularity (at 
> least in the Git world), but I'm wondering whether we should also add a 
> config variable to allow users to configure their encodings.
>
> If you don't mind, your thoughts on this would be appreciated :)
>
> [0] https://github.com/prati0100/git-gui/pull/21

Quite honestly, as long as we can keep the core part to stay
"pathnames and refnames are just runs of bytes, terminated by NUL,
and no other interpretation is necessary", I do not think I care too
deeply if the UI layers (either GUI or Porcelain) become a bit more
restrictive.

But that is just my gut feeling.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] git gui: fix branch name encoding error on git gui
  2019-12-07  0:29 [PATCH] git gui: fix branch name encoding error on git gui 加藤一博
  2019-12-09 19:15 ` Junio C Hamano
@ 2019-12-11 16:05 ` Pratyush Yadav
  1 sibling, 0 replies; 5+ messages in thread
From: Pratyush Yadav @ 2019-12-11 16:05 UTC (permalink / raw)
  To: 加藤一博; +Cc: git@vger.kernel.org

On 07/12/19 12:29AM, 加藤一博 wrote:
> After "git checkout -b '漢字'" to create a branch with UTF-8
> character in it, "git gui" shows the branch name incorrectly,
> as it forgets to turn the bytes
> read from the "git for-each-ref" and
> read from "HEAD" file
> into Unicode characters.
> 
> Signed-off-by: Kazuhiro Kato <kato-k@ksysllc.co.jp>

Merged. Thanks.

-- 
Regards,
Pratyush Yadav

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-12-11 16:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-07  0:29 [PATCH] git gui: fix branch name encoding error on git gui 加藤一博
2019-12-09 19:15 ` Junio C Hamano
2019-12-09 21:09   ` Pratyush Yadav
2019-12-09 21:25     ` Junio C Hamano
2019-12-11 16:05 ` Pratyush Yadav

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).