git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [[PATCH v3] 0/2] module_list enhancements
@ 2013-06-14 15:56 Fredrik Gustafsson
  2013-06-14 15:56 ` [[PATCH v3] 1/2] [submodule] handle multibyte characters in name Fredrik Gustafsson
  2013-06-14 15:56 ` [[PATCH v3] 2/2] [submodule] Replace perl-code with sh Fredrik Gustafsson
  0 siblings, 2 replies; 6+ messages in thread
From: Fredrik Gustafsson @ 2013-06-14 15:56 UTC (permalink / raw)
  To: gitster; +Cc: git, iveqy, jens.lehmann

Reworded commit message for
[submodule] handle multibyte characters in name
as suggested by Junio.

Previous iteration can be found here:
http://thread.gmane.org/gmane.comp.version-control.git/227786/

Fredrik Gustafsson (2):
  [submodule] handle multibyte characters in name
  [submodule] Replace perl-code with sh

 git-submodule.sh           | 53 ++++++++++++++++++++--------------------------
 t/t7400-submodule-basic.sh | 12 +++++++++++
 2 files changed, 35 insertions(+), 30 deletions(-)

-- 
1.8.3.1.381.g2ab719e.dirty

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [[PATCH v3] 1/2] [submodule] handle multibyte characters in name
  2013-06-14 15:56 [[PATCH v3] 0/2] module_list enhancements Fredrik Gustafsson
@ 2013-06-14 15:56 ` Fredrik Gustafsson
  2013-06-14 17:23   ` Junio C Hamano
  2013-06-14 15:56 ` [[PATCH v3] 2/2] [submodule] Replace perl-code with sh Fredrik Gustafsson
  1 sibling, 1 reply; 6+ messages in thread
From: Fredrik Gustafsson @ 2013-06-14 15:56 UTC (permalink / raw)
  To: gitster; +Cc: git, iveqy, jens.lehmann

Many "git submodule" operations do not work on a submodule at a path whose
name is not in ASCII.

This is because "git ls-files" is used to find which paths are bound to
submodules to the current working tree, and the output is C-quoted by default
for non ASCII pathnames and pathnames that has a double-quote, a
backslash or a control character like a newline or a tab in thme.

Tell "git ls-files" to not C-quote its output, which is easier than unwrapping
C-quote ourselves.

This patch still does not allow pathnames with characters that do need C-quote,
but the code didn't handle them before, so it is not making things worse. The
correct approach to solve the problem for all pathnames may be to use
"ls-files -z" and tell the Perl script that reads its output to read NUL
separated records by using $/ = "\0".

Solution-suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
---
 git-submodule.sh           |  2 +-
 t/t7400-submodule-basic.sh | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 79bfaac..bad051e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -113,7 +113,7 @@ resolve_relative_url ()
 module_list()
 {
 	(
-		git ls-files --error-unmatch --stage -- "$@" ||
+		git -c core.quotepath=false ls-files --error-unmatch --stage -- "$@" ||
 		echo "unmatched pathspec exists"
 	) |
 	perl -e '
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index ff26535..d5743ee 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -868,4 +868,16 @@ test_expect_success 'submodule deinit fails when submodule has a .git directory
 	test -n "$(git config --get-regexp "submodule\.example\.")"
 '
 
+test_expect_success 'submodule with strange name works "å äö"' '
+	mkdir "å äö" &&
+	(
+		cd "å äö" &&
+		git init &&
+		touch sub
+		git add sub
+		git commit -m "init sub"
+	)
+	git submodule add "/å äö" &&
+	test -n "$(git submodule | grep "å äö")"
+'
 test_done
-- 
1.8.3.1.381.g2ab719e.dirty

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [[PATCH v3] 2/2] [submodule] Replace perl-code with sh
  2013-06-14 15:56 [[PATCH v3] 0/2] module_list enhancements Fredrik Gustafsson
  2013-06-14 15:56 ` [[PATCH v3] 1/2] [submodule] handle multibyte characters in name Fredrik Gustafsson
@ 2013-06-14 15:56 ` Fredrik Gustafsson
  1 sibling, 0 replies; 6+ messages in thread
From: Fredrik Gustafsson @ 2013-06-14 15:56 UTC (permalink / raw)
  To: gitster; +Cc: git, iveqy, jens.lehmann

This will prevent a fork and makes the code similair to the rest of the
file.

In the long term git-submodule.sh needs to use something else than sh to
handle newline in filenames (and therefore needs to use a language that
accepts \0 in strings). However I don't think that keeping that small
perl-part will ease any rewrite.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
---
 git-submodule.sh | 51 ++++++++++++++++++++++-----------------------------
 1 file changed, 22 insertions(+), 29 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index bad051e..be96934 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -112,38 +112,31 @@ resolve_relative_url ()
 #
 module_list()
 {
+	null_sha1=0000000000000000000000000000000000000000
+	unmerged=
 	(
 		git -c core.quotepath=false ls-files --error-unmatch --stage -- "$@" ||
-		echo "unmatched pathspec exists"
+		echo "#unmatched"
 	) |
-	perl -e '
-	my %unmerged = ();
-	my ($null_sha1) = ("0" x 40);
-	my @out = ();
-	my $unmatched = 0;
-	while (<STDIN>) {
-		if (/^unmatched pathspec/) {
-			$unmatched = 1;
-			next;
-		}
-		chomp;
-		my ($mode, $sha1, $stage, $path) =
-			/^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
-		next unless $mode eq "160000";
-		if ($stage ne "0") {
-			if (!$unmerged{$path}++) {
-				push @out, "$mode $null_sha1 U\t$path\n";
-			}
-			next;
-		}
-		push @out, "$_\n";
-	}
-	if ($unmatched) {
-		print "#unmatched\n";
-	} else {
-		print for (@out);
-	}
-	'
+	while read mode sha1 stage path
+	do
+		if test $mode = "#unmatched"
+		then
+			echo "#unmatched"
+		elif test $mode = "160000"
+		then
+			if test $stage != "0"
+			then
+				if test "$unmerged" != "$path"
+				then
+					echo "$mode $null_sha1 U $path"
+				fi
+				unmerged="$path"
+			else
+				echo "$mode $sha1 $stage $path"
+			fi
+		fi
+	done
 }
 
 die_if_unmatched ()
-- 
1.8.3.1.381.g2ab719e.dirty

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [[PATCH v3] 1/2] [submodule] handle multibyte characters in name
  2013-06-14 15:56 ` [[PATCH v3] 1/2] [submodule] handle multibyte characters in name Fredrik Gustafsson
@ 2013-06-14 17:23   ` Junio C Hamano
  2013-06-14 18:27     ` Fredrik Gustafsson
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2013-06-14 17:23 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, jens.lehmann

Fredrik Gustafsson <iveqy@iveqy.com> writes:

> ... The
> correct approach to solve the problem for all pathnames may be to use
> "ls-files -z" and tell the Perl script that reads its output to read NUL
> separated records by using $/ = "\0".

I've tentatively queued the attached without 2/2; the scriptlet is
small enough not to matter in an eventual rewrite, so it shouldn't
make a difference either way.

-- >8 --
From: Fredrik Gustafsson <iveqy@iveqy.com>
Subject: [PATCH] handle multibyte characters in name

Many "git submodule" operations do not work on a submodule at a path whose
name is not in ASCII.

This is because "git ls-files" is used to find which paths are bound to
submodules to the current working tree, and the output is C-quoted by default
for non ASCII pathnames.

Tell "git ls-files" to not C-quote its output, which is easier than unwrapping
C-quote ourselves.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 git-submodule.sh           |  3 ++-
 t/t7400-submodule-basic.sh | 12 ++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 79bfaac..48bdf84 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -113,7 +113,7 @@ resolve_relative_url ()
 module_list()
 {
 	(
-		git ls-files --error-unmatch --stage -- "$@" ||
+		git ls-files -z --error-unmatch --stage -- "$@" ||
 		echo "unmatched pathspec exists"
 	) |
 	perl -e '
@@ -121,6 +121,7 @@ module_list()
 	my ($null_sha1) = ("0" x 40);
 	my @out = ();
 	my $unmatched = 0;
+	$/ = "\0";
 	while (<STDIN>) {
 		if (/^unmatched pathspec/) {
 			$unmatched = 1;
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index ff26535..d5743ee 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -868,4 +868,16 @@ test_expect_success 'submodule deinit fails when submodule has a .git directory
 	test -n "$(git config --get-regexp "submodule\.example\.")"
 '
 
+test_expect_success 'submodule with strange name works "å äö"' '
+	mkdir "å äö" &&
+	(
+		cd "å äö" &&
+		git init &&
+		touch sub
+		git add sub
+		git commit -m "init sub"
+	)
+	git submodule add "/å äö" &&
+	test -n "$(git submodule | grep "å äö")"
+'
 test_done
-- 
1.8.3.1-538-gb4d04a7

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [[PATCH v3] 1/2] [submodule] handle multibyte characters in name
  2013-06-14 17:23   ` Junio C Hamano
@ 2013-06-14 18:27     ` Fredrik Gustafsson
  2013-06-14 18:33       ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Fredrik Gustafsson @ 2013-06-14 18:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, jens.lehmann

On Fri, Jun 14, 2013 at 10:23:52AM -0700, Junio C Hamano wrote:
> Fredrik Gustafsson <iveqy@iveqy.com> writes:
> 
> > ... The
> > correct approach to solve the problem for all pathnames may be to use
> > "ls-files -z" and tell the Perl script that reads its output to read NUL
> > separated records by using $/ = "\0".
> 
> I've tentatively queued the attached without 2/2; the scriptlet is
> small enough not to matter in an eventual rewrite, so it shouldn't
> make a difference either way.

Sorry, I didn't knew enough perl to understand that that was a
suggestion rather than a hint to a future developer.

Now when I see how you meant it's looks like the best solution to me.
To me it looks like we now should be able to handle the multiline case
here. However, git submodule add doesn't handle newline yet, so it
really doesn't matter for now.

Thanks for the help!

-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iveqy@iveqy.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [[PATCH v3] 1/2] [submodule] handle multibyte characters in name
  2013-06-14 18:27     ` Fredrik Gustafsson
@ 2013-06-14 18:33       ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2013-06-14 18:33 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, jens.lehmann

Fredrik Gustafsson <iveqy@iveqy.com> writes:

> On Fri, Jun 14, 2013 at 10:23:52AM -0700, Junio C Hamano wrote:
>> Fredrik Gustafsson <iveqy@iveqy.com> writes:
>> 
>> > ... The
>> > correct approach to solve the problem for all pathnames may be to use
>> > "ls-files -z" and tell the Perl script that reads its output to read NUL
>> > separated records by using $/ = "\0".
>> 
>> I've tentatively queued the attached without 2/2; the scriptlet is
>> small enough not to matter in an eventual rewrite, so it shouldn't
>> make a difference either way.
>
> Sorry, I didn't knew enough perl to understand that that was a
> suggestion rather than a hint to a future developer.

Heh, no need to be sorry.  It was a hint, and I just made you a
future developer ;-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-06-14 18:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-14 15:56 [[PATCH v3] 0/2] module_list enhancements Fredrik Gustafsson
2013-06-14 15:56 ` [[PATCH v3] 1/2] [submodule] handle multibyte characters in name Fredrik Gustafsson
2013-06-14 17:23   ` Junio C Hamano
2013-06-14 18:27     ` Fredrik Gustafsson
2013-06-14 18:33       ` Junio C Hamano
2013-06-14 15:56 ` [[PATCH v3] 2/2] [submodule] Replace perl-code with sh Fredrik Gustafsson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).