git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jiang Xin <worldhello.net@gmail.com>
Cc: Git List <git@vger.kernel.org>, avarab@gmail.com
Subject: Re: [PATCH v2] Maintaince script for l10n files and commits
Date: Fri, 09 Mar 2012 16:40:30 -0800	[thread overview]
Message-ID: <7vhaxx2sqp.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <1331273307-64598-1-git-send-email-worldhello.net@gmail.com> (Jiang Xin's message of "Fri, 9 Mar 2012 14:08:27 +0800")

Jiang Xin <worldhello.net@gmail.com> writes:

> diff --git a/po/po-helper.sh b/po/po-helper.sh
> new file mode 100755
> index 0000000..166ebb7
> --- /dev/null
> +++ b/po/po-helper.sh
> @@ -0,0 +1,387 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2012 Jiang Xin
> +
> +GIT=git
> +PODIR=$($GIT rev-parse --show-cdup)po

David already pointed out that $GIT is a bad style.

> +POTFILE=$PODIR/git.pot
> +
> +usage() {

Style:

	usage () {

I won't repeat this anymore in this message but there are others.

> +# Init or update XX.po file from git.pot
> +update_po() {
> +	if test $# -eq 0
> +	then
> +		usage "init/update needs at least one argument"
> +	fi
> +	while test $# -gt 0
> +	do
> +		locale=$(basename $1)
> +		locale=${locale%.po}

This is bad for two reasons. (1) Your $1 that directly comes from
the end user's command line may have $IFS characters; (2) You do not
have to spawn a separate process to run basename.

But if I were writing this loop, I would probably avoided refering
to "$1" and shifting at the end by starting it like so:

	for locale
        do
                locale=${locale##*/}
                locale=${locale%.po}
		...
	done

> +		po=$PODIR/$locale.po
> +		mo=$PODIR/build/locale/$locale/LC_MESSAGES/git.mo
> +		if test -f $po

It is safer to say

	if test -f "$po"

here, even though it is not needed in the current form of this
script. Later somebody may change the definition of PODIR above to
something else that may contain $IFS.

> +		then
> +			msgmerge --add-location --backup=off -U $po $POTFILE

Likewise, both for "$po" and "$POTFILE" (I won't repeat this anymore
in this message but there are others).

> +}
> +
> +notes_for_l10n_team_leader() {
> +	locale=$(basename $1)
> +	locale=${locale%.po}

Likewise. I won't repeat this anymore in this message but there are others.

> +# Check po, pot, commits
> +check() {
> +	if test $# -eq 0
> +	then
> +		ls $PODIR/*.po |
> +		while read f
> +		do
> +			echo "============================================================"
> +			echo "Check $(basename $f)..."
> +			check $f
> +		done
> +
> +		echo "============================================================"
> +		echo "Check updates of git.pot..."
> +		check pot
> +
> +		echo "============================================================"
> +		echo "Check commits..."
> +		check commits
> +	fi
> +	while test $# -gt 0
> +	do
> +		case "$1" in
> +		*.po)
> +			check_po $1
> +			;;
> +		*pot)

This is yucky.

		pot | git.pot)

would have been easier to understand what is going on.  Either we
reached here from "check pot" when this function was called without
argument, or we were fed the git.pot from the command line.

> +			check_pot
> +			;;
> +		commit|commits)
> +			shift
> +			check_commits $@
> +			break
> +			;;

Perhaps you meant to say

	check_commits "$@"

> +		*)
> +			usage "Unkown task \"$1\" for check"
> +			;;

I think we tend to do

	usage "Unknown task '$1' for check"

> +# Syntax check on XX.po, or all .po files if nothing provided
> +check_po() {
> +	while test $# -gt 0
> +	do
> +		locale=$(basename $1)
> +		locale=${locale%.po}
> +		po=$PODIR/$locale.po
> +		mo=$PODIR/build/locale/$locale/LC_MESSAGES/git.mo
> +		if test -f $po
> +		then
> +			mkdir -p $(dirname $mo)
> +			msgfmt -o $mo --check --statistics $po
> +		else
> +			echo "Error: File $po does not exist." >&2

It would make it more obvious that this error message does go to
the standard error stream if you said it like this:

	echo >&2 "Error: File '$po' does not exist."

> +		fi
> +		shift
> +	done
> +}

Again,

	for locale
        do
        	...
	done

would have made the loop easier (less risk forgetting to shift or
to shift too many).

> +# Display summary of updates of git.pot
> +check_pot() {
> +	pnew="^.*:\([0-9]*\): this message is used but not defined in.*"
> +	pdel="^.*:\([0-9]*\): warning: this message is not used.*"
> +	new_count=0
> +	del_count=0
> +	new_lineno=""
> +	del_lineno=""
> +
> +	status=$(cd $PODIR; $GIT status --porcelain -- $(basename $POTFILE))
> +	if test -z "$status"
> +	then
> +		echo "Nothing changed."
> +	else
> +		tmpfile=$(mktemp /tmp/git.po.XXXX)

(optional) perhaps set a trap to remove this here, instead of an
explicit "rm" at the end?

> +		(cd $PODIR; LANGUAGE=C $GIT show HEAD:./$(basename $POTFILE) > $tmpfile )

		... show HEAD:./${POTFILE##*/} >"$tmpfile"

Look for "Redirection operators" in Documentation/CodingGuidelines.

> +		LANGUAGE=C msgcmp -N --use-untranslated $tmpfile $POTFILE 2>&1 | {
> +			while read line
> +			do
> +				m=$(echo $line | grep "$pnew" | sed -e "s/$pnew/\1/")

Make it a habit to suspect that you are using one process too many,
whenever you see grep and sed in the same pipe.

	... | sed -ne "s/$pnew/\1/"

would be sufficient, no?

> +				if test -n "$m"
> +				then
> +					new_count=$(( new_count + 1 ))
> +					if test -z "$new_lineno"
> +					then
> +						new_lineno="$m"
> +					else
> +						new_lineno="${new_lineno}, $m"
> +					fi
> +					continue
> +				fi
> +
> +				m=$(echo $line | grep "$pdel" | sed -e "s/$pdel/\1/")
> +				if test -n "$m"
> +				then
> +					del_count=$(( del_count + 1 ))
> +					if test -z "$del_lineno"
> +					then
> +						del_lineno="$m"
> +					else
> +						del_lineno="${del_lineno}, $m"
> +					fi
> +				fi
> +			done
> +			test $new_count -gt 1 && new_plur="s" || new_plur=""
> +			test $del_count -gt 1 && del_plur="s" || del_plur=""

Isn't plural forms "0 lines, 1 line, 2 lines,..."?  We have this in
our "gettext.h" that says "use singular form only when n is 1", not
"when n is less than 2":

	#define ngettext(s, p, n) ((n == 1) ? (s) : (p))

> +verify_commit_encoding() {
> +	c=$1
> +	subject=0
> +	non_ascii=""
> +	encoding=""
> +	log=""
> +
> +	$GIT cat-file commit $c | {
> +		while read line

At least, you should read with "read -r", if you were to write this
as a shell script.

> +		do
> +			log="$log - $line"
> +			# next line would be the commit log subject line,
> +			# if no previous empty line found.
> +			if test -z "$line"
> +			then
> +				subject=$((subject + 1))

Even though POSIX allows the above, this is preferred:

	subject=$(( $subject + 1 ))

as some shells were reported to barf without $ in front of the
variable names.

> +				continue
> +			fi
> +			if test $subject -eq 0
> +			then
> +				if echo $line | grep -q "^encoding "
> +				then
> +					encoding=${line#encoding }
> +				fi

	case "$line" in
        encoding' '*)
		encoding=${...}
		;;
	esac

> +			fi
> +			# non-ascii found in commit log
> +			m=$(echo $line | sed -e "s/\([[:alnum:][:space:][:punct:]]\)//g")
> +			if test -n "$m"
> +			then
> +				non_ascii="$m >> $line <<"
> +				if test $subject -eq 1
> +				then
> +					report_nonascii_in_subject $c "$non_ascii"
> +					return
> +				fi
> +			fi
> +			# subject has only one line
> +			test $subject -eq 1 && subject=$((subject += 1))

subject=$(( $subject + 1 ))

> +			# break if there are non-asciis and has already checked subject line
> +			if test -n "$non_ascii" && test $subject -gt 0
> +			then
> +				break

Is it OK for the body to have good line followed by a bad line?

> +			fi
> +		done
> +		if test -n "$non_ascii"
> +		then
> +			if test -z "$encoding"
> +			then
> +				echo $line | iconv -f UTF-8 -t UTF-8 -s >/dev/null ||
> +					report_bad_encoding "$c" "$non_ascii"
> +			else
> +				echo $line | iconv -f $encoding -t UTF-8 -s >/dev/null ||
> +					report_bad_encoding "$c" "$non_ascii" "$encoding"
> +			fi
> +		fi
> +	}
> +}

  parent reply	other threads:[~2012-03-10  0:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-07 18:47 [PATCH] Maintaince script for l10n files and commits Jiang Xin
2012-03-07 19:17 ` Junio C Hamano
2012-03-08 16:05   ` Jiang Xin
2012-03-08 20:41     ` Junio C Hamano
2012-03-09  0:57       ` Jiang Xin
2012-03-09  6:08       ` [PATCH v2] " Jiang Xin
2012-03-09  6:20         ` David Aguilar
2012-03-09  6:31           ` Jiang Xin
2012-03-10  0:40         ` Junio C Hamano [this message]
2012-03-10  9:17       ` [PATCH v3] " Jiang Xin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vhaxx2sqp.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).