git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v1 0/2] convert: add alias support for 'working-tree-encoding' attributes
@ 2018-07-08 18:30 larsxschneider
  2018-07-08 18:30 ` [PATCH v1 1/2] convert: refactor conversion driver config parsing larsxschneider
  2018-07-08 18:30 ` [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
  0 siblings, 2 replies; 8+ messages in thread
From: larsxschneider @ 2018-07-08 18:30 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Johannes.Schindelin, jehan, whee, me,
	Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

Hi,

this series adds Git config based alias support for
'working-tree-encoding' attributes that were introduced in 107642fe26
("convert: add 'working-tree-encoding' attribute", 2018-04-15).
The feature was suggested by Steve Groeger in [1].

The first patch is a refactoring with no functional change intended.
The second patch is the actual change.

Thanks,
Lars

[1] https://public-inbox.org/git/OF5D40FE06.C18CD7CD-ON002582B9.002B7A02-002582B9.002B7A07@notes.na.collabserv.com/

Base Ref: master
Web-Diff: https://github.com/larsxschneider/git/commit/0cbee9bd8d
Checkout: git fetch https://github.com/larsxschneider/git encoding-alias-v1 && git checkout 0cbee9bd8d

Lars Schneider (2):
  convert: refactor conversion driver config parsing
  convert: add alias support for 'working-tree-encoding' attributes

 Documentation/gitattributes.txt  |  19 ++++++
 convert.c                        | 114 ++++++++++++++++++++++---------
 t/t0028-working-tree-encoding.sh |  28 ++++++++
 3 files changed, 129 insertions(+), 32 deletions(-)


base-commit: e3331758f12da22f4103eec7efe1b5304a9be5e9
--
2.18.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 1/2] convert: refactor conversion driver config parsing
  2018-07-08 18:30 [PATCH v1 0/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
@ 2018-07-08 18:30 ` larsxschneider
  2018-07-08 18:35   ` Lars Schneider
  2018-07-10 16:25   ` Junio C Hamano
  2018-07-08 18:30 ` [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
  1 sibling, 2 replies; 8+ messages in thread
From: larsxschneider @ 2018-07-08 18:30 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Johannes.Schindelin, jehan, whee, me,
	Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

Refactor conversion driver config parsing to ease the parsing of new
configs in a subsequent patch.

No functional change intended.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 convert.c | 64 +++++++++++++++++++++++++++----------------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/convert.c b/convert.c
index 64d0d30e08..949bc783e4 100644
--- a/convert.c
+++ b/convert.c
@@ -1003,43 +1003,43 @@ static int read_convert_config(const char *var, const char *value, void *cb)
 	int namelen;
 	struct convert_driver *drv;
 
-	/*
-	 * External conversion drivers are configured using
-	 * "filter.<name>.variable".
-	 */
-	if (parse_config_key(var, "filter", &name, &namelen, &key) < 0 || !name)
-		return 0;
-	for (drv = user_convert; drv; drv = drv->next)
-		if (!strncmp(drv->name, name, namelen) && !drv->name[namelen])
-			break;
-	if (!drv) {
-		drv = xcalloc(1, sizeof(struct convert_driver));
-		drv->name = xmemdupz(name, namelen);
-		*user_convert_tail = drv;
-		user_convert_tail = &(drv->next);
-	}
+	if (parse_config_key(var, "filter", &name, &namelen, &key) >= 0 && name) {
+		/*
+		 * External conversion drivers are configured using
+		 * "filter.<name>.variable".
+		 */
+		for (drv = user_convert; drv; drv = drv->next)
+			if (!strncmp(drv->name, name, namelen) && !drv->name[namelen])
+				break;
+		if (!drv) {
+			drv = xcalloc(1, sizeof(struct convert_driver));
+			drv->name = xmemdupz(name, namelen);
+			*user_convert_tail = drv;
+			user_convert_tail = &(drv->next);
+		}
 
-	/*
-	 * filter.<name>.smudge and filter.<name>.clean specifies
-	 * the command line:
-	 *
-	 *	command-line
-	 *
-	 * The command-line will not be interpolated in any way.
-	 */
+		/*
+		 * filter.<name>.smudge and filter.<name>.clean specifies
+		 * the command line:
+		 *
+		 *	command-line
+		 *
+		 * The command-line will not be interpolated in any way.
+		 */
 
-	if (!strcmp("smudge", key))
-		return git_config_string(&drv->smudge, var, value);
+		if (!strcmp("smudge", key))
+			return git_config_string(&drv->smudge, var, value);
 
-	if (!strcmp("clean", key))
-		return git_config_string(&drv->clean, var, value);
+		if (!strcmp("clean", key))
+			return git_config_string(&drv->clean, var, value);
 
-	if (!strcmp("process", key))
-		return git_config_string(&drv->process, var, value);
+		if (!strcmp("process", key))
+			return git_config_string(&drv->process, var, value);
 
-	if (!strcmp("required", key)) {
-		drv->required = git_config_bool(var, value);
-		return 0;
+		if (!strcmp("required", key)) {
+			drv->required = git_config_bool(var, value);
+			return 0;
+		}
 	}
 
 	return 0;
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes
  2018-07-08 18:30 [PATCH v1 0/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
  2018-07-08 18:30 ` [PATCH v1 1/2] convert: refactor conversion driver config parsing larsxschneider
@ 2018-07-08 18:30 ` larsxschneider
  2018-07-08 18:39   ` Lars Schneider
  2018-07-10 16:31   ` Junio C Hamano
  1 sibling, 2 replies; 8+ messages in thread
From: larsxschneider @ 2018-07-08 18:30 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Johannes.Schindelin, jehan, whee, me,
	Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

In 107642fe26 ("convert: add 'working-tree-encoding' attribute",
2018-04-15) we added an attribute which defines the working tree
encoding of a file.

Some platforms might spell the name of a certain encoding differently or
some users might want to use different encodings on different platforms.
Add the Git config "encoding.<iconv-name>.insteadOf = <alias-name>" to
support these use-cases with a user specific mapping. If the alias
matches an existing encoding name, then the alias will take precedence.
The alias is case insensitive.

Example:

	(in .gitattributes)
	*.c	working-tree-encoding=foo

	(in config)
	[encoding "UTF-16"]
		insteadOf = foo

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 Documentation/gitattributes.txt  | 19 ++++++++++++
 convert.c                        | 50 ++++++++++++++++++++++++++++++++
 t/t0028-working-tree-encoding.sh | 28 ++++++++++++++++++
 3 files changed, 97 insertions(+)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 92010b062e..3628f0e5cf 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -366,6 +366,25 @@ command to guess the encoding:
 file foo.ps1
 ------------------------
 
+The encoding in all examples above was directly defined in the Git
+attributes. In addition, it is possible to define encodings indirectly
+using aliases set via Git config:
+
+------------------------
+[encoding "UTF-16"]
+    insteadOf = my-custom-alias
+------------------------
+
+The alias name can be used in the Git attributes instead of the actual
+encoding name:
+
+------------------------
+*.ps1   text working-tree-encoding=my-custom-alias
+------------------------
+
+This mapping can be useful if equivalent encodings are spelled
+differently across platforms. It can also be useful if a user wants to
+use different encodings on different platforms for the same file.
 
 `ident`
 ^^^^^^^
diff --git a/convert.c b/convert.c
index 949bc783e4..4f19ce1a04 100644
--- a/convert.c
+++ b/convert.c
@@ -997,6 +997,15 @@ static int apply_filter(const char *path, const char *src, size_t len,
 	return 0;
 }
 
+struct alias2enc {
+	struct hashmap_entry ent; /* must be the first member! */
+	const char *alias;
+	const char *encoding;
+};
+
+static int encoding_aliases_initialized;
+static struct hashmap encoding_map;
+
 static int read_convert_config(const char *var, const char *value, void *cb)
 {
 	const char *key, *name;
@@ -1040,6 +1049,36 @@ static int read_convert_config(const char *var, const char *value, void *cb)
 			drv->required = git_config_bool(var, value);
 			return 0;
 		}
+	} else if (
+		parse_config_key(var, "encoding", &name, &namelen, &key) >= 0 &&
+		name &&	!strcmp(key, "insteadof")) {
+		/*
+		 * Encoding aliases are configured using
+		 * "encoding.<iconv-name>.insteadOf = <alias-name>".
+		 */
+		struct alias2enc *entry;
+		if (!value)
+			return config_error_nonbool(key);
+
+		if (!encoding_aliases_initialized) {
+			encoding_aliases_initialized = 1;
+			hashmap_init(&encoding_map, NULL, NULL, 0);
+			entry = NULL;
+		} else {
+			struct alias2enc hashkey;
+			hashmap_entry_init(&hashkey, strihash(value));
+			hashkey.alias = value;
+			entry = hashmap_get(&encoding_map, &hashkey, NULL);
+		}
+
+		if (!entry) {
+			entry = xmalloc(sizeof(*entry));
+			entry->encoding = xstrndup(name, namelen);
+			entry->alias = xstrdup(value);
+
+			hashmap_entry_init(entry, strihash(value));
+			hashmap_add(&encoding_map, entry);
+		}
 	}
 
 	return 0;
@@ -1225,6 +1264,17 @@ static const char *git_path_check_encoding(struct attr_check_item *check)
 		die(_("true/false are no valid working-tree-encodings"));
 	}
 
+	/* Check if an alias was defined for the encoding in the Git config */
+	if (encoding_aliases_initialized) {
+		struct alias2enc hashkey;
+		struct alias2enc *entry;
+		hashmap_entry_init(&hashkey, strihash(value));
+		hashkey.alias = value;
+		entry = hashmap_get(&encoding_map, &hashkey, NULL);
+		if (entry)
+			value = entry->encoding;
+	}
+
 	/* Don't encode to the default encoding */
 	if (same_encoding(value, default_encoding))
 		return NULL;
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index 12b8eb963a..d803e00cbe 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -242,4 +242,32 @@ test_expect_success 'check roundtrip encoding' '
 	git reset
 '
 
+test_expect_success 'encoding alias' '
+	test_when_finished "rm -f test.foo16.git test.foo8" &&
+	test_when_finished "git reset --hard HEAD" &&
+
+	test_config encoding.UTF-16.InsteadOf foo16 &&
+
+	text="hallo there!\ncan you read me with an alias?" &&
+	printf "$text" >test.foo8 &&
+	printf "$text" | iconv -f UTF-8 -t UTF-16 >test.foo16 &&
+
+	echo "*.foo16 text working-tree-encoding=fOO16" >.gitattributes &&
+	git add test.foo16 .gitattributes &&
+
+	git cat-file -p :test.foo16 >test.foo16.git &&
+	test_cmp_bin test.foo8 test.foo16.git
+'
+
+test_expect_success 'encoding alias overwrites existing encoding' '
+	test_when_finished "git reset --hard HEAD" &&
+
+	test_config encoding.CONFUSE.insteadOf UTF-16 &&
+
+	echo "*.garbage text working-tree-encoding=UTF-16" >.gitattributes &&
+	printf "garbage" >t.garbage &&
+	test_must_fail git add t.garbage 2>err.out &&
+	test_i18ngrep "failed to encode .* from CONFUSE to UTF-8" err.out
+'
+
 test_done
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/2] convert: refactor conversion driver config parsing
  2018-07-08 18:30 ` [PATCH v1 1/2] convert: refactor conversion driver config parsing larsxschneider
@ 2018-07-08 18:35   ` Lars Schneider
  2018-07-09 20:01     ` Junio C Hamano
  2018-07-10 16:25   ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Lars Schneider @ 2018-07-08 18:35 UTC (permalink / raw)
  To: git, gitster; +Cc: Jeff King, Johannes.Schindelin, jehan, whee, me



> On Jul 8, 2018, at 8:30 PM, larsxschneider@gmail.com wrote:
> 
> From: Lars Schneider <larsxschneider@gmail.com>
> 
> Refactor conversion driver config parsing to ease the parsing of new
> configs in a subsequent patch.
> 
> No functional change intended.
> 
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
> convert.c | 64 +++++++++++++++++++++++++++----------------------------
> 1 file changed, 32 insertions(+), 32 deletions(-)
> 
> diff --git a/convert.c b/convert.c
> index 64d0d30e08..949bc783e4 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -1003,43 +1003,43 @@ static int read_convert_config(const char *var, const char *value, void *cb)
> 	int namelen;
> 	struct convert_driver *drv;
> 
> ...
> 
> -	/*
> -	 * filter.<name>.smudge and filter.<name>.clean specifies
> -	 * the command line:
> -	 *
> -	 *	command-line
> -	 *
> -	 * The command-line will not be interpolated in any way.
> -	 */
> +		/*
> +		 * filter.<name>.smudge and filter.<name>.clean specifies
> +		 * the command line:
> +		 *
> +		 *	command-line
> +		 *
> +		 * The command-line will not be interpolated in any way.
> +		 */

I stumbled over this comment introduced in aa4ed402c9 
("Add 'filter' attribute and external filter driver definition.", 2007-04-21).

Is the middle "command-line" intentional?

- Lars

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes
  2018-07-08 18:30 ` [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
@ 2018-07-08 18:39   ` Lars Schneider
  2018-07-10 16:31   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Lars Schneider @ 2018-07-08 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, Jeff King, Johannes.Schindelin, jehan, whee, me



> On Jul 8, 2018, at 8:30 PM, larsxschneider@gmail.com wrote:
> 
> From: Lars Schneider <larsxschneider@gmail.com>
> 
> In 107642fe26 ("convert: add 'working-tree-encoding' attribute",
> 2018-04-15) we added an attribute which defines the working tree
> encoding of a file.
> 
> Some platforms might spell the name of a certain encoding differently or
> some users might want to use different encodings on different platforms.
> Add the Git config "encoding.<iconv-name>.insteadOf = <alias-name>" to
> support these use-cases with a user specific mapping. If the alias
> matches an existing encoding name, then the alias will take precedence.
> The alias is case insensitive.
> 
> Example:
> 
> 	(in .gitattributes)
> 	*.c	working-tree-encoding=foo
> 
> 	(in config)
> 	[encoding "UTF-16"]
> 		insteadOf = foo
> 
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
> Documentation/gitattributes.txt  | 19 ++++++++++++
> convert.c                        | 50 ++++++++++++++++++++++++++++++++
> t/t0028-working-tree-encoding.sh | 28 ++++++++++++++++++
> 3 files changed, 97 insertions(+)
> 
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 92010b062e..3628f0e5cf 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -366,6 +366,25 @@ command to guess the encoding:
> file foo.ps1
> ------------------------
> 
> ...
> 
> 	return 0;
> @@ -1225,6 +1264,17 @@ static const char *git_path_check_encoding(struct attr_check_item *check)
> 		die(_("true/false are no valid working-tree-encodings"));
> 	}
> 
> +	/* Check if an alias was defined for the encoding in the Git config */
> +	if (encoding_aliases_initialized) {
> +		struct alias2enc hashkey;
> +		struct alias2enc *entry;
> +		hashmap_entry_init(&hashkey, strihash(value));
> +		hashkey.alias = value;
> +		entry = hashmap_get(&encoding_map, &hashkey, NULL);
> +		if (entry)
> +			value = entry->encoding;

Here I reuse the char* pointer from the hashmap.
The hashmap is static and no entry is ever removed.
Is this OK or should I rather create a copy of the string?

Thanks,
Lars


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/2] convert: refactor conversion driver config parsing
  2018-07-08 18:35   ` Lars Schneider
@ 2018-07-09 20:01     ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2018-07-09 20:01 UTC (permalink / raw)
  To: Lars Schneider; +Cc: git, Jeff King, Johannes.Schindelin, jehan, whee, me

Lars Schneider <larsxschneider@gmail.com> writes:

>> On Jul 8, 2018, at 8:30 PM, larsxschneider@gmail.com wrote:
>> 
>> From: Lars Schneider <larsxschneider@gmail.com>
>> 
>> Refactor conversion driver config parsing to ease the parsing of new
>> configs in a subsequent patch.
>> 
>> No functional change intended.
>> 
>> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
>> ---
>> convert.c | 64 +++++++++++++++++++++++++++----------------------------
>> 1 file changed, 32 insertions(+), 32 deletions(-)
>> 
>> diff --git a/convert.c b/convert.c
>> index 64d0d30e08..949bc783e4 100644
>> --- a/convert.c
>> +++ b/convert.c
>> @@ -1003,43 +1003,43 @@ static int read_convert_config(const char *var, const char *value, void *cb)
>> 	int namelen;
>> 	struct convert_driver *drv;
>> 
>> ...
>> 
>> -	/*
>> -	 * filter.<name>.smudge and filter.<name>.clean specifies
>> -	 * the command line:
>> -	 *
>> -	 *	command-line
>> -	 *
>> -	 * The command-line will not be interpolated in any way.
>> -	 */
>> +		/*
>> +		 * filter.<name>.smudge and filter.<name>.clean specifies
>> +		 * the command line:
>> +		 *
>> +		 *	command-line
>> +		 *
>> +		 * The command-line will not be interpolated in any way.
>> +		 */
>
> I stumbled over this comment introduced in aa4ed402c9 
> ("Add 'filter' attribute and external filter driver definition.", 2007-04-21).
>
> Is the middle "command-line" intentional?

I think it was a deliberate but ineffective attempt to emphasize the
fact that the command line is used as-is, and does not get split at
SP nor goes through interpolation of placeholders using API such as
strbuf_expand().

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/2] convert: refactor conversion driver config parsing
  2018-07-08 18:30 ` [PATCH v1 1/2] convert: refactor conversion driver config parsing larsxschneider
  2018-07-08 18:35   ` Lars Schneider
@ 2018-07-10 16:25   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2018-07-10 16:25 UTC (permalink / raw)
  To: larsxschneider; +Cc: git, peff, Johannes.Schindelin, jehan, whee, me

larsxschneider@gmail.com writes:

> From: Lars Schneider <larsxschneider@gmail.com>
>
> Refactor conversion driver config parsing to ease the parsing of new
> configs in a subsequent patch.
>
> No functional change intended.
>
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---

The change unfortunately makes everything indented one level deeper,
but I can see why "we look for the only thing we are interested in
and return early when given anything else" that allowed us to avoid
the deep indentation would get in the way of the second patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes
  2018-07-08 18:30 ` [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
  2018-07-08 18:39   ` Lars Schneider
@ 2018-07-10 16:31   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2018-07-10 16:31 UTC (permalink / raw)
  To: larsxschneider; +Cc: git, peff, Johannes.Schindelin, jehan, whee, me

larsxschneider@gmail.com writes:

> From: Lars Schneider <larsxschneider@gmail.com>
>
> In 107642fe26 ("convert: add 'working-tree-encoding' attribute",
> 2018-04-15) we added an attribute which defines the working tree
> encoding of a file.
>
> Some platforms might spell the name of a certain encoding differently or
> some users might want to use different encodings on different platforms.
> Add the Git config "encoding.<iconv-name>.insteadOf = <alias-name>" to
> support these use-cases with a user specific mapping. If the alias
> matches an existing encoding name, then the alias will take precedence.
> The alias is case insensitive.
>
> Example:
>
> 	(in .gitattributes)
> 	*.c	working-tree-encoding=foo
>
> 	(in config)
> 	[encoding "UTF-16"]
> 		insteadOf = foo
>
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---

Hmph, when I was reading the discussion between you and Peff, I
didn't expect it end up with a change too specific to this single
attribute.  I was instead imagining that a change would come closer
to where we call iconv (perhaps we muck with in/out_encoding
parameters to iconv_open()), so that places other than the
working-tree-encoding (e.g. commit log message encoding) would start
honoring the same configuration variable in a consistent way.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-07-10 16:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-08 18:30 [PATCH v1 0/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
2018-07-08 18:30 ` [PATCH v1 1/2] convert: refactor conversion driver config parsing larsxschneider
2018-07-08 18:35   ` Lars Schneider
2018-07-09 20:01     ` Junio C Hamano
2018-07-10 16:25   ` Junio C Hamano
2018-07-08 18:30 ` [PATCH v1 2/2] convert: add alias support for 'working-tree-encoding' attributes larsxschneider
2018-07-08 18:39   ` Lars Schneider
2018-07-10 16:31   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).