git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] Prevent git-config from storing section keys that are too long
@ 2012-09-07  0:47 Ben Walton
  2012-09-07  1:33 ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Walton @ 2012-09-07  0:47 UTC (permalink / raw)
  To: gitster, git; +Cc: Ben Walton

Key names have a length limit defined by MAXNAME in config.c.  When
reading the config file, we reserve half of this limit for the section
identifier and the other half for the key name within that section.

For example, if setting a key named url.foo.insteadOf, url.foo may use
at most half of MANXNAME.

The parser will throw an error if this condition is violated.

This patch ensures that git-config enforces the same restriction
during the creation of a section identifier so that it doesn't allow
the generate a configuration file that cannot be re-read later.

This patch also adds a test to t1303-wacky-config to catch any future
issues with this check.

Signed-off-by: Ben Walton <bwalton@artsci.utoronto.ca>
---

Hi All,

I happened to notice this while running the test suite in a deeply
nested directory...

The check for baselen exceeding half of MAXNAME could be done earlier
in the function but doing it late allowed the error message to be
clearer without extra hassle.

I also wonder if MAXNAME should be increased somewhat.  Section
identifiers generated from keys like:

url./some/really/long/path.insteadOf

could overrun the current limit.  It's not a common case, of course,
or this issue would have been found sooner.  Would doubling the
current limit be out of the question?

Thanks
-Ben



 config.c                |    8 ++++++++
 t/t1303-wacky-config.sh |    4 ++++
 2 files changed, 12 insertions(+)

diff --git a/config.c b/config.c
index 2b706ea..d3f4854 100644
--- a/config.c
+++ b/config.c
@@ -1276,6 +1276,14 @@ int git_config_parse_key(const char *key, char **store_key, int *baselen_)
 	}
 	(*store_key)[i] = 0;
 
+	if (baselen > MAXNAME / 2) {
+		/* ok to destroy this value now since it will be freed */
+		(*store_key)[baselen] = '\0';
+		error("section identifier for key is too long (> %d): %s",
+		      MAXNAME / 2, *store_key);
+		goto out_free_ret_1;
+	}
+
 	return 0;
 
 out_free_ret_1:
diff --git a/t/t1303-wacky-config.sh b/t/t1303-wacky-config.sh
index 46103a1..12f0850 100755
--- a/t/t1303-wacky-config.sh
+++ b/t/t1303-wacky-config.sh
@@ -47,4 +47,8 @@ test_expect_success 'do not crash on special long config line' '
 	check section.key "$LONG_VALUE"
 '
 
+test_expect_success 'do not accept long section identifiers for key names' '
+	test_must_fail git config some.REALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlongREALLYlong.key value
+'
+
 test_done
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Prevent git-config from storing section keys that are too long
  2012-09-07  0:47 [PATCH] Prevent git-config from storing section keys that are too long Ben Walton
@ 2012-09-07  1:33 ` Junio C Hamano
  2012-09-07  2:34   ` Ben Walton
  2012-09-29 10:19   ` [PATCH] Remove the hard coded length limit on variable names in config files Ben Walton
  0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2012-09-07  1:33 UTC (permalink / raw)
  To: Ben Walton; +Cc: git

Ben Walton <bwalton@artsci.utoronto.ca> writes:

> Key names have a length limit defined by MAXNAME in config.c.  When
> reading the config file, we reserve half of this limit for the section
> identifier and the other half for the key name within that section.
>
> For example, if setting a key named url.foo.insteadOf, url.foo may use
> at most half of MANXNAME.
>
> The parser will throw an error if this condition is violated.
>
> This patch ensures that git-config enforces the same restriction
> during the creation of a section identifier so that it doesn't allow
> the generate a configuration file that cannot be re-read later.
>
> This patch also adds a test to t1303-wacky-config to catch any future
> issues with this check.
>
> Signed-off-by: Ben Walton <bwalton@artsci.utoronto.ca>
> ---
>
> Hi All,
>
> I happened to notice this while running the test suite in a deeply
> nested directory...
>
> The check for baselen exceeding half of MAXNAME could be done earlier
> in the function but doing it late allowed the error message to be
> clearer without extra hassle.
>
> I also wonder if MAXNAME should be increased somewhat.  Section
> identifiers generated from keys like:
>
> url./some/really/long/path.insteadOf
>
> could overrun the current limit.  It's not a common case, of course,
> or this issue would have been found sooner.  Would doubling the
> current limit be out of the question?

Is there a reason to have _any_ limitation?  It is not like we store
configuration data by allocating one file per item (in which case we
may be limited by the filesystem limit for direntry size), so if it
is not too much trouble, I think the right thing to do is to lift
the limitation altogether, possibly using strbuf instead of a
statically sized array of characters.

Of course, once you write a very long entry using a newer version of
Git, the resulting configuration file may end up unusable by older
version of Git, so a patch to implement such a change may need to be
based on older maintenance release (say maint-1.7.9) and then merged
upwards, but otherwise I do not offhand see a compatibility downside
of such a change.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Prevent git-config from storing section keys that are too long
  2012-09-07  1:33 ` Junio C Hamano
@ 2012-09-07  2:34   ` Ben Walton
  2012-09-29 10:19   ` [PATCH] Remove the hard coded length limit on variable names in config files Ben Walton
  1 sibling, 0 replies; 9+ messages in thread
From: Ben Walton @ 2012-09-07  2:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Excerpts from Junio C Hamano's message of Thu Sep 06 21:33:20 -0400 2012:

Hi Junio,

> > identifiers generated from keys like:
> >
> > url./some/really/long/path.insteadOf
> >
> > could overrun the current limit.  It's not a common case, of course,
> > or this issue would have been found sooner.  Would doubling the
> > current limit be out of the question?
> 
> Is there a reason to have _any_ limitation?  It is not like we store
> configuration data by allocating one file per item (in which case we
> may be limited by the filesystem limit for direntry size), so if it
> is not too much trouble, I think the right thing to do is to lift
> the limitation altogether, possibly using strbuf instead of a
> statically sized array of characters.

I thought it made sense to impose some sort of bound here but removing
the limit wouldn't encourage the use of ridiculously long names so
lifting it entirely shouldn't hurt.

Any chosen limit would always be somewhat arbitrary.  I had considered
extending it to (PATHMAX + x) where x would also be arbitrary as
that would allow any valid url./path/max.insteadOf type setting but
that didn't seem like a good approach.

Removing the limit is a much better choice...

> Of course, once you write a very long entry using a newer version of
> Git, the resulting configuration file may end up unusable by older
> version of Git, so a patch to implement such a change may need to be
> based on older maintenance release (say maint-1.7.9) and then merged
> upwards, but otherwise I do not offhand see a compatibility downside
> of such a change.

I'm ok with this approach and will put an altered patch together
shortly.  I think it's fairly unlikely, but not impossible, that
anyone creating a config file with long key names would be in a
situation where someone else couldn't read that same config file.
I'll still base the change on maint-1.7.9 as suggested though.

Thanks
-Ben
--
Ben Walton
Systems Programmer - CHASS
University of Toronto
C:416.407.5610 | W:416.978.4302

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] Remove the hard coded length limit on variable names in config files
  2012-09-07  1:33 ` Junio C Hamano
  2012-09-07  2:34   ` Ben Walton
@ 2012-09-29 10:19   ` Ben Walton
  2012-09-30  4:05     ` Michael Haggerty
  1 sibling, 1 reply; 9+ messages in thread
From: Ben Walton @ 2012-09-29 10:19 UTC (permalink / raw)
  To: gitster, git; +Cc: Ben Walton

Previously while reading the variable names in config files, there was
a 256 character limit with at most 128 of those characters being used
by the section header portion of the variable name.  This limitation
was only enforced while reading the config files.  It was possible to
write a config file that was not subsequently readable.

Instead of enforcing this limitation for both reading and writing,
remove it entirely by changing the var member of the config_file
struct to a strbuf instead of a fixed length buffer.  Update all of
the parsing functions in config.c to use the strbuf instead of the
static buffer.  Send the buf member of the strbuf to external callback
functions to preserve the external api.

Signed-off-by: Ben Walton <bdwalton@gmail.com>
---
Hi Junio,

(Sorry that this patch took so long to submit.  I've been busy moving.)

I think this should remove the length limitations enforced while reading
configuration file variable names.

Thanks
-Ben

 config.c |   50 +++++++++++++++++++++++---------------------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/config.c b/config.c
index 40f9c6d..ee860a7 100644
--- a/config.c
+++ b/config.c
@@ -10,8 +10,6 @@
 #include "strbuf.h"
 #include "quote.h"
 
-#define MAXNAME (256)
-
 typedef struct config_file {
 	struct config_file *prev;
 	FILE *f;
@@ -19,7 +17,7 @@ typedef struct config_file {
 	int linenr;
 	int eof;
 	struct strbuf value;
-	char var[MAXNAME];
+	struct strbuf var;
 } config_file;
 
 static config_file *cf;
@@ -191,7 +189,7 @@ static inline int iskeychar(int c)
 	return isalnum(c) || c == '-';
 }
 
-static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
+static int get_value(config_fn_t fn, void *data, struct strbuf *name)
 {
 	int c;
 	char *value;
@@ -203,11 +201,9 @@ static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
 			break;
 		if (!iskeychar(c))
 			break;
-		name[len++] = tolower(c);
-		if (len >= MAXNAME)
-			return -1;
+		strbuf_addch(name, tolower(c));
 	}
-	name[len] = 0;
+
 	while (c == ' ' || c == '\t')
 		c = get_next_char();
 
@@ -219,10 +215,10 @@ static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
 		if (!value)
 			return -1;
 	}
-	return fn(name, value, data);
+	return fn(name->buf, value, data);
 }
 
-static int get_extended_base_var(char *name, int baselen, int c)
+static int get_extended_base_var(struct strbuf *name, int c)
 {
 	do {
 		if (c == '\n')
@@ -233,7 +229,7 @@ static int get_extended_base_var(char *name, int baselen, int c)
 	/* We require the format to be '[base "extension"]' */
 	if (c != '"')
 		return -1;
-	name[baselen++] = '.';
+	strbuf_addch(name, '.');
 
 	for (;;) {
 		int c = get_next_char();
@@ -246,34 +242,30 @@ static int get_extended_base_var(char *name, int baselen, int c)
 			if (c == '\n')
 				return -1;
 		}
-		name[baselen++] = c;
-		if (baselen > MAXNAME / 2)
-			return -1;
+		strbuf_addch(name, c);
 	}
 
 	/* Final ']' */
 	if (get_next_char() != ']')
 		return -1;
-	return baselen;
+	return name->len;
 }
 
-static int get_base_var(char *name)
+static int get_base_var(struct strbuf *name)
 {
-	int baselen = 0;
+	strbuf_reset(name);
 
 	for (;;) {
 		int c = get_next_char();
 		if (cf->eof)
 			return -1;
 		if (c == ']')
-			return baselen;
+			return name->len;
 		if (isspace(c))
-			return get_extended_base_var(name, baselen, c);
+			return get_extended_base_var(name, c);
 		if (!iskeychar(c) && c != '.')
 			return -1;
-		if (baselen > MAXNAME / 2)
-			return -1;
-		name[baselen++] = tolower(c);
+		strbuf_addch(name, tolower(c));
 	}
 }
 
@@ -281,7 +273,7 @@ static int git_parse_file(config_fn_t fn, void *data)
 {
 	int comment = 0;
 	int baselen = 0;
-	char *var = cf->var;
+	struct strbuf *var = &cf->var;
 
 	/* U+FEFF Byte Order Mark in UTF8 */
 	static const unsigned char *utf8_bom = (unsigned char *) "\xef\xbb\xbf";
@@ -320,14 +312,16 @@ static int git_parse_file(config_fn_t fn, void *data)
 			baselen = get_base_var(var);
 			if (baselen <= 0)
 				break;
-			var[baselen++] = '.';
-			var[baselen] = 0;
+			strbuf_addch(var, '.');
 			continue;
 		}
 		if (!isalpha(c))
 			break;
-		var[baselen] = tolower(c);
-		if (get_value(fn, data, var, baselen+1) < 0)
+		/* Truncate the var name back to the section header prior to
+		   grabbing the suffix part of the name and the value */
+		strbuf_setlen(var, baselen+1);
+		strbuf_addch(var, tolower(c));
+		if (get_value(fn, data, var) < 0)
 			break;
 	}
 	die("bad config file line %d in %s", cf->linenr, cf->name);
@@ -842,12 +836,14 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data)
 		top.linenr = 1;
 		top.eof = 0;
 		strbuf_init(&top.value, 1024);
+		strbuf_init(&top.var, 1024);
 		cf = &top;
 
 		ret = git_parse_file(fn, data);
 
 		/* pop config-file parsing state stack */
 		strbuf_release(&top.value);
+		strbuf_release(&top.var);
 		cf = top.prev;
 
 		fclose(f);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Remove the hard coded length limit on variable names in config files
  2012-09-29 10:19   ` [PATCH] Remove the hard coded length limit on variable names in config files Ben Walton
@ 2012-09-30  4:05     ` Michael Haggerty
  2012-09-30 18:20       ` Ben Walton
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Haggerty @ 2012-09-30  4:05 UTC (permalink / raw)
  To: Ben Walton; +Cc: gitster, git

On 09/29/2012 12:19 PM, Ben Walton wrote:
> Previously while reading the variable names in config files, there was
> a 256 character limit with at most 128 of those characters being used
> by the section header portion of the variable name.  This limitation
> was only enforced while reading the config files.  It was possible to
> write a config file that was not subsequently readable.
> 
> Instead of enforcing this limitation for both reading and writing,
> remove it entirely by changing the var member of the config_file
> struct to a strbuf instead of a fixed length buffer.  Update all of
> the parsing functions in config.c to use the strbuf instead of the
> static buffer.  Send the buf member of the strbuf to external callback
> functions to preserve the external api.
> 
> Signed-off-by: Ben Walton <bdwalton@gmail.com>
> ---
> Hi Junio,
> 
> (Sorry that this patch took so long to submit.  I've been busy moving.)

The patch doesn't apply to the current master; it appears to have been
built against master 883a2a3504 (2012-02-23) or older.  It will have to
be rebased to the current master.

Nevertheless I will add a few comments below.

Overall, I like your approach of using strbuf here, as it is simpler to
use and less error-prone.  It is also nice to get rid of an arbitrary
length limit, especially since it was not consistently enforced.

> I think this should remove the length limitations enforced while reading
> configuration file variable names.
> 
> Thanks
> -Ben
> 
>  config.c |   50 +++++++++++++++++++++++---------------------------
>  1 file changed, 23 insertions(+), 27 deletions(-)
> 
> diff --git a/config.c b/config.c
> index 40f9c6d..ee860a7 100644
> --- a/config.c
> +++ b/config.c
> @@ -10,8 +10,6 @@
>  #include "strbuf.h"
>  #include "quote.h"
>  
> -#define MAXNAME (256)
> -
>  typedef struct config_file {
>  	struct config_file *prev;
>  	FILE *f;
> @@ -19,7 +17,7 @@ typedef struct config_file {
>  	int linenr;
>  	int eof;
>  	struct strbuf value;
> -	char var[MAXNAME];
> +	struct strbuf var;
>  } config_file;
>  
>  static config_file *cf;
> @@ -191,7 +189,7 @@ static inline int iskeychar(int c)
>  	return isalnum(c) || c == '-';
>  }
>  
> -static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
> +static int get_value(config_fn_t fn, void *data, struct strbuf *name)
>  {
>  	int c;
>  	char *value;
> @@ -203,11 +201,9 @@ static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
>  			break;
>  		if (!iskeychar(c))
>  			break;
> -		name[len++] = tolower(c);
> -		if (len >= MAXNAME)
> -			return -1;
> +		strbuf_addch(name, tolower(c));
>  	}
> -	name[len] = 0;
> +
>  	while (c == ' ' || c == '\t')
>  		c = get_next_char();
>  
> @@ -219,10 +215,10 @@ static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
>  		if (!value)
>  			return -1;
>  	}
> -	return fn(name, value, data);
> +	return fn(name->buf, value, data);
>  }
>  
> -static int get_extended_base_var(char *name, int baselen, int c)
> +static int get_extended_base_var(struct strbuf *name, int c)
>  {
>  	do {
>  		if (c == '\n')
> @@ -233,7 +229,7 @@ static int get_extended_base_var(char *name, int baselen, int c)
>  	/* We require the format to be '[base "extension"]' */
>  	if (c != '"')
>  		return -1;
> -	name[baselen++] = '.';
> +	strbuf_addch(name, '.');
>  
>  	for (;;) {
>  		int c = get_next_char();
> @@ -246,34 +242,30 @@ static int get_extended_base_var(char *name, int baselen, int c)
>  			if (c == '\n')
>  				return -1;
>  		}
> -		name[baselen++] = c;
> -		if (baselen > MAXNAME / 2)
> -			return -1;
> +		strbuf_addch(name, c);
>  	}
>  
>  	/* Final ']' */
>  	if (get_next_char() != ']')
>  		return -1;
> -	return baselen;
> +	return name->len;
>  }
>  
> -static int get_base_var(char *name)
> +static int get_base_var(struct strbuf *name)
>  {
> -	int baselen = 0;
> +	strbuf_reset(name);
>  
>  	for (;;) {
>  		int c = get_next_char();
>  		if (cf->eof)
>  			return -1;
>  		if (c == ']')
> -			return baselen;
> +			return name->len;
>  		if (isspace(c))
> -			return get_extended_base_var(name, baselen, c);
> +			return get_extended_base_var(name, c);
>  		if (!iskeychar(c) && c != '.')
>  			return -1;
> -		if (baselen > MAXNAME / 2)
> -			return -1;
> -		name[baselen++] = tolower(c);
> +		strbuf_addch(name, tolower(c));
>  	}
>  }
>  
> @@ -281,7 +273,7 @@ static int git_parse_file(config_fn_t fn, void *data)
>  {
>  	int comment = 0;
>  	int baselen = 0;
> -	char *var = cf->var;
> +	struct strbuf *var = &cf->var;
>  
>  	/* U+FEFF Byte Order Mark in UTF8 */
>  	static const unsigned char *utf8_bom = (unsigned char *) "\xef\xbb\xbf";
> @@ -320,14 +312,16 @@ static int git_parse_file(config_fn_t fn, void *data)
>  			baselen = get_base_var(var);
>  			if (baselen <= 0)
>  				break;
> -			var[baselen++] = '.';
> -			var[baselen] = 0;
> +			strbuf_addch(var, '.');
>  			continue;
>  		}
>  		if (!isalpha(c))
>  			break;
> -		var[baselen] = tolower(c);
> -		if (get_value(fn, data, var, baselen+1) < 0)
> +		/* Truncate the var name back to the section header prior to
> +		   grabbing the suffix part of the name and the value */
> +		strbuf_setlen(var, baselen+1);
> +		strbuf_addch(var, tolower(c));
> +		if (get_value(fn, data, var) < 0)
>  			break;
>  	}
>  	die("bad config file line %d in %s", cf->linenr, cf->name);

The preferred format for multiline comments in the git project is

    /*
     * Truncate the var name back to the section header prior to
     * grabbing the suffix part of the name and the value.
     */

It took me a while to figure out what you were doing here.  Let me
explain why.

In the old code, get_base_var() read the string into var and returned
var's length (or -1 on error).  The fact that the length of var was
first "reset" to zero is somewhat implicit in the fact that no length
parameter is being passed to get_base_var().

But in the new version, get_base_var() is passed a strbuf.  Often,
operations with strbufs append to the strbuf, and this is what I first
assumed.  It took me a while to realize that get_base_var() calls
strbuf_reset() before getting to work.  Moreover, get_base_var() still
returns the length of what it found, which is redundant with a strbuf
and therefore unexpected.  So when the return value of get_base_var() is
stored into baselen, it is not really obvious that it is the string's
length.

Therefore, I suggest

* Call strbuf_reset() directly in get_parse_file() rather than in
get_base_var()

* Change get_base_var() to return 0 on success (rather than the length
of the string) and -1 on error (including length==0, which is also an
error in this context).

* Change how get_parse_file() initializes baselen to

        if (get_base_var(var) < 0)
                break;
        strbuf_addch(var, '.');
        baselen = var->len;

Note that baselen now includes the trailing dot.  Then later, you don't
need the "+1":

        /*
         * Truncate the var name back to (section header plus '.')
         * prior to grabbing the suffix part of the name and the value
         */
        strbuf_setlen(var, baselen);
        strbuf_addch(var, tolower(c));
        if (get_value(fn, data, var) < 0)
        [...]

> @@ -842,12 +836,14 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data)
>  		top.linenr = 1;
>  		top.eof = 0;
>  		strbuf_init(&top.value, 1024);
> +		strbuf_init(&top.var, 1024);
>  		cf = &top;
>  
>  		ret = git_parse_file(fn, data);
>  
>  		/* pop config-file parsing state stack */
>  		strbuf_release(&top.value);
> +		strbuf_release(&top.var);
>  		cf = top.prev;
>  
>  		fclose(f);
> 

Finally, I realize that the MAXNAME constant was not exported and I
can't find the old length limits documented anywhere, but I nevertheless
worry a little bit that one of the users of the config API has a
built-in assumption that names can never be longer than 256 characters
(for example, a config_fn_t function might try to store the name into a
fixed-length buffer).  Hopefully such code would never have been written
or accepted, but...?  If you have thought about this or audited the
callers, please mention that in your commit message.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Remove the hard coded length limit on variable names in config files
  2012-09-30  4:05     ` Michael Haggerty
@ 2012-09-30 18:20       ` Ben Walton
  2012-09-30 19:44         ` Ben Walton
  2012-10-01  3:16         ` Michael Haggerty
  0 siblings, 2 replies; 9+ messages in thread
From: Ben Walton @ 2012-09-30 18:20 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: gitster, git

Hi Michael,

> The patch doesn't apply to the current master; it appears to have been
> built against master 883a2a3504 (2012-02-23) or older.  It will have to
> be rebased to the current master.

Junio had asked that it be based on maint so that's what I (thought
I?) did.  I'm happy to redo it against master if that's better though.

> The preferred format for multiline comments in the git project is
>
>     /*
>      * Truncate the var name back to the section header prior to
>      * grabbing the suffix part of the name and the value.
>      */

Oops; Will fix.

> In the old code, get_base_var() read the string into var and returned
> var's length (or -1 on error).  The fact that the length of var was
> first "reset" to zero is somewhat implicit in the fact that no length
> parameter is being passed to get_base_var().
>
> But in the new version, get_base_var() is passed a strbuf.  Often,
> operations with strbufs append to the strbuf, and this is what I first
> assumed.  It took me a while to realize that get_base_var() calls
> strbuf_reset() before getting to work.  Moreover, get_base_var() still
> returns the length of what it found, which is redundant with a strbuf
> and therefore unexpected.  So when the return value of get_base_var() is
> stored into baselen, it is not really obvious that it is the string's
> length.

Ok, that's a fair criticism.  When I was creating the patch, I thought
that placing
the strbuf_reset in get_base_var() seemed nicer as it matched the
baselen = 0 which
effectively reset the old character array.  Your point is well taken
though and I think
it makes sense to switch things around the way you've suggested.

> Finally, I realize that the MAXNAME constant was not exported and I
> can't find the old length limits documented anywhere, but I nevertheless
> worry a little bit that one of the users of the config API has a
> built-in assumption that names can never be longer than 256 characters
> (for example, a config_fn_t function might try to store the name into a
> fixed-length buffer).  Hopefully such code would never have been written
> or accepted, but...?  If you have thought about this or audited the
> callers, please mention that in your commit message.

I did look through the code to see if anything was relying on fixed
size buffers and I didn't see anything.  I'll update the commit
message with this info too.

I'll send a modified patch shortly.

Thanks for the review!
-Ben
-- 
---------------------------------------------------------------------------------------------------------------------------
Take the risk of thinking for yourself.  Much more happiness,
truth, beauty and wisdom will come to you that way.

-Christopher Hitchens
---------------------------------------------------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] Remove the hard coded length limit on variable names in config files
  2012-09-30 18:20       ` Ben Walton
@ 2012-09-30 19:44         ` Ben Walton
  2012-10-01 19:33           ` Junio C Hamano
  2012-10-01  3:16         ` Michael Haggerty
  1 sibling, 1 reply; 9+ messages in thread
From: Ben Walton @ 2012-09-30 19:44 UTC (permalink / raw)
  To: hagger, gitster, git; +Cc: Ben Walton

Previously while reading the variable names in config files, there was
a 256 character limit with at most 128 of those characters being used
by the section header portion of the variable name.  This limitation
was only enforced while reading the config files.  It was possible to
write a config file that was not subsequently readable.

Instead of enforcing this limitation for both reading and writing,
remove it entirely by changing the var member of the config_file
struct to a strbuf instead of a fixed length buffer.  Update all of
the parsing functions in config.c to use the strbuf instead of the
static buffer.

The parsing functions that returned the base length of the variable
name now return simply 0 for success and -1 for failure.  The base
length information is obtained through the strbuf's len member.

We now send the buf member of the strbuf to external callback
functions to preserve the external api.  None of the external callers
rely on the old size limitation for sizing their own buffers so
removing the limit should have no externally visible effect.

Signed-off-by: Ben Walton <bdwalton@gmail.com>
---
 config.c |   59 +++++++++++++++++++++++++++++------------------------------
 1 file changed, 29 insertions(+), 30 deletions(-)

diff --git a/config.c b/config.c
index 08e47e2..24fb2d2 100644
--- a/config.c
+++ b/config.c
@@ -10,8 +10,6 @@
 #include "strbuf.h"
 #include "quote.h"
 
-#define MAXNAME (256)
-
 typedef struct config_file {
 	struct config_file *prev;
 	FILE *f;
@@ -19,7 +17,7 @@ typedef struct config_file {
 	int linenr;
 	int eof;
 	struct strbuf value;
-	char var[MAXNAME];
+	struct strbuf var;
 } config_file;
 
 static config_file *cf;
@@ -260,7 +258,7 @@ static inline int iskeychar(int c)
 	return isalnum(c) || c == '-';
 }
 
-static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
+static int get_value(config_fn_t fn, void *data, struct strbuf *name)
 {
 	int c;
 	char *value;
@@ -272,11 +270,9 @@ static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
 			break;
 		if (!iskeychar(c))
 			break;
-		name[len++] = tolower(c);
-		if (len >= MAXNAME)
-			return -1;
+		strbuf_addch(name, tolower(c));
 	}
-	name[len] = 0;
+
 	while (c == ' ' || c == '\t')
 		c = get_next_char();
 
@@ -288,10 +284,10 @@ static int get_value(config_fn_t fn, void *data, char *name, unsigned int len)
 		if (!value)
 			return -1;
 	}
-	return fn(name, value, data);
+	return fn(name->buf, value, data);
 }
 
-static int get_extended_base_var(char *name, int baselen, int c)
+static int get_extended_base_var(struct strbuf *name, int c)
 {
 	do {
 		if (c == '\n')
@@ -302,7 +298,7 @@ static int get_extended_base_var(char *name, int baselen, int c)
 	/* We require the format to be '[base "extension"]' */
 	if (c != '"')
 		return -1;
-	name[baselen++] = '.';
+	strbuf_addch(name, '.');
 
 	for (;;) {
 		int c = get_next_char();
@@ -315,37 +311,31 @@ static int get_extended_base_var(char *name, int baselen, int c)
 			if (c == '\n')
 				goto error_incomplete_line;
 		}
-		name[baselen++] = c;
-		if (baselen > MAXNAME / 2)
-			return -1;
+		strbuf_addch(name, c);
 	}
 
 	/* Final ']' */
 	if (get_next_char() != ']')
 		return -1;
-	return baselen;
+	return 0;
 error_incomplete_line:
 	cf->linenr--;
 	return -1;
 }
 
-static int get_base_var(char *name)
+static int get_base_var(struct strbuf *name)
 {
-	int baselen = 0;
-
 	for (;;) {
 		int c = get_next_char();
 		if (cf->eof)
 			return -1;
 		if (c == ']')
-			return baselen;
+			return 0;
 		if (isspace(c))
-			return get_extended_base_var(name, baselen, c);
+			return get_extended_base_var(name, c);
 		if (!iskeychar(c) && c != '.')
 			return -1;
-		if (baselen > MAXNAME / 2)
-			return -1;
-		name[baselen++] = tolower(c);
+		strbuf_addch(name, tolower(c));
 	}
 }
 
@@ -353,7 +343,7 @@ static int git_parse_file(config_fn_t fn, void *data)
 {
 	int comment = 0;
 	int baselen = 0;
-	char *var = cf->var;
+	struct strbuf *var = &cf->var;
 
 	/* U+FEFF Byte Order Mark in UTF8 */
 	static const unsigned char *utf8_bom = (unsigned char *) "\xef\xbb\xbf";
@@ -389,17 +379,24 @@ static int git_parse_file(config_fn_t fn, void *data)
 			continue;
 		}
 		if (c == '[') {
-			baselen = get_base_var(var);
-			if (baselen <= 0)
+			/* Reset prior to determining a new stem */
+			strbuf_reset(var);
+			if (get_base_var(var) < 0 || var->len < 1)
 				break;
-			var[baselen++] = '.';
-			var[baselen] = 0;
+			strbuf_addch(var, '.');
+			baselen = var -> len;
 			continue;
 		}
 		if (!isalpha(c))
 			break;
-		var[baselen] = tolower(c);
-		if (get_value(fn, data, var, baselen+1) < 0)
+		/*
+		 * Truncate the var name back to the section header
+		 * stem prior to grabbing the suffix part of the name
+		 * and the value.
+		 */
+		strbuf_setlen(var, baselen);
+		strbuf_addch(var, tolower(c));
+		if (get_value(fn, data, var) < 0)
 			break;
 	}
 	die("bad config file line %d in %s", cf->linenr, cf->name);
@@ -899,12 +896,14 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data)
 		top.linenr = 1;
 		top.eof = 0;
 		strbuf_init(&top.value, 1024);
+		strbuf_init(&top.var, 1024);
 		cf = &top;
 
 		ret = git_parse_file(fn, data);
 
 		/* pop config-file parsing state stack */
 		strbuf_release(&top.value);
+		strbuf_release(&top.var);
 		cf = top.prev;
 
 		fclose(f);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Remove the hard coded length limit on variable names in config files
  2012-09-30 18:20       ` Ben Walton
  2012-09-30 19:44         ` Ben Walton
@ 2012-10-01  3:16         ` Michael Haggerty
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Haggerty @ 2012-10-01  3:16 UTC (permalink / raw)
  To: Ben Walton; +Cc: gitster, git

On 09/30/2012 08:20 PM, Ben Walton wrote:
>> The patch doesn't apply to the current master; it appears to have been
>> built against master 883a2a3504 (2012-02-23) or older.  It will have to
>> be rebased to the current master.
> 
> Junio had asked that it be based on maint so that's what I (thought
> I?) did.  I'm happy to redo it against master if that's better though.

That explains it.  Sorry, I hadn't seen that conversation.  (In the
future, if a patch applies against something other than master, please
add a note in the cover letter or in the patch comment after the "--".)

It is OK with me to leave the patch against maint, if that is what Junio
wanted.  It would help, though, if you rebase it to the latest maint
(the conflict seems easy to fix).

Thanks,
Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Remove the hard coded length limit on variable names in config files
  2012-09-30 19:44         ` Ben Walton
@ 2012-10-01 19:33           ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2012-10-01 19:33 UTC (permalink / raw)
  To: Ben Walton; +Cc: hagger, git

Ben Walton <bdwalton@gmail.com> writes:

> Previously while reading the variable names in config files, there was
> a 256 character limit with at most 128 of those characters being used
> by the section header portion of the variable name.  This limitation
> was only enforced while reading the config files.  It was possible to
> write a config file that was not subsequently readable.
>
> Instead of enforcing this limitation for both reading and writing,
> remove it entirely by changing the var member of the config_file
> struct to a strbuf instead of a fixed length buffer.  Update all of
> the parsing functions in config.c to use the strbuf instead of the
> static buffer.
>
> The parsing functions that returned the base length of the variable
> name now return simply 0 for success and -1 for failure.  The base
> length information is obtained through the strbuf's len member.
>
> We now send the buf member of the strbuf to external callback
> functions to preserve the external api.  None of the external callers
> rely on the old size limitation for sizing their own buffers so
> removing the limit should have no externally visible effect.
>
> Signed-off-by: Ben Walton <bdwalton@gmail.com>
> ---
>  config.c |   59 +++++++++++++++++++++++++++++------------------------------
>  1 file changed, 29 insertions(+), 30 deletions(-)

Makes sense, and I found the patch very readable.

Thanks, both.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-10-01 19:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-07  0:47 [PATCH] Prevent git-config from storing section keys that are too long Ben Walton
2012-09-07  1:33 ` Junio C Hamano
2012-09-07  2:34   ` Ben Walton
2012-09-29 10:19   ` [PATCH] Remove the hard coded length limit on variable names in config files Ben Walton
2012-09-30  4:05     ` Michael Haggerty
2012-09-30 18:20       ` Ben Walton
2012-09-30 19:44         ` Ben Walton
2012-10-01 19:33           ` Junio C Hamano
2012-10-01  3:16         ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).