git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] git-config and large integers
@ 2013-08-20 22:39 Jeff King
  2013-08-20 22:42 ` [PATCH 1/2] config: properly range-check integer values Jeff King
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Jeff King @ 2013-08-20 22:39 UTC (permalink / raw)
  To: git

I was playing with a hook for file size limits that wanted to store the
limit in git-config. It turns out we don't do a very good job of big
integers:

  $ git config foo.size 2g
  $ git config --int foo.size
  -2147483648

Oops. After this series, we properly notice the error:

  $ git config --int foo.size
  fatal: bad config value for 'foo.size' in .git/config

and even better, provide a way to access large values:

  $ git config --ulong foo.size
  2147483648

The patches are:

  [1/2]: config: properly range-check integer values
  [2/2]: teach git-config to output large integers

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] config: properly range-check integer values
  2013-08-20 22:39 [PATCH 0/2] git-config and large integers Jeff King
@ 2013-08-20 22:42 ` Jeff King
  2013-08-20 23:07   ` Jonathan Nieder
  2013-08-20 22:44 ` [PATCH 0/2] git-config and large integers Stefan Beller
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2013-08-20 22:42 UTC (permalink / raw)
  To: git

When we look at a config value as an integer using the
git_config_int function, we carefully range-check the value
we get and complain if it is out of our range. But the range
we compare to is that of a "long", which we then cast to an
"int" in the function's return value. This means that on
systems where "int" and "long" have different sizes (e.g.,
LP64 systems), we may pass the range check, but then return
nonsense by truncating the value as we cast it to an int.

We can solve this by converting git_parse_long into
git_parse_int, and range-checking the "int" range. Nobody
actually cared that we used a "long" internally, since the
result was truncated anyway, and there were no other callers
of git_parse_long.

Of the existing callers, all of them expect small-ish values
that are fine for an "int", and should not be impacted
except when nonsense is fed to them. The one arguable
exception is http.lowSpeedLimit, which is given in bytes per
second, and is fed to curl as a long. However, nobody is
likely to consider 2GB/sec as "low speed", and even if they
did, the new behavior is much better (it barfs and complains
rather than considering the value negative and silently
ignoring it).

Signed-off-by: Jeff King <peff@peff.net>
---
I added a test. It would not fail on existing 32-bit systems, but would
on existing LP64 systems. It will pass with the new code on both.
However, it will fail on ILP64 systems (because their int is large, and
can represent 3GB). I'm not sure if any such systems are in wide use
(SPARC64?), but we would want a prereq in that case, I guess. I'm
inclined to wait to see if it actually fails for anybody.

 config.c               | 12 ++++++------
 t/t1300-repo-config.sh |  5 +++++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/config.c b/config.c
index e13a7b6..9237d4b 100644
--- a/config.c
+++ b/config.c
@@ -468,7 +468,7 @@ static int parse_unit_factor(const char *end, uintmax_t *val)
 	return 0;
 }
 
-static int git_parse_long(const char *value, long *ret)
+static int git_parse_int(const char *value, int *ret)
 {
 	if (value && *value) {
 		char *end;
@@ -484,7 +484,7 @@ static int git_parse_long(const char *value, long *ret)
 			return 0;
 		uval = abs(val);
 		uval *= factor;
-		if ((uval > maximum_signed_value_of_type(long)) ||
+		if ((uval > maximum_signed_value_of_type(int)) ||
 		    (abs(val) > uval))
 			return 0;
 		val *= factor;
@@ -526,8 +526,8 @@ int git_config_int(const char *name, const char *value)
 
 int git_config_int(const char *name, const char *value)
 {
-	long ret = 0;
-	if (!git_parse_long(value, &ret))
+	int ret;
+	if (!git_parse_int(value, &ret))
 		die_bad_config(name);
 	return ret;
 }
@@ -559,10 +559,10 @@ int git_config_maybe_bool(const char *name, const char *value)
 
 int git_config_maybe_bool(const char *name, const char *value)
 {
-	long v = git_config_maybe_bool_text(name, value);
+	int v = git_config_maybe_bool_text(name, value);
 	if (0 <= v)
 		return v;
-	if (git_parse_long(value, &v))
+	if (git_parse_int(value, &v))
 		return !!v;
 	return -1;
 }
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index c4a7d84..3c5ec4b 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -652,6 +652,11 @@ test_expect_success numbers '
 	test_cmp expect actual
 '
 
+test_expect_success 'out-of-range integers are detected' '
+	git config giga.crash 3g &&
+	test_must_fail git config --int giga.crash
+'
+
 test_expect_success 'invalid unit' '
 	git config aninvalid.unit "1auto" &&
 	echo 1auto >expect &&
-- 
1.8.4.rc2.28.g6bb5f3f

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] git-config and large integers
  2013-08-20 22:39 [PATCH 0/2] git-config and large integers Jeff King
  2013-08-20 22:42 ` [PATCH 1/2] config: properly range-check integer values Jeff King
@ 2013-08-20 22:44 ` Stefan Beller
  2013-08-20 22:48   ` Jeff King
  2013-08-20 22:47 ` [PATCH 2/2] teach git-config to output " Jeff King
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Stefan Beller @ 2013-08-20 22:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 787 bytes --]

On 08/21/2013 12:39 AM, Jeff King wrote:
> I was playing with a hook for file size limits that wanted to store the
> limit in git-config. It turns out we don't do a very good job of big
> integers:
> 
>   $ git config foo.size 2g
>   $ git config --int foo.size
>   -2147483648
> 
> Oops. After this series, we properly notice the error:
> 
>   $ git config --int foo.size
>   fatal: bad config value for 'foo.size' in .git/config
> 
> and even better, provide a way to access large values:
> 
>   $ git config --ulong foo.size
>   2147483648
> 

int, ulong...
How large will those be, I'd guess they are machine dependent?
So int being 32 bits as usual, but not on all machines.
(Those, which don't have 32 bits, are maybe not relevant anyways?)

Stefan




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/2] teach git-config to output large integers
  2013-08-20 22:39 [PATCH 0/2] git-config and large integers Jeff King
  2013-08-20 22:42 ` [PATCH 1/2] config: properly range-check integer values Jeff King
  2013-08-20 22:44 ` [PATCH 0/2] git-config and large integers Stefan Beller
@ 2013-08-20 22:47 ` Jeff King
  2013-08-20 22:57   ` Jonathan Nieder
  2013-08-20 23:06 ` [PATCH 0/2] git-config and " Junio C Hamano
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
  4 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2013-08-20 22:47 UTC (permalink / raw)
  To: git

Internally we use "unsigned long" to represent large config
values like packed window memory sizes, packfile size
limits, etc.  On 32-bit systems, this limits these config
options to 4G (and we detect and complain about overflow).
On 64-bit systems, they get the full 64-bit range.

However, there is no way for script callers of git-config to
see the numeric values (they cannot treat them as pure
strings, since they would want git-config to process "3g"
into "3221225472").

This patch teaches git-config a "--ulong" type that can
output these unsigned long values, allowing scripts to
examine the values as the internal code would.

Signed-off-by: Jeff King <peff@peff.net>
---
I kind of hate the name "--ulong". I wanted to call it "--size" or
something and abstract away the actual platform representation, and just
make it "big enough for file sizes". But internally, the git options
for dealing with such things use "unsigned long". We could change that
(we are limiting sizes to 4GB on 32-bit systems, while most modern
32-bit systems can handle >4GB files), but I don't know if anybody
actually cares, and it would be a lot of error-prone work (we have to
change the type to off_t or similar, but then try to handle all of the
fallout throughout the code where we use "unsigned long").

So rather than try for something abstract, I tried to keep it to what
git does internally, which works pretty well in practice. As with --int,
you'll get an error if you try to use too large an integer.

 Documentation/git-config.txt | 6 ++++++
 builtin/config.c             | 8 ++++++++
 t/t1300-repo-config.sh       | 7 +++++++
 3 files changed, 21 insertions(+)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 2dbe486..a8b6626 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -165,6 +165,12 @@ See also <<FILES>>.
 	'git config' will ensure that the output matches the format of
 	either --bool or --int, as described above.
 
+--ulong::
+	As with `--int`, 'git config' will ensure that the output is a
+	simple decimal number, and will respect value suffixes. Unlike
+	`--int`, the range of acceptable integers will be non-negative
+	and go higher (depending on your platform, but at least 2^32-1).
+
 --path::
 	'git-config' will expand leading '{tilde}' to the value of
 	'$HOME', and '{tilde}user' to the home directory for the
diff --git a/builtin/config.c b/builtin/config.c
index 4010c43..f7a46bf 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -47,6 +47,7 @@ static int respect_includes = -1;
 #define TYPE_INT (1<<1)
 #define TYPE_BOOL_OR_INT (1<<2)
 #define TYPE_PATH (1<<3)
+#define TYPE_ULONG (1<<4)
 
 static struct option builtin_config_options[] = {
 	OPT_GROUP(N_("Config file location")),
@@ -73,6 +74,7 @@ static struct option builtin_config_options[] = {
 	OPT_BIT(0, "bool", &types, N_("value is \"true\" or \"false\""), TYPE_BOOL),
 	OPT_BIT(0, "int", &types, N_("value is decimal number"), TYPE_INT),
 	OPT_BIT(0, "bool-or-int", &types, N_("value is --bool or --int"), TYPE_BOOL_OR_INT),
+	OPT_BIT(0, "ulong", &types, N_("value is large unsigned decimal number"), TYPE_ULONG),
 	OPT_BIT(0, "path", &types, N_("value is a path (file or directory name)"), TYPE_PATH),
 	OPT_GROUP(N_("Other")),
 	OPT_BOOLEAN('z', "null", &end_null, N_("terminate values with NUL byte")),
@@ -129,6 +131,8 @@ static int collect_config(const char *key_, const char *value_, void *cb)
 	}
 	if (types == TYPE_INT)
 		sprintf(value, "%d", git_config_int(key_, value_?value_:""));
+	else if (types == TYPE_ULONG)
+		sprintf(value, "%lu", git_config_ulong(key_, value_ ? value_ : ""));
 	else if (types == TYPE_BOOL)
 		vptr = git_config_bool(key_, value_) ? "true" : "false";
 	else if (types == TYPE_BOOL_OR_INT) {
@@ -268,6 +272,10 @@ static char *normalize_value(const char *key, const char *value)
 			int v = git_config_int(key, value);
 			sprintf(normalized, "%d", v);
 		}
+		else if (types == TYPE_ULONG)
+			sprintf(normalized, "%lu",
+				git_config_ulong(key, value));
+
 		else if (types == TYPE_BOOL)
 			sprintf(normalized, "%s",
 				git_config_bool(key, value) ? "true" : "false");
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 3c5ec4b..f57fb27 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -669,6 +669,13 @@ test_expect_success 'invalid unit' '
 	test_cmp actual expect
 '
 
+test_expect_success 'large numbers via --ulong' '
+	git config big.file 3g &&
+	echo 3221225472 >expect &&
+	git config --ulong big.file >actual &&
+	test_cmp expect actual
+'
+
 cat > expect << EOF
 true
 false
-- 
1.8.4.rc2.28.g6bb5f3f

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] git-config and large integers
  2013-08-20 22:44 ` [PATCH 0/2] git-config and large integers Stefan Beller
@ 2013-08-20 22:48   ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-08-20 22:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Wed, Aug 21, 2013 at 12:44:22AM +0200, Stefan Beller wrote:

> On 08/21/2013 12:39 AM, Jeff King wrote:
> > I was playing with a hook for file size limits that wanted to store the
> > limit in git-config. It turns out we don't do a very good job of big
> > integers:
> > 
> >   $ git config foo.size 2g
> >   $ git config --int foo.size
> >   -2147483648
> > 
> > Oops. After this series, we properly notice the error:
> > 
> >   $ git config --int foo.size
> >   fatal: bad config value for 'foo.size' in .git/config
> > 
> > and even better, provide a way to access large values:
> > 
> >   $ git config --ulong foo.size
> >   2147483648
> > 
> 
> int, ulong...
> How large will those be, I'd guess they are machine dependent?
> So int being 32 bits as usual, but not on all machines.
> (Those, which don't have 32 bits, are maybe not relevant anyways?)

Yes, machine dependent. See the discussion in the patches themselves.
It's kind of ugly, but it matches what git does internally (and we
properly detect range errors at runtime).

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] teach git-config to output large integers
  2013-08-20 22:47 ` [PATCH 2/2] teach git-config to output " Jeff King
@ 2013-08-20 22:57   ` Jonathan Nieder
  2013-08-21  3:00     ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Jonathan Nieder @ 2013-08-20 22:57 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King wrote:

> I kind of hate the name "--ulong". I wanted to call it "--size" or
> something and abstract away the actual platform representation, and just
> make it "big enough for file sizes".

Yes, something like --size would be more pleasant.

It could still use unsigned long internally.  My only worry about
--size is that it does not make it clear we are talking about file
sizes and not in-memory sizes (size_t), and I'm not too worried about
that.

[...]
> --- a/builtin/config.c
> +++ b/builtin/config.c
[...]
> @@ -268,6 +272,10 @@ static char *normalize_value(const char *key, const char *value)
>  			int v = git_config_int(key, value);
>  			sprintf(normalized, "%d", v);
>  		}
> +		else if (types == TYPE_ULONG)
> +			sprintf(normalized, "%lu",
> +				git_config_ulong(key, value));
> +
>  		else if (types == TYPE_BOOL)

Style: uncuddled "else", stray blank line.  (The former was already
there, but it still stands out.)  I think

		if (types == TYPE_INT) {
			...
		} else if (types == TYPE_ULONG) {
			...
		} else if (types == TYPE_BOOL) {
			...
		} else if (types == TYPE_BOOL_OR_INT) {
			...
		} else {
			...
		}

would be easiest to read.

Thanks for taking this on.

Sincerely,
Jonathan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] git-config and large integers
  2013-08-20 22:39 [PATCH 0/2] git-config and large integers Jeff King
                   ` (2 preceding siblings ...)
  2013-08-20 22:47 ` [PATCH 2/2] teach git-config to output " Jeff King
@ 2013-08-20 23:06 ` Junio C Hamano
  2013-08-20 23:41   ` Junio C Hamano
  2013-08-21  2:34   ` Jeff King
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
  4 siblings, 2 replies; 25+ messages in thread
From: Junio C Hamano @ 2013-08-20 23:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> I was playing with a hook for file size limits that wanted to store the
> limit in git-config. It turns out we don't do a very good job of big
> integers:
>
>   $ git config foo.size 2g
>   $ git config --int foo.size
>   -2147483648
>
> Oops. After this series, we properly notice the error:
>
>   $ git config --int foo.size
>   fatal: bad config value for 'foo.size' in .git/config
>
> and even better, provide a way to access large values:
>
>   $ git config --ulong foo.size
>   2147483648

I may be missing something, but why do we even need a new option for
the command that is known to always produce textual output?

As you said "Oops", the first example that shows a string of digits
prefixed by a minus sign for input "2g" is buggy, and I think it is
perfectly reasonable to fix it to show a stringified representation
of 2*1024*1024*1024 when asked for "--int".

What am I missing???

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] config: properly range-check integer values
  2013-08-20 22:42 ` [PATCH 1/2] config: properly range-check integer values Jeff King
@ 2013-08-20 23:07   ` Jonathan Nieder
  2013-08-21  2:55     ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Jonathan Nieder @ 2013-08-20 23:07 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King wrote:

> I added a test. It would not fail on existing 32-bit systems, but would
> on existing LP64 systems. It will pass with the new code on both.
> However, it will fail on ILP64 systems (because their int is large, and
> can represent 3GB). I'm not sure if any such systems are in wide use
> (SPARC64?), but we would want a prereq in that case, I guess. I'm
> inclined to wait to see if it actually fails for anybody.

Yuck.

What will go wrong if "git config --int" starts returning numbers too
large to fit in an 'int'?  That can already happen if "git" and a
command that uses it are built for different ABIs (e.g., ILP64 git,
32-bit custom tool that calls git).

It's possible that what the test should be checking for is "either
returns a sane answer or fails" (which would pass regardless of ABI).
Something like:

	test_expect_success 'large integers do not confuse config --int' '
		git config giga.crash 3g &&
		test_might_fail git config --int giga.crash >actual &&
		echo 3221225472 >expect &&
		{
			test_cmp expect actual ||
			test_must_fail git config --int giga.crash
		}
	'

Sensible?
Jonathan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] git-config and large integers
  2013-08-20 23:06 ` [PATCH 0/2] git-config and " Junio C Hamano
@ 2013-08-20 23:41   ` Junio C Hamano
  2013-08-21  2:43     ` Jeff King
  2013-08-21  2:34   ` Jeff King
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2013-08-20 23:41 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> I was playing with a hook for file size limits that wanted to store the
>> limit in git-config. It turns out we don't do a very good job of big
>> integers:
>>
>>   $ git config foo.size 2g
>>   $ git config --int foo.size
>>   -2147483648
>>
>> Oops. After this series, we properly notice the error:
>>
>>   $ git config --int foo.size
>>   fatal: bad config value for 'foo.size' in .git/config
>>
>> and even better, provide a way to access large values:
>>
>>   $ git config --ulong foo.size
>>   2147483648
>
> I may be missing something, but why do we even need a new option for
> the command that is known to always produce textual output?
>
> As you said "Oops", the first example that shows a string of digits
> prefixed by a minus sign for input "2g" is buggy, and I think it is
> perfectly reasonable to fix it to show a stringified representation
> of 2*1024*1024*1024 when asked for "--int".
>
> What am I missing???

If this applied on the writing side, I would understand it very
much, i.e.

	$ git config --int32 foo.size 2g
        fatal: "2g" is too large to be read as "int32".

and as a complement it may make sense as a warning mechanism to also
error out when existing value does not fit on the "platform" int, so
your 

>>   $ git config --int foo.size
>>   fatal: bad config value for 'foo.size' in .git/config

might make sense (even though I'd suggest being more explicit than
"bad value" in this case---"the value specified will not fit when
used in a variable of type int on this platform").  When .git/config
is shared on two different boxes (think: NFS), the size of "int"
might be different between them, so the logic to produce such a
warning may have to explicitly check against int32_t, not platform
int and say "will not fit in 'int' on some machines".

I dunno.


	

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] git-config and large integers
  2013-08-20 23:06 ` [PATCH 0/2] git-config and " Junio C Hamano
  2013-08-20 23:41   ` Junio C Hamano
@ 2013-08-21  2:34   ` Jeff King
  1 sibling, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-08-21  2:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Aug 20, 2013 at 04:06:19PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I was playing with a hook for file size limits that wanted to store the
> > limit in git-config. It turns out we don't do a very good job of big
> > integers:
> >
> >   $ git config foo.size 2g
> >   $ git config --int foo.size
> >   -2147483648
> >
> > Oops. After this series, we properly notice the error:
> >
> >   $ git config --int foo.size
> >   fatal: bad config value for 'foo.size' in .git/config
> >
> > and even better, provide a way to access large values:
> >
> >   $ git config --ulong foo.size
> >   2147483648
> 
> I may be missing something, but why do we even need a new option for
> the command that is known to always produce textual output?

We could do all math with int64_t (for example) and then print the
stringified representation. But then we would not be matching the same
checks that git is doing internally.

For example, on a 32-bit system, setting core.packedGitLimit to 4G will
produce an error for "git config --int core.packedgitlimit", as we
cannot represent the size internally. We could do the conversion in such
a way that we print the correct size, but it does not represent what
happens when "git pack-objects" is run (you get the same error).

> As you said "Oops", the first example that shows a string of digits
> prefixed by a minus sign for input "2g" is buggy, and I think it is
> perfectly reasonable to fix it to show a stringified representation
> of 2*1024*1024*1024 when asked for "--int".

The negative value is a little bit of a sidetrack. For both "git config
--int" and for internal use, we do not correctly range-check integer
values. And that's why we print the negative value, when we should say
"this is a bogus out-of-range value". The latter is what we have always
done for values outside of range, both internal and external, and it is
only that our range check was bogus (and we fed negative crap to the
code instead of complaining). That is fixed by the first patch.

And that leads to the second patch. The "--int" option provides a range
check of (typically) -2^32 to 2^32-1. But many of our values internally
use a larger range. We can either drop that range check (which means we
will let you inspect values with config that git internally will barf
on, with no clue), or we need to add another option with a different
range to retrieve those values. I chose to add another option because I
think the range check has value.

Does that explain the problem more fully?

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] git-config and large integers
  2013-08-20 23:41   ` Junio C Hamano
@ 2013-08-21  2:43     ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-08-21  2:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Aug 20, 2013 at 04:41:30PM -0700, Junio C Hamano wrote:

> If this applied on the writing side, I would understand it very
> much, i.e.
> 
> 	$ git config --int32 foo.size 2g
>         fatal: "2g" is too large to be read as "int32".

It does, by the way. When you request a type on the writing side, we
normalize (and complain in the same way as we do when reading).

> and as a complement it may make sense as a warning mechanism to also
> error out when existing value does not fit on the "platform" int, so
> your 
> 
> >>   $ git config --int foo.size
> >>   fatal: bad config value for 'foo.size' in .git/config
> 
> might make sense (even though I'd suggest being more explicit than
> "bad value" in this case---"the value specified will not fit when
> used in a variable of type int on this platform").

Yes, the error message is terrible, and I think an extra patch on top to
improve it is worth doing. But note that I am not introducing that error
here at all. On 32-bit systems, we already correctly range-checked and
produced that error. It is only on 64-bit systems that the range check
was flat out wrong. It checked against "long"'s precision, but then cast
the result to an int, losing bits. A possibly worse example than the
negative one is:

  $ git config foo.bar 4g
  $ git config --int foo.bar
  0

Again, that is what git's internal code is seeing. And that is why
keeping the range check for git-config has value: it lets you see what
git would see internally.

> When .git/config is shared on two different boxes (think: NFS), the
> size of "int" might be different between them, so the logic to produce
> such a warning may have to explicitly check against int32_t, not
> platform int and say "will not fit in 'int' on some machines".

I don't really see the value in that. You can always write whatever you
like in the config file. The reader is responsible during parsing for
saying "Hey, I am 32-bit and I can't handle this". And we already do
that, and it works fine. So if you have an NFS-shared .git/config, and
you set "pack.deltacachesize" to "4g", a 64-bit machine will do fine
with that, and a 32-bit machine will complain. Which seems like the only
sane thing to do.

There are a few config options that use "unsigned long" that I would
argue should be "off_t" or something (for example,
core.bigFileThreshold, which cannot be more than 4G on a 32-bit machine,
simply because we can't represent the size. On the other hand, there is
probably a ton of stuff that does not work with 4G files on such a
system, because we use unsigned long all over the place inside the
code).
-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] config: properly range-check integer values
  2013-08-20 23:07   ` Jonathan Nieder
@ 2013-08-21  2:55     ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-08-21  2:55 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Tue, Aug 20, 2013 at 04:07:49PM -0700, Jonathan Nieder wrote:

> > I added a test. It would not fail on existing 32-bit systems, but would
> > on existing LP64 systems. It will pass with the new code on both.
> > However, it will fail on ILP64 systems (because their int is large, and
> > can represent 3GB). I'm not sure if any such systems are in wide use
> > (SPARC64?), but we would want a prereq in that case, I guess. I'm
> > inclined to wait to see if it actually fails for anybody.
> 
> Yuck.
> 
> What will go wrong if "git config --int" starts returning numbers too
> large to fit in an 'int'?  That can already happen if "git" and a
> command that uses it are built for different ABIs (e.g., ILP64 git,
> 32-bit custom tool that calls git).

I'm not sure I understand your question. "git config --int" cannot
return numbers that do not fit in an "int", since we use an int as an
intermediate representation type. The intent of the code is to
range-check the input and complain. But the code gets it wrong, and
sometimes truncates the value instead. So we continue to never show
anything that would not fit in an "int", but now our range-check
correctly complains rather than truncating in some cases.

If you are worried about a 32-bit command calling an ILP64 git, then
nothing is made worse by this patch. We return the same range of values
as before.

Short of adding "--int32", etc to git-config, I don't think git can be
responsible for this situation. It can only say "This does not fit in my
internal representation, and I would barf on it". If you do not tell it
that your int is smaller, then it cannot say "you would barf on it".

> It's possible that what the test should be checking for is "either
> returns a sane answer or fails" (which would pass regardless of ABI).
> Something like:
> 
> 	test_expect_success 'large integers do not confuse config --int' '
> 		git config giga.crash 3g &&
> 		test_might_fail git config --int giga.crash >actual &&
> 		echo 3221225472 >expect &&
> 		{
> 			test_cmp expect actual ||
> 			test_must_fail git config --int giga.crash
> 		}
> 	'
> 
> Sensible?

Yes, that would cover an ILP64 system, though it very much muddies the
point of the test (which is to find a value that is represented by a
long, but not an int; such a value does not exist at all on ILP64).

Which is why I wondered about avoiding it with a prerequisite.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] teach git-config to output large integers
  2013-08-20 22:57   ` Jonathan Nieder
@ 2013-08-21  3:00     ` Jeff King
  2013-08-21  4:38       ` Jonathan Nieder
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2013-08-21  3:00 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Tue, Aug 20, 2013 at 03:57:45PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > I kind of hate the name "--ulong". I wanted to call it "--size" or
> > something and abstract away the actual platform representation, and just
> > make it "big enough for file sizes".
> 
> Yes, something like --size would be more pleasant.
> 
> It could still use unsigned long internally.  My only worry about
> --size is that it does not make it clear we are talking about file
> sizes and not in-memory sizes (size_t), and I'm not too worried about
> that.

I almost sent it as "--size" with unsigned long internally. But try
writing the documentation for it. You want to say something like "it's
big enough to handle file sizes". Except that on 32-bit, it's _not_.
It's only 4G.

You really want something that uses off_t internally, so 32-bit systems
with largefile support do the sane thing. But now you have no way of
emulating the way that git parses stuff internally. You cannot say "git
config --size core.bigFileThreshold" and get the same results that git
will have internally when it looks at that file (because it uses
"unsigned long" internally").

I think there is an argument to be made that git should be using off_t
internally for such things. But it is a lot of code to change and check,
and I'm not sure that anybody even really cares that much.

> Style: uncuddled "else", stray blank line.  (The former was already
> there, but it still stands out.)  I think

Yes, I was trying to follow the existing style (but obviously the extra
line was just a typo).

> 		if (types == TYPE_INT) {
> 			...
> 		} else if (types == TYPE_ULONG) {
> 			...
> 		} else if (types == TYPE_BOOL) {
> 			...
> 		} else if (types == TYPE_BOOL_OR_INT) {
> 			...
> 		} else {
> 			...
> 		}
> 
> would be easiest to read.

But that is adding brackets for one-liner conditional bodies that do not
need it. Which is more evil?

My usual method is to do what looks the most readable to me, but I admit
I have a hard time using my intuition with the cuddled-elses, as I think
they look terrible (yes, I'm aware they are in our style guide and I am
not arguing to take them out, only that my personal sense of "looks
good" is helpless with them).

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] teach git-config to output large integers
  2013-08-21  3:00     ` Jeff King
@ 2013-08-21  4:38       ` Jonathan Nieder
  2013-08-21  5:00         ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Jonathan Nieder @ 2013-08-21  4:38 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King wrote:

> I almost sent it as "--size" with unsigned long internally. But try
> writing the documentation for it. You want to say something like "it's
> big enough to handle file sizes". Except that on 32-bit, it's _not_.
> It's only 4G.
>
> You really want something that uses off_t internally, so 32-bit systems
> with largefile support do the sane thing. But now you have no way of
> emulating the way that git parses stuff internally.

Let's take a step back for a moment.  What problem is this patch
solving?

>From the motivating example, I thought it was

	When reading or writing an integer config item, git sometimes
	encounters integer overflow and doesn't know how to deal with it.
	Worse, this means that some meaningful values are unrepresentable
	in config files.  Fix it in two steps:

	 1. Catch overflow, and error out instead of pretending to be
	    able to handle it.

	 2. Provide at least an option to use a wider integer type and
	    handle larger meaningful values.

	This involves a new option --size instead of making --int use
	intmax_t for the following compatibility reason: ...

For example, the compatibility reason could be that some scripts
calling "git config" were not able to handle large integers and that
we do not want to expose them to unexpectedly large values.

But that reason doesn't sound realistic to me.  So what is the actual
reason not to always use a wider range?

That is what I was trying to get at in discussing the test.  It is not
"We would like --int to reject values higher than this, but some
platforms do not allow us to", but "Either rejecting this value, or
even better, computing the right size and printing it, is an
acceptable behavior, and this test checks for those."

Hoping that clarifies,
Jonathan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] teach git-config to output large integers
  2013-08-21  4:38       ` Jonathan Nieder
@ 2013-08-21  5:00         ` Jeff King
  2013-08-21  6:34           ` Jonathan Nieder
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2013-08-21  5:00 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Tue, Aug 20, 2013 at 09:38:41PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > I almost sent it as "--size" with unsigned long internally. But try
> > writing the documentation for it. You want to say something like "it's
> > big enough to handle file sizes". Except that on 32-bit, it's _not_.
> > It's only 4G.
> >
> > You really want something that uses off_t internally, so 32-bit systems
> > with largefile support do the sane thing. But now you have no way of
> > emulating the way that git parses stuff internally.
> 
> Let's take a step back for a moment.  What problem is this patch
> solving?

That you cannot currently ask git-config for a value larger than 2g.

> From the motivating example, I thought it was
> 
> 	When reading or writing an integer config item, git sometimes
> 	encounters integer overflow and doesn't know how to deal with it.
> 	Worse, this means that some meaningful values are unrepresentable
> 	in config files.  Fix it in two steps:
> 
> 	 1. Catch overflow, and error out instead of pretending to be
> 	    able to handle it.

No, this first step is not being added. It is already the case that we
error out for overflow. The first patch is catching a case where we
failed to do that properly, and instead truncated. And it has nothing to
do with git-config itself; it is only that we must use git-config to
test it from the shell. The same problem may happen internally (but
tends not to, because large things tend to use git_config_ulong instead
of git_config_int).

> 	 2. Provide at least an option to use a wider integer type and
> 	    handle larger meaningful values.
> 
> 	This involves a new option --size instead of making --int use
> 	intmax_t for the following compatibility reason: ...
> 
> For example, the compatibility reason could be that some scripts
> calling "git config" were not able to handle large integers and that
> we do not want to expose them to unexpectedly large values.
> 
> But that reason doesn't sound realistic to me.  So what is the actual
> reason not to always use a wider range?

My reason is that it does not represent the same range checks that git
is doing internally (and which vary from platform to platform). So you
might ask git-config "what is the value of pack.deltacachelimit"; I
would expect it to apply the same range checks there that would be used
by git-pack-objects.

> That is what I was trying to get at in discussing the test.  It is not
> "We would like --int to reject values higher than this, but some
> platforms do not allow us to", but "Either rejecting this value, or
> even better, computing the right size and printing it, is an
> acceptable behavior, and this test checks for those."

You are conflating the two patches, I think. The test we were discussing
is for the _first_ patch, which fixes a bug in the range check. It is
not meant to test git-config in particular, but to test that values
higher than INT_MAX and lower than LONG_MAX are properly range-checked.

Forget the second patch for a moment. I believe the first one is a bug
fix that we would want even if we do not take the second patch at all.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] teach git-config to output large integers
  2013-08-21  5:00         ` Jeff King
@ 2013-08-21  6:34           ` Jonathan Nieder
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Nieder @ 2013-08-21  6:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King wrote:
> On Tue, Aug 20, 2013 at 09:38:41PM -0700, Jonathan Nieder wrote:

>> That is what I was trying to get at in discussing the test.  It is not
>> "We would like --int to reject values higher than this, but some
>> platforms do not allow us to", but "Either rejecting this value, or
>> even better, computing the right size and printing it, is an
>> acceptable behavior, and this test checks for those."
>
> You are conflating the two patches, I think. The test we were discussing
> is for the _first_ patch, which fixes a bug in the range check. It is
> not meant to test git-config in particular, but to test that values
> higher than INT_MAX and lower than LONG_MAX are properly range-checked.
>
> Forget the second patch for a moment. I believe the first one is a bug
> fix that we would want even if we do not take the second patch at all.

Sure.  I'm not conflating the patches.  What I mean is that tests are
supposed to test desirable behavior, whatever that is --- they are not
about preventing all behavior changes but only about preventing
regressions.

So talking about tests is a (perhaps overly roundabout) way to figure
out the desirable behavior.

In particular, at first glance I would think computing 3 * 2^20
instead of erroring out would be a *good* behavior, not a regression.
If that's right, it doesn't make sense to me to go to careful lengths
either to test that git continues to error out on most platforms, or
to introduce new options to ensure "git config --int" continues to
error out.

That is what I am trying to understand.  Everything about the first
patch except for the test makes sense to me, but the test doesn't.  As
you noted, we know the test won't pass on some platforms.  Why is it
something we should *want* to pass?

Jonathan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCHv2 0/5] git-config and large integers
  2013-08-20 22:39 [PATCH 0/2] git-config and large integers Jeff King
                   ` (3 preceding siblings ...)
  2013-08-20 23:06 ` [PATCH 0/2] git-config and " Junio C Hamano
@ 2013-09-08  8:27 ` Jeff King
  2013-09-08  8:29   ` [PATCH 1/5] config: factor out integer parsing from range checks Jeff King
                     ` (5 more replies)
  4 siblings, 6 replies; 25+ messages in thread
From: Jeff King @ 2013-09-08  8:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jonathan Nieder

Here's a re-roll (and re-design) of my series to help with "git config"
and large integers. It takes the different approach we discussed of
simply removing (well, increasing) the range limits for "git config
--int".

I also improved the error messages for bogus config (patches 3-4).

  [1/5]: config: factor out integer parsing from range checks
  [2/5]: config: properly range-check integer values
  [3/5]: config: set errno in numeric git_parse_* functions
  [4/5]: config: make numeric parsing errors more clear
  [5/5]: git-config: always treat --int as 64-bit internally

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] config: factor out integer parsing from range checks
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
@ 2013-09-08  8:29   ` Jeff King
  2013-09-08  8:33   ` [PATCH 2/5] config: properly range-check integer values Jeff King
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-09-08  8:29 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jonathan Nieder

When we are parsing integers for config, we use an intmax_t
(or uintmax_t) internally, and then check against the size
of our result type at the end. We can parameterize the
maximum representable value, which will let us re-use the
parsing code for a variety of range checks.

Unfortunately, we cannot combine the signed and unsigned
parsing functions easily, as we have to rely on the signed
and unsigned C types internally.

Signed-off-by: Jeff King <peff@peff.net>
---
The fact that we have two nearly-identical functions next to each other
really bugs me (especially since I'm going to make identical changes to
them later in this series). But I think trying to factor them out into a
single function would end up being even less readable. I'd be happy if
somebody wants to prove me wrong.

 config.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/config.c b/config.c
index 9f9bf0c..ee7c1f8 100644
--- a/config.c
+++ b/config.c
@@ -468,7 +468,7 @@ static int parse_unit_factor(const char *end, uintmax_t *val)
 	return 0;
 }
 
-static int git_parse_long(const char *value, long *ret)
+static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
 {
 	if (value && *value) {
 		char *end;
@@ -484,8 +484,7 @@ static int git_parse_long(const char *value, long *ret)
 			return 0;
 		uval = abs(val);
 		uval *= factor;
-		if ((uval > maximum_signed_value_of_type(long)) ||
-		    (abs(val) > uval))
+		if (uval > max || abs(val) > uval)
 			return 0;
 		val *= factor;
 		*ret = val;
@@ -494,7 +493,7 @@ static int git_parse_long(const char *value, long *ret)
 	return 0;
 }
 
-int git_parse_ulong(const char *value, unsigned long *ret)
+int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
 {
 	if (value && *value) {
 		char *end;
@@ -508,8 +507,7 @@ int git_parse_ulong(const char *value, unsigned long *ret)
 		oldval = val;
 		if (!parse_unit_factor(end, &val))
 			return 0;
-		if ((val > maximum_unsigned_value_of_type(long)) ||
-		    (oldval > val))
+		if (val > max || oldval > val)
 			return 0;
 		*ret = val;
 		return 1;
@@ -517,6 +515,24 @@ int git_parse_ulong(const char *value, unsigned long *ret)
 	return 0;
 }
 
+static int git_parse_long(const char *value, long *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
 static void die_bad_config(const char *name)
 {
 	if (cf && cf->name)
-- 
1.8.4.2.g87d4a77

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/5] config: properly range-check integer values
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
  2013-09-08  8:29   ` [PATCH 1/5] config: factor out integer parsing from range checks Jeff King
@ 2013-09-08  8:33   ` Jeff King
  2013-09-08  8:36   ` [PATCH 3/5] config: set errno in numeric git_parse_* functions Jeff King
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-09-08  8:33 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jonathan Nieder

When we look at a config value as an integer using the
git_config_int function, we carefully range-check the value
we get and complain if it is out of our range. But the range
we compare to is that of a "long", which we then cast to an
"int" in the function's return value. This means that on
systems where "int" and "long" have different sizes (e.g.,
LP64 systems), we may pass the range check, but then return
nonsense by truncating the value as we cast it to an int.

We can solve this by converting git_parse_long into
git_parse_int, and range-checking the "int" range. Nobody
actually cared that we used a "long" internally, since the
result was truncated anyway. And the only other caller of
git_parse_long is git_config_maybe_bool, which should be
fine to just use int (though we will now forbid out-of-range
nonsense like setting "merge.ff" to "10g" to mean "true",
which is probably a good thing).

Signed-off-by: Jeff King <peff@peff.net>
---

I tested this with:

	diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
	index b1a6365..5444952 100755
	--- a/t/t6500-gc.sh
	+++ b/t/t6500-gc.sh
	@@ -25,4 +25,8 @@ test_expect_success 'gc -h with invalid configuration' '
	 	test_i18ngrep "[Uu]sage" broken/usage
	 '
	 
	+test_expect_success 'gc complains about out-of-range integers' '
	+	test_must_fail git -c gc.auto=3g gc --auto
	+'
	+
	 test_done

but that test is horrible for a few reasons:

  1. It depends on knowing that gc.auto uses git_config_int internally.

  2. It passes without the patch on 32-bit platforms.

  3. It won't pass with the patch on ILP64 systems.

We can't use "git config --int" to test it, because it's going to stop
using git_config_int later in the series. So I think we should just go
without a test.

 config.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/config.c b/config.c
index ee7c1f8..0a65ac2 100644
--- a/config.c
+++ b/config.c
@@ -515,10 +515,10 @@ static int git_parse_long(const char *value, long *ret)
 	return 0;
 }
 
-static int git_parse_long(const char *value, long *ret)
+static int git_parse_int(const char *value, int *ret)
 {
 	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(long)))
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
 		return 0;
 	*ret = tmp;
 	return 1;
@@ -542,8 +542,8 @@ int git_config_int(const char *name, const char *value)
 
 int git_config_int(const char *name, const char *value)
 {
-	long ret = 0;
-	if (!git_parse_long(value, &ret))
+	int ret;
+	if (!git_parse_int(value, &ret))
 		die_bad_config(name);
 	return ret;
 }
@@ -575,10 +575,10 @@ int git_config_maybe_bool(const char *name, const char *value)
 
 int git_config_maybe_bool(const char *name, const char *value)
 {
-	long v = git_config_maybe_bool_text(name, value);
+	int v = git_config_maybe_bool_text(name, value);
 	if (0 <= v)
 		return v;
-	if (git_parse_long(value, &v))
+	if (git_parse_int(value, &v))
 		return !!v;
 	return -1;
 }
-- 
1.8.4.2.g87d4a77

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/5] config: set errno in numeric git_parse_* functions
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
  2013-09-08  8:29   ` [PATCH 1/5] config: factor out integer parsing from range checks Jeff King
  2013-09-08  8:33   ` [PATCH 2/5] config: properly range-check integer values Jeff King
@ 2013-09-08  8:36   ` Jeff King
  2013-09-09  0:36     ` Eric Sunshine
  2013-09-08  8:38   ` [PATCH 4/5] config: make numeric parsing errors more clear Jeff King
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2013-09-08  8:36 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jonathan Nieder

When we are parsing an integer or unsigned long, we use
the strto*max functions, which properly set errno to ERANGE
if we get a large value. However, we also do further range
checks after applying our multiplication factor, but do not
set ERANGE. This means that a caller cannot tell if an error
was caused by ERANGE or if the input was simply not a valid
number.

This patch teaches git_parse_signed and git_parse_unsigned
reliably set ERANGE for range errors, and EINVAL for other
errors.

Signed-off-by: Jeff King <peff@peff.net>
---
I'm a little iffy on using errno like this. The "normal" way in git
would be to return an enum like:

  enum config_parse_error {
    CONFIG_PARSE_SUCCESS = 0,
    CONFIG_PARSE_RANGE = -1,
    CONFIG_PARSE_INVALID_UNIT = -2
  };

But that would be changing the return semantics of git_parse_ulong
(which currently is "0" for fail, "1" for success) without changing its
signature. I figured this was the path of least resistance since we
already get ERANGE out of strtoimax, but I'm happy to switch it, too.

 config.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/config.c b/config.c
index 0a65ac2..7332b06 100644
--- a/config.c
+++ b/config.c
@@ -480,16 +480,21 @@ static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
 		val = strtoimax(value, &end, 0);
 		if (errno == ERANGE)
 			return 0;
-		if (!parse_unit_factor(end, &factor))
+		if (!parse_unit_factor(end, &factor)) {
+			errno = EINVAL;
 			return 0;
+		}
 		uval = abs(val);
 		uval *= factor;
-		if (uval > max || abs(val) > uval)
+		if (uval > max || abs(val) > uval) {
+			errno = ERANGE;
 			return 0;
+		}
 		val *= factor;
 		*ret = val;
 		return 1;
 	}
+	errno = EINVAL;
 	return 0;
 }
 
@@ -505,13 +510,18 @@ int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
 		if (errno == ERANGE)
 			return 0;
 		oldval = val;
-		if (!parse_unit_factor(end, &val))
+		if (!parse_unit_factor(end, &val)) {
+			errno = EINVAL;
 			return 0;
-		if (val > max || oldval > val)
+		}
+		if (val > max || oldval > val) {
+			errno = ERANGE;
 			return 0;
+		}
 		*ret = val;
 		return 1;
 	}
+	errno = EINVAL;
 	return 0;
 }
 
-- 
1.8.4.2.g87d4a77

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/5] config: make numeric parsing errors more clear
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
                     ` (2 preceding siblings ...)
  2013-09-08  8:36   ` [PATCH 3/5] config: set errno in numeric git_parse_* functions Jeff King
@ 2013-09-08  8:38   ` Jeff King
  2013-09-08  8:40   ` [PATCH 5/5] git-config: always treat --int as 64-bit internally Jeff King
  2013-09-09 18:58   ` [PATCHv2 0/5] git-config and large integers Junio C Hamano
  5 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-09-08  8:38 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jonathan Nieder

If we try to parse an integer config argument and get a
number outside of the representable range, we die with the
cryptic message: "bad config value for '%s'".

We can improve two things:

  1. Show the value that produced the error (e.g., bad
     config value '3g' for 'foo.bar').

  2. Mention the reason the value was rejected (e.g.,
     "invalid unit" versus "out of range").

A few tests need to be updated with the new output, but that
should not be representative of real-world breakage, as
scripts should not be depending on the exact text of our
stderr output, which is subject to i18n anyway.

Signed-off-by: Jeff King <peff@peff.net>
---
I was tempted to clean up the tortured string construction in
die_bad_number(), but my understanding is that translators prefer to see
the string in as whole a piece as we can get it, rather than building it
piecemeal from "if"s. And though this isn't marked for translation, it
probably should be.

 config.c                | 17 ++++++++++++-----
 t/t1300-repo-config.sh  |  6 +++---
 t/t4055-diff-context.sh |  2 +-
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/config.c b/config.c
index 7332b06..87f87ba 100644
--- a/config.c
+++ b/config.c
@@ -543,18 +543,25 @@ int git_config_int(const char *name, const char *value)
 	return 1;
 }
 
-static void die_bad_config(const char *name)
+static void die_bad_number(const char *name, const char *value)
 {
+	const char *reason = errno == ERANGE ?
+			     "out of range" :
+			     "invalid unit";
+	if (!value)
+		value = "";
+
 	if (cf && cf->name)
-		die("bad config value for '%s' in %s", name, cf->name);
-	die("bad config value for '%s'", name);
+		die("bad numeric config value '%s' for '%s' in %s: %s",
+		    value, name, cf->name, reason);
+	die("bad numeric config value '%s' for '%s': %s", value, name, reason);
 }
 
 int git_config_int(const char *name, const char *value)
 {
 	int ret;
 	if (!git_parse_int(value, &ret))
-		die_bad_config(name);
+		die_bad_number(name, value);
 	return ret;
 }
 
@@ -562,7 +569,7 @@ unsigned long git_config_ulong(const char *name, const char *value)
 {
 	unsigned long ret;
 	if (!git_parse_ulong(value, &ret))
-		die_bad_config(name);
+		die_bad_number(name, value);
 	return ret;
 }
 
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index c4a7d84..20aee6e 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -657,11 +657,11 @@ test_expect_success 'invalid unit' '
 	echo 1auto >expect &&
 	git config aninvalid.unit >actual &&
 	test_cmp expect actual &&
-	cat > expect <<-\EOF
-	fatal: bad config value for '\''aninvalid.unit'\'' in .git/config
+	cat >expect <<-\EOF
+	fatal: bad numeric config value '\''1auto'\'' for '\''aninvalid.unit'\'' in .git/config: invalid unit
 	EOF
 	test_must_fail git config --int --get aninvalid.unit 2>actual &&
-	test_cmp actual expect
+	test_i18ncmp expect actual
 '
 
 cat > expect << EOF
diff --git a/t/t4055-diff-context.sh b/t/t4055-diff-context.sh
index 97172b4..cd04543 100755
--- a/t/t4055-diff-context.sh
+++ b/t/t4055-diff-context.sh
@@ -73,7 +73,7 @@ test_expect_success 'non-integer config parsing' '
 test_expect_success 'non-integer config parsing' '
 	git config diff.context no &&
 	test_must_fail git diff 2>output &&
-	test_i18ngrep "bad config value" output
+	test_i18ngrep "bad numeric config value" output
 '
 
 test_expect_success 'negative integer config parsing' '
-- 
1.8.4.2.g87d4a77

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/5] git-config: always treat --int as 64-bit internally
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
                     ` (3 preceding siblings ...)
  2013-09-08  8:38   ` [PATCH 4/5] config: make numeric parsing errors more clear Jeff King
@ 2013-09-08  8:40   ` Jeff King
  2013-09-09 18:58   ` [PATCHv2 0/5] git-config and large integers Junio C Hamano
  5 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-09-08  8:40 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jonathan Nieder

When you run "git config --int", the maximum size of integer
you get depends on how git was compiled, and what it
considers to be an "int".

This is almost useful, because your scripts calling "git
config" will behave similarly to git internally. But relying
on this is dubious; you have to actually know how git treats
each value internally (e.g., int versus unsigned long),
which is not documented and is subject to change. And even
if you know it is "unsigned long", we do not have a
git-config option to match that behavior.

Furthermore, you may simply be asking git to store a value
on your behalf (e.g., configuration for a hook). In that
case, the relevant range check has nothing at all to do with
git, but rather with whatever scripting tools you are using
(and git has no way of knowing what the appropriate range is
there).

Not only is the range check useless, but it is actively
harmful, as there is no way at all for scripts to look
at config variables with large values. For instance, one
cannot reliably get the value of pack.packSizeLimit via
git-config. On an LP64 system, git happily uses a 64-bit
"unsigned long" internally to represent the value, but the
script cannot read any value over 2G.

Ideally, the "--int" option would simply represent an
arbitrarily large integer. For practical purposes, however,
a 64-bit integer is large enough, and is much easier to
implement (and if somebody overflows it, we will still
notice the problem, and not simply return garbage).

Signed-off-by: Jeff King <peff@peff.net>
---
I kind of picked 64-bit as "big enough". But I suppose we could also go
with intmax_t. The only thing we can't do is an unsigned value.

Also, this is the first use of PRId64. I notice we have compatibility
macros for PRIu*, but I'm not sure what to put in one for PRId64.

 builtin/config.c       |  7 ++++---
 cache.h                |  1 +
 config.c               | 17 +++++++++++++++++
 t/t1300-repo-config.sh |  7 +++++++
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 4010c43..8b182d2 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -128,7 +128,8 @@ static int collect_config(const char *key_, const char *value_, void *cb)
 		must_print_delim = 1;
 	}
 	if (types == TYPE_INT)
-		sprintf(value, "%d", git_config_int(key_, value_?value_:""));
+		sprintf(value, "%"PRId64,
+			git_config_int64(key_, value_ ? value_ : ""));
 	else if (types == TYPE_BOOL)
 		vptr = git_config_bool(key_, value_) ? "true" : "false";
 	else if (types == TYPE_BOOL_OR_INT) {
@@ -265,8 +266,8 @@ static char *normalize_value(const char *key, const char *value)
 	else {
 		normalized = xmalloc(64);
 		if (types == TYPE_INT) {
-			int v = git_config_int(key, value);
-			sprintf(normalized, "%d", v);
+			int64_t v = git_config_int64(key, value);
+			sprintf(normalized, "%"PRId64, v);
 		}
 		else if (types == TYPE_BOOL)
 			sprintf(normalized, "%s",
diff --git a/cache.h b/cache.h
index 85b544f..ac4525a 100644
--- a/cache.h
+++ b/cache.h
@@ -1190,6 +1190,7 @@ extern int git_config_int(const char *, const char *);
 extern int git_config_early(config_fn_t fn, void *, const char *repo_config);
 extern int git_parse_ulong(const char *, unsigned long *);
 extern int git_config_int(const char *, const char *);
+extern int64_t git_config_int64(const char *, const char *);
 extern unsigned long git_config_ulong(const char *, const char *);
 extern int git_config_bool_or_int(const char *, const char *, int *);
 extern int git_config_bool(const char *, const char *);
diff --git a/config.c b/config.c
index 87f87ba..6588cf5 100644
--- a/config.c
+++ b/config.c
@@ -534,6 +534,15 @@ static int git_parse_int(const char *value, int *ret)
 	return 1;
 }
 
+static int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
 int git_parse_ulong(const char *value, unsigned long *ret)
 {
 	uintmax_t tmp;
@@ -565,6 +574,14 @@ int git_config_int(const char *name, const char *value)
 	return ret;
 }
 
+int64_t git_config_int64(const char *name, const char *value)
+{
+	int64_t ret;
+	if (!git_parse_int64(value, &ret))
+		die_bad_number(name, value);
+	return ret;
+}
+
 unsigned long git_config_ulong(const char *name, const char *value)
 {
 	unsigned long ret;
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 20aee6e..b66c632 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -652,6 +652,13 @@ test_expect_success numbers '
 	test_cmp expect actual
 '
 
+test_expect_success '--int is at least 64 bits' '
+	git config giga.watts 121g &&
+	echo 129922760704 >expect &&
+	git config --int --get giga.watts >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'invalid unit' '
 	git config aninvalid.unit "1auto" &&
 	echo 1auto >expect &&
-- 
1.8.4.2.g87d4a77

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] config: set errno in numeric git_parse_* functions
  2013-09-08  8:36   ` [PATCH 3/5] config: set errno in numeric git_parse_* functions Jeff King
@ 2013-09-09  0:36     ` Eric Sunshine
  2013-09-09 19:53       ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Sunshine @ 2013-09-09  0:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Junio C Hamano, Jonathan Nieder

On Sun, Sep 8, 2013 at 4:36 AM, Jeff King <peff@peff.net> wrote:
> When we are parsing an integer or unsigned long, we use
> the strto*max functions, which properly set errno to ERANGE
> if we get a large value. However, we also do further range
> checks after applying our multiplication factor, but do not
> set ERANGE. This means that a caller cannot tell if an error
> was caused by ERANGE or if the input was simply not a valid
> number.
>
> This patch teaches git_parse_signed and git_parse_unsigned
> reliably set ERANGE for range errors, and EINVAL for other

Missing "to": s/reliably/to reliably/

Or, if you don't like splitting the infinitive:

s/reliably set ERANGE/to set ERANGE reliably/

> errors.
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCHv2 0/5] git-config and large integers
  2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
                     ` (4 preceding siblings ...)
  2013-09-08  8:40   ` [PATCH 5/5] git-config: always treat --int as 64-bit internally Jeff King
@ 2013-09-09 18:58   ` Junio C Hamano
  5 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2013-09-09 18:58 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jonathan Nieder

Yikes. The updated series looks very sensible, but I already have
the previous one in 'next'.

Will do the usual "revert and requeue" dance...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] config: set errno in numeric git_parse_* functions
  2013-09-09  0:36     ` Eric Sunshine
@ 2013-09-09 19:53       ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2013-09-09 19:53 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Junio C Hamano, Jonathan Nieder

On Sun, Sep 08, 2013 at 08:36:35PM -0400, Eric Sunshine wrote:

> On Sun, Sep 8, 2013 at 4:36 AM, Jeff King <peff@peff.net> wrote:
> > When we are parsing an integer or unsigned long, we use
> > the strto*max functions, which properly set errno to ERANGE
> > if we get a large value. However, we also do further range
> > checks after applying our multiplication factor, but do not
> > set ERANGE. This means that a caller cannot tell if an error
> > was caused by ERANGE or if the input was simply not a valid
> > number.
> >
> > This patch teaches git_parse_signed and git_parse_unsigned
> > reliably set ERANGE for range errors, and EINVAL for other
> 
> Missing "to": s/reliably/to reliably/
> 
> Or, if you don't like splitting the infinitive:
> 
> s/reliably set ERANGE/to set ERANGE reliably/

Thanks. That is what I get for last-minute tweaking of the commit
message (and I am OK with splitting the infinitive :) ).

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-09-09 19:53 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-20 22:39 [PATCH 0/2] git-config and large integers Jeff King
2013-08-20 22:42 ` [PATCH 1/2] config: properly range-check integer values Jeff King
2013-08-20 23:07   ` Jonathan Nieder
2013-08-21  2:55     ` Jeff King
2013-08-20 22:44 ` [PATCH 0/2] git-config and large integers Stefan Beller
2013-08-20 22:48   ` Jeff King
2013-08-20 22:47 ` [PATCH 2/2] teach git-config to output " Jeff King
2013-08-20 22:57   ` Jonathan Nieder
2013-08-21  3:00     ` Jeff King
2013-08-21  4:38       ` Jonathan Nieder
2013-08-21  5:00         ` Jeff King
2013-08-21  6:34           ` Jonathan Nieder
2013-08-20 23:06 ` [PATCH 0/2] git-config and " Junio C Hamano
2013-08-20 23:41   ` Junio C Hamano
2013-08-21  2:43     ` Jeff King
2013-08-21  2:34   ` Jeff King
2013-09-08  8:27 ` [PATCHv2 0/5] " Jeff King
2013-09-08  8:29   ` [PATCH 1/5] config: factor out integer parsing from range checks Jeff King
2013-09-08  8:33   ` [PATCH 2/5] config: properly range-check integer values Jeff King
2013-09-08  8:36   ` [PATCH 3/5] config: set errno in numeric git_parse_* functions Jeff King
2013-09-09  0:36     ` Eric Sunshine
2013-09-09 19:53       ` Jeff King
2013-09-08  8:38   ` [PATCH 4/5] config: make numeric parsing errors more clear Jeff King
2013-09-08  8:40   ` [PATCH 5/5] git-config: always treat --int as 64-bit internally Jeff King
2013-09-09 18:58   ` [PATCHv2 0/5] git-config and large integers Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).