git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] sequencer: clarify intention to break out of loop
@ 2018-10-28 15:31 Martin Ågren
  2018-10-28 19:01 ` Eric Sunshine
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Ågren @ 2018-10-28 15:31 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

When we find a space, we set `len = i`, which gives us the answer we are
looking for, but which also breaks out of the loop through these steps:

  1. `len = i`

  2. `i = i + 1`

  3. Is `i < len`? No, so break out.

Since `i` is signed, step 2 is undefined if `i` has the value `INT_MAX`.
It can't actually have that value, but that doesn't stop my copy of gcc
7.3.0 from throwing the following:

> sequencer.c:2853:3: error: assuming signed overflow does not occur when
> assuming that (X + c) < X is always false [-Werror=strict-overflow]
>    for (i = 0; i < len; i++)
>    ^~~

That is, the compiler has realized that the code is essentially
evaluating "(len + 1) < len" and that for `len = INT_MAX`, this is
undefined behavior. What it hasn't figured out is that if `i` and `len`
are both `INT_MAX` after step 1, then `len` must have had a value larger
than `INT_MAX` before that step, which it can't have had.

Let's be explicit about breaking out of the loop. This helps the
compiler grok our intention. As a bonus, it might make it (even) more
obvious to human readers that the loop stops at the first space.

While at it, reduce the scope of `i`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 sequencer.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 0c164d5f98..a351638ad9 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2829,7 +2829,7 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
 	struct tree_desc desc;
 	struct tree *tree;
 	struct unpack_trees_options unpack_tree_opts;
-	int ret = 0, i;
+	int ret = 0;
 
 	if (hold_locked_index(&lock, LOCK_REPORT_ON_ERROR) < 0)
 		return -1;
@@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
 		}
 		oidcpy(&oid, &opts->squash_onto);
 	} else {
+		int i;
 		/* Determine the length of the label */
-		for (i = 0; i < len; i++)
-			if (isspace(name[i]))
+		for (i = 0; i < len; i++) {
+			if (isspace(name[i])) {
 				len = i;
+				break;
+			}
+		}
 
 		strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
 		if (get_oid(ref_name.buf, &oid) &&
-- 
2.19.1.593.gc670b1f876.dirty


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] sequencer: clarify intention to break out of loop
  2018-10-28 15:31 [PATCH] sequencer: clarify intention to break out of loop Martin Ågren
@ 2018-10-28 19:01 ` Eric Sunshine
  2018-10-28 20:37   ` Martin Ågren
  2018-10-29  3:43   ` Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Sunshine @ 2018-10-28 19:01 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git List, Johannes Schindelin

On Sun, Oct 28, 2018 at 11:32 AM Martin Ågren <martin.agren@gmail.com> wrote:
> [...]
> Let's be explicit about breaking out of the loop. This helps the
> compiler grok our intention. As a bonus, it might make it (even) more
> obvious to human readers that the loop stops at the first space.

This did come up in review[1,2]. I had a hard time understanding the
loop because it looked idiomatic but wasn't (and we have preconceived
notions about behavior of things which look idiomatic).

[1]: https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=jg@mail.gmail.com/
[2]: https://public-inbox.org/git/CAPig+cRjU6niXpT2FrDWZ0x1HmGf1ojVZj3uk2qXEGe-S7i_HQ@mail.gmail.com/

> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> diff --git a/sequencer.c b/sequencer.c
> @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
>                 /* Determine the length of the label */
> +               for (i = 0; i < len; i++) {
> +                       if (isspace(name[i])) {
>                                 len = i;
> +                               break;
> +                       }
> +               }
>                 strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);

At least for me, this would be more idiomatic if it was coded like this:

    for (i = 0; i < len; i++)
        if (isspace(name[i]))
            break;
    strbuf_addf(..., i, name);

or, perhaps (less concise):

    for (i = 0; i < len; i++)
        if (isspace(name[i]))
            break;
    len = i;
    strbuf_addf(..., len, name);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sequencer: clarify intention to break out of loop
  2018-10-28 19:01 ` Eric Sunshine
@ 2018-10-28 20:37   ` Martin Ågren
  2018-10-29  3:43   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Martin Ågren @ 2018-10-28 20:37 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git Mailing List, Johannes Schindelin

On Sun, 28 Oct 2018 at 20:01, Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Sun, Oct 28, 2018 at 11:32 AM Martin Ågren <martin.agren@gmail.com> wrote:
> > [...]
> > Let's be explicit about breaking out of the loop. This helps the
> > compiler grok our intention. As a bonus, it might make it (even) more
> > obvious to human readers that the loop stops at the first space.
>
> This did come up in review[1,2]. I had a hard time understanding the
> loop because it looked idiomatic but wasn't (and we have preconceived
> notions about behavior of things which look idiomatic).
>
> [1]: https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=jg@mail.gmail.com/
> [2]: https://public-inbox.org/git/CAPig+cRjU6niXpT2FrDWZ0x1HmGf1ojVZj3uk2qXEGe-S7i_HQ@mail.gmail.com/

Hmm, I should have been able to dig those up myself. Thanks for the
pointers.

> >                 /* Determine the length of the label */
> > +               for (i = 0; i < len; i++) {
> > +                       if (isspace(name[i])) {
> >                                 len = i;
> > +                               break;
> > +                       }
> > +               }
> >                 strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
>
> At least for me, this would be more idiomatic if it was coded like this:
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     strbuf_addf(..., i, name);
>
> or, perhaps (less concise):
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     len = i;
>     strbuf_addf(..., len, name);

This second one is more true to the original in that it updates `len` to
the new, shorter length. Which actually seems to be needed -- towards
the very end of the function, `len` is used, so the first of these
suggestions would change the behavior.

Thanks a lot for a review. I'll wait for any additional comments and
will try a v2 with your second suggestion.

Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] sequencer: clarify intention to break out of loop
  2018-10-28 19:01 ` Eric Sunshine
  2018-10-28 20:37   ` Martin Ågren
@ 2018-10-29  3:43   ` Junio C Hamano
  2018-10-30  8:09     ` [PATCH v2] sequencer: break out of loop explicitly Martin Ågren
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2018-10-29  3:43 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Martin Ågren, Git List, Johannes Schindelin

Eric Sunshine <sunshine@sunshineco.com> writes:

>> diff --git a/sequencer.c b/sequencer.c
>> @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
>>                 /* Determine the length of the label */
>> +               for (i = 0; i < len; i++) {
>> +                       if (isspace(name[i])) {
>>                                 len = i;
>> +                               break;
>> +                       }
>> +               }
>>                 strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
>
> At least for me, this would be more idiomatic if it was coded like this:
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     strbuf_addf(..., i, name);
>
> or, perhaps (less concise):
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     len = i;
>     strbuf_addf(..., len, name);

Yup, the former is more idiomatic between these two; the latter is
also fine.

The result of Martin's patch shares the same "Huh?" factor as the
original that mucks with the "supposedly constant" side of the
termination condition, and from that point of view, it is not that
much of a readability improvement, I would think.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] sequencer: break out of loop explicitly
  2018-10-29  3:43   ` Junio C Hamano
@ 2018-10-30  8:09     ` Martin Ågren
  2018-10-31 14:54       ` Johannes Schindelin
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Ågren @ 2018-10-30  8:09 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Eric Sunshine, Johannes Schindelin

It came up in review [1, 2] that this non-idiomatic loop is a bit tricky.
When we find a space, we set `len = i`, which gives us the answer we are
looking for, but which also breaks out of the loop.

It turns out that this loop can confuse compilers as well. My copy of
gcc 7.3.0 realizes that we are essentially evaluating `(len + 1) < len`
and warns that the behavior is undefined if `len` is `INT_MAX`. (Because
the assignment `len = i` is guaranteed to decrease `len`, such undefined
behavior is not actually possible.)

Rewrite the loop to a more idiomatic variant which doesn't muck with
`len` in the loop body. That should help compilers and human readers
figure out what is going on here. But do note that we need to update
`len` since it is not only used just after this loop (where we could
have used `i` directly), but also later in this function.

While at it, reduce the scope of `i`.

[1] https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=jg@mail.gmail.com/

[2] https://public-inbox.org/git/CAPig+cRjU6niXpT2FrDWZ0x1HmGf1ojVZj3uk2qXEGe-S7i_HQ@mail.gmail.com/

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 Thanks for the comments on v1. Based on them, I decided to go for
 Eric's option 2, and to make the log message less technical in favor of
 "compilers and humans alike can get this wrong".

 sequencer.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 0c164d5f98..e7aa4d5020 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2829,7 +2829,7 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
 	struct tree_desc desc;
 	struct tree *tree;
 	struct unpack_trees_options unpack_tree_opts;
-	int ret = 0, i;
+	int ret = 0;
 
 	if (hold_locked_index(&lock, LOCK_REPORT_ON_ERROR) < 0)
 		return -1;
@@ -2849,10 +2849,13 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
 		}
 		oidcpy(&oid, &opts->squash_onto);
 	} else {
+		int i;
+
 		/* Determine the length of the label */
 		for (i = 0; i < len; i++)
 			if (isspace(name[i]))
-				len = i;
+				break;
+		len = i;
 
 		strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
 		if (get_oid(ref_name.buf, &oid) &&
-- 
2.19.1.593.gc670b1f876.dirty


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] sequencer: break out of loop explicitly
  2018-10-30  8:09     ` [PATCH v2] sequencer: break out of loop explicitly Martin Ågren
@ 2018-10-31 14:54       ` Johannes Schindelin
  2018-10-31 17:28         ` Eric Sunshine
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2018-10-31 14:54 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Junio C Hamano, Eric Sunshine

[-- Attachment #1: Type: text/plain, Size: 2547 bytes --]

Hi Martin,

On Tue, 30 Oct 2018, Martin Ågren wrote:

> It came up in review [1, 2] that this non-idiomatic loop is a bit tricky.
> When we find a space, we set `len = i`, which gives us the answer we are
> looking for, but which also breaks out of the loop.
> 
> It turns out that this loop can confuse compilers as well. My copy of
> gcc 7.3.0 realizes that we are essentially evaluating `(len + 1) < len`
> and warns that the behavior is undefined if `len` is `INT_MAX`. (Because
> the assignment `len = i` is guaranteed to decrease `len`, such undefined
> behavior is not actually possible.)
> 
> Rewrite the loop to a more idiomatic variant which doesn't muck with
> `len` in the loop body. That should help compilers and human readers
> figure out what is going on here. But do note that we need to update
> `len` since it is not only used just after this loop (where we could
> have used `i` directly), but also later in this function.
> 
> While at it, reduce the scope of `i`.
> 
> [1] https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=jg@mail.gmail.com/
> 
> [2] https://public-inbox.org/git/CAPig+cRjU6niXpT2FrDWZ0x1HmGf1ojVZj3uk2qXEGe-S7i_HQ@mail.gmail.com/
> 
> Helped-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---

ACK. Thanks for cleaning up after me,
Dscho

>  Thanks for the comments on v1. Based on them, I decided to go for
>  Eric's option 2, and to make the log message less technical in favor of
>  "compilers and humans alike can get this wrong".
> 
>  sequencer.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 0c164d5f98..e7aa4d5020 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2829,7 +2829,7 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
>  	struct tree_desc desc;
>  	struct tree *tree;
>  	struct unpack_trees_options unpack_tree_opts;
> -	int ret = 0, i;
> +	int ret = 0;
>  
>  	if (hold_locked_index(&lock, LOCK_REPORT_ON_ERROR) < 0)
>  		return -1;
> @@ -2849,10 +2849,13 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
>  		}
>  		oidcpy(&oid, &opts->squash_onto);
>  	} else {
> +		int i;
> +
>  		/* Determine the length of the label */
>  		for (i = 0; i < len; i++)
>  			if (isspace(name[i]))
> -				len = i;
> +				break;
> +		len = i;
>  
>  		strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
>  		if (get_oid(ref_name.buf, &oid) &&
> -- 
> 2.19.1.593.gc670b1f876.dirty
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] sequencer: break out of loop explicitly
  2018-10-31 14:54       ` Johannes Schindelin
@ 2018-10-31 17:28         ` Eric Sunshine
  2018-10-31 18:41           ` Martin Ågren
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Sunshine @ 2018-10-31 17:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Martin Ågren, Git List, Junio C Hamano

On Wed, Oct 31, 2018 at 10:54 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> On Tue, 30 Oct 2018, Martin Ågren wrote:
> > Rewrite the loop to a more idiomatic variant which doesn't muck with
> > `len` in the loop body. That should help compilers and human readers
> > figure out what is going on here. But do note that we need to update
> > `len` since it is not only used just after this loop (where we could
> > have used `i` directly), but also later in this function.
> >
> > Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> > ---
>
> ACK. Thanks for cleaning up after me,

Looks good to me, as well. Thanks for working on it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] sequencer: break out of loop explicitly
  2018-10-31 17:28         ` Eric Sunshine
@ 2018-10-31 18:41           ` Martin Ågren
  0 siblings, 0 replies; 8+ messages in thread
From: Martin Ågren @ 2018-10-31 18:41 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Johannes Schindelin, Git Mailing List, Junio C Hamano

On Wed, 31 Oct 2018 at 18:28, Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Wed, Oct 31, 2018 at 10:54 AM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:

> > ACK. Thanks for cleaning up after me,
>
> Looks good to me, as well. Thanks for working on it.

Thanks, both of you.

Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-31 18:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-28 15:31 [PATCH] sequencer: clarify intention to break out of loop Martin Ågren
2018-10-28 19:01 ` Eric Sunshine
2018-10-28 20:37   ` Martin Ågren
2018-10-29  3:43   ` Junio C Hamano
2018-10-30  8:09     ` [PATCH v2] sequencer: break out of loop explicitly Martin Ågren
2018-10-31 14:54       ` Johannes Schindelin
2018-10-31 17:28         ` Eric Sunshine
2018-10-31 18:41           ` Martin Ågren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).