git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
@ 2019-02-03 16:51 Johannes Sixt
  2019-02-04  7:25 ` Johannes Sixt
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Johannes Sixt @ 2019-02-03 16:51 UTC (permalink / raw)
  To: Paul-Sebastian Ungureanu; +Cc: Git Mailing List

strbuf_vinsertf inserts a formatted string in the middle of an existing
strbuf value. It makes room in the strbuf by moving existing string to
the back, then formats the string to insert directly into the hole.

It uses vsnprintf to format the string. The buffer size provided in the
invocation is the number of characters available in the allocated space
behind the final string. This does not make any sense at all.

Fix it to pass the length of the inserted string plus one for the NUL.
(The functions saves and restores the character that the NUL occupies.)

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 I found this, because in my environment I have to compile with
 SNPRINTF_RETURNS_BOGUS. Our implementation of vsnprintf in
 compat/snprintf.c writes into the end of the buffer unconditionally,
 at a spot that is unrelated to the formatted string, and this leads to
 "BUG: a NUL byte in commit log message not allowed" in some "stash"
 tests.

 strbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index bfbbdadbf3..87ecf7f975 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -270,7 +270,7 @@ void strbuf_vinsertf(struct strbuf *sb, size_t pos, const char *fmt, va_list ap)
 	memmove(sb->buf + pos + len, sb->buf + pos, sb->len - pos);
 	/* vsnprintf() will append a NUL, overwriting one of our characters */
 	save = sb->buf[pos + len];
-	len2 = vsnprintf(sb->buf + pos, sb->alloc - sb->len, fmt, ap);
+	len2 = vsnprintf(sb->buf + pos, len + 1, fmt, ap);
 	sb->buf[pos + len] = save;
 	if (len2 != len)
 		BUG("your vsnprintf is broken (returns inconsistent lengths)");
-- 
2.20.1.86.gb0de946387

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-03 16:51 [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf Johannes Sixt
@ 2019-02-04  7:25 ` Johannes Sixt
  2019-02-04 10:38   ` Johannes Schindelin
  2019-02-04 10:54 ` Johannes Schindelin
  2019-02-04 21:57 ` Junio C Hamano
  2 siblings, 1 reply; 13+ messages in thread
From: Johannes Sixt @ 2019-02-04  7:25 UTC (permalink / raw)
  To: Paul-Sebastian Ungureanu; +Cc: Git Mailing List

Am 03.02.19 um 17:51 schrieb Johannes Sixt:
> strbuf_vinsertf inserts a formatted string in the middle of an existing
> strbuf value.

Quite frankly, this is a really unusual operation, and I'd prefer to get
rid of it. There is only one call, and it looks like it only wants to be
lazy and save one strbuf variable.

This helper adds way more code than a non-lazy caller would need. There
wouldn't even be a mental burden. Like this (except that strbuf_addstr
doesn't do what I thought it would do...).

diff --git a/builtin/stash.c b/builtin/stash.c
index 74e6ff62b5..95d202aea3 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1101,7 +1101,7 @@ static int stash_working_tree(struct stash_info *info, struct pathspec ps)
 	return ret;
 }
 
-static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
+static int do_create_stash(struct pathspec ps, const char *stash_msg,
 			   int include_untracked, int patch_mode,
 			   struct stash_info *info, struct strbuf *patch,
 			   int quiet)
@@ -1117,6 +1117,7 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
 	struct strbuf msg = STRBUF_INIT;
 	struct strbuf commit_tree_label = STRBUF_INIT;
 	struct strbuf untracked_files = STRBUF_INIT;
+	struct strbuf stash_msg_buf = STRBUF_INIT;
 
 	prepare_fallback_ident("git stash", "git@stash");
 
@@ -1188,10 +1189,12 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
 		}
 	}
 
-	if (!stash_msg_buf->len)
-		strbuf_addf(stash_msg_buf, "WIP on %s", msg.buf);
-	else
-		strbuf_insertf(stash_msg_buf, 0, "On %s: ", branch_name);
+	if (!*stash_msg) {
+		strbuf_addf(&stash_msg_buf, "WIP on %s", msg.buf);
+	} else {
+		strbuf_addf(&stash_msg_buf, "On %s: ", branch_name);
+		strbuf_addstr(&stash_msg_buf, stash_msg);
+	}
 
 	/*
 	 * `parents` will be empty after calling `commit_tree()`, so there is
@@ -1206,7 +1209,7 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
 			   &parents);
 	commit_list_insert(head_commit, &parents);
 
-	if (commit_tree(stash_msg_buf->buf, stash_msg_buf->len, &info->w_tree,
+	if (commit_tree(stash_msg_buf.buf, stash_msg_buf.len, &info->w_tree,
 			parents, &info->w_commit, NULL, NULL)) {
 		if (!quiet)
 			fprintf_ln(stderr, _("Cannot record "
@@ -1216,6 +1219,7 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
 	}
 
 done:
+	strbuf_release(&stash_msg_buf);
 	strbuf_release(&commit_tree_label);
 	strbuf_release(&msg);
 	strbuf_release(&untracked_files);
@@ -1236,7 +1240,7 @@ static int create_stash(int argc, const char **argv, const char *prefix)
 	if (!check_changes_tracked_files(ps))
 		return 0;
 
-	if (!(ret = do_create_stash(ps, &stash_msg_buf, 0, 0, &info, NULL, 0)))
+	if (!(ret = do_create_stash(ps, stash_msg_buf.buf, 0, 0, &info, NULL, 0)))
 		printf_ln("%s", oid_to_hex(&info.w_commit));
 
 	strbuf_release(&stash_msg_buf);
@@ -1300,7 +1304,7 @@ static int do_push_stash(struct pathspec ps, const char *stash_msg, int quiet,
 
 	if (stash_msg)
 		strbuf_addstr(&stash_msg_buf, stash_msg);
-	if (do_create_stash(ps, &stash_msg_buf, include_untracked, patch_mode,
+	if (do_create_stash(ps, stash_msg_buf.buf, include_untracked, patch_mode,
 			    &info, &patch, quiet)) {
 		ret = -1;
 		goto done;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-04  7:25 ` Johannes Sixt
@ 2019-02-04 10:38   ` Johannes Schindelin
  2019-02-04 21:13     ` Johannes Sixt
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2019-02-04 10:38 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Paul-Sebastian Ungureanu, Git Mailing List

Hi Hannes,

On Mon, 4 Feb 2019, Johannes Sixt wrote:

> Am 03.02.19 um 17:51 schrieb Johannes Sixt:
> > strbuf_vinsertf inserts a formatted string in the middle of an existing
> > strbuf value.
> 
> Quite frankly, this is a really unusual operation, and I'd prefer to get
> rid of it. There is only one call, and it looks like it only wants to be
> lazy and save one strbuf variable.

The only reason why there are not more callers is that I did not convert
any of the appropriate places. We have quite a few places where we
allocate a new strbuf for the sole purpose of formatting something that is
then inserted into an already existing strbuf (possibly extending the
buffer, which might require a move of the buffer just because that
temporary strbuf is in the way).

It does not sound like good practice to me to allocate things left and
right, only to reallocate something that was just allocated anyway and to
copy things into that and then release things left and right.

Ciao,
Dscho

> This helper adds way more code than a non-lazy caller would need. There
> wouldn't even be a mental burden. Like this (except that strbuf_addstr
> doesn't do what I thought it would do...).
> 
> diff --git a/builtin/stash.c b/builtin/stash.c
> index 74e6ff62b5..95d202aea3 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -1101,7 +1101,7 @@ static int stash_working_tree(struct stash_info *info, struct pathspec ps)
>  	return ret;
>  }
>  
> -static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
> +static int do_create_stash(struct pathspec ps, const char *stash_msg,
>  			   int include_untracked, int patch_mode,
>  			   struct stash_info *info, struct strbuf *patch,
>  			   int quiet)
> @@ -1117,6 +1117,7 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
>  	struct strbuf msg = STRBUF_INIT;
>  	struct strbuf commit_tree_label = STRBUF_INIT;
>  	struct strbuf untracked_files = STRBUF_INIT;
> +	struct strbuf stash_msg_buf = STRBUF_INIT;
>  
>  	prepare_fallback_ident("git stash", "git@stash");
>  
> @@ -1188,10 +1189,12 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
>  		}
>  	}
>  
> -	if (!stash_msg_buf->len)
> -		strbuf_addf(stash_msg_buf, "WIP on %s", msg.buf);
> -	else
> -		strbuf_insertf(stash_msg_buf, 0, "On %s: ", branch_name);
> +	if (!*stash_msg) {
> +		strbuf_addf(&stash_msg_buf, "WIP on %s", msg.buf);
> +	} else {
> +		strbuf_addf(&stash_msg_buf, "On %s: ", branch_name);
> +		strbuf_addstr(&stash_msg_buf, stash_msg);
> +	}
>  
>  	/*
>  	 * `parents` will be empty after calling `commit_tree()`, so there is
> @@ -1206,7 +1209,7 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
>  			   &parents);
>  	commit_list_insert(head_commit, &parents);
>  
> -	if (commit_tree(stash_msg_buf->buf, stash_msg_buf->len, &info->w_tree,
> +	if (commit_tree(stash_msg_buf.buf, stash_msg_buf.len, &info->w_tree,
>  			parents, &info->w_commit, NULL, NULL)) {
>  		if (!quiet)
>  			fprintf_ln(stderr, _("Cannot record "
> @@ -1216,6 +1219,7 @@ static int do_create_stash(struct pathspec ps, struct strbuf *stash_msg_buf,
>  	}
>  
>  done:
> +	strbuf_release(&stash_msg_buf);
>  	strbuf_release(&commit_tree_label);
>  	strbuf_release(&msg);
>  	strbuf_release(&untracked_files);
> @@ -1236,7 +1240,7 @@ static int create_stash(int argc, const char **argv, const char *prefix)
>  	if (!check_changes_tracked_files(ps))
>  		return 0;
>  
> -	if (!(ret = do_create_stash(ps, &stash_msg_buf, 0, 0, &info, NULL, 0)))
> +	if (!(ret = do_create_stash(ps, stash_msg_buf.buf, 0, 0, &info, NULL, 0)))
>  		printf_ln("%s", oid_to_hex(&info.w_commit));
>  
>  	strbuf_release(&stash_msg_buf);
> @@ -1300,7 +1304,7 @@ static int do_push_stash(struct pathspec ps, const char *stash_msg, int quiet,
>  
>  	if (stash_msg)
>  		strbuf_addstr(&stash_msg_buf, stash_msg);
> -	if (do_create_stash(ps, &stash_msg_buf, include_untracked, patch_mode,
> +	if (do_create_stash(ps, stash_msg_buf.buf, include_untracked, patch_mode,
>  			    &info, &patch, quiet)) {
>  		ret = -1;
>  		goto done;
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-03 16:51 [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf Johannes Sixt
  2019-02-04  7:25 ` Johannes Sixt
@ 2019-02-04 10:54 ` Johannes Schindelin
  2019-02-04 21:57 ` Junio C Hamano
  2 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2019-02-04 10:54 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Paul-Sebastian Ungureanu, Git Mailing List

Hi Hannes,

On Sun, 3 Feb 2019, Johannes Sixt wrote:

> strbuf_vinsertf inserts a formatted string in the middle of an existing
> strbuf value. It makes room in the strbuf by moving existing string to
> the back, then formats the string to insert directly into the hole.
> 
> It uses vsnprintf to format the string. The buffer size provided in the
> invocation is the number of characters available in the allocated space
> behind the final string. This does not make any sense at all.
> 
> Fix it to pass the length of the inserted string plus one for the NUL.
> (The functions saves and restores the character that the NUL occupies.)
> 
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
>  I found this, because in my environment I have to compile with
>  SNPRINTF_RETURNS_BOGUS. Our implementation of vsnprintf in
>  compat/snprintf.c writes into the end of the buffer unconditionally,
>  at a spot that is unrelated to the formatted string, and this leads to
>  "BUG: a NUL byte in commit log message not allowed" in some "stash"
>  tests.
> 
>  strbuf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/strbuf.c b/strbuf.c
> index bfbbdadbf3..87ecf7f975 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -270,7 +270,7 @@ void strbuf_vinsertf(struct strbuf *sb, size_t pos, const char *fmt, va_list ap)
>  	memmove(sb->buf + pos + len, sb->buf + pos, sb->len - pos);
>  	/* vsnprintf() will append a NUL, overwriting one of our characters */
>  	save = sb->buf[pos + len];
> -	len2 = vsnprintf(sb->buf + pos, sb->alloc - sb->len, fmt, ap);
> +	len2 = vsnprintf(sb->buf + pos, len + 1, fmt, ap);

It is really unfortunate that we use a non-dynamic code review system
where it is pretty impossible to increase the amount of context lines
easily.

And in this instance, a single line before the shown context would
suffice:

	strbuf_grow(sb, len);

Which is in line with moving from `pos` to `pos + len`, and it is also in
line with saving the byte at `pos + len`. And since we must consider that
byte as part of the buffer, and since `vsnprintf()` wants that size of the
buffer (including trailing NUL), `len + 1` is correct.

And since `strbuf_grow(sb, len)` would not in general grow the buffer by
exactly `len` bytes, you indeed fixed a bug.

For historical context, when I first implemented `strbuf_vinsertf()`, I
first grew the buffer, then let `vsnprintf()` write to the end, and then
would rotate the bytes into their correct location. This required the
implementation of an in-place rotation scheme, which was a lot of fun, and
totally unnecessary. That `sb->alloc - sb->len` parameter you fixed was a
remainder of that fun side project.

Thanks for fixing it,
Dscho

P.S.: Side note: I just realized that we could also write

	save = sb->buf[pos];
  	memmove(sb->buf + pos + len + 1, sb->buf + pos + 1, sb->len - pos - 1);

instead, i.e. not move the first byte just to have it overwritten by
vsnprintf() immediately, saving on moving one byte. But quite frankly, in
this case I do agree that readability is more important than trying to
squeeze out the last bit of performance.

>  	sb->buf[pos + len] = save;
>  	if (len2 != len)
>  		BUG("your vsnprintf is broken (returns inconsistent lengths)");
> -- 
> 2.20.1.86.gb0de946387
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-04 10:38   ` Johannes Schindelin
@ 2019-02-04 21:13     ` Johannes Sixt
  2019-02-05 10:38       ` Johannes Schindelin
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Sixt @ 2019-02-04 21:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Paul-Sebastian Ungureanu, Git Mailing List

Am 04.02.19 um 11:38 schrieb Johannes Schindelin:
> On Mon, 4 Feb 2019, Johannes Sixt wrote:
> 
>> Am 03.02.19 um 17:51 schrieb Johannes Sixt:
>>> strbuf_vinsertf inserts a formatted string in the middle of an existing
>>> strbuf value.
>>
>> Quite frankly, this is a really unusual operation, and I'd prefer to get
>> rid of it. There is only one call, and it looks like it only wants to be
>> lazy and save one strbuf variable.
> 
> The only reason why there are not more callers is that I did not convert
> any of the appropriate places. We have quite a few places where we
> allocate a new strbuf for the sole purpose of formatting something that is
> then inserted into an already existing strbuf (possibly extending the
> buffer, which might require a move of the buffer just because that
> temporary strbuf is in the way).
> 
> It does not sound like good practice to me to allocate things left and
> right, only to reallocate something that was just allocated anyway and to
> copy things into that and then release things left and right.

I prefer separation of concerns at the expense of a bit of resource
consumption that is not measurable. But that is the only argument that I
have.

-- Hannes

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-03 16:51 [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf Johannes Sixt
  2019-02-04  7:25 ` Johannes Sixt
  2019-02-04 10:54 ` Johannes Schindelin
@ 2019-02-04 21:57 ` Junio C Hamano
  2019-02-05 11:06   ` Johannes Schindelin
  2 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2019-02-04 21:57 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Paul-Sebastian Ungureanu, Git Mailing List

Johannes Sixt <j6t@kdbg.org> writes:

> strbuf_vinsertf inserts a formatted string in the middle of an existing
> strbuf value. It makes room in the strbuf by moving existing string to
> the back, then formats the string to insert directly into the hole.
>
> It uses vsnprintf to format the string. The buffer size provided in the
> invocation is the number of characters available in the allocated space
> behind the final string. This does not make any sense at all.
>
> Fix it to pass the length of the inserted string plus one for the NUL.
> (The functions saves and restores the character that the NUL occupies.)
>
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
>  I found this, because in my environment I have to compile with
>  SNPRINTF_RETURNS_BOGUS. Our implementation of vsnprintf in
>  compat/snprintf.c writes into the end of the buffer unconditionally,
>  at a spot that is unrelated to the formatted string, and this leads to
>  "BUG: a NUL byte in commit log message not allowed" in some "stash"
>  tests.

An embarrassing indication that not every line is read during
development or review with fine toothed comb.  I guess it won't
trigger without the returns-bogus thing, and the "testing" done on
platforms did not help.

Thanks for finding it.  This needs to be squashed into bfc3fe33
("strbuf.c: add `strbuf_insertf()` and `strbuf_vinsertf()`",
2018-12-20)?

As you mention in the direct response of this message, it might make
sense to get rid of the helper with YAGNI, but that's something we
need to see what its potential users should be doing (note: I did
not say "what its potential users are doing").

It could turn out that there may be many places that could use this
helper, but it may merely be an indication that these many places
are structured poorly to compute the "shell" string too early before
the string to be inserted becomes computable, in which case the real
fix may not be to use this helper but to change the order of
building up the string.  Perhaps we want a string that is
concatenation of part #1, part #2 and part #3, but we somehow first
compute concatenation of part #1 and part #3 in a buffer, which
requires us to insert part #2 in between.  If we restructure the
code to keep parts #1 and #3 separate, or delay computing part #3,
until part #2 becomes available, that would fix the "potential
users" in a better way without having to make calls into this
helper, for example.


>  strbuf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index bfbbdadbf3..87ecf7f975 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -270,7 +270,7 @@ void strbuf_vinsertf(struct strbuf *sb, size_t pos, const char *fmt, va_list ap)
>  	memmove(sb->buf + pos + len, sb->buf + pos, sb->len - pos);
>  	/* vsnprintf() will append a NUL, overwriting one of our characters */
>  	save = sb->buf[pos + len];
> -	len2 = vsnprintf(sb->buf + pos, sb->alloc - sb->len, fmt, ap);
> +	len2 = vsnprintf(sb->buf + pos, len + 1, fmt, ap);
>  	sb->buf[pos + len] = save;
>  	if (len2 != len)
>  		BUG("your vsnprintf is broken (returns inconsistent lengths)");

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-04 21:13     ` Johannes Sixt
@ 2019-02-05 10:38       ` Johannes Schindelin
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2019-02-05 10:38 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Paul-Sebastian Ungureanu, Git Mailing List

Hi Hannes,

On Mon, 4 Feb 2019, Johannes Sixt wrote:

> Am 04.02.19 um 11:38 schrieb Johannes Schindelin:
> > On Mon, 4 Feb 2019, Johannes Sixt wrote:
> > 
> >> Am 03.02.19 um 17:51 schrieb Johannes Sixt:
> >>> strbuf_vinsertf inserts a formatted string in the middle of an existing
> >>> strbuf value.
> >>
> >> Quite frankly, this is a really unusual operation, and I'd prefer to get
> >> rid of it. There is only one call, and it looks like it only wants to be
> >> lazy and save one strbuf variable.
> > 
> > The only reason why there are not more callers is that I did not convert
> > any of the appropriate places. We have quite a few places where we
> > allocate a new strbuf for the sole purpose of formatting something that is
> > then inserted into an already existing strbuf (possibly extending the
> > buffer, which might require a move of the buffer just because that
> > temporary strbuf is in the way).
> > 
> > It does not sound like good practice to me to allocate things left and
> > right, only to reallocate something that was just allocated anyway and to
> > copy things into that and then release things left and right.
> 
> I prefer separation of concerns at the expense of a bit of resource
> consumption that is not measurable. But that is the only argument that I
> have.

As far as I am concerned, the `strbuf_vinsertf()` function does not
conflate any concerns... It has one, very easily explained concern: to
interpolate text into the middle of a `strbuf`, much like `strbuf_vaddf()`
interpolates text at the end of one...

You found a bug in the implementation of that concern, is all, and fixed
it, for which I am grateful.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-04 21:57 ` Junio C Hamano
@ 2019-02-05 11:06   ` Johannes Schindelin
  2019-02-05 18:01     ` Junio C Hamano
  2019-02-05 20:30     ` Johannes Sixt
  0 siblings, 2 replies; 13+ messages in thread
From: Johannes Schindelin @ 2019-02-05 11:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, Paul-Sebastian Ungureanu, Git Mailing List

Hi Junio,

On Mon, 4 Feb 2019, Junio C Hamano wrote:

> Johannes Sixt <j6t@kdbg.org> writes:
> 
> > strbuf_vinsertf inserts a formatted string in the middle of an existing
> > strbuf value. It makes room in the strbuf by moving existing string to
> > the back, then formats the string to insert directly into the hole.
> >
> > It uses vsnprintf to format the string. The buffer size provided in the
> > invocation is the number of characters available in the allocated space
> > behind the final string. This does not make any sense at all.
> >
> > Fix it to pass the length of the inserted string plus one for the NUL.
> > (The functions saves and restores the character that the NUL occupies.)
> >
> > Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> > ---
> >  I found this, because in my environment I have to compile with
> >  SNPRINTF_RETURNS_BOGUS. Our implementation of vsnprintf in
> >  compat/snprintf.c writes into the end of the buffer unconditionally,
> >  at a spot that is unrelated to the formatted string, and this leads to
> >  "BUG: a NUL byte in commit log message not allowed" in some "stash"
> >  tests.
> 
> An embarrassing indication that not every line is read during
> development or review with fine toothed comb.

Or that they are read with a fine toothed comb, but that the focus lies
more on style and maintainability than correctness. We talked about this
in the past.

> I guess it won't trigger without the returns-bogus thing, and the
> "testing" done on platforms did not help.

It won't trigger with implementations of vsnprintf() that write just the
necessary bytes. You will note that our compat/ version writes at the end
of the provided buffer, which is legal, but actually not necessary.

> Thanks for finding it.  This needs to be squashed into bfc3fe33
> ("strbuf.c: add `strbuf_insertf()` and `strbuf_vinsertf()`",
> 2018-12-20)?

Since you want to open that can of worms again, you will also want to
squash ed5d77f7d382 (stash: fix segmentation fault when files were added with
intent, 2019-01-18) into 1f5a011d90ec (stash: convert create to builtin,
2018-12-20). It will have trivial merge conflicts, and the addition of the
regression test will be surprising without modifying the commit message to
explain that there was a regression that was fixed very late in the
development, and that that regression test intends to guarantee that it
won't need to be fixed again.

> As you mention in the direct response of this message, it might make
> sense to get rid of the helper with YAGNI, but that's something we
> need to see what its potential users should be doing (note: I did
> not say "what its potential users are doing").

As I already mentioned, there are a couple of potential users that I even
included in my original proposal to Paul, but we decided that this is a
separate concern and should therefore be addressed in a separate patch
series.

> It could turn out that there may be many places that could use this
> helper, but it may merely be an indication that these many places
> are structured poorly to compute the "shell" string too early before
> the string to be inserted becomes computable, in which case the real
> fix may not be to use this helper but to change the order of
> building up the string.

We do not usually do that, as our reviewers would almost certainly point
out such a convoluted operation.

And you seem to make a case against having `strbuf_insert()` to begin
with. So I am wondering whether you want to take a step back before going
further down that road.

The real examples are much more mundane, and very different from what you
suspected, e.g. inserting the tag header before the tag message in
`create_tag()` in `builtin/tag.c`. Basically, it is building up a strbuf
for the sake of calling `strbuf_insert()` to insert that string elsewhere.

In any case, the mere existence, and use, of `strbuf_insert()` is a pretty
clear counter case to the idea that `strbuf_vinsertf()` would encourage
invalid code flows.

Ciao,
Dscho

> Perhaps we want a string that is concatenation of part #1, part #2 and
> part #3, but we somehow first compute concatenation of part #1 and part
> #3 in a buffer, which requires us to insert part #2 in between.  If we
> restructure the code to keep parts #1 and #3 separate, or delay
> computing part #3, until part #2 becomes available, that would fix the
> "potential users" in a better way without having to make calls into this
> helper, for example.
> 
> 
> >  strbuf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/strbuf.c b/strbuf.c
> > index bfbbdadbf3..87ecf7f975 100644
> > --- a/strbuf.c
> > +++ b/strbuf.c
> > @@ -270,7 +270,7 @@ void strbuf_vinsertf(struct strbuf *sb, size_t pos, const char *fmt, va_list ap)
> >  	memmove(sb->buf + pos + len, sb->buf + pos, sb->len - pos);
> >  	/* vsnprintf() will append a NUL, overwriting one of our characters */
> >  	save = sb->buf[pos + len];
> > -	len2 = vsnprintf(sb->buf + pos, sb->alloc - sb->len, fmt, ap);
> > +	len2 = vsnprintf(sb->buf + pos, len + 1, fmt, ap);
> >  	sb->buf[pos + len] = save;
> >  	if (len2 != len)
> >  		BUG("your vsnprintf is broken (returns inconsistent lengths)");
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-05 11:06   ` Johannes Schindelin
@ 2019-02-05 18:01     ` Junio C Hamano
  2019-02-06 11:41       ` Johannes Schindelin
  2019-02-05 20:30     ` Johannes Sixt
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2019-02-05 18:01 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Sixt, Paul-Sebastian Ungureanu, Git Mailing List

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Or that they are read with a fine toothed comb, but that the focus lies
> more on style and maintainability than correctness. We talked about this
> in the past.

Perhaps we can do better the next time, then.  I find unreadable
code is impossible to reason about its correctness, so you'd need
to pay attention to style and maintenance issues to quickly get past
that step.

>> Thanks for finding it.  This needs to be squashed into bfc3fe33
>> ("strbuf.c: add `strbuf_insertf()` and `strbuf_vinsertf()`",
>> 2018-12-20)?
>
> Since you want to open that can of worms again, you will also want to
> squash ed5d77f7d382 (stash: fix segmentation fault when files were added with
> intent, 2019-01-18) into 1f5a011d90ec (stash: convert create to builtin,
> 2018-12-20). It will have trivial merge conflicts, and the addition of the
> regression test will be surprising without modifying the commit message to
> explain that there was a regression that was fixed very late in the
> development, and that that regression test intends to guarantee that it
> won't need to be fixed again.

Are you saying that I should not merge the series as is but expect
an update that does these squashing?  I was planning to make this a
merge-down day, but let me exclude this topic from the "for next"
batch just in case for today.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-05 11:06   ` Johannes Schindelin
  2019-02-05 18:01     ` Junio C Hamano
@ 2019-02-05 20:30     ` Johannes Sixt
  2019-02-06 11:56       ` Johannes Schindelin
  1 sibling, 1 reply; 13+ messages in thread
From: Johannes Sixt @ 2019-02-05 20:30 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Paul-Sebastian Ungureanu, Git Mailing List

Am 05.02.19 um 12:06 schrieb Johannes Schindelin:
> The real examples are much more mundane, and very different from what you
> suspected, e.g. inserting the tag header before the tag message in
> `create_tag()` in `builtin/tag.c`. Basically, it is building up a strbuf
> for the sake of calling `strbuf_insert()` to insert that string elsewhere.

I had a look, and this example does not convince me at all. If
separation of concerns were applied well around the launch_editor APIs,
you would not need strbuf_insert() in the first place. But, alas, these
functions focus more on DRY, and that is often the opposite of
separation of concerns.

> In any case, the mere existence, and use, of `strbuf_insert()` is a pretty
> clear counter case to the idea that `strbuf_vinsertf()` would encourage
> invalid code flows.

Nobody wants to get rid of strbuf_insert(). I have worked with at least
3 string APIs. All have insert operations, and some have string
formatting capabilities, but none of them conflate the two operations
into one. Maybe, there is a plan behind it? strbuf is the first (my 4th)
API that does, and it was non-trivial to get it right...

-- Hannes

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-05 18:01     ` Junio C Hamano
@ 2019-02-06 11:41       ` Johannes Schindelin
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2019-02-06 11:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, Paul-Sebastian Ungureanu, Git Mailing List

Hi Junio,

On Tue, 5 Feb 2019, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> Thanks for finding it.  This needs to be squashed into bfc3fe33
> >> ("strbuf.c: add `strbuf_insertf()` and `strbuf_vinsertf()`",
> >> 2018-12-20)?
> >
> > Since you want to open that can of worms again, you will also want to
> > squash ed5d77f7d382 (stash: fix segmentation fault when files were
> > added with intent, 2019-01-18) into 1f5a011d90ec (stash: convert
> > create to builtin, 2018-12-20). It will have trivial merge conflicts,
> > and the addition of the regression test will be surprising without
> > modifying the commit message to explain that there was a regression
> > that was fixed very late in the development, and that that regression
> > test intends to guarantee that it won't need to be fixed again.
> 
> Are you saying that I should not merge the series as is but expect
> an update that does these squashing?

No, I thought I had made my wish clear in the past: to advance
ps/stash-in-c to `next` a long time ago, mainly because it became obvious
that Paul is too occupied with University things to really pay attention
to the discussions on the Git mailing list. Consequently, we cannot really
do the regular thing where the branch is cooked in `pu` because the main
contributor is not really able to spend the time cooking. So to me, it
sounded like the safest way forward (without losing the entire
ps/stash-in-c branch) would be to merge it into `next`, with known fixes
on top, and cook it from there, with more cooks involved (which I heard
some people believe is not a wise thing to do, but then, we do that in Git
all the time).

So what I tried to say is that I am a bit opposed to start squashing into
Paul's patches, and rather do things on top as the --intent-to-add fixup
that I already provided. You seemed to suggest completely otherwise, which
got me puzzled.

> I was planning to make this a merge-down day, but let me exclude this
> topic from the "for next" batch just in case for today.

Well, you're the boss of git.git. I find it sad though, personally, that
we could not move this forward, especially since we introduced the safety
hatch for the very purpose of moving this forward, all the way into the
upcoming release.

I would have wished that ps/stash-in-c went into `next` already, what with
the few corner case bug fixes we had recently.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-05 20:30     ` Johannes Sixt
@ 2019-02-06 11:56       ` Johannes Schindelin
  2019-02-06 18:11         ` Johannes Sixt
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2019-02-06 11:56 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Paul-Sebastian Ungureanu, Git Mailing List

Hi Hannes,

On Tue, 5 Feb 2019, Johannes Sixt wrote:

> Am 05.02.19 um 12:06 schrieb Johannes Schindelin:
> > The real examples are much more mundane, and very different from what you
> > suspected, e.g. inserting the tag header before the tag message in
> > `create_tag()` in `builtin/tag.c`. Basically, it is building up a strbuf
> > for the sake of calling `strbuf_insert()` to insert that string elsewhere.
> 
> I had a look, and this example does not convince me at all. If
> separation of concerns were applied well around the launch_editor APIs,
> you would not need strbuf_insert() in the first place. But, alas, these
> functions focus more on DRY, and that is often the opposite of
> separation of concerns.

So you actually are convinced that it is needed in this instance. Good.

> > In any case, the mere existence, and use, of `strbuf_insert()` is a
> > pretty clear counter case to the idea that `strbuf_vinsertf()` would
> > encourage invalid code flows.
> 
> Nobody wants to get rid of strbuf_insert(). I have worked with at least
> 3 string APIs. All have insert operations, and some have string
> formatting capabilities, but none of them conflate the two operations
> into one. Maybe, there is a plan behind it? strbuf is the first (my 4th)
> API that does, and it was non-trivial to get it right...

Sorry, by that token `strbuf_vaddf()` would already be a violation of the
separation of concerns: it also makes space first, and then it also
formats a string.

Render me unconvinced.

I am still glad you found and fixed the bug,
Dscho

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf
  2019-02-06 11:56       ` Johannes Schindelin
@ 2019-02-06 18:11         ` Johannes Sixt
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Sixt @ 2019-02-06 18:11 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Paul-Sebastian Ungureanu, Git Mailing List

Am 06.02.19 um 12:56 schrieb Johannes Schindelin:
> On Tue, 5 Feb 2019, Johannes Sixt wrote:
>> Am 05.02.19 um 12:06 schrieb Johannes Schindelin:
>>> The real examples are much more mundane, and very different from what you
>>> suspected, e.g. inserting the tag header before the tag message in
>>> `create_tag()` in `builtin/tag.c`. Basically, it is building up a strbuf
>>> for the sake of calling `strbuf_insert()` to insert that string elsewhere.
>>
>> I had a look, and this example does not convince me at all. If
>> separation of concerns were applied well around the launch_editor APIs,
>> you would not need strbuf_insert() in the first place. But, alas, these
>> functions focus more on DRY, and that is often the opposite of
>> separation of concerns.
> 
> So you actually are convinced that it is needed in this instance. Good.

I thought I said quite the opposite.

But never mind. Since I'm not going to help refactor any C code anyway
as long as the result is still C code, who am I to complain about the
existence of strbuf_vinsertf()?

-- Hannes

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-02-06 18:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-03 16:51 [PATCH ps/stash-in-c] strbuf_vinsertf: provide the correct buffer size to vsnprintf Johannes Sixt
2019-02-04  7:25 ` Johannes Sixt
2019-02-04 10:38   ` Johannes Schindelin
2019-02-04 21:13     ` Johannes Sixt
2019-02-05 10:38       ` Johannes Schindelin
2019-02-04 10:54 ` Johannes Schindelin
2019-02-04 21:57 ` Junio C Hamano
2019-02-05 11:06   ` Johannes Schindelin
2019-02-05 18:01     ` Junio C Hamano
2019-02-06 11:41       ` Johannes Schindelin
2019-02-05 20:30     ` Johannes Sixt
2019-02-06 11:56       ` Johannes Schindelin
2019-02-06 18:11         ` Johannes Sixt

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).