git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] http: redact curl h2h3 headers in info
@ 2022-11-09  0:52 Glen Choo via GitGitGadget
  2022-11-10  2:52 ` Taylor Blau
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-11-09  0:52 UTC (permalink / raw)
  To: git; +Cc: Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module also prints headers in its "info", which don't get redacted.
For example,

  echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
  GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
    -c 'http.cookiefile=cookiefile' \
    -c 'http.version=' \
    ls-remote https://github.com/git/git refs/heads/main 2>output &&
  grep 'cookie' output

produces output like:

  23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
  23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>

Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic.

[1] https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77

Signed-off-by: Glen Choo <chooglen@google.com>
---
    http: redact curl h2h3 headers in info
    
    I initially sent this to the security list, but the general impression
    is that this isn't sensitive enough for an embargoed fix, so this is
    better discussed in the open instead.
    
    Since this comes from curl's HTTP2.0/3.0 module, this can be mitigated
    by setting http.version to 1.X, e.g. "git -c http.version=HTTP/1.1".
    
    According to [1], the susceptible curl versions appear to be 7.86.0,
    7.85.0, 7.84.0, 7.83.1, 7.83.0, 7.82.0, but I'm not sure which platforms
    are vulnerable.
    
    This patch fixes the issue on my machine running curl 7.85.0, so I think
    it is okay to merge as-is. That said, I would strongly prefer to add
    tests, but I haven't figured out how. In particular:
    
     * Do we have a way of using HTTP/2.0 in our tests? A cursory glance at
       our httpd config suggests that we only use HTTP/1.1.
     * How could we set up end-to-end tests to ensure that we're testing
       this against affected versions of curl? To avoid regressions, I'd
       also prefer to test against future versions of curl too.
    
    [1]
    https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1377%2Fchooglen%2Fhttp%2Fredact-h2h3-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1377/chooglen/http/redact-h2h3-v1
Pull-Request: https://github.com/git/git/pull/1377

 http.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index 5d0502f51fd..cbcc7c3f5b6 100644
--- a/http.c
+++ b/http.c
@@ -560,8 +560,10 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
-static void redact_sensitive_header(struct strbuf *header)
+/* Return 0 if redactions been made, 1 otherwise. */
+static int redact_sensitive_header(struct strbuf *header)
 {
+	int ret = 1;
 	const char *sensitive_header;
 
 	if (trace_curl_redact &&
@@ -575,6 +577,7 @@ static void redact_sensitive_header(struct strbuf *header)
 		/* Everything else is opaque and possibly sensitive */
 		strbuf_setlen(header,  sensitive_header - header->buf);
 		strbuf_addstr(header, " <redacted>");
+		ret = 0;
 	} else if (trace_curl_redact &&
 		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
 		struct strbuf redacted_header = STRBUF_INIT;
@@ -612,6 +615,27 @@ static void redact_sensitive_header(struct strbuf *header)
 
 		strbuf_setlen(header, sensitive_header - header->buf);
 		strbuf_addbuf(header, &redacted_header);
+		ret = 0;
+	}
+	return ret;
+}
+
+/* Redact headers in info */
+static void redact_sensitive_info_header(struct strbuf *header)
+{
+	const char *sensitive_header;
+
+	if (trace_curl_redact &&
+	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
+		struct strbuf inner = STRBUF_INIT;
+
+		/* Drop the trailing "]" */
+		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
+		if (!redact_sensitive_header(&inner)) {
+			strbuf_setlen(header, strlen("h2h3 ["));
+			strbuf_addbuf(header, &inner);
+			strbuf_addch(header, ']');
+		}
 	}
 }
 
@@ -668,6 +692,18 @@ static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
 	strbuf_release(&out);
 }
 
+static void curl_print_info(char *data, size_t size)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_add(&buf, data, size);
+
+	redact_sensitive_info_header(&buf);
+	trace_printf_key(&trace_curl, "== Info: %s", buf.buf);
+
+	strbuf_release(&buf);
+}
+
 static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
 {
 	const char *text;
@@ -675,7 +711,7 @@ static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size,
 
 	switch (type) {
 	case CURLINFO_TEXT:
-		trace_printf_key(&trace_curl, "== Info: %s", data);
+		curl_print_info(data, size);
 		break;
 	case CURLINFO_HEADER_OUT:
 		text = "=> Send header";

base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-09  0:52 [PATCH] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
@ 2022-11-10  2:52 ` Taylor Blau
  2022-11-10 17:48   ` Glen Choo
  2022-11-10 21:57 ` [PATCH] http: redact curl h2h3 headers in info Emily Shaffer
  2022-11-10 22:57 ` [PATCH v2] " Glen Choo via GitGitGadget
  2 siblings, 1 reply; 24+ messages in thread
From: Taylor Blau @ 2022-11-10  2:52 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget; +Cc: git, Glen Choo

On Wed, Nov 09, 2022 at 12:52:31AM +0000, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
>
> With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
> "Authorization" and "Cookie" get redacted. However, since [1], curl's
> h2h3 module also prints headers in its "info", which don't get redacted.
> For example,
>
>   echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
>   GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
>     -c 'http.cookiefile=cookiefile' \
>     -c 'http.version=' \
>     ls-remote https://github.com/git/git refs/heads/main 2>output &&
>   grep 'cookie' output
>
> produces output like:
>
>   23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
>   23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>

Oops. Thanks for sharing this patch, I agree that the general approach
makes sense.

>     I initially sent this to the security list, but the general impression
>     is that this isn't sensitive enough for an embargoed fix, so this is
>     better discussed in the open instead.

Indeed, this only really matters if the would-be victim is convinced to
somehow share the contents of their GIT_CURL_VEROBSE or GIT_TRACE_CURL
output. So I don't think there are any security implications here,
though I appreciate your caution in double checking with the
git-security list first.

>      * How could we set up end-to-end tests to ensure that we're testing
>        this against affected versions of curl? To avoid regressions, I'd
>        also prefer to test against future versions of curl too.

Does that necessarily matter? We want to make sure that we don't see
sensitive headers from the h2h3 module with any version of cURL, no?
>
> +/* Redact headers in info */
> +static void redact_sensitive_info_header(struct strbuf *header)
> +{
> +	const char *sensitive_header;
> +
> +	if (trace_curl_redact &&
> +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
> +		struct strbuf inner = STRBUF_INIT;
> +
> +		/* Drop the trailing "]" */
> +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);

Makes sense. If the incoming header contains the "h2h3 [...]" marker,
and we are redacting sensitive headers, and there are header to redact,
redact them.

> +		if (!redact_sensitive_header(&inner)) {
> +			strbuf_setlen(header, strlen("h2h3 ["));
> +			strbuf_addbuf(header, &inner);

This leaks inner.buf, no?

> +			strbuf_addch(header, ']');
> +		}
>  	}
>  }

The rest looks good.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-10  2:52 ` Taylor Blau
@ 2022-11-10 17:48   ` Glen Choo
  2022-11-10 21:50     ` Jeff King
  0 siblings, 1 reply; 24+ messages in thread
From: Glen Choo @ 2022-11-10 17:48 UTC (permalink / raw)
  To: Taylor Blau, Glen Choo via GitGitGadget; +Cc: git

Taylor Blau <me@ttaylorr.com> writes:

>>      * How could we set up end-to-end tests to ensure that we're testing
>>        this against affected versions of curl? To avoid regressions, I'd
>>        also prefer to test against future versions of curl too.
>
> Does that necessarily matter? We want to make sure that we don't see
> sensitive headers from the h2h3 module with any version of cURL, no?

It would help, but it might not be worth setting up infrastructure for
just this use case alone. Given the various platforms running tests
against the Git codebase, we probably get close to a representative
sample of the population with enough time.

I think it would be more important to have tests against HTTP/2.0. If we
did, we probably would have already caught this, e.g.
t/t5551-http-fetch-smart.sh:'GIT_TRACE_CURL redacts auth details' and
friends.

>> +		if (!redact_sensitive_header(&inner)) {
>> +			strbuf_setlen(header, strlen("h2h3 ["));
>> +			strbuf_addbuf(header, &inner);
>
> This leaks inner.buf, no?

Ah, you're right. Thanks.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-10 17:48   ` Glen Choo
@ 2022-11-10 21:50     ` Jeff King
  2022-11-10 22:53       ` Glen Choo
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff King @ 2022-11-10 21:50 UTC (permalink / raw)
  To: Glen Choo; +Cc: Taylor Blau, Glen Choo via GitGitGadget, git

On Thu, Nov 10, 2022 at 09:48:38AM -0800, Glen Choo wrote:

> Taylor Blau <me@ttaylorr.com> writes:
> 
> >>      * How could we set up end-to-end tests to ensure that we're testing
> >>        this against affected versions of curl? To avoid regressions, I'd
> >>        also prefer to test against future versions of curl too.
> >
> > Does that necessarily matter? We want to make sure that we don't see
> > sensitive headers from the h2h3 module with any version of cURL, no?
> 
> It would help, but it might not be worth setting up infrastructure for
> just this use case alone. Given the various platforms running tests
> against the Git codebase, we probably get close to a representative
> sample of the population with enough time.
> 
> I think it would be more important to have tests against HTTP/2.0. If we
> did, we probably would have already caught this, e.g.
> t/t5551-http-fetch-smart.sh:'GIT_TRACE_CURL redacts auth details' and
> friends.

There's some discussion in b66c77a64e (http: match headers
case-insensitively when redacting, 2021-09-22) about testing with
HTTP/2. Which ironically is basically this exact same bug in a different
form. ;)

The short answer is that it's do-able, but probably there are some
headaches to make it work portably.

I agree with you that trying various curl versions isn't worth doing. If
enough people/platforms run Git's suite, one of them will eventually
see the problem.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-09  0:52 [PATCH] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
  2022-11-10  2:52 ` Taylor Blau
@ 2022-11-10 21:57 ` Emily Shaffer
  2022-11-10 22:14   ` Glen Choo
  2022-11-10 22:57 ` [PATCH v2] " Glen Choo via GitGitGadget
  2 siblings, 1 reply; 24+ messages in thread
From: Emily Shaffer @ 2022-11-10 21:57 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget; +Cc: git, Glen Choo

On Wed, Nov 09, 2022 at 12:52:31AM +0000, Glen Choo via GitGitGadget wrote:
> 
> With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
> "Authorization" and "Cookie" get redacted. However, since [1], curl's
> h2h3 module also prints headers in its "info", which don't get redacted.
> For example,
> 
>   echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
>   GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
>     -c 'http.cookiefile=cookiefile' \
>     -c 'http.version=' \
>     ls-remote https://github.com/git/git refs/heads/main 2>output &&
>   grep 'cookie' output
> 
> produces output like:
> 
>   23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
>   23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>
> 
> Teach http.c to check for h2h3 headers in info and redact them using the
> existing header redaction logic.
> 
> [1] https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77
> 
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>     http: redact curl h2h3 headers in info
>     
>     I initially sent this to the security list, but the general impression
>     is that this isn't sensitive enough for an embargoed fix, so this is
>     better discussed in the open instead.
>     
>     Since this comes from curl's HTTP2.0/3.0 module, this can be mitigated
>     by setting http.version to 1.X, e.g. "git -c http.version=HTTP/1.1".
>     
>     According to [1], the susceptible curl versions appear to be 7.86.0,
>     7.85.0, 7.84.0, 7.83.1, 7.83.0, 7.82.0, but I'm not sure which platforms
>     are vulnerable.
>     
>     This patch fixes the issue on my machine running curl 7.85.0, so I think
>     it is okay to merge as-is. That said, I would strongly prefer to add
>     tests, but I haven't figured out how. In particular:
>     
>      * Do we have a way of using HTTP/2.0 in our tests? A cursory glance at
>        our httpd config suggests that we only use HTTP/1.1.
>      * How could we set up end-to-end tests to ensure that we're testing
>        this against affected versions of curl? To avoid regressions, I'd
>        also prefer to test against future versions of curl too.
>     
>     [1]
>     https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1377%2Fchooglen%2Fhttp%2Fredact-h2h3-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1377/chooglen/http/redact-h2h3-v1
> Pull-Request: https://github.com/git/git/pull/1377
> 
>  http.c | 40 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/http.c b/http.c
> index 5d0502f51fd..cbcc7c3f5b6 100644
> --- a/http.c
> +++ b/http.c
> @@ -560,8 +560,10 @@ static void set_curl_keepalive(CURL *c)
>  }
>  #endif
>  
> -static void redact_sensitive_header(struct strbuf *header)
> +/* Return 0 if redactions been made, 1 otherwise. */

Does it make sense to reverse the retval here?

`if (!redact_sensitive_header())` sounds like "if not redacted, ..." -
but here it means the opposite, right?

> +static int redact_sensitive_header(struct strbuf *header)
>  {
> +	int ret = 1;
>  	const char *sensitive_header;
>  
>  	if (trace_curl_redact &&
> @@ -575,6 +577,7 @@ static void redact_sensitive_header(struct strbuf *header)
>  		/* Everything else is opaque and possibly sensitive */
>  		strbuf_setlen(header,  sensitive_header - header->buf);
>  		strbuf_addstr(header, " <redacted>");
> +		ret = 0;
>  	} else if (trace_curl_redact &&
>  		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
>  		struct strbuf redacted_header = STRBUF_INIT;
> @@ -612,6 +615,27 @@ static void redact_sensitive_header(struct strbuf *header)
>  
>  		strbuf_setlen(header, sensitive_header - header->buf);
>  		strbuf_addbuf(header, &redacted_header);
> +		ret = 0;
> +	}
> +	return ret;
> +}
> +
> +/* Redact headers in info */
> +static void redact_sensitive_info_header(struct strbuf *header)
> +{
> +	const char *sensitive_header;
> +
> +	if (trace_curl_redact &&
> +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
> +		struct strbuf inner = STRBUF_INIT;
> +
> +		/* Drop the trailing "]" */
> +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
> +		if (!redact_sensitive_header(&inner)) {
> +			strbuf_setlen(header, strlen("h2h3 ["));
> +			strbuf_addbuf(header, &inner);
> +			strbuf_addch(header, ']');

I'd really like some more comments in this function - even just one
describing the string we're trying to redact, or showing a sample line.
Navigating string parsing is always a bit difficult.

> +		}
>  	}
>  }
>  
> @@ -668,6 +692,18 @@ static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
>  	strbuf_release(&out);
>  }
>  
> +static void curl_print_info(char *data, size_t size)

Nit: Every other helper in this file calls it _dump_, so should this
also say _dump_ instead of _print_?

> +{
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	strbuf_add(&buf, data, size);
> +
> +	redact_sensitive_info_header(&buf);
> +	trace_printf_key(&trace_curl, "== Info: %s", buf.buf);
> +
> +	strbuf_release(&buf);
> +}
> +
>  static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
>  {
>  	const char *text;
> @@ -675,7 +711,7 @@ static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size,
>  
>  	switch (type) {
>  	case CURLINFO_TEXT:
> -		trace_printf_key(&trace_curl, "== Info: %s", data);
> +		curl_print_info(data, size);
>  		break;
>  	case CURLINFO_HEADER_OUT:
>  		text = "=> Send header";
> 
> base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193

Otherwise functionally it seems fine to me. case CURLINFO_TEXT is the
one case that's not already using a curl_dump_* helper, so we're adding
one, and to that helper we're adding a call out to
redact_sensitive_header().

Thanks.
 - Emily

> -- 
> gitgitgadget

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-10 21:57 ` [PATCH] http: redact curl h2h3 headers in info Emily Shaffer
@ 2022-11-10 22:14   ` Glen Choo
  2022-11-11  2:35     ` Taylor Blau
  0 siblings, 1 reply; 24+ messages in thread
From: Glen Choo @ 2022-11-10 22:14 UTC (permalink / raw)
  To: Emily Shaffer, Glen Choo via GitGitGadget; +Cc: git

Emily Shaffer <nasamuffin@google.com> writes:

> On Wed, Nov 09, 2022 at 12:52:31AM +0000, Glen Choo via GitGitGadget wrote:
>> 
>> With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
>> "Authorization" and "Cookie" get redacted. However, since [1], curl's
>> h2h3 module also prints headers in its "info", which don't get redacted.
>> For example,
>> 
>>   echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
>>   GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
>>     -c 'http.cookiefile=cookiefile' \
>>     -c 'http.version=' \
>>     ls-remote https://github.com/git/git refs/heads/main 2>output &&
>>   grep 'cookie' output
>> 
>> produces output like:
>> 
>>   23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
>>   23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>
>> 
>> Teach http.c to check for h2h3 headers in info and redact them using the
>> existing header redaction logic.
>> 
>> [1] https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77
>> 
>> Signed-off-by: Glen Choo <chooglen@google.com>
>> ---
>>     http: redact curl h2h3 headers in info
>>     
>>     I initially sent this to the security list, but the general impression
>>     is that this isn't sensitive enough for an embargoed fix, so this is
>>     better discussed in the open instead.
>>     
>>     Since this comes from curl's HTTP2.0/3.0 module, this can be mitigated
>>     by setting http.version to 1.X, e.g. "git -c http.version=HTTP/1.1".
>>     
>>     According to [1], the susceptible curl versions appear to be 7.86.0,
>>     7.85.0, 7.84.0, 7.83.1, 7.83.0, 7.82.0, but I'm not sure which platforms
>>     are vulnerable.
>>     
>>     This patch fixes the issue on my machine running curl 7.85.0, so I think
>>     it is okay to merge as-is. That said, I would strongly prefer to add
>>     tests, but I haven't figured out how. In particular:
>>     
>>      * Do we have a way of using HTTP/2.0 in our tests? A cursory glance at
>>        our httpd config suggests that we only use HTTP/1.1.
>>      * How could we set up end-to-end tests to ensure that we're testing
>>        this against affected versions of curl? To avoid regressions, I'd
>>        also prefer to test against future versions of curl too.
>>     
>>     [1]
>>     https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77
>> 
>> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1377%2Fchooglen%2Fhttp%2Fredact-h2h3-v1
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1377/chooglen/http/redact-h2h3-v1
>> Pull-Request: https://github.com/git/git/pull/1377
>> 
>>  http.c | 40 ++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 38 insertions(+), 2 deletions(-)
>> 
>> diff --git a/http.c b/http.c
>> index 5d0502f51fd..cbcc7c3f5b6 100644
>> --- a/http.c
>> +++ b/http.c
>> @@ -560,8 +560,10 @@ static void set_curl_keepalive(CURL *c)
>>  }
>>  #endif
>>  
>> -static void redact_sensitive_header(struct strbuf *header)
>> +/* Return 0 if redactions been made, 1 otherwise. */
>
> Does it make sense to reverse the retval here?
>
> `if (!redact_sensitive_header())` sounds like "if not redacted, ..." -
> but here it means the opposite, right?

I struggled with this for a bit since I wasn't sure what the convention
is here. Enumerating some off the top of my head, we have:

- For 'booleans', we "0" for false and "1" for true (e.g.
  starts_with()).
- For functions that may fail with error, we have "0" for success and
  nonzero to signal the failure type (e.g. strbuf_getdelim()).
- For functions that don't fail we have "0" for "nothing was done" and
  "1" for something was done (e.g. skip_prefix()).

(Tangent: from a readability perspective, this is pretty poor. I need to
know beforehand whether or not the function may fail with error before I
know what the return value means?)

This probably falls into the last category, so for consistency, I think
this should return "1" for "redactions have happened" (as you
suggested).

>> +static int redact_sensitive_header(struct strbuf *header)
>>  {
>> +	int ret = 1;
>>  	const char *sensitive_header;
>>  
>>  	if (trace_curl_redact &&
>> @@ -575,6 +577,7 @@ static void redact_sensitive_header(struct strbuf *header)
>>  		/* Everything else is opaque and possibly sensitive */
>>  		strbuf_setlen(header,  sensitive_header - header->buf);
>>  		strbuf_addstr(header, " <redacted>");
>> +		ret = 0;
>>  	} else if (trace_curl_redact &&
>>  		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
>>  		struct strbuf redacted_header = STRBUF_INIT;
>> @@ -612,6 +615,27 @@ static void redact_sensitive_header(struct strbuf *header)
>>  
>>  		strbuf_setlen(header, sensitive_header - header->buf);
>>  		strbuf_addbuf(header, &redacted_header);
>> +		ret = 0;
>> +	}
>> +	return ret;
>> +}
>> +
>> +/* Redact headers in info */
>> +static void redact_sensitive_info_header(struct strbuf *header)
>> +{
>> +	const char *sensitive_header;
>> +
>> +	if (trace_curl_redact &&
>> +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
>> +		struct strbuf inner = STRBUF_INIT;
>> +
>> +		/* Drop the trailing "]" */
>> +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
>> +		if (!redact_sensitive_header(&inner)) {
>> +			strbuf_setlen(header, strlen("h2h3 ["));
>> +			strbuf_addbuf(header, &inner);
>> +			strbuf_addch(header, ']');
>
> I'd really like some more comments in this function - even just one
> describing the string we're trying to redact, or showing a sample line.
> Navigating string parsing is always a bit difficult.

Ah yes, I should include a description of the string.

>> +		}
>>  	}
>>  }
>>  
>> @@ -668,6 +692,18 @@ static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
>>  	strbuf_release(&out);
>>  }
>>  
>> +static void curl_print_info(char *data, size_t size)
>
> Nit: Every other helper in this file calls it _dump_, so should this
> also say _dump_ instead of _print_?

Sure, I have no opinion here, so I'll do that.

>> +{
>> +	struct strbuf buf = STRBUF_INIT;
>> +
>> +	strbuf_add(&buf, data, size);
>> +
>> +	redact_sensitive_info_header(&buf);
>> +	trace_printf_key(&trace_curl, "== Info: %s", buf.buf);
>> +
>> +	strbuf_release(&buf);
>> +}
>> +
>>  static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
>>  {
>>  	const char *text;
>> @@ -675,7 +711,7 @@ static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size,
>>  
>>  	switch (type) {
>>  	case CURLINFO_TEXT:
>> -		trace_printf_key(&trace_curl, "== Info: %s", data);
>> +		curl_print_info(data, size);
>>  		break;
>>  	case CURLINFO_HEADER_OUT:
>>  		text = "=> Send header";
>> 
>> base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
>
> Otherwise functionally it seems fine to me. case CURLINFO_TEXT is the
> one case that's not already using a curl_dump_* helper, so we're adding
> one, and to that helper we're adding a call out to
> redact_sensitive_header().
>
> Thanks.
>  - Emily
>
>> -- 
>> gitgitgadget

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-10 21:50     ` Jeff King
@ 2022-11-10 22:53       ` Glen Choo
  2022-11-11  2:29         ` Jeff King
  0 siblings, 1 reply; 24+ messages in thread
From: Glen Choo @ 2022-11-10 22:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Glen Choo via GitGitGadget, git

Jeff King <peff@peff.net> writes:

> On Thu, Nov 10, 2022 at 09:48:38AM -0800, Glen Choo wrote:
>
>> Taylor Blau <me@ttaylorr.com> writes:
>> 
>> >>      * How could we set up end-to-end tests to ensure that we're testing
>> >>        this against affected versions of curl? To avoid regressions, I'd
>> >>        also prefer to test against future versions of curl too.
>> >
>> > Does that necessarily matter? We want to make sure that we don't see
>> > sensitive headers from the h2h3 module with any version of cURL, no?
>> 
>> It would help, but it might not be worth setting up infrastructure for
>> just this use case alone. Given the various platforms running tests
>> against the Git codebase, we probably get close to a representative
>> sample of the population with enough time.
>> 
>> I think it would be more important to have tests against HTTP/2.0. If we
>> did, we probably would have already caught this, e.g.
>> t/t5551-http-fetch-smart.sh:'GIT_TRACE_CURL redacts auth details' and
>> friends.
>
> There's some discussion in b66c77a64e (http: match headers
> case-insensitively when redacting, 2021-09-22) about testing with
> HTTP/2. Which ironically is basically this exact same bug in a different
> form. ;)
>
> The short answer is that it's do-able, but probably there are some
> headaches to make it work portably.

Argh, what a shame :( Okay, maybe it's not worth trying to use httpd
then.

Some other ideas I had were:

- Create a test-tool that calls the redaction logic directly (without
  involving about curl), and we pass the strings we want to redacted to
  it. Way less than ideal, since we'd never be able to proactively catch
  failures, but better than nothing I suppose.

- Write our own HTTP/2 server for redaction tests. I assume this won't
  be trivial, but maybe not prohibitive, e.g. [1] implements its own
  http server for credential helper tests.

[1] https://lore.kernel.org/git/4947e81546a51883365d0087ce616b6b77e24a63.1667426970.git.gitgitgadget@gmail.com/

>
> I agree with you that trying various curl versions isn't worth doing. If
> enough people/platforms run Git's suite, one of them will eventually
> see the problem.
>
> -Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2] http: redact curl h2h3 headers in info
  2022-11-09  0:52 [PATCH] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
  2022-11-10  2:52 ` Taylor Blau
  2022-11-10 21:57 ` [PATCH] http: redact curl h2h3 headers in info Emily Shaffer
@ 2022-11-10 22:57 ` Glen Choo via GitGitGadget
  2022-11-11  2:36   ` Taylor Blau
                     ` (2 more replies)
  2 siblings, 3 replies; 24+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-11-10 22:57 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Jeff King, Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module also prints headers in its "info", which don't get redacted.
For example,

  echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
  GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
    -c 'http.cookiefile=cookiefile' \
    -c 'http.version=' \
    ls-remote https://github.com/git/git refs/heads/main 2>output &&
  grep 'cookie' output

produces output like:

  23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
  23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>

Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic.

[1] https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77

Signed-off-by: Glen Choo <chooglen@google.com>
---
    http: redact curl h2h3 headers in info
    
    Thanks for the feedback, all :) For this patch, it sounds like it would
    be too onerous to do the kinds of testing I initially envisioned, so I
    think v2 is ready as-is.
    
    Changes in v2:
    
     * Describe the redacted string in comments.
     * Return 1 for "redactions have happened".
     * Fix a leak of the "inner" strbuf.
     * Rename function, fix typo.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1377%2Fchooglen%2Fhttp%2Fredact-h2h3-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1377/chooglen/http/redact-h2h3-v2
Pull-Request: https://github.com/git/git/pull/1377

Range-diff vs v1:

 1:  e9232396736 ! 1:  a8c35ff4ddf http: redact curl h2h3 headers in info
     @@ http.c: static void set_curl_keepalive(CURL *c)
       #endif
       
      -static void redact_sensitive_header(struct strbuf *header)
     -+/* Return 0 if redactions been made, 1 otherwise. */
     ++/* Return 1 if redactions have been made, 0 otherwise. */
      +static int redact_sensitive_header(struct strbuf *header)
       {
     -+	int ret = 1;
     ++	int ret = 0;
       	const char *sensitive_header;
       
       	if (trace_curl_redact &&
     @@ http.c: static void redact_sensitive_header(struct strbuf *header)
       		/* Everything else is opaque and possibly sensitive */
       		strbuf_setlen(header,  sensitive_header - header->buf);
       		strbuf_addstr(header, " <redacted>");
     -+		ret = 0;
     ++		ret = 1;
       	} else if (trace_curl_redact &&
       		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
       		struct strbuf redacted_header = STRBUF_INIT;
     @@ http.c: static void redact_sensitive_header(struct strbuf *header)
       
       		strbuf_setlen(header, sensitive_header - header->buf);
       		strbuf_addbuf(header, &redacted_header);
     -+		ret = 0;
     ++		ret = 1;
      +	}
      +	return ret;
      +}
     @@ http.c: static void redact_sensitive_header(struct strbuf *header)
      +{
      +	const char *sensitive_header;
      +
     ++	/*
     ++	 * curl's h2h3 prints headers in info, e.g.:
     ++	 *   h2h3 [<header-name>: <header-val>]
     ++	 */
      +	if (trace_curl_redact &&
      +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
      +		struct strbuf inner = STRBUF_INIT;
      +
      +		/* Drop the trailing "]" */
      +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
     -+		if (!redact_sensitive_header(&inner)) {
     ++		if (redact_sensitive_header(&inner)) {
      +			strbuf_setlen(header, strlen("h2h3 ["));
      +			strbuf_addbuf(header, &inner);
      +			strbuf_addch(header, ']');
      +		}
     ++
     ++		strbuf_release(&inner);
       	}
       }
       
     @@ http.c: static void curl_dump_data(const char *text, unsigned char *ptr, size_t
       	strbuf_release(&out);
       }
       
     -+static void curl_print_info(char *data, size_t size)
     ++static void curl_dump_info(char *data, size_t size)
      +{
      +	struct strbuf buf = STRBUF_INIT;
      +
     @@ http.c: static int curl_trace(CURL *handle, curl_infotype type, char *data, size
       	switch (type) {
       	case CURLINFO_TEXT:
      -		trace_printf_key(&trace_curl, "== Info: %s", data);
     -+		curl_print_info(data, size);
     ++		curl_dump_info(data, size);
       		break;
       	case CURLINFO_HEADER_OUT:
       		text = "=> Send header";


 http.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index 5d0502f51fd..8135fac283e 100644
--- a/http.c
+++ b/http.c
@@ -560,8 +560,10 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
-static void redact_sensitive_header(struct strbuf *header)
+/* Return 1 if redactions have been made, 0 otherwise. */
+static int redact_sensitive_header(struct strbuf *header)
 {
+	int ret = 0;
 	const char *sensitive_header;
 
 	if (trace_curl_redact &&
@@ -575,6 +577,7 @@ static void redact_sensitive_header(struct strbuf *header)
 		/* Everything else is opaque and possibly sensitive */
 		strbuf_setlen(header,  sensitive_header - header->buf);
 		strbuf_addstr(header, " <redacted>");
+		ret = 1;
 	} else if (trace_curl_redact &&
 		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
 		struct strbuf redacted_header = STRBUF_INIT;
@@ -612,6 +615,33 @@ static void redact_sensitive_header(struct strbuf *header)
 
 		strbuf_setlen(header, sensitive_header - header->buf);
 		strbuf_addbuf(header, &redacted_header);
+		ret = 1;
+	}
+	return ret;
+}
+
+/* Redact headers in info */
+static void redact_sensitive_info_header(struct strbuf *header)
+{
+	const char *sensitive_header;
+
+	/*
+	 * curl's h2h3 prints headers in info, e.g.:
+	 *   h2h3 [<header-name>: <header-val>]
+	 */
+	if (trace_curl_redact &&
+	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
+		struct strbuf inner = STRBUF_INIT;
+
+		/* Drop the trailing "]" */
+		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
+		if (redact_sensitive_header(&inner)) {
+			strbuf_setlen(header, strlen("h2h3 ["));
+			strbuf_addbuf(header, &inner);
+			strbuf_addch(header, ']');
+		}
+
+		strbuf_release(&inner);
 	}
 }
 
@@ -668,6 +698,18 @@ static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
 	strbuf_release(&out);
 }
 
+static void curl_dump_info(char *data, size_t size)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_add(&buf, data, size);
+
+	redact_sensitive_info_header(&buf);
+	trace_printf_key(&trace_curl, "== Info: %s", buf.buf);
+
+	strbuf_release(&buf);
+}
+
 static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
 {
 	const char *text;
@@ -675,7 +717,7 @@ static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size,
 
 	switch (type) {
 	case CURLINFO_TEXT:
-		trace_printf_key(&trace_curl, "== Info: %s", data);
+		curl_dump_info(data, size);
 		break;
 	case CURLINFO_HEADER_OUT:
 		text = "=> Send header";

base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-10 22:53       ` Glen Choo
@ 2022-11-11  2:29         ` Jeff King
  2022-11-11  2:31           ` Taylor Blau
  2022-11-11 14:49           ` [PATCH] t: run t5551 tests with both HTTP and HTTP/2 Jeff King
  0 siblings, 2 replies; 24+ messages in thread
From: Jeff King @ 2022-11-11  2:29 UTC (permalink / raw)
  To: Glen Choo; +Cc: Taylor Blau, Glen Choo via GitGitGadget, git

On Thu, Nov 10, 2022 at 02:53:54PM -0800, Glen Choo wrote:

> > There's some discussion in b66c77a64e (http: match headers
> > case-insensitively when redacting, 2021-09-22) about testing with
> > HTTP/2. Which ironically is basically this exact same bug in a different
> > form. ;)
> >
> > The short answer is that it's do-able, but probably there are some
> > headaches to make it work portably.
> 
> Argh, what a shame :( Okay, maybe it's not worth trying to use httpd
> then.
> 
> Some other ideas I had were:
> 
> - Create a test-tool that calls the redaction logic directly (without
>   involving about curl), and we pass the strings we want to redacted to
>   it. Way less than ideal, since we'd never be able to proactively catch
>   failures, but better than nothing I suppose.

I don't think this is worth the effort. It's nice to exercise the code a
bit, but it wouldn't have actually found this regression, since the
unexpected thing here was curl changing.

> - Write our own HTTP/2 server for redaction tests. I assume this won't
>   be trivial, but maybe not prohibitive, e.g. [1] implements its own
>   http server for credential helper tests.

These seems like a lot more work than just setting up HTTP/2 support for
Apache. I checked the recipe from b66c77a64e, and it still works. It
does indeed find the bug (my curl is 7.86.0) and confirms your fix.

I think a simple path forward could be something like:

  - teach lib-httpd to conditionally use the current set up versus the
    http2 one outlined in b66c77a64e

  - push most of t5551 into a lib-http-fetch.sh; the client-side set up
    from b66c77a64e checks for an HTTP2 prereq. The test that looks for
    chunked encoding (and only works on HTTP1) checks for !HTTP2.

  - t5551 tells lib-httpd to use the usual setup, and then sources
    lib-http-fetch; it behaves as before

  - t5559 (sadly, not contiguous without renumbering intermediate tests)
    tells lib-httpd to use http2, and sets the HTTP2 prereq. It runs the
    same tests but via http2.

We don't get the _whole_ test suite running with http2, but hopefully it
gives us a fairly representative sample. And it does find this bug.

I can try to work the above into patch form, but I may not get to it for
a day or two.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-11  2:29         ` Jeff King
@ 2022-11-11  2:31           ` Taylor Blau
  2022-11-11 14:49           ` [PATCH] t: run t5551 tests with both HTTP and HTTP/2 Jeff King
  1 sibling, 0 replies; 24+ messages in thread
From: Taylor Blau @ 2022-11-11  2:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Glen Choo, Taylor Blau, Glen Choo via GitGitGadget, git

On Thu, Nov 10, 2022 at 09:29:15PM -0500, Jeff King wrote:
> > - Write our own HTTP/2 server for redaction tests. I assume this won't
> >   be trivial, but maybe not prohibitive, e.g. [1] implements its own
> >   http server for credential helper tests.
>
> These seems like a lot more work than just setting up HTTP/2 support for
> Apache. I checked the recipe from b66c77a64e, and it still works. It
> does indeed find the bug (my curl is 7.86.0) and confirms your fix.
>
> I think a simple path forward could be something like:
>
> [...]

Thanks. Your plan looks very sane and seems like it hopefully wouldn't
be too much effort.

I'll leave it to you and Glen to figure out how to divvy up the patches,
but I'm sure you can figure it out ;-). Thanks in advance.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] http: redact curl h2h3 headers in info
  2022-11-10 22:14   ` Glen Choo
@ 2022-11-11  2:35     ` Taylor Blau
  0 siblings, 0 replies; 24+ messages in thread
From: Taylor Blau @ 2022-11-11  2:35 UTC (permalink / raw)
  To: Glen Choo; +Cc: Emily Shaffer, Glen Choo via GitGitGadget, git

On Thu, Nov 10, 2022 at 02:14:11PM -0800, Glen Choo wrote:
> >> @@ -560,8 +560,10 @@ static void set_curl_keepalive(CURL *c)
> >>  }
> >>  #endif
> >>
> >> -static void redact_sensitive_header(struct strbuf *header)
> >> +/* Return 0 if redactions been made, 1 otherwise. */
> >
> > Does it make sense to reverse the retval here?
> >
> > `if (!redact_sensitive_header())` sounds like "if not redacted, ..." -
> > but here it means the opposite, right?
>
> I struggled with this for a bit since I wasn't sure what the convention
> is here. Enumerating some off the top of my head, we have:

For what it's worth, the "return zero if redactions were made" is what I
would have expected. I think of it as returning zero if we didn't
encounter an error (and returning a negative, non-zero value if we did).

> - For functions that don't fail we have "0" for "nothing was done" and
>   "1" for something was done (e.g. skip_prefix()).
>
> (Tangent: from a readability perspective, this is pretty poor. I need to
> know beforehand whether or not the function may fail with error before I
> know what the return value means?)
>
> This probably falls into the last category, so for consistency, I think
> this should return "1" for "redactions have happened" (as you
> suggested).

...But I don't really care that much ;-). As long as you choose
consistently, and document your choice where it is unclear, it is fine.

> >> +/* Redact headers in info */
> >> +static void redact_sensitive_info_header(struct strbuf *header)
> >> +{
> >> +	const char *sensitive_header;
> >> +
> >> +	if (trace_curl_redact &&
> >> +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
> >> +		struct strbuf inner = STRBUF_INIT;
> >> +
> >> +		/* Drop the trailing "]" */
> >> +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
> >> +		if (!redact_sensitive_header(&inner)) {
> >> +			strbuf_setlen(header, strlen("h2h3 ["));
> >> +			strbuf_addbuf(header, &inner);
> >> +			strbuf_addch(header, ']');
> >
> > I'd really like some more comments in this function - even just one
> > describing the string we're trying to redact, or showing a sample line.
> > Navigating string parsing is always a bit difficult.
>
> Ah yes, I should include a description of the string.

Eh. To be honest, I probably wouldn't have documented it any more than
you did. At most, I would add an example "before" and "after" string
that shows what we're trying to generate.

I agree that string manipulation can end up with some fairly convoluted
code. But I think what is written here is straightforward, and that
any attempt to comment it more than suggested would end up just
repeating what the code does in English.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] http: redact curl h2h3 headers in info
  2022-11-10 22:57 ` [PATCH v2] " Glen Choo via GitGitGadget
@ 2022-11-11  2:36   ` Taylor Blau
  2022-11-11  2:38   ` Jeff King
  2022-11-11 22:35   ` [PATCH v3 0/2] " Glen Choo via GitGitGadget
  2 siblings, 0 replies; 24+ messages in thread
From: Taylor Blau @ 2022-11-11  2:36 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget; +Cc: git, Jeff King, Emily Shaffer, Glen Choo

On Thu, Nov 10, 2022 at 10:57:34PM +0000, Glen Choo via GitGitGadget wrote:
>     Thanks for the feedback, all :) For this patch, it sounds like it would
>     be too onerous to do the kinds of testing I initially envisioned, so I
>     think v2 is ready as-is.

Yeah, I'm fine to start merging this one down as-is, especially with
some verification from others that this does in fact fix what you say it
does.

I would be happy to see more tests here in the future, but I agree that
they can be done on top.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] http: redact curl h2h3 headers in info
  2022-11-10 22:57 ` [PATCH v2] " Glen Choo via GitGitGadget
  2022-11-11  2:36   ` Taylor Blau
@ 2022-11-11  2:38   ` Jeff King
  2022-11-11  2:39     ` Taylor Blau
  2022-11-11 17:55     ` Glen Choo
  2022-11-11 22:35   ` [PATCH v3 0/2] " Glen Choo via GitGitGadget
  2 siblings, 2 replies; 24+ messages in thread
From: Jeff King @ 2022-11-11  2:38 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget; +Cc: git, Taylor Blau, Emily Shaffer, Glen Choo

On Thu, Nov 10, 2022 at 10:57:34PM +0000, Glen Choo via GitGitGadget wrote:

> +/* Redact headers in info */
> +static void redact_sensitive_info_header(struct strbuf *header)
> +{
> +	const char *sensitive_header;
> +
> +	/*
> +	 * curl's h2h3 prints headers in info, e.g.:
> +	 *   h2h3 [<header-name>: <header-val>]
> +	 */
> +	if (trace_curl_redact &&
> +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
> +		struct strbuf inner = STRBUF_INIT;
> +
> +		/* Drop the trailing "]" */
> +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);

This will misbehave if fed the string "h2h3 [", because that strlen()
becomes 0, and the subtraction underflows.

Unlikely, since we are being fed by curl, but possibly worth asserting
(though see below for an alternative which drops this line).

> +		if (redact_sensitive_header(&inner)) {
> +			strbuf_setlen(header, strlen("h2h3 ["));

This strlen may be better spelled as:

  sensitive_header - header->buf

which IMHO makes it more clear that our intent is to truncate based on
the pointer we computed by skipping (and has no chance of getting out of
sync with the earlier copy of the string).

It's also a little more robust, in that it doesn't depend on "h2h3"
being at the beginning of the string (though in practice it must be,
because that's where skip_iprefix() is checking). See below on that.

> +			strbuf_addbuf(header, &inner);
> +			strbuf_addch(header, ']');
> +		}
> +
> +		strbuf_release(&inner);

This will do a new allocation/free for each info line, even if it's not
redacted. It's probably premature optimization to worry about it, but
you could do it all in the original strbuf, if we inform
redact_sensitive_header() of the offset at which it should look for the
header (and because it uses "sensitive_header - header->buf" for the
truncation, it handles the extra "h2h3" at the beginning just fine).
Something like:

diff --git a/http.c b/http.c
index 8135fac283..8a5ba3f477 100644
--- a/http.c
+++ b/http.c
@@ -561,14 +561,14 @@ static void set_curl_keepalive(CURL *c)
 #endif
 
 /* Return 1 if redactions have been made, 0 otherwise. */
-static int redact_sensitive_header(struct strbuf *header)
+static int redact_sensitive_header(struct strbuf *header, size_t offset)
 {
 	int ret = 0;
 	const char *sensitive_header;
 
 	if (trace_curl_redact &&
-	    (skip_iprefix(header->buf, "Authorization:", &sensitive_header) ||
-	     skip_iprefix(header->buf, "Proxy-Authorization:", &sensitive_header))) {
+	    (skip_iprefix(header->buf + offset, "Authorization:", &sensitive_header) ||
+	     skip_iprefix(header->buf + offset, "Proxy-Authorization:", &sensitive_header))) {
 		/* The first token is the type, which is OK to log */
 		while (isspace(*sensitive_header))
 			sensitive_header++;
@@ -579,7 +579,7 @@ static int redact_sensitive_header(struct strbuf *header)
 		strbuf_addstr(header, " <redacted>");
 		ret = 1;
 	} else if (trace_curl_redact &&
-		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
+		   skip_iprefix(header->buf + offset, "Cookie:", &sensitive_header)) {
 		struct strbuf redacted_header = STRBUF_INIT;
 		const char *cookie;
 
@@ -631,17 +631,10 @@ static void redact_sensitive_info_header(struct strbuf *header)
 	 */
 	if (trace_curl_redact &&
 	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
-		struct strbuf inner = STRBUF_INIT;
-
-		/* Drop the trailing "]" */
-		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
-		if (redact_sensitive_header(&inner)) {
-			strbuf_setlen(header, strlen("h2h3 ["));
-			strbuf_addbuf(header, &inner);
+		if (redact_sensitive_header(header, sensitive_header - header->buf)) {
+			/* redaction ate our closing bracket */
 			strbuf_addch(header, ']');
 		}
-
-		strbuf_release(&inner);
 	}
 }
 
@@ -659,7 +652,7 @@ static void curl_dump_header(const char *text, unsigned char *ptr, size_t size,
 
 	for (header = headers; *header; header++) {
 		if (hide_sensitive_header)
-			redact_sensitive_header(*header);
+			redact_sensitive_header(*header, 0);
 		strbuf_insertstr((*header), 0, text);
 		strbuf_insertstr((*header), strlen(text), ": ");
 		strbuf_rtrim((*header));

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] http: redact curl h2h3 headers in info
  2022-11-11  2:38   ` Jeff King
@ 2022-11-11  2:39     ` Taylor Blau
  2022-11-11 17:55     ` Glen Choo
  1 sibling, 0 replies; 24+ messages in thread
From: Taylor Blau @ 2022-11-11  2:39 UTC (permalink / raw)
  To: Jeff King
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, Emily Shaffer,
	Glen Choo

On Thu, Nov 10, 2022 at 09:38:08PM -0500, Jeff King wrote:
> On Thu, Nov 10, 2022 at 10:57:34PM +0000, Glen Choo via GitGitGadget wrote:
>
> > +/* Redact headers in info */
> > +static void redact_sensitive_info_header(struct strbuf *header)
> > +{
> > +	const char *sensitive_header;
> > +
> > +	/*
> > +	 * curl's h2h3 prints headers in info, e.g.:
> > +	 *   h2h3 [<header-name>: <header-val>]
> > +	 */
> > +	if (trace_curl_redact &&
> > +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
> > +		struct strbuf inner = STRBUF_INIT;
> > +
> > +		/* Drop the trailing "]" */
> > +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
>
> This will misbehave if fed the string "h2h3 [", because that strlen()
> becomes 0, and the subtraction underflows.
>
> Unlikely, since we are being fed by curl, but possibly worth asserting
> (though see below for an alternative which drops this line).

Eek. Thanks for spotting. Let's hold off on this one until Glen submits
another version, or you and him coordinate a combined series.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] t: run t5551 tests with both HTTP and HTTP/2
  2022-11-11  2:29         ` Jeff King
  2022-11-11  2:31           ` Taylor Blau
@ 2022-11-11 14:49           ` Jeff King
  2022-11-11 15:06             ` Ævar Arnfjörð Bjarmason
  2022-11-11 15:20             ` Jeff King
  1 sibling, 2 replies; 24+ messages in thread
From: Jeff King @ 2022-11-11 14:49 UTC (permalink / raw)
  To: Glen Choo; +Cc: Taylor Blau, Glen Choo via GitGitGadget, git

On Thu, Nov 10, 2022 at 09:29:15PM -0500, Jeff King wrote:

> We don't get the _whole_ test suite running with http2, but hopefully it
> gives us a fairly representative sample. And it does find this bug.
> 
> I can try to work the above into patch form, but I may not get to it for
> a day or two.

So here's a patch which finds the bug you're fixing. It's set up to be
prepared before your patch, which helps confirm the bug (and that we're
correctly using http/2 in the tests!). You'll want to unmark the !HTTP2
prereqs as part of your patch.

Alternatively, it could come after your patch to confirm the fix, in
which case it would omit those !HTTP2 prereqs from the get-go.

Let me know if you'd like to pick it up as part of your series.
Otherwise, I can submit it separately in the on-top form (but my
preference is for you to take it, since it makes testing the fix
easier).

-- >8 --
Subject: t: run t5551 tests with both HTTP and HTTP/2

We have occasionally seen bugs that affect Git running only against an
HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
match headers case-insensitively when redacting, 2021-09-22). But since
we have no test coverage using HTTP/2, we only uncover these bugs in the
wild.

That commit gives a recipe for converting our Apache setup to support
HTTP/2, but:

  - it's not necessarily portable

  - we don't want to just test HTTP/2; we really want to do a variety of
    basic tests for _both_ protocols

This patch handles both problems by running a duplicate of t5551
(labeled as t5559 here) with an alternate-universe setup that enables
HTTP/2. So we'll continue to run t5551 as before, but run the same
battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
platform, then t5559 should bail during the webserver setup, and
gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
"auto" to "yes", where the point is to complain when webserver setup
fails).

In theory other http-related test scripts could benefit from the same
duplication, but doing t5551 should give us a reasonable check of basic
functionality, and would have caught both bugs we've seen in the wild
with HTTP/2.

A few notes on the implementation:

  - a script enables the server side config by calling enable_http2
    before starting the webserver. This avoids even trying to load any
    HTTP/2 config for t5551 (which is what lets it keep working with
    regular HTTP even on systems that don't support it). This also sets
    a prereq which can be used by individual tests.

  - As discussed in b66c77a64e, the http2 module isn't compatible with
    the "prefork" mpm, so we need to pick something else. I chose
    "event" here, which works on my Debian system, but it's possible
    there are platforms which would prefer something else. We can adjust
    that later if somebody finds such a platform.

  - The test "large fetch-pack requests can be sent using chunked
    encoding" makes sure we use a chunked transfer-encoding by looking
    for that header in the trace. But since HTTP/2 has its own streaming
    mechanisms, we won't find such a header. We could skip the test
    entirely by marking it with !HTTP2. But there's some value in making
    sure that the fetch itself succeeded. So instead, we'll confirm that
    either we're using HTTP2 _or_ we saw the expected chunked header.

  - the redaction tests fail under HTTP/2 with recent versions of curl.
    This is a bug! I've marked them with !HTTP2 here to skip them under
    t5559 for the moment. Using test_expect_failure would be more
    appropriate, but would require a bunch of boilerplate. Since we'll
    be fixing them momentarily, let's just skip them for now to keep the
    test suite bisectable, and we can re-enable them in the commit that
    fixes the bug.

  - one alternative layout would be to push most of t5551 into a
    lib-t5551.sh script, then source it from both t5551 and t5559.
    Keeping t5551 intact seemed a little simpler, as its one less level
    of indirection for people fixing bugs/regressions in the non-HTTP/2
    tests.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/lib-httpd.sh                    |  5 +++++
 t/lib-httpd/apache.conf           | 19 ++++++++++++++++---
 t/t5551-http-fetch-smart.sh       | 19 ++++++++++++++-----
 t/t5559-http-fetch-smart-http2.sh |  4 ++++
 4 files changed, 39 insertions(+), 8 deletions(-)
 create mode 100755 t/t5559-http-fetch-smart-http2.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 1f6b9b08d1..ba9fe36772 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -174,6 +174,11 @@ prepare_httpd() {
 	fi
 }
 
+enable_http2 () {
+	HTTPD_PARA="$HTTPD_PARA -DHTTP2"
+	test_set_prereq HTTP2
+}
+
 start_httpd() {
 	prepare_httpd >&3 2>&4
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 706799391b..0294739a77 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -29,6 +29,11 @@ ErrorLog error.log
 	LoadModule setenvif_module modules/mod_setenvif.so
 </IfModule>
 
+<IfDefine HTTP2>
+LoadModule http2_module modules/mod_http2.so
+Protocols h2c
+</IfDefine>
+
 <IfVersion < 2.4>
 LockFile accept.lock
 </IfVersion>
@@ -64,12 +69,20 @@ LockFile accept.lock
 <IfModule !mod_access_compat.c>
 	LoadModule access_compat_module modules/mod_access_compat.so
 </IfModule>
-<IfModule !mod_mpm_prefork.c>
-	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
-</IfModule>
 <IfModule !mod_unixd.c>
 	LoadModule unixd_module modules/mod_unixd.so
 </IfModule>
+
+<IfDefine HTTP2>
+<IfModule !mod_mpm_event.c>
+	LoadModule mpm_event_module modules/mod_mpm_event.so
+</IfModule>
+</IfDefine>
+<IfDefine !HTTP2>
+<IfModule !mod_mpm_prefork.c>
+	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+</IfModule>
+</IfDefine>
 </IfVersion>
 
 PassEnv GIT_VALGRIND
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 64c6c9f59e..9826631926 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -1,13 +1,19 @@
 #!/bin/sh
 
-test_description='test smart fetching over http via http-backend'
+: ${HTTP_PROTO:=HTTP}
+test_description="test smart fetching over http via http-backend ($HTTP_PROTO)"
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-httpd.sh
+test "$HTTP_PROTO" = "HTTP/2" && enable_http2
 start_httpd
 
+test_expect_success HTTP2 'enable client-side http/2' '
+	git config --global http.version HTTP/2
+'
+
 test_expect_success 'setup repository' '
 	git config push.default matching &&
 	echo content >file &&
@@ -194,7 +200,7 @@ test_expect_success 'redirects send auth to new location' '
 	expect_askpass both user@host auth/smart/repo.git
 '
 
-test_expect_success 'GIT_TRACE_CURL redacts auth details' '
+test_expect_success !HTTP2 'GIT_TRACE_CURL redacts auth details' '
 	rm -rf redact-auth trace &&
 	set_askpass user@host pass@host &&
 	GIT_TRACE_CURL="$(pwd)/trace" git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth &&
@@ -206,7 +212,7 @@ test_expect_success 'GIT_TRACE_CURL redacts auth details' '
 	grep -i "Authorization: Basic <redacted>" trace
 '
 
-test_expect_success 'GIT_CURL_VERBOSE redacts auth details' '
+test_expect_success !HTTP2 'GIT_CURL_VERBOSE redacts auth details' '
 	rm -rf redact-auth trace &&
 	set_askpass user@host pass@host &&
 	GIT_CURL_VERBOSE=1 git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth 2>trace &&
@@ -347,7 +353,10 @@ test_expect_success CMDLINE_LIMIT \
 test_expect_success 'large fetch-pack requests can be sent using chunked encoding' '
 	GIT_TRACE_CURL=true git -c http.postbuffer=65536 \
 		clone --bare "$HTTPD_URL/smart/repo.git" split.git 2>err &&
-	grep "^=> Send header: Transfer-Encoding: chunked" err
+	{
+		test_have_prereq HTTP2 ||
+		grep "^=> Send header: Transfer-Encoding: chunked" err
+	}
 '
 
 test_expect_success 'test allowreachablesha1inwant' '
@@ -473,7 +482,7 @@ test_expect_success 'fetch by SHA-1 without tag following' '
 		--no-tags origin $(cat bar_hash)
 '
 
-test_expect_success 'cookies are redacted by default' '
+test_expect_success !HTTP2 'cookies are redacted by default' '
 	rm -rf clone &&
 	echo "Set-Cookie: Foo=1" >cookies &&
 	echo "Set-Cookie: Bar=2" >>cookies &&
diff --git a/t/t5559-http-fetch-smart-http2.sh b/t/t5559-http-fetch-smart-http2.sh
new file mode 100755
index 0000000000..9eece71c2c
--- /dev/null
+++ b/t/t5559-http-fetch-smart-http2.sh
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+HTTP_PROTO=HTTP/2
+. ./t5551-http-fetch-smart.sh
-- 
2.38.1.808.gb4bb4d4010


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] t: run t5551 tests with both HTTP and HTTP/2
  2022-11-11 14:49           ` [PATCH] t: run t5551 tests with both HTTP and HTTP/2 Jeff King
@ 2022-11-11 15:06             ` Ævar Arnfjörð Bjarmason
  2022-11-11 15:19               ` Jeff King
  2022-11-11 15:20             ` Jeff King
  1 sibling, 1 reply; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-11 15:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Glen Choo, Taylor Blau, Glen Choo via GitGitGadget, git


On Fri, Nov 11 2022, Jeff King wrote:

> On Thu, Nov 10, 2022 at 09:29:15PM -0500, Jeff King wrote:
>
>> We don't get the _whole_ test suite running with http2, but hopefully it
>> gives us a fairly representative sample. And it does find this bug.
>> 
>> I can try to work the above into patch form, but I may not get to it for
>> a day or two.
>
> So here's a patch which finds the bug you're fixing. It's set up to be
> prepared before your patch, which helps confirm the bug (and that we're
> correctly using http/2 in the tests!). You'll want to unmark the !HTTP2
> prereqs as part of your patch.
>
> Alternatively, it could come after your patch to confirm the fix, in
> which case it would omit those !HTTP2 prereqs from the get-go.
>
> Let me know if you'd like to pick it up as part of your series.
> Otherwise, I can submit it separately in the on-top form (but my
> preference is for you to take it, since it makes testing the fix
> easier).
>
> -- >8 --
> Subject: t: run t5551 tests with both HTTP and HTTP/2
>
> We have occasionally seen bugs that affect Git running only against an
> HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
> match headers case-insensitively when redacting, 2021-09-22). But since
> we have no test coverage using HTTP/2, we only uncover these bugs in the
> wild.
>
> That commit gives a recipe for converting our Apache setup to support
> HTTP/2, but:
>
>   - it's not necessarily portable
>
>   - we don't want to just test HTTP/2; we really want to do a variety of
>     basic tests for _both_ protocols
>
> This patch handles both problems by running a duplicate of t5551
> (labeled as t5559 here) with an alternate-universe setup that enables
> HTTP/2. So we'll continue to run t5551 as before, but run the same
> battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
> platform, then t5559 should bail during the webserver setup, and
> gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
> "auto" to "yes", where the point is to complain when webserver setup
> fails).
>
> In theory other http-related test scripts could benefit from the same
> duplication, but doing t5551 should give us a reasonable check of basic
> functionality, and would have caught both bugs we've seen in the wild
> with HTTP/2.

Looking at the diff below, would it be much more work to just support a
GIT_TEST_HTTP_VERSION=2?

Then:

> diff --git a/t/t5559-http-fetch-smart-http2.sh b/t/t5559-http-fetch-smart-http2.sh
> new file mode 100755
> index 0000000000..9eece71c2c
> --- /dev/null
> +++ b/t/t5559-http-fetch-smart-http2.sh
> @@ -0,0 +1,4 @@
> +#!/bin/sh
> +
> +HTTP_PROTO=HTTP/2
> +. ./t5551-http-fetch-smart.sh

This would just skip_all if it was already set to 2, as we'd know we
were testing it in other tests, and lib-httpd.sh would do
GIT_TEST_HTTP_VERSION=1 by default.

There's the "!HTTP2" additions, which you mention you'll be "fixing
momentarily", but this is a stand-alone patch, so maybe I'm missing that
context...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] t: run t5551 tests with both HTTP and HTTP/2
  2022-11-11 15:06             ` Ævar Arnfjörð Bjarmason
@ 2022-11-11 15:19               ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2022-11-11 15:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Glen Choo, Taylor Blau, Glen Choo via GitGitGadget, git

On Fri, Nov 11, 2022 at 04:06:20PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > In theory other http-related test scripts could benefit from the same
> > duplication, but doing t5551 should give us a reasonable check of basic
> > functionality, and would have caught both bugs we've seen in the wild
> > with HTTP/2.
> 
> Looking at the diff below, would it be much more work to just support a
> GIT_TEST_HTTP_VERSION=2?

No, but 99% of the tests would not benefit (because they don't run http
at all), and it would consume a bunch of extra CPU.

> There's the "!HTTP2" additions, which you mention you'll be "fixing
> momentarily", but this is a stand-alone patch, so maybe I'm missing that
> context...

The other context is Glen's patch that started this thread.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] t: run t5551 tests with both HTTP and HTTP/2
  2022-11-11 14:49           ` [PATCH] t: run t5551 tests with both HTTP and HTTP/2 Jeff King
  2022-11-11 15:06             ` Ævar Arnfjörð Bjarmason
@ 2022-11-11 15:20             ` Jeff King
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff King @ 2022-11-11 15:20 UTC (permalink / raw)
  To: Glen Choo; +Cc: Taylor Blau, Glen Choo via GitGitGadget, git

On Fri, Nov 11, 2022 at 09:49:50AM -0500, Jeff King wrote:

>   - As discussed in b66c77a64e, the http2 module isn't compatible with
>     the "prefork" mpm, so we need to pick something else. I chose
>     "event" here, which works on my Debian system, but it's possible
>     there are platforms which would prefer something else. We can adjust
>     that later if somebody finds such a platform.

By the way, I confirmed that t5559 runs in CI (I checked linux-gcc, but
all of the ubuntu-based builds should run it, as they all install
apache).

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] http: redact curl h2h3 headers in info
  2022-11-11  2:38   ` Jeff King
  2022-11-11  2:39     ` Taylor Blau
@ 2022-11-11 17:55     ` Glen Choo
  1 sibling, 0 replies; 24+ messages in thread
From: Glen Choo @ 2022-11-11 17:55 UTC (permalink / raw)
  To: Jeff King, Glen Choo via GitGitGadget; +Cc: git, Taylor Blau, Emily Shaffer

Jeff King <peff@peff.net> writes:

> On Thu, Nov 10, 2022 at 10:57:34PM +0000, Glen Choo via GitGitGadget wrote:
>
>> +/* Redact headers in info */
>> +static void redact_sensitive_info_header(struct strbuf *header)
>> +{
>> +	const char *sensitive_header;
>> +
>> +	/*
>> +	 * curl's h2h3 prints headers in info, e.g.:
>> +	 *   h2h3 [<header-name>: <header-val>]
>> +	 */
>> +	if (trace_curl_redact &&
>> +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
>> +		struct strbuf inner = STRBUF_INIT;
>> +
>> +		/* Drop the trailing "]" */
>> +		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
>
> This will misbehave if fed the string "h2h3 [", because that strlen()
> becomes 0, and the subtraction underflows.
>
> Unlikely, since we are being fed by curl, but possibly worth asserting
> (though see below for an alternative which drops this line).
>
>> +		if (redact_sensitive_header(&inner)) {
>> +			strbuf_setlen(header, strlen("h2h3 ["));
>
> This strlen may be better spelled as:
>
>   sensitive_header - header->buf
>
> which IMHO makes it more clear that our intent is to truncate based on
> the pointer we computed by skipping (and has no chance of getting out of
> sync with the earlier copy of the string).
>
> It's also a little more robust, in that it doesn't depend on "h2h3"
> being at the beginning of the string (though in practice it must be,
> because that's where skip_iprefix() is checking). See below on that.
>
>> +			strbuf_addbuf(header, &inner);
>> +			strbuf_addch(header, ']');
>> +		}
>> +
>> +		strbuf_release(&inner);
>
> This will do a new allocation/free for each info line, even if it's not
> redacted. It's probably premature optimization to worry about it, but
> you could do it all in the original strbuf, if we inform
> redact_sensitive_header() of the offset at which it should look for the
> header (and because it uses "sensitive_header - header->buf" for the
> truncation, it handles the extra "h2h3" at the beginning just fine).
> Something like:
>
> diff --git a/http.c b/http.c
> index 8135fac283..8a5ba3f477 100644
> --- a/http.c
> +++ b/http.c
> @@ -561,14 +561,14 @@ static void set_curl_keepalive(CURL *c)
>  #endif
>  
>  /* Return 1 if redactions have been made, 0 otherwise. */
> -static int redact_sensitive_header(struct strbuf *header)
> +static int redact_sensitive_header(struct strbuf *header, size_t offset)
>  {
>  	int ret = 0;
>  	const char *sensitive_header;
>  
>  	if (trace_curl_redact &&
> -	    (skip_iprefix(header->buf, "Authorization:", &sensitive_header) ||
> -	     skip_iprefix(header->buf, "Proxy-Authorization:", &sensitive_header))) {
> +	    (skip_iprefix(header->buf + offset, "Authorization:", &sensitive_header) ||
> +	     skip_iprefix(header->buf + offset, "Proxy-Authorization:", &sensitive_header))) {
>  		/* The first token is the type, which is OK to log */
>  		while (isspace(*sensitive_header))
>  			sensitive_header++;
> @@ -579,7 +579,7 @@ static int redact_sensitive_header(struct strbuf *header)
>  		strbuf_addstr(header, " <redacted>");
>  		ret = 1;
>  	} else if (trace_curl_redact &&
> -		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
> +		   skip_iprefix(header->buf + offset, "Cookie:", &sensitive_header)) {
>  		struct strbuf redacted_header = STRBUF_INIT;
>  		const char *cookie;
>  
> @@ -631,17 +631,10 @@ static void redact_sensitive_info_header(struct strbuf *header)
>  	 */
>  	if (trace_curl_redact &&
>  	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
> -		struct strbuf inner = STRBUF_INIT;
> -
> -		/* Drop the trailing "]" */
> -		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
> -		if (redact_sensitive_header(&inner)) {
> -			strbuf_setlen(header, strlen("h2h3 ["));
> -			strbuf_addbuf(header, &inner);
> +		if (redact_sensitive_header(header, sensitive_header - header->buf)) {
> +			/* redaction ate our closing bracket */
>  			strbuf_addch(header, ']');
>  		}
> -
> -		strbuf_release(&inner);
>  	}
>  }
>  
> @@ -659,7 +652,7 @@ static void curl_dump_header(const char *text, unsigned char *ptr, size_t size,
>  
>  	for (header = headers; *header; header++) {
>  		if (hide_sensitive_header)
> -			redact_sensitive_header(*header);
> +			redact_sensitive_header(*header, 0);
>  		strbuf_insertstr((*header), 0, text);
>  		strbuf_insertstr((*header), strlen(text), ": ");
>  		strbuf_rtrim((*header));

As someone who's still trying to wrap my head around pointer
manipulations, these suggestions are very welcome, thanks!

I'll take these suggestions along with the HTTP2 one.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 0/2] http: redact curl h2h3 headers in info
  2022-11-10 22:57 ` [PATCH v2] " Glen Choo via GitGitGadget
  2022-11-11  2:36   ` Taylor Blau
  2022-11-11  2:38   ` Jeff King
@ 2022-11-11 22:35   ` Glen Choo via GitGitGadget
  2022-11-11 22:35     ` [PATCH v3 1/2] t: run t5551 tests with both HTTP and HTTP/2 Jeff King via GitGitGadget
                       ` (2 more replies)
  2 siblings, 3 replies; 24+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-11-11 22:35 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Jeff King, Emily Shaffer,
	Ævar Arnfjörð Bjarmason, Glen Choo

Big thanks to Peff for the feedback last round. I've incorporated all of the
suggestions. I'm so glad that we finally have tests here :)

Changes in v3:

 * Add the HTTP2 test from [1] to the start of the series
 * Drop struct strbuf inner in favor of doing work on the original strbuf

Changes in v2:

 * Describe the redacted string in comments.
 * Return 1 for "redactions have happened".
 * Fix a leak of the "inner" strbuf.
 * Rename function, fix typo.

[1] https://lore.kernel.org/git/Y25hDr7aHvKnxso3@coredump.intra.peff.net

Glen Choo (1):
  http: redact curl h2h3 headers in info

Jeff King (1):
  t: run t5551 tests with both HTTP and HTTP/2

 http.c                            | 47 +++++++++++++++++++++++++++----
 t/lib-httpd.sh                    |  5 ++++
 t/lib-httpd/apache.conf           | 19 +++++++++++--
 t/t5551-http-fetch-smart.sh       | 13 +++++++--
 t/t5559-http-fetch-smart-http2.sh |  4 +++
 5 files changed, 77 insertions(+), 11 deletions(-)
 create mode 100755 t/t5559-http-fetch-smart-http2.sh


base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1377%2Fchooglen%2Fhttp%2Fredact-h2h3-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1377/chooglen/http/redact-h2h3-v3
Pull-Request: https://github.com/git/git/pull/1377

Range-diff vs v2:

 -:  ----------- > 1:  09194dba8cd t: run t5551 tests with both HTTP and HTTP/2
 1:  a8c35ff4ddf ! 2:  bb5df1a48b9 http: redact curl h2h3 headers in info
     @@ Commit message
      
          With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
          "Authorization" and "Cookie" get redacted. However, since [1], curl's
     -    h2h3 module also prints headers in its "info", which don't get redacted.
     -    For example,
     +    h2h3 module (invoked when using HTTP/2) also prints headers in its
     +    "info", which don't get redacted. For example,
      
            echo 'github.com      TRUE    /       FALSE   1698960413304   o       foo=bar' >cookiefile &&
            GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
     @@ Commit message
            23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>
      
          Teach http.c to check for h2h3 headers in info and redact them using the
     -    existing header redaction logic.
     +    existing header redaction logic. This fixes the broken redaction logic
     +    that we noted in the previous commit, so mark the redaction tests as
     +    passing under HTTP2.
      
          [1] https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77
      
     +    Helped-by: Jeff King <peff@peff.net>
          Signed-off-by: Glen Choo <chooglen@google.com>
      
       ## http.c ##
     @@ http.c: static void set_curl_keepalive(CURL *c)
       
      -static void redact_sensitive_header(struct strbuf *header)
      +/* Return 1 if redactions have been made, 0 otherwise. */
     -+static int redact_sensitive_header(struct strbuf *header)
     ++static int redact_sensitive_header(struct strbuf *header, size_t offset)
       {
      +	int ret = 0;
       	const char *sensitive_header;
       
       	if (trace_curl_redact &&
     +-	    (skip_iprefix(header->buf, "Authorization:", &sensitive_header) ||
     +-	     skip_iprefix(header->buf, "Proxy-Authorization:", &sensitive_header))) {
     ++	    (skip_iprefix(header->buf + offset, "Authorization:", &sensitive_header) ||
     ++	     skip_iprefix(header->buf + offset, "Proxy-Authorization:", &sensitive_header))) {
     + 		/* The first token is the type, which is OK to log */
     + 		while (isspace(*sensitive_header))
     + 			sensitive_header++;
      @@ http.c: static void redact_sensitive_header(struct strbuf *header)
       		/* Everything else is opaque and possibly sensitive */
       		strbuf_setlen(header,  sensitive_header - header->buf);
       		strbuf_addstr(header, " <redacted>");
      +		ret = 1;
       	} else if (trace_curl_redact &&
     - 		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
     +-		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
     ++		   skip_iprefix(header->buf + offset, "Cookie:", &sensitive_header)) {
       		struct strbuf redacted_header = STRBUF_INIT;
     + 		const char *cookie;
     + 
      @@ http.c: static void redact_sensitive_header(struct strbuf *header)
       
       		strbuf_setlen(header, sensitive_header - header->buf);
     @@ http.c: static void redact_sensitive_header(struct strbuf *header)
      +	 */
      +	if (trace_curl_redact &&
      +	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
     -+		struct strbuf inner = STRBUF_INIT;
     -+
     -+		/* Drop the trailing "]" */
     -+		strbuf_add(&inner, sensitive_header, strlen(sensitive_header) - 1);
     -+		if (redact_sensitive_header(&inner)) {
     -+			strbuf_setlen(header, strlen("h2h3 ["));
     -+			strbuf_addbuf(header, &inner);
     ++		if (redact_sensitive_header(header, sensitive_header - header->buf)) {
     ++			/* redaction ate our closing bracket */
      +			strbuf_addch(header, ']');
      +		}
     -+
     -+		strbuf_release(&inner);
       	}
       }
       
     +@@ http.c: static void curl_dump_header(const char *text, unsigned char *ptr, size_t size,
     + 
     + 	for (header = headers; *header; header++) {
     + 		if (hide_sensitive_header)
     +-			redact_sensitive_header(*header);
     ++			redact_sensitive_header(*header, 0);
     + 		strbuf_insertstr((*header), 0, text);
     + 		strbuf_insertstr((*header), strlen(text), ": ");
     + 		strbuf_rtrim((*header));
      @@ http.c: static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
       	strbuf_release(&out);
       }
     @@ http.c: static int curl_trace(CURL *handle, curl_infotype type, char *data, size
       		break;
       	case CURLINFO_HEADER_OUT:
       		text = "=> Send header";
     +
     + ## t/t5551-http-fetch-smart.sh ##
     +@@ t/t5551-http-fetch-smart.sh: test_expect_success 'redirects send auth to new location' '
     + 	expect_askpass both user@host auth/smart/repo.git
     + '
     + 
     +-test_expect_success !HTTP2 'GIT_TRACE_CURL redacts auth details' '
     ++test_expect_success 'GIT_TRACE_CURL redacts auth details' '
     + 	rm -rf redact-auth trace &&
     + 	set_askpass user@host pass@host &&
     + 	GIT_TRACE_CURL="$(pwd)/trace" git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth &&
     +@@ t/t5551-http-fetch-smart.sh: test_expect_success !HTTP2 'GIT_TRACE_CURL redacts auth details' '
     + 	grep -i "Authorization: Basic <redacted>" trace
     + '
     + 
     +-test_expect_success !HTTP2 'GIT_CURL_VERBOSE redacts auth details' '
     ++test_expect_success 'GIT_CURL_VERBOSE redacts auth details' '
     + 	rm -rf redact-auth trace &&
     + 	set_askpass user@host pass@host &&
     + 	GIT_CURL_VERBOSE=1 git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth 2>trace &&
     +@@ t/t5551-http-fetch-smart.sh: test_expect_success 'fetch by SHA-1 without tag following' '
     + 		--no-tags origin $(cat bar_hash)
     + '
     + 
     +-test_expect_success !HTTP2 'cookies are redacted by default' '
     ++test_expect_success 'cookies are redacted by default' '
     + 	rm -rf clone &&
     + 	echo "Set-Cookie: Foo=1" >cookies &&
     + 	echo "Set-Cookie: Bar=2" >>cookies &&

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 1/2] t: run t5551 tests with both HTTP and HTTP/2
  2022-11-11 22:35   ` [PATCH v3 0/2] " Glen Choo via GitGitGadget
@ 2022-11-11 22:35     ` Jeff King via GitGitGadget
  2022-11-11 22:35     ` [PATCH v3 2/2] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
  2022-11-14 22:33     ` [PATCH v3 0/2] " Jeff King
  2 siblings, 0 replies; 24+ messages in thread
From: Jeff King via GitGitGadget @ 2022-11-11 22:35 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Jeff King, Emily Shaffer,
	Ævar Arnfjörð Bjarmason, Glen Choo, Jeff King

From: Jeff King <peff@peff.net>

We have occasionally seen bugs that affect Git running only against an
HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
match headers case-insensitively when redacting, 2021-09-22). But since
we have no test coverage using HTTP/2, we only uncover these bugs in the
wild.

That commit gives a recipe for converting our Apache setup to support
HTTP/2, but:

  - it's not necessarily portable

  - we don't want to just test HTTP/2; we really want to do a variety of
    basic tests for _both_ protocols

This patch handles both problems by running a duplicate of t5551
(labeled as t5559 here) with an alternate-universe setup that enables
HTTP/2. So we'll continue to run t5551 as before, but run the same
battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
platform, then t5559 should bail during the webserver setup, and
gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
"auto" to "yes", where the point is to complain when webserver setup
fails).

In theory other http-related test scripts could benefit from the same
duplication, but doing t5551 should give us a reasonable check of basic
functionality, and would have caught both bugs we've seen in the wild
with HTTP/2.

A few notes on the implementation:

  - a script enables the server side config by calling enable_http2
    before starting the webserver. This avoids even trying to load any
    HTTP/2 config for t5551 (which is what lets it keep working with
    regular HTTP even on systems that don't support it). This also sets
    a prereq which can be used by individual tests.

  - As discussed in b66c77a64e, the http2 module isn't compatible with
    the "prefork" mpm, so we need to pick something else. I chose
    "event" here, which works on my Debian system, but it's possible
    there are platforms which would prefer something else. We can adjust
    that later if somebody finds such a platform.

  - The test "large fetch-pack requests can be sent using chunked
    encoding" makes sure we use a chunked transfer-encoding by looking
    for that header in the trace. But since HTTP/2 has its own streaming
    mechanisms, we won't find such a header. We could skip the test
    entirely by marking it with !HTTP2. But there's some value in making
    sure that the fetch itself succeeded. So instead, we'll confirm that
    either we're using HTTP2 _or_ we saw the expected chunked header.

  - the redaction tests fail under HTTP/2 with recent versions of curl.
    This is a bug! I've marked them with !HTTP2 here to skip them under
    t5559 for the moment. Using test_expect_failure would be more
    appropriate, but would require a bunch of boilerplate. Since we'll
    be fixing them momentarily, let's just skip them for now to keep the
    test suite bisectable, and we can re-enable them in the commit that
    fixes the bug.

  - one alternative layout would be to push most of t5551 into a
    lib-t5551.sh script, then source it from both t5551 and t5559.
    Keeping t5551 intact seemed a little simpler, as its one less level
    of indirection for people fixing bugs/regressions in the non-HTTP/2
    tests.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/lib-httpd.sh                    |  5 +++++
 t/lib-httpd/apache.conf           | 19 ++++++++++++++++---
 t/t5551-http-fetch-smart.sh       | 19 ++++++++++++++-----
 t/t5559-http-fetch-smart-http2.sh |  4 ++++
 4 files changed, 39 insertions(+), 8 deletions(-)
 create mode 100755 t/t5559-http-fetch-smart-http2.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 1f6b9b08d1d..ba9fe36772a 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -174,6 +174,11 @@ prepare_httpd() {
 	fi
 }
 
+enable_http2 () {
+	HTTPD_PARA="$HTTPD_PARA -DHTTP2"
+	test_set_prereq HTTP2
+}
+
 start_httpd() {
 	prepare_httpd >&3 2>&4
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 706799391bd..0294739a77a 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -29,6 +29,11 @@ ErrorLog error.log
 	LoadModule setenvif_module modules/mod_setenvif.so
 </IfModule>
 
+<IfDefine HTTP2>
+LoadModule http2_module modules/mod_http2.so
+Protocols h2c
+</IfDefine>
+
 <IfVersion < 2.4>
 LockFile accept.lock
 </IfVersion>
@@ -64,12 +69,20 @@ LockFile accept.lock
 <IfModule !mod_access_compat.c>
 	LoadModule access_compat_module modules/mod_access_compat.so
 </IfModule>
-<IfModule !mod_mpm_prefork.c>
-	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
-</IfModule>
 <IfModule !mod_unixd.c>
 	LoadModule unixd_module modules/mod_unixd.so
 </IfModule>
+
+<IfDefine HTTP2>
+<IfModule !mod_mpm_event.c>
+	LoadModule mpm_event_module modules/mod_mpm_event.so
+</IfModule>
+</IfDefine>
+<IfDefine !HTTP2>
+<IfModule !mod_mpm_prefork.c>
+	LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
+</IfModule>
+</IfDefine>
 </IfVersion>
 
 PassEnv GIT_VALGRIND
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 6a38294a476..ad0a0639e6b 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -1,13 +1,19 @@
 #!/bin/sh
 
-test_description='test smart fetching over http via http-backend'
+: ${HTTP_PROTO:=HTTP}
+test_description="test smart fetching over http via http-backend ($HTTP_PROTO)"
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-httpd.sh
+test "$HTTP_PROTO" = "HTTP/2" && enable_http2
 start_httpd
 
+test_expect_success HTTP2 'enable client-side http/2' '
+	git config --global http.version HTTP/2
+'
+
 test_expect_success 'setup repository' '
 	git config push.default matching &&
 	echo content >file &&
@@ -194,7 +200,7 @@ test_expect_success 'redirects send auth to new location' '
 	expect_askpass both user@host auth/smart/repo.git
 '
 
-test_expect_success 'GIT_TRACE_CURL redacts auth details' '
+test_expect_success !HTTP2 'GIT_TRACE_CURL redacts auth details' '
 	rm -rf redact-auth trace &&
 	set_askpass user@host pass@host &&
 	GIT_TRACE_CURL="$(pwd)/trace" git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth &&
@@ -206,7 +212,7 @@ test_expect_success 'GIT_TRACE_CURL redacts auth details' '
 	grep -i "Authorization: Basic <redacted>" trace
 '
 
-test_expect_success 'GIT_CURL_VERBOSE redacts auth details' '
+test_expect_success !HTTP2 'GIT_CURL_VERBOSE redacts auth details' '
 	rm -rf redact-auth trace &&
 	set_askpass user@host pass@host &&
 	GIT_CURL_VERBOSE=1 git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth 2>trace &&
@@ -347,7 +353,10 @@ test_expect_success CMDLINE_LIMIT \
 test_expect_success 'large fetch-pack requests can be sent using chunked encoding' '
 	GIT_TRACE_CURL=true git -c http.postbuffer=65536 \
 		clone --bare "$HTTPD_URL/smart/repo.git" split.git 2>err &&
-	grep "^=> Send header: Transfer-Encoding: chunked" err
+	{
+		test_have_prereq HTTP2 ||
+		grep "^=> Send header: Transfer-Encoding: chunked" err
+	}
 '
 
 test_expect_success 'test allowreachablesha1inwant' '
@@ -473,7 +482,7 @@ test_expect_success 'fetch by SHA-1 without tag following' '
 		--no-tags origin $(cat bar_hash)
 '
 
-test_expect_success 'cookies are redacted by default' '
+test_expect_success !HTTP2 'cookies are redacted by default' '
 	rm -rf clone &&
 	echo "Set-Cookie: Foo=1" >cookies &&
 	echo "Set-Cookie: Bar=2" >>cookies &&
diff --git a/t/t5559-http-fetch-smart-http2.sh b/t/t5559-http-fetch-smart-http2.sh
new file mode 100755
index 00000000000..9eece71c2c2
--- /dev/null
+++ b/t/t5559-http-fetch-smart-http2.sh
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+HTTP_PROTO=HTTP/2
+. ./t5551-http-fetch-smart.sh
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 2/2] http: redact curl h2h3 headers in info
  2022-11-11 22:35   ` [PATCH v3 0/2] " Glen Choo via GitGitGadget
  2022-11-11 22:35     ` [PATCH v3 1/2] t: run t5551 tests with both HTTP and HTTP/2 Jeff King via GitGitGadget
@ 2022-11-11 22:35     ` Glen Choo via GitGitGadget
  2022-11-14 22:33     ` [PATCH v3 0/2] " Jeff King
  2 siblings, 0 replies; 24+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-11-11 22:35 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Jeff King, Emily Shaffer,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module (invoked when using HTTP/2) also prints headers in its
"info", which don't get redacted. For example,

  echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
  GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
    -c 'http.cookiefile=cookiefile' \
    -c 'http.version=' \
    ls-remote https://github.com/git/git refs/heads/main 2>output &&
  grep 'cookie' output

produces output like:

  23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
  23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>

Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic. This fixes the broken redaction logic
that we noted in the previous commit, so mark the redaction tests as
passing under HTTP2.

[1] https://github.com/curl/curl/commit/f8c3724aa90472c0e617ddbbc420aa199971eb77

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
---
 http.c                      | 47 ++++++++++++++++++++++++++++++++-----
 t/t5551-http-fetch-smart.sh |  6 ++---
 2 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/http.c b/http.c
index 5d0502f51fd..8a5ba3f4776 100644
--- a/http.c
+++ b/http.c
@@ -560,13 +560,15 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
-static void redact_sensitive_header(struct strbuf *header)
+/* Return 1 if redactions have been made, 0 otherwise. */
+static int redact_sensitive_header(struct strbuf *header, size_t offset)
 {
+	int ret = 0;
 	const char *sensitive_header;
 
 	if (trace_curl_redact &&
-	    (skip_iprefix(header->buf, "Authorization:", &sensitive_header) ||
-	     skip_iprefix(header->buf, "Proxy-Authorization:", &sensitive_header))) {
+	    (skip_iprefix(header->buf + offset, "Authorization:", &sensitive_header) ||
+	     skip_iprefix(header->buf + offset, "Proxy-Authorization:", &sensitive_header))) {
 		/* The first token is the type, which is OK to log */
 		while (isspace(*sensitive_header))
 			sensitive_header++;
@@ -575,8 +577,9 @@ static void redact_sensitive_header(struct strbuf *header)
 		/* Everything else is opaque and possibly sensitive */
 		strbuf_setlen(header,  sensitive_header - header->buf);
 		strbuf_addstr(header, " <redacted>");
+		ret = 1;
 	} else if (trace_curl_redact &&
-		   skip_iprefix(header->buf, "Cookie:", &sensitive_header)) {
+		   skip_iprefix(header->buf + offset, "Cookie:", &sensitive_header)) {
 		struct strbuf redacted_header = STRBUF_INIT;
 		const char *cookie;
 
@@ -612,6 +615,26 @@ static void redact_sensitive_header(struct strbuf *header)
 
 		strbuf_setlen(header, sensitive_header - header->buf);
 		strbuf_addbuf(header, &redacted_header);
+		ret = 1;
+	}
+	return ret;
+}
+
+/* Redact headers in info */
+static void redact_sensitive_info_header(struct strbuf *header)
+{
+	const char *sensitive_header;
+
+	/*
+	 * curl's h2h3 prints headers in info, e.g.:
+	 *   h2h3 [<header-name>: <header-val>]
+	 */
+	if (trace_curl_redact &&
+	    skip_iprefix(header->buf, "h2h3 [", &sensitive_header)) {
+		if (redact_sensitive_header(header, sensitive_header - header->buf)) {
+			/* redaction ate our closing bracket */
+			strbuf_addch(header, ']');
+		}
 	}
 }
 
@@ -629,7 +652,7 @@ static void curl_dump_header(const char *text, unsigned char *ptr, size_t size,
 
 	for (header = headers; *header; header++) {
 		if (hide_sensitive_header)
-			redact_sensitive_header(*header);
+			redact_sensitive_header(*header, 0);
 		strbuf_insertstr((*header), 0, text);
 		strbuf_insertstr((*header), strlen(text), ": ");
 		strbuf_rtrim((*header));
@@ -668,6 +691,18 @@ static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
 	strbuf_release(&out);
 }
 
+static void curl_dump_info(char *data, size_t size)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_add(&buf, data, size);
+
+	redact_sensitive_info_header(&buf);
+	trace_printf_key(&trace_curl, "== Info: %s", buf.buf);
+
+	strbuf_release(&buf);
+}
+
 static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
 {
 	const char *text;
@@ -675,7 +710,7 @@ static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size,
 
 	switch (type) {
 	case CURLINFO_TEXT:
-		trace_printf_key(&trace_curl, "== Info: %s", data);
+		curl_dump_info(data, size);
 		break;
 	case CURLINFO_HEADER_OUT:
 		text = "=> Send header";
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index ad0a0639e6b..a2ebce1787b 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -200,7 +200,7 @@ test_expect_success 'redirects send auth to new location' '
 	expect_askpass both user@host auth/smart/repo.git
 '
 
-test_expect_success !HTTP2 'GIT_TRACE_CURL redacts auth details' '
+test_expect_success 'GIT_TRACE_CURL redacts auth details' '
 	rm -rf redact-auth trace &&
 	set_askpass user@host pass@host &&
 	GIT_TRACE_CURL="$(pwd)/trace" git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth &&
@@ -212,7 +212,7 @@ test_expect_success !HTTP2 'GIT_TRACE_CURL redacts auth details' '
 	grep -i "Authorization: Basic <redacted>" trace
 '
 
-test_expect_success !HTTP2 'GIT_CURL_VERBOSE redacts auth details' '
+test_expect_success 'GIT_CURL_VERBOSE redacts auth details' '
 	rm -rf redact-auth trace &&
 	set_askpass user@host pass@host &&
 	GIT_CURL_VERBOSE=1 git clone --bare "$HTTPD_URL/auth/smart/repo.git" redact-auth 2>trace &&
@@ -482,7 +482,7 @@ test_expect_success 'fetch by SHA-1 without tag following' '
 		--no-tags origin $(cat bar_hash)
 '
 
-test_expect_success !HTTP2 'cookies are redacted by default' '
+test_expect_success 'cookies are redacted by default' '
 	rm -rf clone &&
 	echo "Set-Cookie: Foo=1" >cookies &&
 	echo "Set-Cookie: Bar=2" >>cookies &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 0/2] http: redact curl h2h3 headers in info
  2022-11-11 22:35   ` [PATCH v3 0/2] " Glen Choo via GitGitGadget
  2022-11-11 22:35     ` [PATCH v3 1/2] t: run t5551 tests with both HTTP and HTTP/2 Jeff King via GitGitGadget
  2022-11-11 22:35     ` [PATCH v3 2/2] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
@ 2022-11-14 22:33     ` Jeff King
  2022-11-14 22:43       ` Taylor Blau
  2 siblings, 1 reply; 24+ messages in thread
From: Jeff King @ 2022-11-14 22:33 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, Emily Shaffer,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Fri, Nov 11, 2022 at 10:35:04PM +0000, Glen Choo via GitGitGadget wrote:

> Big thanks to Peff for the feedback last round. I've incorporated all of the
> suggestions. I'm so glad that we finally have tests here :)
> 
> Changes in v3:
> 
>  * Add the HTTP2 test from [1] to the start of the series
>  * Drop struct strbuf inner in favor of doing work on the original strbuf

Thanks, both look good to me (unsurprisingly ;) ).

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 0/2] http: redact curl h2h3 headers in info
  2022-11-14 22:33     ` [PATCH v3 0/2] " Jeff King
@ 2022-11-14 22:43       ` Taylor Blau
  0 siblings, 0 replies; 24+ messages in thread
From: Taylor Blau @ 2022-11-14 22:43 UTC (permalink / raw)
  To: Jeff King
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, Emily Shaffer,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Mon, Nov 14, 2022 at 05:33:26PM -0500, Jeff King wrote:
> On Fri, Nov 11, 2022 at 10:35:04PM +0000, Glen Choo via GitGitGadget wrote:
>
> > Big thanks to Peff for the feedback last round. I've incorporated all of the
> > suggestions. I'm so glad that we finally have tests here :)
> >
> > Changes in v3:
> >
> >  * Add the HTTP2 test from [1] to the start of the series
> >  * Drop struct strbuf inner in favor of doing work on the original strbuf
>
> Thanks, both look good to me (unsurprisingly ;) ).

Thanks, both. Will start merging this down in this evening's push-out.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-11-14 22:44 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09  0:52 [PATCH] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
2022-11-10  2:52 ` Taylor Blau
2022-11-10 17:48   ` Glen Choo
2022-11-10 21:50     ` Jeff King
2022-11-10 22:53       ` Glen Choo
2022-11-11  2:29         ` Jeff King
2022-11-11  2:31           ` Taylor Blau
2022-11-11 14:49           ` [PATCH] t: run t5551 tests with both HTTP and HTTP/2 Jeff King
2022-11-11 15:06             ` Ævar Arnfjörð Bjarmason
2022-11-11 15:19               ` Jeff King
2022-11-11 15:20             ` Jeff King
2022-11-10 21:57 ` [PATCH] http: redact curl h2h3 headers in info Emily Shaffer
2022-11-10 22:14   ` Glen Choo
2022-11-11  2:35     ` Taylor Blau
2022-11-10 22:57 ` [PATCH v2] " Glen Choo via GitGitGadget
2022-11-11  2:36   ` Taylor Blau
2022-11-11  2:38   ` Jeff King
2022-11-11  2:39     ` Taylor Blau
2022-11-11 17:55     ` Glen Choo
2022-11-11 22:35   ` [PATCH v3 0/2] " Glen Choo via GitGitGadget
2022-11-11 22:35     ` [PATCH v3 1/2] t: run t5551 tests with both HTTP and HTTP/2 Jeff King via GitGitGadget
2022-11-11 22:35     ` [PATCH v3 2/2] http: redact curl h2h3 headers in info Glen Choo via GitGitGadget
2022-11-14 22:33     ` [PATCH v3 0/2] " Jeff King
2022-11-14 22:43       ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).