git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] fetch-pack: redact packfile urls in traces
@ 2021-10-08 16:03 Ivan Frade via GitGitGadget
  2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-08 16:03 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade

In some setups, packfile uris act as bearer token. It is not recommended to
expose them plainly in logs, although in special circunstances (e.g. debug)
it makes sense to write them.

Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade ifrade@google.com

Ivan Frade (2):
  fetch-pack: redact packfile urls in traces
  Documentation: packfile-uri hash can be longer than 40 hex chars

 Documentation/technical/protocol-v2.txt |  8 ++---
 fetch-pack.c                            | 11 +++++++
 http-fetch.c                            |  4 ++-
 pkt-line.c                              |  7 +++-
 pkt-line.h                              |  1 +
 t/t5702-protocol-v2.sh                  | 43 +++++++++++++++++++++++++
 6 files changed, 68 insertions(+), 6 deletions(-)


base-commit: 0785eb769886ae81e346df10e88bc49ffc0ac64e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1052
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/2] fetch-pack: redact packfile urls in traces
  2021-10-08 16:03 [PATCH 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
@ 2021-10-08 16:03 ` Ivan Frade via GitGitGadget
  2021-10-08 19:36   ` Ævar Arnfjörð Bjarmason
  2021-10-08 16:03 ` [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars Ivan Frade via GitGitGadget
  2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-08 16:03 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the
Authorization header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 fetch-pack.c           | 11 +++++++++++
 http-fetch.c           |  4 +++-
 pkt-line.c             |  7 ++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 43 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..05c85eeafa1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1518,7 +1518,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
 static void receive_packfile_uris(struct packet_reader *reader,
 				  struct string_list *uris)
 {
+	int original_options;
 	process_section_header(reader, "packfile-uris", 0);
+	/*
+	 * In some setups, packfile-uris act as bearer tokens,
+	 * redact them by default.
+	 */
+	original_options = reader->options;
+	if (git_env_bool("GIT_TRACE_REDACT", 1))
+		reader->options |= PACKET_READ_REDACT_ON_TRACE;
+
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
 		if (reader->pktlen < the_hash_algo->hexsz ||
 		    reader->line[the_hash_algo->hexsz] != ' ')
@@ -1526,6 +1535,8 @@ static void receive_packfile_uris(struct packet_reader *reader,
 
 		string_list_append(uris, reader->line);
 	}
+	reader->options = original_options;
+
 	if (reader->status != PACKET_READ_DELIM)
 		die("expected DELIM");
 }
diff --git a/http-fetch.c b/http-fetch.c
index fa642462a9e..d35e33e4f65 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -63,7 +63,9 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
 	if (start_active_slot(preq->slot)) {
 		run_active_slot(preq->slot);
 		if (results.curl_result != CURLE_OK) {
-			die("Unable to get pack file %s\n%s", preq->url,
+			int showUrl = git_env_bool("GIT_TRACE_REDACT", 1);
+			die("Unable to get offloaded pack file %s\n%s",
+			    showUrl ? preq->url : "<redacted>",
 			    curl_errorstr);
 		}
 	} else {
diff --git a/pkt-line.c b/pkt-line.c
index de4a94b437e..8da8ed88ccf 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -443,7 +443,12 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_ON_TRACE) {
+		const char *redacted = "<redacted>";
+		packet_trace(redacted, strlen(redacted), 0);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 82b95e4bdd3..44c02f3bc6e 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -88,6 +88,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_ON_TRACE      (1u<<4)
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..a620a678a56 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,49 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	configure_exclusion "$P" my-blob >h &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -A1 "clone<\ ..packfile-uris" log | grep "clone<\ <redacted>"
+'
+
+test_expect_success 'packfile-uri not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	configure_exclusion "$P" my-blob >h &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -A1 "clone<\ ..packfile-uris" log  | grep -E "clone<\ ..[[:alnum:]]{40,64}\ http"
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars
  2021-10-08 16:03 [PATCH 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-08 16:03 ` Ivan Frade via GitGitGadget
  2021-10-08 19:43   ` Ævar Arnfjörð Bjarmason
  2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-08 16:03 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

Packfile-uri line specifies a hash of 40 hex character, but with SHA256
this hash size is 64. There are already tests using SHA256 (e.g. in
ubuntu-latest/linux-clang).

Update protocol-v2 documentation to indicate that the hash size depends
on the hash algorithm in use.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 Documentation/technical/protocol-v2.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 21e8258ccf3..a23f12d6c2b 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -393,7 +393,7 @@ header. Most sections are sent only when the packfile is sent.
     wanted-ref = obj-id SP refname
 
     packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
-    packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
+    packfile-uri = PKT-LINE((40|64)*(HEXDIGIT) SP *%x20-ff LF)
 
     packfile = PKT-LINE("packfile" LF)
 	       *PKT-LINE(%x01-03 *%x00-ff)
@@ -476,9 +476,9 @@ header. Most sections are sent only when the packfile is sent.
 	* For each URI the server sends, it sends a hash of the pack's
 	  contents (as output by git index-pack) followed by the URI.
 
-	* The hashes are 40 hex characters long. When Git upgrades to a new
-	  hash algorithm, this might need to be updated. (It should match
-	  whatever index-pack outputs after "pack\t" or "keep\t".
+	* The hashes length is defined by the hash algorithm (40 hex
+	  characters in SHA-1, 64 in SHA-256). It should match whatever
+	  index-pack outputs after "pack\t" or "keep\t".
 
     packfile section
 	* This section is only included if the client has sent 'want'
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/2] fetch-pack: redact packfile urls in traces
  2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-08 19:36   ` Ævar Arnfjörð Bjarmason
  2021-10-08 23:15     ` Ivan Frade
  0 siblings, 1 reply; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:36 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget; +Cc: git, Ivan Frade


On Fri, Oct 08 2021, Ivan Frade via GitGitGadget wrote:

> diff --git a/http-fetch.c b/http-fetch.c
> index fa642462a9e..d35e33e4f65 100644
> --- a/http-fetch.c
> +++ b/http-fetch.c
> @@ -63,7 +63,9 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
>  	if (start_active_slot(preq->slot)) {
>  		run_active_slot(preq->slot);
>  		if (results.curl_result != CURLE_OK) {
> -			die("Unable to get pack file %s\n%s", preq->url,
> +			int showUrl = git_env_bool("GIT_TRACE_REDACT", 1);
> +			die("Unable to get offloaded pack file %s\n%s",
> +			    showUrl ? preq->url : "<redacted>",
>  			    curl_errorstr);
>  		}
>  	} else {

Your CL and commit message just talk about traes, but this is a die()
message.

Perhaps it makes sense to redact it there too for some reason, but that
seems to be a thing to separately argue for.

This message is shown interactively to users, and I could see it be
annoying to not have the URL that failed in your terminal output, even
if it has some one-time token.

Which is presumably different from the use-cases you're thinking of, I'm
assuming some logging of detached processes, or central logging of user
actions.

> +test_expect_success 'packfile-uri redacted in trace' '
> +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +	rm -rf "$P" http_child log &&
> +
> +	git init "$P" &&
> +	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
> +
> +	echo my-blob >"$P/my-blob" &&
> +	git -C "$P" add my-blob &&
> +	git -C "$P" commit -m x &&
> +
> +	configure_exclusion "$P" my-blob >h &&
> +
> +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
> +	git -c protocol.version=2 \
> +		-c fetch.uriprotocols=http,https \
> +		clone "$HTTPD_URL/smart/http_parent" http_child &&
> +
> +	grep -A1 "clone<\ ..packfile-uris" log | grep "clone<\ <redacted>"

We don't rely on GNU options like those for the test suite, it'll break
on various supported platformrs.

In this case the whole LHS of the pipe looks like it could be dropped,
why not grep for "^clone< <redacted>"?

Also you don't need to quote the space character in regexes, it's not a
metacharacter.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars
  2021-10-08 16:03 ` [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars Ivan Frade via GitGitGadget
@ 2021-10-08 19:43   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:43 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget; +Cc: git, Ivan Frade


On Fri, Oct 08 2021, Ivan Frade via GitGitGadget wrote:

> From: Ivan Frade <ifrade@google.com>
>
> Packfile-uri line specifies a hash of 40 hex character, but with SHA256
> this hash size is 64. There are already tests using SHA256 (e.g. in
> ubuntu-latest/linux-clang).
>
> Update protocol-v2 documentation to indicate that the hash size depends
> on the hash algorithm in use.
>
> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---
>  Documentation/technical/protocol-v2.txt | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 21e8258ccf3..a23f12d6c2b 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -393,7 +393,7 @@ header. Most sections are sent only when the packfile is sent.
>      wanted-ref = obj-id SP refname
>  
>      packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
> -    packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
> +    packfile-uri = PKT-LINE((40|64)*(HEXDIGIT) SP *%x20-ff LF)
>  
>      packfile = PKT-LINE("packfile" LF)
>  	       *PKT-LINE(%x01-03 *%x00-ff)
> @@ -476,9 +476,9 @@ header. Most sections are sent only when the packfile is sent.
>  	* For each URI the server sends, it sends a hash of the pack's
>  	  contents (as output by git index-pack) followed by the URI.
>  
> -	* The hashes are 40 hex characters long. When Git upgrades to a new
> -	  hash algorithm, this might need to be updated. (It should match
> -	  whatever index-pack outputs after "pack\t" or "keep\t".
> +	* The hashes length is defined by the hash algorithm (40 hex
> +	  characters in SHA-1, 64 in SHA-256). It should match whatever
> +	  index-pack outputs after "pack\t" or "keep\t".
>  
>      packfile section
>  	* This section is only included if the client has sent 'want'

(I forgot to say in my first reply, but welcome to the Git Mailing
List!)

This is well spotted, but it seems even better to simply drop this
exhaustive listing of 40 or 64 hex digits here.

In protocol-common.txt we talk about "obj-id", and that's then used
elsewhere in protocol-v2.txt matter-of-factly, e.g. (quoting from a
handy part that happens to use "obj-id"):

    [...]
    obj-id-or-unborn = (obj-id | "unborn")
    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
    [...]

So let's just have packfile-uri do the same.

Now, the thing that *does* need to be updated then is
protocol-common.txt, or this part:

  zero-id   =  40*"0"
  obj-id    =  40*(HEXDIGIT)

Because now if you use obj-id that'll just refer back to that, but
that's also a problem with all the rest of the protocol docs.

It would seem that all our SHA-256 tests and client/servers are in
violation of the documentation, and should truncate their OIDs to 40
chars, or we could fix the docs :)

Anyway, whatever we do here this improvement is unrelated to whatever
we're doing with log redaction in your 1/2, I think it would be better
to submit as its own 1 or 2 patch series.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/2] fetch-pack: redact packfile urls in traces
  2021-10-08 19:36   ` Ævar Arnfjörð Bjarmason
@ 2021-10-08 23:15     ` Ivan Frade
  0 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade @ 2021-10-08 23:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Ivan Frade via GitGitGadget, git

On Fri, Oct 8, 2021 at 12:42 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Fri, Oct 08 2021, Ivan Frade via GitGitGadget wrote:
>
> > diff --git a/http-fetch.c b/http-fetch.c
> > index fa642462a9e..d35e33e4f65 100644
> > --- a/http-fetch.c
> > +++ b/http-fetch.c
> > @@ -63,7 +63,9 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
> >       if (start_active_slot(preq->slot)) {
> >               run_active_slot(preq->slot);
> >               if (results.curl_result != CURLE_OK) {
> > -                     die("Unable to get pack file %s\n%s", preq->url,
> > +                     int showUrl = git_env_bool("GIT_TRACE_REDACT", 1);
> > +                     die("Unable to get offloaded pack file %s\n%s",
> > +                         showUrl ? preq->url : "<redacted>",
> >                           curl_errorstr);
> >               }
> >       } else {
>
> Your CL and commit message just talk about traes, but this is a die()
> message.
>
> Perhaps it makes sense to redact it there too for some reason, but that
> seems to be a thing to separately argue for.
>
> This message is shown interactively to users, and I could see it be
> annoying to not have the URL that failed in your terminal output, even
> if it has some one-time token.


For a regular user the URL could be confusing (should they click on
it? try to download it by themselves?). I also got a suggestion to
print e.g. only the domain and maybe the packname.

In any case, I agree it is a different thing than trace logging. I
removed it from this patch.

>
> > +
> > +     grep -A1 "clone<\ ..packfile-uris" log | grep "clone<\ <redacted>"
>
> We don't rely on GNU options like those for the test suite, it'll break
> on various supported platformrs.
>
> In this case the whole LHS of the pipe looks like it could be dropped,
> why not grep for "^clone< <redacted>"?
>
>
> Also you don't need to quote the space character in regexes, it's not a
> metacharacter.

Updated the grep expressions to look only for the relevant lines and
removed the escaping of the space char.

I was trying to limit the grep to the "packfile-uri" section, not to
match something else by accident, but I think "obj-id http://"
shouldn't match anything else in the clone response (no ref can start
with http://).

Thanks for the quick review!

Ivan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v2] fetch-pack: redact packfile urls in traces
  2021-10-08 16:03 [PATCH 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
  2021-10-08 16:03 ` [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars Ivan Frade via GitGitGadget
@ 2021-10-09  2:20 ` Ivan Frade via GitGitGadget
  2021-10-11 20:39   ` Junio C Hamano
  2021-10-19 22:57   ` [PATCH v3] " Ivan Frade via GitGitGadget
  2 siblings, 2 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-09  2:20 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the
Authorization header in HTTP.

Changes since v1:
- Removed non-POSIX flags in tests
- More accurate regex for the non-encrypted packfile line
- Dropped documentation change
- Dropped redacting the die message in http-fetch

Signed-off-by: Ivan Frade <ifrade@google.com>
---
    fetch-pack: redact packfile urls in traces
    
    In some setups, packfile uris act as bearer token. It is not recommended
    to expose them plainly in logs, although in special circunstances (e.g.
    debug) it makes sense to write them.
    
    Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
    variable is set to false. This mimics the redacting of the Authorization
    header in HTTP.
    
    Signed-off-by: Ivan Frade ifrade@google.com
    
    cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v1:

 1:  b473f145a87 ! 1:  701cb7a6ab9 fetch-pack: redact packfile urls in traces
     @@ Commit message
          variable is set to false. This mimics the redacting of the
          Authorization header in HTTP.
      
     +    Changes since v1:
     +    - Removed non-POSIX flags in tests
     +    - More accurate regex for the non-encrypted packfile line
     +    - Dropped documentation change
     +    - Dropped redacting the die message in http-fetch
     +
          Signed-off-by: Ivan Frade <ifrade@google.com>
      
       ## fetch-pack.c ##
     @@ fetch-pack.c: static void receive_packfile_uris(struct packet_reader *reader,
       		die("expected DELIM");
       }
      
     - ## http-fetch.c ##
     -@@ http-fetch.c: static void fetch_single_packfile(struct object_id *packfile_hash,
     - 	if (start_active_slot(preq->slot)) {
     - 		run_active_slot(preq->slot);
     - 		if (results.curl_result != CURLE_OK) {
     --			die("Unable to get pack file %s\n%s", preq->url,
     -+			int showUrl = git_env_bool("GIT_TRACE_REDACT", 1);
     -+			die("Unable to get offloaded pack file %s\n%s",
     -+			    showUrl ? preq->url : "<redacted>",
     - 			    curl_errorstr);
     - 		}
     - 	} else {
     -
       ## pkt-line.c ##
      @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
       		len--;
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
      +
     -+	grep -A1 "clone<\ ..packfile-uris" log | grep "clone<\ <redacted>"
     ++	grep "clone< <redacted>" log
      +'
      +
      +test_expect_success 'packfile-uri not redacted in trace when GIT_TRACE_REDACT=0' '
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
      +
     -+	grep -A1 "clone<\ ..packfile-uris" log  | grep -E "clone<\ ..[[:alnum:]]{40,64}\ http"
     ++	grep -E "clone< ..[0-9a-f]{40,64} http://" log
      +'
      +
       test_expect_success 'http:// --negotiate-only' '
 2:  497c5fd18d7 < -:  ----------- Documentation: packfile-uri hash can be longer than 40 hex chars


 fetch-pack.c           | 11 +++++++++++
 pkt-line.c             |  7 ++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 43 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..05c85eeafa1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1518,7 +1518,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
 static void receive_packfile_uris(struct packet_reader *reader,
 				  struct string_list *uris)
 {
+	int original_options;
 	process_section_header(reader, "packfile-uris", 0);
+	/*
+	 * In some setups, packfile-uris act as bearer tokens,
+	 * redact them by default.
+	 */
+	original_options = reader->options;
+	if (git_env_bool("GIT_TRACE_REDACT", 1))
+		reader->options |= PACKET_READ_REDACT_ON_TRACE;
+
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
 		if (reader->pktlen < the_hash_algo->hexsz ||
 		    reader->line[the_hash_algo->hexsz] != ' ')
@@ -1526,6 +1535,8 @@ static void receive_packfile_uris(struct packet_reader *reader,
 
 		string_list_append(uris, reader->line);
 	}
+	reader->options = original_options;
+
 	if (reader->status != PACKET_READ_DELIM)
 		die("expected DELIM");
 }
diff --git a/pkt-line.c b/pkt-line.c
index de4a94b437e..8da8ed88ccf 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -443,7 +443,12 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_ON_TRACE) {
+		const char *redacted = "<redacted>";
+		packet_trace(redacted, strlen(redacted), 0);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 82b95e4bdd3..44c02f3bc6e 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -88,6 +88,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_ON_TRACE      (1u<<4)
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..f0273317861 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,49 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	configure_exclusion "$P" my-blob >h &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep "clone< <redacted>" log
+'
+
+test_expect_success 'packfile-uri not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	configure_exclusion "$P" my-blob >h &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -E "clone< ..[0-9a-f]{40,64} http://" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&

base-commit: 0785eb769886ae81e346df10e88bc49ffc0ac64e
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v2] fetch-pack: redact packfile urls in traces
  2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
@ 2021-10-11 20:39   ` Junio C Hamano
  2021-10-26 19:32     ` Ivan Frade
  2021-10-19 22:57   ` [PATCH v3] " Ivan Frade via GitGitGadget
  1 sibling, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2021-10-11 20:39 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Ivan Frade

"Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Ivan Frade <ifrade@google.com>
>
> In some setups, packfile uris act as bearer token. It is not
> recommended to expose them plainly in logs, although in special
> circunstances (e.g. debug) it makes sense to write them.
>
> Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
> variable is set to false. This mimics the redacting of the
> Authorization header in HTTP.

Well explained.

It of course is a different matter if the explained idea is
agreeable, though ;-).  Hiding the entire packet, based on the "it
might be in some setups" seems a bit too much.

Is it often the case that the whole URI is sensitive, or perhaps
leading "<scheme>://<host>/pack-<abc>.pack" part is not sensitive at
all, and what follows after that "public" part has some "nonce"
material that makes it sensitive?

> Changes since v1:
> - Removed non-POSIX flags in tests
> - More accurate regex for the non-encrypted packfile line
> - Dropped documentation change
> - Dropped redacting the die message in http-fetch

These are not for those who read "git log" in 3 months, as they may
not even have seen the "v1".  But these are very helpful for those
who read the "v1" to see how good this round is.  Please write such
material below the three-dash line.

> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---

i.e. here.

>     fetch-pack: redact packfile urls in traces
>     
>     In some setups, packfile uris act as bearer token. It is not recommended
>     to expose them plainly in logs, although in special circunstances (e.g.
>     debug) it makes sense to write them.
>     
>     Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
>     variable is set to false. This mimics the redacting of the Authorization
>     header in HTTP.
>     
>     Signed-off-by: Ivan Frade ifrade@google.com
>     
>     cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

And there is no need to duplicate the log message here ;-)

> diff --git a/fetch-pack.c b/fetch-pack.c
> index a9604f35a3e..05c85eeafa1 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1518,7 +1518,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
>  static void receive_packfile_uris(struct packet_reader *reader,
>  				  struct string_list *uris)
>  {
> +	int original_options;
>  	process_section_header(reader, "packfile-uris", 0);
> +	/*
> +	 * In some setups, packfile-uris act as bearer tokens,
> +	 * redact them by default.
> +	 */
> +	original_options = reader->options;
> +	if (git_env_bool("GIT_TRACE_REDACT", 1))
> +		reader->options |= PACKET_READ_REDACT_ON_TRACE;
> +
>  	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
>  		if (reader->pktlen < the_hash_algo->hexsz ||
>  		    reader->line[the_hash_algo->hexsz] != ' ')
> @@ -1526,6 +1535,8 @@ static void receive_packfile_uris(struct packet_reader *reader,
>  
>  		string_list_append(uris, reader->line);
>  	}
> +	reader->options = original_options;

So "original_options" is used to save away the reader->options so
that it can be restored before returning to our caller?  

OK (it may be more common in this codebase to call such a variable
"saved_X", though).

> diff --git a/pkt-line.c b/pkt-line.c
> index de4a94b437e..8da8ed88ccf 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -443,7 +443,12 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  		len--;
>  
>  	buffer[len] = 0;
> -	packet_trace(buffer, len, 0);
> +	if (options & PACKET_READ_REDACT_ON_TRACE) {
> +		const char *redacted = "<redacted>";
> +		packet_trace(redacted, strlen(redacted), 0);
> +	} else {
> +		packet_trace(buffer, len, 0);
> +	}
> ...
> +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
> +	git -c protocol.version=2 \
> +		-c fetch.uriprotocols=http,https \
> +		clone "$HTTPD_URL/smart/http_parent" http_child &&
> +
> +	grep "clone< <redacted>" log

This checks only that "redacted" string appears, but what the theme
of the change really cares about is different, no?  You want to
ensure that no sensitive substring of the URI appears in the log.

Imagine somebody breaking the redact logic by making it prepend that
string to the payload, instead of replacing the payload with that
string---this test will not catch such a regression.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v3] fetch-pack: redact packfile urls in traces
  2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-11 20:39   ` Junio C Hamano
@ 2021-10-19 22:57   ` Ivan Frade via GitGitGadget
  2021-10-20 11:41     ` Ævar Arnfjörð Bjarmason
  2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
  1 sibling, 2 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-19 22:57 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
    fetch-pack: redact packfile urls in traces
    
    Changes since v1:
    
     * Redact only the path of the URL
     * Test are now strict, validating the exact line expected in the log
    
    Changes since v1:
    
     * Removed non-POSIX flags in tests
     * More accurate regex for the non-encrypted packfile line
     * Dropped documentation change
     * Dropped redacting the die message in http-fetch

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v2:

 1:  701cb7a6ab9 ! 1:  9afe0093af4 fetch-pack: redact packfile urls in traces
     @@ Commit message
          recommended to expose them plainly in logs, although in special
          circunstances (e.g. debug) it makes sense to write them.
      
     -    Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
     -    variable is set to false. This mimics the redacting of the
     -    Authorization header in HTTP.
     -
     -    Changes since v1:
     -    - Removed non-POSIX flags in tests
     -    - More accurate regex for the non-encrypted packfile line
     -    - Dropped documentation change
     -    - Dropped redacting the die message in http-fetch
     +    Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
     +    variable is set to false. This mimics the redacting of the Authorization
     +    header in HTTP.
      
          Signed-off-by: Ivan Frade <ifrade@google.com>
      
     @@ fetch-pack.c: static void receive_wanted_refs(struct packet_reader *reader,
       static void receive_packfile_uris(struct packet_reader *reader,
       				  struct string_list *uris)
       {
     -+	int original_options;
     ++	int saved_options;
       	process_section_header(reader, "packfile-uris", 0);
      +	/*
      +	 * In some setups, packfile-uris act as bearer tokens,
      +	 * redact them by default.
      +	 */
     -+	original_options = reader->options;
     ++	saved_options = reader->options;
      +	if (git_env_bool("GIT_TRACE_REDACT", 1))
     -+		reader->options |= PACKET_READ_REDACT_ON_TRACE;
     ++		reader->options |= PACKET_READ_REDACT_URL_PATH;
      +
       	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
       		if (reader->pktlen < the_hash_algo->hexsz ||
     @@ fetch-pack.c: static void receive_packfile_uris(struct packet_reader *reader,
       
       		string_list_append(uris, reader->line);
       	}
     -+	reader->options = original_options;
     ++	reader->options = saved_options;
      +
       	if (reader->status != PACKET_READ_DELIM)
       		die("expected DELIM");
       }
      
       ## pkt-line.c ##
     +@@ pkt-line.c: int packet_length(const char lenbuf_hex[4])
     + 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
     + }
     + 
     ++static int find_url_path_start(const char* buffer)
     ++{
     ++	const char *URL_MARK = "://";
     ++	char *p = strstr(buffer, URL_MARK);
     ++	if (!p) {
     ++		return -1;
     ++	}
     ++
     ++	p += strlen(URL_MARK);
     ++	while (*p && *p != '/')
     ++		p++;
     ++
     ++	// Position after '/'
     ++	if (*p && *(p + 1))
     ++		return (p + 1) - buffer;
     ++
     ++	return -1;
     ++}
     ++
     + enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     + 						size_t *src_len, char *buffer,
     + 						unsigned size, int *pktlen,
     +@@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     + {
     + 	int len;
     + 	char linelen[4];
     ++	int url_path_start;
     + 
     + 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
     + 		*pktlen = -1;
      @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
       		len--;
       
       	buffer[len] = 0;
      -	packet_trace(buffer, len, 0);
     -+	if (options & PACKET_READ_REDACT_ON_TRACE) {
     ++	if (options & PACKET_READ_REDACT_URL_PATH &&
     ++	    (url_path_start = find_url_path_start(buffer)) != -1) {
      +		const char *redacted = "<redacted>";
     -+		packet_trace(redacted, strlen(redacted), 0);
     ++		struct strbuf tracebuf = STRBUF_INIT;
     ++		strbuf_insert(&tracebuf, 0, buffer, len);
     ++		strbuf_splice(&tracebuf, url_path_start,
     ++			      len - url_path_start, redacted, strlen(redacted));
     ++		packet_trace(tracebuf.buf, tracebuf.len, 0);
     ++		strbuf_release(&tracebuf);
      +	} else {
      +		packet_trace(buffer, len, 0);
      +	}
     @@ pkt-line.h: void packet_fflush(FILE *f);
       #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
       #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
       #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
     -+#define PACKET_READ_REDACT_ON_TRACE      (1u<<4)
     ++#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
       int packet_read(int fd, char **src_buffer, size_t *src_len, char
       		*buffer, unsigned size, int options);
       
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
       	test_i18ngrep "disallowed submodule name" err
       '
       
     -+test_expect_success 'packfile-uri redacted in trace' '
     ++test_expect_success 'packfile-uri path redacted in trace' '
      +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
      +	rm -rf "$P" http_child log &&
      +
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +	git -C "$P" add my-blob &&
      +	git -C "$P" commit -m x &&
      +
     -+	configure_exclusion "$P" my-blob >h &&
     ++	git -C "$P" hash-object my-blob >objh &&
     ++	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
     ++	git -C "$P" config --add \
     ++		"uploadpack.blobpackfileuri" \
     ++		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
      +
      +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
      +	git -c protocol.version=2 \
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
      +
     -+	grep "clone< <redacted>" log
     ++	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
      +'
      +
     -+test_expect_success 'packfile-uri not redacted in trace when GIT_TRACE_REDACT=0' '
     ++test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
      +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
      +	rm -rf "$P" http_child log &&
      +
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +	git -C "$P" add my-blob &&
      +	git -C "$P" commit -m x &&
      +
     -+	configure_exclusion "$P" my-blob >h &&
     ++	git -C "$P" hash-object my-blob >objh &&
     ++	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
     ++	git -C "$P" config --add \
     ++		"uploadpack.blobpackfileuri" \
     ++		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
      +
      +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
      +	GIT_TRACE_REDACT=0 \
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
      +
     -+	grep -E "clone< ..[0-9a-f]{40,64} http://" log
     ++	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
      +'
      +
       test_expect_success 'http:// --negotiate-only' '


 fetch-pack.c           | 11 +++++++++
 pkt-line.c             | 33 ++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..1587b9ae662 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1518,7 +1518,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
 static void receive_packfile_uris(struct packet_reader *reader,
 				  struct string_list *uris)
 {
+	int saved_options;
 	process_section_header(reader, "packfile-uris", 0);
+	/*
+	 * In some setups, packfile-uris act as bearer tokens,
+	 * redact them by default.
+	 */
+	saved_options = reader->options;
+	if (git_env_bool("GIT_TRACE_REDACT", 1))
+		reader->options |= PACKET_READ_REDACT_URL_PATH;
+
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
 		if (reader->pktlen < the_hash_algo->hexsz ||
 		    reader->line[the_hash_algo->hexsz] != ' ')
@@ -1526,6 +1535,8 @@ static void receive_packfile_uris(struct packet_reader *reader,
 
 		string_list_append(uris, reader->line);
 	}
+	reader->options = saved_options;
+
 	if (reader->status != PACKET_READ_DELIM)
 		die("expected DELIM");
 }
diff --git a/pkt-line.c b/pkt-line.c
index de4a94b437e..1a9e6870559 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -386,6 +386,25 @@ int packet_length(const char lenbuf_hex[4])
 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
+static int find_url_path_start(const char* buffer)
+{
+	const char *URL_MARK = "://";
+	char *p = strstr(buffer, URL_MARK);
+	if (!p) {
+		return -1;
+	}
+
+	p += strlen(URL_MARK);
+	while (*p && *p != '/')
+		p++;
+
+	// Position after '/'
+	if (*p && *(p + 1))
+		return (p + 1) - buffer;
+
+	return -1;
+}
+
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
 						unsigned size, int *pktlen,
@@ -393,6 +412,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 {
 	int len;
 	char linelen[4];
+	int url_path_start;
 
 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
 		*pktlen = -1;
@@ -443,7 +463,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_URL_PATH &&
+	    (url_path_start = find_url_path_start(buffer)) != -1) {
+		const char *redacted = "<redacted>";
+		struct strbuf tracebuf = STRBUF_INIT;
+		strbuf_insert(&tracebuf, 0, buffer, len);
+		strbuf_splice(&tracebuf, url_path_start,
+			      len - url_path_start, redacted, strlen(redacted));
+		packet_trace(tracebuf.buf, tracebuf.len, 0);
+		strbuf_release(&tracebuf);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 82b95e4bdd3..853d20688c8 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -88,6 +88,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..f01af2f2ed3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri path redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
+'
+
+test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&

base-commit: 9d530dc0024503ab4218fe6c4395b8a0aa245478
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v3] fetch-pack: redact packfile urls in traces
  2021-10-19 22:57   ` [PATCH v3] " Ivan Frade via GitGitGadget
@ 2021-10-20 11:41     ` Ævar Arnfjörð Bjarmason
  2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
  1 sibling, 0 replies; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-20 11:41 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget; +Cc: git, Ivan Frade


On Tue, Oct 19 2021, Ivan Frade via GitGitGadget wrote:

> From: Ivan Frade <ifrade@google.com>
>
> In some setups, packfile uris act as bearer token. It is not
> recommended to expose them plainly in logs, although in special
> circunstances (e.g. debug) it makes sense to write them.
>
> Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
> variable is set to false. This mimics the redacting of the Authorization
> header in HTTP.
>
> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---
>     fetch-pack: redact packfile urls in traces
>     
>     Changes since v1:

Just context for other reviewers:

s/Changes since v1/Changes since v2/ I see, from the context of
https://lore.kernel.org/git/pull.1052.v2.git.1633746024175.gitgitgadget@gmail.com/

>      * Redact only the path of the URL
>      * Test are now strict, validating the exact line expected in the log

And this was changed in v2...

>     Changes since v1:
>     
>      * Removed non-POSIX flags in tests
>      * More accurate regex for the non-encrypted packfile line

[...]

>      * Dropped documentation change
>      * Dropped redacting the die message in http-fetch

Since both of those were done I think in response to my feedback I just
want to clarify (if needed):

 * That the documentation change is still good to have, although I had
   feedback on fixing that more generally in the protocol v2 docs. It
   would be great if you could still pursue it (and that I didn't
   discourage you from doing so).

 * I think having this redaction in die() could still be valuable,
   e.g. your packfile-uri's start failing, and now users are
   copy/pasting "private" URLs that contain their passwords or whatever
   to try to get help, that would be bad.   

   But perhaps if you don't have private URLs redacting them
   unconditionally would slow down debugging for some, i.e. you have 10x
   pasted URLs, and all the errors are from one set of servers (although
   your current redaction includes the hostname, which I think would
   address most cases of say one CDN node failing).

   It was really just a comment that your v1's commit message didn't
   mention or justify it, but just having it make a mention of it would
   also be an OK solution, or fold that into another patch or
   whatever...

>  {
> +	int saved_options;
>  	process_section_header(reader, "packfile-uris", 0);
> +	/*
> +	 * In some setups, packfile-uris act as bearer tokens,
> +	 * redact them by default.
> +	 */
> +	saved_options = reader->options;

nit: no need to pre-declare "int saved_options" here, just move this &
the comment above "process_section_header" (in this case I'd say a
comment isn't even needed, obvious from context...), or it should be at
the definition of PACKET_READ_REDACT_URL_PATH... (more below)

> +	if (git_env_bool("GIT_TRACE_REDACT", 1))

If we're going to use GIT_TRACE_REDACT for this the documentation needs updating:

Documentation/git.txt:`GIT_TRACE_REDACT`::
Documentation/git.txt-  By default, when tracing is activated, Git redacts the values of
Documentation/git.txt-  cookies, the "Authorization:" header, and the "Proxy-Authorization:"
Documentation/git.txt-  header. Set this variable to `0` to prevent this redaction.

> +		reader->options |= PACKET_READ_REDACT_URL_PATH;

(continued)... but that was from a really narrow reading of the code, I
think this whole flip-flopping of options back and forth isn't needed at
all, and you should just assign this flag at the top of
do_fetch_pack_v2(), no?  The setting of it also looks like it belongs
with the reading of "GIT_TEST_SIDEBAND_ALL".

I.e. nothing else uses PACKET_READ_REDACT_URL_PATH, why do we need to be
flipping it back & forth? Keeping these flags in the "reader" is what
that member is for, isn't it? Maybe I'm missing something.

> +static int find_url_path_start(const char* buffer)
> +{
> +	const char *URL_MARK = "://";
> +	char *p = strstr(buffer, URL_MARK);
> +	if (!p) {
> +		return -1;
> +	}
> +
> +	p += strlen(URL_MARK);
> +	while (*p && *p != '/')
> +		p++;
> +
> +	// Position after '/'
> +	if (*p && *(p + 1))
> +		return (p + 1) - buffer;
> +
> +	return -1;
> +}

I think that packfile URI only supports http:// and https://, not
file:// or whatever, so I wonder if either curl or we have a helper
function for this that we can use....(more below)

>  enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  						size_t *src_len, char *buffer,
>  						unsigned size, int *pktlen,
> @@ -393,6 +412,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  {
>  	int len;
>  	char linelen[4];
> +	int url_path_start;
>  
>  	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
>  		*pktlen = -1;
> @@ -443,7 +463,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  		len--;
>  
>  	buffer[len] = 0;
> -	packet_trace(buffer, len, 0);
> +	if (options & PACKET_READ_REDACT_URL_PATH &&
> +	    (url_path_start = find_url_path_start(buffer)) != -1) {
> +		const char *redacted = "<redacted>";
> +		struct strbuf tracebuf = STRBUF_INIT;
> +		strbuf_insert(&tracebuf, 0, buffer, len);
> +		strbuf_splice(&tracebuf, url_path_start,
> +			      len - url_path_start, redacted, strlen(redacted));
> +		packet_trace(tracebuf.buf, tracebuf.len, 0);
> +		strbuf_release(&tracebuf);
> +	} else {
> +		packet_trace(buffer, len, 0);
> +	}

...If we're redacting the URL isn't (and this might be less code with a
helper function) saying:

    failed to get 'https' url from 'somehost.example.com' (full URL redacted due to XYZ setting)

Friendlier than something like (which this function sets up):

    failed to get 'https://somehost.example.com/<redacted>' url

I.e. it allows us to use a get_schema_from_url() and get_host_from_url()
functions (I don't know if/where we have those, but seems likelier than
"find path url boundary" (or maybe I'm wrong and we always feed those
as-is to curl et al).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2] fetch-pack: redact packfile urls in traces
  2021-10-11 20:39   ` Junio C Hamano
@ 2021-10-26 19:32     ` Ivan Frade
  0 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade @ 2021-10-26 19:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ivan Frade via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason

It seems I sent my original reply only to the github PR. Sorry for the
confusion:

On Mon, Oct 11, 2021 at 1:39 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Ivan Frade <ifrade@google.com>
...
>
> It of course is a different matter if the explained idea is
> agreeable, though ;-).  Hiding the entire packet, based on the "it
> might be in some setups" seems a bit too much.
>
> Is it often the case that the whole URI is sensitive, or perhaps
> leading "<scheme>://<host>/pack-<abc>.pack" part is not sensitive at
> all, and what follows after that "public" part has some "nonce"
> material that makes it sensitive?

In the specific case I am working on, the path of the URL is an
encrypted string that shouldn't be completely exposed (exposing part
of it would be fine). In general, I think we can assume that
<scheme>://<host>/ are always "public" but the path could be
sensitive.

We could redact only the path (<scheme>://<host>/REDACTED), or even a
fixed length of the URL? (<scheme>://<host>/pack-<xxREDACTED).

In the next patch version I go with redacting the path.


> > Changes since v1:
...
>  Please write such material below the three-dash line.
Done

> And there is no need to duplicate the log message here ;-)
Done

> So "original_options" is used to save away the reader->options so
> that it can be restored before returning to our caller?
>
> OK (it may be more common in this codebase to call such a variable
> "saved_X", though).

In the latest iteration, the option is enabled for all sections and
there is no need to set/unset the flag.

> > +     grep "clone< <redacted>" log
>
> This checks only that "redacted" string appears, but what the theme
> of the change really cares about is different, no?  You want to
> ensure that no sensitive substring of the URI appears in the log.
>
> Imagine somebody breaking the redact logic by making it prepend that
> string to the payload, instead of replacing the payload with that
> string---this test will not catch such a regression.

Now the tests verify the expected packfile-uri full line is in the log.

Thanks,

Ivan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4 0/2] fetch-pack: redact packfile urls in traces
  2021-10-19 22:57   ` [PATCH v3] " Ivan Frade via GitGitGadget
  2021-10-20 11:41     ` Ævar Arnfjörð Bjarmason
@ 2021-10-26 22:49     ` Ivan Frade via GitGitGadget
  2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
                         ` (2 more replies)
  1 sibling, 3 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-26 22:49 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade

Changes since v3:

 * Enable redacting URLs for all sections
 * Redact only URL path (it was until the end of line)
 * Redact URL in die() with more friendly message
 * Update doc to mention that packfile URIs are also redacted.

Changes since v2:

 * Redact only the path of the URL
 * Test are now strict, validating the exact line expected in the log

Changes since v1:

 * Removed non-POSIX flags in tests
 * More accurate regex for the non-encrypted packfile line
 * Dropped documentation change
 * Dropped redacting the die message in http-fetch

Ivan Frade (2):
  fetch-pack: redact packfile urls in traces
  http-fetch: redact url on die() message

 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  3 +++
 http-fetch.c           | 15 +++++++++++--
 pkt-line.c             | 40 ++++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 110 insertions(+), 5 deletions(-)


base-commit: e9e5ba39a78c8f5057262d49e261b42a8660d5b9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v3:

 1:  9afe0093af4 ! 1:  973a250752c fetch-pack: redact packfile urls in traces
     @@ Commit message
      
          Signed-off-by: Ivan Frade <ifrade@google.com>
      
     - ## fetch-pack.c ##
     -@@ fetch-pack.c: static void receive_wanted_refs(struct packet_reader *reader,
     - static void receive_packfile_uris(struct packet_reader *reader,
     - 				  struct string_list *uris)
     - {
     -+	int saved_options;
     - 	process_section_header(reader, "packfile-uris", 0);
     -+	/*
     -+	 * In some setups, packfile-uris act as bearer tokens,
     -+	 * redact them by default.
     -+	 */
     -+	saved_options = reader->options;
     -+	if (git_env_bool("GIT_TRACE_REDACT", 1))
     -+		reader->options |= PACKET_READ_REDACT_URL_PATH;
     -+
     - 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
     - 		if (reader->pktlen < the_hash_algo->hexsz ||
     - 		    reader->line[the_hash_algo->hexsz] != ' ')
     -@@ fetch-pack.c: static void receive_packfile_uris(struct packet_reader *reader,
     + ## Documentation/git.txt ##
     +@@ Documentation/git.txt: for full details.
       
     - 		string_list_append(uris, reader->line);
     + `GIT_TRACE_REDACT`::
     + 	By default, when tracing is activated, Git redacts the values of
     +-	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
     +-	header. Set this variable to `0` to prevent this redaction.
     ++	cookies, the "Authorization:" header, the "Proxy-Authorization:"
     ++	header and packfile URLs. Set this variable to `0` to prevent this
     ++	redaction.
     + 
     + `GIT_LITERAL_PATHSPECS`::
     + 	Setting this variable to `1` will cause Git to treat all
     +
     + ## fetch-pack.c ##
     +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
     + 		reader.me = "fetch-pack";
       	}
     -+	reader->options = saved_options;
     + 
     ++	if (git_env_bool("GIT_TRACE_REDACT", 1))
     ++		reader.options |= PACKET_READ_REDACT_URL_PATH;
      +
     - 	if (reader->status != PACKET_READ_DELIM)
     - 		die("expected DELIM");
     - }
     + 	while (state != FETCH_DONE) {
     + 		switch (state) {
     + 		case FETCH_CHECK_LOCAL:
      
       ## pkt-line.c ##
      @@ pkt-line.c: int packet_length(const char lenbuf_hex[4])
       	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
       }
       
     -+static int find_url_path_start(const char* buffer)
     ++static char *find_url_path(const char* buffer, int *path_len)
      +{
      +	const char *URL_MARK = "://";
     -+	char *p = strstr(buffer, URL_MARK);
     -+	if (!p) {
     -+		return -1;
     -+	}
     ++	char *path = strstr(buffer, URL_MARK);
     ++	if (!path)
     ++		return NULL;
      +
     -+	p += strlen(URL_MARK);
     -+	while (*p && *p != '/')
     -+		p++;
     ++	path += strlen(URL_MARK);
     ++	while (*path && *path != '/')
     ++		path++;
      +
     -+	// Position after '/'
     -+	if (*p && *(p + 1))
     -+		return (p + 1) - buffer;
     ++	if (!*path || !*(path + 1))
     ++		return NULL;
     ++
     ++	// position after '/'
     ++	path++;
     ++
     ++	if (path_len) {
     ++		char *url_end = strchrnul(path, ' ');
     ++		*path_len = url_end - path;
     ++	}
      +
     -+	return -1;
     ++	return path;
      +}
      +
       enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_b
       {
       	int len;
       	char linelen[4];
     -+	int url_path_start;
     ++	char *url_path_start;
     ++	int url_path_len;
       
       	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
       		*pktlen = -1;
     @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_b
       	buffer[len] = 0;
      -	packet_trace(buffer, len, 0);
      +	if (options & PACKET_READ_REDACT_URL_PATH &&
     -+	    (url_path_start = find_url_path_start(buffer)) != -1) {
     ++	    (url_path_start = find_url_path(buffer, &url_path_len))) {
      +		const char *redacted = "<redacted>";
      +		struct strbuf tracebuf = STRBUF_INIT;
      +		strbuf_insert(&tracebuf, 0, buffer, len);
     -+		strbuf_splice(&tracebuf, url_path_start,
     -+			      len - url_path_start, redacted, strlen(redacted));
     ++		strbuf_splice(&tracebuf, url_path_start - buffer,
     ++			      url_path_len, redacted, strlen(redacted));
      +		packet_trace(tracebuf.buf, tracebuf.len, 0);
      +		strbuf_release(&tracebuf);
      +	} else {
     @@ pkt-line.h: void packet_fflush(FILE *f);
       #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
       #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
      +#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
     - int packet_read(int fd, char **src_buffer, size_t *src_len, char
     - 		*buffer, unsigned size, int options);
     + int packet_read(int fd, char *buffer, unsigned size, int options);
       
     + /*
      
       ## t/t5702-protocol-v2.sh ##
      @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 -:  ----------- > 2:  c7f0977cabd http-fetch: redact url on die() message

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4 1/2] fetch-pack: redact packfile urls in traces
  2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
@ 2021-10-26 22:49       ` Ivan Frade via GitGitGadget
  2021-10-28  1:01         ` Junio C Hamano
  2021-10-26 22:49       ` [PATCH v4 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
  2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-26 22:49 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  3 +++
 pkt-line.c             | 40 ++++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index d63c65e67d8..f64c8ce5183 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -832,8 +832,9 @@ for full details.
 
 `GIT_TRACE_REDACT`::
 	By default, when tracing is activated, Git redacts the values of
-	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
-	header. Set this variable to `0` to prevent this redaction.
+	cookies, the "Authorization:" header, the "Proxy-Authorization:"
+	header and packfile URLs. Set this variable to `0` to prevent this
+	redaction.
 
 `GIT_LITERAL_PATHSPECS`::
 	Setting this variable to `1` will cause Git to treat all
diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..ad8ac49ca50 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1581,6 +1581,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 		reader.me = "fetch-pack";
 	}
 
+	if (git_env_bool("GIT_TRACE_REDACT", 1))
+		reader.options |= PACKET_READ_REDACT_URL_PATH;
+
 	while (state != FETCH_DONE) {
 		switch (state) {
 		case FETCH_CHECK_LOCAL:
diff --git a/pkt-line.c b/pkt-line.c
index 2dc8ac274bd..ba0a2d65f0c 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
+static char *find_url_path(const char* buffer, int *path_len)
+{
+	const char *URL_MARK = "://";
+	char *path = strstr(buffer, URL_MARK);
+	if (!path)
+		return NULL;
+
+	path += strlen(URL_MARK);
+	while (*path && *path != '/')
+		path++;
+
+	if (!*path || !*(path + 1))
+		return NULL;
+
+	// position after '/'
+	path++;
+
+	if (path_len) {
+		char *url_end = strchrnul(path, ' ');
+		*path_len = url_end - path;
+	}
+
+	return path;
+}
+
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
 						unsigned size, int *pktlen,
@@ -377,6 +402,8 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 {
 	int len;
 	char linelen[4];
+	char *url_path_start;
+	int url_path_len;
 
 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
 		*pktlen = -1;
@@ -427,7 +454,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_URL_PATH &&
+	    (url_path_start = find_url_path(buffer, &url_path_len))) {
+		const char *redacted = "<redacted>";
+		struct strbuf tracebuf = STRBUF_INIT;
+		strbuf_insert(&tracebuf, 0, buffer, len);
+		strbuf_splice(&tracebuf, url_path_start - buffer,
+			      url_path_len, redacted, strlen(redacted));
+		packet_trace(tracebuf.buf, tracebuf.len, 0);
+		strbuf_release(&tracebuf);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 467ae013573..a610ecb88e8 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -87,6 +87,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
 int packet_read(int fd, char *buffer, unsigned size, int options);
 
 /*
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..f01af2f2ed3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri path redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
+'
+
+test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
  2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-26 22:49       ` Ivan Frade via GitGitGadget
  2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
  2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-26 22:49 UTC (permalink / raw)
  To: git; +Cc: Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

http-fetch prints the URL after failing to fetch it. This can be
confusing to users (they cannot really do anything with it) but even
worse, they can share by accident a sensitive URL (e.g. with
credentials) while looking for help.

Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
mimics the redaction of other sensitive information in git, like the
Authorization header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 http-fetch.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index fa642462a9e..bbe09a6ad9f 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -4,6 +4,7 @@
 #include "http.h"
 #include "walker.h"
 #include "strvec.h"
+#include "urlmatch.h"
 
 static const char http_fetch_usage[] = "git http-fetch "
 "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url";
@@ -63,8 +64,18 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
 	if (start_active_slot(preq->slot)) {
 		run_active_slot(preq->slot);
 		if (results.curl_result != CURLE_OK) {
-			die("Unable to get pack file %s\n%s", preq->url,
-			    curl_errorstr);
+			struct url_info url;
+			char *nurl = url_normalize(preq->url, &url);
+			if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
+				die("Unable to get pack file %s\n%s", preq->url,
+				    curl_errorstr);
+			} else {
+				char *schema = xstrndup(url.url, url.scheme_len);
+				char *host = xstrndup(&url.url[url.host_off], url.host_len);
+				die("failed to get '%s' url from '%s' "
+				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
+				    schema, host, curl_errorstr);
+			}
 		}
 	} else {
 		die("Unable to start request");
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 1/2] fetch-pack: redact packfile urls in traces
  2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-28  1:01         ` Junio C Hamano
  2021-10-28 22:15           ` Ivan Frade
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2021-10-28  1:01 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget; +Cc: git, Ivan Frade, Jonathan Tan

"Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Ivan Frade <ifrade@google.com>
>
> In some setups, packfile uris act as bearer token. It is not
> recommended to expose them plainly in logs, although in special
> circunstances (e.g. debug) it makes sense to write them.
>
> Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
> variable is set to false. This mimics the redacting of the Authorization
> header in HTTP.
>
> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---
>  Documentation/git.txt  |  5 +++--
>  fetch-pack.c           |  3 +++
>  pkt-line.c             | 40 ++++++++++++++++++++++++++++++++-
>  pkt-line.h             |  1 +
>  t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 97 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index d63c65e67d8..f64c8ce5183 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -832,8 +832,9 @@ for full details.
>  
>  `GIT_TRACE_REDACT`::
>  	By default, when tracing is activated, Git redacts the values of
> -	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
> -	header. Set this variable to `0` to prevent this redaction.
> +	cookies, the "Authorization:" header, the "Proxy-Authorization:"
> +	header and packfile URLs. Set this variable to `0` to prevent this
> +	redaction.

Just a curiosity.  Do we call these packfile URI, or packfile URL?

> diff --git a/pkt-line.c b/pkt-line.c
> index 2dc8ac274bd..ba0a2d65f0c 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
>  	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
>  }
>  
> +static char *find_url_path(const char* buffer, int *path_len)
> +{
> +	const char *URL_MARK = "://";
> +	char *path = strstr(buffer, URL_MARK);
> +	if (!path)
> +		return NULL;

Hmph, the format we expect is "<hash> <uri>"; don't we need to
validate the leading <hash> followed by SP?

    len = strspn(buffer, "0123456789abcdefABCDEF");
    if (len != 40 || len != 64 || buffer[len] != ' ')
	return NULL; /* required "<hash> SP" not seen */
    path = strstr(buffer + len + 1, URL_MARK);

or somesuch?

> +	path += strlen(URL_MARK);

OK.

> +	while (*path && *path != '/')
> +		path++;

strchr()?

> +	if (!*path || !*(path + 1))
> +		return NULL;

OK.

> +	// position after '/'

No // comments in our codebase, please.  Unless it is a borrowed
code, that is.

> +	path++;
> +
> +	if (path_len) {
> +		char *url_end = strchrnul(path, ' ');

Is this because SP is not a valid character in packfile URI, or at
this point in the callchain it would be encoded or something?  The
format we expect is "<hash> <uri>", so we shouldn't even have to
look for SP but just redact everything to the end, no?

Apparently we are assuming that there won't be more than one such
URL-path that needs redacting in the packet, but that is perfectly
fine, as the sole goal of this helper is to identify the packfile
URI packet and redact it in the log.

> +		*path_len = url_end - path;
> +	}
> +
> +	return path;
> +}

> -	packet_trace(buffer, len, 0);
> +	if (options & PACKET_READ_REDACT_URL_PATH &&
> +	    (url_path_start = find_url_path(buffer, &url_path_len))) {
> +		const char *redacted = "<redacted>";
> +		struct strbuf tracebuf = STRBUF_INIT;
> +		strbuf_insert(&tracebuf, 0, buffer, len);
> +		strbuf_splice(&tracebuf, url_path_start - buffer,
> +			      url_path_len, redacted, strlen(redacted));
> +		packet_trace(tracebuf.buf, tracebuf.len, 0);
> +		strbuf_release(&tracebuf);

I briefly wondered if the repeated allocation (and more
fundamentally, preparing the redacted copy of packet whether we are
actually tracing the packet in the first place) is blindly wasting
the resources too much, but this only happens in the protocol header
part, so it might be OK.

Even if that is not the case, we should be able to update
fetch_pack.c::do_fetch_pack_v2() so that the REDACT_URL_PATH bit is
turned on in a much narrower region of code, right?  Enable when we
enter the GET_PACK state and drop the bit when we are done with the
packfile URI packets, or something?

Thanks for working on this.


> +	} else {
> +		packet_trace(buffer, len, 0);
> +	}

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-26 22:49       ` [PATCH v4 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
@ 2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
  2021-10-28 17:25           ` Eric Sunshine
                             ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-28 16:39 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget; +Cc: git, Ivan Frade


On Tue, Oct 26 2021, Ivan Frade via GitGitGadget wrote:

> From: Ivan Frade <ifrade@google.com>
>
> http-fetch prints the URL after failing to fetch it. This can be
> confusing to users (they cannot really do anything with it) but even
> worse, they can share by accident a sensitive URL (e.g. with
> credentials) while looking for help.
>
> Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
> mimics the redaction of other sensitive information in git, like the
> Authorization header in HTTP.
>
> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---
>  http-fetch.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/http-fetch.c b/http-fetch.c
> index fa642462a9e..bbe09a6ad9f 100644
> --- a/http-fetch.c
> +++ b/http-fetch.c
> @@ -4,6 +4,7 @@
>  #include "http.h"
>  #include "walker.h"
>  #include "strvec.h"
> +#include "urlmatch.h"
>  
>  static const char http_fetch_usage[] = "git http-fetch "
>  "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url";
> @@ -63,8 +64,18 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
>  	if (start_active_slot(preq->slot)) {
>  		run_active_slot(preq->slot);
>  		if (results.curl_result != CURLE_OK) {
> -			die("Unable to get pack file %s\n%s", preq->url,
> -			    curl_errorstr);
> +			struct url_info url;
> +			char *nurl = url_normalize(preq->url, &url);
> +			if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
> +				die("Unable to get pack file %s\n%s", preq->url,
> +				    curl_errorstr);

small nit: arrange if's from "if (cheap || expensive)", i.e. no need for
getenv() if !nurl, but maybe compilers are smart enough for that...

nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so:

    die("unable to get pack '%s': '%s'", ...)

Or maybe without the second '%s', as in 3e8084f1884 (http: check
CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which
I authored, but just copy/pasted the convention in the surrounding
code)>

> +			} else {
> +				char *schema = xstrndup(url.url, url.scheme_len);
> +				char *host = xstrndup(&url.url[url.host_off], url.host_len);
> +				die("failed to get '%s' url from '%s' "
> +				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
> +				    schema, host, curl_errorstr);

Hrm, I haven't tested, but aren't both of those xstrndup's redundant to
using %*s instead of %s for the printf format? I.e.:

    die("failed to get '%*s'[...]", url.schema_len, url.url, )

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
@ 2021-10-28 17:25           ` Eric Sunshine
  2021-10-28 22:44             ` Ivan Frade
  2021-10-28 22:41           ` Ivan Frade
  2021-10-29 23:18           ` Junio C Hamano
  2 siblings, 1 reply; 43+ messages in thread
From: Eric Sunshine @ 2021-10-28 17:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Ivan Frade via GitGitGadget, Git List, Ivan Frade

On Thu, Oct 28, 2021 at 12:46 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Tue, Oct 26 2021, Ivan Frade via GitGitGadget wrote:
> >               if (results.curl_result != CURLE_OK) {
> > -                     die("Unable to get pack file %s\n%s", preq->url,
> > -                         curl_errorstr);
> > +                     struct url_info url;
> > +                     char *nurl = url_normalize(preq->url, &url);
> > +                     if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
> > +                             die("Unable to get pack file %s\n%s", preq->url,
> > +                                 curl_errorstr);
>
> small nit: arrange if's from "if (cheap || expensive)", i.e. no need for
> getenv() if !nurl, but maybe compilers are smart enough for that...

I had the same passing thought when glancing over this code (although
this appears to be an error patch, thus not performance critical, so
not terribly important).

> nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so:
>
>     die("unable to get pack '%s': '%s'", ...)
>
> Or maybe without the second '%s', as in 3e8084f1884 (http: check
> CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which
> I authored, but just copy/pasted the convention in the surrounding
> code)>

Note that this is not a new die() call; it just got indented as-is by
this patch, so the changes you suggest to the message string are
potentially outside the scope of this patch. Possibilities: (1) make
the changes in this patch but mention them in the commit message; (2)
make the changes in a preparatory patch; (3) punt on the changes for
now.

> > +                     } else {
> > +                             char *schema = xstrndup(url.url, url.scheme_len);
> > +                             char *host = xstrndup(&url.url[url.host_off], url.host_len);
> > +                             die("failed to get '%s' url from '%s' "
> > +                                 "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
> > +                                 schema, host, curl_errorstr);
>
> Hrm, I haven't tested, but aren't both of those xstrndup's redundant to
> using %*s instead of %s for the printf format? I.e.:
>
>     die("failed to get '%*s'[...]", url.schema_len, url.url, )

I wondered the same when reading the patch. Thanks for mentioning it.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 1/2] fetch-pack: redact packfile urls in traces
  2021-10-28  1:01         ` Junio C Hamano
@ 2021-10-28 22:15           ` Ivan Frade
  2021-10-28 22:46             ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade @ 2021-10-28 22:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ivan Frade via GitGitGadget, git, Jonathan Tan

On Wed, Oct 27, 2021 at 6:01 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Ivan Frade <ifrade@google.com>
> >

> Just a curiosity.  Do we call these packfile URI, or packfile URL?

The feature is "packfile URI" (and the section is called so in the
protocol). I changed all "url" to "uri".

> > diff --git a/pkt-line.c b/pkt-line.c
> > index 2dc8ac274bd..ba0a2d65f0c 100644
> > --- a/pkt-line.c
> > +++ b/pkt-line.c
> > @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
> >       return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
> >  }
> >
> > +static char *find_url_path(const char* buffer, int *path_len)
> > +{
> > +     const char *URL_MARK = "://";
> > +     char *path = strstr(buffer, URL_MARK);
> > +     if (!path)
> > +             return NULL;
>
> Hmph, the format we expect is "<hash> <uri>"; don't we need to
> validate the leading <hash> followed by SP?

I was trying to find a uri in a packet in general, not counting on the
packfile-uri line format. That is probably an overgeneralization.

Next patch version follows these suggestions to look for a packfile-uri line.

> > +     if (path_len) {
> > +             char *url_end = strchrnul(path, ' ');
>
> Is this because SP is not a valid character in packfile URI, or at
> this point in the callchain it would be encoded or something?  The
> format we expect is "<hash> <uri>", so we shouldn't even have to
> look for SP but just redact everything to the end, no?

Yes, now that we count on the packfile-uri line format, we can redact
everything to the end and there is no need to return the length.

> > -     packet_trace(buffer, len, 0);
> > +     if (options & PACKET_READ_REDACT_URL_PATH &&
> > +         (url_path_start = find_url_path(buffer, &url_path_len))) {
> > +             const char *redacted = "<redacted>";
> > +             struct strbuf tracebuf = STRBUF_INIT;
> > +             strbuf_insert(&tracebuf, 0, buffer, len);
> > +             strbuf_splice(&tracebuf, url_path_start - buffer,
> > +                           url_path_len, redacted, strlen(redacted));
> > +             packet_trace(tracebuf.buf, tracebuf.len, 0);
> > +             strbuf_release(&tracebuf);
>
> I briefly wondered if the repeated allocation (and more
> fundamentally, preparing the redacted copy of packet whether we are
> actually tracing the packet in the first place) is blindly wasting
> the resources too much, but this only happens in the protocol header
> part, so it might be OK.

We only allocate and redact if it looks like a packfile-uri line, so
it shouldn't happen too frequently.

> Even if that is not the case, we should be able to update
> fetch_pack.c::do_fetch_pack_v2() so that the REDACT_URL_PATH bit is
> turned on in a much narrower region of code, right?  Enable when we
> enter the GET_PACK state and drop the bit when we are done with the
> packfile URI packets, or something?

I move the set/unset of the redacting flag to the FETCH_GET_PACK
around the "packfile-uris" section.
There is no need to check every incoming packet for a packfile-uri
line, we know when they should come.

Thanks,

Ivan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
  2021-10-28 17:25           ` Eric Sunshine
@ 2021-10-28 22:41           ` Ivan Frade
  2021-10-29 23:18           ` Junio C Hamano
  2 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade @ 2021-10-28 22:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Ivan Frade via GitGitGadget, git

On Thu, Oct 28, 2021 at 9:46 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Oct 26 2021, Ivan Frade via GitGitGadget wrote:
>
> > From: Ivan Frade <ifrade@google.com>
> >
...
> > +                     if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
> > +                             die("Unable to get pack file %s\n%s", preq->url,
> > +                                 curl_errorstr);
>
> small nit: arrange if's from "if (cheap || expensive)", i.e. no need for
> getenv() if !nurl, but maybe compilers are smart enough for that...

Done

> nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so:
>
>     die("unable to get pack '%s': '%s'", ...)
>
> Or maybe without the second '%s', as in 3e8084f1884 (http: check
> CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which
> I authored, but just copy/pasted the convention in the surrounding
> code)>

Done

> > +                     } else {
> > +                             char *schema = xstrndup(url.url, url.scheme_len);
> > +                             char *host = xstrndup(&url.url[url.host_off], url.host_len);
> > +                             die("failed to get '%s' url from '%s' "
> > +                                 "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
> > +                                 schema, host, curl_errorstr);
>
> Hrm, I haven't tested, but aren't both of those xstrndup's redundant to
> using %*s instead of %s for the printf format? I.e.:
>
>     die("failed to get '%*s'[...]", url.schema_len, url.url, )

Indeed, "%.*s" did the trick. Thanks!

Ivan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-28 17:25           ` Eric Sunshine
@ 2021-10-28 22:44             ` Ivan Frade
  0 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade @ 2021-10-28 22:44 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Ævar Arnfjörð Bjarmason,
	Ivan Frade via GitGitGadget, Git List

On Thu, Oct 28, 2021 at 10:26 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Oct 28, 2021 at 12:46 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:

>
> > nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so:
> >
> >     die("unable to get pack '%s': '%s'", ...)
> >
> > Or maybe without the second '%s', as in 3e8084f1884 (http: check
> > CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which
> > I authored, but just copy/pasted the convention in the surrounding
> > code)>
>
> Note that this is not a new die() call; it just got indented as-is by
> this patch, so the changes you suggest to the message string are
> potentially outside the scope of this patch. Possibilities: (1) make
> the changes in this patch but mention them in the commit message; (2)
> make the changes in a preparatory patch; (3) punt on the changes for
> now.

I went for option (1). It is a minimal change and we are moving that
line in this commit.

Ivan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 1/2] fetch-pack: redact packfile urls in traces
  2021-10-28 22:15           ` Ivan Frade
@ 2021-10-28 22:46             ` Junio C Hamano
  0 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2021-10-28 22:46 UTC (permalink / raw)
  To: Ivan Frade; +Cc: Ivan Frade via GitGitGadget, git, Jonathan Tan

Ivan Frade <ifrade@google.com> writes:

>> Hmph, the format we expect is "<hash> <uri>"; don't we need to
>> validate the leading <hash> followed by SP?
>
> I was trying to find a uri in a packet in general, not counting on the
> packfile-uri line format. That is probably an overgeneralization.

Ah, I see.  This is merely a tracing, so we might benefit from a
generalized version of redactor, and from that point of view, the
use of strstr and stopping at the whitespace do make sort-of sense
to me, but then we lack any attempt to redact more than one instance
of URL in a packet, so the generalization may have quite a limited
usefulness.

> Next patch version follows these suggestions to look for a packfile-uri line.

Yeah, I think that is a good way to go, at least for now.  When we
want a more general one, we can revisit it, but not now.

>> > -     packet_trace(buffer, len, 0);
>> > +     if (options & PACKET_READ_REDACT_URL_PATH &&
>> > +         (url_path_start = find_url_path(buffer, &url_path_len))) {
>> > +             const char *redacted = "<redacted>";
>> > +             struct strbuf tracebuf = STRBUF_INIT;
>> > +             strbuf_insert(&tracebuf, 0, buffer, len);
>> > +             strbuf_splice(&tracebuf, url_path_start - buffer,
>> > +                           url_path_len, redacted, strlen(redacted));
>> > +             packet_trace(tracebuf.buf, tracebuf.len, 0);
>> > +             strbuf_release(&tracebuf);
>>
>> I briefly wondered if the repeated allocation (and more
>> fundamentally, preparing the redacted copy of packet whether we are
>> actually tracing the packet in the first place) is blindly wasting
>> the resources too much, but this only happens in the protocol header
>> part, so it might be OK.
>
> We only allocate and redact if it looks like a packfile-uri line, so
> it shouldn't happen too frequently.

I was mostly wondering about the cost of determining "if it looks
like?".  But we do this only for the protocol header part, so we
won't have thousands of attempts to match, I guess.  Oh, or if we
also do this for the ref advertisement packets, then we might have
quite a many.  Hmph.

> I move the set/unset of the redacting flag to the FETCH_GET_PACK
> around the "packfile-uris" section.
> There is no need to check every incoming packet for a packfile-uri
> line, we know when they should come.

Yeah, that is quite a wise design decision, I would think.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v5 0/2] fetch-pack: redact packfile urls in traces
  2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
  2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
  2021-10-26 22:49       ` [PATCH v4 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
@ 2021-10-28 22:51       ` Ivan Frade via GitGitGadget
  2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
                           ` (2 more replies)
  2 siblings, 3 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-28 22:51 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine, Ivan Frade

Changes since v4:

 * Use "uri" instead of "url"
 * Look specifically for a line with packfile-uri format (instead of for a
   URL in general)
 * Limit the redacting to the packfile-uri section in do_fetch_pack_v2
 * Use "%.*s" instead of duplicating parts of the string to print

Changes since v3:

 * Enable redacting URLs for all sections
 * Redact only URL path (it was until the end of line)
 * Redact URL in die() with more friendly message
 * Update doc to mention that packfile URIs are also redacted.

Changes since v2:

 * Redact only the path of the URL
 * Test are now strict, validating the exact line expected in the log

Changes since v1:

 * Removed non-POSIX flags in tests
 * More accurate regex for the non-encrypted packfile line
 * Dropped documentation change
 * Dropped redacting the die message in http-fetch

Ivan Frade (2):
  fetch-pack: redact packfile urls in traces
  http-fetch: redact url on die() message

 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  4 ++++
 http-fetch.c           | 14 ++++++++++--
 pkt-line.c             | 39 +++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 109 insertions(+), 5 deletions(-)


base-commit: e9e5ba39a78c8f5057262d49e261b42a8660d5b9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v4:

 1:  973a250752c ! 1:  c95b3cafcd6 fetch-pack: redact packfile urls in traces
     @@ Documentation/git.txt: for full details.
      -	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
      -	header. Set this variable to `0` to prevent this redaction.
      +	cookies, the "Authorization:" header, the "Proxy-Authorization:"
     -+	header and packfile URLs. Set this variable to `0` to prevent this
     ++	header and packfile URIs. Set this variable to `0` to prevent this
      +	redaction.
       
       `GIT_LITERAL_PATHSPECS`::
     @@ Documentation/git.txt: for full details.
      
       ## fetch-pack.c ##
      @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
     - 		reader.me = "fetch-pack";
     - 	}
     + 				receive_wanted_refs(&reader, sought, nr_sought);
       
     -+	if (git_env_bool("GIT_TRACE_REDACT", 1))
     -+		reader.options |= PACKET_READ_REDACT_URL_PATH;
     -+
     - 	while (state != FETCH_DONE) {
     - 		switch (state) {
     - 		case FETCH_CHECK_LOCAL:
     + 			/* get the pack(s) */
     ++			if (git_env_bool("GIT_TRACE_REDACT", 1))
     ++				reader.options |= PACKET_READ_REDACT_URI_PATH;
     + 			if (process_section_header(&reader, "packfile-uris", 1))
     + 				receive_packfile_uris(&reader, &packfile_uris);
     ++			reader.options &= ~PACKET_READ_REDACT_URI_PATH;
     ++
     + 			process_section_header(&reader, "packfile", 0);
     + 
     + 			/*
      
       ## pkt-line.c ##
      @@ pkt-line.c: int packet_length(const char lenbuf_hex[4])
       	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
       }
       
     -+static char *find_url_path(const char* buffer, int *path_len)
     ++static char *find_packfile_uri_path(const char *buffer)
      +{
     -+	const char *URL_MARK = "://";
     -+	char *path = strstr(buffer, URL_MARK);
     -+	if (!path)
     -+		return NULL;
     ++	const char *URI_MARK = "://";
     ++	char *path;
     ++	int len;
      +
     -+	path += strlen(URL_MARK);
     -+	while (*path && *path != '/')
     -+		path++;
     ++	/* First char is sideband mark */
     ++	buffer += 1;
      +
     -+	if (!*path || !*(path + 1))
     -+		return NULL;
     ++	len = strspn(buffer, "0123456789abcdefABCDEF");
     ++	if (!(len == 40 || len == 64) || buffer[len] != ' ')
     ++		return NULL; /* required "<hash>SP" not seen */
      +
     -+	// position after '/'
     -+	path++;
     ++	path = strstr(buffer + len + 1, URI_MARK);
     ++	if (!path)
     ++		return NULL;
      +
     -+	if (path_len) {
     -+		char *url_end = strchrnul(path, ' ');
     -+		*path_len = url_end - path;
     -+	}
     ++	path = strchr(path + strlen(URI_MARK), '/');
     ++	if (!path || !*(path + 1))
     ++		return NULL;
      +
     -+	return path;
     ++	/* position after '/' */
     ++	return ++path;
      +}
      +
       enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_b
       {
       	int len;
       	char linelen[4];
     -+	char *url_path_start;
     -+	int url_path_len;
     ++	char *uri_path_start;
       
       	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
       		*pktlen = -1;
     @@ pkt-line.c: enum packet_read_status packet_read_with_status(int fd, char **src_b
       
       	buffer[len] = 0;
      -	packet_trace(buffer, len, 0);
     -+	if (options & PACKET_READ_REDACT_URL_PATH &&
     -+	    (url_path_start = find_url_path(buffer, &url_path_len))) {
     ++	if (options & PACKET_READ_REDACT_URI_PATH &&
     ++	    (uri_path_start = find_packfile_uri_path(buffer))) {
      +		const char *redacted = "<redacted>";
      +		struct strbuf tracebuf = STRBUF_INIT;
      +		strbuf_insert(&tracebuf, 0, buffer, len);
     -+		strbuf_splice(&tracebuf, url_path_start - buffer,
     -+			      url_path_len, redacted, strlen(redacted));
     ++		strbuf_splice(&tracebuf, uri_path_start - buffer,
     ++			      strlen(uri_path_start), redacted, strlen(redacted));
      +		packet_trace(tracebuf.buf, tracebuf.len, 0);
      +		strbuf_release(&tracebuf);
      +	} else {
     @@ pkt-line.h: void packet_fflush(FILE *f);
       #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
       #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
       #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
     -+#define PACKET_READ_REDACT_URL_PATH      (1u<<4)
     ++#define PACKET_READ_REDACT_URI_PATH      (1u<<4)
       int packet_read(int fd, char *buffer, unsigned size, int options);
       
       /*
 2:  c7f0977cabd ! 2:  6912a690197 http-fetch: redact url on die() message
     @@ Commit message
          http-fetch: redact url on die() message
      
          http-fetch prints the URL after failing to fetch it. This can be
     -    confusing to users (they cannot really do anything with it) but even
     -    worse, they can share by accident a sensitive URL (e.g. with
     -    credentials) while looking for help.
     +    confusing to users (they cannot really do anything with it), and they
     +    can share by accident a sensitive URL (e.g. with credentials) while
     +    looking for help.
      
          Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
          mimics the redaction of other sensitive information in git, like the
          Authorization header in HTTP.
      
     +    Fix also capitalization of previous die() message (must start in
     +    lowercase).
     +
          Signed-off-by: Ivan Frade <ifrade@google.com>
      
       ## http-fetch.c ##
     @@ http-fetch.c: static void fetch_single_packfile(struct object_id *packfile_hash,
      -			    curl_errorstr);
      +			struct url_info url;
      +			char *nurl = url_normalize(preq->url, &url);
     -+			if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
     -+				die("Unable to get pack file %s\n%s", preq->url,
     ++			if (!nurl || !git_env_bool("GIT_TRACE_REDACT", 1)) {
     ++				die("unable to get pack file '%s'\n%s", preq->url,
      +				    curl_errorstr);
      +			} else {
     -+				char *schema = xstrndup(url.url, url.scheme_len);
     -+				char *host = xstrndup(&url.url[url.host_off], url.host_len);
     -+				die("failed to get '%s' url from '%s' "
     ++				die("failed to get '%.*s' url from '%.*s' "
      +				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
     -+				    schema, host, curl_errorstr);
     ++				    (int)url.scheme_len, url.url,
     ++				    (int)url.host_len, &url.url[url.host_off], curl_errorstr);
      +			}
       		}
       	} else {

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v5 1/2] fetch-pack: redact packfile urls in traces
  2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
@ 2021-10-28 22:51         ` Ivan Frade via GitGitGadget
  2021-10-28 23:21           ` Junio C Hamano
  2021-10-28 22:51         ` [PATCH v5 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
  2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-28 22:51 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine, Ivan Frade,
	Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  4 ++++
 pkt-line.c             | 39 +++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index d63c65e67d8..c91aa2737f0 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -832,8 +832,9 @@ for full details.
 
 `GIT_TRACE_REDACT`::
 	By default, when tracing is activated, Git redacts the values of
-	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
-	header. Set this variable to `0` to prevent this redaction.
+	cookies, the "Authorization:" header, the "Proxy-Authorization:"
+	header and packfile URIs. Set this variable to `0` to prevent this
+	redaction.
 
 `GIT_LITERAL_PATHSPECS`::
 	Setting this variable to `1` will cause Git to treat all
diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..62ea90541c5 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1653,8 +1653,12 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				receive_wanted_refs(&reader, sought, nr_sought);
 
 			/* get the pack(s) */
+			if (git_env_bool("GIT_TRACE_REDACT", 1))
+				reader.options |= PACKET_READ_REDACT_URI_PATH;
 			if (process_section_header(&reader, "packfile-uris", 1))
 				receive_packfile_uris(&reader, &packfile_uris);
+			reader.options &= ~PACKET_READ_REDACT_URI_PATH;
+
 			process_section_header(&reader, "packfile", 0);
 
 			/*
diff --git a/pkt-line.c b/pkt-line.c
index 2dc8ac274bd..06013d2a54a 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
+static char *find_packfile_uri_path(const char *buffer)
+{
+	const char *URI_MARK = "://";
+	char *path;
+	int len;
+
+	/* First char is sideband mark */
+	buffer += 1;
+
+	len = strspn(buffer, "0123456789abcdefABCDEF");
+	if (!(len == 40 || len == 64) || buffer[len] != ' ')
+		return NULL; /* required "<hash>SP" not seen */
+
+	path = strstr(buffer + len + 1, URI_MARK);
+	if (!path)
+		return NULL;
+
+	path = strchr(path + strlen(URI_MARK), '/');
+	if (!path || !*(path + 1))
+		return NULL;
+
+	/* position after '/' */
+	return ++path;
+}
+
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
 						unsigned size, int *pktlen,
@@ -377,6 +402,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 {
 	int len;
 	char linelen[4];
+	char *uri_path_start;
 
 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
 		*pktlen = -1;
@@ -427,7 +453,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_URI_PATH &&
+	    (uri_path_start = find_packfile_uri_path(buffer))) {
+		const char *redacted = "<redacted>";
+		struct strbuf tracebuf = STRBUF_INIT;
+		strbuf_insert(&tracebuf, 0, buffer, len);
+		strbuf_splice(&tracebuf, uri_path_start - buffer,
+			      strlen(uri_path_start), redacted, strlen(redacted));
+		packet_trace(tracebuf.buf, tracebuf.len, 0);
+		strbuf_release(&tracebuf);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 467ae013573..6d2a63db238 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -87,6 +87,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_URI_PATH      (1u<<4)
 int packet_read(int fd, char *buffer, unsigned size, int options);
 
 /*
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..f01af2f2ed3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri path redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
+'
+
+test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 2/2] http-fetch: redact url on die() message
  2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-28 22:51         ` Ivan Frade via GitGitGadget
  2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-28 22:51 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine, Ivan Frade,
	Ivan Frade

From: Ivan Frade <ifrade@google.com>

http-fetch prints the URL after failing to fetch it. This can be
confusing to users (they cannot really do anything with it), and they
can share by accident a sensitive URL (e.g. with credentials) while
looking for help.

Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
mimics the redaction of other sensitive information in git, like the
Authorization header in HTTP.

Fix also capitalization of previous die() message (must start in
lowercase).

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 http-fetch.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index fa642462a9e..c7c7d391ac5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -4,6 +4,7 @@
 #include "http.h"
 #include "walker.h"
 #include "strvec.h"
+#include "urlmatch.h"
 
 static const char http_fetch_usage[] = "git http-fetch "
 "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url";
@@ -63,8 +64,17 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
 	if (start_active_slot(preq->slot)) {
 		run_active_slot(preq->slot);
 		if (results.curl_result != CURLE_OK) {
-			die("Unable to get pack file %s\n%s", preq->url,
-			    curl_errorstr);
+			struct url_info url;
+			char *nurl = url_normalize(preq->url, &url);
+			if (!nurl || !git_env_bool("GIT_TRACE_REDACT", 1)) {
+				die("unable to get pack file '%s'\n%s", preq->url,
+				    curl_errorstr);
+			} else {
+				die("failed to get '%.*s' url from '%.*s' "
+				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
+				    (int)url.scheme_len, url.url,
+				    (int)url.host_len, &url.url[url.host_off], curl_errorstr);
+			}
 		}
 	} else {
 		die("Unable to start request");
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/2] fetch-pack: redact packfile urls in traces
  2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-28 23:21           ` Junio C Hamano
  2021-10-29 18:42             ` Ivan Frade
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2021-10-28 23:21 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Ivan Frade

"Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/pkt-line.c b/pkt-line.c
> index 2dc8ac274bd..06013d2a54a 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
>  	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
>  }
>  
> +static char *find_packfile_uri_path(const char *buffer)
> +{
> +	const char *URI_MARK = "://";
> +	char *path;
> +	int len;
> +
> +	/* First char is sideband mark */
> +	buffer += 1;
> +
> +	len = strspn(buffer, "0123456789abcdefABCDEF");
> +	if (!(len == 40 || len == 64) || buffer[len] != ' ')
> +		return NULL; /* required "<hash>SP" not seen */

People may have comments on hardcoded 40/64 here and offer a better
way to write it ;-)

> +	path = strstr(buffer + len + 1, URI_MARK);
> +	if (!path)
> +		return NULL;
> +
> +	path = strchr(path + strlen(URI_MARK), '/');
> +	if (!path || !*(path + 1))
> +		return NULL;
> +
> +	/* position after '/' */
> +	return ++path;
> +}

Other than that, the patch this round looks quite clean.

Nicely done.

Thanks, will queue.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/2] fetch-pack: redact packfile urls in traces
  2021-10-28 23:21           ` Junio C Hamano
@ 2021-10-29 18:42             ` Ivan Frade
  2021-10-29 19:59               ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade @ 2021-10-29 18:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ivan Frade via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Eric Sunshine

On Thu, Oct 28, 2021 at 4:21 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:
>

> > +     len = strspn(buffer, "0123456789abcdefABCDEF");
> > +     if (!(len == 40 || len == 64) || buffer[len] != ' ')
> > +             return NULL; /* required "<hash>SP" not seen */
>
> People may have comments on hardcoded 40/64 here and offer a better
> way to write it ;-)

Latest version uses the_hash_algo->hexsz:

+       if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
+               return NULL; /* required "<hash>SP" not seen */

Thanks!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v6 0/2] fetch-pack: redact packfile urls in traces
  2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
  2021-10-28 22:51         ` [PATCH v5 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
@ 2021-10-29 18:42         ` Ivan Frade via GitGitGadget
  2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
                             ` (2 more replies)
  2 siblings, 3 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-29 18:42 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine, Ivan Frade

Changes since v5:

 * Use hexsz instead of hardcoded hash sizes

Changes since v4:

 * Use "uri" instead of "url"
 * Look specifically for a line with packfile-uri format (instead of for a
   URL in general)
 * Limit the redacting to the packfile-uri section in do_fetch_pack_v2
 * Use "%.*s" instead of duplicating parts of the string to print

Changes since v3:

 * Enable redacting URLs for all sections
 * Redact only URL path (it was until the end of line)
 * Redact URL in die() with more friendly message
 * Update doc to mention that packfile URIs are also redacted.

Changes since v2:

 * Redact only the path of the URL
 * Test are now strict, validating the exact line expected in the log

Changes since v1:

 * Removed non-POSIX flags in tests
 * More accurate regex for the non-encrypted packfile line
 * Dropped documentation change
 * Dropped redacting the die message in http-fetch

Ivan Frade (2):
  fetch-pack: redact packfile urls in traces
  http-fetch: redact url on die() message

 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  4 ++++
 http-fetch.c           | 14 ++++++++++--
 pkt-line.c             | 39 +++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 109 insertions(+), 5 deletions(-)


base-commit: e9e5ba39a78c8f5057262d49e261b42a8660d5b9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v6
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v5:

 1:  c95b3cafcd6 ! 1:  a6098f98946 fetch-pack: redact packfile urls in traces
     @@ pkt-line.c: int packet_length(const char lenbuf_hex[4])
      +	buffer += 1;
      +
      +	len = strspn(buffer, "0123456789abcdefABCDEF");
     -+	if (!(len == 40 || len == 64) || buffer[len] != ' ')
     ++	if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
      +		return NULL; /* required "<hash>SP" not seen */
      +
      +	path = strstr(buffer + len + 1, URI_MARK);
 2:  6912a690197 = 2:  38859ae7b7d http-fetch: redact url on die() message

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v6 1/2] fetch-pack: redact packfile urls in traces
  2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
@ 2021-10-29 18:42           ` Ivan Frade via GitGitGadget
  2021-11-08 23:01             ` Jonathan Tan
  2021-10-29 18:42           ` [PATCH v6 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
  2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-29 18:42 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine, Ivan Frade,
	Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  4 ++++
 pkt-line.c             | 39 +++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index d63c65e67d8..c91aa2737f0 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -832,8 +832,9 @@ for full details.
 
 `GIT_TRACE_REDACT`::
 	By default, when tracing is activated, Git redacts the values of
-	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
-	header. Set this variable to `0` to prevent this redaction.
+	cookies, the "Authorization:" header, the "Proxy-Authorization:"
+	header and packfile URIs. Set this variable to `0` to prevent this
+	redaction.
 
 `GIT_LITERAL_PATHSPECS`::
 	Setting this variable to `1` will cause Git to treat all
diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..62ea90541c5 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1653,8 +1653,12 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				receive_wanted_refs(&reader, sought, nr_sought);
 
 			/* get the pack(s) */
+			if (git_env_bool("GIT_TRACE_REDACT", 1))
+				reader.options |= PACKET_READ_REDACT_URI_PATH;
 			if (process_section_header(&reader, "packfile-uris", 1))
 				receive_packfile_uris(&reader, &packfile_uris);
+			reader.options &= ~PACKET_READ_REDACT_URI_PATH;
+
 			process_section_header(&reader, "packfile", 0);
 
 			/*
diff --git a/pkt-line.c b/pkt-line.c
index 2dc8ac274bd..5a69ddc2e77 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
+static char *find_packfile_uri_path(const char *buffer)
+{
+	const char *URI_MARK = "://";
+	char *path;
+	int len;
+
+	/* First char is sideband mark */
+	buffer += 1;
+
+	len = strspn(buffer, "0123456789abcdefABCDEF");
+	if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
+		return NULL; /* required "<hash>SP" not seen */
+
+	path = strstr(buffer + len + 1, URI_MARK);
+	if (!path)
+		return NULL;
+
+	path = strchr(path + strlen(URI_MARK), '/');
+	if (!path || !*(path + 1))
+		return NULL;
+
+	/* position after '/' */
+	return ++path;
+}
+
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
 						unsigned size, int *pktlen,
@@ -377,6 +402,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 {
 	int len;
 	char linelen[4];
+	char *uri_path_start;
 
 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
 		*pktlen = -1;
@@ -427,7 +453,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_URI_PATH &&
+	    (uri_path_start = find_packfile_uri_path(buffer))) {
+		const char *redacted = "<redacted>";
+		struct strbuf tracebuf = STRBUF_INIT;
+		strbuf_insert(&tracebuf, 0, buffer, len);
+		strbuf_splice(&tracebuf, uri_path_start - buffer,
+			      strlen(uri_path_start), redacted, strlen(redacted));
+		packet_trace(tracebuf.buf, tracebuf.len, 0);
+		strbuf_release(&tracebuf);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 467ae013573..6d2a63db238 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -87,6 +87,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_URI_PATH      (1u<<4)
 int packet_read(int fd, char *buffer, unsigned size, int options);
 
 /*
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..f01af2f2ed3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri path redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
+'
+
+test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 2/2] http-fetch: redact url on die() message
  2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
@ 2021-10-29 18:42           ` Ivan Frade via GitGitGadget
  2021-11-08 23:06             ` Jonathan Tan
  2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-10-29 18:42 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine, Ivan Frade,
	Ivan Frade

From: Ivan Frade <ifrade@google.com>

http-fetch prints the URL after failing to fetch it. This can be
confusing to users (they cannot really do anything with it), and they
can share by accident a sensitive URL (e.g. with credentials) while
looking for help.

Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
mimics the redaction of other sensitive information in git, like the
Authorization header in HTTP.

Fix also capitalization of previous die() message (must start in
lowercase).

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 http-fetch.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index fa642462a9e..c7c7d391ac5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -4,6 +4,7 @@
 #include "http.h"
 #include "walker.h"
 #include "strvec.h"
+#include "urlmatch.h"
 
 static const char http_fetch_usage[] = "git http-fetch "
 "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url";
@@ -63,8 +64,17 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
 	if (start_active_slot(preq->slot)) {
 		run_active_slot(preq->slot);
 		if (results.curl_result != CURLE_OK) {
-			die("Unable to get pack file %s\n%s", preq->url,
-			    curl_errorstr);
+			struct url_info url;
+			char *nurl = url_normalize(preq->url, &url);
+			if (!nurl || !git_env_bool("GIT_TRACE_REDACT", 1)) {
+				die("unable to get pack file '%s'\n%s", preq->url,
+				    curl_errorstr);
+			} else {
+				die("failed to get '%.*s' url from '%.*s' "
+				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
+				    (int)url.scheme_len, url.url,
+				    (int)url.host_len, &url.url[url.host_off], curl_errorstr);
+			}
 		}
 	} else {
 		die("Unable to start request");
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/2] fetch-pack: redact packfile urls in traces
  2021-10-29 18:42             ` Ivan Frade
@ 2021-10-29 19:59               ` Junio C Hamano
  2021-11-08 22:43                 ` Jonathan Tan
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2021-10-29 19:59 UTC (permalink / raw)
  To: Ivan Frade
  Cc: Ivan Frade via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Eric Sunshine

Ivan Frade <ifrade@google.com> writes:

> On Thu, Oct 28, 2021 at 4:21 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> "Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>
>> > +     len = strspn(buffer, "0123456789abcdefABCDEF");
>> > +     if (!(len == 40 || len == 64) || buffer[len] != ' ')
>> > +             return NULL; /* required "<hash>SP" not seen */
>>
>> People may have comments on hardcoded 40/64 here and offer a better
>> way to write it ;-)
>
> Latest version uses the_hash_algo->hexsz:
>
> +       if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
> +               return NULL; /* required "<hash>SP" not seen */
>
> Thanks!

OK.  If the <hash> is given by this side (as opposed to "you started
to talk to a remote, and it turns out that you are still talking
SHA-1 but the other side talks SHA-256 and their <hash> size that is
64 does not match your 40" case), then checking against
the_hash_algo->hexsz should be sufficient.  The original suggestion
was tried both because I didn't know where <hash> originates, and we
would want to redact even in such a hash type mismatch case.

Thanks.  Will take a look at the updated one.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
  2021-10-28 17:25           ` Eric Sunshine
  2021-10-28 22:41           ` Ivan Frade
@ 2021-10-29 23:18           ` Junio C Hamano
  2021-11-09  1:54             ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2021-10-29 23:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Ivan Frade via GitGitGadget, git, Ivan Frade

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> +			if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
>> +				die("Unable to get pack file %s\n%s", preq->url,
>> +				    curl_errorstr);
>
> small nit: arrange if's from "if (cheap || expensive)", i.e. no need for
> getenv() if !nurl, but maybe compilers are smart enough for that...

They typically do not see what happens inside git_env_bool() while
compling this compilation unit, and cannot tell if the programmer
wanted to call it first for its side effects, hence they cannot
swap them safely.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/2] fetch-pack: redact packfile urls in traces
  2021-10-29 19:59               ` Junio C Hamano
@ 2021-11-08 22:43                 ` Jonathan Tan
  0 siblings, 0 replies; 43+ messages in thread
From: Jonathan Tan @ 2021-11-08 22:43 UTC (permalink / raw)
  To: gitster; +Cc: ifrade, gitgitgadget, git, avarab, sunshine, Jonathan Tan

> OK.  If the <hash> is given by this side (as opposed to "you started
> to talk to a remote, and it turns out that you are still talking
> SHA-1 but the other side talks SHA-256 and their <hash> size that is
> 64 does not match your 40" case), then checking against
> the_hash_algo->hexsz should be sufficient.  The original suggestion
> was tried both because I didn't know where <hash> originates, and we
> would want to redact even in such a hash type mismatch case.
> 
> Thanks.  Will take a look at the updated one.

This is when reading from the remote, so <hash> comes from the other
side. I don't think that the remote sending the wrong hash size (and
then needing to redact) is a big concern, but there is definitely no
harm in checking for both (and commenting that these are the SHA-1 and
SHA-256 hash sizes).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 1/2] fetch-pack: redact packfile urls in traces
  2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
@ 2021-11-08 23:01             ` Jonathan Tan
  2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
  2021-11-10 21:18               ` Ivan Frade
  0 siblings, 2 replies; 43+ messages in thread
From: Jonathan Tan @ 2021-11-08 23:01 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, avarab, sunshine, ifrade, Jonathan Tan

> diff --git a/fetch-pack.c b/fetch-pack.c
> index a9604f35a3e..62ea90541c5 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1653,8 +1653,12 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
>  				receive_wanted_refs(&reader, sought, nr_sought);
>  
>  			/* get the pack(s) */
> +			if (git_env_bool("GIT_TRACE_REDACT", 1))
> +				reader.options |= PACKET_READ_REDACT_URI_PATH;
>  			if (process_section_header(&reader, "packfile-uris", 1))
>  				receive_packfile_uris(&reader, &packfile_uris);
> +			reader.options &= ~PACKET_READ_REDACT_URI_PATH;

Probably worth commenting why you're resetting the flag (avoid the
relatively expensive URI check when we don't need it).

> diff --git a/pkt-line.c b/pkt-line.c
> index 2dc8ac274bd..5a69ddc2e77 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
>  	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
>  }
>  
> +static char *find_packfile_uri_path(const char *buffer)
> +{
> +	const char *URI_MARK = "://";
> +	char *path;
> +	int len;
> +
> +	/* First char is sideband mark */
> +	buffer += 1;
> +
> +	len = strspn(buffer, "0123456789abcdefABCDEF");
> +	if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
> +		return NULL; /* required "<hash>SP" not seen */

Optional: As I said in my reply (just sent out), checking for both SHA-1
and SHA-256 lengths is reasonable too.

[1] https://lore.kernel.org/git/20211108224335.569596-1-jonathantanmy@google.com/

> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> index d527cf6c49f..f01af2f2ed3 100755
> --- a/t/t5702-protocol-v2.sh
> +++ b/t/t5702-protocol-v2.sh
> @@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
>  	test_i18ngrep "disallowed submodule name" err
>  '
>  
> +test_expect_success 'packfile-uri path redacted in trace' '
> +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +	rm -rf "$P" http_child log &&
> +
> +	git init "$P" &&
> +	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
> +
> +	echo my-blob >"$P/my-blob" &&
> +	git -C "$P" add my-blob &&
> +	git -C "$P" commit -m x &&
> +
> +	git -C "$P" hash-object my-blob >objh &&
> +	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
> +	git -C "$P" config --add \
> +		"uploadpack.blobpackfileuri" \
> +		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
> +
> +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \

No need for GIT_TRACE=1 since you're not checking stdout. Also I don't
think GIT_TEST_SIDEBAND_ALL=1 is needed - we should check that it works
even without a test variable (and I've checked and it seems to work).

[snip]

> +test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
> +	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +	rm -rf "$P" http_child log &&
> +
> +	git init "$P" &&
> +	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
> +
> +	echo my-blob >"$P/my-blob" &&
> +	git -C "$P" add my-blob &&
> +	git -C "$P" commit -m x &&
> +
> +	git -C "$P" hash-object my-blob >objh &&
> +	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
> +	git -C "$P" config --add \
> +		"uploadpack.blobpackfileuri" \
> +		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
> +
> +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \

Same comment here.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/2] http-fetch: redact url on die() message
  2021-10-29 18:42           ` [PATCH v6 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
@ 2021-11-08 23:06             ` Jonathan Tan
  0 siblings, 0 replies; 43+ messages in thread
From: Jonathan Tan @ 2021-11-08 23:06 UTC (permalink / raw)
  To: gitgitgadget; +Cc: git, avarab, sunshine, ifrade, Jonathan Tan

> @@ -63,8 +64,17 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
>  	if (start_active_slot(preq->slot)) {
>  		run_active_slot(preq->slot);
>  		if (results.curl_result != CURLE_OK) {
> -			die("Unable to get pack file %s\n%s", preq->url,
> -			    curl_errorstr);
> +			struct url_info url;
> +			char *nurl = url_normalize(preq->url, &url);
> +			if (!nurl || !git_env_bool("GIT_TRACE_REDACT", 1)) {
> +				die("unable to get pack file '%s'\n%s", preq->url,
> +				    curl_errorstr);
> +			} else {
> +				die("failed to get '%.*s' url from '%.*s' "
> +				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
> +				    (int)url.scheme_len, url.url,
> +				    (int)url.host_len, &url.url[url.host_off], curl_errorstr);
> +			}

I was confused why nurl was set but never used in "else", but I see that
it's because url_normalize() also sets that value in the urlinfo struct.
This patch looks good (and patch 1 too, with my suggested changes).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 1/2] fetch-pack: redact packfile urls in traces
  2021-11-08 23:01             ` Jonathan Tan
@ 2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
  2021-11-10 23:44                 ` Ivan Frade
  2021-11-10 21:18               ` Ivan Frade
  1 sibling, 1 reply; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-09  1:36 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: gitgitgadget, git, sunshine, ifrade


On Mon, Nov 08 2021, Jonathan Tan wrote:

>> diff --git a/fetch-pack.c b/fetch-pack.c
>> index a9604f35a3e..62ea90541c5 100644
>> --- a/fetch-pack.c
>> +++ b/fetch-pack.c
>> @@ -1653,8 +1653,12 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
>>  				receive_wanted_refs(&reader, sought, nr_sought);
>>  
>>  			/* get the pack(s) */
>> +			if (git_env_bool("GIT_TRACE_REDACT", 1))
>> +				reader.options |= PACKET_READ_REDACT_URI_PATH;
>>  			if (process_section_header(&reader, "packfile-uris", 1))
>>  				receive_packfile_uris(&reader, &packfile_uris);
>> +			reader.options &= ~PACKET_READ_REDACT_URI_PATH;
>
> Probably worth commenting why you're resetting the flag (avoid the
> relatively expensive URI check when we don't need it).

...yeah...

>> diff --git a/pkt-line.c b/pkt-line.c
>> index 2dc8ac274bd..5a69ddc2e77 100644
>> --- a/pkt-line.c
>> +++ b/pkt-line.c
>> @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
>>  	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
>>  }
>>  
>> +static char *find_packfile_uri_path(const char *buffer)
>> +{
>> +	const char *URI_MARK = "://";
>> +	char *path;
>> +	int len;
>> +
>> +	/* First char is sideband mark */
>> +	buffer += 1;
>> +
>> +	len = strspn(buffer, "0123456789abcdefABCDEF");
>> +	if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
>> +		return NULL; /* required "<hash>SP" not seen */
>
> Optional: As I said in my reply (just sent out), checking for both SHA-1
> and SHA-256 lengths is reasonable too.
>
> [1] https://lore.kernel.org/git/20211108224335.569596-1-jonathantanmy@google.com/

Correct me if I'm wrong, but I find it really strange that we're trying
to parse things in pkt-line.c like this.

We'll only get these from a client in code that's invoked in
fetch-pack.c, specifically the process_section_header() quoted above,
no?

From there we'll call packet_reader_read(), which will call
packet_read_with_status(), and from there we'll call packet_trace().

Then right after all this happens we've got a loop that parses out these
packfile URIs, including this being-done-first-here parsing of the hex
value just for logging, except in pkt-line.c we've lost the information
about what hash algorithm length we should be using, which fetch-pack.c
of course needs to know.

Why can't that process_section_header() in fetch-pack.c just be made to
call some pkt-line.c API saying "don't log yet", i.e. something like
this pseudocode:

diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..31f5ee7fc6b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1518,14 +1518,18 @@ static void receive_wanted_refs(struct packet_reader *reader,
 static void receive_packfile_uris(struct packet_reader *reader,
                                  struct string_list *uris)
 {
+       struct string_list log = STRING_LIST_INIT_DUP;
+
        process_section_header(reader, "packfile-uris", 0);
-       while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+       while (packet_reader_read_log_to(reader, &log) == PACKET_READ_NORMAL) {
                if (reader->pktlen < the_hash_algo->hexsz ||
                    reader->line[the_hash_algo->hexsz] != ' ')
                        die("expected '<hash> <uri>', got: %s\n", reader->line);
 
+               /* move the parsing of the URLs here */
                string_list_append(uris, reader->line);
        }
+       log_stuff(&log);
        if (reader->status != PACKET_READ_DELIM)
                die("expected DELIM");
 }

I.e. we'll eventually call trace_strbuf() in pkt-line.c, and we know
that we're doing these packfile-uris, and we know that we're just about
to parse them. Let's just:

 1. Start reading the section
 2. Turn off tracing
 3. Parse the URIs as we go
 3. When done (or on the fly), scrub URIs, log any backlog suppressed trace, and turn on tracing again

Instead of:

 1. Set a flag to scrub stuff
 2. Because of the disconnect between fetch-pack.c and pkt-line.c,
    effectively implement a new parser for data we're already going to be
    parsing some microseconds later during the course of the request.

That "turn off the trace" could be passing down a string_list/strbuf, or
even doing the same via a nev member in "struct packet_reader", both
would be simpler than needing to re-do the parse. Probably simplest is just:

    struct string_list log = STRING_LIST_INIT_DUP;

    reader.deferred_trace = &log;
    /* packet_reader_read() etc. code, unchanged from now */
    /* parse URIs (just move the existing code around a bit) */
    packet_reader.deferred_trace = NULL;
    for_each...(item, &log)
        trace_strbuf(...);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 2/2] http-fetch: redact url on die() message
  2021-10-29 23:18           ` Junio C Hamano
@ 2021-11-09  1:54             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-09  1:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ivan Frade via GitGitGadget, git, Ivan Frade


On Fri, Oct 29 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>> +			if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) {
>>> +				die("Unable to get pack file %s\n%s", preq->url,
>>> +				    curl_errorstr);
>>
>> small nit: arrange if's from "if (cheap || expensive)", i.e. no need for
>> getenv() if !nurl, but maybe compilers are smart enough for that...
>
> They typically do not see what happens inside git_env_bool() while
> compling this compilation unit, and cannot tell if the programmer
> wanted to call it first for its side effects, hence they cannot
> swap them safely.

*nod*, but since that function is just:
    
    int git_env_bool(const char *k, int def)
    {
            const char *v = getenv(k);
            return v ? git_config_bool(k, v) : def;
    }

I was hedging and pondering if some compilers were smart enough these
days to optimize things like that.

I.e. in this case getenv() is a simple C library function, the env
variable is constant, and we do a boolean test of it before calling
git_config_bool().

So a sufficiently smart compiler could turn that into:

     /* global, probably something iterated over env already */
    static int __have_seen_GIT_TRACE_REDACT = 0;
    ...

    if ((!__have_seen_GIT_TRACE_REDACT || !nurl) ||
        (__have_seen_GIT_TRACE_REDACT && git_env_bool_without_v_bool_check(...)))

But probably not, since it wolud need quite a bit of C library
cooperation/hooks...

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 1/2] fetch-pack: redact packfile urls in traces
  2021-11-08 23:01             ` Jonathan Tan
  2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
@ 2021-11-10 21:18               ` Ivan Frade
  1 sibling, 0 replies; 43+ messages in thread
From: Ivan Frade @ 2021-11-10 21:18 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: gitgitgadget, git, avarab, sunshine

On Mon, Nov 8, 2021 at 3:01 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > +                     reader.options &= ~PACKET_READ_REDACT_URI_PATH;
>
> Probably worth commenting why you're resetting the flag (avoid the
> relatively expensive URI check when we don't need it).

Done

>
> > diff --git a/pkt-line.c b/pkt-line.c
...
> > +     len = strspn(buffer, "0123456789abcdefABCDEF");
> > +     if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
> > +             return NULL; /* required "<hash>SP" not seen */
>
> Optional: As I said in my reply (just sent out), checking for both SHA-1
> and SHA-256 lengths is reasonable too.

Done (with a comment indicating they are the hash sizes of SHA1 and SHA256)

> > +     GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
>
> No need for GIT_TRACE=1 since you're not checking stdout. Also I don't
> think GIT_TEST_SIDEBAND_ALL=1 is needed - we should check that it works
> even without a test variable (and I've checked and it seems to work).

Done in both tests.

Thanks,

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 1/2] fetch-pack: redact packfile urls in traces
  2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
@ 2021-11-10 23:44                 ` Ivan Frade
  2021-11-11  0:01                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 43+ messages in thread
From: Ivan Frade @ 2021-11-10 23:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jonathan Tan, gitgitgadget, git, sunshine

On Mon, Nov 8, 2021 at 5:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
...
>... Let's just:
>
>  1. Start reading the section
>  2. Turn off tracing
>  3. Parse the URIs as we go
>  3. When done (or on the fly), scrub URIs, log any backlog suppressed trace, and turn on tracing again

This is a more generic redacting mechanism, but I understood that
there is no need for it. Previous comments went in the direction of
removing generality (e.g. not looking for a URI anywhere in the
packet, but specifically for the packfile line format) and now this
patch is very specific to redact packfile-uri lines in the protocol.

> Instead of:
>
>  1. Set a flag to scrub stuff
>  2. Because of the disconnect between fetch-pack.c and pkt-line.c,
>     effectively implement a new parser for data we're already going to be
>     parsing some microseconds later during the course of the request.

pkt-line is only looking for the "<n-hex-chars>SP" shape. True that it
encodes some protocol knowledge, but it is hardly a new parser.

> That "turn off the trace" could be passing down a string_list/strbuf, or
> even doing the same via a nev member in "struct packet_reader", both
> would be simpler than needing to re-do the parse.

Saving the lines and delaying the tracing could also produce weird
outputs, no? e.g. 3 lines received, the second doesn't validate, the
program aborts and the trace doesn't show any of the lines that caused
the problem. Or we would need to iterate in parallel through lines and
saved-log-lines assuming they match 1:1. Nothing unsolvable, but I am
not sure it is worthy the effort now.

Thanks,

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v7 0/2] fetch-pack: redact packfile urls in traces
  2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
  2021-10-29 18:42           ` [PATCH v6 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
@ 2021-11-10 23:51           ` Ivan Frade via GitGitGadget
  2021-11-10 23:51             ` [PATCH v7 1/2] " Ivan Frade via GitGitGadget
                               ` (2 more replies)
  2 siblings, 3 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-11-10 23:51 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Ivan Frade

Changes since v6:

 * Use specific hash sizes instead of hexsz
 * Remove unnecessary env vars in tests
 * Added comment on bit toggle

Changes since v5:

 * Use hexsz instead of hardcoded hash sizes

Changes since v4:

 * Use "uri" instead of "url"
 * Look specifically for a line with packfile-uri format (instead of for a
   URL in general)
 * Limit the redacting to the packfile-uri section in do_fetch_pack_v2
 * Use "%.*s" instead of duplicating parts of the string to print

Changes since v3:

 * Enable redacting URLs for all sections
 * Redact only URL path (it was until the end of line)
 * Redact URL in die() with more friendly message
 * Update doc to mention that packfile URIs are also redacted.

Changes since v2:

 * Redact only the path of the URL
 * Test are now strict, validating the exact line expected in the log

Changes since v1:

 * Removed non-POSIX flags in tests
 * More accurate regex for the non-encrypted packfile line
 * Dropped documentation change
 * Dropped redacting the die message in http-fetch

Ivan Frade (2):
  fetch-pack: redact packfile urls in traces
  http-fetch: redact url on die() message

 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  5 +++++
 http-fetch.c           | 14 ++++++++++--
 pkt-line.c             | 40 ++++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 111 insertions(+), 5 deletions(-)


base-commit: 88d915a634b449147855041d44875322de2b286d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1052%2Fifradeo%2Fredact-packfile-uri-v7
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1052/ifradeo/redact-packfile-uri-v7
Pull-Request: https://github.com/gitgitgadget/git/pull/1052

Range-diff vs v6:

 1:  a6098f98946 ! 1:  bbfdc346ede fetch-pack: redact packfile urls in traces
     @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
      +				reader.options |= PACKET_READ_REDACT_URI_PATH;
       			if (process_section_header(&reader, "packfile-uris", 1))
       				receive_packfile_uris(&reader, &packfile_uris);
     ++			/* We don't expect more URIs. Reset to avoid expensive URI check. */
      +			reader.options &= ~PACKET_READ_REDACT_URI_PATH;
      +
       			process_section_header(&reader, "packfile", 0);
     @@ pkt-line.c: int packet_length(const char lenbuf_hex[4])
      +	buffer += 1;
      +
      +	len = strspn(buffer, "0123456789abcdefABCDEF");
     -+	if (len != (int)the_hash_algo->hexsz || buffer[len] != ' ')
     ++	/* size of SHA1 and SHA256 hash */
     ++	if (!(len == 40 || len == 64) || buffer[len] != ' ')
      +		return NULL; /* required "<hash>SP" not seen */
      +
      +	path = strstr(buffer + len + 1, URI_MARK);
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +		"uploadpack.blobpackfileuri" \
      +		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
      +
     -+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
     ++	GIT_TRACE_PACKET="$(pwd)/log" \
      +	git -c protocol.version=2 \
      +		-c fetch.uriprotocols=http,https \
      +		clone "$HTTPD_URL/smart/http_parent" http_child &&
     @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje
      +		"uploadpack.blobpackfileuri" \
      +		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
      +
     -+	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
     ++	GIT_TRACE_PACKET="$(pwd)/log" \
      +	GIT_TRACE_REDACT=0 \
      +	git -c protocol.version=2 \
      +		-c fetch.uriprotocols=http,https \
 2:  38859ae7b7d = 2:  3b210735bc8 http-fetch: redact url on die() message

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v7 1/2] fetch-pack: redact packfile urls in traces
  2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
@ 2021-11-10 23:51             ` Ivan Frade via GitGitGadget
  2021-11-10 23:51             ` [PATCH v7 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
  2021-11-12  4:43             ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Junio C Hamano
  2 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-11-10 23:51 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.

Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 Documentation/git.txt  |  5 +++--
 fetch-pack.c           |  5 +++++
 pkt-line.c             | 40 ++++++++++++++++++++++++++++++++-
 pkt-line.h             |  1 +
 t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 99 insertions(+), 3 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 281c5f8caef..13f83a2a3a1 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -832,8 +832,9 @@ for full details.
 
 `GIT_TRACE_REDACT`::
 	By default, when tracing is activated, Git redacts the values of
-	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
-	header. Set this variable to `0` to prevent this redaction.
+	cookies, the "Authorization:" header, the "Proxy-Authorization:"
+	header and packfile URIs. Set this variable to `0` to prevent this
+	redaction.
 
 `GIT_LITERAL_PATHSPECS`::
 	Setting this variable to `1` will cause Git to treat all
diff --git a/fetch-pack.c b/fetch-pack.c
index a9604f35a3e..8b8c75f33aa 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1653,8 +1653,13 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				receive_wanted_refs(&reader, sought, nr_sought);
 
 			/* get the pack(s) */
+			if (git_env_bool("GIT_TRACE_REDACT", 1))
+				reader.options |= PACKET_READ_REDACT_URI_PATH;
 			if (process_section_header(&reader, "packfile-uris", 1))
 				receive_packfile_uris(&reader, &packfile_uris);
+			/* We don't expect more URIs. Reset to avoid expensive URI check. */
+			reader.options &= ~PACKET_READ_REDACT_URI_PATH;
+
 			process_section_header(&reader, "packfile", 0);
 
 			/*
diff --git a/pkt-line.c b/pkt-line.c
index 2dc8ac274bd..8e43c2def4c 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -370,6 +370,32 @@ int packet_length(const char lenbuf_hex[4])
 	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
+static char *find_packfile_uri_path(const char *buffer)
+{
+	const char *URI_MARK = "://";
+	char *path;
+	int len;
+
+	/* First char is sideband mark */
+	buffer += 1;
+
+	len = strspn(buffer, "0123456789abcdefABCDEF");
+	/* size of SHA1 and SHA256 hash */
+	if (!(len == 40 || len == 64) || buffer[len] != ' ')
+		return NULL; /* required "<hash>SP" not seen */
+
+	path = strstr(buffer + len + 1, URI_MARK);
+	if (!path)
+		return NULL;
+
+	path = strchr(path + strlen(URI_MARK), '/');
+	if (!path || !*(path + 1))
+		return NULL;
+
+	/* position after '/' */
+	return ++path;
+}
+
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
 						unsigned size, int *pktlen,
@@ -377,6 +403,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 {
 	int len;
 	char linelen[4];
+	char *uri_path_start;
 
 	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0) {
 		*pktlen = -1;
@@ -427,7 +454,18 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		len--;
 
 	buffer[len] = 0;
-	packet_trace(buffer, len, 0);
+	if (options & PACKET_READ_REDACT_URI_PATH &&
+	    (uri_path_start = find_packfile_uri_path(buffer))) {
+		const char *redacted = "<redacted>";
+		struct strbuf tracebuf = STRBUF_INIT;
+		strbuf_insert(&tracebuf, 0, buffer, len);
+		strbuf_splice(&tracebuf, uri_path_start - buffer,
+			      strlen(uri_path_start), redacted, strlen(redacted));
+		packet_trace(tracebuf.buf, tracebuf.len, 0);
+		strbuf_release(&tracebuf);
+	} else {
+		packet_trace(buffer, len, 0);
+	}
 
 	if ((options & PACKET_READ_DIE_ON_ERR_PACKET) &&
 	    starts_with(buffer, "ERR "))
diff --git a/pkt-line.h b/pkt-line.h
index 467ae013573..6d2a63db238 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -87,6 +87,7 @@ void packet_fflush(FILE *f);
 #define PACKET_READ_CHOMP_NEWLINE        (1u<<1)
 #define PACKET_READ_DIE_ON_ERR_PACKET    (1u<<2)
 #define PACKET_READ_GENTLE_ON_READ_ERROR (1u<<3)
+#define PACKET_READ_REDACT_URI_PATH      (1u<<4)
 int packet_read(int fd, char *buffer, unsigned size, int options);
 
 /*
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index d527cf6c49f..78f85b0714a 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -1107,6 +1107,57 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodul
 	test_i18ngrep "disallowed submodule name" err
 '
 
+test_expect_success 'packfile-uri path redacted in trace' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/<redacted>" log
+'
+
+test_expect_success 'packfile-uri path not redacted in trace when GIT_TRACE_REDACT=0' '
+	P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	rm -rf "$P" http_child log &&
+
+	git init "$P" &&
+	git -C "$P" config "uploadpack.allowsidebandall" "true" &&
+
+	echo my-blob >"$P/my-blob" &&
+	git -C "$P" add my-blob &&
+	git -C "$P" commit -m x &&
+
+	git -C "$P" hash-object my-blob >objh &&
+	git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" <objh >packh &&
+	git -C "$P" config --add \
+		"uploadpack.blobpackfileuri" \
+		"$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" \
+	GIT_TRACE_REDACT=0 \
+	git -c protocol.version=2 \
+		-c fetch.uriprotocols=http,https \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	grep -F "clone< \\1$(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" log
+'
+
 test_expect_success 'http:// --negotiate-only' '
 	SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
 	URI="$HTTPD_URL/smart/server" &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v7 2/2] http-fetch: redact url on die() message
  2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-11-10 23:51             ` [PATCH v7 1/2] " Ivan Frade via GitGitGadget
@ 2021-11-10 23:51             ` Ivan Frade via GitGitGadget
  2021-11-12  4:43             ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Junio C Hamano
  2 siblings, 0 replies; 43+ messages in thread
From: Ivan Frade via GitGitGadget @ 2021-11-10 23:51 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Ivan Frade, Ivan Frade

From: Ivan Frade <ifrade@google.com>

http-fetch prints the URL after failing to fetch it. This can be
confusing to users (they cannot really do anything with it), and they
can share by accident a sensitive URL (e.g. with credentials) while
looking for help.

Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
mimics the redaction of other sensitive information in git, like the
Authorization header in HTTP.

Fix also capitalization of previous die() message (must start in
lowercase).

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 http-fetch.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index fa642462a9e..c7c7d391ac5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -4,6 +4,7 @@
 #include "http.h"
 #include "walker.h"
 #include "strvec.h"
+#include "urlmatch.h"
 
 static const char http_fetch_usage[] = "git http-fetch "
 "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url";
@@ -63,8 +64,17 @@ static void fetch_single_packfile(struct object_id *packfile_hash,
 	if (start_active_slot(preq->slot)) {
 		run_active_slot(preq->slot);
 		if (results.curl_result != CURLE_OK) {
-			die("Unable to get pack file %s\n%s", preq->url,
-			    curl_errorstr);
+			struct url_info url;
+			char *nurl = url_normalize(preq->url, &url);
+			if (!nurl || !git_env_bool("GIT_TRACE_REDACT", 1)) {
+				die("unable to get pack file '%s'\n%s", preq->url,
+				    curl_errorstr);
+			} else {
+				die("failed to get '%.*s' url from '%.*s' "
+				    "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s",
+				    (int)url.scheme_len, url.url,
+				    (int)url.host_len, &url.url[url.host_off], curl_errorstr);
+			}
 		}
 	} else {
 		die("Unable to start request");
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 1/2] fetch-pack: redact packfile urls in traces
  2021-11-10 23:44                 ` Ivan Frade
@ 2021-11-11  0:01                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 43+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-11  0:01 UTC (permalink / raw)
  To: Ivan Frade; +Cc: Jonathan Tan, gitgitgadget, git, sunshine


On Wed, Nov 10 2021, Ivan Frade wrote:

> On Mon, Nov 8, 2021 at 5:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
> ...
>>... Let's just:
>>
>>  1. Start reading the section
>>  2. Turn off tracing
>>  3. Parse the URIs as we go
>>  3. When done (or on the fly), scrub URIs, log any backlog suppressed trace, and turn on tracing again
>
> This is a more generic redacting mechanism, but I understood that
> there is no need for it. Previous comments went in the direction of
> removing generality (e.g. not looking for a URI anywhere in the
> packet, but specifically for the packfile line format) and now this
> patch is very specific to redact packfile-uri lines in the protocol.

It's less generic, because it would live in the loop that consumes the
lines. 

>> Instead of:
>>
>>  1. Set a flag to scrub stuff
>>  2. Because of the disconnect between fetch-pack.c and pkt-line.c,
>>     effectively implement a new parser for data we're already going to be
>>     parsing some microseconds later during the course of the request.
>
> pkt-line is only looking for the "<n-hex-chars>SP" shape. True that it
> encodes some protocol knowledge, but it is hardly a new parser.

Yeah, but why have find_packfile_uri_path() at all instead of just
moving the parsing code around?

We've already got the code that parses these lines, it's just a few
lines removed from the code you're adding...

>> That "turn off the trace" could be passing down a string_list/strbuf, or
>> even doing the same via a nev member in "struct packet_reader", both
>> would be simpler than needing to re-do the parse.
>
> Saving the lines and delaying the tracing could also produce weird
> outputs, no? e.g. 3 lines received, the second doesn't validate, the
> program aborts and the trace doesn't show any of the lines that caused
> the problem. Or we would need to iterate in parallel through lines and
> saved-log-lines assuming they match 1:1. Nothing unsolvable, but I am
> not sure it is worthy the effort now.

It would only be weird if you do :

    download_later =
    while (consume lines)
        download_later += buffer_lines;
    log lines;

I'm suggesting:

    download_later =
    while (consume lines)
        raw, to_log = parse line
        log line(to_log)
        download_later += raw

Sure, you'll need to do something in the case where the line doesn't
validate, should you redact it still, or log it as is? Anyway, that's
also a caveat you've got now.

That's not iterating in parallel, having one for-loop instead of two.

I see now that that approach would also solve at least one
bug/misfeature in the packfile-uri handling, i.e.:

        for (i = 0; i < packfile_uris.nr; i++) {
            [...]
            start_command(...) [... to download the URI ...]
            [...]
            die("fetch-pack: pack downloaded from %s does not match expected hash %.*s",
        }

I.e. we've already received all the URIs, but then do validation on them
one at a time, so we might only notice that the server has sent us bad
data for the Nth URI after first downloading the first N-1 URIs.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v7 0/2] fetch-pack: redact packfile urls in traces
  2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
  2021-11-10 23:51             ` [PATCH v7 1/2] " Ivan Frade via GitGitGadget
  2021-11-10 23:51             ` [PATCH v7 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
@ 2021-11-12  4:43             ` Junio C Hamano
  2 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2021-11-12  4:43 UTC (permalink / raw)
  To: Ivan Frade via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Jonathan Tan, Ivan Frade

"Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Changes since v6:
>
>  * Use specific hash sizes instead of hexsz
>  * Remove unnecessary env vars in tests
>  * Added comment on bit toggle
> Ivan Frade (2):
>   fetch-pack: redact packfile urls in traces
>   http-fetch: redact url on die() message

Thanks, will queue.

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-11-12  4:43 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-08 16:03 [PATCH 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
2021-10-08 19:36   ` Ævar Arnfjörð Bjarmason
2021-10-08 23:15     ` Ivan Frade
2021-10-08 16:03 ` [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars Ivan Frade via GitGitGadget
2021-10-08 19:43   ` Ævar Arnfjörð Bjarmason
2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-11 20:39   ` Junio C Hamano
2021-10-26 19:32     ` Ivan Frade
2021-10-19 22:57   ` [PATCH v3] " Ivan Frade via GitGitGadget
2021-10-20 11:41     ` Ævar Arnfjörð Bjarmason
2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
2021-10-28  1:01         ` Junio C Hamano
2021-10-28 22:15           ` Ivan Frade
2021-10-28 22:46             ` Junio C Hamano
2021-10-26 22:49       ` [PATCH v4 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
2021-10-28 17:25           ` Eric Sunshine
2021-10-28 22:44             ` Ivan Frade
2021-10-28 22:41           ` Ivan Frade
2021-10-29 23:18           ` Junio C Hamano
2021-11-09  1:54             ` Ævar Arnfjörð Bjarmason
2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
2021-10-28 23:21           ` Junio C Hamano
2021-10-29 18:42             ` Ivan Frade
2021-10-29 19:59               ` Junio C Hamano
2021-11-08 22:43                 ` Jonathan Tan
2021-10-28 22:51         ` [PATCH v5 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
2021-11-08 23:01             ` Jonathan Tan
2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
2021-11-10 23:44                 ` Ivan Frade
2021-11-11  0:01                   ` Ævar Arnfjörð Bjarmason
2021-11-10 21:18               ` Ivan Frade
2021-10-29 18:42           ` [PATCH v6 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-11-08 23:06             ` Jonathan Tan
2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-11-10 23:51             ` [PATCH v7 1/2] " Ivan Frade via GitGitGadget
2021-11-10 23:51             ` [PATCH v7 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-11-12  4:43             ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).